The simple datastructure pandas.DataFrame is described in this article. It includes the related information about the creation, index, addition and deletion. The text is very detailed.
In short: it's a two-dimensional data structure (like table) with rows and columns.
Practice now: Test your Python skills with interactive challenges
Create DataFrame
What is a Pandas DataFrame
Pandas is a data manipulation module. DataFrame let you store tabular data in Python. The DataFrame lets you easily store and manipulate tabular data like rows and columns.
A dataframe can be created from a list (see below), or a dictionary or numpy array (see bottom).
Create DataFrame from list
You can turn a single list into a pandas dataframe:
import pandas as pd
data = [1,2,3]
df = pd.DataFrame(data)
The contents of the dataframe is then:
>>> df
0
0 1
1 2
2 3
>>>
Before the contents, you'll see every element has an index (0,1,2). This works for tables (n-dimensional arrays) too:
import pandas as pd
data = [['Axel',32], ['Alice', 26], ['Alex', 45]]
df = pd.DataFrame(data,columns=['Name','Age'])
This outputs:
>>> df
Name Age
0 Axel 32
1 Alice 26
2 Alex 45
>>>
Practice now: Test your Python skills with interactive challenges
Columns
Select column
To select a column, you can use the column name.
Step 1: Create frame:
>>> df = pd.DataFrame(data,columns=['Name','Age'])
>>> df
Name Age
0 Axel 32
1 Alice 26
2 Alex 45
Step 2: Select by column name:
>>> df['Name']
0 Axel
1 Alice
2 Alex
Name: Name, dtype: object
>>> df['Age']
0 32
1 26
2 45
Name: Age, dtype: int64
>>>
Column Addition
You can add a column to a dataframe. So this:
>>> df
Name Age
0 Axel 32
1 Alice 26
2 Alex 45
Becomes this:
>>> df
Name Age Example
0 Axel 32 1
1 Alice 26 2
2 Alex 45 3
>>>
Here's how to do that:
Step 1: Create the dataframe
>>> data = [['Axel',32], ['Alice', 26], ['Alex', 45]]
>>> df = pd.DataFrame(data,columns=['Name','Age'])
>>>
>>> df
Name Age
0 Axel 32
1 Alice 26
2 Alex 45
Step 2: Create a new dataframe with column
>>> c = pd.DataFrame([1,2,3], columns=['Example'])
Step 3: Set the column name of your dataframe to that of the newly created one:
>>> df['Example'] = c['Example']
>>> df
Name Age Example
0 Axel 32 1
1 Alice 26 2
2 Alex 45 3
>>>
Column deletion
To delete a column, you can use the keyword del. The original dataframe:
>>> df
Name Age Example
0 Axel 32 1
1 Alice 26 2
2 Alex 45 3
Then delete it:
>>> del df['Example']
And it will delete that column:
>>> df
Name Age
0 Axel 32
1 Alice 26
2 Alex 45
>>>
Practice now: Test your Python skills with interactive challenges
Rows
Select row
You can select a row using .loc[label].
>>> df
Name Age
0 Axel 32
1 Alice 26
2 Alex 45
>>>
>>> df.loc[0]
Name Axel
Age 32
Name: 0, dtype: object
>>>
>>> df.loc[2]
Name Alex
Age 45
Name: 2, dtype: object
>>>
You can select by index too, .iloc[index].
>>> df.iloc[0]
Name Axel
Age 32
Name: 0, dtype: object
>>>
Append row
You can append a row by calling the .append() method on the dataframe. First create a new dataframe:
>>> user = pd.DataFrame([['Vivian',33]], columns= ['Name','Age'])
Then add it to the existing dataframe:
>>> df = df.append(user)
>>> df
Name Age
0 Axel 32
1 Alice 26
2 Alex 45
0 Vivian 33
>>>
Delete row
To delete a row, you can use the method .drop(index).
Start by creating a frame:
>>> data = [['Axel',32], ['Alice', 26], ['Alex', 45]]
>>> df = pd.DataFrame(data,columns=['Name','Age'])
>>> df
Name Age
0 Axel 32
1 Alice 26
2 Alex 45
Lets delete the first row:
>>> df = df.drop(0)
>>> df
Name Age
1 Alice 26
2 Alex 45
>>>
How to Create a Pandas DataFrame
Create DataFrame from dictionary
If you have a dictionary, you can turn it into a dataframe.
>>> import pandas as pd
aa>>> d = {'one':[1,2,3], 'two':[2,3,4], 'three':[3,4,5] }
>>> df = pd.DataFrame(d)
>>> df
one two three
0 1 2 3
1 2 3 4
2 3 4 5
>>>
The keys in the dictionary are columns in the DataFrame, but there is no value for the index, so you need to set it yourself, and no default is to count from zero.
>>> df = pd.DataFrame(d, index=['first','second','third'])
>>> df
one two three
first 1 2 3
second 2 3 4
third 3 4 5
>>>
Create DataFrame from array
An array (numpy array) can be converted into an dataframe too.
>>> import numpy as np
>>> ar = np.array([[1,2,3],[4,5,6],[6,7,8]])
>>> ar
array([[1, 2, 3],
[4, 5, 6],
[6, 7, 8]])
Then turn it into a dataframe with the line:
>>> df = pd.DataFrame(ar)
>>> df
0 1 2
0 1 2 3
1 4 5 6
2 6 7 8
>>>
Creating a DataFrame assignment columns and index is created from a multi-dimensional array, otherwise it is the default, ugly.
>>> df = pd.DataFrame(ar, index=['A','B','C'], columns=['One','Two','Three'])
>>> df
One Two Three
A 1 2 3
B 4 5 6
C 6 7 8
>>>
Create from DataFrame
You can copy parts of a dataframe into a new dataframe. Using the dataframe above:
>>> df2 = df[['One','Two']].copy()
>>> df2
One Two
A 1 2
B 4 5
C 6 7
>>>
Create from CSV
If you have a csv file (Google Sheets can save as csv), you can load it like this:
# Import pandas as pd
import pandas as pd
# Import the cats.csv data: cats
cats = pd.read_csv('cats.csv')
# Print out cats
print(cats)