DataFrame Looping (iteration) with a for statement. You can loop over a pandas dataframe, for each column row by row.
Related course: Data Analysis with Python Pandas
Below pandas. Using a DataFrame as an example.
1 | import pandas as pd |
This outputs this dataframe:
1 | age state point |
Loop over columns
If you stick the DataFrame directly into a for loop, the column names (column names) are retrieved in order as follows:
1 | for column_name in df: |
This outputs:
1 | <class 'str'> |
Iterate dataframe
.iteritems()
You can use the iteritems() method to use the column name (column name) and the column data (pandas. Series) tuple (column name, Series) can be obtained.
1 | import pandas as pd |
This outputs:
1 | <class 'str'> |
.iterrows()
You can use the iterrows() method to use the index name (row name) and the data (pandas. Series) tuple (index, Series) can be obtained.
1 | import pandas as pd |
This results in:
1 | <class 'str'> |
.itertuples()
You can use the itertuples() method to retrieve a column of index names (row names) and data for that row, one row at a time. The first element of the tuple is the index name.
By default, it returns namedtuple namedtuple named Pandas. Namedtuple allows you to access the value of each element in addition to [].
1 | import pandas as pd |
This outputs the following:
1 | <class 'pandas.core.frame.Pandas'> |
Retrieve column values
It’s possible to get the values of a specific column in order.
The iterrows(), itertuples() method described above can retrieve elements for all columns in each row, but can also be written as follows if you only need elements for a particular column:
1 | print(df['age']) |
When you apply a Series to a for loop, you can get its value in order. If you specify a column in the DataFrame and apply it to a for loop, you can get the value of that column in order.
1 | for age in df['age']: |
It is also possible to obtain the values of multiple columns together using the built-in function zip().
1 | for age, point in zip(df['age'], df['point']): |
If you want to get the index (line name), use the index attribute.
1 | print(df.index) |
Related course: Data Analysis with Python Pandas