To read the csv file as pandas.DataFrame, use the pandas function read_csv()
or read_table()
.
The difference between read_csv() and read_table() is almost nothing. In fact, the same function is called by the source:
- read_csv() delimiter is a comma character
- read_table() is a delimiter of tab
\t
.
Related course: Data Analysis with Python Pandas
Read CSV
Read csv with Python
The pandas function read_csv()
reads in values, where the delimiter is a comma character.
You can export a file into a csv file in any modern office suite including Google Sheets.
Use the following csv data as an example.
name,age,state,point
Alice,24,NY,64
Bob,42,CA,92
Charlie,18,CA,70
Dave,68,TX,70
Ellen,24,CA,88
Frank,30,NY,57
Alice,24,NY,64
Bob,42,CA,92
Charlie,18,CA,70
Dave,68,TX,70
Ellen,24,CA,88
Frank,30,NY,57
You can load the csv like this:
1 | # Load pandas |
It then outputs the data frame:
1 | # age state point |
If you want to export data from a DataFrame or pandas.Series as a csv file or append it to an existing csv file, use the to_csv() method.
Read csv without header
Read a csv file that does not have a header (header line):
11,12,13,14
21,22,23,24
31,32,33,34
Specify the path relative path to the absolute path or the relative path from the current directory (the working directory).See the following articles for information on verifying or modifying the current directory.
If none of the arguments are set, the first line is recognized as a header and assigned to the column name columns.
1 | import pandas as pd |
If header = None, the sequential number is assigned to the column name columns.
1 | df_none = pd.read_csv('data/src/sample.csv', header=None) |
names=('A', 'B', 'C', 'D')
As a result, arbitrary values can be set as column names.Specify in lists and tuples.
1 | df_names = pd.read_csv('data/src/sample.csv', names=('A', 'B', 'C', 'D')) |
Related course: Data Analysis with Python Pandas
Read csv with header
Read the following csv file with header:
a,b,c,d
11,12,13,14
21,22,23,24
31,32,33,34
Specify the line number of the header as 0, such as header= 0.The default is header= 0, and if the first line is header, the result is the same result.
1 | df_header = pd.read_csv('data/src/sample_header.csv') |
Data is read from the line specified by header, and the above lines are ignored.
1 | df_header_2 = pd.read_csv('data/src/sample_header.csv', header=2) |
Read csv with index
Read a csv file with header and index (header column), such as:
,a,b,c,d
ONE,11,12,13,14
TWO,21,22,23,24
THREE,31,32,33,34
The index column is not recognized, especially if nothing is specified.
So add index_col=0
Specifies the column number of the column that you want to use as the index as the index, starting with 0.
1 | df_header_index_col = pd.read_csv('data/src/sample_header_index.csv', index_col=0) |
Related course: Data Analysis with Python Pandas