Read Excel files (extensions:.xlsx, .xls) with Python Pandas. To read an excel file as a DataFrame, use the pandas read_excel() method.

You can read the first sheet, specific sheets, multiple sheets or all sheets. Pandas converts this to the DataFrame structure, which is a tabular like structure.

Excel

In this article we use an example Excel file. The programs we'll make reads Excel into Python.

Creat an excel file with two sheets, sheet1 and sheet2. You can use any Excel supporting program like Microsoft Excel or Google Sheets.

The contents of each are as follows:

sheet1:

Excel sheet 1

sheet2:

Excel sheet 1

Install xlrd

Pandas. .read_excel a.) uses a library called xlrd internally.

xlrd is a library for reading (input) Excel files (.xlsx, .xls) in Python.

Related article: How to use xlrd, xlwt to read and write Excel files in Python

If you call pandas.read_excel s() in an environment where xlrd is not installed, you will receive an error message similar to the following:

ImportError: Install xlrd >= 0.9.0 for Excel support

xlrd can be installed with pip. (pip3 depending on the environment)

$ pip install xlrd

Read excel

Specify the path or URL of the Excel file in the first argument. If there are multiple sheets, only the first sheet is used by pandas. It reads as DataFrame.

import pandas as pd

df = pd.read_excel('sample.xlsx')

print(df)

The code above outputs the excel sheet content:

  Unnamed: 0   A   B   C
0        one  11  12  13
1        two  21  22  23
2      three  31  32  33

Get sheet

You can specify the sheet to read with the argument sheet_name.

Specify by number (starting at 0)

df_sheet_index = pd.read_excel('sample.xlsx', sheet_name=1)

print(df_sheet_index)
#        AA  BB  CC
# ONE    11  12  13
# TWO    21  22  23
# THREE  31  32  33

Specify by sheet name:

df_sheet_name = pd.read_excel('sample.xlsx', sheet_name='sheet2')

print(df_sheet_name)
#        AA  BB  CC
# ONE    11  12  13
# TWO    21  22  23
# THREE  31  32  33

Load multiple sheets

It is also possible to specify a list in the argumentsheet_name. It is OK even if it is a number of 0 starting or the sheet name.

The specified number or sheet name is the key key, and the data pandas. The DataFrame is read as the ordered dictionary OrderedDict with the value value.

df_sheet_multi = pd.read_excel('sample.xlsx', sheet_name=[0, 'sheet2'])

print(df_sheet_multi)

Then you can use it like this:

print(df_sheet_multi[0])
#         A   B   C
# one    11  12  13
# two    21  22  23
# three  31  32  33

print(type(df_sheet_multi[0]))
# <class 'pandas.core.frame.DataFrame'>

print(df_sheet_multi['sheet2'])
#        AA  BB  CC
# ONE    11  12  13
# TWO    21  22  23
# THREE  31  32  33

print(type(df_sheet_multi['sheet2']))
# <class 'pandas.core.frame.DataFrame'>

Load all sheets

If sheet_name argument is none, all sheets are read.

df_sheet_all = pd.read_excel('sample.xlsx', sheet_name=None)
print(df_sheet_all)

In this case, the sheet name becomes the key.

print(df_sheet_all['sheet1'])
#         A   B   C
# one    11  12  13
# two    21  22  23
# three  31  32  33

print(type(df_sheet_all['sheet1']))
# <class 'pandas.core.frame.DataFrame'>

print(df_sheet_all['sheet2'])
#        AA  BB  CC
# ONE    11  12  13
# TWO    21  22  23
# THREE  31  32  33

print(type(df_sheet_all['sheet2']))
# <class 'pandas.core.frame.DataFrame'>