Read json string files in pandas read_json()
. You can do this for URLS, files, compressed files and anything that’s in json format. In this post, you will learn how to do that with Python.
First load the json data with Pandas read_json method, then it’s loaded into a Pandas DataFrame.
Related course: Data Analysis with Python Pandas
Read JSON
What is JSON?
JSON is shorthand for JavaScript Object Notation. This is a text format that is often used to exchange data on the web.
The format looks like this:
In practice, this data is often on one line, like so:
{“col1”:{“row1”:1,”row2”:2,”row3”:3},”col2”:{“row1”:”x”,”row2”:”y”,”row3”:”z”}}
Any type of data can be stored in this format (string, integer, float etc).
It’s common for a web server to return and accept json format. This is often how the frontend communicates with the backend.
pandas.read_json
The example below parses a JSON string and converts it to a Pandas DataFrame.
1 | # load pandas and json modules |
You can run it to see the output:
Load JSON from URL
To load JSON from an URL (API), you can use this code:
1 | import requests |
Related course: Data Analysis with Python Pandas
Save to JSON file
A DataFrame can be saved as a json file. To do so, use the method to_json(filename)
.
If you want to save to a json file, you can do the following:
1 | import pandas as pd |
For a dataframe with several columns:
1 | import pandas as pd |
Load JSON from File
If the json data is stored in a file, you can load it into a DataFrame.
You can use the example above to create a json file, then use this example to load it into a dataframe.
1 | df_f = pd.read_json('files/sample_file.json') |
For a compressed file .gz use:
1 | df_gzip = pd.read_json('sample_file.gz', compression='infer') |
If the extension is .gz
, .bz2
, .zip
, and .xz
, the corresponding compression method is automatically selected.
Pandas to JSON example
In the next example, you load data from a csv file into a dataframe, that you can then save as json file.
You can load a csv file as a pandas dataframe:
1 | df = pd.read_csv("data.csv") |
Then save the DataFrame to JSON format:
1 | # save a dataframe to json format: |
This also works for Excel files.
Related course: Data Analysis with Python Pandas