Parse JSON with Python

May 19, 2023
4 mins read

Python is rapidly becoming the preferred programming language for application “backend” development and machine learning. Python is also frequently trending on Twitter, further evidence of its increasing popularity. For many developers, a common task is to parse JSON with Python. The Veryfi OCR API Platform integrates AI-driven OCR with a web application backend, and returns formatted JSON. Veryfi provides SDKs in all of the popular programming languages, including Python.

What is Python Programming

Python is a programming language suitable for a wide range of tasks, including data analysis, machine learning, and web development. One of the most common data formats used in web development is JSON (JavaScript Object Notation), which is a lightweight, human-readable format for storing and exchanging data. In this article, we will explore how to use Python to parse and manipulate JSON data.

What is JSON

JSON is a data format that is similar to a Python dictionary. It is composed of key-value pairs, where the key is a string and the value can be a string, number, boolean, array, or another JSON object. JSON is often used to transmit data between a server and a web application, or between different parts of a web application.

Python provides several libraries for working with JSON, including the built-in json module. The json module provides functions for parsing and manipulating JSON data, and is included in the standard library.

How to Parse JSON with Python

The first step in working with JSON in Python is to parse the JSON data. The json.loads() function converts a string of JSON data into a Python dictionary. For example:

import json

json_data = '{"name": "John Smith", "age": 35, "is_employee": true}'

data = json.loads(json_data)

print(data)

This will output the following:

{'name': 'John Smith', 'age': 35, 'is_employee': True}

The json.loads() function can also parse a file containing JSON data. To do this, you can use the open() function to open the file, and then pass the file object to json.loads():

with open('data.json') as json_file:
    data = json.loads(json_file.read())

print(data)

After parsing the JSON data, you can access and manipulate it like a normal Python dictionary. For example, you can access the value of a specific key using the square bracket notation:

print(data['name']) # Outputs: "John Smith"

You can also use the json.dumps() function to convert a Python dictionary into a string of JSON data. This is useful for sending JSON data to a server, or for saving JSON data to a file. For example:

json_data = json.dumps(data)

print(json_data)

This will output the following:

'{"name": "John Smith", "age": 35, "is_employee": true}'

You can also use the json.dump() function to write JSON data to a file. This function works just like the json.dumps() function, but it writes the JSON data to a file instead of returning it as a string:

with open('data.json', 'w') as json_file:
    json.dump(data, json_file)

Third-Party Libraries for Parsing JSON With Python

In addition to the json module, there are also several third-party libraries available to parse JSON with Python. One popular library is the jsonpath-rw library, which allows you to extract specific values from a JSON object using a JSONPath expression. JSONPath is similar to XPath, but enables navigating JSON data rather than XML data.

Here is an example of using the jsonpath-rw library to extract specific values from a JSON object using a JSONPath expression:

from jsonpath_rw import jsonpath, parse
import json

# Sample JSON data
json_data = '{"employees": [{"name": "John Smith", "age": 35, "is_employee": true}, {"name": "Jane Doe", "age": 32, "is_employee": false}]}'

# Parse the JSON data
data = json.loads(json_data)

# Create a JSONPath expression to extract the name of the first employee
jsonpath_expr = parse('$.employees[0].name')

# Use the match function to extract the value
match = jsonpath_expr.find(data)

# Extract the value from the match object
first_employee_name = match[0].value

# Print the result
print(first_employee_name) # Outputs: "John Smith"

In this example, we first import the jsonpath_rw library, parse the JSON data, and create a JSONPath expression to extract the name of the first employee. We use the match function to extract the value, and extract the value from the match object. In this case the value is “John Smith”.

Note that the jsonpath-rw library is not included in the Python standard library and must be installed separately using pip. The syntax of JSONPath expressions is similar to that of XPath expressions, with some differences. The jsonpath-rw library provides documentation on using the JSONPath syntax to extract values from JSON data.

Next Steps

Now that you know how to parse JSON with Python, you’re ready to use the Veryfi OCR API Platform. Read how to capture data from a receipt or invoice in 5 lines of Python code!

Capture data from an invoice in 5 lines of Python code

And, to explore further, please sign up for a free account on the Veryfi platform. Once you have a free account, you can use the Veryfi Python SDK to get started with the OCR API Platform.