When working with data in Python, I often need to convert DataFrames to JSON arrays for web applications, API integration, or data interchange. Converting Pandas DataFrames to JSON format is a common requirement that I encounter regularly in my projects.
In this tutorial, I will show you multiple methods to convert a DataFrame to a JSON array in Python. These approaches have helped me streamline data processing workflows and ensure compatibility across different systems.
So, let us get into the topic..
JSON Array in Python
A JSON array is an ordered collection of values enclosed in square brackets. Each value in the array can be a string, number, boolean, null, object, or another type.
JSON (JavaScript Object Notation) is a lightweight data interchange format that’s easy for humans to read and write, and easy for machines to parse and generate.
Let’s get started with the different methods to convert DataFrames to JSON arrays.
Methods to Convert A DataFrame To JSON Array In Python
Now, I will explain methods to convert a DataFrame to a JSON array in Python.
1. Use DataFrame.to_json() with ‘records’ Orientation
The simplest and most direct way to convert a DataFrame to a JSON array is using the built-in to_json() method in Python with the ‘records’ orientation.
import pandas as pd
# Create a sample DataFrame with US sales data
data = {
'Product': ['Laptop', 'Smartphone', 'Tablet', 'Smart Watch'],
'State': ['California', 'New York', 'Texas', 'Florida'],
'Sales': [5000, 8000, 3000, 2000],
'Month': ['January', 'February', 'March', 'April']
}
df = pd.DataFrame(data)
# Convert DataFrame to JSON array using to_json() with 'records' orientation
json_array = df.to_json(orient='records')
print(json_array)Output:
[{"Product":"Laptop","State":"California","Sales":5000,"Month":"January"},{"Product":"Smartphone","State":"New York","Sales":8000,"Month":"February"},{"Product":"Tablet","State":"Texas","Sales":3000,"Month":"March"},{"Product":"Smart Watch","State":"Florida","Sales":2000,"Month":"April"}]I executed the above example code and added the screenshot below.

In this example, I used the orient='records' parameter to get each row as a JSON object within an array. This format is particularly useful when working with JavaScript frameworks like React or Angular.
You can also add formatting to make the JSON more readable:
# Pretty print the JSON
json_array_pretty = df.to_json(orient='records', indent=4)
print(json_array_pretty)2. Use DataFrame.to_dict() with the json Module
Another approach I frequently use combines the DataFrame’s to_dict() method with Python’s built-in json module:
import pandas as pd
import json
# Create a sample DataFrame with US sales data
data = {
'Product': ['Laptop', 'Smartphone', 'Tablet', 'Smart Watch'],
'State': ['California', 'New York', 'Texas', 'Florida'],
'Sales': [5000, 8000, 3000, 2000],
'Month': ['January', 'February', 'March', 'April']
}
df = pd.DataFrame(data)
# Convert DataFrame to list of dictionaries
dict_records = df.to_dict('records')
# Convert to JSON string
json_array = json.dumps(dict_records)
print(json_array)
# Pretty print the JSON
json_array_pretty = json.dumps(dict_records, indent=4)
print(json_array_pretty)This method gives you more control over the JSON serialization process. The json.dumps() function offers additional parameters for customizing the output format.
Read Python Pandas Write to Excel
3. Use DataFrame.apply() with json.loads()
For more complex transformations or when you need to modify the data during conversion, I use the apply() method in Python:
import pandas as pd
import json
# Create a sample DataFrame with US customer data
data = {
'Name': ['John Smith', 'Sarah Johnson', 'Michael Davis', 'Emily Wilson'],
'City': ['Los Angeles', 'Manhattan', 'Houston', 'Miami'],
'Age': [34, 28, 45, 31],
'Premium_Member': [True, False, True, True]
}
df = pd.DataFrame(data)
# Convert each row to a custom JSON object
def row_to_json(row):
return {
"customer": {
"name": row['Name'],
"location": {
"city": row['City']
},
"details": {
"age": row['Age'],
"isPremium": row['Premium_Member']
}
}
}
# Apply the function to each row and convert to list
json_objects = df.apply(row_to_json, axis=1).tolist()
# Convert to JSON string
json_array = json.dumps(json_objects)
print(json_array)Output:
[{"customer": {"name": "John Smith", "location": {"city": "Los Angeles"}, "details": {"age": 34, "isPremium": true}}}, {"customer": {"name": "Sarah Johnson", "location": {"city": "Manhattan"}, "details": {"age": 28, "isPremium": false}}}, {"customer": {"name": "Michael Davis", "location": {"city": "Houston"}, "details": {"age": 45, "isPremium": true}}}, {"customer": {"name": "Emily Wilson", "location": {"city": "Miami"}, "details": {"age": 31, "isPremium": true}}}]I executed the above example code and added the screenshot below.

This method is particularly useful when you need to reshape the data structure or apply custom transformations during the conversion process.
Check out Create Plots Using Pandas crosstab() in Python
5. Use DataFrame.to_numpy() with json.dumps()
For extremely large DataFrames or when performance is a concern, converting to a Python NumPy array first can be more efficient:
import pandas as pd
import json
import numpy as np
# Create a sample DataFrame with US stock data
data = {
'Stock': ['AAPL', 'MSFT', 'AMZN', 'GOOGL'],
'Price': [175.25, 326.14, 125.30, 135.90],
'Volume': [32500000, 28400000, 45200000, 18700000],
'Change': [2.5, -1.2, 0.8, 1.4]
}
df = pd.DataFrame(data)
# Convert DataFrame to numpy array and then to list
array_data = df.to_numpy().tolist()
# Create a JSON array with column names as keys
column_names = df.columns.tolist()
json_array = json.dumps([dict(zip(column_names, row)) for row in array_data])
print(json_array)Output:
[{"Stock": "AAPL", "Price": 175.25, "Volume": 32500000, "Change": 2.5}, {"Stock": "MSFT", "Price": 326.14, "Volume": 28400000, "Change": -1.2}, {"Stock": "AMZN", "Price": 125.3, "Volume": 45200000, "Change": 0.8}, {"Stock": "GOOGL", "Price": 135.9, "Volume": 18700000, "Change": 1.4}]I executed the above example code and added the screenshot below.

This approach is helpful when dealing with large datasets because NumPy arrays can be more memory-efficient than Python dictionaries.
Handle Date and Time Values
When working with dates and time values in DataFrames, I need to be careful about their conversion to JSON:
import pandas as pd
import json
from datetime import datetime
# Create a sample DataFrame with US sales data including dates
data = {
'Product': ['Laptop', 'Smartphone', 'Tablet', 'Smart Watch'],
'State': ['California', 'New York', 'Texas', 'Florida'],
'Sales': [5000, 8000, 3000, 2000],
'Date': [datetime(2023, 1, 15), datetime(2023, 2, 20),
datetime(2023, 3, 10), datetime(2023, 4, 5)]
}
df = pd.DataFrame(data)
# Convert DataFrame to JSON with date handling
json_array = df.to_json(orient='records', date_format='iso')
print(json_array)The date_format='iso' parameter ensures that datetime objects are properly serialized as ISO-formatted strings, which is widely supported in various applications.
Save JSON Array to a File
In many of my projects, I need to save the JSON array to a file for later use:
import pandas as pd
# Create a sample DataFrame
data = {
'Product': ['Laptop', 'Smartphone', 'Tablet', 'Smart Watch'],
'State': ['California', 'New York', 'Texas', 'Florida'],
'Sales': [5000, 8000, 3000, 2000]
}
df = pd.DataFrame(data)
# Save DataFrame as JSON array to a file
df.to_json('sales_data.json', orient='records', indent=4)
print("JSON array saved to 'sales_data.json'")This creates a file named ‘sales_data.json’ with the DataFrame data in a JSON array format.
Read the Drop the Header Row of Pandas DataFrame
Convert Nested DataFrames to JSON
When dealing with nested data structures, I use a combination of methods:
import pandas as pd
import json
# Create main DataFrame
main_df = pd.DataFrame({
'Customer': ['John Smith', 'Sarah Johnson'],
'Location': ['New York', 'California']
})
# Create nested DataFrames for purchases
purchases_john = pd.DataFrame({
'Product': ['Laptop', 'Headphones'],
'Price': [1200, 150]
})
purchases_sarah = pd.DataFrame({
'Product': ['Smartphone', 'Tablet'],
'Price': [800, 350]
})
# Convert to dictionaries
result = []
for i, row in main_df.iterrows():
customer_data = row.to_dict()
# Add purchases data
if i == 0:
customer_data['Purchases'] = purchases_john.to_dict('records')
else:
customer_data['Purchases'] = purchases_sarah.to_dict('records')
result.append(customer_data)
# Convert to JSON
json_array = json.dumps(result, indent=4)
print(json_array)This approach is particularly useful when you need to create complex, nested JSON structures from multiple DataFrames.
Check out Pandas Find Duplicates in Python
Conclusion
Converting DataFrames to JSON arrays in Python is a common task that I perform regularly. The to_json() method with ‘records’ orientation is the simplest approach for most cases, but each method has its advantages depending on your specific requirements.
For basic conversions, Method 1 is my go-to choice. When I need more control over the output format or require custom transformations, Methods 2 and 3 are more suitable. Method 4 can be useful for large datasets where performance is a concern.
I hope you found this tutorial helpful!
Related tutorials:
- np.where in Pandas Python
- Pandas GroupBy Without Aggregation Function in Python
- Pandas Merge Fill NAN with 0 in Python

I am Bijay Kumar, a Microsoft MVP in SharePoint. Apart from SharePoint, I started working on Python, Machine learning, and artificial intelligence for the last 5 years. During this time I got expertise in various Python libraries also like Tkinter, Pandas, NumPy, Turtle, Django, Matplotlib, Tensorflow, Scipy, Scikit-Learn, etc… for various clients in the United States, Canada, the United Kingdom, Australia, New Zealand, etc. Check out my profile.