โ† Back to Blog

How to Handle JSON Data in Python

2026-04-20 ยท 5 min read

Python json Module Basics

Python's built-in json module is the standard way to handle JSON, requiring no additional dependencies. It provides four core functions: json.loads() (string โ†’ Python object), json.dumps() (Python object โ†’ string), json.load() (file โ†’ Python object), json.dump() (Python object โ†’ file). The s in the function names stands for "string" โ€” functions with s operate on strings, without s on file objects. Remembering this rule prevents confusion.

Python's JSON type mapping: JSON object ({}) โ†” Python dict (dict); JSON array ([]) โ†” Python list (list); JSON string โ†” Python str; JSON number โ†” Python int or float; JSON true/false โ†” Python True/False; JSON null โ†” Python None.

import json

# ๅญ—็ฌฆไธฒ่งฃๆž / String parsing
json_str = '{"name": "Alice", "age": 30, "active": true}'
data = json.loads(json_str)
print(data['name'])    # "Alice"
print(data['active'])  # True (Python bool, not string)
print(type(data))      #

# ๅบๅˆ—ๅŒ–ไธบๅญ—็ฌฆไธฒ / Serialize to string
obj = {"name": "Bob", "scores": [95, 87, 92]}
json_str = json.dumps(obj)                          # ็ดงๅ‡‘ / Compact
pretty_str = json.dumps(obj, indent=2)              # ๆ ผๅผๅŒ– / Pretty
sorted_str = json.dumps(obj, indent=2, sort_keys=True)  # ้”ฎๆŽ’ๅบ / Sorted keys
unicode_str = json.dumps(obj, ensure_ascii=False)   # ไฟ็•™ไธญๆ–‡ / Preserve CJK

Reading and Writing JSON Files

Reading and writing JSON files in Python is one of the most common operations, using json.load() and json.dump() with the open() context manager:

import json

# ่ฏปๅ– JSON ๆ–‡ไปถ / Reading JSON file
with open('data.json', 'r', encoding='utf-8') as f:
    data = json.load(f)

# ๅ†™ๅ…ฅ JSON ๆ–‡ไปถ / Writing JSON file
with open('output.json', 'w', encoding='utf-8') as f:
    json.dump(data, f, indent=2, ensure_ascii=False)

# ่ฟฝๅŠ ๅˆฐ JSON Lines ๆ–‡ไปถ๏ผˆๆฏ่กŒไธ€ไธช JSON ๅฏน่ฑก๏ผ‰
# Append to JSON Lines file (one JSON object per line)
with open('data.jsonl', 'a', encoding='utf-8') as f:
    f.write(json.dumps(new_record, ensure_ascii=False) + '\n')

# ่ฏปๅ– JSON Lines ๆ–‡ไปถ / Reading JSON Lines file
records = []
with open('data.jsonl', 'r', encoding='utf-8') as f:
    for line in f:
        if line.strip():  # ่ทณ่ฟ‡็ฉบ่กŒ / Skip empty lines
            records.append(json.loads(line))

Custom Serialization: Handling Special Types

Python's json module doesn't support serializing datetime, Decimal, UUID, custom classes, and other types by default. You can extend it with a custom JSONEncoder or a default function:

import json
from datetime import datetime, date
from decimal import Decimal
from uuid import UUID

# ๆ–นๆณ• 1๏ผšไฝฟ็”จ default ๅ‡ฝๆ•ฐ / Method 1: Using default function
def extended_encoder(obj):
    if isinstance(obj, (datetime, date)):
        return obj.isoformat()
    if isinstance(obj, Decimal):
        return float(obj)
    if isinstance(obj, UUID):
        return str(obj)
    if hasattr(obj, '__dict__'):
        return obj.__dict__
    raise TypeError(f'Object of type {type(obj)} is not JSON serializable')

json.dumps(data, default=extended_encoder)

# ๆ–นๆณ• 2๏ผš่‡ชๅฎšไน‰ JSONEncoder / Method 2: Custom JSONEncoder
class ExtendedEncoder(json.JSONEncoder):
    def default(self, obj):
        if isinstance(obj, datetime):
            return obj.isoformat()
        if isinstance(obj, Decimal):
            return str(obj)  # ็ฒพ็กฎ้‡‘้ข็”จๅญ—็ฌฆไธฒ / Precise amounts as string
        return super().default(obj)

json.dumps(data, cls=ExtendedEncoder, indent=2)

# ่‡ชๅฎšไน‰่งฃ็ ๏ผˆobject_hook๏ผ‰/ Custom decoding (object_hook)
def datetime_decoder(dct):
    for key, value in dct.items():
        if isinstance(value, str) and 'T' in value:
            try:
                dct[key] = datetime.fromisoformat(value)
            except ValueError:
                pass
    return dct

data = json.loads(json_str, object_hook=datetime_decoder)

JSON Data Validation with Pydantic

Pydantic is the most popular data validation library in the Python ecosystem. It defines data models through type annotations, automatically validates the types and formats of JSON data, and provides clear error messages. It's widely used in FastAPI:

from pydantic import BaseModel, EmailStr, validator
from typing import Optional, List
from datetime import datetime

class Address(BaseModel):
    city: str
    country: str
    zip_code: Optional[str] = None

class User(BaseModel):
    id: int
    name: str
    email: str
    age: int
    address: Address
    tags: List[str] = []
    created_at: datetime

    @validator('age')
    def validate_age(cls, v):
        if v < 0 or v > 150:
            raise ValueError('Age must be between 0 and 150')
        return v

# ไปŽ JSON ๅญ—็ฌฆไธฒ่งฃๆžๅนถ้ชŒ่ฏ
# Parse and validate from JSON string
json_str = '''
{
  "id": 1,
  "name": "Alice",
  "email": "[email protected]",
  "age": 30,
  "address": {"city": "Beijing", "country": "China"},
  "created_at": "2025-01-01T00:00:00"
}
'''

user = User.model_validate_json(json_str)  # Pydantic v2
# ๆˆ– / or:
# user = User.parse_raw(json_str)  # Pydantic v1

# ๅบๅˆ—ๅŒ–ๅ›ž JSON
# Serialize back to JSON
print(user.model_dump_json(indent=2))

High-Performance JSON Library: orjson

orjson is Python's highest-performance JSON library, implemented in Rust, more than 10x faster than the standard json module. It also natively supports datetime, UUID, numpy arrays, and other types โ€” the first choice for processing large amounts of JSON data or high-throughput applications:

# ๅฎ‰่ฃ… / Install
# pip install orjson

import orjson
from datetime import datetime
from uuid import UUID

# ๅŸบๆœฌ็”จๆณ•ไธŽ json ๆจกๅ—็ฑปไผผ / Basic usage similar to json module
data = {'name': 'Alice', 'created': datetime.now(), 'id': UUID('...')}

# ๅบๅˆ—ๅŒ–๏ผˆ่ฟ”ๅ›ž bytes๏ผŒไธๆ˜ฏ str๏ผ‰
# Serialize (returns bytes, not str)
json_bytes = orjson.dumps(data)
json_bytes = orjson.dumps(data, option=orjson.OPT_INDENT_2)  # ๆ ผๅผๅŒ–

# ๅๅบๅˆ—ๅŒ–๏ผˆๆŽฅๅ— str ๆˆ– bytes๏ผ‰
# Deserialize (accepts str or bytes)
data = orjson.loads(json_bytes)

# ๆ€ง่ƒฝๅฏนๆฏ”๏ผˆๅŒ็ญ‰ๆ•ฐๆฎ๏ผ‰/ Performance comparison (same data)
# json.dumps:  ~1000 ยตs
# orjson.dumps: ~80 ยตs  (็บฆ 12 ๅ€้€Ÿๅบฆ / ~12x faster)

# ujson ไนŸๆ˜ฏๅฟซ้€Ÿๆ›ฟไปฃ้€‰้กน
# ujson is also a fast alternative option
# pip install ujson
import ujson
data = ujson.loads(json_str)
output = ujson.dumps(data, ensure_ascii=False, indent=2)

Handling JSON in API Development

Python's most popular API frameworks FastAPI and Flask both have built-in JSON support. FastAPI (recommended for new projects): automatic JSON serialization/deserialization based on Pydantic, automatically generates OpenAPI Schema; request bodies are automatically parsed as Pydantic models, and responses are automatically serialized:

# FastAPI ็คบไพ‹ / FastAPI example
from fastapi import FastAPI
from pydantic import BaseModel

app = FastAPI()

class UserCreate(BaseModel):
    name: str
    email: str

class UserResponse(BaseModel):
    id: int
    name: str
    email: str

@app.post('/users', response_model=UserResponse)
async def create_user(user: UserCreate):
    # FastAPI ่‡ชๅŠจ่งฃๆž่ฏทๆฑ‚ไฝ“ไธบ UserCreate ๅฏน่ฑก
    # FastAPI automatically parses request body as UserCreate object
    new_user = save_to_db(user)
    return new_user  # ่‡ชๅŠจๅบๅˆ—ๅŒ–ไธบ JSON

# Flask ็คบไพ‹ / Flask example
from flask import Flask, request, jsonify

app = Flask(__name__)

@app.route('/users', methods=['POST'])
def create_user():
    data = request.get_json()  # ่งฃๆž่ฏทๆฑ‚ JSON
    if not data:
        return jsonify({'error': 'Invalid JSON'}), 400
    # ๅค„็†ๆ•ฐๆฎ / Process data
    return jsonify({'id': 1, 'name': data['name']}), 201

JSON and Python dataclass Integration

Python 3.7+'s dataclass is a concise way to organize data structures. Through dataclasses.asdict() and dataclasses.fields(), it can integrate with JSON serialization:

For Python 3.10+, you can also use dataclasses.dataclass with __post_init__ for validation, or use newer libraries like dataclass-wizard and cattrs for more complete JSON serialization support, including automatic recursive serialization of nested objects and type checking. These tools are more reliable than manual conversion in terms of performance and functionality, especially for projects with complex data models.

from dataclasses import dataclass, asdict, field
from typing import List
import json

@dataclass
class Address:
    city: str
    country: str

@dataclass
class User:
    id: int
    name: str
    address: Address
    tags: List[str] = field(default_factory=list)

# ๅบๅˆ—ๅŒ–๏ผšdataclass โ†’ dict โ†’ JSON
# Serialize: dataclass โ†’ dict โ†’ JSON
user = User(id=1, name="Alice", address=Address(city="Beijing", country="China"))
user_dict = asdict(user)  # ้€’ๅฝ’่ฝฌๆขไธบๅญ—ๅ…ธ
json_str = json.dumps(user_dict, ensure_ascii=False, indent=2)

# ๅๅบๅˆ—ๅŒ–๏ผšJSON โ†’ dict โ†’ dataclass๏ผˆ้œ€ๆ‰‹ๅŠจๅค„็†ๅตŒๅฅ—๏ผ‰
# Deserialize: JSON โ†’ dict โ†’ dataclass (need to handle nesting manually)
data = json.loads(json_str)
address = Address(**data['address'])
user = User(**{**data, 'address': address})

Try the free tool now

Use Free Tool โ†’