Structured Outputs and JSON Mode

The Problem with Unstructured LLM Output

LLMs naturally produce free-form text. For applications, you need structured data you can parse reliably.

# ❌ Unreliable — model might format differently each time
response = "The user's name is John, age 30, from Mumbai"
 
# ✅ Reliable — always parseable
response = {"name": "John", "age": 30, "city": "Mumbai"}

Without structured output, you're writing fragile string parsing code that breaks when the model changes its phrasing.

JSON Mode

JSON Mode instructs the model to always return valid JSON. It guarantees the output is parseable — but not that it matches a specific schema.

from openai import OpenAI
import json
 
client = OpenAI()
 
response = client.chat.completions.create(
    model="gpt-4o",
    response_format={"type": "json_object"},  # enable JSON mode
    messages=[
        {"role": "system", "content": "Return responses as JSON"},
        {"role": "user", "content": "Extract: John is 30 years old from Mumbai"}
    ]
)
 
data = json.loads(response.choices[0].message.content)
# {"name": "John", "age": 30, "city": "Mumbai"}

Limitation: JSON mode guarantees valid JSON but not a specific structure. The model decides the keys.

Structured Outputs (Schema-based)

Structured outputs let you define an exact schema using JSON Schema or Pydantic. The model is constrained to match it exactly.

from pydantic import BaseModel
from openai import OpenAI
 
client = OpenAI()
 
class UserInfo(BaseModel):
    name: str
    age: int
    city: str
    is_premium: bool
 
response = client.beta.chat.completions.parse(
    model="gpt-4o",
    messages=[
        {"role": "user", "content": "Extract: John is 30, from Mumbai, premium user"}
    ],
    response_format=UserInfo,
)
 
user = response.choices[0].message.parsed
print(user.name)       # "John"

Real-World Use Cases

Data Extraction

class InvoiceData(BaseModel):
    vendor: str
    amount: float
    currency: str
    date: str
    line_items: list[str]
 
# Extract structured data from unstructured invoice text

Classification

class SentimentResult(BaseModel):
    sentiment: Literal["positive", "negative", "neutral"]
    confidence: float
    reasoning: str

Multi-step Pipelines

class SearchQuery(BaseModel):
    query: str
    filters: list[str]
    max_results: int
 
# LLM converts natural language to structured search params
# Then pass to your search API

Handling Failures

Even with structured outputs, always validate:

from pydantic import ValidationError
 
try:
    result = response.choices[0].message.parsed
    if result is None:
        # Model refused or couldn't parse
        handle_refusal()
except ValidationError as e:
    # Schema validation failed
    log_error(e)
    use_fallback()

JSON Mode vs Structured Outputs

	JSON Mode	Structured Outputs
Valid JSON	✅ Always	✅ Always
Exact schema	❌ No	✅ Yes
Type safety	❌ No	✅ Yes
Model support	Broad	GPT-4o, some others
Use when	Simple JSON needed	Exact schema required

Key Takeaway

Free-form LLM output is unreliable for production — use structured outputs
JSON Mode guarantees valid JSON but not a specific schema
Structured Outputs guarantee both valid JSON and an exact schema
Use Pydantic models to define schemas — they integrate directly with OpenAI's API
Always handle None responses — the model may refuse to fill a schema