JSON vs YAML: Choosing the Right Format
JSON and YAML are two of the most widely used data serialization formats in modern software development. Both represent structured data as human-readable text, yet they differ significantly in syntax, features, and ideal use cases. Understanding these differences helps you make informed decisions about which format to use in your projects, and how to convert between them when necessary.
History and Design Goals
JSON (JavaScript Object Notation) was formalized by Douglas Crockford in the early 2000s and standardized as ECMA-404 in 2013. It was designed to be a lightweight, language-independent data interchange format with minimal syntax. Its roots in JavaScript object literal syntax made it immediately familiar to web developers, and its simplicity made it easy to parse in any programming language.
YAML (YAML Ain't Markup Language) was first proposed in 2001 by Clark Evans, Ingy döt Net, and Oren Ben-Kiki. YAML's primary design goal was human-friendliness — it prioritizes readability and ease of editing by hand. YAML 1.0 was released in 2004, with the current YAML 1.2 specification published in 2009. YAML 1.2 was explicitly designed to be a superset of JSON, meaning every valid JSON document is also valid YAML.
Syntax Comparison
The most obvious difference between JSON and YAML is their approach to structure. JSON uses braces, brackets, and commas. YAML uses indentation and newlines.
JSON Example
{
"server": {
"host": "localhost",
"port": 8080,
"ssl": true,
"allowed_origins": ["https://example.com", "https://app.example.com"]
},
"database": {
"url": "postgres://localhost:5432/mydb",
"pool_size": 10
}
}YAML Equivalent
# Server configuration
server:
host: localhost
port: 8080
ssl: true
allowed_origins:
- https://example.com
- https://app.example.com
# Database settings
database:
url: postgres://localhost:5432/mydb
pool_size: 10Notice how YAML eliminates braces, brackets, commas, and most quotation marks. It also supports inline comments, which JSON does not.
The Superset Relationship
YAML 1.2 is a strict superset of JSON. This means any valid JSON document can be parsed by a YAML parser without modification. You can freely mix JSON-style syntax within YAML files. However, the reverse is not true — YAML features like comments, anchors, and unquoted strings have no equivalent in JSON.
Feature Comparison
Comments
YAML supports comments with the # character. JSON has no comment syntax. This is one of the most frequently cited reasons for choosing YAML over JSON for configuration files, where inline documentation is valuable.
Multi-line Strings
YAML provides powerful multi-line string support with block scalars. The pipe character (|) preserves newlines, and the greater-than character (>) folds newlines into spaces:
# Literal block scalar - preserves newlines
description: |
This is a multi-line
description that preserves
each line break.
# Folded block scalar - joins lines
summary: >
This is a long sentence
that will be folded into
a single line.In JSON, multi-line strings must use \n escape sequences within a single line, making them harder to read and edit.
Anchors and Aliases
YAML supports anchors (&) and aliases (*) that allow you to define a value once and reference it elsewhere, reducing duplication:
defaults: &defaults
timeout: 30
retries: 3
production:
<<: *defaults
host: prod.example.com
staging:
<<: *defaults
host: staging.example.comJSON has no equivalent feature. Reusing values requires duplication or application-level reference resolution.
Type Coercion
YAML automatically infers types from values. Unquoted true, false, null, and numeric literals are parsed as their respective types. JSON requires explicit syntax for each type (quoted strings, unquoted numbers, literal true/false/null).
When to Use JSON
- APIs and data exchange — JSON is the standard for REST APIs, GraphQL, and web services. Every language has robust, fast JSON parsers.
- Strict parsing requirements — JSON's simple grammar means fewer parser implementations and fewer ambiguities than YAML.
- Machine-generated data — When data is produced and consumed programmatically, JSON's explicit syntax prevents the implicit type coercion surprises that YAML can introduce.
- Browser environments — Browsers have native
JSON.parseandJSON.stringify. YAML requires a third-party library.
When to Use YAML
- Configuration files — YAML's readability and comment support make it ideal for config files that humans edit frequently.
- Kubernetes manifests — The Kubernetes ecosystem is built around YAML. Pod definitions, Deployments, Services, and Helm charts are all YAML.
- Docker Compose — Multi-container Docker configurations use
docker-compose.yml. - CI/CD pipelines — GitHub Actions, GitLab CI, CircleCI, and Azure Pipelines all use YAML for workflow definitions.
- Ansible playbooks — Infrastructure-as-code tools like Ansible rely heavily on YAML for defining automation tasks.
Conversion Gotchas
Converting between JSON and YAML is generally straightforward, but there are several well-known pitfalls to watch out for:
The Norway Problem
In YAML 1.1 (still used by many parsers), bare values like yes, no, on, and off are interpreted as booleans. This is famously known as the "Norway problem" because the country code NO for Norway gets parsed as the boolean false:
# YAML 1.1 - DANGER!
countries:
- GB # parsed as string "GB"
- NO # parsed as boolean false!
- FR # parsed as string "FR"
# Safe version - quote the value
countries:
- "GB"
- "NO"
- "FR"YAML 1.2 fixed this by limiting boolean values to true and false only, but many popular parsers (including PyYAML) still default to YAML 1.1 behavior.
Implicit Numeric Strings
Values that look like numbers are parsed as numbers in YAML. A version string like 1.10 becomes the float 1.1, losing the trailing zero. Similarly, octal-like values such as 010 may be interpreted as the decimal number 8 in YAML 1.1. Always quote values that should remain strings.
Indentation Sensitivity
YAML's reliance on indentation means that inconsistent spacing or mixing tabs with spaces will cause parse errors. YAML strictly requires spaces (not tabs) for indentation. When converting JSON to YAML, be aware that deeply nested structures may become hard to manage as indentation levels grow.
Practical Conversion Examples
When converting between JSON and YAML, you can use libraries in your language of choice or command-line tools:
# Python: JSON to YAML
import json, yaml
with open('data.json') as f:
data = json.load(f)
print(yaml.dump(data, default_flow_style=False))
# Python: YAML to JSON
with open('config.yml') as f:
data = yaml.safe_load(f)
print(json.dumps(data, indent=2))Always use yaml.safe_load instead of yaml.load in Python to prevent arbitrary code execution from malicious YAML documents.