Not So Fast, TOON

Not So Fast, TOON

TOON (Token-Oriented Object Notation) is a next-generation format tailored for AI and LLM applications. It aims to make structured data token-efficient, reducing the cost of processing data within language models.

  • Developed and open-sourced by Johann Schopplich, with the GitHub project live since late 2025 - https://github.com/toon-format/toon
  • The core library (TypeScript + specification) is under the toon-format/toon GitHub org, which shows active contributions and tooling.

Key Characteristics

  1. Syntax: indentation-based with tabular structure
  2. Efficiency: uses 30 to 60% fewer tokens than JSON
  3. Compactness: removes redundant symbols and keys
  4. Readability: clean, spreadsheet-like representation
  5. Optimisation: purpose-built for AI data flows

What does TOON offer

TOON boosts token efficiency mainly by cutting out the extra syntax that JSON requires. It gets rid of the curly braces, square brackets, commas, and quotes, all those little characters that add up quickly when you are sending data to large language models.

Instead, TOON uses indentation and a simple, spreadsheet-like layout where you list keys just once per block. This way, it minimises repeated tokens and keeps things clean. The result? Less clutter without losing clarity or readability, making data easier on both the AI and the humans who work with it.

Specifically, TOON achieves:

  • 30% to 60% fewer tokens on average for flat, tabular JSON-like data structures.
  • More compact representation that reduces cost and speeds up processing when interacting with LLMs.
  • Higher model retrieval accuracy (approximately 73.9% vs 69.7% for JSON) due to clearer structure.
  • Optimised for uniform, repeated data objects typical in LLM inputs, but not deep nested structures where JSON may still be better.

Use cases include structured prompts, large AI pipelines, and any high-volume LLM workflow where token efficiency is critical. TOON is designed to serve as an AI-native format to complement existing JSON pipelines rather than fully replace JSON.

In summary, TOON enhances token efficiency by trimming repeated key declarations and punctuation, using indentation for structure, and focusing on flat data to save tokens, reduce costs, and improve LLM interpretation.

Here is a direct comparison of TOON vs JSON formats, showing the same data represented in both ways:

JSON example (list of courses):

json{ "courses": [ { "id": 101, "name": "APA Mastery", "price": 99, "rating": "4.9" }, { "id": 102, "name": "AI & Automation", "price": 79, "rating": "4.3" }, { "id": 103, "name": "AI for Leaders", "price": 89, "rating": "4.5" } ] }

TOON example (same data token-efficiently):

textcourses[3]{id, name, price, rating}: 101, APA Mastery, 99, 4.9 102, AI & Automation, 79, 4.3 103, AI for Leaders, 89, 4.5

Limitations

TOON’s biggest challenge with nested data comes from how it is designed around flat, table-like structures rather than deeply layered hierarchies. It cleverly reduces repeated keys and uses indentation to keep things compact and efficient for simple or shallow objects.

However, when dealing with deeply nested or recursive data, TOON loses some of its clarity and ease of use because it doesn't have the clear braces or brackets that JSON uses to mark nested levels.

This makes such data harder for both humans and AI models to read and parse, sometimes adding complexity rather than reducing it. So, while TOON is great for straightforward data, JSON still shines when it comes to representing and working with complex, deeply nested structures reliably and clearly.

Here are some of the key limitations:

  • Struggles with representing deeply nested or highly hierarchical data structures.
  • It may become less readable or more ambiguous when used beyond simple key-value pairs and shallow objects.
  • It is primarily optimised for uniform, repeated tabular data, not for diverse, recursive data forms.

Article content
JSON v TOON

Why it won't replace JSON anytime soon

It is an exciting new format optimised for token efficiency in AI workflows, but replacing JSON? Not likely, and here’s why:

  1. Established Ecosystem: JSON is deeply embedded in web standards, APIs, databases, and programming languages. Its mature ecosystem and universal support make displacement extremely difficult.
  2. Versatility & Familiarity: JSON handles complex, nested data structures with clarity. Developers know its syntax well. TOON, while efficient for flat/tabular data, struggles with deeply nested structures, limiting its scope.
  3. Tooling & Standards: JSON benefits from decades of tooling: validators, parsers, schema systems, and debugging support. TOON is new and niche, lacking widespread standards and tools.
  4. Complementary Role: TOON shines in AI-native contexts—reducing token usage for LLM prompts, but it’s not designed as a universal data interchange format. It’s a complement, not a replacement.
  5. Network Effect: JSON’s dominance is self-reinforcing. Migration costs for billions of systems are enormous, even if TOON offers technical advantages.

In conclusion, JSON’s ubiquity, simplicity, and tooling maturity mean TOON will augment, not replace JSON for the foreseeable future.

  • No alternative text description for this image
Like
Reply

To view or add a comment, sign in

More articles by Karthick Thoppe

Others also viewed

Explore content categories