JSON vs. TOON Prompting for Large Language Models

Large language models (LLMs) can interpret instructions in many different formats, including plain text, JSON, CSV, Markdown, YAML, XML, and several others. Many developers have tested different input formats to discover which structure can improve accuracy while keeping token usage as low as possible.

JSON (JavaScript Object Notation) prompts have become a common choice for both simple instruction prompts and for adding parts of datasets into the prompt context. A newer format called TOON (Token Oriented Object Notation) has also gained attention because it can represent similar context to JSON while requiring fewer tokens.

This tutorial explains the differences between JSON and TOON formatted prompts and includes examples that can help you decide whether TOON prompts are suitable for your use case.

Key Takeaways

  • TOON formatted data can be used as an alternative to JSON formatted data when sending requests to LLMs. TOON claims that it can reduce input token usage by around 40%.
  • TOON formatted data has also shown the potential to improve prompt accuracy. This applies specifically to structured data formats, not to prompts that were originally written as plain text and then converted into JSON.
  • TOON is not a replacement for JSON in model outputs. For parsing, function calling, and other situations where JSON outputs are helpful, TOON has not yet proven to be equally effective.

Why JSON Prompting Is Sometimes Used

There are two main reasons why developers use JSON formatting in LLM prompts. The first reason is that some developers prefer JSON formatted prompts over plain text prompts because the structure is clear, consistent, and easy to read.

Example

json_prompt = {
  "task": "summarize_email",
  "email": "Love the app, but it keeps crashing. It shows me a black screen with no words appearing. I've tried logging out and back in again.",
  "output": "list",
  "max_points": 3
}

The JSON shown above can be converted into a Python string and then sent to an LLM to generate a response like the following. This approach is an alternative to writing plain text instructions that describe the LLM’s role and ask it to summarize the email in a list with three points.

Output

 - The user is experiencing repeated app crashes, specifically encountering a black screen without text.

 - They have attempted troubleshooting by logging out and back in without success.

 - The user expresses overall satisfaction with the app despite the prevailing issue.

Replacing plain text with JSON formatted instructions may work well for certain architectures, but it is not generally proven that JSON performs better than plain text in most cases. JSON can often require more tokens than plain text, and formal studies have not consistently shown better accuracy. In one study that compared plain text, Markdown, JSON, and YAML, the results showed that accuracy can differ significantly depending on the model, and no single format clearly performed best across all cases. Whether plain text prompts should be replaced with JSON should therefore be decided by testing the specific model and application use case.

The second reason to use JSON is when a large set of structured data needs to be included in the prompt context for the LLM.

JSON_dataset = {
  "animals": [
    {
      "name": "African Elephant",
      "category": "mammal",
      "habitat": "savanna",
      "diet": "herbivore"
    },
    {
      "name": "Red-Eyed Tree Frog",
      "category": "amphibian",
      "habitat": "rainforest",
      "diet": "carnivore"
    },
    {
      "name": "Bald Eagle",
      "category": "bird",
      "habitat": "forests near water",
      "diet": "carnivore"
    },
    {
      "name": "Great White Shark",
      "category":  "fish",
      "habitat":  "ocean",
      "diet":  "carnivore"
    }
  ]
}

prompt = f"Based on their habitat and diet, which of the following animals would be most difficult to keep in a zoo? Respond with only the name of the animal. {JSON_dataset}"

In the prompt above, example database entries are sent together with a text instruction to an LLM. This enables the model to perform a logic-based query on part of the dataset. In this example, the model is asked which animal would be the most difficult to keep in a zoo based on the listed attributes. The result is shown below.

Output

In this use case, the JSON context was not created by converting plain text. Instead, it comes from a source that already stores structured data. The data is usually combined with plain text instructions. These are the types of situations where TOON data formats may be useful.

Why Use TOON?

The main reason to use TOON is to keep the benefits of JSON or another structured data format while reducing token usage. If the previous JSON example is encoded into TOON format, the number of tokens drops significantly because TOON is less verbose and avoids many repeated tokens while preserving a similar output.

TOON_dataset = """animals[4]{name,category,habitat,diet}:
  African Elephant,mammal,savanna,herbivore
  Red-Eyed Tree Frog,amphibian,rainforest,carnivore
  Bald Eagle,bird,forests near water,carnivore
  Great White Shark,fish,ocean,carnivore"""

prompt = f"Based on their habitat and diet, which of the following animals would be most difficult to keep in a zoo? Respond with only the name of the animal. {TOON_dataset}"

The dataset above includes the same information as the JSON example, but it has been converted into TOON format. It uses fewer tokens, with 71 tokens compared to 172 tokens in the JSON version, which equals a 59% reduction. The output below produces the same result as the JSON example.

Output

On the GitHub release page, TOON is shown as achieving higher accuracy than JSON while using around 40% fewer tokens across four different model providers. A Python converter is also available, which can convert JSON into TOON format before the data is sent to an LLM. The following sections explain how to use it.

How to Choose the Best Prompting Format

To choose the best prompting format for an application, first determine whether structured data from a database or another source needs to be passed to the LLM as JSON, CSV, XML, YAML, or a similar format. If there is no structured data and the prompts are already written in plain language, TOON is probably not the best option. In that case, continue using plain language prompts and refine them to improve performance.

If structured data does need to be passed to the model, it may be useful to test TOON formatted prompts for the application. As with all prompt formatting options, the most measurable way to evaluate the value of a format is to count the number of tokens used. Test different formats, identify which ones meet the required accuracy level, and then choose the option that uses the fewest tokens. In this tutorial, a TOON Python package is used to create a workflow that converts JSON into TOON before sending it to an LLM endpoint.

Step One — Setting Up the Environment

First, prepare the environment by installing Python and the TOON Python library.

pip install git+https://github.com/toon-format/toon-python.git

Create a Python script named TOON_demo.py. Make sure that access to a large language model API is available. This tutorial uses a Mistral 3 model that has been deployed on a GPU-based virtual server by following a general Mistral 3 deployment process.

Step Two — Importing Dependencies

Next, import the Python dependencies and create the function that calls the LLM. Replace the highlighted your_server_ip placeholder value with the public IPv4 address of the instance. Alternatively, this function can be changed to use an LLM API service.

TOON_demo.py

import requests
import json
from toon_format import encode, decode

url = "http://your_server_ip:8000/v1/completions"

def call_LLM(dataset):
    prompt = f"Based on their habitat and diet, which of the following animals would be most difficult to keep in a zoo? Respond with only the name of the animal. {dataset}"

    data = {
        "model": "mistralai/Ministral-3-14B-Instruct-2512",
        "prompt": prompt,
        "max_tokens": 500
    }

    response = requests.post(url, json=data)
    response_message = response.json()['choices'][0]['text']
    print(response_message)

Step Three — Using the TOON Encoder

Next, use the encode function from the toon_format library to convert the JSON data before making the LLM call.

TOON_demo.py

JSON_dataset = {
  "animals": [
    {
      "name": "African Elephant",
      "category": "mammal",
      "habitat": "savanna",
      "diet": "herbivore"
    },
    {
      "name": "Red-Eyed Tree Frog",
      "category": "amphibian",
      "habitat": "rainforest",
      "diet": "carnivore"
    },
    {
      "name": "Bald Eagle",
      "category": "bird",
      "habitat": "forests near water",
      "diet": "carnivore"
    },
    {
      "name": "Great White Shark",
      "category":  "fish",
      "habitat":  "ocean",
      "diet":  "carnivore"
    }
  ]
}

TOON_dataset = encode(JSON_dataset)
print(TOON_dataset)

Encoding the JSON dataset returns a string in TOON format.

Output

animals[4]{name,category,habitat,diet}:
  African Elephant,mammal,savanna,herbivore
  Red-Eyed Tree Frog,amphibian,rainforest,carnivore
  Bald Eagle,bird,forests near water,carnivore
  Great White Shark,fish,ocean,carnivore

Decoding the TOON dataset returns a minified JSON object without the usual line breaks and whitespace found in formatted JSON.

TOON_demo.py

JSON_formatted_dataset = decode(TOON_dataset)
print(TOON_dataset)

Output

{'animals': [{'name': 'African Elephant', 'category': 'mammal', 'habitat': 'savanna', 'diet': 'herbivore'}, {'name': 'Red-Eyed Tree Frog', 'category': 'amphibian', 'habitat': 'rainforest', 'diet': 'carnivore'}, {'name': 'Bald Eagle', 'category': 'bird', 'habitat': 'forests near water', 'diet': 'carnivore'}, {'name': 'Great White Shark', 'category': 'fish', 'habitat': 'ocean', 'diet': 'carnivore'}]}

Step Four — Running Both JSON and TOON Datasets

Next, run both the JSON and TOON datasets and compare whether the results are accurate.

TOON_demo.py

print("JSON Dataset")
call_LLM(JSON_dataset)

print("\n\nTOON Dataset")
call_LLM(TOON_dataset)

Output

JSON Dataset
Great White Shark

TOON Dataset
Great White Shark

Step Five — Counting Tokens and Estimating Savings

The toon_format library includes an estimate_savings function that can be used to compare token usage. Import the function and pass in the JSON_dataset.

TOON_demo.py

from toon_format import estimate_savings

savings = estimate_savings(JSON_dataset)
print(f"Estimated savings: {savings}")

Output

{'json_tokens': 169, 'toon_tokens': 71, 'savings': 98, 'savings_percent': 57.98816568047337}

Token counts may differ depending on the model, but the function returns the estimated token count for each data format as well as the percentage of tokens saved.

FAQ About JSON and TOON Prompting

Does TOON support nested or complex JSON structures?

Yes. TOON can represent nested JSON objects and arrays because the toon-python encoder automatically converts hierarchical data into TOON format. As the depth and complexity of the structure increase, however, you should verify both that the token savings remain worthwhile and that the target model continues to interpret the encoded data correctly.

Can TOON be used as an output format?

Yes, although its practicality depends on the model. You can instruct an LLM to generate responses in TOON format and then convert the result back into JSON for downstream processing. Since TOON is still a relatively new format, most language models have had far less exposure to it than to JSON during training. As a result, models may be less reliable when producing valid TOON output. While JSON remains the preferred choice for tasks such as structured parsing and function calling, TOON-based outputs may require additional testing or fine-tuning to achieve consistent formatting.

Are there compatibility concerns when using TOON with different LLMs?

In general, TOON should work with any language model trained on broad text corpora, and it has already been tested successfully with multiple models. Because it is newer than JSON, however, some models may have encountered relatively few TOON examples during training. For that reason, validating performance with the specific model used in your application is recommended.

Can I write plain-text instructions in TOON format?

You can structure instruction prompts with TOON, but doing so does not necessarily improve model performance. Previous research has shown that converting ordinary text prompts into structured representations such as JSON does not automatically increase response accuracy. TOON is therefore most useful when the contextual information is already organized as structured data rather than when formatting natural-language instructions.

Do JSON or TOON prompts make model outputs more deterministic?

There is no definitive answer. Some practitioners report that structured prompt formats, particularly JSON, can produce more consistent responses. Even so, language models remain inherently non-deterministic, so identical prompts can still generate different outputs. Many techniques exist for improving consistency, and structured prompting is only one of them. Whether TOON offers additional determinism depends on the specific model, dataset, and application, making empirical testing the most reliable way to evaluate its impact.

Conclusion

TOON formatted data can be a more efficient and lower-cost alternative to JSON when JSON data needs to be included in the context of an LLM prompt. It uses around 40% fewer tokens while offering comparable, and in some cases improved, accuracy. Because LLMs process large amounts of data through context tokens, formats that preserve information while reducing token usage are likely to improve efficiency and may also improve accuracy. TOON is one example of this type of format, and more alternatives are likely to appear in the future.

In this tutorial, a TOON encoder was used to convert JSON data and estimate how many tokens could be saved by using a different format. The next step is to implement this approach in an application and test whether the results are acceptable for the specific use case while confirming that token usage is reduced.

Source: digitalocean.com

Create a Free Account

Register now and get access to our Cloud Services.

Posts you might be interested in:

Moderne Hosting Services mit Cloud Server, Managed Server und skalierbarem Cloud Hosting für professionelle IT-Infrastrukturen

LLM Fine-Tuning Guide: PEFT, LoRA & QLoRA

AI/ML, Tutorial
Vijona2 hours ago LLM Fine-Tuning: A Practical Crash Course Large language models are extremely capable today, yet ready-made models often do not perform well enough for specialized domains or application-specific…
Moderne Hosting Services mit Cloud Server, Managed Server und skalierbarem Cloud Hosting für professionelle IT-Infrastrukturen

Install and Configure Coolify on Ubuntu

Docker, Tutorial
Vijona2 hours ago Install and Configure Coolify on a Self-Hosted Ubuntu Server Coolify is an open-source, self-hosted Platform as a Service (PaaS) that delivers a Heroku-like developer workflow on infrastructure…