Using Zod and zodResponseFormat with OpenAI
Over the last couple of weeks, I’ve been messing with using Zod and zodResponseFormat
with making OpenAI calls, requesting “structured outputs”.
It is documented in a rudimentary way here but I still had a bunch of open questions.
Here’s a brief but comprehensive summary of the Zod and zodResponseFormat
implementation for OpenAI’s Structured Outputs feature.
All of this is in a node.js environment, coded in ES module format.
Declarations
Below are the declarations you need.
import OpenAI from "openai";
import { zodResponseFormat } from "openai/helpers/zod";
import { z } from "zod";
const openai = new OpenAI();
This means that you always have to install both openai and Zod (npm install zod openai
) for any project if you plan on using Zod.
An alternative approach is to define your schema in accordance with zodResponseFormat
once, archive that schema, and use it. However, the risk of this approach is that OpenAI will update their helper as well as the range of things available using Zod.
I tried to create my own JSON schema and use those, adhering to the rules OpenAI gave. But I never got it to work for anything but the most basic schema.
Creating Zod Schema
Below are the rules for creating Zod schema for OpenAI calls.
- Use
z.object()
to define the structure - Use
z.string()
,z.number()
,z.boolean()
,z.array()
, andz.enum()
for types - Use
z.array()
thus, specifying the type inside:z.array(z.string())
- Use
z.enum()
thus:z.enum(["thing 1", "thing 2", "thing 3"])
orz.enum([1, 2, 3])
All fields must be set as “required” by default — and the OpenAI helper sets this, so there’s nothing for you to do.
To make an optional value, use z.union()
. E.g.
...
optionalField: z.union(z.string(), z.integer()),
...
You can also nest objects without declaring them explicitly above. But I found that this was problematic. I found it much more reliable to create objects and nest them by name, as shown in the sample code below.
When explicitly writing schema, you can use anyOf
. But at present, there’s no way to use anyOf
(that I know of… if you know, help!) using Zod.
Warning — Don’t use an LLM to create Zod schema. It will get very creative with things like .optional()
or other unsupported bits of schema. It’s likely also to be very complicated. If you do use an LLM, then copy-paste this guide into it as a strict set of guidelines. (And it’ll still probably be cheeky and break the rules…)
Calling OpenAI
There are a couple of small differences (subject to change) in how you call the OpenAI API endpoint. This means that if you have a function for calling LLMs (as I do) then you will have to modify it slightly to be able to use zodResponseFormat
.
- You can use two models —
gpt-4o-mini
(which currently can handle Structured Outputs) orgpt-4o-2024-08-06
. At some point, GPT-4o will redirect to this. - You have to call
openai.beta.chat.completions.parse
- You have to get
message.parsed
The response is always a JSON object.
Sample Code
Below is some sample code for using Zod and zodResponseFormat
. Make sure you have the declarations above, and of course, you have to initialise the OpenAI API.
const thingSchema = z.object({
type: z.enum(["alien", "spaceship"]),
name: z.string()
});
const MySchema = z.object({
field1: z.string(),
field2: z.number(),
nestedField: z.object({
subField: z.boolean()
}),
enumField: z.enum(["sausage", "elephant", "telephone"]),
arrayField: z.array(z.string()),
arrayOfThings: z.array(thingSchema)
});
const completion = await openai.beta.chat.completions.parse({
model: "gpt-4o-mini",
messages: [
{ role: "system", content: "Instruction for the model" },
{ role: "user", content: "User input" }
],
response_format: zodResponseFormat(MySchema, "schemaName")
});
const result = completion.choices[0].message.parsed;
Various Implementation Rules
Below are a mix of rules from OpenAI plus some things I learned through experimentation.
- There is a maximum of 5 levels of nesting (which is easy to get to!).
- Objects can have up to 100 properties (though I haven’t gotten near this)
- Use
zodResponseFormat(Schema, "schemaName")
in the API call- You have to give the schema a name — but I don’t know why. It can be anything.
- When calling other functions (e.g. if you have a function that makes your LLM calls), you should pass the object created by
zodResponseFormat
— not the Zod schema.
Some of these rules may change as OpenAI updates its API.
Final thoughts
OpenAI’s zodResponseFormat
schema seems to be a work in progress. It’s not easy to use, requires too many imports, and is quite limited compared to how much Zod schema can do.
The main limitations, in my mind, are
- The need to use “zod”. It’s yet another package and standard, even if it’s a good package. Unfortunately, creating json schema is annoying, so it’s necessary. Google’s release of “JSON mode” with Gemini is easier to parse and use.
- The limited levels of nesting. Five levels?? C’mon.
There are little bits that make me think… Why is this released? For example, why make it slightly complicated to use optional values? Why is there no clear description of how to use anyOf
using Zod?
Anyway, it’s a good start, and within its constraints, it works very reliably. But I’d be aware that it will likely change again very soon.