# tsfm — Complete Documentation > TypeScript SDK for Apple's Foundation Models framework — on-device Apple Intelligence inference in Node.js via FFI. macOS 26+, Apple Silicon only. --- Source: /guide/getting-started --- # Getting Started TSFM gives Node.js applications access to Apple's on-device large language model through the on-device Foundation Models framework. It loads a pre-compiled dynamic library [via FFI](https://koffi.dev/), allowing it the same access as native Swift and ObjC applications. TSFM is **not** a browser library or a cloud API. TSFM requires Node.js ≥20 on an Apple Silicon Mac running macOS 26+ with Apple Intelligence enabled. No matter what your AI assistant tells you, TSFM **will not work** in browser client-side code, on Windows/Linux, on Intel Macs or on macs without Apple Intelligence installed. You might use TSFM for CLI tools, local dev tooling, Electron apps, automation scripts or small Mac-native services written in TypeScript. ## Requirements - **macOS 26** (Tahoe) or later, Apple Silicon - **Apple Intelligence** enabled in System Settings - **Node.js 20+** ## Installation ```bash npm install tsfm-sdk ``` Xcode is not required to use this package. The NPM package ships with a prebuilt dylib for macOS 26.0+. If you know your machine requires a different dylib, see [Building from Source](#building-from-source). ## Quick Start ```ts import { SystemLanguageModel, LanguageModelSession } from "tsfm-sdk"; const model = new SystemLanguageModel(); const { available } = await model.waitUntilAvailable(); if (!available) process.exit(1); const session = new LanguageModelSession({ instructions: "You are a concise assistant.", }); const reply = await session.respond("What is the capital of France?"); console.log(reply); // "The capital of France is Paris." session.dispose(); model.dispose(); ``` ## Key Concepts **Apple Intelligence** refers to Apple's suite of generative AI features (Siri, Writing Tools, Image Playground, and more). The **Foundation Models** framework exposes **SystemLanguageModel**, the **on-device** large language model at the core of Apple Intelligence that runs on Macs, iPhones and iPads with no network required. TSFM basically mirrors the Swift Foundation Models API (same class names, same method signatures, same concepts) with TypeScript translating the same actions to the same underlying model. For the most part, [Apple's own documentation](https://developer.apple.com/documentation/FoundationModels) will translate pretty directly. | SDK class | Role | | --- | --- | | `SystemLanguageModel` | Entry point. Wraps the native model pointer and gates availability before you create sessions. | | `LanguageModelSession` | Holds conversation state. All generation (text, structured, streaming, tool use) goes through a session. | | `.dispose()` or `Symbol.dispose` | Releases native resources. Required for any object that holds a C pointer. | ## Where To Go From Here - [Model Configuration](/guide/model-configuration) — Use cases, guardrails, availability - [Sessions](/guide/sessions) — Creating and using sessions - [Streaming](/guide/streaming) — Token-by-token response streaming - [Structured Outputs](/guide/structured-output) — Typed generation with dictionary or JSON schemas - [Tools](/guide/tools) — Function calling - [Error Handling](/guide/error-handling) — Error types and recovery - [Chat API Compatibility](/guide/chat-api) — Drop-in Chat API compatible interface ## Building from Source If you are working on TSFM as a developer, or need to rebuild the native library, run: ```bash git clone https://github.com/codybrom/tsfm.git cd tsfm npm run build ``` Rebuilding from source requires **Xcode 26+** to compile the libFoundationModels.dylib Swift bridge. --- Source: /guide/model-configuration --- # Model Configuration `SystemLanguageModel` is the entry point for the on-device model. It wraps the native model pointer to gate availability before you create sessions. ::: info The **Swift** equivalent is [`SystemLanguageModel`](https://developer.apple.com/documentation/foundationmodels/systemlanguagemodel). ::: ## Creating a Model ```ts import { SystemLanguageModel, SystemLanguageModelUseCase, SystemLanguageModelGuardrails, } from "tsfm-sdk"; const model = new SystemLanguageModel({ useCase: SystemLanguageModelUseCase.GENERAL, guardrails: SystemLanguageModelGuardrails.DEFAULT, }); ``` Both options are optional and default to the values shown above. ## Guardrails Guardrails control how the model handles potentially unsafe content in prompts and responses. ::: info The **Swift** equivalent is [`SystemLanguageModel.Guardrails`](https://developer.apple.com/documentation/foundationmodels/systemlanguagemodel/guardrails). ::: | Value | Description | | --- | --- | | `DEFAULT` | Blocks unsafe content in both prompts and responses. Use this for most applications. | | `PERMISSIVE_CONTENT_TRANSFORMATIONS` | Allows transforming potentially unsafe text input into text responses. Use this when your app needs to process user-generated content that may contain sensitive material (e.g., content moderation tools, text rewriting). | ```ts const model = new SystemLanguageModel({ guardrails: SystemLanguageModelGuardrails.PERMISSIVE_CONTENT_TRANSFORMATIONS, }); ``` With `DEFAULT` guardrails, unsafe content may trigger a `GuardrailViolationError`. With `PERMISSIVE_CONTENT_TRANSFORMATIONS`, the model may attempt to transform the content instead of rejecting it outright. ## Use Cases Use cases hint to the model what kind of task you're performing. ::: info The **Swift** equivalent is [`SystemLanguageModel.UseCase`](https://developer.apple.com/documentation/foundationmodels/systemlanguagemodel/usecase). ::: | Value | Description | | --- | --- | | `GENERAL` | General-purpose text generation (default) | | `CONTENT_TAGGING` | Optimized for classification and labeling tasks | ```ts const tagger = new SystemLanguageModel({ useCase: SystemLanguageModelUseCase.CONTENT_TAGGING, }); ``` ## Checking Availability The on-device model may not be available if Apple Intelligence is disabled, assets haven't finished downloading, or the hardware doesn't support it. Always check before creating a session. ### Synchronous Check ```ts const { available, reason } = model.isAvailable(); if (!available) { console.log("Unavailable:", reason); } ``` ### Waiting for Availability `waitUntilAvailable()` polls until the model is ready, with a default timeout of 30 seconds. If the failure is permanent (`DEVICE_NOT_ELIGIBLE` or `APPLE_INTELLIGENCE_NOT_ENABLED`), it returns immediately rather than waiting the full timeout. It only retries when the reason is `MODEL_NOT_READY`. ```ts const { available } = await model.waitUntilAvailable(); const { available } = await model.waitUntilAvailable(10_000); // custom timeout ``` ## Unavailability Reasons When `available` is `false`, the `reason` field indicates why: | Reason | Description | | --- | --- | | `APPLE_INTELLIGENCE_NOT_ENABLED` | Apple Intelligence is turned off in Settings | | `MODEL_NOT_READY` | Model assets are still downloading | | `DEVICE_NOT_ELIGIBLE` | Hardware doesn't support Foundation Models | ## Cleanup Release native resources when you're done with the model: ```ts model.dispose(); ``` --- Source: /guide/sessions --- # Sessions `LanguageModelSession` manages conversation state and provides all generation methods. Each session maintains its own context window and [transcript](/guide/transcripts). ::: info The **Swift** equivalent is [`LanguageModelSession`](https://developer.apple.com/documentation/foundationmodels/languagemodelsession). ::: ## Creating a Session ```ts import { LanguageModelSession } from "tsfm-sdk"; const session = new LanguageModelSession({ instructions: "You are a concise assistant.", }); ``` ### With a Specific Model ```ts const model = new SystemLanguageModel({ useCase: SystemLanguageModelUseCase.CONTENT_TAGGING }); const session = new LanguageModelSession({ model }); ``` ### With Tools ```ts const session = new LanguageModelSession({ tools: [weatherTool, calculatorTool], }); ``` ## Generating Responses ### Text Response ```ts const reply = await session.respond("What is the capital of France?"); console.log(reply); // "The capital of France is Paris." ``` ### With Generation Options ```ts const reply = await session.respond("Write a poem", { options: { temperature: 0.9, maximumResponseTokens: 200, }, }); ``` See [Generation Options](/guide/generation-options) for all available options. ## Concurrency Sessions serialize concurrent calls automatically. If you call `respond()` while another request is in progress, it queues and runs after the first completes: ```ts // These run sequentially, not in parallel const [a, b] = await Promise.all([ session.respond("First question"), session.respond("Second question"), ]); ``` ## Cancellation Cancel an in-progress request with `cancel()`: ```ts const promise = session.respond("Tell me a long story"); session.cancel(); ``` Cancellation is advisory — the response may still complete if the model finishes before the cancel is processed. After cancellation, the session resets to idle and is ready for new requests. ## Checking State `isResponding` tells you whether the session is currently processing a request: ```ts if (session.isResponding) { // A generation call is in flight } ``` ## Cleanup Always dispose sessions when done to release native memory: ```ts session.dispose(); ``` ::: tip If you prefer a higher-level interface, the [Chat API compatibility layer](/guide/chat-api) manages sessions automatically behind a more standard `chat.completions.create()` interface. ::: --- Source: /guide/streaming --- # Streaming TSFM can stream responses token-by-token using an async iterator. The on-device model produces cumulative snapshots, and the SDK diffs them internally so you receive only the new tokens on each iteration. ::: info The **Swift** equivalent is [`LanguageModelSession.ResponseStream`](https://developer.apple.com/documentation/foundationmodels/languagemodelsession/responsestream). ::: ## Basic Streaming ```ts import { LanguageModelSession } from "tsfm-sdk"; const session = new LanguageModelSession(); for await (const chunk of session.streamResponse("Tell me a joke")) { process.stdout.write(chunk); } console.log(); session.dispose(); ``` Each `chunk` is a string containing only the **new** tokens since the last iteration. ## With Options ```ts for await (const chunk of session.streamResponse("Write a story", { options: { temperature: 0.8, maximumResponseTokens: 500 }, })) { process.stdout.write(chunk); } ``` ## Collecting the Full Response If you want both streaming output and the complete text: ```ts let full = ""; for await (const chunk of session.streamResponse("Explain TypeScript")) { process.stdout.write(chunk); full += chunk; } console.log("\n\nFull response length:", full.length); ``` ## Chat API Streaming If you prefer the Chat API streaming interface, the [compatibility layer](/guide/chat-api#streaming) provides `stream: true` with `ChatCompletionChunk` objects: ```ts import Client from "tsfm-sdk/chat"; const client = new Client(); const stream = await client.chat.completions.create({ messages: [{ role: "user", content: "Tell me a joke" }], stream: true, }); for await (const chunk of stream) { const delta = chunk.choices[0].delta.content; if (delta) process.stdout.write(delta); } client.close(); ``` ## Cleanup The stream reference is released automatically when iteration completes or the session is disposed. The SDK keeps the Node.js event loop alive while streaming, so the process won't exit mid-stream. --- Source: /guide/structured-output --- # Structured Output When you provide a schema, the on-device model uses constrained sampling to guarantee its output matches your types and structure with no string parsing needed. ::: info The **Swift** equivalents are the [`@Generable`](https://developer.apple.com/documentation/foundationmodels/generable) macro for compile-time schemas and [`DynamicGenerationSchema`](https://developer.apple.com/documentation/foundationmodels/dynamicgenerationschema) for runtime schemas. TSFM's `GenerationSchema` maps to the same underlying dictionary format. ::: If you already have or prefer to use JSON Schema objects, you can use `respondWithJsonSchema` instead and the SDK will convert it at runtime. If you're unsure which schema format you should use, see [Picking a Schema Format](#picking-a-schema-format). ## Defining a Schema (Native Format) ```ts import { GenerationSchema, GenerationGuide } from "tsfm-sdk"; const schema = new GenerationSchema("Person", "A person profile") .property("name", "string", { description: "Full name" }) .property("age", "integer", { description: "Age in years", guides: [GenerationGuide.range(0, 120)], }) .property("tags", "array", { guides: [GenerationGuide.maxItems(5)], optional: true, }); ``` ### Property Types `"string"` | `"integer"` | `"number"` | `"boolean"` | `"array"` | `"object"` ## Generation Guides Guides constrain the model's output for a property. ::: info The **Swift** equivalent is Foundation Models' [`@Guide`](https://developer.apple.com/documentation/foundationmodels/guide()) annotations. See Apple's [Generating Swift Data Structures with Guided Generation](https://developer.apple.com/documentation/foundationmodels/generating-swift-data-structures-with-guided-generation) guide. ::: | Method | Constrains | | --- | --- | | `GenerationGuide.anyOf(["a", "b"])` | Enumerated string values | | `GenerationGuide.constant("fixed")` | Exact string value | | `GenerationGuide.range(min, max)` | Numeric range (inclusive) | | `GenerationGuide.minimum(n)` | Numeric lower bound | | `GenerationGuide.maximum(n)` | Numeric upper bound | | `GenerationGuide.regex(pattern)` | String pattern | | `GenerationGuide.count(n)` | Exact array length | | `GenerationGuide.minItems(n)` | Minimum array length | | `GenerationGuide.maxItems(n)` | Maximum array length | | `GenerationGuide.element(guide)` | Applies a guide to array elements | ## Generating Structured Output ```ts const session = new LanguageModelSession(); const content = await session.respondWithSchema("Describe a software engineer", schema); ``` ### Extracting Values Use `content.value(key)` to extract typed values: ```ts const name = content.value("name"); const age = content.value("age"); ``` ### Full Example ```ts interface Cat { name: string; age: number; breed: string; } const schema = new GenerationSchema("Cat", "A rescue cat") .property("name", "string", { description: "The cat's name" }) .property("age", "integer", { description: "Age in years", guides: [GenerationGuide.range(0, 20)], }) .property("breed", "string", { description: "The cat's breed" }); const content = await session.respondWithSchema("Generate a rescue cat", schema); const cat: Cat = { name: content.value("name"), age: content.value("age"), breed: content.value("breed"), }; ``` ## Generating Structured Output with JSON Schema If you already have a JSON Schema definition, or are porting from OpenAI or another API, you can pass it directly with respondWithJsonSchema instead of building a GenerationSchema first: ```ts const content = await session.respondWithJsonSchema("Generate a person profile", { type: "object", properties: { name: { type: "string", description: "Full name" }, age: { type: "integer", description: "Age in years" }, occupation: { type: "string", description: "Job title" }, }, required: ["name", "age", "occupation"], }); const person = content.toObject(); // { name: "Ada Lovelace", age: 36, occupation: "Mathematician" } ``` The SDK converts JSON Schema to Apple's native format automatically. Use toObject to get the full result as a plain object instead of extracting properties individually. ## Picking a Schema Format Both methods are capable of producing constrained output. The choice comes down to whether you need [generation guides](#generation-guides) and what format you already have. Use **`respondWithSchema`** when you need the extra constraints possible only with generation guides. It takes a `GenerationSchema` built with TSFM, which is the native [dictionary](https://developer.apple.com/documentation/swift/dictionary) format that Foundation Models uses internally. This option is also the only path that supports [generation guides](#generation-guides). Guides like `constant`, `anyOf`, and `element` have no JSON Schema equivalent. Generation guides allow you to constrain token selection at generation time rather than validating output after. Use **`respondWithJsonSchema`** when you already have existing JSON schemas or don't need the extra constraints possible with generation guides. It accepts a standard JSON Schema object and TSFM converts it to the model's dictionary format at runtime. Standard constraints like `enum`, `minimum`/`maximum`, and `pattern` all work, but the model's more specific generation guides aren't available if you pass a JSON Schema. If you don't need guides, either works. If you already have JSON schemas or are porting from another API, `respondWithJsonSchema` is the faster path. ::: tip The [Chat API compatibility layer](/guide/chat-api#structured-output) also supports structured output via `response_format: { type: "json_schema" }`, using the same JSON Schema format as the Chat Completions API. ::: --- Source: /guide/tools --- # Tools Tools let the model call your functions during generation. It is up to the model to decide if a tool can help, generate arguments matching your schema, call the tool, receive the result and continue generating. ::: info The **Swift** equivalent is the Foundation Models [`Tool`](https://developer.apple.com/documentation/foundationmodels/tool) protocol. ::: ## Defining a Tool Extend the abstract `Tool` class: ```ts import { Tool, GenerationSchema, GeneratedContent, GenerationGuide } from "tsfm-sdk"; class WeatherTool extends Tool { readonly name = "get_weather"; readonly description = "Gets current weather for a city."; readonly argumentsSchema = new GenerationSchema("WeatherParams", "") .property("city", "string", { description: "City name" }) .property("units", "string", { description: "Temperature units", guides: [GenerationGuide.anyOf(["celsius", "fahrenheit"])], }); async call(args: GeneratedContent): Promise { const city = args.value("city"); const units = args.value("units"); return `Sunny, 22°C in ${city} (${units})`; } } ``` ### Required Members | Member | Type | Description | | --- | --- | --- | | `name` | `string` | Unique tool identifier | | `description` | `string` | What the tool does (shown to the model) | | `argumentsSchema` | `GenerationSchema` | Schema for the tool's arguments | | `call(args)` | `async (GeneratedContent) => string` | Handler that returns a string result | ## Using Tools in a Session Pass tools when creating a session: ```ts const tool = new WeatherTool(); const session = new LanguageModelSession({ instructions: "You are a helpful assistant.", tools: [tool], }); const reply = await session.respond("What's the weather in Tokyo?"); // The model calls get_weather, receives the result, and formulates a response ``` ## Error Handling If `call()` throws, it's wrapped in a `ToolCallError`: ```ts try { await session.respond("..."); } catch (e) { if (e instanceof ToolCallError) { console.log(e.message); // includes tool name and original error } } ``` ## Cleanup Tools register a native callback that must be released: ```ts session.dispose(); tool.dispose(); ``` Tools can be reused across sessions — just dispose after all sessions are done. ## Best Practices The Foundation Model [`Tool` documentation](https://developer.apple.com/documentation/foundationmodels/tool) recommends: - **Limit to 3–5 tools per session.** Tool schemas and descriptions consume context window space. More tools means less room for conversation. If your session exceeds the context size, split work across new sessions. - **Keep descriptions short.** A brief phrase is enough. Long descriptions add latency and use up context. - **Pre-run essential tools.** If a tool's output is always needed, call it yourself and include the result in the prompt or instructions rather than waiting for the model to discover it needs the tool. ## Tool Chaining The model can call multiple tools in sequence within a single `respond()` call. If the first tool's output informs a second tool call, the model handles the chaining automatically — you don't need to loop. ## Chat API Tool Calling If you prefer the Chat API tool calling interface, the [compatibility layer](/guide/chat-api#tool-calling) supports `tools` with the standard `ChatCompletionTool` format. You define tools as JSON objects instead of extending the `Tool` class, and handle tool execution yourself between requests. --- Source: /guide/transcripts --- # Transcripts Transcripts let you save and restore session history, enabling persistent conversations across process restarts. The transcript records instructions, user prompts, responses and tool results as a linear history. ::: info The **Swift** equivalent is Foundation Models' [`Transcript`](https://developer.apple.com/documentation/foundationmodels/transcript). ::: ## Entry Types A transcript is a linear sequence of entries. ::: info The **Swift** equivalent is [`Transcript.Entry`](https://developer.apple.com/documentation/foundationmodels/transcript). ::: | Role | Description | | --- | --- | | `instructions` | Behavioral directives provided to the model when creating the session. | | `user` | User input passed to `respond()` or `streamResponse()`. | | `response` | Model-generated output (text, structured content, or tool calls). | | `tool` | Results returned from executed tools. | ## Inspecting Entries Use `entries()` to access typed transcript entries without manually parsing JSON: ```ts const entries = session.transcript.entries(); for (const entry of entries) { if (entry.role === "response" && entry.contents) { for (const content of entry.contents) { if (content.type === "text") console.log(content.text); } } } ``` Each entry has a `role` (`"instructions"`, `"user"`, `"response"`, or `"tool"`) and role-specific fields: | Field | Roles | Description | | --- | --- | --- | | `contents` | all | Array of text or structured content items. | | `tools` | `instructions` | Tool definitions registered with the session. | | `options` | `user` | Generation options for this prompt. | | `responseFormat` | `user` | Schema constraint for structured output. | | `toolCalls` | `response` | Tool invocations with name and arguments. | | `assets` | `response` | Asset references in the response. | | `toolName` | `tool` | Name of the tool that produced this output. | | `toolCallID` | `tool` | ID linking this output to its tool call. | ## Exporting a Transcript Every session has a `transcript` property: ```ts const session = new LanguageModelSession(); await session.respond("My name is Cody."); await session.respond("I work on open source."); // Export as JSON string const json = session.transcript.toJson(); // Or as a dictionary object const dict = session.transcript.toDict(); ``` ## Restoring a Session Create a new session from a saved transcript: ```ts import { Transcript, LanguageModelSession } from "tsfm-sdk"; // From JSON string const transcript = Transcript.fromJson(json); const resumed = LanguageModelSession.fromTranscript(transcript); // From dictionary object const transcript = Transcript.fromDict(dict); const resumed = LanguageModelSession.fromTranscript(transcript); ``` The restored session has full context of the previous conversation: ```ts const reply = await resumed.respond("What's my name?"); // The model remembers: "Your name is Cody." ``` ## Full Example ```ts // First session const session = new LanguageModelSession(); await session.respond("My name is Cody."); const json = session.transcript.toJson(); session.dispose(); // Later — resume from saved transcript const resumed = LanguageModelSession.fromTranscript(Transcript.fromJson(json)); const recall = await resumed.respond("What's my name?"); console.log(recall); // References "Cody" resumed.dispose(); ``` ::: warning You must access `session.transcript` *before* calling `session.dispose()`. Transcripts are read from the native session pointer and will be lost when dispose runs. ::: --- Source: /guide/generation-options --- # Generation Options Control temperature, token limits, and sampling strategy for any generation method. ::: info The **Swift** equivalent is Foundation Models' [`GenerationOptions`](https://developer.apple.com/documentation/foundationmodels/generationoptions). ::: ## Usage Pass `options` as part of the second argument to any generation method: ```ts import { SamplingMode } from "tsfm-sdk"; const reply = await session.respond("Write a haiku about rain", { options: { temperature: 0.9, maximumResponseTokens: 100, sampling: SamplingMode.random({ top: 50, seed: 42 }), }, }); ``` ## Options | Option | Type | Description | | --- | --- | --- | | `temperature` | `number` | Influences the confidence of the model's response. Higher values produce more varied output. Lower values produce more deterministic output. | | `maximumResponseTokens` | `number` | Maximum tokens the model is allowed to produce. Enforcing a strict limit can lead to truncated or grammatically incorrect responses. | | `sampling` | `SamplingMode` | Controls how the model picks tokens from its probability distribution (see below). | ## Sampling Modes The model builds its response token by token. At each step it produces a probability distribution over its vocabulary. The sampling mode controls how a token is selected from that distribution. ::: info The **Swift** equivalent is Foundation Models' [`SamplingMode`](https://developer.apple.com/documentation/foundationmodels/generationoptions/samplingmode). ::: ### Greedy (Most Deterministic) Always chooses the most likely token. The same prompt should always produce the same output. ```ts SamplingMode.greedy() ``` ### Random Samples from a subset of likely tokens. You must choose **one** of `top` or `probabilityThreshold`, but not both. Either can be combined with `seed` for reproducibility: | Parameter | Description | | --- | --- | | `top` | Pick from the K most likely tokens (fixed set). Cannot be combined with `probabilityThreshold`. Maps to Apple's `random(top:seed:)`. | | `probabilityThreshold` | Pick from the smallest set of tokens whose probabilities sum to this threshold. Cannot be combined with `top`. Maps to Apple's `random(probabilityThreshold:seed:)`. | | `seed` | Random seed for reproducible output. Works with either constraint. | ```ts // Top-K: pick from the 50 most likely tokens SamplingMode.random({ top: 50, seed: 42 }) // Top-P (nucleus): pick from the smallest set of tokens whose probabilities add up to 0.9 SamplingMode.random({ probabilityThreshold: 0.9 }) ``` --- Source: /guide/error-handling --- # Error Handling All SDK errors extend `FoundationModelsError`. Generation-specific errors extend `GenerationError`, which itself extends `FoundationModelsError`. TSFM also adds `ServiceCrashedError` and `ToolCallError`. ::: info The **Swift** equivalent is [`LanguageModelSession.GenerationError`](https://developer.apple.com/documentation/foundationmodels/languagemodelsession/generationerror). ::: ## Error Hierarchy ::: info FoundationModelsError All errors inherit from `FoundationModelsError`. **GenerationError** — errors during generation: - `ExceededContextWindowSizeError` - `AssetsUnavailableError` - `GuardrailViolationError` - `UnsupportedGuideError` - `UnsupportedLanguageOrLocaleError` - `DecodingFailureError` - `RateLimitedError` - `ConcurrentRequestsError` - `RefusalError` - `InvalidGenerationSchemaError` - `ServiceCrashedError` **ToolCallError** — a tool's `call()` method threw ::: ## Catching Errors ```ts import { ExceededContextWindowSizeError, GuardrailViolationError, RateLimitedError, } from "tsfm-sdk"; try { await session.respond("..."); } catch (e) { if (e instanceof ExceededContextWindowSizeError) { // Start a new session — context window is full } else if (e instanceof GuardrailViolationError) { // Content policy was triggered } else if (e instanceof RateLimitedError) { // Too many requests — wait and retry } } ``` ## Error Reference ### ExceededContextWindowSizeError The session's accumulated context has exceeded the model's limit. All content (instructions, prompts, responses, tool schemas, tool calls, and tool output) share one context window. Long conversations or large tool outputs will eventually hit this. Dispose the session and start a new one, optionally seeding it with a trimmed [transcript](/guide/transcripts). Apple recommends splitting large tasks across multiple sessions. ### AssetsUnavailableError The on-device model files haven't finished downloading. This typically happens right after enabling Apple Intelligence or after a macOS update. Call `model.waitUntilAvailable()` before creating a session — it will resolve once the assets are ready. ### GuardrailViolationError The model's safety [guardrails](/guide/model-configuration#guardrails) flagged the prompt or the generated response. With `DEFAULT` guardrails, this means unsafe content was detected and blocked. With `PERMISSIVE_CONTENT_TRANSFORMATIONS`, you should see this less often as the model will attempt to transform content instead of rejecting it. Either way, you should attempt to catch this and surface a user-friendly message. ### UnsupportedGuideError A `GenerationGuide` on one of your schema properties isn't supported by the current model version. This can happen if you use a guide that was introduced in a newer OS version than the user is running. Check your guide types against the [guides reference](/guide/structured-output#generation-guides). ### UnsupportedLanguageOrLocaleError The system locale or the language of the prompt isn't supported by the on-device model. Foundation Models supports a subset of languages — this error means you've hit one it can't handle. ### DecodingFailureError The model generated output during structured generation, but it couldn't be decoded into your schema. This can happen with complex or deeply nested schemas. Simplify the schema or add more descriptive property descriptions to guide the model. ### RateLimitedError Too many requests to the on-device model in a short window. This is an OS-level rate limit, not a network API limit. Back off and retry after a short delay. ### ConcurrentRequestsError You called a generation method on a session that's already processing a request. The SDK serializes calls internally via `_enqueue()`, so you shouldn't normally hit this. If you do, check that you're `await`ing calls or use `session.isResponding` to check state before calling. ### RefusalError The model declined to generate a response. This is distinct from `GuardrailViolationError` — refusal means the model chose not to answer (e.g., the prompt asks for something outside its capabilities), not that a content filter triggered. ### InvalidGenerationSchemaError Your `GenerationSchema` is malformed or was rejected by the on-device model. Common causes: unsupported property types, conflicting guides, or schemas that are too complex for the model to constrain. Also thrown when the native layer returns a `ModelManagerError Code=1041` rejection. ### ServiceCrashedError The Apple Intelligence background service (`generativeexperiencesd`) has crashed. This is an OS-level issue, not an SDK bug. The error message includes the restart command: ```bash launchctl kickstart -k gui/$(id -u)/com.apple.generativeexperiencesd ``` After restarting the service, create a new session and retry. ### ToolCallError Your tool's `call()` method threw during execution. The SDK wraps the original error with the tool name so you can identify which tool failed and why. Access the original error via `err.cause`. ## Catching All SDK Errors ```ts import { FoundationModelsError } from "tsfm-sdk"; try { await session.respond("..."); } catch (e) { if (e instanceof FoundationModelsError) { console.error("SDK error:", e.message); } } ``` --- Source: /guide/chat-api --- # Chat & Responses APIs ## SDK Options TSFM offers two ways to interact with the on-device Foundation Model: 1. The **Native SDK** covered in the rest of this guide
(mostly mirrors [the original Swift FoundationModels API](https://developer.apple.com/documentation/foundationmodels)) 2. **Compatibility APIs** that mirror popular cloud interfaces The `tsfm-sdk/chat` module translates familiar OpenAI-style calls into native Foundation Models operations, so you can swap in on-device Apple Intelligence with minimal code changes. For full control over sessions, schemas, and tools, use the [native SDK](/guide/sessions) instead. ## Tradeoffs If you use `tsfm-sdk/chat` you will lose access to some features: - Each `create()` call builds and tears down a session (slightly higher overhead) - No direct access to [generation guides](/guide/structured-output#generation-guides) (e.g. `anyOf`, `regex`, `range` constraints) - No persistent sessions (you manage conversation history yourself and pass it each call) - No direct access to the underlying [transcript](/guide/transcripts) (though you can build your own from the messages array) ## When to Use - Migrating an existing OpenAI-based codebase to on-device inference - Building apps that switch between cloud and on-device models - Prototyping quickly with a familiar interface ```ts import Client from "tsfm-sdk/chat"; const client = new Client(); // Responses API (recommended) const response = await client.responses.create({ model: "SystemLanguageModel", instructions: "You are a helpful assistant.", input: "What is the capital of France?", }); console.log(response.output_text); // Chat Completions API const completion = await client.chat.completions.create({ model: "SystemLanguageModel", messages: [ { role: "system", content: "You are a helpful assistant." }, { role: "user", content: "What is the capital of France?" }, ], }); console.log(completion.choices[0].message.content); client.close(); ``` If you've used the OpenAI Node SDK or similar APIs, the interface should feel familiar. The biggest difference is that the `model` param can be omitted or set to `"SystemLanguageModel"` ## What TSFM Supports Both APIs support the same core capabilities: | Feature | Responses API | Chat Completions API | tsfm Support | | --- | --- | --- | --- | | Text generation | `input: "..."` | `messages: [...]` | Full | | Multi-turn conversations | `input: [...]` (message array) | `messages: [...]` | Full | | Streaming | `stream: true` | `stream: true` | Full | | Structured output | `text: { format: { type: "json_schema" } }` | `response_format: { type: "json_schema" }` | Full | | Tool calling | `tools: [{ type: "function", name, ... }]` | `tools: [{ type: "function", function: { name, ... } }]` | Full | | `temperature`, `max_output_tokens` | `temperature`, `max_output_tokens` | `temperature`, `max_tokens` / `max_completion_tokens` | Full | | `top_p`, `seed` | `top_p`, `seed` | `top_p`, `seed` | Full | | Image/audio content | `input_image`, `input_file` | Image URLs | Not supported (warns) | | `usage` / token counts | `usage` | `usage` | Always `null` | --- ## Responses API The Responses-style API uses a `client.responses.create()` function with a simpler input model and richer output structure. ### Basic Usage The simplest `responses.create()` call takes a string `input`: ```ts const response = await client.responses.create({ input: "What is the capital of France?", }); // Outputs are available on the response object console.log(response.output_text); ``` ### Instructions In the Responses API, system instructions are a top-level parameter rather than a message role: ```ts const response = await client.responses.create({ instructions: "You are a concise math tutor.", input: "What is 2 + 2?", }); ``` ### Multi-turn Conversations For multi-turn conversations with `responses.create()`, pass an array of input items: ```ts const response = await client.responses.create({ instructions: "You are a math tutor.", input: [ { role: "user", content: "What is 2 + 2?" }, { role: "assistant", content: "4" }, { role: "user", content: "Multiply that by 3" }, ], }); ``` ### Streaming Pass `stream: true` to `responses.create()` to get a `ResponseStream` of typed events: ```ts const stream = await client.responses.create({ input: "Tell me a story", stream: true, }); for await (const event of stream) { if (event.type === "response.output_text.delta") { process.stdout.write(event.delta); } } ``` Key event types: | Event type | Description | | --- | --- | | `response.created` | Response object created | | `response.in_progress` | Generation started | | `response.output_item.added` | New output item (message or function call) | | `response.output_text.delta` | Text token | | `response.output_text.done` | Full text complete | | `response.function_call_arguments.delta` | Function arguments chunk | | `response.function_call_arguments.done` | Full function call complete | | `response.output_item.done` | Output item complete | | `response.completed` | Full response complete | | `response.incomplete` | Generation stopped early | ::: warning When streaming structured output or tool calls, the full response is generated before any events are emitted. This is because Foundation Models uses constrained generation (a grammar that forces valid JSON), which cannot be interrupted mid-token. Plain text generation is the only mode that streams incrementally as tokens are produced. ::: ### Structured Output The Responses API uses `text.format` with `type: "json_schema"` for structured output: ```ts const response = await client.responses.create({ input: "Extract: Alice is 28 and lives in Seattle", text: { format: { type: "json_schema", name: "Person", schema: { type: "object", properties: { name: { type: "string" }, age: { type: "integer" }, city: { type: "string" }, }, required: ["name", "age", "city"], }, }, }, }); const person = JSON.parse(response.output_text); // { name: "Alice", age: 28, city: "Seattle" } ``` ### Tool Calling The Responses API uses a flat tool format with `name` and `parameters` at the top level (not nested under `function` like Chat Completions): ```ts const response = await client.responses.create({ input: "What's the weather in Tokyo?", tools: [ { type: "function", name: "get_weather", description: "Get current weather for a city", parameters: { type: "object", properties: { city: { type: "string", description: "City name" }, }, required: ["city"], }, }, ], }); // Check for function calls in the output for (const item of response.output) { if (item.type === "function_call") { console.log(item.name); // "get_weather" console.log(item.arguments); // '{"city":"Tokyo"}' console.log(item.call_id); // "call_" — use this to send results back } } ``` ### Sending Tool Results Back Send results using `function_call_output` input items. Pass back the original `function_call` item alongside its output: ```ts const fc = response.output.find((item) => item.type === "function_call")!; const followUp = await client.responses.create({ input: [ { role: "user", content: "What's the weather in Tokyo?" }, fc, // pass the function_call back { type: "function_call_output", call_id: fc.call_id, output: JSON.stringify({ temp: 22, condition: "Sunny" }), }, ], tools: [/* same tools */], }); console.log(followUp.output_text); // "It's currently 22°C and sunny in Tokyo." ``` ### Generation Options ```ts const response = await client.responses.create({ input: "Write a creative haiku", temperature: 0.8, max_output_tokens: 50, seed: 42, }); ``` ### Error Mapping | Native error | Responses API equivalent | | --- | --- | | `ExceededContextWindowSizeError` | `status: "incomplete"`, `incomplete_details.reason: "max_output_tokens"` | | `RefusalError` | Output contains `{ type: "refusal", refusal: "..." }` | | `GuardrailViolationError` | `status: "incomplete"`, `incomplete_details.reason: "content_filter"` | | `RateLimitedError` | Thrown as error with status `429` | ### Response Object ```ts { id: "resp_...", object: "response", created_at: 1710000000, model: "SystemLanguageModel", output: [ { id: "msg_...", type: "message", role: "assistant", status: "completed", content: [{ type: "output_text", text: "...", annotations: [] }] } ], output_text: "...", // convenience: concatenated text status: "completed", // "completed" | "failed" | "incomplete" error: null, incomplete_details: null, // { reason: "max_output_tokens" | "content_filter" } instructions: "...", usage: null // not tracked } ``` --- ## Chat Completions API The Chat Completions API uses the classic `client.chat.completions.create()` interface. ### Messages The Chat Completions API accepts all standard message roles: | Role | Behavior | | --- | --- | | `system` | Mapped to the session's `instructions`. Only the first system message becomes instructions — subsequent ones are treated as user messages with a `[System]` prefix. | | `developer` | Same as `system`. | | `user` | Mapped to a user transcript entry. The last user message becomes the prompt. | | `assistant` | Mapped to a response transcript entry. Tool calls are preserved. | | `tool` | Mapped to a user message formatted as `[Tool result for toolName]: content`. | #### Chat: Multi-turn Conversations Pass the full conversation history in the `messages` array. The client converts it to a native Foundation Models [transcript](/guide/transcripts) behind the scenes — each `create()` call builds a fresh session from the messages you provide. ```ts const response = await client.chat.completions.create({ messages: [ { role: "system", content: "You are a math tutor." }, { role: "user", content: "What is 2 + 2?" }, { role: "assistant", content: "4" }, { role: "user", content: "Multiply that by 3" }, ], }); ``` ### Chat: Streaming Pass `stream: true` to get an async iterable of `ChatCompletionChunk` objects: ```ts const stream = await client.chat.completions.create({ messages: [{ role: "user", content: "Tell me a story" }], stream: true, }); for await (const chunk of stream) { const delta = chunk.choices[0].delta.content; if (delta) process.stdout.write(delta); } ``` The `Stream` object supports: - **`for await...of`** — iterates chunks, auto-closes on completion or `break` - **`stream.close()`** — eagerly release resources without finishing iteration - **`stream.toReadableStream()`** — convert to a Web `ReadableStream` for HTTP responses ::: warning Structured output and tool call responses are buffered — the model must finish constrained generation before the response is emitted. Only plain text streams token-by-token. ::: ### Chat: Structured Output Use `response_format` with `type: "json_schema"` to get guaranteed JSON output: ```ts const response = await client.chat.completions.create({ messages: [{ role: "user", content: "Extract: Alice is 28 and lives in Seattle" }], response_format: { type: "json_schema", json_schema: { name: "Person", schema: { type: "object", properties: { name: { type: "string" }, age: { type: "integer" }, city: { type: "string" }, }, required: ["name", "age", "city"], }, }, }, }); const person = JSON.parse(response.choices[0].message.content!); // { name: "Alice", age: 28, city: "Seattle" } ``` The JSON schema is converted to Apple's native generation schema format at runtime. The model uses constrained sampling to guarantee valid output — no retry or validation needed. ### Chat: Tool Calling Define tools using the standard function tool format: ```ts const tools = [ { type: "function" as const, function: { name: "get_weather", description: "Get current weather for a city", parameters: { type: "object", properties: { city: { type: "string", description: "City name" }, }, required: ["city"], }, }, }, ]; const response = await client.chat.completions.create({ messages: [{ role: "user", content: "What's the weather in Tokyo?" }], tools, }); ``` When the model decides to call a tool, the response has `finish_reason: "tool_calls"` and `message.tool_calls` contains the calls: ```ts const choice = response.choices[0]; if (choice.finish_reason === "tool_calls" && choice.message.tool_calls) { const call = choice.message.tool_calls[0]; console.log(call.function.name); // "get_weather" console.log(call.function.arguments); // '{"city":"Tokyo"}' } ``` #### Chat: Sending Tool Results Back After executing the tool, send the result back with a follow-up request that includes the full conversation: ```ts const followUp = await client.chat.completions.create({ messages: [ { role: "user", content: "What's the weather in Tokyo?" }, { role: "assistant", content: null, tool_calls: [call] }, { role: "tool", tool_call_id: call.id, content: JSON.stringify({ temp: 22, condition: "Sunny" }), }, ], tools, }); console.log(followUp.choices[0].message.content); // "It's currently 22°C and sunny in Tokyo." ``` ::: info Under the hood, tool calling uses structured output with a discriminated schema. The model chooses between `"text"` and `"tool_call"` as the first generated token, then fills in the tool name and arguments. Tools are suppressed when the last message is a tool result to prevent the model from re-calling the same tool. ::: ### Chat: Generation Options | Param | Maps to | | --- | --- | | `temperature` | `GenerationOptions.temperature` | | `max_tokens` / `max_completion_tokens` | `GenerationOptions.maximumResponseTokens` (`max_completion_tokens` takes priority) | | `top_p` | `SamplingMode.random({ probabilityThreshold })` | | `seed` | `SamplingMode.random({ seed })` | ```ts const response = await client.chat.completions.create({ messages: [{ role: "user", content: "Say hello" }], temperature: 0, max_tokens: 50, seed: 42, }); ``` ## Chat Completions Error Mapping | Native error | Chat Completions equivalent | | --- | --- | | `ExceededContextWindowSizeError` | `finish_reason: "length"` | | `RefusalError` | `message.refusal` set, `content: null` | | `GuardrailViolationError` | `finish_reason: "content_filter"` | | `RateLimitedError` | Thrown as error with status `429` | ## Chat Completions Response Format ```ts { id: "chatcmpl-...", // Unique ID object: "chat.completion", // Or "chat.completion.chunk" for streaming created: 1710000000, // Unix timestamp (seconds) model: "SystemLanguageModel", choices: [{ index: 0, message: { role: "assistant", content: "...", // null when tool_calls present refusal: null, // Set on RefusalError tool_calls: [...] // Present when finish_reason is "tool_calls" }, finish_reason: "stop" // "stop" | "length" | "tool_calls" | "content_filter" }], usage: null, // Not tracked system_fingerprint: null } ``` --- ## Cleanup Call `client.close()` when you're done to release the native model pointer: ```ts const client = new Client(); // ... use client ... client.close(); ``` Each `create()` call manages its own session lifecycle internally — sessions are created from the messages array and disposed after the response completes (or after streaming finishes). ## What's Next - [Structured Output](/guide/structured-output) — Schema-based generation with the native SDK - [Tools](/guide/tools) — Native tool calling with the `Tool` class - [Streaming](/guide/streaming) — Native streaming API - [Error Handling](/guide/error-handling) — Full error reference --- Source: /api/ --- # API Reference Complete reference for all public exports from `tsfm`. ## Classes | Class | Description | | --- | --- | | [SystemLanguageModel](/api/system-language-model) | On-device model access and availability | | [LanguageModelSession](/api/language-model-session) | Conversation session with all generation methods | | [GenerationSchema](/api/generation-schema) | Schema builder for structured output | | [Tool](/api/tool) | Abstract base class for tool calling | | [Transcript](/api/transcript) | Session history export and import | ## Types and Enums | Export | Description | | --- | --- | | [GenerationOptions](/api/generation-options) | Options for temperature, tokens, sampling | | [SamplingMode](/api/generation-options#samplingmode) | Greedy or random sampling strategies | | [GenerationGuide](/api/generation-schema#generationguide) | Output constraints for schema properties | | [GeneratedContent](/api/generation-schema#generatedcontent) | Structured generation result | | [Errors](/api/errors) | Error hierarchy and error codes | ## Chat & Responses APIs | Export | Description | | --- | --- | | [Client](/api/chat) | Chat-style and Responses-style API client backed by on-device Apple Intelligence | ```ts import Client from "tsfm-sdk/chat"; ``` See the [Chat & Responses API reference](/api/chat) for full type documentation. ## Installation ```ts import { SystemLanguageModel, LanguageModelSession, GenerationSchema, GenerationGuide, Tool, Transcript, SamplingMode, } from "tsfm-sdk"; // Chat API compatible interface import Client from "tsfm-sdk/chat"; ``` --- Source: /api/system-language-model --- # SystemLanguageModel Represents the on-device Foundation Models language model. Provides availability checking and model configuration. ## Constructor ```ts new SystemLanguageModel(options?: { useCase?: SystemLanguageModelUseCase; guardrails?: SystemLanguageModelGuardrails; }) ``` | Parameter | Default | Description | | --- | --- | --- | | `useCase` | `GENERAL` | Model use case | | `guardrails` | `DEFAULT` | Guardrail configuration | ## Methods ### `isAvailable()` Synchronously checks if the model is ready. ```ts isAvailable(): AvailabilityResult ``` Returns `{ available: true }` or `{ available: false, reason: SystemLanguageModelUnavailableReason }`. ### `waitUntilAvailable()` Polls until the model is available or the timeout expires. ```ts waitUntilAvailable(timeoutMs?: number): Promise ``` | Parameter | Default | Description | | --- | --- | --- | | `timeoutMs` | `30000` | Maximum wait time in milliseconds | ### `dispose()` Releases the native model reference. ```ts dispose(): void ``` ## Enums ### `SystemLanguageModelUseCase` | Value | Description | | --- | --- | | `GENERAL` | General-purpose generation | | `CONTENT_TAGGING` | Classification and labeling | ### `SystemLanguageModelGuardrails` | Value | Description | | --- | --- | | `DEFAULT` | Standard content safety guardrails | ### `SystemLanguageModelUnavailableReason` | Value | Description | | --- | --- | | `APPLE_INTELLIGENCE_NOT_ENABLED` | Apple Intelligence is off | | `MODEL_NOT_READY` | Model assets still downloading | | `DEVICE_NOT_ELIGIBLE` | Hardware not supported | ## Types ### `AvailabilityResult` ```ts type AvailabilityResult = | { available: true } | { available: false; reason: SystemLanguageModelUnavailableReason }; ``` --- Source: /api/language-model-session --- # LanguageModelSession Manages conversation state and provides all generation methods — text, streaming, structured, and JSON Schema. ## Constructor ```ts new LanguageModelSession(options?: { instructions?: string; model?: SystemLanguageModel; tools?: Tool[]; }) ``` | Parameter | Default | Description | | --- | --- | --- | | `instructions` | `undefined` | System prompt for the session | | `model` | Default model | A configured `SystemLanguageModel` | | `tools` | `[]` | Tools available during generation | ## Methods ### `respond()` Generate a text response. ```ts respond(prompt: string, options?: { options?: GenerationOptions }): Promise ``` ### `respondWithSchema()` Generate structured output matching a `GenerationSchema`. ```ts respondWithSchema(prompt: string, schema: GenerationSchema, options?: { options?: GenerationOptions }): Promise ``` Returns a [`GeneratedContent`](/api/generation-schema#generatedcontent) with typed property access. ### `respondWithJsonSchema()` Generate structured output from a JSON Schema object. ```ts respondWithJsonSchema(prompt: string, schema: object, options?: { options?: GenerationOptions }): Promise ``` Returns a [`GeneratedContent`](/api/generation-schema#generatedcontent) with `toObject()` for the full result. ### `streamResponse()` Stream a response token-by-token. ```ts streamResponse(prompt: string, options?: { options?: GenerationOptions }): AsyncIterable ``` Each yielded string contains only the new tokens since the last iteration. ### `cancel()` Cancel an in-progress request. Advisory — the response may complete before cancellation takes effect. ```ts cancel(): void ``` ### `dispose()` Release the native session. Access `transcript` before calling this. ```ts dispose(): void ``` ## Properties ### `isResponding` ```ts readonly isResponding: boolean ``` `true` while a generation request is in progress. ### `transcript` ```ts readonly transcript: Transcript ``` The session's conversation history. See [Transcript](/api/transcript). ## Static Methods ### `fromTranscript()` Create a new session from a saved transcript. ```ts static fromTranscript(transcript: Transcript, options?: { instructions?: string; model?: SystemLanguageModel; tools?: Tool[]; }): LanguageModelSession ``` --- Source: /api/generation-schema --- # GenerationSchema Builder for typed schemas that constrain structured generation output. ## Constructor ```ts new GenerationSchema(name: string, description: string) ``` ## Methods ### `property()` Add a property to the schema. Returns `this` for chaining. ```ts property(name: string, type: PropertyType, options?: { description?: string; guides?: GenerationGuide[]; optional?: boolean; }): this ``` ### `toDict()` Export the schema as a JSON Schema-compatible dictionary. ```ts toDict(): object ``` ## Types ### `PropertyType` ```ts type PropertyType = "string" | "integer" | "number" | "boolean" | "array" | "object" ``` ## GenerationGuide Factory methods that create output constraints for schema properties. ### String Guides ```ts GenerationGuide.anyOf(values: string[]) // enumerated values GenerationGuide.constant(value: string) // exact value GenerationGuide.regex(pattern: string) // regex pattern ``` ### Numeric Guides ```ts GenerationGuide.range(min: number, max: number) // inclusive range GenerationGuide.minimum(n: number) // lower bound GenerationGuide.maximum(n: number) // upper bound ``` ### Array Guides ```ts GenerationGuide.count(n: number) // exact length GenerationGuide.minItems(n: number) // minimum length GenerationGuide.maxItems(n: number) // maximum length GenerationGuide.element(guide: GenerationGuide) // constrain elements ``` ## GeneratedContent Returned by `respondWithSchema()` and `respondWithJsonSchema()`. ### `value()` Extract a typed property value: ```ts value(key: string): T ``` ### `toObject()` Get the full result as a plain object: ```ts toObject(): Record ``` ## GenerationSchemaProperty Represents a single property in a schema. Created internally by `GenerationSchema.property()`. ## GuideType Enum of guide types used internally: | Value | Description | | --- | --- | | `ANY_OF` | Enumerated values | | `CONSTANT` | Fixed value | | `RANGE` | Numeric range | | `MINIMUM` | Lower bound | | `MAXIMUM` | Upper bound | | `REGEX` | Pattern match | | `COUNT` | Exact array length | | `MIN_ITEMS` | Minimum array length | | `MAX_ITEMS` | Maximum array length | | `ELEMENT` | Element constraint | --- Source: /api/generation-options --- # GenerationOptions Options that control generation behavior across all response methods. ## Interface ```ts interface GenerationOptions { temperature?: number; maximumResponseTokens?: number; sampling?: SamplingMode; } ``` | Property | Type | Description | | --- | --- | --- | | `temperature` | `number` | Controls randomness. Higher = more varied. | | `maximumResponseTokens` | `number` | Max tokens in the response. | | `sampling` | `SamplingMode` | Sampling strategy. | ## Usage ```ts await session.respond("prompt", { options: { temperature: 0.8, maximumResponseTokens: 500, sampling: SamplingMode.greedy(), }, }); ``` ## SamplingMode ### `SamplingMode.greedy()` Deterministic sampling — always picks the most likely token. ```ts static greedy(): SamplingMode ``` ### `SamplingMode.random()` Stochastic sampling with optional constraints. ```ts static random(options?: { top?: number; seed?: number; probabilityThreshold?: number; }): SamplingMode ``` | Parameter | Description | | --- | --- | | `top` | Top-K: only consider the K most likely tokens | | `seed` | Random seed for reproducible output | | `probabilityThreshold` | Top-P / nucleus: cumulative probability threshold | ### `SamplingModeType` ```ts type SamplingModeType = "greedy" | "random" ``` --- Source: /api/tool --- # Tool Abstract base class for defining tools the model can call during generation. ## Abstract Members Subclasses must implement: ```ts abstract class Tool { abstract readonly name: string; abstract readonly description: string; abstract readonly argumentsSchema: GenerationSchema; abstract call(args: GeneratedContent): Promise; } ``` | Member | Type | Description | | --- | --- | --- | | `name` | `string` | Unique tool identifier | | `description` | `string` | What the tool does (visible to the model) | | `argumentsSchema` | `GenerationSchema` | Schema defining the tool's arguments | | `call(args)` | `async (GeneratedContent) => string` | Handler invoked when the model calls this tool | ## Properties ### `onCall` Optional callback fired at the start of each tool invocation, before `call()` runs. Useful for showing UI indicators (e.g. "Using tool: search") while the model waits for the tool result. ```ts onCall?: (toolName: string) => void; ``` ```ts const tool = new WeatherTool(); tool.onCall = (name) => console.log(`Tool invoked: ${name}`); ``` ## Methods ### `dispose()` Release the native callback. Call after all sessions using this tool are done. ```ts dispose(): void ``` ## Example ```ts import { Tool, GenerationSchema, GeneratedContent, GenerationGuide } from "tsfm-sdk"; class WeatherTool extends Tool { readonly name = "get_weather"; readonly description = "Gets current weather for a city."; readonly argumentsSchema = new GenerationSchema("WeatherParams", "") .property("city", "string", { description: "City name" }) .property("units", "string", { description: "Temperature units", guides: [GenerationGuide.anyOf(["celsius", "fahrenheit"])], }); async call(args: GeneratedContent): Promise { const city = args.value("city"); const units = args.value("units"); return `Sunny, 22°C in ${city} (${units})`; } } ``` ## Lifecycle 1. Create the tool instance 2. Pass to `LanguageModelSession({ tools: [tool] })` 3. The tool's callback is registered internally when the session is created 4. After all sessions are disposed, call `tool.dispose()` Tools can be shared across multiple sessions. The native callback remains registered until `dispose()` is called. --- Source: /api/transcript --- # Transcript Represents a session's conversation history. Used to export and restore sessions. ## Accessing Every session exposes its transcript: ```ts const transcript = session.transcript; ``` ::: warning Access the transcript before calling `session.dispose()`. The transcript reads from the native session pointer. ::: ## Methods ### `toJson()` Export the transcript as a JSON string. ```ts toJson(): string ``` ### `toDict()` Export the transcript as a dictionary object. ```ts toDict(): object ``` ## Static Methods ### `fromJson()` Create a transcript from a JSON string. ```ts static fromJson(json: string): Transcript ``` ### `fromDict()` Create a transcript from a dictionary object. ```ts static fromDict(dict: object): Transcript ``` ## Restoring a Session ```ts const transcript = Transcript.fromJson(savedJson); const session = LanguageModelSession.fromTranscript(transcript); ``` See [LanguageModelSession.fromTranscript()](/api/language-model-session#fromtranscript) for full options. --- Source: /api/errors --- # Errors All SDK errors extend `FoundationModelsError`. Import specific error classes to handle them individually. ## Hierarchy ``` FoundationModelsError ├── GenerationError │ ├── ExceededContextWindowSizeError │ ├── AssetsUnavailableError │ ├── GuardrailViolationError │ ├── UnsupportedGuideError │ ├── UnsupportedLanguageOrLocaleError │ ├── DecodingFailureError │ ├── RateLimitedError │ ├── ConcurrentRequestsError │ ├── RefusalError │ └── InvalidGenerationSchemaError └── ToolCallError ``` ## Error Reference | Error | Code | When | | --- | --- | --- | | `ExceededContextWindowSizeError` | 1 | Session history too long | | `AssetsUnavailableError` | 2 | Model not downloaded | | `GuardrailViolationError` | 3 | Content policy violation | | `UnsupportedGuideError` | 4 | Unsupported generation guide | | `UnsupportedLanguageOrLocaleError` | 5 | Language not supported | | `DecodingFailureError` | 6 | Structured output parse failure | | `RateLimitedError` | 7 | Too many requests | | `ConcurrentRequestsError` | 8 | Session already responding | | `RefusalError` | 9 | Model declined to answer | | `InvalidGenerationSchemaError` | 10 | Malformed schema | | `ToolCallError` | — | Tool's `call()` threw | ## GenerationErrorCode Enum mapping status codes to error types: ```ts enum GenerationErrorCode { SUCCESS = 0, EXCEEDED_CONTEXT_WINDOW_SIZE = 1, ASSETS_UNAVAILABLE = 2, GUARDRAIL_VIOLATION = 3, UNSUPPORTED_GUIDE = 4, UNSUPPORTED_LANGUAGE_OR_LOCALE = 5, DECODING_FAILURE = 6, RATE_LIMITED = 7, CONCURRENT_REQUESTS = 8, REFUSAL = 9, INVALID_SCHEMA = 10, UNKNOWN_ERROR = 255, } ``` ## Usage ```ts import { FoundationModelsError, GenerationError, ExceededContextWindowSizeError, GuardrailViolationError, ToolCallError, } from "tsfm-sdk"; try { await session.respond("..."); } catch (e) { if (e instanceof ToolCallError) { // Tool handler threw } else if (e instanceof GenerationError) { // Any generation error } else if (e instanceof FoundationModelsError) { // Any SDK error } } ``` --- Source: /api/chat --- # Chat & Responses API Reference API reference for `tsfm-sdk/chat`. This module provides a compatibility layer with a Responses API and Chat Completions API backed by on-device Apple Intelligence. ```ts import Client, { Stream, ResponseStream, MODEL_DEFAULT } from "tsfm-sdk/chat"; ``` ## Client Main client class. Provides Chat-style and Responses-style API interfaces backed by on-device Apple Intelligence. ### Constructor ```ts const client = new Client(); ``` No arguments. No API key needed. ### Properties | Property | Type | Description | | --- | --- | --- | | `responses` | `Responses` | Responses API endpoint | | `chat.completions` | `Completions` | Chat Completions API endpoint | ### Methods #### `close()` Releases the native model pointer. Call when you're done with the client. ```ts client.close(): void ``` --- ## Responses API ### Responses Accessed via `client.responses`. Similar to the modern Responses API interface used by OpenAI. #### `responses.create(params)` Creates a response. ```ts // Non-streaming create(params: ResponseCreateParams & { stream?: false | null }): Promise // Streaming create(params: ResponseCreateParams & { stream: true }): Promise ``` --- ### ResponseCreateParams | Parameter | Type | Required | Description | | --- | --- | --- | --- | | `input` | `string \| ResponseInputItem[]` | Yes | Text prompt or array of input items | | `model` | `string` | No | Ignored. Always uses on-device model. | | `instructions` | `string` | No | System instructions | | `stream` | `boolean` | No | Enable streaming | | `temperature` | `number` | No | Sampling temperature | | `max_output_tokens` | `number` | No | Maximum response tokens | | `top_p` | `number` | No | Probability threshold for sampling | | `seed` | `number` | No | Random seed for reproducibility | | `tools` | `FunctionTool[]` | No | Tool definitions | | `tool_choice` | `string \| object` | No | Accepted but ignored | | `text` | `ResponseTextConfig` | No | Structured output configuration | All other params (`previous_response_id`, `conversation`, `store`, `truncation`, `metadata`, `reasoning`, etc.) are accepted but ignored with a runtime warning. --- ### Input Types #### ResponseInputItem ```ts type ResponseInputItem = EasyInputMessage | ResponseFunctionToolCall | FunctionCallOutput; ``` #### EasyInputMessage ```ts { role: "user" | "assistant" | "system" | "developer"; content: string | ResponseInputContent[]; type?: "message"; } ``` #### ResponseFunctionToolCall Passed back to continue a conversation after a function call: ```ts { type: "function_call"; name: string; arguments: string; call_id: string; status?: "in_progress" | "completed" | "incomplete"; } ``` #### FunctionCallOutput Provides the result of a function call: ```ts { type: "function_call_output"; call_id: string; output: string; } ``` #### ResponseInputContent ```ts type ResponseInputContent = | { type: "input_text"; text: string } | { type: "input_image"; image_url?: string } // not supported | { type: "input_file"; file_data?: string }; // not supported ``` Only `input_text` is supported. Other types log a warning and are skipped. --- ### Tool Types (Responses API) #### FunctionTool Flat format — `name` and `parameters` are top-level (not nested under `function`): ```ts { type: "function"; name: string; parameters: Record | null; // JSON Schema description?: string; strict?: boolean | null; } ``` --- ### Structured Output (Responses API) #### ResponseTextConfig ```ts { format?: ResponseFormatConfig } ``` #### ResponseFormatConfig ```ts type ResponseFormatConfig = | { type: "text" } | { type: "json_object" } | { type: "json_schema"; name: string; schema: Record; description?: string; strict?: boolean | null; }; ``` Only `json_schema` triggers constrained generation. --- ### Response Object ```ts { id: string; // "resp_" object: "response"; created_at: number; // Unix timestamp (seconds) model: string; // "SystemLanguageModel" output: ResponseOutputItem[]; output_text: string; // convenience: concatenated text from output messages status: "completed" | "failed" | "incomplete"; error: ResponseError | null; incomplete_details: { reason?: "max_output_tokens" | "content_filter" } | null; instructions: string | null; metadata: Record | null; temperature: number | null; top_p: number | null; max_output_tokens: number | null; tool_choice: "none" | "auto" | "required" | { type: "function"; name: string }; tools: FunctionTool[]; parallel_tool_calls: boolean; text: ResponseTextConfig; truncation: "auto" | "disabled" | null; usage: null; // not tracked } ``` ### ResponseOutputItem ```ts type ResponseOutputItem = ResponseOutputMessage | ResponseOutputFunctionToolCall; ``` ### ResponseOutputMessage ```ts { id: string; type: "message"; role: "assistant"; status: "completed" | "incomplete" | "in_progress"; content: Array; } ``` ### ResponseOutputText ```ts { type: "output_text"; text: string; annotations: unknown[] } ``` ### ResponseOutputRefusal ```ts { type: "refusal"; refusal: string } ``` ### ResponseOutputFunctionToolCall ```ts { type: "function_call"; id: string; call_id: string; // use this in FunctionCallOutput name: string; arguments: string; // JSON string status: "completed"; } ``` --- ### ResponseStream Async iterable wrapper for Responses API streaming events. ```ts class ResponseStream implements AsyncIterable ``` | Method | Description | | --- | --- | | `[Symbol.asyncIterator]()` | Iterate events with `for await...of` | | `close()` | Eagerly release resources | | `toReadableStream()` | Convert to Web `ReadableStream` | ### ResponseStreamEvent Union of all event types. See [Streaming Events](#streaming-events-reference) for the full list. --- ### Streaming Events Reference | Event type | Key fields | | --- | --- | | `response.created` | `response: Response` | | `response.in_progress` | `response: Response` | | `response.completed` | `response: Response` | | `response.failed` | `response: Response` | | `response.incomplete` | `response: Response` | | `response.output_item.added` | `item: ResponseOutputItem`, `output_index` | | `response.output_item.done` | `item: ResponseOutputItem`, `output_index` | | `response.content_part.added` | `part`, `item_id`, `output_index`, `content_index` | | `response.content_part.done` | `part`, `item_id`, `output_index`, `content_index` | | `response.output_text.delta` | `delta: string`, `item_id`, `output_index`, `content_index` | | `response.output_text.done` | `text: string`, `item_id`, `output_index`, `content_index` | | `response.refusal.delta` | `delta: string`, `item_id` | | `response.refusal.done` | `refusal: string`, `item_id` | | `response.function_call_arguments.delta` | `delta: string`, `item_id`, `output_index` | | `response.function_call_arguments.done` | `arguments: string`, `name`, `call_id`, `item_id` | All events include a `sequence_number` field. --- ## Chat Completions API ### Completions Accessed via `client.chat.completions`. #### `chat.completions.create(params)` Creates a chat completion. ```ts // Non-streaming create(params: ChatCompletionCreateParams & { stream?: false | null }): Promise // Streaming create(params: ChatCompletionCreateParams & { stream: true }): Promise ``` --- ## ChatCompletionCreateParams Request parameters for `create()`. | Parameter | Type | Required | Description | | --- | --- | --- | --- | | `messages` | `ChatCompletionMessageParam[]` | Yes | Conversation messages | | `model` | `string` | No | Ignored. Always uses on-device model. | | `stream` | `boolean` | No | Enable streaming | | `temperature` | `number` | No | Sampling temperature | | `max_tokens` | `number` | No | Maximum response tokens | | `max_completion_tokens` | `number` | No | Same as `max_tokens` (takes priority) | | `top_p` | `number` | No | Probability threshold for sampling | | `seed` | `number` | No | Random seed for reproducibility | | `tools` | `ChatCompletionTool[]` | No | Tool definitions | | `response_format` | `ResponseFormat` | No | Output format constraint | All other Chat Completions parameters (`n`, `stop`, `logprobs`, `frequency_penalty`, `presence_penalty`, `logit_bias`, `tool_choice`, `parallel_tool_calls`, etc.) are accepted but ignored. A warning is logged at runtime for each unsupported parameter that has a non-null value. --- ## Message Types ### ChatCompletionMessageParam Union of all message types: ```ts type ChatCompletionMessageParam = | ChatCompletionSystemMessageParam | ChatCompletionDeveloperMessageParam | ChatCompletionUserMessageParam | ChatCompletionAssistantMessageParam | ChatCompletionToolMessageParam; ``` ### ChatCompletionSystemMessageParam ```ts { role: "system"; content: string; name?: string } ``` ### ChatCompletionDeveloperMessageParam ```ts { role: "developer"; content: string; name?: string } ``` ### ChatCompletionUserMessageParam ```ts { role: "user"; content: string | ChatCompletionContentPart[]; name?: string } ``` ### ChatCompletionAssistantMessageParam ```ts { role: "assistant"; content?: string | null; tool_calls?: ChatCompletionMessageToolCall[]; refusal?: string | null; name?: string; } ``` ### ChatCompletionToolMessageParam ```ts { role: "tool"; content: string; tool_call_id: string } ``` ### ChatCompletionContentPart ```ts type ChatCompletionContentPart = | { type: "text"; text: string } | { type: "image_url"; image_url: { url: string } } | { type: "input_audio"; input_audio: { data: string; format: string } } | { type: "file"; file: { file_data: string; filename: string } } | { type: "refusal"; refusal: string }; ``` Only `text` parts are supported. Other content types log a warning and are skipped. --- ## Tool Types ### ChatCompletionTool ```ts { type: "function"; function: { name: string; description?: string; parameters?: Record; // JSON Schema strict?: boolean | null; }; } ``` ### ChatCompletionMessageToolCall ```ts { id: string; // "call_" type: "function"; function: { name: string; arguments: string; // JSON string }; } ``` --- ## Response Format ### ResponseFormat ```ts type ResponseFormat = | { type: "text" } | { type: "json_object" } | { type: "json_schema"; json_schema: { name: string; description?: string; schema?: Record; strict?: boolean | null; }; }; ``` Only `json_schema` triggers constrained generation. `text` and `json_object` are treated as plain text generation. --- ## Response Types ### ChatCompletion ```ts { id: string; // "chatcmpl-" object: "chat.completion"; created: number; // Unix timestamp (seconds) model: string; // "SystemLanguageModel" choices: ChatCompletionChoice[]; usage: null; system_fingerprint: null; } ``` ### ChatCompletionChoice ```ts { index: number; message: ChatCompletionMessage; finish_reason: "stop" | "length" | "tool_calls" | "content_filter"; } ``` ### ChatCompletionMessage ```ts { role: "assistant"; content: string | null; refusal: string | null; tool_calls?: ChatCompletionMessageToolCall[]; } ``` --- ## Streaming Types ### Stream Async iterable wrapper with resource cleanup. ```ts class Stream implements AsyncIterable ``` | Method | Description | | --- | --- | | `[Symbol.asyncIterator]()` | Iterate chunks with `for await...of` | | `close()` | Eagerly release resources | | `toReadableStream()` | Convert to Web `ReadableStream` | The stream auto-closes on iteration completion, `break`, or error. A `FinalizationRegistry` ensures cleanup if the stream is abandoned without being fully consumed. ### ChatCompletionChunk ```ts { id: string; object: "chat.completion.chunk"; created: number; model: string; choices: ChatCompletionChunkChoice[]; usage: null; system_fingerprint: null; } ``` ### ChatCompletionChunkDelta ```ts { role?: "assistant"; content?: string | null; tool_calls?: Array<{ index: number; id?: string; type?: "function"; function?: { name?: string; arguments?: string }; }>; refusal?: string | null; } ``` --- ## Constants ### MODEL_DEFAULT ```ts const MODEL_DEFAULT = "SystemLanguageModel"; ``` Placeholder model identifier for the on-device foundation model. It can be omitted since only one model is available. --- ## CompatError Error class with an HTTP-style status code, thrown for `RateLimitedError` (status 429). ```ts class CompatError extends Error { status: number; } ``` --- Source: /examples/ --- # Examples Runnable examples demonstrating each feature of the SDK. All examples are in the [`examples/`](https://github.com/codybrom/tsfm/tree/main/examples) directory. ## Available Examples | Example | Description | | --- | --- | | [Basic](/examples/basic) | Simple prompt and response | | [Streaming](/examples/streaming) | Token-by-token streaming | | [Structured Output](/examples/structured-output) | Typed schemas with `GenerationSchema` | | [JSON Schema](/examples/json-schema) | JSON Schema-based structured output | | [Tools](/examples/tools) | Tool calling with a calculator | | [Generation Options](/examples/generation-options) | Temperature, sampling, token limits | | [Transcripts](/examples/transcript) | Session history persistence | | [Content Tagging](/examples/content-tagging) | Content tagging use case | | [Chat & Responses APIs](/examples/chat-api) | Chat-style and Responses-style API interface | --- Source: /examples/basic --- # Basic A simple prompt and response using `SystemLanguageModel` and `LanguageModelSession`. <<< @/../examples/basic/basic.ts ## What This Shows 1. Create a `SystemLanguageModel` and wait for availability 2. Create a session with system instructions 3. Generate a text response with `respond()` 4. Dispose both session and model to free native resources --- Source: /examples/streaming --- # Streaming Token-by-token streaming using `streamResponse()`. <<< @/../examples/streaming/streaming.ts ## What This Shows 1. Create a session (model availability check can be skipped for brevity) 2. Use `for await...of` to iterate over response chunks 3. Each chunk contains only new tokens — write directly to stdout --- Source: /examples/structured-output --- # Structured Output Generate typed objects using `GenerationSchema` and `respondWithSchema()`. <<< @/../examples/structured-output/structured-output.ts ## What This Shows 1. Define a schema with `GenerationSchema` and typed properties 2. Use `GenerationGuide.range()` to constrain numeric values 3. Extract typed values with `content.value(key)` 4. Map results to a TypeScript interface --- Source: /examples/json-schema --- # JSON Schema Generate structured output using a standard JSON Schema object. <<< @/../examples/json-schema/json-schema.ts ## What This Shows 1. Build a schema using the `GenerationSchema` builder, then export with `toDict()` 2. Pass the schema object to `respondWithJsonSchema()` 3. Get the full result as a plain object with `content.toObject()` --- Source: /examples/tools --- # Tools Tool calling with a calculator that performs arithmetic operations. <<< @/../examples/tools/tools.ts ## What This Shows 1. Extend `Tool` with `name`, `description`, `argumentsSchema`, and `call()` 2. Use `GenerationGuide.anyOf()` to constrain argument values 3. Pass tools to a session via `{ tools: [calculator] }` 4. The model decides when to call the tool and incorporates the result 5. Dispose both the session and tool when done --- Source: /examples/generation-options --- # Generation Options Control temperature, sampling strategy, and token limits. <<< @/../examples/generation-options/generation-options.ts ## What This Shows 1. Set `temperature` for creative output 2. Use `SamplingMode.random()` with top-K and a seed for reproducible randomness 3. Limit response length with `maximumResponseTokens` --- Source: /examples/transcript --- # Transcripts Save and restore session history across process restarts. <<< @/../examples/transcript/transcript.ts ## What This Shows 1. Build context in a session with multiple `respond()` calls 2. Export the transcript with `toJson()` 3. Restore a session with `Transcript.fromJson()` and `LanguageModelSession.fromTranscript()` 4. The restored session retains full conversation context --- Source: /examples/content-tagging --- # Content Tagging Use the `CONTENT_TAGGING` use case for classification tasks. <<< @/../examples/content-tagging/content-tagging.ts ## What This Shows 1. Create a model with `SystemLanguageModelUseCase.CONTENT_TAGGING` 2. Pass the model to a session 3. The model is optimized for classification rather than general chat --- Source: /examples/chat-api --- # Chat & Responses APIs Examples using the Chat-style and Responses-style API interfaces at `tsfm-sdk/chat`. Both the Responses API and Chat Completions API are shown. ## Responses API ### Basic Text Generation ```ts import Client from "tsfm-sdk/chat"; const client = new Client(); // String input — the simplest form const response = await client.responses.create({ instructions: "You are a helpful assistant. Be concise.", input: "What is the capital of France?", }); console.log(response.output_text); // "The capital of France is Paris." client.close(); ``` ### Multi-turn Conversation ```ts const response = await client.responses.create({ instructions: "You are a math tutor. Be concise.", input: [ { role: "user", content: "What is 2 + 2?" }, { role: "assistant", content: "4" }, { role: "user", content: "Multiply that by 3" }, ], }); console.log(response.output_text); // "12" ``` ### Streaming ```ts const stream = await client.responses.create({ input: "Count from 1 to 5, one per line.", stream: true, }); for await (const event of stream) { if (event.type === "response.output_text.delta") { process.stdout.write(event.delta); } } console.log(); ``` ### Structured Output ```ts const response = await client.responses.create({ input: "Extract: Alice is 28 years old and lives in Seattle", text: { format: { type: "json_schema", name: "Person", schema: { type: "object", properties: { name: { type: "string" }, age: { type: "integer" }, city: { type: "string" }, }, required: ["name", "age", "city"], }, }, }, }); const person = JSON.parse(response.output_text); console.log(person); // { name: "Alice", age: 28, city: "Seattle" } ``` ### Tool Calling ```ts const tools = [ { type: "function" as const, name: "get_weather", description: "Get current weather for a city", parameters: { type: "object", properties: { city: { type: "string", description: "City name" }, }, required: ["city"], }, }, ]; // Step 1: Model decides to call a tool const response = await client.responses.create({ input: "What's the weather in Tokyo?", tools, }); const fc = response.output.find((item) => item.type === "function_call"); if (fc && fc.type === "function_call") { console.log("Tool:", fc.name); console.log("Args:", fc.arguments); // Step 2: Execute the tool and send results back const result = JSON.stringify({ temp: 22, condition: "Sunny" }); const followUp = await client.responses.create({ input: [ { role: "user", content: "What's the weather in Tokyo?" }, fc, // pass the function_call back { type: "function_call_output", call_id: fc.call_id, output: result }, ], tools, }); console.log(followUp.output_text); // "It's currently 22°C and sunny in Tokyo." } ``` ### Generation Options ```ts const response = await client.responses.create({ input: "Write a creative haiku", temperature: 0.8, max_output_tokens: 50, seed: 42, }); console.log(response.output_text); ``` ### Handling Errors ```ts const response = await client.responses.create({ input: "...", }); if (response.status === "incomplete") { console.log("Incomplete:", response.incomplete_details?.reason); // "max_output_tokens" or "content_filter" } // Check for refusals for (const item of response.output) { if (item.type === "message") { for (const content of item.content) { if (content.type === "refusal") { console.log("Refused:", content.refusal); } } } } ``` --- ## Chat Completions API ### Chat: Basic Text Generation ```ts import Client from "tsfm-sdk/chat"; const client = new Client(); const response = await client.chat.completions.create({ messages: [ { role: "system", content: "You are a helpful assistant. Be concise." }, { role: "user", content: "What is the capital of France?" }, ], }); console.log(response.choices[0].message.content); // "The capital of France is Paris." client.close(); ``` ### Chat: Multi-turn Conversation ```ts const response = await client.chat.completions.create({ messages: [ { role: "system", content: "You are a math tutor. Be concise." }, { role: "user", content: "What is 2 + 2?" }, { role: "assistant", content: "4" }, { role: "user", content: "Multiply that by 3" }, ], }); console.log(response.choices[0].message.content); // "12" ``` ### Chat: Streaming ```ts const stream = await client.chat.completions.create({ messages: [{ role: "user", content: "Count from 1 to 5, one per line." }], stream: true, }); for await (const chunk of stream) { const delta = chunk.choices[0].delta.content; if (delta) process.stdout.write(delta); } console.log(); ``` ### Chat: Structured Output ```ts const response = await client.chat.completions.create({ messages: [ { role: "user", content: "Extract: Alice is 28 years old and lives in Seattle" }, ], response_format: { type: "json_schema", json_schema: { name: "Person", schema: { type: "object", properties: { name: { type: "string" }, age: { type: "integer" }, city: { type: "string" }, }, required: ["name", "age", "city"], }, }, }, }); const person = JSON.parse(response.choices[0].message.content!); console.log(person); // { name: "Alice", age: 28, city: "Seattle" } ``` ### Chat: Tool Calling ```ts const tools = [ { type: "function" as const, function: { name: "get_weather", description: "Get current weather for a city", parameters: { type: "object", properties: { city: { type: "string", description: "City name" }, }, required: ["city"], }, }, }, ]; // Step 1: Model decides to call a tool const response = await client.chat.completions.create({ messages: [{ role: "user", content: "What's the weather in Tokyo?" }], tools, }); const choice = response.choices[0]; if (choice.finish_reason === "tool_calls" && choice.message.tool_calls) { const call = choice.message.tool_calls[0]; console.log("Tool:", call.function.name); console.log("Args:", call.function.arguments); // Step 2: Execute the tool and send results back const result = JSON.stringify({ temp: 22, condition: "Sunny" }); const followUp = await client.chat.completions.create({ messages: [ { role: "user", content: "What's the weather in Tokyo?" }, { role: "assistant", content: null, tool_calls: [call] }, { role: "tool", tool_call_id: call.id, content: result }, ], tools, }); console.log(followUp.choices[0].message.content); // "It's currently 22°C and sunny in Tokyo." } ``` --- Source: /changelog --- # Changelog