Apple Foundation Models Framework Documentation

Overview

The Foundation Models framework is Apple's new API announced at WWDC 2025 that provides developers with direct access to the on-device large language model powering Apple Intelligence. Released to beta on June 9, 2025, this framework enables developers to integrate powerful AI capabilities directly into their apps while maintaining user privacy through on-device processing.

Key Features

On-device processing: All AI inference runs locally on the device
Privacy-focused: No data leaves the device or is sent to cloud servers
Offline capable: Works without an internet connection
Zero cost: No API fees or cloud computing charges
Small footprint: Built into the OS, doesn't increase app size
Swift-native: Integrates seamlessly with Swift using as few as 3 lines of code

Platform Availability

iOS 26
iPadOS 26
macOS Tahoe 26
visionOS 26

Device Requirements

iPhone 16 (all models)
iPhone 15 Pro and iPhone 15 Pro Max
iPad mini (A17 Pro)
iPad models with M1 chip or later
Mac models with M1 chip or later
Apple Vision Pro

Language Support

Available at launch:

English, French, German, Italian, Portuguese (Brazil), Spanish, Japanese, Korean, Chinese (simplified)

Coming by end of 2025:

Danish, Dutch, Norwegian, Portuguese (Portugal), Swedish, Turkish, Chinese (traditional), Vietnamese

The Foundation Model

Model Specifications

Parameters: ~3 billion
Quantization: 2-bit (with 3.5-3.7 bits-per-weight average using mixed 2-bit and 4-bit configuration)
Architecture: Optimized for Apple Silicon
Performance:
- Time-to-first-token latency: ~0.6ms per prompt token
- Generation rate: 30 tokens per second on iPhone 15 Pro

Model Capabilities

The on-device model excels at:

Text summarization
Entity extraction
Text understanding and refinement
Short dialog generation
Creative content generation
Classification tasks
Content tagging
Natural language search

Model Limitations

The model is NOT designed for:

General world knowledge queries
Advanced reasoning tasks
Chatbot-style conversations
Server-scale LLM tasks

Core Features

1. Guided Generation

Guided Generation is the framework's core feature that ensures reliable structured output from the model using Swift's type system.

The @Generable Macro

import FoundationModels

@Generable
struct SearchSuggestions {
    @Guide(description: "A list of suggested search terms", .count(4))
    var searchTerms: [String]
}

Supported Types

Generable types can include:

Primitives: String, Int, Double, Float, Decimal, Bool
Arrays: [String], [Int], etc.
Composed types: Nested structs
Recursive types: Self-referencing structures

The @Guide Macro

Provides constraints and natural language descriptions for properties:

@Generable
struct Person {
    @Guide(description: "Person's full name")
    var name: String
    
    @Guide(description: "Age in years", .range(0...120))
    var age: Int
    
    @Guide(regex: /^[A-Z]{2}-\d{4}$/)
    var id: String
}

Basic Usage

let session = LanguageModelSession()
let prompt = "Generate search suggestions for a travel app"
let response = try await session.respond(
    to: prompt,
    generating: SearchSuggestions.self
)
print(response.content.searchTerms)

2. Snapshot Streaming

The framework uses a unique snapshot-based streaming approach instead of traditional delta streaming.

PartiallyGenerated Types

The @Generable macro automatically generates a PartiallyGenerated type with all optional properties:

@Generable
struct Itinerary {
    var destination: String
    var days: [DayPlan]
    var summary: String
}

// Automatically generates:
// Itinerary.PartiallyGenerated with all optional properties

Streaming Implementation

struct ItineraryView: View {
    let session: LanguageModelSession
    @State private var itinerary: Itinerary.PartiallyGenerated?
    
    var body: some View {
        VStack {
            // UI components
            Button("Generate") {
                Task {
                    let stream = session.streamResponse(
                        to: "Plan a 3-day trip to Tokyo",
                        generating: Itinerary.self
                    )
                    
                    for try await partial in stream {
                        self.itinerary = partial
                    }
                }
            }
        }
    }
}

Best Practices for Streaming

Use SwiftUI animations to hide latency
Consider view identity when generating arrays
Property order matters - properties are generated in declaration order
Place summaries last for better quality output

3. Tool Calling

Tool calling allows the model to execute custom code to retrieve information or perform actions.

Defining a Tool

struct WeatherTool: Tool {
    static let name = "get_weather"
    static let description = "Get current weather for a location"
    
    @Generable
    struct Arguments {
        let city: String
        let unit: String?
    }
    
    func call(with arguments: Arguments) async throws -> ToolOutput {
        // Use WeatherKit or other APIs
        let temperature = try await getTemperature(for: arguments.city)
        
        return .init(content: "The temperature in \(arguments.city) is \(temperature)°")
    }
}

Using Tools in a Session

let weatherTool = WeatherTool()
let session = LanguageModelSession(tools: [weatherTool])

let response = try await session.respond(
    to: "What's the weather like in San Francisco?"
)
// Model will automatically call the weather tool when needed

4. Stateful Sessions

Sessions maintain context across multiple interactions.

Creating a Session with Instructions

let session = LanguageModelSession(
    instructions: """
    You are a helpful travel assistant. 
    Provide concise, actionable recommendations.
    Focus on local experiences and hidden gems.
    """
)

Multi-turn Conversations

// First turn
let response1 = try await session.respond(to: "Recommend a restaurant in Paris")

// Second turn - model remembers context
let response2 = try await session.respond(to: "What about one with vegetarian options?")

// Access conversation history
let transcript = session.transcript

Session Properties

// Check if model is currently generating
if session.isResponding {
    // Show loading indicator
}

// Check availability
if case .available = SystemLanguageModel.availability {
    // Model is available
}

Specialized Adapters

Content Tagging Adapter

let tagger = SystemLanguageModel(adapter: .contentTagging)
let session = LanguageModelSession(model: tagger)

@Generable
struct ContentTags {
    var topics: [String]
    var sentiment: String
    var keywords: [String]
}

let tags = try await session.respond(
    to: userContent,
    generating: ContentTags.self
)

Development Tools

Xcode Playgrounds

Test prompts directly in your code:

import FoundationModels
import Playgrounds

#Playground {
    let session = LanguageModelSession()
    let response = try await session.respond(
        to: "Generate a haiku about coding"
    )
}

Instruments Profiling

Use the new Foundation Models template in Instruments to:

Profile model request latency
Identify optimization opportunities
Quantify performance improvements

Error Handling

do {
    let response = try await session.respond(to: prompt)
} catch LanguageModelError.guardrailViolation {
    // Handle safety violation
} catch LanguageModelError.unsupportedLanguage {
    // Handle language not supported
} catch LanguageModelError.contextWindowExceeded {
    // Handle context too long
} catch {
    // Handle other errors
}

Best Practices

Prompt Design

Keep prompts focused - Break complex tasks into smaller pieces
Use instructions wisely - Static developer guidance, not user input
Leverage guided generation - Let the framework handle output formatting
Test extensively - Use Xcode Playgrounds for rapid iteration

Performance Optimization

Prewarm models when appropriate
Stream responses for better perceived performance
Use appropriate verbosity in prompts
Profile with Instruments to identify bottlenecks

Security Considerations

Never interpolate user input into instructions
Use tool calling for external data instead of prompt injection
Handle errors gracefully including guardrail violations
Validate generated content before using in production

Real-World Examples

Education App

@Generable
struct Quiz {
    var questions: [Question]
    
    @Generable
    struct Question {
        var text: String
        var options: [String]
        var correctAnswer: Int
    }
}

let quiz = try await session.respond(
    to: "Generate a quiz about \(studentNotes)",
    generating: Quiz.self
)

Travel App

@Generable
struct TravelItinerary {
    var destination: String
    var duration: Int
    var activities: [Activity]
    var estimatedBudget: Decimal
    
    @Generable
    struct Activity {
        var name: String
        var description: String
        var duration: String
        var cost: Decimal?
    }
}

Journaling App (Day One)

Automattic's Day One app uses the framework for:

Intelligent prompts based on user entries
Privacy-preserving content analysis
Mood and theme detection

Advanced Features

Dynamic Schemas

For runtime-determined structures:

let schema = GenerationSchema(
    type: .object([
        "name": .string,
        "values": .array(.number),
        "metadata": .object(dynamicProperties)
    ])
)

let response = try await session.respond(
    to: prompt,
    using: schema
)

Custom Adapters

For specialized use cases, train custom adapters using Apple's Python toolkit:

Rank 32 LoRA adapters
Must be retrained with each base model update
Consider only after exhausting base model capabilities

Acceptable Use Requirements

The framework must NOT be used for:

Illegal activities or law violations
Generating pornographic or sexual content
Child exploitation or abuse
Employment-related assessments
Circumventing safety guardrails
Reverse engineering training data
Generating harmful or discriminatory content

Getting Started

Installation

Developer Beta (Available now):
- Join Apple Developer Program
- Download from developer.apple.com
Public Beta (July 2025):
- Join Apple Beta Software Program
- Download from beta.apple.com

Minimum Code Example

import FoundationModels

// Create a session
let session = LanguageModelSession()

// Generate response
let response = try await session.respond(to: "Summarize this text: \(userText)")

// Use the response
print(response.content)

Resources

Documentation

WWDC 2025 Sessions

Meet the Foundation Models framework
Deep dive into the Foundation Models framework
Integrating Foundation Models into your app
Prompt design and safety

Sample Code

Available through Apple Developer Program
Xcode 26 includes playground templates

Feedback

Use Feedback Assistant with encodable attachment structure
Help improve models by sharing use cases

Conclusion

The Foundation Models framework represents a significant advancement in on-device AI for Apple platforms. By combining powerful language models with Swift's type system and Apple's focus on privacy, developers can create intelligent features that work offline, protect user data, and provide responsive experiences without cloud infrastructure costs.

As the framework evolves, Apple will continue to improve model capabilities and add new specialized adapters based on developer feedback. The tight integration with Swift and Apple's development tools makes it easier than ever to add AI-powered features to apps while maintaining the high standards of user experience and privacy that Apple users expect.

pj4533/foundation_models_docs.md