Skip to content

Instantly share code, notes, and snippets.

@pj4533
Last active June 11, 2025 13:33
Show Gist options
  • Save pj4533/9d196763f2b80bc01b40fc66ba4d1c63 to your computer and use it in GitHub Desktop.
Save pj4533/9d196763f2b80bc01b40fc66ba4d1c63 to your computer and use it in GitHub Desktop.
foundation_models_docs.md

Apple Foundation Models Framework Documentation

Overview

The Foundation Models framework is Apple's new API announced at WWDC 2025 that provides developers with direct access to the on-device large language model powering Apple Intelligence. Released to beta on June 9, 2025, this framework enables developers to integrate powerful AI capabilities directly into their apps while maintaining user privacy through on-device processing.

Key Features

  • On-device processing: All AI inference runs locally on the device
  • Privacy-focused: No data leaves the device or is sent to cloud servers
  • Offline capable: Works without an internet connection
  • Zero cost: No API fees or cloud computing charges
  • Small footprint: Built into the OS, doesn't increase app size
  • Swift-native: Integrates seamlessly with Swift using as few as 3 lines of code

Platform Availability

  • iOS 26
  • iPadOS 26
  • macOS Tahoe 26
  • visionOS 26

Device Requirements

  • iPhone 16 (all models)
  • iPhone 15 Pro and iPhone 15 Pro Max
  • iPad mini (A17 Pro)
  • iPad models with M1 chip or later
  • Mac models with M1 chip or later
  • Apple Vision Pro

Language Support

Available at launch:

  • English, French, German, Italian, Portuguese (Brazil), Spanish, Japanese, Korean, Chinese (simplified)

Coming by end of 2025:

  • Danish, Dutch, Norwegian, Portuguese (Portugal), Swedish, Turkish, Chinese (traditional), Vietnamese

The Foundation Model

Model Specifications

  • Parameters: ~3 billion
  • Quantization: 2-bit (with 3.5-3.7 bits-per-weight average using mixed 2-bit and 4-bit configuration)
  • Architecture: Optimized for Apple Silicon
  • Performance:
    • Time-to-first-token latency: ~0.6ms per prompt token
    • Generation rate: 30 tokens per second on iPhone 15 Pro

Model Capabilities

The on-device model excels at:

  • Text summarization
  • Entity extraction
  • Text understanding and refinement
  • Short dialog generation
  • Creative content generation
  • Classification tasks
  • Content tagging
  • Natural language search

Model Limitations

The model is NOT designed for:

  • General world knowledge queries
  • Advanced reasoning tasks
  • Chatbot-style conversations
  • Server-scale LLM tasks

Core Features

1. Guided Generation

Guided Generation is the framework's core feature that ensures reliable structured output from the model using Swift's type system.

The @Generable Macro

import FoundationModels

@Generable
struct SearchSuggestions {
    @Guide(description: "A list of suggested search terms", .count(4))
    var searchTerms: [String]
}

Supported Types

Generable types can include:

  • Primitives: String, Int, Double, Float, Decimal, Bool
  • Arrays: [String], [Int], etc.
  • Composed types: Nested structs
  • Recursive types: Self-referencing structures

The @Guide Macro

Provides constraints and natural language descriptions for properties:

@Generable
struct Person {
    @Guide(description: "Person's full name")
    var name: String
    
    @Guide(description: "Age in years", .range(0...120))
    var age: Int
    
    @Guide(regex: /^[A-Z]{2}-\d{4}$/)
    var id: String
}

Basic Usage

let session = LanguageModelSession()
let prompt = "Generate search suggestions for a travel app"
let response = try await session.respond(
    to: prompt,
    generating: SearchSuggestions.self
)
print(response.content.searchTerms)

2. Snapshot Streaming

The framework uses a unique snapshot-based streaming approach instead of traditional delta streaming.

PartiallyGenerated Types

The @Generable macro automatically generates a PartiallyGenerated type with all optional properties:

@Generable
struct Itinerary {
    var destination: String
    var days: [DayPlan]
    var summary: String
}

// Automatically generates:
// Itinerary.PartiallyGenerated with all optional properties

Streaming Implementation

struct ItineraryView: View {
    let session: LanguageModelSession
    @State private var itinerary: Itinerary.PartiallyGenerated?
    
    var body: some View {
        VStack {
            // UI components
            Button("Generate") {
                Task {
                    let stream = session.streamResponse(
                        to: "Plan a 3-day trip to Tokyo",
                        generating: Itinerary.self
                    )
                    
                    for try await partial in stream {
                        self.itinerary = partial
                    }
                }
            }
        }
    }
}

Best Practices for Streaming

  1. Use SwiftUI animations to hide latency
  2. Consider view identity when generating arrays
  3. Property order matters - properties are generated in declaration order
  4. Place summaries last for better quality output

3. Tool Calling

Tool calling allows the model to execute custom code to retrieve information or perform actions.

Defining a Tool

struct WeatherTool: Tool {
    static let name = "get_weather"
    static let description = "Get current weather for a location"
    
    @Generable
    struct Arguments {
        let city: String
        let unit: String?
    }
    
    func call(with arguments: Arguments) async throws -> ToolOutput {
        // Use WeatherKit or other APIs
        let temperature = try await getTemperature(for: arguments.city)
        
        return .init(content: "The temperature in \(arguments.city) is \(temperature)°")
    }
}

Using Tools in a Session

let weatherTool = WeatherTool()
let session = LanguageModelSession(tools: [weatherTool])

let response = try await session.respond(
    to: "What's the weather like in San Francisco?"
)
// Model will automatically call the weather tool when needed

4. Stateful Sessions

Sessions maintain context across multiple interactions.

Creating a Session with Instructions

let session = LanguageModelSession(
    instructions: """
    You are a helpful travel assistant. 
    Provide concise, actionable recommendations.
    Focus on local experiences and hidden gems.
    """
)

Multi-turn Conversations

// First turn
let response1 = try await session.respond(to: "Recommend a restaurant in Paris")

// Second turn - model remembers context
let response2 = try await session.respond(to: "What about one with vegetarian options?")

// Access conversation history
let transcript = session.transcript

Session Properties

// Check if model is currently generating
if session.isResponding {
    // Show loading indicator
}

// Check availability
if case .available = SystemLanguageModel.availability {
    // Model is available
}

Specialized Adapters

Content Tagging Adapter

let tagger = SystemLanguageModel(adapter: .contentTagging)
let session = LanguageModelSession(model: tagger)

@Generable
struct ContentTags {
    var topics: [String]
    var sentiment: String
    var keywords: [String]
}

let tags = try await session.respond(
    to: userContent,
    generating: ContentTags.self
)

Development Tools

Xcode Playgrounds

Test prompts directly in your code:

import FoundationModels
import Playgrounds

#Playground {
    let session = LanguageModelSession()
    let response = try await session.respond(
        to: "Generate a haiku about coding"
    )
}

Instruments Profiling

Use the new Foundation Models template in Instruments to:

  • Profile model request latency
  • Identify optimization opportunities
  • Quantify performance improvements

Error Handling

do {
    let response = try await session.respond(to: prompt)
} catch LanguageModelError.guardrailViolation {
    // Handle safety violation
} catch LanguageModelError.unsupportedLanguage {
    // Handle language not supported
} catch LanguageModelError.contextWindowExceeded {
    // Handle context too long
} catch {
    // Handle other errors
}

Best Practices

Prompt Design

  1. Keep prompts focused - Break complex tasks into smaller pieces
  2. Use instructions wisely - Static developer guidance, not user input
  3. Leverage guided generation - Let the framework handle output formatting
  4. Test extensively - Use Xcode Playgrounds for rapid iteration

Performance Optimization

  1. Prewarm models when appropriate
  2. Stream responses for better perceived performance
  3. Use appropriate verbosity in prompts
  4. Profile with Instruments to identify bottlenecks

Security Considerations

  1. Never interpolate user input into instructions
  2. Use tool calling for external data instead of prompt injection
  3. Handle errors gracefully including guardrail violations
  4. Validate generated content before using in production

Real-World Examples

Education App

@Generable
struct Quiz {
    var questions: [Question]
    
    @Generable
    struct Question {
        var text: String
        var options: [String]
        var correctAnswer: Int
    }
}

let quiz = try await session.respond(
    to: "Generate a quiz about \(studentNotes)",
    generating: Quiz.self
)

Travel App

@Generable
struct TravelItinerary {
    var destination: String
    var duration: Int
    var activities: [Activity]
    var estimatedBudget: Decimal
    
    @Generable
    struct Activity {
        var name: String
        var description: String
        var duration: String
        var cost: Decimal?
    }
}

Journaling App (Day One)

Automattic's Day One app uses the framework for:

  • Intelligent prompts based on user entries
  • Privacy-preserving content analysis
  • Mood and theme detection

Advanced Features

Dynamic Schemas

For runtime-determined structures:

let schema = GenerationSchema(
    type: .object([
        "name": .string,
        "values": .array(.number),
        "metadata": .object(dynamicProperties)
    ])
)

let response = try await session.respond(
    to: prompt,
    using: schema
)

Custom Adapters

For specialized use cases, train custom adapters using Apple's Python toolkit:

  • Rank 32 LoRA adapters
  • Must be retrained with each base model update
  • Consider only after exhausting base model capabilities

Acceptable Use Requirements

The framework must NOT be used for:

  • Illegal activities or law violations
  • Generating pornographic or sexual content
  • Child exploitation or abuse
  • Employment-related assessments
  • Circumventing safety guardrails
  • Reverse engineering training data
  • Generating harmful or discriminatory content

Getting Started

Installation

  1. Developer Beta (Available now):

    • Join Apple Developer Program
    • Download from developer.apple.com
  2. Public Beta (July 2025):

    • Join Apple Beta Software Program
    • Download from beta.apple.com

Minimum Code Example

import FoundationModels

// Create a session
let session = LanguageModelSession()

// Generate response
let response = try await session.respond(to: "Summarize this text: \(userText)")

// Use the response
print(response.content)

Resources

Documentation

WWDC 2025 Sessions

  • Meet the Foundation Models framework
  • Deep dive into the Foundation Models framework
  • Integrating Foundation Models into your app
  • Prompt design and safety

Sample Code

  • Available through Apple Developer Program
  • Xcode 26 includes playground templates

Feedback

  • Use Feedback Assistant with encodable attachment structure
  • Help improve models by sharing use cases

Conclusion

The Foundation Models framework represents a significant advancement in on-device AI for Apple platforms. By combining powerful language models with Swift's type system and Apple's focus on privacy, developers can create intelligent features that work offline, protect user data, and provide responsive experiences without cloud infrastructure costs.

As the framework evolves, Apple will continue to improve model capabilities and add new specialized adapters based on developer feedback. The tight integration with Swift and Apple's development tools makes it easier than ever to add AI-powered features to apps while maintaining the high standards of user experience and privacy that Apple users expect.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment