A step-by-step guide to run an LLM on an iPhone with MLX Swift

This guide is adapted from this original post by Christopher Charles.

Project Setup

1. Get the source:

Clone the MLX Swift Examples GitHub repository:

git clone [email protected]:ml-explore/mlx-swift-examples.git

Open XCode and click "Open Existing Project".
Navigate to the path that you cloned mlx-swift-examples and open the file mlx-swift-examples.xcodeproj
From XCode, select the LLMEval example application (see screenshot)

2. Set the Team in Signing & Capabilities

Click on the mlx-swift-examples project in Xcode’s Project Navigator (left sidebar).
Choose the LLMEval target under the “Targets” section.
Select the Signing & Capabilities tab (see screenshot).
Under Team choose the Apple Developer account to use for code signing. For this you need an Apple Developer account.

The LLMEval app uses two extra capabilities. These should already be enabled, but you can verify this in the Signing & Capabilities:

Outgoing Connections (Client) to allow the app to download models from Hugging Face.
The Increased Memory Limit entitlement (on supported devices and OS versions) which is useful for larger models.

3. Configure "Release" Build

Press "⌘+⌥+r" (command + option + r) to edit the build scheme
Select the "Info" tab
Select "Release" for the "Build configuration" (see screenshot)

Build & Run

1. Connect Your Device

For example to build on an iPhone:

Plug your iPhone into your Mac via USB (or be on the same Wi-Fi network with wireless debugging enabled).
In the Xcode toolbar (top) select your iPhone from the device list (screenshot):

2. Build & Run

Press Cmd + R (or click the play button) to build and install LLMEval on your device.
You may be prompted to put your device in developer mode if this is your first time building an App for it. See these instructions.
You may be prompted to "Verify that the Developer App certificate for your account is trusted on your device". Follow the instructions in the prompt.
Open the App
Wait for the model to download. If you haven't used the model yet, then LLMEval will download it. This can take a while.
Enter a prompt to generate text on the device in the app’s UI.

3. Try Different Models (optional)

By default, ModelConfiguration.phi4bit is used (see ContentView.swift). You can try other models in LLMModelFactory.swift. Just ensure the chosen model architecture and your device’s memory constraints are compatible.

See the mlx-swift-examples documentation for more information on adding and using models.

Troubleshooting & Tips

Entitlements / Code Signing: If you get build errors about code signing or entitlements, double-check that:

You’ve selected your Team.
Outgoing network connections are enabled.
Increased memory entitlement is enabled, if you need it.

Model Size & Crashes: Large models might cause crashes if your device does not have enough memory. Try a smaller or more quantized model. Visit the MLX Community on Hugging Face to see many more MLX-compatible models.

laventura/llms_on_ios.md