Raw notes are at the top
Human-readable blog post is in the middle
llbuild
is the build system used by xcode and the swift package manager.
Understanding llbuild
will allow you to understand how xcode builds your app.
Run this in your terminal:
defaults write com.apple.dt.XCBuild EnableBuildDebugging -bool YES
This will tell XCBuild
to enable build debugging, which will in turn tell llbuild
to enable build debugging.
You'll want to turn this off when you're done as it will produce build artifacts that take up a fair amount of disk space if you're building many times a day, every day.
manifest.xcbuild
- This is the
build file
that describes your build as a yml file with a bunch of json in it.
- This is the
build.db
- This is an SQLite database that holds build information. Essentially, this is the cache behind the build that enables incremental compilation. It stores cache keys related to the artifacts that were build previously which are stored in the derived data folder.
build.trace
- The trace file is the log of what
llbuild
actually did as it executedmanifest.xcbuild
- The trace file is the log of what
This file describes your build.
It has a few different parts: client
, target
, nodes
, and commands
.
I've not seen client
or target
contain particularly useful information. nodes
and commands
are where it's at.
You can find the source documentation for each of these things here, but I'll cover them in my own words now.
A node
represents some input or output of the build process. Typically, this is a file.
Nodes can have attributes.
By default, if a node ends in a /
, then it is considered a directory.
By default, if a node matches the pattern <.*>
(where .*
is the regex for "any number of characters"), like <link>
, then it is considered "virtual" and is assumed to not represent a file.
If a node has the attribute is-command-timestamp
, then it will be used to hold the timestamp when a command was completed. This timestamp can be used to see if a command that was run in a previous build needs to be re-run.
A command
is some task to be run by the build system. This can be something simple like "copy the contents of this framework somewhere else" (ProcessXCFramework
), or "compile these swift files" (Debug:CompileSwiftSources
).
A command has a tool
, description
, probably inputs
and outputs
and then a few other attributes depending on what tool
is specified.
There are a few built in tools listed here.
Here are all the tools used by a build I just ran:
phony
mkdir
shell
stale-file-removal
auxiliary-file
copy-plist
copy-strings-file
create-build-directory
embed-swift-stdlib
file-copy
info-plist-processor
process-product-entitlements
process-xcframework
register-execution-policy-exception
The inputs
and outputs
part of commands
are quite important - they're the thing that determine whether the command should be run, or if the output from a previous run of the command is sufficient and we can all save a bunch of time by skipping it. This is what's known as "incremental compilation" or "compilation avoidance".
In the parlance of llbuild
, a command
creates a rule
, which creates a task
.
commands
will always be run unless the output they produce is the same from one build to the next.
rules
will run if their "signature" has changed from one build to the next OR any of their inputs have different "file info" from one build to the next
A Shell
command is the type of command that will compile your source code. Since compiling source code is probably the biggest thing you want to avoid, lets take a look at what causes Shell
commands to be re-run on an incremental build.
Firstly, a Shell
command is a subclass of ExternalCommand
- you can find more information on when an ExternalCommand
is run by looking at the source code here.
An ExternalCommand
is run if:
- it's deemed to be
always out of date
, which is an attribute that can specified on acommand
within thebuild file
. - if the prior value wasn't "successful", where "successful" means that the command was 1. executed and 2. for every output specified for the command, that output's
stat
information is recorded - for each output of a command
- if the output is "virtual", don't consider that output
- if the output has different "information" than before, run the command. "information" is the result of the unix
stat
command and is retrieved here. It includes a file'sst_dev, st_ino, st_mode, st_size, st_mtime
- you can read more about what each of those mean here.
If a command
is run (which is controlled by the outputs
for the command), it will create a new rule
. The rule may or may not be run based on its inputs
.
BuildEngine#scanRule
- Check if it needs to run ("checking-rule-needs-to-run")
- Does it need to run because the signature changed? ("signature-changed")
ruleInfo.rule->signature != ruleInfo.result.signature
-
ruleInfo.rule
is from this run -
ruleInfo.result
is from last run- how are rules identified between runs? do they have a stable ID?
- they are identified through their KeyID
- a KeyID is a sequential database ID (integer) that maps to an actual key (the string representation of a rule, like "N/Users/.../GraphQLSchema.swift") through the
key_names
table
- how are rules created?
BuildEngine#addRule
- a rule is looked up from the db (
SQLiteBuildDB#lookupRuleResult
)- if it's found, return it
- if it's not, don't do anything and allow the newly created rule to be empty
- how are rules identified between runs? do they have a stable ID?
-
Where do these rule signatures get computed?
BuildSystem#lookupRule
is where new rules get their values (including their signature)BuildKey::Kind::Command
-command->getSignature()
BuildKey::Kind::Node
-node->getSignature()
- Only
Commands
andNodes
have signatures
-
What's in a
signature
?BuildKey::Kind::Command
-command->getSignature()
- combine the name of the command, the names of all the inputs to the command, the names of all the outputs, whether or not the command allows missing inputs, whether or not the command allows modified outputs, and whether or not the command is always out of date
BuildKey::Kind::Node
-node->getSignature()
- combine the type of the node with the names of all the producers for the node
- what is the type of a node?
BuildNode.h
- Plain, Directory, DirectoryStructure, Virtual - what's a producer for a node?
- a producer is a Command that can produce a node
- what is the type of a node?
- combine the type of the node with the names of all the producers for the node
-
- So based on the above, the signature is not going to change when you, e.g. change the contents of a file. It will change when you change the name of a file, or a file. It's more of a structural thing that will only ever change when files or dependencies are added or removed.
- Does it need to run because it's "invalid"? ("invalid-value")
- How does this differ from the signature check above?
- The signature is computed based on file names, compiler arguments, and the like
- Whether or not a rule is valid has to do with the contents or modification times of the files involved have changed since last time
BuildKey::Kind::Command
-CommandTask::isResultValid(engine, *command, BuildValue::fromData(value))
- Delegate to the command -
command.isResultValid(buildSystem.getBuildSystem(), value)
- See above for this chain of events - checking if a command is valid is essentially checking if the outputs are all there
- Delegate to the command -
BuildKey::Kind::Node
-FileInputNodeTask::isResultValid(engine, *node, BuildValue::fromData(value))
BuildSystem.cpp#FileInputNodeTask#isResultValid
- compares "value" to current file "info"
- value
- is stored in the db
- where's it created?
- anywhere
getFileInfo
is called
- anywhere
- info
BuildInfo.cpp#BuildNode::getFileInfo
FileSystem.cpp#FileInfo getFileInfo
FileInfo.cpp#FileInfo::getInfoForPath
- { device: st_dev (device the file's inode resides on), inode: st_ino (inode number), mode: st_mode (inode protection mode), size: st_size (size, in bytes, of the file), modTime.seconds: st_mtim.tv_sec (time of last file modification in seconds), modTime.nanoseconds: st_mtim.tv_nsec (time of last file modification in nanoseconds), }
- value
- compares "value" to current file "info"
- How does this differ from the signature check above?
- Does it need to run because the input has been rebuilt? ("input-rebuilt")
- The system that enqueues the rules for running (
BuildEngine.cpp#executeTasks
) is basically a big while loop. It goes like this:- While there are new rules in the queue, pick the top one off the stack
- If the rule is picked up but has not finished scanning, skip it and pick up the next rule ("rule-scanning-deferred-on-input")
- Do all the checks above (signature changed, invalid value)
- Check to make sure that the rule has all the inputs it needs to run
- Inputs from previous builds are still valid (this is how caching/derived data works)
- If it doesn't, enqueue new rules to scan all its inputs and then mark the rule as "deferred" and put it back in the queue
- The trace includes "rule-scanning-deferred-on-task" when this happens
- When the inputs are all scanned and built and the task is picked up again, it needs to be run and "input-rebuilt"
- If it does, run it
- While there are new rules in the queue, pick the top one off the stack
- The system that enqueues the rules for running (
- Do any of the inputs for a rule need to be run? ("rule-does-not-need-to-run")
- If nothing depends on the rule, or all the inputs for a rule have already been run, it does not need to be run.
- If the rule passes all the tests above and does actually need to run, enqueue rule for scanning (aka task execution in
BuildEngine.cpp#executeTasks
) ("rule-scheduled-for-scanning")
- Does it need to run because the signature changed? ("signature-changed")
- Scan / Execute the rule (
BuildEngine.cpp#executeTasks
)- For every rule enqueued for scanning
BuildEngine.cpp#processRuleScanRequest
- For every input for the rule
- "Check if it needs to run" (see above). If it does, scan it.
- If the rule has been scanned already and does in fact need to run, scan the input ("rule-scanning-next-input")
BuildEngine.cpp#demandRule
- Assert that the rule has already been scanned
- If the rule is "complete", exit
- If the rule is "in progress" already, exit
- If the rule doesn't actually need to run ("rule-does-not-need-to-run" from before), mark it "complete" and exit
- If all of those checks don't cause an early exit, create a new
task
for thisrule
("created-task-for-rule")BuildSystem.cpp#createTask
- the type of
task
created for each rule is determined inBuildSystem.cpp#BuildSystemEngineDelegate::lookupRule
- the most relevant
tasks
for investigating build slowness are probablyVirtualInputNodeTask
,DirectoryInputNodeTask
,DirectoryStructureInputNodeTask
,FileInputNodeTask
,ProducedNodeTask
- the type of
- Start the task by enqueuing it into a queue called
readyTaskInfos
- For every input for the rule
- For every rule enqueued for scanning
- Check if it needs to run ("checking-rule-needs-to-run")
BuildEngine.cpp#executeTasks
is the thing that writes the rule results to the db with setRuleResult
BuildEngine.cpptaskIsComplete
is where task results (signatures and values and computed_at) are recorded
new-task
- when a new task is created (BuildEngine.cpp#ruleInfo.rule->createTask
)new-rule
- when a new rule is created (BuildSystem.cpp#new BuildSystemRule
)build-started
- when the build is started (BuildEngine.cpp#trace->buildStarted()
)handling-build-input-request
- when a build input request is handled fromBuildEngine.cpp#inputRequests
. A "build input request" is a request made by arule
to do some workcreated-task-for-rule
- when a rule creates a task to do some work on its behalfhandling-task-input-request
- when a build input request is handled fromBuildEngine.cpp#inputRequests
. A "build input request" is a request made by atask
to do some workpaused-input-request-for-rule-scan
- when a rule is scanned, but already marked as "pending scan", so it's skipped and not scanned twicereadying-task-input-request
- when a rule's inputs are computed/completed and the work that the rule represents is enqueuedadded-rule-pending-task
- when a rule's inputs are not computed/completed and the work that the rule represents is attempted to be enqueued (but fails because its inputs are not ready)completed-task-input-request
- when a rule is dequeued after it's been enqueued by "readying-task-input-request"updated-task-wait-count
- when a task is no longer waiting on an input (tasks wait on all their inputs before they're run)unblocked-task
- when a task is no longer waiting on any inputs (happens right after "updated-task-wait-count")readied-task
- when a task is dequeued fromreadyTaskInfos
queue and ready to run. ThereadyTaskInfos
queue contains tasks that are waiting on no inputsfinished-task
- when a task is dequeued fromfinishedTaskInfos
. Tasks are placed on this queue when they are completed. A task is "changed" if its value was computed in the current build (and not pulled from a prior build).build-ended
- when the build endschecking-rule-needs-to-run
- when a rule is scanned to determine whether or not it needs to be runrule-scheduled-for-scanning
- when it is determined that a rule needs to be run and it is enqueued for processing (where its inputs are checked to make sure it's ready to run, then it's executed)rule-scanning-next-input
- while a rule is processed, when one of its inputs is retrieved and enqueued for scanning, and has been scanned alreadyrule-scanning-deferred-on-input
- while a rule is processed, when one of its inputs is retrieved, has not been scanned, and is therefore enqueued for scanningrule-scanning-deferred-on-task
- when a rule is processed, when one if its inputs is retrieved, has been scanned already, but the task representing that input has not been completedrule-needs-to-run, never-built
- when a rule is scanned and has not been run and therefore is marked as "needs to run"rule-needs-to-run, signature-changed
- when a rule is scanned and the file associated with the rule for this run has a differentsignature
than that of the previous cached buildrule-needs-to-run, invalid-value
- when a rule is scanned and the file associated with the rule for this run has a differentstat
output (file modification time and other file metadata) than that of the previous cached buildrule-needs-to-run, input-missing
- this is a possible trace output, but it isn't currently used anywhererule-needs-to-run, input-rebuilt
- when the rule has been computed at a certain time, but has an input that's been computed more recentlyrule-does-not-need-to-run
- if the rule has no dependenciescycle-force-rule-needs-to-run
- force a rule to be run in order to break a build cyclellbuild
has detectedcycle-supply-prior-value
- when a rule is forced to be run in order to break a build cycle and the value from the previous build is set as the rule result
A "signature" is computed by hashing some inputs with hash_combine
like in ExternalCommand#getSignature
Questions left to answer:
- What's the difference between the BuildEngine and the BuildSystem?
- The BuildEngine seems to be the thing doing things - enqueuing commands, rules, tasks, telling them to run, printing trace output
- The BuildSystem seems to be the thing that contains the logic for all the parts - "what does this type of task do?", "how does this rule know it needs to run"?
Things left to do:
- Approach this document as if I didn't know anything and I had a goal: I notice my ios build is taking too long, what can I do about it?
- Maybe create an architecture diagram of how all the data moves around and where it's stored
XCode is a powerful tool that allows developers to create amazing apps, games, and full featured applications. I've used it as a mobile developer in the past and now as a mobile infrastructure engineer. Most of the time it does its job without complaint, but sometimes it doesn't, and doesn't give much indication as to why.
I've explored the nitty gritty of how Android apps get built efficiently and here I'll document the same, but for the build system used by XCode called llbuild.
llbuild was integrated into XCode fairly recently (XCode 10, 2018) and is still sometimes referred to as "the new build system". The next version is in the works, but that's probably a long way out and will change a fair bit before it's used.
Both build systems for Android and iOS (Gradle and llbuild) operate with similar fundamental concepts. Gradle operates on "tasks" with "inputs" and "outputs". llbuild uses "commands", "rules", and "tasks" - all of which also have "inputs" and "outputs". In order for builds to operate efficiently, the inputs and outputs from a build are recorded and stored. On subsequent builds those outputs are reused if the inputs haven't changed. That's the basic idea behind incremental compilation, also called compilation avoidance.
So if a clean build takes 10 minutes, subsequent builds with no changes should hypothetically run in seconds. The power of caching!
But what happens when the dream of incremental compilation doesn't come true? Up until recently I chalked it up to "just the way things are". Today, though, lets dive in and figure out what's really going on and how we can reach the promised land of build system performance and 0 sescond incremental compilation.
Before we get into the details, lets find a part of the build to investigate. For this I recommend XCode's "build with timing summary" or the third party tool XCLogParser. A ton has already been written on how to use these tools to find slow aspects of a build, so I won't duplicate that here - a quick google will turn up some great guides.
Now we'll get some files in front of us and explore what they do and how they're structured.
Firstly, run this in your terminal: defaults write com.apple.dt.XCBuild EnableBuildDebugging -bool YES
.
That will tell XCode to enable build debugging output, which will tell llbuild to do the same. Once you're done debugging your builds, change that YES
to a NO
and rerun the command. The artifacts that get produced are a bit less than 100Mb per run, which can really add up if you're building all day every day.
Now, build your project with XCode. The build logs (View > Navigators > Reports) will tell you about a few new files:
manifest.xcbuild
- This is the
build file
that describes your build as a yml file with a bunch of json in it. It's fed to the build system as a set of instructions.
- This is the
build.db
- This is an SQLite database that holds prior build metadata. It's one half of the cache that allows incremental compilation. The other half is the actual files stored in DerivedData.
build.trace
- The trace file is the log of what
llbuild
actually did as it executed the instructions inmanifest.xcbuild
- The trace file is the log of what
Now, lets look at those files and investigate what's going on. Open up the manfifest.xcbuild
and build.trace
in your text editor of choice (careful - they're big!). You can find the exact location of those files by looking at the build log file in the Report Navigator.
This next part will be fairly freeform - you're going to need to put on your detective hat and explore:
- Start by taking a look at the
build.trace
and try and find any log lines related to the slow aspect of the build you're investigating.
Eventually you should find a line that mentions the framework you're interested in and whatever operation is taking a long time - maybe the string :Debug:CompileSwiftSources
if you're curious about why the framework is being recompiled. Try and find the line that describes the entire slow operation you're interested in and not just one part, i.e. compiling an entire framework rather than an individual file.
- After you've located the slow operation, the next question to answer is "why did this happen?" To answer this, just employ the standard process you use every day to debug regular programs: start from the observed behavior, walk backwards to find its cause, and repeat until you find the root of the problem.
As you're working through the trace, use the handy glossary at the bottom to understand what each line means.
Eventually you'll reach a trace line that makes you say "huh, I didn't modify that file!" or "why'd that change?". What you do next will totally depend on the aspect of the build you're investigating, the cause of the slowness, and so on. I'll leave that up to you!
Assuming you've followed the steps above, you've walked backwards from the observed slow behavior and found its root cause, whether that's an errant file modification, mis-configured build settings, or something else. Hopefully you've been able to remedy that root cause and are on your way to faster, more consistent builds.
Remember that you don't have to tackle every single slowdown all at once! Any non-trivial XCode project will have multiple steps that can be improved or optimized.
Good luck, and keep working towards 0 second incremental builds in XCode!
{ "new-rule", "R7897", "N/Users/blah/foo/bar/file.json" }
The first element is the "trace keyword" - it describes what this trace line is doing
The second element is the rule ID, formatted something like R###
The third element is the rule key, and it usually looks like the path to a file. The N at the front is a marker for the build system. It might be a different capital letter sometimes.
{ "rule-scanning-next-input", "R7853", "R7854" }
The first element is the trace keyword
The second element is a rule ID for the rule that's being scanned
The third element is a rule ID for the input of the rule that's being scanned.
Think of it this way: the third element is an input to the second, and therefore to scan the second element, all its inputs must be scanned as well - that's what's happening here.
new-task
- when a new task is creatednew-rule
- when a new rule is createdbuild-started
- when the build is startedhandling-build-input-request
- when a build input request is handled from the BuildEngine. A "build input request" is a request made by arule
to do some workcreated-task-for-rule
- when a rule creates a task to do some work on its behalfhandling-task-input-request
- when a build input request is handled from the BuildEngine. A "build input request" is a request made by atask
to do some workpaused-input-request-for-rule-scan
- when a rule is scanned, but already marked as "pending scan", so it's skipped and not scanned twicereadying-task-input-request
- when a rule's inputs are computed/completed and the work that the rule represents is enqueuedadded-rule-pending-task
- when a rule's inputs are not computed/completed and the work that the rule represents is attempted to be enqueued (but fails because its inputs are not ready)completed-task-input-request
- when a rule is dequeued after it's been enqueued by "readying-task-input-request"updated-task-wait-count
- when a task is no longer waiting on an input (tasks wait on all their inputs before they're run)unblocked-task
- when a task is no longer waiting on any inputs (happens right after "updated-task-wait-count")readied-task
- when a task is dequeued fromreadyTaskInfos
queue and ready to run. ThereadyTaskInfos
queue contains tasks that are waiting on no inputsfinished-task
- when a task is dequeued fromfinishedTaskInfos
. Tasks are placed on this queue when they are completed. A task is "changed" if its value was computed in the current build (and not pulled from a prior build).build-ended
- when the build endschecking-rule-needs-to-run
- when a rule is scanned to determine whether or not it needs to be runrule-scheduled-for-scanning
- when it is determined that a rule needs to be run and it is enqueued for processing (where its inputs are checked to make sure it's ready to run, then it's executed)rule-scanning-next-input
- while a rule is processed, when one of its inputs is retrieved and enqueued for scanning, and has been scanned alreadyrule-scanning-deferred-on-input
- while a rule is processed, when one of its inputs is retrieved, has not been scanned, and is therefore enqueued for scanningrule-scanning-deferred-on-task
- when a rule is processed, when one if its inputs is retrieved, has been scanned already, but the task representing that input has not been completedrule-needs-to-run, never-built
- when a rule is scanned and has not been run and therefore is marked as "needs to run"rule-needs-to-run, signature-changed
- when a rule is scanned and the file associated with the rule for this run has a differentsignature
than that of the previous cached buildrule-needs-to-run, invalid-value
- when a rule is scanned and the file associated with the rule for this run has a differentstat
output (file modification time and other file metadata) than that of the previous cached buildrule-needs-to-run, input-missing
- this is a possible trace output, but it isn't currently used anywhererule-needs-to-run, input-rebuilt
- when the rule has been computed at a certain time, but has an input that's been computed more recentlyrule-does-not-need-to-run
- if the rule has no dependenciescycle-force-rule-needs-to-run
- force a rule to be run in order to break a build cyclellbuild
has detectedcycle-supply-prior-value
- when a rule is forced to be run in order to break a build cycle and the value from the previous build is set as the rule result
This file describes your build.
It has a few different parts: client
, target
, nodes
, and commands
.
I've not seen client
or target
contain particularly useful information. nodes
and commands
are where it's at.
You can find the source documentation for each of these things here, but I'll cover them in my own words now.
A node
represents some input or output of the build process. Typically, this is a file.
Nodes can have attributes.
- Gist from Daniel Dunbar (works at apple on build systems) describing how to turn on build debugging
swift-llbuild
repo- Docs on
llbuild
- Blog post on this whole thing