buck2
is very explicit re: the interactions dep files have w/caching:
- dep files can (after execution) narrow the set of inputs for an action
- the input set of the action as known to your
buck2
daemon will have this narrowed set but...- the actual action cache key (as is given to disk/remote caches, etc.) will still have the full set of inputs
- ultimately this means that as far as remote caching for cold builds is concerned, the action at hand is still keyed on all of its inputs — including the ones that are pruned
- this means that — for cold builds — if there have been any changes to any of the "pruned out" files since the last run of the action (+ cache population), the action will not hit in the cache
- additionally, in the event that there are analysis-level changes that result in the starlark action being reconstructed (i.e. BUILD file is edited), we also lose the narrowed set of inputs
- for changes that materially affect the produced action (i.e. new
.cc
file added to a target or even to a dependent of a target) this makes sense.. - but in cases where the produced
Action
is actually "identical" to the original (prior to pruning) action this is unnecessary and unfortunate
- for changes that materially affect the produced action (i.e. new
If we think about what's happening at the action/remote cache level, the behavior w.r.t. to cold builds makes sense.
As an example:
- Analysis produces action
A
A
has inputs{ 1, 2, 3, 4 }
A
producesU
(a list of unused inputs) andO
(an output file)- At this point
A
's cache key ishash({ 1 ..= 4 }, ...)
A
executes, yielding a narrowed list of inputs:{ 1 }
- i.e.
U
contains{ 2, 3, 4 }
- At this point
A
(as it exists in-memory in the daemon) uses the narrowed list of inputs — we'll call this modified version ofA
,B
- However, the remote cache is given
{ 1 ..= 4 } => { U, O }
instead of{ 1 } => { U, O }
- ...
- So long as there aren't changes invalidating
B
(i.e. changes to aBUILD
/BUCK
file, repo rule restart, etc.),B
will continue to remain in memory- However, if analysis runs again for any reason, (assuming there are no material changes to the analysis time constructs producing
A
),B
will be replaced with a freshly computedA
- However, if analysis runs again for any reason, (assuming there are no material changes to the analysis time constructs producing
- On future builds involving
B
, when Skyframe/DICE askB
whether it is up-to-date,B
consults its narrowed inputs list and thus isn't sensitive to changes to{ 2, 3, 4 }
- i.e.
NOTE: a side-effect of B
not being sensitive to changes in { 2, 3, 4 }
is that changes to those files will not trigger the addition of new cache entries...
Future cold builds where { 1, 2, 3, 4 }
are the same (and the action A
as yielded by analysis is the same..) will have a cache hit:
- they'll then download
U
, modifyA
such that it has the narrowed input list (i.e.A
becomesB
) and subsequent builds of the action will (subject to the same caveats above) not be sensitive to changes in{ 2, 3, 4 }
Cold builds where { 2, 3, 4 }
have changes will not get a cache hit.
This is unfortunate but I don't think there's a way around it? There's kind of a circular dependency here.
- consider: if we had entered our initial build of
A
into the cache as{ 1 } => { U, O }
this wouldn't solve the problem- future cold builds would still need to know to use that key for
A
instead of{ 1, 2, 3, 4 }
- future cold builds would still need to know to use that key for
- if we introduced a special cache blob for "unused inputs lists" and rewired our action execution machinery to query this cache + narrow inputs on actions before looking for actual cache hits, we still have an issue: what would this "unused inputs lists" cache be keyed on?
- to be clear:
unused_input_list_cache[ ??? ] = U
- if we key it on
{ 1, 2, 3, 4 }
that lines up nicely with the version ofA
we get out of analysis but... it's too sensitive!- changes to
{ 2, 3, 4 }
would cause us to miss the cache which defeats the point — we'd only get hits in this cache in cases where{ 2, 3, 4 }
haven't changed since the original run which is the status quo
- changes to
- if we key it on just
{ 1 }
, how do we know to use this key in a cold build?- also: if we already know to use this narrowed key,
U
has no extra information for us...
- also: if we already know to use this narrowed key,
- using only file paths instead of file contents for
{ 1, 2, 3, 4 }
as the key is tempting and seems like it sidesteps the over-sensitivity issue but...- this is only safe to do for files that were in the unused list!
- modifications to files that were used can easily affect the list of files that are used/unused and so we do want to be sensitive to changes to such files
- knowing which files we can ignore content changes for is... once again, equivalent to knowing the list of unused inputs
- ultimately we want some way of associating the action with the list of unused inputs
- fuzzy schemes that use the action's target label and such all have correctness issues
- the only canonical identifier we have for the action is its hash which by definition is too sensitive
- what we really want is a caching scheme that lets us express:
cache[action { inputs(specific 1, any 2, any 3, any 4), ... }]
- such that actions that are structurally identical but have different hashes for
2
,3
, and4
still match this cache entry - unfortunately I think ^ is fundamentally incompatible with merkle-tree based caching schemes
- we could have the lookup process start by querying for a "structure-only" version of the action (i.e. paths only, not content hashes) that then produces a manifest describing the sensitivity for each of the inputs (i.e. requires exact match or not)
- i.e.
structure_cache[A: action { cmd, inputs(path1, path2, path3, path4), ... }] = { sensitivity(S1, S2, S3, S4) }
- i.e.
component_sensitivity[S1] = exact_hash(...)
- i.e.
component_sensitivity[S2 .. S4] = unused
- i.e.
- but... this has some issues:
- expensive: many lookups
- the structure-only cache entries are not unique!
- as discussed, changes in inputs can very much affect the list of inputs that are unused or not
- this doesn't exactly pose a correctness issue — consumers would need to check that all the listed sensitivity items match before using the cache entry — but it's still annoying
- to account for this we'd really need to have the
structure_cache
in the above return a list of "sensitivity"s...- not clear how we "resolve" any particular one with hashing; seems like we'd have to test them in sequence (though we can sort and do clever things, etc.)
- bottom line is that it's expensive
- we could have the lookup process start by querying for a "structure-only" version of the action (i.e. paths only, not content hashes) that then produces a manifest describing the sensitivity for each of the inputs (i.e. requires exact match or not)
- the approach implemented by Bazel/buck2 works because it uses an out-of-band method of associating the action with its "narrowed" input list; i.e. relying on the fact that the action won't be replaced in memory until/unless its full input list/structure is modified
- to be clear:
note: actual dynamic deps sidesteps this issue
Note
You can see the pruning in action by running (without caching) bazel aquery
followed by bazel build
:
❯ bazel aquery //:b
action '//:b: a.txt -> bazel-out/k8-fastbuild/bin/b.out'
Mnemonic: Ex
Target: //:b
Configuration: k8-fastbuild
Execution platform: @local_config_platform//:host
ActionKey: f504c1367cbd0ab79ea4a3d6e8d099d32678fb86e994d9b9d5e68eba619ab6a4
Inputs: [a.txt, b.txt]
Outputs: [bazel-out/k8-fastbuild/bin/b.out]
Command Line: (exec cp \
b.txt \
bazel-out/k8-fastbuild/bin/b.out)
# Configuration: 7e9df8fa2f6dac8029396f42123b7f41cd174371128cf0a776032c570d3d69c8
# Execution platform: @@local_config_platform//:host
ExecutionInfo: {internal-inline-outputs: bazel-out/k8-fastbuild/bin/b.unused}
❯ bazel build //:b
Target //:b up-to-date:
bazel-bin/b.out
bazel-bin/b.unused
INFO: Elapsed time: 0.253s, Critical Path: 0.05s
INFO: 3 processes: 2 internal, 1 linux-sandbox.
INFO: Build completed successfully, 3 total actions
❯ bazel aquery //:b
action '//:b: b.txt -> bazel-out/k8-fastbuild/bin/b.out'
Mnemonic: Ex
Target: //:b
Configuration: k8-fastbuild
Execution platform: @local_config_platform//:host
ActionKey: f504c1367cbd0ab79ea4a3d6e8d099d32678fb86e994d9b9d5e68eba619ab6a4
Inputs: [b.txt] # !!!
Outputs: [bazel-out/k8-fastbuild/bin/b.out]
Command Line: (exec cp \
b.txt \
bazel-out/k8-fastbuild/bin/b.out)
# Configuration: 7e9df8fa2f6dac8029396f42123b7f41cd174371128cf0a776032c570d3d69c8
# Execution platform: @@local_config_platform//:host
ExecutionInfo: {internal-inline-outputs: bazel-out/k8-fastbuild/bin/b.unused}
Can verify the behavior described by running the follow sequences:
- unused inputs are respected:
- build
b
, modifya.txt
->b
should not be re-built
- build
- the cache key covers all inputs (including those that are pruned):
- build
a
,b
with a disk cache - do a
bazel clean --expunge
- modify
a
- build
b
-> should re-build
- build
- the unused inputs list is used from the cache, if available
- build
a
,b
with a disk cache - do a
bazel clean --expunge
- build
b
-> should hit in the cache - modify
a
- build
b
-> should not be re-built
- build
- the cache is not populated when there are changes to the unused inputs:
- build
a
,b
with a disk cache - modify
a
- build
b
-> should not be re-built - do a
bazel clean --expunge
, leavea
in its modified state - build
b
-> should rebuild (i.e. won't hit in the cache..)
- build
- the on-disk analysis cache includes the updated (pruned) input list:
- build
a
,b
- do a
bazel shutdown
(note: notclean --expunge
) - modify
a
- build
b
-> should not be re-built
- build
- changes to things that influence the key of
Action
cause the in-memory and on-disk action to be replaced with a freshly computed version:- ...
- note: in practice this is actually pretty good; the set of things that can influence an action's key is pretty tight
- also, the
Action
in memory retains its original (full input list) key which is exactly what we want re: comparing with newly constructed versions of thisAction
- see here
misc:
- https://github.com/bazelbuild/bazel/blob/9ca7c3666c2b54a75ea40f924b29fd0d4671c328/src/main/java/com/google/devtools/build/lib/actions/ActionCacheAwareAction.java#L16-L24
- https://github.com/bazelbuild/bazel/commit/e6cce762d846f3915b834c278fcb7fa0072ea530
- https://github.com/bazelbuild/bazel/commit/4784d735f5c13ab7ad0c4c1068c50a3a6a23f855
- codepath that ensures that — even when an
Action
has been executed and has had its inputs narrowed —Action
s use the full set of inputs for cache key computation (I think): https://github.com/bazelbuild/bazel/commit/9ca7c3666c2b54a75ea40f924b29fd0d4671c328- see: https://github.com/bazelbuild/bazel/blob/17f678b2bbbc59d36ba51ae6d7a86ab672214a2b/src/main/java/com/google/devtools/build/lib/skyframe/ActionExecutionFunction.java#L626-L653
- see: https://github.com/bazelbuild/bazel/blob/0a2aac4e3c6f5ef98fc233e2ce83e44930edb188/src/main/java/com/google/devtools/build/lib/actions/ActionCacheChecker.java#L768-L831
Also see "input discovery in Bazel"