Skip to content

Instantly share code, notes, and snippets.

@beccadax
Last active December 6, 2019 01:06
Show Gist options
  • Save beccadax/2c9a16f37853d60184961dca2a853d1e to your computer and use it in GitHub Desktop.
Save beccadax/2c9a16f37853d60184961dca2a853d1e to your computer and use it in GitHub Desktop.

Concise magic file names

Introduction

Today, #file evaluates to a string literal containing the full path to the current source file. We propose to instead have it evaluate to a human-readable string containing the filename and module name, while preserving the existing behavior in a new #filePath expression.

Swift-evolution thread: We need #fileName

Motivation

In Swift today, the magic identifier #file evaluates to a string literal containing the "full path"[1] to the current file. It's a nice way to trace the location of logic occurring in a Swift process, but its use of a full path has a lot of drawbacks:

  • It clutters the debug output with irrelevant information. The path is usually very long and only a little bit of that information is necessary to locate the file in question. In a hundred-character path, the developer usually only cares about the last ten or twenty.

  • It's not portable. The same project may be located at different paths on different machines; a developer looking at a crash log doesn't care about a path on a build server.

  • It can inadvently reveal private or sensitive information. The full path to a source file may contain a developer's username, hints about the configuration of a build farm, proprietary versions or identifiers, or the Sailor Scout you named an external disk after. Users probably don't know that this information is embedded in their binaries and may not want it to be there.

  • It bloats the final size of the binary. In testing with the Swift benchmark suite, a shorter #file string reduced code size by up to 5%. The large code also impacts runtime performance; in the same tests, a couple dozen benchmarks ran noticeably faster, with several taking 22% less time.

  • It introduces artificial differences between binaries built on different machines. For instance, the same code built in two different environments might produce different binaries with different hashes. This makes life difficult for anyone trying to do distributed builds or find the differences between two binaries.

[1] Specifically, the "full path" is the path passed to the Swift compiler by the build system. This might be relative or absolute. Xcode and SwiftPM both pass absolute paths, but you might see a relative path in current Swift's #file if you invoked swift foo.swift on the command line or if you built a Swift project with the Bazel build system.

Situations where the full path is needed

While the full path is not needed when printing messages for the developer, some uses of #file do rely on it. In particular, Swift tests sometimes use #file to compute paths to fixtures relative to the source file that uses them. This has historically been necessary in SwiftPM because it did not support resources, but SE-0271 has added that feature and there is little need to resort to these tricks anymore.

An analysis of the 1,073 places where #file is written in the Swift Source Compatibility Suite suggests that well over 90% of uses would be better served by a #file that did not include a full path. However, we do need to make some concession to the small portion of uses that need a full path for some reason.

Methodology

We applied several regular expressions to all 108 projects in the Source Compatibility Suite to try to classify uses of #file.

980 uses matched patterns that we believe represent display to humans:

  • 419 uses matched a pattern for StaticString = #file; we take these to be default arguments that are eventually passed to StaticString-taking APIs like fatalError or XCTAssertEqual, since there is little other reason to use StaticString.

  • 281 uses matched patterns for <StaticString typealias> = #file where the project usually passes values of that type to APIs like fatalError or XCTAssertEqual.

  • 148 uses matched a pattern for String = #file, but also referenced #line on the same line. We take these to be attempts to capture a full source location for display to the user.

  • 132 uses matched a pattern for interpolations of #file; we take these to be interpolated into a string that is then displayed to a user.

41 uses matched patterns that we believe represent path computation:

  • 10 uses matched a pattern for String = #file, but did not have #line on the same line. We take these to be default arguments that will eventually be passed to String-taking file APIs like URL.init(fileURLWithPath:).

  • 31 uses matched a pattern for uses in parenthesized lists (but didn't match the interpolation pattern); we take these to be passed to file APIs.

52 uses did not match any of these patterns.

We therefore estimate that about 6% (±3%) of uses actually want a full path so they can compute paths to other files, while 94% (±3%) would be better served by a more succinct string.

A manual check of 172 uses in 16 projects suggested that about 95% displayed the #file value to the user; this is in line with the regex-based estimate.

Proposed solution

We propose changing the string #file evaluates to. To preserve implementation flexibility, we do not specify the exact format of this string; we merely specify that it must:

  1. Be likely to uniquely identify the source file among all files whose code is present in a given process.
  2. Be easy for a human to read and map to the matching source file.
  3. Not contain the full path to the source file.

#file will otherwise behave as it did before, including its special behavior in default arguments. Standard library assertion functions will continue to use #file, and we encourage developers to use it in test helpers and most other places where they use #file today.

For those rare cases where developers actually need a full path, we propose adding a #filePath magic identifier with the same behavior that #file had in previous versions of Swift. That is, it contains the path to the file as passed to the Swift compiler.

Detailed design

We do not specify the exact string produced by #file. In today's compiler, the filename and module name are sufficient to uniquely identify a file[2], so a string containing those two pieces of information should be unique within the process.

In a module named MagicFile and a file named NNNN-magic-file.swift at /Users/brent/Desktop, the prototype implementation currently produces output like this:

print(#file)     // => "NNNN-magic-file.swift (MagicFile)"
print(#filePath) // => "/Users/brent/Desktop/NNNN-magic-file.swift"

fatalError("Something bad happened!")
// => "Fatal error: Something bad happened!: file NNNN-magic-file.swift (MagicFile), line 1"

However, future versions of Swift may change the format of this string.

[2] This is sufficient to uniquely identify a file because the Swift compiler will not build a module which contains two identically-named source files, even if they're in different directories. This limitation ensures that identically-named private and fileprivate declarations in different files will have unique mangled names.

A future version of the Swift compiler could lift this limitation. If this happens, not fully specifying the format of the #file string preserves flexibility for that version to adjust its #file strings to, for instance, contain enough parent directories to distinguish between the files.

Disabling #filePath

Although it is not technically part of this proposal, we are considering adding a new compiler flag which privacy- or security-conscious developers can use to disable #filePath in some fashion.

Source compatibility

All existing source code will continue to compile, but the compiler will generate different strings for #file expressions. We anticipate that this will change the behavior of a small amount of existing code in non-trivial ways. However, we believe that this will most heavily impact tests and test support libraries, resulting in easily detected test failures rather than hidden bugs, and that adding #filePath makes these failures easy to correct.

Effect on ABI stability

None. #file is a compile-time feature.

Effect on API resilience

Compilers that do not support #filePath will not be able to compile swiftinterface files that have adopted it. The module stability design is not intended to avoid this kind of breakage, but it's still unfortunate. To mitigate it, we may want to roll out #filePath (and #fileName, if the alternative including it is accepted) as simple aliases for #file in Swift 5.2, and then make the other changes proposed here in the release after that.

Alternatives considered

Deprecate #file and introduce two new syntaxes

Rather than changing the meaning of #file, we could keep its existing behavior, deprecate it, and provide two alternatives:

  • #filePath would continue to use the full path.
  • #fileName would use this new name-and-module string.

This is a more conservative approach that would avoid breaking any existing uses. We choose not to propose it for three reasons:

  1. The name #fileName is misleading because it sounds like the string only contains the file name, but it also contains the module name. #file is more vague, so we're more comfortable saying that it's "a string that identifies the file".

  2. This alternative will force users to update every use of #file to one or the other option. We feel this is burdensome and unnecessary given how much more frequently the #fileName behavior would be appropriate.

  3. This alternative gives users no guidance on which feature they ought to use. We feel that giving #file a shorter name gives users a soft push towards using it when they can, while resorting to #filePath only when necessary.

However, if the core team feels that changing #file's behavior is too radical for our source stability guarantees, this option exists and would not be difficult to implement.

Support more than two #file variants

We considered introducing additional #file-like features to generate other strings, selecting between them either with a compiler flag or with different magic identifiers. The full set of behaviors we considered included:

  1. Path as written in the compiler invocation
  2. Guaranteed-absolute path
  3. Path relative to the Xcode SOURCE_DIR value, or some equivalent
  4. Last component of the path (file name only)
  5. File name plus module name
  6. Empty string (sensible as a compiler flag)

We ultimately decided that supporting only 1 (as #filePath) and 5 (as #file) would adequately cover the use cases for #file. Five different syntaxes would devote a lot of language surface area to a small niche, and controlling the behavior with a compiler flag would create six language dialects that might break some code. Some of these behaviors would also require introducing new concepts into the compiler or would cause trouble for distributed build systems.

Make #filePath always absolute

While we're looking at this area of the language, we could change #filePath to always generate an absolute path. This would make #filePath more stable and useful, but it would cause problems for distributed builds unless it respected -debug-prefix-map or something similar. It would also mean that there'd be no simple way to get the exact same behavior as Swift previously provided, which would make it more difficult to adapt code to this change.

Ultimately, we think changing to an absolute path is severable from this proposal and that, if we want to do this, we should consider it separately.

Other alternatives

We considered introducing a new alternative to #file (e.g. #fileName) while preserving the existing meaning of #file. However, a great deal of code already uses #file and would in practice probably never be converted to #fileName. The vast majority of this code would benefit from the new behavior, so we think it would be better to automatically adopt it. (Note that clang supports a __FILE_NAME__ alternative, but most code still uses __FILE__ anyway.)

We considered switching between the old and new #file behavior with a compiler flag. However, this creates a language dialect, and compiler flags are not a natural interface for users.

Finally, we could change the behavior of #file without offering an escape hatch. However, we think that the existing behavior is useful in rare circumstances and should not be totally removed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment