Skip to content

Instantly share code, notes, and snippets.

@fxn
Last active October 11, 2024 16:41
Show Gist options
  • Save fxn/bf4eed2505c76f4fca03ab48c43adc72 to your computer and use it in GitHub Desktop.
Save fxn/bf4eed2505c76f4fca03ab48c43adc72 to your computer and use it in GitHub Desktop.

Ruby: The future of frozen string literals

What is a literal?

In programming languages, literals are textual representations of values in the source code. This is a syntactical concept.

Some examples:

7     # integer literal
'foo' # string literal
[]    # array literal

In contrast,

Math::PI
String.new
Array.new

are not literals.

In the case of Math::PI, while it may store a fixed number, syntactically that is a constant path, not a literal.

Literals and object allocations

Everything is an object in Ruby, does Ruby create a new object when it encounters a literal?

It depends, there are three possibilities:

1. You get the same object for the same literal everywhere

This happens with nil, true, false, symbols, small integers (fixnums), and others:

p 7.object_id # => 15
p 7.object_id # => 15

As the example illustrates, 7 evaluates always to the same object.

2. You get the same object for the same literal in the same spot

This happens with literals for regular expressions or rational numbers, for example. Check this out:

def m1 = 0.5r
def m2 = 0.5r

2.times { p m1.object_id } # prints the same object ID twice
2.times { p m2.object_id } # prints the same object ID twice, but a different one

0.5r is the same literal for the fraction 1/2 in both methods. You get the same object every time you invoke m1. You also get the same object every time you invoke m2. But those two objects are different, because the literals are located in different places.

3. You always get different objects

This happens for example with arrays, hashes, or strings (by default):

2.times { p ''.object_id } # prints different object IDs

However, in practice, we'd often prefer string literals to behave as in (1), is that possible?

The magic comment

Yes, Ruby 2.3 introduced this magic comment:

# frozen_string_literal: true

If a file has it at the top, string literals in that file evaluate to frozen (immutable) string instances:

# frozen_string_literal: true

s = 'foo'
s.frozen?       # => true
s.equal?('foo') # => true

Additionally, string literals behave as in (1) now, as the last line shows. That is, any 'foo', anywhere, evaluates to the same object.

This is important because it reduces allocations and, therefore, reduces the time spent in garbage collection. No big deal if the string is used to initialize a constant, but it might be for those in method definitions, for example.

Impact depends on the application but, in general terms, this is more performant. For instance, the Lobsters benchmark is about 5% slower with frozen string literals disabled.

Let me underline that this optimization applies only to frozen string literals, not to arbitrary frozen strings:

# frozen_string_literal: true

s = String.new('foo').freeze
t = String.new('foo').freeze

s.equal?(t) # false because while 'foo' is a literal, String.new('foo') is not

The vision

The Ruby community has fully embraced this feature, and modern codebases normally have that magic comment in all their files. To the point that we would like to have (1) by default, without the need of the magic comment.

That possibility was discussed for Ruby 3, but Matz considered that the ecosystem was just not ready (see #11473).

The goal is to make the switch in Ruby 4.

Ruby 3.4

Ruby 3.4 is going to ship with a new feature that will help making the transition.

Ruby committer (and Rails Core Team member) Jean Boussier is championing this effort. To me, that is admirable, this epic needs determination.

In Ruby 3.4, by default, if a file does not have the magic comment and a string object that was instantiated with a literal gets mutated, Ruby still allows the mutation, but it now issues a warning:

s = 'foo'
s << 'bar' # warning: literal string will be frozen in the future

The mutation does not need to happen in the same file, it can happen elsewhere.

Deprecation warnings have to be enabled to see them. For example, by passing -W:deprecated to ruby, or by setting Warning[:deprecated] = true. It is worth noting that nowadays minitest has deprecation warnings enabled. RSpec does not have them enabled, though there is a pull request for it. In any case, you can just add Warning[:deprecated] = true to spec/spec_helper.rb.

You can tell ruby to err instead of warn with --enable-frozen-string-literal. With that option, string literals are frozen by default globally, without magic comments (that is, unless you opt-out manually with # frozen_string_literal: false).

As a curiosity, in the current 3.4.0-preview1, s.frozen? returns true, even if the string is mutable. This was subject to discussion, and it has been revised, in 3.4.0 it will return false.

Can I delete the magic comments in Ruby 3.4?

In general, no.

By default, if you delete the magic comments in Ruby 3.4, the optimizations you enabled with the comment are disabled. As we saw, strings are not frozen, and string objects are not reused.

You could get frozen string literals by passing --enable-frozen-string-literal to ruby, but since that has a global effect, right now that can be risky in production due to transitive dependencies.

On the other hand, gems supporting Ruby < 4 may want to leave the magic comment in place for now. If they remove the comment, clients running in those Rubies without --enable-frozen-string-literal will lose the optimizations. Furthermore, string literals in your gem would all of a sudden evaluate to mutable objects, which is in itself a logic concern if the code relied on them being immutable.

How to help?

In order to be able to have frozen string literals by default in the future, gems have to be ready for the switch. As much as possible.

This is going to be a community effort 💪.

To help in this transition, you can enable warnings in CI and note which gems issue warnings. Then, report them to the gem maintainers.

Basic GitHub Actions configuration would be something like:

- run: "RUBYOPT=-W:deprecated bundle exec rake"

Once warnings are clean, you can keep an eye on this by enabling errors:

- run: "RUBYOPT='--enable=frozen-string-literal --debug=frozen-string-literal' bundle exec rake"

The option --debug=frozen-string-literal helps, because it reports the locations of both the allocation and the mutation.

Thanks

We have polished this post together with Jean Boussier, thanks man.

@drgcms
Copy link

drgcms commented May 28, 2024

If I understand correctly, there is no performance advantage if the source file doesn't contain # frozen_string_literal: true magic comment.

@fxn
Copy link
Author

fxn commented May 28, 2024

@drgcms in the general case, it does have a performance impact.

If frozen string literals are not enabled, methods like

def hidden?(basename)
  basename.start_with?(".")
end

that have a string literal in them create a new string object in each call. That means 1) we are doing the work of creating a new object in each call, and 2) the garbage collector has to get rid of them. If you pass the magic comment, none of that happens.

You can see the impact by yourself with this script, for example:

require "benchmark"

def hidden?(basename)
  basename.start_with?(".")
end

puts Benchmark.measure {
  i = 0
  while i < 100_000_000
    i += 1
    hidden?(".foo")
  end
}

If you add the magic comment, you'll see the script runs about twice as fast.

@chaadow
Copy link

chaadow commented May 29, 2024

@fxn Thanks for sharing this!

This helped me clean some of my application code as well as opening PRs on other gems.

However I have an issue with this:

- run: "RUBYOPT='-W:deprecated --debug-frozen-string-literal' bundle exec rake"

This seems to only work for a script that contains the pragma # frozen_string_literal: true
here are three cases:
image

From my testing:

  • adding -W:deprecated does not differ with '-W:deprecated --debug-frozen-string-literal'
  • the debug frozen string literal option, only adds the location BUT it still needs frozen_string_literal: true and it does not issue any warning

here is another screenshot, with the same '-W:deprecated --debug-frozen-string-literal' but with no frozen_string_literal: true pragma

image

==> Nothing happens :/

here is the script used for reference ( Using ruby 3.3.1 )

# frozen_string_literal: true

def modify_string
  str = "immutable"
  str << " change"
end

modify_string

@fxn
Copy link
Author

fxn commented May 29, 2024

@chaadow ya, the warning is a new feature in the forthcoming Ruby 3.4. If you install 3.4.0-preview1 you'll be able to experiment with it.

@chaadow
Copy link

chaadow commented Jun 8, 2024

@fxn Thanks. so I went ahead and installed 3.4.0-preview1

  • when I add --debug-frozen-string-literal the warning is never shown
  • without it, it works ( screenshot below)

Also on ruby's website there is no mention of the --debug-frozen-string-literal flag. Maybe it was removed recently?

image

@fxn
Copy link
Author

fxn commented Jun 24, 2024

@byroot @chaadow is right, if we pass -W:deprecated --debug-frozen-string-literal no warning is issued. Could that be a bug?

(I'd swear I tested by hand everything said in this post, I don't know what happened with that line.)

@casperisfine
Copy link

if we pass -W:deprecated --debug-frozen-string-literal no warning is issued. Could that be a bug?

Seems so, I'll have a look.

That flag is supposed to be used in conjunction with --enable-frozen-string-literal:

$ ruby -W:deprecated --enable-frozen-string-literal --debug-frozen-string-literal /tmp/foo.rb
/tmp/foo.rb:1:in '<main>': can't modify frozen String: "foo", created at /tmp/foo.rb:1 (FrozenError)

When used alone it has no effect, but it shouldn't turn of the chilled string warnings, I'll fix that.

@fxn
Copy link
Author

fxn commented Jun 24, 2024

That flag is supposed to be used in conjunction with --enable-frozen-string-literal

I see, then the recommendation in the post to use it also in conjuction with warnings in a first pass seems to be wrong.

@fxn
Copy link
Author

fxn commented Jun 24, 2024

I have edited the post to mention --debug-frozen-string-literal only in the context of --enable-frozen-string-literal.

@casperisfine
Copy link

Alright, ruby/ruby#11052 should fix it.

@casperisfine
Copy link

to use it also in conjuction with warnings in a first pass seems to be wrong.

Yeah, that's something I'd want to support. Enrich the warning to include the allocation location when --debug-frozen-string-literal is on. I need to look at it, I think it wouldn't be too hard.

@fxn
Copy link
Author

fxn commented Jun 24, 2024

Yeah, that's something I'd want to support. Enrich the warning to include the allocation location when --debug-frozen-string-literal is on. I need to look at it, I think it wouldn't be too hard.

Awesome. What in the end matters for this post is what ships in 3.4.0 final. If that makes it, then we could add it back.

I have also edited the options to be passed as --enable= and --debug=, because while the long forms are good, these alternatives match what ruby --help documents.

@chaadow
Copy link

chaadow commented Jun 24, 2024

@casperisfine Maybe I'm missing something, but why do we need to add --debug-frozen-string-literal in conjunction with --enable-frozen-string-literal, since --enable-frozen-string-literal alone (and in conjunction as well) raises an error, and there is no "debugging" involved ( or maybe i poorly understand the meaning of "debug" in this particular context. )

/tmp/foo.rb:1:in '

': can't modify frozen String: "foo", created at /tmp/foo.rb:1 (FrozenError)

@casperisfine
Copy link

there is no "debugging" involved

The debugging part is to add the created at /tmp/foo.rb:1 :

$ ruby --enable-frozen-string-literal -e '"foo".upcase!'
-e:1:in `upcase!': can't modify frozen String: "foo" (FrozenError)
        from -e:1:in `<main>'
$ ruby --enable-frozen-string-literal --debug-frozen-string-literal -e '"foo".upcase!'
-e:1:in `upcase!': can't modify frozen String: "foo", created at -e:1 (FrozenError)
        from -e:1:in `<main>'

In some cases the literal may be mutated far away from where it was allocated, so this help find where the string comes from. But it also increase memory usage significantly (to record the location), so it can't be enabled by default.

@chaadow
Copy link

chaadow commented Jun 24, 2024

Thanks, very clear.

@chaadow
Copy link

chaadow commented Jun 24, 2024

@fxn Thank you for editing the post! 🙏

@Earlopain
Copy link

If you use RuboCop and your codebase is the end-product (like a Rails application) you can now do the following:

AllCops:
  StringLiteralsFrozenByDefault: true

Style/FrozenStringLiteralComment:
  EnforcedStyle: never

This will give correct analysis when forcing frozen string literals through the environment variable and allows you to get rid of the magic comments early. Requires RuboCop 1.66

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment