In programming languages, literals are textual representations of values in the source code. This is a syntactical concept.
Some examples:
7 # integer literal
'foo' # string literal
[] # array literal
In contrast,
Math::PI
String.new
Array.new
are not literals.
In the case of Math::PI
, while it may store a fixed number, syntactically that is a constant path, not a literal.
Everything is an object in Ruby, does Ruby create a new object when it encounters a literal?
It depends, there are three possibilities:
This happens with nil
, true
, false
, symbols, small integers (fixnums), and others:
p 7.object_id # => 15
p 7.object_id # => 15
As the example illustrates, 7
evaluates always to the same object.
This happens with literals for regular expressions or rational numbers, for example. Check this out:
def m1 = 0.5r
def m2 = 0.5r
2.times { p m1.object_id } # prints the same object ID twice
2.times { p m2.object_id } # prints the same object ID twice, but a different one
0.5r
is the same literal for the fraction 1/2 in both methods. You get the same object every time you invoke m1
. You also get the same object every time you invoke m2
. But those two objects are different, because the literals are located in different places.
This happens for example with arrays, hashes, or strings (by default):
2.times { p ''.object_id } # prints different object IDs
However, in practice, we'd often prefer string literals to behave as in (1), is that possible?
Yes, Ruby 2.3 introduced this magic comment:
# frozen_string_literal: true
If a file has it at the top, string literals in that file evaluate to frozen (immutable) string instances:
# frozen_string_literal: true
s = 'foo'
s.frozen? # => true
s.equal?('foo') # => true
Additionally, string literals behave as in (1) now, as the last line shows. That is, any 'foo'
, anywhere, evaluates to the same object.
This is important because it reduces allocations and, therefore, reduces the time spent in garbage collection. No big deal if the string is used to initialize a constant, but it might be for those in method definitions, for example.
Impact depends on the application but, in general terms, this is more performant. For instance, the Lobsters benchmark is about 5% slower with frozen string literals disabled.
Let me underline that this optimization applies only to frozen string literals, not to arbitrary frozen strings:
# frozen_string_literal: true
s = String.new('foo').freeze
t = String.new('foo').freeze
s.equal?(t) # false because while 'foo' is a literal, String.new('foo') is not
The Ruby community has fully embraced this feature, and modern codebases normally have that magic comment in all their files. To the point that we would like to have (1) by default, without the need of the magic comment.
That possibility was discussed for Ruby 3, but Matz considered that the ecosystem was just not ready (see #11473).
The goal is to make the switch in Ruby 4.
Ruby 3.4 is going to ship with a new feature that will help making the transition.
Ruby committer (and Rails Core Team member) Jean Boussier is championing this effort. To me, that is admirable, this epic needs determination.
In Ruby 3.4, by default, if a file does not have the magic comment and a string object that was instantiated with a literal gets mutated, Ruby still allows the mutation, but it now issues a warning:
s = 'foo'
s << 'bar' # warning: literal string will be frozen in the future
The mutation does not need to happen in the same file, it can happen elsewhere.
Deprecation warnings have to be enabled to see them. For example, by passing -W:deprecated
to ruby
, or by setting Warning[:deprecated] = true
. It is worth noting that nowadays minitest has deprecation warnings enabled. RSpec does not have them enabled, though there is a pull request for it. In any case, you can just add Warning[:deprecated] = true
to spec/spec_helper.rb
.
You can tell ruby
to err instead of warn with --enable-frozen-string-literal
. With that option, string literals are frozen by default globally, without magic comments (that is, unless you opt-out manually with # frozen_string_literal: false
).
As a curiosity, in the current 3.4.0-preview1
, s.frozen?
returns true
, even if the string is mutable. This was subject to discussion, and it has been revised, in 3.4.0 it will return false
.
In general, no.
By default, if you delete the magic comments in Ruby 3.4, the optimizations you enabled with the comment are disabled. As we saw, strings are not frozen, and string objects are not reused.
You could get frozen string literals by passing --enable-frozen-string-literal
to ruby
, but since that has a global effect, right now that can be risky in production due to transitive dependencies.
On the other hand, gems supporting Ruby < 4 may want to leave the magic comment in place for now. If they remove the comment, clients running in those Rubies without --enable-frozen-string-literal
will lose the optimizations. Furthermore, string literals in your gem would all of a sudden evaluate to mutable objects, which is in itself a logic concern if the code relied on them being immutable.
In order to be able to have frozen string literals by default in the future, gems have to be ready for the switch. As much as possible.
This is going to be a community effort 💪.
To help in this transition, you can enable warnings in CI and note which gems issue warnings. Then, report them to the gem maintainers.
Basic GitHub Actions configuration would be something like:
- run: "RUBYOPT=-W:deprecated bundle exec rake"
Once warnings are clean, you can keep an eye on this by enabling errors:
- run: "RUBYOPT='--enable=frozen-string-literal --debug=frozen-string-literal' bundle exec rake"
The option --debug=frozen-string-literal
helps, because it reports the locations of both the allocation and the mutation.
We have polished this post together with Jean Boussier, thanks man.
@dougc84 regarding concatenation,
gives you a mutable string, because the expression is not a literal. The operands are literals, but not their addition.
Sometimes you want a string buffer to push to it, yes, but statistically that use case is less frequent, so we optimize for the common and most performant case. Still, the options are simple:
About other languages, it depends. In some languages all strings are immutable, Java is an example.