When looking at RuboCop's code, I noticed a big number of frozen arrays being used only to later call include? on them. This is O(n) instead of O(1).
Trying to convert them to Sets causes major compatibility issues, as well as very frustrating situations (See set.join and array + set) and where the fact that they are now Sets makes them much less efficient (See array & set).
Here are the improvements that would improve Sets:
I'd like to add #join to Set:
# Now:
Set[1, 2, 3].join('x') # => NoMethodError
# After
Set[1, 2, 3].join('x') # => "1x2x3"I'd make this join never recursive. (I've never wanted to use the recursivity of Array#join, but it may very well just be my particular experiences)
Reminder: Why there is no Enumerable#join: https://bugs.ruby-lang.org/issues/1893
We have set <op> array work fine:
Set[1] + [2] # => Set[1, 2]Nothing works in the reverse order:
[1] + Set[2] # => no implicit conversion of Set into Array
# should be:
[1] + Set[2] # => [1, 2]Note that the situation is particularly frustrating for & and |.
If someone wants to do ary & set (and keep the results in the order of ary), one has to do ary & set.to_a which will, internally, do a to_set, so we have set.to_a.to_set!! (assuming ary is over SMALL_ARRAY_LEN == 16 size, otherwise it's still O(n) instead of O(1)).
See order issue as to why this can not (officially) be done any other way.
Reminder:
ary & ary.reverse # => ary
Set[*ary] & Set[*ary.reverse] # => Set[*ary.reverse], officially order indeterminateOfficially, set elements have uncertain order. This predades when Hash started being ordered (Ruby 1.9.0, Xmas 2007). Sets have since been de-facto insertion-ordered. FYI, in those 13 years, there have been about 70 commits to lib/set.rb.
I have the impression that a non-negligible amount of code in the wild rely on sets being ordered, at least under most circumstances. I feel that this should be officialized.
If sets are truly unordered, then why do we hesitate to make an optimization of & and |: https://bugs.ruby-lang.org/issues/15281
See also: https://bugs.ruby-lang.org/issues/14069
To create a set from hash keys currently implies a temporary array for all keys, rehashing all those keys and rebuilding a hash. Instead, the hash could be copied and its values set to true.
h = {a: 1}
# Now:
Set.new(h.keys) # => Set[:a]
# After
h.key_set # => Set[:a], efficiently.I would like a shorthand syntax for frozen sets of symbols or of strings.
%ws{hello world} # => Set['hello', 'world'].freeze
%is{hello world} # => Set[:hello, :world].freezeWe should consider these sets to return a unique frozen to_a. The strings should be frozen. These literals would be created once at parse time (like regex):
def foo
p %ws{hello world}.object_id
end
foo
foo # => prints the same id twiceReminder: Ruby has literal notations for Rational and Complex. I've sadly never had to use either.
I would venture to say that Complex is much less used than Sets.
Reminder: previous discussion for builtin syntax was not for frozen literal, strings or symbols specifically: https://bugs.ruby-lang.org/issues/5478
Quite minor, but Set#flatten should be accept an argument to level the level of flattening.
Quite minor, but Set#<=> should be defined.
The official stated reason for Set to not implement is that some sets are not comparable. That is exactly what nil result type is for.
Set[1] < Set[1, 2] # => true
[Set[1], Set[1, 2]] # => ArgumentError, should be [Set[1], Set[1, 2]]
[Set[1], Set[2]] # => ArgumentError, ok, not comparable
Great ideas!
I've often wondered by this myself.
Excellent observation! I never had to use those literals myself. I've rarely touched Sets in Ruby, but I use them all the time in other languages where they are first-class citizens.
I've been a big proponent of sets being promoted to Ruby's core. Unfortunately, as the issue indicates, there has been little upstream interest in this.