Jordan Ritter

Recently it came to pass that my company needed to update its Platform Framework and Applications, which was based on Ruby 1.8.7 (Ruby Enterprise Edition). New features required new gems which required newer versions of existing gems which either broke existing conventions/APIs or were no longer supported on 1.8.7. Then we learned REE had reached End of Life. Nuts!

I’ll save a deeper analysis of the why for a separate post. The short of it is that we were already faced with a full regression test impact to get the changes we needed, so it made sense to invest a few more weeks and bite the whole bullet: hop onto the next Ubuntu LTS release in production, upgrade all of the (mature) gems, and upgrade the VM to the now-stable Ruby 1.9.3. This was a Big Deal.

How to go about this, then? Well, the ChangeLog could be enlightening, and of course there’s tons of blog posts about 1.9.3 itself. But when it came to actual conversions, the wisdom was surprisingly scarce. My sense is that the Ruby community is still relatively small overall, only a small subset of the community has ever (successfully) built stable, mature, long-lived platforms on the language, and maybe only a tiny fraction of them have ever had to work through the EOL of their particular VM and get on a new version.

So, here’s the skinny on converting a large, mature Ruby codebase from 1.8.7 (REE) to 1.9.3. While it’s not exhaustive, it will be representative of the sort of things to look out for. To be fair, a number of the problems encountered were due to “liberties” we had taken with the syntax – a core value of Ruby, and a big reason why so many love it. But that’s harsh criticism that misses the practical reality of a team of developers working hard and fast through the evolution of a large codebase over an extended period of time.

This post is by no means an indictment of 1.9.3. No, 1.9.3 is Better, and in many cases also Faster. But it ain’t perfect, and you’ll do better if you have some idea of what to expect.

Syntax

First up, syntax. While some of this was expected, some things were pretty subtle (e.g. array splat behavior).

rescue gains higher precedence than not

foo and not (nil.nonexistant_method rescue nil) # => raises exception
foo and !(nil.nonexistant_method rescue nil)    # => works

parenthesis generally required where precedence gets confused

raise Foo, "bar"  unless blort # => works on 1.8.7
raise Foo, "bar"  unless blort # => raises exception on 1.9.3
raise(Foo, "bar") unless blort # => works on 1.9.3

can’t use multi-value assignment as conditional anymore

if foo and (a, b = 1, 2) # => raises exception

can’t use instance variables as formal block params

foo.inject({}) { |@h, v| .. } # => raises exception

splat Array/List deref remains Array regardless of # elements

def foo(*args) bar = *args end
a = foo(1); a # => 1     (1.8.7)
a = foo(1); a # => [1]   (1.9.3)

splat behavior exception: when assigning to list of lvalues

def foo(*args) bar = *args end
a, b = foo(1); a # => 1     (1.8.7)
a, b = foo(1); a # => 1     (1.9.3)

Encoding

Anyone who has worked with non-ASCII character sets in 1.8.7 will know of Ruby’s shortcomings when it comes to other encodings and codepages. 1.9 improves the situation significantly, and actually deprecates the iconv library.

embedded UTF-8 chars/strings require explicit file-wide encoding
- insert # encoding: UTF-8 at top of each affected file

manual “guards” against invalid byte sequences require forced binary encoding

We had several cases using a Regex to check a String for invalid byte sequences, as in user-provided inputs or validating email addresses. It simplifies to this:

foo = "\xff"   # => "\xFF"
bar = /#{foo}/ # => ArgumentError: invalid multibyte character

foo.force_encoding('binary')
bar = /#{foo}/               # => /\xFF/

Note: You’ll get a SyntaxError exception instead if you run this in irb.

$KCODE is gone; ruby -K is not its equivalent - __ENCODING__ is
- __ENCODING__ only set by shell locale encoding (*nix) or codepage (Win32)

*nix

LC_CTYPE=en_US.iso8859-1 ruby -e "puts __ENCODING__" # => "ISO-8859-1"
LC_CTYPE=en_US.utf-8     ruby -e "puts __ENCODING__" # => "UTF-8"

Win32

ruby -e "puts __ENCODING__" # => "IBM437"
chcp 60001
ruby -e "puts __ENCODING__" # => "UTF-8"

Beware: Win32 Ruby gets unstable quickly when default codepage is UTF-8 (aka crash)

(Win32) win32console completely breaks $stdout encoding
- given the right encoding, UTF-8 chars will display to $stdout correctly
- after loading win32console, UTF-8 is garbled again

puts "#{foo = File.read('foo.txt').chomp} -> #{foo.force_encoding('utf-8')}"
require 'win32console'
puts "#{foo = File.read('foo.txt').chomp} -> #{foo.force_encoding('utf-8')}"
# FIXME: update with better example

Threading (Win32)

Thread is no longer Green - it’s now a native thread

We wrap Microsoft Word with an asynchronous control channel using OLE and Eventmachine-driven AMQP, primarily for high fidelity split/combine & track-changes operations on Word documents.

Each AMQP message spawns off in a new Thread. With green threads in 1.8.7, we were effectively isolated from the vaguaries of various COM threading models underlying OLE. Under 1.9.3 however, each Thread is native and thus needs its own OLE object handle, otherwise you get a WIN32OLERuntimeError exception with the following message:

The application called an interface that was marshalled for a different thread

We found a good discussion on the topic and a potential solution involving calling CoInitialize/CoUninitialize per-Thread, however it didn’t solve our problem.

Instead, we simply memoized the OLE object handle in Thread-local storage:

def msword
    Thread.current[:msword] ||= WIN32OLE.connect('word.application') rescue WIN32OLE.new('word.application')
end

`Kernel`

Kernel method enumerators produce an Array of Symbol, not String

`Module`

Module.constants produces an Array of Symbol, not String

`Enumerable`

Enumerable.map behavior changes
- implications on a loop that expects the yielded value to work like a String (e.g. value#[])

("a".."c").map      # => ["a", "b", "c"]             (1.8.7)
("a".."c").map      # => #<Enumerator: "a".."c":map> (1.9.3)

`Exception`

Exception#message is now immutable

Many of our Platform Framework libraries wrap various vendor-specific Exception classes with our own, to keep our library consumers within their own domain and to guard against coupling consumer unit tests to indirect vendor-specific behaviors. In some cases we add our own prefix/postfix to existing #message strings. Can’t do that in 1.9.3:

Exception.new.message << "foo" # => RuntimeError: can't modify frozen String

Exception#message now forces #to_s on value

We had some cases where we wrap vendor exceptions with our own, and tunneled the original Exception in #message. That won’t work by default in 1.9.3, so we added this:

class Exception
    attr_reader :message
    def initialize(*data)
        @message = data.first
        super
    end
end

`Hash`

Hash now natively respects insert order
- e.g. beware web tests that compare stringified JSON results

inserting into Hash while enumerating it raises an exception

foo = { :a => :b }
foo.each { |h| foo[:c] = :d }
  # => RuntimeError: can't add a new key into hash during iteration

Hash#merge doesn’t use Hash#[]= anymore
- e.g. stringify’ing things upon assignment doesn’t work with Hash#merge

class A < Hash; def []=(k, v) super(k.to_s, v.to_s) end end
foo = A.new
foo[:a] = :b        # => {"a"=>"b"}
foo.merge(:c => :d) # => {"a"=>"b", :c => :d}

Hash enumerable methods now return Enumerable

Hash.new.map.class         # => Array       (1.8.7)
Hash.new.map.flatten.uniq  # => works       (1.8.7)
Hash.new.map.class         # => Enumerable  (1.9.3)
Hash.new.to_a.flatten.uniq # => works       (1.9.3)

`Symbol`

Symbol#match now exists but doesn’t work like String#match

"foo".match(/foo/) # => #<MatchData "foo">
:foo.match(/foo/)  # => 0

`String`

String no longer directly enumerable, so #map, #grep etc is gone
- but you do get String#each_{byte,char,codepoint,line}

"foo".map # => ["foo"]                                              (1.8.7)
"foo".map # => NoMethodError: undefined method `map' for "":String  (1.9.3)

String#[Symbol] doesn’t (silently) work anymore

"foo"[:a] # => nil                                           (1.8.7)
"foo"[:a] # => TypeError: can't convert Symbol into Integer  (1.9.3)

String#[Fixnum] returns char-as-string, not ordinal

"foo"[1] # => 111  (1.8.7)
"foo"[1] # => "o"  (1.9.3)

String#include? no longer takes a Fixnum (as ordinal)

"123".include?("2") # => true
"123".include?(50)  # => true                                         (1.8.7)
"123".include?(50)  # => TypeError: can't convert Fixnum into String  (1.9.3)

String#strip doesn’t work on non-ASCII whitespace

" \xc2\xa0".strip                    # => "\302\240" (1.8.7)
" \xc2\xa0".strip                    # => " "        (1.9.3)

But using the POSIX character class does:

" \xc2\xa0".gsub(/[[:space:]]/, '')  # => "\302"  (1.8.7)
" \xc2\xa0".gsub(/[[:space:]]/, '')  # => ""      (1.9.3)

Binding

getting an object’s binding requires #instance_eval instead of #send

A good example is the pattern of using an OpenStruct initialized with a Hash (which provides method accessors for keys) as a Context object for template interpolation systems like Erubis.

This used to work in 1.8.7:

template = ERB.new("foo.erb")
context = OpenStruct.new(some_object_with_methods).send(:binding)
output = template.result(context)

This works in 1.9.3 and 1.8.7:

template = ERB.new("foo.erb")
context = OpenStruct.new(some_object_with_methods).instance_eval { binding() }
output = template.result(context)

`Fixnum`

Fixnum#to_sym no longer (silently) works

1.to_sym # => nil                                                    (1.8.7)
1.to_sym # => NoMethodError: undefined method `to_sym' for 1:Fixnum  (1.9.3)

`Float`

fraction-less Float becomes a Fixnum through #to_s (precision lost)
- e.g. cucumber test with a table containing a Float like 10.0

Float(0.0).to_s # => "0.0"  (1.8.7)
Float(0.0).to_s # => "0"    (1.9.3)

`BigDecimal`

BigDecimal problems were among the most difficult in the conversion. There are differences in behavior between VM versions and amongst BigDecimal representations within the 1.9.3 VM itself.

BD init by String skips precision altogether

Float(80.784).to_d.to_s('F')             # => "80.78400000000001"
BigDecimal.new("80.784") == 80.784.to_d  # => false

BD compared to Float makes a BD implicitly from Float

80.784.to_d == 80.784              # => true   (but it shouldn't!)
BigDecimal.new("80.784") == 80.784 # => false  (but you might think it should!)

BD + Float makes a Float

BigDecimal.new("80.784") == 80.784       # => false  (OK we just learned that)
BigDecimal.new("80.784") + 0.0 == 80.784 # => true   (ZOMGGGGG)

Float#to_d ignores class-level precision limit setting entirely

BigDecimal.limit(5)  # docs say this sets default precision for newly created BDs
80.784.to_d          # => #<BigDecimal:7fbe331c7e58,'0.8078400000 000001E2',27(45)>

Note: Float#to_d comes from the bigdecimal library.

Float#to_d uses a different “default precision” than BigDecimal.new

BigDecimal.limit                                    # => 0
BigDecimal(80.784, BigDecimal.limit).precs          # => [45, 54]
80.784.to_d.precs                                   # => [27, 45]
BigDecimal(80.784, BigDecimal.limit) == 80.784.to_d # => false

BD invocation differs depending on init param type

This makes writing generalized code for handling a BigDecimal overly complicated.

BigDecimal.new("80.784") # => 80.784
BigDecimal.new(80.784)   # => ArgumentError: can't omit precision for a Rational.

A Practical Look at the Problem

The Float vs. BigDecimal issue is about precision and “true representations” of Rational numbers vs. representations that are incorrectly assumed to be precise (like Float). The story almost always ends the same way: mixing differing-precision numbers in math calculations is just a Bad Idea. Mixing BigDecimal and Float certainly qualifies as this.

The problem in 1.9.3 is that different invocations of BigDecimal also qualify as differing-precision numbers. A String-initialized BigDecimal will skip any precision. Convenience methods like #to_d will produce a BigDecimal using a different “default precision” than the normal #new. #to_d doesn’t even respect BigDecimal.limit.

Sure there are ways to compare the precision of a BigDecimal, and it is possible to write extra code to ensure same-precision computation and comparison. But on a practical level that’s terribly obnoxious and simply not going to happen. Your average application developer just doesn’t care, and shouldn’t. It’s hard enough to avoid mixing Float and BigDecimal, and getting that right should be enough. Almost.

A Practical Solution

After adhering to the “don’t mix fractional types” rule, we were left with a bunch of broken tests around an ORM property type designed for persisting a BigDecimal. In the end, those all resolved to this simple, new behavior in 1.9.3:

BigDecimal.new("80.784") == 80.784  # => false

So a BigDecimal is being compared against a Float – big surprise that a mixed precision comparison causes problems! But practically speaking, it’s simply unrealistic for it to happen any other way: no one’s going to adjust every line of their code and tests to detect+match precision, no one’s going to update fixture loaders to know about and adjust to ORM-specific representations of precision/scale-limited rational numbers, no one will investigate+adjust test harnesses like Rspec and Cucumber/Gherkin to understand when a rational should be a Float vs. a BigDecimal and at what precision. Realistically, fuck all of that.

Instead, we just changed the implicit type conversion when comparing against a Float to cast to a String first:

class BigDecimal
    def ==(other)
        other = BigDecimal(other.to_s) if other.kind_of?(Float)
        return super(other)
    end
end

Then the test cases start working again, without having hacked any representation’s real scale/precision:

BigDecimal.new("80.784") == 80.784  # => true

And just like that, the old behavior. Awesome.

FYI we used 80.784 because it was a (calculated) value in some tests that triggered the BD-related problems. There’s nothing else special about the value.

Closures

lambda rules about arity have changed

lambda{}.call(self) # => nil                                                 (1.8.7)
lambda{}.call(self) # => ArgumentError: wrong number of arguments (1 for 0)  (1.9.3)

This manifested as a result of an old rspec gem (1.2.9) that wasn’t getting upgraded with the mix. Test it declaratives without closures would raise a special NotImplementedYetError exception, automatically marking them as pending. It did so using a lambda, which it later called with an argument – which doesn’t work anymore in 1.9.3. Swapping it to proc with a monkey patch solved that.

initializing a proc with a proc no longer catches return-from-inside-closure

This was a hack originally invented to wrap our ORM’s transactional layer with our own, in order to provide better commit/rollback semantics and deferred AMQP message delivery (outside the transaction window).

In general, we try to avoid obfuscating our code paths with too much nesting by early-exiting from subroutines whenever possible, usually by calling return. And sometimes those subroutines end up being transaction closures.

From a pedantic POV that’s a terrible idea – closures aren’t methods, they have slightly different semantics, blah blah blah. Practically speaking, who cares. It should Just Work. So we made it work. And then it totally broke in 1.9.3.

It essentially resolved to this:

def orm_txn
    ret = yield
ensure
    return ret
end

def transaction(&block)
    orm_txn { proc(&block).call }
end

transaction { return 1 } # => 1    (1.8.7)
transaction { return 1 } # => nil  (1.9.3)

The idea of adhering to the Principal of Least Surprise around return from closures wasn’t unique to us; Sinatra had the same concept too, just using a different mechanism: unbound methods.

So we use that now, and the method works like a charm on both 1.8.7 and 1.9.3:

def orm_txn
    ret = yield
ensure
    return ret
end

def transaction(&block)
    orm_txn { Class.new.instance_eval { define_method("", &block) }.call }
end

transaction { return 1 } # => 1  (1.8.7)
transaction { return 1 } # => 1  (1.9.3)

Hooray!

Library / Gems

FasterCSV renamed to CSV and included in the core Ruby distribution

json (v1.5.4) included in the core Ruby distribution
- beware non-rubygems $LOAD_PATH manipulations – Ruby’s libs always come first

soap4r no longer bundled in the core Ruby distribution
- old soap4r gem is incompatible with 1.9 – use soap4r-ruby1.9

sha1 library is gone
- use digest/sha1 instead

ruby-debug gem et al. renamed to debugger gem et al.

$ENV['CWD'] not in $LOAD_PATH by default anymore

XML parsing is more strict
- can’t embed dash-dash (“–”) inside XML comments

YAML parsing is more strict
- bare regexes need to be quoted
- bare regexes containing “escaped”-style character classes require single-quoting

YAML parsing also got a little more convenient
- Time/DateTime/Date values serialized as String are auto-converted to native type

Net::HTTP::Post.set_form_data now uses URI.encode_www_form

This subtly broke signature calculation on our Oauth consumers vs. our provider.

Oauth consumers (oauth gem) use Net::HTTP::Post while Oauth providers (oauth-provider gem) use Rack::Request, but param collection and serialization for signature calculation was effectively equivalent in 1.8.7.

Now Net::HTTP::Post uses URI.encode_www_form, and that produces a different result when given a param key with a nil value. Thus signature calculation on the consumer would differ from what the provider arrives at, and request authentication would fail.

Our “fix” was to just guard against nil parameters before making requests.

params = { :foo => nil }
post = Net::HTTP::Post.new("/") && post.set_form_data(params)

post.body # => "foo="  (1.8.7)
post.body # => "foo"   (1.9.3)

Evil!

open-uri is a rainbow-hating, puppy-murdering Nazi who wants to make your life harder

If the userinfo portion of a URI is set, it raises an exception:

    if target.userinfo && "1.9.0" <= RUBY_VERSION
      # don't raise for 1.8 because compatibility.
      raise ArgumentError, "userinfo not supported.  [RFC3986]"
    end

This turned out to be one of those tired, irritating pedantic engineer vs. practical utility arguments where the pedant won. Hooray for pedantic idiots!

Conclusion

Well, that’s all I got for now! Check out Harvest’s Ruby 1.9.3 upgrade blog post for some additional pitfalls to watch out for.