The Real Deal

condensing fact from the vapor of nuance

Why We Do What We Do

Permalink

Recently I came across some old nostalgic bits from one of my companies. One of those bits was an email I wrote to the entire company, following a particularly large website and product launch the night before. An incredible amount of time and effort had gone into making that launch successful, and succeed it did.

What stands out to me now, reading this again, is how well it speaks to why I love what I do, and why the great people I’ve been privileged to work with did so alongside me.

This is what all companies should be about:

Everyone,

        I just wanted to share with you some feelings I had from last
night, which relate to the very core of what I think makes Cloudmark
so great.

It was only a short few years ago that we were trying to decide how to
"launch" our products and services, and settled on creating an Outlook
plugin that would establish us commercially in the consumer market.
At the time, SpamNet was about as Beta as you can get -- we had no QA
beyond development itself -- and quite frankly it really wasn't ready
for prime-time.  But, we set an aggressive goal for ourselves, a firm,
finite deadline for us to work towards.  PR/Marcom aimed all the
reviews and newspaper articles and press releases for that date, Web
had gotten all the content prepped and ready to go, and like it or not
we had to ship or otherwise lose the tremendous groundswell of
interest and buzz that most startups only dream of.

Then came the night of shipping.

Most people were still in the office as the 12:00am deadline
approached, huddled around computers, watching and waiting.  There was
a palpable anxiety in the air, but it was the good kind -- everyone
could sense how well we would be received, and we were about to make
our first concerted mark upon the world -- us against those who
doubted us, our technology and even the viability of an anti-spam
industry.  There were last minute compiles, last minute bug fixes,
last minute web content tweaks, last minute server configuration
changes, last minute adjustments to any and every thing that was going
out.  Articles from the Wall Street Journal and LA Times were waiting
to go live, our press release and web site were almost literally under
a big red button, ready to hit the wire at a moment's notice.  And
while we were dashing back and forth to make everything as perfect as
possible, the deadline loomed.  Under an impossible time constraint,
we weren't calmly waiting for it, we were frantically trying to beat
it.

And we did.  We came together as a Team, under immense pressure and
Great Expectations, and we did it -- we shipped, we shipped, we
shipped!  That single moment culminating into release is still as
intoxicating to me now as it was that very night.

And now I am once again proud, almost awed to see that same relentless
energy and unbreakable spirit alive and well.  Huddled around
computers, one group focused on QA, another on development/bugfixing,
yet others working on website content and deployment, all in unison
and everyone pitching in where possible -- hey try this, whoops we
missed that, wow that looks great!  Our deadline loomed, lots of last
minute changes right up until the end, and due to the inevitable
problems from such tight deadlines there was an ever-so-slight patina
of despair in the air -- but nothing broke our determination.  We
would be damned if anything was going to keep us from our goal, come
hell or high water we were going to ship our new website and 4 -- 4!!
products -- all in one night.  No one left until their job was done,
and even then some stayed longer to help, if not in function then in
presence and spirit.  And those that weren't there, either polling the
website, providing suggestions on SILC, or just thinking about
shipping, well, they were there too.

Folks, this is who we are, what we stand for -- as a united team of
talented, passionate, determined people we are a force to be reckoned
with, and one that is easily underestimated.  We can achieve anything
we put our efforts towards so long as we work hard, work smart, and
work together.  While late nights against impossible deadlines
shouldn't be the norm, I consider it a privilege to share in them and
hope that this great spirit of ours never dies.

Cheers, and I hope you all have a festive, restful 4th of July
weekend.

--jordan

Indeed, the great spirit does live on.

A Real Ruby 1.8.7 → 1.9.3 Migration

Permalink

Recently it came to pass that my company needed to update its Platform Framework and Applications, which was based on Ruby 1.8.7 (Ruby Enterprise Edition). New features required new gems which required newer versions of existing gems which either broke existing conventions/APIs or were no longer supported on 1.8.7. Then we learned REE had reached End of Life. Nuts!

I’ll save a deeper analysis of the why for a separate post. The short of it is that we were already faced with a full regression test impact to get the changes we needed, so it made sense to invest a few more weeks and bite the whole bullet: hop onto the next Ubuntu LTS release in production, upgrade all of the (mature) gems, and upgrade the VM to the now-stable Ruby 1.9.3. This was a Big Deal.

How to go about this, then? Well, the ChangeLog could be enlightening, and of course there’s tons of blog posts about 1.9.3 itself. But when it came to actual conversions, the wisdom was surprisingly scarce. My sense is that the Ruby community is still relatively small overall, only a small subset of the community has ever (successfully) built stable, mature, long-lived platforms on the language, and maybe only a tiny fraction of them have ever had to work through the EOL of their particular VM and get on a new version.

So, here’s the skinny on converting a large, mature Ruby codebase from 1.8.7 (REE) to 1.9.3. While it’s not exhaustive, it will be representative of the sort of things to look out for. To be fair, a number of the problems encountered were due to “liberties” we had taken with the syntax – a core value of Ruby, and a big reason why so many love it. But that’s harsh criticism that misses the practical reality of a team of developers working hard and fast through the evolution of a large codebase over an extended period of time.

This post is by no means an indictment of 1.9.3. No, 1.9.3 is Better, and in many cases also Faster. But it ain’t perfect, and you’ll do better if you have some idea of what to expect.

Syntax

First up, syntax. While some of this was expected, some things were pretty subtle (e.g. array splat behavior).

  • rescue gains higher precedence than not
1
2
foo and not (nil.nonexistant_method rescue nil) # => raises exception
foo and !(nil.nonexistant_method rescue nil)    # => works
  • parenthesis generally required where precedence gets confused
1
2
3
raise Foo, "bar"  unless blort # => works on 1.8.7
raise Foo, "bar"  unless blort # => raises exception on 1.9.3
raise(Foo, "bar") unless blort # => works on 1.9.3
  • can’t use multi-value assignment as conditional anymore
1
if foo and (a, b = 1, 2) # => raises exception
  • can’t use instance variables as formal block params
1
foo.inject({}) { |@h, v| .. } # => raises exception
  • splat Array/List deref remains Array regardless of # elements
1
2
3
def foo(*args) bar = *args end
a = foo(1); a # => 1     (1.8.7)
a = foo(1); a # => [1]   (1.9.3)
  • splat behavior exception: when assigning to list of lvalues
1
2
3
def foo(*args) bar = *args end
a, b = foo(1); a # => 1     (1.8.7)
a, b = foo(1); a # => 1     (1.9.3)

Encoding

Anyone who has worked with non-ASCII character sets in 1.8.7 will know of Ruby’s shortcomings when it comes to other encodings and codepages. 1.9 improves the situation significantly, and actually deprecates the iconv library.

  • embedded UTF-8 chars/strings require explicit file-wide encoding
    • insert # encoding: UTF-8 at top of each affected file
  • manual “guards” against invalid byte sequences require forced binary encoding

We had several cases using a Regex to check a String for invalid byte sequences, as in user-provided inputs or validating email addresses. It simplifies to this:

1
2
3
4
5
foo = "\xff"   # => "\xFF"
bar = /#{foo}/ # => ArgumentError: invalid multibyte character

foo.force_encoding('binary')
bar = /#{foo}/               # => /\xFF/

Note: You’ll get a SyntaxError exception instead if you run this in irb.

  • $KCODE is gone; ruby -K is not its equivalent - __ENCODING__ is
    • __ENCODING__ only set by shell locale encoding (*nix) or codepage (Win32)
*nix
1
2
LC_CTYPE=en_US.iso8859-1 ruby -e "puts __ENCODING__" # => "ISO-8859-1"
LC_CTYPE=en_US.utf-8     ruby -e "puts __ENCODING__" # => "UTF-8"
Win32
1
2
3
ruby -e "puts __ENCODING__" # => "IBM437"
chcp 60001
ruby -e "puts __ENCODING__" # => "UTF-8"

Beware: Win32 Ruby gets unstable quickly when default codepage is UTF-8 (aka crash)

  • (Win32) win32console completely breaks $stdout encoding
    • given the right encoding, UTF-8 chars will display to $stdout correctly
    • after loading win32console, UTF-8 is garbled again
1
2
3
4
puts "#{foo = File.read('foo.txt').chomp} -> #{foo.force_encoding('utf-8')}"
require 'win32console'
puts "#{foo = File.read('foo.txt').chomp} -> #{foo.force_encoding('utf-8')}"
# FIXME: update with better example

Threading (Win32)

  • Thread is no longer Green - it’s now a native thread

We wrap Microsoft Word with an asynchronous control channel using OLE and Eventmachine-driven AMQP, primarily for high fidelity split/combine & track-changes operations on Word documents.

Each AMQP message spawns off in a new Thread. With green threads in 1.8.7, we were effectively isolated from the vaguaries of various COM threading models underlying OLE. Under 1.9.3 however, each Thread is native and thus needs its own OLE object handle, otherwise you get a WIN32OLERuntimeError exception with the following message:

1
The application called an interface that was marshalled for a different thread

We found a good discussion on the topic and a potential solution involving calling CoInitialize/CoUninitialize per-Thread, however it didn’t solve our problem.

Instead, we simply memoized the OLE object handle in Thread-local storage:

1
2
3
def msword
    Thread.current[:msword] ||= WIN32OLE.connect('word.application') rescue WIN32OLE.new('word.application')
end

Kernel

  • Kernel method enumerators produce an Array of Symbol, not String

Module

  • Module.constants produces an Array of Symbol, not String

Enumerable

  • Enumerable.map behavior changes
    • implications on a loop that expects the yielded value to work like a String (e.g. value#[])
1
2
("a".."c").map      # => ["a", "b", "c"]             (1.8.7)
("a".."c").map      # => #<Enumerator: "a".."c":map> (1.9.3)

Exception

  • Exception#message is now immutable

Many of our Platform Framework libraries wrap various vendor-specific Exception classes with our own, to keep our library consumers within their own domain and to guard against coupling consumer unit tests to indirect vendor-specific behaviors. In some cases we add our own prefix/postfix to existing #message strings. Can’t do that in 1.9.3:

1
Exception.new.message << "foo" # => RuntimeError: can't modify frozen String
  • Exception#message now forces #to_s on value

We had some cases where we wrap vendor exceptions with our own, and tunneled the original Exception in #message. That won’t work by default in 1.9.3, so we added this:

1
2
3
4
5
6
7
class Exception
    attr_reader :message
    def initialize(*data)
        @message = data.first
        super
    end
end

Hash

  • Hash now natively respects insert order
    • e.g. beware web tests that compare stringified JSON results
  • inserting into Hash while enumerating it raises an exception
1
2
3
foo = { :a => :b }
foo.each { |h| foo[:c] = :d }
  # => RuntimeError: can't add a new key into hash during iteration
  • Hash#merge doesn’t use Hash#[]= anymore
    • e.g. stringify’ing things upon assignment doesn’t work with Hash#merge
1
2
3
4
class A < Hash; def []=(k, v) super(k.to_s, v.to_s) end end
foo = A.new
foo[:a] = :b        # => {"a"=>"b"}
foo.merge(:c => :d) # => {"a"=>"b", :c => :d}
  • Hash enumerable methods now return Enumerable
1
2
3
4
Hash.new.map.class         # => Array       (1.8.7)
Hash.new.map.flatten.uniq  # => works       (1.8.7)
Hash.new.map.class         # => Enumerable  (1.9.3)
Hash.new.to_a.flatten.uniq # => works       (1.9.3)

Symbol

  • Symbol#match now exists but doesn’t work like String#match
1
2
"foo".match(/foo/) # => #<MatchData "foo">
:foo.match(/foo/)  # => 0

String

  • String no longer directly enumerable, so #map, #grep etc is gone
    • but you do get String#each_{byte,char,codepoint,line}
1
2
"foo".map # => ["foo"]                                              (1.8.7)
"foo".map # => NoMethodError: undefined method `map' for "":String  (1.9.3)
  • String#[Symbol] doesn’t (silently) work anymore
1
2
"foo"[:a] # => nil                                           (1.8.7)
"foo"[:a] # => TypeError: can't convert Symbol into Integer  (1.9.3)
  • String#[Fixnum] returns char-as-string, not ordinal
1
2
"foo"[1] # => 111  (1.8.7)
"foo"[1] # => "o"  (1.9.3)
  • String#include? no longer takes a Fixnum (as ordinal)
1
2
3
"123".include?("2") # => true
"123".include?(50)  # => true                                         (1.8.7)
"123".include?(50)  # => TypeError: can't convert Fixnum into String  (1.9.3)
  • String#strip doesn’t work on non-ASCII whitespace
1
2
" \xc2\xa0".strip                    # => "\302\240" (1.8.7)
" \xc2\xa0".strip                    # => " "        (1.9.3)

But using the POSIX character class does:

1
2
" \xc2\xa0".gsub(/[[:space:]]/, '')  # => "\302"  (1.8.7)
" \xc2\xa0".gsub(/[[:space:]]/, '')  # => ""      (1.9.3)

Binding

  • getting an object’s binding requires #instance_eval instead of #send

A good example is the pattern of using an OpenStruct initialized with a Hash (which provides method accessors for keys) as a Context object for template interpolation systems like Erubis.

This used to work in 1.8.7:

1
2
3
template = ERB.new("foo.erb")
context = OpenStruct.new(some_object_with_methods).send(:binding)
output = template.result(context)

This works in 1.9.3 and 1.8.7:

1
2
3
template = ERB.new("foo.erb")
context = OpenStruct.new(some_object_with_methods).instance_eval { binding() }
output = template.result(context)

Fixnum

  • Fixnum#to_sym no longer (silently) works
1
2
1.to_sym # => nil                                                    (1.8.7)
1.to_sym # => NoMethodError: undefined method `to_sym' for 1:Fixnum  (1.9.3)

Float

  • fraction-less Float becomes a Fixnum through #to_s (precision lost)
    • e.g. cucumber test with a table containing a Float like 10.0
1
2
Float(0.0).to_s # => "0.0"  (1.8.7)
Float(0.0).to_s # => "0"    (1.9.3)

BigDecimal

BigDecimal problems were among the most difficult in the conversion. There are differences in behavior between VM versions and amongst BigDecimal representations within the 1.9.3 VM itself.

  • BD init by String skips precision altogether
1
2
Float(80.784).to_d.to_s('F')             # => "80.78400000000001"
BigDecimal.new("80.784") == 80.784.to_d  # => false
  • BD compared to Float makes a BD implicitly from Float
1
2
80.784.to_d == 80.784              # => true   (but it shouldn't!)
BigDecimal.new("80.784") == 80.784 # => false  (but you might think it should!)
  • BD + Float makes a Float
1
2
BigDecimal.new("80.784") == 80.784       # => false  (OK we just learned that)
BigDecimal.new("80.784") + 0.0 == 80.784 # => true   (ZOMGGGGG)
  • Float#to_d ignores class-level precision limit setting entirely
1
2
BigDecimal.limit(5)  # docs say this sets default precision for newly created BDs
80.784.to_d          # => #<BigDecimal:7fbe331c7e58,'0.8078400000 000001E2',27(45)>

Note: Float#to_d comes from the bigdecimal library.

  • Float#to_d uses a different “default precision” than BigDecimal.new
1
2
3
4
BigDecimal.limit                                    # => 0
BigDecimal(80.784, BigDecimal.limit).precs          # => [45, 54]
80.784.to_d.precs                                   # => [27, 45]
BigDecimal(80.784, BigDecimal.limit) == 80.784.to_d # => false
  • BD invocation differs depending on init param type

This makes writing generalized code for handling a BigDecimal overly complicated.

1
2
BigDecimal.new("80.784") # => 80.784
BigDecimal.new(80.784)   # => ArgumentError: can't omit precision for a Rational.

A Practical Look at the Problem

The Float vs. BigDecimal issue is about precision and “true representations” of Rational numbers vs. representations that are incorrectly assumed to be precise (like Float). The story almost always ends the same way: mixing differing-precision numbers in math calculations is just a Bad Idea. Mixing BigDecimal and Float certainly qualifies as this.

The problem in 1.9.3 is that different invocations of BigDecimal also qualify as differing-precision numbers. A String-initialized BigDecimal will skip any precision. Convenience methods like #to_d will produce a BigDecimal using a different “default precision” than the normal #new. #to_d doesn’t even respect BigDecimal.limit.

Sure there are ways to compare the precision of a BigDecimal, and it is possible to write extra code to ensure same-precision computation and comparison. But on a practical level that’s terribly obnoxious and simply not going to happen. Your average application developer just doesn’t care, and shouldn’t. It’s hard enough to avoid mixing Float and BigDecimal, and getting that right should be enough. Almost.

A Practical Solution

After adhering to the “don’t mix fractional types” rule, we were left with a bunch of broken tests around an ORM property type designed for persisting a BigDecimal. In the end, those all resolved to this simple, new behavior in 1.9.3:

1
BigDecimal.new("80.784") == 80.784  # => false

So a BigDecimal is being compared against a Float – big surprise that a mixed precision comparison causes problems! But practically speaking, it’s simply unrealistic for it to happen any other way: no one’s going to adjust every line of their code and tests to detect+match precision, no one’s going to update fixture loaders to know about and adjust to ORM-specific representations of precision/scale-limited rational numbers, no one will investigate+adjust test harnesses like Rspec and Cucumber/Gherkin to understand when a rational should be a Float vs. a BigDecimal and at what precision. Realistically, fuck all of that.

Instead, we just changed the implicit type conversion when comparing against a Float to cast to a String first:

1
2
3
4
5
6
class BigDecimal
    def ==(other)
        other = BigDecimal(other.to_s) if other.kind_of?(Float)
        return super(other)
    end
end

Then the test cases start working again, without having hacked any representation’s real scale/precision:

1
BigDecimal.new("80.784") == 80.784  # => true

And just like that, the old behavior. Awesome.

FYI we used 80.784 because it was a (calculated) value in some tests that triggered the BD-related problems. There’s nothing else special about the value.

Closures

  • lambda rules about arity have changed
1
2
lambda{}.call(self) # => nil                                                 (1.8.7)
lambda{}.call(self) # => ArgumentError: wrong number of arguments (1 for 0)  (1.9.3)

This manifested as a result of an old rspec gem (1.2.9) that wasn’t getting upgraded with the mix. Test it declaratives without closures would raise a special NotImplementedYetError exception, automatically marking them as pending. It did so using a lambda, which it later called with an argument – which doesn’t work anymore in 1.9.3. Swapping it to proc with a monkey patch solved that.

  • initializing a proc with a proc no longer catches return-from-inside-closure

This was a hack originally invented to wrap our ORM’s transactional layer with our own, in order to provide better commit/rollback semantics and deferred AMQP message delivery (outside the transaction window).

In general, we try to avoid obfuscating our code paths with too much nesting by early-exiting from subroutines whenever possible, usually by calling return. And sometimes those subroutines end up being transaction closures.

From a pedantic POV that’s a terrible idea – closures aren’t methods, they have slightly different semantics, blah blah blah. Practically speaking, who cares. It should Just Work. So we made it work. And then it totally broke in 1.9.3.

It essentially resolved to this:

1
2
3
4
5
6
7
8
9
10
11
12
def orm_txn
    ret = yield
ensure
    return ret
end

def transaction(&block)
    orm_txn { proc(&block).call }
end

transaction { return 1 } # => 1    (1.8.7)
transaction { return 1 } # => nil  (1.9.3)

The idea of adhering to the Principal of Least Surprise around return from closures wasn’t unique to us; Sinatra had the same concept too, just using a different mechanism: unbound methods.

So we use that now, and the method works like a charm on both 1.8.7 and 1.9.3:

1
2
3
4
5
6
7
8
9
10
11
12
def orm_txn
    ret = yield
ensure
    return ret
end

def transaction(&block)
    orm_txn { Class.new.instance_eval { define_method("", &block) }.call }
end

transaction { return 1 } # => 1
transaction { return 1 } # => 1

Hooray!

Library / Gems

  • FasterCSV renamed to CSV and included in the core Ruby distribution
  • json (v1.5.4) included in the core Ruby distribution
    • beware non-rubygems $LOAD_PATH manipulations – Ruby’s libs always come first
  • soap4r no longer bundled in the core Ruby distribution
    • old soap4r gem is incompatible with 1.9 – use soap4r-ruby1.9
  • sha1 library is gone
    • use digest/sha1 instead
  • ruby-debug gem et al. renamed to debugger gem et al.
  • $ENV['CWD'] not in $LOAD_PATH by default anymore
  • XML parsing is more strict
    • can’t embed dash-dash (“–”) inside XML comments
  • YAML parsing is more strict
    • bare regexes need to be quoted
    • bare regexes containing “escaped”-style character classes require single-quoting
  • YAML parsing also got a little more convenient
    • Time/DateTime/Date values serialized as String are auto-converted to native type
  • Net::HTTP::Post.set_form_data now uses URI.encode_www_form

This subtly broke signature calculation on our Oauth consumers vs. our provider.

Oauth consumers (oauth gem) use Net::HTTP::Post while Oauth providers (oauth-provider gem) use Rack::Request, but param collection and serialization for signature calculation was effectively equivalent in 1.8.7.

Now Net::HTTP::Post uses URI.encode_www_form, and that produces a different result when given a param key with a nil value. Thus signature calculation on the consumer would differ from what the provider arrives at, and request authentication would fail.

Our “fix” was to just guard against nil parameters before making requests.

1
2
3
4
5
params = { :foo => nil }
post = Net::HTTP::Post.new("/") && post.set_form_data(params)

post.body # => "foo="  (1.8.7)
post.body # => "foo"   (1.9.3)

Evil!

  • open-uri is a rainbow-hating, puppy-murdering Nazi who wants to make your life harder

If the userinfo portion of a URI is set, it raises an exception:

1
2
3
4
    if target.userinfo && "1.9.0" <= RUBY_VERSION
      # don't raise for 1.8 because compatibility.
      raise ArgumentError, "userinfo not supported.  [RFC3986]"
    end

This turned out to be one of those tired, irritating pedantic engineer vs. practical utility arguments where the pedant won. Hooray for pedantic idiots!

Conclusion

Well, that’s all I got for now! Check out Harvest’s Ruby 1.9.3 upgrade blog post for some additional pitfalls to watch out for.

5 Years Later, Why I Don’t Blog Much

Permalink

It’s been almost 5 years since the last update, and now I’ve decided it’s finally time. Time to update my website with a new look, time to participate in the Public Think and make my contribution again.

For the record, I’ve resisted blogging anything forever, mainly because:

  • Who has time? (Not me. I spend my time applying myself to the things I love.)
  • Who can risk the exposure? (Not me. I’m a Professional with a Career.)
  • Who cares? (Usually, not me.)

I’m perpetually floored by how many people qualify against these criteria. Really? Don’t you have a day job? Aren’t you any good at it? (Practice takes time, and you’re obviously not practicing.) Do you love to do anything else in your life besides narcissistically plaster your verbal diarrhea bullshit all over the Web? (You should probably go do that instead.) Did you somehow miss the time-honored wisdom of not making a total ass of yourself in public, especially when the Internet will never forget it? Don’t you have anything to lose? I mean, seriously!

Sure, LinkedIn is great - it’s about jobs, careers, credibility. Facebook - ok fine, it’s an interesting concept to have 10 times more relationships each a 1/100th the value of a normal friendship. But garbage like Twitter? They’re ruining the Internet with noise - 140 chars isn’t enough space to express anything truly meaningful. In the CS domain, statistically, 140 chars looks almost like the definition of uselessness. Yeah sure people use it usefully, but the signal to noise ratio is intergalactically, unforgiveably huge.

But, that doesn’t mean I shouldn’t keep trying. So, down the rabbit hole we go. Again.