Zack Hobson's blog

  • Object#let in Ruby

    Oct 09, 2009

    A common construct in functional programming languages is the let macro, used to define lexically-scoped names. The let macro exists in most (all?) Lisp dialects and spiritual descendants, including Clojure. Since it's possible to define lexically-scoped variables at any time in Ruby, there isn't much perceived need for a construct like let. However, Ruby's block syntax makes it particularly easy to experiment with functional programming techniques, and in fact there is already a method in Ruby's standard library that is maddeningly similar to let, the tap method.

    The reason I assert that tap is "maddeningly similar" to let is that tap does not return the result of the block, rather, it always returns the receiver. This makes it possible to take a value or values and create lexically-scoped names referring to them in a block, but there is no way to access the results of that block unless you explicitly stash them in the receiver or (possibly) a global. Even still, tap has its uses, which is why it's been in the standard library since 1.9.

    # output: up is down
    "down".tap { |up| puts "up is #{up}" }

    A Ruby implementation of let would be similar to tap in that it would allow you to create a lexical block, but it would differ in that it evaluates to the block result:

    # output: up is down
    puts "down".let { |up| "up is #{up}" }

    In cases where you don't care about the result of the block you can use either one, but if you care about the block result instead of the receiver then you can use let instead of tap:

    # output: UP IS DOWN
    puts "down".let { |up| "up is #{up}" }.upcase

    I find myself wanting to use tap or let in cases when I need to use a calculated value multiple times. I could just create a temporary variable, of course:

    c = some_complex_expression
    do_something if c
    other_method(c)

    Instead I can use tap (or let if it's available) to achieve the same effect:

    (some_complex_expression).tap do |c|
      do_something if c
      other_method(c)
    end

    This is quite useful in ERB templates, where creating temporary variables feels especially wrong.

    You can also exploit a feature of Ruby to get multiple lexical names per block. If the receiver is an Array and you provide multiple names in the block arguments, the contents will automatically be broken out. This works with either tap or let:

    (1..3).to_a.tap do |one, two, three|
      puts "one is #{one}"
      puts "two is #{two}"
      puts "three is #{three}"
    end
     
    puts (1..3).to_a.let { |one, two, three|
      "one is #{one}", "two is #{two}", "three is #{three}"
    }.join("\n")

    The implementation of both methods is trivial. It's not like any developer who wants to use let needs to wait for some Ruby core developer to code it up for him. Here's the complete implementation of Object#let:

    class Object
      def let
        yield self
      end
    end

    The tap method (which is already in the standard library) is about twice as large, weighing in at two trivial lines:

    class Object
      def tap
        yield self
        self
      end
    end

    To other developers that stumble across this, I pose this question: what techniques and/or idioms do you favor when a scoped temporary variable is called for?

  • Beware of Top Level Methods in Ruby [updated]

    Jun 19, 2009

    In a Rails project we're currently developing, I ran into an odd failure while running rake:

    $ rake gems:unpack
    (in /home/hobson/railsproject)
    rake aborted!
    undefined method `[]' for :sourdough:Symbol
    /home/hobson/railsproject/lib/tasks/import.rake:113:in `method_missing'
    ...

    Why does this trace end in a totally unrelated rakefile? A look at the source of this file provided a simple explanation: in order to break up the logic of a complicated Rake task, someone had defined a large number of constants and utility methods (including method_missing!) without a surrounding class or module namespace. Other Rake tasks were accidentally triggering this method_missing implementation because it was defined at the top level. This is why it's never a good idea to define methods in the top level of any Ruby file that might be included as part of a larger system. The solution is to wrap the methods in a module, and include that module in the task namespace:

    module ImportUtil
      # you can define any method you want here, including method_missing.
    end
     
    namespace :import do
     
      include ImportUtil
     
      task :something do
        # you can now call everything defined in ImportUtil as a top level method
      end
    end

    This avoids poisoning the top level of your Ruby environment with Rake utility methods. The lesson here is that since Ruby makes it so easy to include modules at any level, it's almost never necessary to define a method at the top level. Especially not method_missing.

    UPDATE: Whoops, it turns out the solution I proposed above does not work. Since rake "namespaces" are really just blocks executed at the same level at which they're declared, the module include causes the methods to be included into the top level anyway. While the final lesson of this post remains true, the only solution I'd recommend (and the one we eventually took with the code above) is to remove the offending code completely, and don't ever define method_missing (or any other method, if you can help it) at the top level.

  • Fixture Replacements Are Catching On

    Jun 12, 2009

    Ruby on Rails is well known in its capacity as opinionated software, but there are a couple of things on which Rails and I never agreed. First among these is test fixtures, a feature that I learned to loathe immediately after I recognized the alternatives. My favored approach for creating test data is to generate it at testing time using a factory API, an idea I first saw described by Dan Manges a few years back. This idea has since then caught on, and now there are many implementations of what are most commonly termed fixture replacements.

    The problems with fixtures are now fairly widely understood: they are difficult to maintain, they separate your test data from your tests, and they can increase the brittleness of your test suite. At the time, however, this idea hadn't caught on widely. Without many alternatives that met our needs, and inspired by Dan's post, Justin Balthrop and I collaborated on a package that directly implemented Dan's proposed API, called ModelFactory. Since then I've used this tool in a handful of projects as a replacement for fixtures, and Justin and I have expanded the functionality of ModelFactory as needs have arisen.

    The purpose of ModelFactory is to generate valid ActiveRecord objects that can be used instead of fixture-generated objects. ModelFactory allows you to clearly demonstrate your intent because you only specify attribute values that you care about, while everything else is valid but essentially (and intentionally) opaque. Here's an example usage of the original ModelFactory API:

    require 'model_factory'
    module Factory
      extend ModelFactory
      default User, {
        :name => 'Factory User',
        :login => 'factoryuser',
        :email  => 'factoryuser@example.com',
        :active => true
      }
    end
     
    class SomeTest < Test::Unit::TestCase
      def test_inactive_user
        assert !Factory.create_user(:active => false).active?
      end
    end

    Occasionally I'd check to see what new alternatives had become available. It was beginning to become clear to me that the ModelFactory API we'd adopted had some inherent limitations, mostly having to do with the creation of unique values and support for namespaced classes. Even still, I hadn't seen anything more appealing until I encountered machinist. Like ModelFactory, you use machinist by defining a set of default properties for your models:

    require 'machinist/active_record'
     
    User.blueprint do
      name   { "Factory User" }
      login  { 'factoryuser' },
      email  { "#{login}@example.com" }
      active { true }
    end
     
    class SomeTest < Test::Unit::TestCase
      def test_inactive_user
        assert !User.make(:active => false).active?
      end
    end

    One of the smartest features of machinist (and one that I borrowed for ModelFactory) is the use of blocks for assigning values. Since these blocks execute in the context of the new instance, it's possible to build values on top of one another. Machinist also includes an additional facility for generating unique values for your test data called Sham. This API can be combined with Faker to generate realistic-looking data for your tests:

    require 'machinist/active_record'
    require 'sham'
    require 'faker'
     
    Sham.name  { Faker::Name.name }
    Sham.email { Faker::Internet.email }
     
    User.blueprint do
      name
      email
      active { true }
    end

    While a facility for generating unique values is obviously needed, I don't see the advantage of having random, realistic-looking test data. Luckily, the use of Faker is optional:

    require 'machinist/active_record'
    require 'sham'
     
    Sham.name  {|i| "Factory User #{i}" }
    Sham.login {|i| "factoryuser#{i}" }
     
    User.blueprint do
      name
      login
      email { "#{login}@example.com" }
      active { true }
    end

    When used in this way, the Sham API seems like an unnecessary component. If all I want is a counter, why not just pass it in to the block generating the values? Here's a example of this technique as it appears in the latest ModelFactory:

    require 'modelfactory'
     
    ModelFactory.configure do
      default(User) do
        name   {|i| "Factory User #{i}" }
        login  {|i| "factoryuser#{i}" } 
        email  { "#{login}@example.com" }
        active { true }
      end
    end

    In this case the blocks are still being called in the context of the new instance, but they're also getting a counter passed in when the blocks have a single arity. This is accomplished using the excellent Ruby 1.9 method instance_exec, backported to earlier versions by ActiveSupport.

    In this way I can have the best of both worlds while using less code. In order to improve support for namespaced classes, instances are created differently using the new ModelFactory API:

    class SomeTest < Test::Unit::TestCase
      def test_inactive_user
        assert !User.factory.create(:active => false).active?
      end
    end

    For a while I wasn't sure if I wanted to keep ModelFactory around when so many of its problems are solved by other packages. The combination of legacy compatibility and the features described above have granted ModelFactory a reprieve, at least until I find something that I'd rather be using instead.

    It's important to evaluate your own solutions for continued relevance, and in the case of ModelFactory it took some work for me to get something that was worth keeping around, given the excellent alternatives. However, whether you use ModelFactory, machinist, Factory Girl or something else entirely, you'll be doing yourself a favor by avoiding stock Rails fixtures.

  • Zack Hobson

    OpenSourcery Alumnus