Sunday, May 13, 2007

Ruby Snippet: Caching object attributes

Here's another little snippet. I'm not sure if I'm extracting out boiler plate code prematurely, but it seems right.

A good general rule of thumb for refactoring is that one should always call an object for its value, rather than storing object values in temporary variables when you're using the object. Generally, you can get away with setting the value of an object to a temporary variable, and then using that temporary variables for subsequent calculations
# c = Collection.new((0..500).to_a)
collection_sum = c.sum
collection_sum * 3 + 2 # some calculation with the sum
if collection_sum == 4 # some comparison with the sum.
# do something else
end
You're essentially caching the value outside of the object, 'c'. This is good if 'sum' is an expensive call. However, if the code segment gets large, this method tends to get confusing, especially if you stored it in temporary variables at different times, etc. (Also a good rule of thumb is that your scope should never be bigger than about 10-20 lines) It's better to refer to the value directly from the object.
# c = Collection.new((0..500).to_a)
c.sum * 3 + 2 # some calculation with the sum
if c.sum == 4 # some comparison with the sum.
# do something else
end

This is not applicable to functional programming languages, since all variables are consts within a scope. But for object-orientated imperative languages, this generally makes for less messy code.

However, performance is a problem, if you have to calculate the sum each time you needed it. An easy solution would be to cache the value. An example would be summing the values across some collection. Normally, that wouldn't be a problem, except if you had hundreds thousand entries. If nothing was added to the collection, the sum would stay the same. If something was added, mark the collection to need updating, and then the next time sum() is called, update the value.

You'd need both a variable to keep track of whether the attribute needed updating, and another variable to hold the result. If you only had one attribute, that'd be ok. If you have more than one, it starts to get a little bit messy. This is where meta programming comes in. I wrote a piece of code that did the bookkeeping of caching attributes for you. You'd use it like:
class Collection
include AttributeCache

def initialize
@array = []
cache :sum, :initial => 0
end

def add(x)
outdate_sum
@array << x
end

def sum
cached_sum { @array.inject {|t, e| t += e} }
end

end

The code is simple. In the initialization method, there is a call to cache an attribute, sum. And then in the other methods, you'd mark where the sum would be outdated, and in the actual call to sum, you'd specify what to do to update the sum.
module AttributeCache

def metaclass; class << self; self; end; end

def cache(attr_name, options)
instance_variable_set "@#{attr_name}", options[:initial]
instance_variable_set "@#{:sum}_outdated", true

metaclass.instance_eval do
define_method("outdate_#{attr_name}") do
instance_variable_set "@#{attr_name}_outdated", true
end
end

metaclass.class_eval %Q{
def cached_#{attr_name}(&block)
if @#{attr_name}_outdated
@#{attr_name}_outdated = false
@#{attr_name} = block.call
end
return @#{attr_name}
end
}

end

end

The biggest hurdle was to figure out how to define a method that accepted a block with meta programming. The best I came up with was to use class_eval. If you have better suggestions, let me know. tip!

2 comments:

  1. Anonymous4:42 PM

    You could also use memoize to get the same effect

    ReplyDelete
  2. Thanks for the tip~

    ReplyDelete