Thursday, December 06, 2007

Communicating your intent in Ruby

I've been using Ruby most everyday for about two years now. While I'm no expert, I know enough to be fairly productive in it. And beyond liking the succinctness and power that you often hear other people talk about, it's made me a better programmer. But there's an aspect of Ruby that worries me somewhat.

To start, programming is recognized rightfully as a means to build something from pure thought. But it's also a form of communication, to other programmers that will touch your code later, and to yourself when you look at it months from now. We're at a point that other than embedded and spacecraft programming, we have the luxury of using programming languages that focus ease for the programmer, rather than for the ease of the machine. Fundamentally, that's the philosophy that Ruby takes.

And while Ruby's nice in a lot of ways, I'm not sure about how it communicates an object's interface. When you're allowed to modify objects and classes on the fly, how do you communicate interfaces between modules you mixin and methods/modules you add? By interface, I mean, how do you use this class so that it does what it's suppose to? Normally, it's pretty obvious--you look at the names of the methods declared in the code. A well-written class has public methods exposed, or you look at its ancestor's public methods. You might need some documentation to figure out how to call them in the right order, but generally, you have some idea just by looking at the method signatures.

However, when you throw mixins and metaprogramming in the mix, it becomes less easy to tell just from looking at the method signatures in the code--the structure of the code. You have to specifically read the code, or you have to rely on someone who knew intent to document it in detail.

An example communicating interfaces for mixins: the module Enumerable contains a lot of the Collections related methods. The cool thing is that if you wanted these functions in your own class, all you have to do is define each() in your class, mixin the Enumerable module, and you get all of these "for free". However, outside of documentation explicitly stating it, it's not as immediately obvious in method signatures that this is what you have to do in order to use it. It's only after scanning through the entire code that you notice each() being used for all the methods.

Of course, Ruby contains enough metaprogramming power to protect yourself against this. one can do something like this:

class MethodNeededError < RuntimeError
def initialize(method_symbol, klass)
super "Method #{method_symbol.to_s} needs to be in client class #{klass.inspect}"
end
end

module Enumerable
def self.included(mod)
raise MethodNeededError.new(:each, mod) unless mod.method_defined?(:each)
end
end

This only works if you put the include after you define each(). That's just asking for trouble when the order of your definitions in your class matter.

A fair number of people are writing mini-DSLs in ruby using metaprogramming tricks. One of the common ones is the use method_missing to define or execute methods on the fly. ActiveRecord's dynamic finds are implemented this way. The advantage of communication of interface here in the structure of the code is obvious. Unless it was documented well, you can't tell just by looking at the method signatures.

Why do I harp on interface signatures? I mean, in the instance of requiring each(), it works by just letting it fail in the enumerated methods, since it'll complain about each itself. In the instance of method_missing, just read the regex in the body. While these are true, none of these allow for rdoc to generate proper documentation. The whole point of documentation is to show you the interface--how to use that piece of code. I'm just afraid that given Ruby's philosophy of being able to write clear, powerful, and succinct code, it might fall short when people start using these metaprogramming tricks like alias_method_chain and method_missing more and more. Maybe rdoc needs to be more powerful and read code bodies for regex in method_missing?. It already documents yields in code bodies, but that seems awfully specific.

I'm not a exactly a fan of dictating interfaces like in Java. When you're first coding something up, you're sketching, so things are bound to change. Having plumbing like interface declaration gets in the way, imo. However, when something's a bit more nailed down, it'd be nice to be able to communicate to other programmers your intent without them having to read code bodies all the time.

In the end, I side on flexibility. However, I kinda wish Ruby had some type of pattern matching for methods so I didn't have to read method_missing all the time. But then again, that would be messy in all but the simplest schemes. Can you imagine a class that responded to email addresses as method calls? I guess I'd have to file this one under "bad ideas"

No comments:

Post a Comment