Web Jazz: 2007

Saturday, December 29, 2007

Painite teaser

It seemed like "What the heck is a Ycombinator" struck a chord with people. In retrospect, I probably should have done less rambling. So after a silence of a week, I'll get to the point. Here's a little teaser of what I've been working on over the holidays.


require 'painite'

p_es = Prob::EventSpace.new do |es| 
  es << Prob::RandVar.new("cancer") do |pdf|
    pdf["cancer = :sick"] = 0.01
    pdf["cancer = :healthy"] = 0.99
  end
  es << Prob::RandVar.new("test | cancer") do |pdf|
    pdf["test = :pos | cancer = :sick"] = 0.8
    pdf["test = :neg | cancer = :sick"] = 0.2
    pdf["test = :pos | cancer = :healthy"] = 0.096
    pdf["test = :neg | cancer = :healthy"] = 0.904
  end
end

p_es["cancer = :sick | test = :pos"] # => 0.8 * 0.01 / (0.8 * 0.01 + 0.096 * 0.99)

It's still not quite complete, as it's harder than I expected to generalize it. The interface is still a little bit in flux, but I hope to nail it down soon.

Friday, December 21, 2007

Nice little irb console tidbits

Slash7 with Amy Hoy - Secrets of the Rails Console Ninjas

Not a big deal, but always nice to be more familiar with tools you use all the time. I knew about "reload!", but had no idea about "_", and the especially YMLicious "y @bob"

referring tip!

Wednesday, December 19, 2007

What the heck is a Ycombinator

I woke up at 4am this morning for no reason and decided to go through Ycombinator in Ruby. Given that I read Hacker News daily, it's a shame I didn't know what a Ycombinator was. I thought the article was pretty clear, at least compared to Raganwald's article. As an aside, from purely a learning experience, it was kinda fun and eye opening. At least I can see why geeks think it's neat.

I had written a reply to a comment on hacker news about it, and it turned out to be really long, so I just extracted it out here. I was suppose to start some coding, but I got wrapped up in writing up the comment. My comment ended up to be pretty long, but it was shorter than it could have been. You guys get to suffer through the long version. Haha. But I'm not going to go into details. You can read that in the linked articles. Instead, I'll paint some broad strokes.


  Y = λf·(λx·f (x x)) (λx·f (x x))

or in Ruby, copied from the aforementioned article:


y = proc { |generator|
    proc { |x|
        proc { |*args|
            generator.call(x.call(x)).call(*args)
        }
    }.call(proc { |x|
        proc { |*args|
            generator.call(x.call(x)).call(*args)
        }
    })
}

or if you prefer elegance, Tom's solution in response to Raganwald


def y(&f)
    lambda { |x| x[x] } [ 
        lambda { |yf| lambda { |*args| f[yf[yf]][*args] } } ]
end

I found lambda calculus above hard to read. However, if you go through the code in Y Combinator in Ruby, you'll find it's not too bad. I find that this lecture is also pretty good, as it takes you step by step, with a little bit of humor as well.

If I had to take a stab at it, A Ycombo is a way to implement recursion mechanism when the language doesn't provide named recursion, loops, or iterators, and all you get are first-class functions and a few substitution rules.

Functions are just mappings from one set of things to another set of things--ie. give it a number, and it'll point to another number. Ycombo relies on a property of functions that sometimes, when you put something into a function, you get the exact same thing out, i.e. something gets mapped to itself, like f(x) = x^2, where f(1) = 1 is an example of this property. They call this the fixed point of a function.

The thing about functions is that they don't just have to map numbers to numbers. They can map functions to other functions. A derivative is an example of a function that takes one function as an input, and spits another function back out. like d(x^2) = 2x. Where is the fixed-point of a derivative? One of them is when you put e^x into it. d(e^x) = e^x. I'm sure there are more.

This is important because if you can find the point in which a function can return a function unchanged, you can use that to call the function again, which is what we call recursion. And all the trickiness you see in ycombinator that you see is mainly a result of functional programming not keeping state, so you have to pass everything you need into a function. So if you have to do recursion, you have to have a mechanism pass the function itself, so you can call it again. And this mechanism kinda bends back onto itself, and that's why you see a part of the ycombinator repeated twice in the code above, and in the lambda calculus.

It seems pretty inane to use ycombo given that modern high level languages provide named recursions, and if anything, for loops and iterators with closures. But what if you don't? How do you process lists/arrays without loops or named recursion? Generally, you'd have to make your own recursion mechanism.

When would you ever have languages that don't have those things? Probably not often. But Forth comes to mind. I've never used Forth, but from what little I know about it, the language starts off with some basic primitives and that's it. No loops, no if statements. How do you get anything done? Well, you build your own control structures. People use Forth because it's possible to build your own very small compiler from the ground up written in Forth itself, and still understand the entire thing. I suspect it's used in embedded programming because of that reason.

So you'd pretty much use it when you want to have a powerful system from first principles. I'm guessing it's possible to build your own computer by implementing only functions and substitution rules in hardware, and then you can derive everything else, numbers, pairings, and even recursions in software. That way, you keep your hardware costs down, while retaining the power of a Turing machine.

Speculating here...but another thing that might be interesting is that ycombinator might be a way to compress code. Software size should be reducible if the code is compressible. And recursion can be thought of as a compressing code, and expanding it when the recursion is run. I wonder if there's other ways to reduce software bloat with the use of a ycombinator besides recursion?

Tuesday, December 18, 2007

(Aha!) Part of the reason why great hackers are 10 times as productive

I knew in college that some dudes were faster than I was in terms of programming. Since peer programming wasn't exactly encouraged in college, and at work I did mostly prototyping work, I never really knew how fast other programmers worked.

So when I read Paul Graham (and Joel's) claim that great hackers are at least ten times as productive as average programmers (too lazy to cite right now), I was kinda shocked. Surely, ten times is an order of magnitude! Something that takes an average programmer a 40 hour week to do the great hacker can do in a 4 hour afternoon?

I wondered about that, since there are times when I get stuck on something, then I just start stabbing around randomly out of frustration. I had assumed that great hackers were faster only because they had either the experience or insight to side-step whatever I was doing wrong.

But lately, I've been re-reading everyone's essays that write about programming productivity. And one thing that caught my eye the second time around was when Paul Graham was talking about bottom up programming and how he didn't really believe in objects, but rather, he bent the language to his will. He was building new blocks for himself so he could think about the problem at a higher level of abstraction.

This is basic stuff! I mean, that's the whole point of higher-level programming. When you refactor methods out, you're creating a vernacular so that you can express the problem in terms of the problem domain, not in terms of general computing. This is much like if you want to drive a car, you'd want to be able to step on the gas, rather than time the firings of the pistons. And if you want to control traffic in a city, you'd rather tell all cars to go to a destination, rather than stepping on the gas and turning the steering wheel for each one.

But taken into the light of bending a language to your will, it makes it more clear for me as to how great hackers are ten times as productive. Great hackers are productive not only because they know what problems to sidestep and can problem solve systematically and quickly, but they also build a set of tools for the problem domain as they go along. They are very good pattern recognizers and will be able to generalize a particular pattern of code, so that they can use it again. But not only that, great hackers will create an implicit understanding attached to the abstraction, ie. what we might call common sense.

A case in point. Before Ruby, I'd used for loops over and over again, never really thinking that I could abstract a for loop. It wasn't until they were taken away in Ruby did I realize that map, inject, and their cousins are all abstractions of the for loop. When I see "map" I know that it performs a transformation on every element. But I also know that the array I get back will be the same size, that each element operation doesn't affect other elements, among other things. These are implicitly stated, and they allow for shorter code.

When that happens, you can simply read "map", and get all the connotations it comes with, and hence it comes with meaning. It also becomes easier to remember, since it's a generalized concept that you can apply in different places in the code. The more times you use it, the easier it is to remember, instead of having specialized cases of the same kind of code where the behavior is different in different parts of the code.

A great hacker will take the initial time upfront to create this generalized code, and will save in the long run being able to use it. Done over and over again, it all adds up. So it's not that for any given problem, a great hacker will be done in 4 hours what it takes an average programmer 40 hours, but that over time, a great hacker will invest the time to create a tools and vocabulary that lets him express things easier. That leads to substantial savings in time over the long haul.

I hesitated writing about it, as it's nothing I (nor you) haven't heard before. But I noticed that until recently, I almost never lifted my level abstraction beyond what the library gave me. I would always be programming at the level of the framework, not at the level of the domain. It wasn't until I started writing plugins for rails extracted from my own work and reading the Paul Graham article that a light went off for me. It was easier to plug things like act_as_votable together, rather than to still mess around with associations (at the level of the framework). I still believe you should know how things work underneath the hood, but afterwards, but all means, go up to the level of abstraction appropriate for the problem domain.

DSLs (Domain specific languages) are really just tool-making and vernacular creation taken to the level of bending the language itself. It's especially powerful if you can add implicit meaning to the vernacular in your DSL. It's not only a way of giving your client power in their expression, but it's also a refactoring tool so that you can better express your problem in the language of the problem domain. Instead of only adding methods to your vernacular, you can change how the language works. It was with this in mind that I did a talk on DSLs this past weekend at the local Ruby meetup. First part is on Dwemthy's Array, and the second is using pattern matching to parse logo. Both seemed pretty neat when I first read about it. Enjoy!

DSL in Ruby through metaprogramming and pattern matching

Monday, December 17, 2007

Getting rid of emacs 23 splash screen

Pretty Emacs Reloaded

The latest emacs 23 is pretty neat, but it noticeably has that splash screen in the beginning. It's generally good practice for an app not to present new users with just a blank screen. There's should be something there to say, 'hey, these are the next steps'.

That said, I've found it annoying, probably because I'm use to it not being there. So I've found that you can turn it off through this comment.


(setq inhibit-splash-screen t)

Just put that in your .emacs file in your home directory, and you're all set. tip~

Thursday, December 13, 2007

Generating rdoc for gems that refuse to generate docs

I recently upgraded to capistrano 2.1, and it's woefully lacking in documentation. Jamis Buck had already picked a documentation coordinator about a month ago, but nothing seemed to be happening since.

So it was time to go dumpster diving in cappy code. I at least wanted to see what the standard variables were. To my surprise, there were some docs in code, but I couldn't generate it with


gem rdoc capistrano

For those of us that had never made a gem, all you have to do to force it to is to edit the associated specification for the gem (/usr/lib/ruby/gems/1.8/specifications/), and add


s.has_rdoc = true

Maybe if I dig enough stuff out of it, I'll pose some prelim documentation for cappy 2.1 and donate it to the documentation effort.

As an aside and musing, ideally, the code itself would be documentation. However, just because you're reading code, doesn't mean that you can get the overall picture of how to use it. Even if you filtered out the details, and saw just the class and method declarations, that still wouldn't be enough since you can't see how things fit together. I don't know what a good solution would be. The simplest API is often complex enough to be nearly useless without good documentation.

Small tip!

Tuesday, December 11, 2007

The user owns this and that

In most any community-based website, your users are creating objects. And sometimes, you'd only like to display them when they're owned by the user. Usually, it's easy enough to do the check based on the association in the template views.


  <% if @story.account == session[:member] %>
    <%= render :partial => "story"
  <% end %>

However, for the sake of perhaps over-reaching, but something more readable, we try:


class Story < ActiveRecord::Base
  belongs_to :account

  def owned_by?(user)
    self.account == user
  end
end


  <% if @story.owned_by?(session[:member]) %>
    <%= render :partial => "story"
  <% end %>

But this gets to suck, because you have duplicate code in different classes that are just slightly different due to association names. One way to solve it is to put it in a module, and include it in different classes. After all, that's what mixins are for.


module Ownership
  def owned_by?(user)
    self.account == user
  end
end

class Story < ActiveRecord::Base
  include Ownership
  belongs_to :account
  # blah blah blah
end

class Post < ActiveRecord::Base
  include Ownership
  belongs_to :account
end

Or alternatively, you can put it in the account, but in reverse. But now, you have to search through the associations of the owned ActiveRecord object.


class Account < ActiveRecord::Base
  def owns?(ar_object)
    ar_object.class.reflect_on_all_associations(:belongs_to).detect { |assoc|
      ar_object.send(assoc.name) == self
    } ? true : false
  end
end

I find the tertiary operator to be kinda ugly there, but it doesn't make sense to return the reflection object. Regardless, this lets you do:


  <% if session[:member].owns?(@story) %>
    <%= render :partial => "story"
  <% end %>

However, this doesn't work for self-associated AR objects, or objects that have two or more belongs_to to the same account. It relies on a unique belongs_to association for every object belonging to account. I'm not sure yet, which way's the better way to go, and in the end it probably doesn't matter as much, but I do like being able to say user.owns?(anything) for any object without really thinking about what the associations are. half-tip.

Friday, December 07, 2007

A simple distributed crawler in Ruby

A week ago, I took a break from Mobtropolis, and...of all things ended up writing a simple distributed crawler in Ruby. I hesitated posting it at first, since crawlers are conceptually pretty simple. But eh, what the heck.

This was just an exercise to do some DRb and Hpricot, so don't use this for your production work, whatever it may be. An actual crawler is far more robust than what I wrote. And don't keep it running hammering at stuff, since it'll get you banned.

First, this is how you use it:


WebCrawler.start("http://en.wikipedia.org/") do |doc|
  puts "#{doc.search("title").inner_html}"
end

And that's it. It returns documents in an XPath traversable form, courtesy of Hpricot.

A web crawler is a program that simply downloads pages, takes notes of what links there are on that page, and puts those links on its queue of links to crawl. Then it takes the next link off its queue and downloads that page and does the same thing. Rise and Repeat.

First, we create a class method named start that creates an instance of a webcrawler and then starts it. Of course, we could have done without this helper method, but it makes it easier to call.


module Crawler
  class WebCrawler
    class << self
      def start(url)
        crawler = WebCrawler.new
        crawler.start(url) do |doc|
          yield doc
        end
        return crawler
      end
    end
end

So next, we define the initialization method.


module Crawler
  class WebCrawler
    def initialize
      puts "Starting WebCrawler..."
      begin
        DRb.start_service "druby://localhost:9999"
        puts "Initializing first crawler"
        puts "Starting RingServer..."
        Rinda::RingServer.new(Rinda::TupleSpace.new)
        
        puts "Starting URL work queue"
        @work_provider = Rinda::RingProvider.new(:urls_to_crawl, Rinda::TupleSpace.new, "Queue of URLs to crawl")
        @work_provider.provide
        
        puts "Starting URL visited tuple"
        @visited_provider = Rinda::RingProvider.new(:urls_status, Hash.new, "Tuplespace of URLs visited")
        @visited_provider.provide
      rescue Errno::EADDRINUSE => e
        puts "Initialize other crawlers"
        DRb.start_service
      end
      puts "Looking for RingServer..."
      @ring_server = Rinda::RingFinger.primary
      
      @urls_to_crawl = @ring_server.read([:name, :urls_to_crawl, nil, nil])[2]
      @urls_status = @ring_server.read([:name, :urls_status, nil, nil])[2]
      @delay = 1
    end
  end
end

This bears a little explaining. The first webcrawler you start will create a DRb server if it doesn't already exist and do the setup. Then, every subsequent webcrawler it'll connect to the server and start picking URLs off the work queue.

So when you start a DRb server, you call start_server with a URI, then you start a RingServer. What a RingServer provides is a way from subsequent clients to find services provided by the server or other clients.

Next, we register a URL work queue and a URLs visited hash as services. The URL work queue is a TupleSpace. If you haven't heard of TupleSpace, the easiest way to think of it is as like a bulletin board. Clients post items on there, and other clients can take them out. This is what we'll use as a work queue of URLs to crawl.

The URLs visited is a Hash so we can check which URLs we've already visited. Ideally, we'd use the URL work queue, but DRb seems to only provide blocking calls for reading/taking from the TupleSpace. That doesn't make sense, but I couldn't find a call that day. Lemme know if I'm wrong.


module Crawler
  class WebCrawler
    def start(start_url)
      @urls_to_crawl.write([:url, URI(start_url)])
      crawl do |doc|
        yield doc
      end
    end

    private

    def crawl
      loop do
        url = @urls_to_crawl.take([:url, nil])[1]
        @urls_status[url.to_s] = true
        
        doc = download_resource(url) do |file|
          Hpricot(file)
        end or next
        yield doc

        time_begin = Time.now
        add_new_urls(extract_urls(doc, url))
        puts "Elapsed: #{Time.now - time_begin}"
      end
    end
  end
end

Here is the guts of the crawler. It loops forever taking a url off the work queue using take(). It looks for a pattern in the TupleSpace, and finds the first one that matches. Then, we mark it as 'visited' in @urls_status. Then, we download the resource at the url and use Hpricot to parse it into a document then yield it. If we can't download it for whatever reason, then we grab the next URL. Lastly, extract all the urls in the document and add it to the work queue TupleSpace. Then we do it again.

The private methods download_resource(), extract_urls(), and add_new_urls() are merely details, and I won't go over it. But if you want to check it out, you can download the entire file. There are weaknesses to it that I haven't solved, of course. If the first client goes down, everyone goes down. Also, there's no central place to put the processing done by the clients. But like lazy textbook writers, I'll say I'll leave that as an exercise for the readers. snippet!

webcrawler.rb

Thursday, December 06, 2007

Communicating your intent in Ruby

I've been using Ruby most everyday for about two years now. While I'm no expert, I know enough to be fairly productive in it. And beyond liking the succinctness and power that you often hear other people talk about, it's made me a better programmer. But there's an aspect of Ruby that worries me somewhat.

To start, programming is recognized rightfully as a means to build something from pure thought. But it's also a form of communication, to other programmers that will touch your code later, and to yourself when you look at it months from now. We're at a point that other than embedded and spacecraft programming, we have the luxury of using programming languages that focus ease for the programmer, rather than for the ease of the machine. Fundamentally, that's the philosophy that Ruby takes.

And while Ruby's nice in a lot of ways, I'm not sure about how it communicates an object's interface. When you're allowed to modify objects and classes on the fly, how do you communicate interfaces between modules you mixin and methods/modules you add? By interface, I mean, how do you use this class so that it does what it's suppose to? Normally, it's pretty obvious--you look at the names of the methods declared in the code. A well-written class has public methods exposed, or you look at its ancestor's public methods. You might need some documentation to figure out how to call them in the right order, but generally, you have some idea just by looking at the method signatures.

However, when you throw mixins and metaprogramming in the mix, it becomes less easy to tell just from looking at the method signatures in the code--the structure of the code. You have to specifically read the code, or you have to rely on someone who knew intent to document it in detail.

An example communicating interfaces for mixins: the module Enumerable contains a lot of the Collections related methods. The cool thing is that if you wanted these functions in your own class, all you have to do is define each() in your class, mixin the Enumerable module, and you get all of these "for free". However, outside of documentation explicitly stating it, it's not as immediately obvious in method signatures that this is what you have to do in order to use it. It's only after scanning through the entire code that you notice each() being used for all the methods.

Of course, Ruby contains enough metaprogramming power to protect yourself against this. one can do something like this:


class MethodNeededError < RuntimeError
  def initialize(method_symbol, klass)
    super "Method #{method_symbol.to_s} needs to be in client class #{klass.inspect}"
  end
end

module Enumerable
  def self.included(mod)
    raise MethodNeededError.new(:each, mod) unless mod.method_defined?(:each)
  end
end

This only works if you put the include after you define each(). That's just asking for trouble when the order of your definitions in your class matter.

A fair number of people are writing mini-DSLs in ruby using metaprogramming tricks. One of the common ones is the use method_missing to define or execute methods on the fly. ActiveRecord's dynamic finds are implemented this way. The advantage of communication of interface here in the structure of the code is obvious. Unless it was documented well, you can't tell just by looking at the method signatures.

Why do I harp on interface signatures? I mean, in the instance of requiring each(), it works by just letting it fail in the enumerated methods, since it'll complain about each itself. In the instance of method_missing, just read the regex in the body. While these are true, none of these allow for rdoc to generate proper documentation. The whole point of documentation is to show you the interface--how to use that piece of code. I'm just afraid that given Ruby's philosophy of being able to write clear, powerful, and succinct code, it might fall short when people start using these metaprogramming tricks like alias_method_chain and method_missing more and more. Maybe rdoc needs to be more powerful and read code bodies for regex in method_missing?. It already documents yields in code bodies, but that seems awfully specific.

I'm not a exactly a fan of dictating interfaces like in Java. When you're first coding something up, you're sketching, so things are bound to change. Having plumbing like interface declaration gets in the way, imo. However, when something's a bit more nailed down, it'd be nice to be able to communicate to other programmers your intent without them having to read code bodies all the time.

In the end, I side on flexibility. However, I kinda wish Ruby had some type of pattern matching for methods so I didn't have to read method_missing all the time. But then again, that would be messy in all but the simplest schemes. Can you imagine a class that responded to email addresses as method calls? I guess I'd have to file this one under "bad ideas"

Don't reopen ActiveRecord in another file

The power of Ruby lies partially in how one can reopen classes to redefine them. Besides namespace clashes, this is usually a good way to extend and refine classes to your own uses. However, last night, I got bitten in the ass trying to refactor a couple classes. In Rails, you're allowed to extend associations by adding a class the association call.


class User < ActiveRecord::Base
  has_many :stories, :through => :entries, :source => :story, 
    :extend => StoryAssociationExtensions
end

where StoryAssociationExtensions is a ~~class~~ module holding methods, like expired() that I can perform on the challenges association, so I can do stuff like


  @user = User.find(:first)
  @user.stories.expired  # gives all expired stories

So when refactoring and cleaning up, I renamed StoryAssociationExtensions to AssociationExtensions and reopened up Story class and put it in there. I just wanted to clean up the namespace, and put the association extensions somewhere that made semantic sense. Naturally, I thought putting association extensions for a class belongs in a class. Well, it doesn't work. And don't do it. Hopefully, I'm saving you some pain.


class Story < ActiveRecord::Base
  module AssociationExtensions
    def expired
      self.select { |c| c.expired? }
    end
  end
end

Well, this works if you've reopened the class within the same model file, story.rb in this case. However, if you reopen the class in another file elsewhere, your model definition won't get loaded properly, which leads to associations and methods you defined not to exist.

So imagine my bewilderment when associations didn't work on only certain ActiveRecord Models. In addition, they worked on the unit tests and script/console, but didn't work when the server was running. All that at 3am in the morning. :(

Good thing for source control, so I could revert (but I have to say, svn isn't as easy to use as it could be).

I ended up creating a directory in model called collection_associations and putting the associations in there under a module CollectionAssociations namespace. Not exactly the best arrangement but it'll do for now.

I'm still not sure why ActiveRecord::Base instances don't like being reopened, but I'm guessing it has something to do with only getting loaded once. If anyone has an explanation, I'd like to read about it.

free warning!

Monday, December 03, 2007

Book Review: GIS for Web Developers

I'm involved in running the local Ruby Meetup in Chicago, and one of the co-organizers got O'Reilly to send us some book for free. In return, I suppose the deal was to give them some exposure for their books, so I got to review this book.

I had actually ordered the book before I found out about the free copy, so now I have two. So it's on amazon now under wil822 seller if anyone wants to buy the first copy off of me.
With that, here we go:

GIS stands for Geographic Information Systems and these technologies existed previously, but were often in the domain of military, scientific, or government applications. Geospatial datasets were either public, but hard to decrypt. Or they were easy to view, but expensive. However, with the advent of Google Earth, NASA World Wind, yahoo maps, and google maps, GIS has come into mainstream awareness and attention.

“Earth materializes, rotating majestically in front of his face. Hiro reaches out and grabs it. He twists it around so he's looking at Oregon. Tells it to get rid of the clouds, and it does, giving him a crystalline view of the mountains and the seashore.” - SnowCrash by Neal Stephenson

Even with the explosion of what's available to us geospatially on the web, it would be naïve to think that the development of GIS has plateaued. There are several complementary technologies that are starting to mature that will develop it even further, such as open mobile phone, ubiquitous wireless connectivity, and open social networks. These are but some of the few reasons to look at GIS as a programmer, as it opens up more possibilities for innovation, creativity, and making useful things.

A word of warning to start: GIS for web developers is not a book telling you how to use Google or Yahoo maps APIs to embed maps in your web application. If you don't need much more than to plop down points on a map, then there's no need to know most of what's in GIS for web developers.

However, if you need to suck down geospatial data from different sources to better provide services for your users, then this book might have something for you. GIS has a host of terminology and difficulties that get abstracted away for you in a consumer application like Google or Yahoo maps. GIS for web developers starts with basic GIS terminology, takes you through spatial databases, washes you in OGC web services, and finally plops you down OGC clients. In addition, it focuses on free (as in free speech, not just free beer) open source solutions.

Like a tour guide for the novice, it starts by introducing various basics, and even tells you where to look for different datasets, how to interpret them, and the difficulties that might arise from it. To start, geospatial datasets aren't exactly widely publicized. They're available, but you'd have to know where to look. For example, the US Census Bureau provides a geocoding database called TIGER, and they also provide geospatial data on the demographics of each county in the United States. Many US government agencies provide geographic data, but you'd have to know where to look.

Even if you know where to look, the file formats of geospatial data are many and varied. The book preps you on various popular file formats and translators to help you along when you encounter this in your searches for geospatial data. In addition, it goes into difficulties that might arise even if you've got the file format straight. For geospatial data, there are many different ways to project a 3D sphere into a 2D plane. Because of that, you have to make sure all your data is in the same projection, or else they won't line up correctly. Throw in the fact that the earth isn't really a sphere, you'll be glad that you had this book to figure out how to make sure everything lines up.

Next, it takes you on a short tour of spatial databases. Up until now, I would have thought that querying, “All points of interest within 20 miles of here” would be something implemented at the application level. However, GIS for web developers shows us this is not the case. Apparently, PostgreSQL already has spatial capabilities built in, and one can have table attributes that indicate location and make queries based on distance.

The rest of the book is devoted to getting your feet wet about servers and client standards by the Open Geospatial Consortium. It shows you how to setup an OGC standard geospatial server implementation called GeoServer, so you can serve up geospatial data in a RESTful way. Then a brief romp through using OpenLayers to embed a map widget, before tying together the odds and ends at the end.

This book is a good start to exploring GIS, but it only scrapes the surface. It suggests the possibilities towards the end about going beyond putting push pins on the earth. There's still a lot of room for possibility to explore. A word on editing, however. On page 60 in chapter 4 on rasters, a whole paragraph was repeated. It was as if someone cut and pasted and hit paste one time too many. That should have been caught.

Overall, it's a good book in terms of its information content, given that you're not just looking to embed Google maps in your web application. It takes you through the terminology of GIS, database concerns, and then through standard OGC web services and clients. By the tech book standards, this one is brief, but that's intended, as this is an introduction. The writing itself isn't spectacular, but it's clear and not a drudgery to read through. If you can stand the occasional corny joke and you're looking to suck down geospatial datasets, then this is the book for you.

Friday, November 30, 2007

Nerd Time issue 13 - Shoes, javascript, and physics

Hope you had a good thanksgiving. Lately, I've been watching math and tech lectures, and dipping into various physics stuff. This nerd time is more hardcore than the usual news clippings. Next time, I think it's going to be more about geospatial stuff. I've been looking into that too. I'm going to start stating why I think something's significant, so you can figure out whether you should look at it or not. As usual, the easy stuff is up top.

Secret strategies behind many viral videos
As usual, when there's a new medium of expression, it's lawlessness and wild fun for a while, but where there are people, there are advertisers and marketers right on their heels to try to grab their attention. I don't know how I feel about this, but I am a bit disgusted for some reason I can't yet put my finger on. Perhaps it's big corporations posing as 'homegrown'. In any case, if anything positive has come out of this whole thing it's that advertising has become funnier over the years. Props to Geico.
http://www.techcrunch.com/2007/11/22/the-secret-strategies-behind-many-viral-videos

Verizon opens its network to any device
Traditionally, telecos have seen themselves trying to be both line and content providers. So in 50 years, we've only had call waiting, *69, and three way calling. So for a teleco to open up its wireless network to any device is pretty big. Of course, it's in response to Google's open phone platform Android, and their bid for the 700MHz spectrum. I'm cautiously optimistic about the open-ness of the mobile web.
http://bits.blogs.nytimes.com/2007/11/27/verizon-wireless-says-bring-your-own-device/

Google goes into renewable energy
If I built big datacenters that sucked down lots of energy, I'd be interested in this too.
http://www.google.com/intl/en/press/pressrel/20071127_green.html

How to destroy the web 2.0 look
I think none of us here are designers, but I check in on this once in a while, since I have to do front-end design. The so-called web 2.0 has a look...the gradients, the beveled edges, and the rounded corners. But I'm also seeing more designers move away from that, and trying to break out of an obvious grid layout, so you might see that bleed over the web apps.
http://www.snap2objects.com/2007/11/20/how-to-destroy-the-web-20-look/

Metalayer over web pages
The web was always meant to be a read/write medium. In the beginning, it was a predominately read medium until wikis and whatnot came along. Some people are still trying to push the envelope by putting a metalayer over web pages that you can write on and communicate to others visiting the same space. So far, nothing in this space has made huge waves, but I expect there to be more developments on this front.
http://www.shiftspace.org/screencasts/intro/index.html

Running in Shoes in Ruby
Ruby is a nice language, but there are some problems with its Std lib. One of which is a poor GUI toolkit. It uses the old Tk toolkit which is super ugly. Shoes is a GUI toolkit by _why_the_lucky_stiff for native apps that is meant to write like web pages. That makes it pretty easy to figure out. Check out some of the screenshots with the accompanying source. It makes Java GUIs seem terribly verbose.
http://code.whytheluckystiff.net/shoes/wiki/Tutorials

MIT's Open Courseware
For those of you that would like to brush up on various undergraduate and graduate topics. It's probably less relevant to those of you at the lab, due to the free master's program. They have courses on other topics besides math too.
http://ocw.mit.edu/OcwWeb/Mathematics/

Future of Javascript 2
Javascript, as I've said before, has surprised me. My previous impression of Javascript was a dinky little language on browsers that you use to to do some form validation. It's evolved into the most used language on the widest platform on the planet. It supports references, OOP, and closures. This slide details more of what's to come. Beyond Ajax, I think you'll start to see more and more flexible interfaces in javascript, starting with SVG. Various browsers are making their javascript interpreters faster and meaner, so you'll see more web sites pushing this envelope by making their websites more expressive.
http://ejohn.org/blog/easy-pdf-sharing/

Jquery vs Prototype
Jquery and prototype are two javascript libraries that provide lots of syntactic sugar as well as hiding browser incompatibilities from the application developer. People often dish it out between the two, so here's two perspectives.
http://jquery.com/blog/2006/08/20/why-jquerys-philosophy-is-better/
http://alternateidea.com/blog/articles/2006/8/23/jquery-mis-leading-the-pack

SeaForth
Forth is a stack-based programming language. This is a piece of hardware implemented with Forth on top. I actually didn't read too much of it, because I didn't get everything they were saying, but Mike and I were talking about Forth the other day, and this reminded me of it.
http://www.falvotech.com/blog/index.php?/archives/200-Forth-Day-Report.html

Quantum Mechanics from a computational point of view
I never really got the wave equation when I was an undergrad. And quantum mechanics had seemed odd and spooky to me. However, this article on the math behind it is fairly clear. It explains how you get negative probabilities, clearly, but gets kinda murky when it starts talking about mixed states. Currently, in machine learning and search, statistical methods dominate the field over ontological methods. I wonder how long it might be before probabilistic methods in quantum mechanics will find a use in machine learning?
http://www.scottaaronson.com/democritus/lec9.html

String Theory in two minutes
This is something fun, isn't hard, and doesn't take too long. It's just a short video on string theory...in two minutes! If you want to know more about string theory, click on the second link. It's a tutorial.
http://discovermagazine.com/twominutesorless?bcpid=716091875&bclid=686943766&bctid=687029421
http://www.slimy.com/~steuard/research/StringIntro/

3D mouse from electric field sensing
Minority Report certainly inspired some HCI people to get cracking. This is a discussion of how to detect hand gestures using an elec field. I wasn't able to get all the way through it, cuz I started watching math lectures. However, it is an interesting piece. I don't think we'll see 3D mice any time soon, but with mobile devices having small keyboards, this sort of technology might become very useful.
http://www.research.ibm.com/journal/sj/353/sectione/smith.html

Similarity Search
And lastly, a talk on similarity search. It's a different measure of similarity. Rather than putting everything in a parameterized space and using malahanobis distance, you calculate distance based on the graphical structure the data makes. I've watched it twice, and I feel like I'm still missing something. At least the accent reminds me of "Hokey, here's the earf"
http://www.youtube.com/watch?v=MsRTrO_p6yE
http://www.endofworld.net/
http://simsearch.yury.name/tutorial.html

Monday, November 19, 2007

Nerd Time issue 12 - Android, Social Ads, Hardware, Networking

I owe you a beer
I'm not sure how many of you heard of Twittering by now, but Twitter is like...microblogging. You say what you're doing or just quips over your cell phone in 144 char or less, and your friends can get updates from you on what you're doing on their phone. If any of you use facebook, it's much like the status update feature. Most people find Twittering useless and inane on one hand, but lots of people seem to use it the world over. They also released an API, which someone took advantage of with Foamee, which is why this is even on here. Foamee records who you owe beers to, and keeps track of that. So even on a seemingly inane platform, I thought the use of foamee for that end is actually pretty creative.
http://twitter.com/
http://foamee.com/

Google's Android Platform
Submitted: Metlis
As most of you probably heard, Google released its mobile OS platform, not an actual phone, as rumored. I took a moderately deep look into it. It's a full stack that runs on linux. It compiles Java (rather, a flavor of Java) into their own Java Virtual Machine, named Dalvik. I think with JRuby and Jython around it should be a matter of time to get Ruby and Python on there. The way they've decided to organize the application lifecycle is simple to understand and organized. The UI uses xml to declare the view, rather than to connect it together in code, like in Swing. Outside of standard UI components like text fields, they also have mapviews. You can do interprocess communication by broadcasting an "intent", and it'll pick the application best suited to fulfill that intent. So if your app need to pick a photo, it sends out an intent to pick a photo, and the photo gallery will respond. The user picks a photo, and that's what gets returned to your app. The API isn't done in full. Some of it isn't completely implemented yet, and an actual android phone isn't due out til mid to late 2008, I think. So we'll see if this all pans out, but it'd be exciting if it does.
http://www.reuters.com/articlePrint?articleId=USN0262823920071106
http://code.google.com/android/what-is-android.html
http://www.youtube.com/watch?v=fL6gSd4ugSI&feature=PlayList&p=D7C64411AF40DEA5&index=1
http://code.google.com/android/
http://www.betaversion.org/~stefano/linotype/news/110/

Facebook's new Ad platform
Last week, on the 5th, Facebook released its new ad platform. The new ad platform uses what people do when interacting with their friends to advertise. Most of us don't make buying decisions independently. We ask our friends about what to buy, especially if we don't know much about the domain. Facebook will allow companies to sell their wares on it, and if you buy say...Nike shoes on it, it'll show your friends on their news feeds that you brought shoes. IBM did a paper on advertising, as we know it, will start to fade out. Advertising isn't "advertising" when it's targeted and relevant.
http://www.facebook.com/business/?socialads
http://www-03.ibm.com/press/us/en/pressrelease/22570.wss
http://www.scripting.com/2006/08/03.html

Seiko comes out with thin ebook reader
It's a prototype, but all the same, pretty impressive. I can't way until eBook readers become more popular.
http://gizmodo.com/gadgets/e_ink/seiko-high+res-super+thin-ebook-reader-323502.php

Intel releases Penryn Processor
I don't know too much about this topic, other than, "Nick would know more". Got any light to shine on this one?
http://www.infoworld.com/article/07/11/12/Intel-launches-power-efficient-Penryn-processors_1.html

Nokia comes out with a tacile touchscreen
This should be of interest to hardware nerds like Mike. Nokia came out with a touch screen that feels like you're typing at a keyboard. What they do is put an array of pizoelectrics behind the screen to move it, and then time it correctly to fool your senses. That way, it feels like you're actually clickity-clacking away on the keyboard on a touchscreen.
http://www.redferret.net/?p=9533

Amazon comes out with an eBook reader
I totally didn't see this coming, but it makes sense in hindsight. The coolest thing about it is the device can download directly off wireless cellular internet, and the subscription to the IP service is included with the price of the device.
http://www.newsweek.com/id/70983

Giggling Robot becomes one of the kids
I've always thought that intelligence was partly social. A Qurio robot does enough to fool toddlers into thinking that it's one of them. Eventually, I think we'll have the same type of stuff for adults, but fool us into thinking about them as pets with utility, rather than as equals.
http://technology.newscientist.com/article/dn12879-giggling-robot-becomes-one-of-the-kids-.html
http://webjazz.blogspot.com/2007/11/nabaztag-and-pet-appliances.html

Lets you control a real person in real life
Submitted: Howard
Lots of people are experimenting with connecting the real world with the virtual. I think we'll probably see more and more of this type of stuff as mobile phones become more powerful and connected.
http://www.therawfeed.com/2007/11/new-site-lets-you-control-real-person.html

Where am I? Firefox extension
Things have been brewing in the mobile world, with iPhone and Android making waves. One thing is for sure: people will want web browsers on their mobile phones. I think I remember firefox wanting to move to mobile platforms. Anyway, we'll probably eventually see geo-location aware browsers. Here's a neat firefox extension that helps patch that need for now.
https://fosswiki.liip.ch/display/WHE/Home

Just an interesting tidbig on cracking MD5
Usually MD5 hashes are used to encrypt a string. The resulting hash you get is suppose to be hard to "reverse" so you can't tell what the original string is. This guy used google to search for the MD5 hash to get the original string. Let that be a lesson for you. Always salt your passwords!
http://www.lightbluetouchpaper.org/2007/11/16/google-as-a-password-cracker/
http://md5.rednoize.com/

A New way to look at Networking
I imagine most of you don't ever watch the lectures. But I only list the good ones! This is a pretty good lecture taking you through the history of networking from telephony all the way to the present day TCP/IP and its problems. The proposal Van Jacobson makes is to request data by name to the network rather than by source. So instead of asking for http://nytimes.com, which you assume the content is the nytimes, you'd ask the network "give me the new york times", and you don't care where on the network it comes from. Think bittorrent for smaller files without the existence of a tracker. "Change your point of view to focus on the data, not where the data lives, because it doesn't have to live anywhere" That means that nodes will cache content it receives and gives it to anyone that asks for it. Of course, updating that distributed content will be tougher, as well as how to implement security for content provider. If you want to skip to the meat, start at 40:00.
http://video.google.com/videoplay?docid=-6972678839686672840&q=engedu

Shared Memory Must Die
It seems like programmers will have to figure out how to program more concurrency models outside of locks. I've already mentioned this when I talked about Erlang before.
http://www.wellquite.org/shared_mutable_memory_must_die.html

Pattern matching method dispatch and DSL
Ian asks me, "Have you heard of Lua?", to which I said, "It was in nerd time a couple issues back!" Lua apparently makes it easy to embed custom languages in your applications--what people call DSLs. Ruby has been pretty good at doing it too. This is ruby envying functional programming languages and their weird features like pattern matching method dispatch and lisp's s-expressions. A guy uses pattern matching to write a DSL to parse Logo, the turtle drawing program. This wouldn't have been a way I'd ever think to solve this problem, so it opened up my eyes a bit.
http://www.artima.com/rubycs/articles/patterns_sexp_dslsP.html

Friday, November 16, 2007

State change observer for ActiveRecord

When I started writing some code recently, I noticed that my controllers were getting fat. There was much to do, but there was a bunch of stuff in there that didn't have anything to do with actually carrying out the action--things like sending notifications. ActiveRecord already has observers to take action on certain callbacks. However, what I needed was to take actions on certain state transitions. Not seeing any immediate solutions in the Rails API, I decided to test myself and try writing one. I was bored too. So while I'm not sure if it was worth the time writing it, it certainly was kinda interesting. Here's what I came up with:

Just as a contrived example, let's say we are modeling the transmission of a car. It has three modes: "park", "reverse", "drive". We want to send a notification when a user tries to change it from "reverse" to "drive", but not when he tries to change it from "park" to "drive". If it didn't matter, and we just wanted to send notifications when the state changed to drive, we'd just use the observers that came with ActiveRecord. But since we do care where the state transition came from, here's what I came up with:


class CreateCarTransmission < ActiveRecord::Migration
  def self.up
    create_table :car_transmission do |t|
      t.column :engine_id, :integer, :null => false
      t.column :mode, :string, :null => false, :default => "park"
    end
  end

  def self.down
    drop_table :car_transmission
  end
end


class CarTransmission < ActiveRecord::Base
  include StateTransition::Observable
  state_observable CarTransmissionNotifier, :state_name => :mode
end

So then for my notifier I have:


class CarTransmissionNotifier < StateTransition::Observer
  def mode_from_drive_to_reverse(transmission)
    # send out mail and flash lights about how this is bad.
  end
end

And that's it. Whenever in the controller, I change the state from "reverse" to "drive", lights will flash and emails will be sent out condemning the action, and my controllers stay small and lean.


class CarController < ApplicationController
  def dismantle
    @car = Car.find(params[:id])
    @car.update_attribute :mode, "reverse"
    @car.update_attribute :mode, "drive"
  end
end

So where's the magic? It took a bit of digging around. There were two major things I had to do. I had to insert observers during initialization and I had to override setting of attributes to include an update to notify observers.

ActiveRecord doesn't exactly allow you to override the constructor. I don't think I tried too hard to mess around with it. Looking on the web, I happened upon has_many :through again, where he has some good tips that helped me through Rail's rough edges. While I didn't exactly follow his advice, I did find out about the call back, :after_initialize. It must be something new, because I don't see it in the 2nd edition of the Rails book, and the current official API doesn't list it. Other Rails API manuals seem to be more comprehensive, like RailsBrain and Rails Manual.

Then overridding attributes has always been a bit of a mystery. I found a listing of the attribute update semantics, which was helpful to figure out what I was looking for, but it was false, in that you can't use the first one (article.attributes[:attr_name]=value) to set an attribute. Looking in the Rails code for 1.2.3, it shows that attributes is a read_only hash. But it's right that you should override the second one (article.attr_name=value), since update_attribute() and update_attributes() depends on it.

Again, it ends up that the function I was looking for wasn't found in the official API as a method, other than a short mention in the description of ActiveRecord under Overriding Attributes, which makes it harder to find. Ends up that we can use write_attribute().

So that's pretty much it. Using some standard meta-programming like how plugins do it, you wrap it up, and it's pretty simple:


require 'observer'

module StateTransition
  module Observable
    class StateNameNotFoundError < RuntimeError
      def message
        "option :state_name needs to be set to the name of an attribute"
      end
    end

    def self.included(mod)
      mod.extend(ClassMethods)
    end

    module ClassMethods

      def state_observable(observer_class, options)
        raise StateNameNotFoundError.new if options[:state_name].nil?
        state_name = options[:state_name].to_s
        
        include Object::Observable

        define_method(:after_initialize) do 
          add_observer(observer_class.new)
        end

        define_method("#{state_name}=") do |new_state|
          old_state = read_attribute(state_name)
          if old_state != new_state
            write_attribute(state_name, new_state) # TODO yield the update method
            changed
            notify_observers(self, state_name, old_state, new_state)
          end
        end
      end

    end
    
  end

  class Observer
    def update(observable, state_name, old_state, new_state)
      send("#{state_name}_from_#{old_state}_to_#{new_state}", observable)
    rescue NoMethodError => e
      # ignore any methods not found here
    end  
  end

end

I had a difficult time figuring out how to define methods for an instance of a class. The only thing I came up with was to use define_method, or to include a module with instance methods in them. instance_eval() didn't work. The meta programming for ruby gets rather confusing when you're doing it inside a method--it seems hard to keep track of which context you're in.

So if you can make a use of this, great. If you think it's worth moving it into a plugin, let me know that too. If you know of a better way, by all means, let me know. tip!

Saturday, November 10, 2007

A $100 per page

A short tidbit, no insights: I didn't know that amazon sold analysis of products, like sony ericsson EDGE modem cards...for $100 a page.

I guess IT managers (or whomever makes buying decisions at big companies), have a lot on the line, and they'd be willing to spend the money on this sort of thing. I just wouldn't have thought that it'd be available on Amazon.

Wednesday, November 07, 2007

Nabaztag and pet appliances

A couple weeks ago, I happened upon this strange internet rabbit. Nowadays, there are lots of electronic pets that react to people, so that wasn't anything new. But there were some aspects of Nabaztag that were intriguing and was worth some musing. Nabaztag is a kind of appliance/toy that's connected to the web. It's a little hard to describe at first, but this how it works does a good job of giving examples.

Generally, I see it as a simplified interface for the web embodied in a pet avatar. If you've ever watched any anime, you'll be familiar with pet sidekicks, usually for comic relief or raising the cuteness factor. If you imagine a sidekick through whom you can channel to communicate/interact with others, or to receive news, that'd probably be on the spot to what a Nabaztag does.

But why do I think it's worth posting?

When the internet was conceived, there were many users on a few computers. This has changed significantly. There are now a few users on many computers. Computers are not only mainframes, but first desktop, then laptops, and now budding, mobile devices. Eventually, there will be many more devices retrofitted for the web, such as refrigerators, stoves, and clocks. But that doesn't mean that there won't be communications appliances made specifically for the web.

Nabaztag seems to be, one of the early steps in making communications appliances in a form that people bond with. This can work one of two ways.

1) Just on the news today was a piece on how toddlers treated a QURIO as one of their own and bonded with it.

I can see something QURIO-like that will do just enough to fool us in the right ways for us to bond with it, like the toddlers have, and make it part of the family--like a home robot. Not really a robot to do heavy labor, but more like a companion/pet that can give you the weather and channels your friends to you.

2) Alternatively, we can have electronic pets that don't fool us, but rather, they are representatives and reflections of ourselves and we use them to interact with our friends' pets (also representatives of themselves). We already do this to some extend through MMORPGs as well as the many Sim Games. However, the difference is that we play those avatars. Here, the pets are recognized to be separate from ourselves, but they are our delegate. Just as people socialize through their dogs at dog parks, I think people will start to socialized based on their physical electronic pets.

If the pets learn its owner's habits, when it meets other people's pets, one might be able to trade information/gossip, or judge how well they'd get along with each other based on how well their pets got along with each other.

So why check the weather through a robot rabbit than through the browser on your computer? Sometimes, it's a lot faster through the rabbit, since it has lights and indicators you can check at a glance. Presumably, that's why the Ambient Orbs have been making money. Information becomes a part of a person's environment, rather than something that's queried.

But an even more compelling reason is that it becomes another dimension in interacting with other humans, and in self-expression. The pet becomes an extension of self that one uses to interact with others. And with the web, the interaction doesn't have to be physical. One of the interesting demonstrations on Nabaztag is that two rabbits can get married, and thus would imitate each other's ears. If a user on one end can control the ears of your friend's pet, you can communicate tactile touch, and your pets would be channeling you to your friend. Extended to a gel-like tactile substance that changes shape like a piezo-electric, it would make it even more real.

And while I don't know for sure whether any of this will happen, it seems like an exciting area to explore.

Saturday, November 03, 2007

Conversion boxes for the old spectrum

As we all have heard, with the advent of high definition digital TV, old analog TVs aren't going to work anymore. This is because HD digital TV will be broadcast on a different frequency, thus freeing up the old TV frequencies for other uses. The FCC is going to auction this spectrum off to the highest bidder. The likes of Google are bidding on parts of the famed 700MHz spectrum with the intent of providing free wireless internet to the masses. It's considered 'beachfront' property on the wireless spectrum because of its propagation properties. Apparently, 700MHz has a wider coverage area.

But my question is: what to do with all the analog TVs? While more than half of America watches TV through satellite or cable now, there is still a significant portion of people receiving TV signals the old school way. Sure, after the conversion, our old analog TVs can still receive signals, but it won't be able to correctly interpret them. Most people have assumed that if you want to use your old TVs, you'd want to use them to watch TV. Thus, the solution is to buy converter boxes that receive digital TV signals and convert it to the old analog signals.

If we're going to have converter boxes, why not have different types of converter boxes--not only ones to watch TV? As of now, the auction for those bands hasn't happened yet, so we don't know what type of signals are going to be on those channels. But if there's signals that provide free wireless internet, I don't think it's too far fetched to make converter boxes that are thin clients that use the old TV as a monitor. That way, it can not only be a telephone through VOIP, but it can still be television, but an internet television.

Going one step further, the converter box could have a software radio in it, so you can tune to a wide range in the spectrum of radio frequencies, and make use of that signal and the data on that signal if you had the software to interpret it. However, that's not exactly happening soon as today's computational prowess isn't fast enough to digitally process signals at high frequencies. You need to sample the incoming signal at least two times the highest frequency in the signal in order to recover it, and we're just not there yet digitally (at least that's what I read about a year ago).

I hope that people see opportunities here, and take advantage of it, so we can see some innovation in reusing what was once old for the new.

image credit: xkcd.com

Thursday, November 01, 2007

Nerd time 11 - Open social, web on desktop, random tidbits

I use to work at a research lab, and most of them are playing around with other toys outside the web. I send this out as a mailing list just to keep them updated.
---
Google Gphone
Rumor mill's on full churn with speculation about google's gPhone, so even if you read a bit of tech news, you'll probably have heard something about it. I personally don't think that they're looking to compete with apple's iPhone. To me, it makes more sense for them to license an open phone to other phone manufacturers to make it a platform they can do mobile and location-based advertising. We'll see what actually happens.
http://www.techcrunch.com/tag/gphone/

Google Open Social API
Google's going to announce the openSocial API that is suppose to out-open facebook. There are moves here and elsewhere to try and make your social network portable across different web services. That way, if you sign up for a new service, you don't have to tell it who your friends are all over again. The last link is by Marc Andersen of the Netscape Fame. His latest thing is Ning, a tool that lets you build social networks--so obviously, he has a vested interested in the topic.
http://www.techcrunch.com/2007/10/30/details-revealed-google-opensocial-to-be-common-apis-for-building-social-apps/
http://www.techcrunch.com/2007/10/29/googles-response-to-facebook-maka-maka/
http://blog.pmarca.com/2007/10/open-social-a-n.html

Eye-fi
This is something I've been waiting for for about 2 years now. It's an SD card that's also a wifi card. It enables any camera to be wifi-enabled. So you can take pictures and have it be uploaded to the web at the same time. This sort of a thing is a boon to Mobtropolis, as it lowers the barrier between taking a picture and sharing the photo. Hopefully, people will stop taking pictures of the same group pose with all different cameras soon.
http://www.eye.fi/

Mozilla Prism
Prism is still experimental, but both mozilla and adobe are thinking of taking the web experience and putting it back onto the desktop. Prism is mozilla's Thus, every web application will seem like a native application, regardless of whether you're actually connected or not. Thus, you can browse your mail or feed reader even if you're not connected. In addition, desktop apps can take advantage of local hardware acceleration for graphics. This seems similar in concept to Java's Web Start, except it's built on top of web technologies. While nothing's for sure, all the stars seem pointed in this direction. Thus, web developers might start moving on desktop developer's territory in the near future.
http://labs.mozilla.com/2007/10/prism/

How can I use spreadsheets to answer some of my many questions about the world?
One example of mixing the web with traditional desktop applications is that you can actually put queries in your spreadsheet, such as # of users in Paraguay or the ERA of Roger Clemens. Just a tidbit I thought was neat.
http://documents.google.com/support/spreadsheets/bin/answer.py?answer=54199

Evidence Based Scheduling
Joel came out with this article a couple days ago. I thought it was pretty neat and obvious (in hindsight). He comes up with a way to estimate shipping dates of software with a specific probability. He adapts a version of Monte Carlo in order to do it, and while I don't know if it works as well as it claims in practice, I assume that Joel eats his own dog food, and it seems to make sense. If you're interested in software scheduling, defn give it a read.
http://www.joelonsoftware.com/items/2007/10/26.html

The 4 boneheaded biases of voters
As some of you know, I'm pretty interested in decentralized systems--especially since Mobtropolis will have social problems at larger scales if I don't pay attention to them. Capitalist economies and voting systems being two examples. This is an article I found detailing the biases that people have about large-scale decentralized systems--specifically the economy. It's an interesting read.
http://reason.com/news/show/122019.html

State Machine Compiler
Ragel generates state machines for you. I found this to be interesting because I was wondering about how to do minimal aspect orientated programming without a full-fledges AOP system in place.
http://www.cs.queensu.ca/home/thurston/ragel/

Ruby's Object Model
When you grew up with OOP, you think you know it all. But the object model changes when you're using a dynamically typed language. It's a rather different beast altogether, and in order to keep straight the meta-programming things that people do, it helps to know and understand the object model. These are the two best ones that I've seen that explains Ruby's Object Model, and particularly, the Metaclasses.
http://rubyforge.org/docman/view.php/251/96/ChrisPine_UROM.ppt
http://www.klankboomklang.com/2007/10/05/the-metaclass/

MapReduce in a Week
If you've got a week to spare. Mapreduce is how google churns through embarrassingly parallel problems. It has its roots in functional programming. If you don't know what mapreduce it, check out joel below. Though I'm sure I posted that link before, it's worth checking out. He gives a good overview.
http://code.google.com/edu/content/submissions/mapreduce/listing.html
http://www.joelonsoftware.com/items/2006/08/01.html

Nerd time - issue 10

Nerd time is just a short mailing list I put out to my ex-coworkers at APL. They're in the applied engineering research fields, so what's going on in the web world isn't well known to them in their daily work, so I fill them in from time to time. If you regularly read techcrunch, proggit, or slashdot, I'm sure you've seen these before.

---
Hey all,

I was going to make this one about new services I use that might not be well known, since there's nothing terribly interesting going on lately. But after a month of haphazardly collecting interesting things, no particular pattern appeared. Just a hodgepodge of things I found interesting. There's nothing terribly hard this time. All easy reading, except for the one on APL at the end.

Take screenshots to measure your productivity.
This is something Ian's been asking for, and I thought he'd like to check it out. No Linux client yet though.
http://www.rescuetime.com

Prof. Randy Pausch's Last Lecture
This is a CMU prof that is dying of cancer, and he gives a last lecture. You can skip all the intros and extros, as the actual lecture is about an hour. It's pretty good, and entertaining. I found his lecture of time management to be pretty helpful.
http://video.google.com/videoplay?docid=-5700431505846055184

Voice tracking camera
This is one of those "simple" things that you wish you did. Theoretically, it's pretty easy. You use microphones to do triangulation, to figure out where the voice is coming from. But when you see his setup, he uses seven all around the room--so it might be a bit complicated. It's a long way from our own two ears.
http://www.youtube.com/watch?v=rrOy6LpL940

Commenting engines
Commenting is one of those fundamental aspects of web interaction that gets implemented over and over again, in wikis, in forums, in social apps, in blogs. But with commenting comes a host of problems. Some technical such as spam bots, cross referencing them, keeping the most relevant ones. Some social, such as trolls, scaling a conversation, etc. These two implement that for you, and commenting becomes just a widget. Not a bad idea, especially if they can thread conversations
across different blogs.
http://cocomment.com
http://www.disqus.com/

Build your own car
I've always wanted a hackable Linux based car. Everything from the onboard entertainment system to the safety system. While this isn't it, it's a step closer. I think they'll ship you all the parts you need to build your own car.
http://www.grabercars.com/content/view/80/2/

Dopplr is a service way to tell your friends, "hey, I'm going to [town], who's already there, let's hang out, or I need a place to crash" sort of thing. It's a social network focused on travelers. I've often was somewhere, and found out a friend was there at the same time too, but we didn't know. It's still in private beta.
http://www.dopplr.com

Mozilla Lab's social network in a browser.
This is an experimental add-on from mozilla that tells you want your friends are doing online. It's like the news feed in facebook. So any time anyone posts a link, updates their status, etc. you'll see it. And sending links to people is easy. You just drag it to their photo in the side bar. So instead of me sending nerd time over email,
I might as well blog it or use something like "The Coop"
http://wiki.mozilla.org/Labs/The_Coop

OAuth is Open Authentication.
I think I posted something about OpenID way back. OpenID is an open way of having uses verify to you they are who they say they are. That way, you don't have to have a separate login/pass everytime you want to use a new service. OAuth is a way for users to grant permission to a new service for their API. So if you signed up for mobtropolis, and your social network is elsewhere, you'd use OAuth to authorize
Mobtropolis to look up your friends.
http://oauth.net/about/

XFN microformats and FOAF
This is also part of the effort to open up your social network. Microformats are basically little bits of meta-data inside HTML tags. It's part of the effort to make the web more semantic. This can be used in conjunction with OAuth to make your social networks portable. We'll see if people make headway. You can view microformats with the Operator mozilla plugin for firefox. Microformats are actually on quite a few web pages now.
http://microformats.org/
http://www.gmpg.org/xfn/
http://www.foaf-project.org/
https://addons.mozilla.org/en-US/firefox/addon/4106

Forth is a stack based programming language. I don't know as much as I should about it, but it's mind-expanding. The language lacked conditional branching and loops. But apparently, that's because you can write your own, not to mention any other weird control structures you can think of. In fact, you can write your own Forth based-PC, its environment, OS, and language in about 2000 lines of code (it is said)
http://weblog.raganwald.com/2007/10/until-you-understand-how-forth-is.html

APL -- the language
I heard of this language, but never managed to see any code. I can see why. You need a whole other keyboard to program in it. But it is pretty neat. You can write Conway's Game of Life in one line. I expect this is because it maps well to functions. Neat idea.
http://catpad.net/michael/apl/

Friday, October 26, 2007

ArbCamp 2007

Well, I'm headed out to ArbCamp 2007. It's my first conference in a long while, and a first "unconference". I think I've been stuck with my nose to the grindstone the last couple of months, so it'll be nice to see what other people are doing. If any of you happen to be out in Ann Arbor or doing to ArbCamp, drop me a line, and we can meet up.

There should be some more exciting things to post after this weekend, from ArbCamp and otherwise. Til Monday.

Monday, October 22, 2007

Where do you put the rules of Monopoly?

It's apparently a favorite interview question. I have to admit, the first response in my head wasn't a great one, which was "everywhere".

What are game rules? When browsing the rules of popular games like Monopoly and Scrabble, they seem to follow a similar format:

The initial conditions of the game (the setup)
Then given a condition,

the set of allowable actions for the player to do
the effects of the condition

I embarked on somewhat of the same problem, though on a much smaller scale. I wanted to implement a point system that gave points to those that participated in a Rails app. My first reaction was to make a Rule abstract class, and have different rule classes that subclass it. Something like this (variable names have been changed to protect the innocent):

class SceneRules
  def initialize(model)
    @scene = model
    @user = @scene.submitting_user
  end

  def on_submit
    @user.points += 1
  end
end

class SceneshotRules
  def initialize(model)
    @sceneshot = model
    @user = @sceneshot.sceneshot_uploader
  end
 
  def on_submit
    @user.points += 10
    @scene.submitting_user.points += 5
  end

end

Then I'd be able to call it from the controllers. However, on second thought, it's rather ugly, since I'd be updating the karma everywhere in the controllers. If I understand cross-cutting concerns correctly, scoring karma would be a good example of one. I suppose it's a good candidate for aspect orientated programming, so I scrapped the code above.

Logging is often cited as the poster-child problem to solve with AOP. Logging needs to be done everywhere in the code, but it really has nothing to do with the responsibilities of the class that it's performed in. So you have the same code doing the same thing, duplicated everywhere because there's no one place to put it to make things easy to change.

By the same token, game rules and scoring are of the same nature. And because game rules involve lots of different objects at once, and scoring is interspersed throughout, I think that makes it a good candidate for AOP. However, Ruby has no such direct support for AOP. Instead, the closest thing we have are observers, before/after/around filters (in Rails), and some meta-programming.

I wanted something that allowed me to list out rules like games like Monopoly and Scrabble. I'd have a setup, and some conditions and their effects. Scoring is simplified here because the only time you can score is when one of the models is created or changes state. This is a good fit to the observers and the filters available in Rails.

class ScoringRules < ActiveRecord::Observer
  observe Sceneshot, Scene
  include Rules   

  setup { :scene => Proc.new { |sceneshot| sceneshot.scene } },
        { :sceneshot_uploader     => Proc.new { |sceneshot| sceneshot.sceneshot_uploader },
          :scene_submitting_user => Proc.new { |sceneshot| sceneshot.scene.submitting_user } }

  rule :after_create_sceneshot do |board, players|
    players[:scene_submitting_user].score += 5
  end

  # put more rules here

  def after_create(model)
    rule_dispatch(:after_create, model)
  end

end

Here, the Rules module is what encapsulates the setup, rule, and rule_dispatch calls. I needed setup so that I can access different "game elements" (the board and the players) to update the scoring. It basically stores the setup as a list of lambdas that it can execute at a later time when the rule needs to be executed. Now, when a model is created, we ask the rule dispatcher to figure out which rules execute based on the rules we've named, and then execute the attached block. The block is passed a hash of different game pieces that it needs to update the score and the game conditions. That's it.

I thought it was an interesting way about it and probably warranted some criticism. Is there any particular disadvantage of doing it this way? And if you can think of a way to not have to explicitly state the model relationships in the setup, that'd be nice. half-tip!

Thursday, October 11, 2007

Operator for Microformats

I've put microformats in the stuff I've built before. And to be honest, I wasn't sure if they were going to take off. The formats were there, but none of the tools to actually use them were around. For a long time, I just used the tails export add-on in firefox. You couldn't really do anything with it, but you could at least see microformats.

The Operator takes it one step further, and adds actions to microformats. In a way, it's a god-send, and now I'm wondering why I didn't write it. I HATE having to cut and paste addresses into google maps. Now, if there's a microformat of an address on a page now, all I have to do is use Operator--two clicks and I'm there. (As an aside, Humanized Enzo makes it even easier to include maps.)

Microformats have been gaining momentum for a year now, and I think it will be important in the web to come. Not because it's one-more-thing to have to know, but because the guys over at social network portability are using it as part of their solution to open up the walled gardens of the social network stovepipes in effect today.

Given how we're all use to using web applications today, it's as if we're always 'reborn' every time we sign up for a new service. It's an odd idea to be able to "take your network with you". If you can do this, especially as mobile devices get more powerful, there can be more and more web applications that can act as mediators in social contexts of the users as they go about the world.

One nice little app would be a name-recaller. At a conference or a party, my mobile phone would detect which other mobile phones are around. It could then query the web application whether I've met any of these people before, and what their names are. So when I actually go to shake their hand, I can address them by name. And as we all know, the sweetest sound to a person's ears is his/her own name.

But that's still a long ways from now. Probably not in another 4 or 5 years, if not more. However, I'd keep an eye on this area. I'm sure you'd hear more about it soon.

Model Reduction of Complex Dynamics

Alright, for some actual news. This is something that has been hard to achieve in computer graphics--that is believable fluid mechanics like water or smoke. Traditionally, they've been modeled as really large particle systems, which gets to be pretty expensive computationally.

These guys have managed to reduce the amount of computation required to make these computations to simulate fluids, detailed in this paper.

I've only glanced at the paper, but it looks like they were able to frame the problem in such a way that they were able to use dimension reduction techniques to reduce the number of computations they need to do, but have the least noticeable effect. By noticeable, I mean not only to the eye, but also to physics. It also conserves kinetic energy in the simulation.

I don't understand much of the math in there, as it'll take some time to go through it, but it reminds me of lossy compression algorithms and search engines. Not every piece of information is important, or is important in the same way. If you can frame the problem so that you can throw away less important information and still have approximately the same thing. It's kind of novel to think of it as applied to a computational process, rather than a stream of data.

Tuesday, October 09, 2007

First Review!

Admittedly, I submitted mobtropolis to Geekheartland, but it was nice to get a first review. You'll probably see less and less of these self promotions, since there are only so many "firsts" you can have.

All in all, there was nothing scathing about the review, and it actually said it was interesting. In general, of everyone I've ever talked that got it thought it was "interesting". Honestly, that has me puzzled. From everything I've read and heard about the history of innovation, great and revolutionary ideas polarize people. So far, I haven't really had anyone that got it and was turned off by the idea or thought it could never work. Update: Probably because not enough people have heard about it. I'm sure if the bloggers at Uncov got wind of it, they'd rip it to pieces.

We'll see what happens as I expand on Mobtropolis's vision.

Friday, October 05, 2007

Mergesort in Erlang was to be my redemption--alas it was not

Well, once I got started, I kinda couldn't help myself. I really should be debugging and cleaning up Mobtropolis, but sometimes, you need a break. To add to my shame of Erlang neophytism (probably not even a word), I tried out merge sort.


mergesort([Last]) ->
   [Last];
mergesort(List) ->
   [Left, Right] = split(List, [], []),
   io:format("~w ~w~n", [Left, Right]),
   merge(mergesort(Left), mergesort(Right)).

split([Last], Acc1, Acc2)->
   [ [Last | Acc1], Acc2 ];
split([First | Rest], Acc1, Acc2) when length(Acc1) <>
   split(Rest, [First | Acc1], Acc2);
split([First | Rest], Acc1, Acc2) ->
   split(Rest, Acc1, [First | Acc2]).

merge([], R) ->
   R;
merge(L, []) ->
   L;
merge([L_h | L_t], [R_h | R_t]) when R_h =<>
   [R_h | merge([L_h | L_t], R_t)];
merge([L_h | L_t], [R_h | R_t]) ->
   [L_h | merge(L_t, [R_h | R_t])].

Ugh. I thought mergesort would be my redemption, since in my mind, it looked shorter. But I ended up having to write split(), since I didn't know the libraries well enough, and couldn't think of a good list comprehension to split the list. Admittingly, it's like deciding to forgo inventing the combustible engine and deciding to start walking--I just built it out what I little I knew. If you're so inclined, you can also try one-liner-ing (also not a word) mergesort, as the bar I set was pretty low. :( Like Bryan says:

"Get back into Erlang! it's good for you. ;)"

Speaking of Bryan, he posted a solution to bubblesort that was pretty good in the last post. It was a good lesson. Like anything worthwhile learning, languages are often easy enough to pick up, but hard to master. This was a good wakeup call to go through the examples in the book and master more Erlang before trying to write anything else on my own.