Thursday, December 31, 2009

Bubbles and triangles

Lately, I've been looking at more information visualizations, and it's not been said enough that simple geometry is often ignored.

Often times, I'll see visualizations like this, where bubbles are employed to visually compare different records. It seems likely that people judge and compare difference in size by area. However, the artist/designer makes the mistake of mapping the data to the radius instead. This doesn't work for circles because of your old 4th grade math, stating that A = pi * r^2, where the area doesn't increase linearly as a proportion of the radius.

For the record, it's done correctly here in the visualization above, as far as I can tell.

Here, the designer decided to use triangles. If he mapped the data to the height of the triangles, that's fine, because for triangles, A = 0.5 * b * h, and hence area varies in linear proportion to height.

However, looking at Hungary, the red triangle doesn't seem quite a quarter of the black triangle.

Beyond that, for these two examples, I really see no reason to use circles or triangles. People are able to judge spatial difference much more easily, as opposed to size. It would have been far more effective to use bar charts and rectangles instead of shapes like triangles and circles. In my opinion, you only use that if the spatial x and y axis are already being used to convey other information already.

Posted via web from The Web and all that Jazz

Wednesday, December 23, 2009

Practice coding faster

map_with_index. Why is there no map_with_index in ruby? Ends up it's because you don't need it. You can simply do this:

Even now, I'm learning things about Ruby. The rabbit hole is deep.

Anyway, a post about learning how to code fast came across my desk. I knew that I wasn't quite as fast as other coders, but I had always thought that I thought deeper on the solution. But I think what he says makes sense. I know that violinists slowly ramp up their speed to a point where they're almost making mistakes, faster than they'd actually play the piece to practice playing it at the correct speed. Same with drawing. The more you practice drawing faster, the better you get at being economic with your strokes. So I figured I'd try the same with programming, since I've never done much of this type of exercise.

I decided to do the first Ruby Quiz, since I was most familiar with Ruby, and I should be able to do it quickly. It took me about three hours, including reading the instructions, going on bathroom breaks, etc. I think I should have been faster, and I noticed where I slowed down. I found myself trying things out in irb a lot because I didn't know the exact behavior of some array and string functions. Also, I spent some time in the beginning pondering how to structure it--should it be a class, or just a collection of functions, or should I extend the classes?

I'm not thrilled about how it's structured, but it works. Well, there's a small bug in there, but I'm going to refrain from fixing it. It'll tack on another 30 mins. The point of the exercise is that I can see what I need to work on. I'll try again next time with the next ruby quiz.

Posted via email from The Web and all that Jazz

Monday, December 21, 2009

On to the new old thing

Ever since I quit my job at the lab 4 years ago to pursue a startup, I'd been fumbling around learning all sorts of things. And though I had determination and persistence, I simply didn't know a lot of things outside of coding. Who wants this? Why would they pay money for it? Where do you find them? How do you change your idea if it doesn't work? Going into it, I knew I didn't know anything and that I'd learn, but I also didn't know what I didn't know.

For the past year and a half, I've been with a startup that went into the YC program. So as much as I wanted to pursue my own ideas, I was advised to join up and watch other people and learn. And learn a lot I did. What to do, and what not to do. And just seeing founder from other startups helped. What their thought processes and attitudes about their line was work was. Meeting role models are easier than reading about them, I guess.

And yet, while I've been learning about things outside of code, I felt like I've been sailing downwind when it came to technical things. Sure, I'd mess around with chickens flocking (which embarrassingly, I haven't gotten back to), but for the most part, I was consumed by work. Getting the tickets done and getting better at communicating with other team members.

However, much of my creativity was sapped. It was hard to fire up the editor afterwards and explore something new. I have a whole list of things I wanted to dive into more deeply. Haskell's type system. Erlang servers. Spatial trees for Frock. Arc language. Potion language. Prof Strang's Linear Algebra lectures. Visualizations and info graphics. Mobile web apps. 3D printers. And though I've dabbled in all of the above, it's not yet been enough to satiate my cravings.

But I have learned that it's also a lesson in keeping things small and simple at first. Much kudos to those working a full time job and are able to get a side project up and running.

This blog also suffered as a result. I was afraid that the things I was learning outside of code might reveal too much about the internals of the startup I was with, so I just left nothing to chance. I simply fell out of the habit of blogging about what I've learned, and as a result I feel like my writing skills have deteriorated.

But now I've left the startup, moved to Mountain View, and I'm pursuing my own once again. It's not a secret what I'm working on, but I'd just rather talk about it in a separate post. I've also started exploring other technical topics as well, as I hinted in the last post about Potion. I'll start finishing up the backlogged technical posts.

So for those of you that still are subscribed, well, thanks for your faith. I'll be writing more, and I hope you'll be able to learn something from reading this blog as well. Have a great holiday!

Posted via web from The Web and all that Jazz

Saturday, December 19, 2009

Playing with potion

For fun this morning, I cloned _why's potion and started going through the tutorial. After cloning it, I was able to compile it on my mac, and then I tried to run:

"hello world" print

and it seg faulted. Boo. So in case any of you are looking for a work around, this worked:

And looking through the source, I found the about function's easter eggs. Humorous.

1 times:
about("_why") print
about("stage fright) print

Posted via email from The Web and all that Jazz

Friday, November 13, 2009

jQuery live events can only bind once

I've been doing some more javascript on the side as of late, and I ran into a snag with jQuery.  It was an odd case of live events that didn't seem to be taking hold.  In addition, it only looked like it was happening on Chrome.  As always, it helps to read the docs.
Unlike .bind(), only a single event can be bound in each call to the .live() method.

That means that my overlapping live binding were getting overwritten, and that Firefox and Chrome were merely adding the events in different orders.  If it's not apparently in the docs, start looking in the code.  As an aside, I've heard that the JQuery code is a good example of great javascript code.  I don't understand a lot of what I saw.  Well, tip.  Been working on some other things on the side, and will reveal them in due time.

Posted via email from The Web and all that Jazz

Wednesday, November 04, 2009

Tarsnap docs as an example of confusing typography

Because in 1960, the Bureau International des Poids et Mesures decided that the SI prefix G- meant 10^9.

But it means 2^30, really!

No it doesn't. Let's look at some examples:

Just a quick thought. It's pretty basic, but this was the first time I had a front-row seat demonstration of basic design principles and why they're suggested.

You can't really see it in the quotes, but if you follow the link, you'll see that the headings and text are the same size. Not only that, but the spacing between paragraphs is the same between headings and paragraphs.

That confused me and I had thought that the bolded statements was part of the text, and hence it was something the author was saying, rather than as intentioned, something we are readers would be saying as headings of different sections.

We differentiate importance and grouping by weight, size, color, and spacing. It takes a combination of these to discern what we're reading.

Posted via web from The Web and all that Jazz

Thursday, October 29, 2009

You probably don't need an OLAP

It's a "well known" that relational databases are bad at multi-column "slice and dice" calculations.  So when you have data that you'd like to represent as an aggregated trend, it's easy to reach for that OLAP.  Chances are, you don't need it.  Here's an example of something where you want a count of the number of comments from an author by day.

select DATE(created_at) as DateOnly, count(*) 
from comments 
where author_id = 877081418 
group by DateOnly
order by DateOnly

The trick here is the DATE() function provided by various database vendors.  This returns any datetime as simply a date that can be aggregated.  

To be honest, I looked into this way too late and didn't contest the OLAP architectural decision until it was late.  We ended up having the legacy of dragging a big fat OLAP with all its trappings of complicating our architecture.  If you end up with a complex architecture, there's probably a simpler way you're not seeing.  The simpler your setup, the easier it will be for you to hold it all in your head and understand it when things go wrong.  

The only OLAPs we've found to be available was the open source Mondrian and Microsoft's Analysis Services.  To be honest, I found both to be way harder to use than it should have been.  If someone else wants to write another OLAP that's simpler to use without a lot of luggage to blow those two out of the water, the time is nigh.

Posted via email from The Web and all that Jazz

Thursday, October 08, 2009

Scope in JavaScript is just from which door you entered

Put simply, we entered BigComputer via new, so this meant “the new object.” On the other hand, we entered the_question via deep_thought, so while we’re executing that method, this means “whatever deep_thought refers to”. this is not read from the scope chain as other variables are, but instead is reset on a context by context basis.

Javascript's scoping has been one of most confusing things about it, just as Ruby's metaclass and object model is the most confusing things about it. If you're looking to expand the horizon of what you understand about programming languages, it's worth it to figure out javascript scoping.

The paragraph gave a good way to think about it:  this changes based on the object that calls the method.  It only gets confusing when you start passing around functions and using callbacks, which is most of the power of functional programming.

As an example, here, I was using an anonymous function as a callback in the request() method.  But it doesn't work!

So that's just one way to solve it.  If you're using Prototype, you can also try using the bind() method.  jQuery doesn't have an equivalent bind method, as hard as I looked for it at one time.  I was just about to write it myself (as it's not too hard), but according to the a list apart article on Getting out of binding situations in javascript:

jQuery does not provide such a binding facility. The library’s philosophy favors closures over binding and forces users to jump through hoops (that is, manually combine lexical closures and apply or call, much as other libraries do internally) when they actually need to pass along a piece of code referring to “instance members.”

So while I use closures extensively in Ruby, I haven't had to explicitly think about the scope until I was using closures in Javascript.  Huzzah.  Hopefully, it'll prompt you to take a deeper look at Javascript.

Posted via web from The Web and all that Jazz

Wednesday, October 07, 2009

Concurrency and integrity with validates_uniqueness_of

Here's another tidbit that I hadn't noticed in the rails docs before.  I was looking at validations for uniqueness and I saw this:

Using this validation method in conjunction with ActiveRecord::Base#save does not guarantee the absence of duplicate record insertions, because uniqueness checks on the application level are inherently prone to race conditions.

And the docs also offer some solutions:

This could even happen if you use transactions with the ‘serializable’ isolation level. There are several ways to get around this problem:  
By locking the database table before validating, and unlocking it after saving. However, table locking is very expensive, and thus not recommended.   
By locking a lock file before validating, and unlocking it after saving. This does not work if you‘ve scaled your Rails application across multiple web servers (because they cannot share lock files, or cannot do that efficiently), and thus not recommended. 
Creating a unique index on the field, by using ActiveRecord::ConnectionAdapters::SchemaStatements#add_index. In the rare case that a race condition occurs, the database will guarantee the field‘s uniqueness.

This typically isn't something you'd need to worry about until you get to some traffic of scale and size.  So don't worry about it too much until you get there, but be aware of the problem.  Read the docs for more details and information.  tip! 

Posted via email from The Web and all that Jazz

Tuesday, October 06, 2009

Amazon S3 and Paperclip plugin

Even after reading all the documentation, paperclip still has its quirks.  I've been pretty busy, but here's a short tip to tide you over.  When using paperclip with S3, make sure that you have the :path option set when using has_attached_file.  

It didn't take too long to figure out, but just in case, make sure bucket option is set either in the has_attached_file declaration or your s3 config file pointed to by :s3_credentials option.  Otherwise, you'll get a mysterious 

"MethodNotAllowed: The specified method is not allowed against this resource." Error.

So head on over to Scott Mottes and learn it step by step.  tip.

Posted via email from The Web and all that Jazz

Tuesday, September 22, 2009

If only

"If only I had ____ I would succeed."

These simple words will kill your dreams faster than anything else you could say or think. There are so many self-defeating thoughts that an entrepreneur can have, and they often take this very simple form.

While Garry takes it in the direction of getting your hands dirty and building, and the recent HN discussion talking about whether one should sell or not, reading these compels me to take it in a different direction this morning before work--I'd like to speak a little about mental blocks.

There were many reasons why you'd want to sell your company.  Your business deals with fads and the market will go away.  You're done with this thing and want to move on.  But there is a bad reason I want to focus on:  "it'll give me freedom to do what I want".  I think when people say this, they mean two different things: 1) if I have lots of money to take care of life's annoyances like bills and college tuition, then my mind will be free to work on anything 2) if I have lots of money, I can fund whatever I want to work on.  The latter, I find to be an unconvincing reason.

My dad is retired. He talks about starting a foundation to help education in Taiwan, and seems rather passionate about it.  He spends a lot of time watching and reading Taiwanese news.  Given a chance, he'll talk your ear off about it.  However, he says, "if only I had a million dollars", he could start his foundation.  And the way he usually thinks of getting the million dollars is through the lotto.  Now, my dad is no fool.  He knows the odds.  And I don't know if it's a generation gap in the way jokes are told, but if he's serious, it's a mental block that I see in some friends also.  It's an excuse to do nothing because of the preceived notion that the external world hasn't given you permission.

By contrast, a couple years back Oprah had some special on TV about a new school she was building in South Africa.  Though she put in a hefty sum, I was surprised to find out that she didn't put in all the money herself.  She had other people help her with donations.  That's why she had Nelson Mendela, Maria Carey, and others visit the school--to help donate.  Even when she could pay for it all herself, she enlisted other people to help. In a more recent example, Breadpig and xkcd joined forces to put a school in Laos.  They're putting in the work, yes, but as far as I can tell, it's none of their personal money. 

Just because something takes a million dollars to do, doesn't mean it has to be your million. 

Perhaps this is obvious to some of you, but I was a little bit surprised when I realized this.  Growing up, I never thought about it too much, because in movies like Batman, Bruce Wayne funded his own crazy toys.  So I naturally assumed that if you want to do huge things, you do it all with your own money. As a kid, I thought:  If I wanted to build a Mechwarrior, I'd have to do it with my own money.  If I wanted to build a loop-de-loop highway, I'd have to do it with my own money.  If I wanted to build a giant chicken slingshot, I'd have to do with with my own money.

Of course, this comes with some amount of responsibility and constraint.  Pissing away other peoples' millions is a sure way to get your legs broken, especially with money from a loan shark (or its million dollar equivalent).  But I believe constraint in business and philantropy, as constraint in design, is a good thing to focus your efforts.  Sometimes, personal money projects fail because they're not as readily subjected to market forces.  A bad idea is kept afloat because there's a huge chunk of personal money that keeps getting dumped into it.

In the end, I just want to say, you have a choice.  Don't let a little thing like not having a couple million stop you from doing what you want to do, as there's always more than one way to skin a cat.  But if you want to build a pyramid for your burial site, then yes, please do that with your own money.

Posted via web from The Web and all that Jazz

Sunday, September 13, 2009

rake task with arguments - Ruby Forum

Of course csh is evil! That's nothing new. This works just fine with bash:

rab://tmp $ cat Rakefile
namespace :foo do
desc 'lol'
task :bar, :num do |t, args|
puts "num = #{args.num}"

rab://tmp $ rake foo:bar[123]
(in /private/tmp)
num = 123

Hey look. Arguments in Rake. I've been looking for this for a while now. No more using env variables.

Posted via web from The Web and all that Jazz

Saturday, September 12, 2009

Another bad data visualization

(click image to enlarge)


This is one of the worst data visualizations I've seen.  Problem is it looks pretty, so people send it around, but it's not very informational.  Nor does it allow easy comparison of the data.  First, it's not apparent that the light green and the dark green sections are the same thing until you realize it's an "O" from "Google", and actually adds no information.  Second, what do the size of the circles represent?  Is it combined daily spending or average daily spending per advertiser?  It takes a while to find the circumferencial text, which you'd guess that it represents the amount of revenue from top N advertisers.  Then the chart also mixes terminology.  While spending by advertiser and revenue by google are the same thing, you need to do extra work to figure that out.  Then, what the heck, the list of logos on the side is distracting.  It's suppose to be the advertisers in the blue circle--the top 10 advertisers--but it sits firmly in the red section, which is the long tail of advertisers.  Even more confusing, the $59,184,783 is red, but points to the blue list of logos.  Lastly, the average daily spending is colored with the same position and weight as the combined daily spending, but it doesn't represent the size of the circles, which adds even more confusion.

The only thing they did right was to match the size of the circles with the amount of combined daily spending.  Often times, people will draw these sorts of graphs using the diameter as the basis for comparison, which is misleading.


Posted via web from The Web and all that Jazz

Tuesday, September 08, 2009

Using HTML attributes as a mini-DSL for AJAX

At first, I thought being able to return json as responses from the server was pretty neat, with rjs (now js.erb) files using render :update call in Rails.  However, this often lead to some messy code by me and my colleagues.  I would see lots of client side code in the application controller like:

Sometimes, this is ok, but when html elements change, the controller methods break, and the effect cascades from the views through the controllers.  That's a code smell that our code is tightly coupled.  Of course, we can refactor it to a separate .rjs template file, and justify it to ourselves that it's the same as having an .html.erb file.  However, because .rjs files are often so short, the cognitive shift to find that other file is often distracting, and it still doesn't solve the problem of code coupling.

Consider the following:  You want to have a link that makes an AJAX GET request to get a list of comments for the post when clicked.  It gets the response as HTML and then dumps it in the target DOM.  You also want to fade in an indicator feedback when the ajax is loading and fade it out when it's finished loading.  Lastly, there's to be a slidedown effect on the target comments DOM element.  

For the situation above, if we were to do it with render :update or with rjs files, you'd have code that is coupled with DOM elements as I said earlier.  What if we can contain it all on the client side, and leave the server side for business logic?

One way is to use all the options that link_to_remote provides.  This way, all the UI effects are contained on the client side when you have an AJAX GET request.  We can now keep all our UI effects code in the views.  

However, what if we can declare this effect in HTML?  Would that work?  What are the advantages and disadvantages?

I went to the local Ruby meetup, and remembered that Ben Johnson of Authlogic was mentioning something about "data-" attributes in HTML5 and its relation to the problem I described above.  I wasn't entirely sure what he was talking about, until I looked around for the "data- attribute" and found good ole Resig blogging about it (last year, no less).  

What this allows us to do is basically insert data into our HTML elements.  And because jQuery events let us separate the "how" in javascript from the "what" in html, we can declaratively use it as a mini-DSL of sorts.  

The basics are pretty easy to implement.  I didn't do the data-indicator and the data-effect because I'm lazy and it's left to the reader "as an exercise".

Note that I'm proposing it to be a mini-DSL, so that very common AJAX idioms are covered, and you can simply declare things in HTML and not have to go into the javascript often, if at all.  That way, you keep working in the same file, working on the same level of abstraction.

There are some advantages to doing it this way.  
  1. Server response can be faster, since we wouldn't have to rely on the server to generate the proper html link.  The UI effects and behavior can be all done on the client side, where it should be.
  2. I think it's a bit cleaner to be able to say things declaratively, more aligned with how html is declarative.
  3. The DOM elements wouldn't couple the controller and the view.  Everything that refers to DOM elements would be in the view and be easier to change without cascading effects.
  4. You can work in the same level of abstraction while in the HTML views and won't have to jump between layers of abstraction.
There are few disadvantages I can see right now, other than not having the correct DSL, or having a method call that uses too many attributes, making the HTML hard to read.  (If you have others, comment below)

Pretty neat.  So why use the class attribute as the method "call"?  Perhaps it's better to find some other attribute.  I've seen facebox use the rel attribute instead.  That lead me wonder if other people have thought to do this before.  And of course, there's something similar called AHAH microformat, based on JAH from 2005, which is demoed here.  

JAH does something notable, in that it uses the form:

to order to execute the javascript, instead of binding an event to the DOM.  This removes the extraneous href="#" in the other way I showed you above (more succinct), but it breaks the declarative nature, and cannot be unbound and binded with something else easily--one would have to change all instances it's called, instead of re-adjusting the DOM selector element (as rare as that may be).  I personally don't think it's as easy to read, especially when the method call has parameters, but the implementation would be shorter.

These techniques seem to be by no means widespread at the moment, but one of the contributors is DHH of the Rails fame, so I expect to see it in Rails soon.  So be on the lookout for something similar in the future Rails.  It seems like the technique was talked about back in 2005, but never fully incorporated or used.  I have no idea why.  In the mean time, I intend to incorporate it into new code I write.

(Apologies if there are typos above.  It's late, and I haven't had my ramen)

Posted via email from The Web and all that Jazz

Monday, September 07, 2009

Getting all attributes of a DOM element in Javascript/jQuery

Sometimes, you need to iterate over a number of jQuery elements. You pull something like this:

$.each($(".hello"), function() {
In this context, the "this" variable is actually not the jQuery objects. According to the docs:
Whenever you use jQuery's each-method, the context of your callback is set to a DOM element. That is also the case for event handlers.

This is helpful when you want all the attributes of a particular DOM element. You can call attributes property on the "this" variable (this.attributes) inside of the each() method to get all attributes of each element with the class hello.

Lastly, if you have a jQuery element, you can get all attributes by:


Posted via web from The Web and all that Jazz

Thursday, September 03, 2009

Testing Named Routes in the Rails Console

I finally found out how to do this, from the Rails Routing shortcut by David Black. In the Rails console, do this:

include ActionController::UrlWriter  default_url_options[:host] = 'whatever'  

Then you can call your named route methods directly from the console.

This entry was posted on Tuesday, January 8th, 2008 at 1:58 pm and is filed under Programming, Ruby.You can follow any responses to this entry through the RSS 2.0 feed. You can leave a response, or trackback from your own site.

I've always just worked around this by trying it out in the templates. Should keep looking things up on google. I learn much more that way. Anyway, thought the rest of you should know also.

Posted via web from The Web and all that Jazz

A tiny empty shortcut

I think I've written about this before, by monkeypatching an empty method on the Array class that takes a block and executes it only if the array is empty.  But anyway, for some of you, you might see something like this taking place often:

To clean it up a little, you can do:

Or else if each of your items is a partial, use the :collection method.

Posted via email from The Web and all that Jazz

Wednesday, September 02, 2009

Expected delayed_job.rb to define DelayedJob

Recently, I needed to do some background processing.  I just needed something simple, and looking at all various options, I decided to go with Bj.  It was fairly simple to understand, and best of all, didn't require another daemon to be running by hand--it started one itself if you didn't configure it.  

However, it's incompatible with SQLServer.  I'm guessing no one's every used it with SQLServer before, since I didn't read anything about it in my research.  Bj uses a column called "key" which SQLServer reserves as a future keyword, and thus, automatically changes the name of the column to "[key]"  So consider yourself forewarned.

Thus, I decided to switch to something similarly simple, and I started using Delayed Jobs.  It's all fine, except in some instances, I ran into the following error:

LoadError: Expected /[RAILS_ROOT]/vendor/plugins/delayed_job/lib/delayed_job.rb to define DelayedJob

After reading around on the web for a while, it seemed like any number of things could cause this error.  I finally found way down in this rails ticket from a while back, had the lines that explained it.

The critical comment at that link is: "Prior to this revision, Rails would happily load files from Ruby’s standard lib via const_missing; you will now need to explicitly require such files."

The rest of the comments also talked about various causes.  It's the case of different problems having the same symptom, and here, the backtrace isn't pointing to where the problem is.  

In the case of Delayed Job, Rails expects delayed_job.rb to define a module or class named DelayedJob.  However, that plugin doesn't have any such class or module.  It defined Delayed::Job instead.  So when loading up dependencies, It's looking for DelayedJob module or class when there is none.  

A workaround is to simply define an empty DelayedJob module in your config/environment.rb file, or feel free to put it in a file under config/initializers if you don't want to pollute your environments file.  Hopefully, that saves you some pain.  tip.

Posted via email from The Web and all that Jazz

Wednesday, August 19, 2009

Sending HTML emails with attachments in Rails

Here's something you might have missed in Rails.

From ActionMailer's Documentation:

Implicit template rendering is not performed if any attachments or parts have been added to the email. This means that you‘ll have to manually add each part to the email and set the content type of the email to multipart/alternative.

If you want to have pretty html templates and have an attached file, you can't just set the content_type to "text/html" and call attachment. You need to do it separately, like so:

There, now you can have html and your attachments too. Reference:

Wednesday, July 08, 2009

Regular Expression Matching and Postfix notation

As the compiler scans the postfix expression, it maintains a stack of computed NFA fragments. Literals push new NFA fragments onto the stack, while operators pop fragments off the stack and then push a new fragment. For example, after compiling the abb in abb.+.a., the stack contains NFA fragments for a, b, and b. The compilation of the . that follows pops the two b NFA fragment from the stack and pushes an NFA fragment for the concatenation bb.. Each NFA fragment is defined by its start state and its outgoing arrows:
The snippet doesn't make much sense unless you read the article, but this part, I thought was rather neat. Usually, when I wrote my crappy, one-off parsers, I just used regexes to pull out the tokens that I needed. Never thought too much about how it was implemented. But what's detailed here makes sense. Regexes are just state machines where you track whether the string you're matching against lets you traverse all the way through the state machine. And to do that, it pushes each fragment of the regex onto a stack until it reaches an operator, which then pops it off and works on it. While I've usually left post-fix notation is ass-backwards from a user perspective, I can see the elegance of the implementation. I suspect Forth and Factor are similar in this regard.

Posted via web from The Web and all that Jazz

Monday, June 29, 2009

Nerd time issue 16

I use to work at a research lab and while it was cutting edge in some ways, it seemed unaware of things going on in software and web.  I've since left the research lab, and Nerd Time started as a mailing list to tell my friends still at the lab about things that they might be able to use in their own projects. 


Hey all,

It's that time again.  I never get feedback about whether you guys read this or not.  But apparently, some of you do, since my last trip back to MD.  This is just a collection of things I found interesting since the last nerd time.  Obviously, there are other trends I'm missing.

If you want off, just lemme know. 

This is rather long since it's been a good 6 months since the last nerd time.  Work has kept me busy, and I don't read as much as I use to.  Since there's no theme, but lots of trends, they're in no particular order this time.  Skim through it and see if there's anything that catches your eye. 

If you have questions about stuff, feel free to ask me (don't reply all!)

First, some stuff I did on the side:
Senate Majroity vs National Debt
I was talking with Ian about graphing public data, and this was what he wanted to know.  This sort of thing should be so much easier.  If you find the process of getting this data to graph, lemme know.  I imagine it goes in line with a lot of the net-centric buzzwording that does on in DoD projects.

Frock, a chicken flocking simulator
I wanted to get to know the Lua programming language, so I chose this as a project.  I'm getting it to support more chickens still.

And now, the other stuff.

So I'm sure most of you by now have heard of twitter.  Considering that Oprah did a show on it, it's crossed over to mainstream.  A lot of you might not think of it as anything to pay attention to.  However, it's one of those things where its value depends on whom you follow.  Beyond the hype, it's mainly a messaging multicast system that has a dead-simple API, so that other people can build things on top of it.  People have made things that twitter, such as plants that tell you when you need to water it, when bridges go up and down, when a meteor almost hits the earth, etc.

Wolfram Alpha and Google Squared and YQL
Wolfram Alpha and Google Squared had both been announced in the last month or so.  Both are looking towards being able to query large amounts of structured data.  However, wolfram curates this data with experts, and google squared attempts to make structured data from indexing tables of data on the web.  In addition, Yahoo released YQL, which is a query you can use to scrap the web and treat it as just another database.

Real-time search
Real-time search seems to be the wrestling ground for the next generation of search right now.  There are a number of competitors in this field, including giants and startups.  It's evident with the death of Michael Jackson that news doesn't just travel through the old channels anymore.

Google Wave
If you haven't heard, google released a new communication tool called google wave.  It's what email would be like if it was reinvented today.  It's basically combining different aspects of our communication tools and merging them all together.  It's best if you watch the video and play with the demo.  If you want to play with wave yourself, you don't have to wait for an invite, but can sign up with a wave server that someone set up themselves.  I recommend watching the video, as it breaks your presumptions of what's possible with HTML5 and the web.

Git and github
To me, this is really old news, but just in case you're still using SVN, you should checkout Git instead for your source control.  It's ass-kicking good, though it has a slight learning curve.  I won't say too much more about it, but you should really look into it.  It'll expand your mind.

Key-value stores
Lately, there's been a flurry of attention on key-value stores.  I've mentioned one of them before, CouchDb.  There are a bunch of others.  Tokyo Cabinet (link #2) is used at, a social network in Japan.  Cassandra (link #3) is used at facebook.  Amazon has SimpleDB and Dynamo.  I've only played with tokyo cabinet and couchdb, so I can't really do a compare and contrast between them all.  But to me, TC, couchdb, and redis seem to be the most interesting.  This marks a shift away from relational dbs as the default data store.  Not that they'll replace relational db, but we're finding there are a different class of constraints for the web not necessarily taken care of by relational dbs.  In addition, they have properties not avail to relational dbs, such as being schema-less, an http server built in, replication, distributed, etc.

The internet of things
It's something further out, but these first two talks from TED got me thinking about where the web was heading.  I don't think that the semantic web, as we imagine it will come to fruition.  However, having the things we own talk to each other over the internet is not unfathomable.  They'll be able to negotiate with each other to perform a task, or they'll be able to keep a history of what they're doing or how you're interacting with them.

Cheap hardware boards
Hardware is already cheap, but building hardware yourself has still been somewhat of a pain.  I remember having to use Rabbit boards before.  There are better ones now.  I've mentioned arduino before.  Beagle board is a full board that you can run Ubuntu on.  Teensy is a small USB microcontroller.

Quake online
Gaming often is looked on as child's play, when in fact, it's some of the hardest programming around, and often drives innovation and progress in graphical techniques, AI, and hardware.  Carmack, the guy that wrote Quake, wants to put Quake on the browser.  For the longest time, people derided the web, saying it'll never match the performance of desktop apps, and never give the same user experience.  If Carmack can run quake on a native browser, then I believe desktop will lose.  If he's delivering quake as a video stream, then that's another matter altogether.

Reverse HTTP
HTTP is by design a pull model, where the client requests resources from a server.  If you wanted to push data to a browser client, you had to rely on a bit of javascript finangling called Comet (cousin to AJAX), where you open an http connection to the client, and leave it open until you want to push stuff to the client.  This certainly puts a load on servers because you have to keep connections open.  Alternatively, you can have the client keep polling the server.  That sucks too.  Reverse HTTP doesn't need to keep the connection open.  It basically takes advantage of the upgrade field in the HTTP header normally used to find a more appropriate protocol, and instead to turn the connection around from the server to the client.  It's still experimental, but it makes a lot more things simple instead of messing with javascript on clients to push data to browsers.

Whiteboarding in real time
Many collaboration tools have come out.  We've discovered that the web is essentially a communications medium.  Anyway, this set of collaboration tools lets you whiteboard, compose text, and revise docs in real time as other people are editing them.  The last link shows you a re-play of paul graham writing one of his essays.  This allows people to see how they edit their text over time, and shows others how other people think as they write.  It'd be useful as an educational tool.

Facebook's walled garden

Facebook is the AOL of today.  It's basically a walled garden of data, where users live.  There's a bunch of effort to break them open.  Facebook also wants to open itself out as a fast follower to twitter.  I won't say much more here, but there's an ongoing battle about where data gets to go on this front. and the sunlight foundation

Since Obama took office, there's been a big push and initiative to open up the government to its citizens in the name of transparency.  One of the things they're doing is and opening raw public data up to developers or anyone that wants to use it.  The sunlight foundation is doing the same for legistator and voting data.  I expect that we'll have more apps that will be able to take advantage of this data in the near future, not just to help the people govern their govenment, but also to lead more informed lives.

DNA engineering
A front that I don't know too much about, but is probably a bigger revolution than the information age and the internet are things that have to do with genetic and bio engineering.  23andme lets you submit cell samples of yourself, and they'll do genetic testing to tell you if you have genetic diseases, among other things.  You can now submit gene sequences and get them built for a modest amount of money--not super expensive, but still out of reach for hobbists)  As the cost goes down, you'll soon see designer pets and bacteria.  The last post is about a guy that theoretically hacks a more potent variant of swine flu.

GWT, sproutcore, and Cappucino

Javascript is the most widely used language in the world.  And while it has its merits as a functional language, people are trying to develop frameworks that compile to javascript.  Javascript is not bad when you're using jQuery.  Scriptaculous 2 just got released also.

Chat on couchdb, standalone web apps, and taking your data with you
This was curious.  Couchdb is a key-value document orientated database with an http server as its frontend.  They were able to demonstrate that you can fit an entire web app just in the database.  Data is code, and code is data.  Not only that, you can use the database's replication to port your data and sync it where-ever you go.  It's an interesting head turn, even if it's just a demo.

Mozilla Ubiquity again, but this time hooks into webapps
I've mentioned ubiquity before, which is like a commandline interface for your browser.  I use it myself, but only in limited amounts.  What's interesting about the direction is that they leverage web services to complete tasks it can't complete for itself.  I think that high level languages will eventually adopt the idea of being able to easily hook into web services as a natural part of the language, without extra libraries.

Mozilla Jetpack lets you write Firefox addons with the web
Traditionally, web developers have stayed out of the realm of desktop developers.  This is one of the many indications I have that a lot of programming--especially those with user interfaces or a social aspect--will move towards web programming constructs.

Clojure, Scala, Haskell, Erlang

I'm not going to say too much about these programming languages, since I've mentioned them before, but just as a reminder, there's more than Java out there.  These four to me, represent the edge of programming languages that have potential in the future.  With the advent of multicores, it's likely that functional programming will lead the way in giving us adequate programming constructs to deal with multicores.  If you're a programmer, it'd probably serve you well to learn at least one of these in the coming 4 years.

ParrotVM and mod_parrot
Admittedly, I don't know much about Parrot.  But the claims it makes is big.  With the rise and popularity of dynamic programming languages like Python and Ruby, we're struggling for a fast virtual machine.  And to have to build a new virtual machine every time we have a new language is a pain.  ParrotVM is suppose to take care of easing that pain.  If that's the case, it might be easier to make languages catered to our problem domain.

Augmented reality and zombies
We've moved closer to having augmented reality.  This is a far cry from the geeky headcam helmets and laptop backpacks that dorky MIT profs wore a decade ago.  It still relies on a 2D barcode, and has limited uses, but now with the iPhone3GS out (it has a compass), we might see more augmented reality apps (as well as on android phones)

Sysadmin tips
Here are some good sysadmin tips.  I'd like to think I know my way around linux, when in fact, I've just started.

Probabilistic chips
I don't know anymore than what's written in the article.  So read about it.

Google Moderator
Voting on websites is old hat since about 2005 with the advent of and  I found it curious that google has a moderator app, to help facilitate the asking of questions.  If you want your own, you can create a white label voting site at slinkset.

The rest of these are related to software, but not about code.  If you can only watch/read one, I'd recommend the poisonous people one.  That applies to more than open source projects.  In it, SVN core devs talk about how someone came in and told them they were all wrong.  I have a feeling that was Linus Torvalds, as he rails on the SVN guys in his talk.

The Business and Politics of Software

Pivoting, or knowing when to stop.

How open source projects survive poisonous people.

Linus Torvalds on Git

Build or buy?

Posted via email from The Web and all that Jazz

Friday, June 26, 2009

Netflix Prize barrier of 10% has been broken

Well, looks like they did it. A bunch of teams came together and put their solutions together to do 10%. Congrats.

Posted via web from The Web and all that Jazz

Wednesday, June 24, 2009

Army Exoskeleton Suit Gives Man Superhuman Strength | Singularity Hub

"it is impressive enough to hear somebody say that they gave up on lifting a 200-pound weight after 500 repetitions not because they were tired but because they were bored."

I've always wanted one of these. I wonder if having mechs is too far off. Perhaps there's no tactical advantage to having a large humanoid robot.

In any case, this makes me wonder what other things DARPA funds.

Posted via web from The Web and all that Jazz

Saturday, May 30, 2009

Returning the keys of all documents in CouchDb

There's a bit of a learning curve when trying to use CouchDb's mapreduce. One of the harder parts is to write the reduce function, which can have two separate cases: called from the map functions, and called again from reduce functions.

When you emit data from map, the examples show you emitting the document, but you can emit any data structure you care to dream up in the key and value portion of the emit. I needed a mapreduce view that returned all the keys that were present in the all the documents. So if I had documents in the db in the form:
{"year": 2008, "birth_rate": 20.0 }
{"year": 2009, "birth_rate": 21.0 }
{"year": 2008, "death_rate": 20.0 }
{"year": 2009, "death_rate": 20.0 }

I wanted something that returned: ["year", "birth_rate", "death_rate"]

Here's one way to do it:


Wednesday, May 27, 2009

How to add helpers, controllers, models, and views of your plugin into the Rails loadpath

Sometimes, when you're writing a plugin, you end up writing models,
helpers, and controllers that the main app can use.  However, you
don't want to copy it into the main app all the time.  You'd like to
keep things separate between the plugin, but you'd like to be able to
include it in the path of the main app.

To do this, put the following in your init.rb file in the root
of your plugin.  To add a new view path in your plugin that's at
PLUGIN_ROOT/lib/views (where PLUGIN_ROOT is the root directory of your

ActionController::Base.append_view_path(File.join(PLUGIN_ROOT, "lib", "views"))

Any template files (like html.erb) that you put in that path will be
seen in your app.

To add new helper, model, or controller directories in the rails load path:

%w{ helpers model controller }.each do |dir|
path = File.join(PLUGIN_ROOT, 'lib', dir)
$LOAD_PATH << path
Dependencies.load_paths << path

And now, any models you put in lib/model, lib/controller, and
lib/helpers will be in the rails load path.

Of course, this might all be moot with the reintroduction of Rails
engines in 2.3.  I haven't gotten around to using them or figuring it
out yet, but for now, this is how you do it with plugins.  tip!

Posted via web from The Web and all that Jazz

Friday, May 22, 2009

Never mix package managers

Today, I decided to dive into Python and explore some web frameworks.
Instead, I've spent the better part of the day messing around with
package problems on Mac OsX.
The problem? Mixing package managers. I had installed py-setuptools
from Mac Ports. Turns out it's still far behind, so it barfs when
easy_install uses it to install something like mysql-python.egg.
I should know better. On Ubuntu, I only use apt to get the language,
and the rest is managed in rubygems.

Posted via email from The Web and all that Jazz

Wednesday, April 01, 2009

Senate majority vs. national debt - Getting at public data is a pain

Last weekend, my friend Ian and I were talking, and we got to the topic of trying to find data online to fact check newspapers. As newspapers get squeezed due to readership moving online, they have less money to do in-depth research. We hear more stories about Britney Spears and her ilk not just because of her hot pants, but because she's cheaper to cover. Sometimes, when newspapers put out something, we find out later that their facts weren't exactly right.

One of things he wanted to fact-check was the national debt, and how it related to our politicians. I imagine he was incensed at the state of financial affairs and wanted to know who were the fools that did it. He wanted to know what the senate majority was plotted against the national debt.

After that, I took some time in the last couple of days to find the data, and then write a ruby script to scrape it and plot it on a javascript webpage. Here's the resulting graph:

The first surprising thing was how fast the national debt has grown in the last 30 years. It's been an exponential growth. I think that while increased spending have happened, the interest on the debt has a significant effect on that growth.

The second surprising thing was how much democrats dominated the era between 1930 and 1980, with a sliver of Republican majority in the 50's. And before that, it was dominated by the Republicans. Apparently, political ideology shifts back and forth.

One of the things I have to note about the graph is that the first set of Red in the early 1800's is not the modern Republican party we know today. It was another party of the same name, I think also called Jeffersonian-Republicans. The gaps before and after were times when there were no Democrats or Republicans. There were Pro-administration or Anti-administration, or Pro-Jackson, Anti-Jackson, or Whigs and others. In addition, there weren't 100 seats in the Senate at the very beginning, so we see some instances where Senate Majority wasn't more than 50 seats.

Oh, and because I'm lazy, I didn't label the axis. The left y-axis is the # of senate seats held by the majority party. The right y-axis is the national debt in dollars.

We can see from the graph that the explosive rise in national debt occurred in the last two or three decades. In addition, both parties had Senate majority at the time. Not only that, but the Senate Majority party only had a slight majority, which meant that it could tip in favor of the other party from congress to congress.

Seeing how it was exponential, I plotted it as a log-plot. Ian quipped that "it's terrifying that it makes sense to plot the national debt in log scale."

You can see more details here. Remember, a line in a log plot means exponential growth. We can see that there are times in history that the US debt dropped or rose at a significantly rapid rate. I was surprised to see that in the mid 1830's, it looked like the US cleared itself of its debt. I don't know enough about history to know whether the US just defaulted or it paid the debt back. The significant rises in debt seem to correlate with the major wars. 1860's for the US Civil War, the 1915's for WWI and 1940's for WWII.

Now the last graph is not for the faint of heart. It's a graph of the rate of change of the national debt on a log scale.

Because the changes have been so dramatic in the last couple of decades, it dwarfs any changes in earlier periods on a linear scale. Here, we can see that for most of the history of the US, the debt change fluctuated up and down. There were periods of lots of spending, but then also periods of cutting back. Sometimes, one party was responsible for doing that, and sometimes, it was another. But in the recent years, we've just been accruing debt. See that little dip in the late 90's/early 2000? That was the big savings from the Clinton era. Note that it's a log scale, so that a little dip up high on the scale means that it's huge when it's lower down. If we had the level of debt back in the 1920's, we've almost have cleared it.

So all this has been interesting, but is this what falls under the "all that jazz" category? What does it have to do with the web and programming? I've been thinking about all the public the data that's out there, and how to get to it. The conclusion was that it was pretty damn hard. There were four steps to tell this story to you. I had to find the data, scrape it, clean it, and then graph it. Of the four steps, the hardest part was scraping and cleaning. It took a good 4+ hours to do it, and I'm a programmer. Most other people that were curious enough could use excel, but last I checked, excel didn't do data scraping on web pages. Hello cut and paste.

I think it should be much much easier for citizens in a country where we elect government officials to be informed and see this data for themselves. Before, we had relied on journalists to give me the straight dope on these facts. But, as I mentioned before, the newspapers have been in decline. As a result there's less budget to pay for good reporting watching the government and what it's doing. Beyond watching the government, I expect that people generally have questions that can be best answered by graphs of public data--and those answers aren't just yes or no.

As an example, another friend of mine, Matt, is single and looking for the ladies. However, living in Columbia, MD, it's a tough dating environment--everyone's under 18 or over 40. So if he could move, which counties in Maryland has the highest number of single females?

Do any of you find that you have similar questions that can be answered by public data in graphs?

Data Sources:

Posted via email from Wil's posterous

Thursday, March 19, 2009

Bastardized recursion

I seem to be posting less. I've been thinking about why that is. Perhaps, less things are surprising to me now (i.e. I'm not learning as much as before). When doing this Rails stuff, the bulk is standard fare, and only occasionally do you run into something mildly interesting. I have been queuing up posts, however. Between work, small side projects, reading, and hanging, there's less time than before.

I stumbled on something, which I saw in the Rails source once. Thought I'd share.

Say I have a :blog that has_many :posts. But Posts are subclassed to have many different types. But I wanted that post_type information from Blog in different formats. Originally, it looked something like this

class Blog
has_many :posts

def post_types

def post_names { |pt|'Post::','') }

def post_string { |n| "'" + n + "'" }.join(",")

Since they progressively built off of each other, I figured I can use a bastardized recursion, like I saw in find() in ActiveRecord::Base.

class Blog
has_many :posts

def post_types(format = :classes)
case format
when :classes
when :names
post_types(:classes).map { |pt|'Post::','') }
when :string
post_types(:names).map { |n| "'" + n + "'" }.join(",")

Seems alright. Reduces the clutter of functions that are related to each other, so I'm on the lookout for being able to reduce related functions together like that. tip~!

Found this reverse engineering brief on obfuscated code that recites the 12 days of Christmas. It uses the same technique that I described above. I suppose as always, case statements can be abused.

Friday, February 13, 2009

Reply comments to Frock

Since I can't comment on my own blog, here we go. And people on blogger groups don't know either.

I was reading that some of the tree methods were mainly used for static collision culling, due to their expensive insertion and deletion properties. I don't know yet if Quadtrees have the desirable properties.

Thanks for the link on the Voronoi paper. I'll take a look at it.

@Wally thanks. I'll see what I can do.

@Hal Oddly enough I saw your simulation on youtube when I was browsing around the other day. Do you have source I can look at for that?

Thursday, February 12, 2009

Introducing Frock, a flocking chicken simulation written in Lua with Löve

I recently learned about LÖVE, a 2D game framework for Lua off HN. It looked simple enough, and when I ran the particle system demo and it was pretty fast. It seemed faster than what Ruby Shoes would be able to do. I got rather excited because
  1. I've always wanted to make my own game. A lawnmowing game comes to mind.
  2. I wanted to see if I could create a large flocking simulation
Also, I've decided a while back to start writing more great projects and do less reading of tech news garbage. While #1 would be on the backburner for the time being, #2 was fairly easy to do in a couple of hours. (I think about a weekend's worth of hours total tweaking things).

I've been fascinated with decentralized systems since right after college--one of them being flocks. But how do they work? For a long time, ornithologist (bird scientists) had no clue how birds flocked. Birds seem to be able to move in unison, as a super-organism, swooping, expanding/contracting, splitting. When you see massive waves of these things (these are starlings), it's something to behold. Who is the leader? How do they coordinate these flocks?

We still don't exactly know, since we don't yet have the capabilities to tap into a bird's decision making mechanism in real time. However, in the 1990's, Craig Reynolds demonstrated that you can get very convincing flock-like behavior generated procedurally, using three simple rules. And it ends up that you don't need a leader to get flocking behavior. All interactions can be local, and each flock member (or boid, as he called it), just needed to follow three simple rules:
  1. Attraction: Move towards the perceived center of the flock according to your neighbors
  2. Repulsion: Avoid colliding with your neighbors
  3. Alignment: Move in the same direction and speed (velocity) as your neighbors.
Add up all these vectors together and you get a resultant velocity vector. Different amounts of the three influences leads to different looking types of flocks. As an aside, particle swarm optimization works on the same sort of principles.

So here it is. Introducing Frock, a flocking simulator in Lua Love.

I release it now, because while it's primitive, it works (release early, release often!). The screenshot doesn't really show it well, they fly about in a flock, hunting for plants to eat. It's rather mesmerizing, and I find I just stare at it, the same way I stare at fish tanks.

It was originally a port of the Ruby Shoe's hungry boids, but I used flying chickens lifted from Harvest Moon instead. I originally had cows flying about, but without flapping, it just wasn't the same. I also made the repulsion vector increase as a function of decreasing distance. Otherwise, the chickens didn't mind being right on top of each other if they were on the inside of the flock.

My immediate goal is to make it support more chickens, so I can get a whole swarm of them. Right now, I'm using an inefficient algorithm to calculate which chickens are neighbors (basically n^2 comparisons). So if any of you have good culling techniques applicable here, I'd love to hear it. I'm currently looking at R-trees.

There are different possibilities as to where it could go. I think while lots of people have written boid simulations, they haven't taken it much further than that. While I've seen ones with predators, I haven't seen anything where people try to evolve the flock parameters, or try to scale it up to a large number of chickens. One can also do experiments on whether different flock parameters have different coverage of the field, or which set of parameters minimizes the time a plant is alive.

If at the basic, it becomes a framework for me to write 'me and my neighbors' decentralized algorithms, that'd be useful too. And since Lua is suppose to be embeddable into other languages, that makes it an even more exciting possibility. Later on, I'll write a little something about Lua.

Well, if you decide to try it out yourself, get to the Frock github repo, and follow the readme. Patches are welcome, but note I haven't decided on a license yet--but it will be open source. If you have questions, feel free to contact me or comment. Have fun!

Tuesday, February 03, 2009

Can't comment

Somehow, blogger doesn't let me comment. I'm not sure why, and I'll fix it some time later, or export to Posterous.

Thanks trevor. I've subscribed to your weekly digest. I've been reading Hacker News all this time. However, it wasn't so much that I didn't have time to read, as I didn't have time to synthesize them.

And oh, thanks for the Perforce slides. They've been really helpful.

Nerd time, issue 15

Nerd time is back, due to some complaints in person about how they missed it. In brief, nerd time is an occasional mailing list I send out to people I know that work at research labs. They're usually slow in getting the tech world news, so I thought it would be fun to pass along some tech news bits. This is issue 15.
Hi all.

Well, after starting work at Frogmetrics last May, I got really busy writing code, learning about business and startup related things, and absorbing sales and marketing stuff, that I simply read a lot less. There was also less tech news going on that I felt was significant. When there's no significant news, the echoes become pretty loud in the echo chamber. Since I didn't hear anyone say anything about it, I figured no one read nerd time. And since I was reading less, I became more intellectually lazy. Hence the 10 month silence on nerd time.

But about a month and a half ago, I got in-person complaints that nerd time was no longer being sent out, so here it is reinstated. Some of these might be old news, but it's what I collected and found was significant over the last couple of months. And what the hell, old news in the tech world takes a while to get to the research lab world, so hey, this might be new to you. This time is mostly about source control and languages. I have other things I'm playing with that might be of interest, but will reveal them as I mature them.

Git the decentralized version control
Decentralized version control isn't anything new, but its adoption is. Git is pretty powerful. I don't go into all the reasons why. You can read about it in my post here. But on a deeper level, git is essentially a versioned filesystem. In fact, what's most interesting about it is how general it is and how you can use it for things other than version control. You can use it to synchronize address books, remote deploy code, or even as a basic wiki or blog.

Github a social network for developers
Github is like sourceforge, but much better designed, and has a social component to it. I can follow the coders and projects that I admire or find useful, and I can also see what projects they are committing to or watching. Thus, it gives me a sort of leading indicator of what the alpha nerds and geeks find interesting. And what they find interesting is what you and I will be using in our jobs 5 to 8 years down the line. The generality of git, as mentioned in the previous entry is given as an example in the last link. It's of Raganwald, a well known Ruby coder and blogger who blogs on Github. Having a central place to commit and share code sorta defeats a mainstay of decentralized source control, but let's ignore that point for now, and drink the github kool-aid

Along the same lines, someone combined git with bittorrents. Whoot.

SUP friendfeed
I'm personally not as excited about SUP as I don't mess around with feeds that much. But it claims to cut down on bandwidth for servicing feeds

If you thought Rails was lightweight, Sinatra blows it out of the water. While there have been other micro-frameworks for web apps, sinatra takes the cake in my opinion. It doesn't take very much at all to get something up and running with sinatra. So given the amount of prototyping work that the lab does, it helps to just get something demoable up. If you wanted to continue with it, sinatra runs on pretty solid web server, and you can optionally switch it out also. I'd recommend taking a good look at it, and brush up on your Ruby skills as well.

Clojure is a lisp dialect in a JVM. I've heard some good things about it, but I haven't really tried it out myself, so I can't speak on the merits of Clojure. However, as the second link below wonders, could Clojure be to concurrency orientated programming as Java was to OOP? I've talked about Erlang in the past, and it defn has some amazing traits as a programming language, barring the syntax.

As for actual programming languages I've been messing with, there are three. Erlang, Javascript, and Lua. I'll only talk about the last one--and I only started messing with it because of LÖVE.

LÖVE is an "unquestionably awesome 2D game engine"
More akin to Pygame in Python than to Shoes in Ruby, Love lets you quickly build a game, but still stay within the realms of programming. The reason why I find it worth mentioning are the merits of Lua. It's a basic interpreted language, but it's embeddable into other languages, and its total size is pretty small ~200k or so. This make it ideal to be used in embedded systems. In addition, it's one of the faster interpreted languages out there, and with LuaJIT (on the JVM), it's even faster.

Mozilla Weave and Prism
Google Chrome

While I'm sure the lot of you have heard about Google Chrome, what's interesting to me is in relation to the direction that Mozilla, and also Adobe Air have been trying to move towards: treating web applications like desktop applications. Not only will they be easy to install, and easy to maintain, but they afford easy collaboration with others. And with the maturity of Google Gears, being offline is not a problem now either. While there are still a couple bastions where pure desktop applications reign, such as gaming, I think we'll find that the web app style development to be more pervasive for desktop apps.

SVG and application development
I don't know that SVG will be the future of application development or not, but I know that the current html and css constructs were meant for documents. Web developers are actually rebending those tools for application needs. While it's useful to think of the web as a collection of resources and document--it makes for a scaleable app--the actual page elements are still stuck in document-page speak. It would make it easier for app development to have their own app-specific constructs. SVG may or may not help in that regard

Stackless Python, PyCUDA
There's lot of people trying to find suitable languages for our multicore future. I'm not so sure that Erlang will be it. However, functional programming concepts are going to make a comeback, if not already. Python is poised to be ready as being able to handle concurrency through stackless. Some people are experimenting with using Python to access the new Nvidia CUDA hardware architecture.

Gnip Central
Gnip central acts as a data middle man. Often times, getting through the API sucks for various reasons, and having a middle man that either converts that data for you, makes it available, or converts it into a push model instead makes it convenient. It's an interesting niche, and I see this is as a perennial tar pit of data portability.

New York Times API
New York times is probably the more forward thinking out of all the newspapers when it comes to the web. Who else do you know that has released an API?