Thursday, October 29, 2009

You probably don't need an OLAP

It's a "well known" that relational databases are bad at multi-column "slice and dice" calculations.  So when you have data that you'd like to represent as an aggregated trend, it's easy to reach for that OLAP.  Chances are, you don't need it.  Here's an example of something where you want a count of the number of comments from an author by day.

select DATE(created_at) as DateOnly, count(*) 
from comments 
where author_id = 877081418 
group by DateOnly
order by DateOnly

The trick here is the DATE() function provided by various database vendors.  This returns any datetime as simply a date that can be aggregated.  

To be honest, I looked into this way too late and didn't contest the OLAP architectural decision until it was late.  We ended up having the legacy of dragging a big fat OLAP with all its trappings of complicating our architecture.  If you end up with a complex architecture, there's probably a simpler way you're not seeing.  The simpler your setup, the easier it will be for you to hold it all in your head and understand it when things go wrong.  

The only OLAPs we've found to be available was the open source Mondrian and Microsoft's Analysis Services.  To be honest, I found both to be way harder to use than it should have been.  If someone else wants to write another OLAP that's simpler to use without a lot of luggage to blow those two out of the water, the time is nigh.

Posted via email from The Web and all that Jazz

No comments:

Post a Comment