Thursday, 31 May 2007

Google Gears -- useful, but a couple of early limitations?

I just got back from the Google Developer Day in London, where one of the "big announcements" was Google Gears -- a way for AJAX applications, to work locally on your computer even when offline. Gears keeps a local copy of the web data an application would use in an SQLite database on your PC; the local data is synced with the remote web data whenever you are online.

There seem to be a couple of yet-to-be solved issues that some apps will strike:
  1. Merging conflicts
  2. Knowing what data you need ahead of time
Merging conflicts.
Imagine if Wikipedia was an AJAX site, working offline. Algernon and Berty could both edit the GoogleGears entry, and then when they both go online, who's edit will be uploaded to the site? Algernon's? Berty's? Or will an error happen?

The Gears team are pretty up-front that they haven't solved this yet. Unfortunately, they're not so clear on what will actually happen at the moment -- I asked one of the developers whether there was at least any way for an application to find out which tables or rows are in conflict, but the answer was a pretty blank "we haven't added anything for that, there's just whatever SQL provides". So I guess that means at the moment the last upload wins (and nothing will even know there was ever a conflict). Maybe there's some way to check a lastUpdate timestamp on the server though...?

Knowing how much data you'll need
Storing the data locally will only work for applications that know what data they want to store -- for example your email or calendar. That sounds pretty obvious and unavoidable. But for one recent craze (that Google's keen on) this could be problematic: "mashups of mashups" -- letting users combine information from multiple sites and functionality from multiple mashups.

For example, let's start with a TV guide. Let's call its data t.

Now let's use a mashup that links the TV guide to some reviews from rottentomatoes. Now we need data
t, r(t)

Let's also use a mashup that uses the IMDB or the BBC's program data to find out what other tv shows the actors have been in. "Open All Hours": Granville is played by David Jason who you'd know from "A Touch of Frost" and "Dangermouse"
t, p(t)

But hang on, I don't want it telling me "Open All Hours": Customer Number 3 is played by Joe Bloggs who appeared in "RubbishProgrammeX". I only want to hear about actors who were in quality shows. So let's use the review site again, and combine the mashups so I only hear about the good shows the actors were in.
t, r(t), p(t), r(p(t))

That last one looks a little big in off-line mode. A user might click on any show in the guide today, and the app needs to check the reviews of all the other shows each actor has been in. That's probably ok to do online for one show that the user has just clicked on. It means checking about a thousand reviews (say 20 actors, each having 50 other roles). But for the mashup to work off-line, well the user might click on any show on any of 40 channels today. Let's say there are 1,000 shows on tv today. We need to pre-fetch around a thousand reviews for each of those shows. Suddenly we're pre-fetching a million reviews! (Ok, minus a significant number of overlaps).

To work offline, a mashup-of-mashups could have to do a number of joins across multiple sites, and pre-cache the (quite large) result.

Friday, 25 May 2007

First post

I submitted my PhD recently, and with a fresh stage of life comes a fresh blog! More to the point, I wanted to change the address from the rather cryptic 'whb21' to the more comprehensible 'wbillingsley'. I'll leave the old one there for now, though.