MobileMe

A lot of people have been posting about the MobileMe fiasco recently, and just recently, Apple has extended the free MobileMe period by another 60 days.

As someone who’s written a little bit of Sync Services code, and who’s had some involvement with large scale server side implementations, I thought I’d throw in my tuppence worth.

I have to say that I’m not all that surprised by the problems that Apple has run into with MobileMe.

The general take seems to be that the major initial problem was inadequate capacity planning.

I don’t have a big problem believing that. Syncing tends to be disk and CPU intensive.

When you’re syncing locally on your Mac, it’s a bit of a drag, but not all that bad, even though an intensive sync session (for example, if some syncing app decides it needs to refresh all of its data) will basically suck up all of your CPU, or at least one of the processors.

But, for MobileMe, the process is at least as CPU and disk intensive on the server side. Given that there are probably hundreds of thousands, if not millions, of MobileMe users, a *lot* of servers will be required to soak up the load.

This is also probably why MobileMe seems to be now using scheduled syncing, instead of trickle syncing. Scheduled syncing basically means “sync every 15 minutes, if necessary” or whatever time period is appropriate. With trickle syncing, every time a syncable record is changed or added, a sync occurs shortly afterwards, so that syncing is almost instantaneous. Even though trickle syncs, by their nature, tend to involve very small volumes, lots of them will add up, and because the overheads for starting and managing a session are being paid many many times over, trickle syncing is almost bound to be more intensive than scheduled.

Apple’s initial MobileMe publicity and presentations seemed to indicate that MobileMe would use trickle syncing over the air, but this seems to have been dropped pretty quickly, most likely in an attempt to conserve server capacity on Apple’s part.

Even so, I don’t expect capacity to be a big deal in the long run.

Each user’s data is essentially orthogonal to every other user, so no matter how high the load, it can be scaled efficiently, horizontally across whatever number of servers are available.

If, as rumour has it, Apple is really backing MobileMe with Oracle running on Sun hardware, then scaling will probably be a lot more expensive than if they’d run on mysql on xserves (which is what I’d have done, in their position), but Apple’s pockets are probably deep enought to cover that.

But even if the capacity problems are solved, which I expect they will be, I suspect that it will be very difficult for Apple to meet users expectations with MobileMe. While the demos were very impressive, in practice, MobileMe seems likely to encounter almost as many problems as .Mac

The .Mac Apple forums (which seem to have disappeared) show lots of complaints from users about failed syncs and disappearing data with the old service, and these are exactly the kinds of issues that I’m talking about.

When I saw the MobileMe demo, I hoped for a moment that Apple had cracked these problems, but although they’ve obviously put a lot of work into MobileMe, and added lots of functionality, such as the gorgeous (but potentially insecure) WebMail interface, the underlying syncing technology is fundamentally pretty much the same.

Now this is not intended to criticize the Sync Services team (who are a great bunch of guys), or their implementation. Syncing is one of the holy grails of personal computing. Who doesn’t want to be able to read or update their personal data, anywhere and everywhere, and for it all to match up. Apple has worked hard at, and done a great job with Sync Services, and when it works, its a joy.

But even so, getting syncing to work *reliably*, across a wide range of devices and applications, has proved to be very very very difficult, and because of the nature of syncing, any error, anywhere, soon becomes an error everywhere. Unreliable syncing is almost always worse than no syncing at all.

If you don’t know how Syncing works, here’s a quick primer. On your Mac, every user account has a “Truth Store”. This is where your syncable data lives. You also have a number of “Sync Clients” on your mac. For example, Address Book, Safari, KeyChain, etc. Every application that can synchronise its data, is a Sync Client.

.Mac also operates as a Sync Client, and I expect that MobileMe does as well.

When a sync client wants to sync, it connects to the Sync Server (on your Mac), and compares its data against The Truth. The client accepts updates to its data, and writes any updates made via the application to the Truth.

For example, you add a new entry to Address Book, which synchronises with the Truth Store, and writes the new entry to the Truth. Now when other clients sync (e.g MobileMe), Sync Services tells them “hey, there’s a new address”, which they grab and add to their data (which in the case of MobileMe, is held on Apple’s servers).

Most of the time, this all works fine.

However, inevitably, there are bugs. These might be bugs in Apple’s Sync Server code – historically, there have been a few. Or, they might be bugs in client code. Although Apple provides sample code, writing a sync client and correctly handling all of the strange cases that might be thrown at you is highly non-trivial.

Any bug, in any client, can potentially have unwelcome effects on all of the other clients. That’s the whole point of syncing after all. When a change is made, it is reflected everywhere. If that happens to be the wrong change, then it will mess with all of your data. Oops.

You can think of the Truth Store as a well, from which every client drinks. And any badly behaved client can easily, and inevitably always accidentally, pour poison into the well.

Classic bugs are duplicating all of your data (because the client thinks Sync Services told it to send it all the data to the Truth again, and then the Truth thinks – I already know about this record, the client must mean that I should create another, identical one), or deleting all your data (this is also surprisingly easy to do), but there’s all kinds of fun and madness in between.

There are very very few sync clients that can’t be caught out by some unusual combination of data or events.

So let’s say that syncing is working fine 99% of the time. For many programs, 99% is acceptable. For example, if Acorn crashed one in 100 times that you ran it, it would be a pain, but not that bad.

But in the case of syncing, if you’re making any changes to your syncable data, its quite possible for you to be running 20 or so sync sessions a day. Even if the chance of a problem is small, you’re very likely to hit it once a week, or even more frequently.

And when you do hit a problem, it either means that not all of your data is being synced, or even worse, that some of your data is deleted. Aaaaaaaargh. You are now a very unhappy user. I used to *hate* getting MySync support requests that began “Your app has deleted all of my data …”.

To make this problem even worse, Apple has also taken on synchronisation of email as well as Contact and Calendar data.

Email synchronisation was a frequent request from MySync users, but a challenge that we chose not to take on, because it is so damn hard.

Is this really Apple’s fault? In the case of Mobile Me, and .Mac before it, all of the code is essentially Apple’s, but I think this just goes to show how hard it is to get a sync client and the core sync services code right. When you add in all of the third-party Sync client code, and mail synchronization, it just seems inevitable that many users will hit a problem at some point, and become very very unhappy.

Backup and recovery options can lessen those users pain, but at the end of the day, what you’re looking at is a very un-Apple like experience.

Blackberry seems to have done a “better” job of this, but I get the impression (not having looked that hard) that Blackberry syncing occurs between a much smaller range of devices.

Apple’s job is made even harder, in that testing of the .Mac/MobileMe infrastructure is in practice probably pretty limited. And once you go live, you’re messing with millions of people’s real data.

Personally, if it was me, I would have let .Mac die a quiet death. The problem set for ubiquitous syncing is just very very hard, and the consequences of failure, in terms of user dissatisfaction are too high. I suspect that, in time, MobileMe will go the same way as the Newton …

Comments are closed.