Auto Ads by Adsense

Booking.com

Friday, April 09, 2010

Infrastructure

If there's anything Google loves to build, it's infrastructure. Google had entire buildings worth of machines, and lots of ways to make use of all of them. There's MapReduce, Bigtable, Blobstore, and all sorts of other distributed infrastructure. So much so that engineers frequently told me that they thought that developing without all that infrastructure would be crippling, and would slow them down too much.

The irony, of course, is that companies like FriendFeed gives the lie to that statement. Friendfeed was launched in weeks! If you're not Google, you don't have to scale to billions of users right away. Existing tools can be made to be extremely scalable. For instance, even MySQL can be made to scale. The truth is, launching products at a big company takes a much longer time than at a startup because of non-coding related reasons. In fact, many of the engineers who made that above statement would find creative ways around missing infrastructure if they were at a startup: context is everything.

I remember attending a talk by YouTube engineers after the acquisition (this was at OSCON, so I know it's unclassified information). What impressed me was how closed they always seemed to be to falling over completely. Yet they never did. Then it occurred to me that a startup should always be running at the ragged edge of what their systems can handle: to do otherwise would mean that you're not using all your resources efficiently. By contrast, Google can afford a few under-utilized machines. In addition, all that generic infrastructure has overhead. Generic cluster management software, for instance, doesn't (and can't) know enough about the overall job structure of your tasks to put compute-intensive tasks on the same machine as network bound tasks. But a startup with a customized software stack can do that (and frequently must do so) because they don't have enough machines to do otherwise.

In short, I think startups have to be very careful about building generic infrastructure just because that's the way Google did things. Google built generic infrastructure because its big problem (search) had to have massive scalability right away. Even with a single user, a search engine still has to search as much of the web as possible. But what applied to Google doesn't apply to all startups. Build only the tools you need as the need arises.

4 comments:

Anonymous said...

I think this is part of a larger problem that big companies face vs small one, which is technological momentum.

Once a company starts using technology and has invested a lot in it, they have to start using it for everything.

At my company, we use a particular version control system I'm not a big fan of. Nearly all of the engineers would prefer not to use it, but a lot of effort has gone into streamlining our use of it, mainly fixing problems that don't exist in other systems. Therefore the justification for using the technology becomes not the reward we are getting from using it, which is minimal, but the amount of energy we expended in making it work in the first place.

This kind of momentum based thinking tends to favor complicated heavy weight technologies that cost a lot to put in place in the first place.

I've never worked at a startup, but I've been tempted to join one just for the sake of not having to deal with the burden of prior technology decisions, infrastructure, legacy code, etc. Maybe there are other problems, and the environment isn't as ideal as I'm imagining.

ovidiu said...

Google was a startup too. We built our infrastructure as we went along, out of pure necessity. Even now, hardly a startup, the systems are still evolving as new needs appear.

What I loved about the early days was that you could go to a relatively few number of people, and tell them what was the problem you had to solve. They'd come up with creative solutions involving the then current best of breed of technologies. They told you what were the limitations of the existing systems, and how you could improve them.

What was missing could be built by you. But you built on top of other people's work, which was extremely productive.

You can do that now as well, but the sheer number of options you have is bewildering! The end result however could easily scale to millions of users, if the code and the deployment are done right.

Piaw Na said...

Oh, I agree. Systems evolve as new needs appear, but the evolution is at this point much slower than what you would get at a startup. For instance, for quite a few years, PicasaWeb storage prices were uncompetitive. A startup could not have survive by waiting that long. Even now, I'm not convinced that PicasaWeb is profitable, whereas SmugMug is.

The difference is entirely on the basis of Infrastructure. I don't believe SmugMug could have been profitable on Google's infrastructure, not matter what. As a startup, they were not tied to legacy infrastructure built for other apps, but PicasaWeb was. I believe that the legacy of having to live on other people's infrastructure is what made PicasaWeb ultimately a lackluster photo site.

R.E. Xavier said...

I've never worked for a start-up, but I have worked for companies that built infrastructure. So my perspective may be somewhat relevent to your blog.

My take is that I think it is inevitable that Google's newest "experiment" is in broadband networks. No matter how many "content" services it develops or acquires, Google still needs distirbution - currently over somebody else's network. If it builds its own fast open - source network infrastructure, Google removes that dependence and becomes the master of its own destiny.

If you're interested in reading more, I wrote an article about this recently at www.smallbizdr.net.

Eric Xavier