Google's Search Engine Gets A Caffeine Boost
Google has released its Caffeine indexing technology. As the most significant change Google has made to the basic technology that crawls the Internet and ranks Web pages since 2006, Caffeine has been in the testing phase for almost a year. According to Google, "Caffeine provides 50 percent fresher results for web searches than our last index, and it's the largest collection of web content we've offered."
In a blog post announcing the completion of Caffeine, Google
Our old index had several layers, some of which were refreshed at a faster rate than others; the main layer would update every couple of weeks. To refresh a layer of the old index, we would analyze the entire web, which meant there was a significant delay between when we found a page and made it available to you.
With Caffeine, we analyze the web in small portions and update our search index on a continuous basis, globally. As we find new pages, or new information on existing pages, we can add these straight to the index. That means you can find fresher information than ever before—no matter when or where it was published.
Grimes pointed out that every second Caffeine processes hundreds of thousands of pages in parallel. To put this in perspective, Google says, "Caffeine takes up nearly 100 million gigabytes of storage in one database and adds new information at a rate of hundreds of thousands of gigabytes per day. You would need 625,000 of the largest iPods to store that much information; if these were stacked end-to-end they would go for more than 40 miles."