Monday, May 19, 2014

400 billion... The Wayback Machine now has more pages than there are stars in our galaxy (and here's how they manage that)

High Scalability - A Short On How the Wayback Machine Stores More Pages than Stars in the Milky Way

How does the Wayback Machine work? Now with over 400 billion webpages indexed, allowing the Internet to be browsed all the way back to 1996, it's an even more compelling question. I've looked several times but I've never found a really good answer.

Here's some information from a thread on Hacker News. It starts with mmagin, a former Archive employee:

...

image

..."

How awesome is that? If you're interested in the story behind the storage/indexing/etc used by the Wayback Machine, read this...

No comments: