2008
Google : un mystère fascinant et bien gardé
by holyver (via)L’infrastructure utilisée par le géant de la recherche Google est un mystère que beaucoup aimeraient percer, que ce soit les concurrents ou les utilisateurs étonnés de la réactivité sans faille des services malgré un nombre d’utilisateurs record.
Google Architecture | High Scalability
by holyver & 1 other (via)Google is the King of scalability. Everyone knows Google for their large, sophisticated, and fast searching, but they don't just shine in search. Their platform approach to building scalable applications allows them to roll out internet scale applications at an alarmingly high competition crushing rate. Their goal is always to build a higher performing higher scaling infrastructure to support their products. How do they do that?
Vdoop - Manage your virtual cluster - Vdoop
by camelEvery day, search companies like Google download terabytes of data from the Internet, store it on clusters of thousands of machines, and process it so that it can be easily searched. To make this possible, these companies need sophisticated distributed file system and parallel programing architectures.
Have you ever heard of the Map/Reduce distributed parallel programing paradigm? If you are a computer scientist, you should have, because every time you submit a Google search, you are using Map/Reduce. Despite growing demand from companies like Google, Yahoo, and Microsoft, few computer science majors have even heard of Map/Reduce, let alone graduate well versed in its use. Unfortunately, several barriers exist to integrating Map/Reduce into computer science curricula. Obtaining a large cluster, configuring it, and installing complicated distributed file system and parallel programing software is difficult, time consuming, and expensive.
In the past, Google's solution to this problem has been to ship entire clusters pre-configured with Map/Reduce software to select universities. In essence, Vdoop does same thing, with exactly the same software, except for our clusters are virtual, and hence free.
scalr - Google Code
by camel & 3 othersScalr is a fully redundant, self-curing and self-scaling hosting environment utilizing Amazon's EC2.
It allows you to create server farms through a web-based interface using prebuilt AMI's for load balancers (pound or nginx), app servers (apache, others), databases (mysql master-slave, others), and a generic AMI to build on top of.
The health of the farm is continuously monitored and maintained. When the Load Average on a type of node goes above a configurable threshold a new node is inserted into the farm to spread the load and the cluster is reconfigured. When a node crashes a new machine of that type is inserted into the farm to replace it.
4 AMI's are provided for load balancers, mysql databases, application servers, and a generic base image to customize. Scalr allows you to further customize each image, bundle the image and use that for future nodes that are inserted into the farm. You can make changes to one machine and use that for a specific type of node. New machines of this type will be brought online to meet current levels and the old machines are terminated one by one.
The project is still very young, but we're hoping that by open sourcing it the AWS development community can turn this into a robust hosting platform and give users an alternative to the current fee based services available.
2007
ganeti - Google Code
by camel & 1 otherGaneti is a virtual server management software tool built on top of Xen virtual machine monitor and other Open Source software.
However, Ganeti requires pre-installed virtualization software on your servers in order to function. Once installed, the tool will take over the management part of the virtual instances (Xen DomU), e.g. disk creation management, operating system installation for these instances (in co-operation with OS-specific install scripts), and startup, shutdown, failover between physical systems. It has been designed to facilitate cluster management of virtual servers and to provide fast and simple recovery after physical failures using commodity hardware.
2006
Peeking Into Google
by dcancel & 3 othersThe key to the speed and reliability of Google search is cutting up data into chunks, its top engineer said. Urs Hoelzle, Google vice president of operations and vice president of engineering, offered a rare behind-the-scenes tour of Google's architecture
1
(8 marks)