Social
networking is the art of connecting with others who share common interests.
This “Red Road” is a community that helps us hold together and offers many
other benefits. Networking through social networking has revolutionized the way
that Internet usage is at the forefront of what we now know as Web 2.0.
Facebook is the social network. People have been
“facebooking” one another for about 7 years, so most used Facebook social
network with over 800 million users worldwide. But how does Facebook work?
In this article I’ll discuss
the inner workings of Facebook, which includes infrastructure architecture and
frontend / backend, nuts and bolts holding together Facebook.
How
does Facebook work – The Front End
Facebook
uses a variety of services, tools and programming languages ??to make up its
basic infrastructure. In the front, their servers run a LAMP (Linux, Apache,
MySQL and PHP) stack with Memcache. Not a computer science expert? Let’s take a
look at what this means.
Linux
& Apache
This part
is pretty self-explanatory. Linux is a Unix-like operating system kernel. It is
open source, highly customizable, and good for security. Facebook runs the
Linux operating system Apache HTTP server. Apache is also free and is the most
popular open source web server in use.
MySQL
For
the database, Facebook uses MySQL for its speed and reliability. MySQL is used
primarily as a key store of value when the data are randomly distributed among
a large number of cases logical. These logical instances extend across physical
nodes and load balancing is done at physical node.
As
far as customization is concerned, Facebook has developed a custom partitioning
scheme which is assigned a global ID for all data. They also have a custom
schema file that is based on the amount of common data and the latest is on a
per user basis. Most of the data are randomly distributed.
PHP
Facebook
uses PHP, since it is a good web programming language with extensive support
and active developer community and is good for rapid iteration. PHP is a
dynamically typed language / interpreter.
Memcache
what is memcache??- Memcache
is a caching system that is used to accelerate dynamic web sites with databases
(like Facebook) by caching data and objects in RAM to reduce reading time. Memcache
is the main form of caching Facebook and helps relieve the burden of database.
Having
a caching system allows Facebook to be as fast as it is to remember your information.
If you do not have to go to the database you just collect data from the cache
based on user name.
Disadvantages of using LAMP
Facebook
has realized that there are disadvantages to using the LAMP stack. Note that
PHP is not necessarily optimized for web sites large and therefore difficult to
scale. Furthermore, it is the fastest executing language and framework of the
extension is difficult to use.
Facebook
President, Vice President of Engineering, has conducted an interview with
EmTech @ MIT on this. “Extension of any website is a challenge,” Schroepfer
said, “but the expansion of a social network has unique challenges.”
He
continued by saying that unlike other websites, you can simply add more servers
to solve the problem because Facebook “huge interconnected dataset.” The new
connections are created all the time due to user activity.
Facebook
has grown so rapidly that often face questions relating to database queries,
caching and data storage. Their database is huge and complex in many ways. To account
for this, Facebook has started a lot of open source projects and back-end
services.
How does Facebook Work – The Back End
Facebook
backend services are written in a variety of different programming languages
??like C + +, Java, Python, and Erlang. His philosophy of building services is
as follows:
1. Create a service if necessary
2.
Create a framework / toolkit to facilitate the creation of services
3.
Use the programming language suitable for the task
A list of all open source
developers Facebook can be found here. I will discuss some of the essential
tools that Facebook has been developed
Thrift
(protocol)
Thrift is a lightweight remote procedure call framework for scalable
cross-language services development. Thrift supports C + +, PHP, Python, Perl,
Java, Ruby, Erlang, and others. It’s fast, saves development time and provides
a working division of labor in high-performance servers and applications.
Escribano (server logs)
Scribe
is a server for aggregating log data streamed in real time on many other
servers. It is a scalable framework useful for recording a wide range of data.
It is built on top of savings.
Cassandra (database)
management
system is a database designed to handle large amounts of data spread out across
many servers. The function of the power of Facebook
Inbox
search and provides a structure of key-value store with eventual consistency.
HipHop for
PHP
HipHop
for PHP is a transformer of source code for PHP script code and was created to
save server resources. HipHop transforms PHP source code in C + + optimized.
After doing this, use g + + to compile it to machine code.
Why
Facebook’s performance is so good?
800
million users, divide America, Europe, Asia. It means that more than one
million people view photos, chat with friends or update status at a time. How
can they do?
The
main language is PHP and MySQL Facebook, who have a reputation to scale well.
To my knowledge, people tend to use a compiled language (like Java,. NET) for
the implementation of big business. The languages ??enforce good practice and
habit of refactoring, good architecture, while PHP does not. In addition, the
scripting language can not run faster than a compiled one.
Critical
Architecture Maps of Facebook Technology
There is
no single reason but a lot of reasons:
Extensive
use of caching (memcached APC), which drastically reduces processing time.
Slide 12 compares the load time with APC (~ 130 ms) versus without it – 4050
ms. That’s 30 times faster!
The
use of HipHop, which converts PHP code C + + (which is compiled into machine
code much more efficient than current PHP).
Facebook
uses PHP and MySQL, but that is not the only thing. For example, use Erlang by
chat, groups of Hadoop for some storage. If you visit their employment page,
you’ll see they’re hiring developers with experience in C + +, Java, Python and
others.
Facebook
has data distributed across many servers, many years. In June 2010, there were
60,000 FB servers. (I think it is not too much. Google had half
a million … 5 years ago)
Facebook
sends traffic as little as possible using static CDN to deliver static content.
Gzip to compress the data. Cookies, Javascript, HTML – all that is cut back to
reduce the number of bytes sent over the network. Use a technology they call
“BigPipe” which sends the partial content rather than the entire page.
One
of the key values ??of Facebook is to move fast. Over the past six years
Facebook has been able to achieve much faster thanks to a development path that
offers PHP. As a programming language PHP is simple. Easy to learn, easy to
spell, easy to read, and easier to debug. Facebook is able to obtain new
Facebook engineers intensified in faster with PHP than with other languages,
allowing us to innovate faster.
HipHop
for PHP is not technically a compiler itself. Rather it is a source
transformer. HipHop program transforms the PHP source code highly optimized C +
+ and uses g + + to compile it. HipHop executes the source code so semantically
equivalent and sacrifices some rarely used features – such as eval () – in
exchange for enhanced performance. HipHop includes a code transformer, a
reimplementation of PHP run time system, and a rewrite of many common PHP
extensions to take advantage of these performance optimizations.
Scale PHP
as a scripting language
Roots
of PHP is a scripting language like Perl, Python and Ruby, all of which have
important benefits in terms of programmer productivity and the capacity to
implement products quickly. This compares with more traditional compiled
languages ??like C + + and scripting languages ??such as Java. On the other
hand, scripting languages ??are known to be generally less efficient when it
comes to CPU and memory usage. Because this has been a challenge to scale
Facebook to over 400 million page views each month based on PHP.
A
common way to address these inefficiencies is to rewrite the most complex parts
of your PHP application directly in C + + and PHP extensions. This is largely
becomes a glue between the PHP front-end HTML and application logic in C + +. From
a technical standpoint this works well, but it dramatically reduces the number
of engineers who are able to work throughout the application. Learning C + + is
just the first step in writing PHP extensions, the second is the understanding
of the Zend API. Given that Facebook’s engineering team is relatively small –
there are over one million users all engineers – Facebook can not afford to be
part of Facebook’s code base less accessible than others.
Scale
Facebook is particularly difficult because almost all page views are a
registered user in a personalized experience. Seeing your home page Facebook
have to find all your friends, see their most relevant updates (from a
personalized service that it created called multi-feed), filter the results
based on your privacy settings, then fill the stories with comments, photos,
likes, and all data rich people love Facebook. All this in less than a second.
HipHop allows us to write the logic that makes the final cut of the PHP page
and go quickly while relying on custom back-end services in C + +, Erlang, Java
or Python with external news service, research, Chat, and other core parts of
the site.
Since
2007 Facebook has thought of some different ways to solve these problems and
have even tried to implement some of them. The common suggestion is to rewrite
only to Facebook in another language, but given the complexity and speed of
development of the site that it will take some time to perform. We’ve rewritten
aspects Zend Engine – PHP internal – and has contributed to the patches in the
PHP project, but ultimately Facebook has not seen the kind of performance gains
that are needed. HipHop benefits are nearly transparent to Facebook speed of
development.
Share
your thoughts by Commenting here. Let us know what you think.