Sunday 14 July 2013

Technology Used by Facebook – Research Report by Jay Thadeshwar

Social networking is the art of connecting with others who share common interests. This “Red Road” is a community that helps us hold together and offers many other benefits. Networking through social networking has revolutionized the way that Internet usage is at the forefront of what we now know as Web 2.0.
Facebook is the social network. People have been “facebooking” one another for about 7 years, so most used Facebook social network with over 800 million users worldwide. But how does Facebook work?

In this article I’ll discuss the inner workings of Facebook, which includes infrastructure architecture and frontend / backend, nuts and bolts holding together Facebook.

How does Facebook work –  The Front End

Facebook uses a variety of services, tools and programming languages ??to make up its basic infrastructure. In the front, their servers run a LAMP (Linux, Apache, MySQL and PHP) stack with Memcache. Not a computer science expert? Let’s take a look at what this means.

Linux & Apache

This part is pretty self-explanatory. Linux is a Unix-like operating system kernel. It is open source, highly customizable, and good for security. Facebook runs the Linux operating system Apache HTTP server. Apache is also free and is the most popular open source web server in use. 

MySQL

For the database, Facebook uses MySQL for its speed and reliability. MySQL is used primarily as a key store of value when the data are randomly distributed among a large number of cases logical. These logical instances extend across physical nodes and load balancing is done at physical node.
As far as customization is concerned, Facebook has developed a custom partitioning scheme which is assigned a global ID for all data. They also have a custom schema file that is based on the amount of common data and the latest is on a per user basis. Most of the data are randomly distributed.

PHP

Facebook uses PHP, since it is a good web programming language with extensive support and active developer community and is good for rapid iteration. PHP is a dynamically typed language / interpreter.

Memcache

what is memcache??- Memcache is a caching system that is used to accelerate dynamic web sites with databases (like Facebook) by caching data and objects in RAM to reduce reading time. Memcache is the main form of caching Facebook and helps relieve the burden of database.


























Having a caching system allows Facebook to be as fast as it is to remember your information. If you do not have to go to the database you just collect data from the cache based on user name.

Disadvantages of using LAMP

Facebook has realized that there are disadvantages to using the LAMP stack. Note that PHP is not necessarily optimized for web sites large and therefore difficult to scale. Furthermore, it is the fastest executing language and framework of the extension is difficult to use.
Facebook President, Vice President of Engineering, has conducted an interview with EmTech @ MIT on this. “Extension of any website is a challenge,” Schroepfer said, “but the expansion of a social network has unique challenges.”
He continued by saying that unlike other websites, you can simply add more servers to solve the problem because Facebook “huge interconnected dataset.” The new connections are created all the time due to user activity.
Facebook has grown so rapidly that often face questions relating to database queries, caching and data storage. Their database is huge and complex in many ways. To account for this, Facebook has started a lot of open source projects and back-end services.

How does Facebook Work – The Back End

Facebook backend services are written in a variety of different programming languages ??like C + +, Java, Python, and Erlang. His philosophy of building services is as follows:
1. Create a service if necessary
2. Create a framework / toolkit to facilitate the creation of services
3. Use the programming language suitable for the task
A list of all open source developers Facebook can be found here. I will discuss some of the essential tools that Facebook has been developed

Thrift (protocol)

Thrift is a lightweight remote procedure call framework for scalable cross-language services development. Thrift supports C + +, PHP, Python, Perl, Java, Ruby, Erlang, and others. It’s fast, saves development time and provides a working division of labor in high-performance servers and applications.

Escribano (server logs)

Scribe is a server for aggregating log data streamed in real time on many other servers. It is a scalable framework useful for recording a wide range of data. It is built on top of savings.

Cassandra (database)

management system is a database designed to handle large amounts of data spread out across many servers. The function of the power of Facebook
 Inbox search and provides a structure of key-value store with eventual consistency.
HipHop for PHP
HipHop for PHP is a transformer of source code for PHP script code and was created to save server resources. HipHop transforms PHP source code in C + + optimized. After doing this, use g + + to compile it to machine code.

Why Facebook’s performance is so good?

800 million users, divide America, Europe, Asia. It means that more than one million people view photos, chat with friends or update status at a time. How can they do?
The main language is PHP and MySQL Facebook, who have a reputation to scale well. To my knowledge, people tend to use a compiled language (like Java,. NET) for the implementation of big business. The languages ??enforce good practice and habit of refactoring, good architecture, while PHP does not. In addition, the scripting language can not run faster than a compiled one.

Critical Architecture Maps of Facebook Technology




There is no single reason but a lot of reasons:
Extensive use of caching (memcached APC), which drastically reduces processing time. Slide 12 compares the load time with APC (~ 130 ms) versus without it – 4050 ms. That’s 30 times faster!
The use of HipHop, which converts PHP code C + + (which is compiled into machine code much more efficient than current PHP).
Facebook uses PHP and MySQL, but that is not the only thing. For example, use Erlang by chat, groups of Hadoop for some storage. If you visit their employment page, you’ll see they’re hiring developers with experience in C + +, Java, Python and others.
Facebook has data distributed across many servers, many years. In June 2010, there were 60,000 FB servers. (I think it is not too much. Google had half a million … 5 years ago)
Facebook sends traffic as little as possible using static CDN to deliver static content. Gzip to compress the data. Cookies, Javascript, HTML – all that is cut back to reduce the number of bytes sent over the network. Use a technology they call “BigPipe” which sends the partial content rather than the entire page.
One of the key values ??of Facebook is to move fast. Over the past six years Facebook has been able to achieve much faster thanks to a development path that offers PHP. As a programming language PHP is simple. Easy to learn, easy to spell, easy to read, and easier to debug. Facebook is able to obtain new Facebook engineers intensified in faster with PHP than with other languages, allowing us to innovate faster.
HipHop for PHP is not technically a compiler itself. Rather it is a source transformer. HipHop program transforms the PHP source code highly optimized C + + and uses g + + to compile it. HipHop executes the source code so semantically equivalent and sacrifices some rarely used features – such as eval () – in exchange for enhanced performance. HipHop includes a code transformer, a reimplementation of PHP run time system, and a rewrite of many common PHP extensions to take advantage of these performance optimizations.
Scale PHP as a scripting language
Roots of PHP is a scripting language like Perl, Python and Ruby, all of which have important benefits in terms of programmer productivity and the capacity to implement products quickly. This compares with more traditional compiled languages ??like C + + and scripting languages ??such as Java. On the other hand, scripting languages ??are known to be generally less efficient when it comes to CPU and memory usage. Because this has been a challenge to scale Facebook to over 400 million page views each month based on PHP.
A common way to address these inefficiencies is to rewrite the most complex parts of your PHP application directly in C + + and PHP extensions. This is largely becomes a glue between the PHP front-end HTML and application logic in C + +. From a technical standpoint this works well, but it dramatically reduces the number of engineers who are able to work throughout the application. Learning C + + is just the first step in writing PHP extensions, the second is the understanding of the Zend API. Given that Facebook’s engineering team is relatively small – there are over one million users all engineers – Facebook can not afford to be part of Facebook’s code base less accessible than others.
Scale Facebook is particularly difficult because almost all page views are a registered user in a personalized experience. Seeing your home page Facebook have to find all your friends, see their most relevant updates (from a personalized service that it created called multi-feed), filter the results based on your privacy settings, then fill the stories with comments, photos, likes, and all data rich people love Facebook. All this in less than a second. HipHop allows us to write the logic that makes the final cut of the PHP page and go quickly while relying on custom back-end services in C + +, Erlang, Java or Python with external news service, research, Chat, and other core parts of the site.
Since 2007 Facebook has thought of some different ways to solve these problems and have even tried to implement some of them. The common suggestion is to rewrite only to Facebook in another language, but given the complexity and speed of development of the site that it will take some time to perform. We’ve rewritten aspects Zend Engine – PHP internal – and has contributed to the patches in the PHP project, but ultimately Facebook has not seen the kind of performance gains that are needed. HipHop benefits are nearly transparent to Facebook speed of development.


Share your thoughts by Commenting here. Let us know what you think.





No comments:

Post a Comment