Engineering Blog

Blog posts tagged 'Infra'

Alethea PowerSite Reliability Engineer at Facebook

Making Facebook Self-Healing

Posted about 7 years ago

When your infrastructure is the size of Facebook’s, there are always broken servers and pieces of software that have gone down or are generally misbehaving. In most cases, our systems are engineered such that these issues cause little or no impact to people using the site. But sometimes small outages can become bigger outages, causing errors or poor performance on the site. If a piece of broken software or hardware does impact the site, then it's important that we fix it or replace it as quickly as possible. Even if it's not causing issues for users yet, it could in the future so we need to take care of it quickly. Read more...

Qiang WuInfrastructure Software Engineer at Facebook

Keeping the Site Reliable While Moving Fast

Posted about 7 years ago
blog post · Infra · Web · Data · Culture · Optimization · Performance · Backend · Platform · Chat · PHPmore


Carlos BuenoFixer at Facebook

Doppler: Internet Radar

Posted about 7 years ago
blog post · Infra · Data · Web · Mobile · Performance · Optimization

The basic strategy for all performance and optimization work is the delicious measurement sandwich: measure, change something, then measure again. Detailed network measurements are especially hard to do because we only control one side of the transaction, our own servers. So we design network experiments that are lightweight, continuous, and gather as many samples as possible, even at the expense of detail and accuracy. A billion data points can cover a lot of methodological sins. Read more...

Yael MaguireEngineering
Donn LeeEngineering at Facebook
Donn LeeEngineering at Facebook

Facebook and World IPv6 Day

Posted about 7 years ago
blog post · Infra · Data · Testing


Greg SchechterEngineering

Visualizing Facebook's PHP Codebase

Posted about 7 years ago
blog post · Infra · PHP


Sanjeev KumarDirector Engineering at Facebook

How Project Triforce Prepared our Software Stack for Prineville

Posted about 7 years ago
blog post · Infra · Data · Compute · Hardware · Open Compute · Prineville · Data Centers · MySQL

For the first few years of Facebook’s existence, we served our users from data centers in a single region in Northern California. As the site grew, we added a second region of data centers in Virginia in 2007, and this year, we launched our third region in Prineville, Oregon. Read more...

Mark SleeUnder the fluorescent bulbs at Facebook

Thoughts on Software Quality

Posted about 7 years ago
blog post · Infra · Culture

or: Ramblings from the anti-Hedonic Treadmill of Software Quality. Read more...

Prashant MalikEngineering

Scaling the Messages Application Back End

Posted about 7 years ago
blog post · Infra · Messages

Facebook Messages seamlessly integrates many communication channels: email, SMS, Facebook Chat, and the existing Facebook Inbox. Combining all this functionality and offering a powerful user experience involved building an entirely new infrastructure stack from the ground up. Read more...

Amir MichaelEngineering

Inside the Open Compute Project Server

Posted about 7 years ago
blog post · Infra · Data · Data Centers · Hardware · Open Compute · Open Source · Optimization

We launched the Open Compute Project, an effort to nurture industry collaboration on the best practices and implementation of power- and cost-efficient compute infrastructure, yesterday. At the heart of the project lies the Open Compute server, a highly optimized server layout developed by Facebook engineers and industry partners. Read more...

Jonathan HeiligerVP Tech Operations at Facebook

Building Efficient Data Centers with the Open Compute Project

Posted about 7 years ago

A small team of Facebook engineers spent the past two years tackling a big challenge: how to scale our computing infrastructure in the most efficient and economical way possible. Read more...

Xin QiResearch Scientist at Facebook

HipHop for PHP: More Optimizations for Efficient Servers

Posted about 7 years ago

Facebook switched all its production servers to HipHop in early 2010, also releasing the project’s source code at that time. At the time of the switch, HipHop reduced our average CPU usage by 50%, the six months after its release saw an additional 1.8x performance improvement, and in the past six months the team in conjunction with the open source community has made an additional 1.7x improvement. Read more...

Najam AhmadEngineering

Accelerating Network Innovation with the Open Networking Foundation

Posted about 7 years ago
blog post · Infra · Networking and Traffic

Facebook runs one of the largest networks in the world. Delivering traffic over that network as quickly and efficiently as possible is of prime importance to us and our users, which is why we're a founding member of the Open Networking Foundation (ONF), along with Deutsche Telekom, Google, Microsoft, Verizon, and Yahoo!. The ONF is a non-profit organization dedicated to promoting a new approach to networking, called Software-Defined Networking (SDN). Read more...

Nagavamsi PonnekantiEngineering at Facebook

Hybrid Incremental MySQL Backups

Posted about 7 years ago
blog post · Infra · Web · Data · MySQL · PHP · Storage · Performance

This post discusses enhancements to our database backups. As we deploy these enhancements to production servers, we may write additional posts about other improvements made along the way. Read more...

Scott MacVicarEngineering

Supercell: test infrastructure for any open source project

Posted about 7 years ago
blog post · Infra · Open Source

Testing is an important part of any project; it allows the engineer to discover errors that could have occurred while adding a new feature or a regression introduced while fixing a bug. These are all a normal part of the software development process. Read more...

Donn LeeEngineering at Facebook

World IPv6 Day: Solving the IP Address Chicken-and-Egg Challenge

Posted about 7 years ago
blog post · Infra · Data · Networking and Traffic

We’re announcing today our participation in World IPv6 Day, along with Google, Yahoo!, Akamai, Limelight Networks, and the Internet Society. June 8, 2011, will be the first global-scale "test flight" of IPv6, the next generation protocol for the Internet. And best of all, it’s open to everyone who’s interested in testing their IPv6 service. Read more...

Jason EvansEngineering

Scalable memory allocation using jemalloc

Posted about 7 years ago
blog post · Infra · Data · Storage

The Facebook website comprises a diverse set of server applications, most of which run on dedicated machines with 8+ CPU cores and 8+ GiB of RAM. These applications typically use POSIX threads for concurrent computations, with the goal of maximizing throughput by fully utilizing the CPUs and RAM. This environment poses some serious challenges for memory allocation, in particular:. Read more...

Liyin TangSoftware engineer at Facebook

Join Optimization in Apache Hive

Posted about 7 years ago
blog post · Infra · Data · Open Source · Optimization · Performance

With more than 500 million users sharing a billion pieces of content daily, Facebook stores a vast amount of data, and needs a solid infrastructure to store and retrieve that data. This is why we use Apache Hive and Apache Hadoop so widely at Facebook. Hive is a data warehouse infrastructure built on top of Hadoop that can compile SQL queries as MapReduce jobs and run the jobs in the cluster. Read more...

Dhruba BorthakurEngineering

Looking at the code behind our three uses of Apache Hadoop

Posted about 7 years ago
blog post · Data · Infra · Open Source · MySQL · Storage · Messages

The size of the data warehouse cluster at Facebook has been increasing tremendously over the past few years. We use several pieces of open source software in our data warehouse including Apache Hadoop, Apache Hive, Apache HBase, Apache Thrift and Facebook Scribe. Together they keep this data processing engine humming. Read more...

Carlos BuenoFixer at Facebook

The Full Stack, Part I

Posted about 7 years ago
blog post · Infra · Data · Storage · Networking and Traffic · Compute · Hardware

One of my most vivid memories from school was the day our chemistry teacher let us in on the Big Secret: every chemical reaction is a joining or separating of links between atoms. Which links form or break is completely governed by the energy involved and the number of electrons each atom has. The principle stuck with me long after I'd forgotten the details. There existed a simple reason for all of the strange rules of chemistry, and that reason lived at a lower level of reality. Maybe other things in the world were like that too. Read more...

Kannan MuthukkaruppanTechnical Lead at Facebook

The Underlying Technology of Messages

Posted about 7 years ago
blog post · Infra · Data · Messages

We're launching a new version of Messages today that combines chat, SMS, email, and Messages into a real-time conversation. The product team spent the last year building out a robust, scalable infrastructure. As we launch the product, we wanted to share some details about the technology. Read more...

Jay ParkEngineering

New Cooling Strategies for Greater Data Center Energy Efficiency

Posted about 7 years ago
blog post · Infra · Hardware

Energy efficiency in the data center has both environmental and economic payoffs. Lately at Facebook we've been implementing strategies and industry best practices that allow us to run the most energy efficient data centers. We also made an interesting discovery along the way: you can realize greater efficiency gains from newer data centers, which are already much more energy efficient than their predecessors. Read more...

David RecordonEngineering Director at Facebook

Using HTML5 Today

Posted about 7 years ago
blog post · Infra · HTML5

We are excited about HTML5 and wanted to share with you some of the things we’re already using it for!. Read more...

Robert JohnsonDirector, Software Engineering at Facebook

More Details on Today's Outage

Posted about 7 years ago
blog post · Infra · User Experience · Performance · Caching

Early today Facebook was down or unreachable for many of you for approximately 2.5 hours. This is the worst outage we’ve had in over four years, and we wanted to first of all apologize for it. We also wanted to provide much more technical detail on what happened and share one big lesson learned. Read more...

Keep Updated

Stay up-to-date via RSS with the latest open source project releases from Facebook, news from our Engineering teams, and upcoming events.

Facebook © 2018