Engineering Blog

Blog posts tagged 'Storage'

Yoshinori MatsunobuDatabase Engineer at Facebook

MyRocks: A space- and write-optimized MySQL database

Posted about 5 months ago
blog post · Data · Infra · Storage · MySQL · Backend · Data Infrastructure

Deploying MyRocks to a database tier in one of our data center regions enabled a 50 percent reduction in storage requirements. Read more...

Smaller and faster data compression with Zstandard

Posted about 5 months ago
blog post · Data · Performance · Storage

With a performance-first design optimized for modern CPUs, Facebook's new compression algorithm translates directly to faster data transfer and smaller storage requirements. Read more...

Facebook Seattle moves into Dexter Station

Posted about 8 months ago
blog post · Infra · Culture · Seattle · Data Infrastructure · Storage · Platform

The open layout fosters Facebook's open and transparent culture, helping connect teams as they work together to connect the world. Read more...

Chris PetersenEngineering

Introducing Lightning: A flexible NVMe JBOF

Posted about 11 months ago
blog post · Open Compute · Storage · Data Centers

Lightning is a new NVMe-based storage platform built to improve flash capacity and performance while providing a common driver and management interface. Read more...

Thomas FurlongEngineering

Facebook in Ireland: Our newest data center

Posted about 12 months ago
blog post · Data Centers · Compute · Open Compute · Storage

The facility will be one of the most advanced data centers in the world, featuring OCP server and storage hardware. Read more...

Shaohua LiSoftware Engineer at Facebook

Improving software RAID with a write-ahead log

Posted about a year ago

Software RAID has some drawbacks, which can be problematic at Facebook's scale. Using a write-ahead log can address some of these issues and improve reliability of the array. Read more...

Grantland ChewEngineering

The Parse SDK: What's inside?

Posted about a year ago
blog post · Mobile · Data · Storage · Open Source

In this post, we'll unpack a few of the most challenging aspects of building the Parse SDKs — structuring an asynchronous API, decoupling architecture, and achieving API consistency. Read more...

George XieSoftware Engineer at Facebook

Improving Facebook's performance on Android with FlatBuffers

Posted about a year ago
blog post · Mobile · Android · Performance · Storage

In last six months, we have transitioned most of Facebook on Android to use FlatBuffers as the storage format. Read more...

Ganapathy KrishnamoorthyEngineering at Facebook

Inside Data@Scale 2015

Posted about 2 years ago
blog post · Data · @Scale · Storage

This week, hundreds of engineers gathered in Seattle for the Data@Scale event for discussion and collaboration toward better solutions for scaling data storage and processing. Read more...

Under the hood: Facebook’s cold storage system

Posted about 2 years ago

Finding a place for images to live so they can be instantly available is a recurring scale challenge for Facebook. Read more...

How RocksDB is used in osquery

Posted about 2 years ago
blog post · Infra · Data · Backend · Security · Framework · Analytics · Storage · Open Source

Using RocksDB as osquery's embedded database allows osquery to store and access data in a fast, persistent way, enabling our team to solve some technical problems we'll detail in this blog. Read more...

Tyrone NicholasSoftware Engineer at Facebook

Introducing Fresco: A new image library for Android

Posted about 2 years ago
blog post · Infra · Android · Open Source · Photos · Performance · Storage · Java · Development Tools

Today we're open-sourcing a library we're calling Fresco — it manages images and the memory they use. Read more...

Tomer BarEngineering at Facebook

Faster Photos in Facebook for iOS

Posted about 2 years ago
blog post · Mobile · Data · iOS · Photos · Performance · User Experience · Storage · News Feed

Faster Photos in Facebook for iOS. Read more...

Adam ErnstiOS Developer at Facebook

Making News Feed nearly 50% faster on iOS

Posted about 2 years ago
blog post · Mobile · Data · iOS · News Feed · User Experience · Optimization · Performance · Framework · Caching · Storage · Testingmore

We realized that while Core Data had served us well in the beginning, we needed to go without some of its features to accommodate our scale. We set about replacing it with our own solution, resulting in News Feed performing nearly 50% faster on iOS. Read more...

Building Mobile-First Infrastructure for Messenger

Posted about 2 years ago
blog post · Mobile · Infra · Messages · Production Engineering · Backend · Storage

Messages have been part of Facebook for many years, beginning as direct messaging similar to email (available in your inbox the next time you visited the site) and then eventually evolving into a real-time messaging platform that provides access to your messages on a number of mobile apps or in a browser. But until recently the back-end systems hadn't evolved much from early iterations, and Messenger's performance and data usage started to lag behind — especially on networks with costly data plans and limited bandwidth. To fix this, we needed to completely re-imagine how data is synchronized to the device and change how data is processed in the back end to support our new synchronization protocol. Read more...

Introducing mcrouter: A memcached protocol router for scaling memcached deployments

Posted about 2 years ago

Most web-based services begin as a collection of front-end application servers paired with databases used to manage data storage. As they grow, the databases are augmented with caches to store frequently-read pieces of data and improve site performance. Often, the ability to quickly access data moves from being an optimization to a requirement for a site. This evolution of cache from neat optimization to necessity is a common path that has been followed by many large web scale companies, including Facebook, Twitter[1], Instagram, Reddit, and many others. Read more...

HydraBase – The evolution of HBase@Facebook

Posted about 3 years ago
blog post · Data · Infra · Messages · Analytics · Storage · Platform · Open Source

When we revamped Messages in 2010 to integrate SMS, chat, email and Facebook Messages into one inbox, we built the product on open-source Apache HBase, a distributed key value data store running on top of HDFS, and extended it to meet our requirements. At the time, HBase was chosen as the underlying durable data store because it provided the high write throughput and low latency random read performance necessary for our Messages platform. In addition, it provided other important features, including horizontal scalability, strong consistency, and high availability via automatic failover. Since then, we’ve expanded the HBase footprint across Facebook, using it not only for point-read, online transaction processing workloads like Messages, but also for online analytics processing workloads where large data scans are prevalent. Today, in addition to Messages, HBase is used in production by other Facebook services, including our internal monitoring system, the recently launched Nearby Friends feature, search indexing, streaming data analysis, and data scraping for our internal data warehouses. Read more...

Looking back on “Look Back” videos

Posted about 3 years ago

Facebook’s mission is to help people connect with one another, and as our 10th anniversary approached last month, we wanted to do something that would let everyone participate in the event together. After some discussion, we settled on the Look Back feature, which allows people to generate one-minute videos that highlight memorable photos and posts from their time on Facebook. Read more...

Subodh IyengarSoftware Engineer at Facebook

Introducing Conceal: Efficient storage encryption for Android

Posted about 3 years ago
blog post · Web · Infra · Data · Security · Open Source · Android · Java · Development Tools · Caching · Storage · Performancemore

Caching and storage are tricky problems for mobile developers because they directly impact performance and data usage on a mobile device. Caching helps developers speed up their apps and reduce network costs for the device owner by storing information directly on the phone for later access. However, internal storage capacity on Android phones is often limited, especially with lower to mid range phone models. A common solution for Android is to store some data on an expandable SD card to mitigate the storage cost. What many people don't realize is that Android's privacy model treats the SD card storage as a publicly accessible directory. This allows data to be read by any app (with the right permissions). Thus, external storage is normally not a good place to store private information. Read more...

Dhruba BorthakurEngineering

Under the Hood: Building and open-sourcing RocksDB

Posted about 3 years ago
blog post · Data · Infra · Backend · Production Engineering · Open Source · Storage

Every time one of the 1.2 billion people who use Facebook visits the site, they see a completely unique, dynamically generated home page. There are several different applications powering this experience--and others across the site--that require global, real-time data fetching. Read more...

Domas MituzasInfrastructure Engineer at Facebook

Flashcache at Facebook: From 2010 to 2013 and beyond

Posted about 3 years ago
blog post · Infra · Data · Storage · Caching · Performance · Optimization

We recently released a new version of Flashcache, kicking off the flashcache-3.x series. We’ve spent the last few months working on this new version, and our work has resulted in some performance improvements, including increasing the average hit rate from 60% to 80% and cutting the disk operation rate nearly in half. Read more...

Lachlan MulcahyEngineering

Windex: Automation for database provisioning

Posted about 4 years ago
blog post · Data · Infra · Production Engineering · Storage

Windex was originally developed to wipe data from hosts coming out of production, reinstall everything from the OS through to MySQL, and then configure them so they could be placed back into the spares pool all shiny and new. Now, Windex has expanded its role to cover all provisioning of MySQL DB hosts, whether they are freshly racked and set up by our site operations team or taken out of production for an offline repair like RAM replacement. Read more...

Scaling memcache at Facebook

Posted about 4 years ago
blog post · Data · Infra · Caching · Production Engineering · Storage

Facebook started using memcached in August 2005 when Mark Zuckerberg downloaded it from the Internet and installed it on our Apache web servers. At that time, Facebook was starting to make increasingly sizable database queries on every page load, and page load times were significantly increasing. Providing a fast, snappy user experience has always been a high priority for Facebook, and memcached came to the rescue. Read more...

Tim ArmstrongEngineering

LinkBench: A database benchmark for the social graph

Posted about 4 years ago
blog post · Data · Infra · Graph · MySQL · Performance · Optimization · Open Source · Testing · Storage

MySQL offers a good mix of flexibility, performance, and administrative ease, but the database engineering team continues to explore alternatives to MySQL for storing social graph data. There are several generic open-source benchmarks that could provide a starting point for comparing database systems. However, the gold standard for database benchmarking is to test the performance of a system on the real production workload, since synthetic benchmarks often don't exercise systems in the same way. When making decisions about a significant component of Facebook's infrastructure, we need to understand how a database system will really perform in Facebook's production workload. Read more...

Alex GartrellEngineering

McDipper: A key-value cache for Flash storage

Posted about 4 years ago
blog post · Infra · Data · Web · Storage · Caching · Performance · Server Infrastructure · Data Centers · Photos

Memcache has been used at Facebook for everything from a look-aside cache for MySQL to a semi-reliable data-store for ads impression data. Of course RAM is relatively expensive, and for working sets that had very large footprints but moderate to low request rates, we believed we could make Memcached much more efficient. Compared with memory, flash provides up to 20 times the capacity per server and still supports tens of thousands of operations per second, so it was the obvious answer to this problem. Read more...

Under the Hood: Scheduling MapReduce jobs more efficiently with Corona

Posted about 4 years ago
blog post · Data · Infra · Open Source · Storage

Nearly every team at Facebook depends on our custom-built data infrastructure for warehousing and analytics, with roughly 1,000 people across the company – technical and non-technical – using these technologies every day. Over half a petabyte of new data arrives in the warehouse every 24 hours, and ad-hoc queries, data pipelines, and custom MapReduce jobs process this raw data around the clock to generate more meaningful features and aggregations. Read more...

Sean LynchEngineering

Monitoring cache with Claspin

Posted about 4 years ago

When I started at Facebook, I joined the newly formed cache performance team in production engineering. Our goal was to get a handle on the health of our various cache systems and to facilitate quick troubleshooting, starting with answering the question, "Is this problem being caused by the cache?". Read more...

More blog postsNext

Keep Updated

Stay up-to-date via RSS with the latest open source project releases from Facebook, news from our Engineering teams, and upcoming events.

Subscribe
Facebook © 2017