Data @Scale 2017 Recap

On Thursday June 8th, we welcomed 350 engineers to Seattle for the 3rd consecutive year of Data @Scale.

In addition to focusing on the challenges of building services and solutions for large-scale storage systems and analytics, this year’s speakers from Facebook, Google, LinkedIn, Microsoft, Pinterest, Uber, and Yandex examined the ways in which Big Data is transforming machine learning, even as new machine learning techniques are leading to an evolution in infrastructure, hardware engineering, and data center design.

In advance of the day’s presentations we hosted a Women in Engineering Breakfast & Panel, offering attendees the chance to connect with fellow researchers and industry leaders with a passion for technology and participate in a discussion with panelists from Amazon, Facebook, and Google.

For a recap of the conference and the presentations, check out the videos below. The @Scale community is focused on bringing people together to openly discuss these challenges and collaborate on the development of new solutions. If you’re interested in joining the next event, visit the @Scale website or join the @Scale community.

Accelerating Machine Learning for Computer Vision

Pieter Noordhuis, Facebook

Facebook engineer Pieter Noordhuis shares insights from a newly released paper, “Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour.” The paper demonstrates how creative infrastructure design can contribute to more efficient deep learning at scale.

Next Generation of Globally-Distributed Databases in Azure

Rimma Nehme, Microsoft

Rimma describes the next generation of globally distributed databases at Microsoft. These databases can run on millions of nodes across hundreds of data centers and handle up to trillions of data objects, 24/7 – all backed by industry-leading comprehensive SLAs.

Yandex Clickhouse: A DBMS for Interactive Analytics at Scale

Alexey Milovidov, Yandex

In his session, Alexey walks through the development of ClickHouse and how an iterative approach to data storage organization resulted in a system that can ingest clickstream data in real time, generate interactive reports on non-aggregated data, process 100 billion rows per second on HDDs, scales linearly, supports the SQL language dialect, and is open source.

Evolution of Storage and Serving at Pinterest

Yongsheng Wu, Pinterest

Yongsheng covers the evolution of storage and serving at scale as Pinterest grows. He shares insight into building a machine learning serving platform to address new challenges on how to efficiently serve feeds with complicated machine-learned ranking models and features scattered across many data sets with very low latency to deliver delightful experiences.

Cadence: Micro-service Architecture Beyond Request/Reply

Maxim Fateev, Uber

Uber’s Maxim Fateev offers a technical review of Cadence, an open source solution for building and running micro-services that expose asynchronous, long-running operations in a scalable and resilient way. Cadence borrows ideas from the AWS Simple Workflow service, is written in Go, and relies on Cassandra for storage.

How Reporting and Experimentation Fuel Product Innovation at LinkedIn

Kapil Surlaker, LinkedIn

Kapil describes UMP and XLNT, platforms built for metrics computation and experimentation, respectively. Over the last few years, these platforms have allowed LinkedIn to perform measurement and experimentation efficiently at scale while preserving trust in data.

Spanner’s SQL Evolution

Sergey Melnik, Google

Sergey offers a look at the technical challenges behind Spanner, a globally distributed data management system that backs hundreds of mission-critical services at Google.

Spanner is built on ideas from both the systems and database communities. Initially, Spanner focused on the systems aspects such as scalability, automatic sharding, fault tolerance, consistent replication, external consistency, and wide-area distribution. More recently, Google has been working on turning Spanner into a SQL DBMS.

Sergey describes distributed query execution in the presence of resharding, query restarts upon transient failures, range extraction that drives query routing and index seeks, and the improved blockwise-columnar storage format. He touches upon migrating Spanner to the common SQL dialect shared with other systems at Google.

Architectures for the New Era of Cloud Specialization

Doug Burger, Microsoft

Doug Burger walks the audience through some of Microsoft’s efforts around large-scale deployments of programmable hardware in the Microsoft cloud, including both the hardware and the resource management interfaces. He does so within the context of the incipient end of Moore’s law and the ever-increasing computational needs, driven in part by big data and machine learning, that will force systems to become more heterogeneous.