Migrating Etsy infrastructure from On-premises to Google Cloud Platform

Chris Bohn (15.Sep.2018 at 14:30, 40 min)
Talk at Highload fwdays '18 (English - UK)

Rating: 0 of 5

Etsy is one of the largest and best-known specialty online marketplaces worldwide, with gross sales in 2017 exceeding $3 Billion. Etsy was founded in 2005, before the emergence of viable cloud platforms. Until recently, all of Etsy's critical systems -including production and analytics data stacks - were hosted and managed on premises. In 2017, the decision was made to migrate all infrastructure to Google Cloud Platform (GCP), to become operational in 2018. This talk describes the migration, with a focus on moving Etsy's analytics data systems. The Etsy Analytics Data Stack consists of Hadoop for large batch jobs, Vertica for data analysis, and Kafka for clickstream and production data distribution, as well as custom tools for Data Science projects and ETL processes. In addition to migrating legacy technologies to GCP, Etsy has also integrated native GCP data products such as Big Query (big data processing) and Airflow (workflow management replacing Oozie).

The technical challenges and cloud economics of the migration will be discussed. This has been a very large project that has gone well, due to good planning and building the right teams. Anyone considering migrating infrastructure to the cloud, especially to GCP, will benefit from hearing about Etsy's challenges and solutions.

Who are you?

Claim talk

Talk claims have been moved to the new Joind.in site.

Please login to the new site to claim your talk

Want to comment on this talk? Log in or create a new account or comment anonymously

Write a comment

Please note: you are not logged in and will be posting anonymously!
= five minus one
No comments yet.
© Joind.in 2018