SBTB 2021: Spark

Alexy Khrabrov
Chief Scientist
Published in
1 min readOct 7, 2021

--

Apache Spark is a core open-source technology discussed at Scale By the Bay from the inception of both. SBTB as a conference stared at SF Scala, as did the very first Spark meetup in the world, Bay Area Spark Meetup. Matei Zaharia keynoted several SBTB editions and Databricks sent many speakers describing various aspects of its stack. The Spark ecosystem is global and well represented by many startups and enterprises.

This year, we are happy to welcome Databricks’ own John O’Dwyer describing an incremental ETL architecture.

Itay Yaffe of Databricks demonstrates Druid in practice alongside cocohub.ai’s founder Yakir Buskilla.

Jean-Yves Stephan, of Data Mechanics, conducts an Apache Spark performance tuning session — with delight!

Microsoft Azure’s Adi Polak rethinks Machine Learning ecosystem with Spark.

Aporia’s Alon Golubkin builds an ML platform from scratch, using tools like DVC and MLflow.

Uber’s Mohit Jaggi moves us from Python and to Scala, Presto to Spark, elbow grease to automation.

See additional SBTB 2021 Cloud track talks integrating with Spark: https://chief.sc/sbtb2021-cloud

See SBTB 2021 AI track often built on top of Spark and Databricks platform and tools: https://chief.sc/sbtb2021-ai

See the complete SBTB 2021 program and register for the October 28–29 online sessions: https://scale.bythebay.io/

--

--

Open-Source Science Founder and Chair, NumFOCUS. Founder and organizer, Scale By the Bay and Bay Area AI. Dad of 4.