Delta Live Tables
Chris Hoshino-Fish introduces Delta Live Tables, an optimized system for data management in the cloud.
With the advent of cloud computing, the software and hardware industry has been developing new paradigms enabling businesses to scale in the age of data. One of the greatest advantages realized was the separation of storage and compute for data; previous data collection projects were constantly constrained by either storage or compute, and enabling independent horizontal scaling of both disrupted the legacy database and data warehousing industries.
However, databases have been used by enterprises for decades and developed thousands of techniques and optimizations for these data systems that depended on coupled storage and compute. Allowing the compute engine to have certain expectations of the storage layer unlocks techniques for optimizing data access patterns. Some of these techniques are still useful in the world of cloud computing, as well as newly developed techniques specific to the cloud.
Delta Live Tables is a new system for data engineering from Databricks that builds upon technologies like Delta Lake, Apache Spark, and Spark’s Structured Streaming. It focuses on incrementally processing data, optimizing the compute system for cost and performance, while actively managing the data created by business’s data teams. It’s a key component of a data lakehouse, which requires a fast data processing to provide data practitioners with real-time data. Additionally, Delta Live Tables provides data quality monitoring capabilities, helping data teams analyze the quality of data and remediate data problems before they can affect data-driven decisions.
If this is your first time using Zoom, please allow a few extra minutes to download the browser plugin or mobile app.
Chris Hoshino-Fish is a lead solutions architect at Databricks. Chris is an active member of the Performance Subject Matter Expert group and a former principal consultant focused on data engineering, working with several Fortune 500 Databricks customers. Prior to Databricks, Chris worked for an adtech company as a data engineer managing pipelines using Apache Spark for 3.5 years. Chris has a B.A. in computational mathematics from University of California, Santa Cruz.