Databricks Unveils Zerobus Ingest: A New Approach to Real-Time Data Streaming
Databricks has announced the general availability of Zerobus Ingest, a fully managed, serverless service designed to streamline real-time data streaming directly into data lakehouses. This new offering aims to address the complexities and costs associated with traditional streaming architectures like those built around Apache Kafka.
The Challenge with Traditional Streaming
As organizations increasingly rely on real-time operational intelligence, traditional streaming architectures can become bottlenecks. Managing message buses, schema registries, and connector frameworks creates a significant operational burden, diverting engineering resources. Duplicate storage increases cloud costs, and multi-hop architectures can introduce latency. Data in transit too poses governance and compliance risks.
Introducing Zerobus Ingest
Zerobus Ingest, part of Lakeflow Connect, offers a simplified architecture by streaming data directly into governed Delta tables, eliminating intermediate layers. According to Databricks, this approach slashes costs and reduces tool sprawl. The service supports thousands of concurrent connections and can achieve over 10GB/second of aggregate throughput to a table in under 5 seconds.1
The Single-Sink Advantage
Traditional message buses like Kafka are designed as multi-sink architectures, routing data to numerous consumers. Zerobus Ingest, however, employs a single-sink architecture optimized for pushing data directly to the lakehouse. This focused approach is intended to deliver significant cost reductions.1
Key Features and Benefits
- Scalable Performance: Zerobus Ingest delivers near real-time data ingestion – within as little as five seconds – with throughput up to 100MB/s per connection and over 10GB/s aggregate table throughput.2
- Simplified Infrastructure: By eliminating unnecessary data hops, Zerobus Ingest reduces the cost and complexity of traditional message bus architectures.2
- Databricks Integration: Built on open standard Delta tables, Zerobus Ingest provides unified governance via Unity Catalog and full integration with Databricks analytics and AI tools.2
- Flexible APIs: Supports persistent gRPC streams and stateless REST APIs for various data volumes.2
- Variant Type Support: Ingests JSON data via the Variant type with durable buffering.2
Impact on the Data Ecosystem
Databricks is positioning Zerobus Ingest as an alternative to platforms like Apache Kafka, particularly for organizations focused on a lakehouse architecture. Bilal Aslam, Databricks senior director of product management, noted that 30 to 40 percent of Databricks’ customers already have real-time or near-real-time data streaming applications.3 The service is expected to accelerate the implementation of real-time analytics and AI applications, with potential benefits for cybersecurity, IoT, and manufacturing leverage cases.
Partner Opportunities
Databricks partners, including system integrators and solution providers, can leverage Zerobus Ingest to deliver faster time-to-value for Databricks-based systems. Implementation times can be reduced from weeks or months to just hours.3 The new product also opens opportunities for modernization projects around legacy systems.