SQLBits 2026
  • 6-24 months
  • Lakehouse
  • Fabric
  • Spark
  • Python
  • Analytics
  • Tooling
  • Data Transformation and Integration
  • Optimising Your Data Platform
  • Fabric – How to Succeed
  • Data Internals – Blow Your Mind
  • Optimising
  • Architecture
  • Development
  • Demo Led Session
  • Regular 50 minute session
  • Intermediate
Room 1C

Spark Unplugged: How In-Process Analytics Is Making Distributed Computing An Expensive Investment

The data landscape is experiencing a fundamental shift. For years, we've been told that serious data work demands distributed computing, cloud infrastructure, and complex pipelines. But what if the most powerful analytics engine is already sitting on your desk?

In this provocative, insight-packed session, we'll challenge the conventional wisdom around big data processing by exploring the emerging "data singularity" - the point where single-node computing power is outpacing the growth of most analytical datasets. We'll demonstrate how tools like DuckDB and Polars are revolutionizing analytics by bringing analytical capabilities directly into your application process, eliminating overhead, delivering mind-blowing performance and turbo charging your "inner development loop".

You'll learn how these in-process engines can process millions of rows on your laptop, often outperforming distributed systems like Spark while dramatically reducing complexity, cost, and carbon footprint. We'll share practical code examples showing how to implement these tools in your workflows, with special focus on integrating into your Databricks or Microsoft Fabric environment.

This session is perfect for:

* Data engineers tired of the overhead incurred working with distributed systems
* Data scientists seeking faster iteration cycles
* Data leaders looking to minimise total cost of ownership and accelerate time to value
* Anyone interested in the future direction of data processing

Walk away with a completely fresh perspective on data architecture, practical techniques to implement tomorrow, and perhaps a nagging question: Have we been overengineering our data solutions all along?

Session Materials

  • Slide Deck

    Spark_Unplugged_How_In-Process_Analytics_Is_Making_Distributed_Computing_An_Expensive_Investment - SQLbits_2026_-_Data_Singularity_-_April_2026.pdf
    Download

On-demand