- 6-24 months
- Lakehouse
- Data Transformation and Integration
- Inspirational Data Solutions
- Architecture
- Presentation
- Regular 50 minute session
- Beginner
Apache Iceberg Data Analytics with Python
Python continues to be a leading language for data analytics, and Apache Iceberg has emerged as a powerful table format for managing large-scale datasets. In this talk, we’ll explore how to unlock the full potential of Iceberg for data analytics in Python. From working with PyIceberg for low-level Iceberg table operations to leveraging Apache Polaris for catalog management, we’ll cover the essential tools and libraries available for Python users.
We’ll also dive into DataFusion, a high-performance query engine that integrates seamlessly with Iceberg, and the dremio-simple-query library, which simplifies querying Iceberg tables through Dremio. This session will provide hands-on examples, best practices, and real-world scenarios to help you harness Python’s flexibility and Iceberg’s scalability for analytics workloads. Whether you’re a data scientist, engineer, or analyst, you’ll leave with practical insights into building a Python-powered data analytics pipeline with Apache Iceberg.