Welcome to the Eventual blog

Join us as we explore innovative ways to handle multimodal datasets, optimize performance, and simplify your data workflows.

Engineering Insights Product Case Studies Announcements Team Company Tutorials Video

Daft v0.7.15: Safe Type Conversions, Flight Shuffle Optimizations, and PostgreSQL Support

Engineering

June 7, 2026

Daft v0.7.15: Safe Type Conversions, Flight Shuffle Optimizations, and PostgreSQL Support

Daft v0.7.15 ships with try_cast for safe type conversion, Flight shuffle LZ4 compression, UUIDv7 timestamp extraction, and PostgreSQL support.

Open-sourcing 43 Billion Tokens of SEC EDGAR

Case Studies

April 9, 2026

Open-sourcing 43 Billion Tokens of SEC EDGAR

Datamule, Teraflop AI, and Eventual collaborated to release the SEC-EDGAR dataset containing 590 GB of data, spanning 8 million samples and 43 billion tokens from all major filings in the SEC EDGAR database.

Daft v0.7.7: Parquet Cache Regression Fixed, df.shuffle(), and Coalesce Short-Circuit

Announcements

April 3, 2026

Daft v0.7.7: Parquet Cache Regression Fixed, df.shuffle(), and Coalesce Short-Circuit

Daft v0.7.7 fixes a parquet streaming regression that made aggregations 2-4x slower, adds df.shuffle() for ML data prep, and makes coalesce short-circuit per the SQL spec.

Daft v0.7.6: Every Major Lake Format, O(1) Scalars, and Swordfish Plan Caching

Announcements

March 31, 2026

Daft v0.7.6: Every Major Lake Format, O(1) Scalars, and Swordfish Plan Caching

Daft natively reads and writes every major open lake format — Iceberg, Delta Lake, Hudi, and now Apache Paimon. Plus O(1) scalar columns, fingerprint-based plan caching in Swordfish, and production observability.

Daft UDF Patterns: Four Patterns, One Notebook

Product

March 30, 2026

Daft UDF Patterns: Four Patterns, One Notebook

Row-wise, generator, async, and stateful UDFs — one notebook, one dataset, runnable side by side.

Product

March 23, 2026

GPU Inference with @daft.cls

Run GPU models on millions of rows without OOM. Real patterns from ByteDance, Essential AI, and more.

Stateful UDFs with daft.cls: Python Classes that Scale

Product

March 17, 2026

Stateful UDFs with daft.cls: Python Classes that Scale

Turn any Python class into a distributed operator. Hold models, connections, and clients across rows with one decorator.

Daft v0.7.5: A Plugin System, 5x Faster Parquet, and a Real-Time Query Debugger

Engineering

March 11, 2026

Daft v0.7.5: A Plugin System, 5x Faster Parquet, and a Real-Time Query Debugger

Native Extensions via Stable C ABI, Live Query Dashboard, and 2-5x faster Parquet Reads on Nested Types

Stateless UDFs with daft.func - four patterns, one decorator

Product

March 10, 2026

Stateless UDFs with daft.func - four patterns, one decorator

Row-wise, async, generator, and batch UDFs in Daft — one decorator, zero boilerplate, local or distributed.

Daft UDFs: What is a UDF and why do you need one?

Product

March 3, 2026

Daft UDFs: What is a UDF and why do you need one?

Daft User Defined Functions (UDFs) let you run custom Python inside a distributed DataFrame pipeline. Leverage Row-wise, Async, Generators, and Batch.

PreviousPage 3 of 8Next