Join us as we explore innovative ways to handle multimodal datasets, optimize performance, and simplify your data workflows.

Daft v0.7.15 ships with try_cast for safe type conversion, Flight shuffle LZ4 compression, UUIDv7 timestamp extraction, and PostgreSQL support.

Build production-ready PDF processing pipelines with distributed computing, OCR, spatial analysis, and GPU embeddings

Daft makes it easy to express these pipelines end-to-end, while seamlessly scaling them up to handle massive workloads.

Essential AI leveraged Daft's data engine to process a massive web-scale dataset for large language model (LLM) training.

Learn how to achieve near-100% GPU utilization processing millions of text documents with Qwen3 embeddings.

A Streaming Solution

Daft Community is expanding to China, bridging the gap between English documentation and Chinese innovation cycles, in partnership with Bytedance Team

We've raised $30M to build generational technology for simple, reliable, and performant data processing across all modalities and regardless of scale.

An adventure in AI and data engineering to analyze developers across Github

Learn how Daft integrates with DeepSeek SmallPond 3FS to deliver faster file access and efficient data handling for modern workloads.