Explore Vortex

Vortex is a highly performant, extensible columnar file format, with an associated toolkit for running compute over compressed data in-memory.

Get Started

Learn More

Your data without limits

Vortex is our next-generation, open-source columnar format, designed from the ground up to achieve optimal performance on modern hardware. It’s a file format and a memory format, bridging storage and compute for seamless data processing.

Pareto-optimal
Performance

100x faster random access.
2-10x faster scans.
0-50% faster writes.
Comparable size to modern Parquet.

Cascaded Lightweight Encodings

“White box” encodings support pushdown compute on compressed data.
Designed for random access & decompression via SIMD (CPU) & SIMT (GPU).
Based on the latest research, like FastLanes, ALP, FSST, & BtrBlocks.

Extensible & Future-proof

Rich extension types.
Pluggable encodings & compression strategies.
Self-describing layouts & WASM decoders allow the format to evolve.

Optimized Metadata

Designed for efficient reads from object storage.
Zero-copy deserialization efficiently supports wide schemas.

Interoperable

Decompressed arrays are zero-copy to/from Apache Arrow.
Integrates with Polars, Pandas, DataFusion, DuckDB, Spark, and many other popular frameworks.

Data’s favorite format

vortex.dev