Explore Vortex
Vortex is a highly performant, extensible columnar file format, with an associated toolkit for running compute over compressed data in-memory.
Your data without limits
Vortex is our next-generation, open-source columnar format, designed from the ground up to achieve optimal performance on modern hardware. It’s a file format and a memory format, bridging storage and compute for seamless data processing.
Pareto-optimal
Performance
100x faster random access.
2-10x faster scans.
0-50% faster writes.
Comparable size to modern Parquet.
Cascaded Lightweight Encodings
“White box” encodings support pushdown compute on compressed data.
Designed for random access & decompression via SIMD (CPU) & SIMT (GPU).
Based on the latest research, like FastLanes, ALP, FSST, & BtrBlocks.
Extensible & Future-proof
Rich extension types.
Pluggable encodings & compression strategies.
Self-describing layouts & WASM decoders allow the format to evolve.
Optimized Metadata
Designed for efficient reads from object storage.
Zero-copy deserialization efficiently supports wide schemas.
Interoperable
Decompressed arrays are zero-copy to/from Apache Arrow.
Integrates with Polars, Pandas, DataFusion, DuckDB, Spark, and many other popular frameworks.