FederIQ vs. Trino vs. DuckDB
A pragmatic feature comparison. None of these tools is strictly better — they solve different pain points.
| Dimension | DuckDB | FederIQ | Trino |
|---|---|---|---|
| Deploy unit | Single binary | Single binary | JVM cluster |
| Minimum memory | ~50 MB | ~100 MB | ~4 GB |
| Cold start | ~50 ms | ~200 ms | ~30 s |
| Multi-source federation | Limited | Yes | Yes |
| Postgres / MySQL / SQLite | Via extensions | Native catalog | Native |
| Parquet / CSV / JSON | Native | Via DuckDB | Via Hive connector |
| HTTP / REST | No | Yes (default feature) | No |
| Mongo | No | Planned | Yes |
| Iceberg | Via extension | In progress | Yes |
| Declarative policy engine | No | Yes (YAML, SQL rewrite) | Via Ranger add-on |
| Column masking / row filters | Manual views | Built-in, catalog-declared | Via Ranger |
| HTTP service mode | No | Yes (auth + TLS + metrics) | Yes |
| Cache layer (memory/disk/Redis) | Manual | Built-in | Materialized views |
| Python / TS clients | Yes / No | Yes / Yes | Yes / community |
| License | MIT | Apache 2.0 | Apache 2.0 |
When to pick each
- DuckDB alone: single-node analytics, embedded in a Python/R/Node process, local Parquet/CSV processing.
- FederIQ: you need federated queries across 2–10 sources, a policy engine, or an HTTP service, but don't want the operational cost of Trino. Works great embedded or as a sidecar.
- Trino: petabyte-scale, hundreds of concurrent users, existing Hive metastore, or you already run a JVM fleet.
Benchmarks
Rigorous TPC-H numbers against Trino and Dremio are on the roadmap. When
we publish, the benchmark harness and raw numbers will live in the repo
under bench/ so you can reproduce.