FederIQ vs. Trino vs. DuckDB

A pragmatic feature comparison. None of these tools is strictly better — they solve different pain points.

DimensionDuckDBFederIQTrino
Deploy unitSingle binarySingle binaryJVM cluster
Minimum memory~50 MB~100 MB~4 GB
Cold start~50 ms~200 ms~30 s
Multi-source federationLimitedYesYes
Postgres / MySQL / SQLiteVia extensionsNative catalogNative
Parquet / CSV / JSONNativeVia DuckDBVia Hive connector
HTTP / RESTNoYes (default feature)No
MongoNoPlannedYes
IcebergVia extensionIn progressYes
Declarative policy engineNoYes (YAML, SQL rewrite)Via Ranger add-on
Column masking / row filtersManual viewsBuilt-in, catalog-declaredVia Ranger
HTTP service modeNoYes (auth + TLS + metrics)Yes
Cache layer (memory/disk/Redis)ManualBuilt-inMaterialized views
Python / TS clientsYes / NoYes / YesYes / community
LicenseMITApache 2.0Apache 2.0

When to pick each

  • DuckDB alone: single-node analytics, embedded in a Python/R/Node process, local Parquet/CSV processing.
  • FederIQ: you need federated queries across 2–10 sources, a policy engine, or an HTTP service, but don't want the operational cost of Trino. Works great embedded or as a sidecar.
  • Trino: petabyte-scale, hundreds of concurrent users, existing Hive metastore, or you already run a JVM fleet.

Benchmarks

Rigorous TPC-H numbers against Trino and Dremio are on the roadmap. When we publish, the benchmark harness and raw numbers will live in the repo under bench/ so you can reproduce.