Apache Iceberg connector
FederIQ reads Iceberg tables through DuckDB's native iceberg extension.
Declare the source once, query it like any other table.
sources:
- name: bronze
type: iceberg
path: "s3://warehouse/bronze/events"
# optional — tolerate data files that moved relative to the manifest
allow_moved_paths: false
Then:
SELECT event_type, COUNT(*)
FROM bronze
GROUP BY event_type;
How it works
FederIQ emits:
INSTALL iceberg;
LOAD iceberg;
CREATE OR REPLACE VIEW bronze AS
SELECT * FROM iceberg_scan('s3://warehouse/bronze/events');
on attach. DuckDB handles manifest parsing and data file access.
Credentials
For S3-backed tables, set AWS credentials via environment variables
(AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_REGION) or via
DuckDB's secrets manager.
For local filesystem paths, no credentials are needed.
Limitations
DuckDB's Iceberg extension is read-only and tracks a subset of the spec:
- v2 tables supported; older format versions best-effort.
- No writes (add partitions, schema evolution, etc.) — use your engine of choice (Spark, Trino, pyiceberg) to author, FederIQ to query.
- Time travel (
FOR TIMESTAMP AS OF ...) is not yet surfaced through FederIQ's catalog — query DuckDB directly if you need it. - Expect slower first-query latency than Parquet — Iceberg reads the metadata tree on every attach.