DuckDB

DuckDB database is an in-process SQL OLAP database management system.
Like SQLite, it is particulary usefull for :
- Processing and storing tabular datasets, e.g. from CSV or Parquet files
- Interactive data analysis, e.g. Joining & aggregate multiple large tables
Embed
TODO: Explain how a single or multiple DuckDB files could be attached to a signed PDF file
Convert
TODO: Explain how a user can drop CSV, Excel, FEC files and have them converted to a single DuckDB database file.
Attach
The PDF document may contain a link to the database, and not the database file itself (thanks to the attach command). A hash should be computed and stored in the PDF document to ensure data inegrity.
Multiple sources can be linked/attached and processed as a single database.
Multiple format can then be supported out of the box :
- sqlite
- CSV
- Microsoft Excel
- OpenDocument
- ...
Hash
We need a canonical form of a DuckDB database file to compute a hash.
See this DuckDB discussion.
Diff
TODO: link to DeltaV module
Query
To query the database from a checklist we can use :