This document describes a generic ETL process to load data into a Data Vault model using Pentaho Data Integration. Key points include:
- A single parameterized job and transformation was created for loading each type of Data Vault entity (hub, link, hub satellite, link satellite) driven by metadata from Excel and database tables.
- Loading is done in a loop with jobs logging details for easy debugging. Errors are captured in error tables to identify design issues.
- The process is available for free and supports MySQL and PostgreSQL backends initially with Oracle planned. Recipients can obtain the code by sending Belgian beer to the author.