The BIA Reporting databases can be populated by executing the ETL's. The databases must be populated in the following order:
- 1. Staging database
- 2. Central database (CDW)
- 3. Datamarts
Within each database the ETL's may need to be run in a particular order. The sequence of running ETL's is described below. The reasons why some ETL's need to be run in a database before others are:
- Control tables : the ETL Control tables in the databases must be populated before the other ETL's are run. This is because each ETL identifies and processes only the data that has been added or updated for each of the tables in a database. The ETL identifies this data through the last written date in the ETL Control tables.
- Dependencies in the CDW : foreign key dependencies exist on some of the tables in the CDW as this database is normalized. This means that some ETL's cannot be run until other ETL's are complete e.g DW_CASESTATUSHISTORY_ETL cannot be run until the DW_CASE_ETL has completed successfully as the prior ETL needs to retrieve the DWCASEID from the latter.
- Datamart Dimensions and Facts : dependencies in the datamarts mean that fact tables cannot be populated until all the dimensions that the fact table reference are populated.
The build script should be used to run ETL's in Oracle or DB2. The run commands allows a user to run control and operational ETL's in the Staging, CDW, or Datamarts. Appendix B details the build run targets, the sequencing of the ETL's in the Staging, CDW, and Datamarts and how to amend any run batch files.