Friday, March 21, 2008

Why staging area is needed?

Unlike OLTP systems that create their own data through a user interface, data warehouses source their data from other systems.
There is physical data movement from source database to data warehouse database.
Staging area is primarily designed to serve as intermediate resting place for data before it is processed and integrated into the target data warehouse.

This staging are serves many purpose above and beyond the primary function.
a.The data is most consistent with the source.
It is devoid of any transformation or has only minor format changes.
b. The staging area in a relational database can be read/ scanned/ queried using SQL without the need of logging into the source system or reading files (text/xml/binary).
c. It is a prime location for validating data quality from source or auditing and tracking down data issues.
d. Staging area acts as a repository for historical data if not truncated

No comments: