Pages

Wednesday, January 8, 2014

Best Practices for ETL Design #10: When to Truncate Tables

Truncating tables is a common practice when you need the table to be totally refreshed. In Microsoft SQL Server Integration Service, it is often implemented through “Execution SQL Task” at the very beginning of ETL packages. You may never think about it would sometimes give trouble. Here is some worst possible scenario:

Let us imagine the target table to be refreshed is a report that users may run at any minute. If we truncate the target table at the beginning. However, for some reasons, there is some problem to get the rest of ETL package to complete immediately. Or it takes more than five minutes ( or the time that the end users could bear with) to finish the ETL package and to load new data to the target table.

Then what will happen to users on the reporting side? They will come back to you and complain the report is down!

Solution

Therefore, in order to avoid the above worst scenario, in your ETL packages, please truncate the final key target table at the LAST minute when its new data is ready to load. Satisfy end users with your reports always on!

No comments:

Post a Comment