SAS® Data Integration Studio: Tips and Techniques for Implementing ELT
Jeff Stander, SAS Institute Inc., Lake Mary, Florida
Are you looking to improve performance of your data integration processes? One best practice is to stage data inside
the database and direct SAS® to do its data integration work inside the database. Extract, load, and transform (ELT)
is an increasingly popular way to minimize data movement and gain efficiencies using your existing database
management system. This paper explains how to enable ELT processing using the newest features of SAS® Data
Integration Studio 4.21 and SAS/ACCESS®.
Several trends are influencing applications of data integration, including increasing data volumes, new database
technologies, and innovations in SAS Data Integration Studio and SAS/ACCESS.
First, there is an ever increasing demand for more data integration processing inside the database management
system. This is due to growing data volumes. Some studies indicate that data volumes are doubling every 11
months. This increase in data volumes means we have to be more efficient on how we process data. Database
consolidation is one strategy organizations are undertaking in order to be more agile.
Another recent trend is the innovations in database technologies and data warehouse appliances. More computing
power is becoming available to do data integration inside the database. These technologies have extremely efficient
ways to handle large data volumes without having to move data back and forth.
SAS Data Integration Studio now has enhanced pass-through Structured Query Language (SQL) generation for most
database management systems, and provides better support for in-database processing.
For many years, users of Base SAS have used implicit and explicit SQL Pass-Through capabilities found in PROC
SQL and SAS/ACCESS software to push processing into the database management system. SAS Data Integration
Studio provides many of the same capabilities, minus the hand coding, by using metadata to control the processing.
SAS DATA INTEGRATION STUDIO
AN EXAMPLE ELT JOB
When using ELT, you make an architectural decision to use the database management system to do the
transformation. Using this technique, you move your data into the database management system (DBMS) and then
use SQL to perform DBMS-specific processing on the data. The basic difference between extract, transform, and
load (ETL) and ELT is that you do not extract the data to perform operations on it. Architecturally you extract the data
from the sources, load by using an ODS or staging area on the target, and then use the database to complete the
transformation. See Figure 1. Some benefits of using an ELT approach include: reduced data movement, ability to
leverage DBMS scalability and parallel processing capabilities, and in some cases reduced storage requirements due
to using a centralized data store.
SAS Global Forum 2010