In the news, Microsoft and Dell have teamed up to provide “out of the box” Business Intelligence systems.
In this week’s feature, we’ll continue the introduction to SQL Server Features and talk about SQL Server Integration services.
The web link is on Replication with Oracle, and this week’s tip is on using Unicode data.
In the News:
Microsoft and Dell have teamed up to provide “out of the box” Business Intelligence systems. Dell provides hardware and configuration, and Microsoft provides the software and documentation. If you’ve been asked to get a BI solution ready for your company, check out the Dell site – it has tons of useful information on BI and SQL Server. Find it at: http://tinyurl.com/6jvxnp
In a previous Podcast I started an overview of the major features you can find in SQL Server 2005. I’ll cover each feature in more depth in future Podcasts, but we need a place to start the discussion for those features, so we’ll cover the basics first. This week we’ll take a look at SQL Server Integration Services.
SQL Server Integration Services, or SSIS, is a mechanism primarily used to bring data in and out of SQL Server. One of the most common uses for SSIS is for the Extract, Transform and Load (or ETL) process in a Business Intelligence landscape, but it can also be used for data movement or data cleansing needs you have. It also includes lots of tools to automate the process.
In many ETL layouts, you normally extract the data from a source system (or more than one), and then place the raw import into a set of staging tables in a database. From there you transform the data into the types, formats and so on that you need for the destination Business Intelligence system. Finally, you load the data from the database into the Business Intelligence system.
With SSIS, you can extract the data from almost any source system, perform a transformation function on the data, and then load it into the final destination all in one step. And destination can be a Business Intelligence system, another database, or even a text file or Excel spreadsheet.
The conceptual framework for SSIS includes the Control Flow and the Data Flow. These components make up a Package, and multiple Packages make up a Project. Multiple Projects make up Solutions. All of these objects are designed in the Business Intelligence Development Studio, or BIDS. This is the second major tool installed with a full copy of SQL Server 2005 and higher.
The Control Flow of SSIS includes any steps you want to take before or after you connect to the source data, and any Data Flow elements. Think of Control Flow objects as all of those things you want to do in addition to moving the data. You might download a file from an FTP location, or start logging the process. All of those objects are placed in an outer “container”, which can also include one or more Data Flow objects. So in effect you create a “bookend” on the front and back of the data movement.
In a simple Project, such as transferring data from one database to another, you might have only one item inside your Control Flow container: a Data Flow object. In more complicated situations you'll create multiple objects, all linked together with constraints and precedents.
The Data Flow container has objects to connect to data sources and their destinations. Using the connections you create you define the inputs and outputs for each one. You connect the output of the source to the input of the destination objects, creating a basic data path.
You can also add to the data path by adding various transformation objects between the sources and destinations. These transformations can combine and aggregate, or split apart and separate data. They can perform numerical calculations, and they can create derived fields.
Keep in mind that you're not limited to working with a single source and destination. You can route the outputs of several sources to the inputs of a single destination, or a single source can feed multiple destinations. You can route the outputs of one source into the inputs of a data object that in turn outputs to another, each having a different transformation.
BIDS has all of the elements and tools you need to design and implement an ETL process. Everything is a graphical component, but you can always get to code if you need to do more granular work.
SSIS is also programmable. The API object model is exposed so that you can create and control the entire system with code.
The web link for this week is a great resource that will help you set up replication with an Oracle system. You can find it here: http://tinyurl.com/63juxe.
Tip of the Week:
When a collation is specified for non-Unicode character data, such as char, varchar, and text data, a particular code page is associated with the collation. Collations specified for Unicode data, such as nchar, nvarchar, and nvarchar(max), do not have specific code pages associated with them, because Unicode data handles virtually all characters of all the world's languages.