Data Generation Wizard


One of the new features in the Visual Studio Team System 2008 Database Edition – Power Tools release is the addition of the Data Generation Wizard. The Data Generation Wizard is an item template wizard that allows you to create and configure a data generation plan by pointing it at an existing database, which has the same schema as the underlying database project, and configure each column inside each table with a sequential data bound generator. When finished you have a configured data generation plan that pulls all rows from the existing database in to your target database.

Lets step through the process.

 

Step 1: Add a new Data Generation plan using the Data Generation Wizard.

image

Inside an existing database project, select Add… New Item… and select the “Data Generation Wizard” option. In this example I am using the good old trusted pubs database as my sample database.

 

Step 2: Launch the Data Generation Wizard

Once you selected the Data Generation Wizard, you will get the main page of the wizard. Above you see the default state of the wizard.

 

Step 3: Select your source database

The next step is to select the database that you want to pull the data from, in this example I am pulling from the pubs sample database. If you have not configured a connection to the specific target database, you can create one using the New Connection… option or you can change existing connection definitions using the Edit Connection… option.

Once you select the database there are a couple of options you can select or set, lets walk through these:

  • Set the number of rows to generate…
    This option will configure the “Rows to Insert” value of data generation plan, to exactly the number of rows that are inside the source database.
  • Replace the foreign key generator with values form the target database…
    If you check this option the wizard will pull the foreign key values from the source database, instead of leveraging the foreign key data generator
  • Number of failed rows…
    This option configures the error threshold for the number of failures to be allowed before terminating the population of a table
  • Number of failed tables…
    This option configures the error threshold for the number of failures to be allowed before the terminating execution of the data generation plan

image

 

Step 4: Run the wizard

When you hit OK, you will see a Status window which will display the progress information of the creation and configuration of the data generation plan.

image

Below you see the status information from creating the data generation plan for the pubs database.

Retrieving data from pubs.
Data retrieval complete.
Starting merge of data generation plan with database schema.
Completed.
Resetting all target row counts to match target database
Setting table [dbo].[authors] to 23 rows
Setting table [dbo].[discounts] to 3 rows
Setting table [dbo].[employee] to 43 rows
Setting table [dbo].[jobs] to 14 rows
Setting table [dbo].[pub_info] to 8 rows
Setting table [dbo].[publishers] to 8 rows
Setting table [dbo].[roysched] to 86 rows
Setting table [dbo].[sales] to 21 rows
Setting table [dbo].[stores] to 6 rows
Setting table [dbo].[titleauthor] to 25 rows
Setting table [dbo].[titles] to 18 rows
Populating columns with Sequential databound generator
Configuring table dbo.authors to be databound
Configuring table dbo.discounts to be databound
Configuring table dbo.employee to be databound
Configuring table dbo.jobs to be databound
Configuring table dbo.pub_info to be databound
Configuring table dbo.publishers to be databound
Configuring table dbo.roysched to be databound
Configuring table dbo.sales to be databound
Configuring table dbo.stores to be databound
Configuring table dbo.titleauthor to be databound
Configuring table dbo.titles to be databound
Saving to file ‘c:\work\pubs\Data Generation Plans\DataTransformPlan1.dgen’
Completed

As you can conclude from the status output the wizard configures the data generation plan in two phases. First it configures the row counts and secondly in configures the generators for each column inside each table. If you did not select the option to propagate row counts the first phase will be skipped.

 

Step 5: The resulting data generation plan

Now that the wizard is finished, lets look at the results. First we will look at the data generation plan. We will see that the row counts are configured for each respective table.

image

Step 6: The resulting column configuration

If we look at the column configurations we see that in our example the foreign keys are generated using the foreign key generator and the other values are pulled from the database. Note that we are using the sequential databound generator, which support multi-instancing. This means that columns values are matched per row, the regular databound generator does not have this ability to correlate column values within a single row!

image

Step 7: Running the generation plan

Now that we have a data generation plan, lets runs it. We created a clean database from our project named pubs2 and we are going to generate straight in to this.

image

If you are using the real pubs database you will find that this will fail. For a good reason, because the employee_insupd trigger inside the schema that will cause a rollback if the job id does not have a certain value. If you are using the default setting, the plan for the employee table will look like this:

image 

The job id foreign key generator is not configured correctly to satisfy the employee_insupd trigger.

If you selected the option “Replace the foreign key generator with values form the target database…” you plan for the employee table looks like this:

image

Note that the job_id column is also pulled from the database, so when you used that option you will satisfy the trigger and the plan will just work!

 

Step 8: Showing the results

Now that we executed the plan, lets check the results, by opening up the T-SQL Editor and fire off a simple query “SELECT * FROM sales” to see if the content made it over.

image

Conclusion:

The Data Generation Wizard allows you to quickly configure a data generation plan that pulls all rows from an existing SQL Server database. The wizard will only allow you to pull information from an SQL Server database, because the schema of the objects inside the database has to match the table inside the database project.

We hope this new feature saves you time!

-GertD
“DataDude” Group Engineering Manager

Comments (12)

  1. Last night the Data Team released a new version of their Power Tools. Well Gert and the Gang have already

  2. Got VSTS 2008 Database Edition ? Then, you’ll love this: Earlier today, a colleague of mine notified

  3. The long-awaited power tools for VSTS 2008 DB edition have been released , which restore the sorely-missed

  4. Download details: Visual Studio 2008 Support for SQL Server 2008 VS2008에서 SQL Server 2008 11월 CTP를 접속할

  5. Download details: Visual Studio 2008 Support for SQL Server 2008 VS2008에서 SQL Server 2008 11월 CTP를 접속할

  6. Nous avons sorti, le 14 février dernier, la dernière version des PowerTools pour Database Edition version

  7. Nous avons sorti, le 14 février dernier, la dernière version des PowerTools pour Database Edition version

  8. Updated: Now that the Power Tools for VS2008 have been released , I recommend that you use the Sequential

  9. Con Diego necesitando generar valores aleatorios en una base de datos, en otra oportunidad estábamos

  10. Aymeric says:

    Hi Gert,

    First off, let me say I love VSDB. I don’t know how I lived without it prior to VS2005.

    Now, down to my question, I realize that a lot of the power tools features were implemented directly into GDR R2. But, one key feature that is still missing is the Data Generator Wizard, please give us a timeline of when the new power tools will be available (been waiting since January when the first "out-of-sync" GDR came out…)

    Thank you,

    Aymeric

  11. sathis says:

    Hello I am trying to execute the .dgen using the following

    DatabaseTestService.GenerateData(.dgenfile,true,""System.Data.SqlClient",connectionstring)

    but iam getting error "Failed to generate data in target database with data generation plan file" could you please provide me some insigth wehter i should setup any thing else apart from the one line of code. Initially i created a sample datageneration plan and executed and it workds fine and generates data inside all the tables. But when i try to run the .dgen file from my test where i need to run the .dgen at runtime i am having the issue.