DQS is a Knowledge Driven Data Quality Solution.
Knowledge is what every organization and its data users know about their data and want to use for validation and correction.
This knowledge can be addressed also as the Data Quality Policy.
In DQS this knowledge is stored as Data Domains in the DQS Knowledge Base (KB).
There are various methods to build/define knowledge in your KB :
- Manually – Entered manually to the system.
The business user creates the knowledge himself.
- Discovery – Generate Knowledge from business sample data.
A defined DQS functionality that enables Knowledge creation from samples of business data.
- Import – Bulk Knowledge Creation.
Importing data directly to the KB from external data files.
- Reference Data – External Knowledge Sources
Associate knowledge that is available from external Domain Specific Experts.
So when it comes to building your first Knowledge Base (KB) , DQS is a process driven tool that will support you in the following steps:
- Identify the Data Source you would like to clean
This will enable you to identify the KB, its domains and the relevant data sources.
For example: My Customers would be the Customers KB.
- Creating a new KB.
To create a new KB, just select the “New Knowledge Base” action from the home screen.
You can then choose how to create it – either starting from scratch, or creating it from a different KB
(DQS is installed with a predefined out of the box KB – DQS Data that includes a bunch of common domains and values),
or import a DQS file that was created.
Choose the DQS Data KB , select “Next” and congratulations – A new KB was born.
- Manage Domains.
Now you’re in the Domain Management page. Here you can delete the domains you don’t need and create new domains that do not appear in DQS Data.
Manage the knowledge for your domains: properties values, rules and also attaching reference Data to them.
For details about this activity, please consult the DQS documentation, under DQS Knowledge Bases and Domains.
When you’re done, select “Finish“, followed by “Publish” and you’re ready to go.
- Knowledge Discovery.
To further enrich your KB, you can run “Knowledge Discovery”. This is a computer-assisted activity that will help you to validate/test/tune the content of your KB.
Not always what you know about your data is actually true… It is recommended to choose a sample of your data and run Knowledge discovery from it.
This will give you more insights into your data and will allow you to update/enhance the knowledge in your domains.
For details about this activty, please consult the DQS Documentation for Knowledge Discovery sections.
And always – remember to finish your activity and publish your work.
Congratulations – You have embarked on your Knowledge journey. Now go out there and cleanse some data!!!