Rapid “Data Collection” Solutions – Part 2: designing and implementing a data model

I refer to the business problem already described in the post “Rapid “Data Collection” Solutions – Part 1: defining the business problem”. In this series of posts, I am considering the options available for building a quick and zero-code solution based on a SharePoint web environment.

In this post I will consider the problem of preparing the “storage” for the data that need to be collected. This can be a very trivial problem (…and often it is!): a simple SharePoint list created in 2 minutes can be adequate to the need. In this post I will consider a more complicated scenario, where the data to be collected has relations with other data stored inside or outside the SharePoint environment; I will consider the options for representing those relations in the SharePoint environment.

Let us imagine that we have already designed a "logical data model" describing the entities and the relations for the data that we need to deal with. It may happen that this data model contains:

· some data that is already shared on a custom DB;

· some data that is already shared on a SharePoint site;

· some data that is already available in some kind of other off-line formats (e.g. in Excel files).

As an example, I will consider the logical data model shown in the following figure:

Data Model in the Example

I imagine that:

· the set of departments and company is already available (for example in Excel files) and needs only to be written into SharePoint lists;

· a set of addresses is already available in a SharePoint list; this set also contains the addresses for the considered contacts;

· some other information concerning the contacts is already available in an external DB;

· there is the need to collect the list of contacts with other specific information.

It’s quite easy to implement a data model with the tools provided by SharePoint:

· The entities can be defined in SharePoint as “Content Types” (explicitly or implicitly) associated to “Lists

· The relations between entities can be defined using the “Lookup Columns”.

· The external data (data stored outside SharePoint, e.g. in a custom DB) can be seen and referenced in SharePoint 2010 using “External Content Types” and “External Lists” (in SharePoint 2007 the same concepts are named “BDC Entities” and “Business Data Lists”)

Here below I show how the previous data model can be represented in SharePoint.

  Data Model Base Implementation

The following figure shows some sample data for the considered lists:

Sample Data in the Data Model Implementation

Was it easy? Of course it was… Unfortunately, this simple approach leaves some “usability holes”: often the business users like to have a one-shot view of all the related data; more over often they like to export this data using Excel or to filter the data directly in a SharePoint data sheet view. With the simple implementation shown in the previous picture, a business user exporting a spreadsheet from the “Contacts” list will not have the possibility to see a complete set of information of the related data. For example: the “Company name” field will not be available into the “Contacts” list view, neither it will be available in any Excel spreadsheet exported from the “Contacts” list.

Let us imagine that the business users need the following view on the “Contacts” data:

Desired View

So: how to give a “flexible” and “powerful” view showing all the related columns of the Contacts? I see a few approaches for a zero-code solution.

The first approach consists in accessing the SharePoint data with Microsoft Access 2007/2010, using the “Open with Access” menu command available in every SharePoint List and then selecting the option to create “Linked Tables”. With this option, the resulting Access DB has already all the related tables (mapping to the SharePoint Lists), so it is possible to create a report that contains all the desired columns from the related entities. This approach is the simplest and the “cleanest” one; unfortunately, it happens quite often that not all the users have the possibility and/or the skills to use Access (yes, in my experience I have seen that Excel is the only tool “for every business user”…).

The second approach consists in creating a new ASPX page and adding to it a “Data View Web Part” using SharePoint Designer; this web part can be configured in order to show data from related lists. Unfortunately, with this approach the users browsing this custom page will not have all the configuration flexibility and power of a standard “List View”: they cannot add/remove/move columns, they cannot add filtering, ordering or grouping conditions and – probably the most important lack – they have not the option to export the data into an Excel workbook.

A third approach consists in using the new great Access Services of SharePoint 2010 in order to publish reports designed in Access 2010; these reports are automatically published in SharePoint as RDL (through SQL Server Reporting Services). I still have a poor knowledge about this functionality; AFAIK, with this approach the users maintain the option to export the data in their favorite format (for example, in Excel) but they still miss the possibility to configure a custom view.

A fourth approach consist in adding additional (redundant) columns to the same SharePoint List where this columns need to be shown in a view. This approach is the worst one in terms of “architectural cleanness” for the resulting solution but it also completely solves the problem of showing the related data in a flexible and powerful way.

In the rest of this post I will consider the option to introduce “redundant columns” to the SharePoint List.

The following figure shows a possible SharePoint implementation for the considered example, where the List “Contacts” is configured with the redundant columns needed for the desired view.

Data Model Implementation with Redundant Columns

In the previous picture, some of the additional columns are marked as “projected columns”; I refer to the new functionality of SharePoint 2010 that allows adding columns whose value is automatically tied to the value chosen in a specific “lookup column” of the same list; the value of the projected column is simply the value of another column in the target list pointed by the tied lookup column. The projected columns will not need to be directly filled by the user: in the example shown above, when the department is selected by the user, the related department director is automatically shown into the contacts list. When the data collection should be organized on a SharePoint 2007 environment, there is no option to create projected columns: all the “additional columns” need to be created as “lookup columns”.

The choice to add redundant columns to a SharePoint List opens a new problem: how to automatically set the value in the redundant columns? It is not practical to ask all this information to the user: no one likes to insert redundant data! Moreover, asking this data to the user could be error-prone. There is the need to ask the user a minimal set of information.

The completion of all the redundant data can be automated using a declarative workflow designed and published with SharePoint Designer. This “data-completion workflow” should automatically start every time a user creates or updates an item in the considered list.