The LINQ Effect: Unifying and Extending Data Access

By Matt Duffin.  

 

So, you’re creating an application… no surprise there. As part of that application, you need to retrieve data from a database, read configuration settings from an XML document and manipulate in-memory data structures. All of these three tasks are fairly standard in any application, and many apps exist that mix and match one, two, or all three of these techniques during their routine activities.

During this article, I’m going to take the route of cliché and stick with possibly the most-implemented class in history: ‘customer’. I’ve got six of them, and they live in Reading, Birmingham, Manchester, London, Reading and London respectively. Consider retrieving a list of cities your customers reside in without duplicates – Birmingham, Manchester, London and Reading – using each of those three backing sources.

Click through to the main article to find out how LINQ, an extension to the .NET Framework, can help you query data from disparate sources in a more flexible, powerful and unified way.

Getting that from the database might look something as follows:

        private List<string>GetCustomerTownListFromDatabase()

        {

            List<string> result = new List<string>();

            using (SqlConnection conn =

                new SqlConnection("Connection String"))

            {

                SqlCommand query = new SqlCommand(

                    "SELECT DISTINCT City FROM Customer",

                    conn);

         

                conn.Open();

                using (SqlDataReader data = query.ExecuteReader())

                {

                    while (data.Read())

                    {

                        result.Add(data.GetString(0));

                    }

                    data.Close();

                }

                conn.Close();

            }

            return result;

        }

Here, we make a connection to the database and execute an SQL query against it (ensuring to use the ‘DISTINCT’ keyword to ensure no duplicates), which returns a SqlDataReader. We then read through each row in the reader and add the string value of the first column (which contains the city: it’s the only column returned) to a temporary list. After reading has finished, the reader and connection are closed and the resulting list is returned. Great.

Now, let’s take a look at the same process in XML:

        private List<string>GetCustomerTownListFromXml()

        {

            List<string> result = new List<string>();

            string myXML = "XML Text";

            XmlDocument customers = new XmlDocument();

            customers.Load(new System.IO.StringReader(myXML));

            foreach (XmlElement customer in customers.GetElementsByTagName("customer"))

            {

                string city = customer.GetElementsByTagName("city")[0].InnerText;

                if (!result.Contains(city))

                {

                    result.Add(city);

                }

            }

            return result;

        }

An XML document is constructed from a string literal. Next we iterate over each customer, and the inner text of the node representing the city is retrieved. Unlike with the database, extra logic is added to ensure that the current city does not already exist in the list before it is added. Finally, the list is returned.

Finally, the same process applied to an in-memory collection. Because ‘foreach’ requires the implementation of IEnumerable, this function assumes the existence of a customer class, a customer collection class called ‘Customers’, which implements IEnumerable, and a corresponding IEnumerator class:

        private List<string>GetCustomerTownListFromMemory()

        {

            List<string> result = new List<string>();

            Customers customers = new Customers(new Customer[6]

            {

                new Customer(1, "Reading"),

    ...

            });

            foreach (Customer customer in customers)

            {

                if (!result.Contains(customer.City))

                {

                    result.Add(customer.City);

                }

            }

            return result;

        }

First, the customers are constructed and placed into a collection. Next each customer is iterated over, and the ‘City’ property representing the place is retrieved. Again, extra logic is added to ensure that the current city does not already exist in the list before it is added. Finally, the list is returned.

So these three code samples allow us to retrieve the same list of unique customer cities from a database, an XML document and an in-memory data structure. However, what’s wrong with these code samples? (Apart from the [purposeful] construction of an XML document from a string literal, the hard coding of a connection string, etc.; ignore these for a moment.)

Object-oriented principles have become increasingly popular and concepts such as classes, properties and methods have become an integral part of the programming experience for many. Although the XML and in-memory examples appear quite similar, the differences are already clear to see and it is apparent that the method of retrieving data from the two sources will diverge quickly with more complicated data. Plus, the database example differs much from the other two.

The problem occurs when the familiarity and efficiency of object-oriented data access is lost when accessing non-OO sources, the most prevalent of which today are relational and XML-based data. Learning different methods to get at your data can be a chore. Wouldn’t it be nice if there were a method of uniting these disparate methods of data access?

This is where LINQ steps in! But what is it? The LINQ homepage tells us, “the LINQ Project is a codename for a set of extensions to the Microsoft® .NET Framework that encompass language-integrated query, set, and transform operations. It extends C# and Visual Basic with native language syntax for queries and provides class libraries to take advantage of these capabilities.” Sounds impressive, but what does it mean?

It’s a technology that is added on to the .NET Framework and extends C# and Visual Basic – all the features you’re comfortable with, such as IntelliSense are included (hence the ‘language integrated’ part) - with features that unify and extend data access techniques across disparate sources. As a quick example, let’s take the preceding code and give it a LINQ overhaul.

(The code samples below are based on the LINQ CTP (May 2006); its features may not represent those of the final version.) The in-memory Customer class already implements IEnumerable, and so is compatible with LINQ. The code now becomes:

        private List<string>GetCustomerTownListFromMemory()

        {

            List<string> result = new List<string>();

            Customers customers = new Customers(new Customer[6]

            {

                new Customer(1, "Reading"),

    ...

            });

            var result = customers.Select(c => c.City).Distinct<string>();

            return result;

        }

Several powerful features of LINQ are in play here – a query expression performs the actual interrogation of the data, lambda expressions allow the expression of intent in a concise manner and anonymous types allow the compiler to infer the result of the query expression. For more information about these features, see the resources at the end of this article.

So let’s take a look at database-based data access:

        [Table(Name="Customer")]

        public class LINQCustomer {

          [Column(DBType="int not null", Id=true)]

          public int CustomerID;

          [Column(DBType="varchar(255) not null")]

          public string City;

        }

        private List<string> GetCustomerTownListFromDatabaseLinq()

        {

            DataContext context = new DataContext(@"Data Source=acdevsvr\SQL_2005;Initial Catalog=LINQTest;Integrated Security=True");

            Table<LINQCustomer> customers = context.GetTable<LINQCustomer>();

            var result = customers.Select(c => c.City).Distinct<string>();

            return result;

        }

That looks a little more complicated! But we’re telling the Framework about the database in a much more object-oriented manner, including creating a class representing the customer table decorated using attributes that enable LINQ to query it (a related technology, SqlMetal can automatically generate these classes for you). Then, when it comes to querying the data, exactly the same method is used as with the in-memory collection! (Unfortunately, LINQ can’t disappear the necessity of a connection string, but hopefully the impact isn’t diminished.)

You’ll find that a similar technique is available for querying an XML document, but that’s left as an exercise for the reader.

I hope that this has given you a taste for the possibilities and potential of the LINQ project. The usefulness of unifying and extending data access across domains is tantalising and this article merely scratches the surface (well, as much as a slither of talc article would scratch a towering diamond LINQ block).

I strongly urge you to take some time out to take a look at this technology. There’s a wealth of material available, both at a high level for those interested architecturally in its features and in extremely in-depth dives for, on the LINQ homepage (https://msdn.microsoft.com/data/ref/linq/). One of the best resources for LINQ, and what caused it to ‘click’ for me was an interview with Anders Hejlsberg and Luca Bolognese on Channel9, available at https://channel9.msdn.com/showpost.aspx?postid=114680. For an in-depth look at how the querying and lambda-expression functionality works at a lower level, visit https://blogs.gotdotnet.com/mattwar/archive/2006/05/10/594966.aspx. And finally, for those interested in SqlMetal, a good introduction is available at https://davidhayden.com/blog/dave/archive/2006/05/18/2947.aspx.

I wish you the best of luck exploring LINQ. As always, if you have any questions, please contact the Academic Team at ukacinfo@microsoft.com.