Comparison of a simple select statement in C-omega vs. ADO.Net

In a couple weeks, I'm doing a talk at a .NET users group in Ontario about . Cω is a cool research language from Microsoft Research that extends C# with direct support for data access (SQL and XML) and concurrency abstractions.  I’ve been planning on writing a number of blog entries about Cω, but I haven’t yet been able to make the time.  I was just working on the slides for my talk and thought I’d at least post one of my examples comparing simple database access with ADO.Net and Cω.  One important concept in efficiently building reliable and maintainable software is “discover errors as early (and cheaply) as possible”.  With that in mind, let’s look at some code (disclaimer: I’m by no means an ADO.Net expert).

 

Say we just want to get the names of all the employees who live in a specific city (in the sample Northwind database that comes with SQL Server).  The simplest approach has a number of well-known problems:

 

      SqlDataAdapter da = new SqlDataAdapter(

      "SELECT * FROM Employees WHERE City='"+city+"'", nwindConn );

DataSet ds = new DataSet();

      da.Fill(ds,"Employees");

      foreach (DataRow dr in ds.Tables["Employees"].Rows)

      {

      string name = dr["LastName"].ToString();

int id = (int)dr["EmployeeID"];

            Console.WriteLine( id + ": " + name);

      }

 

  • The argument ‘city’ is subject to a SQL injection attack
  • The data is weakly typed, so we have to cast to get what we want, which could fail at run-time
  • Several string literals (table and column names) are used which could result in run-time errors
  • The query is stored as a string providing no opportunity for validation before run-time

 

We prevent the injection attack by using SqlParameters, and use strongly typed data sets to prevent the problems with string literals and casting:

 

      SqlDataAdapter da = new SqlDataAdapter(

            "SELECT * FROM Employees WHERE City= @city", nwindConn );

      SqlParameter cityParam = da.SelectCommand.Parameters.Add("@city", SqlDbType.VarChar, 80);

      cityParam.Value = city;

      NorthwindDataSet ds = new NorthwindDataSet();

      da.Fill(ds, ds.Employees.TableName );

      foreach (NorthwindDataSet.EmployeesRow dr in ds.Employees.Rows)

      {

            string name = dr.LastName;

            int id = dr.EmployeeID;

            Console.WriteLine( id + ": " + name);

      }

 

This is a lot better (although people are often too lazy to write the extra code to use SqlParameters).  However, the SQL command is still stored as a string meaning we won’t find any errors in it until run-time, and the relationship between the shape of its output and the NorthwindDataSet is implicit and brittle.  We might want to put the SQL into a stored procedure:

 

CREATE PROCEDURE EmployeesForCity

      @City nvarchar(80)

AS

SELECT EmployeeID, LastName FROM Employees WHERE City = @City

 

And then write our code without explicit knowledge of the structure of the underlying database:

 

      SqlCommand cmd = new SqlCommand( "dbo.EmployeesForCity", nwindConn );

      cmd.CommandType = CommandType.StoredProcedure;

      SqlParameter cityParam = cmd.Parameters.Add("@city", SqlDbType.VarChar, 80);

      cityParam.Value = city;

      SqlDataAdapter da = new SqlDataAdapter( cmd );

      NorthwindDataSet ds = new NorthwindDataSet();

      da.Fill(ds, ds.EmployeesForCity.TableName );

      foreach (NorthwindDataSet.EmployeesForCityRow dr in ds.EmployeesForCity.Rows)

      {

            string name = dr.LastName;

            int id = dr.EmployeeID;

            Console.WriteLine( id + ": " + name);

      }

 

This is better; at least now our SQL statement can be checked before we run our app.  However, we still have some literal strings in our code, and we’re still going to get a run-time error if the stored procedure changes in some way (I don’t see any easy way to rebuild the XSD from the database schema in VS.Net 2003).  Plus, we’ve had to write significantly more code here.  At best, it feels like our code has a weak connection to the underlying database.  For small programs like this, it’s no big deal.  But for large data-intensive programs, people complain a lot about spending a lot of the time messing with plumbing code. 

 

Here’s the equivalent code in Cω:

 

      rows = select * from DB.Employees where City == city;

      foreach( row in rows )

      {

            string name = row.LastName.Value;

            int id = row.EmployeeID.Value;

            Console.WriteLine( id.ToString() + ": " + name);

      }

 

Note the following:

  • There is no knowledge embedded in string literals
  • The select statement is strongly typed, meaning that we know exactly at compile-time what the shape of the data will look like (intellisense even shows us), and what valid statements are.
  • We can embed local variables like “city” directly into our sql without worrying about injection attacks.
  • We can rely on type inference so we don’t have to explicitly specify the type of “rows” and “row”, even though it is strongly typed.
  • We can configure our build such that we’ll know at compile time if the database has changed in any way that would make the code fail. Of course the database could still be changed underneath our program resulting in a run-time error if we don’t recompile.

And if you are willing to abandon SQL syntax all together, you could equivalently write:

      DB.Employees [City==city].{

            Console.WriteLine( it.EmployeeID + ": " + it.LastName );

      };

How do you like that for concisely expressing what you mean?

[Update: Added entry with C# 3.0 / DLinq syntax for this comparison]