A LINQ provider for Web queries


To start a series of "LINQ provider" posts, today I upload a provider sample that in some sense treats the Internet as a database: For a SQL Server database, you can make tables in a database accessible to LINQ by writing classes with attributes that define how objects of these classes are retrieved from rows in tables. LINQ can then use these classes to issue queries against the database. Similarly, this provider allows adding attributes to classes to specify how such objects are retrieved from Web pages, and you can then issue LINQ queries against them.


The project "WebLinq" in the attached solution contains this provider - it is not very sophisticated, it just contains three files:
- WebLinqAttributes.cs contains the attributes that are recognized
- WebContext.cs is the class your WebLinq enabled classes inherit from
- Utils.cs contains helper functions to GET / POST to a web site and to find substrings in a text.


The project "WebSources" defines some classes for 
- Searching for articles in the CiteSeer web sites (see below)
- Searching for articles in the MSDN web sites
- Translating words / sentences
- Integrating functions of one variable
- Looking up the current values of stocks from the company symbol


The project "SimpleDemos" uses these two DLLs to demonstrate the last three classes.

The project "TestWebLinq" demonstrates the access to the CiteSeer web sites.


CiteSeer is a database of computer science articles; you can search for articles by keywords, and obtain information about articles, and often even retrieve them directly from the Web site.
To use the CiteSeer demo, enter for example "Support Vector Machines" in the text box labeled "Search terms", and click on the "Retrieve" button. It will take some while to visit the web pages which list available articles, to visit the web page for each article, retrieve the information from this article, and access a another web page for details, but then you should see a list of paragraphs which contain
- Author's name(s)
- Title and year
- Some three lines of introduction
- URL for this article
- URL for downloading the article as pdf file
- Information about the rights for this article


If you are only interested in new articles, try entering 2002 in the "Publication year >=" text field and click again on "Retrieve" (currently I get 3 results back).

Here is how the corresponding query looks in the code:

var doc =
new GoogleCiteSeer(searchTerms,0);
var
query = from art in doc.Articles
           
where art.details.Document != null
              
&& art.details.Document.bibtex != null
              
&& art.details.Document.bibtex.year>=minYear
            
select art.details;


Here is an example for a class that defines how to read the "BibTeX" part of the Web page with details for an article:


public class CsBibTex {
  [
StartPart("author = \"")] [EndPart("\"")] public string
author;
  [
StartPart("title = \"")]  [EndPart("\"")] public string
title;
  [
StartPart("year = ")]     [EndPart(",")]  public int
year;
}


This sample code is provided as-is and does not come with any warranty.
You can modify and use the code for commercial and non-commercial purposes.

WebLinq.zip

Comments (17)

  1. This week I heard a number of people on the C# team talking about the hot new gaming technology for Windows and the Xbox called XNA. The big news is that a beta of the XNA Game Studio Express is available as a free download.

  2. Here are some useful links to LINQ information. Use the comments or write me if you want to add to this

  3. I’ve recently updated the list of LINQ Providers found on my Links to LINQ page, accessible from the

  4. TerryLee says:

    微软在.NET3.5中推出了LINQ,现在各种LINQProvider满天飞,刚才在老外站点上看到了一份LINQProvider列表,近30多个:LINQtoAmazonLINQto…

  5. I mentioned in a post a little while ago about the various LINQ To projects I had seen, but Charlie Calvert

  6. LINQ Providers LINQ to Amazon LINQ to Active Directory LINQ over C# project LINQ to CRM LINQ To Geo

  7. Tecnologias says:

    LINQ Providers LINQ to Amazon LINQ to Active Directory LINQ over C# project LINQ to CRM LINQ To Geo

  8. Here are some useful links to LINQ information. Use the comments or write me if you want to add to this

  9. Офіційні: LINQ to SQL (DLINQ) LINQ to XML (XLINQ) LINQ to XSD LINQ to Entities BLINQ PLINQ Неофіційні

  10. Офіційні: LINQ to SQL (DLINQ) LINQ to XML (XLINQ) LINQ to XSD LINQ to Entities BLINQ PLINQ Неофіційні

  11. This weekend I’ve built a small application, which queries the “Simpsons” seasons guide data and updates

Skip to main content