Searching for Code Online

While drinking my much improved coffee this morning and doing my rounds through my RSS feeds, I happened about a Slashdot post about Google offering a new “code search“.  It’s interesting stuff–especially how the FAQ mentions that they can’t guarantee that the code that they are returning to you is actually under the license they label it as, so it’s risky business to actually use any of the code that you might find on there, and definitely isn’t appropriate for developers to use to find code snippets or other code blocks that they might want to tie back into their apps.

Our team has been thinking about power toys or any other ways that we could help developers, especially new developers to the Microsoft platform, more readily find code snippets and sample applications that apply to what they are working on.  Searching through code just based on regular expressions and the like might be good if you wanted to find a code sample of how you use a particular API or library, but it seems lousy for the situations where you want an example of how to *do* something (such as our classic example we use on the forums all the time…”How do I find the difference between two DateTimes in C#?”)  Not to mention that regexes aren’t super-accessible for new developers who are just starting to code for the first time… 🙂

The other flaw is that the only way people could contribute and get added to these search results is to have a project in some publicly accessible source control–this really isn’t a “sharing code” site…it’s a “grabbing code anonymously” engine.

What do you think?  What would be the ideal way you would want to find code samples and snippets on the Internet?

Comments (2)

  1. Peter Ritchie says:

    Actually, you kind of hit upon the problem.  Finding code on the Web is easy.  Finding code that 1) includes enough comments/documentation to be able to get indexed, 2) documents what the code *actually* does, and 3) what the author *thinks* it does and *actually* does are the same thing, and 4) actually works, and 5) can be re-used, is the hard part.

    It’s more like a conundrum, you really have to be able to write enough of the code to find the code you want to use.  At which point, well you know…

    There’s been lots of research put into areas like facial recognition or loosly describing the type or filtering types of images to the point where searching for images on the Web is reasonably useful.  Even then, I don’t find image-only searches to find a statistically significant quantity of images.  It’s much the same with code, indexing images depends on the context around the image.  If there isn’t enough context around the image that matches what you’re searching for the image isn’t in the results–despite being relevent.

    I guess someone need to come up with are unwizards–not code that generates code, but code that recognizes the intent of code.  DSQL: Domain Specific Query Language?

  2. orcmid says:

    Because they will also search within archive files (.zip, .tar.gzip, etc.), that sounds like a convenient way to package with documentation and everything else that might be needed.

    Public access to a source-control server is also kinda nifty for finding code.

    I don’t think those are the barriers (and it would be nice if CodePlex supported something beside team server version control, ya know), if only a web interface for lightweight access, but zipping the builds would work there too.

    What I’m wondering about is the tagging that helps google and any other search get it right — source code, platform, all of those things that one might want to use to narrow down to acceptible candidates. It would also be great if it was a flex off of something like Windows Live/Desktop Search, rather than one more different search engine to use.