LearnHub: Powered by Rails, Searches with Bing

Introducing LearnHub

learnhub_home_pageLearnHub's home page.

If you’re a student applying to colleges and universities and are looking for help with the process, you should try LearnHub. Based in Toronto, LearnHub is a social learning network that helps students to prepare for standardized tests, assists with finding places to study abroad and provides career counseling. LearnHub’s site has hundreds of thousands of pages of free content, including the world's largest bank of questions that appear in the GMAT and SAT standardized tests. The site has a large following among students worldwide, particularly in India, and has partnerships with 25 universities to recruit domestic and international students.

learnhub

With those hundreds of thousands of pages, LearnHub needed to provide a way for students to find what they’re looking for. They provide a search function, and it’s powered by Bing.

The people at LearnHub are part of that sector of Toronto tech that’s into Ruby on Rails, open source and founding startups. Founders John Philip Green and Malgosia Green are a husband-and-wife team who are known for building web applications for education and have been active members of Toronto’s tightly-knit open source tech community since the earliest DemoCamps. John caught Rails fever after trying it out and decided to rewrite a major application using it. The core development team of Wesley Moxam, Carsten Nielsen and Libin Pan are fixtures of the local Toronto’s on Rails scene; a gathering of local Rubyists doesn’t feel complete without them.

So what are they doing, using Bing?

learnhub_dev_management_teamThe main room at LearnHub’s offices. Management are to the left, developers to the right.

In the beginning, they went with their first instinct, which was to use Google. “We launched in March 2008,” said co-founder John Philip Green, “and we needed to provide site-wide search, so we went with Google. We signed up, and for a few hundred bucks a year, we got a search function that covered about 5,000 pages. It seemed like a pretty big number, and we thought that would be more than enough to cover our site.”

They soon found that the results weren’t what they expected. “We weren’t getting good results. We’d use our site-wide search to search for something that we knew was in our site, and it wouldn’t show up in the results.” The same search would work just fine if you did it from Google.com, but not from their Google-powered search function. “The results just weren’t relevant, and we also had a limited number of queries,” John said.

learnhub_management_dev_teamThe main room at LearnHub’s offices. That’s management in the foreground, developers in the back.

LearnHub’s page count grew quickly and beyond the 5,000 pages covered by their arrangement with Google. “Going up to a bigger package was expensive;” John said, “it would have cost a couple thousand for 50,000 pages, and we were already at hundreds of thousands.”

“We could’ve gotten the functionality for free, but that’s only an option when you show ads in the search results, and the ads that showed up were for our competitors.”

learnhub_sales_teamLearnHub's sales team.

There was another problem: Google’s site search returned its results as a web page. In order to make LearnHub’s site-wide search’s results page have the same look and feel as the rest of the site, they had to stick the Google results in an iframe. “And even then, what was inside the iframe didn’t match the rest of the page,” added John.

They started looking at other options for implementing LearnHub’s site-wide search, including running their own spider. “We really didn’t want to do that,” said programmer Wesley Moxam.

Enter Bing

wes_moxamLearnHub developer Wesley Moxam.

While looking around at search options, Wesley found the Live Search API, which is now known as the Bing API. “It was free, well-designed and spits out JSON,” he said. “Google requires a JavaScript interface or SOAP, and SOAP libraries in Ruby are painful.”

“It took a day to implement and get it up and running,” said Wesley, “The entire switch-over project happened over three days, with us working on it on and off, while we were doing other tasks. Best of all, we get consistent results – the results from the API are the same results you’d get if you just used the Bing site.”

“Bing’s API is simple and straightforward. You call it, you get the results, you take those results and use them how you like,” he continued. “It’s good. It’s hard to explain good software; good software is inherently simple.”

Here’s a screenshot of a LearnHub search results page for the search term “accordion” – and yes, the word appears on a handful of Learnhub pages!

LearnHub search results page for the search term "accordion" LearnHub’s search results page for the term “accordion”.

LearnHub have benefited from using Bing to power their site-wide search, and they’ve decided to share the wealth. Wesley’s working on refactoring the Ruby library he wrote to act as a wrapper for the Bing API and open source it for anyone to use. It should be available later this summer. He’ll announce it when it’s released, and I’ll announce it here.

The Bing API

Bing logo

It’s easy to harness the power of Bing in your applications, whether for desktop, web or mobile.

The first step is to get an AppID, which is a string that uniquely identifies you as a registered Bing application developer. Go to the Bing Developer Center, sign in with your Windows Live ID (which you can get for free) and follow the link to created a new AppID. You’ll be asked to supply some very basic information about your application and to review the Bing API’s Terms of Use. If you provide the information and agree to the Terms of Use (which I summarize in plain English below), you'll get an AppID.

Once you have an AppID, you can start experimenting right away with the Bing API. All you need to do is start typing URLs with the format below into your browser’s address bar:

 https://api.search.live.net/xml.aspx?AppID=  <AppID>  &query=  <SearchTerms>  &sources=  <SourceTypes>  

where:

  • <AppID> is the AppID assigned to you
  • <SearchTerms> are your urlencoded search terms
  • <SourceTypes> specifies the type(s) of search results you want. The different sourcetypes are explained in the table below:
SourceType Description Example Search Terms
Web Searches for web content accordion – returns web pages containing the term “accordion”
Image Searches for images on the web accordion – returns images of accordions
News Searches news stories accordion – returns news articles about accordions
InstantAnswer Searches Encarta online what is an accordion – returns the definition of “accordion” convert 1.6 kilometres to miles – returns “0.9941939 miles” sin(30 degrees) – returns “0.5”
Spell Searches Encarta Dictionary for spelling suggestions accordian – returns “accordion” 
Phonebook Searches phonebook entries accordions in Toronto – returns location results for “accordions in Toronto”
RelatedSearch Returns query strings most similar to yours accordion – returns results like “{piano accordion; button accordion; accordion store}”
Ad Returns advertisements to incorporate with results (use this to make money with you Bing-powered application) accordion – returns ads relevant to the keyword “accordion”

 

The default format for results is XML, and that’s the format you get when typing in API calls in your browser. You can also have the results returned as JSON or SOAP if you prefer.

You can find out more about the Bing API in the Bing API section of MSDN.

Bing’s Terms of Use, Explained as Simply as Possible

Here’s a quick explanation of Bing’s Terms of Use for those of us without a law degree. It’s adapted from the Bing documentation and provides a quick summary of what application developers using the Bing API must do and cannot do (besides the obvious "I promise not to use the API to plan a terrorist attack, run a drug smuggling ring or help the band Nickelback take forceful despotic rule of planet Earth").

What you must do:

  • You must display all the results you request. No filtering!

  • You must display your results in the context of a user-facing application or website.

  • You must display attribution to Bing in a manner compliant with our branding rules. Currently, you may determine the specific manner in which you display attribution. A link to https://www.live.com with the query echo is a suggested example.

  • You must restrict your usage to less than 7 queries per second per IP address. You may be permitted to exceed this limit under some conditions, but this must be approved through discussion with the folks at api_tou@microsoft.com.

  • If you interleave data from any source other than the API with data from the API, you must clearly

    differentiate the respective sources. (Yes, you can interleave Bing results with other data!)

What you cannot do:

  • You cannot use API results for search engine optimization (SEO). In particular, using the API for rank checks is explicitly prohibited.
  • You cannot display advertisements in positions other than the mainline and sidebar.
  • You cannot change the order of the results the API returns from a SourceType other than Web. (In other words, you can re-order results from standard searches for web pages!)

Bing Your Apps!

From there, the sky’s the limit. The Bing API is very straightforward and easy to use, it costs nothing to use it, and as someone who’s been using Bing as his default search engine since its beta period, the results it provides are great. Go forth and Bing your apps!