Late last year, Microsoft Research, in partnership with Bing, Microsoft’s decision engine, introduced a private beta testing of Microsoft Web N-gram Services. The goal of Microsoft Web N-gram Services is to support research conducted using large data sets, particularly to engage the academic community in the area of data-driven research. This week, during the World Wide Web Conference (WWW2010), Microsoft Research and Bing will announce expanded access of the Microsoft Web N-gram Services beta to include professors and students at accredited colleges and universities worldwide.
The technologies included in Microsoft Web N-gram Services have been noted for their ability to assist in writing applications specific to search, translation, and speech processing. One of the immediate scenarios made possible by the technology is the ability to understand misspelled words and ungrammatical sentences by using the power of the sheer volume of language data, for any natural language that has lots of data published on the web. From a development perspective, this reduces the need for experts to develop grammars for all languages; users who conduct searches or network on the Internet will be enabled to share information in free form with stronger understanding and clarity. This is made possible by using predictions to contextualize the initial words in the query.
As the technology and corresponding development efforts advance, Microsoft Web N-gram Services are expected to provide an accurate, consistent user experience, such as helping people learn another language or search for information with queries that are spoken rather than typed.
Microsoft Web N-gram Services will be demonstrated in the Microsoft booth during WWW2010.
Call for Papers and Proposals
The evolution of Microsoft Web N-gram Services is the result of ongoing collaboration. If you’re passionate about advancing data-driven research, here are two upcoming opportunities to get involved:
- The National Science Foundation has issued a call for proposals regarding Computing in the Cloud. Microsoft Web N-gram Services is part of theoffering made available.
- Paper submissions have been requested by the Programme Committee of the Microsoft Web N-gram Workshop, to be held during the 33rd annual ACM SIGIR conference. The workshop, set for July 23, 2010, in Geneva, Switzerland, will bring together a group of leaders in information retrieval and language modeling to discuss and debate the challenges in their fields and the ways in which language-modeling approaches might help address them. Submissions are due June 11, and authors will be notified June 28.
For those of you attending WWW2010, I look forward to meeting you. And for those of you planning to participate in the upcoming calls for papers and proposals, I’m eager to work with you.
Evelyne Viegas, senior research program manager, Microsoft Research