Research Life Science Projects on the Windows Azure Cloud (Part 1)

Over the last three years we have been helping researchers use the Windows Azure cloud platform to host scientific research projects. And we just launched the new Azure4Research program and we have new collection of projects are coming on-board. What has been amazing to me is the variety of topics that the research community has come up with. One area that has had a significant critical mass of activity has been in the area of bioinformatics. I will list some of these projects in this series of blog posts and point out a few of the resulting research publications.

One particular standout has been the work if Wuchun Feng of Virginia Tech. Wu was awarded one of the NSF Computing in the cloud awards. I have known Wu for a long time and I was pleased to see him in this program. Microsoft provided the Windows Azure cloud support for the project. Wu worked closely with our HDInsight team to use Hadoop and mapreduce for bioinformatics. He accomplished quite a bit. Here is some of the PR from his work.

· Tackling the Big Data Issue Behind Connected Health

· https://www.qmed.com/mpmn/medtechpulse/tackling-big-data-issue-behind-connected-health

· New Computing Model Could Lead to Quicker Advancements in Medical Research

· https://cacm.acm.org/news/169480-new-computing-model-could-lead-to-quicker-advancements-in-medical-research/fulltext

· Microsoft's Big Data Service Available After a Year in Preview

· https://www.cio.com/article/742127/Microsoft_s_Big_Data_Service_Available_After_a_Year_in_Preview

· Virginia Tech researchers: Faster computing can speed medical research

· https://fairfaxnews.com/2013/11/virginia-tech-researchers-faster-computing-can-speed-medical-research/

· New Computing Model Could Lead to Quicker Advancements in Medical Research

· https://www.sciencedaily.com/releases/2013/11/131104101047.htm

· New computing model could lead to quicker advancements in medical research, according to Virginia Tech

· https://www.eurekalert.org/pub_releases/2013-11/vt-ncm110413.php

Recent publication include

Nabeel M. Mohamed, Heshan Lin, Wuchun Feng. Accelerating Data-Intensive Genome Analysis in the Cloud. In Proceedings of the 5th International Conference on Bioinformatics and Computational Biology (BICoB), Honolulu, Hawaii, USA, March 2013

Another really nice project was led by Zhengchang Su, Srinivas Arkela, and Youjie Zhou of the University of North Carolina at Charlotte in an effort entitled ‘Large-Scale Annotation of Gene Transcription Regulatory Sequences in Bacterial Genomes Using Cloud Computing.” There work involves annotating regulatory sequences in sequenced bacterial genomes by using comparative genomics-based algorithms. This project was also supported by the NSF/Microsoft “Computing in the Cloud” program.

Recent publications from their work include

o S Zhang, L Jiang, C Du, Z Su, A novel information contents based similarity metric for comparing TFBS motifs. - Systems Biology (ISB), 2012 …, 2012 - ieeexplore.ieee.org

o S Li, X Dong, Z Su, Directional RNA-seq reveals highly complex condition-dependent transcriptomes in E. coli K12 through accurate full-length transcripts assembling. BMC genomics, 2013 - biomedcentral.com

In the next blog entry in this series I will describe work by Ignacio Blanquer and Paul Watson. This work is being presented this week in the BIO-IT conference in Lisbon this week.