Run book Run! From physical paper to executable online books

Visual Studio Blog

Python Data Science HandbookHave you ever wanted to run the code samples while reading a book? Without having to first download the sample code, its runtime, and configure your environment so that everything is setup the way you need it? What if you could be reading a book, and immediately execute (and change!) the code without needing to install anything on your computer? What if you could do this without needing any software other than a modern web browser?

Azure Notebooks makes it easy to read a book online and run and change the code samples all from just a modern web browser. Today, we’re delighted to announce that Jake VanderPlas’ Python Data Science Handbook is now available for free on Azure notebooks.

The book is published by O’Reilly as well as in Jupyter notebook format. We are hosting the complete book on Azure Notebooks. You can learn the fundamentals of Data Science by either purchasing the book, or by hopping over to Azure and cloning your own copy and “running” it.

I personally like the feel of an actual book, and now with the ability to run the book without preparing an environment or installing software; the whole experience is a lot more enjoyable.

Note that the Jupyter version of the book is not just the code, but the entire prose as well.

What’s Azure Notebooks?

Azure Notebooks is free service for hosting and running Jupyter notebooks. Let’s take a look at Jupyter itself first.

Jupyter is an open source project started by Professors Fernando Perez and Brian Granger that provides a rich, browser-based notebook for general data analysis and exploration. Jupyter notebooks are comprised of markdown, executable code, interactive plots and much more.

Imagine your typical scientific paper in static PDF format, except now the code is executable, plots are interactive, and notebooks can be shared easily for reproducibility – that’s what Jupyter notebooks enable.

Jupyter

Azure Notebooks provides a playground to create, run, and share Jupyter notebooks for free. Python 2, Python 3, R and F# are currently supported. Beyond languages, you get their supporting distros, such as Anaconda which provides access to 500+ pkgs for just about any domain.

Interactive coding in your bowser

Typical scenario for using Azure Notebooks

This O’Reilly title is our first foray into “Executable” books. We hope that it becomes a trend and a convenient (and fun!) way for people to explore books. There are near-infinite possibilities for how you can use Azure Notebooks:

  • Learn a programming language (check out the samples on our homepage)
  • University courses (checkout out this sample from Georgia Tech)
  • Give a webinar/seminar/talk to 500+ people (w/o any software installation headaches)
  • Explore Azure (check out the CosmosDB sample running against a live database)

Python Data Science Handbook contents

Beyond Data Science, Jake’s book is packed with excellent introductions to Python and Jupyter itself. All the Jupyter material applies to Azure Notebooks as well

  • Chapter 1: IPython: Beyond Normal Python
  • Chapter 2: Introduction to NumPy
  • Chapter 3: Data Manipulation with Pandas
  • Chapter 4: Visualization with Matplotlib
  • Chapter 5: Machine Learning

While on Azure notebooks, you can view an HTML preview of the book without logging in. If you want to run/edit the book, you need to sign in and “clone” the book, which gives you your own private copy to execute and edit.

Python Data Science Handbook on Azure Notebooks

Summary

Thanks to Jake and O’Reilly, we’re at an exciting juncture for learning and exploration using executable books. We’re delighted to partner with O’Reilly on this journey and hope to provide access to other books in the future.

Links:

Jake VanderPlas, Data Science fellow at the UW’s eScience Institute

Jake VanderPlas is a data science fellow at the University of Washington’s eScience Institute, where his work focuses on data-intensive physical science research in an interdisciplinary setting. In the Python world, Jake is the author of the Python Data Science Handbook, and is active in maintaining and/or contributing to several well-known Python scientific computing packages, including Scikit-learn, Scipy, Matplotlib, Astropy, Altair, and others. He occasionally blogs on python-related topics at http://jakevdp.github.io/.

Shahrokh Mortazavi, Partner Dir. of Engineering, Visual Studio

Shahrokh Mortazavi runs the Data Science Developer Tools teams at Microsoft, focused on Python, R, and Jupyter Notebooks. Previously, he was in the High Performance Computing group at Microsoft. He worked on the Phoenix Compiler tool chain (code gen, analysis, JIT) at Microsoft Research and, prior to that, over a 10 year period led Sun Microsystems’ Code Generation & Optimization compiler backend teams.

0 comments

Discussion is closed.

Feedback usabilla icon