Tuesday, November 20, 2012

Back from PyCon Canada 2012

I just got back a few days ago from the 2012 edition of PyCon Canada, which was a great success. I wanted to thank the team who invited me for a fantastic experience: Diana Clarke who as conference chair did an incredible job, Greg Wilson from Software Carpentry with whom I had a chance to interact a lot (he already has a long list of ideas for the IPython notebook in teaching contexts we're discussing), Mike DiBernardo and the rest of the PyConCa team. They ran a conference with a great vibe and tons of opportunity for engaging discussion.

Thanks to Greg I also had a chance to give a couple of more academically-oriented talks at U. Toronto facilities, both at the Sunnybrook hospital and their SciNet HPC center, where we had some great discussions. I look forward to future collaborations with some of the folks there.

The PyConCa kindly invited me to deliver the closing keynote for the conference, and I tried to provide a presentation on the part of the Python world that I've been involved with, namely scientific computing, but that would be of interest to the broader Python development community in attendance here. I tried to illustrate where Python has been a great success for modern scientific research, and in doing so I took a deliberately biased view where I spent a good amount of time discussing IPython, which is how I entered that world in the first place.

This is the video of the talk:

and here are the accompanying slides.

I'm too far behind to do a proper recap of the conference itself, but I want to mention one of the highlights for me: a fantastic talk by Elizabeth Leddy, a prominent figure in the Plone world, on how to build sustainable communities. She had a ton of useful insight from in-the-trenches experience with the Plone foundation, and I fortunately got to pick her brain for a while after the talk on these topics. As we gradually build up somewhat similar efforts in the scientific Python world with NumFOCUS, I think she'll be a great person for us to bug every now and then for wisdom.

IPython at the sprints

I managed to stay for the two days of sprints after the end of the main conference, and we had a great time: a number of people made contributions to IPython for the first time, so I'd like to quickly recap here what happened.

Nose extension

Taavi Burns and Greg Ward of distutils fame fought hard on a fairly tricky but extremely useful idea on a suggestion from Greg Wilson: easy in-place use of nose to run tests inside a notebook. This was done by taking inspiration (and I think code) from Catherine Devlin's recent work on integrating doctesting inside the notebook.

The new nose extension hasn't been merged yet, but you can already get the code from github, as usual. Briefly (from Taavi's instructions), this little IPython extension gives you the ability to discover and run tests using Nose in an IPython Notebook.

You starty with a cell containing:

%load_ext ipython_nose

Then write tests that conform to Nose conventions, e.g.

  def test_arithmetic():
      assert 1+1 == 2

And where you want to run your tests, you add a cell consisting of


and run it: that will discover your test_* functions, run them, and report how many passed and how many failed, with stack traces for each failure.

WebGL-based 3d protein visualization

RishiRamraj, Christopher Ing and Jonathan Villemaire-Krajden implemented an extremely cool visualization widget that can, using the IPython display protocol, render a protein structure directly in a 3d interactive window. They used Konrad Hinsen's MMTK toolkit, and the resulting code is as simple as:

from MMTK.Proteins import Protein

You can see what the output looks like in this short video shot by Taavi Burns just as they got it working and we were all very excited looking at the result; the code is already available on github.

I very much look forward to much more of this kind of tools being developed, and in fact Cyrille Rossant wasted no time at all building off this to provide fast 2-d visualizations rendered via WebGL with his Galry library:

Software Carpentry

In addition to the Nose extension above, Greg Wilson had a ton of ideas on things that could be added to the notebook that he thinks would help in the context of teaching workshops such as those that Software Carpentry presents. Their audience is typically composed of beginning programmers, scientists who may be experts in their discipline but who have little to no formal computational training and are now tasked with managing often quite complex computational workflows. Since SWC recently announced they would be switching to the notebook as their main teaching platform, they obviously are thinking deeply about how to make the best use of it and where the notebook can improve for this kind of use case.

These are still conversations that I hope will turn soon into concrete issues/code repositories to begin testing them, but that kind of validated testing is very useful for us. Since at this point we have too many feature requests from multiple fronts to be able to satisfy them all, we are trying to focus on ensuring that IPython can support indivdual projects building their own custom tools and extensions. We can't possibly merge every last idea from every front into IPython, but we can work to ensure it's a flexible and coherent enough foundation that others can build their own highly customized experiences on top. Once these get widely tested and validated, it may be that pieces are clearly of generic enough value to percolate into the core, but in the meantime this approach means that other projects (SWC being just one example among many) don't need to wait for us to add every feature they need.

What we will focus on will be on addressing any limitations that our architecture may have for such extensibility to work well, so the life of third party projects isn't a fight against our interfaces.

A first-time contributor to open source

Last, but not least, I had the great experience of working with David Kua, a CS student from U. Toronto who had never made a contribution to open source and wanted to work on IPython. Right during the sprints we were able to merge his first pull request into nbconvert, and he immediately started working on a new one for IPython that by now has also been merged.

That last one required that he learn how to rebase his git repo (he had some extraneous commits originally) and go through a fair amount of feedback before merging: this is precisely the real world cycle of open source contributions. It's always great to see a brand new contributor in the making, and I very much look forward to many more work from David, whether he decides to do it in IPython or in any other open source project that catches his interest.


Since I am now writing all my posts as IPython notebooks (even when there's no code, it's a really nice way to get instant feedback on markdown), you can get the notebook for this post from my repo.

Sunday, October 14, 2012

Help save open space in the Bay Area by protecting Knowland Park from development

Vote NO on new Tax Measure A1

Update: there is now evidence that Zoo officials have actually violated election laws in their zeal to promote measure A1.

I normally only blog about technical topics, but the destruction of a beautiful piece of open space in the Bay Area is imminent, and I want to at least do a little bit to help prevent this disaster.

In short: there's a tax measure on the November ballot, Measure A1, that would impose a parcel tax on all residences and businesses in Alameda County to fund the Oakland Zoo for the next 25 years.  The way the short text on the ballot is worded makes it appear as something geared towards animal care for a cash-strapped Zoo.  The sad reality is that the full text of the measure allows the Zoo to use these funds for a very controversial expansion plan that includes a 34,000 sq. ft. visitor center, gift shop and restaurant serviced by a ski gondola atop one of the last pristine remaining ridges in Knowland Park, an Oakland city park that sits above the Zoo.

Yes, it's as bad as it sounds; the beautiful ridge in the background:

that is today part of an unspoiled open space, would be closed off by a fence and a restaurant would be built atop of it,  serviced by a ski gondola that would reach it from the bottom of the hill.  Here are a few more pics from the same album as well as a great photo essay on the park from the AllThingsOakland blog, and some more history of the park.

Restaurant development disguised as animal care

The Zoo claims to be strapped for cash, yet they are spending over $ 1 million on a media blitz to get this measure passed, and only presenting it as an animal-care issue.  I am a huge animal lover and donate regularly to the San Diego Zoo, but unfortunately the situation with the Oakland Zoo is a different story: they see the 525-acre Knowland Park above the Zoo as their personal back yard, not as a resource that belongs to all of us.  It has been impossible, in years of negotiations, to get the Zoo to sign anything that commits them to respect the boundaries of the park in the future.  They see this tax measure as their strategic "nuclear weapon" to destroy the park, and in order to get it, they are willing to burn through cash they should instead be using for animal care.

I urge you to consider this as you go to the polls in November: all Alameda county voters will end up having a say on whether "nature preservation" in the East Bay is spelled "huge restaurant and a ski gondola on open space". By voting NO on A1 you will help prevent such madness.

More information

 Here are a few relevant links with details and further info

A final note: the citizen's group fighting to save the park can use all the help in the world. You can make donations or join the effort in many other ways; don't hesitate to ask me for more info.  And please share this post as widely as possible!

Friday, September 7, 2012

Blogging with the IPython notebook

Update (May 2014): Please note that these instructions are outdated. while it is still possible (and in fact easier) to blog with the Notebook, the exact process has changed now that IPython has an official conversion framework. However, Blogger isn't the ideal platform for that (though it can be made to work). If you are interested in using the Notebook as a tool for technical blogging, I recommend looking at Jake van der Plas' Pelican support or Damián Avila's support in Nikola.

Update: made full github repo for blog-as-notebooks, and updated instructions on how to more easily configure everything and use the newest nbconvert for a more streamlined workflow.

Since the notebook was introduced with IPython 0.12, it has proved to be very popular, and we are seeing great adoption of the tool and the underlying file format in research and education. One persistent question we've had since the beginning (even prior to its official release) was whether it would be possible to easily write blog posts using the notebook. The combination of easy editing in markdown with the notebook's ability to contain code, figures and results, makes it an ideal platform for quick authoring of technical documents, so being able to post to a blog is a natural request.

Today, in answering a query about this from a colleague, I decided to try again the status of our conversion pipeline, and I'm happy to report that with a bit of elbow-grease, at least on Blogger things work pretty well!

This post was entirely written as a notebook, and in fact I have now created a github repo, which means that you can see it directly rendered in IPyhton's nbviewer app.

The purpose of this post is to quickly provide a set of instructions on how I got it to work, and to test things out. Please note: this requires code that isn't quite ready for prime-time and is still under heavy development, so expect some assembly.

Converting your notebook to html with nbconvert

The first thing you will need is our nbconvert tool that converts notebooks across formats. The README file in the repo contains the requirements for nbconvert (basically python-markdown, pandoc, docutils from SVN and pygments).

Once you have nbconvert installed, you can convert your notebook to Blogger-friendly html with:

nbconvert -f blogger-html your_notebook.ipynb

This will leave two files in your computer, one named your_notebook.html and one named your_noteboook_header.html; it might also create a directory called your_notebook_files if needed for ancillary files. The first file will contain the body of your post and can be pasted wholesale into the Blogger editing area. The second file contains the CSS and Javascript material needed for the notebook to display correctly, you should only need to use this once to configure your blogger setup (see below):

# Only one notebook so far
(master)longs[blog]> ls
120907-Blogging with the IPython Notebook.ipynb  fig/  old/

# Now run the conversion:
(master)longs[blog]> nbconvert.py -f blogger-html 120907-Blogging\ with\ the\ IPython\ Notebook.ipynb

# This creates the header and html body files
(master)longs[blog]> ls
120907-Blogging with the IPython Notebook_header.html  fig/
120907-Blogging with the IPython Notebook.html         old/
120907-Blogging with the IPython Notebook.ipynb

Configuring your Blogger blog to accept notebooks

The notebook uses a lot of custom CSS for formatting input and output, as well as Javascript from MathJax to display mathematical notation. You will need all this CSS and the Javascript calls in your blog's configuration for your notebook-based posts to display correctly:

  1. Once authenticated, go to your blog's overview page by clicking on its title.
  2. Click on templates (left column) and customize using the Advanced options.
  3. Scroll down the middle column until you see an "Add CSS" option.
  4. Copy entire the contents of the _header file into the CSS box.

That's it, and you shouldn't need to do anything else as long as the CSS we use in the notebooks doesn't drastically change. This customization of your blog needs to be done only once.

While you are at it, I recommend you change the width of your blog so that cells have enough space for clean display; in experimenting I found out that the default template was too narrow to properly display code cells, producing a lot of text wrapping that impaired readability. I ended up using a layout with a single column for all blog contents, putting the blog archive at the bottom. Otherwise, if I kept the right sidebar, code cells got too squished in the post area.

I also had problems using some of the fancier templates available from 'Dynamic Views', in that I could never get inline math to render. But sticking to those from the Simple or 'Picture Window' categories worked fine and they still allow for a lot of customization.

Note: if you change blog templates, Blogger does destroy your custom CSS, so you may need to repeat the above steps in that case.

Adding the actual posts

Now, whenever you want to write a new post as a notebook, simply convert the .ipynb file to blogger-html and copy its entire contents to the clipboard. Then go to the 'raw html' view of the post, remove anything Blogger may have put there by default, and paste. You should also click on the 'options' tab (right hand side) and select both Show HTML literally and Use <br> tag, else your paragraph breaks will look all wrong.

That's it!

What can you put in?

I will now add a few bits of code, plots, math, etc, to show which kinds of content can be put in and work out of the box. These are mostly bits copied from our example notebooks so the actual content doesn't matter, I'm just illustrating the kind of content that works.

In [1]:
# Let's initialize pylab so we can plot later
%pylab inline
Welcome to pylab, a matplotlib-based Python environment [backend: module://IPython.zmq.pylab.backend_inline].
For more information, type 'help(pylab)'.

With pylab loaded, the usual matplotlib operations work

In [2]:
x = linspace(0, 2*pi)
plot(x, sin(x), label=r'$\sin(x)$')
plot(x, cos(x), 'ro', label=r'$\cos(x)$')
title(r'Two familiar functions')
Out [2]:
<matplotlib.legend.Legend at 0x3128610>

The notebook, thanks to MathJax, has great LaTeX support, so that you can type inline math $(1,\gamma,\ldots, \infty)$ as well as displayed equations:

$$ e^{i \pi}+1=0 $$

but by loading the sympy extension, it's easy showcase math output from Python computations, where we don't type the math expressions in text, and instead the results of code execution are displayed in mathematical format:

In [3]:
%load_ext sympyprinting
import sympy as sym
from sympy import *
x, y, z = sym.symbols("x y z")

From simple algebraic expressions

In [4]:
Rational(3,2)*pi + exp(I*x) / (x**2 + y)
Out [4]:
$$\frac{3}{2} \pi + \frac{e^{\mathbf{\imath} x}}{x^{2} + y}$$
In [5]:
eq = ((x+y)**2 * (x+1))
Out [5]:
$$\left(x + 1\right) \left(x + y\right)^{2}$$
In [6]:
Out [6]:
$$x^{3} + 2 x^{2} y + x^{2} + x y^{2} + 2 x y + y^{2}$$

To calculus

In [7]:
diff(cos(x**2)**2 / (1+x), x)
Out [7]:
$$- 4 \frac{x \operatorname{sin}\left(x^{2}\right) \operatorname{cos}\left(x^{2}\right)}{x + 1} - \frac{\operatorname{cos}^{2}\left(x^{2}\right)}{\left(x + 1\right)^{2}}$$

For more examples of how to use sympy in the notebook, you can see our example sympy notebook or go to the sympy website for much more documentation.

You can easily include formatted text and code with markdown

You can italicize, boldface

  • build
  • lists

and embed code meant for illustration instead of execution in Python:

def f(x):
    """a docstring"""
    return x**2

or other languages:

if (i=0; i<n; i++) {
  printf("hello %d\n", i);
  x += 4;

And since the notebook can store displayed images in the file itself, you can show images which will be embedded in your post:

In [8]:
from IPython.display import Image
Out [8]:

You can embed YouTube videos using the IPython object, this is my recent talk at SciPy'12 about IPython:

In [9]:
from IPython.display import YouTubeVideo
Out [9]:

Including code examples from other languages

Using our various script cell magics, it's easy to include code in a variety of other languages

In [10]:
puts "Hello from Ruby #{RUBY_VERSION}"
Hello from Ruby 1.8.7
In [11]:
echo "hello from $BASH"
hello from /bin/bash

And tools like the Octave and R magics let you interface with entire computational systems directly from the notebook; this is the Octave magic for which our example notebook contains more details:

In [12]:
%load_ext octavemagic
In [13]:
%%octave -s 500,500

# butterworth filter, order 2, cutoff pi/2 radians
b = [0.292893218813452  0.585786437626905  0.292893218813452];
a = [1  0  0.171572875253810];
freqz(b, a, 32);

The rmagic extension does a similar job, letting you call R directly from the notebook, passing variables back and forth between Python and R.

In [14]:
%load_ext rmagic 

Start by creating some data in Python

In [15]:
X = np.array([0,1,2,3,4])
Y = np.array([3,5,4,6,7])

Which can then be manipulated in R, with results available back in Python (in XYcoef):

In [16]:
%%R -i X,Y -o XYcoef
XYlm = lm(Y~X)
XYcoef = coef(XYlm)
lm(formula = Y ~ X)

   1    2    3    4    5 
-0.2  0.9 -1.0  0.1  0.2 

            Estimate Std. Error t value Pr(>|t|)  
(Intercept)   3.2000     0.6164   5.191   0.0139 *
X             0.9000     0.2517   3.576   0.0374 *
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 

Residual standard error: 0.7958 on 3 degrees of freedom
Multiple R-squared:  0.81, Adjusted R-squared: 0.7467 
F-statistic: 12.79 on 1 and 3 DF,  p-value: 0.03739 

In [17]:
Out [17]:
[ 3.2  0.9]

And finally, in the same spirit, the cython magic extension lets you call Cython code directly from the notebook:

In [18]:
%load_ext cythonmagic
In [19]:
%%cython -lm
from libc.math cimport sin
print 'sin(1)=', sin(1)
sin(1)= 0.841470984808

Keep in mind, this is still experimental code!

Hopefully this post shows that the system is already useful to communicate technical content in blog form with a minimal amount of effort. But please note that we're still in heavy development of many of these features, so things are susceptible to changing in the near future. By all means join the IPython dev mailing list if you'd like to participate and help us make IPython a better tool!