Python Rocks! and other rants
Weblog of Kent S Johnson

#76 2008-05-31 08:24:13

Yet another new job

For those of you keeping track at home I will note that I am now working for Cambridge Research & Instrumentation in Woburn, MA as a Principal Software Developer. I am excited to be working on medical equipment - I am helping develop software for the Maestro and Nuance systems. The coding is primarily C++ so I have been studying a lot - I haven't used C++ in almost ten years.


#75 2008-05-31 08:06:52


I just noticed that my name is in a new book. The book is Refactoring HTML, by Elliotte Rusty Harold. I'm pretty sure I am the 'Kent' mentioned in the section Why Refactor HTML in Chapter 1.

This is the third time, to my knowledge, that I been mentioned in a book. The first, and most substantial, is the printed Python Cookbook, 2nd Edition in which I am primarily responsible for Recipe 3.2. The second time is in The Definitive Guide to Django where I receive an acknowledgement.


#74 2007-02-14 10:45:06

New Job

I am pleased to say that I have a new job as a Python programmer! I am very happy at the prospect of full-time Python work. I will be working at home for a tiny startup in Chelmsford. The application is a Python web site that will be built in Django. There is also a substantial data collection and analysis piece which should be interesting as well.

Also I will be working on a Mac instead of Windows, another big plus.



#73 2007-02-14 10:13:18

Django rocks

I recently tried out TurboGears and Django for a small project at work. Here are some impressions.

I tried TurboGears first. I really wanted to like TurboGears. I like the concept - find best-of-breed parts, add a little glue to make them work together, stir gently and enjoy. Don't re-invent the wheel. I often do the same thing in my work.

Unfortunately the reality is not so simple. For one thing, the "best-of-breed" seems to change. In TurboGears 1.0, the recommended ORM is SQLObject and the recommended template engine is Kid. But for TG 1.1, this will change to SQLAlchemy and Genshi. This gave me an uneasy feeling that TG is a moving target.

Then there are the TurboGears docs. For information about TG itself, you go to the TG website. These docs are kind of patchy, they don't feel comprehensive at all, more like a collection of recipes. For the component packages like SQLObject, you have to go to the component web site. (I know, there is a TG book. I bought a copy but I switched to Django before it arrived so I don't have anything to say about it.)

In the end TurboGears feels like a patchwork, pieced together out of whatever was near at hand.

OK, what about Django? Simply put, Django rocks! It is an integrated whole. The Django docs are clear and well organized. Most chapters of the upcoming book are online; they give a good overview of Django and dig into a lot of the details. The complete reference fills in the rest.

The functionality available in Django's admin site is phenomenal. The application I was playing with was very simple; take the data in an Excel spreadsheet and convert it to a web interface. The only code I had to write to do this was the Django model and an importer to populate the database. The rest was just configuration, and not much of that. The result is an application with a column view of the database with sortable columns, multiple filters, drill-down by date and a detail view and data entry screen. All basically for free. Oh, it looks great, too. My co-workers were blown away and they are planning to clean up my prototype and deploy it.

I'm convinced.


#72 2006-04-06 15:18:37

lxml for Windows!

lxml now has a Windows binary available. I'm thrilled! lxml is a fast, Pythonic XML library with full XPath support. I don't know of any other Python XML library that has all three of these attributes:

  • ElementTree is fast and Pythonic but only supports a limited subset of XPath.
  • Amara is Pythonic and supports XPath but in my limited experience it seems quite slow.
  • PyXML uses w3c DOM models which are clunky and un-Pythonic.

I have raved many times about dom4j, an XML library for Java which is easy to use and includes XPath support. I have long wished for a Python package as powerful and easy-to-use. I'm optimistic that lxml will fill the bill.

Categories: Python


#71 2006-03-31 08:27:19

Switching to Firedrop2

I finally switched this blog from PyDS and Python Community Server to Firedrop2 and static hosting at my ISP.

I made the switch because PyDS wasn't working well for me. It was always more complex than I really needed, even when I was blogging regularly. Writing infrequently, this became more of a problem.

Worse, for some reason PyDS stopped working. First the Upstreaming page started showing a traceback, then the ReST formatter broke. Now it is so broken that I can't even upload a final post redirecting to my new blog.

I don't know what broke it; I rarely use Python 2.3 any more, which is where all the PyDS stuff is installed, but maybe I upgraded some package that it uses. I could probably track it down but it doesn't seem worth it. The few times I looked at the PyDS code it was hard to find what I needed - it uses a plugin architecture that makes it hard to trace through the code.

A few other problems: PyDS and PyCS both can be slow to respond. PyCS occasionally stops responding. I shouldn't complain too much about a free service, but still...

So when Firedrop2 came out it seemed like time for a change. I want my blog files on my computer in text format, not in a database or on an external server. Static hosting on my ISP is just fine, thank you. I don't need to run an embedded server. So Firedrop2 fits my preferences.

Anyway it seems to be working fine, maybe I'll even start writing again :-) I have found a few minor problems with Firedrop2 but the code is easy to understand and fix and the developers are receptive to help.


#70 2005-12-28 10:49:20

Why I love Python 5

Easy introspection and dynamic loading

This example shows off several useful features of Python including introspection, dynamic loading, first-class functions and flexible except clauses.

At work I have some Java code that uses XPath support from the Xalan class org.apache.xpath.XPathAPI. In Java 1.4 this class is provided with the JRE. In Java 1.5 they moved the class to I need to run with either version of Java. I prefer not to bundle Xalan with my program, so I wrote a wrapper that dynamically locates the correct version and dispatches to it:

// The XPathAPI is in different packages in Java 1.4 and 1.5.
// Use introspection to find the right one
private static Method __selectSingleNode;

static {
    // Look for the XPathAPI class in two places
    Class XPathAPI = null;
    try {
        XPathAPI = Class.forName("org.apache.xpath.XPathAPI");
    } catch (ClassNotFoundException e) {
        try {
            XPathAPI = Class.forName("");
        } catch (ClassNotFoundException e1) {

    // Get the methods we support
    try {
        __selectSingleNode =
                             new Class[] { Node.class, String.class} );
    } catch (SecurityException e) {
    } catch (NoSuchMethodException e) {

/** XPathAPI.selectSingleNode */
public static Node selectSingleNode(Node node, String xpath) {
    try {
        return (Node)__selectSingleNode.invoke(null, new Object[] { node, xpath });
    } catch (IllegalArgumentException e) {
    } catch (IllegalAccessException e) {
    } catch (InvocationTargetException e) {
    return null;

Wow, what an ugly mess! What would it look like in Python?

The initial static block would become a conditional import:

  import org.apache.xpath.XPathAPI as XPathAPI
except ImportError:
  import as XPathAPI

That was easy - and wait - we're done now! The client code can call XPathAPI.selectSingleNode() and it will work!

But suppose for the sake of example we want to get a reference to selectSingleNode using introspection. That is as simple as

__selectSingleNode = getattr(XPathAPI, 'selectSingleNode')

This __selectSingleNode is itself a callable function (not a wrapper around a function) so clients can call it directly; the selectSingleNode() wrapper is not needed at all.

I have omitted the exception handling in the Python code because these exceptions are fatal and might as well terminate the program. If I wanted to catch them I could use an except clause with multiple exception types, instead of multiple except clauses, something like this:

  __selectSingleNode = ...
except (SecurityException, NoSuchMethodException), e:

Categories: Java, Python


#69 2005-12-06 22:37:36

Simple itertools.groupby() example

Suppose you have a (sorted) list of dicts containing the names of cities and states, and you want to print them out with headings by state:

>>> cities = [
...     { 'city' : 'Harford', 'state' : 'Connecticut' },
...     { 'city' : 'Boston', 'state' : 'Massachusetts' },
...     { 'city' : 'Worcester', 'state' : 'Massachusetts' },
...     { 'city' : 'Albany', 'state' : 'New York' },
...     { 'city' : 'New York City', 'state' : 'New York' },
...     { 'city' : 'Yonkers', 'state' : 'New York' },
... ]

First let me explain operator.itemgetter(). This function is a factory for new functions. It creates functions that access items using a key. In this case I will use it to create a function to access the 'state' item of each record:

>>> from operator import itemgetter
>>> getState = itemgetter('state')
>>> getState
<operator.itemgetter object at 0x00A31D90>
>>> getState(cities[0])
>>> [ getState(record) for record in cities ]
['Connecticut', 'Massachusetts', 'Massachusetts', 'New York', 'New York', 'New York']

So the value returned by itemgetter('state') is a function that accepts a dict as an argument and returns the 'state' item of the dict. Calling getState(d) is the same as writing d['state'].

What does this have to do with itertool.groupby()?

>>> from itertools import groupby
>>> help(groupby)
Help on class groupby in module itertools:

class groupby(__builtin__.object)
|  groupby(iterable[, keyfunc]) -> create an iterator which returns
|  (key, sub-iterator) grouped by each value of key(value).

groupby() takes an optional second argument which is a function to extract keys from the data. getState() is just the function we need.

>>> groups = groupby(cities, getState)
>>> groups
<itertools.groupby object at 0x00A88300>

Hmm. That's a bit opaque. groupby() returns an iterator. Each item in the iterator is a pair of (key, group). Let's take a look:

>>> for key, group in groups:
...   print key, group
Connecticut <itertools._grouper object at 0x0089D0F0>
Massachusetts <itertools._grouper object at 0x0089D0C0>
New York <itertools._grouper object at 0x0089D0F0>

Hmm. Still a bit opaque :-) The key part is clear - that's the state, extracted with getState - but group is another iterator. One way to look at it's contents is to use a nested loop. Note that I have to call groupby() again, the old iterator was consumed by the last loop:

>>> for key, group in groupby(cities, getState):
...   print key
...   for record in group:
...     print record
{'city': 'Harford', 'state': 'Connecticut'}
{'city': 'Boston', 'state': 'Massachusetts'}
{'city': 'Worcester', 'state': 'Massachusetts'}
New York
{'city': 'Albany', 'state': 'New York'}
{'city': 'New York City', 'state': 'New York'}
{'city': 'Yonkers', 'state': 'New York'}

Well, that makes more sense! And it's not too far from the original requirement, we just need to pretty up the output a bit. How about this:

>>> for key, group in groupby(cities, getState):
...   print 'State:', key
...   for record in group:
...     print '   ', record['city']
State: Connecticut
State: Massachusetts
State: New York
     New York City

Other than misspelling Hartford (sheesh, and I grew up in Connecticut!) that's not too bad!

Categories: Python


#68 2005-12-03 07:01:04

How I write code

I tend to design from the bottom up - not exclusively, but in general I make small parts and combine them to make larger parts until I have something that does what I want. I refactor constantly as my understanding of a problem and the solution increase. This way I always have complete working code for some section of the problem. I rarely use stubs of any kind.

To start I will take some small section of the problem and think about what kind of data and operations on the data I need to solve it. For a very simple problem I might just write some functions to operate on the data. As I expand into larger parts of the problem I might find that several functions are operating on the same data and decide that they belong in a class. Or it might be clear from the start that I want to create a class around the data.

When one chunk is done to my satisfaction, I take on another, and another. I am creating building blocks, then using the building blocks to create larger blocks. Some of the blocks are classes, others are functions.

I write unit tests as I go, sometimes test-first, sometimes test-after, but always alternating writing code with writing tests so I know the code works and I have a safety net when I need to refactor or make other major changes.

At any time I may discover that I made a bad decision earlier, or realize that there is a better way to structure the code or data. Then I stop and rework until I am happy with what I have. The unit tests give me confidence that I haven's broken anything in the process. It's a very organic process, I sometimes think of it as growing a program.

(from a post to the Python-tutor list)

Categories: Agile


#67 2005-11-28 20:12:32

Recommended Reading

I have just added a page of recommended books to my main web site. It is very much a work in progress but there is enough there for an initial post.


#66 2005-10-24 21:14:40

What's so great about Ruby?

I'm reading Bruce Tate's latest book, Beyond Java. In it, he argues that Java has become overgrown, unwieldy and vulnerable for replacement in many applications. Prime candidates to replace it are the dynamic languages, particularly Ruby.

As a staunch Python advocate I read his description of Ruby with interest. Most Ruby features that he thinks are cool are available in Python in some form. Some are considered wizard-level tricks in Python instead of the mainstream practices they seem to be in Ruby.

For example, in Ruby you can easily add to a class definition. You just declare the class again and extend the definition. This works even for built-in classes. The Ruby approach is conceptually very simple--it reuses the class definition syntax. In Python you can add methods to a class after it is defined by adding attributes to the class. Python's approach is fairly obscure - getting it right can take a few tries.

Ruby allows mixins--class fragments that can be added to a class definition to extend it. Python can do the same with multiple inheritance, at the time a class is defined, or by appending to __bases__, which might be considered a hack.

Ruby has support for creating aliases of methods and replacing them, and this is considered a good thing. In Python this is called monkeypatching and is generally frowned on.

So there is not that much difference in capability. In Ruby some of these things are easier, and I don't discount that. But the main difference seems to be philosophical or cultural. In Python classes are thought of as fairly static--once you create it, it's done. Metaprogramming tricks are used during class creation to get some special effect, or to meet some unusual need.

In Ruby, though, classes are thought of as malleable. A class definition is just a starting point for the full functionality of the class. It's an interesting way of looking at it.

Categories: Python

© Kent S Johnson Creative Commons License

Comments about life, the universe and Python, from the imagination of Kent S Johnson.

Weblog home

All By Date

All By Category







Powered by Firedrop2