Python Rocks! and other rants
Weblog of Kent S Johnson

2004-10-25 20:49:04

When should I use classes?

A question beginners sometimes ask is, "Should I use classes in my program?" This article gives some ideas of when it is appropriate to introduce classes into a design.

One of the great things about Python is that it supports several styles of programming. A really simple program might start out as straight-line code without even any functions. As it gets more complicated you break it up into functions to keep it readable or to reuse bits of code. You may put some functions together in a module.

Many of the Python library modules are written like this, without any use of classes. This approach works well for many relatively simple needs.

On the other hand, some kinds of complexity are best addressed with classes. Here are some indications that you should start using classes, with illustrations taken from the Python standard library modules.

Several functions share state

You may have several functions that use the same state variables, either reading or writing them. You are passing a lot of parameters around. You have nested functions that have to forward their parameters to the functions they use. You are tempted to make some module variables to hold the state.

You could make a class instead! All methods of a class have access to all istance data of the class. By storing the shared state in the class, you avoid the need to pass it as parameters to the methods.

A class is a container for shared state, combined with functions (methods) that operate on that state. By putting state variables into member fields, they are accessible to all the methods of the class without having to be passed as parameters.

You can accomplish the same thing by using module-level (global) variables, but globals in general are a bad idea.

The DictReader and DictWriter classes from the csv module are simple examples of this style of class. They hold parameters that describe an input or output stream and methods that operate on the stream.

Another way to solve this problem is to make a class that is only a container for the state. This kind of class is called a struct in C and C++. The Dialect class in the csv module is an example of this. It holds quite a few configuration values. It has just two methods: a constructor and a validation function.

A variation on the theme of shared data is to have one function that creates some data, and other functions that operate on the data. You could have the first function return all the data to its client. Then the client would forward the data to the consumer function. If the two functions are closely related, it might be a better solution to put the data into a class.

The FieldStorage class in the cgi module is a good example of this use of classes. Its constructor parses data from an HTTP FORM submission and stores the data in instance data. The other methods of the class provide access to the form data.

More than one copy of the same state variables

Maybe you decided to put your shared state into global variables instead of creating a class. The single client of your module works great. But what if you have a second client that needs its own copy of the module state, or you need to use the module from multiple threads? You are getting wacky results as the module trips over itself trying to serve two masters. How can you have two copies of the global state? Move it into a class! Each client has its own instance of the class with its own state.

Each separate instance of a class has its own state, independent of the other instances.

For example, a DOM tree built with the xml.dom module may contain many Element objects, each with its own state. This wouldn't be possible without class instances to hold the data.

To extend the behavior of an existing class

This is classic OOP. Maybe there is an existing class that does almost what you want. You just need to add a little something and it will be perfect. Or maybe a library class is intended to be subclassed in normal use.

Make a class which subclasses the class you want to change, and add the new behavior to it.

Note: This technique can easily be abused. Subclasses should have an is-a relationship to their superclass. If class Derived extends class Base, you should sensibly be able to say Derived is-a Base.

Consider containment (has-a) as an alternative to inheritance (is-a). If the base class is just an implementation detail of the derived class, then maybe instead of inheriting from base it should just contain a base instance as a member.

Some library classes are meant to be extended. For example unittest.TestCase is routinely subclassed to make new test cases. cmd.Cmd must be subclassed to be useful. Widget classes in GUI frameworks are commonly subclassed.

A callback function needs persistent state

This is a special situation that comes up pretty regularly for me.

For example, I have a database access module with a query method. The query method takes a callback function as an argument. For each row returned by the query, the callback function is called.

Now suppose I want to build a list with all the query rows. I can make a class that holds the list. The class has a __call__ method. This makes instances of the class act like functions. I pass an instance of the class to the query function. The instance accumulates the list which I pull out at the end:

class Callback:
  def __init__(self):
    self.vals = []

  def __call__(self, row):
    # This is the actual callback
    self.vals.append(row)

The client code looks like this:

callback = Callback()
db.query(sql, callback)
result = callback.vals

Note: This could also be done with a closure like this:

result = []
def callback(row):
  result.append(row)
db.query(sql, callback)

Finally

I have just scratched the surface of object-oriented design. My intent is help beginners understand when it might be appropriate to use classes, not to catalog every possible use.

I have linked to the documentation pages for the modules I use as examples. If you have Python installed, you can find the source code in the Python library directory. On Windows look in C:\Python23\Lib.

 
© Kent S Johnson Creative Commons License

Comments about life, the universe and Python, from the imagination of Kent S Johnson.

kentsjohnson.com

Weblog home

All By Date

All By Category

Essays

XML-Image

BlogRoll