39
It’s now possible to define static and class methods, in addition to the instance
methods available in previous versions of Python.
It’s also possible to automatically call methods on accessing or setting an instance
attribute by using a new mechanism called properties. Many uses of
__getattr__()
can be rewritten to use properties instead, making the resulting code simpler and
faster. As a small side benefit, attributes can now have docstrings, too.
The list of legal attributes for an instance can be limited to a particular set using
slots, making it possible to safeguard against typos and perhaps make more
optimizations possible in future versions of Python.
Some users have voiced concern about all these changes. Sure, they say, the new
features are neat and lend themselves to all sorts of tricks that weren’t possible in
previous versions of Python, but they also make the language more complicated. Some
people have said that they’ve always recommended Python for its simplicity, and feel that
its simplicity is being lost.
Personally, I think there’s no need to worry. Many of the new features are quite esoteric,
and you can write a lot of Python code without ever needed to be aware of them. Writing
a simple class is no more difficult than it ever was, so you don’t need to bother learning or
teaching them unless they’re actually needed. Some very complicated tasks that were
previously only possible from C will now be possible in pure Python, and to my mind that’s
all for the better.
I’m not going to attempt to cover every single corner case and small change that were
required to make the new features work. Instead this section will paint only the broad
strokes. See section Related Links, “Related Links”, for further sources of information
about Python 2.2’s new object model.
Old and New Classes
First, you should know that Python 2.2 really has two kinds of classes: classic or old-style
classes, and new-style classes. The old-style class model is exactly the same as the
class model in earlier versions of Python. All the new features described in this section
apply only to new-style classes. This divergence isn’t intended to last forever; eventually
old-style classes will be dropped, possibly in Python 3.0.
So how do you define a new-style class? You do it by subclassing an existing new-style
class. Most of Python’s built-in types, such as integers, lists, dictionaries, and even files,
are new-style classes now. A new-style class named
object
, the base class for all built-in
types, has also been added so if no built-in type is suitable, you can just subclass
object
:
65
class C(object):
def __init__ (self):
...
...
This means that
class
statements that don’t have any base classes are always classic
classes in Python 2.2. (Actually you can also change this by setting a module-level
variable named
__metaclass__
— see PEP 253 for the details — but it’s easier to just
subclass
object
.)
The type objects for the built-in types are available as built-ins, named using a clever
trick. Python has always had built-in functions named
int()
,
float()
, and
str()
. In 2.2,
they aren’t functions any more, but type objects that behave as factories when called.
>>> int
<type 'int'>
>>> int('123')
123
To make the set of types complete, new type objects such as
dict()
and
file()
have
been added. Here’s a more interesting example, adding a
lock()
method to file objects:
class LockableFile(file):
def lock (self, operation, length=0, start=0, whence=0):
import fcntl
return fcntl.lockf(self.fileno(), operation,
length, start, whence)
The now-obsolete
posixfile
module contained a class that emulated all of a file object’s
methods and also added a
lock()
method, but this class couldn’t be passed to internal
functions that expected a built-in file, something which is possible with our new
LockableFile
.
Descriptors
In previous versions of Python, there was no consistent way to discover what attributes
and methods were supported by an object. There were some informal conventions, such
as defining
__members__
and
__methods__
attributes that were lists of names, but often
the author of an extension type or a class wouldn’t bother to define them. You could fall
back on inspecting the
__dict__
of an object, but when class inheritance or an arbitrary
__getattr__()
hook were in use this could still be inaccurate.
The one big idea underlying the new class model is that an API for describing the
attributes of an object using descriptors has been formalized. Descriptors specify the
Online Remove password from protected PDF file If we need a password from you, it will not be read or stored. To hlep protect your PDF document in C# project, XDoc.PDF provides some PDF security settings.
creating secure pdf files; pdf security settings
52
value of an attribute, stating whether it’s a method or a field. With the descriptor API,
static methods and class methods become possible, as well as more exotic constructs.
Attribute descriptors are objects that live inside class objects, and have a few attributes of
their own:
__name__
is the attribute’s name.
__doc__
is the attribute’s docstring.
__get__(object)()
is a method that retrieves the attribute value from object.
__set__(object, value)()
sets the attribute on object to value.
__delete__(object, value)()
deletes the value attribute of object.
For example, when you write
obj.x
, the steps that Python actually performs are:
descriptor = obj.__class__.x
descriptor.__get__(obj)
For methods,
descriptor.__get__()
returns a temporary object that’s callable, and
wraps up the instance and the method to be called on it. This is also why static methods
and class methods are now possible; they have descriptors that wrap up just the method,
or the method and the class. As a brief explanation of these new kinds of methods, static
methods aren’t passed the instance, and therefore resemble regular functions. Class
methods are passed the class of the object, but not the object itself. Static and class
methods are defined like this:
class C(object):
def f(arg1, arg2):
...
f = staticmethod(f)
def g(cls, arg1, arg2):
...
g = classmethod(g)
The
staticmethod()
function takes the function
f()
, and returns it wrapped up in a
descriptor so it can be stored in the class object. You might expect there to be special
syntax for creating such methods (
def static f
,
defstatic f()
, or something like that)
but no such syntax has been defined yet; that’s been left for future versions of Python.
More new features, such as slots and properties, are also implemented as new kinds of
descriptors, and it’s not difficult to write a descriptor class that does something novel. For
example, it would be possible to write a descriptor class that made it possible to write
Eiffel-style preconditions and postconditions for a method. A class that used this feature
might be defined like this:
C# HTML5 Viewer: Deployment on AzureCloudService All. Create a New AzureCloudService Project in RasterEdge.XDoc.PDF.HTML5Editor.dll. validateIntegratedModeConfiguration="false"/> <security> <requestFiltering
convert secure webpage to pdf; decrypt pdf without password C# HTML5 Viewer: Deployment on ASP.NET MVC Create a New ASP.NET MVC3 RasterEdge.XDoc.PDF.HTML5Editor.dll. validateIntegratedM odeConfiguration="false"/> <security> <requestFiltering allowDoubleEscaping
convert locked pdf to word; add security to pdf in reader
62
from eiffel import eiffelmethod
class C(object):
def f(self, arg1, arg2):
# The actual function
...
def pre_f(self):
# Check preconditions
...
def post_f(self):
# Check postconditions
...
f = eiffelmethod(f, pre_f, post_f)
Note that a person using the new
eiffelmethod()
doesn’t have to understand anything
about descriptors. This is why I think the new features don’t increase the basic complexity
of the language. There will be a few wizards who need to know about it in order to write
eiffelmethod()
or the ZODB or whatever, but most users will just write code on top of
the resulting libraries and ignore the implementation details.
Multiple Inheritance: The Diamond Rule
Multiple inheritance has also been made more useful through changing the rules under
which names are resolved. Consider this set of classes (diagram taken from PEP 253 by
Guido van Rossum):
class A:
^ ^ def save(self): ...
/ \
/ \
/ \
/ \
class B class C:
^ ^ def save(self): ...
\ /
\ /
\ /
\ /
class D
The lookup rule for classic classes is simple but not very smart; the base classes are
searched depth-first, going from left to right. A reference to
D.save()
will search the
classes
D
,
B
, and then
A
, where
save()
would be found and returned.
C.save()
would
never be found at all. This is bad, because if
C
‘s
save()
method is saving some internal
state specific to
C
, not calling it will result in that state never getting saved.
New-style classes follow a different algorithm that’s a bit more complicated to explain, but
does the right thing in this situation. (Note that Python 2.3 changes this algorithm to one
80
that produces the same results in most cases, but produces more useful results for really
complicated inheritance graphs.)
1. List all the base classes, following the classic lookup rule and include a class multiple
times if it’s visited repeatedly. In the above example, the list of visited classes is [
D
,
B
,
A
,
C
,
A
].
2. Scan the list for duplicated classes. If any are found, remove all but one occurrence,
leaving the last one in the list. In the above example, the list becomes [
D
,
B
,
C
,
A
]
after dropping duplicates.
Following this rule, referring to
D.save()
will return
C.save()
, which is the behaviour
we’re after. This lookup rule is the same as the one followed by Common Lisp. A new
built-in function,
super()
, provides a way to get at a class’s superclasses without having
to reimplement Python’s algorithm. The most commonly used form will be
super(class,
obj)()
, which returns a bound superclass object (not the actual class object). This form
will be used in methods to call a method in the superclass; for example,
D
‘s
save()
method would look like this:
class D (B,C):
def save (self):
# Call superclass .save()
super(D, self).save()
# Save D's private information here
...
super()
can also return unbound superclass objects when called as
super(class)()
or
super(class1, class2)()
, but this probably won’t often be useful.
Attribute Access
A fair number of sophisticated Python classes define hooks for attribute access using
__getattr__()
; most commonly this is done for convenience, to make code more
readable by automatically mapping an attribute access such as
obj.parent
into a method
call such as
obj.get_parent
. Python 2.2 adds some new ways of controlling attribute
access.
First,
__getattr__(attr_name)()
is still supported by new-style classes, and nothing
about it has changed. As before, it will be called when an attempt is made to access
obj.foo
and no attribute named
foo
is found in the instance’s dictionary.
New-style classes also support a new method,
__getattribute__(attr_name)()
. The
difference between the two methods is that
__getattribute__()
is always called
67
whenever any attribute is accessed, while the old
__getattr__()
is only called if
foo
isn’t
found in the instance’s dictionary.
However, Python 2.2’s support for properties will often be a simpler way to trap attribute
references. Writing a
__getattr__()
method is complicated because to avoid recursion
you can’t use regular attribute accesses inside them, and instead have to mess around
with the contents of
__dict__
.
__getattr__()
methods also end up being called by
Python when it checks for other methods such as
__repr__()
or
__coerce__()
, and so
have to be written with this in mind. Finally, calling a function on every attribute access
results in a sizable performance loss.
property
is a new built-in type that packages up three functions that get, set, or delete
an attribute, and a docstring. For example, if you want to define a
size
attribute that’s
computed, but also settable, you could write:
class C(object):
def get_size (self):
result = ... computation ...
return result
def set_size (self, size):
... compute something based on the size
and set internal state appropriately ...
# Define a property. The 'delete this attribute'
# method is defined as None, so the attribute
# can't be deleted.
size = property(get_size, set_size,
None,
"Storage size of this instance")
That is certainly clearer and easier to write than a pair of
__getattr__()
/
__setattr__()
methods that check for the
size
attribute and handle it specially while retrieving all other
attributes from the instance’s
__dict__
. Accesses to
size
are also the only ones which
have to perform the work of calling a function, so references to other attributes run at
their usual speed.
Finally, it’s possible to constrain the list of attributes that can be referenced on an object
using the new
__slots__
class attribute. Python objects are usually very dynamic; at any
time it’s possible to define a new attribute on an instance by just doing
obj.new_attr=1
. A
new-style class can define a class attribute named
__slots__
to limit the legal attributes
to a particular set of names. An example will make this clear:
41
>>> class C(object):
... __slots__ = ('template', 'name')
...
>>> obj = C()
>>> print obj.template
None
>>> obj.template = 'Test'
>>> print obj.template
Test
>>> obj.newattr = None
Traceback (most recent call last):
File "<stdin>", line 1, in ?
AttributeError: 'C' object has no attribute 'newattr'
Note how you get an
AttributeError
on the attempt to assign to an attribute not listed in
__slots__
.
Related Links
This section has just been a quick overview of the new features, giving enough of an
explanation to start you programming, but many details have been simplified or ignored.
Where should you go to get a more complete picture?
http://www.python.org/2.2/descrintro.html is a lengthy tutorial introduction to the
descriptor features, written by Guido van Rossum. If my description has whetted your
appetite, go read this tutorial next, because it goes into much more detail about the new
features while still remaining quite easy to read.
Next, there are two relevant PEPs, PEP 252 and PEP 253. PEP 252 is titled “Making
Types Look More Like Classes”, and covers the descriptor API. PEP 253 is titled
“Subtyping Built-in Types”, and describes the changes to type objects that make it
possible to subtype built-in objects. PEP 253 is the more complicated PEP of the two, and
at a few points the necessary explanations of types and meta-types may cause your
head to explode. Both PEPs were written and implemented by Guido van Rossum, with
substantial assistance from the rest of the Zope Corp. team.
Finally, there’s the ultimate authority: the source code. Most of the machinery for the type
handling is in
Objects/typeobject.c
, but you should only resort to it after all other
avenues have been exhausted, including posting a question to python-list or python-dev.
PEP 234: Iterators
Another significant addition to 2.2 is an iteration interface at both the C and Python levels.
Objects can define how they can be looped over by callers.
61
In Python versions up to 2.1, the usual way to make
for item in obj
work is to define a
__getitem__()
method that looks something like this:
def __getitem__(self, index):
return <next item>
__getitem__()
is more properly used to define an indexing operation on an object so that
you can write
obj[5]
to retrieve the sixth element. It’s a bit misleading when you’re using
this only to support
for
loops. Consider some file-like object that wants to be looped
over; the index parameter is essentially meaningless, as the class probably assumes that
a series of
__getitem__()
calls will be made with index incrementing by one each time. In
other words, the presence of the
__getitem__()
method doesn’t mean that using
file[5]
to randomly access the sixth element will work, though it really should.
In Python 2.2, iteration can be implemented separately, and
__getitem__()
methods can
be limited to classes that really do support random access. The basic idea of iterators is
simple. A new built-in function,
iter(obj)()
or
iter(C, sentinel)
, is used to get an
iterator.
iter(obj)()
returns an iterator for the object obj, while
iter(C, sentinel)
returns an iterator that will invoke the callable object C until it returns sentinel to signal
that the iterator is done.
Python classes can define an
__iter__()
method, which should create and return a new
iterator for the object; if the object is its own iterator, this method can just return
self
. In
particular, iterators will usually be their own iterators. Extension types implemented in C
can implement a
tp_iter
function in order to return an iterator, and extension types that
want to behave as iterators can define a
tp_iternext
function.
So, after all this, what do iterators actually do? They have one required method,
next()
,
which takes no arguments and returns the next value. When there are no more values to
be returned, calling
next()
should raise the
StopIteration
exception.
56
>>> L = [1,2,3]
>>> i = iter(L)
>>> print i
<iterator object at 0x8116870>
>>> i.next()
1
>>> i.next()
2
>>> i.next()
3
>>> i.next()
Traceback (most recent call last):
File "<stdin>", line 1, in ?
StopIteration
>>>
In 2.2, Python’s
for
statement no longer expects a sequence; it expects something for
which
iter()
will return an iterator. For backward compatibility and convenience, an
iterator is automatically constructed for sequences that don’t implement
__iter__()
or a
tp_iter
slot, so
for i in [1,2,3]
will still work. Wherever the Python interpreter loops
over a sequence, it’s been changed to use the iterator protocol. This means you can do
things like this:
>>> L = [1,2,3]
>>> i = iter(L)
>>> a,b,c = i
>>> a,b,c
(1, 2, 3)
Iterator support has been added to some of Python’s basic types. Calling
iter()
on a
dictionary will return an iterator which loops over its keys:
>>> m = {'Jan': 1, 'Feb': 2, 'Mar': 3, 'Apr': 4, 'May': 5, 'Jun': 6,
... 'Jul': 7, 'Aug': 8, 'Sep': 9, 'Oct': 10, 'Nov': 11, 'Dec': 12}
>>> for key in m: print key, m[key]
...
Mar 3
Feb 2
Aug 8
Sep 9
May 5
Jun 6
Jul 7
Jan 1
Apr 4
Nov 11
Dec 12
Oct 10
That’s just the default behaviour. If you want to iterate over keys, values, or key/value
Documents you may be interested
Documents you may be interested