61
s = struct.Struct('ih3s')
data = s.pack(1972, 187, 'abc')
year, number, name = s.unpack(data)
You can also pack and unpack data to and from buffer objects directly using the
pack_into(buffer, offset, v1, v2, ...)()
and
unpack_from(buffer, offset)()
methods. This lets you store data directly into an array or a memory- mapped file.
(
Struct
objects were implemented by Bob Ippolito at the NeedForSpeed sprint.
Support for buffer objects was added by Martin Blais, also at the NeedForSpeed
sprint.)
The Python developers switched from CVS to Subversion during the 2.5
development process. Information about the exact build version is available as the
sys.subversion
variable, a 3-tuple of
(interpreter-name, branch-name, revision-
range)
. For example, at the time of writing my copy of 2.5 was reporting
('CPython', 'trunk', '45313:45315')
.
This information is also available to C extensions via the
Py_GetBuildInfo()
function
that returns a string of build information like this:
"trunk:45355:45356M, Apr 13
2006, 07:42:19"
. (Contributed by Barry Warsaw.)
Another new function,
sys._current_frames()
, returns the current stack frames for
all running threads as a dictionary mapping thread identifiers to the topmost stack
frame currently active in that thread at the time the function is called. (Contributed by
Tim Peters.)
The
TarFile
class in the
tarfile
module now has an
extractall()
method that
extracts all members from the archive into the current working directory. It’s also
possible to set a different directory as the extraction target, and to unpack only a
subset of the archive’s members.
The compression used for a tarfile opened in stream mode can now be autodetected
using the mode
'r|*'
. (Contributed by Lars Gustäbel.)
The
threading
module now lets you set the stack size used when new threads are
created. The
stack_size([*size*])()
function returns the currently configured stack
size, and supplying the optional size parameter sets a new value. Not all platforms
support changing the stack size, but Windows, POSIX threading, and OS/2 all do.
(Contributed by Andrew MacIntyre.)
The
unicodedata
module has been updated to use version 4.1.0 of the Unicode
86
character database. Version 3.2.0 is required by some specifications, so it’s still
available as
unicodedata.ucd_3_2_0
.
New module: the
uuid
module generates universally unique identifiers (UUIDs)
according to RFC 4122. The RFC defines several different UUID versions that are
generated from a starting string, from system properties, or purely randomly. This
module contains a
UUID
class and functions named
uuid1()
,
uuid3()
,
uuid4()
, and
uuid5()
to generate different versions of UUID. (Version 2 UUIDs are not specified
in RFC 4122 and are not supported by this module.)
>>> import uuid
>>> # make a UUID based on the host ID and current time
>>> uuid.uuid1()
UUID('a8098c1a-f86e-11da-bd1a-00112444be1e')
>>> # make a UUID using an MD5 hash of a namespace UUID and a name
>>> uuid.uuid3(uuid.NAMESPACE_DNS, 'python.org')
UUID('6fa459ea-ee8a-3ca4-894e-db77e160355e')
>>> # make a random UUID
>>> uuid.uuid4()
UUID('16fd2706-8baf-433b-82eb-8c7fada847da')
>>> # make a UUID using a SHA-1 hash of a namespace UUID and a name
>>> uuid.uuid5(uuid.NAMESPACE_DNS, 'python.org')
UUID('886313e1-3b8a-5372-9b90-0c9aee199e5d')
(Contributed by Ka-Ping Yee.)
The
weakref
module’s
WeakKeyDictionary
and
WeakValueDictionary
types gained
new methods for iterating over the weak references contained in the dictionary.
iterkeyrefs()
and
keyrefs()
methods were added to
WeakKeyDictionary
, and
itervaluerefs()
and
valuerefs()
were added to
WeakValueDictionary
.
(Contributed by Fred L. Drake, Jr.)
The
webbrowser
module received a number of enhancements. It’s now usable as a
script with
python -m webbrowser
, taking a URL as the argument; there are a
number of switches to control the behaviour (-n for a new browser window, -t for a
new tab). New module-level functions,
open_new()
and
open_new_tab()
, were added
to support this. The module’s
open()
function supports an additional feature, an
autoraise parameter that signals whether to raise the open window when possible. A
number of additional browsers were added to the supported list such as Firefox,
Opera, Konqueror, and elinks. (Contributed by Oleg Broytmann and Georg Brandl.)
The
xmlrpclib
module now supports returning
datetime
objects for the XML-RPC
date type. Supply
use_datetime=True
to the
loads()
function or the
Unmarshaller
class to enable this feature. (Contributed by Skip Montanaro.)
72
The
zipfile
module now supports the ZIP64 version of the format, meaning that a
.zip archive can now be larger than 4 GiB and can contain individual files larger than
4 GiB. (Contributed by Ronald Oussoren.)
The
zlib
module’s
Compress
and
Decompress
objects now support a
copy()
method
that makes a copy of the object’s internal state and returns a new
Compress
or
Decompress
object. (Contributed by Chris AtLee.)
The ctypes package
The
ctypes
package, written by Thomas Heller, has been added to the standard library.
ctypes
lets you call arbitrary functions in shared libraries or DLLs. Long-time users may
remember the
dl
module, which provides functions for loading shared libraries and calling
functions in them. The
ctypes
package is much fancier.
To load a shared library or DLL, you must create an instance of the
CDLL
class and
provide the name or path of the shared library or DLL. Once that’s done, you can call
arbitrary functions by accessing them as attributes of the
CDLL
object.
import ctypes
libc = ctypes.CDLL('libc.so.6')
result = libc.printf("Line of output\n")
Type constructors for the various C types are provided:
c_int()
,
c_float()
,
c_double()
,
c_char_p()
(equivalent to
char *
), and so forth. Unlike Python’s types, the C versions
are all mutable; you can assign to their
value
attribute to change the wrapped value.
Python integers and strings will be automatically converted to the corresponding C types,
but for other types you must call the correct type constructor. (And I mean must; getting
it wrong will often result in the interpreter crashing with a segmentation fault.)
You shouldn’t use
c_char_p()
with a Python string when the C function will be modifying
the memory area, because Python strings are supposed to be immutable; breaking this
rule will cause puzzling bugs. When you need a modifiable memory area, use
create_string_buffer()
:
s = "this is a string"
buf = ctypes.create_string_buffer(s)
libc.strfry(buf)
C functions are assumed to return integers, but you can set the
restype
attribute of the
function object to change this:
64
>>> libc.atof('2.71828')
-1783957616
>>> libc.atof.restype = ctypes.c_double
>>> libc.atof('2.71828')
2.71828
ctypes
also provides a wrapper for Python’s C API as the
ctypes.pythonapi
object. This
object does not release the global interpreter lock before calling a function, because the
lock must be held when calling into the interpreter’s code. There’s a
py_object()
type
constructor that will create a
PyObject *
pointer. A simple usage:
import ctypes
d = {}
ctypes.pythonapi.PyObject_SetItem(ctypes.py_object(d),
ctypes.py_object("abc"), ctypes.py_object(1))
# d is now {'abc', 1}.
Don’t forget to use
py_object()
; if it’s omitted you end up with a segmentation fault.
ctypes
has been around for a while, but people still write and distribution hand-coded
extension modules because you can’t rely on
ctypes
being present. Perhaps developers
will begin to write Python wrappers atop a library accessed through
ctypes
instead of
extension modules, now that
ctypes
is included with core Python.
See also:
http://starship.python.net/crew/theller/ctypes/
The ctypes web page, with a tutorial, reference, and FAQ.
The documentation for the
ctypes
module.
The ElementTree package
A subset of Fredrik Lundh’s ElementTree library for processing XML has been added to
the standard library as
xml.etree
. The available modules are
ElementTree
,
ElementPath
,
and
ElementInclude
from ElementTree 1.2.6. The
cElementTree
accelerator module is
also included.
The rest of this section will provide a brief overview of using ElementTree. Full
documentation for ElementTree is available at http://effbot.org/zone/element-index.htm.
ElementTree represents an XML document as a tree of element nodes. The text content
of the document is stored as the
text
and
tail
attributes of (This is one of the major
66
differences between ElementTree and the Document Object Model; in the DOM there are
many different types of node, including
TextNode
.)
The most commonly used parsing function is
parse()
, that takes either a string
(assumed to contain a filename) or a file-like object and returns an
ElementTree
instance:
from xml.etree import ElementTree as ET
tree = ET.parse('ex-1.xml')
feed = urllib.urlopen(
'http://planet.python.org/rss10.xml')
tree = ET.parse(feed)
Once you have an
ElementTree
instance, you can call its
getroot()
method to get the
root
Element
node.
There’s also an
XML()
function that takes a string literal and returns an
Element
node (not
an
ElementTree
). This function provides a tidy way to incorporate XML fragments,
approaching the convenience of an XML literal:
svg = ET.XML("""<svg width="10px" version="1.0">
</svg>""")
svg.set('height', '320px')
svg.append(elem1)
Each XML element supports some dictionary-like and some list-like access methods.
Dictionary-like operations are used to access attribute values, and list-like operations are
used to access child nodes.
Operation
Result
elem[n]
Returns n’th child element.
elem[m:n]
Returns list of m’th through n’th child
elements.
len(elem)
Returns number of child elements.
list(elem)
Returns list of child elements.
elem.append(elem2)
Adds elem2 as a child.
elem.insert(index, elem2)
Inserts elem2 at the specified location.
del elem[n]
Deletes n’th child element.
elem.keys()
Returns list of attribute names.
elem.get(name)
Returns value of attribute name.
elem.set(name, value)
Sets new value for attribute name.
elem.attrib
Retrieves the dictionary containing attributes.
del elem.attrib[name]
Deletes attribute name.
48
Comments and processing instructions are also represented as
Element
nodes. To check
if a node is a comment or processing instructions:
if elem.tag is ET.Comment:
...
elif elem.tag is ET.ProcessingInstruction:
...
To generate XML output, you should call the
ElementTree.write()
method. Like
parse()
, it can take either a string or a file-like object:
# Encoding is US-ASCII
tree.write('output.xml')
# Encoding is UTF-8
f = open('output.xml', 'w')
tree.write(f, encoding='utf-8')
(Caution: the default encoding used for output is ASCII. For general XML work, where an
element’s name may contain arbitrary Unicode characters, ASCII isn’t a very useful
encoding because it will raise an exception if an element’s name contains any characters
with values greater than 127. Therefore, it’s best to specify a different encoding such as
UTF-8 that can handle any Unicode character.)
This section is only a partial description of the ElementTree interfaces. Please read the
package’s official documentation for more details.
See also:
http://effbot.org/zone/element-index.htm
Official documentation for ElementTree.
The hashlib package
A new
hashlib
module, written by Gregory P. Smith, has been added to replace the
md5
and
sha
modules.
hashlib
adds support for additional secure hashes (SHA-224, SHA-
256, SHA-384, and SHA-512). When available, the module uses OpenSSL for fast
platform optimized implementations of algorithms.
The old
md5
and
sha
modules still exist as wrappers around hashlib to preserve
backwards compatibility. The new module’s interface is very close to that of the old
modules, but not identical. The most significant difference is that the constructor
functions for creating new hashing objects are named differently.
51
# Old versions
h = md5.md5()
h = md5.new()
# New version
h = hashlib.md5()
# Old versions
h = sha.sha()
h = sha.new()
# New version
h = hashlib.sha1()
# Hash that weren't previously available
h = hashlib.sha224()
h = hashlib.sha256()
h = hashlib.sha384()
h = hashlib.sha512()
# Alternative form
h = hashlib.new('md5') # Provide algorithm as a string
Once a hash object has been created, its methods are the same as before:
update(string)()
hashes the specified string into the current digest state,
digest()
and
hexdigest()
return the digest value as a binary string or a string of hex digits, and
copy()
returns a new hashing object with the same digest state.
See also: The documentation for the
hashlib
module.
The sqlite3 package
The pysqlite module (http://www.pysqlite.org), a wrapper for the SQLite embedded
database, has been added to the standard library under the package name
sqlite3
.
SQLite is a C library that provides a lightweight disk-based database that doesn’t require
a separate server process and allows accessing the database using a nonstandard
variant of the SQL query language. Some applications can use SQLite for internal data
storage. It’s also possible to prototype an application using SQLite and then port the code
to a larger database such as PostgreSQL or Oracle.
pysqlite was written by Gerhard Häring and provides a SQL interface compliant with the
DB-API 2.0 specification described by PEP 249.
If you’re compiling the Python source yourself, note that the source tree doesn’t include
the SQLite code, only the wrapper module. You’ll need to have the SQLite libraries and
headers installed before compiling Python, and the build process will compile the module
when the necessary headers are available.
To use the module, you must first create a
Connection
object that represents the
database. Here the data will be stored in the
/tmp/example
file:
Documents you may be interested
Documents you may be interested