mirror of
https://github.com/python/cpython.git
synced 2025-01-04 07:34:35 +08:00
2285 lines
98 KiB
ReStructuredText
2285 lines
98 KiB
ReStructuredText
****************************
|
|
What's New in Python 2.5
|
|
****************************
|
|
|
|
:Author: A.M. Kuchling
|
|
|
|
.. |release| replace:: 1.01
|
|
|
|
.. $Id: whatsnew25.tex 56611 2007-07-29 08:26:10Z georg.brandl $
|
|
.. Fix XXX comments
|
|
|
|
This article explains the new features in Python 2.5. The final release of
|
|
Python 2.5 is scheduled for August 2006; :pep:`356` describes the planned
|
|
release schedule.
|
|
|
|
The changes in Python 2.5 are an interesting mix of language and library
|
|
improvements. The library enhancements will be more important to Python's user
|
|
community, I think, because several widely-useful packages were added. New
|
|
modules include ElementTree for XML processing (:mod:`xml.etree`),
|
|
the SQLite database module (:mod:`sqlite`), and the :mod:`ctypes`
|
|
module for calling C functions.
|
|
|
|
The language changes are of middling significance. Some pleasant new features
|
|
were added, but most of them aren't features that you'll use every day.
|
|
Conditional expressions were finally added to the language using a novel syntax;
|
|
see section :ref:`pep-308`. The new ':keyword:`with`' statement will make
|
|
writing cleanup code easier (section :ref:`pep-343`). Values can now be passed
|
|
into generators (section :ref:`pep-342`). Imports are now visible as either
|
|
absolute or relative (section :ref:`pep-328`). Some corner cases of exception
|
|
handling are handled better (section :ref:`pep-341`). All these improvements
|
|
are worthwhile, but they're improvements to one specific language feature or
|
|
another; none of them are broad modifications to Python's semantics.
|
|
|
|
As well as the language and library additions, other improvements and bugfixes
|
|
were made throughout the source tree. A search through the SVN change logs
|
|
finds there were 353 patches applied and 458 bugs fixed between Python 2.4 and
|
|
2.5. (Both figures are likely to be underestimates.)
|
|
|
|
This article doesn't try to be a complete specification of the new features;
|
|
instead changes are briefly introduced using helpful examples. For full
|
|
details, you should always refer to the documentation for Python 2.5 at
|
|
http://docs.python.org. If you want to understand the complete implementation
|
|
and design rationale, refer to the PEP for a particular new feature.
|
|
|
|
Comments, suggestions, and error reports for this document are welcome; please
|
|
e-mail them to the author or open a bug in the Python bug tracker.
|
|
|
|
.. ======================================================================
|
|
|
|
|
|
.. _pep-308:
|
|
|
|
PEP 308: Conditional Expressions
|
|
================================
|
|
|
|
For a long time, people have been requesting a way to write conditional
|
|
expressions, which are expressions that return value A or value B depending on
|
|
whether a Boolean value is true or false. A conditional expression lets you
|
|
write a single assignment statement that has the same effect as the following::
|
|
|
|
if condition:
|
|
x = true_value
|
|
else:
|
|
x = false_value
|
|
|
|
There have been endless tedious discussions of syntax on both python-dev and
|
|
comp.lang.python. A vote was even held that found the majority of voters wanted
|
|
conditional expressions in some form, but there was no syntax that was preferred
|
|
by a clear majority. Candidates included C's ``cond ? true_v : false_v``, ``if
|
|
cond then true_v else false_v``, and 16 other variations.
|
|
|
|
Guido van Rossum eventually chose a surprising syntax::
|
|
|
|
x = true_value if condition else false_value
|
|
|
|
Evaluation is still lazy as in existing Boolean expressions, so the order of
|
|
evaluation jumps around a bit. The *condition* expression in the middle is
|
|
evaluated first, and the *true_value* expression is evaluated only if the
|
|
condition was true. Similarly, the *false_value* expression is only evaluated
|
|
when the condition is false.
|
|
|
|
This syntax may seem strange and backwards; why does the condition go in the
|
|
*middle* of the expression, and not in the front as in C's ``c ? x : y``? The
|
|
decision was checked by applying the new syntax to the modules in the standard
|
|
library and seeing how the resulting code read. In many cases where a
|
|
conditional expression is used, one value seems to be the 'common case' and one
|
|
value is an 'exceptional case', used only on rarer occasions when the condition
|
|
isn't met. The conditional syntax makes this pattern a bit more obvious::
|
|
|
|
contents = ((doc + '\n') if doc else '')
|
|
|
|
I read the above statement as meaning "here *contents* is usually assigned a
|
|
value of ``doc+'\n'``; sometimes *doc* is empty, in which special case an empty
|
|
string is returned." I doubt I will use conditional expressions very often
|
|
where there isn't a clear common and uncommon case.
|
|
|
|
There was some discussion of whether the language should require surrounding
|
|
conditional expressions with parentheses. The decision was made to *not*
|
|
require parentheses in the Python language's grammar, but as a matter of style I
|
|
think you should always use them. Consider these two statements::
|
|
|
|
# First version -- no parens
|
|
level = 1 if logging else 0
|
|
|
|
# Second version -- with parens
|
|
level = (1 if logging else 0)
|
|
|
|
In the first version, I think a reader's eye might group the statement into
|
|
'level = 1', 'if logging', 'else 0', and think that the condition decides
|
|
whether the assignment to *level* is performed. The second version reads
|
|
better, in my opinion, because it makes it clear that the assignment is always
|
|
performed and the choice is being made between two values.
|
|
|
|
Another reason for including the brackets: a few odd combinations of list
|
|
comprehensions and lambdas could look like incorrect conditional expressions.
|
|
See :pep:`308` for some examples. If you put parentheses around your
|
|
conditional expressions, you won't run into this case.
|
|
|
|
|
|
.. seealso::
|
|
|
|
:pep:`308` - Conditional Expressions
|
|
PEP written by Guido van Rossum and Raymond D. Hettinger; implemented by Thomas
|
|
Wouters.
|
|
|
|
.. ======================================================================
|
|
|
|
|
|
.. _pep-309:
|
|
|
|
PEP 309: Partial Function Application
|
|
=====================================
|
|
|
|
The :mod:`functools` module is intended to contain tools for functional-style
|
|
programming.
|
|
|
|
One useful tool in this module is the :func:`partial` function. For programs
|
|
written in a functional style, you'll sometimes want to construct variants of
|
|
existing functions that have some of the parameters filled in. Consider a
|
|
Python function ``f(a, b, c)``; you could create a new function ``g(b, c)`` that
|
|
was equivalent to ``f(1, b, c)``. This is called "partial function
|
|
application".
|
|
|
|
:func:`partial` takes the arguments ``(function, arg1, arg2, ... kwarg1=value1,
|
|
kwarg2=value2)``. The resulting object is callable, so you can just call it to
|
|
invoke *function* with the filled-in arguments.
|
|
|
|
Here's a small but realistic example::
|
|
|
|
import functools
|
|
|
|
def log (message, subsystem):
|
|
"Write the contents of 'message' to the specified subsystem."
|
|
print '%s: %s' % (subsystem, message)
|
|
...
|
|
|
|
server_log = functools.partial(log, subsystem='server')
|
|
server_log('Unable to open socket')
|
|
|
|
Here's another example, from a program that uses PyGTK. Here a context-
|
|
sensitive pop-up menu is being constructed dynamically. The callback provided
|
|
for the menu option is a partially applied version of the :meth:`open_item`
|
|
method, where the first argument has been provided. ::
|
|
|
|
...
|
|
class Application:
|
|
def open_item(self, path):
|
|
...
|
|
def init (self):
|
|
open_func = functools.partial(self.open_item, item_path)
|
|
popup_menu.append( ("Open", open_func, 1) )
|
|
|
|
Another function in the :mod:`functools` module is the
|
|
:func:`update_wrapper(wrapper, wrapped)` function that helps you write well-
|
|
behaved decorators. :func:`update_wrapper` copies the name, module, and
|
|
docstring attribute to a wrapper function so that tracebacks inside the wrapped
|
|
function are easier to understand. For example, you might write::
|
|
|
|
def my_decorator(f):
|
|
def wrapper(*args, **kwds):
|
|
print 'Calling decorated function'
|
|
return f(*args, **kwds)
|
|
functools.update_wrapper(wrapper, f)
|
|
return wrapper
|
|
|
|
:func:`wraps` is a decorator that can be used inside your own decorators to copy
|
|
the wrapped function's information. An alternate version of the previous
|
|
example would be::
|
|
|
|
def my_decorator(f):
|
|
@functools.wraps(f)
|
|
def wrapper(*args, **kwds):
|
|
print 'Calling decorated function'
|
|
return f(*args, **kwds)
|
|
return wrapper
|
|
|
|
|
|
.. seealso::
|
|
|
|
:pep:`309` - Partial Function Application
|
|
PEP proposed and written by Peter Harris; implemented by Hye-Shik Chang and Nick
|
|
Coghlan, with adaptations by Raymond Hettinger.
|
|
|
|
.. ======================================================================
|
|
|
|
|
|
.. _pep-314:
|
|
|
|
PEP 314: Metadata for Python Software Packages v1.1
|
|
===================================================
|
|
|
|
Some simple dependency support was added to Distutils. The :func:`setup`
|
|
function now has ``requires``, ``provides``, and ``obsoletes`` keyword
|
|
parameters. When you build a source distribution using the ``sdist`` command,
|
|
the dependency information will be recorded in the :file:`PKG-INFO` file.
|
|
|
|
Another new keyword parameter is ``download_url``, which should be set to a URL
|
|
for the package's source code. This means it's now possible to look up an entry
|
|
in the package index, determine the dependencies for a package, and download the
|
|
required packages. ::
|
|
|
|
VERSION = '1.0'
|
|
setup(name='PyPackage',
|
|
version=VERSION,
|
|
requires=['numarray', 'zlib (>=1.1.4)'],
|
|
obsoletes=['OldPackage']
|
|
download_url=('http://www.example.com/pypackage/dist/pkg-%s.tar.gz'
|
|
% VERSION),
|
|
)
|
|
|
|
Another new enhancement to the Python package index at
|
|
http://cheeseshop.python.org is storing source and binary archives for a
|
|
package. The new :command:`upload` Distutils command will upload a package to
|
|
the repository.
|
|
|
|
Before a package can be uploaded, you must be able to build a distribution using
|
|
the :command:`sdist` Distutils command. Once that works, you can run ``python
|
|
setup.py upload`` to add your package to the PyPI archive. Optionally you can
|
|
GPG-sign the package by supplying the :option:`--sign` and :option:`--identity`
|
|
options.
|
|
|
|
Package uploading was implemented by Martin von Löwis and Richard Jones.
|
|
|
|
|
|
.. seealso::
|
|
|
|
:pep:`314` - Metadata for Python Software Packages v1.1
|
|
PEP proposed and written by A.M. Kuchling, Richard Jones, and Fred Drake;
|
|
implemented by Richard Jones and Fred Drake.
|
|
|
|
.. ======================================================================
|
|
|
|
|
|
.. _pep-328:
|
|
|
|
PEP 328: Absolute and Relative Imports
|
|
======================================
|
|
|
|
The simpler part of PEP 328 was implemented in Python 2.4: parentheses could now
|
|
be used to enclose the names imported from a module using the ``from ... import
|
|
...`` statement, making it easier to import many different names.
|
|
|
|
The more complicated part has been implemented in Python 2.5: importing a module
|
|
can be specified to use absolute or package-relative imports. The plan is to
|
|
move toward making absolute imports the default in future versions of Python.
|
|
|
|
Let's say you have a package directory like this::
|
|
|
|
pkg/
|
|
pkg/__init__.py
|
|
pkg/main.py
|
|
pkg/string.py
|
|
|
|
This defines a package named :mod:`pkg` containing the :mod:`pkg.main` and
|
|
:mod:`pkg.string` submodules.
|
|
|
|
Consider the code in the :file:`main.py` module. What happens if it executes
|
|
the statement ``import string``? In Python 2.4 and earlier, it will first look
|
|
in the package's directory to perform a relative import, finds
|
|
:file:`pkg/string.py`, imports the contents of that file as the
|
|
:mod:`pkg.string` module, and that module is bound to the name ``string`` in the
|
|
:mod:`pkg.main` module's namespace.
|
|
|
|
That's fine if :mod:`pkg.string` was what you wanted. But what if you wanted
|
|
Python's standard :mod:`string` module? There's no clean way to ignore
|
|
:mod:`pkg.string` and look for the standard module; generally you had to look at
|
|
the contents of ``sys.modules``, which is slightly unclean. Holger Krekel's
|
|
:mod:`py.std` package provides a tidier way to perform imports from the standard
|
|
library, ``import py ; py.std.string.join()``, but that package isn't available
|
|
on all Python installations.
|
|
|
|
Reading code which relies on relative imports is also less clear, because a
|
|
reader may be confused about which module, :mod:`string` or :mod:`pkg.string`,
|
|
is intended to be used. Python users soon learned not to duplicate the names of
|
|
standard library modules in the names of their packages' submodules, but you
|
|
can't protect against having your submodule's name being used for a new module
|
|
added in a future version of Python.
|
|
|
|
In Python 2.5, you can switch :keyword:`import`'s behaviour to absolute imports
|
|
using a ``from __future__ import absolute_import`` directive. This absolute-
|
|
import behaviour will become the default in a future version (probably Python
|
|
2.7). Once absolute imports are the default, ``import string`` will always
|
|
find the standard library's version. It's suggested that users should begin
|
|
using absolute imports as much as possible, so it's preferable to begin writing
|
|
``from pkg import string`` in your code.
|
|
|
|
Relative imports are still possible by adding a leading period to the module
|
|
name when using the ``from ... import`` form::
|
|
|
|
# Import names from pkg.string
|
|
from .string import name1, name2
|
|
# Import pkg.string
|
|
from . import string
|
|
|
|
This imports the :mod:`string` module relative to the current package, so in
|
|
:mod:`pkg.main` this will import *name1* and *name2* from :mod:`pkg.string`.
|
|
Additional leading periods perform the relative import starting from the parent
|
|
of the current package. For example, code in the :mod:`A.B.C` module can do::
|
|
|
|
from . import D # Imports A.B.D
|
|
from .. import E # Imports A.E
|
|
from ..F import G # Imports A.F.G
|
|
|
|
Leading periods cannot be used with the ``import modname`` form of the import
|
|
statement, only the ``from ... import`` form.
|
|
|
|
|
|
.. seealso::
|
|
|
|
:pep:`328` - Imports: Multi-Line and Absolute/Relative
|
|
PEP written by Aahz; implemented by Thomas Wouters.
|
|
|
|
http://codespeak.net/py/current/doc/index.html
|
|
The py library by Holger Krekel, which contains the :mod:`py.std` package.
|
|
|
|
.. ======================================================================
|
|
|
|
|
|
.. _pep-338:
|
|
|
|
PEP 338: Executing Modules as Scripts
|
|
=====================================
|
|
|
|
The :option:`-m` switch added in Python 2.4 to execute a module as a script
|
|
gained a few more abilities. Instead of being implemented in C code inside the
|
|
Python interpreter, the switch now uses an implementation in a new module,
|
|
:mod:`runpy`.
|
|
|
|
The :mod:`runpy` module implements a more sophisticated import mechanism so that
|
|
it's now possible to run modules in a package such as :mod:`pychecker.checker`.
|
|
The module also supports alternative import mechanisms such as the
|
|
:mod:`zipimport` module. This means you can add a .zip archive's path to
|
|
``sys.path`` and then use the :option:`-m` switch to execute code from the
|
|
archive.
|
|
|
|
|
|
.. seealso::
|
|
|
|
:pep:`338` - Executing modules as scripts
|
|
PEP written and implemented by Nick Coghlan.
|
|
|
|
.. ======================================================================
|
|
|
|
|
|
.. _pep-341:
|
|
|
|
PEP 341: Unified try/except/finally
|
|
===================================
|
|
|
|
Until Python 2.5, the :keyword:`try` statement came in two flavours. You could
|
|
use a :keyword:`finally` block to ensure that code is always executed, or one or
|
|
more :keyword:`except` blocks to catch specific exceptions. You couldn't
|
|
combine both :keyword:`except` blocks and a :keyword:`finally` block, because
|
|
generating the right bytecode for the combined version was complicated and it
|
|
wasn't clear what the semantics of the combined statement should be.
|
|
|
|
Guido van Rossum spent some time working with Java, which does support the
|
|
equivalent of combining :keyword:`except` blocks and a :keyword:`finally` block,
|
|
and this clarified what the statement should mean. In Python 2.5, you can now
|
|
write::
|
|
|
|
try:
|
|
block-1 ...
|
|
except Exception1:
|
|
handler-1 ...
|
|
except Exception2:
|
|
handler-2 ...
|
|
else:
|
|
else-block
|
|
finally:
|
|
final-block
|
|
|
|
The code in *block-1* is executed. If the code raises an exception, the various
|
|
:keyword:`except` blocks are tested: if the exception is of class
|
|
:class:`Exception1`, *handler-1* is executed; otherwise if it's of class
|
|
:class:`Exception2`, *handler-2* is executed, and so forth. If no exception is
|
|
raised, the *else-block* is executed.
|
|
|
|
No matter what happened previously, the *final-block* is executed once the code
|
|
block is complete and any raised exceptions handled. Even if there's an error in
|
|
an exception handler or the *else-block* and a new exception is raised, the code
|
|
in the *final-block* is still run.
|
|
|
|
|
|
.. seealso::
|
|
|
|
:pep:`341` - Unifying try-except and try-finally
|
|
PEP written by Georg Brandl; implementation by Thomas Lee.
|
|
|
|
.. ======================================================================
|
|
|
|
|
|
.. _pep-342:
|
|
|
|
PEP 342: New Generator Features
|
|
===============================
|
|
|
|
Python 2.5 adds a simple way to pass values *into* a generator. As introduced in
|
|
Python 2.3, generators only produce output; once a generator's code was invoked
|
|
to create an iterator, there was no way to pass any new information into the
|
|
function when its execution is resumed. Sometimes the ability to pass in some
|
|
information would be useful. Hackish solutions to this include making the
|
|
generator's code look at a global variable and then changing the global
|
|
variable's value, or passing in some mutable object that callers then modify.
|
|
|
|
To refresh your memory of basic generators, here's a simple example::
|
|
|
|
def counter (maximum):
|
|
i = 0
|
|
while i < maximum:
|
|
yield i
|
|
i += 1
|
|
|
|
When you call ``counter(10)``, the result is an iterator that returns the values
|
|
from 0 up to 9. On encountering the :keyword:`yield` statement, the iterator
|
|
returns the provided value and suspends the function's execution, preserving the
|
|
local variables. Execution resumes on the following call to the iterator's
|
|
:meth:`next` method, picking up after the :keyword:`yield` statement.
|
|
|
|
In Python 2.3, :keyword:`yield` was a statement; it didn't return any value. In
|
|
2.5, :keyword:`yield` is now an expression, returning a value that can be
|
|
assigned to a variable or otherwise operated on::
|
|
|
|
val = (yield i)
|
|
|
|
I recommend that you always put parentheses around a :keyword:`yield` expression
|
|
when you're doing something with the returned value, as in the above example.
|
|
The parentheses aren't always necessary, but it's easier to always add them
|
|
instead of having to remember when they're needed.
|
|
|
|
(:pep:`342` explains the exact rules, which are that a :keyword:`yield`\
|
|
-expression must always be parenthesized except when it occurs at the top-level
|
|
expression on the right-hand side of an assignment. This means you can write
|
|
``val = yield i`` but have to use parentheses when there's an operation, as in
|
|
``val = (yield i) + 12``.)
|
|
|
|
Values are sent into a generator by calling its :meth:`send(value)` method. The
|
|
generator's code is then resumed and the :keyword:`yield` expression returns the
|
|
specified *value*. If the regular :meth:`next` method is called, the
|
|
:keyword:`yield` returns :const:`None`.
|
|
|
|
Here's the previous example, modified to allow changing the value of the
|
|
internal counter. ::
|
|
|
|
def counter (maximum):
|
|
i = 0
|
|
while i < maximum:
|
|
val = (yield i)
|
|
# If value provided, change counter
|
|
if val is not None:
|
|
i = val
|
|
else:
|
|
i += 1
|
|
|
|
And here's an example of changing the counter::
|
|
|
|
>>> it = counter(10)
|
|
>>> print it.next()
|
|
0
|
|
>>> print it.next()
|
|
1
|
|
>>> print it.send(8)
|
|
8
|
|
>>> print it.next()
|
|
9
|
|
>>> print it.next()
|
|
Traceback (most recent call last):
|
|
File ``t.py'', line 15, in ?
|
|
print it.next()
|
|
StopIteration
|
|
|
|
:keyword:`yield` will usually return :const:`None`, so you should always check
|
|
for this case. Don't just use its value in expressions unless you're sure that
|
|
the :meth:`send` method will be the only method used to resume your generator
|
|
function.
|
|
|
|
In addition to :meth:`send`, there are two other new methods on generators:
|
|
|
|
* :meth:`throw(type, value=None, traceback=None)` is used to raise an exception
|
|
inside the generator; the exception is raised by the :keyword:`yield` expression
|
|
where the generator's execution is paused.
|
|
|
|
* :meth:`close` raises a new :exc:`GeneratorExit` exception inside the generator
|
|
to terminate the iteration. On receiving this exception, the generator's code
|
|
must either raise :exc:`GeneratorExit` or :exc:`StopIteration`. Catching the
|
|
:exc:`GeneratorExit` exception and returning a value is illegal and will trigger
|
|
a :exc:`RuntimeError`; if the function raises some other exception, that
|
|
exception is propagated to the caller. :meth:`close` will also be called by
|
|
Python's garbage collector when the generator is garbage-collected.
|
|
|
|
If you need to run cleanup code when a :exc:`GeneratorExit` occurs, I suggest
|
|
using a ``try: ... finally:`` suite instead of catching :exc:`GeneratorExit`.
|
|
|
|
The cumulative effect of these changes is to turn generators from one-way
|
|
producers of information into both producers and consumers.
|
|
|
|
Generators also become *coroutines*, a more generalized form of subroutines.
|
|
Subroutines are entered at one point and exited at another point (the top of the
|
|
function, and a :keyword:`return` statement), but coroutines can be entered,
|
|
exited, and resumed at many different points (the :keyword:`yield` statements).
|
|
We'll have to figure out patterns for using coroutines effectively in Python.
|
|
|
|
The addition of the :meth:`close` method has one side effect that isn't obvious.
|
|
:meth:`close` is called when a generator is garbage-collected, so this means the
|
|
generator's code gets one last chance to run before the generator is destroyed.
|
|
This last chance means that ``try...finally`` statements in generators can now
|
|
be guaranteed to work; the :keyword:`finally` clause will now always get a
|
|
chance to run. The syntactic restriction that you couldn't mix :keyword:`yield`
|
|
statements with a ``try...finally`` suite has therefore been removed. This
|
|
seems like a minor bit of language trivia, but using generators and
|
|
``try...finally`` is actually necessary in order to implement the
|
|
:keyword:`with` statement described by PEP 343. I'll look at this new statement
|
|
in the following section.
|
|
|
|
Another even more esoteric effect of this change: previously, the
|
|
:attr:`gi_frame` attribute of a generator was always a frame object. It's now
|
|
possible for :attr:`gi_frame` to be ``None`` once the generator has been
|
|
exhausted.
|
|
|
|
|
|
.. seealso::
|
|
|
|
:pep:`342` - Coroutines via Enhanced Generators
|
|
PEP written by Guido van Rossum and Phillip J. Eby; implemented by Phillip J.
|
|
Eby. Includes examples of some fancier uses of generators as coroutines.
|
|
|
|
Earlier versions of these features were proposed in :pep:`288` by Raymond
|
|
Hettinger and :pep:`325` by Samuele Pedroni.
|
|
|
|
http://en.wikipedia.org/wiki/Coroutine
|
|
The Wikipedia entry for coroutines.
|
|
|
|
http://www.sidhe.org/~dan/blog/archives/000178.html
|
|
An explanation of coroutines from a Perl point of view, written by Dan Sugalski.
|
|
|
|
.. ======================================================================
|
|
|
|
|
|
.. _pep-343:
|
|
|
|
PEP 343: The 'with' statement
|
|
=============================
|
|
|
|
The ':keyword:`with`' statement clarifies code that previously would use
|
|
``try...finally`` blocks to ensure that clean-up code is executed. In this
|
|
section, I'll discuss the statement as it will commonly be used. In the next
|
|
section, I'll examine the implementation details and show how to write objects
|
|
for use with this statement.
|
|
|
|
The ':keyword:`with`' statement is a new control-flow structure whose basic
|
|
structure is::
|
|
|
|
with expression [as variable]:
|
|
with-block
|
|
|
|
The expression is evaluated, and it should result in an object that supports the
|
|
context management protocol (that is, has :meth:`__enter__` and :meth:`__exit__`
|
|
methods.
|
|
|
|
The object's :meth:`__enter__` is called before *with-block* is executed and
|
|
therefore can run set-up code. It also may return a value that is bound to the
|
|
name *variable*, if given. (Note carefully that *variable* is *not* assigned
|
|
the result of *expression*.)
|
|
|
|
After execution of the *with-block* is finished, the object's :meth:`__exit__`
|
|
method is called, even if the block raised an exception, and can therefore run
|
|
clean-up code.
|
|
|
|
To enable the statement in Python 2.5, you need to add the following directive
|
|
to your module::
|
|
|
|
from __future__ import with_statement
|
|
|
|
The statement will always be enabled in Python 2.6.
|
|
|
|
Some standard Python objects now support the context management protocol and can
|
|
be used with the ':keyword:`with`' statement. File objects are one example::
|
|
|
|
with open('/etc/passwd', 'r') as f:
|
|
for line in f:
|
|
print line
|
|
... more processing code ...
|
|
|
|
After this statement has executed, the file object in *f* will have been
|
|
automatically closed, even if the :keyword:`for` loop raised an exception part-
|
|
way through the block.
|
|
|
|
.. note::
|
|
|
|
In this case, *f* is the same object created by :func:`open`, because
|
|
:meth:`file.__enter__` returns *self*.
|
|
|
|
The :mod:`threading` module's locks and condition variables also support the
|
|
':keyword:`with`' statement::
|
|
|
|
lock = threading.Lock()
|
|
with lock:
|
|
# Critical section of code
|
|
...
|
|
|
|
The lock is acquired before the block is executed and always released once the
|
|
block is complete.
|
|
|
|
The new :func:`localcontext` function in the :mod:`decimal` module makes it easy
|
|
to save and restore the current decimal context, which encapsulates the desired
|
|
precision and rounding characteristics for computations::
|
|
|
|
from decimal import Decimal, Context, localcontext
|
|
|
|
# Displays with default precision of 28 digits
|
|
v = Decimal('578')
|
|
print v.sqrt()
|
|
|
|
with localcontext(Context(prec=16)):
|
|
# All code in this block uses a precision of 16 digits.
|
|
# The original context is restored on exiting the block.
|
|
print v.sqrt()
|
|
|
|
|
|
.. _new-25-context-managers:
|
|
|
|
Writing Context Managers
|
|
------------------------
|
|
|
|
Under the hood, the ':keyword:`with`' statement is fairly complicated. Most
|
|
people will only use ':keyword:`with`' in company with existing objects and
|
|
don't need to know these details, so you can skip the rest of this section if
|
|
you like. Authors of new objects will need to understand the details of the
|
|
underlying implementation and should keep reading.
|
|
|
|
A high-level explanation of the context management protocol is:
|
|
|
|
* The expression is evaluated and should result in an object called a "context
|
|
manager". The context manager must have :meth:`__enter__` and :meth:`__exit__`
|
|
methods.
|
|
|
|
* The context manager's :meth:`__enter__` method is called. The value returned
|
|
is assigned to *VAR*. If no ``'as VAR'`` clause is present, the value is simply
|
|
discarded.
|
|
|
|
* The code in *BLOCK* is executed.
|
|
|
|
* If *BLOCK* raises an exception, the :meth:`__exit__(type, value, traceback)`
|
|
is called with the exception details, the same values returned by
|
|
:func:`sys.exc_info`. The method's return value controls whether the exception
|
|
is re-raised: any false value re-raises the exception, and ``True`` will result
|
|
in suppressing it. You'll only rarely want to suppress the exception, because
|
|
if you do the author of the code containing the ':keyword:`with`' statement will
|
|
never realize anything went wrong.
|
|
|
|
* If *BLOCK* didn't raise an exception, the :meth:`__exit__` method is still
|
|
called, but *type*, *value*, and *traceback* are all ``None``.
|
|
|
|
Let's think through an example. I won't present detailed code but will only
|
|
sketch the methods necessary for a database that supports transactions.
|
|
|
|
(For people unfamiliar with database terminology: a set of changes to the
|
|
database are grouped into a transaction. Transactions can be either committed,
|
|
meaning that all the changes are written into the database, or rolled back,
|
|
meaning that the changes are all discarded and the database is unchanged. See
|
|
any database textbook for more information.)
|
|
|
|
Let's assume there's an object representing a database connection. Our goal will
|
|
be to let the user write code like this::
|
|
|
|
db_connection = DatabaseConnection()
|
|
with db_connection as cursor:
|
|
cursor.execute('insert into ...')
|
|
cursor.execute('delete from ...')
|
|
# ... more operations ...
|
|
|
|
The transaction should be committed if the code in the block runs flawlessly or
|
|
rolled back if there's an exception. Here's the basic interface for
|
|
:class:`DatabaseConnection` that I'll assume::
|
|
|
|
class DatabaseConnection:
|
|
# Database interface
|
|
def cursor (self):
|
|
"Returns a cursor object and starts a new transaction"
|
|
def commit (self):
|
|
"Commits current transaction"
|
|
def rollback (self):
|
|
"Rolls back current transaction"
|
|
|
|
The :meth:`__enter__` method is pretty easy, having only to start a new
|
|
transaction. For this application the resulting cursor object would be a useful
|
|
result, so the method will return it. The user can then add ``as cursor`` to
|
|
their ':keyword:`with`' statement to bind the cursor to a variable name. ::
|
|
|
|
class DatabaseConnection:
|
|
...
|
|
def __enter__ (self):
|
|
# Code to start a new transaction
|
|
cursor = self.cursor()
|
|
return cursor
|
|
|
|
The :meth:`__exit__` method is the most complicated because it's where most of
|
|
the work has to be done. The method has to check if an exception occurred. If
|
|
there was no exception, the transaction is committed. The transaction is rolled
|
|
back if there was an exception.
|
|
|
|
In the code below, execution will just fall off the end of the function,
|
|
returning the default value of ``None``. ``None`` is false, so the exception
|
|
will be re-raised automatically. If you wished, you could be more explicit and
|
|
add a :keyword:`return` statement at the marked location. ::
|
|
|
|
class DatabaseConnection:
|
|
...
|
|
def __exit__ (self, type, value, tb):
|
|
if tb is None:
|
|
# No exception, so commit
|
|
self.commit()
|
|
else:
|
|
# Exception occurred, so rollback.
|
|
self.rollback()
|
|
# return False
|
|
|
|
|
|
.. _contextlibmod:
|
|
|
|
The contextlib module
|
|
---------------------
|
|
|
|
The new :mod:`contextlib` module provides some functions and a decorator that
|
|
are useful for writing objects for use with the ':keyword:`with`' statement.
|
|
|
|
The decorator is called :func:`contextmanager`, and lets you write a single
|
|
generator function instead of defining a new class. The generator should yield
|
|
exactly one value. The code up to the :keyword:`yield` will be executed as the
|
|
:meth:`__enter__` method, and the value yielded will be the method's return
|
|
value that will get bound to the variable in the ':keyword:`with`' statement's
|
|
:keyword:`as` clause, if any. The code after the :keyword:`yield` will be
|
|
executed in the :meth:`__exit__` method. Any exception raised in the block will
|
|
be raised by the :keyword:`yield` statement.
|
|
|
|
Our database example from the previous section could be written using this
|
|
decorator as::
|
|
|
|
from contextlib import contextmanager
|
|
|
|
@contextmanager
|
|
def db_transaction (connection):
|
|
cursor = connection.cursor()
|
|
try:
|
|
yield cursor
|
|
except:
|
|
connection.rollback()
|
|
raise
|
|
else:
|
|
connection.commit()
|
|
|
|
db = DatabaseConnection()
|
|
with db_transaction(db) as cursor:
|
|
...
|
|
|
|
The :mod:`contextlib` module also has a :func:`nested(mgr1, mgr2, ...)` function
|
|
that combines a number of context managers so you don't need to write nested
|
|
':keyword:`with`' statements. In this example, the single ':keyword:`with`'
|
|
statement both starts a database transaction and acquires a thread lock::
|
|
|
|
lock = threading.Lock()
|
|
with nested (db_transaction(db), lock) as (cursor, locked):
|
|
...
|
|
|
|
Finally, the :func:`closing(object)` function returns *object* so that it can be
|
|
bound to a variable, and calls ``object.close`` at the end of the block. ::
|
|
|
|
import urllib, sys
|
|
from contextlib import closing
|
|
|
|
with closing(urllib.urlopen('http://www.yahoo.com')) as f:
|
|
for line in f:
|
|
sys.stdout.write(line)
|
|
|
|
|
|
.. seealso::
|
|
|
|
:pep:`343` - The "with" statement
|
|
PEP written by Guido van Rossum and Nick Coghlan; implemented by Mike Bland,
|
|
Guido van Rossum, and Neal Norwitz. The PEP shows the code generated for a
|
|
':keyword:`with`' statement, which can be helpful in learning how the statement
|
|
works.
|
|
|
|
The documentation for the :mod:`contextlib` module.
|
|
|
|
.. ======================================================================
|
|
|
|
|
|
.. _pep-352:
|
|
|
|
PEP 352: Exceptions as New-Style Classes
|
|
========================================
|
|
|
|
Exception classes can now be new-style classes, not just classic classes, and
|
|
the built-in :exc:`Exception` class and all the standard built-in exceptions
|
|
(:exc:`NameError`, :exc:`ValueError`, etc.) are now new-style classes.
|
|
|
|
The inheritance hierarchy for exceptions has been rearranged a bit. In 2.5, the
|
|
inheritance relationships are::
|
|
|
|
BaseException # New in Python 2.5
|
|
|- KeyboardInterrupt
|
|
|- SystemExit
|
|
|- Exception
|
|
|- (all other current built-in exceptions)
|
|
|
|
This rearrangement was done because people often want to catch all exceptions
|
|
that indicate program errors. :exc:`KeyboardInterrupt` and :exc:`SystemExit`
|
|
aren't errors, though, and usually represent an explicit action such as the user
|
|
hitting Control-C or code calling :func:`sys.exit`. A bare ``except:`` will
|
|
catch all exceptions, so you commonly need to list :exc:`KeyboardInterrupt` and
|
|
:exc:`SystemExit` in order to re-raise them. The usual pattern is::
|
|
|
|
try:
|
|
...
|
|
except (KeyboardInterrupt, SystemExit):
|
|
raise
|
|
except:
|
|
# Log error...
|
|
# Continue running program...
|
|
|
|
In Python 2.5, you can now write ``except Exception`` to achieve the same
|
|
result, catching all the exceptions that usually indicate errors but leaving
|
|
:exc:`KeyboardInterrupt` and :exc:`SystemExit` alone. As in previous versions,
|
|
a bare ``except:`` still catches all exceptions.
|
|
|
|
The goal for Python 3.0 is to require any class raised as an exception to derive
|
|
from :exc:`BaseException` or some descendant of :exc:`BaseException`, and future
|
|
releases in the Python 2.x series may begin to enforce this constraint.
|
|
Therefore, I suggest you begin making all your exception classes derive from
|
|
:exc:`Exception` now. It's been suggested that the bare ``except:`` form should
|
|
be removed in Python 3.0, but Guido van Rossum hasn't decided whether to do this
|
|
or not.
|
|
|
|
Raising of strings as exceptions, as in the statement ``raise "Error
|
|
occurred"``, is deprecated in Python 2.5 and will trigger a warning. The aim is
|
|
to be able to remove the string-exception feature in a few releases.
|
|
|
|
|
|
.. seealso::
|
|
|
|
:pep:`352` - Required Superclass for Exceptions
|
|
PEP written by Brett Cannon and Guido van Rossum; implemented by Brett Cannon.
|
|
|
|
.. ======================================================================
|
|
|
|
|
|
.. _pep-353:
|
|
|
|
PEP 353: Using ssize_t as the index type
|
|
========================================
|
|
|
|
A wide-ranging change to Python's C API, using a new :ctype:`Py_ssize_t` type
|
|
definition instead of :ctype:`int`, will permit the interpreter to handle more
|
|
data on 64-bit platforms. This change doesn't affect Python's capacity on 32-bit
|
|
platforms.
|
|
|
|
Various pieces of the Python interpreter used C's :ctype:`int` type to store
|
|
sizes or counts; for example, the number of items in a list or tuple were stored
|
|
in an :ctype:`int`. The C compilers for most 64-bit platforms still define
|
|
:ctype:`int` as a 32-bit type, so that meant that lists could only hold up to
|
|
``2**31 - 1`` = 2147483647 items. (There are actually a few different
|
|
programming models that 64-bit C compilers can use -- see
|
|
http://www.unix.org/version2/whatsnew/lp64_wp.html for a discussion -- but the
|
|
most commonly available model leaves :ctype:`int` as 32 bits.)
|
|
|
|
A limit of 2147483647 items doesn't really matter on a 32-bit platform because
|
|
you'll run out of memory before hitting the length limit. Each list item
|
|
requires space for a pointer, which is 4 bytes, plus space for a
|
|
:ctype:`PyObject` representing the item. 2147483647\*4 is already more bytes
|
|
than a 32-bit address space can contain.
|
|
|
|
It's possible to address that much memory on a 64-bit platform, however. The
|
|
pointers for a list that size would only require 16 GiB of space, so it's not
|
|
unreasonable that Python programmers might construct lists that large.
|
|
Therefore, the Python interpreter had to be changed to use some type other than
|
|
:ctype:`int`, and this will be a 64-bit type on 64-bit platforms. The change
|
|
will cause incompatibilities on 64-bit machines, so it was deemed worth making
|
|
the transition now, while the number of 64-bit users is still relatively small.
|
|
(In 5 or 10 years, we may *all* be on 64-bit machines, and the transition would
|
|
be more painful then.)
|
|
|
|
This change most strongly affects authors of C extension modules. Python
|
|
strings and container types such as lists and tuples now use
|
|
:ctype:`Py_ssize_t` to store their size. Functions such as
|
|
:cfunc:`PyList_Size` now return :ctype:`Py_ssize_t`. Code in extension modules
|
|
may therefore need to have some variables changed to :ctype:`Py_ssize_t`.
|
|
|
|
The :cfunc:`PyArg_ParseTuple` and :cfunc:`Py_BuildValue` functions have a new
|
|
conversion code, ``n``, for :ctype:`Py_ssize_t`. :cfunc:`PyArg_ParseTuple`'s
|
|
``s#`` and ``t#`` still output :ctype:`int` by default, but you can define the
|
|
macro :cmacro:`PY_SSIZE_T_CLEAN` before including :file:`Python.h` to make
|
|
them return :ctype:`Py_ssize_t`.
|
|
|
|
:pep:`353` has a section on conversion guidelines that extension authors should
|
|
read to learn about supporting 64-bit platforms.
|
|
|
|
|
|
.. seealso::
|
|
|
|
:pep:`353` - Using ssize_t as the index type
|
|
PEP written and implemented by Martin von Löwis.
|
|
|
|
.. ======================================================================
|
|
|
|
|
|
.. _pep-357:
|
|
|
|
PEP 357: The '__index__' method
|
|
===============================
|
|
|
|
The NumPy developers had a problem that could only be solved by adding a new
|
|
special method, :meth:`__index__`. When using slice notation, as in
|
|
``[start:stop:step]``, the values of the *start*, *stop*, and *step* indexes
|
|
must all be either integers or long integers. NumPy defines a variety of
|
|
specialized integer types corresponding to unsigned and signed integers of 8,
|
|
16, 32, and 64 bits, but there was no way to signal that these types could be
|
|
used as slice indexes.
|
|
|
|
Slicing can't just use the existing :meth:`__int__` method because that method
|
|
is also used to implement coercion to integers. If slicing used
|
|
:meth:`__int__`, floating-point numbers would also become legal slice indexes
|
|
and that's clearly an undesirable behaviour.
|
|
|
|
Instead, a new special method called :meth:`__index__` was added. It takes no
|
|
arguments and returns an integer giving the slice index to use. For example::
|
|
|
|
class C:
|
|
def __index__ (self):
|
|
return self.value
|
|
|
|
The return value must be either a Python integer or long integer. The
|
|
interpreter will check that the type returned is correct, and raises a
|
|
:exc:`TypeError` if this requirement isn't met.
|
|
|
|
A corresponding :attr:`nb_index` slot was added to the C-level
|
|
:ctype:`PyNumberMethods` structure to let C extensions implement this protocol.
|
|
:cfunc:`PyNumber_Index(obj)` can be used in extension code to call the
|
|
:meth:`__index__` function and retrieve its result.
|
|
|
|
|
|
.. seealso::
|
|
|
|
:pep:`357` - Allowing Any Object to be Used for Slicing
|
|
PEP written and implemented by Travis Oliphant.
|
|
|
|
.. ======================================================================
|
|
|
|
|
|
.. _other-lang:
|
|
|
|
Other Language Changes
|
|
======================
|
|
|
|
Here are all of the changes that Python 2.5 makes to the core Python language.
|
|
|
|
* The :class:`dict` type has a new hook for letting subclasses provide a default
|
|
value when a key isn't contained in the dictionary. When a key isn't found, the
|
|
dictionary's :meth:`__missing__(key)` method will be called. This hook is used
|
|
to implement the new :class:`defaultdict` class in the :mod:`collections`
|
|
module. The following example defines a dictionary that returns zero for any
|
|
missing key::
|
|
|
|
class zerodict (dict):
|
|
def __missing__ (self, key):
|
|
return 0
|
|
|
|
d = zerodict({1:1, 2:2})
|
|
print d[1], d[2] # Prints 1, 2
|
|
print d[3], d[4] # Prints 0, 0
|
|
|
|
* Both 8-bit and Unicode strings have new :meth:`partition(sep)` and
|
|
:meth:`rpartition(sep)` methods that simplify a common use case.
|
|
|
|
The :meth:`find(S)` method is often used to get an index which is then used to
|
|
slice the string and obtain the pieces that are before and after the separator.
|
|
:meth:`partition(sep)` condenses this pattern into a single method call that
|
|
returns a 3-tuple containing the substring before the separator, the separator
|
|
itself, and the substring after the separator. If the separator isn't found,
|
|
the first element of the tuple is the entire string and the other two elements
|
|
are empty. :meth:`rpartition(sep)` also returns a 3-tuple but starts searching
|
|
from the end of the string; the ``r`` stands for 'reverse'.
|
|
|
|
Some examples::
|
|
|
|
>>> ('http://www.python.org').partition('://')
|
|
('http', '://', 'www.python.org')
|
|
>>> ('file:/usr/share/doc/index.html').partition('://')
|
|
('file:/usr/share/doc/index.html', '', '')
|
|
>>> (u'Subject: a quick question').partition(':')
|
|
(u'Subject', u':', u' a quick question')
|
|
>>> 'www.python.org'.rpartition('.')
|
|
('www.python', '.', 'org')
|
|
>>> 'www.python.org'.rpartition(':')
|
|
('', '', 'www.python.org')
|
|
|
|
(Implemented by Fredrik Lundh following a suggestion by Raymond Hettinger.)
|
|
|
|
* The :meth:`startswith` and :meth:`endswith` methods of string types now accept
|
|
tuples of strings to check for. ::
|
|
|
|
def is_image_file (filename):
|
|
return filename.endswith(('.gif', '.jpg', '.tiff'))
|
|
|
|
(Implemented by Georg Brandl following a suggestion by Tom Lynn.)
|
|
|
|
.. RFE #1491485
|
|
|
|
* The :func:`min` and :func:`max` built-in functions gained a ``key`` keyword
|
|
parameter analogous to the ``key`` argument for :meth:`sort`. This parameter
|
|
supplies a function that takes a single argument and is called for every value
|
|
in the list; :func:`min`/:func:`max` will return the element with the
|
|
smallest/largest return value from this function. For example, to find the
|
|
longest string in a list, you can do::
|
|
|
|
L = ['medium', 'longest', 'short']
|
|
# Prints 'longest'
|
|
print max(L, key=len)
|
|
# Prints 'short', because lexicographically 'short' has the largest value
|
|
print max(L)
|
|
|
|
(Contributed by Steven Bethard and Raymond Hettinger.)
|
|
|
|
* Two new built-in functions, :func:`any` and :func:`all`, evaluate whether an
|
|
iterator contains any true or false values. :func:`any` returns :const:`True`
|
|
if any value returned by the iterator is true; otherwise it will return
|
|
:const:`False`. :func:`all` returns :const:`True` only if all of the values
|
|
returned by the iterator evaluate as true. (Suggested by Guido van Rossum, and
|
|
implemented by Raymond Hettinger.)
|
|
|
|
* The result of a class's :meth:`__hash__` method can now be either a long
|
|
integer or a regular integer. If a long integer is returned, the hash of that
|
|
value is taken. In earlier versions the hash value was required to be a
|
|
regular integer, but in 2.5 the :func:`id` built-in was changed to always
|
|
return non-negative numbers, and users often seem to use ``id(self)`` in
|
|
:meth:`__hash__` methods (though this is discouraged).
|
|
|
|
.. Bug #1536021
|
|
|
|
* ASCII is now the default encoding for modules. It's now a syntax error if a
|
|
module contains string literals with 8-bit characters but doesn't have an
|
|
encoding declaration. In Python 2.4 this triggered a warning, not a syntax
|
|
error. See :pep:`263` for how to declare a module's encoding; for example, you
|
|
might add a line like this near the top of the source file::
|
|
|
|
# -*- coding: latin1 -*-
|
|
|
|
* A new warning, :class:`UnicodeWarning`, is triggered when you attempt to
|
|
compare a Unicode string and an 8-bit string that can't be converted to Unicode
|
|
using the default ASCII encoding. The result of the comparison is false::
|
|
|
|
>>> chr(128) == unichr(128) # Can't convert chr(128) to Unicode
|
|
__main__:1: UnicodeWarning: Unicode equal comparison failed
|
|
to convert both arguments to Unicode - interpreting them
|
|
as being unequal
|
|
False
|
|
>>> chr(127) == unichr(127) # chr(127) can be converted
|
|
True
|
|
|
|
Previously this would raise a :class:`UnicodeDecodeError` exception, but in 2.5
|
|
this could result in puzzling problems when accessing a dictionary. If you
|
|
looked up ``unichr(128)`` and ``chr(128)`` was being used as a key, you'd get a
|
|
:class:`UnicodeDecodeError` exception. Other changes in 2.5 resulted in this
|
|
exception being raised instead of suppressed by the code in :file:`dictobject.c`
|
|
that implements dictionaries.
|
|
|
|
Raising an exception for such a comparison is strictly correct, but the change
|
|
might have broken code, so instead :class:`UnicodeWarning` was introduced.
|
|
|
|
(Implemented by Marc-André Lemburg.)
|
|
|
|
* One error that Python programmers sometimes make is forgetting to include an
|
|
:file:`__init__.py` module in a package directory. Debugging this mistake can be
|
|
confusing, and usually requires running Python with the :option:`-v` switch to
|
|
log all the paths searched. In Python 2.5, a new :exc:`ImportWarning` warning is
|
|
triggered when an import would have picked up a directory as a package but no
|
|
:file:`__init__.py` was found. This warning is silently ignored by default;
|
|
provide the :option:`-Wd` option when running the Python executable to display
|
|
the warning message. (Implemented by Thomas Wouters.)
|
|
|
|
* The list of base classes in a class definition can now be empty. As an
|
|
example, this is now legal::
|
|
|
|
class C():
|
|
pass
|
|
|
|
(Implemented by Brett Cannon.)
|
|
|
|
.. ======================================================================
|
|
|
|
|
|
.. _25interactive:
|
|
|
|
Interactive Interpreter Changes
|
|
-------------------------------
|
|
|
|
In the interactive interpreter, ``quit`` and ``exit`` have long been strings so
|
|
that new users get a somewhat helpful message when they try to quit::
|
|
|
|
>>> quit
|
|
'Use Ctrl-D (i.e. EOF) to exit.'
|
|
|
|
In Python 2.5, ``quit`` and ``exit`` are now objects that still produce string
|
|
representations of themselves, but are also callable. Newbies who try ``quit()``
|
|
or ``exit()`` will now exit the interpreter as they expect. (Implemented by
|
|
Georg Brandl.)
|
|
|
|
The Python executable now accepts the standard long options :option:`--help`
|
|
and :option:`--version`; on Windows, it also accepts the :option:`/?` option
|
|
for displaying a help message. (Implemented by Georg Brandl.)
|
|
|
|
.. ======================================================================
|
|
|
|
|
|
.. _opts:
|
|
|
|
Optimizations
|
|
-------------
|
|
|
|
Several of the optimizations were developed at the NeedForSpeed sprint, an event
|
|
held in Reykjavik, Iceland, from May 21--28 2006. The sprint focused on speed
|
|
enhancements to the CPython implementation and was funded by EWT LLC with local
|
|
support from CCP Games. Those optimizations added at this sprint are specially
|
|
marked in the following list.
|
|
|
|
* When they were introduced in Python 2.4, the built-in :class:`set` and
|
|
:class:`frozenset` types were built on top of Python's dictionary type. In 2.5
|
|
the internal data structure has been customized for implementing sets, and as a
|
|
result sets will use a third less memory and are somewhat faster. (Implemented
|
|
by Raymond Hettinger.)
|
|
|
|
* The speed of some Unicode operations, such as finding substrings, string
|
|
splitting, and character map encoding and decoding, has been improved.
|
|
(Substring search and splitting improvements were added by Fredrik Lundh and
|
|
Andrew Dalke at the NeedForSpeed sprint. Character maps were improved by Walter
|
|
Dörwald and Martin von Löwis.)
|
|
|
|
.. Patch 1313939, 1359618
|
|
|
|
* The :func:`long(str, base)` function is now faster on long digit strings
|
|
because fewer intermediate results are calculated. The peak is for strings of
|
|
around 800--1000 digits where the function is 6 times faster. (Contributed by
|
|
Alan McIntyre and committed at the NeedForSpeed sprint.)
|
|
|
|
.. Patch 1442927
|
|
|
|
* It's now illegal to mix iterating over a file with ``for line in file`` and
|
|
calling the file object's :meth:`read`/:meth:`readline`/:meth:`readlines`
|
|
methods. Iteration uses an internal buffer and the :meth:`read\*` methods
|
|
don't use that buffer. Instead they would return the data following the
|
|
buffer, causing the data to appear out of order. Mixing iteration and these
|
|
methods will now trigger a :exc:`ValueError` from the :meth:`read\*` method.
|
|
(Implemented by Thomas Wouters.)
|
|
|
|
.. Patch 1397960
|
|
|
|
* The :mod:`struct` module now compiles structure format strings into an
|
|
internal representation and caches this representation, yielding a 20% speedup.
|
|
(Contributed by Bob Ippolito at the NeedForSpeed sprint.)
|
|
|
|
* The :mod:`re` module got a 1 or 2% speedup by switching to Python's allocator
|
|
functions instead of the system's :cfunc:`malloc` and :cfunc:`free`.
|
|
(Contributed by Jack Diederich at the NeedForSpeed sprint.)
|
|
|
|
* The code generator's peephole optimizer now performs simple constant folding
|
|
in expressions. If you write something like ``a = 2+3``, the code generator
|
|
will do the arithmetic and produce code corresponding to ``a = 5``. (Proposed
|
|
and implemented by Raymond Hettinger.)
|
|
|
|
* Function calls are now faster because code objects now keep the most recently
|
|
finished frame (a "zombie frame") in an internal field of the code object,
|
|
reusing it the next time the code object is invoked. (Original patch by Michael
|
|
Hudson, modified by Armin Rigo and Richard Jones; committed at the NeedForSpeed
|
|
sprint.) Frame objects are also slightly smaller, which may improve cache
|
|
locality and reduce memory usage a bit. (Contributed by Neal Norwitz.)
|
|
|
|
.. Patch 876206
|
|
.. Patch 1337051
|
|
|
|
* Python's built-in exceptions are now new-style classes, a change that speeds
|
|
up instantiation considerably. Exception handling in Python 2.5 is therefore
|
|
about 30% faster than in 2.4. (Contributed by Richard Jones, Georg Brandl and
|
|
Sean Reifschneider at the NeedForSpeed sprint.)
|
|
|
|
* Importing now caches the paths tried, recording whether they exist or not so
|
|
that the interpreter makes fewer :cfunc:`open` and :cfunc:`stat` calls on
|
|
startup. (Contributed by Martin von Löwis and Georg Brandl.)
|
|
|
|
.. Patch 921466
|
|
|
|
.. ======================================================================
|
|
|
|
|
|
.. _25modules:
|
|
|
|
New, Improved, and Removed Modules
|
|
==================================
|
|
|
|
The standard library received many enhancements and bug fixes in Python 2.5.
|
|
Here's a partial list of the most notable changes, sorted alphabetically by
|
|
module name. Consult the :file:`Misc/NEWS` file in the source tree for a more
|
|
complete list of changes, or look through the SVN logs for all the details.
|
|
|
|
* The :mod:`audioop` module now supports the a-LAW encoding, and the code for
|
|
u-LAW encoding has been improved. (Contributed by Lars Immisch.)
|
|
|
|
* The :mod:`codecs` module gained support for incremental codecs. The
|
|
:func:`codec.lookup` function now returns a :class:`CodecInfo` instance instead
|
|
of a tuple. :class:`CodecInfo` instances behave like a 4-tuple to preserve
|
|
backward compatibility but also have the attributes :attr:`encode`,
|
|
:attr:`decode`, :attr:`incrementalencoder`, :attr:`incrementaldecoder`,
|
|
:attr:`streamwriter`, and :attr:`streamreader`. Incremental codecs can receive
|
|
input and produce output in multiple chunks; the output is the same as if the
|
|
entire input was fed to the non-incremental codec. See the :mod:`codecs` module
|
|
documentation for details. (Designed and implemented by Walter Dörwald.)
|
|
|
|
.. Patch 1436130
|
|
|
|
* The :mod:`collections` module gained a new type, :class:`defaultdict`, that
|
|
subclasses the standard :class:`dict` type. The new type mostly behaves like a
|
|
dictionary but constructs a default value when a key isn't present,
|
|
automatically adding it to the dictionary for the requested key value.
|
|
|
|
The first argument to :class:`defaultdict`'s constructor is a factory function
|
|
that gets called whenever a key is requested but not found. This factory
|
|
function receives no arguments, so you can use built-in type constructors such
|
|
as :func:`list` or :func:`int`. For example, you can make an index of words
|
|
based on their initial letter like this::
|
|
|
|
words = """Nel mezzo del cammin di nostra vita
|
|
mi ritrovai per una selva oscura
|
|
che la diritta via era smarrita""".lower().split()
|
|
|
|
index = defaultdict(list)
|
|
|
|
for w in words:
|
|
init_letter = w[0]
|
|
index[init_letter].append(w)
|
|
|
|
Printing ``index`` results in the following output::
|
|
|
|
defaultdict(<type 'list'>, {'c': ['cammin', 'che'], 'e': ['era'],
|
|
'd': ['del', 'di', 'diritta'], 'm': ['mezzo', 'mi'],
|
|
'l': ['la'], 'o': ['oscura'], 'n': ['nel', 'nostra'],
|
|
'p': ['per'], 's': ['selva', 'smarrita'],
|
|
'r': ['ritrovai'], 'u': ['una'], 'v': ['vita', 'via']}
|
|
|
|
(Contributed by Guido van Rossum.)
|
|
|
|
* The :class:`deque` double-ended queue type supplied by the :mod:`collections`
|
|
module now has a :meth:`remove(value)` method that removes the first occurrence
|
|
of *value* in the queue, raising :exc:`ValueError` if the value isn't found.
|
|
(Contributed by Raymond Hettinger.)
|
|
|
|
* New module: The :mod:`contextlib` module contains helper functions for use
|
|
with the new ':keyword:`with`' statement. See section :ref:`contextlibmod`
|
|
for more about this module.
|
|
|
|
* New module: The :mod:`cProfile` module is a C implementation of the existing
|
|
:mod:`profile` module that has much lower overhead. The module's interface is
|
|
the same as :mod:`profile`: you run ``cProfile.run('main()')`` to profile a
|
|
function, can save profile data to a file, etc. It's not yet known if the
|
|
Hotshot profiler, which is also written in C but doesn't match the
|
|
:mod:`profile` module's interface, will continue to be maintained in future
|
|
versions of Python. (Contributed by Armin Rigo.)
|
|
|
|
Also, the :mod:`pstats` module for analyzing the data measured by the profiler
|
|
now supports directing the output to any file object by supplying a *stream*
|
|
argument to the :class:`Stats` constructor. (Contributed by Skip Montanaro.)
|
|
|
|
* The :mod:`csv` module, which parses files in comma-separated value format,
|
|
received several enhancements and a number of bugfixes. You can now set the
|
|
maximum size in bytes of a field by calling the
|
|
:meth:`csv.field_size_limit(new_limit)` function; omitting the *new_limit*
|
|
argument will return the currently-set limit. The :class:`reader` class now has
|
|
a :attr:`line_num` attribute that counts the number of physical lines read from
|
|
the source; records can span multiple physical lines, so :attr:`line_num` is not
|
|
the same as the number of records read.
|
|
|
|
The CSV parser is now stricter about multi-line quoted fields. Previously, if a
|
|
line ended within a quoted field without a terminating newline character, a
|
|
newline would be inserted into the returned field. This behavior caused problems
|
|
when reading files that contained carriage return characters within fields, so
|
|
the code was changed to return the field without inserting newlines. As a
|
|
consequence, if newlines embedded within fields are important, the input should
|
|
be split into lines in a manner that preserves the newline characters.
|
|
|
|
(Contributed by Skip Montanaro and Andrew McNamara.)
|
|
|
|
* The :class:`datetime` class in the :mod:`datetime` module now has a
|
|
:meth:`strptime(string, format)` method for parsing date strings, contributed
|
|
by Josh Spoerri. It uses the same format characters as :func:`time.strptime` and
|
|
:func:`time.strftime`::
|
|
|
|
from datetime import datetime
|
|
|
|
ts = datetime.strptime('10:13:15 2006-03-07',
|
|
'%H:%M:%S %Y-%m-%d')
|
|
|
|
* The :meth:`SequenceMatcher.get_matching_blocks` method in the :mod:`difflib`
|
|
module now guarantees to return a minimal list of blocks describing matching
|
|
subsequences. Previously, the algorithm would occasionally break a block of
|
|
matching elements into two list entries. (Enhancement by Tim Peters.)
|
|
|
|
* The :mod:`doctest` module gained a ``SKIP`` option that keeps an example from
|
|
being executed at all. This is intended for code snippets that are usage
|
|
examples intended for the reader and aren't actually test cases.
|
|
|
|
An *encoding* parameter was added to the :func:`testfile` function and the
|
|
:class:`DocFileSuite` class to specify the file's encoding. This makes it
|
|
easier to use non-ASCII characters in tests contained within a docstring.
|
|
(Contributed by Bjorn Tillenius.)
|
|
|
|
.. Patch 1080727
|
|
|
|
* The :mod:`email` package has been updated to version 4.0. (Contributed by
|
|
Barry Warsaw.)
|
|
|
|
.. XXX need to provide some more detail here
|
|
|
|
* The :mod:`fileinput` module was made more flexible. Unicode filenames are now
|
|
supported, and a *mode* parameter that defaults to ``"r"`` was added to the
|
|
:func:`input` function to allow opening files in binary or universal-newline
|
|
mode. Another new parameter, *openhook*, lets you use a function other than
|
|
:func:`open` to open the input files. Once you're iterating over the set of
|
|
files, the :class:`FileInput` object's new :meth:`fileno` returns the file
|
|
descriptor for the currently opened file. (Contributed by Georg Brandl.)
|
|
|
|
* In the :mod:`gc` module, the new :func:`get_count` function returns a 3-tuple
|
|
containing the current collection counts for the three GC generations. This is
|
|
accounting information for the garbage collector; when these counts reach a
|
|
specified threshold, a garbage collection sweep will be made. The existing
|
|
:func:`gc.collect` function now takes an optional *generation* argument of 0, 1,
|
|
or 2 to specify which generation to collect. (Contributed by Barry Warsaw.)
|
|
|
|
* The :func:`nsmallest` and :func:`nlargest` functions in the :mod:`heapq`
|
|
module now support a ``key`` keyword parameter similar to the one provided by
|
|
the :func:`min`/:func:`max` functions and the :meth:`sort` methods. For
|
|
example::
|
|
|
|
>>> import heapq
|
|
>>> L = ["short", 'medium', 'longest', 'longer still']
|
|
>>> heapq.nsmallest(2, L) # Return two lowest elements, lexicographically
|
|
['longer still', 'longest']
|
|
>>> heapq.nsmallest(2, L, key=len) # Return two shortest elements
|
|
['short', 'medium']
|
|
|
|
(Contributed by Raymond Hettinger.)
|
|
|
|
* The :func:`itertools.islice` function now accepts ``None`` for the start and
|
|
step arguments. This makes it more compatible with the attributes of slice
|
|
objects, so that you can now write the following::
|
|
|
|
s = slice(5) # Create slice object
|
|
itertools.islice(iterable, s.start, s.stop, s.step)
|
|
|
|
(Contributed by Raymond Hettinger.)
|
|
|
|
* The :func:`format` function in the :mod:`locale` module has been modified and
|
|
two new functions were added, :func:`format_string` and :func:`currency`.
|
|
|
|
The :func:`format` function's *val* parameter could previously be a string as
|
|
long as no more than one %char specifier appeared; now the parameter must be
|
|
exactly one %char specifier with no surrounding text. An optional *monetary*
|
|
parameter was also added which, if ``True``, will use the locale's rules for
|
|
formatting currency in placing a separator between groups of three digits.
|
|
|
|
To format strings with multiple %char specifiers, use the new
|
|
:func:`format_string` function that works like :func:`format` but also supports
|
|
mixing %char specifiers with arbitrary text.
|
|
|
|
A new :func:`currency` function was also added that formats a number according
|
|
to the current locale's settings.
|
|
|
|
(Contributed by Georg Brandl.)
|
|
|
|
.. Patch 1180296
|
|
|
|
* The :mod:`mailbox` module underwent a massive rewrite to add the capability to
|
|
modify mailboxes in addition to reading them. A new set of classes that include
|
|
:class:`mbox`, :class:`MH`, and :class:`Maildir` are used to read mailboxes, and
|
|
have an :meth:`add(message)` method to add messages, :meth:`remove(key)` to
|
|
remove messages, and :meth:`lock`/:meth:`unlock` to lock/unlock the mailbox.
|
|
The following example converts a maildir-format mailbox into an mbox-format
|
|
one::
|
|
|
|
import mailbox
|
|
|
|
# 'factory=None' uses email.Message.Message as the class representing
|
|
# individual messages.
|
|
src = mailbox.Maildir('maildir', factory=None)
|
|
dest = mailbox.mbox('/tmp/mbox')
|
|
|
|
for msg in src:
|
|
dest.add(msg)
|
|
|
|
(Contributed by Gregory K. Johnson. Funding was provided by Google's 2005
|
|
Summer of Code.)
|
|
|
|
* New module: the :mod:`msilib` module allows creating Microsoft Installer
|
|
:file:`.msi` files and CAB files. Some support for reading the :file:`.msi`
|
|
database is also included. (Contributed by Martin von Löwis.)
|
|
|
|
* The :mod:`nis` module now supports accessing domains other than the system
|
|
default domain by supplying a *domain* argument to the :func:`nis.match` and
|
|
:func:`nis.maps` functions. (Contributed by Ben Bell.)
|
|
|
|
* The :mod:`operator` module's :func:`itemgetter` and :func:`attrgetter`
|
|
functions now support multiple fields. A call such as
|
|
``operator.attrgetter('a', 'b')`` will return a function that retrieves the
|
|
:attr:`a` and :attr:`b` attributes. Combining this new feature with the
|
|
:meth:`sort` method's ``key`` parameter lets you easily sort lists using
|
|
multiple fields. (Contributed by Raymond Hettinger.)
|
|
|
|
* The :mod:`optparse` module was updated to version 1.5.1 of the Optik library.
|
|
The :class:`OptionParser` class gained an :attr:`epilog` attribute, a string
|
|
that will be printed after the help message, and a :meth:`destroy` method to
|
|
break reference cycles created by the object. (Contributed by Greg Ward.)
|
|
|
|
* The :mod:`os` module underwent several changes. The :attr:`stat_float_times`
|
|
variable now defaults to true, meaning that :func:`os.stat` will now return time
|
|
values as floats. (This doesn't necessarily mean that :func:`os.stat` will
|
|
return times that are precise to fractions of a second; not all systems support
|
|
such precision.)
|
|
|
|
Constants named :attr:`os.SEEK_SET`, :attr:`os.SEEK_CUR`, and
|
|
:attr:`os.SEEK_END` have been added; these are the parameters to the
|
|
:func:`os.lseek` function. Two new constants for locking are
|
|
:attr:`os.O_SHLOCK` and :attr:`os.O_EXLOCK`.
|
|
|
|
Two new functions, :func:`wait3` and :func:`wait4`, were added. They're similar
|
|
the :func:`waitpid` function which waits for a child process to exit and returns
|
|
a tuple of the process ID and its exit status, but :func:`wait3` and
|
|
:func:`wait4` return additional information. :func:`wait3` doesn't take a
|
|
process ID as input, so it waits for any child process to exit and returns a
|
|
3-tuple of *process-id*, *exit-status*, *resource-usage* as returned from the
|
|
:func:`resource.getrusage` function. :func:`wait4(pid)` does take a process ID.
|
|
(Contributed by Chad J. Schroeder.)
|
|
|
|
On FreeBSD, the :func:`os.stat` function now returns times with nanosecond
|
|
resolution, and the returned object now has :attr:`st_gen` and
|
|
:attr:`st_birthtime`. The :attr:`st_flags` member is also available, if the
|
|
platform supports it. (Contributed by Antti Louko and Diego Pettenò.)
|
|
|
|
.. (Patch 1180695, 1212117)
|
|
|
|
* The Python debugger provided by the :mod:`pdb` module can now store lists of
|
|
commands to execute when a breakpoint is reached and execution stops. Once
|
|
breakpoint #1 has been created, enter ``commands 1`` and enter a series of
|
|
commands to be executed, finishing the list with ``end``. The command list can
|
|
include commands that resume execution, such as ``continue`` or ``next``.
|
|
(Contributed by Grégoire Dooms.)
|
|
|
|
.. Patch 790710
|
|
|
|
* The :mod:`pickle` and :mod:`cPickle` modules no longer accept a return value
|
|
of ``None`` from the :meth:`__reduce__` method; the method must return a tuple
|
|
of arguments instead. The ability to return ``None`` was deprecated in Python
|
|
2.4, so this completes the removal of the feature.
|
|
|
|
* The :mod:`pkgutil` module, containing various utility functions for finding
|
|
packages, was enhanced to support PEP 302's import hooks and now also works for
|
|
packages stored in ZIP-format archives. (Contributed by Phillip J. Eby.)
|
|
|
|
* The pybench benchmark suite by Marc-André Lemburg is now included in the
|
|
:file:`Tools/pybench` directory. The pybench suite is an improvement on the
|
|
commonly used :file:`pystone.py` program because pybench provides a more
|
|
detailed measurement of the interpreter's speed. It times particular operations
|
|
such as function calls, tuple slicing, method lookups, and numeric operations,
|
|
instead of performing many different operations and reducing the result to a
|
|
single number as :file:`pystone.py` does.
|
|
|
|
* The :mod:`pyexpat` module now uses version 2.0 of the Expat parser.
|
|
(Contributed by Trent Mick.)
|
|
|
|
* The :class:`Queue` class provided by the :mod:`Queue` module gained two new
|
|
methods. :meth:`join` blocks until all items in the queue have been retrieved
|
|
and all processing work on the items have been completed. Worker threads call
|
|
the other new method, :meth:`task_done`, to signal that processing for an item
|
|
has been completed. (Contributed by Raymond Hettinger.)
|
|
|
|
* The old :mod:`regex` and :mod:`regsub` modules, which have been deprecated
|
|
ever since Python 2.0, have finally been deleted. Other deleted modules:
|
|
:mod:`statcache`, :mod:`tzparse`, :mod:`whrandom`.
|
|
|
|
* Also deleted: the :file:`lib-old` directory, which includes ancient modules
|
|
such as :mod:`dircmp` and :mod:`ni`, was removed. :file:`lib-old` wasn't on the
|
|
default ``sys.path``, so unless your programs explicitly added the directory to
|
|
``sys.path``, this removal shouldn't affect your code.
|
|
|
|
* The :mod:`rlcompleter` module is no longer dependent on importing the
|
|
:mod:`readline` module and therefore now works on non-Unix platforms. (Patch
|
|
from Robert Kiendl.)
|
|
|
|
.. Patch #1472854
|
|
|
|
* The :mod:`SimpleXMLRPCServer` and :mod:`DocXMLRPCServer` classes now have a
|
|
:attr:`rpc_paths` attribute that constrains XML-RPC operations to a limited set
|
|
of URL paths; the default is to allow only ``'/'`` and ``'/RPC2'``. Setting
|
|
:attr:`rpc_paths` to ``None`` or an empty tuple disables this path checking.
|
|
|
|
.. Bug #1473048
|
|
|
|
* The :mod:`socket` module now supports :const:`AF_NETLINK` sockets on Linux,
|
|
thanks to a patch from Philippe Biondi. Netlink sockets are a Linux-specific
|
|
mechanism for communications between a user-space process and kernel code; an
|
|
introductory article about them is at http://www.linuxjournal.com/article/7356.
|
|
In Python code, netlink addresses are represented as a tuple of 2 integers,
|
|
``(pid, group_mask)``.
|
|
|
|
Two new methods on socket objects, :meth:`recv_into(buffer)` and
|
|
:meth:`recvfrom_into(buffer)`, store the received data in an object that
|
|
supports the buffer protocol instead of returning the data as a string. This
|
|
means you can put the data directly into an array or a memory-mapped file.
|
|
|
|
Socket objects also gained :meth:`getfamily`, :meth:`gettype`, and
|
|
:meth:`getproto` accessor methods to retrieve the family, type, and protocol
|
|
values for the socket.
|
|
|
|
* New module: the :mod:`spwd` module provides functions for accessing the shadow
|
|
password database on systems that support shadow passwords.
|
|
|
|
* The :mod:`struct` is now faster because it compiles format strings into
|
|
:class:`Struct` objects with :meth:`pack` and :meth:`unpack` methods. This is
|
|
similar to how the :mod:`re` module lets you create compiled regular expression
|
|
objects. You can still use the module-level :func:`pack` and :func:`unpack`
|
|
functions; they'll create :class:`Struct` objects and cache them. Or you can
|
|
use :class:`Struct` instances directly::
|
|
|
|
s = struct.Struct('ih3s')
|
|
|
|
data = s.pack(1972, 187, 'abc')
|
|
year, number, name = s.unpack(data)
|
|
|
|
You can also pack and unpack data to and from buffer objects directly using the
|
|
:meth:`pack_into(buffer, offset, v1, v2, ...)` and :meth:`unpack_from(buffer,
|
|
offset)` methods. This lets you store data directly into an array or a memory-
|
|
mapped file.
|
|
|
|
(:class:`Struct` objects were implemented by Bob Ippolito at the NeedForSpeed
|
|
sprint. Support for buffer objects was added by Martin Blais, also at the
|
|
NeedForSpeed sprint.)
|
|
|
|
* The Python developers switched from CVS to Subversion during the 2.5
|
|
development process. Information about the exact build version is available as
|
|
the ``sys.subversion`` variable, a 3-tuple of ``(interpreter-name, branch-name,
|
|
revision-range)``. For example, at the time of writing my copy of 2.5 was
|
|
reporting ``('CPython', 'trunk', '45313:45315')``.
|
|
|
|
This information is also available to C extensions via the
|
|
:cfunc:`Py_GetBuildInfo` function that returns a string of build information
|
|
like this: ``"trunk:45355:45356M, Apr 13 2006, 07:42:19"``. (Contributed by
|
|
Barry Warsaw.)
|
|
|
|
* Another new function, :func:`sys._current_frames`, returns the current stack
|
|
frames for all running threads as a dictionary mapping thread identifiers to the
|
|
topmost stack frame currently active in that thread at the time the function is
|
|
called. (Contributed by Tim Peters.)
|
|
|
|
* The :class:`TarFile` class in the :mod:`tarfile` module now has an
|
|
:meth:`extractall` method that extracts all members from the archive into the
|
|
current working directory. It's also possible to set a different directory as
|
|
the extraction target, and to unpack only a subset of the archive's members.
|
|
|
|
The compression used for a tarfile opened in stream mode can now be autodetected
|
|
using the mode ``'r|*'``. (Contributed by Lars Gustäbel.)
|
|
|
|
.. patch 918101
|
|
|
|
* The :mod:`threading` module now lets you set the stack size used when new
|
|
threads are created. The :func:`stack_size([*size*])` function returns the
|
|
currently configured stack size, and supplying the optional *size* parameter
|
|
sets a new value. Not all platforms support changing the stack size, but
|
|
Windows, POSIX threading, and OS/2 all do. (Contributed by Andrew MacIntyre.)
|
|
|
|
.. Patch 1454481
|
|
|
|
* The :mod:`unicodedata` module has been updated to use version 4.1.0 of the
|
|
Unicode character database. Version 3.2.0 is required by some specifications,
|
|
so it's still available as :attr:`unicodedata.ucd_3_2_0`.
|
|
|
|
* New module: the :mod:`uuid` module generates universally unique identifiers
|
|
(UUIDs) according to :rfc:`4122`. The RFC defines several different UUID
|
|
versions that are generated from a starting string, from system properties, or
|
|
purely randomly. This module contains a :class:`UUID` class and functions
|
|
named :func:`uuid1`, :func:`uuid3`, :func:`uuid4`, and :func:`uuid5` to
|
|
generate different versions of UUID. (Version 2 UUIDs are not specified in
|
|
:rfc:`4122` and are not supported by this module.) ::
|
|
|
|
>>> import uuid
|
|
>>> # make a UUID based on the host ID and current time
|
|
>>> uuid.uuid1()
|
|
UUID('a8098c1a-f86e-11da-bd1a-00112444be1e')
|
|
|
|
>>> # make a UUID using an MD5 hash of a namespace UUID and a name
|
|
>>> uuid.uuid3(uuid.NAMESPACE_DNS, 'python.org')
|
|
UUID('6fa459ea-ee8a-3ca4-894e-db77e160355e')
|
|
|
|
>>> # make a random UUID
|
|
>>> uuid.uuid4()
|
|
UUID('16fd2706-8baf-433b-82eb-8c7fada847da')
|
|
|
|
>>> # make a UUID using a SHA-1 hash of a namespace UUID and a name
|
|
>>> uuid.uuid5(uuid.NAMESPACE_DNS, 'python.org')
|
|
UUID('886313e1-3b8a-5372-9b90-0c9aee199e5d')
|
|
|
|
(Contributed by Ka-Ping Yee.)
|
|
|
|
* The :mod:`weakref` module's :class:`WeakKeyDictionary` and
|
|
:class:`WeakValueDictionary` types gained new methods for iterating over the
|
|
weak references contained in the dictionary. :meth:`iterkeyrefs` and
|
|
:meth:`keyrefs` methods were added to :class:`WeakKeyDictionary`, and
|
|
:meth:`itervaluerefs` and :meth:`valuerefs` were added to
|
|
:class:`WeakValueDictionary`. (Contributed by Fred L. Drake, Jr.)
|
|
|
|
* The :mod:`webbrowser` module received a number of enhancements. It's now
|
|
usable as a script with ``python -m webbrowser``, taking a URL as the argument;
|
|
there are a number of switches to control the behaviour (:option:`-n` for a new
|
|
browser window, :option:`-t` for a new tab). New module-level functions,
|
|
:func:`open_new` and :func:`open_new_tab`, were added to support this. The
|
|
module's :func:`open` function supports an additional feature, an *autoraise*
|
|
parameter that signals whether to raise the open window when possible. A number
|
|
of additional browsers were added to the supported list such as Firefox, Opera,
|
|
Konqueror, and elinks. (Contributed by Oleg Broytmann and Georg Brandl.)
|
|
|
|
.. Patch #754022
|
|
|
|
* The :mod:`xmlrpclib` module now supports returning :class:`datetime` objects
|
|
for the XML-RPC date type. Supply ``use_datetime=True`` to the :func:`loads`
|
|
function or the :class:`Unmarshaller` class to enable this feature. (Contributed
|
|
by Skip Montanaro.)
|
|
|
|
.. Patch 1120353
|
|
|
|
* The :mod:`zipfile` module now supports the ZIP64 version of the format,
|
|
meaning that a .zip archive can now be larger than 4 GiB and can contain
|
|
individual files larger than 4 GiB. (Contributed by Ronald Oussoren.)
|
|
|
|
.. Patch 1446489
|
|
|
|
* The :mod:`zlib` module's :class:`Compress` and :class:`Decompress` objects now
|
|
support a :meth:`copy` method that makes a copy of the object's internal state
|
|
and returns a new :class:`Compress` or :class:`Decompress` object.
|
|
(Contributed by Chris AtLee.)
|
|
|
|
.. Patch 1435422
|
|
|
|
.. ======================================================================
|
|
|
|
|
|
.. _module-ctypes:
|
|
|
|
The ctypes package
|
|
------------------
|
|
|
|
The :mod:`ctypes` package, written by Thomas Heller, has been added to the
|
|
standard library. :mod:`ctypes` lets you call arbitrary functions in shared
|
|
libraries or DLLs. Long-time users may remember the :mod:`dl` module, which
|
|
provides functions for loading shared libraries and calling functions in them.
|
|
The :mod:`ctypes` package is much fancier.
|
|
|
|
To load a shared library or DLL, you must create an instance of the
|
|
:class:`CDLL` class and provide the name or path of the shared library or DLL.
|
|
Once that's done, you can call arbitrary functions by accessing them as
|
|
attributes of the :class:`CDLL` object. ::
|
|
|
|
import ctypes
|
|
|
|
libc = ctypes.CDLL('libc.so.6')
|
|
result = libc.printf("Line of output\n")
|
|
|
|
Type constructors for the various C types are provided: :func:`c_int`,
|
|
:func:`c_float`, :func:`c_double`, :func:`c_char_p` (equivalent to :ctype:`char
|
|
\*`), and so forth. Unlike Python's types, the C versions are all mutable; you
|
|
can assign to their :attr:`value` attribute to change the wrapped value. Python
|
|
integers and strings will be automatically converted to the corresponding C
|
|
types, but for other types you must call the correct type constructor. (And I
|
|
mean *must*; getting it wrong will often result in the interpreter crashing
|
|
with a segmentation fault.)
|
|
|
|
You shouldn't use :func:`c_char_p` with a Python string when the C function will
|
|
be modifying the memory area, because Python strings are supposed to be
|
|
immutable; breaking this rule will cause puzzling bugs. When you need a
|
|
modifiable memory area, use :func:`create_string_buffer`::
|
|
|
|
s = "this is a string"
|
|
buf = ctypes.create_string_buffer(s)
|
|
libc.strfry(buf)
|
|
|
|
C functions are assumed to return integers, but you can set the :attr:`restype`
|
|
attribute of the function object to change this::
|
|
|
|
>>> libc.atof('2.71828')
|
|
-1783957616
|
|
>>> libc.atof.restype = ctypes.c_double
|
|
>>> libc.atof('2.71828')
|
|
2.71828
|
|
|
|
:mod:`ctypes` also provides a wrapper for Python's C API as the
|
|
``ctypes.pythonapi`` object. This object does *not* release the global
|
|
interpreter lock before calling a function, because the lock must be held when
|
|
calling into the interpreter's code. There's a :class:`py_object()` type
|
|
constructor that will create a :ctype:`PyObject \*` pointer. A simple usage::
|
|
|
|
import ctypes
|
|
|
|
d = {}
|
|
ctypes.pythonapi.PyObject_SetItem(ctypes.py_object(d),
|
|
ctypes.py_object("abc"), ctypes.py_object(1))
|
|
# d is now {'abc', 1}.
|
|
|
|
Don't forget to use :class:`py_object()`; if it's omitted you end up with a
|
|
segmentation fault.
|
|
|
|
:mod:`ctypes` has been around for a while, but people still write and
|
|
distribution hand-coded extension modules because you can't rely on
|
|
:mod:`ctypes` being present. Perhaps developers will begin to write Python
|
|
wrappers atop a library accessed through :mod:`ctypes` instead of extension
|
|
modules, now that :mod:`ctypes` is included with core Python.
|
|
|
|
|
|
.. seealso::
|
|
|
|
http://starship.python.net/crew/theller/ctypes/
|
|
The ctypes web page, with a tutorial, reference, and FAQ.
|
|
|
|
The documentation for the :mod:`ctypes` module.
|
|
|
|
.. ======================================================================
|
|
|
|
|
|
.. _module-etree:
|
|
|
|
The ElementTree package
|
|
-----------------------
|
|
|
|
A subset of Fredrik Lundh's ElementTree library for processing XML has been
|
|
added to the standard library as :mod:`xml.etree`. The available modules are
|
|
:mod:`ElementTree`, :mod:`ElementPath`, and :mod:`ElementInclude` from
|
|
ElementTree 1.2.6. The :mod:`cElementTree` accelerator module is also
|
|
included.
|
|
|
|
The rest of this section will provide a brief overview of using ElementTree.
|
|
Full documentation for ElementTree is available at http://effbot.org/zone
|
|
/element-index.htm.
|
|
|
|
ElementTree represents an XML document as a tree of element nodes. The text
|
|
content of the document is stored as the :attr:`.text` and :attr:`.tail`
|
|
attributes of (This is one of the major differences between ElementTree and
|
|
the Document Object Model; in the DOM there are many different types of node,
|
|
including :class:`TextNode`.)
|
|
|
|
The most commonly used parsing function is :func:`parse`, that takes either a
|
|
string (assumed to contain a filename) or a file-like object and returns an
|
|
:class:`ElementTree` instance::
|
|
|
|
from xml.etree import ElementTree as ET
|
|
|
|
tree = ET.parse('ex-1.xml')
|
|
|
|
feed = urllib.urlopen(
|
|
'http://planet.python.org/rss10.xml')
|
|
tree = ET.parse(feed)
|
|
|
|
Once you have an :class:`ElementTree` instance, you can call its :meth:`getroot`
|
|
method to get the root :class:`Element` node.
|
|
|
|
There's also an :func:`XML` function that takes a string literal and returns an
|
|
:class:`Element` node (not an :class:`ElementTree`). This function provides a
|
|
tidy way to incorporate XML fragments, approaching the convenience of an XML
|
|
literal::
|
|
|
|
svg = ET.XML("""<svg width="10px" version="1.0">
|
|
</svg>""")
|
|
svg.set('height', '320px')
|
|
svg.append(elem1)
|
|
|
|
Each XML element supports some dictionary-like and some list-like access
|
|
methods. Dictionary-like operations are used to access attribute values, and
|
|
list-like operations are used to access child nodes.
|
|
|
|
+-------------------------------+--------------------------------------------+
|
|
| Operation | Result |
|
|
+===============================+============================================+
|
|
| ``elem[n]`` | Returns n'th child element. |
|
|
+-------------------------------+--------------------------------------------+
|
|
| ``elem[m:n]`` | Returns list of m'th through n'th child |
|
|
| | elements. |
|
|
+-------------------------------+--------------------------------------------+
|
|
| ``len(elem)`` | Returns number of child elements. |
|
|
+-------------------------------+--------------------------------------------+
|
|
| ``list(elem)`` | Returns list of child elements. |
|
|
+-------------------------------+--------------------------------------------+
|
|
| ``elem.append(elem2)`` | Adds *elem2* as a child. |
|
|
+-------------------------------+--------------------------------------------+
|
|
| ``elem.insert(index, elem2)`` | Inserts *elem2* at the specified location. |
|
|
+-------------------------------+--------------------------------------------+
|
|
| ``del elem[n]`` | Deletes n'th child element. |
|
|
+-------------------------------+--------------------------------------------+
|
|
| ``elem.keys()`` | Returns list of attribute names. |
|
|
+-------------------------------+--------------------------------------------+
|
|
| ``elem.get(name)`` | Returns value of attribute *name*. |
|
|
+-------------------------------+--------------------------------------------+
|
|
| ``elem.set(name, value)`` | Sets new value for attribute *name*. |
|
|
+-------------------------------+--------------------------------------------+
|
|
| ``elem.attrib`` | Retrieves the dictionary containing |
|
|
| | attributes. |
|
|
+-------------------------------+--------------------------------------------+
|
|
| ``del elem.attrib[name]`` | Deletes attribute *name*. |
|
|
+-------------------------------+--------------------------------------------+
|
|
|
|
Comments and processing instructions are also represented as :class:`Element`
|
|
nodes. To check if a node is a comment or processing instructions::
|
|
|
|
if elem.tag is ET.Comment:
|
|
...
|
|
elif elem.tag is ET.ProcessingInstruction:
|
|
...
|
|
|
|
To generate XML output, you should call the :meth:`ElementTree.write` method.
|
|
Like :func:`parse`, it can take either a string or a file-like object::
|
|
|
|
# Encoding is US-ASCII
|
|
tree.write('output.xml')
|
|
|
|
# Encoding is UTF-8
|
|
f = open('output.xml', 'w')
|
|
tree.write(f, encoding='utf-8')
|
|
|
|
(Caution: the default encoding used for output is ASCII. For general XML work,
|
|
where an element's name may contain arbitrary Unicode characters, ASCII isn't a
|
|
very useful encoding because it will raise an exception if an element's name
|
|
contains any characters with values greater than 127. Therefore, it's best to
|
|
specify a different encoding such as UTF-8 that can handle any Unicode
|
|
character.)
|
|
|
|
This section is only a partial description of the ElementTree interfaces. Please
|
|
read the package's official documentation for more details.
|
|
|
|
|
|
.. seealso::
|
|
|
|
http://effbot.org/zone/element-index.htm
|
|
Official documentation for ElementTree.
|
|
|
|
.. ======================================================================
|
|
|
|
|
|
.. _module-hashlib:
|
|
|
|
The hashlib package
|
|
-------------------
|
|
|
|
A new :mod:`hashlib` module, written by Gregory P. Smith, has been added to
|
|
replace the :mod:`md5` and :mod:`sha` modules. :mod:`hashlib` adds support for
|
|
additional secure hashes (SHA-224, SHA-256, SHA-384, and SHA-512). When
|
|
available, the module uses OpenSSL for fast platform optimized implementations
|
|
of algorithms.
|
|
|
|
The old :mod:`md5` and :mod:`sha` modules still exist as wrappers around hashlib
|
|
to preserve backwards compatibility. The new module's interface is very close
|
|
to that of the old modules, but not identical. The most significant difference
|
|
is that the constructor functions for creating new hashing objects are named
|
|
differently. ::
|
|
|
|
# Old versions
|
|
h = md5.md5()
|
|
h = md5.new()
|
|
|
|
# New version
|
|
h = hashlib.md5()
|
|
|
|
# Old versions
|
|
h = sha.sha()
|
|
h = sha.new()
|
|
|
|
# New version
|
|
h = hashlib.sha1()
|
|
|
|
# Hash that weren't previously available
|
|
h = hashlib.sha224()
|
|
h = hashlib.sha256()
|
|
h = hashlib.sha384()
|
|
h = hashlib.sha512()
|
|
|
|
# Alternative form
|
|
h = hashlib.new('md5') # Provide algorithm as a string
|
|
|
|
Once a hash object has been created, its methods are the same as before:
|
|
:meth:`update(string)` hashes the specified string into the current digest
|
|
state, :meth:`digest` and :meth:`hexdigest` return the digest value as a binary
|
|
string or a string of hex digits, and :meth:`copy` returns a new hashing object
|
|
with the same digest state.
|
|
|
|
|
|
.. seealso::
|
|
|
|
The documentation for the :mod:`hashlib` module.
|
|
|
|
.. ======================================================================
|
|
|
|
|
|
.. _module-sqlite:
|
|
|
|
The sqlite3 package
|
|
-------------------
|
|
|
|
The pysqlite module (http://www.pysqlite.org), a wrapper for the SQLite embedded
|
|
database, has been added to the standard library under the package name
|
|
:mod:`sqlite3`.
|
|
|
|
SQLite is a C library that provides a lightweight disk-based database that
|
|
doesn't require a separate server process and allows accessing the database
|
|
using a nonstandard variant of the SQL query language. Some applications can use
|
|
SQLite for internal data storage. It's also possible to prototype an
|
|
application using SQLite and then port the code to a larger database such as
|
|
PostgreSQL or Oracle.
|
|
|
|
pysqlite was written by Gerhard Häring and provides a SQL interface compliant
|
|
with the DB-API 2.0 specification described by :pep:`249`.
|
|
|
|
If you're compiling the Python source yourself, note that the source tree
|
|
doesn't include the SQLite code, only the wrapper module. You'll need to have
|
|
the SQLite libraries and headers installed before compiling Python, and the
|
|
build process will compile the module when the necessary headers are available.
|
|
|
|
To use the module, you must first create a :class:`Connection` object that
|
|
represents the database. Here the data will be stored in the
|
|
:file:`/tmp/example` file::
|
|
|
|
conn = sqlite3.connect('/tmp/example')
|
|
|
|
You can also supply the special name ``:memory:`` to create a database in RAM.
|
|
|
|
Once you have a :class:`Connection`, you can create a :class:`Cursor` object
|
|
and call its :meth:`execute` method to perform SQL commands::
|
|
|
|
c = conn.cursor()
|
|
|
|
# Create table
|
|
c.execute('''create table stocks
|
|
(date text, trans text, symbol text,
|
|
qty real, price real)''')
|
|
|
|
# Insert a row of data
|
|
c.execute("""insert into stocks
|
|
values ('2006-01-05','BUY','RHAT',100,35.14)""")
|
|
|
|
Usually your SQL operations will need to use values from Python variables. You
|
|
shouldn't assemble your query using Python's string operations because doing so
|
|
is insecure; it makes your program vulnerable to an SQL injection attack.
|
|
|
|
Instead, use the DB-API's parameter substitution. Put ``?`` as a placeholder
|
|
wherever you want to use a value, and then provide a tuple of values as the
|
|
second argument to the cursor's :meth:`execute` method. (Other database modules
|
|
may use a different placeholder, such as ``%s`` or ``:1``.) For example::
|
|
|
|
# Never do this -- insecure!
|
|
symbol = 'IBM'
|
|
c.execute("... where symbol = '%s'" % symbol)
|
|
|
|
# Do this instead
|
|
t = (symbol,)
|
|
c.execute('select * from stocks where symbol=?', t)
|
|
|
|
# Larger example
|
|
for t in (('2006-03-28', 'BUY', 'IBM', 1000, 45.00),
|
|
('2006-04-05', 'BUY', 'MSOFT', 1000, 72.00),
|
|
('2006-04-06', 'SELL', 'IBM', 500, 53.00),
|
|
):
|
|
c.execute('insert into stocks values (?,?,?,?,?)', t)
|
|
|
|
To retrieve data after executing a SELECT statement, you can either treat the
|
|
cursor as an iterator, call the cursor's :meth:`fetchone` method to retrieve a
|
|
single matching row, or call :meth:`fetchall` to get a list of the matching
|
|
rows.
|
|
|
|
This example uses the iterator form::
|
|
|
|
>>> c = conn.cursor()
|
|
>>> c.execute('select * from stocks order by price')
|
|
>>> for row in c:
|
|
... print row
|
|
...
|
|
(u'2006-01-05', u'BUY', u'RHAT', 100, 35.140000000000001)
|
|
(u'2006-03-28', u'BUY', u'IBM', 1000, 45.0)
|
|
(u'2006-04-06', u'SELL', u'IBM', 500, 53.0)
|
|
(u'2006-04-05', u'BUY', u'MSOFT', 1000, 72.0)
|
|
>>>
|
|
|
|
For more information about the SQL dialect supported by SQLite, see
|
|
http://www.sqlite.org.
|
|
|
|
|
|
.. seealso::
|
|
|
|
http://www.pysqlite.org
|
|
The pysqlite web page.
|
|
|
|
http://www.sqlite.org
|
|
The SQLite web page; the documentation describes the syntax and the available
|
|
data types for the supported SQL dialect.
|
|
|
|
The documentation for the :mod:`sqlite3` module.
|
|
|
|
:pep:`249` - Database API Specification 2.0
|
|
PEP written by Marc-André Lemburg.
|
|
|
|
.. ======================================================================
|
|
|
|
|
|
.. _module-wsgiref:
|
|
|
|
The wsgiref package
|
|
-------------------
|
|
|
|
The Web Server Gateway Interface (WSGI) v1.0 defines a standard interface
|
|
between web servers and Python web applications and is described in :pep:`333`.
|
|
The :mod:`wsgiref` package is a reference implementation of the WSGI
|
|
specification.
|
|
|
|
.. XXX should this be in a PEP 333 section instead?
|
|
|
|
The package includes a basic HTTP server that will run a WSGI application; this
|
|
server is useful for debugging but isn't intended for production use. Setting
|
|
up a server takes only a few lines of code::
|
|
|
|
from wsgiref import simple_server
|
|
|
|
wsgi_app = ...
|
|
|
|
host = ''
|
|
port = 8000
|
|
httpd = simple_server.make_server(host, port, wsgi_app)
|
|
httpd.serve_forever()
|
|
|
|
.. XXX discuss structure of WSGI applications?
|
|
.. XXX provide an example using Django or some other framework?
|
|
|
|
|
|
.. seealso::
|
|
|
|
http://www.wsgi.org
|
|
A central web site for WSGI-related resources.
|
|
|
|
:pep:`333` - Python Web Server Gateway Interface v1.0
|
|
PEP written by Phillip J. Eby.
|
|
|
|
.. ======================================================================
|
|
|
|
|
|
.. _build-api:
|
|
|
|
Build and C API Changes
|
|
=======================
|
|
|
|
Changes to Python's build process and to the C API include:
|
|
|
|
* The Python source tree was converted from CVS to Subversion, in a complex
|
|
migration procedure that was supervised and flawlessly carried out by Martin von
|
|
Löwis. The procedure was developed as :pep:`347`.
|
|
|
|
* Coverity, a company that markets a source code analysis tool called Prevent,
|
|
provided the results of their examination of the Python source code. The
|
|
analysis found about 60 bugs that were quickly fixed. Many of the bugs were
|
|
refcounting problems, often occurring in error-handling code. See
|
|
http://scan.coverity.com for the statistics.
|
|
|
|
* The largest change to the C API came from :pep:`353`, which modifies the
|
|
interpreter to use a :ctype:`Py_ssize_t` type definition instead of
|
|
:ctype:`int`. See the earlier section :ref:`pep-353` for a discussion of this
|
|
change.
|
|
|
|
* The design of the bytecode compiler has changed a great deal, no longer
|
|
generating bytecode by traversing the parse tree. Instead the parse tree is
|
|
converted to an abstract syntax tree (or AST), and it is the abstract syntax
|
|
tree that's traversed to produce the bytecode.
|
|
|
|
It's possible for Python code to obtain AST objects by using the
|
|
:func:`compile` built-in and specifying ``_ast.PyCF_ONLY_AST`` as the value of
|
|
the *flags* parameter::
|
|
|
|
from _ast import PyCF_ONLY_AST
|
|
ast = compile("""a=0
|
|
for i in range(10):
|
|
a += i
|
|
""", "<string>", 'exec', PyCF_ONLY_AST)
|
|
|
|
assignment = ast.body[0]
|
|
for_loop = ast.body[1]
|
|
|
|
No official documentation has been written for the AST code yet, but :pep:`339`
|
|
discusses the design. To start learning about the code, read the definition of
|
|
the various AST nodes in :file:`Parser/Python.asdl`. A Python script reads this
|
|
file and generates a set of C structure definitions in
|
|
:file:`Include/Python-ast.h`. The :cfunc:`PyParser_ASTFromString` and
|
|
:cfunc:`PyParser_ASTFromFile`, defined in :file:`Include/pythonrun.h`, take
|
|
Python source as input and return the root of an AST representing the contents.
|
|
This AST can then be turned into a code object by :cfunc:`PyAST_Compile`. For
|
|
more information, read the source code, and then ask questions on python-dev.
|
|
|
|
The AST code was developed under Jeremy Hylton's management, and implemented by
|
|
(in alphabetical order) Brett Cannon, Nick Coghlan, Grant Edwards, John
|
|
Ehresman, Kurt Kaiser, Neal Norwitz, Tim Peters, Armin Rigo, and Neil
|
|
Schemenauer, plus the participants in a number of AST sprints at conferences
|
|
such as PyCon.
|
|
|
|
.. List of names taken from Jeremy's python-dev post at
|
|
.. http://mail.python.org/pipermail/python-dev/2005-October/057500.html
|
|
|
|
* Evan Jones's patch to obmalloc, first described in a talk at PyCon DC 2005,
|
|
was applied. Python 2.4 allocated small objects in 256K-sized arenas, but never
|
|
freed arenas. With this patch, Python will free arenas when they're empty. The
|
|
net effect is that on some platforms, when you allocate many objects, Python's
|
|
memory usage may actually drop when you delete them and the memory may be
|
|
returned to the operating system. (Implemented by Evan Jones, and reworked by
|
|
Tim Peters.)
|
|
|
|
Note that this change means extension modules must be more careful when
|
|
allocating memory. Python's API has many different functions for allocating
|
|
memory that are grouped into families. For example, :cfunc:`PyMem_Malloc`,
|
|
:cfunc:`PyMem_Realloc`, and :cfunc:`PyMem_Free` are one family that allocates
|
|
raw memory, while :cfunc:`PyObject_Malloc`, :cfunc:`PyObject_Realloc`, and
|
|
:cfunc:`PyObject_Free` are another family that's supposed to be used for
|
|
creating Python objects.
|
|
|
|
Previously these different families all reduced to the platform's
|
|
:cfunc:`malloc` and :cfunc:`free` functions. This meant it didn't matter if
|
|
you got things wrong and allocated memory with the :cfunc:`PyMem` function but
|
|
freed it with the :cfunc:`PyObject` function. With 2.5's changes to obmalloc,
|
|
these families now do different things and mismatches will probably result in a
|
|
segfault. You should carefully test your C extension modules with Python 2.5.
|
|
|
|
* The built-in set types now have an official C API. Call :cfunc:`PySet_New`
|
|
and :cfunc:`PyFrozenSet_New` to create a new set, :cfunc:`PySet_Add` and
|
|
:cfunc:`PySet_Discard` to add and remove elements, and :cfunc:`PySet_Contains`
|
|
and :cfunc:`PySet_Size` to examine the set's state. (Contributed by Raymond
|
|
Hettinger.)
|
|
|
|
* C code can now obtain information about the exact revision of the Python
|
|
interpreter by calling the :cfunc:`Py_GetBuildInfo` function that returns a
|
|
string of build information like this: ``"trunk:45355:45356M, Apr 13 2006,
|
|
07:42:19"``. (Contributed by Barry Warsaw.)
|
|
|
|
* Two new macros can be used to indicate C functions that are local to the
|
|
current file so that a faster calling convention can be used.
|
|
:cfunc:`Py_LOCAL(type)` declares the function as returning a value of the
|
|
specified *type* and uses a fast-calling qualifier.
|
|
:cfunc:`Py_LOCAL_INLINE(type)` does the same thing and also requests the
|
|
function be inlined. If :cfunc:`PY_LOCAL_AGGRESSIVE` is defined before
|
|
:file:`python.h` is included, a set of more aggressive optimizations are enabled
|
|
for the module; you should benchmark the results to find out if these
|
|
optimizations actually make the code faster. (Contributed by Fredrik Lundh at
|
|
the NeedForSpeed sprint.)
|
|
|
|
* :cfunc:`PyErr_NewException(name, base, dict)` can now accept a tuple of base
|
|
classes as its *base* argument. (Contributed by Georg Brandl.)
|
|
|
|
* The :cfunc:`PyErr_Warn` function for issuing warnings is now deprecated in
|
|
favour of :cfunc:`PyErr_WarnEx(category, message, stacklevel)` which lets you
|
|
specify the number of stack frames separating this function and the caller. A
|
|
*stacklevel* of 1 is the function calling :cfunc:`PyErr_WarnEx`, 2 is the
|
|
function above that, and so forth. (Added by Neal Norwitz.)
|
|
|
|
* The CPython interpreter is still written in C, but the code can now be
|
|
compiled with a C++ compiler without errors. (Implemented by Anthony Baxter,
|
|
Martin von Löwis, Skip Montanaro.)
|
|
|
|
* The :cfunc:`PyRange_New` function was removed. It was never documented, never
|
|
used in the core code, and had dangerously lax error checking. In the unlikely
|
|
case that your extensions were using it, you can replace it by something like
|
|
the following::
|
|
|
|
range = PyObject_CallFunction((PyObject*) &PyRange_Type, "lll",
|
|
start, stop, step);
|
|
|
|
.. ======================================================================
|
|
|
|
|
|
.. _ports:
|
|
|
|
Port-Specific Changes
|
|
---------------------
|
|
|
|
* MacOS X (10.3 and higher): dynamic loading of modules now uses the
|
|
:cfunc:`dlopen` function instead of MacOS-specific functions.
|
|
|
|
* MacOS X: an :option:`--enable-universalsdk` switch was added to the
|
|
:program:`configure` script that compiles the interpreter as a universal binary
|
|
able to run on both PowerPC and Intel processors. (Contributed by Ronald
|
|
Oussoren; :issue:`2573`.)
|
|
|
|
* Windows: :file:`.dll` is no longer supported as a filename extension for
|
|
extension modules. :file:`.pyd` is now the only filename extension that will be
|
|
searched for.
|
|
|
|
.. ======================================================================
|
|
|
|
|
|
.. _porting:
|
|
|
|
Porting to Python 2.5
|
|
=====================
|
|
|
|
This section lists previously described changes that may require changes to your
|
|
code:
|
|
|
|
* ASCII is now the default encoding for modules. It's now a syntax error if a
|
|
module contains string literals with 8-bit characters but doesn't have an
|
|
encoding declaration. In Python 2.4 this triggered a warning, not a syntax
|
|
error.
|
|
|
|
* Previously, the :attr:`gi_frame` attribute of a generator was always a frame
|
|
object. Because of the :pep:`342` changes described in section :ref:`pep-342`,
|
|
it's now possible for :attr:`gi_frame` to be ``None``.
|
|
|
|
* A new warning, :class:`UnicodeWarning`, is triggered when you attempt to
|
|
compare a Unicode string and an 8-bit string that can't be converted to Unicode
|
|
using the default ASCII encoding. Previously such comparisons would raise a
|
|
:class:`UnicodeDecodeError` exception.
|
|
|
|
* Library: the :mod:`csv` module is now stricter about multi-line quoted fields.
|
|
If your files contain newlines embedded within fields, the input should be split
|
|
into lines in a manner which preserves the newline characters.
|
|
|
|
* Library: the :mod:`locale` module's :func:`format` function's would
|
|
previously accept any string as long as no more than one %char specifier
|
|
appeared. In Python 2.5, the argument must be exactly one %char specifier with
|
|
no surrounding text.
|
|
|
|
* Library: The :mod:`pickle` and :mod:`cPickle` modules no longer accept a
|
|
return value of ``None`` from the :meth:`__reduce__` method; the method must
|
|
return a tuple of arguments instead. The modules also no longer accept the
|
|
deprecated *bin* keyword parameter.
|
|
|
|
* Library: The :mod:`SimpleXMLRPCServer` and :mod:`DocXMLRPCServer` classes now
|
|
have a :attr:`rpc_paths` attribute that constrains XML-RPC operations to a
|
|
limited set of URL paths; the default is to allow only ``'/'`` and ``'/RPC2'``.
|
|
Setting :attr:`rpc_paths` to ``None`` or an empty tuple disables this path
|
|
checking.
|
|
|
|
* C API: Many functions now use :ctype:`Py_ssize_t` instead of :ctype:`int` to
|
|
allow processing more data on 64-bit machines. Extension code may need to make
|
|
the same change to avoid warnings and to support 64-bit machines. See the
|
|
earlier section :ref:`pep-353` for a discussion of this change.
|
|
|
|
* C API: The obmalloc changes mean that you must be careful to not mix usage
|
|
of the :cfunc:`PyMem_\*` and :cfunc:`PyObject_\*` families of functions. Memory
|
|
allocated with one family's :cfunc:`\*_Malloc` must be freed with the
|
|
corresponding family's :cfunc:`\*_Free` function.
|
|
|
|
.. ======================================================================
|
|
|
|
|
|
Acknowledgements
|
|
================
|
|
|
|
The author would like to thank the following people for offering suggestions,
|
|
corrections and assistance with various drafts of this article: Georg Brandl,
|
|
Nick Coghlan, Phillip J. Eby, Lars Gustäbel, Raymond Hettinger, Ralf W. Grosse-
|
|
Kunstleve, Kent Johnson, Iain Lowe, Martin von Löwis, Fredrik Lundh, Andrew
|
|
McNamara, Skip Montanaro, Gustavo Niemeyer, Paul Prescod, James Pryor, Mike
|
|
Rovner, Scott Weikart, Barry Warsaw, Thomas Wouters.
|
|
|