cpython/Doc/library/stringprep.rst
Georg Brandl b044b2a701 Merged revisions 74821,74828-74831,74833,74835 via svnmerge from
svn+ssh://svn.python.org/python/branches/py3k

................
  r74821 | georg.brandl | 2009-09-16 11:42:19 +0200 (Mi, 16 Sep 2009) | 1 line

  #6885: run python 3 as python3.
................
  r74828 | georg.brandl | 2009-09-16 16:23:20 +0200 (Mi, 16 Sep 2009) | 1 line

  Use true booleans.
................
  r74829 | georg.brandl | 2009-09-16 16:24:29 +0200 (Mi, 16 Sep 2009) | 1 line

  Small PEP8 correction.
................
  r74830 | georg.brandl | 2009-09-16 16:36:22 +0200 (Mi, 16 Sep 2009) | 1 line

  Use true booleans.
................
  r74831 | georg.brandl | 2009-09-16 17:54:04 +0200 (Mi, 16 Sep 2009) | 1 line

  Use true booleans and PEP8 for argdefaults.
................
  r74833 | georg.brandl | 2009-09-16 17:58:14 +0200 (Mi, 16 Sep 2009) | 1 line

  Last round of adapting style of documenting argument default values.
................
  r74835 | georg.brandl | 2009-09-16 18:00:31 +0200 (Mi, 16 Sep 2009) | 33 lines

  Merged revisions 74817-74820,74822-74824 via svnmerge from
  svn+ssh://pythondev@svn.python.org/python/trunk

  ........
    r74817 | georg.brandl | 2009-09-16 11:05:11 +0200 (Mi, 16 Sep 2009) | 1 line

    Make deprecation notices as visible as warnings are right now.
  ........
    r74818 | georg.brandl | 2009-09-16 11:23:04 +0200 (Mi, 16 Sep 2009) | 1 line

    #6880: add reference to classes section in exceptions section, which comes earlier.
  ........
    r74819 | georg.brandl | 2009-09-16 11:24:57 +0200 (Mi, 16 Sep 2009) | 1 line

    #6876: fix base class constructor invocation in example.
  ........
    r74820 | georg.brandl | 2009-09-16 11:30:48 +0200 (Mi, 16 Sep 2009) | 1 line

    #6891: comment out dead link to Unicode article.
  ........
    r74822 | georg.brandl | 2009-09-16 12:12:06 +0200 (Mi, 16 Sep 2009) | 1 line

    #5621: refactor description of how class/instance attributes interact on a.x=a.x+1 or augassign.
  ........
    r74823 | georg.brandl | 2009-09-16 15:06:22 +0200 (Mi, 16 Sep 2009) | 1 line

    Remove strange trailing commas.
  ........
    r74824 | georg.brandl | 2009-09-16 15:11:06 +0200 (Mi, 16 Sep 2009) | 1 line

    #6892: fix optparse example involving help option.
  ........
................
2009-09-16 16:05:59 +00:00

141 lines
4.1 KiB
ReStructuredText

:mod:`stringprep` --- Internet String Preparation
=================================================
.. module:: stringprep
:synopsis: String preparation, as per RFC 3453
:deprecated:
.. moduleauthor:: Martin v. Löwis <martin@v.loewis.de>
.. sectionauthor:: Martin v. Löwis <martin@v.loewis.de>
When identifying things (such as host names) in the internet, it is often
necessary to compare such identifications for "equality". Exactly how this
comparison is executed may depend on the application domain, e.g. whether it
should be case-insensitive or not. It may be also necessary to restrict the
possible identifications, to allow only identifications consisting of
"printable" characters.
:rfc:`3454` defines a procedure for "preparing" Unicode strings in internet
protocols. Before passing strings onto the wire, they are processed with the
preparation procedure, after which they have a certain normalized form. The RFC
defines a set of tables, which can be combined into profiles. Each profile must
define which tables it uses, and what other optional parts of the ``stringprep``
procedure are part of the profile. One example of a ``stringprep`` profile is
``nameprep``, which is used for internationalized domain names.
The module :mod:`stringprep` only exposes the tables from RFC 3454. As these
tables would be very large to represent them as dictionaries or lists, the
module uses the Unicode character database internally. The module source code
itself was generated using the ``mkstringprep.py`` utility.
As a result, these tables are exposed as functions, not as data structures.
There are two kinds of tables in the RFC: sets and mappings. For a set,
:mod:`stringprep` provides the "characteristic function", i.e. a function that
returns true if the parameter is part of the set. For mappings, it provides the
mapping function: given the key, it returns the associated value. Below is a
list of all functions available in the module.
.. function:: in_table_a1(code)
Determine whether *code* is in tableA.1 (Unassigned code points in Unicode 3.2).
.. function:: in_table_b1(code)
Determine whether *code* is in tableB.1 (Commonly mapped to nothing).
.. function:: map_table_b2(code)
Return the mapped value for *code* according to tableB.2 (Mapping for
case-folding used with NFKC).
.. function:: map_table_b3(code)
Return the mapped value for *code* according to tableB.3 (Mapping for
case-folding used with no normalization).
.. function:: in_table_c11(code)
Determine whether *code* is in tableC.1.1 (ASCII space characters).
.. function:: in_table_c12(code)
Determine whether *code* is in tableC.1.2 (Non-ASCII space characters).
.. function:: in_table_c11_c12(code)
Determine whether *code* is in tableC.1 (Space characters, union of C.1.1 and
C.1.2).
.. function:: in_table_c21(code)
Determine whether *code* is in tableC.2.1 (ASCII control characters).
.. function:: in_table_c22(code)
Determine whether *code* is in tableC.2.2 (Non-ASCII control characters).
.. function:: in_table_c21_c22(code)
Determine whether *code* is in tableC.2 (Control characters, union of C.2.1 and
C.2.2).
.. function:: in_table_c3(code)
Determine whether *code* is in tableC.3 (Private use).
.. function:: in_table_c4(code)
Determine whether *code* is in tableC.4 (Non-character code points).
.. function:: in_table_c5(code)
Determine whether *code* is in tableC.5 (Surrogate codes).
.. function:: in_table_c6(code)
Determine whether *code* is in tableC.6 (Inappropriate for plain text).
.. function:: in_table_c7(code)
Determine whether *code* is in tableC.7 (Inappropriate for canonical
representation).
.. function:: in_table_c8(code)
Determine whether *code* is in tableC.8 (Change display properties or are
deprecated).
.. function:: in_table_c9(code)
Determine whether *code* is in tableC.9 (Tagging characters).
.. function:: in_table_d1(code)
Determine whether *code* is in tableD.1 (Characters with bidirectional property
"R" or "AL").
.. function:: in_table_d2(code)
Determine whether *code* is in tableD.2 (Characters with bidirectional property
"L").