Commit Graph

14 Commits

Author SHA1 Message Date
Ashwin Ramaswami
614f17211c
bpo-39073: validate Address parts to disallow CRLF (#19007)
Disallow CR or LF in email.headerregistry.Address arguments to guard against header injection attacks.
2020-03-29 20:38:41 -04:00
Michael Selik
2702638eab bpo-34002: Minor efficiency and clarity improvements in email package. (GH-7999)
* Check intersection of two sets explicitly

Comparing ``len(a) > ``len(a - b)`` is essentially looking for an
intersection between the two sets. If set ``b`` does not intersect ``a``
then ``len(a - b)`` will be equal to ``len(a)``. This logic is more
clearly expressed as ``a & b``.

* Change while/pop to a for-loop

Copying the list, then repeatedly popping the first element was
unnecessarily slow. I also cleaned up a couple other inefficiencies.
There's no need to unpack a tuple, then re-pack and append it. The list
can be created with the first element instead of empty. Secondly, the
``endswith`` method returns a bool, so there's no need for an if-
statement to set ``encoding`` to True or False.

* Use set.intersection to check for intersections

``a.intersection(b)`` method is more clear of purpose than ``not
a.isdisjoint(b)`` and avoids an unnecessary set construction that ``a &
set(b)`` performs.

* Use not isdisjoint instead of intersection

While it reads slightly worse, the isdisjoint method will stop when it
finds a counterexample and returns a bool, rather than looping over the
entire iterable and constructing a new set.
2019-09-19 20:25:55 -07:00
Serhiy Storchaka
662db125cd
bpo-37685: Fixed __eq__, __lt__ etc implementations in some classes. (GH-14952)
They now return NotImplemented for unsupported type of the other operand.
2019-08-08 08:42:54 +03:00
Min ho Kim
96e12d5f4f Fix typos in docs, comments and test assert messages (#14872) 2019-07-21 16:12:33 -04:00
Abhilash Raj
46d88a1131 bpo-35805: Add parser for Message-ID email header. (GH-13397)
* bpo-35805: Add parser for Message-ID header.

This parser is based on the definition of Identification Fields from RFC 5322
Sec 3.6.4.

This should also prevent folding of Message-ID header using RFC 2047 encoded
words and hence fix bpo-35805.

* Prevent folding of non-ascii message-id headers.
* Add fold method to MsgID token to prevent folding.
2019-06-04 10:41:34 -07:00
R. David Murray
85d5c18c9d
bpo-27240 Rewrite the email header folding algorithm. (#3488)
The original algorithm tried to delegate the folding to the tokens so
that those tokens whose folding rules differed could specify the
differences.  However, this resulted in a lot of duplicated code because
most of the rules were the same.

The new algorithm moves all folding logic into a set of functions
external to the token classes, but puts the information about which
tokens can be folded in which ways on the tokens...with the exception of
mime-parameters, which are a special case (which was not even
implemented in the old folder).

This algorithm can still probably be improved and hopefully simplified
somewhat.

Note that some of the test expectations are changed.  I believe the
changes are toward more desirable and consistent behavior: in general
when (re) folding a line the canonical version of the tokens is
generated, rather than preserving errors or extra whitespace.
2017-12-03 18:51:41 -05:00
Jon Dufresne
3972628de3 bpo-30296 Remove unnecessary tuples, lists, sets, and dicts (#1489)
* Replaced list(<generator expression>) with list comprehension
* Replaced dict(<generator expression>) with dict comprehension
* Replaced set(<list literal>) with set literal
* Replaced builtin func(<list comprehension>) with func(<generator
  expression>) when supported (e.g. any(), all(), tuple(), min(), &
  max())
2017-05-18 07:35:54 -07:00
Serhiy Storchaka
6a7b3a77b4 Issue #26778: Fixed "a/an/and" typos in code comment and documentation. 2016-04-17 08:32:47 +03:00
Martin Panter
96a4f07107 Issues #26310, #26311: Fix typos in the documentation and code comments 2016-02-10 01:17:51 +00:00
R David Murray
d2ff243b38 Merge: #21991: make headerregistry params property MappingProxyType. 2014-10-17 19:32:08 -04:00
R David Murray
685b3495e1 #21991: make headerregistry params property MappingProxyType.
It is unlikely anyone is using the fact that the dictionary returned
by the 'params' attribute was previously writable, but even if someone
is the API is provisional so this kind of change is acceptable (and
needed, to get the API "right" before it becomes official).

Patch by Stéphane Wirtel.
2014-10-17 19:30:13 -04:00
Serhiy Storchaka
465e60e654 Issue #22033: Reprs of most Python implemened classes now contain actual
class name instead of hardcoded one.
2014-07-25 23:36:00 +03:00
R David Murray
97f43c019f #15160: Extend the new email parser to handle MIME headers.
This code passes all the same tests that the existing RFC mime header
parser passes, plus a bunch of additional ones.

There are a couple of commented out tests where there are issues with the
folding.  The folding doesn't normally get invoked for headers parsed from
source, and the cases are marginal anyway (headers with invalid binary data)
so I'm not worried about them, but will fix them after the beta.

There are things that can be done to make this API even more convenient, but I
think this is a solid foundation worth having.  And the parser is a full RFC
parser, so it handles cases that the current parser doesn't.  (There are also
probably cases where it fails when the current parser doesn't, but I haven't
found them yet ;)

Oh, yeah, and there are some really ugly bits in the parser for handling some
'postel' cases that are unfortunately common.

I hope/plan to to eventually refactor a lot of the code in the parser which
should reduce the line count...but there is no escaping the fact that the
error recovery is welter of special cases.
2012-06-24 05:03:27 -04:00
R David Murray
ea9766897b Make headerregistry fully part of the provisional api.
When I made the checkin of the provisional email policy, I knew that
Address and Group needed to be made accessible from somewhere.  The more
I looked at it, though, the more it became clear that since this is a
provisional API anyway, there's no good reason to hide headerregistry as
a private API.  It was designed to ultimately be part of the public API,
and so it should be part of the provisional API.

This patch fully documents the headerregistry API, and deletes the
abbreviated version of those docs I had added to the provisional policy
docs.
2012-05-27 15:03:38 -04:00