mirror of
https://github.com/python/cpython.git
synced 2024-12-06 00:05:32 +08:00
Merge: #20647: Update dictobject.c comments to account for randomized string hashes.
This commit is contained in:
commit
ce85acff3a
@ -88,20 +88,17 @@ it's USABLE_FRACTION (currently two-thirds) full.
|
|||||||
/*
|
/*
|
||||||
Major subtleties ahead: Most hash schemes depend on having a "good" hash
|
Major subtleties ahead: Most hash schemes depend on having a "good" hash
|
||||||
function, in the sense of simulating randomness. Python doesn't: its most
|
function, in the sense of simulating randomness. Python doesn't: its most
|
||||||
important hash functions (for strings and ints) are very regular in common
|
important hash functions (for ints) are very regular in common
|
||||||
cases:
|
cases:
|
||||||
|
|
||||||
>>> map(hash, (0, 1, 2, 3))
|
>>>[hash(i) for i in range(4)]
|
||||||
[0, 1, 2, 3]
|
[0, 1, 2, 3]
|
||||||
>>> map(hash, ("namea", "nameb", "namec", "named"))
|
|
||||||
[-1658398457, -1658398460, -1658398459, -1658398462]
|
|
||||||
>>>
|
|
||||||
|
|
||||||
This isn't necessarily bad! To the contrary, in a table of size 2**i, taking
|
This isn't necessarily bad! To the contrary, in a table of size 2**i, taking
|
||||||
the low-order i bits as the initial table index is extremely fast, and there
|
the low-order i bits as the initial table index is extremely fast, and there
|
||||||
are no collisions at all for dicts indexed by a contiguous range of ints.
|
are no collisions at all for dicts indexed by a contiguous range of ints. So
|
||||||
The same is approximately true when keys are "consecutive" strings. So this
|
this gives better-than-random behavior in common cases, and that's very
|
||||||
gives better-than-random behavior in common cases, and that's very desirable.
|
desirable.
|
||||||
|
|
||||||
OTOH, when collisions occur, the tendency to fill contiguous slices of the
|
OTOH, when collisions occur, the tendency to fill contiguous slices of the
|
||||||
hash table makes a good collision resolution strategy crucial. Taking only
|
hash table makes a good collision resolution strategy crucial. Taking only
|
||||||
|
Loading…
Reference in New Issue
Block a user