Commit Graph

14 Commits

Author SHA1 Message Date
Eric Snow
b72947a8d2
gh-106931: Intern Statically Allocated Strings Globally (gh-107272)
We tried this before with a dict and for all interned strings.  That ran into problems due to interpreter isolation.  However, exclusively using a per-interpreter cache caused some inconsistency that can eliminate the benefit of interning.  Here we circle back to using a global cache, but only for statically allocated strings.  We also use a more-basic _Py_hashtable_t for that global cache instead of a dict.

Ideally we would only have the global cache, but the optional isolation of each interpreter's allocator means that a non-static string object must not outlive its interpreter.  Thus we would have to store a copy of each such interned string in the global cache, tied to the main interpreter.
2023-07-27 13:56:59 -06:00
Victor Stinner
89f9875448
gh-106320: Move private _PyHash API to the internal C API (#107026)
* No longer export most private _PyHash symbols, only export the ones
  which are needed by shared extensions.
* Modules/_xxtestfuzz/fuzzer.c now uses the internal C API.
2023-07-22 13:49:37 +00:00
Christian Heimes
4901ea9526
bpo-41061: Fix incorrect expressions in hashtable (GH-21028)
Signed-off-by: Christian Heimes <christian@python.org>
2020-06-22 00:41:48 -07:00
Victor Stinner
d2dc827d16
bpo-40602: _Py_hashtable_set() reports rehash failure (GH-20077)
If _Py_hashtable_set() fails to grow the hash table (rehash), it now
fails rather than ignoring the error.
2020-05-14 22:44:32 +02:00
Victor Stinner
a482dc500b
bpo-40602: Write unit tests for _Py_hashtable_t (GH-20091)
Cleanup also hashtable.c.
Rename _Py_hashtable_t members:

* Rename entries to nentries
* Rename num_buckets to nbuckets
2020-05-14 21:55:47 +02:00
Victor Stinner
42bae3a3d9
bpo-40602: Optimize _Py_hashtable_get_ptr() (GH-20066)
_Py_hashtable_get_entry_ptr() avoids comparing the entry hash:
compare directly keys.

Move _Py_hashtable_get_entry_ptr() just after
_Py_hashtable_get_entry_generic().
2020-05-13 05:36:23 +02:00
Victor Stinner
5b0a30354d
bpo-40609: _Py_hashtable_t values become void* (GH-20065)
_Py_hashtable_t values become regular "void *" pointers.

* Add _Py_hashtable_entry_t.data member
* Remove _Py_hashtable_t.data_size member
* Remove _Py_hashtable_t.get_func member. It is no longer needed
  to specialize _Py_hashtable_get() for a specific value size, since
  all entries now have the same size (void*).
* Remove the following macros:

  * _Py_HASHTABLE_GET()
  * _Py_HASHTABLE_SET()
  * _Py_HASHTABLE_SET_NODATA()
  * _Py_HASHTABLE_POP()

* Rename _Py_hashtable_pop() to _Py_hashtable_steal()
* _Py_hashtable_foreach() callback now gets key and value rather than
  entry.
* Remove _Py_hashtable_value_destroy_func type. value_destroy_func
  callback now only has a single parameter: data (void*).
2020-05-13 04:40:30 +02:00
Victor Stinner
d95bd4214c
bpo-40609: _tracemalloc allocates traces (GH-20064)
Rewrite _tracemalloc to store "trace_t*" rather than directly
"trace_t" in traces hash tables. Traces are now allocated on the heap
memory, outside the hash table.

Add tracemalloc_copy_traces() and tracemalloc_copy_domains() helper
functions.

Remove _Py_hashtable_copy() function since there is no API to copy a
key or a value.

Remove also _Py_hashtable_delete() function which was commented.
2020-05-13 03:52:11 +02:00
Victor Stinner
2d0a3d682f
bpo-40609: Add destroy functions to _Py_hashtable (GH-20062)
Add key_destroy_func and value_destroy_func parameters to
_Py_hashtable_new_full().

marshal.c and _tracemalloc.c use these destroy functions.
2020-05-13 02:50:18 +02:00
Victor Stinner
f9b3b582b8
bpo-40609: Remove _Py_hashtable_t.key_size (GH-20060)
Rewrite _Py_hashtable_t type to always store the key as
a "const void *" pointer. Add an explicit "key" member to
_Py_hashtable_entry_t.

Remove _Py_hashtable_t.key_size member.

hash and compare functions drop their hash table parameter, and their
'key' parameter type becomes "const void *".
2020-05-13 02:26:02 +02:00
Victor Stinner
f453221c8b
bpo-40602: Add _Py_HashPointerRaw() function (GH-20056)
Add a new _Py_HashPointerRaw() function which avoids replacing -1
with -2 to micro-optimize hash table using pointer keys: using
_Py_hashtable_hash_ptr() hash function.
2020-05-12 18:46:20 +02:00
Victor Stinner
7c6e970775
bpo-40602: Optimize _Py_hashtable for pointer keys (GH-20051)
Optimize _Py_hashtable_get() and _Py_hashtable_get_entry() for
pointer keys:

* key_size == sizeof(void*)
* hash_func == _Py_hashtable_hash_ptr
* compare_func == _Py_hashtable_compare_direct

Changes:

* Add get_func and get_entry_func members to _Py_hashtable_t
* Convert _Py_hashtable_get() and _Py_hashtable_get_entry() functions
  to static nline functions.
* Add specialized get and get entry for pointer keys.
2020-05-12 13:31:59 +02:00
Victor Stinner
d0919f0d6b
bpo-40602: _Py_hashtable_new() uses PyMem_Malloc() (GH-20046)
_Py_hashtable_new() now uses PyMem_Malloc/PyMem_Free allocator by
default, rather than PyMem_RawMalloc/PyMem_RawFree.

PyMem_Malloc is faster than PyMem_RawMalloc for memory blocks smaller
than or equal to 512 bytes.
2020-05-12 03:07:40 +02:00
Victor Stinner
b617993b7c
bpo-40602: Rename hashtable.h to pycore_hashtable.h (GH-20044)
* Move Modules/hashtable.h to Include/internal/pycore_hashtable.h
* Move Modules/hashtable.c to Python/hashtable.c
* Python is now linked to hashtable.c. _tracemalloc is no longer
  linked to hashtable.c. Previously, marshal.c got hashtable.c via
  _tracemalloc.c which is built as a builtin module.
2020-05-12 02:42:19 +02:00