cpython

mirror of https://github.com/python/cpython.git synced 2024-11-23 18:04:37 +08:00

Author	SHA1	Message	Date
Min ho Kim	c4cacc8c5e	Fix typos in comments, docs and test names (#15018 ) * Fix typos in comments, docs and test names * Update test_pyparse.py account for change in string length * Apply suggestion: splitable -> splittable Co-Authored-By: Terry Jan Reedy <tjreedy@udel.edu> * Apply suggestion: splitable -> splittable Co-Authored-By: Terry Jan Reedy <tjreedy@udel.edu> * Apply suggestion: Dealloccte -> Deallocate Co-Authored-By: Terry Jan Reedy <tjreedy@udel.edu> * Update posixmodule checksum. * Reverse idlelib changes.	2019-07-30 18:16:13 -04:00
Serhiy Storchaka	894263ba80	bpo-24214: Fixed the UTF-8 and UTF-16 incremental decoders. (GH-14304) * The UTF-8 incremental decoders fails now fast if encounter a sequence that can't be handled by the error handler. * The UTF-16 incremental decoders with the surrogatepass error handler decodes now a lone low surrogate with final=False.	2019-06-25 11:54:18 +03:00
Francisco Couzo	9843bc110d	Improve exception message for str.format (GH-12675)	2019-06-01 10:14:00 -07:00
Jeroen Demeyer	530f506ac9	bpo-36974: tp_print -> tp_vectorcall_offset and tp_reserved -> tp_as_async (GH-13464) Automatically replace tp_print -> tp_vectorcall_offset tp_compare -> tp_as_async tp_reserved -> tp_as_async	2019-05-30 19:13:39 -07:00
David Carlier	27ee0f8551	Fix couple of dead code paths (GH-7418)	2019-05-17 19:46:22 -04:00
Victor Stinner	709d23dee6	bpo-36775: _PyCoreConfig only uses wchar_t* (GH-13062) _PyCoreConfig: Change filesystem_encoding, filesystem_errors, stdio_encoding and stdio_errors fields type from char* to wchar_t. Changes: PyInterpreterState: replace fscodec_initialized (int) with fs_codec structure. * Add get_error_handler_wide() and unicode_encode_utf8() helper functions. * Add error_handler parameter to unicode_encode_locale() and unicode_decode_locale(). * Remove _PyCoreConfig_SetString(). * Rename _PyCoreConfig_SetWideString() to _PyCoreConfig_SetString(). * Rename _PyCoreConfig_SetWideStringFromString() to _PyCoreConfig_DecodeLocale().	2019-05-02 14:56:30 -04:00
Serhiy Storchaka	3191391515	bpo-36127: Argument Clinic: inline parsing code for keyword parameters. (GH-12058)	2019-03-14 10:32:22 +02:00
Serhiy Storchaka	4fa9591025	bpo-35582: Argument Clinic: inline parsing code for positional parameters. (GH-11313)	2019-01-11 16:01:14 +02:00
Serhiy Storchaka	32d96a2b5b	bpo-23867: Argument Clinic: inline parsing code for a single positional parameter. (GH-9689)	2018-12-25 13:23:47 +02:00
Serhiy Storchaka	4a934d490f	bpo-33012: Fix invalid function cast warnings with gcc 8 in Argument Clinic. (GH-6748) Fix invalid function cast warnings with gcc 8 for method conventions different from METH_NOARGS, METH_O and METH_VARARGS in Argument Clinic generated code.	2018-11-27 11:27:36 +02:00
Victor Stinner	59423e3ddd	bpo-33954: Fix _PyUnicode_InsertThousandsGrouping() (GH-10623) Fix str.format(), float.__format__() and complex.__format__() methods for non-ASCII decimal point when using the "n" formatter. Changes: * Rewrite _PyUnicode_InsertThousandsGrouping(): it now requires a _PyUnicodeWriter object for the buffer and a Python str object for digits. * Rename FILL() macro to unicode_fill(), convert it to static inline function, add "assert(0 <= start);" and rework its code.	2018-11-26 13:40:01 +01:00
Victor Stinner	3d4226a832	bpo-34523: Support surrogatepass in locale codecs (GH-8995) Add support for the "surrogatepass" error handler in PyUnicode_DecodeFSDefault() and PyUnicode_EncodeFSDefault() for the UTF-8 encoding. Changes: * _Py_DecodeUTF8Ex() and _Py_EncodeUTF8Ex() now support the surrogatepass error handler (_Py_ERROR_SURROGATEPASS). * _Py_DecodeLocaleEx() and _Py_EncodeLocaleEx() now use the _Py_error_handler enum instead of "int surrogateescape" to pass the error handler. These functions now return -3 if the error handler is unknown. * Add unit tests on _Py_DecodeLocaleEx() and _Py_EncodeLocaleEx() in test_codecs. * Rename get_error_handler() to _Py_GetErrorHandler() and expose it as a private function. * _freeze_importlib doesn't need config.filesystem_errors="strict" workaround anymore.	2018-08-29 22:21:32 +02:00
Tal Einat	c929df3b96	bpo-20180: complete AC conversion of Objects/stringlib/transmogrify.h (GH-8039) * converted bytes methods: expandtabs, ljust, rjust, center, zfill * updated char_convertor to properly set the C default value	2018-07-06 13:17:38 +03:00
Siddhesh Poyarekar	55edd0c185	bpo-33012: Fix invalid function cast warnings with gcc 8 for METH_NOARGS. (GH-6030) METH_NOARGS functions need only a single argument but they are cast into a PyCFunction, which takes two arguments. This triggers an invalid function cast warning in gcc8 due to the argument mismatch. Fix this by adding a dummy unused argument.	2018-04-29 21:59:33 +03:00
INADA Naoki	a49ac99029	bpo-32677: Add .isascii() to str, bytes and bytearray (GH-5342)	2018-01-27 14:06:21 +09:00
Barry Warsaw	b2e5794870	bpo-31338 (#3374 ) * Add Py_UNREACHABLE() as an alias to abort(). * Use Py_UNREACHABLE() instead of assert(0) * Convert more unreachable code to use Py_UNREACHABLE() * Document Py_UNREACHABLE() and a few other macros.	2017-09-14 18:13:16 -07:00
Stefan Krah	f432a3234f	bpo-30923: Silence fall-through warnings included in -Wextra since gcc-7.0. (#3157 )	2017-08-21 13:09:59 +02:00
Serhiy Storchaka	5075416b8f	bpo-30978: str.format_map() now passes key lookup exceptions through. (#2790 ) Previously any exception was replaced with a KeyError exception.	2017-08-03 11:45:23 +03:00
Serhiy Storchaka	0a58f72762	bpo-24821: Fixed the slowing down to 25 times in the searching of some (#505 ) unlucky Unicode characters.	2017-03-30 09:11:10 +03:00
Serhiy Storchaka	d1302c0154	Issue #28999 : Use Py_RETURN_NONE, Py_RETURN_TRUE and Py_RETURN_FALSE wherever possible but Coccinelle couldn't find opportunity.	2017-01-23 10:23:58 +02:00
Xiang Zhang	7a4da324dc	Issue #29145 : Merge 3.6.	2017-01-10 10:56:38 +08:00
Serhiy Storchaka	998c9cdd42	Issue #28561 : Clean up UTF-8 encoder: remove dead code, update comments, etc. Patch by Xiang Zhang.	2016-10-30 18:25:27 +02:00
Christian Heimes	f051e43b22	Issue #28126 : Replace Py_MEMCPY with memcpy(). Visual Studio can properly optimize memcpy().	2016-09-13 20:22:02 +02:00
Benjamin Peterson	621b430a14	remove all usage of Py_LOCAL	2016-09-09 13:54:34 -07:00
Victor Stinner	1a05d6c04d	PEP 7 style for if/else in C Add also a newline for readability in normalize_encoding().	2016-09-02 12:12:23 +02:00
Raymond Hettinger	15f44ab043	Issue #27895 : Spelling fixes (Contributed by Ville Skyttä).	2016-08-30 10:47:49 -07:00
Serhiy Storchaka	e09132f2c7	Backed out changeset b0087e17cd5e (issue #26765 ) For unknown reasons it perhaps caused a crash on 32-bit Windows (issue #).	2016-07-03 13:57:48 +03:00
Serhiy Storchaka	355048970b	Issue #26765 : Moved wrappers for bytes and bytearray methods to common header file.	2016-07-01 17:57:30 +03:00
Serhiy Storchaka	bcde10aa7e	Issue #26765 : Ensure that bytes- and unicode-specific stringlib files are used with correct type.	2016-05-16 09:42:29 +03:00
Serhiy Storchaka	fb81d3cbe7	Issue #26765 : Moved common code for the replace() method of bytes and bytearray to a template file.	2016-05-05 09:26:07 +03:00
Serhiy Storchaka	dd40fc3e57	Issue #26765 : Moved common code and docstrings for bytes and bytearray methods to bytes_methods.c.	2016-05-04 22:23:26 +03:00
Serhiy Storchaka	b6a9c9761c	Issue #26778 : Fixed "a/an/and" typos in code comment, documentation and error messages.	2016-04-17 09:39:28 +03:00
Serhiy Storchaka	6a7b3a77b4	Issue #26778 : Fixed "a/an/and" typos in code comment and documentation.	2016-04-17 08:32:47 +03:00
Serhiy Storchaka	21a663ea28	Issue #26057 : Got rid of nonneeded use of PyUnicode_FromObject().	2016-04-13 15:37:23 +03:00
Serhiy Storchaka	413fdcea21	Issue #24821 : Refactor STRINGLIB(fastsearch_memchr_1char) and split it on STRINGLIB(find_char) and STRINGLIB(rfind_char) that can be used independedly without special preconditions.	2015-11-14 15:42:17 +02:00
Victor Stinner	6bd525b656	Optimize error handlers of ASCII and Latin1 encoders when the replacement string is pure ASCII: use _PyBytesWriter_WriteBytes(), don't check individual character. Cleanup unicode_encode_ucs1(): * Rename repunicode to rep * Clear rep object on error * Factorize code between bytes and unicode path	2015-10-09 13:10:05 +02:00
Victor Stinner	ce179bf6ba	Add _PyBytesWriter_WriteBytes() to factorize the code	2015-10-09 12:57:22 +02:00
Victor Stinner	ad7715891e	_PyBytesWriter: simplify code to avoid "prealloc" parameters Substract preallocate bytes from min_size before calling _PyBytesWriter_Prepare().	2015-10-09 12:38:53 +02:00
Victor Stinner	e7bf86cd7d	Optimize backslashreplace error handler Issue #25318: Optimize backslashreplace and xmlcharrefreplace error handlers in UTF-8 encoder. Optimize also backslashreplace error handler for ASCII and Latin1 encoders. Use the new _PyBytesWriter API to optimize these error handlers for the encoders. It avoids to create an exception and call the slow implementation of the error handler.	2015-10-09 01:39:28 +02:00
Victor Stinner	fdfbf78114	Issue #25318 : Add _PyBytesWriter API Add a new private API to optimize Unicode encoders. It uses a small buffer allocated on the stack and supports overallocation. Use _PyBytesWriter API for UCS1 (ASCII and Latin1) and UTF-8 encoders. Enable overallocation for the UTF-8 encoder with error handlers. unicode_encode_ucs1(): initialize collend to collstart+1 to not check the current character twice, we already know that it is not ASCII.	2015-10-09 00:33:49 +02:00
Victor Stinner	01ada3996b	Issue #25267 : The UTF-8 encoder is now up to 75 times as fast for error handlers: ``ignore``, ``replace``, ``surrogateescape``, ``surrogatepass``. Patch co-written with Serhiy Storchaka.	2015-10-01 21:54:51 +02:00
Eric V. Smith	ab2aa6dc91	Fixed an incorrect comment.	2015-08-26 14:10:32 -04:00
Serhiy Storchaka	9ce71a6475	Fixed typos in comments.	2015-05-18 22:20:18 +03:00
Serhiy Storchaka	7e29eea926	Fixed typos in comments.	2015-05-18 22:19:42 +03:00
Serhiy Storchaka	0d4df752ac	Issue #15027 : The UTF-32 encoder is now 3x to 7x faster.	2015-05-12 23:12:45 +03:00
Serhiy Storchaka	d9d769fcdd	Issue #23573 : Increased performance of string search operations (str.find, str.index, str.count, the in operator, str.split, str.partition) with arguments of different kinds (UCS1, UCS2, UCS4).	2015-03-24 21:55:47 +02:00
Serhiy Storchaka	009b811d67	Removed unintentional trailing spaces in non-external and non-generated C files.	2015-03-18 21:53:15 +02:00
Serhiy Storchaka	4fdb68491e	Issue #22896 : Avoid to use PyObject_AsCharBuffer(), PyObject_AsReadBuffer() and PyObject_AsWriteBuffer().	2015-02-03 01:21:08 +02:00
Serhiy Storchaka	b757c83ec6	Issue #22581 : Use more "bytes-like object" throughout the docs and comments.	2014-12-05 22:25:22 +02:00
Benjamin Peterson	1cc9520327	s/stringobject/bytesobject/ (closes #22036 ) Patch by Martin Matusiak.	2014-07-23 21:39:37 -07:00
Benjamin Peterson	d455ce4fd4	merge 3.3	2014-03-30 19:52:39 -04:00
Benjamin Peterson	0ad6098b67	merge 3.2	2014-03-30 19:52:22 -04:00
Benjamin Peterson	23cf403ca1	fix expandtabs overflow detection to be consistent and not rely on signed overflow	2014-03-30 19:47:57 -04:00
Serhiy Storchaka	3079328d29	Reverted changeset b72c5573c5e7 (issue #15027 ).	2014-01-04 22:44:01 +02:00
Serhiy Storchaka	583a93943c	Issue #15027 : Rewrite the UTF-32 encoder. It is now 1.6x to 3.5x faster.	2014-01-04 19:25:37 +02:00
Benjamin Peterson	0ee22bf774	fix format spec recursive expansion (closes #19729 )	2013-11-26 19:22:36 -06:00
Serhiy Storchaka	dc2fd5101a	Remove dead code committed in issue #12892 .	2013-11-19 15:56:05 +02:00
Serhiy Storchaka	58cf607d13	Issue #12892 : The utf-16* and utf-32* codecs now reject (lone) surrogates. The utf-16* and utf-32* encoders no longer allow surrogate code points (U+D800-U+DFFF) to be encoded. The utf-32* decoders no longer decode byte sequences that correspond to surrogate code points. The surrogatepass error handler now works with the utf-16* and utf-32* codecs. Based on patches by Victor Stinner and Kang-Hao (Kenny) Lu.	2013-11-19 11:32:41 +02:00
Ezio Melotti	745d54d2fa	#17806 : Added keyword-argument support for "tabsize" to str/bytes.expandtabs().	2013-11-16 19:10:57 +02:00
Victor Stinner	cc64eb5b9f	Issue #18408 : Fix bytearrayiter.partition()/rpartition(), handle PyByteArray_FromStringAndSize() failure (ex: on memory allocation failure)	2013-10-29 03:15:37 +01:00
Serhiy Storchaka	8fa8ee3970	Issue #18701 : Remove support of old CPython versions (<3.0) from C code.	2013-08-17 00:48:02 +03:00
Raymond Hettinger	d06eeb4a24	merge	2013-08-13 18:20:55 -07:00
Raymond Hettinger	b1b915c796	Issue 18719: Remove a false optimization Remove an unused early-out test from the critical path for dict and set lookups. When the strings already have matching lengths, kinds, and hashes, there is no additional information gained by checking the first characters (the probability of a mismatch is already known to be less than 1 in 2**64).	2013-08-13 18:16:34 -07:00
Antoine Pitrou	9ed5f27266	Issue #18722 : Remove uses of the "register" keyword in C code.	2013-08-13 20:18:52 +02:00
Benjamin Peterson	d2b58a9880	only recursively expand in the format spec (closes #17644 )	2013-05-17 17:34:30 -05:00
Benjamin Peterson	4d94474ba3	rewrite the parsing of field names to be more consistent wrt recursive expansion	2013-05-17 18:22:31 -05:00
Benjamin Peterson	48953632df	merge 3.3	2013-05-17 17:35:28 -05:00
Ezio Melotti	5263c13801	Merge removal of trailing whitespace from 3.3.	2013-04-21 04:08:18 +03:00
Ezio Melotti	6b02772c13	Remove trailing whitespace.	2013-04-21 04:07:51 +03:00
Victor Stinner	8f674ccd64	Close #17694 : Add minimum length to _PyUnicodeWriter * Add also min_char attribute to _PyUnicodeWriter structure (currently unused) * _PyUnicodeWriter_Init() has no more argument (except the writer itself): min_length and overallocate must be set explicitly * In error handlers, only enable overallocation if the replacement string is longer than 1 character * CJK decoders don't use overallocation anymore * Set min_length, instead of preallocating memory using _PyUnicodeWriter_Prepare(), in many decoders * _PyUnicode_DecodeUnicodeInternal() checks for integer overflow	2013-04-17 23:02:17 +02:00
Victor Stinner	76b3b2726c	stringlib: remove unused STRINGLIB_RESIZE macro	2013-04-14 16:29:09 +02:00
Serhiy Storchaka	e2cef885a2	Issue #16061 : Speed up str.replace() for replacing 1-character strings.	2013-04-13 22:45:04 +03:00
Victor Stinner	7efa3b8242	Close #13126 : "Simplify" FASTSEARCH() code to help the compiler to emit more efficient machine code. Patch written by Antoine Pitrou. Without this change, str.find() was 10% slower than str.rfind() in the worst case.	2013-04-08 00:26:43 +02:00
Victor Stinner	cfc4c13b04	Add _PyUnicodeWriter_WriteSubstring() function Write a function to enable more optimizations: * If the substring is the whole string and overallocation is disabled, just keep a reference to the string, don't copy characters * Avoid a call to the expensive _PyUnicode_FindMaxChar() function when possible	2013-04-03 01:48:39 +02:00
Serhiy Storchaka	06b16f879f	Remove unused defines.	2013-02-23 14:49:09 +02:00
Serhiy Storchaka	18809fa94e	Remove unused defines.	2013-02-23 14:48:16 +02:00
Antoine Pitrou	4de7457009	Issue #17173 : Remove uses of locale-dependent C functions (isalpha() etc.) in the interpreter. I've left a couple of them in: zlib (third-party lib), getaddrinfo.c (doesn't include Python.h, and probably obsolete), _sre.c (legitimate use for the re.LOCALE flag).	2013-02-09 23:11:27 +01:00
Serhiy Storchaka	b946af5897	Check for NULL before the pointer aligning in fastsearch_memchr_1char. There is no guarantee that NULL is aligned.	2013-01-15 13:32:41 +02:00
Serhiy Storchaka	18ba40b945	Check for NULL before the pointer aligning in fastsearch_memchr_1char. There is no guarantee that NULL is aligned.	2013-01-15 13:27:28 +02:00
Christian Heimes	5f7e8dab11	Issue #16592 : stringlib_bytes_join doesn't raise MemoryError on allocation failure	2012-12-02 07:56:42 +01:00
Victor Stinner	6caa6fb535	(Merge 3.3) Issue #8271 : Fix compilation on Windows	2012-11-05 00:00:50 +01:00
Victor Stinner	ab60de478d	Issue #8271 : Fix compilation on Windows	2012-11-04 23:59:15 +01:00
Ezio Melotti	cfa9636404	#8271 : merge with 3.3.	2012-11-04 23:23:09 +02:00
Ezio Melotti	f7ed5d111b	#8271 : the utf-8 decoder now outputs the correct number of U+FFFD characters when used with the "replace" error handler on invalid utf-8 sequences. Patch by Serhiy Storchaka, tests by Ezio Melotti.	2012-11-04 23:21:38 +02:00
Antoine Pitrou	6f7b0da6bc	Issue #12805 : Make bytes.join and bytearray.join faster when the separator is empty. Patch by Serhiy Storchaka.	2012-10-20 23:08:34 +02:00
Christian Heimes	743e0cd6b5	Issue #16166 : Add PY_LITTLE_ENDIAN and PY_BIG_ENDIAN macros and unified endianess detection and handling.	2012-10-17 23:52:17 +02:00
Antoine Pitrou	cfc22b4a9b	Issue #15958 : bytes.join and bytearray.join now accept arbitrary buffer objects.	2012-10-16 21:07:23 +02:00
Antoine Pitrou	ca8aa4acf6	Issue #15144 : Fix possible integer overflow when handling pointers as integer values, by using Py_uintptr_t instead of size_t. Patch by Serhiy Storchaka.	2012-09-20 20:56:47 +02:00
Victor Stinner	b3f5501250	Close #15534 : Fix a typo in the fast search function of the string library (_s => s) Replace _s with ptr to avoid future confusion. Add also non regression tests.	2012-08-02 23:05:01 +02:00
Mark Dickinson	fb90c0934c	Issue #14700 : Fix buggy overflow checks for large precision and width in new-style and old-style formatting.	2012-10-28 10:18:03 +00:00
Mark Dickinson	01ac8b6ab1	Use correct types for ASCII_CHAR_MASK integer constants.	2012-07-07 14:08:48 +02:00
Mark Dickinson	106c4145ff	Issue #14923 : Optimize continuation-byte check in UTF-8 decoding. Patch by Serhiy Storchaka.	2012-06-23 21:45:14 +01:00
Antoine Pitrou	a759d4e9f4	Make private function static (from `make smelly`)	2012-06-21 17:26:28 +02:00
Antoine Pitrou	27f6a3b0bf	Issue #15026 : utf-16 encoding is now significantly faster (up to 10x). Patch by Serhiy Storchaka.	2012-06-15 22:15:23 +02:00
Victor Stinner	d7b7c7472b	Issue #14993 : Use standard "unsigned char" instead of a unsigned char bitfield	2012-06-04 22:52:12 +02:00
Victor Stinner	d3f0882dfb	Issue #14744 : Use the new _PyUnicodeWriter internal API to speed up str%args and str.format(args) * Formatting string, int, float and complex use the _PyUnicodeWriter API. It avoids a temporary buffer in most cases. * Add _PyUnicodeWriter_WriteStr() to restore the PyAccu optimization: just keep a reference to the string if the output is only composed of one string * Disable overallocation when formatting the last argument of str%args and str.format(args) * Overallocation allocates at least 100 characters: add min_length attribute to the _PyUnicodeWriter structure * Add new private functions: _PyUnicode_FastCopyCharacters(), _PyUnicode_FastFill() and _PyUnicode_FromASCII() The speed up is around 20% in average.	2012-05-29 12:57:52 +02:00
Antoine Pitrou	63065d761e	Issue #14624 : UTF-16 decoding is now 3x to 4x faster on various inputs. Patch by Serhiy Storchaka.	2012-05-15 23:48:04 +02:00
Antoine Pitrou	ca5f91b888	Issue #14738 : Speed-up UTF-8 decoding on non-ASCII data. Patch by Serhiy Storchaka.	2012-05-10 16:36:02 +02:00
Victor Stinner	3b1a74a9c3	Rename unicode_write_t structure and its methods to "_PyUnicodeWriter"	2012-05-09 22:25:00 +02:00
Victor Stinner	ee4544c920	Issue #14744 : Inline unicode_writer_write_char() and unicode_write_str() Optimize also PyUnicode_Format(): call unicode_writer_prepare() only once per argument.	2012-05-09 22:24:08 +02:00

1 2 3 4 5 ...

288 Commits