Commit Graph

220 Commits

Author SHA1 Message Date
Yaron de Leeuw
50cd4b6959
bpo-26253: Add compressionlevel to tarfile stream (GH-2962)
`tarfile` already accepts a compressionlevel argument for creating
files. This patch adds the same for stream-based tarfile usage.
The default is 9, the value that was previously hard-coded.
2022-06-25 11:43:54 +03:00
Chris Fernald
c1e19421c2
gh-91387: Strip trailing slash from tarfile longname directories (GH-32423)
Co-authored-by: Brett Cannon <brett@python.org>
2022-06-17 15:38:41 -07:00
Joshua Root
bf2d44ffb0
bpo-45863: tarfile: don't zero out header fields unnecessarily (GH-29693)
Numeric fields of type float, notably mtime, can't be represented
exactly in the ustar header, so the pax header is used. But it is
helpful to set them to the nearest int (i.e. second rather than
nanosecond precision mtimes) in the ustar header as well, for the
benefit of unarchivers that don't understand the pax header.

Add test for tarfile.TarInfo.create_pax_header to confirm correct
behaviour.
2022-02-09 18:06:19 +01:00
Andrzej Mateja
128ab092ca
bpo-44289: Keep argument file object's current position in tarfile.is_tarfile (GH-26488) 2022-02-09 08:19:16 -08:00
andrei kulakov
cfadcc31ea
bpo-21987: Fix TarFile.getmember getting a dir with a trailing slash (GH-30283) 2022-01-21 09:40:32 +02:00
Jack DeVries
b6fe857250
bpo-39039: tarfile raises descriptive exception from zlib.error (GH-27766)
* during tarfile parsing, a zlib error indicates invalid data
* tarfile.open now raises a descriptive exception from the zlib error
* this makes it clear to the user that they may be trying to open a
  corrupted tar file
2021-09-29 11:25:48 +02:00
Anthony Sottile
9aea31dedd
bpo-8978: improve tarfile.open error message when lzma / bz2 are missing (GH-24850)
Automerge-Triggered-By: GH:pablogsal
2021-04-27 10:39:01 -07:00
Ethan Furman
b5a6db9111
bpo-39717: [tarfile] update nested exception raising (GH-23739)
- `from None` if the new exception uses, or doesn't need, the previous one
- `from e` if the previous exception is still relevant
2020-12-12 13:26:44 -08:00
Julien Palard
4fedd7123e
bpo-12800: tarfile: Restore fix from 011525ee9 (GH-21409)
Restore fix from 011525ee92.
2020-11-25 10:23:17 +01:00
Andrey Doroschenko
ec42789e6e
bpo-39693: mention KeyError in tarfile extractfile documentation (GH-18639)
Co-authored-by: Andrey Darascheka <andrei.daraschenka@leverx.com>
2020-10-20 10:05:01 -04:00
Artem Bulgakov
22748a83d9
bpo-41316: Make tarfile follow specs for FNAME (GH-21511)
tarfile writes full path to FNAME field of GZIP format instead of just basename if user specified absolute path. Some archive viewers may process file incorrectly. Also it creates security issue because anyone can know structure of directories on system and know username or other personal information.

RFC1952 says about FNAME:
This is the original name of the file being compressed, with any directory components removed.

So tarfile must remove directory names from FNAME and write only basename of file.

Automerge-Triggered-By: @jaraco
2020-09-07 09:46:33 -07:00
Rishi
5a8d121a1f
bpo-39017: Avoid infinite loop in the tarfile module (GH-21454)
Avoid infinite loop when reading specially crafted TAR files using the tarfile module
(CVE-2019-20907).
2020-07-15 13:51:00 +02:00
William Chargin
674935b8ca
bpo-18819: tarfile: only set device fields for device files (GH-18080)
The GNU docs describe the `devmajor` and `devminor` fields of the tar
header struct only in the context of character and block special files,
suggesting that in other cases they are not populated. Typical utilities
behave accordingly; this patch teaches `tarfile` to do the same.
2020-02-12 11:56:02 -08:00
Serhiy Storchaka
9017e0bd5e bpo-39430: Fix race condition in lazy imports in tarfile. (GH-18161)
Use `from ... import ...` to ensure module is fully loaded before accessing its attributes.
2020-01-24 09:55:52 -08:00
William Woodruff
dd754caf14 bpo-29435: Allow is_tarfile to take a filelike obj (GH-18090)
`is_tarfile()` now supports `name` being a file or file-like object.
2020-01-22 18:24:16 -08:00
Raymond Hettinger
a694f23948
Add missing docstrings for TarInfo objects (#12555) 2019-03-27 13:16:34 -07:00
CAM Gerlach
e680c3db80 bpo-36268: Change default tar format to pax from GNU. (GH-12355) 2019-03-21 16:44:51 +02:00
Anthony Sottile
8377cd4fcd Clean up code which checked presence of os.{stat,lstat,chmod} (#11643) 2019-02-25 23:32:27 +01:00
INADA Naoki
8d130913cb
bpo-34043: Optimize tarfile uncompress performance (GH-8089)
tarfile._Stream has two buffer for compressed and uncompressed data.
Those buffers are not aligned so unnecessary bytes slicing happens
for every reading chunks.

This commit bypass compressed buffering.

In this benchmark [1], user time become 250ms from 300ms.

[1]: https://bugs.python.org/msg320763
2018-07-06 14:06:00 +09:00
hajoscher
12a08c4760 bpo-34010: Fix tarfile read performance regression (GH-8020)
During buffered read, use a list followed by join instead of extending a bytes object.
This is how it was done before but changed in commit b506dc32c1.
2018-07-04 17:13:18 +09:00
INADA Naoki
461a1c4b49
bpo-33842: Remove tarfile.filemode (GH-7661) 2018-06-28 17:10:36 +09:00
Joffrey F
72d9b2be36 bpo-32713: Fix tarfile.itn for large/negative float values. (GH-5434) 2018-02-27 02:02:21 +02:00
Bernhard M. Wiedemann
84521047e4 bpo-30693: zip+tarfile: sort directory listing (#2263)
tarfile and zipfile now sort directory listing to generate tar and zip archives
in a more reproducible way.

See also https://reproducible-builds.org/docs/stable-inputs/ on that topic.
2018-01-31 11:17:10 +01:00
Mike
53f7a7c281 bpo-32297: Few misspellings found in Python source code comments. (#4803)
* Fix multiple typos in code comments

* Add spacing in comments (test_logging.py, test_math.py)

* Fix spaces at the beginning of comments in test_logging.py
2017-12-14 13:04:53 +02:00
Alex Gaynor
c7cc14a825 Remove two legacy constants which hopefully have no consumers (#1087)
The data contained in them is nonsensical
2017-04-11 22:41:42 -04:00
Serhiy Storchaka
150cd1916a bpo-29958: Minor improvements to zipfile and tarfile CLI. (#944) 2017-04-07 18:56:12 +03:00
Serhiy Storchaka
bdf6b910f9 bpo-29776: Use decorator syntax for properties. (#585) 2017-03-19 08:40:32 +02:00
Serhiy Storchaka
4f76fb16b7 Issue #29210: Removed support of deprecated argument "exclude" in
tarfile.TarFile.add().
2017-01-13 13:25:24 +02:00
Xavier de Gaye
f44abdab1e Issue #26937: The chown() method of the tarfile.TarFile class does not fail now
when the grp module cannot be imported, as for example on Android platforms.
2016-12-09 09:33:09 +01:00
Serhiy Storchaka
2f4453eff8 Issue #28449: tarfile.open() with mode "r" or "r:" now tries to open a tar
file with compression before trying to open it without compression.  Otherwise
it had 50% chance failed with ignore_zeros=True.
2016-10-30 20:56:23 +02:00
Serhiy Storchaka
a89d22aff3 Issue #28449: tarfile.open() with mode "r" or "r:" now tries to open a tar
file with compression before trying to open it without compression.  Otherwise
it had 50% chance failed with ignore_zeros=True.
2016-10-30 20:52:29 +02:00
Łukasz Langa
04bedfa3ce Issue #27199: TarFile expose copyfileobj bufsize to improve throughput
Patch by Jason Fried.
2016-09-09 19:48:14 -07:00
Larry Hastings
10108a7b9a Issue #27355: Removed support for Windows CE. It was never finished,
and Windows CE is no longer a relevant platform for Python.
2016-09-05 15:11:23 -07:00
Łukasz Langa
5135e9ed51 Merge 3.5, issue #27194 2016-06-11 16:56:18 -07:00
Łukasz Langa
e7f27481a8 Issue #27194: superfluous truncate calls in tarfile.py slow down extraction
Patch by Jason Fried.
2016-06-11 16:42:36 -07:00
Lars Gustäbel
7c3e6848f2 Issue #24838: Merge tarfile fix from 3.5. 2016-04-19 08:53:14 +02:00
Lars Gustäbel
0f450abec4 Issue #24838: tarfile's ustar and gnu formats now correctly calculate name and
link field limits for multibyte character encodings like utf-8.
2016-04-19 08:43:17 +02:00
Serhiy Storchaka
b6a9c9761c Issue #26778: Fixed "a/an/and" typos in code comment, documentation and error
messages.
2016-04-17 09:39:28 +03:00
Serhiy Storchaka
6a7b3a77b4 Issue #26778: Fixed "a/an/and" typos in code comment and documentation. 2016-04-17 08:32:47 +03:00
Martin Panter
2d2d08d2cc Issue #22468: Merge gettarinfo() doc from 3.5 2016-02-19 23:46:59 +00:00
Martin Panter
f817a48d17 Issues #22468, #21996, #22208: Clarify gettarinfo() and TarInfo usage
* The Windows-specific binary notice was probably a Python 2 thing
* Make it more obvious gettarinfo() is based on stat(), and that non-ordinary
  files may need special care
* The file name must be text; suggest dummy arcname as a workaround
* Indicate TarInfo may be used directly, not just via gettarinfo()
2016-02-19 23:34:56 +00:00
Martin Panter
104dcdab59 Issue #23883: Add missing APIs to tarfile.__all__
Patch by Joel Taddei and Jacek Kołodziej.
2016-01-16 06:59:13 +00:00
Serhiy Storchaka
a254921cd4 Issue #22227: The TarFile iterator is reimplemented using generator.
This implementation is simpler that using class.
2015-12-19 09:43:14 +02:00
Martin Panter
b82032f935 Issue #22341: Drop Python 2 workaround and document CRC initial value
Also align the parameter naming in binascii to be consistent with zlib.
2015-12-11 05:19:29 +00:00
Lars Gustäbel
e12aa62d68 Merge with 3.4: Issue #24259: tarfile now raises a ReadError if an archive is truncated inside a data segment. 2015-07-06 09:29:41 +02:00
Lars Gustäbel
0357268d96 Issue #24259: tarfile now raises a ReadError if an archive is truncated inside a data segment. 2015-07-06 09:27:24 +02:00
Lars Gustäbel
49c521fd5d Merge with 3.4: Issue #24514: tarfile now tolerates number fields consisting of only whitespace. 2015-07-02 19:41:03 +02:00
Lars Gustäbel
b7a688b3a4 Issue #24514: tarfile now tolerates number fields consisting of only whitespace. 2015-07-02 19:38:38 +02:00
Lars Gustäbel
20703c6969 tarfile.open() with mode 'x' created files without an end of archive marker. 2015-05-27 12:53:44 +02:00
Eric V. Smith
7a80389ce5 Issue 23193: Add numeric_owner to tarfile.TarFile.extract() and tarfile.TarFile.extractall(). 2015-04-15 10:27:58 -04:00