mirrors/git

mirror of https://github.com/git/git.git synced 2024-12-13 11:54:56 +08:00

Author	SHA1	Message	Date
Junio C Hamano	cf792653ad	Merge branch 'ps/leakfixes' Leakfixes. * ps/leakfixes: builtin/mv: fix leaks for submodule gitfile paths builtin/mv: refactor to use `struct strvec` builtin/mv duplicate string list memory builtin/mv: refactor `add_slash()` to always return allocated strings strvec: add functions to replace and remove strings submodule: fix leaking memory for submodule entries commit-reach: fix memory leak in `ahead_behind()` builtin/credential: clear credential before exit config: plug various memory leaks config: clarify memory ownership in `git_config_string()` builtin/log: stop using globals for format config builtin/log: stop using globals for log config convert: refactor code to clarify ownership of check_roundtrip_encoding diff: refactor code to clarify memory ownership of prefixes config: clarify memory ownership in `git_config_pathname()` http: refactor code to clarify memory ownership checkout: clarify memory ownership in `unique_tracking_name()` strbuf: fix leak when `appendwholeline()` fails with EOF transport-helper: fix leaking helper name	2024-06-06 12:49:23 -07:00
Junio C Hamano	988499e295	Merge branch 'ps/refs-without-the-repository-updates' Further clean-up the refs subsystem to stop relying on the_repository, and instead use the repository associated to the ref_store object. * ps/refs-without-the-repository-updates: refs/packed: remove references to `the_hash_algo` refs/files: remove references to `the_hash_algo` refs/files: use correct repository refs: remove `dwim_log()` refs: drop `git_default_branch_name()` refs: pass repo when peeling objects refs: move object peeling into "object.c" refs: pass ref store when detecting dangling symrefs refs: convert iteration over replace refs to accept ref store refs: retrieve worktree ref stores via associated repository refs: refactor `resolve_gitlink_ref()` to accept a repository refs: pass repo when retrieving submodule ref store refs: track ref stores via strmap refs: implement releasing ref storages refs: rename `init_db` callback to avoid confusion refs: adjust names for `init` and `init_db` callbacks	2024-05-30 14:15:13 -07:00
Junio C Hamano	a60c21b720	Merge branch 'ps/undecided-is-not-necessarily-sha1' Before discovering the repository details, We used to assume SHA-1 as the "default" hash function, which has been corrected. Hopefully this will smoke out codepaths that rely on such an unwarranted assumptions. * ps/undecided-is-not-necessarily-sha1: repository: stop setting SHA1 as the default object hash oss-fuzz/commit-graph: set up hash algorithm builtin/shortlog: don't set up revisions without repo builtin/diff: explicitly set hash algo when there is no repo builtin/bundle: abort "verify" early when there is no repository builtin/blame: don't access potentially unitialized `the_hash_algo` builtin/rev-parse: allow shortening to more than 40 hex characters remote-curl: fix parsing of detached SHA256 heads attr: fix BUG() when parsing attrs outside of repo attr: don't recompute default attribute source parse-options-cb: only abbreviate hashes when hash algo is known path: move `validate_headref()` to its only user path: harden validation of HEAD with non-standard hashes	2024-05-30 14:15:11 -07:00
Patrick Steinhardt	1b261c20ed	config: clarify memory ownership in `git_config_string()` The out parameter of `git_config_string()` is a `const char ` even though we transfer ownership of memory to the caller. This is quite misleading and has led to many memory leaks all over the place. Adapt the parameter to instead be `char `. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-05-27 11:20:00 -07:00
Patrick Steinhardt	e19488a60a	refs: refactor `resolve_gitlink_ref()` to accept a repository In `resolve_gitlink_ref()` we implicitly rely on `the_repository` to look up the submodule ref store. Now that we can look up submodule ref stores for arbitrary repositories we can improve this function to instead accept a repository as parameter for which we want to resolve the gitlink. Do so and adjust callers accordingly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-05-17 10:33:38 -07:00
Junio C Hamano	b077cf2679	Merge branch 'jc/no-default-attr-tree-in-bare' Git 2.43 started using the tree of HEAD as the source of attributes in a bare repository, which has severe performance implications. For now, revert the change, without ripping out a more explicit support for the attr.tree configuration variable. * jc/no-default-attr-tree-in-bare: stop using HEAD for attributes in bare repository by default	2024-05-13 10:19:46 -07:00
Patrick Steinhardt	813f17fd6b	attr: fix BUG() when parsing attrs outside of repo If either the `--attr-source` option or the `GIT_ATTR_SOURCE` envvar are set, then `compute_default_attr_source()` will try to look up the value as a treeish. It is possible to hit that function while outside of a Git repository though, for example when using `git grep --no-index`. In that case, Git will hit a bug because we try to look up the main ref store outside of a repository. Handle the case gracefully and detect when we try to look up an attr source without a repository. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-05-06 22:50:49 -07:00
Patrick Steinhardt	bbb82f8dc8	attr: don't recompute default attribute source The `default_attr_source()` function lazily computes the attr source supposedly once, only. This is done via a static variable `attr_source` that contains the resolved object ID of the attr source's tree. If the variable is the null object ID then we try to look up the attr source, otherwise we skip over it. This approach is flawed though: the variable will never be set to anything else but the null object ID in case there is no attr source. Consequently, we re-compute the information on every call. And in the worst case, when we silently ignore bad trees, this will cause us to try and look up the treeish every single time. Improve this by introducing a separate variable `has_attr_source` to track whether we already computed the attr source and, if so, whether we have an attr source or not. This also allows us to convert the `ignore_bad_attr_tree` to not be static anymore as the code will only be executed once anyway. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-05-06 22:50:48 -07:00
Junio C Hamano	e9e8dd8801	Merge branch 'jc/no-default-attr-tree-in-bare' into ps/undecided-is-not-necessarily-sha1 * jc/no-default-attr-tree-in-bare: stop using HEAD for attributes in bare repository by default	2024-05-06 22:50:24 -07:00
Taylor Blau	c793f9cb08	attr.c: move ATTR_MAX_FILE_SIZE check into read_attr_from_buf() Commit `3c50032ff5` (attr: ignore overly large gitattributes files, 2022-12-01) added a defense-in-depth check to ensure that .gitattributes blobs read from the index do not exceed ATTR_MAX_FILE_SIZE (100 MB). But there were two cases added shortly after `3c50032ff5` was written which do not apply similar protections: - `47cfc9bd7d` (attr: add flag `--source` to work with tree-ish, 2023-01-14) - `4723ae1007` (attr.c: read attributes in a sparse directory, 2023-08-11) added a similar Ensure that we refuse to process a .gitattributes blob exceeding ATTR_MAX_FILE_SIZE when reading from either an arbitrary tree object or a sparse directory. This is done by pushing the ATTR_MAX_FILE_SIZE check down into the low-level `read_attr_from_buf()`. In doing so, plug a leak in `read_attr_from_index()` where we would accidentally leak the large buffer upon detecting it is too large to process. (Since `read_attr_from_buf()` handles a NULL buffer input, we can remove a NULL check before calling it in `read_attr_from_index()` as well). Co-authored-by: Jeff King <peff@peff.net> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-05-03 12:44:16 -07:00
Junio C Hamano	51441e6460	stop using HEAD for attributes in bare repository by default With `23865355` (attr: read attributes from HEAD when bare repo, 2023-10-13), we started to use the HEAD tree as the default attribute source in a bare repository. One argument for such a behaviour is that it would make things like "git archive" run in bare and non-bare repositories for the same commit consistent. This changes was merged to Git 2.43 but without an explicit mention in its release notes. It turns out that this change destroys performance of shallowly cloning from a bare repository. As the "server" installations are expected to be mostly bare, and "git pack-objects", which is the core of driving the other side of "git clone" and "git fetch" wants to see if a path is set not to delta with blobs from other paths via the attribute system, the change forces the server side to traverse the tree of the HEAD commit needlessly to find if each and every paths the objects it sends out has the attribute that controls the deltification. Given that (1) most projects do not configure such an attribute, and (2) it is dubious for the server side to honor such an end-user supplied attribute anyway, this was a poor choice of the default. To mitigate the current situation, let's revert the change that uses the tree of HEAD in a bare repository by default as the attribute source. This will help most people who have been happy with the behaviour of Git 2.42 and before. Two things to note: * If you are stuck with versions of Git 2.43 or newer, that is older than the release this fix appears in, you can explicitly set the attr.tree configuration variable to point at an empty tree object, i.e. $ git config attr.tree `4b825dc642` * If you like the behaviour we are reverting, you can explicitly set the attr.tree configuration variable to HEAD, i.e. $ git config attr.tree HEAD The right fix for this is to optimize the code paths that allow accesses to attributes in tree objects, but that is a much more involved change and is left as a longer-term project, outside the scope of this "first step" fix. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-05-03 09:15:33 -07:00
Joanna Wang	2232a88ab6	attr: add builtin objectmode values support Gives all paths builtin objectmode values based on the paths' modes (one of 100644, 100755, 120000, 040000, 160000). Users may use this feature to filter by file types. For example a pathspec such as ':(attr:builtin_objectmode=160000)' could filter for submodules without needing to have `builtin_objectmode=160000` to be set in .gitattributes for every submodule path. These values are also reflected in `git check-attr` results. If the git_attr_direction is set to GIT_ATTR_INDEX or GIT_ATTR_CHECKIN and a path is not found in the index, the value will be unspecified. This patch also reserves the builtin_* attribute namespace for objectmode and any future builtin attributes. Any user defined attributes using this reserved namespace will result in a warning. This is a breaking change for any existing builtin_* attributes. Pathspecs with some builtin_* attribute name (excluding builtin_objectmode) will behave like any attribute where there are no user specified values. Signed-off-by: Joanna Wang <jojwang@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-12-28 13:21:52 -08:00
Junio C Hamano	26dd307cfa	Merge branch 'jc/attr-tree-config' The attribute subsystem learned to honor `attr.tree` configuration that specifies which tree to read the .gitattributes files from. * jc/attr-tree-config: attr: add attr.tree for setting the treeish to read attributes from attr: read attributes from HEAD when bare repo	2023-10-30 07:09:55 +09:00
John Cai	9f9c40cf34	attr: add attr.tree for setting the treeish to read attributes from `44451a2` (attr: teach "--attr-source=<tree>" global option to "git", 2023-05-06) provided the ability to pass in a treeish as the attr source. In the context of serving Git repositories as bare repos like we do at GitLab however, it would be easier to point --attr-source to HEAD for all commands by setting it once. Add a new config attr.tree that allows this. Signed-off-by: John Cai <johncai86@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-10-13 11:43:29 -07:00
John Cai	2386535511	attr: read attributes from HEAD when bare repo The motivation for `44451a2e5e` (attr: teach "--attr-source=<tree>" global option to "git" , 2023-05-06), was to make it possible to use gitattributes with bare repositories. To make it easier to read gitattributes in bare repositories however, let's just make HEAD:.gitattributes the default. This is in line with how mailmap works, `8c473cecfd` (mailmap: default mailmap.blob in bare repositories, 2012-12-13). Signed-off-by: John Cai <johncai86@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-10-13 11:43:29 -07:00
Calvin Wan	b1bda75173	parse: separate out parsing functions from config.h The files config.{h,c} contain functions that have to do with parsing, but not config. In order to further reduce all-in-one headers, separate out functions in config.c that do not operate on config into its own file, parse.h, and update the include directives in the .c files that need only such functions accordingly. Signed-off-by: Calvin Wan <calvinwan@google.com> Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-09-29 15:14:57 -07:00
Junio C Hamano	354356feff	Merge branch 'sl/sparse-check-attr' Teach "git check-attr" work better with sparse-index. * sl/sparse-check-attr: check-attr: integrate with sparse-index attr.c: read attributes in a sparse directory t1092: add tests for 'git check-attr'	2023-08-29 13:51:43 -07:00
Shuqi Liang	4723ae1007	attr.c: read attributes in a sparse directory Before this patch, git check-attr was unable to read the attributes from a .gitattributes file within a sparse directory. The original comment was operating under the assumption that users are only interested in files or directories inside the cones. Therefore, in the original code, in the case of a cone-mode sparse-checkout, we didn't load the .gitattributes file. However, this behavior can lead to missing attributes for files inside sparse directories, causing inconsistencies in file handling. To resolve this, revise 'git check-attr' to allow attribute reading for files in sparse directories from the corresponding .gitattributes files: 1.Utilize path_in_cone_mode_sparse_checkout() and index_name_pos_sparse to check if a path falls within a sparse directory. 2.If path is inside a sparse directory, employ the value of index_name_pos_sparse() to find the sparse directory containing path and path relative to sparse directory. Proceed to read attributes from the tree OID of the sparse directory using read_attr_from_blob(). 3.If path is not inside a sparse directory，ensure that attributes are fetched from the index blob with read_blob_data_from_index(). Change the test 'check-attr with pathspec outside sparse definition' to 'test_expect_success' to reflect that the attributes inside a sparse directory can now be read. Ensure that the sparse index case works correctly for git check-attr to illustrate the successful handling of attributes within sparse directories. Helped-by: Victoria Dye <vdye@github.com> Signed-off-by: Shuqi Liang <cheskaqiqi@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-08-11 09:44:51 -07:00
Junio C Hamano	ce481ac8b3	Merge branch 'cw/compat-util-header-cleanup' Further shuffling of declarations across header files to streamline file dependencies. * cw/compat-util-header-cleanup: git-compat-util: move alloc macros to git-compat-util.h treewide: remove unnecessary includes for wrapper.h kwset: move translation table from ctype sane-ctype.h: create header for sane-ctype macros git-compat-util: move wrapper.c funcs to its header git-compat-util: move strbuf.c funcs to its header	2023-07-17 11:30:42 -07:00
Calvin Wan	91c080dff5	git-compat-util: move alloc macros to git-compat-util.h alloc_nr, ALLOC_GROW, and ALLOC_GROW_BY are commonly used macros for dynamic array allocation. Moving these macros to git-compat-util.h with the other alloc macros focuses alloc.[ch] to allocation for Git objects and additionally allows us to remove inclusions to alloc.h from files that solely used the above macros. Signed-off-by: Calvin Wan <calvinwan@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-07-05 11:42:31 -07:00
Junio C Hamano	89d62d5e8e	Merge branch 'bc/more-git-var' Add more "git var" for toolsmiths to learn various locations Git is configured with either via the configuration or hardcoded defaults. * bc/more-git-var: var: add config file locations var: add attributes files locations attr: expose and rename accessor functions var: adjust memory allocation for strings var: format variable structure with C99 initializers var: add support for listing the shell t: add a function to check executable bit var: mark unused parameters in git_var callbacks	2023-07-04 16:08:18 -07:00
brian m. carlson	15780bb4f0	attr: expose and rename accessor functions Right now, the functions which determine the current system and global gitattributes files are not exposed. We'd like to use them in a future commit, but they're not ideally named. Rename them to something more suitable as a public interface, expose them, and document them. Signed-off-by: brian m. carlson <bk2204@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-06-27 11:31:06 -07:00
Elijah Newren	a034e9106f	object-store-ll.h: split this header out of object-store.h The vast majority of files including object-store.h did not need dir.h nor khash.h. Split the header into two files, and let most just depend upon object-store-ll.h, while letting the two callers that need it depend on the full object-store.h. After this patch: $ git grep -h include..object-store \| sort \| uniq -c 2 #include "object-store.h" 129 #include "object-store-ll.h" Diff best viewed with `--color-moved`. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-06-21 13:39:54 -07:00
Elijah Newren	c339932bd8	repository: remove unnecessary include of path.h This also made it clear that several .c files that depended upon path.h were missing a #include for it; add the missing includes while at it. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-06-21 13:39:53 -07:00
Elijah Newren	bc5c5ec044	cache.h: remove this no-longer-used header Since this header showed up in some places besides just #include statements, update/clean-up/remove those other places as well. Note that compat/fsmonitor/fsm-path-utils-darwin.c previously got away with violating the rule that all files must start with an include of git-compat-util.h (or a short-list of alternate headers that happen to include it first). This change exposed the violation and caused it to stop building correctly; fix it by having it include git-compat-util.h first, as per policy. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-06-21 13:39:53 -07:00
Elijah Newren	08c46a499a	read-cache*.h: move declarations for read-cache.c functions from cache.h For the functions defined in read-cache.c, move their declarations from cache.h to a new header, read-cache-ll.h. Also move some related inline functions from cache.h to read-cache.h. The purpose of the read-cache-ll.h/read-cache.h split is that about 70% of the sites don't need the inline functions and the extra headers they include. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-06-21 13:39:53 -07:00
Junio C Hamano	67a3b2b39f	Merge branch 'jc/attr-source-tree' "git --attr-source=<tree> cmd $args" is a new way to have any command to read attributes not from the working tree but from the given tree object. * jc/attr-source-tree: attr: teach "--attr-source=<tree>" global option to "git"	2023-05-17 10:11:41 -07:00
John Cai	44451a2e5e	attr: teach "--attr-source=<tree>" global option to "git" Earlier, `47cfc9bd` (attr: add flag `--source` to work with tree-ish, 2023-01-14) taught "git check-attr" the "--source=<tree>" option to allow it to read attribute files from a tree-ish, but did so only for the command. Just like "check-attr" users wanted a way to use attributes from a tree-ish and not from the working tree files, users of other commands (like "git diff") would benefit from the same. Undo most of the UI change the commit made, while keeping the internal logic to read attributes from a given tree-ish. Expose the internal logic via a new "--attr-source=<tree>" command line option given to "git", so that it can be used with any git command that runs as part of the main git process. Additionally, add an environment variable GIT_ATTR_SOURCE that is set when --attr-source is passed in, so that subprocesses use the same value for the attributes source tree. Signed-off-by: John Cai <johncai86@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-06 14:34:09 -07:00
Elijah Newren	0e312eaa12	diff.h: reduce unnecessary includes Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-24 12:47:33 -07:00
Elijah Newren	e38da487cc	setup.h: move declarations for setup.c functions from cache.h Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-21 10:56:54 -07:00
Elijah Newren	32a8f51061	environment.h: move declarations for environment.c functions from cache.h Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-21 10:56:53 -07:00
Elijah Newren	f394e093df	treewide: be explicit about dependence on gettext.h Dozens of files made use of gettext functions, without explicitly including gettext.h. This made it more difficult to find which files could remove a dependence on cache.h. Make C files explicitly include gettext.h if they are using it. However, while compat/fsmonitor/fsm-ipc-darwin.c should also gain an include of gettext.h, it was left out to avoid conflicting with an in-flight topic. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-21 10:56:51 -07:00
Elijah Newren	36bf195890	alloc.h: move ALLOC_GROW() functions from cache.h This allows us to replace includes of cache.h with includes of the much smaller alloc.h in many places. It does mean that we also need to add includes of alloc.h in a number of C files. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-02-23 17:25:28 -08:00
Junio C Hamano	577bff3a81	Merge branch 'kn/attr-from-tree' "git check-attr" learned to take an optional tree-ish to read the .gitattributes file from. * kn/attr-from-tree: attr: add flag `--source` to work with tree-ish t0003: move setup for `--all` into new block	2023-01-23 13:39:51 -08:00
Junio C Hamano	60ce816cb6	Merge branch 'rs/dup-array' Code cleaning. * rs/dup-array: use DUP_ARRAY add DUP_ARRAY do full type check in BARF_UNLESS_COPYABLE factor out BARF_UNLESS_COPYABLE mingw: make argv2 in try_shell_exec() non-const	2023-01-21 17:21:58 -08:00
Johannes Schindelin	37537d6472	attr: adjust a mismatched data type On platforms where `size_t` does not have the same width as `unsigned long`, passing a pointer to the former when a pointer to the latter is expected can lead to problems. Windows and 32-bit Linux are among the affected platforms. In this instance, we want to store the size of the blob that was read in that variable. However, `read_blob_data_from_index()` passes that pointer to `read_object_file()` which expects an `unsigned long *`. Which means that on affected platforms, the variable is not fully populated and part of its value is left uninitialized. (On Big-Endian platforms, this problem would be even worse.) The consequence is that depending on the uninitialized memory's contents, we may erroneously reject perfectly fine attributes. Let's address this by passing a pointer to a variable of the expected data type. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-01-17 06:58:20 -08:00
Karthik Nayak	47cfc9bd7d	attr: add flag `--source` to work with tree-ish The contents of the .gitattributes files may evolve over time, but "git check-attr" always checks attributes against them in the working tree and/or in the index. It may be beneficial to optionally allow the users to check attributes taken from a commit other than HEAD against paths. Add a new flag `--source` which will allow users to check the attributes against a commit (actually any tree-ish would do). When the user uses this flag, we go through the stack of .gitattributes files but instead of checking the current working tree and/or in the index, we check the blobs from the provided tree-ish object. This allows the command to also be used in bare repositories. Since we use a tree-ish object, the user can pass "--source HEAD:subdirectory" and all the attributes will be looked up as if subdirectory was the root directory of the repository. We cannot simply use the `<rev>:<path>` syntax without the `--source` flag, similar to how it is used in `git show` because any non-flag parameter before `--` is treated as an attribute and any parameter after `--` is treated as a pathname. The change involves creating a new function `read_attr_from_blob`, which given the path reads the blob for the path against the provided source and parses the attributes line by line. This function is plugged into `read_attr()` function wherein we go through the stack of attributes files. Signed-off-by: Karthik Nayak <karthik.188@gmail.com> Signed-off-by: Toon Claes <toon@iotcl.com> Co-authored-by: toon@iotcl.com Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-01-14 08:49:55 -08:00
René Scharfe	6e57841096	use DUP_ARRAY Add a semantic patch for replace ALLOC_ARRAY+COPY_ARRAY with DUP_ARRAY to reduce code duplication and apply its results. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-01-09 13:28:36 +09:00
Junio C Hamano	fea9f607a8	Sync with Git 2.37.5	2022-12-13 21:23:36 +09:00
Junio C Hamano	8253c00421	Merge branch 'maint-2.35' into maint-2.36	2022-12-13 21:19:11 +09:00
Junio C Hamano	3748b5b7f5	Merge branch 'maint-2.33' into maint-2.34	2022-12-13 21:15:22 +09:00
Junio C Hamano	5f22dcc02d	Sync with Git 2.32.5	2022-12-13 21:13:11 +09:00
Junio C Hamano	8a755eddf5	Sync with Git 2.31.6	2022-12-13 21:09:40 +09:00
Junio C Hamano	16128765d7	Git 2.30.7 -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEE4fA2sf7nIh/HeOzvsLXohpav5ssFAmOYaKAACgkQsLXohpav 5svHThAAhjTaBBBYDM6FbHcFUHv515fSo04AmyXl6QKdjiBroaGj+WpJPrYM3B5G eR0MYfwDfp7rAvxPQwq1LzSQLz2G2Swue1a4X0t3Bomjpqf48OeAUleGHQlBUTm7 wfZIHpgCbMBIHJtVAVPiEOo43ZJ1OareCwVpPOpAXLVgTU2Bbx59K0oUqGszjgE3 anQ0kon6hELZ9aBTx80hUJaYWaxiUqENtRFs6vyOV/MKvW2KR+MJgvu/SQqbRJPy ndBJ5r0gcSbes0OLxKCAFFNVt2p6BeVb4IxyPogJveGwJsNU88DQnarSos7hvPYG DkhTzfpPmFJkP0WiRHr87jWCXNJraq1SmK65ac1CGV/NTrDfX9ZNoGIRFsHfLmw2 1poTxhB/h0F4wCucZu7Wavvgd2NI2V+GK5dx8Mx5NovrC67smBny2W7kQgXJCdZX e6vNuKVK7pz3cVYvo5GbUo2ivY2igm9Xbj3Na1/Ie8wTFaZ0ZX+oRnxxAdwKbL/1 X0VRUTQMgtrrLd24JCApo8r5+Ssg0HvIOpXcUZFpvaYl9kMltatwV1Y01lNAhAgF VFBvUWdFy5tGzPzSCd3w2NyZOJBng2GdKw9YUt/WVWCKeiLXLI3wh10pC+m1qJus HJwQbRRSzC4mhXlkKZ5IG+Xz7x+HrHFnLpQXhtjeSc5WwGQkE2w= =syKo -----END PGP SIGNATURE----- Sync with Git 2.30.7	2022-12-13 21:02:20 +09:00
Patrick Steinhardt	3c50032ff5	attr: ignore overly large gitattributes files Similar as with the preceding commit, start ignoring gitattributes files that are overly large to protect us against out-of-bounds reads and writes caused by integer overflows. Unfortunately, we cannot just define "overly large" in terms of any preexisting limits in the codebase. Instead, we choose a very conservative limit of 100MB. This is plenty of room for specifying gitattributes, and incidentally it is also the limit for blob sizes for GitHub. While we don't want GitHub to dictate limits here, it is still sensible to use this fact for an informed decision given that it is hosting a huge set of repositories. Furthermore, over at GitLab we scanned a subset of repositories for their root-level attribute files. We found that 80% of them have a gitattributes file smaller than 100kB, 99.99% have one smaller than 1MB, and only a single repository had one that was almost 3MB in size. So enforcing a limit of 100MB seems to give us ample of headroom. With this limit in place we can be reasonably sure that there is no easy way to exploit the gitattributes file via integer overflows anymore. Furthermore, it protects us against resource exhaustion caused by allocating the in-memory data structures required to represent the parsed attributes. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-12-05 15:50:03 +09:00
Patrick Steinhardt	dfa6b32b5e	attr: ignore attribute lines exceeding 2048 bytes There are two different code paths to read gitattributes: once via a file, and once via the index. These two paths used to behave differently because when reading attributes from a file, we used fgets(3P) with a buffer size of 2kB. Consequentially, we silently truncate line lengths when lines are longer than that and will then parse the remainder of the line as a new pattern. It goes without saying that this is entirely unexpected, but it's even worse that the behaviour depends on how the gitattributes are parsed. While this is simply wrong, the silent truncation saves us with the recently discovered vulnerabilities that can cause out-of-bound writes or reads with unreasonably long lines due to integer overflows. As the common path is to read gitattributes via the worktree file instead of via the index, we can assume that any gitattributes file that had lines longer than that is already broken anyway. So instead of lifting the limit here, we can double down on it to fix the vulnerabilities. Introduce an explicit line length limit of 2kB that is shared across all paths that read attributes and ignore any line that hits this limit while printing a warning. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-12-05 15:33:07 +09:00
Patrick Steinhardt	d74b1fd54f	attr: fix silently splitting up lines longer than 2048 bytes When reading attributes from a file we use fgets(3P) with a buffer size of 2048 bytes. This means that as soon as a line exceeds the buffer size we split it up into multiple parts and parse each of them as a separate pattern line. This is of course not what the user intended, and even worse the behaviour is inconsistent with how we read attributes from the index. Fix this bug by converting the code to use `strbuf_getline()` instead. This will indeed read in the whole line, which may theoretically lead to an out-of-memory situation when the gitattributes file is huge. We're about to reject any gitattributes files larger than 100MB in the next commit though, which makes this less of a concern. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-12-05 15:29:30 +09:00
Patrick Steinhardt	a60a66e409	attr: harden allocation against integer overflows When parsing an attributes line, we need to allocate an array that holds all attributes specified for the given file pattern. The calculation to determine the number of bytes that need to be allocated was prone to an overflow though when there was an unreasonable amount of attributes. Harden the allocation by instead using the `st_` helper functions that cause us to die when we hit an integer overflow. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-12-05 15:14:16 +09:00
Patrick Steinhardt	e1e12e97ac	attr: fix integer overflow with more than INT_MAX macros Attributes have a field that tracks the position in the `all_attrs` array they're stored inside. This field gets set via `hashmap_get_size` when adding the attribute to the global map of attributes. But while the field is of type `int`, the value returned by `hashmap_get_size` is an `unsigned int`. It can thus happen that the value overflows, where we would now dereference teh `all_attrs` array at an out-of-bounds value. We do have a sanity check for this overflow via an assert that verifies the index matches the new hashmap's size. But asserts are not a proper mechanism to detect against any such overflows as they may not in fact be compiled into production code. Fix this by using an `unsigned int` to track the index and convert the assert to a call `die()`. Reported-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-12-05 15:14:16 +09:00
Patrick Steinhardt	447ac906e1	attr: fix out-of-bounds read with unreasonable amount of patterns The `struct attr_stack` tracks the stack of all patterns together with their attributes. When parsing a gitattributes file that has more than 2^31 such patterns though we may trigger multiple out-of-bounds reads on 64 bit platforms. This is because while the `num_matches` variable is an unsigned integer, we always use a signed integer to iterate over them. I have not been able to reproduce this issue due to memory constraints on my systems. But despite the out-of-bounds reads, the worst thing that can seemingly happen is to call free(3P) with a garbage pointer when calling `attr_stack_free()`. Fix this bug by using unsigned integers to iterate over the array. While this makes the iteration somewhat awkward when iterating in reverse, it is at least better than knowingly running into an out-of-bounds read. While at it, convert the call to `ALLOC_GROW` to use `ALLOC_GROW_BY` instead. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-12-05 15:14:16 +09:00

1 2 3 4 5

242 Commits