mirrors/git

mirror of https://github.com/git/git.git synced 2024-11-23 18:05:29 +08:00

Author	SHA1	Message	Date
Patrick Steinhardt	7ac16649ec	path: hide functions using `the_repository` by default The path subsystem provides a bunch of legacy functions that compute paths relative to the "gitdir" and "commondir" directories of the global `the_repository` variable. Use of those functions is discouraged, and it is easy to miss the implicit dependency on `the_repository` that calls to those functions may cause. With `USE_THE_REPOSITORY_VARIABLE`, we have recently introduced a tool that allows us to get rid of such functions over time. With this macro, we can hide away functions that have such implicit dependency such that other subsystems that want to be free of `the_repository` will not use them by accident. Move all path-related functions that use `the_repository` into a block that gets only conditionally compiled depending on whether or not the macro has been defined. This also removes all dependencies on that variable in "path.c", allowing us to remove the definition of said preprocessor macro. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-08-13 10:01:01 -07:00
Patrick Steinhardt	a973f60dc7	path: stop relying on `the_repository` in `worktree_git_path()` When not provided a worktree, then `worktree_git_path()` will fall back to returning a path relative to the main repository. In this case, we implicitly rely on `the_repository` to derive the path. Remove this dependency by passing a `struct repository` as parameter. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-08-13 10:01:01 -07:00
Patrick Steinhardt	78f2210b3c	path: stop relying on `the_repository` when reporting garbage We access `the_repository` in `report_linked_checkout_garbage()` both directly and indirectly via `get_git_dir()`. Remove this dependency by instead passing a `struct repository` as parameter. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-08-13 10:01:01 -07:00
Patrick Steinhardt	61419a42f6	path: expose `do_git_common_path()` as `repo_common_pathv()` With the same reasoning as the preceding commit, expose the function `do_git_common_path()` as `repo_common_pathv()`. While at it, reorder parameters such that they match the order we have in `repo_git_pathv()`. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-08-13 10:01:00 -07:00
Patrick Steinhardt	b6c6bfef31	path: expose `do_git_path()` as `repo_git_pathv()` We're about to move functions of the "path" subsytem that do not use a `struct repository` into "path.h" as static inlined functions. This will require us to call `do_git_path()`, which is internal to "path.c". Expose the function as `repo_git_pathv()` to prepare for the change. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-08-13 10:01:00 -07:00
Patrick Steinhardt	e7da938570	global: introduce `USE_THE_REPOSITORY_VARIABLE` macro Use of the `the_repository` variable is deprecated nowadays, and we slowly but steadily convert the codebase to not use it anymore. Instead, callers should be passing down the repository to work on via parameters. It is hard though to prove that a given code unit does not use this variable anymore. The most trivial case, merely demonstrating that there is no direct use of `the_repository`, is already a bit of a pain during code reviews as the reviewer needs to manually verify claims made by the patch author. The bigger problem though is that we have many interfaces that implicitly rely on `the_repository`. Introduce a new `USE_THE_REPOSITORY_VARIABLE` macro that allows code units to opt into usage of `the_repository`. The intent of this macro is to demonstrate that a certain code unit does not use this variable anymore, and to keep it from new dependencies on it in future changes, be it explicit or implicit For now, the macro only guards `the_repository` itself as well as `the_hash_algo`. There are many more known interfaces where we have an implicit dependency on `the_repository`, but those are not guarded at the current point in time. Over time though, we should start to add guards as required (or even better, just remove them). Define the macro as required in our code units. As expected, most of our code still relies on the global variable. Nearly all of our builtins rely on the variable as there is no way yet to pass `the_repository` to their entry point. For now, declare the macro in "biultin.h" to keep the required changes at least a little bit more contained. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-14 10:26:33 -07:00
Junio C Hamano	a60c21b720	Merge branch 'ps/undecided-is-not-necessarily-sha1' Before discovering the repository details, We used to assume SHA-1 as the "default" hash function, which has been corrected. Hopefully this will smoke out codepaths that rely on such an unwarranted assumptions. * ps/undecided-is-not-necessarily-sha1: repository: stop setting SHA1 as the default object hash oss-fuzz/commit-graph: set up hash algorithm builtin/shortlog: don't set up revisions without repo builtin/diff: explicitly set hash algo when there is no repo builtin/bundle: abort "verify" early when there is no repository builtin/blame: don't access potentially unitialized `the_hash_algo` builtin/rev-parse: allow shortening to more than 40 hex characters remote-curl: fix parsing of detached SHA256 heads attr: fix BUG() when parsing attrs outside of repo attr: don't recompute default attribute source parse-options-cb: only abbreviate hashes when hash algo is known path: move `validate_headref()` to its only user path: harden validation of HEAD with non-standard hashes	2024-05-30 14:15:11 -07:00
Patrick Steinhardt	0c6bd2b81d	path: move `validate_headref()` to its only user While `validate_headref()` is only called from `is_git_directory()` in "setup.c", it is currently implemented in "path.c". Move it over such that it becomes clear that it is only really used during setup in order to discover repositories. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-05-06 22:50:48 -07:00
Patrick Steinhardt	a0851cece5	path: harden validation of HEAD with non-standard hashes The `validate_headref()` function takes a path to a supposed "HEAD" file and checks whether its format is something that we understand. It is used as part of our repository discovery to check whether a specific directory is a Git directory or not. Part of the validation is a check for a detached HEAD that contains a plain object ID. To do this validation we use `get_oid_hex()`, which relies on `the_hash_algo`. At this point in time the hash algo cannot yet be initialized though because we didn't yet read the Git config. Consequently, it will always be the SHA1 hash algorithm. In practice this works alright because `get_oid_hex()` only ends up checking whether the prefix of the buffer is a valid object ID. And because SHA1 is shorter than SHA256, the function will successfully parse SHA256 object IDs, as well. It is somewhat fragile though and not really the intent to only check for SHA1. With this in mind, harden the code to use `get_oid_hex_any()` to check whether the "HEAD" file parses as any known hash. One might be hard pressed to tighten the check even further and fully validate the file contents, not only the prefix. In practice though that wouldn't make a lot of sense as it could be that the repository uses a hash function that produces longer hashes than SHA256, but which the current version of Git doesn't understand yet. We'd still want to detect the repository as proper Git repository in that case, and we will fail eventually with a proper error message that the hash isn't understood when trying to set up the repository format. It follows that we could just leave the current code intact, as in practice the code change doesn't have any user visible impact. But it also prepares us for `the_hash_algo` being unset when there is no repository. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-05-06 22:50:48 -07:00
Johannes Schindelin	1c00f92eb5	Sync with 2.44.1 * maint-2.44: (41 commits) Git 2.44.1 Git 2.43.4 Git 2.42.2 Git 2.41.1 Git 2.40.2 Git 2.39.4 fsck: warn about symlink pointing inside a gitdir core.hooksPath: add some protection while cloning init.templateDir: consider this config setting protected clone: prevent hooks from running during a clone Add a helper function to compare file contents init: refactor the template directory discovery into its own function find_hook(): refactor the `STRIP_EXTENSION` logic clone: when symbolic links collide with directories, keep the latter entry: report more colliding paths t5510: verify that D/F confusion cannot lead to an RCE submodule: require the submodule path to contain directories only clone_submodule: avoid using `access()` on directories submodules: submodule paths must not contain symlinks clone: prevent clashing git dirs when cloning submodule in parallel ...	2024-04-29 20:42:30 +02:00
Johannes Schindelin	e5e6663e69	Sync with 2.43.4 * maint-2.43: (40 commits) Git 2.43.4 Git 2.42.2 Git 2.41.1 Git 2.40.2 Git 2.39.4 fsck: warn about symlink pointing inside a gitdir core.hooksPath: add some protection while cloning init.templateDir: consider this config setting protected clone: prevent hooks from running during a clone Add a helper function to compare file contents init: refactor the template directory discovery into its own function find_hook(): refactor the `STRIP_EXTENSION` logic clone: when symbolic links collide with directories, keep the latter entry: report more colliding paths t5510: verify that D/F confusion cannot lead to an RCE submodule: require the submodule path to contain directories only clone_submodule: avoid using `access()` on directories submodules: submodule paths must not contain symlinks clone: prevent clashing git dirs when cloning submodule in parallel t7423: add tests for symlinked submodule directories ...	2024-04-19 12:38:54 +02:00
Johannes Schindelin	be348e9815	Sync with 2.41.1 * maint-2.41: (38 commits) Git 2.41.1 Git 2.40.2 Git 2.39.4 fsck: warn about symlink pointing inside a gitdir core.hooksPath: add some protection while cloning init.templateDir: consider this config setting protected clone: prevent hooks from running during a clone Add a helper function to compare file contents init: refactor the template directory discovery into its own function find_hook(): refactor the `STRIP_EXTENSION` logic clone: when symbolic links collide with directories, keep the latter entry: report more colliding paths t5510: verify that D/F confusion cannot lead to an RCE submodule: require the submodule path to contain directories only clone_submodule: avoid using `access()` on directories submodules: submodule paths must not contain symlinks clone: prevent clashing git dirs when cloning submodule in parallel t7423: add tests for symlinked submodule directories has_dir_name(): do not get confused by characters < '/' docs: document security issues around untrusted .git dirs ...	2024-04-19 12:38:46 +02:00
Johannes Schindelin	f5b2af06f5	Sync with 2.40.2 * maint-2.40: (39 commits) Git 2.40.2 Git 2.39.4 fsck: warn about symlink pointing inside a gitdir core.hooksPath: add some protection while cloning init.templateDir: consider this config setting protected clone: prevent hooks from running during a clone Add a helper function to compare file contents init: refactor the template directory discovery into its own function find_hook(): refactor the `STRIP_EXTENSION` logic clone: when symbolic links collide with directories, keep the latter entry: report more colliding paths t5510: verify that D/F confusion cannot lead to an RCE submodule: require the submodule path to contain directories only clone_submodule: avoid using `access()` on directories submodules: submodule paths must not contain symlinks clone: prevent clashing git dirs when cloning submodule in parallel t7423: add tests for symlinked submodule directories has_dir_name(): do not get confused by characters < '/' docs: document security issues around untrusted .git dirs upload-pack: disable lazy-fetching by default ...	2024-04-19 12:38:42 +02:00
Johannes Schindelin	f4aa8c8bb1	fetch/clone: detect dubious ownership of local repositories When cloning from somebody else's repositories, it is possible that, say, the `upload-pack` command is overridden in the repository that is about to be cloned, which would then be run in the user's context who started the clone. To remind the user that this is a potentially unsafe operation, let's extend the ownership checks we have already established for regular gitdir discovery to extend also to local repositories that are about to be cloned. This protection extends also to file:// URLs. The fixes in this commit address CVE-2024-32004. Note: This commit does not touch the `fetch`/`clone` code directly, but instead the function used implicitly by both: `enter_repo()`. This function is also used by `git receive-pack` (i.e. pushes), by `git upload-archive`, by `git daemon` and by `git http-backend`. In setups that want to serve repositories owned by different users than the account running the service, this will require `safe.*` settings to be configured accordingly. Also note: there are tiny time windows where a time-of-check-time-of-use ("TOCTOU") race is possible. The real solution to those would be to work with `fstat()` and `openat()`. However, the latter function is not available on Windows (and would have to be emulated with rather expensive low-level `NtCreateFile()` calls), and the changes would be quite extensive, for my taste too extensive for the little gain given that embargoed releases need to pay extra attention to avoid introducing inadvertent bugs. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>	2024-04-17 22:29:54 +02:00
Junio C Hamano	c7a9ec4728	Merge branch 'rs/apply-lift-path-length-limit' "git apply" has been updated to lift the hardcoded pathname length limit, which in turn allowed a mksnpath() function that is no longer used. * rs/apply-lift-path-length-limit: path: remove mksnpath() apply: avoid fixed-size buffer in create_one_file()	2024-04-15 14:11:42 -07:00
René Scharfe	708f7e0590	path: remove mksnpath() Remove the function mksnpath(), which has become unused. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-04-05 09:49:38 -07:00
Patrick Steinhardt	57db2a094d	refs: introduce reftable backend Due to scalability issues, Shawn Pearce has originally proposed a new "reftable" format more than six years ago [1]. Initially, this new format was implemented in JGit with promising results. Around two years ago, we have then added the "reftable" library to the Git codebase via `a4bbd13be3` (Merge branch 'hn/reftable', 2021-12-15). With this we have landed all the low-level code to read and write reftables. Notably missing though was the integration of this low-level code into the Git code base in the form of a new ref backend that ties all of this together. This gap is now finally closed by introducing a new "reftable" backend into the Git codebase. This new backend promises to bring some notable improvements to Git repositories: - It becomes possible to do truly atomic writes where either all refs are committed to disk or none are. This was not possible with the "files" backend because ref updates were split across multiple loose files. - The disk space required to store many refs is reduced, both compared to loose refs and packed-refs. This is enabled both by the reftable format being a binary format, which is more compact, and by prefix compression. - We can ignore filesystem-specific behaviour as ref names are not encoded via paths anymore. This means there is no need to handle case sensitivity on Windows systems or Unicode precomposition on macOS. - There is no need to rewrite the complete refdb anymore every time a ref is being deleted like it was the case for packed-refs. This means that ref deletions are now constant time instead of scaling linearly with the number of refs. - We can ignore file/directory conflicts so that it becomes possible to store both "refs/heads/foo" and "refs/heads/foo/bar". - Due to this property we can retain reflogs for deleted refs. We have previously been deleting reflogs together with their refs to avoid file/directory conflicts, which is not necessary anymore. - We can properly enumerate all refs. With the "files" backend it is not easily possible to distinguish between refs and non-refs because they may live side by side in the gitdir. Not all of these improvements are realized with the current "reftable" backend implementation. At this point, the new backend is supposed to be a drop-in replacement for the "files" backend that is used by basically all Git repositories nowadays. It strives for 1:1 compatibility, which means that a user can expect the same behaviour regardless of whether they use the "reftable" backend or the "files" backend for most of the part. Most notably, this means we artificially limit the capabilities of the "reftable" backend to match the limits of the "files" backend. It is not possible to create refs that would end up with file/directory conflicts, we do not retain reflogs, we perform stricter-than-necessary checks. This is done intentionally due to two main reasons: - It makes it significantly easier to land the "reftable" backend as tests behave the same. It would be tough to argue for each and every single test that doesn't pass with the "reftable" backend. - It ensures compatibility between repositories that use the "files" backend and repositories that use the "reftable" backend. Like this, hosters can migrate their repositories to use the "reftable" backend without causing issues for clients that use the "files" backend in their clones. It is expected that these artificial limitations may eventually go away in the long term. Performance-wise things very much depend on the actual workload. The following benchmarks compare the "files" and "reftable" backends in the current version: - Creating N refs in separate transactions shows that the "files" backend is ~50% faster. This is not surprising given that creating a ref only requires us to create a single loose ref. The "reftable" backend will also perform auto compaction on updates. In real-world workloads we would likely also want to perform pack loose refs, which would likely change the picture. Benchmark 1: update-ref: create refs sequentially (refformat = files, refcount = 1) Time (mean ± σ): 2.1 ms ± 0.3 ms [User: 0.6 ms, System: 1.7 ms] Range (min … max): 1.8 ms … 4.3 ms 133 runs Benchmark 2: update-ref: create refs sequentially (refformat = reftable, refcount = 1) Time (mean ± σ): 2.7 ms ± 0.1 ms [User: 0.6 ms, System: 2.2 ms] Range (min … max): 2.4 ms … 2.9 ms 132 runs Benchmark 3: update-ref: create refs sequentially (refformat = files, refcount = 1000) Time (mean ± σ): 1.975 s ± 0.006 s [User: 0.437 s, System: 1.535 s] Range (min … max): 1.969 s … 1.980 s 3 runs Benchmark 4: update-ref: create refs sequentially (refformat = reftable, refcount = 1000) Time (mean ± σ): 2.611 s ± 0.013 s [User: 0.782 s, System: 1.825 s] Range (min … max): 2.597 s … 2.622 s 3 runs Benchmark 5: update-ref: create refs sequentially (refformat = files, refcount = 100000) Time (mean ± σ): 198.442 s ± 0.241 s [User: 43.051 s, System: 155.250 s] Range (min … max): 198.189 s … 198.670 s 3 runs Benchmark 6: update-ref: create refs sequentially (refformat = reftable, refcount = 100000) Time (mean ± σ): 294.509 s ± 4.269 s [User: 104.046 s, System: 190.326 s] Range (min … max): 290.223 s … 298.761 s 3 runs - Creating N refs in a single transaction shows that the "files" backend is significantly slower once we start to write many refs. The "reftable" backend only needs to update two files, whereas the "files" backend needs to write one file per ref. Benchmark 1: update-ref: create many refs (refformat = files, refcount = 1) Time (mean ± σ): 1.9 ms ± 0.1 ms [User: 0.4 ms, System: 1.4 ms] Range (min … max): 1.8 ms … 2.6 ms 151 runs Benchmark 2: update-ref: create many refs (refformat = reftable, refcount = 1) Time (mean ± σ): 2.5 ms ± 0.1 ms [User: 0.7 ms, System: 1.7 ms] Range (min … max): 2.4 ms … 3.4 ms 148 runs Benchmark 3: update-ref: create many refs (refformat = files, refcount = 1000) Time (mean ± σ): 152.5 ms ± 5.2 ms [User: 19.1 ms, System: 133.1 ms] Range (min … max): 148.5 ms … 167.8 ms 15 runs Benchmark 4: update-ref: create many refs (refformat = reftable, refcount = 1000) Time (mean ± σ): 58.0 ms ± 2.5 ms [User: 28.4 ms, System: 29.4 ms] Range (min … max): 56.3 ms … 72.9 ms 40 runs Benchmark 5: update-ref: create many refs (refformat = files, refcount = 1000000) Time (mean ± σ): 152.752 s ± 0.710 s [User: 20.315 s, System: 131.310 s] Range (min … max): 152.165 s … 153.542 s 3 runs Benchmark 6: update-ref: create many refs (refformat = reftable, refcount = 1000000) Time (mean ± σ): 51.912 s ± 0.127 s [User: 26.483 s, System: 25.424 s] Range (min … max): 51.769 s … 52.012 s 3 runs - Deleting a ref in a fully-packed repository shows that the "files" backend scales with the number of refs. The "reftable" backend has constant-time deletions. Benchmark 1: update-ref: delete ref (refformat = files, refcount = 1) Time (mean ± σ): 1.7 ms ± 0.1 ms [User: 0.4 ms, System: 1.2 ms] Range (min … max): 1.6 ms … 2.1 ms 316 runs Benchmark 2: update-ref: delete ref (refformat = reftable, refcount = 1) Time (mean ± σ): 1.8 ms ± 0.1 ms [User: 0.4 ms, System: 1.3 ms] Range (min … max): 1.7 ms … 2.1 ms 294 runs Benchmark 3: update-ref: delete ref (refformat = files, refcount = 1000) Time (mean ± σ): 2.0 ms ± 0.1 ms [User: 0.5 ms, System: 1.4 ms] Range (min … max): 1.9 ms … 2.5 ms 287 runs Benchmark 4: update-ref: delete ref (refformat = reftable, refcount = 1000) Time (mean ± σ): 1.9 ms ± 0.1 ms [User: 0.5 ms, System: 1.3 ms] Range (min … max): 1.8 ms … 2.1 ms 217 runs Benchmark 5: update-ref: delete ref (refformat = files, refcount = 1000000) Time (mean ± σ): 229.8 ms ± 7.9 ms [User: 182.6 ms, System: 46.8 ms] Range (min … max): 224.6 ms … 245.2 ms 6 runs Benchmark 6: update-ref: delete ref (refformat = reftable, refcount = 1000000) Time (mean ± σ): 2.0 ms ± 0.0 ms [User: 0.6 ms, System: 1.3 ms] Range (min … max): 2.0 ms … 2.1 ms 3 runs - Listing all refs shows no significant advantage for either of the backends. The "files" backend is a bit faster, but not by a significant margin. When repositories are not packed the "reftable" backend outperforms the "files" backend because the "reftable" backend performs auto-compaction. Benchmark 1: show-ref: print all refs (refformat = files, refcount = 1, packed = true) Time (mean ± σ): 1.6 ms ± 0.1 ms [User: 0.4 ms, System: 1.1 ms] Range (min … max): 1.5 ms … 2.0 ms 1729 runs Benchmark 2: show-ref: print all refs (refformat = reftable, refcount = 1, packed = true) Time (mean ± σ): 1.6 ms ± 0.1 ms [User: 0.4 ms, System: 1.1 ms] Range (min … max): 1.5 ms … 1.8 ms 1816 runs Benchmark 3: show-ref: print all refs (refformat = files, refcount = 1000, packed = true) Time (mean ± σ): 4.3 ms ± 0.1 ms [User: 0.9 ms, System: 3.3 ms] Range (min … max): 4.1 ms … 4.6 ms 645 runs Benchmark 4: show-ref: print all refs (refformat = reftable, refcount = 1000, packed = true) Time (mean ± σ): 4.5 ms ± 0.2 ms [User: 1.0 ms, System: 3.3 ms] Range (min … max): 4.2 ms … 5.9 ms 643 runs Benchmark 5: show-ref: print all refs (refformat = files, refcount = 1000000, packed = true) Time (mean ± σ): 2.537 s ± 0.034 s [User: 0.488 s, System: 2.048 s] Range (min … max): 2.511 s … 2.627 s 10 runs Benchmark 6: show-ref: print all refs (refformat = reftable, refcount = 1000000, packed = true) Time (mean ± σ): 2.712 s ± 0.017 s [User: 0.653 s, System: 2.059 s] Range (min … max): 2.692 s … 2.752 s 10 runs Benchmark 7: show-ref: print all refs (refformat = files, refcount = 1, packed = false) Time (mean ± σ): 1.6 ms ± 0.1 ms [User: 0.4 ms, System: 1.1 ms] Range (min … max): 1.5 ms … 1.9 ms 1834 runs Benchmark 8: show-ref: print all refs (refformat = reftable, refcount = 1, packed = false) Time (mean ± σ): 1.6 ms ± 0.1 ms [User: 0.4 ms, System: 1.1 ms] Range (min … max): 1.4 ms … 2.0 ms 1840 runs Benchmark 9: show-ref: print all refs (refformat = files, refcount = 1000, packed = false) Time (mean ± σ): 13.8 ms ± 0.2 ms [User: 2.8 ms, System: 10.8 ms] Range (min … max): 13.3 ms … 14.5 ms 208 runs Benchmark 10: show-ref: print all refs (refformat = reftable, refcount = 1000, packed = false) Time (mean ± σ): 4.5 ms ± 0.2 ms [User: 1.2 ms, System: 3.3 ms] Range (min … max): 4.3 ms … 6.2 ms 624 runs Benchmark 11: show-ref: print all refs (refformat = files, refcount = 1000000, packed = false) Time (mean ± σ): 12.127 s ± 0.129 s [User: 2.675 s, System: 9.451 s] Range (min … max): 11.965 s … 12.370 s 10 runs Benchmark 12: show-ref: print all refs (refformat = reftable, refcount = 1000000, packed = false) Time (mean ± σ): 2.799 s ± 0.022 s [User: 0.735 s, System: 2.063 s] Range (min … max): 2.769 s … 2.836 s 10 runs - Printing a single ref shows no real difference between the "files" and "reftable" backends. Benchmark 1: show-ref: print single ref (refformat = files, refcount = 1) Time (mean ± σ): 1.5 ms ± 0.1 ms [User: 0.4 ms, System: 1.0 ms] Range (min … max): 1.4 ms … 1.8 ms 1779 runs Benchmark 2: show-ref: print single ref (refformat = reftable, refcount = 1) Time (mean ± σ): 1.6 ms ± 0.1 ms [User: 0.4 ms, System: 1.1 ms] Range (min … max): 1.4 ms … 2.5 ms 1753 runs Benchmark 3: show-ref: print single ref (refformat = files, refcount = 1000) Time (mean ± σ): 1.5 ms ± 0.1 ms [User: 0.3 ms, System: 1.1 ms] Range (min … max): 1.4 ms … 1.9 ms 1840 runs Benchmark 4: show-ref: print single ref (refformat = reftable, refcount = 1000) Time (mean ± σ): 1.6 ms ± 0.1 ms [User: 0.4 ms, System: 1.1 ms] Range (min … max): 1.5 ms … 2.0 ms 1831 runs Benchmark 5: show-ref: print single ref (refformat = files, refcount = 1000000) Time (mean ± σ): 1.6 ms ± 0.1 ms [User: 0.4 ms, System: 1.1 ms] Range (min … max): 1.5 ms … 2.1 ms 1848 runs Benchmark 6: show-ref: print single ref (refformat = reftable, refcount = 1000000) Time (mean ± σ): 1.6 ms ± 0.1 ms [User: 0.4 ms, System: 1.1 ms] Range (min … max): 1.5 ms … 2.1 ms 1762 runs So overall, performance depends on the usecases. Except for many sequential writes the "reftable" backend is roughly on par or significantly faster than the "files" backend though. Given that the "files" backend has received 18 years of optimizations by now this can be seen as a win. Furthermore, we can expect that the "reftable" backend will grow faster over time when attention turns more towards optimizations. The complete test suite passes, except for those tests explicitly marked to require the REFFILES prerequisite. Some tests in t0610 are marked as failing because they depend on still-in-flight bug fixes. Tests can be run with the new backend by setting the GIT_TEST_DEFAULT_REF_FORMAT environment variable to "reftable". There is a single known conceptual incompatibility with the dumb HTTP transport. As "info/refs" SHOULD NOT contain the HEAD reference, and because the "HEAD" file is not valid anymore, it is impossible for the remote client to figure out the default branch without changing the protocol. This shortcoming needs to be handled in a subsequent patch series. As the reftable library has already been introduced a while ago, this commit message will not go into the details of how exactly the on-disk format works. Please refer to our preexisting technical documentation at Documentation/technical/reftable for this. [1]: https://public-inbox.org/git/CAJo=hJtyof=HRy=2sLP0ng0uZ4=S-DpZ5dR1aF+VHVETKG20OQ@mail.gmail.com/ Original-idea-by: Shawn Pearce <spearce@spearce.org> Based-on-patch-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-02-07 08:28:37 -08:00
Patrick Steinhardt	3f921c7591	refs: convert MERGE_AUTOSTASH to become a normal pseudo-ref Similar to the preceding conversion of the AUTO_MERGE pseudo-ref, let's convert the MERGE_AUTOSTASH ref to become a normal pseudo-ref as well. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-01-19 11:10:41 -08:00
Patrick Steinhardt	fd7c6ffa9e	refs: convert AUTO_MERGE to become a normal pseudo-ref In `70c70de616` (refs: complete list of special refs, 2023-12-14) we have inrtoduced a new `is_special_ref()` function that classifies some refs as being special. The rule is that special refs are exclusively read and written via the filesystem directly, whereas normal refs exclucsively go via the refs API. The intent of that commit was to record the status quo so that we know to route reads of such special refs consistently. Eventually, the list should be reduced to its bare minimum of refs which really are special, namely FETCH_HEAD and MERGE_HEAD. Follow up on this promise and convert the AUTO_MERGE ref to become a normal pseudo-ref by using the refs API to both read and write it instead of accessing the filesystem directly. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-01-19 11:10:41 -08:00
Junio C Hamano	ce481ac8b3	Merge branch 'cw/compat-util-header-cleanup' Further shuffling of declarations across header files to streamline file dependencies. * cw/compat-util-header-cleanup: git-compat-util: move alloc macros to git-compat-util.h treewide: remove unnecessary includes for wrapper.h kwset: move translation table from ctype sane-ctype.h: create header for sane-ctype macros git-compat-util: move wrapper.c funcs to its header git-compat-util: move strbuf.c funcs to its header	2023-07-17 11:30:42 -07:00
Junio C Hamano	67e7305e64	Merge branch 'cw/strbuf-cleanup' Move functions that are not about pure string manipulation out of strbuf.[ch] * cw/strbuf-cleanup: strbuf: remove global variable path: move related function to path object-name: move related functions to object-name credential-store: move related functions to credential-store file abspath: move related functions to abspath strbuf: clarify dependency strbuf: clarify API boundary	2023-07-06 11:54:46 -07:00
Calvin Wan	da9502ff4d	treewide: remove unnecessary includes for wrapper.h Signed-off-by: Calvin Wan <calvinwan@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-07-05 11:41:59 -07:00
Elijah Newren	a034e9106f	object-store-ll.h: split this header out of object-store.h The vast majority of files including object-store.h did not need dir.h nor khash.h. Split the header into two files, and let most just depend upon object-store-ll.h, while letting the two callers that need it depend on the full object-store.h. After this patch: $ git grep -h include..object-store \| sort \| uniq -c 2 #include "object-store.h" 129 #include "object-store-ll.h" Diff best viewed with `--color-moved`. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-06-21 13:39:54 -07:00
Calvin Wan	aba0706832	path: move related function to path Move path-related function from strbuf.[ch] to path.[ch] so that strbuf is focused on string manipulation routines with minimal dependencies. repository.h is no longer a necessary dependency after moving this function out. Signed-off-by: Calvin Wan <calvinwan@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-06-12 13:49:36 -07:00
Elijah Newren	65156bb7ec	treewide: remove double forward declaration of read_in_full cache.h's nature of a dumping ground of includes prevented it from being included in some compat/ files, forcing us into a workaround of having a double forward declaration of the read_in_full() function (see commit `14086b0a13` ("compat/pread.c: Add a forward declaration to fix a warning", 2007-11-17)). Now that we have moved functions like read_in_full() from cache.h to wrapper.h, and wrapper.h isn't littered with unrelated and scary #defines, get rid of the extra forward declaration and just have compat/pread.c include wrapper.h. Signed-off-by: Elijah Newren <newren@gmail.com> Acked-by: Calvin Wan <calvinwan@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-11 08:52:11 -07:00
Elijah Newren	61a7b98264	treewide: remove cache.h inclusion due to setup.h changes By moving several declarations to setup.h, the previous patch made it possible to remove the include of cache.h in several source files. Do so. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-21 10:56:54 -07:00
Elijah Newren	e38da487cc	setup.h: move declarations for setup.c functions from cache.h Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-21 10:56:54 -07:00
Elijah Newren	32a8f51061	environment.h: move declarations for environment.c functions from cache.h Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-21 10:56:53 -07:00
Elijah Newren	0b027f6ca7	abspath.h: move absolute path functions from cache.h This is another step towards letting us remove the include of cache.h in strbuf.c. It does mean that we also need to add includes of abspath.h in a number of C files. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-21 10:56:52 -07:00
Elijah Newren	f394e093df	treewide: be explicit about dependence on gettext.h Dozens of files made use of gettext functions, without explicitly including gettext.h. This made it more difficult to find which files could remove a dependence on cache.h. Make C files explicitly include gettext.h if they are using it. However, while compat/fsmonitor/fsm-ipc-darwin.c should also gain an include of gettext.h, it was left out to avoid conflicting with an in-flight topic. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-21 10:56:51 -07:00
Junio C Hamano	d0732a8120	Merge branch 'jk/unused-post-2.39-part2' More work towards -Wunused. * jk/unused-post-2.39-part2: (21 commits) help: mark unused parameter in git_unknown_cmd_config() run_processes_parallel: mark unused callback parameters userformat_want_item(): mark unused parameter for_each_commit_graft(): mark unused callback parameter rewrite_parents(): mark unused callback parameter fetch-pack: mark unused parameter in callback function notes: mark unused callback parameters prio-queue: mark unused parameters in comparison functions for_each_object: mark unused callback parameters list-objects: mark unused callback parameters mark unused parameters in signal handlers run-command: mark error routine parameters as unused mark "pointless" data pointers in callbacks ref-filter: mark unused callback parameters http-backend: mark unused parameters in virtual functions http-backend: mark argc/argv unused object-name: mark unused parameters in disambiguate callbacks serve: mark unused parameters in virtual functions serve: use repository pointer to get config ls-refs: drop config caching ...	2023-03-17 14:03:09 -07:00
Jeff King	d3dcfa047f	mark "pointless" data pointers in callbacks Both the object_array_filter() and trie_find() functions use callback functions that let the caller specify which elements match. These callbacks take a void pointer in case the caller wants to pass in extra data. But in each case, the single user of these functions just passes NULL, and the callback ignores the extra pointer. We could just remove these unused parameters from the callback interface entirely. But it's good practice to provide such a pointer, as it guides future callers of the function in the right direction (rather than tempting them to access global data). Plus it's consistent with other generic callback interfaces. So let's instead annotate the unused parameters, in order to silence the compiler's -Wunused-parameter warning. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-02-24 09:13:30 -08:00
Elijah Newren	41771fa435	cache.h: remove dependence on hex.h; make other files include it explicitly Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-02-23 17:25:29 -08:00
Junio C Hamano	671bbf7b9d	adjust_shared_perm(): leave g+s alone when the group does not matter Julien Moutinho reports that in an environment where directory does not have BSD group semantics and requires the g+s to be set (aka FORCE_DIR_SET_GID), but the system forbids chmod() to touch the g+s bit, adjust_shared_perm() fails even when the repository is for private use with perm = 0600, because we unconditionally try to set the g+s bit. When we grant extra access based on group membership (i.e. the directory has either g+r or g+w bit set), which group the directory and its contents are owned by matters. But otherwise (e.g. perm is set to 0600, in Julien's case), flipping g+s bit is not necessary. Reported-by: Julien Moutinho <julm+git@sourcephile.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-10-28 14:55:27 -07:00
Junio C Hamano	b3b2ddced2	Merge branch 'ds/bundle-uri' Preliminary code refactoring around transport and bundle code. * ds/bundle-uri: bundle.h: make "fd" version of read_bundle_header() public remote: allow relative_url() to return an absolute url remote: move relative_url() http: make http_get_file() external fetch-pack: move --keep=* option filling to a function fetch-pack: add a deref_without_lazy_fetch_extended() dir API: add a generalized path_match_flags() function connect.c: refactor sending of agent & object-format	2022-06-03 14:30:34 -07:00
Ævar Arnfjörð Bjarmason	9fd512c8d6	dir API: add a generalized path_match_flags() function Add a path_match_flags() function and have the two sets of starts_with_dot_{,dot_}slash() functions added in `63e95beb08` (submodule: port resolve_relative_url from shell to C, 2016-04-15) and `a2b26ffb1a` (fsck: convert gitmodules url to URL passed to curl, 2020-04-18) be thin wrappers for it. As the latter of those notes the fsck version was copied from the initial builtin/submodule--helper.c version. Since the code added in `a2b26ffb1a` was doing really doing the same as win32_is_dir_sep() added in `1cadad6f65` (git clone <url> C:\cygwin\home\USER\repo' is working (again), 2018-12-15) let's move the latter to git-compat-util.h is a is_xplatform_dir_sep(). We can then call either it or the platform-specific is_dir_sep() from this new function. Let's likewise change code in various other places that was hardcoding checks for "'/' \|\| '\\'" with the new is_xplatform_dir_sep(). As can be seen in those callers some of them still concern themselves with ':' (Mac OS classic?), but let's leave the question of whether that should be consolidated for some other time. As we expect to make wider use of the "native" case in the future, define and use two starts_with_dot_{,dot_}slash_native() convenience wrappers. This makes the diff in builtin/submodule--helper.c much smaller. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-05-16 15:02:09 -07:00
Junio C Hamano	afe8a9070b	tree-wide: apply equals-null.cocci Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-05-02 09:50:37 -07:00
Johannes Schindelin	93fbff09eb	Sync with 2.33.2 * maint-2.33: Git 2.33.2 Git 2.32.1 Git 2.31.2 GIT-VERSION-GEN: bump to v2.33.1 Git 2.30.3 setup_git_directory(): add an owner check for the top-level directory Add a function to determine whether a path is owned by the current user	2022-03-24 00:31:36 +01:00
Johannes Schindelin	201b0c7af6	Sync with 2.31.2 * maint-2.31: Git 2.31.2 Git 2.30.3 setup_git_directory(): add an owner check for the top-level directory Add a function to determine whether a path is owned by the current user	2022-03-24 00:31:28 +01:00
Johannes Schindelin	fdcad5a53e	Fix `GIT_CEILING_DIRECTORIES` with `C:\` and the likes When determining the length of the longest ancestor of a given path with respect to to e.g. `GIT_CEILING_DIRECTORIES`, we special-case the root directory by returning 0 (i.e. we pretend that the path `/` does not end in a slash by virtually stripping it). That is the correct behavior because when normalizing paths, the root directory is special: all other directory paths have their trailing slash stripped, but not the root directory's path (because it would become the empty string, which is not a legal path). However, this special-casing of the root directory in `longest_ancestor_length()` completely forgets about Windows-style root directories, e.g. `C:\`. These _also_ get normalized with a trailing slash (because `C:` would actually refer to the current directory on that drive, not necessarily to its root directory). In `fc56c7b34b` (mingw: accomodate t0060-path-utils for MSYS2, 2016-01-27), we almost got it right. We noticed that `longest_ancestor_length()` expects a slash _after_ the matched prefix, and if the prefix already ends in a slash, the normalized path won't ever match and -1 is returned. But then that commit went astray: The correct fix is not to adjust the _tests_ to expect an incorrect -1 when that function is fed a prefix that ends in a slash, but instead to treat such a prefix as if the trailing slash had been removed. Likewise, that function needs to handle the case where it is fed a path that ends in a slash (not only a prefix that ends in a slash): if it matches the prefix (plus trailing slash), we still need to verify that the path does not end there, otherwise the prefix is not actually an ancestor of the path but identical to it (and we need to return -1 in that case). With these two adjustments, we no longer need to play games in t0060 where we only add `$rootoff` if the passed prefix is different from the MSYS2 pseudo root, instead we also add it for the MSYS2 pseudo root itself. We do have to be careful to skip that logic entirely for Windows paths, though, because they do are not subject to that MSYS2 pseudo root treatment. This patch fixes the scenario where a user has set `GIT_CEILING_DIRECTORIES=C:\`, which would be ignored otherwise. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>	2022-03-24 00:21:08 +01:00
Junio C Hamano	ed8794ef7a	Merge branch 'lh/systemd-timers' "git maintenance" scheduler learned to use systemd timers as a possible backend. * lh/systemd-timers: maintenance: add support for systemd timers on Linux maintenance: `git maintenance run` learned `--scheduler=<scheduler>` cache.h: Introduce a generic "xdg_config_home_for(…)" function	2021-09-20 15:20:40 -07:00
Lénaïc Huard	cb7db5bbd5	cache.h: Introduce a generic "xdg_config_home_for(…)" function Current implementation of `xdg_config_home(filename)` returns `$XDG_CONFIG_HOME/git/$filename`, with the `git` subdirectory inserted between the `XDG_CONFIG_HOME` environment variable and the parameter. This patch introduces a `xdg_config_home_for(subdir, filename)` function which is more generic. It only concatenates "$XDG_CONFIG_HOME", or "$HOME/.config" if the former isn’t defined, with the parameters, without adding `git` in between. `xdg_config_home(filename)` is now implemented by calling `xdg_config_home_for("git", filename)` but this new generic function can be used to compute the configuration directory of other programs. Signed-off-by: Lénaïc Huard <lenaic@lhuard.fr> Acked-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-07 10:57:04 -07:00
Johannes Schindelin	e394a16023	interpolate_path(): allow specifying paths relative to the runtime prefix Ever since Git learned to detect its install location at runtime, there was the slightly awkward problem that it was impossible to specify paths relative to said location. For example, if a version of Git was shipped with custom SSL certificates to use, there was no portable way to specify `http.sslCAInfo`. In Git for Windows, the problem was "solved" for years by interpreting paths starting with a slash as relative to the runtime prefix. However, this is not correct: such paths _are_ legal on Windows, and they are interpreted as absolute paths in the same drive as the current directory. After a lengthy discussion, and an even lengthier time to mull over the problem and its best solution, and then more discussions, we eventually decided to introduce support for the magic sequence `%(prefix)/`. If a path starts with this, the remainder is interpreted as relative to the detected (runtime) prefix. If built without runtime prefix support, Git will simply interpolate the compiled-in prefix. If a user _wants_ to specify a path starting with the magic sequence, they can prefix the magic sequence with `./` and voilà, the path won't be expanded. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-07-26 12:17:16 -07:00
Johannes Schindelin	a03b097d63	Use a better name for the function interpolating paths It is not immediately clear what `expand_user_path()` means, so let's rename it to `interpolate_path()`. This also opens the path for interpolating more than just a home directory. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-07-26 12:17:16 -07:00
Johannes Schindelin	644e6b2c0f	expand_user_path(): clarify the role of the `real_home` parameter The `real_home` parameter only has an effect when expanding paths starting with `~/`, not when expanding paths starting with `~<user>/`. Let's make that clear. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-07-26 12:17:16 -07:00
Johannes Schindelin	789f6f226b	expand_user_path(): remove stale part of the comment In `395de250d9` (Expand ~ and ~user in core.excludesfile, commit.template, 2009-11-17), the `user_path()` function was refactored into the `expand_user_path()`. During that refactoring, the `buf` parameter was lost, but the code comment above said function still talks about it. Let's remove that stale part of the comment. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-07-26 12:17:16 -07:00
Jeff King	801ed010bf	t0060: test ntfs/hfs-obscured dotfiles We have tests that cover various filesystem-specific spellings of ".gitmodules", because we need to reliably identify that path for some security checks. These are from `dc2d9ba318` (is_{hfs,ntfs}_dotgitmodules: add tests, 2018-05-12), with the actual code coming from `e7cb0b4455` (is_ntfs_dotgit: match other .git files, 2018-05-11) and `0fc333ba20` (is_hfs_dotgit: match other .git files, 2018-05-02). Those latter two commits also added similar matching functions for .gitattributes and .gitignore. These ended up not being used in the final series, and are currently dead code. But in preparation for them being used in some fsck checks, let's make sure they actually work by throwing a few basic tests at them. Likewise, let's cover .mailmap (which does need matching code added). I didn't bother with the whole battery of tests that we cover for .gitmodules. These functions are all based on the same generic matcher, so it's sufficient to test most of the corner cases just once. Note that the ntfs magic prefix names in the tests come from the algorithm described in `e7cb0b4455` (and are different for each file). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-04 11:52:02 +09:00
Elijah Newren	5291828df8	merge-ort: write $GIT_DIR/AUTO_MERGE whenever we hit a conflict There are a variety of questions users might ask while resolving conflicts: * What changes have been made since the previous (first) parent? * What changes are staged? * What is still unstaged? (or what is still conflicted?) * What changes did I make to resolve conflicts so far? The first three of these have simple answers: * git diff HEAD * git diff --cached * git diff There was no way to answer the final question previously. Adding one is trivial in merge-ort, since it works by creating a tree representing what should be written to the working copy complete with conflict markers. Simply write that tree to .git/AUTO_MERGE, allowing users to answer the fourth question with * git diff AUTO_MERGE I avoided using a name like "MERGE_AUTO", because that would be merge-specific (much like MERGE_HEAD, REBASE_HEAD, REVERT_HEAD, CHERRY_PICK_HEAD) and I wanted a name that didn't change depending on which type of operation the merge was part of. Ensure that paths which clean out other temporary operation-specific files (e.g. CHERRY_PICK_HEAD, MERGE_MSG, rebase-merge/ state directory) also clean out this AUTO_MERGE file. Signed-off-by: Elijah Newren <newren@gmail.com> Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-20 12:35:40 -07:00
Han-Wen Nienhuys	b8825ef233	sequencer: treat REVERT_HEAD as a pseudo ref Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-08-21 11:20:11 -07:00
Han-Wen Nienhuys	c8e4159efd	sequencer: treat CHERRY_PICK_HEAD as a pseudo ref Check for existence and delete CHERRY_PICK_HEAD through ref functions. This will help cherry-pick work with alternate ref storage backends. Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-08-21 11:20:10 -07:00

1 2 3 4 5 ...

312 Commits