Commit Graph

20373 Commits

Author SHA1 Message Date
Phillip Wood
7aed2c0565 rebase -i: match whole word in is_command()
When matching an unabbreviated command is_command() only does a prefix
match which means it parses "pickled" as TODO_PICK. parse_insn_line()
does error out because is_command() only advances as far as the end of
"pick" so it looks like the command name is not followed by a space but
the error message is "missing arguments for pick" rather than telling
the user that the "pickled" is not a valid command.

Fix this by ensuring the match is follow by whitespace or the end of the
string as we already do for abbreviated commands. The (*bol = p) at the
end of the condition is a bit cute for my taste but I decided to leave
it be for now. Rather than add new tests the existing tests for bad
commands are adapted to use a bad command name that triggers the prefix
matching bug.

Signed-off-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-23 14:25:48 -08:00
Junio C Hamano
2f80d1b42e Merge branch 'rj/branch-copy-and-rename' into maint-2.39
Fix a pair of bugs in 'git branch'.

* rj/branch-copy-and-rename:
  branch: force-copy a branch to itself via @{-1} is a no-op
2023-02-14 14:15:55 -08:00
Junio C Hamano
8ca2b1f248 Merge branch 'rs/t3920-crlf-eating-grep-fix' into maint-2.39
Test fix.

* rs/t3920-crlf-eating-grep-fix:
  t3920: support CR-eating grep
2023-02-14 14:15:54 -08:00
Junio C Hamano
763ae829a3 Merge branch 'js/t3920-shell-and-or-fix' into maint-2.39
Test fix.

* js/t3920-shell-and-or-fix:
  t3920: don't ignore errors of more than one command with `|| true`
2023-02-14 14:15:54 -08:00
Junio C Hamano
81b216e4f7 Merge branch 'ab/t4023-avoid-losing-exit-status-of-diff' into maint-2.39
Test fix.

* ab/t4023-avoid-losing-exit-status-of-diff:
  t4023: fix ignored exit codes of git
2023-02-14 14:15:54 -08:00
Junio C Hamano
54941a5316 Merge branch 'ab/t7600-avoid-losing-exit-status-of-git' into maint-2.39
Test fix.

* ab/t7600-avoid-losing-exit-status-of-git:
  t7600: don't ignore "rev-parse" exit code in helper
2023-02-14 14:15:54 -08:00
Junio C Hamano
2509d0198c Merge branch 'ab/t5314-avoid-losing-exit-status' into maint-2.39
Test fix.

* ab/t5314-avoid-losing-exit-status:
  t5314: check exit code of "git"
2023-02-14 14:15:53 -08:00
Junio C Hamano
db2a91ba36 Merge branch 'rs/t4205-do-not-exit-in-test-script' into maint-2.39
Test fix.

* rs/t4205-do-not-exit-in-test-script:
  t4205: don't exit test script on failure
2023-02-14 14:15:53 -08:00
Junio C Hamano
1f071460d3 Merge branch 'rs/ls-tree-path-expansion-fix' into maint-2.39
"git ls-tree --format='%(path) %(path)' $tree $path" showed the
path three times, which has been corrected.

* rs/ls-tree-path-expansion-fix:
  ls-tree: remove dead store and strbuf for quote_c_style()
  ls-tree: fix expansion of repeated %(path)
2023-02-14 14:15:52 -08:00
Junio C Hamano
4950677b48 Merge branch 'ws/single-file-cone' into maint-2.39
The logic to see if we are using the "cone" mode by checking the
sparsity patterns has been tightened to avoid mistaking a pattern
that names a single file as specifying a cone.

* ws/single-file-cone:
  dir: check for single file cone patterns
2023-02-14 14:15:51 -08:00
Junio C Hamano
f8382a6396 Merge branch 'jk/ext-diff-with-relative' into maint-2.39
"git diff --relative" did not mix well with "git diff --ext-diff",
which has been corrected.

* jk/ext-diff-with-relative:
  diff: drop "name" parameter from prepare_temp_file()
  diff: clean up external-diff argv setup
  diff: use filespec path to set up tempfiles for ext-diff
2023-02-14 14:15:51 -08:00
Junio C Hamano
7cbfd0e572 Merge branch 'ab/bundle-wo-args' into maint-2.39
Fix to a small regression in 2.38 days.

* ab/bundle-wo-args:
  bundle <cmd>: have usage_msg_opt() note the missing "<file>"
  builtin/bundle.c: remove superfluous "newargc" variable
  bundle: don't segfault on "git bundle <subcmd>"
2023-02-14 14:15:50 -08:00
Junio C Hamano
725f293355 Merge branch 'lk/line-range-parsing-fix' into maint-2.39
When given a pattern that matches an empty string at the end of a
line, the code to parse the "git diff" line-ranges fell into an
infinite loop, which has been corrected.

* lk/line-range-parsing-fix:
  line-range: fix infinite loop bug with '$' regex
2023-02-14 14:15:49 -08:00
Johannes Schindelin
3aef76ffd4 Sync with 2.38.4
* maint-2.38:
  Git 2.38.4
  Git 2.37.6
  Git 2.36.5
  Git 2.35.7
  Git 2.34.7
  http: support CURLOPT_PROTOCOLS_STR
  http: prefer CURLOPT_SEEKFUNCTION to CURLOPT_IOCTLFUNCTION
  http-push: prefer CURLOPT_UPLOAD to CURLOPT_PUT
  Git 2.33.7
  Git 2.32.6
  Git 2.31.7
  Git 2.30.8
  apply: fix writing behind newly created symbolic links
  dir-iterator: prevent top-level symlinks without FOLLOW_SYMLINKS
  clone: delay picking a transport until after get_repo_path()
  t5619: demonstrate clone_local() with ambiguous transport
2023-02-06 09:43:39 +01:00
Johannes Schindelin
6487e9c459 Sync with 2.37.6
* maint-2.37:
  Git 2.37.6
  Git 2.36.5
  Git 2.35.7
  Git 2.34.7
  http: support CURLOPT_PROTOCOLS_STR
  http: prefer CURLOPT_SEEKFUNCTION to CURLOPT_IOCTLFUNCTION
  http-push: prefer CURLOPT_UPLOAD to CURLOPT_PUT
  Git 2.33.7
  Git 2.32.6
  Git 2.31.7
  Git 2.30.8
  apply: fix writing behind newly created symbolic links
  dir-iterator: prevent top-level symlinks without FOLLOW_SYMLINKS
  clone: delay picking a transport until after get_repo_path()
  t5619: demonstrate clone_local() with ambiguous transport
2023-02-06 09:43:28 +01:00
Johannes Schindelin
16004682f9 Sync with 2.36.5
* maint-2.36:
  Git 2.36.5
  Git 2.35.7
  Git 2.34.7
  http: support CURLOPT_PROTOCOLS_STR
  http: prefer CURLOPT_SEEKFUNCTION to CURLOPT_IOCTLFUNCTION
  http-push: prefer CURLOPT_UPLOAD to CURLOPT_PUT
  Git 2.33.7
  Git 2.32.6
  Git 2.31.7
  Git 2.30.8
  apply: fix writing behind newly created symbolic links
  dir-iterator: prevent top-level symlinks without FOLLOW_SYMLINKS
  clone: delay picking a transport until after get_repo_path()
  t5619: demonstrate clone_local() with ambiguous transport
2023-02-06 09:38:31 +01:00
Johannes Schindelin
40843216c5 Sync with 2.35.7
* maint-2.35:
  Git 2.35.7
  Git 2.34.7
  http: support CURLOPT_PROTOCOLS_STR
  http: prefer CURLOPT_SEEKFUNCTION to CURLOPT_IOCTLFUNCTION
  http-push: prefer CURLOPT_UPLOAD to CURLOPT_PUT
  Git 2.33.7
  Git 2.32.6
  Git 2.31.7
  Git 2.30.8
  apply: fix writing behind newly created symbolic links
  dir-iterator: prevent top-level symlinks without FOLLOW_SYMLINKS
  clone: delay picking a transport until after get_repo_path()
  t5619: demonstrate clone_local() with ambiguous transport
2023-02-06 09:37:52 +01:00
Johannes Schindelin
6a53a59bf9 Sync with 2.34.7
* maint-2.34:
  Git 2.34.7
  http: support CURLOPT_PROTOCOLS_STR
  http: prefer CURLOPT_SEEKFUNCTION to CURLOPT_IOCTLFUNCTION
  http-push: prefer CURLOPT_UPLOAD to CURLOPT_PUT
  Git 2.33.7
  Git 2.32.6
  Git 2.31.7
  Git 2.30.8
  apply: fix writing behind newly created symbolic links
  dir-iterator: prevent top-level symlinks without FOLLOW_SYMLINKS
  clone: delay picking a transport until after get_repo_path()
  t5619: demonstrate clone_local() with ambiguous transport
2023-02-06 09:29:44 +01:00
Johannes Schindelin
a7237f5ae9 Sync with 2.33.7
* maint-2.33:
  Git 2.33.7
  Git 2.32.6
  Git 2.31.7
  Git 2.30.8
  apply: fix writing behind newly created symbolic links
  dir-iterator: prevent top-level symlinks without FOLLOW_SYMLINKS
  clone: delay picking a transport until after get_repo_path()
  t5619: demonstrate clone_local() with ambiguous transport
2023-02-06 09:29:16 +01:00
Johannes Schindelin
87248c5933 Sync with 2.32.6
* maint-2.32:
  Git 2.32.6
  Git 2.31.7
  Git 2.30.8
  apply: fix writing behind newly created symbolic links
  dir-iterator: prevent top-level symlinks without FOLLOW_SYMLINKS
  clone: delay picking a transport until after get_repo_path()
  t5619: demonstrate clone_local() with ambiguous transport
2023-02-06 09:25:56 +01:00
Johannes Schindelin
aeb93d7da2 Sync with 2.31.7
* maint-2.31:
  Git 2.31.7
  Git 2.30.8
  apply: fix writing behind newly created symbolic links
  dir-iterator: prevent top-level symlinks without FOLLOW_SYMLINKS
  clone: delay picking a transport until after get_repo_path()
  t5619: demonstrate clone_local() with ambiguous transport
2023-02-06 09:25:08 +01:00
Johannes Schindelin
e14d6b8408 Sync with 2.30.8
* maint-2.30:
  Git 2.30.8
  apply: fix writing behind newly created symbolic links
  dir-iterator: prevent top-level symlinks without FOLLOW_SYMLINKS
  clone: delay picking a transport until after get_repo_path()
  t5619: demonstrate clone_local() with ambiguous transport
2023-02-06 09:24:06 +01:00
Junio C Hamano
a3033a68ac Merge branch 'ps/apply-beyond-symlink' into maint-2.30
Fix a vulnerability (CVE-2023-23946) that allows crafted input to trick
`git apply` into writing files outside of the working tree.

* ps/apply-beyond-symlink:
  dir-iterator: prevent top-level symlinks without FOLLOW_SYMLINKS

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2023-02-06 09:12:16 +01:00
Patrick Steinhardt
fade728df1 apply: fix writing behind newly created symbolic links
When writing files git-apply(1) initially makes sure that none of the
files it is about to create are behind a symlink:

```
 $ git init repo
 Initialized empty Git repository in /tmp/repo/.git/
 $ cd repo/
 $ ln -s dir symlink
 $ git apply - <<EOF
 diff --git a/symlink/file b/symlink/file
 new file mode 100644
 index 0000000..e69de29
 EOF
 error: affected file 'symlink/file' is beyond a symbolic link
```

This safety mechanism is crucial to ensure that we don't write outside
of the repository's working directory. It can be fooled though when the
patch that is being applied creates the symbolic link in the first
place, which can lead to writing files in arbitrary locations.

Fix this by checking whether the path we're about to create is
beyond a symlink or not. Tightening these checks like this should be
fine as we already have these precautions in Git as explained
above. Ideally, we should update the check we do up-front before
starting to reflect the computed changes to the working tree so that
we catch this case as well, but as part of embargoed security work,
adding an equivalent check just before we try to write out a file
should serve us well as a reasonable first step.

Digging back into history shows that this vulnerability has existed
since at least Git v2.9.0. As Git v2.8.0 and older don't build on my
system anymore I cannot tell whether older versions are affected, as
well.

Reported-by: Joern Schneeweisz <jschneeweisz@gitlab.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-03 14:41:31 -08:00
Taylor Blau
bffc762f87 dir-iterator: prevent top-level symlinks without FOLLOW_SYMLINKS
When using the dir_iterator API, we first stat(2) the base path, and
then use that as a starting point to enumerate the directory's contents.

If the directory contains symbolic links, we will immediately die() upon
encountering them without the `FOLLOW_SYMLINKS` flag. The same is not
true when resolving the top-level directory, though.

As explained in a previous commit, this oversight in 6f054f9fb3
(builtin/clone.c: disallow `--local` clones with symlinks, 2022-07-28)
can be used as an attack vector to include arbitrary files on a victim's
filesystem from outside of the repository.

Prevent resolving top-level symlinks unless the FOLLOW_SYMLINKS flag is
given, which will cause clones of a repository with a symlink'd
"$GIT_DIR/objects" directory to fail.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-24 16:52:16 -08:00
Taylor Blau
cf8f6ce02a clone: delay picking a transport until after get_repo_path()
In the previous commit, t5619 demonstrates an issue where two calls to
`get_repo_path()` could trick Git into using its local clone mechanism
in conjunction with a non-local transport.

That sequence is:

 - the starting state is that the local path https:/example.com/foo is a
   symlink that points to ../../../.git/modules/foo. So it's dangling.

 - get_repo_path() sees that no such path exists (because it's
   dangling), and thus we do not canonicalize it into an absolute path

 - because we're using --separate-git-dir, we create .git/modules/foo.
   Now our symlink is no longer dangling!

 - we pass the url to transport_get(), which sees it as an https URL.

 - we call get_repo_path() again, on the url. This second call was
   introduced by f38aa83f9a (use local cloning if insteadOf makes a
   local URL, 2014-07-17). The idea is that we want to pull the url
   fresh from the remote.c API, because it will apply any aliases.

And of course now it sees that there is a local file, which is a
mismatch with the transport we already selected.

The issue in the above sequence is calling `transport_get()` before
deciding whether or not the repository is indeed local, and not passing
in an absolute path if it is local.

This is reminiscent of a similar bug report in [1], where it was
suggested to perform the `insteadOf` lookup earlier. Taking that
approach may not be as straightforward, since the intent is to store the
original URL in the config, but to actually fetch from the insteadOf
one, so conflating the two early on is a non-starter.

Note: we pass the path returned by `get_repo_path(remote->url[0])`,
which should be the same as `repo_name` (aside from any `insteadOf`
rewrites).

We *could* pass `absolute_pathdup()` of the same argument, which
86521acaca (Bring local clone's origin URL in line with that of a remote
clone, 2008-09-01) indicates may differ depending on the presence of
".git/" for a non-bare repo. That matters for forming relative submodule
paths, but doesn't matter for the second call, since we're just feeding
it to the transport code, which is fine either way.

[1]: https://lore.kernel.org/git/CAMoD=Bi41mB3QRn3JdZL-FGHs4w3C2jGpnJB-CqSndO7FMtfzA@mail.gmail.com/

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-24 16:52:16 -08:00
Taylor Blau
58325b93c5 t5619: demonstrate clone_local() with ambiguous transport
When cloning a repository, Git must determine (a) what transport
mechanism to use, and (b) whether or not the clone is local.

Since f38aa83f9a (use local cloning if insteadOf makes a local URL,
2014-07-17), the latter check happens after the remote has been
initialized, and references the remote's URL instead of the local path.
This is done to make it possible for a `url.<base>.insteadOf` rule to
convert a remote URL into a local one, in which case the `clone_local()`
mechanism should be used.

However, with a specially crafted repository, Git can be tricked into
using a non-local transport while still setting `is_local` to "1" and
using the `clone_local()` optimization. The below test case
demonstrates such an instance, and shows that it can be used to include
arbitrary (known) paths in the working copy of a cloned repository on a
victim's machine[^1], even if local file clones are forbidden by
`protocol.file.allow`.

This happens in a few parts:

 1. We first call `get_repo_path()` to see if the remote is a local
    path. If it is, we replace the repo name with its absolute path.

 2. We then call `transport_get()` on the repo name and decide how to
    access it. If it was turned into an absolute path in the previous
    step, then we should always treat it like a file.

 3. We use `get_repo_path()` again, and set `is_local` as appropriate.
    But it's already too late to rewrite the repo name as an absolute
    path, since we've already fed it to the transport code.

The attack works by including a submodule whose URL corresponds to a
path on disk. In the below example, the repository "sub" is reachable
via the dumb HTTP protocol at (something like):

    http://127.0.0.1:NNNN/dumb/sub.git

However, the path "http:/127.0.0.1:NNNN/dumb" (that is, a top-level
directory called "http:", then nested directories "127.0.0.1:NNNN", and
"dumb") exists within the repository, too.

To determine this, it first picks the appropriate transport, which is
dumb HTTP. It then uses the remote's URL in order to determine whether
the repository exists locally on disk. However, the malicious repository
also contains an embedded stub repository which is the target of a
symbolic link at the local path corresponding to the "sub" repository on
disk (i.e., there is a symbolic link at "http:/127.0.0.1/dumb/sub.git",
pointing to the stub repository via ".git/modules/sub/../../../repo").

This stub repository fools Git into thinking that a local repository
exists at that URL and thus can be cloned locally. The affected call is
in `get_repo_path()`, which in turn calls `get_repo_path_1()`, which
locates a valid repository at that target.

This then causes Git to set the `is_local` variable to "1", and in turn
instructs Git to clone the repository using its local clone optimization
via the `clone_local()` function.

The exploit comes into play because the stub repository's top-level
"$GIT_DIR/objects" directory is a symbolic link which can point to an
arbitrary path on the victim's machine. `clone_local()` resolves the
top-level "objects" directory through a `stat(2)` call, meaning that we
read through the symbolic link and copy or hardlink the directory
contents at the destination of the link.

In other words, we can get steps (1) and (3) to disagree by leveraging
the dangling symlink to pick a non-local transport in the first step,
and then set is_local to "1" in the third step when cloning with
`--separate-git-dir`, which makes the symlink non-dangling.

This can result in data-exfiltration on the victim's machine when
sensitive data is at a known path (e.g., "/home/$USER/.ssh").

The appropriate fix is two-fold:

 - Resolve the transport later on (to avoid using the local
   clone optimization with a non-local transport).

 - Avoid reading through the top-level "objects" directory when
   (correctly) using the clone_local() optimization.

This patch merely demonstrates the issue. The following two patches will
implement each part of the above fix, respectively.

[^1]: Provided that any target directory does not contain symbolic
  links, in which case the changes from 6f054f9fb3 (builtin/clone.c:
  disallow `--local` clones with symlinks, 2022-07-28) will abort the
  clone.

Reported-by: yvvdwf <yvvdwf@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-24 16:52:16 -08:00
René Scharfe
16fb5c54bd ls-tree: fix expansion of repeated %(path)
expand_show_tree() borrows the base strbuf given to us by read_tree() to
build the full path of the current entry when handling %(path).  Only
its indirect caller, show_tree_fmt(), removes the added entry name.
That works fine as long as %(path) is only included once in the format
string, but accumulates duplicates if it's repeated:

   $ git ls-tree --format='%(path) %(path) %(path)' HEAD M*
   Makefile MakefileMakefile MakefileMakefileMakefile

Reset the length after each use to get the same expansion every time;
here's the behavior with this patch:

   $ ./git ls-tree --format='%(path) %(path) %(path)' HEAD M*
   Makefile Makefile Makefile

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-14 19:22:26 -08:00
Jeff King
a0f83e7776 diff: use filespec path to set up tempfiles for ext-diff
When we're going to run an external diff, we have to make the contents
of the pre- and post-images available either by dumping them to a
tempfile, or by pointing at a valid file in the worktree. The logic of
this is all handled by prepare_temp_file(), and we just pass in the
filename and the diff_filespec.

But there's a gotcha here. The "filename" we have is a logical filename
and not necessarily a path on disk or in the repository. This matters in
at least one case: when using "--relative", we may have a name like
"foo", even though the file content is found at "subdir/foo". As a
result, we look for the wrong path, fail to find "foo", and claim that
the file has been deleted (passing "/dev/null" to the external diff,
rather than the correct worktree path).

We can fix this by passing the pathname from the diff_filespec, which
should always be a full repository path (and that's what we want even if
reusing a worktree file, since we're always operating from the top-level
of the working tree).

The breakage seems to go all the way back to cd676a5136 (diff
--relative: output paths as relative to the current subdirectory,
2008-02-12). As far as I can tell, before then "name" would always have
been the same as the filespec's "path".

There are two related cases I looked at that aren't buggy:

  1. the only other caller of prepare_temp_file() is run_textconv(). But
     it always passes the filespec's path field, so it's OK.

  2. I wondered if file renames/copies might cause similar confusion.
     But they don't, because run_external_diff() receives two names in
     that case: "name" and "other", which correspond to the two sides of
     the diff. And we did correctly pass "other" when handling the
     post-image side. Barring the use of "--relative", that would always
     match "two->path", the path of the second filespec (and the rename
     destination).

So the only bug is just the interaction with external diff drivers and
--relative.

Reported-by: Carl Baldwin <carl@ecbaldwin.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-06 21:49:55 +09:00
William Sprent
5842710dc2 dir: check for single file cone patterns
The sparse checkout documentation states that the cone mode pattern set
is limited to patterns that either recursively include directories or
patterns that match all files in a directory. In the sparse checkout
file, the former manifest in the form:

    /A/B/C/

while the latter become a pair of patterns either in the form:

    /A/B/
    !/A/B/*/

or in the special case of matching the toplevel files:

    /*
    !/*/

The 'add_pattern_to_hashsets()' function contains checks which serve to
disable cone-mode when non-cone patterns are encountered. However, these
do not catch when the pattern list attempts to match a single file or
directory, e.g. a pattern in the form:

    /A/B/C

This causes sparse-checkout to exhibit unexpected behaviour when such a
pattern is in the sparse-checkout file and cone mode is enabled.
Concretely, with the pattern like the above, sparse-checkout, in
non-cone mode, will only include the directory or file located at
'/A/B/C'. However, with cone mode enabled, sparse-checkout will instead
just manifest the toplevel files but not any file located at '/A/B/C'.

Relatedly, issues occur when supplying the same kind of filter when
partial cloning with '--filter=sparse:oid=<oid>'. 'upload-pack' will
correctly just include the objects that match the non-cone pattern
matching. Which means that checking out the newly cloned repo with the
same filter, but with cone mode enabled, fails due to missing objects.

To fix these issues, add a cone mode pattern check that asserts that
every pattern is either a directory match or the pattern '/*'. Add a
test to verify the new pattern check and modify another to reflect that
non-directory patterns are caught earlier.

Signed-off-by: William Sprent <williams@unity3d.com>
Acked-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-05 11:14:28 +09:00
Ævar Arnfjörð Bjarmason
891cb09db6 bundle: don't segfault on "git bundle <subcmd>"
Since aef7d75e58 (builtin/bundle.c: let parse-options parse
subcommands, 2022-08-19) we've been segfaulting if no argument was
provided.

The fix is easy, as all of the "git bundle" subcommands require a
non-option argument we can check that we have arguments left after
calling parse-options().

This makes use of code added in 73c3253d75 (bundle: framework for
options before bundle file, 2019-11-10), before this change that code
has always been unreachable. In 73c3253d75 we'd never reach it as we
already checked "argc < 2" in cmd_bundle() itself.

Then when aef7d75e58 (whose segfault we're fixing here) migrated this
code to the subcommand API it removed that "argc < 2" check, but we
were still checking the wrong "argc" in parse_options_cmd_bundle(), we
need to check the "newargc". The "argc" will always be >= 1, as it
will necessarily contain at least the subcommand name
itself (e.g. "create").

As an aside, this could be safely squashed into this, but let's not do
that for this minimal segfault fix, as it's an unrelated refactoring:

	--- a/builtin/bundle.c
	+++ b/builtin/bundle.c
	@@ -55,13 +55,12 @@ static int parse_options_cmd_bundle(int argc,
	 		const char * const usagestr[],
	 		const struct option options[],
	 		char **bundle_file) {
	-	int newargc;
	-	newargc = parse_options(argc, argv, NULL, options, usagestr,
	+	argc = parse_options(argc, argv, NULL, options, usagestr,
	 			     PARSE_OPT_STOP_AT_NON_OPTION);
	-	if (!newargc)
	+	if (!argc)
	 		usage_with_options(usagestr, options);
	 	*bundle_file = prefix_filename(prefix, argv[0]);
	-	return newargc;
	+	return argc;
	 }

	 static int cmd_bundle_create(int argc, const char **argv, const char *prefix) {

Reported-by: Hubert Jasudowicz <hubertj@stmcyber.pl>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Tested-by: Hubert Jasudowicz <hubertj@stmcyber.pl>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-25 16:01:09 +09:00
Lars Kellogg-Stedman
4e57c88e02 line-range: fix infinite loop bug with '$' regex
When the -L argument to "git log" is passed the zero-width regular
expression "$" (as in "-L :$:line-range.c"), this results in an
infinite loop in find_funcname_matching_regexp().

Modify find_funcname_matching_regexp to correctly match the entire line
instead of the zero-width match at eol and update the loop condition to
prevent an infinite loop in the event of other undiscovered corner cases.

The primary change is that we pre-decrement the beginning-of-line marker
('bol') before comparing it to '\n'. In the case of '$', where we match the
'\n' at the end of the line and start the loop with bol == eol, this
ensures that bol will find the beginning of the line on which the match
occurred.

Originally reported in <https://stackoverflow.com/q/74690545/147356>.

Signed-off-by: Lars Kellogg-Stedman <lars@oddbit.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-20 10:00:43 +09:00
Junio C Hamano
96738bb0e1 Sync with 2.38.3 2022-12-13 21:25:15 +09:00
Junio C Hamano
fea9f607a8 Sync with Git 2.37.5 2022-12-13 21:23:36 +09:00
Junio C Hamano
431f6e67e6 Merge branch 'maint-2.36' into maint-2.37 2022-12-13 21:20:35 +09:00
Junio C Hamano
8253c00421 Merge branch 'maint-2.35' into maint-2.36 2022-12-13 21:19:11 +09:00
Junio C Hamano
fbabbc30e7 Merge branch 'maint-2.34' into maint-2.35 2022-12-13 21:17:10 +09:00
Junio C Hamano
3748b5b7f5 Merge branch 'maint-2.33' into maint-2.34 2022-12-13 21:15:22 +09:00
Junio C Hamano
5f22dcc02d Sync with Git 2.32.5 2022-12-13 21:13:11 +09:00
Junio C Hamano
32e357b6df Merge branch 'ps/attr-limits-with-fsck' into maint-2.32 2022-12-13 21:09:56 +09:00
Junio C Hamano
8a755eddf5 Sync with Git 2.31.6 2022-12-13 21:09:40 +09:00
Junio C Hamano
16128765d7 Git 2.30.7
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE4fA2sf7nIh/HeOzvsLXohpav5ssFAmOYaKAACgkQsLXohpav
 5svHThAAhjTaBBBYDM6FbHcFUHv515fSo04AmyXl6QKdjiBroaGj+WpJPrYM3B5G
 eR0MYfwDfp7rAvxPQwq1LzSQLz2G2Swue1a4X0t3Bomjpqf48OeAUleGHQlBUTm7
 wfZIHpgCbMBIHJtVAVPiEOo43ZJ1OareCwVpPOpAXLVgTU2Bbx59K0oUqGszjgE3
 anQ0kon6hELZ9aBTx80hUJaYWaxiUqENtRFs6vyOV/MKvW2KR+MJgvu/SQqbRJPy
 ndBJ5r0gcSbes0OLxKCAFFNVt2p6BeVb4IxyPogJveGwJsNU88DQnarSos7hvPYG
 DkhTzfpPmFJkP0WiRHr87jWCXNJraq1SmK65ac1CGV/NTrDfX9ZNoGIRFsHfLmw2
 1poTxhB/h0F4wCucZu7Wavvgd2NI2V+GK5dx8Mx5NovrC67smBny2W7kQgXJCdZX
 e6vNuKVK7pz3cVYvo5GbUo2ivY2igm9Xbj3Na1/Ie8wTFaZ0ZX+oRnxxAdwKbL/1
 X0VRUTQMgtrrLd24JCApo8r5+Ssg0HvIOpXcUZFpvaYl9kMltatwV1Y01lNAhAgF
 VFBvUWdFy5tGzPzSCd3w2NyZOJBng2GdKw9YUt/WVWCKeiLXLI3wh10pC+m1qJus
 HJwQbRRSzC4mhXlkKZ5IG+Xz7x+HrHFnLpQXhtjeSc5WwGQkE2w=
 =syKo
 -----END PGP SIGNATURE-----

Sync with Git 2.30.7
2022-12-13 21:02:20 +09:00
Victoria Dye
93a7bc8b28 rebase --update-refs: avoid unintended ref deletion
In b3b1a21d1a (sequencer: rewrite update-refs as user edits todo list,
2022-07-19), the 'todo_list_filter_update_refs()' step was added to handle
the removal of 'update-ref' lines from a 'rebase-todo'. Specifically, it
removes potential ref updates from the "update refs state" if a ref does not
have a corresponding 'update-ref' line.

However, because 'write_update_refs_state()' will not update the state if
the 'refs_to_oids' list was empty, removing *all* 'update-ref' lines will
result in the state remaining unchanged from how it was initialized (with
all refs' "after" OID being null). Then, when the ref update is applied, all
refs will be updated to null and consequently deleted.

To fix this, delete the 'update-refs' state file when 'refs_to_oids' is
empty. Additionally, add a tests covering "all update-ref lines removed"
cases.

Reported-by: herr.kaste <herr.kaste@gmail.com>
Helped-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Helped-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-12-09 19:31:45 +09:00
Patrick Steinhardt
27ab4784d5 fsck: implement checks for gitattributes
Recently, a vulnerability was reported that can lead to an out-of-bounds
write when reading an unreasonably large gitattributes file. The root
cause of this error are multiple integer overflows in different parts of
the code when there are either too many lines, when paths are too long,
when attribute names are too long, or when there are too many attributes
declared for a pattern.

As all of these are related to size, it seems reasonable to restrict the
size of the gitattributes file via git-fsck(1). This allows us to both
stop distributing known-vulnerable objects via common hosting platforms
that have fsck enabled, and users to protect themselves by enabling the
`fetch.fsckObjects` config.

There are basically two checks:

    1. We verify that size of the gitattributes file is smaller than
       100MB.

    2. We verify that the maximum line length does not exceed 2048
       bytes.

With the preceding commits, both of these conditions would cause us to
either ignore the complete gitattributes file or blob in the first case,
or the specific line in the second case. Now with these consistency
checks added, we also grow the ability to stop distributing such files
in the first place when `receive.fsckObjects` is enabled.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-09 17:07:04 +09:00
Junio C Hamano
e0bfc0b3b9 Merge branch 'ps/attr-limits' into maint-2.32 2022-12-09 17:03:49 +09:00
Junio C Hamano
6662a836eb Merge branch 'ps/attr-limits' into maint-2.30 2022-12-09 16:05:52 +09:00
Patrick Steinhardt
304a50adff pretty: restrict input lengths for padding and wrapping formats
Both the padding and wrapping formatting directives allow the caller to
specify an integer that ultimately leads to us adding this many chars to
the result buffer. As a consequence, it is trivial to e.g. allocate 2GB
of RAM via a single formatting directive and cause resource exhaustion
on the machine executing this logic. Furthermore, it is debatable
whether there are any sane usecases that require the user to pad data to
2GB boundaries or to indent wrapped data by 2GB.

Restrict the input sizes to 16 kilobytes at a maximum to limit the
amount of bytes that can be requested by the user. This is not meant
as a fix because there are ways to trivially amplify the amount of
data we generate via formatting directives; the real protection is
achieved by the changes in previous steps to catch and avoid integer
wraparound that causes us to under-allocate and access beyond the
end of allocated memory reagions. But having such a limit
significantly helps fuzzing the pretty format, because the fuzzer is
otherwise quite fast to run out-of-memory as it discovers these
formatters.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-09 14:26:21 +09:00
Patrick Steinhardt
81c2d4c3a5 utf8: fix checking for glyph width in strbuf_utf8_replace()
In `strbuf_utf8_replace()`, we call `utf8_width()` to compute the width
of the current glyph. If the glyph is a control character though it can
be that `utf8_width()` returns `-1`, but because we assign this value to
a `size_t` the conversion will cause us to underflow. This bug can
easily be triggered with the following command:

    $ git log --pretty='format:xxx%<|(1,trunc)%x10'

>From all I can see though this seems to be a benign underflow that has
no security-related consequences.

Fix the bug by using an `int` instead. When we see a control character,
we now copy it into the target buffer but don't advance the current
width of the string.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-09 14:26:21 +09:00
Patrick Steinhardt
937b71cc8b utf8: fix overflow when returning string width
The return type of both `utf8_strwidth()` and `utf8_strnwidth()` is
`int`, but we operate on string lengths which are typically of type
`size_t`. This means that when the string is longer than `INT_MAX`, we
will overflow and thus return a negative result.

This can lead to an out-of-bounds write with `--pretty=format:%<1)%B`
and a commit message that is 2^31+1 bytes long:

    =================================================================
    ==26009==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x603000001168 at pc 0x7f95c4e5f427 bp 0x7ffd8541c900 sp 0x7ffd8541c0a8
    WRITE of size 2147483649 at 0x603000001168 thread T0
        #0 0x7f95c4e5f426 in __interceptor_memcpy /usr/src/debug/gcc/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:827
        #1 0x5612bbb1068c in format_and_pad_commit pretty.c:1763
        #2 0x5612bbb1087a in format_commit_item pretty.c:1801
        #3 0x5612bbc33bab in strbuf_expand strbuf.c:429
        #4 0x5612bbb110e7 in repo_format_commit_message pretty.c:1869
        #5 0x5612bbb12d96 in pretty_print_commit pretty.c:2161
        #6 0x5612bba0a4d5 in show_log log-tree.c:781
        #7 0x5612bba0d6c7 in log_tree_commit log-tree.c:1117
        #8 0x5612bb691ed5 in cmd_log_walk_no_free builtin/log.c:508
        #9 0x5612bb69235b in cmd_log_walk builtin/log.c:549
        #10 0x5612bb6951a2 in cmd_log builtin/log.c:883
        #11 0x5612bb56c993 in run_builtin git.c:466
        #12 0x5612bb56d397 in handle_builtin git.c:721
        #13 0x5612bb56db07 in run_argv git.c:788
        #14 0x5612bb56e8a7 in cmd_main git.c:923
        #15 0x5612bb803682 in main common-main.c:57
        #16 0x7f95c4c3c28f  (/usr/lib/libc.so.6+0x2328f)
        #17 0x7f95c4c3c349 in __libc_start_main (/usr/lib/libc.so.6+0x23349)
        #18 0x5612bb5680e4 in _start ../sysdeps/x86_64/start.S:115

    0x603000001168 is located 0 bytes to the right of 24-byte region [0x603000001150,0x603000001168)
    allocated by thread T0 here:
        #0 0x7f95c4ebe7ea in __interceptor_realloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:85
        #1 0x5612bbcdd556 in xrealloc wrapper.c:136
        #2 0x5612bbc310a3 in strbuf_grow strbuf.c:99
        #3 0x5612bbc32acd in strbuf_add strbuf.c:298
        #4 0x5612bbc33aec in strbuf_expand strbuf.c:418
        #5 0x5612bbb110e7 in repo_format_commit_message pretty.c:1869
        #6 0x5612bbb12d96 in pretty_print_commit pretty.c:2161
        #7 0x5612bba0a4d5 in show_log log-tree.c:781
        #8 0x5612bba0d6c7 in log_tree_commit log-tree.c:1117
        #9 0x5612bb691ed5 in cmd_log_walk_no_free builtin/log.c:508
        #10 0x5612bb69235b in cmd_log_walk builtin/log.c:549
        #11 0x5612bb6951a2 in cmd_log builtin/log.c:883
        #12 0x5612bb56c993 in run_builtin git.c:466
        #13 0x5612bb56d397 in handle_builtin git.c:721
        #14 0x5612bb56db07 in run_argv git.c:788
        #15 0x5612bb56e8a7 in cmd_main git.c:923
        #16 0x5612bb803682 in main common-main.c:57
        #17 0x7f95c4c3c28f  (/usr/lib/libc.so.6+0x2328f)

    SUMMARY: AddressSanitizer: heap-buffer-overflow /usr/src/debug/gcc/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:827 in __interceptor_memcpy
    Shadow bytes around the buggy address:
      0x0c067fff81d0: fd fd fd fa fa fa fd fd fd fa fa fa fd fd fd fa
      0x0c067fff81e0: fa fa fd fd fd fd fa fa fd fd fd fd fa fa fd fd
      0x0c067fff81f0: fd fa fa fa fd fd fd fa fa fa fd fd fd fa fa fa
      0x0c067fff8200: fd fd fd fa fa fa fd fd fd fd fa fa 00 00 00 fa
      0x0c067fff8210: fa fa fd fd fd fa fa fa fd fd fd fa fa fa fd fd
    =>0x0c067fff8220: fd fa fa fa fd fd fd fa fa fa 00 00 00[fa]fa fa
      0x0c067fff8230: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
      0x0c067fff8240: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
      0x0c067fff8250: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
      0x0c067fff8260: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
      0x0c067fff8270: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
    Shadow byte legend (one shadow byte represents 8 application bytes):
      Addressable:           00
      Partially addressable: 01 02 03 04 05 06 07
      Heap left redzone:       fa
      Freed heap region:       fd
      Stack left redzone:      f1
      Stack mid redzone:       f2
      Stack right redzone:     f3
      Stack after return:      f5
      Stack use after scope:   f8
      Global redzone:          f9
      Global init order:       f6
      Poisoned by user:        f7
      Container overflow:      fc
      Array cookie:            ac
      Intra object redzone:    bb
      ASan internal:           fe
      Left alloca redzone:     ca
      Right alloca redzone:    cb
    ==26009==ABORTING

Now the proper fix for this would be to convert both functions to return
an `size_t` instead of an `int`. But given that this commit may be part
of a security release, let's instead do the minimal viable fix and die
in case we see an overflow.

Add a test that would have previously caused us to crash.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-09 14:26:21 +09:00
Patrick Steinhardt
17d23e8a38 utf8: fix returning negative string width
The `utf8_strnwidth()` function calls `utf8_width()` in a loop and adds
its returned width to the end result. `utf8_width()` can return `-1`
though in case it reads a control character, which means that the
computed string width is going to be wrong. In the worst case where
there are more control characters than non-control characters, we may
even return a negative string width.

Fix this bug by treating control characters as having zero width.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-12-09 14:26:21 +09:00