Explanation
-----------
The problem here is flawed `poll()` implementation. When it tries to
see if pipe can be written without blocking, it eventually calls
`NtQueryInformationFile()` and tests `WriteQuotaAvailable`. However,
the meaning of quota was misunderstood. The value of quota is reduced
when either some data was written to a pipe, *or* there is a pending
read on the pipe. Therefore, if there is a pending read of size >= than
the pipe's buffer size, poll() will think that pipe is not writable and
will hang forever, usually that means deadlocking both pipe users.
I have studied the problem and found that Windows pipes track two values:
`QuotaUsed` and `BytesInQueue`. The code in `poll()` apparently wants to
know `BytesInQueue` instead of quota. Unfortunately, `BytesInQueue` can
only be requested from read end of the pipe, while `poll()` receives
write end.
The git's implementation of `poll()` was copied from gnulib, which also
contains a flawed implementation up to today.
I also had a look at implementation in cygwin, which is also broken in a
subtle way. It uses this code in `pipe_data_available()`:
fpli.WriteQuotaAvailable = (fpli.OutboundQuota - fpli.ReadDataAvailable)
However, `ReadDataAvailable` always returns 0 for the write end of the pipe,
turning the code into an obfuscated version of returning pipe's total
buffer size, which I guess will in turn have `poll()` always say that pipe
is writable. The commit that introduced the code doesn't say anything about
this change, so it could be some debugging code that slipped in.
These are the typical sizes used in git:
0x2000 - default read size in `strbuf_read()`
0x1000 - default read size in CRT, used by `strbuf_getwholeline()`
0x2000 - pipe buffer size in compat\mingw.c
As a consequence, as soon as child process uses `strbuf_read()`,
`poll()` in parent process will hang forever, deadlocking both
processes.
This results in two observable behaviors:
1) If parent process begins sending STDIN quickly (and usually that's
the case), then first `poll()` will succeed and first block will go
through. MAX_IO_SIZE_DEFAULT is 8MB, so if STDIN exceeds 8MB, then
it will deadlock.
2) If parent process waits a little bit for any reason (including OS
scheduler) and child is first to issue `strbuf_read()`, then it will
deadlock immediately even on small STDINs.
The problem is illustrated by `git stash push`, which will currently
read the entire patch into memory and then send it to `git apply` via
STDIN. If patch exceeds 8MB, git hangs on Windows.
Possible solutions
------------------
1) Somehow obtain `BytesInQueue` instead of `QuotaUsed`
I did a pretty thorough search and didn't find any ways to obtain
the value from write end of the pipe.
2) Also give read end of the pipe to `poll()`
That can be done, but it will probably invite some dirty code,
because `poll()`
* can accept multiple pipes at once
* can accept things that are not pipes
* is expected to have a well known signature.
3) Make `poll()` always reply "writable" for write end of the pipe
Afterall it seems that cygwin (accidentally?) does that for years.
Also, it should be noted that `pump_io_round()` writes 8MB blocks,
completely ignoring the fact that pipe's buffer size is only 8KB,
which means that pipe gets clogged many times during that single
write. This may invite a deadlock, if child's STDERR/STDOUT gets
clogged while it's trying to deal with 8MB of STDIN. Such deadlocks
could be defeated with writing less than pipe's buffer size per
round, and always reading everything from STDOUT/STDERR before
starting next round. Therefore, making `poll()` always reply
"writable" shouldn't cause any new issues or block any future
solutions.
4) Increase the size of the pipe's buffer
The difference between `BytesInQueue` and `QuotaUsed` is the size
of pending reads. Therefore, if buffer is bigger than size of reads,
`poll()` won't hang so easily. However, I found that for example
`strbuf_read()` will get more and more hungry as it reads large inputs,
eventually surpassing any reasonable pipe buffer size.
Chosen solution
---------------
Make `poll()` always reply "writable" for write end of the pipe.
Hopefully one day someone will find a way to implement it properly.
Reproduction
------------
printf "%8388608s" X >large_file.txt
git stash push --include-untracked -- large_file.txt
I have decided not to include this as test to avoid slowing down the
test suite. I don't expect the specific problem to come back, and
chances are that `git stash push` will be reworked to avoid sending the
entire patch via STDIN.
Signed-off-by: Alexandr Miloslavskiy <alexandr.miloslavskiy@syntevo.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We can check if certain characters are present in a string by calling
strchr(3) on each of them, or we can pass them all to a single
strpbrk(3) call. The latter is shorter, less repetitive and slightly
more efficient, so let's do that instead.
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When debugging Git, the criss-cross spawning of processes can make
things quite a bit difficult, especially when a Unix shell script is
thrown in the mix that calls a `git.exe` that then segfaults.
To help debugging such things, we introduce the `open_in_gdb()` function
which can be called at a code location where the segfault happens (or as
close as one can get); This will open a new MinTTY window with a GDB
that already attached to the current process.
Inspired by Derrick Stolee.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The final leg of rewriting "add -i/-p" in C.
* js/add-p-leftover-bits:
ci: include the built-in `git add -i` in the `linux-gcc` job
built-in add -p: handle Escape sequences more efficiently
built-in add -p: handle Escape sequences in interactive.singlekey mode
built-in add -p: respect the `interactive.singlekey` config setting
terminal: add a new function to read a single keystroke
terminal: accommodate Git for Windows' default terminal
terminal: make the code of disable_echo() reusable
built-in add -p: handle diff.algorithm
built-in add -p: support interactive.diffFilter
t3701: adjust difffilter test
Work around test breakages caused by custom regex engine used in
libasan, when address sanitizer is used with more recent versions
of gcc and clang.
* jk/asan-build-fix:
Makefile: use compat regex with SANITIZE=address
As with the previous two commits, UBSan with clang-11 complains about
computing offsets from a NULL pointer. The failures in t4013 (and
elsewhere) look like this:
kwset.c:102:23: runtime error: applying non-zero offset 107820859019600 to null pointer
...
not ok 79 - git log -SF master # magic is (not used)
That line is not enlightening:
... = obstack_alloc(&kwset->obstack, sizeof (struct trie));
because obstack is implemented almost entirely in macros, and the actual
problem is five macros deep (I temporarily converted them to inline
functions to get better compiler errors, which was tedious but worked
reasonably well).
The actual problem is in these pointer-alignment macros:
/* If B is the base of an object addressed by P, return the result of
aligning P to the next multiple of A + 1. B and P must be of type
char *. A + 1 must be a power of 2. */
#define __BPTR_ALIGN(B, P, A) ((B) + (((P) - (B) + (A)) & ~(A)))
/* Similar to _BPTR_ALIGN (B, P, A), except optimize the common case
where pointers can be converted to integers, aligned as integers,
and converted back again. If PTR_INT_TYPE is narrower than a
pointer (e.g., the AS/400), play it safe and compute the alignment
relative to B. Otherwise, use the faster strategy of computing the
alignment relative to 0. */
#define __PTR_ALIGN(B, P, A) \
__BPTR_ALIGN (sizeof (PTR_INT_TYPE) < sizeof (void *) ? (B) : (char *) 0, \
P, A)
If we have a sufficiently-large integer pointer type, then we do the
computation using a NULL pointer constant. That turns __BPTR_ALIGN()
into something like:
NULL + (P - NULL + A) & ~A
and UBSan is complaining about adding the full value of P to that
initial NULL. We can fix this by doing our math as an integer type, and
then casting the result back to a pointer. The problem case only happens
when we know that the integer type is large enough, so there should be
no issue with truncation.
Another option would be just simplify out all the 0's from
__BPTR_ALIGN() for the NULL-pointer case. That probably wouldn't work
for a platform where the NULL pointer isn't all-zeroes, but Git already
wouldn't work on such a platform (due to our use of memset to set
pointers in structs to NULL). But I tried here to keep as close to the
original as possible.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Recent versions of the gcc and clang Address Sanitizer produce test
failures related to regexec(). This triggers with gcc-10 and clang-8
(but not gcc-9 nor clang-7). Running:
make CC=gcc-10 SANITIZE=address test
results in failures in t4018, t3206, and t4062.
The cause seems to be that when built with ASan, we use a different
version of regexec() than normal. And this version doesn't understand
the REG_STARTEND flag. Here's my evidence supporting that.
The failure in t4062 is an ASan warning:
expecting success of 4062.2 '-G matches':
git diff --name-only -G "^(0{64}){64}$" HEAD^ >out &&
test 4096-zeroes.txt = "$(cat out)"
=================================================================
==672994==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x7fa76f672000 at pc 0x7fa7726f75b6 bp 0x7ffe41bdda70 sp 0x7ffe41bdd220
READ of size 4097 at 0x7fa76f672000 thread T0
#0 0x7fa7726f75b5 (/lib/x86_64-linux-gnu/libasan.so.6+0x4f5b5)
#1 0x562ae0c9c40e in regexec_buf /home/peff/compile/git/git-compat-util.h:1117
#2 0x562ae0c9c40e in diff_grep /home/peff/compile/git/diffcore-pickaxe.c:52
#3 0x562ae0c9cc28 in pickaxe_match /home/peff/compile/git/diffcore-pickaxe.c:166
[...]
In this case we're looking in a buffer which was mmap'd via
reuse_worktree_file(), and whose size is 4096 bytes. But libasan's
regex tries to look at byte 4097 anyway! If we tweak Git like this:
diff --git a/diff.c b/diff.c
index 8e2914c031..cfae60c120 100644
--- a/diff.c
+++ b/diff.c
@@ -3880,7 +3880,7 @@ static int reuse_worktree_file(struct index_state *istate,
*/
if (ce_uptodate(ce) ||
(!lstat(name, &st) && !ie_match_stat(istate, ce, &st, 0)))
- return 1;
+ return 0;
return 0;
}
to use a regular buffer (with a trailing NUL) instead of an mmap, then
the complaint goes away.
The other failures are actually diff output with an incorrect funcname
header. If I instrument xdiff to show the funcname matching like so:
diff --git a/xdiff-interface.c b/xdiff-interface.c
index 8509f9ea22..f6c3dc1986 100644
--- a/xdiff-interface.c
+++ b/xdiff-interface.c
@@ -197,6 +197,7 @@ struct ff_regs {
struct ff_reg {
regex_t re;
int negate;
+ char *printable;
} *array;
};
@@ -218,7 +219,12 @@ static long ff_regexp(const char *line, long len,
for (i = 0; i < regs->nr; i++) {
struct ff_reg *reg = regs->array + i;
- if (!regexec_buf(®->re, line, len, 2, pmatch, 0)) {
+ int ret = regexec_buf(®->re, line, len, 2, pmatch, 0);
+ warning("regexec %s:\n regex: %s\n buf: %.*s",
+ ret == 0 ? "matched" : "did not match",
+ reg->printable,
+ (int)len, line);
+ if (!ret) {
if (reg->negate)
return -1;
break;
@@ -264,6 +270,7 @@ void xdiff_set_find_func(xdemitconf_t *xecfg, const char *value, int cflags)
expression = value;
if (regcomp(®->re, expression, cflags))
die("Invalid regexp to look for hunk header: %s", expression);
+ reg->printable = xstrdup(expression);
free(buffer);
value = ep + 1;
}
then when compiling with ASan and gcc-10, running the diff from t4018.66
produces this:
$ git diff -U1 cpp-skip-access-specifiers
warning: regexec did not match:
regex: ^[ ]*[A-Za-z_][A-Za-z_0-9]*:[[:space:]]*($|/[/*])
buf: private:
warning: regexec matched:
regex: ^((::[[:space:]]*)?[A-Za-z_].*)$
buf: private:
diff --git a/cpp-skip-access-specifiers b/cpp-skip-access-specifiers
index 4d4a9db..ebd6f42 100644
--- a/cpp-skip-access-specifiers
+++ b/cpp-skip-access-specifiers
@@ -6,3 +6,3 @@ private:
void DoSomething();
int ChangeMe;
};
void DoSomething();
- int ChangeMe;
+ int IWasChanged;
};
That first regex should match (and is negated, so it should be telling
us _not_ to match "private:"). But it wouldn't if regexec() is looking
at the whole buffer, and not just the length-limited line we've fed to
regexec_buf(). So this is consistent again with REG_STARTEND being
ignored.
The correct output (compiling without ASan, or gcc-9 with Asan) looks
like this:
warning: regexec matched:
regex: ^[ ]*[A-Za-z_][A-Za-z_0-9]*:[[:space:]]*($|/[/*])
buf: private:
[...more lines that we end up not using...]
warning: regexec matched:
regex: ^((::[[:space:]]*)?[A-Za-z_].*)$
buf: class RIGHT : public Baseclass
diff --git a/cpp-skip-access-specifiers b/cpp-skip-access-specifiers
index 4d4a9db..ebd6f42 100644
--- a/cpp-skip-access-specifiers
+++ b/cpp-skip-access-specifiers
@@ -6,3 +6,3 @@ class RIGHT : public Baseclass
void DoSomething();
- int ChangeMe;
+ int IWasChanged;
};
So it really does seem like libasan's regex engine is ignoring
REG_STARTEND. We should be able to work around it by compiling with
NO_REGEX, which would use our local regexec(). But to make matters even
more interesting, this isn't enough by itself.
Because ASan has support from the compiler, it doesn't seem to intercept
our call to regexec() at the dynamic library level. It actually
recognizes when we are compiling a call to regexec() and replaces it
with ASan-specific code at that point. And unlike most of our other
compat code, where we might have git_mmap() or similar, the actual
symbol name in the compiled compat/regex code is regexec(). So just
compiling with NO_REGEX isn't enough; we still end up in libasan!
We can work around that by having the preprocessor replace regexec with
git_regexec (both in the callers and in the actual implementation), and
we truly end up with a call to our custom regex code, even when
compiling with ASan. That's probably a good thing to do anyway, as it
means anybody looking at the symbols later (e.g., in a debugger) would
have a better indication of which function is which. So we'll do the
same for the other common regex functions (even though just regexec() is
enough to fix this ASan problem).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
With the upgrade, the library names changed from libeay32/ssleay32 to
libcrypto/libssl.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When `interactive.singlekey = true`, we react immediately to keystrokes,
even to Escape sequences (e.g. when pressing a cursor key).
The problem with Escape sequences is that we do not really know when
they are done, and as a heuristic we poll standard input for half a
second to make sure that we got all of it.
While waiting half a second is not asking for a whole lot, it can become
quite annoying over time, therefore with this patch, we read the
terminal capabilities (if available) and extract known Escape sequences
from there, then stop polling immediately when we detected that the user
pressed a key that generated such a known sequence.
This recapitulates the remaining part of b5cc003253 (add -i: ignore
terminal escape sequences, 2011-05-17).
Note: We do *not* query the terminal capabilities directly. That would
either require a lot of platform-specific code, or it would require
linking to a library such as ncurses.
Linking to a library in the built-ins is something we try very hard to
avoid (we even kicked the libcurl dependency to a non-built-in remote
helper, just to shave off a tiny fraction of a second from Git's startup
time). And the platform-specific code would be a maintenance nightmare.
Even worse: in Git for Windows' case, we would need to query MSYS2
pseudo terminals, which `git.exe` simply cannot do (because it is
intentionally *not* an MSYS2 program).
To address this, we simply spawn `infocmp -L -1` and parse its output
(which works even in Git for Windows, because that helper is included in
the end-user facing installations).
This is done only once, as in the Perl version, but it is done only when
the first Escape sequence is encountered, not upon startup of `git add
-i`; This saves on startup time, yet makes reacting to the first Escape
sequence slightly more sluggish. But it allows us to keep the
terminal-related code encapsulated in the `compat/terminal.c` file.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This recapitulates part of b5cc003253 (add -i: ignore terminal escape
sequences, 2011-05-17):
add -i: ignore terminal escape sequences
On the author's terminal, the up-arrow input sequence is ^[[A, and
thus fat-fingering an up-arrow into 'git checkout -p' is quite
dangerous: git-add--interactive.perl will ignore the ^[ and [
characters and happily treat A as "discard everything".
As a band-aid fix, use Term::Cap to get all terminal capabilities.
Then use the heuristic that any capability value that starts with ^[
(i.e., \e in perl) must be a key input sequence. Finally, given an
input that starts with ^[, read more characters until we have read a
full escape sequence, then return that to the caller. We use a
timeout of 0.5 seconds on the subsequent reads to avoid getting stuck
if the user actually input a lone ^[.
Since none of the currently recognized keys start with ^[, the net
result is that the sequence as a whole will be ignored and the help
displayed.
Note that we leave part for later which uses "Term::Cap to get all
terminal capabilities", for several reasons:
1. it is actually not really necessary, as the timeout of 0.5 seconds
should be plenty sufficient to catch Escape sequences,
2. it is cleaner to keep the change to special-case Escape sequences
separate from the change that reads all terminal capabilities to
speed things up, and
3. in practice, relying on the terminal capabilities is a bit overrated,
as the information could be incomplete, or plain wrong. For example,
in this developer's tmux sessions, the terminal capabilities claim
that the "cursor up" sequence is ^[M, but the actual sequence
produced by the "cursor up" key is ^[[A.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Typically, input on the command-line is line-based. It is actually not
really easy to get single characters (or better put: keystrokes).
We provide two implementations here:
- One that handles `/dev/tty` based systems as well as native Windows.
The former uses the `tcsetattr()` function to put the terminal into
"raw mode", which allows us to read individual keystrokes, one by one.
The latter uses `stty.exe` to do the same, falling back to direct
Win32 Console access.
Thanks to the refactoring leading up to this commit, this is a single
function, with the platform-specific details hidden away in
conditionally-compiled code blocks.
- A fall-back which simply punts and reads back an entire line.
Note that the function writes the keystroke into an `strbuf` rather than
a `char`, in preparation for reading Escape sequences (e.g. when the
user hit an arrow key). This is also required for UTF-8 sequences in
case the keystroke corresponds to a non-ASCII letter.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Git for Windows' Git Bash runs in MinTTY by default, which does not have
a Win32 Console instance, but uses MSYS2 pseudo terminals instead.
This is a problem, as Git for Windows does not want to use the MSYS2
emulation layer for Git itself, and therefore has no direct way to
interact with that pseudo terminal.
As a workaround, use the `stty` utility (which is included in Git for
Windows, and which *is* an MSYS2 program, so it knows how to deal with
the pseudo terminal).
Note: If Git runs in a regular CMD or PowerShell window, there *is* a
regular Win32 Console to work with. This is not a problem for the MSYS2
`stty`: it copes with this scenario just fine.
Also note that we introduce support for more bits than would be
necessary for a mere `disable_echo()` here, in preparation for the
upcoming `enable_non_canonical()` function.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We are about to introduce the function `enable_non_canonical()`, which
shares almost the complete code with `disable_echo()`.
Let's prepare for that, by refactoring out that shared code.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Forbid pathnames that the platform's filesystem cannot represent on
MinGW.
* js/mingw-reserved-filenames:
mingw: refuse paths containing reserved names
mingw: short-circuit the conversion of `/dev/null` to UTF-16
There are a couple of reserved names that cannot be file names on
Windows, such as `AUX`, `NUL`, etc. For an almost complete list, see
https://docs.microsoft.com/en-us/windows/win32/fileio/naming-a-file
If one would try to create a directory named `NUL`, it would actually
"succeed", i.e. the call would return success, but nothing would be
created.
Worse, even adding a file extension to the reserved name does not make
it a valid file name. To understand the rationale behind that behavior,
see https://devblogs.microsoft.com/oldnewthing/20031022-00/?p=42073
Let's just disallow them all.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the next commit, we want to disallow accessing any path that contains
any segment that is equivalent to `NUL`. In particular, we want to
disallow accessing `NUL` (e.g. to prevent any repository from being
checked out that contains a file called `NUL`, as that is not a valid
file name on Windows).
However, there are legitimate use cases within Git itself to write to
the Null device. As Git is really a Linux project, it does not abstract
that idea, though, but instead uses `/dev/null` to describe this
intention.
So let's side-step the validation _specifically_ in the case that we
want to write to (or read from) `/dev/null`, via a dedicated short-cut
in the code that skips the call to `validate_win32_path()`.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Avoid gmtime() and localtime() and prefer their reentrant
counterparts.
* dd/time-reentrancy:
mingw: use {gm,local}time_s as backend for {gm,local}time_r
archive-zip.c: switch to reentrant localtime_r
date.c: switch to reentrant {gm,local}time_r
Work around a issue where a FD that is left open when spawning a
child process and is kept open in the child can interfere with the
operation in the parent process on Windows.
* js/mingw-inherit-only-std-handles:
mingw: forbid translating ERROR_SUCCESS to an errno value
mingw: do set `errno` correctly when trying to restrict handle inheritance
mingw: restrict file handle inheritance only on Windows 7 and later
mingw: spawned processes need to inherit only standard handles
mingw: work around incorrect standard handles
mingw: demonstrate that all file handles are inherited by child processes
* maint-2.23: (44 commits)
Git 2.23.1
Git 2.22.2
Git 2.21.1
mingw: sh arguments need quoting in more circumstances
mingw: fix quoting of empty arguments for `sh`
mingw: use MSYS2 quoting even when spawning shell scripts
mingw: detect when MSYS2's sh is to be spawned more robustly
t7415: drop v2.20.x-specific work-around
Git 2.20.2
t7415: adjust test for dubiously-nested submodule gitdirs for v2.20.x
Git 2.19.3
Git 2.18.2
Git 2.17.3
Git 2.16.6
test-drop-caches: use `has_dos_drive_prefix()`
Git 2.15.4
Git 2.14.6
mingw: handle `subst`-ed "DOS drives"
mingw: refuse to access paths with trailing spaces or periods
mingw: refuse to access paths with illegal characters
...
* maint-2.22: (43 commits)
Git 2.22.2
Git 2.21.1
mingw: sh arguments need quoting in more circumstances
mingw: fix quoting of empty arguments for `sh`
mingw: use MSYS2 quoting even when spawning shell scripts
mingw: detect when MSYS2's sh is to be spawned more robustly
t7415: drop v2.20.x-specific work-around
Git 2.20.2
t7415: adjust test for dubiously-nested submodule gitdirs for v2.20.x
Git 2.19.3
Git 2.18.2
Git 2.17.3
Git 2.16.6
test-drop-caches: use `has_dos_drive_prefix()`
Git 2.15.4
Git 2.14.6
mingw: handle `subst`-ed "DOS drives"
mingw: refuse to access paths with trailing spaces or periods
mingw: refuse to access paths with illegal characters
unpack-trees: let merged_entry() pass through do_add_entry()'s errors
...
* maint-2.21: (42 commits)
Git 2.21.1
mingw: sh arguments need quoting in more circumstances
mingw: fix quoting of empty arguments for `sh`
mingw: use MSYS2 quoting even when spawning shell scripts
mingw: detect when MSYS2's sh is to be spawned more robustly
t7415: drop v2.20.x-specific work-around
Git 2.20.2
t7415: adjust test for dubiously-nested submodule gitdirs for v2.20.x
Git 2.19.3
Git 2.18.2
Git 2.17.3
Git 2.16.6
test-drop-caches: use `has_dos_drive_prefix()`
Git 2.15.4
Git 2.14.6
mingw: handle `subst`-ed "DOS drives"
mingw: refuse to access paths with trailing spaces or periods
mingw: refuse to access paths with illegal characters
unpack-trees: let merged_entry() pass through do_add_entry()'s errors
quote-stress-test: offer to test quoting arguments for MSYS2 sh
...
Previously, we failed to quote characters such as '*', '(' and the
likes. Let's fix this.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
When constructing command-lines to spawn processes, it is an unfortunate
but necessary decision to quote arguments differently: MSYS2 has
different dequoting rules (inherited from Cygwin) than the rest of
Windows.
To accommodate that, Git's Windows compatibility layer has two separate
quoting helpers, one for MSYS2 (which it uses exclusively when spawning
`sh`) and the other for regular Windows executables.
The MSYS2 one had an unfortunate bug where a `,` somehow slipped in,
instead of the `;`. As a consequence, empty arguments would not be
enclosed in a pair of double quotes, but the closing double quote was
skipped.
Let's fix this.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
At the point where `mingw_spawn_fd()` is called, we already have a full
path to the script interpreter in that scenario, and we pass it in as
the executable to run, while the `argv` reflect what the script should
receive as command-line.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
* maint-2.20: (36 commits)
Git 2.20.2
t7415: adjust test for dubiously-nested submodule gitdirs for v2.20.x
Git 2.19.3
Git 2.18.2
Git 2.17.3
Git 2.16.6
test-drop-caches: use `has_dos_drive_prefix()`
Git 2.15.4
Git 2.14.6
mingw: handle `subst`-ed "DOS drives"
mingw: refuse to access paths with trailing spaces or periods
mingw: refuse to access paths with illegal characters
unpack-trees: let merged_entry() pass through do_add_entry()'s errors
quote-stress-test: offer to test quoting arguments for MSYS2 sh
t6130/t9350: prepare for stringent Win32 path validation
quote-stress-test: allow skipping some trials
quote-stress-test: accept arguments to test via the command-line
tests: add a helper to stress test argument quoting
mingw: fix quoting of arguments
Disallow dubiously-nested submodule git directories
...
* maint-2.19: (34 commits)
Git 2.19.3
Git 2.18.2
Git 2.17.3
Git 2.16.6
test-drop-caches: use `has_dos_drive_prefix()`
Git 2.15.4
Git 2.14.6
mingw: handle `subst`-ed "DOS drives"
mingw: refuse to access paths with trailing spaces or periods
mingw: refuse to access paths with illegal characters
unpack-trees: let merged_entry() pass through do_add_entry()'s errors
quote-stress-test: offer to test quoting arguments for MSYS2 sh
t6130/t9350: prepare for stringent Win32 path validation
quote-stress-test: allow skipping some trials
quote-stress-test: accept arguments to test via the command-line
tests: add a helper to stress test argument quoting
mingw: fix quoting of arguments
Disallow dubiously-nested submodule git directories
protect_ntfs: turn on NTFS protection by default
path: also guard `.gitmodules` against NTFS Alternate Data Streams
...
* maint-2.18: (33 commits)
Git 2.18.2
Git 2.17.3
Git 2.16.6
test-drop-caches: use `has_dos_drive_prefix()`
Git 2.15.4
Git 2.14.6
mingw: handle `subst`-ed "DOS drives"
mingw: refuse to access paths with trailing spaces or periods
mingw: refuse to access paths with illegal characters
unpack-trees: let merged_entry() pass through do_add_entry()'s errors
quote-stress-test: offer to test quoting arguments for MSYS2 sh
t6130/t9350: prepare for stringent Win32 path validation
quote-stress-test: allow skipping some trials
quote-stress-test: accept arguments to test via the command-line
tests: add a helper to stress test argument quoting
mingw: fix quoting of arguments
Disallow dubiously-nested submodule git directories
protect_ntfs: turn on NTFS protection by default
path: also guard `.gitmodules` against NTFS Alternate Data Streams
is_ntfs_dotgit(): speed it up
...
* maint-2.17: (32 commits)
Git 2.17.3
Git 2.16.6
test-drop-caches: use `has_dos_drive_prefix()`
Git 2.15.4
Git 2.14.6
mingw: handle `subst`-ed "DOS drives"
mingw: refuse to access paths with trailing spaces or periods
mingw: refuse to access paths with illegal characters
unpack-trees: let merged_entry() pass through do_add_entry()'s errors
quote-stress-test: offer to test quoting arguments for MSYS2 sh
t6130/t9350: prepare for stringent Win32 path validation
quote-stress-test: allow skipping some trials
quote-stress-test: accept arguments to test via the command-line
tests: add a helper to stress test argument quoting
mingw: fix quoting of arguments
Disallow dubiously-nested submodule git directories
protect_ntfs: turn on NTFS protection by default
path: also guard `.gitmodules` against NTFS Alternate Data Streams
is_ntfs_dotgit(): speed it up
mingw: disallow backslash characters in tree objects' file names
...
* maint-2.16: (31 commits)
Git 2.16.6
test-drop-caches: use `has_dos_drive_prefix()`
Git 2.15.4
Git 2.14.6
mingw: handle `subst`-ed "DOS drives"
mingw: refuse to access paths with trailing spaces or periods
mingw: refuse to access paths with illegal characters
unpack-trees: let merged_entry() pass through do_add_entry()'s errors
quote-stress-test: offer to test quoting arguments for MSYS2 sh
t6130/t9350: prepare for stringent Win32 path validation
quote-stress-test: allow skipping some trials
quote-stress-test: accept arguments to test via the command-line
tests: add a helper to stress test argument quoting
mingw: fix quoting of arguments
Disallow dubiously-nested submodule git directories
protect_ntfs: turn on NTFS protection by default
path: also guard `.gitmodules` against NTFS Alternate Data Streams
is_ntfs_dotgit(): speed it up
mingw: disallow backslash characters in tree objects' file names
path: safeguard `.git` against NTFS Alternate Streams Accesses
...
* maint-2.15: (29 commits)
Git 2.15.4
Git 2.14.6
mingw: handle `subst`-ed "DOS drives"
mingw: refuse to access paths with trailing spaces or periods
mingw: refuse to access paths with illegal characters
unpack-trees: let merged_entry() pass through do_add_entry()'s errors
quote-stress-test: offer to test quoting arguments for MSYS2 sh
t6130/t9350: prepare for stringent Win32 path validation
quote-stress-test: allow skipping some trials
quote-stress-test: accept arguments to test via the command-line
tests: add a helper to stress test argument quoting
mingw: fix quoting of arguments
Disallow dubiously-nested submodule git directories
protect_ntfs: turn on NTFS protection by default
path: also guard `.gitmodules` against NTFS Alternate Data Streams
is_ntfs_dotgit(): speed it up
mingw: disallow backslash characters in tree objects' file names
path: safeguard `.git` against NTFS Alternate Streams Accesses
clone --recurse-submodules: prevent name squatting on Windows
is_ntfs_dotgit(): only verify the leading segment
...
* maint-2.14: (28 commits)
Git 2.14.6
mingw: handle `subst`-ed "DOS drives"
mingw: refuse to access paths with trailing spaces or periods
mingw: refuse to access paths with illegal characters
unpack-trees: let merged_entry() pass through do_add_entry()'s errors
quote-stress-test: offer to test quoting arguments for MSYS2 sh
t6130/t9350: prepare for stringent Win32 path validation
quote-stress-test: allow skipping some trials
quote-stress-test: accept arguments to test via the command-line
tests: add a helper to stress test argument quoting
mingw: fix quoting of arguments
Disallow dubiously-nested submodule git directories
protect_ntfs: turn on NTFS protection by default
path: also guard `.gitmodules` against NTFS Alternate Data Streams
is_ntfs_dotgit(): speed it up
mingw: disallow backslash characters in tree objects' file names
path: safeguard `.git` against NTFS Alternate Streams Accesses
clone --recurse-submodules: prevent name squatting on Windows
is_ntfs_dotgit(): only verify the leading segment
test-path-utils: offer to run a protectNTFS/protectHFS benchmark
...
While the only permitted drive letters for physical drives on Windows
are letters of the US-English alphabet, this restriction does not apply
to virtual drives assigned via `subst <letter>: <path>`.
To prevent targeted attacks against systems where "funny" drive letters
such as `1` or `!` are assigned, let's handle them as regular drive
letters on Windows.
This fixes CVE-2019-1351.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
On Windows, filenames cannot have trailing spaces or periods, when
opening such paths, they are stripped automatically. Read: you can open
the file `README` via the file name `README . . .`. This ambiguity can
be used in combination with other security bugs to cause e.g. remote
code execution during recursive clones. This patch series fixes that.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Over a decade ago, in 25fe217b86 (Windows: Treat Windows style path
names., 2008-03-05), Git was taught to handle absolute Windows paths,
i.e. paths that start with a drive letter and a colon.
Unbeknownst to us, while drive letters of physical drives are limited to
letters of the English alphabet, there is a way to assign virtual drive
letters to arbitrary directories, via the `subst` command, which is
_not_ limited to English letters.
It is therefore possible to have absolute Windows paths of the form
`1:\what\the\hex.txt`. Even "better": pretty much arbitrary Unicode
letters can also be used, e.g. `ä:\tschibät.sch`.
While it can be sensibly argued that users who set up such funny drive
letters really seek adverse consequences, the Windows Operating System
is known to be a platform where many users are at the mercy of
administrators who have their very own idea of what constitutes a
reasonable setup.
Therefore, let's just make sure that such funny paths are still
considered absolute paths by Git, on Windows.
In addition to Unicode characters, pretty much any character is a valid
drive letter, as far as `subst` is concerned, even `:` and `"` or even a
space character. While it is probably the opposite of smart to use them,
let's safeguard `is_dos_drive_prefix()` against all of them.
Note: `[::1]:repo` is a valid URL, but not a valid path on Windows.
As `[` is now considered a valid drive letter, we need to be very
careful to avoid misinterpreting such a string as valid local path in
`url_is_local_not_ssh()`. To do that, we use the just-introduced
function `is_valid_path()` (which will label the string as invalid file
name because of the colon characters).
This fixes CVE-2019-1351.
Reported-by: Nicolas Joly <Nicolas.Joly@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
When creating a directory on Windows whose path ends in a space or a
period (or chains thereof), the Win32 API "helpfully" trims those. For
example, `mkdir("abc ");` will return success, but actually create a
directory called `abc` instead.
This stems back to the DOS days, when all file names had exactly 8
characters plus exactly 3 characters for the file extension, and the
only way to have shorter names was by padding with spaces.
Sadly, this "helpful" behavior is a bit inconsistent: after a successful
`mkdir("abc ");`, a `mkdir("abc /def")` will actually _fail_ (because
the directory `abc ` does not actually exist).
Even if it would work, we now have a serious problem because a Git
repository could contain directories `abc` and `abc `, and on Windows,
they would be "merged" unintentionally.
As these paths are illegal on Windows, anyway, let's disallow any
accesses to such paths on that Operating System.
For practical reasons, this behavior is still guarded by the
config setting `core.protectNTFS`: it is possible (and at least two
regression tests make use of it) to create commits without involving the
worktree. In such a scenario, it is of course possible -- even on
Windows -- to create such file names.
Among other consequences, this patch disallows submodules' paths to end
in spaces on Windows (which would formerly have confused Git enough to
try to write into incorrect paths, anyway).
While this patch does not fix a vulnerability on its own, it prevents an
attack vector that was exploited in demonstrations of a number of
recently-fixed security bugs.
The regression test added to `t/t7417-submodule-path-url.sh` reflects
that attack vector.
Note that we have to adjust the test case "prevent git~1 squatting on
Windows" in `t/t7415-submodule-names.sh` because of a very subtle issue.
It tries to clone two submodules whose names differ only in a trailing
period character, and as a consequence their git directories differ in
the same way. Previously, when Git tried to clone the second submodule,
it thought that the git directory already existed (because on Windows,
when you create a directory with the name `b.` it actually creates `b`),
but with this patch, the first submodule's clone will fail because of
the illegal name of the git directory. Therefore, when cloning the
second submodule, Git will take a different code path: a fresh clone
(without an existing git directory). Both code paths fail to clone the
second submodule, both because the the corresponding worktree directory
exists and is not empty, but the error messages are worded differently.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Certain characters are not admissible in file names on Windows, even if
Cygwin/MSYS2 (and therefore, Git for Windows' Bash) pretend that they
are, e.g. `:`, `<`, `>`, etc
Let's disallow those characters explicitly in Windows builds of Git.
Note: just like trailing spaces or periods, it _is_ possible on Windows
to create commits adding files with such illegal characters, as long as
the operation leaves the worktree untouched. To allow for that, we
continue to guard `is_valid_win32_path()` behind the config setting
`core.protectNTFS`, so that users _can_ continue to do that, as long as
they turn the protections off via that config setting.
Among other problems, this prevents Git from trying to write to an "NTFS
Alternate Data Stream" (which refers to metadata stored alongside a
file, under a special name: "<filename>:<stream-name>"). This fix
therefore also prevents an attack vector that was exploited in
demonstrations of a number of recently-fixed security bugs.
Further reading on illegal characters in Win32 filenames:
https://docs.microsoft.com/en-us/windows/win32/fileio/naming-a-file
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
We need to be careful to follow proper quoting rules. For example, if an
argument contains spaces, we have to quote them. Double-quotes need to
be escaped. Backslashes need to be escaped, but only if they are
followed by a double-quote character.
We need to be _extra_ careful to consider the case where an argument
ends in a backslash _and_ needs to be quoted: in this case, we append a
double-quote character, i.e. the backslash now has to be escaped!
The current code, however, fails to recognize that, and therefore can
turn an argument that ends in a single backslash into a quoted argument
that now ends in an escaped double-quote character. This allows
subsequent command-line parameters to be split and part of them being
mistaken for command-line options, e.g. through a maliciously-crafted
submodule URL during a recursive clone.
Technically, we would not need to quote _all_ arguments which end in a
backslash _unless_ the argument needs to be quoted anyway. For example,
`test\` would not need to be quoted, while `test \` would need to be.
To keep the code simple, however, and therefore easier to reason about
and ensure its correctness, we now _always_ quote an argument that ends
in a backslash.
This addresses CVE-2019-1350.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Johannes Sixt pointed out that the `err_win_to_posix()` function
mishandles `ERROR_SUCCESS`: it maps it to `ENOSYS`.
The only purpose of this function is to map Win32 API errors to `errno`
ones, and there is actually no equivalent to `ERROR_SUCCESS`: the idea
of `errno` is that it will only be set in case of an error, and left
alone in case of success.
Therefore, as pointed out by Junio Hamano, it is a bug to call this
function when there was not even any error to map.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since Windows doesn't provide gmtime_r(3) and localtime_r(3),
we're providing a compat version by using non-reentrant gmtime(3) and
localtime(3) as backend. Then, we copy the returned data into the
buffer.
By doing that, in case of failure, we will dereference a NULL pointer
returned by gmtime(3), and localtime(3), and we always return a valid
pointer instead of NULL.
Drop the memcpy(3) by using gmtime_s(), and use localtime_s() as the
backend on Windows, and make sure we will return NULL in case of
failure.
Cc: Johannes Sixt <j6t@kdbg.org>
Cc: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Doan Tran Cong Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Docfix.
* en/doc-typofix:
Fix spelling errors in no-longer-updated-from-upstream modules
multimail: fix a few simple spelling errors
sha1dc: fix trivial comment spelling error
Fix spelling errors in test commands
Fix spelling errors in messages shown to users
Fix spelling errors in names of tests
Fix spelling errors in comments of testcases
Fix spelling errors in code comments
Fix spelling errors in documentation outside of Documentation/
Documentation: fix a bunch of typos, both old and new
In 9a780a384d (mingw: spawned processes need to inherit only standard
handles, 2019-11-22), we taught the Windows-specific part to restrict
which file handles are passed on to the spawned processes.
Since this logic seemed to be a bit fragile across Windows versions (we
_still_ support Windows Vista in Git for Windows, for example), a
fall-back was added to try spawning the process again, this time without
restricting which file handles are to be inherited by the spawned
process.
In the common case (i.e. when the process could not be spawned for
reasons _other_ than the file handle inheritance), the fall-back attempt
would still fail, of course.
Crucially, one thing we missed in that code path was to set `errno`
appropriately.
This should have been caught by t0061.2 which expected `errno` to be
`ENOENT` after trying to start a process for a non-existing executable,
but `errno` was set to `ENOENT` prior to the `CreateProcessW()` call:
while looking for the config settings for trace2, Git tries to access
`xdg_config` and `user_config` via `access_or_die()`, and as neither of
those config files exists when running the test case (because in Git's
test suite, `HOME` points to the test directory), the `errno` has the
expected value, but for the wrong reasons.
Let's fix that by making sure that `errno` is set correctly. It even
appears that `errno` was set in the _wrong_ case previously:
`CreateProcessW()` returns non-zero upon success, but `errno` was set
only in the non-zero case.
It would be nice if we could somehow fix t0061 to make sure that this
does not regress again. One approach that seemed like it should work,
but did not, was to set `errno` to 0 in the test helper that is used by
t0061.2.
However, when `mingw_spawnvpe()` wants to see whether the file in
question is a script, it calls `parse_interpreter()`, which in turn
tries to `open()` the file. Obviously, this call fails, and sets `errno`
to `ENOENT`, deep inside the call chain started from that test helper.
Instead, we force re-set `errno` at the beginning of the function
`mingw_spawnve_fd()`, which _should_ be safe given that callers of that
function will want to look at `errno` if -1 was returned. And if that
`errno` is 0 ("No error"), regression tests like t0061.2 will kick in.
Reported-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Acked-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Turns out that it don't work so well on Vista, see
https://github.com/git-for-windows/git/issues/1742 for details.
According to https://devblogs.microsoft.com/oldnewthing/?p=8873, it
*should* work on Windows Vista and later.
But apparently there are issues on Windows Vista when pipes are
involved. Given that Windows Vista is past its end of life (official
support ended on April 11th, 2017), let's not spend *too* much time on
this issue and just disable the file handle inheritance restriction on
any Windows version earlier than Windows 7.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>