git/compat
Alexandr Miloslavskiy 94f4d01932 mingw: workaround for hangs when sending STDIN
Explanation
-----------
The problem here is flawed `poll()` implementation. When it tries to
see if pipe can be written without blocking, it eventually calls
`NtQueryInformationFile()` and tests `WriteQuotaAvailable`. However,
the meaning of quota was misunderstood. The value of quota is reduced
when either some data was written to a pipe, *or* there is a pending
read on the pipe. Therefore, if there is a pending read of size >= than
the pipe's buffer size, poll() will think that pipe is not writable and
will hang forever, usually that means deadlocking both pipe users.

I have studied the problem and found that Windows pipes track two values:
`QuotaUsed` and `BytesInQueue`. The code in `poll()` apparently wants to
know `BytesInQueue` instead of quota. Unfortunately, `BytesInQueue` can
only be requested from read end of the pipe, while `poll()` receives
write end.

The git's implementation of `poll()` was copied from gnulib, which also
contains a flawed implementation up to today.

I also had a look at implementation in cygwin, which is also broken in a
subtle way. It uses this code in `pipe_data_available()`:
	fpli.WriteQuotaAvailable = (fpli.OutboundQuota - fpli.ReadDataAvailable)
However, `ReadDataAvailable` always returns 0 for the write end of the pipe,
turning the code into an obfuscated version of returning pipe's total
buffer size, which I guess will in turn have `poll()` always say that pipe
is writable. The commit that introduced the code doesn't say anything about
this change, so it could be some debugging code that slipped in.

These are the typical sizes used in git:
0x2000 - default read size in `strbuf_read()`
0x1000 - default read size in CRT, used by `strbuf_getwholeline()`
0x2000 - pipe buffer size in compat\mingw.c

As a consequence, as soon as child process uses `strbuf_read()`,
`poll()` in parent process will hang forever, deadlocking both
processes.

This results in two observable behaviors:
1) If parent process begins sending STDIN quickly (and usually that's
   the case), then first `poll()` will succeed and first block will go
   through. MAX_IO_SIZE_DEFAULT is 8MB, so if STDIN exceeds 8MB, then
   it will deadlock.
2) If parent process waits a little bit for any reason (including OS
   scheduler) and child is first to issue `strbuf_read()`, then it will
   deadlock immediately even on small STDINs.

The problem is illustrated by `git stash push`, which will currently
read the entire patch into memory and then send it to `git apply` via
STDIN. If patch exceeds 8MB, git hangs on Windows.

Possible solutions
------------------
1) Somehow obtain `BytesInQueue` instead of `QuotaUsed`
   I did a pretty thorough search and didn't find any ways to obtain
   the value from write end of the pipe.
2) Also give read end of the pipe to `poll()`
   That can be done, but it will probably invite some dirty code,
   because `poll()`
   * can accept multiple pipes at once
   * can accept things that are not pipes
   * is expected to have a well known signature.
3) Make `poll()` always reply "writable" for write end of the pipe
   Afterall it seems that cygwin (accidentally?) does that for years.
   Also, it should be noted that `pump_io_round()` writes 8MB blocks,
   completely ignoring the fact that pipe's buffer size is only 8KB,
   which means that pipe gets clogged many times during that single
   write. This may invite a deadlock, if child's STDERR/STDOUT gets
   clogged while it's trying to deal with 8MB of STDIN. Such deadlocks
   could be defeated with writing less than pipe's buffer size per
   round, and always reading everything from STDOUT/STDERR before
   starting next round. Therefore, making `poll()` always reply
   "writable" shouldn't cause any new issues or block any future
   solutions.
4) Increase the size of the pipe's buffer
   The difference between `BytesInQueue` and `QuotaUsed` is the size
   of pending reads. Therefore, if buffer is bigger than size of reads,
   `poll()` won't hang so easily. However, I found that for example
   `strbuf_read()` will get more and more hungry as it reads large inputs,
   eventually surpassing any reasonable pipe buffer size.

Chosen solution
---------------
Make `poll()` always reply "writable" for write end of the pipe.
Hopefully one day someone will find a way to implement it properly.

Reproduction
------------
printf "%8388608s" X >large_file.txt
git stash push --include-untracked -- large_file.txt

I have decided not to include this as test to avoid slowing down the
test suite. I don't expect the specific problem to come back, and
chances are that `git stash push` will be reworked to avoid sending the
entire patch via STDIN.

Signed-off-by: Alexandr Miloslavskiy <alexandr.miloslavskiy@syntevo.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-02-27 14:23:29 -08:00
..
nedmalloc nedmalloc: avoid compiler warning about unused value 2019-08-07 11:54:55 -07:00
poll mingw: workaround for hangs when sending STDIN 2020-02-27 14:23:29 -08:00
regex compat/regex/regcomp.c: define intptr_t and uintptr_t on NonStop 2019-01-03 14:16:20 -08:00
vcbuild msvc: accommodate for vcpkg's upgrade to OpenSSL v1.1.x 2020-01-16 12:18:23 -08:00
win32 Sync with 2.23.1 2019-12-06 16:31:39 +01:00
access.c git-compat-util: work around for access(X_OK) under root 2019-04-25 17:49:44 +09:00
apple-common-crypto.h imap-send: use HMAC() function provided by OpenSSL 2016-04-08 11:45:47 -07:00
basename.c compat/basename.c: provide a dirname() compatibility function 2016-01-12 10:40:54 -08:00
bswap.h compat/bswap: add include header guards 2019-03-07 07:42:14 +09:00
fileno.c git-compat-util: work around for access(X_OK) under root 2019-04-25 17:49:44 +09:00
fopen.c git_fopen: fix a sparse 'not declared' warning 2017-05-26 12:33:55 +09:00
gmtime.c date: recognize bogus FreeBSD gmtime output 2014-04-01 14:39:04 -07:00
hstrerror.c compat/hstrerror: convert sprintf to snprintf 2015-09-25 10:18:18 -07:00
inet_ntop.c compat/inet_ntop: fix off-by-one in inet_ntop4 2015-09-25 10:18:18 -07:00
inet_pton.c Drop system includes from inet_pton/inet_ntop compatibility wrappers 2012-02-05 16:32:33 -08:00
memmem.c
mingw.c Sync with 2.23.1 2019-12-06 16:31:39 +01:00
mingw.h Sync with 2.23.1 2019-12-06 16:31:39 +01:00
mkdir.c compat: some mkdir() do not like a slash at the end 2012-08-24 09:48:51 -07:00
mkdtemp.c
mmap.c compat: make sure git_mmap is not expected to write 2018-10-25 18:51:03 +09:00
msvc.c win32: use our own dirent.h 2010-11-23 16:06:50 -08:00
msvc.h msvc: add pragmas for common warnings 2019-06-25 10:46:57 -07:00
obstack.c compat/obstack: fix -Wcast-function-type warnings 2019-01-17 11:13:38 -08:00
obstack.h obstack: fix compiler warning 2019-06-20 14:03:05 -07:00
pread.c
precompose_utf8.c Support working-tree-encoding "UTF-16LE-BOM" 2019-01-31 10:27:52 -08:00
precompose_utf8.h compat/precompose_utf8.h: use more common include guard style 2018-08-15 11:52:09 -07:00
qsort_s.c compat: add qsort_s() 2017-01-23 11:02:34 -08:00
setenv.c use st_add and st_mult for allocation size computation 2016-02-22 14:51:09 -08:00
sha1-chunked.c sha1: allow limiting the size of the data passed to SHA1_Update() 2015-11-05 10:35:11 -08:00
sha1-chunked.h sha1: allow limiting the size of the data passed to SHA1_Update() 2015-11-05 10:35:11 -08:00
snprintf.c MSVC: vsnprintf in Visual Studio 2015 doesn't need SNPRINTF_SIZE_CORR any more 2016-03-30 11:13:01 -07:00
stat.c compat: convert modes to use portable file type values 2014-12-04 11:58:36 -08:00
strcasestr.c
strdup.c compat: move strdup(3) replacement to its own file 2016-09-07 10:41:45 -07:00
strlcpy.c
strtoimax.c Add strtoimax() compatibility function. 2011-11-02 13:06:30 -07:00
strtoumax.c
terminal.c strbuf: introduce strbuf_getline_{lf,nul}() 2016-01-15 10:12:51 -08:00
terminal.h add generic terminal prompt function 2011-12-12 16:09:38 -08:00
unsetenv.c Revert "compat/unsetenv.c: Fix a sparse warning" 2013-07-21 15:09:56 -07:00
win32.h mingw: rename WIN32 cpp macro to GIT_WINDOWS_NATIVE 2013-05-08 12:14:35 -07:00
win32mmap.c mmap(win32): avoid expensive fstat() call 2016-04-22 15:01:16 -07:00
winansi.c winansi: use FLEX_ARRAY to avoid compiler warning 2019-10-06 09:07:44 +09:00