mirrors/git

mirror of https://github.com/git/git.git synced 2024-11-23 18:05:29 +08:00

Author	SHA1	Message	Date
René Scharfe	0283cd5161	don't report vsnprintf(3) error as bug strbuf_addf() has been reporting a negative return value of vsnprintf(3) as a bug since `f141bd804d` (Handle broken vsnprintf implementations in strbuf, 2007-11-13). Other functions copied that behavior: `7b03c89ebd` (add xsnprintf helper function, 2015-09-24) `5ef264dbdb` (strbuf.c: add `strbuf_insertf()` and `strbuf_vinsertf()`, 2019-02-25) `8d25663d70` (mem-pool: add mem_pool_strfmt(), 2024-02-25) However, vsnprintf(3) can legitimately return a negative value if the formatted output would be longer than INT_MAX. Stop accusing it of being broken and just report the fact that formatting failed. Suggested-by: Jeff King <peff@peff.net> Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-04-21 12:27:07 -07:00
Elijah Newren	eea0e59ffb	treewide: remove unnecessary includes in source files Each of these were checked with gcc -E -I. ${SOURCE_FILE} \| grep ${HEADER_FILE} to ensure that removing the direct inclusion of the header actually resulted in that header no longer being included at all (i.e. that no other header pulled it in transitively). ...except for a few cases where we verified that although the header was brought in transitively, nothing from it was directly used in that source file. These cases were: * builtin/credential-cache.c * builtin/pull.c * builtin/send-pack.c Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-12-26 12:04:31 -08:00
Calvin Wan	b1bda75173	parse: separate out parsing functions from config.h The files config.{h,c} contain functions that have to do with parsing, but not config. In order to further reduce all-in-one headers, separate out functions in config.c that do not operate on config into its own file, parse.h, and update the include directives in the .c files that need only such functions accordingly. Signed-off-by: Calvin Wan <calvinwan@google.com> Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-09-29 15:14:57 -07:00
Calvin Wan	afd2a1d5f1	wrapper: reduce scope of remove_or_warn() remove_or_warn() is only used by entry.c and apply.c, but it is currently declared and defined in wrapper.{h,c}, so it has a scope much greater than it needs. This needlessly large scope also causes wrapper.c to need to include object.h, when this file is largely unconcerned with Git objects. Move remove_or_warn() to entry.{h,c}. The file apply.c still has access to it, since it already includes entry.h for another reason. Signed-off-by: Calvin Wan <calvinwan@google.com> Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-09-29 15:14:56 -07:00
Derrick Stolee	89024a0ab0	maintenance: add get_random_minute() When we initially created background maintenance -- with its hourly, daily, and weekly schedules -- we considered the effects of all clients launching fetches to the server every hour on the hour. The worry of DDoSing server hosts was noted, but left as something we would consider for a future update. As background maintenance has gained more adoption over the past three years, our worries about DDoSing the big Git hosts has been unfounded. Those systems, especially those serving public repositories, are already resilient to thundering herds of much smaller scale. However, sometimes organizations spin up specific custom server infrastructure either in addition to or on top of their Git host. Some of these technologies are built for a different range of scale, and can hit concurrency limits sooner. Organizations with such custom infrastructures are more likely to recommend tools like `scalar` which furthers their adoption of background maintenance. To help solve for this, create get_random_minute() as a method to help Git select a random minute when creating schedules in the future. The integrations with this method do not yet exist, but will follow in future changes. To avoid multiple sources of randomness in the Git codebase, create a new helper function, git_rand(), that returns a random uint32_t. This is similar to how rand() returns a random nonnegative value, except it is based on csprng_bytes() which is cryptographic and will return values larger than RAND_MAX. One thing that is important for testability is that we notice when we are under a test scenario and return a predictable result. The schedules themselves are not checked for this value, but at least one launchctl test checks that we do not unnecessarily reboot the schedule if it has not changed from a previous version. Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-08-10 14:04:16 -07:00
Junio C Hamano	52d9dc20e1	Merge branch 'bb/use-trace2-counters-for-fsync-stats' Instead of inventing a custom counter variables for debugging, use existing trace2 facility in the fsync customization codepath. * bb/use-trace2-counters-for-fsync-stats: wrapper: use trace2 counters to collect fsync stats	2023-08-02 09:37:23 -07:00
Junio C Hamano	0e30958044	Merge branch 'mh/mingw-case-sensitive-build' Names of MinGW header files are spelled in mixed case in some source files, but the build host can be using case sensitive filesystem with header files with their name spelled in all lowercase. * mh/mingw-case-sensitive-build: mingw: use lowercase includes for some Windows headers	2023-07-25 12:05:23 -07:00
Beat Bolli	a27eecea75	wrapper: use trace2 counters to collect fsync stats As mentioned in the thread starting at [1], trace2 counters should be used to count events instead of ad-hoc static variables. Convert the two fsync static variables to trace2 counters, reducing the coupling between wrapper.c and the trace2 subsystem. Adjust t/t5351 to match the trace2 counter output format. The counters are not per-thread because the ones being replaced also were not. [1] https://lore.kernel.org/git/20230627195251.1973421-2-calvinwan@google.com/ Signed-off-by: Beat Bolli <dev+git@drbeat.li> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-07-20 11:52:53 -07:00
Calvin Wan	da9502ff4d	treewide: remove unnecessary includes for wrapper.h Signed-off-by: Calvin Wan <calvinwan@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-07-05 11:41:59 -07:00
Mike Hommey	4a53d0d0bc	mingw: use lowercase includes for some Windows headers When cross-compiling with the mingw toolchain on a system with a case sensitive filesystem, the mixed case (which is technically correct as per the contents of MS Visual C++) doesn't work (the corresponding mingw headers are all lowercase for some reason). Signed-off-by: Mike Hommey <mh@glandium.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-06-12 12:31:52 -07:00
Elijah Newren	5e3f94dfe3	treewide: remove cache.h inclusion due to previous changes Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-24 12:47:33 -07:00
Elijah Newren	d1cbe1e6d8	hash-ll.h: split out of hash.h to remove dependency on repository.h hash.h depends upon and includes repository.h, due to the definition and use of the_hash_algo (defined as the_repository->hash_algo). However, most headers trying to include hash.h are only interested in the layout of the structs like object_id. Move the parts of hash.h that do not depend upon repository.h into a new file hash-ll.h (the "low level" parts of hash.h), and adjust other files to use this new header where the convenience inline functions aren't needed. This allows hash.h and object.h to be fairly small, minimal headers. It also exposes a lot of hidden dependencies on both path.h (which was brought in by repository.h) and repository.h (which was previously implicitly brought in by object.h), so also adjust other files to be more explicit about what they depend upon. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-24 12:47:32 -07:00
Elijah Newren	69a63fe663	treewide: be explicit about dependence on strbuf.h Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-24 12:47:31 -07:00
Elijah Newren	74ea5c9574	treewide: be explicit about dependence on trace.h & trace2.h Dozens of files made use of trace and trace2 functions, without explicitly including trace.h or trace2.h. This made it more difficult to find which files could remove a dependence on cache.h. Make C files explicitly include trace.h or trace2.h if they are using them. Signed-off-by: Elijah Newren <newren@gmail.com> Acked-by: Calvin Wan <calvinwan@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-11 08:52:08 -07:00
Elijah Newren	d5ebb50dcb	wrapper.h: move declarations for wrapper.c functions from cache.h Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-21 10:56:53 -07:00
Elijah Newren	0b027f6ca7	abspath.h: move absolute path functions from cache.h This is another step towards letting us remove the include of cache.h in strbuf.c. It does mean that we also need to add includes of abspath.h in a number of C files. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-21 10:56:52 -07:00
Elijah Newren	f394e093df	treewide: be explicit about dependence on gettext.h Dozens of files made use of gettext functions, without explicitly including gettext.h. This made it more difficult to find which files could remove a dependence on cache.h. Make C files explicitly include gettext.h if they are using it. However, while compat/fsmonitor/fsm-ipc-darwin.c should also gain an include of gettext.h, it was left out to avoid conflicting with an in-flight topic. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-21 10:56:51 -07:00
Junio C Hamano	a103ad6f3d	Merge branch 'jk/pipe-command-nonblock' Fix deadlocks between main Git process and subprocess spawned via the pipe_command() API, that can kill "git add -p" that was reimplemented in C recently. * jk/pipe-command-nonblock: pipe_command(): mark stdin descriptor as non-blocking pipe_command(): handle ENOSPC when writing to a pipe pipe_command(): avoid xwrite() for writing to pipe git-compat-util: make MAX_IO_SIZE define globally available nonblock: support Windows compat: add function to enable nonblocking pipes	2022-08-25 14:42:32 -07:00
Jeff King	ec4f39b233	git-compat-util: make MAX_IO_SIZE define globally available We define MAX_IO_SIZE within wrapper.c, but it's useful for any code that wants to do a raw write() for whatever reason (say, because they want different EAGAIN handling). Let's make it available everywhere. The alternative would be adding xwrite_foo() variants to give callers more options. But there's really no reason MAX_IO_SIZE needs to be abstracted away, so this give callers the most flexibility. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-17 09:21:40 -07:00
Ævar Arnfjörð Bjarmason	3a251bac0d	trace2: only include "fsync" events if we git_fsync() Fix the overly verbose trace2 logging added in `9a4987677d` (trace2: add stats for fsync operations, 2022-03-30) (first released with v2.36.0). Since that change every single "git" command invocation has included these "data" events, even though we'll only make use of these with core.fsyncMethod=batch, and even then only have non-zero values if we're writing object data to disk. See `c0f4752ed2` (core.fsyncmethod: batched disk flushes for loose-objects, 2022-04-04) for that feature. As we're needing to indent the trace2_data_intmax() lines let's introduce helper variables to ensure that our resulting lines (which were already too) don't exceed the recommendations of the CodingGuidelines. Doing that requires either wrapping them twice, or introducing short throwaway variable names, let's do the latter. The result was that e.g. "git version" would previously emit a total of 6 trace2 events with the GIT_TRACE2_EVENT target (version, start, cmd_ancestry, cmd_name, exit, atexit), but afterwards would emit 8. We'd emit 2 "data" events before the "exit" event. The reason we didn't catch this was that the trace2 unit tests added in `a15860dca3` (trace2: t/helper/test-trace2, t0210.sh, t0211.sh, t0212.sh, 2019-02-22) would omit any "data" events that weren't the ones it cared about. Before this change to the C code 6/7 of our "t/t0212-trace2-event.sh" tests would fail if this change was applied to "t/t0212/parse_events.perl". Let's make the trace2 testing more strict, and further append any new events types we don't know about in "t/t0212/parse_events.perl". Since we only invoke the "test-tool trace2" there's no guarantee that we'll catch other overly verbose events in the future, but we'll at least notice if we start emitting new events that are issues every time we log anything with trace2's JSON target. We exclude the "data_json" event type, we'd otherwise would fail on both "win test" and "win+VS test" CI due to the logging added in `353d3d77f4` (trace2: collect Windows-specific process information, 2019-02-22). It looks like that logging should really be using trace2_cmd_ancestry() instead, which was introduced later in `2f732bf15e` (tr2: log parent process name, 2021-07-21), but let's leave it for now. The fix-up to `aaf81223f4` (unpack-objects: use stream_loose_object() to unpack large objects, 2022-06-11) is needed because we're changing the behavior of these events as discussed above. Since we'd always emit a "hardware-flush" event the test added in `aaf81223f4` wasn't testing anything except that this trace2 data was unconditionally logged. Even if "core.fsyncMethod" wasn't set to "batch" we'd pass the test. Now we'll check the expected number of "writeout" v.s. "flush" calls under "core.fsyncMethod=batch", but note that this doesn't actually test if we carried out the sync using that method, on a platform where we'd have to fall back to fsync() each of those "writeout" would really be a "flush" (i.e. a full fsync()). But in this case what we're testing is that the logic in "unpack-objects" behaves as expected, not the OS-specific question of whether we actually were able to use the "bulk" method. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-07-18 09:41:57 -07:00
Junio C Hamano	538dc459a0	Merge branch 'ep/maint-equals-null-cocci' Introduce and apply coccinelle rule to discourage an explicit comparison between a pointer and NULL, and applies the clean-up to the maintenance track. * ep/maint-equals-null-cocci: tree-wide: apply equals-null.cocci tree-wide: apply equals-null.cocci contrib/coccinnelle: add equals-null.cocci	2022-05-20 15:26:59 -07:00
Junio C Hamano	afe8a9070b	tree-wide: apply equals-null.cocci Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-05-02 09:50:37 -07:00
Neeraj Singh	9a4987677d	trace2: add stats for fsync operations Add some global trace2 statistics for the number of fsyncs performed during the lifetime of a Git process. These stats are printed as part of trace2_cmd_exit_fl, which is presumably where we might want to print any other cross-cutting statistics. Signed-off-by: Neeraj Singh <neerajsi@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-03-30 11:15:55 -07:00
Neeraj Singh	abf38abec2	core.fsyncmethod: add writeout-only mode This commit introduces the `core.fsyncMethod` configuration knob, which can currently be set to `fsync` or `writeout-only`. The new writeout-only mode attempts to tell the operating system to flush its in-memory page cache to the storage hardware without issuing a CACHE_FLUSH command to the storage controller. Writeout-only fsync is significantly faster than a vanilla fsync on common hardware, since data is written to a disk-side cache rather than all the way to a durable medium. Later changes in this patch series will take advantage of this primitive to implement batching of hardware flushes. When git_fsync is called with FSYNC_WRITEOUT_ONLY, it may fail and the caller is expected to do an ordinary fsync as needed. On Apple platforms, the fsync system call does not issue a CACHE_FLUSH directive to the storage controller. This change updates fsync to do fcntl(F_FULLFSYNC) to make fsync actually durable. We maintain parity with existing behavior on Apple platforms by setting the default value of the new core.fsyncMethod option. Signed-off-by: Neeraj Singh <neerajsi@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-03-10 15:10:22 -08:00
Neeraj Singh	19d3f228c8	wrapper: make inclusion of Windows csprng header tightly scoped Including NTSecAPI.h in git-compat-util.h causes build errors in any other file that includes winternl.h. NTSecAPI.h was included in order to get access to the RtlGenRandom cryptographically secure PRNG. This change scopes the inclusion of ntsecapi.h to wrapper.c, which is the only place that it's actually needed. The build breakage is due to the definition of UNICODE_STRING in NtSecApi.h: #ifndef _NTDEF_ typedef LSA_UNICODE_STRING UNICODE_STRING, PUNICODE_STRING; typedef LSA_STRING STRING, PSTRING ; #endif LsaLookup.h: typedef struct _LSA_UNICODE_STRING { USHORT Length; USHORT MaximumLength; #ifdef MIDL_PASS [size_is(MaximumLength/2), length_is(Length/2)] #endif // MIDL_PASS PWSTR Buffer; } LSA_UNICODE_STRING, PLSA_UNICODE_STRING; winternl.h also defines UNICODE_STRING: typedef struct _UNICODE_STRING { USHORT Length; USHORT MaximumLength; PWSTR Buffer; } UNICODE_STRING; typedef UNICODE_STRING PUNICODE_STRING; Both definitions have equivalent layouts. Apparently these internal Windows headers aren't designed to be included together. This is an oversight in the headers and does not represent an incompatibility between the APIs. Signed-off-by: Neeraj Singh <neerajsi@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-03-10 15:10:22 -08:00
brian m. carlson	47efda967c	wrapper: use a CSPRNG to generate random file names The current way we generate random file names is by taking the seconds and microseconds, plus the PID, and mixing them together, then encoding them. If this fails, we increment the value by 7777, and try again up to TMP_MAX times. Unfortunately, this is not the best idea from a security perspective. If we're writing into TMPDIR, an attacker can guess these values easily and prevent us from creating any temporary files at all by creating them all first. Even though we set TMP_MAX to 16384, this may be achievable in some contexts, even if unlikely to occur in practice. Fortunately, we can simply solve this by using the system cryptographically secure pseudorandom number generator (CSPRNG) to generate a random 64-bit value, and use that as before. Note that there is still a small bias here, but because a six-character sequence chosen out of 62 characters provides about 36 bits of entropy, the bias here is less than 2^-28, which is acceptable, especially considering we'll retry several times. Note that the use of a CSPRNG in generating temporary file names is also used in many libcs. glibc recently changed from an approach similar to ours to using a CSPRNG, and FreeBSD and OpenBSD also use a CSPRNG in this case. Even if the likelihood of an attack is low, we should still be at least as responsible in creating temporary files as libc is. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-01-17 14:17:51 -08:00
brian m. carlson	05cd988dce	wrapper: add a helper to generate numbers from a CSPRNG There are many situations in which having access to a cryptographically secure pseudorandom number generator (CSPRNG) is helpful. In the future, we'll encounter one of these when dealing with temporary files. To make this possible, let's add a function which reads from a system CSPRNG and returns some bytes. We know that all systems will have such an interface. A CSPRNG is required for a secure TLS or SSH implementation and a Git implementation which provided neither would be of little practical use. In addition, POSIX is set to standardize getentropy(2) in the next version, so in the (potentially distant) future we can rely on that. For systems which lack one of the other interfaces, we provide the ability to use OpenSSL's CSPRNG. OpenSSL is highly portable and functions on practically every known OS, and we know it will have access to some source of cryptographically secure randomness. We also provide support for the arc4random in libbsd for folks who would prefer to use that. Because this is a security sensitive interface, we take some precautions. We either succeed by filling the buffer completely as we requested, or we fail. We don't return partial data because the caller will almost never find that to be a useful behavior. Specify a makefile knob which users can use to specify one or more suitable CSPRNGs, and turn the multiple string options into a set of defines, since we cannot match on strings in the preprocessor. We allow multiple options to make the job of handling this in autoconf easier. The order of options is important here. On systems with arc4random, which is most of the BSDs, we use that, since, except on MirBSD and macOS, it uses ChaCha20, which is extremely fast, and sits entirely in userspace, avoiding a system call. We then prefer getrandom over getentropy, because the former has been available longer on Linux, and then OpenSSL. Finally, if none of those are available, we use /dev/urandom, because most Unix-like operating systems provide that API. We prefer options that don't involve device files when possible because those work in some restricted environments where device files may not be available. Set the configuration variables appropriately for Linux and the BSDs, including macOS, as well as Windows and NonStop. We specifically only consider versions which receive publicly available security support here. For the same reason, we don't specify getrandom(2) on Linux, because CentOS 7 doesn't support it in glibc (although its kernel does) and we don't want to resort to making syscalls. Finally, add a test helper to allow this to be tested by hand and in tests. We don't add any tests, since invoking the CSPRNG is not likely to produce interesting, reproducible results. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-01-17 14:17:48 -08:00
Carlo Marcelo Arenas Belón	6fc527a8d0	wrapper: remove xunsetenv() Remove the unused wrapper function. Reported-by: Randall S. Becker <rsbecker@nexbridge.com> Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-10-29 14:59:29 -07:00
Ævar Arnfjörð Bjarmason	3540c71ea5	wrapper.c: add x{un,}setenv(), and use xsetenv() in environment.c Add fatal wrappers for setenv() and unsetenv(). In `d7ac12b25d` (Add set_git_dir() function, 2007-08-01) we started checking its return value, and since `48988c4d0c` (set_git_dir: die when setenv() fails, 2018-03-30) we've had set_git_dir_1() die if we couldn't set it. Let's provide a wrapper for both, this will be useful in many other places, a subsequent patch will make another use of xsetenv(). The checking of the return value here is over-eager according to setenv(3) and POSIX. It's documented as returning just -1 or 0, so perhaps we should be checking -1 explicitly. Let's just instead die on any non-zero, if our C library is so broken as to return something else than -1 on error (and perhaps not set errno?) the worst we'll do is die with a nonsensical errno value, but we'll want to die in either case. Let's make these return "void" instead of "int". As far as I can tell there's no other x*() wrappers that needed to make the decision of deviating from the signature in the C library, but since their return value is only used to indicate errors (so we'd die here), we can catch unreachable code such as if (xsetenv(...) < 0) [...]; I think it would be OK skip the NULL check of the "name" here for the calls to die_errno(). Almost all of our setenv() callers are taking a constant string hardcoded in the source as the first argument, and for the rest we can probably assume they've done the NULL check themselves. Even if they didn't, modern C libraries are forgiving about it (e.g. glibc formatting it as "(null)"), on those that aren't, well, we were about to die anyway. But let's include the check anyway for good measure. 1. https://pubs.opengroup.org/onlinepubs/009604499/functions/setenv.html Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-22 13:15:00 -07:00
René Scharfe	a7439d0f9d	xopen: explicitly report creation failures If the flags O_CREAT and O_EXCL are both given then open(2) is supposed to create the file and error out if it already exists. The error message in that case looks like this: fatal: could not open 'foo' for writing: File exists Without further context this is confusing: Why should the existence of the file pose a problem? Isn't that a requirement for writing to it? Add a more specific error message for that case to tell the user that we actually don't expect the file to preexist, so the example becomes: fatal: unable to create 'foo': File exists Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-08-25 14:39:06 -07:00
Jeff King	00611d8440	add open_nofollow() helper Some callers of open() would like to use O_NOFOLLOW, but it is not available on all platforms. Let's abstract this into a helper function so we can provide system-specific implementations. Some light web-searching reveals that we might be able to get something similar on Windows using FILE_FLAG_OPEN_REPARSE_POINT. I didn't dig into this further. For other systems without O_NOFOLLOW or any equivalent, we have two options for fallback: - we can just open anyway, following symlinks; this may have security implications (e.g., following untrusted in-tree symlinks) - we can determine whether the path is a symlink with lstat(). This is slower (two syscalls instead of one), but that may be acceptable for infrequent uses like looking up .gitattributes files (especially because we can get away with a single syscall for the common case of ENOENT). It's also racy, but should be sufficient for our needs (we are worried about in-tree symlinks that we ourselves would have previously created). We could make it non-racy at the cost of making it even slower, by doing an fstat() on the opened descriptor and comparing the dev/ino fields to the original lstat(). This patch implements the lstat() option in its slightly-faster racy form. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-16 09:41:32 -08:00
Jeff King	6479ea4a8a	xrealloc: do not reuse pointer freed by zero-length realloc() This patch fixes a bug where xrealloc(ptr, 0) can double-free and corrupt the heap on some platforms (including at least glibc). The C99 standard says of malloc (section 7.20.3): If the size of the space requested is zero, the behavior is implementation-defined: either a null pointer is returned, or the behavior is as if the size were some nonzero value, except that the returned pointer shall not be used to access an object. So we might get NULL back, or we might get an actual pointer (but we're not allowed to look at its contents). To simplify our code, our xmalloc() handles a NULL return by converting it into a single-byte allocation. That way callers get consistent behavior. This was done way back in `4e7a2eccc2` (?alloc: do not return NULL when asked for zero bytes, 2005-12-29). We also gave xcalloc() and xrealloc() the same treatment. And according to C99, that is fine; the text above is in a paragraph that applies to all three. But what happens to the memory we passed to realloc() in such a case? I.e., if we do: ret = realloc(ptr, 0); and "ptr" is non-NULL, but we get NULL back, is "ptr" still valid? C99 doesn't cover this case specifically, but says (section 7.20.3.4): The realloc function deallocates the old object pointed to by ptr and returns a pointer to a new object that has the size specified by size. So "ptr" is now deallocated, and we must only look at "ret". And since "ret" is NULL, that means we have no allocated object at all. But that's not quite the whole story. It also says: If memory for the new object cannot be allocated, the old object is not deallocated and its value is unchanged. [...] The realloc function returns a pointer to the new object (which may have the same value as a pointer to the old object), or a null pointer if the new object could not be allocated. So if we see a NULL return with a non-zero size, we can expect that the original object _is_ still valid. But with a non-zero size, it's ambiguous. The NULL return might mean a failure (in which case the object is valid), or it might mean that we successfully allocated nothing, and used NULL to represent that. The glibc manpage for realloc() explicitly says: [...]if size is equal to zero, and ptr is not NULL, then the call is equivalent to free(ptr). Likewise, this StackOverflow answer: https://stackoverflow.com/a/2135302 claims that C89 gave similar guidance (but I don't have a copy to verify it). A comment on this answer: https://stackoverflow.com/a/2022410 claims that Microsoft's CRT behaves the same. But our current "retry with 1 byte" code passes the original pointer again. So on glibc, we effectively free() the pointer and then try to realloc() it again, which is undefined behavior. The simplest fix here is to just pass "ret" (which we know to be NULL) to the follow-up realloc(). But that means that a system which _doesn't_ free the original pointer would leak it. It's not clear if any such systems exist, and that interpretation of the standard seems unlikely (I'd expect a system that doesn't deallocate to simply return the original pointer in this case). But it's easy enough to err on the safe side, and just never pass a zero size to realloc() at all. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-02 12:18:14 -07:00
brian m. carlson	14570dc67d	wrapper: add function to compare strings with different NUL termination When parsing capabilities for the pack protocol, there are times we'll want to compare the value of a capability to a NUL-terminated string. Since the data we're reading will be space-terminated, not NUL-terminated, we need a function that compares the two strings, but also checks that they're the same length. Otherwise, if we used strncmp to compare these strings, we might accidentally accept a parameter that was a prefix of the expected value. Add a function, xstrncmpz, that takes a NUL-terminated string and a non-NUL-terminated string, plus a length, and compares them, ensuring that they are the same length. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-27 10:07:06 -07:00
Junio C Hamano	b660a76d0f	Merge branch 'dl/wrapper-fix-indentation' Coding style fix. * dl/wrapper-fix-indentation: wrapper: indent with tabs	2020-04-22 13:42:47 -07:00
Denton Liu	7cd54d37dc	wrapper: indent with tabs The codebase uses tabs for indentation. Convert an erroneous space indent into a tab indent. Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-03-28 18:06:51 -07:00
Junio C Hamano	6e12570822	Merge branch 'ah/cleanups' Miscellaneous code clean-ups. * ah/cleanups: git_mkstemps_mode(): replace magic numbers with computed value wrapper: use a loop instead of repetitive statements diffcore-break: use a goto instead of a redundant if statement commit-graph: remove a duplicate assignment	2019-10-09 14:01:00 +09:00
Jeff King	53d687bf5f	git_mkstemps_mode(): replace magic numbers with computed value The magic number "6" appears several times in the function, and is related to the size of the "XXXXXX" string we expect to find in the template. Let's pull that "XXXXXX" into a constant array, whose size we can get at compile time with ARRAY_SIZE(). Note that we probably can't just change this value, since callers will be feeding us a certain number of X's, but it hopefully makes the function itself easier to follow. While we're here, let's do the same with the "letters" array (which we _could_ modify if we wanted to include more characters). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-10-03 09:58:25 +09:00
Alex Henrie	54a80a9ad8	wrapper: use a loop instead of repetitive statements A check into the history of this code revealed no particular reason for the code to be written in this way. All popular compilers are capable of unrolling loops if it benefits performance, and once this code is replaced with a loop, the magic number 6 used in multiple places in this function can be replaced with a named constant. Reviewed-by: Derrick Stolee <stolee@gmail.com> Reviewed-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Alex Henrie <alexhenrie24@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-10-02 15:04:23 +09:00
Jeff King	9827d4c185	packfile: drop release_pack_memory() Long ago, in `97bfeb34df` (Release pack windows before reporting out of memory., 2006-12-24), we taught xmalloc() and friends to try unmapping pack windows when malloc() failed. It's unlikely that his helps a lot in practice, and it has some downsides. First, the downsides: 1. It makes xmalloc() not thread-safe. We've worked around this in pack-objects.c, which installs its own locking version of the try_to_free_routine(). But other threaded code doesn't. 2. It makes the system as a whole harder to reason about. Functions which allocate heap memory under the hood may have farther-reaching effects than expected. That might be worth the tradeoff if there's a benefit. But in practice, it seems unlikely. We're generally dealing with mmap'd files, so the OS is going to do a much better job at responding to memory pressure by dropping individual pages (the exception is systems with NO_MMAP, but even there the OS can probably respond just as well with swapping). So the only thing we're really freeing is address space. On 64-bit systems, we have plenty of that to go around. On 32-bit systems, it could possibly help. But around the same time we made two other changes: `77ccc5bbd1` (Introduce new config option for mmap limit., 2006-12-23) and `60bb8b1453` (Fully activate the sliding window pack access., 2006-12-23). Together that means that a 32-bit system should have no more than 256MB total of packed-git mmaps at one time, split between a few 32MB windows. It's unlikely we have any address space problems since then, but we don't have any data since the features were all added at the same time. Likewise, xmmap() will try to free memory. At first glance, it seems like we'd need this (when we try to mmap a new window, we might need to close an old one to save address space on a 32-bit system). But we're saved again by core.packedGitLimit: if we're going to exceed our 256MB limit, we'll close an existing window before we even call mmap(). So it seems unlikely that this feature is actually doing anything useful. And while we don't have reports of it harming anything (probably because it rarely if ever kicks in), it would be nice to simplify the system overall. This patch drops the whole try_to_free system from xmalloc(), as well as the manual pack memory release in xmmap(). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-08-13 12:21:33 -07:00
Carlo Marcelo Arenas Belón	729a9b558b	wrapper: avoid undefined behaviour in macOS `0620b39b3b` ("compat: add a mkstemps() compatibility function", 2009-05-31) included a function based on code from libiberty which would result in undefined behaviour in platforms where timeval's tv_usec is a 32-bit signed type as shown by: wrapper.c:505:31: runtime error: left shift of 594546 by 16 places cannot be represented in type '__darwin_suseconds_t' (aka 'int') interestingly the version of this code from gcc never had this bug and the code had a cast that would had prevented the issue (at least in 64-bit platforms) but was misapplied. change the cast to uint64_t so it also works in 32-bit platforms. Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-06-19 07:41:31 -07:00
Pranit Bauva	e3b1e3bdc0	wrapper: move is_empty_file() and rename it as is_empty_or_missing_file() is_empty_file() can help to refactor a lot of code. This will be very helpful in porting "git bisect" to C. Suggested-by: Torsten Bögershausen <tboegi@web.de> Mentored-by: Lars Schneider <larsxschneider@gmail.com> Mentored-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Pranit Bauva <pranit.bauva@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-01-02 10:23:02 -08:00
Johannes Schindelin	033abf97fc	Replace all die("BUG: ...") calls by BUG() ones In `d8193743e0` (usage.c: add BUG() function, 2017-05-12), a new macro was introduced to use for reporting bugs instead of die(). It was then subsequently used to convert one single caller in `588a538ae5` (setup_git_env: convert die("BUG") to BUG(), 2017-05-12). The cover letter of the patch series containing this patch (cf 20170513032414.mfrwabt4hovujde2@sigill.intra.peff.net) is not terribly clear why only one call site was converted, or what the plan is for other, similar calls to die() to report bugs. Let's just convert all remaining ones in one fell swoop. This trick was performed by this invocation: sed -i 's/die("BUG: /BUG("/g' $(git grep -l 'die("BUG' \*.c) Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-05-06 19:06:13 +09:00
Brandon Williams	eb78e23f22	wrapper: rename 'template' variables Rename C++ keyword in order to bring the codebase closer to being able to be compiled with a C++ compiler. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-02-22 10:08:05 -08:00
Simon Ruderich	0a288d1ee9	wrapper.c: consistently quote filenames in error messages All other error messages in the file use quotes around the file name. This change removes two translations as "could not write to '%s'" and "could not close '%s'" are already translated and these two are the only occurrences without quotes. Signed-off-by: Simon Ruderich <simon@ruderich.org> [jc: adjusted tests I noticed were broken by the change] Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-11-06 11:53:14 +09:00
Jeff King	06f46f237a	avoid "write_in_full(fd, buf, len) != len" pattern The return value of write_in_full() is either "-1", or the requested number of bytes[1]. If we make a partial write before seeing an error, we still return -1, not a partial value. This goes back to `f6aa66cb95` (write_in_full: really write in full or return error on disk full., 2007-01-11). So checking anything except "was the return value negative" is pointless. And there are a couple of reasons not to do so: 1. It can do a funny signed/unsigned comparison. If your "len" is signed (e.g., a size_t) then the compiler will promote the "-1" to its unsigned variant. This works out for "!= len" (unless you really were trying to write the maximum size_t bytes), but is a bug if you check "< len" (an example of which was fixed recently in config.c). We should avoid promoting the mental model that you need to check the length at all, so that new sites are not tempted to copy us. 2. Checking for a negative value is shorter to type, especially when the length is an expression. 3. Linus says so. In `d34cf19b89` (Clean up write_in_full() users, 2007-01-11), right after the write_in_full() semantics were changed, he wrote: I really wish every "write_in_full()" user would just check against "<0" now, but this fixes the nasty and stupid ones. Appeals to authority aside, this makes it clear that writing it this way does not have an intentional benefit. It's a historical curiosity that we never bothered to clean up (and which was undoubtedly cargo-culted into new sites). So let's convert these obviously-correct cases (this includes write_str_in_full(), which is just a wrapper for write_in_full()). [1] A careful reader may notice there is one way that write_in_full() can return a different value. If we ask write() to write N bytes and get a return value that is _larger_ than N, we could return a larger total. But besides the fact that this would imply a totally broken version of write(), it would already invoke undefined behavior. Our internal remaining counter is an unsigned size_t, which means that subtracting too many byte will wrap it around to a very large number. So we'll instantly begin reading off the end of the buffer, trying to write gigabytes (or petabytes) of data. Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-09-14 15:17:59 +09:00
Junio C Hamano	f31d23a399	Merge branch 'bw/config-h' Fix configuration codepath to pay proper attention to commondir that is used in multi-worktree situation, and isolate config API into its own header file. * bw/config-h: config: don't implicitly use gitdir or commondir config: respect commondir setup: teach discover_git_directory to respect the commondir config: don't include config.h by default config: remove git_config_iter config: create config.h	2017-06-24 14:28:41 -07:00
Brandon Williams	b2141fc1d2	config: don't include config.h by default Stop including config.h by default in cache.h. Instead only include config.h in those files which require use of the config system. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-15 12:56:22 -07:00
Junio C Hamano	b9a7d55d93	Merge branch 'nd/fopen-errors' We often try to open a file for reading whose existence is optional, and silently ignore errors from open/fopen; report such errors if they are not due to missing files. * nd/fopen-errors: mingw_fopen: report ENOENT for invalid file names mingw: verify that paths are not mistaken for remote nicknames log: fix memory leak in open_next_file() rerere.c: move error_errno() closer to the source system call print errno when reporting a system call error wrapper.c: make warn_on_inaccessible() static wrapper.c: add and use fopen_or_warn() wrapper.c: add and use warn_on_fopen_errors() config.mak.uname: set FREAD_READS_DIRECTORIES for Darwin, too config.mak.uname: set FREAD_READS_DIRECTORIES for Linux and FreeBSD clone: use xfopen() instead of fopen() use xfopen() in more places git_fopen: fix a sparse 'not declared' warning	2017-06-13 13:47:09 -07:00
Junio C Hamano	93dd544f54	Merge branch 'jc/noent-notdir' Our code often opens a path to an optional file, to work on its contents when we can successfully open it. We can ignore a failure to open if such an optional file does not exist, but we do want to report a failure in opening for other reasons (e.g. we got an I/O error, or the file is there, but we lack the permission to open). The exact errors we need to ignore are ENOENT (obviously) and ENOTDIR (less obvious). Instead of repeating comparison of errno with these two constants, introduce a helper function to do so. * jc/noent-notdir: treewide: use is_missing_file_error() where ENOENT and ENOTDIR are checked compat-util: is_missing_file_error()	2017-06-13 13:47:07 -07:00
Junio C Hamano	c7054209d6	treewide: use is_missing_file_error() where ENOENT and ENOTDIR are checked Using the is_missing_file_error() helper introduced in the previous step, update all hits from $ git grep -e ENOENT --and -e ENOTDIR There are codepaths that only check ENOENT, and it is possible that some of them should be checking both. Updating them is kept out of this step deliberately, as we do not want to change behaviour in this step. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-30 09:29:00 +09:00

1 2 3

142 Commits