2005-04-19 04:04:43 +08:00
|
|
|
/*
|
|
|
|
* GIT - The information manager from hell
|
|
|
|
*
|
|
|
|
* Copyright (C) Linus Torvalds, 2005
|
|
|
|
*/
|
2023-03-21 14:26:02 +08:00
|
|
|
#include "git-compat-util.h"
|
2023-03-21 14:25:54 +08:00
|
|
|
#include "gettext.h"
|
2023-03-21 14:26:02 +08:00
|
|
|
#include "trace2.h"
|
2005-04-19 04:04:43 +08:00
|
|
|
|
2021-12-08 02:26:34 +08:00
|
|
|
static void vreportf(const char *prefix, const char *err, va_list params)
|
2005-04-19 04:04:43 +08:00
|
|
|
{
|
2017-01-11 22:02:03 +08:00
|
|
|
char msg[4096];
|
2019-10-30 18:44:36 +08:00
|
|
|
char *p, *pend = msg + sizeof(msg);
|
|
|
|
size_t prefix_len = strlen(prefix);
|
vreportf: avoid intermediate buffer
When we call "die(fmt, args...)", we end up in vreportf with
two pieces of information:
1. The prefix "fatal: "
2. The original fmt and va_list of args.
We format item (2) into a temporary buffer, and then fprintf
the prefix and the temporary buffer, along with a newline.
This has the unfortunate side effect of truncating any error
messages that are longer than 4096 bytes.
Instead, let's use separate calls for the prefix and
newline, letting us hand the item (2) directly to vfprintf.
This is essentially undoing d048a96 (print
warning/error/fatal messages in one shot, 2007-11-09), which
tried to have the whole output end up in a single `write`
call.
But we can address this instead by explicitly requesting
line-buffering for the output handle, and by making sure
that the buffer is empty before we start (so that outputting
the prefix does not cause a flush due to hitting the buffer
limit).
We may still break the output into two writes if the content
is larger than our buffer, but there's not much we can do
there; depending on the stdio implementation, that might
have happened even with a single fprintf call.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-08-12 02:13:59 +08:00
|
|
|
|
2019-10-30 18:44:36 +08:00
|
|
|
if (sizeof(msg) <= prefix_len) {
|
|
|
|
fprintf(stderr, "BUG!!! too long a prefix '%s'\n", prefix);
|
|
|
|
abort();
|
|
|
|
}
|
|
|
|
memcpy(msg, prefix, prefix_len);
|
|
|
|
p = msg + prefix_len;
|
|
|
|
if (vsnprintf(p, pend - p, err, params) < 0)
|
|
|
|
*p = '\0'; /* vsnprintf() failed, clip at prefix */
|
|
|
|
|
|
|
|
for (; p != pend - 1 && *p; p++) {
|
vreport: sanitize ASCII control chars
Our error() and die() calls may report messages with
arbitrary data (e.g., filenames or even data from a remote
server). Let's make it harder to cause confusion with
mischievous filenames. E.g., try:
git rev-parse "$(printf "\rfatal: this argument is too sneaky")" --
or
git rev-parse "$(printf "\x1b[5mblinky\x1b[0m")" --
Let's block all ASCII control characters, with the exception
of TAB and LF. We use both in our own messages (and we are
necessarily sanitizing the complete output of snprintf here,
as we do not have access to the individual varargs). And TAB
and LF are unlikely to cause confusion (you could put
"\nfatal: sneaky\n" in your filename, but it would at least
not _cover up_ the message leading to it, unlike "\r").
We'll replace the characters with a "?", which is similar to
how "ls" behaves. It might be nice to do something less
lossy, like converting them to "\x" hex codes. But replacing
with a single character makes it easy to do in-place and
without worrying about length limitations. This feature
should kick in rarely enough that the "?" marks are almost
never seen.
We'll leave high-bit characters as-is, as they are likely to
be UTF-8 (though there may be some Unicode mischief you
could cause, which may require further patches).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-01-11 22:02:23 +08:00
|
|
|
if (iscntrl(*p) && *p != '\t' && *p != '\n')
|
|
|
|
*p = '?';
|
vreportf: avoid intermediate buffer
When we call "die(fmt, args...)", we end up in vreportf with
two pieces of information:
1. The prefix "fatal: "
2. The original fmt and va_list of args.
We format item (2) into a temporary buffer, and then fprintf
the prefix and the temporary buffer, along with a newline.
This has the unfortunate side effect of truncating any error
messages that are longer than 4096 bytes.
Instead, let's use separate calls for the prefix and
newline, letting us hand the item (2) directly to vfprintf.
This is essentially undoing d048a96 (print
warning/error/fatal messages in one shot, 2007-11-09), which
tried to have the whole output end up in a single `write`
call.
But we can address this instead by explicitly requesting
line-buffering for the output handle, and by making sure
that the buffer is empty before we start (so that outputting
the prefix does not cause a flush due to hitting the buffer
limit).
We may still break the output into two writes if the content
is larger than our buffer, but there's not much we can do
there; depending on the stdio implementation, that might
have happened even with a single fprintf call.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-08-12 02:13:59 +08:00
|
|
|
}
|
2019-10-30 18:44:36 +08:00
|
|
|
|
|
|
|
*(p++) = '\n'; /* we no longer need a NUL */
|
|
|
|
fflush(stderr);
|
|
|
|
write_in_full(2, msg, p - msg);
|
2011-07-28 05:32:34 +08:00
|
|
|
}
|
|
|
|
|
2009-11-09 23:05:02 +08:00
|
|
|
static NORETURN void usage_builtin(const char *err, va_list params)
|
2005-04-19 04:04:43 +08:00
|
|
|
{
|
2022-06-21 21:57:57 +08:00
|
|
|
vreportf(_("usage: "), err, params);
|
2019-02-23 06:25:01 +08:00
|
|
|
|
|
|
|
/*
|
|
|
|
* When we detect a usage error *before* the command dispatch in
|
|
|
|
* cmd_main(), we don't know what verb to report. Force it to this
|
|
|
|
* to facilitate post-processing.
|
|
|
|
*/
|
|
|
|
trace2_cmd_name("_usage_");
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Currently, the (err, params) are usually just the static usage
|
|
|
|
* string which isn't very useful here. Usually, the call site
|
|
|
|
* manually calls fprintf(stderr,...) with the actual detailed
|
|
|
|
* syntax error before calling usage().
|
|
|
|
*
|
|
|
|
* TODO It would be nice to update the call sites to pass both
|
|
|
|
* the static usage string and the detailed error message.
|
|
|
|
*/
|
|
|
|
|
2005-10-02 04:24:27 +08:00
|
|
|
exit(129);
|
2005-04-19 04:04:43 +08:00
|
|
|
}
|
|
|
|
|
2021-12-08 02:26:29 +08:00
|
|
|
static void die_message_builtin(const char *err, va_list params)
|
|
|
|
{
|
|
|
|
trace2_cmd_error_va(err, params);
|
2022-06-21 21:57:57 +08:00
|
|
|
vreportf(_("fatal: "), err, params);
|
2021-12-08 02:26:29 +08:00
|
|
|
}
|
|
|
|
|
2021-04-13 17:08:19 +08:00
|
|
|
/*
|
|
|
|
* We call trace2_cmd_error_va() in the below functions first and
|
|
|
|
* expect it to va_copy 'params' before using it (because an 'ap' can
|
|
|
|
* only be walked once).
|
|
|
|
*/
|
2006-06-24 13:44:33 +08:00
|
|
|
static NORETURN void die_builtin(const char *err, va_list params)
|
2006-06-24 10:34:38 +08:00
|
|
|
{
|
2021-12-08 02:26:29 +08:00
|
|
|
report_fn die_message_fn = get_die_message_routine();
|
2019-02-23 06:25:01 +08:00
|
|
|
|
2021-12-08 02:26:29 +08:00
|
|
|
die_message_fn(err, params);
|
2006-06-24 10:34:38 +08:00
|
|
|
exit(128);
|
|
|
|
}
|
|
|
|
|
2006-06-24 13:44:33 +08:00
|
|
|
static void error_builtin(const char *err, va_list params)
|
2006-06-24 10:34:38 +08:00
|
|
|
{
|
2019-02-23 06:25:01 +08:00
|
|
|
trace2_cmd_error_va(err, params);
|
|
|
|
|
2022-06-21 21:57:57 +08:00
|
|
|
vreportf(_("error: "), err, params);
|
2006-06-24 10:34:38 +08:00
|
|
|
}
|
|
|
|
|
2006-12-22 08:48:32 +08:00
|
|
|
static void warn_builtin(const char *warn, va_list params)
|
|
|
|
{
|
2020-11-24 04:45:22 +08:00
|
|
|
trace2_cmd_error_va(warn, params);
|
|
|
|
|
2022-06-21 21:57:57 +08:00
|
|
|
vreportf(_("warning: "), warn, params);
|
2006-12-22 08:48:32 +08:00
|
|
|
}
|
2006-06-24 10:34:38 +08:00
|
|
|
|
2013-04-17 03:46:22 +08:00
|
|
|
static int die_is_recursing_builtin(void)
|
|
|
|
{
|
|
|
|
static int dying;
|
die(): stop hiding errors due to overzealous recursion guard
Change the recursion limit for the default die routine from a *very*
low 1 to 1024. This ensures that infinite recursions are broken, but
doesn't lose the meaningful error messages under threaded execution
where threads concurrently start to die.
The intent of the existing code, as explained in commit
cd163d4b4e ("usage.c: detect recursion in die routines and bail out
immediately", 2012-11-14), is to break infinite recursion in cases
where the die routine itself calls die(), and would thus infinitely
recurse.
However, doing that very aggressively by immediately printing out
"recursion detected in die handler" if we've already called die() once
means that threaded invocations of git can end up only printing out
the "recursion detected" error, while hiding the meaningful error.
An example of this is running a threaded grep which dies on execution
against pretty much any repo, git.git will do:
git grep -P --threads=8 '(*LIMIT_MATCH=1)-?-?-?---$'
With the current version of git this will print some combination of
multiple PCRE failures that caused the abort and multiple "recursion
detected", some invocations will print out multiple "recursion
detected" errors with no PCRE error at all!
Before this change, running the above grep command 1000 times against
git.git[1] and taking the top 20 results will on my system yield the
following distribution of actual errors ("E") and recursion
errors ("R"):
322 E R
306 E
116 E R R
65 R R
54 R E
49 E E
44 R
15 E R R R
9 R R R
7 R E R
5 R R E
3 E R R R R
2 E E R
1 R R R R
1 R R R E
1 R E R R
The exact results are obviously random and system-dependent, but this
shows the race condition in this code. Some small part of the time
we're about to print out the actual error ("E") but another thread's
recursion error beats us to it, and sometimes we print out nothing but
the recursion error.
With this change we get, now with "W" to mean the new warning being
emitted indicating that we've called die() many times:
502 E
160 E W E
120 E E
53 E W
35 E W E E
34 W E E
29 W E E E
16 E E W
16 E E E
11 W E E E E
7 E E W E
4 W E
3 W W E E
2 E W E E E
1 W W E
1 W E W E
1 E W W E E E
1 E W W E E
1 E W W E
1 E W E E W
Which still sucks a bit, due to a still present race-condition in this
code we're sometimes going to print out several errors still, or
several warnings, or two duplicate errors without the warning.
But we will never have a case where we completely hide the actual
error as we do now.
Now, git-grep could make use of the pluggable error facility added in
commit c19a490e37 ("usage: allow pluggable die-recursion checks",
2013-04-16). There's other threaded code that calls set_die_routine()
or set_die_is_recursing_routine().
But this is about fixing the general die() behavior with threading
when we don't have such a custom routine yet. Right now the common
case is not an infinite recursion in the handler, but us losing error
messages by default because we're overly paranoid about our recursion
check.
So let's just set the recursion limit to a number higher than the
number of threads we're ever likely to spawn. Now we won't lose
errors, and if we have a recursing die handler we'll still die within
microseconds.
There are race conditions in this code itself, in particular the
"dying" variable is not thread mutexed, so we e.g. won't be dying at
exactly 1024, or for that matter even be able to accurately test
"dying == 2", see the cases where we print out more than one "W"
above.
But that doesn't really matter, for the recursion guard we just need
to die "soon", not at exactly 1024 calls, and for printing the correct
error and only one warning most of the time in the face of threaded
death this is good enough and a net improvement on the current code.
1. for i in {1..1000}; do git grep -P --threads=8 '(*LIMIT_MATCH=1)-?-?-?---$' 2>&1|perl -pe 's/^fatal: r.*/R/; s/^fatal: p.*/E/; s/^warning.*/W/' | tr '\n' ' '; echo; done | sort | uniq -c | sort -nr | head -n 20
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-22 04:47:42 +08:00
|
|
|
/*
|
|
|
|
* Just an arbitrary number X where "a < x < b" where "a" is
|
|
|
|
* "maximum number of pthreads we'll ever plausibly spawn" and
|
|
|
|
* "b" is "something less than Inf", since the point is to
|
|
|
|
* prevent infinite recursion.
|
|
|
|
*/
|
|
|
|
static const int recursion_limit = 1024;
|
|
|
|
|
|
|
|
dying++;
|
|
|
|
if (dying > recursion_limit) {
|
|
|
|
return 1;
|
|
|
|
} else if (dying == 2) {
|
|
|
|
warning("die() called many times. Recursion error or racy threaded death!");
|
|
|
|
return 0;
|
|
|
|
} else {
|
|
|
|
return 0;
|
|
|
|
}
|
2013-04-17 03:46:22 +08:00
|
|
|
}
|
|
|
|
|
2006-06-24 10:34:38 +08:00
|
|
|
/* If we are in a dlopen()ed .so write to a global variable would segfault
|
|
|
|
* (ugh), so keep things static. */
|
2020-10-16 03:30:04 +08:00
|
|
|
static NORETURN_PTR report_fn usage_routine = usage_builtin;
|
|
|
|
static NORETURN_PTR report_fn die_routine = die_builtin;
|
2021-12-08 02:26:29 +08:00
|
|
|
static report_fn die_message_routine = die_message_builtin;
|
2020-10-16 03:30:04 +08:00
|
|
|
static report_fn error_routine = error_builtin;
|
|
|
|
static report_fn warn_routine = warn_builtin;
|
2013-04-17 03:46:22 +08:00
|
|
|
static int (*die_is_recursing)(void) = die_is_recursing_builtin;
|
2006-06-24 10:34:38 +08:00
|
|
|
|
2020-10-16 03:30:04 +08:00
|
|
|
void set_die_routine(NORETURN_PTR report_fn routine)
|
2006-06-24 10:34:38 +08:00
|
|
|
{
|
|
|
|
die_routine = routine;
|
|
|
|
}
|
|
|
|
|
2021-12-08 02:26:29 +08:00
|
|
|
report_fn get_die_message_routine(void)
|
|
|
|
{
|
|
|
|
return die_message_routine;
|
|
|
|
}
|
|
|
|
|
2020-10-16 03:30:04 +08:00
|
|
|
void set_error_routine(report_fn routine)
|
2011-07-28 05:32:34 +08:00
|
|
|
{
|
|
|
|
error_routine = routine;
|
|
|
|
}
|
|
|
|
|
2020-10-16 03:30:04 +08:00
|
|
|
report_fn get_error_routine(void)
|
2016-09-05 04:18:28 +08:00
|
|
|
{
|
|
|
|
return error_routine;
|
|
|
|
}
|
|
|
|
|
2020-10-16 03:30:04 +08:00
|
|
|
void set_warn_routine(report_fn routine)
|
2016-09-05 04:18:27 +08:00
|
|
|
{
|
|
|
|
warn_routine = routine;
|
|
|
|
}
|
|
|
|
|
2020-10-16 03:30:04 +08:00
|
|
|
report_fn get_warn_routine(void)
|
2016-09-05 04:18:28 +08:00
|
|
|
{
|
|
|
|
return warn_routine;
|
|
|
|
}
|
|
|
|
|
2013-04-17 03:46:22 +08:00
|
|
|
void set_die_is_recursing_routine(int (*routine)(void))
|
|
|
|
{
|
|
|
|
die_is_recursing = routine;
|
|
|
|
}
|
|
|
|
|
Fix sparse warnings
Fix warnings from 'make check'.
- These files don't include 'builtin.h' causing sparse to complain that
cmd_* isn't declared:
builtin/clone.c:364, builtin/fetch-pack.c:797,
builtin/fmt-merge-msg.c:34, builtin/hash-object.c:78,
builtin/merge-index.c:69, builtin/merge-recursive.c:22
builtin/merge-tree.c:341, builtin/mktag.c:156, builtin/notes.c:426
builtin/notes.c:822, builtin/pack-redundant.c:596,
builtin/pack-refs.c:10, builtin/patch-id.c:60, builtin/patch-id.c:149,
builtin/remote.c:1512, builtin/remote-ext.c:240,
builtin/remote-fd.c:53, builtin/reset.c:236, builtin/send-pack.c:384,
builtin/unpack-file.c:25, builtin/var.c:75
- These files have symbols which should be marked static since they're
only file scope:
submodule.c:12, diff.c:631, replace_object.c:92, submodule.c:13,
submodule.c:14, trace.c:78, transport.c:195, transport-helper.c:79,
unpack-trees.c:19, url.c:3, url.c:18, url.c:104, url.c:117, url.c:123,
url.c:129, url.c:136, thread-utils.c:21, thread-utils.c:48
- These files redeclare symbols to be different types:
builtin/index-pack.c:210, parse-options.c:564, parse-options.c:571,
usage.c:49, usage.c:58, usage.c:63, usage.c:72
- These files use a literal integer 0 when they really should use a NULL
pointer:
daemon.c:663, fast-import.c:2942, imap-send.c:1072, notes-merge.c:362
While we're in the area, clean up some unused #includes in builtin files
(mostly exec_cmd.h).
Signed-off-by: Stephen Boyd <bebarino@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-03-22 15:51:05 +08:00
|
|
|
void NORETURN usagef(const char *err, ...)
|
2009-11-09 23:05:02 +08:00
|
|
|
{
|
|
|
|
va_list params;
|
|
|
|
|
|
|
|
va_start(params, err);
|
|
|
|
usage_routine(err, params);
|
|
|
|
va_end(params);
|
|
|
|
}
|
|
|
|
|
Fix sparse warnings
Fix warnings from 'make check'.
- These files don't include 'builtin.h' causing sparse to complain that
cmd_* isn't declared:
builtin/clone.c:364, builtin/fetch-pack.c:797,
builtin/fmt-merge-msg.c:34, builtin/hash-object.c:78,
builtin/merge-index.c:69, builtin/merge-recursive.c:22
builtin/merge-tree.c:341, builtin/mktag.c:156, builtin/notes.c:426
builtin/notes.c:822, builtin/pack-redundant.c:596,
builtin/pack-refs.c:10, builtin/patch-id.c:60, builtin/patch-id.c:149,
builtin/remote.c:1512, builtin/remote-ext.c:240,
builtin/remote-fd.c:53, builtin/reset.c:236, builtin/send-pack.c:384,
builtin/unpack-file.c:25, builtin/var.c:75
- These files have symbols which should be marked static since they're
only file scope:
submodule.c:12, diff.c:631, replace_object.c:92, submodule.c:13,
submodule.c:14, trace.c:78, transport.c:195, transport-helper.c:79,
unpack-trees.c:19, url.c:3, url.c:18, url.c:104, url.c:117, url.c:123,
url.c:129, url.c:136, thread-utils.c:21, thread-utils.c:48
- These files redeclare symbols to be different types:
builtin/index-pack.c:210, parse-options.c:564, parse-options.c:571,
usage.c:49, usage.c:58, usage.c:63, usage.c:72
- These files use a literal integer 0 when they really should use a NULL
pointer:
daemon.c:663, fast-import.c:2942, imap-send.c:1072, notes-merge.c:362
While we're in the area, clean up some unused #includes in builtin files
(mostly exec_cmd.h).
Signed-off-by: Stephen Boyd <bebarino@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-03-22 15:51:05 +08:00
|
|
|
void NORETURN usage(const char *err)
|
2006-06-24 10:34:38 +08:00
|
|
|
{
|
2009-11-09 23:05:02 +08:00
|
|
|
usagef("%s", err);
|
2006-06-24 10:34:38 +08:00
|
|
|
}
|
|
|
|
|
Fix sparse warnings
Fix warnings from 'make check'.
- These files don't include 'builtin.h' causing sparse to complain that
cmd_* isn't declared:
builtin/clone.c:364, builtin/fetch-pack.c:797,
builtin/fmt-merge-msg.c:34, builtin/hash-object.c:78,
builtin/merge-index.c:69, builtin/merge-recursive.c:22
builtin/merge-tree.c:341, builtin/mktag.c:156, builtin/notes.c:426
builtin/notes.c:822, builtin/pack-redundant.c:596,
builtin/pack-refs.c:10, builtin/patch-id.c:60, builtin/patch-id.c:149,
builtin/remote.c:1512, builtin/remote-ext.c:240,
builtin/remote-fd.c:53, builtin/reset.c:236, builtin/send-pack.c:384,
builtin/unpack-file.c:25, builtin/var.c:75
- These files have symbols which should be marked static since they're
only file scope:
submodule.c:12, diff.c:631, replace_object.c:92, submodule.c:13,
submodule.c:14, trace.c:78, transport.c:195, transport-helper.c:79,
unpack-trees.c:19, url.c:3, url.c:18, url.c:104, url.c:117, url.c:123,
url.c:129, url.c:136, thread-utils.c:21, thread-utils.c:48
- These files redeclare symbols to be different types:
builtin/index-pack.c:210, parse-options.c:564, parse-options.c:571,
usage.c:49, usage.c:58, usage.c:63, usage.c:72
- These files use a literal integer 0 when they really should use a NULL
pointer:
daemon.c:663, fast-import.c:2942, imap-send.c:1072, notes-merge.c:362
While we're in the area, clean up some unused #includes in builtin files
(mostly exec_cmd.h).
Signed-off-by: Stephen Boyd <bebarino@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-03-22 15:51:05 +08:00
|
|
|
void NORETURN die(const char *err, ...)
|
2005-04-19 04:04:43 +08:00
|
|
|
{
|
|
|
|
va_list params;
|
|
|
|
|
2013-04-17 03:46:22 +08:00
|
|
|
if (die_is_recursing()) {
|
2012-11-15 09:45:52 +08:00
|
|
|
fputs("fatal: recursion detected in die handler\n", stderr);
|
|
|
|
exit(128);
|
|
|
|
}
|
|
|
|
|
2005-04-19 04:04:43 +08:00
|
|
|
va_start(params, err);
|
2006-06-24 10:34:38 +08:00
|
|
|
die_routine(err, params);
|
2005-04-19 04:04:43 +08:00
|
|
|
va_end(params);
|
|
|
|
}
|
|
|
|
|
2016-05-08 17:47:21 +08:00
|
|
|
static const char *fmt_with_err(char *buf, int n, const char *fmt)
|
2009-06-27 23:58:44 +08:00
|
|
|
{
|
2009-06-27 23:58:45 +08:00
|
|
|
char str_error[256], *err;
|
|
|
|
int i, j;
|
|
|
|
|
|
|
|
err = strerror(errno);
|
|
|
|
for (i = j = 0; err[i] && j < sizeof(str_error) - 1; ) {
|
|
|
|
if ((str_error[j++] = err[i++]) != '%')
|
|
|
|
continue;
|
|
|
|
if (j < sizeof(str_error) - 1) {
|
|
|
|
str_error[j++] = '%';
|
|
|
|
} else {
|
|
|
|
/* No room to double the '%', so we overwrite it with
|
|
|
|
* '\0' below */
|
|
|
|
j--;
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
str_error[j] = 0;
|
2018-05-19 09:58:44 +08:00
|
|
|
/* Truncation is acceptable here */
|
2016-05-08 17:47:21 +08:00
|
|
|
snprintf(buf, n, "%s: %s", fmt, str_error);
|
|
|
|
return buf;
|
|
|
|
}
|
|
|
|
|
|
|
|
void NORETURN die_errno(const char *fmt, ...)
|
|
|
|
{
|
|
|
|
char buf[1024];
|
|
|
|
va_list params;
|
|
|
|
|
|
|
|
if (die_is_recursing()) {
|
|
|
|
fputs("fatal: recursion detected in die_errno handler\n",
|
|
|
|
stderr);
|
|
|
|
exit(128);
|
|
|
|
}
|
2009-06-27 23:58:44 +08:00
|
|
|
|
|
|
|
va_start(params, fmt);
|
2016-05-08 17:47:21 +08:00
|
|
|
die_routine(fmt_with_err(buf, sizeof(buf), fmt), params);
|
2009-06-27 23:58:44 +08:00
|
|
|
va_end(params);
|
|
|
|
}
|
|
|
|
|
2021-12-08 02:26:29 +08:00
|
|
|
#undef die_message
|
|
|
|
int die_message(const char *err, ...)
|
|
|
|
{
|
|
|
|
va_list params;
|
|
|
|
|
|
|
|
va_start(params, err);
|
|
|
|
die_message_routine(err, params);
|
|
|
|
va_end(params);
|
|
|
|
return 128;
|
|
|
|
}
|
|
|
|
|
2021-12-08 02:26:33 +08:00
|
|
|
#undef die_message_errno
|
|
|
|
int die_message_errno(const char *fmt, ...)
|
|
|
|
{
|
|
|
|
char buf[1024];
|
|
|
|
va_list params;
|
|
|
|
|
|
|
|
va_start(params, fmt);
|
|
|
|
die_message_routine(fmt_with_err(buf, sizeof(buf), fmt), params);
|
|
|
|
va_end(params);
|
|
|
|
return 128;
|
|
|
|
}
|
|
|
|
|
2016-08-31 11:41:22 +08:00
|
|
|
#undef error_errno
|
2016-05-08 17:47:22 +08:00
|
|
|
int error_errno(const char *fmt, ...)
|
|
|
|
{
|
|
|
|
char buf[1024];
|
|
|
|
va_list params;
|
|
|
|
|
|
|
|
va_start(params, fmt);
|
|
|
|
error_routine(fmt_with_err(buf, sizeof(buf), fmt), params);
|
|
|
|
va_end(params);
|
|
|
|
return -1;
|
|
|
|
}
|
|
|
|
|
make error()'s constant return value more visible
When git is compiled with "gcc -Wuninitialized -O3", some
inlined calls provide an additional opportunity for the
compiler to do static analysis on variable initialization.
For example, with two functions like this:
int get_foo(int *foo)
{
if (something_that_might_fail() < 0)
return error("unable to get foo");
*foo = 0;
return 0;
}
void some_fun(void)
{
int foo;
if (get_foo(&foo) < 0)
return -1;
printf("foo is %d\n", foo);
}
If get_foo() is not inlined, then when compiling some_fun,
gcc sees only that a pointer to the local variable is
passed, and must assume that it is an out parameter that
is initialized after get_foo returns.
However, when get_foo() is inlined, the compiler may look at
all of the code together and see that some code paths in
get_foo() do not initialize the variable. As a result, it
prints a warning. But what the compiler can't see is that
error() always returns -1, and therefore we know that either
we return early from some_fun, or foo ends up initialized,
and the code is safe. The warning is a false positive.
If we can make the compiler aware that error() will always
return -1, it can do a better job of analysis. The simplest
method would be to inline the error() function. However,
this doesn't work, because gcc will not inline a variadc
function. We can work around this by defining a macro. This
relies on two gcc extensions:
1. Variadic macros (these are present in C99, but we do
not rely on that).
2. Gcc treats the "##" paste operator specially between a
comma and __VA_ARGS__, which lets our variadic macro
work even if no format parameters are passed to
error().
Since we are using these extra features, we hide the macro
behind an #ifdef. This is OK, though, because our goal was
just to help gcc.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-12-16 01:37:36 +08:00
|
|
|
#undef error
|
2005-04-19 04:04:43 +08:00
|
|
|
int error(const char *err, ...)
|
|
|
|
{
|
|
|
|
va_list params;
|
|
|
|
|
|
|
|
va_start(params, err);
|
2006-06-24 10:34:38 +08:00
|
|
|
error_routine(err, params);
|
2005-04-19 04:04:43 +08:00
|
|
|
va_end(params);
|
|
|
|
return -1;
|
|
|
|
}
|
2006-12-22 08:48:32 +08:00
|
|
|
|
2016-05-08 17:47:22 +08:00
|
|
|
void warning_errno(const char *warn, ...)
|
|
|
|
{
|
|
|
|
char buf[1024];
|
|
|
|
va_list params;
|
|
|
|
|
|
|
|
va_start(params, warn);
|
|
|
|
warn_routine(fmt_with_err(buf, sizeof(buf), warn), params);
|
|
|
|
va_end(params);
|
|
|
|
}
|
|
|
|
|
2007-03-31 07:07:05 +08:00
|
|
|
void warning(const char *warn, ...)
|
2006-12-22 08:48:32 +08:00
|
|
|
{
|
|
|
|
va_list params;
|
|
|
|
|
|
|
|
va_start(params, warn);
|
|
|
|
warn_routine(warn, params);
|
|
|
|
va_end(params);
|
|
|
|
}
|
usage.c: add BUG() function
There's a convention in Git's code base to write assertions
as:
if (...some_bad_thing...)
die("BUG: the terrible thing happened");
with the idea that users should never see a "BUG:" message
(but if they, it at least gives a clue what happened). We
use die() here because it's convenient, but there are a few
draw-backs:
1. Without parsing the messages, it's hard for callers to
distinguish BUG assertions from regular errors.
For instance, it would be nice if the test suite could
check that we don't hit any assertions, but
test_must_fail will pass BUG deaths as OK.
2. It would be useful to add more debugging features to
BUG assertions, like file/line numbers or dumping core.
3. The die() handler can be replaced, and might not
actually exit the whole program (e.g., it may just
pthread_exit()). This is convenient for normal errors,
but for an assertion failure (which is supposed to
never happen), we're probably better off taking down
the whole process as quickly and cleanly as possible.
We could address these by checking in die() whether the
error message starts with "BUG", and behaving appropriately.
But there's little advantage at that point to sharing the
die() code, and only downsides (e.g., we can't change the
BUG() interface independently). Moreover, converting all of
the existing BUG calls reveals that the test suite does
indeed trigger a few of them.
Instead, this patch introduces a new BUG() function, which
prints an error before dying via SIGABRT. This gives us test
suite checking and core dumps. The function is actually a
macro (when supported) so that we can show the file/line
number.
We can convert die("BUG") invocations to BUG() in further
patches, dealing with any test fallouts individually.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-13 11:28:50 +08:00
|
|
|
|
2018-05-02 17:38:28 +08:00
|
|
|
/* Only set this, ever, from t/helper/, when verifying that bugs are caught. */
|
|
|
|
int BUG_exit_code;
|
|
|
|
|
2022-06-02 20:25:33 +08:00
|
|
|
static void BUG_vfl_common(const char *file, int line, const char *fmt,
|
|
|
|
va_list params)
|
usage.c: add BUG() function
There's a convention in Git's code base to write assertions
as:
if (...some_bad_thing...)
die("BUG: the terrible thing happened");
with the idea that users should never see a "BUG:" message
(but if they, it at least gives a clue what happened). We
use die() here because it's convenient, but there are a few
draw-backs:
1. Without parsing the messages, it's hard for callers to
distinguish BUG assertions from regular errors.
For instance, it would be nice if the test suite could
check that we don't hit any assertions, but
test_must_fail will pass BUG deaths as OK.
2. It would be useful to add more debugging features to
BUG assertions, like file/line numbers or dumping core.
3. The die() handler can be replaced, and might not
actually exit the whole program (e.g., it may just
pthread_exit()). This is convenient for normal errors,
but for an assertion failure (which is supposed to
never happen), we're probably better off taking down
the whole process as quickly and cleanly as possible.
We could address these by checking in die() whether the
error message starts with "BUG", and behaving appropriately.
But there's little advantage at that point to sharing the
die() code, and only downsides (e.g., we can't change the
BUG() interface independently). Moreover, converting all of
the existing BUG calls reveals that the test suite does
indeed trigger a few of them.
Instead, this patch introduces a new BUG() function, which
prints an error before dying via SIGABRT. This gives us test
suite checking and core dumps. The function is actually a
macro (when supported) so that we can show the file/line
number.
We can convert die("BUG") invocations to BUG() in further
patches, dealing with any test fallouts individually.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-13 11:28:50 +08:00
|
|
|
{
|
|
|
|
char prefix[256];
|
|
|
|
|
|
|
|
/* truncation via snprintf is OK here */
|
C99: remove hardcoded-out !HAVE_VARIADIC_MACROS code
Remove the "else" branches of the HAVE_VARIADIC_MACROS macro, which
have been unconditionally omitted since 765dc168882 (git-compat-util:
always enable variadic macros, 2021-01-28).
Since were always omitted, anyone trying to use a compiler without
variadic macro support to compile a git since version
git v2.31.0 or later would have had a compilation error. 10 months
across a few releases since then should have been enough time for
anyone who cared to run into that and report the issue.
In addition to that, for anyone unsetting HAVE_VARIADIC_MACROS we've
been emitting extremely verbose warnings since at least
ee4512ed481 (trace2: create new combined trace facility,
2019-02-22). That's because there is no such thing as a
"region_enter_printf" or "region_leave_printf" format, so at least
under GCC and Clang everything that includes trace.h (almost every
file) emits a couple of warnings about that.
There's a large benefit to being able to have a hard dependency rely
on variadic macros, the code surrounding usage.c is hard to maintain
if we need to write two implementations of everything, and by relying
on "__FILE__" and "__LINE__" along with "__VA_ARGS__" we can in the
future make error(), die() etc. log where they were called from. We've
also recently merged d67fc4bf0ba (Merge branch 'bc/require-c99',
2021-12-10) which further cements our hard dependency on C99.
So let's delete the fallback code, and update our CodingGuidelines to
note that we depend on this. The added bullet-point starts with
lower-case for consistency with other bullet-points in that section.
The diff in "trace.h" is relatively hard to read, since we need to
retain the existing API docs, which were comments on the code used if
HAVE_VARIADIC_MACROS was not defined.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-02-22 00:05:27 +08:00
|
|
|
snprintf(prefix, sizeof(prefix), "BUG: %s:%d: ", file, line);
|
usage.c: add BUG() function
There's a convention in Git's code base to write assertions
as:
if (...some_bad_thing...)
die("BUG: the terrible thing happened");
with the idea that users should never see a "BUG:" message
(but if they, it at least gives a clue what happened). We
use die() here because it's convenient, but there are a few
draw-backs:
1. Without parsing the messages, it's hard for callers to
distinguish BUG assertions from regular errors.
For instance, it would be nice if the test suite could
check that we don't hit any assertions, but
test_must_fail will pass BUG deaths as OK.
2. It would be useful to add more debugging features to
BUG assertions, like file/line numbers or dumping core.
3. The die() handler can be replaced, and might not
actually exit the whole program (e.g., it may just
pthread_exit()). This is convenient for normal errors,
but for an assertion failure (which is supposed to
never happen), we're probably better off taking down
the whole process as quickly and cleanly as possible.
We could address these by checking in die() whether the
error message starts with "BUG", and behaving appropriately.
But there's little advantage at that point to sharing the
die() code, and only downsides (e.g., we can't change the
BUG() interface independently). Moreover, converting all of
the existing BUG calls reveals that the test suite does
indeed trigger a few of them.
Instead, this patch introduces a new BUG() function, which
prints an error before dying via SIGABRT. This gives us test
suite checking and core dumps. The function is actually a
macro (when supported) so that we can show the file/line
number.
We can convert die("BUG") invocations to BUG() in further
patches, dealing with any test fallouts individually.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-13 11:28:50 +08:00
|
|
|
|
|
|
|
vreportf(prefix, fmt, params);
|
2022-06-02 20:25:33 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
static NORETURN void BUG_vfl(const char *file, int line, const char *fmt, va_list params)
|
|
|
|
{
|
|
|
|
va_list params_copy;
|
|
|
|
static int in_bug;
|
|
|
|
|
|
|
|
va_copy(params_copy, params);
|
|
|
|
BUG_vfl_common(file, line, fmt, params);
|
2021-02-06 04:09:08 +08:00
|
|
|
|
|
|
|
if (in_bug)
|
|
|
|
abort();
|
|
|
|
in_bug = 1;
|
|
|
|
|
|
|
|
trace2_cmd_error_va(fmt, params_copy);
|
|
|
|
|
2018-05-02 17:38:28 +08:00
|
|
|
if (BUG_exit_code)
|
|
|
|
exit(BUG_exit_code);
|
usage.c: add BUG() function
There's a convention in Git's code base to write assertions
as:
if (...some_bad_thing...)
die("BUG: the terrible thing happened");
with the idea that users should never see a "BUG:" message
(but if they, it at least gives a clue what happened). We
use die() here because it's convenient, but there are a few
draw-backs:
1. Without parsing the messages, it's hard for callers to
distinguish BUG assertions from regular errors.
For instance, it would be nice if the test suite could
check that we don't hit any assertions, but
test_must_fail will pass BUG deaths as OK.
2. It would be useful to add more debugging features to
BUG assertions, like file/line numbers or dumping core.
3. The die() handler can be replaced, and might not
actually exit the whole program (e.g., it may just
pthread_exit()). This is convenient for normal errors,
but for an assertion failure (which is supposed to
never happen), we're probably better off taking down
the whole process as quickly and cleanly as possible.
We could address these by checking in die() whether the
error message starts with "BUG", and behaving appropriately.
But there's little advantage at that point to sharing the
die() code, and only downsides (e.g., we can't change the
BUG() interface independently). Moreover, converting all of
the existing BUG calls reveals that the test suite does
indeed trigger a few of them.
Instead, this patch introduces a new BUG() function, which
prints an error before dying via SIGABRT. This gives us test
suite checking and core dumps. The function is actually a
macro (when supported) so that we can show the file/line
number.
We can convert die("BUG") invocations to BUG() in further
patches, dealing with any test fallouts individually.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-13 11:28:50 +08:00
|
|
|
abort();
|
|
|
|
}
|
|
|
|
|
2017-05-22 06:25:39 +08:00
|
|
|
NORETURN void BUG_fl(const char *file, int line, const char *fmt, ...)
|
usage.c: add BUG() function
There's a convention in Git's code base to write assertions
as:
if (...some_bad_thing...)
die("BUG: the terrible thing happened");
with the idea that users should never see a "BUG:" message
(but if they, it at least gives a clue what happened). We
use die() here because it's convenient, but there are a few
draw-backs:
1. Without parsing the messages, it's hard for callers to
distinguish BUG assertions from regular errors.
For instance, it would be nice if the test suite could
check that we don't hit any assertions, but
test_must_fail will pass BUG deaths as OK.
2. It would be useful to add more debugging features to
BUG assertions, like file/line numbers or dumping core.
3. The die() handler can be replaced, and might not
actually exit the whole program (e.g., it may just
pthread_exit()). This is convenient for normal errors,
but for an assertion failure (which is supposed to
never happen), we're probably better off taking down
the whole process as quickly and cleanly as possible.
We could address these by checking in die() whether the
error message starts with "BUG", and behaving appropriately.
But there's little advantage at that point to sharing the
die() code, and only downsides (e.g., we can't change the
BUG() interface independently). Moreover, converting all of
the existing BUG calls reveals that the test suite does
indeed trigger a few of them.
Instead, this patch introduces a new BUG() function, which
prints an error before dying via SIGABRT. This gives us test
suite checking and core dumps. The function is actually a
macro (when supported) so that we can show the file/line
number.
We can convert die("BUG") invocations to BUG() in further
patches, dealing with any test fallouts individually.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-13 11:28:50 +08:00
|
|
|
{
|
|
|
|
va_list ap;
|
2022-06-02 20:25:33 +08:00
|
|
|
|
|
|
|
bug_called_must_BUG = 0;
|
|
|
|
|
usage.c: add BUG() function
There's a convention in Git's code base to write assertions
as:
if (...some_bad_thing...)
die("BUG: the terrible thing happened");
with the idea that users should never see a "BUG:" message
(but if they, it at least gives a clue what happened). We
use die() here because it's convenient, but there are a few
draw-backs:
1. Without parsing the messages, it's hard for callers to
distinguish BUG assertions from regular errors.
For instance, it would be nice if the test suite could
check that we don't hit any assertions, but
test_must_fail will pass BUG deaths as OK.
2. It would be useful to add more debugging features to
BUG assertions, like file/line numbers or dumping core.
3. The die() handler can be replaced, and might not
actually exit the whole program (e.g., it may just
pthread_exit()). This is convenient for normal errors,
but for an assertion failure (which is supposed to
never happen), we're probably better off taking down
the whole process as quickly and cleanly as possible.
We could address these by checking in die() whether the
error message starts with "BUG", and behaving appropriately.
But there's little advantage at that point to sharing the
die() code, and only downsides (e.g., we can't change the
BUG() interface independently). Moreover, converting all of
the existing BUG calls reveals that the test suite does
indeed trigger a few of them.
Instead, this patch introduces a new BUG() function, which
prints an error before dying via SIGABRT. This gives us test
suite checking and core dumps. The function is actually a
macro (when supported) so that we can show the file/line
number.
We can convert die("BUG") invocations to BUG() in further
patches, dealing with any test fallouts individually.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-13 11:28:50 +08:00
|
|
|
va_start(ap, fmt);
|
|
|
|
BUG_vfl(file, line, fmt, ap);
|
|
|
|
va_end(ap);
|
|
|
|
}
|
add UNLEAK annotation for reducing leak false positives
It's a common pattern in git commands to allocate some
memory that should last for the lifetime of the program and
then not bother to free it, relying on the OS to throw it
away.
This keeps the code simple, and it's fast (we don't waste
time traversing structures or calling free at the end of the
program). But it also triggers warnings from memory-leak
checkers like valgrind or LSAN. They know that the memory
was still allocated at program exit, but they don't know
_when_ the leaked memory stopped being useful. If it was
early in the program, then it's probably a real and
important leak. But if it was used right up until program
exit, it's not an interesting leak and we'd like to suppress
it so that we can see the real leaks.
This patch introduces an UNLEAK() macro that lets us do so.
To understand its design, let's first look at some of the
alternatives.
Unfortunately the suppression systems offered by
leak-checking tools don't quite do what we want. A
leak-checker basically knows two things:
1. Which blocks were allocated via malloc, and the
callstack during the allocation.
2. Which blocks were left un-freed at the end of the
program (and which are unreachable, but more on that
later).
Their suppressions work by mentioning the function or
callstack of a particular allocation, and marking it as OK
to leak. So imagine you have code like this:
int cmd_foo(...)
{
/* this allocates some memory */
char *p = some_function();
printf("%s", p);
return 0;
}
You can say "ignore allocations from some_function(),
they're not leaks". But that's not right. That function may
be called elsewhere, too, and we would potentially want to
know about those leaks.
So you can say "ignore the callstack when main calls
some_function". That works, but your annotations are
brittle. In this case it's only two functions, but you can
imagine that the actual allocation is much deeper. If any of
the intermediate code changes, you have to update the
suppression.
What we _really_ want to say is that "the value assigned to
p at the end of the function is not a real leak". But
leak-checkers can't understand that; they don't know about
"p" in the first place.
However, we can do something a little bit tricky if we make
some assumptions about how leak-checkers work. They
generally don't just report all un-freed blocks. That would
report even globals which are still accessible when the
leak-check is run. Instead they take some set of memory
(like BSS) as a root and mark it as "reachable". Then they
scan the reachable blocks for anything that looks like a
pointer to a malloc'd block, and consider that block
reachable. And then they scan those blocks, and so on,
transitively marking anything reachable from a global as
"not leaked" (or at least leaked in a different category).
So we can mark the value of "p" as reachable by putting it
into a variable with program lifetime. One way to do that is
to just mark "p" as static. But that actually affects the
run-time behavior if the function is called twice (you
aren't likely to call main() twice, but some of our cmd_*()
functions are called from other commands).
Instead, we can trick the leak-checker by putting the value
into _any_ reachable bytes. This patch keeps a global
linked-list of bytes copied from "unleaked" variables. That
list is reachable even at program exit, which confers
recursive reachability on whatever values we unleak.
In other words, you can do:
int cmd_foo(...)
{
char *p = some_function();
printf("%s", p);
UNLEAK(p);
return 0;
}
to annotate "p" and suppress the leak report.
But wait, couldn't we just say "free(p)"? In this toy
example, yes. But UNLEAK()'s byte-copying strategy has
several advantages over actually freeing the memory:
1. It's recursive across structures. In many cases our "p"
is not just a pointer, but a complex struct whose
fields may have been allocated by a sub-function. And
in some cases (e.g., dir_struct) we don't even have a
function which knows how to free all of the struct
members.
By marking the struct itself as reachable, that confers
reachability on any pointers it contains (including those
found in embedded structs, or reachable by walking
heap blocks recursively.
2. It works on cases where we're not sure if the value is
allocated or not. For example:
char *p = argc > 1 ? argv[1] : some_function();
It's safe to use UNLEAK(p) here, because it's not
freeing any memory. In the case that we're pointing to
argv here, the reachability checker will just ignore
our bytes.
3. Likewise, it works even if the variable has _already_
been freed. We're just copying the pointer bytes. If
the block has been freed, the leak-checker will skip
over those bytes as uninteresting.
4. Because it's not actually freeing memory, you can
UNLEAK() before we are finished accessing the variable.
This is helpful in cases like this:
char *p = some_function();
return another_function(p);
Writing this with free() requires:
int ret;
char *p = some_function();
ret = another_function(p);
free(p);
return ret;
But with unleak we can just write:
char *p = some_function();
UNLEAK(p);
return another_function(p);
This patch adds the UNLEAK() macro and enables it
automatically when Git is compiled with SANITIZE=leak. In
normal builds it's a noop, so we pay no runtime cost.
It also adds some UNLEAK() annotations to show off how the
feature works. On top of other recent leak fixes, these are
enough to get t0000 and t0001 to pass when compiled with
LSAN.
Note the case in commit.c which actually converts a
strbuf_release() into an UNLEAK. This code was already
non-leaky, but the free didn't do anything useful, since
we're exiting. Converting it to an annotation means that
non-leak-checking builds pay no runtime cost. The cost is
minimal enough that it's probably not worth going on a
crusade to convert these kinds of frees to UNLEAKS. I did it
here for consistency with the "sb" leak (though it would
have been equally correct to go the other way, and turn them
both into strbuf_release() calls).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-08 14:38:41 +08:00
|
|
|
|
2022-06-02 20:25:33 +08:00
|
|
|
int bug_called_must_BUG;
|
|
|
|
void bug_fl(const char *file, int line, const char *fmt, ...)
|
|
|
|
{
|
bug_fl(): correctly initialize trace2 va_list
The code added 0cc05b044f (usage.c: add a non-fatal bug() function to go
with BUG(), 2022-06-02) sets up two va_list variables: one to output to
stderr, and one to trace2. But the order of initialization is wrong:
va_list ap, cp;
va_copy(cp, ap);
va_start(ap, fmt);
We copy the contents of "ap" into "cp" before it is initialized, meaning
it is full of garbage. The two should be swapped.
However, there's another bug, noticed by Johannes Schindelin: we forget
to call va_end() for the copy. So instead of just fixing the copy's
initialization, let's do two separate start/end pairs. This is allowed
by the standard, and we don't need to use copy here since we have access
to the original varargs. Matching the pairs with the calls makes it more
obvious that everything is being done correctly.
Note that we do call bug_fl() in the tests, but it didn't trigger this
problem because our format string doesn't have any placeholders. So even
though we were passing a garbage va_list through the stack, nobody ever
needed to look at it. We can easily adjust one of the trace2 tests to
trigger this, both for bug() and for BUG(). The latter isn't broken, but
it's nice to exercise both a bit more. Without the fix in this patch
(but with the test change), the bug() case causes a segfault.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-17 04:04:25 +08:00
|
|
|
va_list ap;
|
2022-06-02 20:25:33 +08:00
|
|
|
|
|
|
|
bug_called_must_BUG = 1;
|
|
|
|
|
|
|
|
va_start(ap, fmt);
|
|
|
|
BUG_vfl_common(file, line, fmt, ap);
|
|
|
|
va_end(ap);
|
bug_fl(): correctly initialize trace2 va_list
The code added 0cc05b044f (usage.c: add a non-fatal bug() function to go
with BUG(), 2022-06-02) sets up two va_list variables: one to output to
stderr, and one to trace2. But the order of initialization is wrong:
va_list ap, cp;
va_copy(cp, ap);
va_start(ap, fmt);
We copy the contents of "ap" into "cp" before it is initialized, meaning
it is full of garbage. The two should be swapped.
However, there's another bug, noticed by Johannes Schindelin: we forget
to call va_end() for the copy. So instead of just fixing the copy's
initialization, let's do two separate start/end pairs. This is allowed
by the standard, and we don't need to use copy here since we have access
to the original varargs. Matching the pairs with the calls makes it more
obvious that everything is being done correctly.
Note that we do call bug_fl() in the tests, but it didn't trigger this
problem because our format string doesn't have any placeholders. So even
though we were passing a garbage va_list through the stack, nobody ever
needed to look at it. We can easily adjust one of the trace2 tests to
trigger this, both for bug() and for BUG(). The latter isn't broken, but
it's nice to exercise both a bit more. Without the fix in this patch
(but with the test change), the bug() case causes a segfault.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-17 04:04:25 +08:00
|
|
|
|
|
|
|
va_start(ap, fmt);
|
|
|
|
trace2_cmd_error_va(fmt, ap);
|
|
|
|
va_end(ap);
|
2022-06-02 20:25:33 +08:00
|
|
|
}
|
|
|
|
|
add UNLEAK annotation for reducing leak false positives
It's a common pattern in git commands to allocate some
memory that should last for the lifetime of the program and
then not bother to free it, relying on the OS to throw it
away.
This keeps the code simple, and it's fast (we don't waste
time traversing structures or calling free at the end of the
program). But it also triggers warnings from memory-leak
checkers like valgrind or LSAN. They know that the memory
was still allocated at program exit, but they don't know
_when_ the leaked memory stopped being useful. If it was
early in the program, then it's probably a real and
important leak. But if it was used right up until program
exit, it's not an interesting leak and we'd like to suppress
it so that we can see the real leaks.
This patch introduces an UNLEAK() macro that lets us do so.
To understand its design, let's first look at some of the
alternatives.
Unfortunately the suppression systems offered by
leak-checking tools don't quite do what we want. A
leak-checker basically knows two things:
1. Which blocks were allocated via malloc, and the
callstack during the allocation.
2. Which blocks were left un-freed at the end of the
program (and which are unreachable, but more on that
later).
Their suppressions work by mentioning the function or
callstack of a particular allocation, and marking it as OK
to leak. So imagine you have code like this:
int cmd_foo(...)
{
/* this allocates some memory */
char *p = some_function();
printf("%s", p);
return 0;
}
You can say "ignore allocations from some_function(),
they're not leaks". But that's not right. That function may
be called elsewhere, too, and we would potentially want to
know about those leaks.
So you can say "ignore the callstack when main calls
some_function". That works, but your annotations are
brittle. In this case it's only two functions, but you can
imagine that the actual allocation is much deeper. If any of
the intermediate code changes, you have to update the
suppression.
What we _really_ want to say is that "the value assigned to
p at the end of the function is not a real leak". But
leak-checkers can't understand that; they don't know about
"p" in the first place.
However, we can do something a little bit tricky if we make
some assumptions about how leak-checkers work. They
generally don't just report all un-freed blocks. That would
report even globals which are still accessible when the
leak-check is run. Instead they take some set of memory
(like BSS) as a root and mark it as "reachable". Then they
scan the reachable blocks for anything that looks like a
pointer to a malloc'd block, and consider that block
reachable. And then they scan those blocks, and so on,
transitively marking anything reachable from a global as
"not leaked" (or at least leaked in a different category).
So we can mark the value of "p" as reachable by putting it
into a variable with program lifetime. One way to do that is
to just mark "p" as static. But that actually affects the
run-time behavior if the function is called twice (you
aren't likely to call main() twice, but some of our cmd_*()
functions are called from other commands).
Instead, we can trick the leak-checker by putting the value
into _any_ reachable bytes. This patch keeps a global
linked-list of bytes copied from "unleaked" variables. That
list is reachable even at program exit, which confers
recursive reachability on whatever values we unleak.
In other words, you can do:
int cmd_foo(...)
{
char *p = some_function();
printf("%s", p);
UNLEAK(p);
return 0;
}
to annotate "p" and suppress the leak report.
But wait, couldn't we just say "free(p)"? In this toy
example, yes. But UNLEAK()'s byte-copying strategy has
several advantages over actually freeing the memory:
1. It's recursive across structures. In many cases our "p"
is not just a pointer, but a complex struct whose
fields may have been allocated by a sub-function. And
in some cases (e.g., dir_struct) we don't even have a
function which knows how to free all of the struct
members.
By marking the struct itself as reachable, that confers
reachability on any pointers it contains (including those
found in embedded structs, or reachable by walking
heap blocks recursively.
2. It works on cases where we're not sure if the value is
allocated or not. For example:
char *p = argc > 1 ? argv[1] : some_function();
It's safe to use UNLEAK(p) here, because it's not
freeing any memory. In the case that we're pointing to
argv here, the reachability checker will just ignore
our bytes.
3. Likewise, it works even if the variable has _already_
been freed. We're just copying the pointer bytes. If
the block has been freed, the leak-checker will skip
over those bytes as uninteresting.
4. Because it's not actually freeing memory, you can
UNLEAK() before we are finished accessing the variable.
This is helpful in cases like this:
char *p = some_function();
return another_function(p);
Writing this with free() requires:
int ret;
char *p = some_function();
ret = another_function(p);
free(p);
return ret;
But with unleak we can just write:
char *p = some_function();
UNLEAK(p);
return another_function(p);
This patch adds the UNLEAK() macro and enables it
automatically when Git is compiled with SANITIZE=leak. In
normal builds it's a noop, so we pay no runtime cost.
It also adds some UNLEAK() annotations to show off how the
feature works. On top of other recent leak fixes, these are
enough to get t0000 and t0001 to pass when compiled with
LSAN.
Note the case in commit.c which actually converts a
strbuf_release() into an UNLEAK. This code was already
non-leaky, but the free didn't do anything useful, since
we're exiting. Converting it to an annotation means that
non-leak-checking builds pay no runtime cost. The cost is
minimal enough that it's probably not worth going on a
crusade to convert these kinds of frees to UNLEAKS. I did it
here for consistency with the "sb" leak (though it would
have been equally correct to go the other way, and turn them
both into strbuf_release() calls).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-08 14:38:41 +08:00
|
|
|
#ifdef SUPPRESS_ANNOTATED_LEAKS
|
|
|
|
void unleak_memory(const void *ptr, size_t len)
|
|
|
|
{
|
|
|
|
static struct suppressed_leak_root {
|
|
|
|
struct suppressed_leak_root *next;
|
|
|
|
char data[FLEX_ARRAY];
|
|
|
|
} *suppressed_leaks;
|
|
|
|
struct suppressed_leak_root *root;
|
|
|
|
|
|
|
|
FLEX_ALLOC_MEM(root, data, ptr, len);
|
|
|
|
root->next = suppressed_leaks;
|
|
|
|
suppressed_leaks = root;
|
|
|
|
}
|
|
|
|
#endif
|