git/t/helper/test-delta.c

79 lines
1.8 KiB
C
Raw Normal View History

/*
* test-delta.c: test code to exercise diff-delta.c and patch-delta.c
*
* (C) 2005 Nicolas Pitre <nico@fluxnic.net>
*
* This code is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License version 2 as
* published by the Free Software Foundation.
*/
#include "git-compat-util.h"
#include "delta.h"
#include "cache.h"
static const char usage_str[] =
"test-delta (-d|-p) <from_file> <data_file> <out_file>";
add an extra level of indirection to main() There are certain startup tasks that we expect every git process to do. In some cases this is just to improve the quality of the program (e.g., setting up gettext()). In others it is a requirement for using certain functions in libgit.a (e.g., system_path() expects that you have called git_extract_argv0_path()). Most commands are builtins and are covered by the git.c version of main(). However, there are still a few external commands that use their own main(). Each of these has to remember to include the correct startup sequence, and we are not always consistent. Rather than just fix the inconsistencies, let's make this harder to get wrong by providing a common main() that can run this standard startup. We basically have two options to do this: - the compat/mingw.h file already does something like this by adding a #define that replaces the definition of main with a wrapper that calls mingw_startup(). The upside is that the code in each program doesn't need to be changed at all; it's rewritten on the fly by the preprocessor. The downside is that it may make debugging of the startup sequence a bit more confusing, as the preprocessor is quietly inserting new code. - the builtin functions are all of the form cmd_foo(), and git.c's main() calls them. This is much more explicit, which may make things more obvious to somebody reading the code. It's also more flexible (because of course we have to figure out _which_ cmd_foo() to call). The downside is that each of the builtins must define cmd_foo(), instead of just main(). This patch chooses the latter option, preferring the more explicit approach, even though it is more invasive. We introduce a new file common-main.c, with the "real" main. It expects to call cmd_main() from whatever other objects it is linked against. We link common-main.o against anything that links against libgit.a, since we know that such programs will need to do this setup. Note that common-main.o can't actually go inside libgit.a, as the linker would not pick up its main() function automatically (it has no callers). The rest of the patch is just adjusting all of the various external programs (mostly in t/helper) to use cmd_main(). I've provided a global declaration for cmd_main(), which means that all of the programs also need to match its signature. In particular, many functions need to switch to "const char **" instead of "char **" for argv. This effect ripples out to a few other variables and functions, as well. This makes the patch even more invasive, but the end result is much better. We should be treating argv strings as const anyway, and now all programs conform to the same signature (which also matches the way builtins are defined). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-01 13:58:58 +08:00
int cmd_main(int argc, const char **argv)
{
int fd;
struct stat st;
void *from_buf, *data_buf, *out_buf;
unsigned long from_size, data_size, out_size;
if (argc != 5 || (strcmp(argv[1], "-d") && strcmp(argv[1], "-p"))) {
fprintf(stderr, "usage: %s\n", usage_str);
return 1;
}
fd = open(argv[2], O_RDONLY);
if (fd < 0 || fstat(fd, &st)) {
perror(argv[2]);
return 1;
}
from_size = st.st_size;
from_buf = mmap(NULL, from_size, PROT_READ, MAP_PRIVATE, fd, 0);
if (from_buf == MAP_FAILED) {
perror(argv[2]);
close(fd);
return 1;
}
close(fd);
fd = open(argv[3], O_RDONLY);
if (fd < 0 || fstat(fd, &st)) {
perror(argv[3]);
return 1;
}
data_size = st.st_size;
data_buf = mmap(NULL, data_size, PROT_READ, MAP_PRIVATE, fd, 0);
if (data_buf == MAP_FAILED) {
perror(argv[3]);
close(fd);
return 1;
}
close(fd);
if (argv[1][1] == 'd')
out_buf = diff_delta(from_buf, from_size,
data_buf, data_size,
&out_size, 0);
else
out_buf = patch_delta(from_buf, from_size,
data_buf, data_size,
&out_size);
if (!out_buf) {
fprintf(stderr, "delta operation failed (returned NULL)\n");
return 1;
}
fd = open (argv[4], O_WRONLY|O_CREAT|O_TRUNC, 0666);
avoid "write_in_full(fd, buf, len) != len" pattern The return value of write_in_full() is either "-1", or the requested number of bytes[1]. If we make a partial write before seeing an error, we still return -1, not a partial value. This goes back to f6aa66cb95 (write_in_full: really write in full or return error on disk full., 2007-01-11). So checking anything except "was the return value negative" is pointless. And there are a couple of reasons not to do so: 1. It can do a funny signed/unsigned comparison. If your "len" is signed (e.g., a size_t) then the compiler will promote the "-1" to its unsigned variant. This works out for "!= len" (unless you really were trying to write the maximum size_t bytes), but is a bug if you check "< len" (an example of which was fixed recently in config.c). We should avoid promoting the mental model that you need to check the length at all, so that new sites are not tempted to copy us. 2. Checking for a negative value is shorter to type, especially when the length is an expression. 3. Linus says so. In d34cf19b89 (Clean up write_in_full() users, 2007-01-11), right after the write_in_full() semantics were changed, he wrote: I really wish every "write_in_full()" user would just check against "<0" now, but this fixes the nasty and stupid ones. Appeals to authority aside, this makes it clear that writing it this way does not have an intentional benefit. It's a historical curiosity that we never bothered to clean up (and which was undoubtedly cargo-culted into new sites). So let's convert these obviously-correct cases (this includes write_str_in_full(), which is just a wrapper for write_in_full()). [1] A careful reader may notice there is one way that write_in_full() can return a different value. If we ask write() to write N bytes and get a return value that is _larger_ than N, we could return a larger total. But besides the fact that this would imply a totally broken version of write(), it would already invoke undefined behavior. Our internal remaining counter is an unsigned size_t, which means that subtracting too many byte will wrap it around to a very large number. So we'll instantly begin reading off the end of the buffer, trying to write gigabytes (or petabytes) of data. Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-14 01:16:03 +08:00
if (fd < 0 || write_in_full(fd, out_buf, out_size) < 0) {
perror(argv[4]);
return 1;
}
return 0;
}