git/connected.c
Junio C Hamano d21c463d55 fetch/receive: remove over-pessimistic connectivity check
Git 1.7.8 introduced an object and history re-validation step after
"fetch" or "push" causes new history to be added to a receiving
repository. This is to protect a malicious server or pushing client from
corrupting the repository by taking advantage of an existing corrupt
object that is unconnected to existing history.

But this check is way over-pessimistic.  During "fetch" or "receive-pack"
(the server side of "push"), unpack-objects and index-pack already
validate individual objects that are received, and the only thing we would
want to catch are corrupted objects that already happen to exist in our
repository but are not referenced from our refs.  Such objects must have
been written by an earlier run of our codepaths that write out loose
objects or packfiles, and they must have done the validation of individual
objects when they did so.  The only thing left to worry about is the
connectivity integrity, which can be checked with "rev-list --objects",
which is much cheaper.  We have been paying the 5x to 8x runtime overhead
the --verify-objects often adds for no real gain.

Revert check_everything_connected() not to use this over-pessimistic
check.

Credit goes to Nguyễn Thái Ngọc Duy, who originally identified the
performance regression and endured multiple rounds of reviews to fix it.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-03-15 15:23:17 -07:00

63 lines
1.5 KiB
C

#include "cache.h"
#include "run-command.h"
#include "sigchain.h"
#include "connected.h"
/*
* If we feed all the commits we want to verify to this command
*
* $ git rev-list --objects --stdin --not --all
*
* and if it does not error out, that means everything reachable from
* these commits locally exists and is connected to our existing refs.
* Note that this does _not_ validate the individual objects.
*
* Returns 0 if everything is connected, non-zero otherwise.
*/
int check_everything_connected(sha1_iterate_fn fn, int quiet, void *cb_data)
{
struct child_process rev_list;
const char *argv[] = {"rev-list", "--objects",
"--stdin", "--not", "--all", NULL, NULL};
char commit[41];
unsigned char sha1[20];
int err = 0;
if (fn(cb_data, sha1))
return err;
if (quiet)
argv[5] = "--quiet";
memset(&rev_list, 0, sizeof(rev_list));
rev_list.argv = argv;
rev_list.git_cmd = 1;
rev_list.in = -1;
rev_list.no_stdout = 1;
rev_list.no_stderr = quiet;
if (start_command(&rev_list))
return error(_("Could not run 'git rev-list'"));
sigchain_push(SIGPIPE, SIG_IGN);
commit[40] = '\n';
do {
memcpy(commit, sha1_to_hex(sha1), 40);
if (write_in_full(rev_list.in, commit, 41) < 0) {
if (errno != EPIPE && errno != EINVAL)
error(_("failed write to rev-list: %s"),
strerror(errno));
err = -1;
break;
}
} while (!fn(cb_data, sha1));
if (close(rev_list.in)) {
error(_("failed to close rev-list's stdin: %s"), strerror(errno));
err = -1;
}
sigchain_pop(SIGPIPE);
return finish_command(&rev_list) || err;
}