refs: add DO_FOR_EACH_OMIT_DANGLING_SYMREFS flag

When the DO_FOR_EACH_INCLUDE_BROKEN flag is used, we include both actual
corrupt refs (illegal names, missing objects), but also symrefs that
point to nothing. This latter is not really a corruption, but just
something that may happen normally. For example, the symref at
refs/remotes/origin/HEAD may point to a tracking branch which is later
deleted. (The local HEAD may also be unborn, of course, but we do not
access it through ref iteration).

Most callers of for_each_ref() etc, do not care. They don't pass
INCLUDE_BROKEN, so don't see it at all. But for those which do pass it,
this somewhat-normal state causes extra warnings (e.g., from
for-each-ref) or even aborts operations (destructive repacks with
GIT_REF_PARANOIA set).

This patch just introduces the flag and the mechanism; there are no
callers yet (and hence no tests). Two things to note on the
implementation:

  - we actually skip any symref that does not resolve to a ref. This
    includes ones which point to an invalidly-named ref. You could argue
    this is a more serious breakage than simple dangling. But the
    overall effect is the same (we could not follow the symref), as well
    as the impact on things like REF_PARANOIA (either way, a symref we
    can't follow won't impact reachability, because we'll see the ref
    itself during iteration). The underlying resolution function doesn't
    distinguish these two cases (they both get REF_ISBROKEN).

  - we change the iterator in refs/files-backend.c where we check
    INCLUDE_BROKEN. There's a matching spot in refs/packed-backend.c,
    but we don't know need to do anything there. The packed backend does
    not support symrefs at all.

The resulting set of flags might be a bit easier to follow if we broke
this down into "INCLUDE_CORRUPT_REFS" and "INCLUDE_DANGLING_SYMREFS".
But there are a few reasons not do so:

  - adding a new OMIT_DANGLING_SYMREFS flag lets us leave existing
    callers intact, without changing their behavior (and some of them
    really do want to see the dangling symrefs; e.g., t5505 has a test
    which expects us to report when a symref becomes dangling)

  - they're not actually independent. You cannot say "include dangling
    symrefs" without also including refs whose objects are not
    reachable, because dangling symrefs by definition do not have an
    object. We could tweak the implementation to distinguish this, but
    in practice nobody wants to ask for that. Adding the OMIT flag keeps
    the implementation simple and makes sure we don't regress the
    current behavior.

Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This commit is contained in:
Jeff King 2021-09-24 14:41:32 -04:00 committed by Junio C Hamano
parent 9aab952e85
commit 8dccb2244c
2 changed files with 11 additions and 0 deletions

View File

@ -744,6 +744,11 @@ static int files_ref_iterator_advance(struct ref_iterator *ref_iterator)
ref_type(iter->iter0->refname) != REF_TYPE_PER_WORKTREE)
continue;
if ((iter->flags & DO_FOR_EACH_OMIT_DANGLING_SYMREFS) &&
(iter->iter0->flags & REF_ISSYMREF) &&
(iter->iter0->flags & REF_ISBROKEN))
continue;
if (!(iter->flags & DO_FOR_EACH_INCLUDE_BROKEN) &&
!ref_resolves_to_object(iter->iter0->refname,
iter->iter0->oid,

View File

@ -268,6 +268,12 @@ enum do_for_each_ref_flags {
* per-worktree refs.
*/
DO_FOR_EACH_PER_WORKTREE_ONLY = (1 << 1),
/*
* Omit dangling symrefs from output; this only has an effect with
* INCLUDE_BROKEN, since they are otherwise not included at all.
*/
DO_FOR_EACH_OMIT_DANGLING_SYMREFS = (1 << 2),
};
/*