Commit Graph

226 Commits

Author SHA1 Message Date
Junio C Hamano
143682ec43 Merge branch 'ps/pack-refs-auto-heuristics'
"git pack-refs --auto" for the files backend was too aggressive,
which has been a bit tamed.

* ps/pack-refs-auto-heuristics:
  refs/files: use heuristic to decide whether to repack with `--auto`
  t0601: merge tests for auto-packing of refs
  wrapper: introduce `log2u()`
2024-09-12 11:47:23 -07:00
Patrick Steinhardt
d343068e4a wrapper: introduce log2u()
We have an implementation of a function that computes the log2 for an
integer. While we could instead use log2(3P), that involves floating
point numbers and is thus needlessly complex and inefficient.

We're about to add a second caller that wants to compute log2 for a
`size_t`. Let's thus move the function into "wrapper.h" such that it
becomes generally available.

While at it, tweak the implementation a bit:

  - The parameter is converted from `int` to `uintmax_t`. This
    conversion is safe to do in "bisect.c" because we already check that
    the argument is positive.

  - The return value is an `unsigned`. It cannot ever be negative, so it
    is pointless for it to be a signed integer.

  - Loop until `!n` instead of requiring that `n > 1` and then subtract
    1 from the result and add a special case for `!sz`. This helps
    compilers to generate more efficient code.

Compilers recognize the pattern of this function and optimize
accordingly. On GCC 14.2 x86_64:

    log2u(unsigned long):
            test    rdi, rdi
            je      .L3
            bsr     rax, rdi
            ret
    .L3:
            mov     eax, -1
            ret

Clang 18.1 does not yet recognize the pattern, but starts to do so on
Clang trunk x86_64. The code isn't quite as efficient as the one
generated by GCC, but still manages to optimize away the loop:

    log2u(unsigned long):
            test    rdi, rdi
            je      .LBB0_1
            shr     rdi
            bsr     rcx, rdi
            mov     eax, 127
            cmovne  rax, rcx
            xor     eax, -64
            add     eax, 65
            ret
    .LBB0_1:
            mov     eax, -1
            ret

The pattern is also recognized on other platforms like ARM64 GCC 14.2.0,
where we end up using `clz`:

    log2u(unsigned long):
            clz     x2, x0
            cmp     x0, 0
            mov     w1, 63
            sub     w0, w1, w2
            csinv   w0, w0, wzr, ne
            ret

Note that we have a similar function `fastlog2()` in the reftable code.
As that codebase is separate from the Git codebase we do not adapt it to
use the new function.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-04 08:03:24 -07:00
John Cai
e8207717f1 refs: add referent to each_ref_fn
Add a parameter to each_ref_fn so that callers to the ref APIs
that use this function as a callback can have acess to the
unresolved value of a symbolic ref.

Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-09 08:47:34 -07:00
Patrick Steinhardt
e7da938570 global: introduce USE_THE_REPOSITORY_VARIABLE macro
Use of the `the_repository` variable is deprecated nowadays, and we
slowly but steadily convert the codebase to not use it anymore. Instead,
callers should be passing down the repository to work on via parameters.

It is hard though to prove that a given code unit does not use this
variable anymore. The most trivial case, merely demonstrating that there
is no direct use of `the_repository`, is already a bit of a pain during
code reviews as the reviewer needs to manually verify claims made by the
patch author. The bigger problem though is that we have many interfaces
that implicitly rely on `the_repository`.

Introduce a new `USE_THE_REPOSITORY_VARIABLE` macro that allows code
units to opt into usage of `the_repository`. The intent of this macro is
to demonstrate that a certain code unit does not use this variable
anymore, and to keep it from new dependencies on it in future changes,
be it explicit or implicit

For now, the macro only guards `the_repository` itself as well as
`the_hash_algo`. There are many more known interfaces where we have an
implicit dependency on `the_repository`, but those are not guarded at
the current point in time. Over time though, we should start to add
guards as required (or even better, just remove them).

Define the macro as required in our code units. As expected, most of our
code still relies on the global variable. Nearly all of our builtins
rely on the variable as there is no way yet to pass `the_repository` to
their entry point. For now, declare the macro in "biultin.h" to keep the
required changes at least a little bit more contained.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-14 10:26:33 -07:00
Patrick Steinhardt
2e5c4758b7 cocci: apply rules to rewrite callers of "refs" interfaces
Apply the rules that rewrite callers of "refs" interfaces to explicitly
pass `struct ref_store`. The resulting patch has been applied with the
`--whitespace=fix` option.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-07 10:06:59 -07:00
Peter Krefting
8198993c81 bisect: report the found commit with "show"
When "git bisect" finds the first bad commit and shows it to the user,
it calls "git diff-tree" to do so, whose output is meant to be stable
and deliberately ignores end-user customizations.

As the output is supposed to be consumed by humans, replace this with
a call to "git show". This command honors configuration options (such
as "log.date" and "log.mailmap") and other UI improvements (renames
are detected).

Pass some hard-coded options to "git show" to make the output similar
to the one we are replacing, such as showing a patch summary only.

Reported-by: Michael Osipov <michael.osipov@innomotics.com>
Signed-off-By: Peter Krefting <peter@softwolves.pp.se>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-15 11:29:09 -07:00
Johannes Schindelin
5317380521 commit-reach(repo_get_merge_bases_many): pass on "missing commits" errors
The `merge_bases_many()` function was just taught to indicate parsing
errors, and now the `repo_get_merge_bases_many()` function is aware of
that, too.

Naturally, there are a lot of callers that need to be adjusted now, too.

Next stop: `repo_get_merge_bases_dirty()`.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-29 08:06:01 -08:00
Junio C Hamano
c59ba68ea7 Merge branch 'js/check-null-from-read-object-file'
The code paths that call repo_read_object_file() have been
tightened to react to errors.

* js/check-null-from-read-object-file:
  Always check the return value of `repo_read_object_file()`
2024-02-14 15:36:06 -08:00
Johannes Schindelin
568459bf5e Always check the return value of repo_read_object_file()
There are a couple of places in Git's source code where the return value
is not checked. As a consequence, they are susceptible to segmentation
faults.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-06 10:42:28 -08:00
Junio C Hamano
492ee03f60 Merge branch 'en/header-cleanup'
Remove unused header "#include".

* en/header-cleanup:
  treewide: remove unnecessary includes in source files
  treewide: add direct includes currently only pulled in transitively
  trace2/tr2_tls.h: remove unnecessary include
  submodule-config.h: remove unnecessary include
  pkt-line.h: remove unnecessary include
  line-log.h: remove unnecessary include
  http.h: remove unnecessary include
  fsmonitor--daemon.h: remove unnecessary includes
  blame.h: remove unnecessary includes
  archive.h: remove unnecessary include
  treewide: remove unnecessary includes in source files
  treewide: remove unnecessary includes from header files
2024-01-08 14:05:15 -08:00
Elijah Newren
eea0e59ffb treewide: remove unnecessary includes in source files
Each of these were checked with
   gcc -E -I. ${SOURCE_FILE} | grep ${HEADER_FILE}
to ensure that removing the direct inclusion of the header actually
resulted in that header no longer being included at all (i.e. that
no other header pulled it in transitively).

...except for a few cases where we verified that although the header
was brought in transitively, nothing from it was directly used in
that source file.  These cases were:
  * builtin/credential-cache.c
  * builtin/pull.c
  * builtin/send-pack.c

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-26 12:04:31 -08:00
Patrick Steinhardt
0a06892ddd bisect: consistently write BISECT_EXPECTED_REV via the refdb
We're inconsistently writing BISECT_EXPECTED_REV both via the filesystem
and via the refdb, which violates the newly established rules for how
special refs must be treated. This works alright in practice with the
reffiles reference backend, but will cause bugs once we gain additional
backends.

Fix this issue and consistently write BISECT_EXPECTED_REV via the refdb
so that it is no longer a special ref.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-14 09:25:27 -08:00
Elijah Newren
a034e9106f object-store-ll.h: split this header out of object-store.h
The vast majority of files including object-store.h did not need dir.h
nor khash.h.  Split the header into two files, and let most just depend
upon object-store-ll.h, while letting the two callers that need it
depend on the full object-store.h.

After this patch:
    $ git grep -h include..object-store | sort | uniq -c
          2 #include "object-store.h"
        129 #include "object-store-ll.h"

Diff best viewed with `--color-moved`.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21 13:39:54 -07:00
Elijah Newren
c339932bd8 repository: remove unnecessary include of path.h
This also made it clear that several .c files that depended upon path.h
were missing a #include for it; add the missing includes while at it.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21 13:39:53 -07:00
Elijah Newren
d812c3b6a0 treewide: remove cache.h inclusion due to object.h changes
Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-11 08:52:10 -07:00
Elijah Newren
dabab1d6e6 object-name.h: move declarations for object-name.c functions from cache.h
Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-11 08:52:09 -07:00
Junio C Hamano
e7dca80692 Merge branch 'ab/remove-implicit-use-of-the-repository' into en/header-split-cache-h
* ab/remove-implicit-use-of-the-repository:
  libs: use "struct repository *" argument, not "the_repository"
  post-cocci: adjust comments for recent repo_* migration
  cocci: apply the "revision.h" part of "the_repository.pending"
  cocci: apply the "rerere.h" part of "the_repository.pending"
  cocci: apply the "refs.h" part of "the_repository.pending"
  cocci: apply the "promisor-remote.h" part of "the_repository.pending"
  cocci: apply the "packfile.h" part of "the_repository.pending"
  cocci: apply the "pretty.h" part of "the_repository.pending"
  cocci: apply the "object-store.h" part of "the_repository.pending"
  cocci: apply the "diff.h" part of "the_repository.pending"
  cocci: apply the "commit.h" part of "the_repository.pending"
  cocci: apply the "commit-reach.h" part of "the_repository.pending"
  cocci: apply the "cache.h" part of "the_repository.pending"
  cocci: add missing "the_repository" macros to "pending"
  cocci: sort "the_repository" rules by header
  cocci: fix incorrect & verbose "the_repository" rules
  cocci: remove dead rule from "the_repository.pending.cocci"
2023-04-04 08:25:52 -07:00
Ævar Arnfjörð Bjarmason
bab821646a cocci: apply the "pretty.h" part of "the_repository.pending"
Apply the part of "the_repository.pending.cocci" pertaining to
"pretty.h".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-28 07:36:45 -07:00
Ævar Arnfjörð Bjarmason
bc726bd075 cocci: apply the "object-store.h" part of "the_repository.pending"
Apply the part of "the_repository.pending.cocci" pertaining to
"object-store.h".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-28 07:36:45 -07:00
Ævar Arnfjörð Bjarmason
cb338c23d6 cocci: apply the "commit-reach.h" part of "the_repository.pending"
Apply the part of "the_repository.pending.cocci" pertaining to
"commit-reach.h".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-28 07:36:36 -07:00
Elijah Newren
32a8f51061 environment.h: move declarations for environment.c functions from cache.h
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21 10:56:53 -07:00
Elijah Newren
f394e093df treewide: be explicit about dependence on gettext.h
Dozens of files made use of gettext functions, without explicitly
including gettext.h.  This made it more difficult to find which files
could remove a dependence on cache.h.  Make C files explicitly include
gettext.h if they are using it.

However, while compat/fsmonitor/fsm-ipc-darwin.c should also gain an
include of gettext.h, it was left out to avoid conflicting with an
in-flight topic.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21 10:56:51 -07:00
Elijah Newren
41771fa435 cache.h: remove dependence on hex.h; make other files include it explicitly
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-23 17:25:29 -08:00
Johannes Schindelin
de54b5fec4 bisect: no longer try to clean up left-over .git/head-name files
As per the code comment, the `.git/head-name` files were cleaned up for
backwards-compatibility: an old version of `git bisect` could have left
them behind.

Now, just how old would such a version be? As of 0f497e75f0 (Eliminate
confusing "won't bisect on seeked tree" failure, 2008-02-23), `git
bisect` does not write that file anymore. Which corresponds to Git
v1.5.4.4.

Even if the likelihood is non-nil that there might still be users out
there who use such an old version to start a bisection, but then decide
to continue bisecting with a current Git version, it is highly
improbable.

So let's remove that code, at long last.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-13 14:17:14 -08:00
René Scharfe
ddbb47fde9 replace and remove run_command_v_opt()
Replace the remaining calls of run_command_v_opt() with run_command()
calls and explict struct child_process variables.  This is more verbose,
but not by much overall.  The code becomes more flexible, e.g. it's easy
to extend to conditionally add a new argument.

Then remove the now unused function and its own flag names, simplifying
the run-command API.

Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-30 14:04:51 -04:00
René Scharfe
48750b2d0d bisect: simplify building "checkout" argument list
Reduce the scope of argv_checkout, which allows to fully build it during
initialization.  Use oid_to_hex() instead of oid_to_hex_r(), because
that's simpler and using the static buffer of the former is just as safe
as the old static argv_checkout.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-10-30 14:04:35 -04:00
Ævar Arnfjörð Bjarmason
5cf88fd8b0 git-compat-util.h: use "UNUSED", not "UNUSED(var)"
As reported in [1] the "UNUSED(var)" macro introduced in
2174b8c75d (Merge branch 'jk/unused-annotation' into next,
2022-08-24) breaks coccinelle's parsing of our sources in files where
it occurs.

Let's instead partially go with the approach suggested in [2] of
making this not take an argument. As noted in [1] "coccinelle" will
ignore such tokens in argument lists that it doesn't know about, and
it's less of a surprise to syntax highlighters.

This undoes the "help us notice when a parameter marked as unused is
actually use" part of 9b24034754 (git-compat-util: add UNUSED macro,
2022-08-19), a subsequent commit will further tweak the macro to
implement a replacement for that functionality.

1. https://lore.kernel.org/git/220825.86ilmg4mil.gmgdl@evledraar.gmail.com/
2. https://lore.kernel.org/git/220819.868rnk54ju.gmgdl@evledraar.gmail.com/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-09-01 10:49:48 -07:00
Jeff King
63e14ee2d6 refs: mark unused each_ref_fn parameters
Functions used with for_each_ref(), etc, need to conform to the
each_ref_fn interface. But most of them don't need every parameter;
let's annotate the unused ones to quiet -Wunused-parameter.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-19 12:18:54 -07:00
Ævar Arnfjörð Bjarmason
f92dbdbc6a revisions API: don't leak memory on argv elements that need free()-ing
Add a "free_removed_argv_elements" member to "struct
setup_revision_opt", and use it to fix several memory leaks.

We have various memory leaks in APIs that take and munge "const
char **argv", e.g. parse_options(). Sometimes these APIs are given the
"argv" we get to the "main" function, in which case we don't leak
memory, but other times we're giving it the "v" member of a "struct
strvec" we created.

There's several potential ways to fix those sort of leaks, we could
add a "nodup" mode to "struct strvec", which would work for the cases
where we push constant strings to it. But that wouldn't work as soon
as we used strvec_pushf(), or otherwise needed to duplicate or create
a string for that "struct strvec".

Let's instead make it the responsibility of the revisions API. If it's
going to clobber elements of argv it can also free() them, which it
will now do if instructed to do so via "free_removed_argv_elements".

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-03 11:12:36 -07:00
Ævar Arnfjörð Bjarmason
57efebb9b9 bisect.c: partially fix bisect_rev_setup() memory leak
Partially fix the memory leak noted in in 8a534b6124 (bisect: use
argv_array API, 2011-09-13), which added the "XXX" comment seen in the
context. We can partially fix it by having the bisect_rev_setup()
function take a "struct strvec", rather than constructing it.

As the comment notes we need to keep the construct "rev_argv" around
while the "struct rev_info" is around, which as seen in the newly
added "strvec_clear()" calls here we do after "release_revisions()".

This "partially" fixes the memory leak because we're leaking the "--"
added to the "rev_argv" here still, which will be addressed in a
subsequent commit.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-03 11:01:03 -07:00
Ævar Arnfjörð Bjarmason
c5365e93fd bisect.c: add missing "goto" for release_revisions()
Add a missing "goto cleanup", this fixes a bug in
f196c1e908 (revisions API users: use release_revisions() needing
REV_INFO_INIT, 2022-04-13).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-03 10:12:12 -07:00
Ævar Arnfjörð Bjarmason
f196c1e908 revisions API users: use release_revisions() needing REV_INFO_INIT
Use release_revisions() to various users of "struct rev_list" which
need to have their "struct rev_info" zero-initialized before we can
start using it.

For the bundle.c code see the early exit case added in
3bbbe467f2 (bundle verify: error out if called without an object
database, 2019-05-27).

For the relevant bisect.c code see 45b6370812 (bisect: libify
`check_good_are_ancestors_of_bad` and its dependents, 2020-02-17).

For the submodule.c code see the "goto" on "(!left || !right || !sub)"
added in 8e6df65015 (submodule: refactor show_submodule_summary with
helper function, 2016-08-31).

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-04-13 23:56:08 -07:00
Ævar Arnfjörð Bjarmason
2108fe4a19 revisions API users: add straightforward release_revisions()
Add a release_revisions() to various users of "struct rev_list" in
those straightforward cases where we only need to add the
release_revisions() call to the end of a block, and don't need to
e.g. refactor anything to use a "goto cleanup" pattern.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-04-13 23:56:08 -07:00
René Scharfe
48af1fdee3 bisect--helper: double-check run command on exit code 126 and 127
When a run command cannot be executed or found, shells return exit code
126 or 127, respectively.  Valid run commands are allowed to return
these codes as well to indicate bad revisions, though, for historical
reasons.  This means typos can cause bogus bisect runs that go over the
full distance and end up reporting invalid results.

The best solution would be to reserve exit codes 126 and 127, like
71b0251cdd (Bisect run: "skip" current commit if script exit code is
125., 2007-10-26) did for 125, and abort bisect run when we get them.
That might be inconvenient for those who relied on the documentation
stating that 126 and 127 can be used for bad revisions, though.

The workaround used by this patch is to run the command on a known-good
revision and abort if we still get the same error code.  This adds one
step to runs with scripts that use exit codes 126 and 127, but keeps
them supported, with one exception: It won't work with commands that
cannot recognize the (manually marked) known-good revision as such.

Run commands that use low exit codes are unaffected.  Typos are reported
after executing the missing command twice and three checkouts (the first
step, the known good revision and back to the revision of the first
step).

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-01-19 09:35:03 -08:00
Junio C Hamano
1fcc40cd1d bisect: simplify return code from bisect_checkout()
The function was designed to return only BISECT_OK (0) or
BISECT_FAILED (-1) and no other values, but there were two issues:

 - The comment misspelled BISECT_FAILED as BISECT_FAILURE, even
   though the logic it described (i.e. any non-zero return should be
   reported as a single BISECT_FAILED) was correct.

 - It took the return value from run_command_v_opt(), and assumed it
   was either -1 or 1 upon error, which is not the case; it can relay
   errors from wait_or_whine(), which can report exit status of the
   child process.

Translate any error return from run_command_v_opt() to BISECT_FAILED,
and simplify the resulting code by losing the 'res' variable that is
no longer needed.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-28 10:57:37 -07:00
Junio C Hamano
ffcb4e94d3 bisect: do not run show-branch just to show the current commit
In scripted versions of "git bisect", we used "git show-branch" to
describe a single commit in the bisect log and also to the interactive
user after checking out the next version to be tested.

The former use of "git show-branch" was lost when the helper
function that wrote bisect log entries was rewritten at 0f30233a
(bisect--helper: `bisect_write` shell function in C, 2019-01-02) in
C

But we've kept the latter ever since 0871984d (bisect: make "git
bisect" use new "--next-all" bisect-helper function, 2009-05-09)
started using the faithful C-rewrite introduced at ef24c7ca
(bisect--helper: add "--next-exit" to output bisect results,
2009-04-19).

Showing "[<full hex>] <subject>" is simple enough with our helper
pretty.c::format_commit_message() and spawning show-branch is an
overkill.  Let's lose one external process.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-28 10:57:26 -07:00
René Scharfe
ca56dadb4b use CALLOC_ARRAY
Add and apply a semantic patch for converting code that open-codes
CALLOC_ARRAY to use it instead.  It shortens the code and infers the
element size automatically.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-13 16:00:09 -08:00
Junio C Hamano
1c8f5dfa42 Merge branch 'js/params-vs-args'
Messages update.

* js/params-vs-args:
  replace "parameters" by "arguments" in error messages
2021-02-25 16:43:32 -08:00
Johannes Sixt
b865734760 replace "parameters" by "arguments" in error messages
When an error message informs the user about an incorrect command
invocation, it should refer to "arguments", not "parameters".

Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-23 13:30:45 -08:00
Martin Ågren
bc62692757 hash-lookup: rename from sha1-lookup
Change all remnants of "sha1" in hash-lookup.c and .h and rename them to
reflect that we're not just able to handle SHA-1 these days.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Reviewed-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-04 13:01:55 -08:00
Junio C Hamano
2557c1183a Merge branch 'sg/bisect-approximately-halfway'
"git bisect start/next" in a large span of history spends a lot of
time trying to come up with exactly the half-way point; this can be
optimized by stopping when we see a commit that is close enough to
the half-way point.

* sg/bisect-approximately-halfway:
  bisect: loosen halfway() check for a large number of commits
2020-11-25 15:24:52 -08:00
SZEDER Gábor
0afcea70b1 bisect: loosen halfway() check for a large number of commits
'git bisect start ...' and subsequent 'git bisect (good|bad)' commands
can take quite a while when the given/remaining revision range between
good and bad commits is big and contains a lot of merge commits, e.g.
in git.git:

  $ git rev-list --count v1.6.0..v2.28.0
  44284
  $ time git bisect start v2.28.0 v1.6.0
  Bisecting: 22141 revisions left to test after this (roughly 15 steps)
  [e197c21807] unable_to_lock_die(): rename function from unable_to_lock_index_die()

  real    0m15.472s
  user    0m15.220s
  sys     0m0.255s

The majority of the runtime is spent in do_find_bisection(), where we
try to find a commit as close as possible to the halfway point between
the bad and good revisions, i.e. a commit from which the number of
reachable commits that are in the good-bad range is half the total
number of commits in that range.  So we count how many commits are
reachable in the good-bad range for each commit in that range, which
is quick and easy for a linear history, even over 300k commits in a
linear range are handled in ~0.3s on my machine.  Alas, handling merge
commits is non-trivial and quite expensive as the algorithm used seems
to be quadratic, causing the long runtime shown above.

Interestingly, look at what a big difference one additional commit
can make:

  $ git rev-list --count v1.6.0^..v2.28.0
  44285
  $ time git bisect start v2.28.0 v1.6.0^
  Bisecting: 22142 revisions left to test after this (roughly 15 steps)
  [565301e416] Sync with 2.1.2

  real  0m5.848s
  user  0m5.600s
  sys   0m0.252s

The difference is caused by one of the optimizations attempting to cut
down the runtime added in 1c4fea3a40 (git-rev-list --bisect:
optimization, 2007-03-21):

    Another small optimization is whenever we find a half-way commit
    (that is, a commit that can reach exactly half of the commits),
    we stop giving counts to remaining commits, as we will not find
    any better commit than we just found.

In this second 'git bisect start' command we happen to find a commit
exactly at the halfway point and can return early, but in the first
case there is no such commit, so we can't return early and end up
counting the number of reachable commits from all commits in the
good-bad range.

However, when we have thousands of commits it's not all that important
to find the _exact_ halfway point, a few commits more or less doesn't
make any real difference for the bisection.

So let's loosen the check in the halfway() helper to consider commits
within about 0.1% of the exact halfway point as halfway as well, and
rename the function to approx_halfway() accordingly.  This will allow
us to return early on a bigger good-bad range, even when there is no
commit exactly at the halfway point, thereby reducing the runtime of
the first command above considerably, from ~15s to 4.901s.
Furthermore, even if there is a commit exactly at the halfway point,
we might still stumble upon a commit within that 0.1% range before
finding the exact halfway point, allowing us to return a bit earlier,
slightly reducing the runtime of the second command from 5.848s to
5.058s.  Note that this change doesn't affect good-bad ranges
containing ~2000 commits or less, because that 0.1% tolerance becomes
zero due to integer arithmetic; however, if the range is that small
then counting the reachable commits for all commits is already fast
enough anyway.

Naturally, this will likely change which commits get picked at each
bisection step, and, in turn, might change how many bisection steps
are necessary to find the first bad commit.  If the number of
necessary bisection steps were to increase often, then this change
could backfire, because building and testing at each step might take
much longer than the time spared.  OTOH, if the number of steps were
to decrease, then it would be a double win.

So I ran some tests to see how often that happens: picked random good
and bad starting revisions at least 50k commits apart and a random
first bad commit in between in git.git, and used 'git bisect run git
merge-base --is-ancestor HEAD $first_bad_commit' to check the number
of necessary bisection steps.  After repeating all this 1000 times
both with and without this patch I found that:

  - 146 cases needed one more bisection step than before, 149 cases
    needed one less step, while in the remaining 705 cases the number
    of steps didn't change.  So the number of bisection steps does
    indeed change in a non-negligible number of cases, but it seems
    that the average number of steps doesn't change in the long run.

  - The first 'git bisect start' command got over 3x faster in 456
    cases, so this "no commit at the exact halfway point" case seems
    to be common enough to care about.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-12 09:36:48 -08:00
René Scharfe
0795df4b9b bisect: clear flags in passed repository
69d2cfe6e8 (bisect.c: remove the_repository reference, 2018-11-10) kept
the implicit the_repository reference in clear_commit_marks_all, which
was made explicit by the previous commit (and which also renamed it to
repo_clear_commit_marks).  Replace it as well.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-10-31 10:46:34 -07:00
René Scharfe
cd8888452c object: allow clear_commit_marks_all to handle any repo
Allow callers to specify the repository to use.  Rename the function to
repo_clear_commit_marks to document its new scope.  No functional change
intended.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-10-31 10:46:34 -07:00
Pranit Bauva
517ecb3161 bisect--helper: reimplement bisect_next and bisect_auto_next shell functions in C
Reimplement the `bisect_next()` and the `bisect_auto_next()` shell functions
in C and add the subcommands to `git bisect--helper` to call them from
git-bisect.sh .

bisect_auto_next() function returns an enum bisect_error type as whole
`git bisect` can exit with an error code when bisect_next() does.

Return an error when `bisect_next()` fails, that fix a bug on shell script
version.

Using `--bisect-next` and `--bisect-auto-next` subcommands is a
temporary measure to port shell function to C so as to use the existing
test suite. As more functions are ported, `--bisect-auto-next`
subcommand will be retired and will be called by some other methods.

Mentored-by: Lars Schneider <larsxschneider@gmail.com>
Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Pranit Bauva <pranit.bauva@gmail.com>
Signed-off-by: Tanushree Tumane <tanushreetumane@gmail.com>
Signed-off-by: Miriam Rubio <mirucam@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-24 12:06:30 -07:00
Miriam Rubio
c7a7f48f4f bisect: call 'clear_commit_marks_all()' in 'bisect_next_all()'
As there can be other revision walks after bisect_next_all(),
let's add a call to a function to clear all the marks at the
end of bisect_next_all().

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Miriam Rubio <mirucam@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-24 12:06:30 -07:00
Junio C Hamano
47f0f94bc7 Merge branch 'al/bisect-first-parent'
"git bisect" learns the "--first-parent" option to find the first
breakage along the first-parent chain.

* al/bisect-first-parent:
  bisect: combine args passed to find_bisection()
  bisect: introduce first-parent flag
  cmd_bisect__helper: defer parsing no-checkout flag
  rev-list: allow bisect and first-parent flags
  t6030: modernize "git bisect run" tests
2020-08-17 17:02:45 -07:00
Junio C Hamano
7d7f4e3a3e Merge branch 'rs/bisect-oid-to-hex-fix'
Code cleanup.

* rs/bisect-oid-to-hex-fix:
  bisect: use oid_to_hex_r() instead of memcpy()+oid_to_hex()
2020-08-10 10:24:01 -07:00
Aaron Lipman
ad464a4e84 bisect: combine args passed to find_bisection()
Now that find_bisection() accepts multiple boolean arguments, these may
be combined into a single unsigned integer in order to declutter some of
the code in bisect.c

Also, rename the existing "flags" bitfield to "commit_flags", to
explicitly differentiate it from the new "bisect_flags" bitfield.

Based-on-patch-by: Harald Nordgren <haraldnordgren@gmail.com>
Signed-off-by: Aaron Lipman <alipman88@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-08-07 15:13:03 -07:00
Aaron Lipman
e8861ffc20 bisect: introduce first-parent flag
Upon seeing a merge commit when bisecting, this option may be used to
follow only the first parent.

In detecting regressions introduced through the merging of a branch, the
merge commit will be identified as introduction of the bug and its
ancestors will be ignored.

This option is particularly useful in avoiding false positives when a
merged branch contained broken or non-buildable commits, but the merge
itself was OK.

Signed-off-by: Aaron Lipman <alipman88@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-08-07 15:13:03 -07:00