Commit Graph

74158 Commits

Author SHA1 Message Date
Junio C Hamano
7c01dcd018 Merge branch 'as/pathspec-h-typofix'
Typofix.

* as/pathspec-h-typofix:
  pathspec: fix typo "glossary-context.txt" -> "glossary-content.txt"
2024-07-12 08:41:57 -07:00
Piotr Szlazak
8d20119551 doc: update http.cookieFile with in-memory cookie processing
Documentation only mentions how to read cookies from the given file
and how to save them to the file using http.saveCookies.

But underlying libcURL allows the HTTP cookies used only in memory;
cookies from the server will be accepted and sent back in successive
requests within same connection, by using an empty string as the
filename.  Document this.

Signed-off-by: Piotr Szlazak <piotr.szlazak@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-11 08:50:30 -07:00
Rubén Justo
8c1d6691bc test-lib: GIT_TEST_SANITIZE_LEAK_LOG enabled by default
As we currently describe in t/README, it can happen that:

    Some tests run "git" (or "test-tool" etc.) without properly checking
    the exit code, or git will invoke itself and fail to ferry the
    abort() exit code to the original caller.

Therefore, GIT_TEST_SANITIZE_LEAK_LOG=true is needed to be set to
capture all memory leaks triggered by our tests.

It seems unnecessary to force users to remember this option, as
forgetting it could lead to missed memory leaks.

We could solve the problem by making it "true" by default, but that
might suggest we think "false" makes sense, which isn't the case.

Therefore, the best approach is to remove the option entirely while
maintaining the capability to detect memory leaks in blind spots of our
tests.

Signed-off-by: Rubén Justo <rjusto@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-11 08:37:44 -07:00
Jeff King
55fe61559e t/.gitattributes: ignore whitespace in chainlint expect files
The ".expect" files in t/chainlint/ are snippets of expected output from
the chainlint script, and do not necessarily conform to our usual code
style. Especially with the recent change to retain line numbers, blank
lines in the input script end up with trailing whitespace as we print
"3 " for line 3, for example. The point of these files is to match the
output verbatim, so let's not complain about the trailing spaces.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-10 10:15:40 -07:00
Jeff King
f6b75726b2 t: convert some here-doc test bodies
The t1404 script checks a lot of output from Git which contains single
quotes. Because the test snippets are themselves wrapped in the same
single-quotes, we have to resort to using $SQ to match them.  This is
error-prone and makes the tests harder to read.

Instead, let's use the new here-doc feature added in the previous
commit, which lets us write anything in the test body we want (except
the here-doc end marker on a line by itself, of course).

Note that we do use "\" in our marker to avoid interpolation (which is
the whole point). But we don't use "<<-", as we want to preserve
whitespace in the snippet (and running with "-v" before and after shows
that we produce the exact same output, except with the ugly $SQ
references fixed).

I just converted every test here, even though only some of them use
$SQ. But it would be equally correct to mix-and-match styles if we don't
mind the inconsistency.

I've also converted a few tests in t0600 which were moved from t1404 (I
had written this patch before they were moved, but it seemed worth
porting over the changes rather than losing them).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-10 10:14:23 -07:00
Jeff King
1d133ae91f test-lib: allow test snippets as here-docs
Most test snippets are wrapped in single quotes, like:

  test_expect_success 'some description' '
          do_something
  '

This sometimes makes the snippets awkward to write, because you can't
easily use single quotes within them. We sometimes work around this with
$SQ, or by loosening regexes to use "." instead of a literal quote, or
by using double quotes when we'd prefer to use single-quotes (and just
adding extra backslash-escapes to avoid interpolation).

This commit adds another option: feeding the snippet via the function's
stdin. This doesn't conflict with anything the snippet would want to do,
because we always redirect its stdin from /dev/null anyway (which we'll
continue to do).

A few notes on the implementation:

  - it would be nice to push this down into test_run_, but we can't, as
    test_expect_success and test_expect_failure want to see the actual
    script content to report it for verbose-mode. A helper function
    limits the amount of duplication in those callers here.

  - The helper function is a little awkward to call, as you feed it the
    name of the variable you want to set. The more natural thing in
    shell would be command substitution like:

      body=$(body_or_stdin "$2")

    but that loses trailing whitespace. There are tricks around this,
    like:

      body=$(body_or_stdin "$2"; printf .)
      body=${body%.}

    but we'd prefer to keep such tricks in the helper, not in each
    caller.

  - I implemented the helper using a sequence of "read" calls. Together
    with "-r" and unsetting the IFS, this preserves incoming whitespace.
    An alternative is to use "cat" (which then requires the gross "."
    trick above). But this saves us a process, which is probably a good
    thing. The "read" builtin does use more read() syscalls than
    necessary (one per byte), but that is almost certainly a win over a
    separate process.

    Both are probably slower than passing a single-quoted string, but
    the difference is lost in the noise for a script that I converted as
    an experiment.

  - I handle test_expect_success and test_expect_failure here. If we
    like this style, we could easily extend it to other spots (e.g.,
    lazy_prereq bodies) on top of this patch.

  - even though we are using "local", we have to be careful about our
    variable names. Within test_expect_success, any variable we declare
    with local will be seen as local by the test snippets themselves (so
    it wouldn't persist between tests like normal variables would).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-10 10:14:23 -07:00
Jeff King
0c7d630220 chainlint.pl: add tests for test body in heredoc
The chainlint.pl script recently learned about the upcoming:

  test_expect_success 'some test' - <<\EOT
	TEST_BODY
  EOT

syntax, where TEST_BODY should be checked in the usual way. Let's make
sure this works by adding a few tests. The "here-doc-body" file tests
the basic syntax, including an embedded here-doc which we should still
be able to recognize.

Likewise the "here-doc-body-indent" checks the same thing, but using the
"<<-" operator. We wouldn't expect this to be used normally, but we
would not want to accidentally miss a body that uses it. The
"pathological" variant checks the opposite: we don't get confused by an
indented tag within the here-doc body.

The "here-doc-double" tests the handling of two here-doc tags on the
same line. This is not something we'd expect anybody to do in practice,
but the code was written defensively to handle this, so let's make sure
it works.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-10 10:14:22 -07:00
Eric Sunshine
a4a5f282f5 chainlint.pl: recognize test bodies defined via heredoc
In order to check tests for semantic problems, chainlint.pl scans test
scripts, looking for tests defined as:

    test_expect_success [prereq] title '
        body
    '

where `body` is a single string which is then treated as a standalone
chunk of code and "linted" to detect semantic issues. (The same happens
for `test_expect_failure` definitions.)

The introduction of test definitions in which the test body is instead
presented via a heredoc rather than as a single string creates a blind
spot in the linting process since such invocations are not recognized by
chainlint.pl.

Prepare for this new style by also recognizing tests defined as:

    test_expect_success [prereq] title - <<\EOT
        body
    EOT

A minor complication is that chainlint.pl has never considered heredoc
bodies significant since it doesn't scan them for semantic problems,
thus it has always simply thrown them away. However, with the new
`test_expect_success` calling sequence, heredoc bodies become
meaningful, thus need to be captured.

Signed-off-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-10 10:14:22 -07:00
Jeff King
03763e68fb chainlint.pl: check line numbers in expected output
While working on chainlint.pl recently, we introduced some bugs that
showed incorrect line numbers in the output. But it was hard to notice,
since we sanitize the output by removing all of the line numbers! It
would be nice to retain these so we can catch any regressions.

The main reason we sanitize is for maintainability: we concatenate all
of the test snippets into a single file, so it's hard for each ".expect"
file to know at which offset its test input will be found. We can handle
that by storing the per-test line numbers in the ".expect" files, and
then dynamically offsetting them as we build the concatenated test and
expect files together.

The changes to the ".expect" files look like tedious boilerplate, but it
actually makes adding new tests easier. You can now just run:

  perl chainlint.pl chainlint/foo.test |
  tail -n +2 >chainlint/foo.expect

to save the output of the script minus the comment headers (after
checking that it is correct, of course). Whereas before you had to strip
the line numbers. The conversions here were done mechanically using
something like the script above, and then spot-checked manually.

It would be possible to do all of this in shell via the Makefile, but it
gets a bit complicated (and requires a lot of extra processes). Instead,
I've written a short perl script that generates the concatenated files
(we already depend on perl, since chainlint.pl uses it). Incidentally,
this improves a few other things:

  - we incorrectly used $(CHAINLINTTMP_SQ) inside a double-quoted
    string. So if your test directory required quoting, like:

       make "TEST_OUTPUT_DIRECTORY=/tmp/h'orrible"

    we'd fail the chainlint tests.

  - the shell in the Makefile didn't handle &&-chaining correctly in its
    loops (though in practice the "sed" and "cat" invocations are not
    likely to fail).

  - likewise, the sed invocation to strip numbers was hiding the exit
    code of chainlint.pl itself. In practice this isn't a big deal;
    since there are linter violations in the test files, we expect it to
    exit non-zero. But we could later use exit codes to distinguish
    serious errors from expected ones.

  - we now use a constant number of processes, instead of scaling with
    the number of test scripts. So it should be a little faster (on my
    machine, "make check-chainlint" goes from 133ms to 73ms).

There are some alternatives to this approach, but I think this is still
a good intermediate step:

  1. We could invoke chainlint.pl individually on each test file, and
     compare it to the expected output (and possibly using "make" to
     avoid repeating already-done checks). This is a much bigger change
     (and we'd have to figure out what to do with the "# LINT" lines in
     the inputs). But in this case we'd still want the "expect" files to
     be annotated with line numbers. So most of what's in this patch
     would be needed anyway.

  2. Likewise, we could run a single chainlint.pl and feed it all of the
     scripts (with "--jobs=1" to get deterministic output). But we'd
     still need to annotate the scripts as we did here, and we'd still
     need to either assemble the "expect" file, or break apart the
     script output to compare to each individual ".expect" file.

So we may pursue those in the long run, but this patch gives us more
robust tests without too much extra work or moving in a useless
direction.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-10 10:14:22 -07:00
Jeff King
382f6edaee chainlint.pl: force CRLF conversion when opening input files
The lexer in chainlint.pl can't handle CRLF line endings; it complains
about an internal error in scan_token() if we see one. For example, in
our Windows CI environment:

  $ perl chainlint.pl chainlint/for-loop.test | cat -v
  Thread 2 terminated abnormally: internal error scanning character '^M'

This doesn't break "make check-chainlint" (yet), because we assemble a
concatenated input by passing the contents of each file through "sed".
And the "sed" we use will strip out the CRLFs. But the next patch is
going to rework this a bit, which does break check-chainlint on Windows.
Plus it's probably nicer to folks on Windows who might work on chainlint
itself and write new tests.

In theory we could fix the parser to handle this, but it's not really
worth the trouble. We should be able to ask the input layer to translate
the line endings for us. In fact, I'd expect this to happen by default,
as perl's documentation claims Win32 uses the ":unix:crlf" PERLIO layer
by default ("unix" here just refers to using read/write syscalls, and
then "crlf" layers the translation on top). However, this doesn't seem
to be the case in our Windows CI environment. I didn't dig into the
exact reason, but it is perhaps because we are using an msys build of
perl rather than a "true" Win32 build.

At any rate, it is easy-ish to just ask explicitly for the conversion.
In the above example, setting PERLIO=crlf in the environment is enough
to make it work. Curiously, though, this doesn't work when invoking
chainlint via "make". Again, I didn't dig into it, but it may have to do
with msys programs calling Windows programs or vice versa.

We can make it work consistently by just explicitly asking for CRLF
translation when we open the files. This will even work on non-Windows
platforms, though we wouldn't really expect to find CRLF files there.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-10 10:14:22 -07:00
Jeff King
d558509e25 chainlint.pl: do not spawn more threads than we have scripts
The chainlint.pl script spawns worker threads to check many scripts in
parallel. This is good if you feed it a lot of scripts. But if you give
it few (or one), then the overhead of spawning the threads dominates. We
can easily notice that we have fewer scripts than threads and scale back
as appropriate.

This patch reduces the time to run:

  time for i in chainlint/*.test; do
	perl chainlint.pl $i
  done >/dev/null

on my system from ~4.1s to ~1.1s, where I have 8+8 cores.

As with the previous patch, this isn't the usual way we run chainlint
(we feed many scripts at once, which is why it supports threading in the
first place). So this won't make a big difference in the real world, but
it may help us out in the future, and it makes experimenting with and
debugging the chainlint tests a bit more pleasant.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-10 10:14:21 -07:00
Jeff King
a7c1c10256 chainlint.pl: only start threads if jobs > 1
If the system supports threads, chainlint.pl will always spawn worker
threads to do the real work. But when --jobs=1, this is pointless, since
we could just do the work in the main thread. And spawning even a single
thread has a high overhead. For example, on my Linux system, running:

  for i in chainlint/*.test; do
	perl chainlint.pl --jobs=1 $i
  done >/dev/null

takes ~1.7s without this patch, and ~1.1s after. We don't usually spawn
a bunch of individual chainlint.pl processes (instead we feed several
scripts at once, and the parallelism outweighs the setup cost). But it's
something we've considered doing, and since we already have fallback
code for systems without thread support, it's pretty easy to make this
work.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-10 10:14:21 -07:00
Jeff King
a5e450144d chainlint.pl: add test_expect_success call to test snippets
The chainlint tests are a series of individual files, each holding a
test body. The "make check-chainlint" target assembles them into a
single file, adding a "test_expect_success" function call around each.
Let's instead include that function call in the files themselves. This
is a little more boilerplate, but has several advantages:

  1. You can now run chainlint manually on snippets with just "perl
     chainlint.perl chainlint/foo.test". This can make developing and
     debugging a little easier.

  2. Many of the tests implicitly relied on the syntax of the lines
     added by the Makefile (in particular the use of single-quotes).
     This assumption is much easier to see when the single-quotes are
     alongside the test body.

  3. We had no way to test how the chainlint program handled
     various test_expect_success lines themselves. Now we'll be able to
     check variations.

The change to the .test files was done mechanically, using the same
test names they would have been assigned by the Makefile (this is
important to match the expected output). The Makefile has the minimal
change to drop the extra lines; there are more cleanups possible but a
future patch in this series will rewrite this substantially anyway.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-10 10:14:21 -07:00
Junio C Hamano
4f5822076f http.c: cookie file tightening
The http.cookiefile configuration variable is used to call
curl_easy_setopt() to set CURLOPT_COOKIEFILE and if http.savecookies
is set, the same value is used for CURLOPT_COOKIEJAR.  The former is
used only to read cookies at startup, the latter is used to write
cookies at the end.

The manual pages https://curl.se/libcurl/c/CURLOPT_COOKIEFILE.html
and https://curl.se/libcurl/c/CURLOPT_COOKIEJAR.html talk about two
interesting special values.

 * "" (an empty string) given to CURLOPT_COOKIEFILE means not to
   read cookies from any file upon startup.

 * It is not specified what "" (an empty string) given to
   CURLOPT_COOKIEJAR does; presumably open a file whose name is an
   empty string and write cookies to it?  In any case, that is not
   what we want to see happen, ever.

 * "-" (a dash) given to CURLOPT_COOKIEFILE makes cURL read cookies
   from the standard input, and given to CURLOPT_COOKIEJAR makes
   cURL write cookies to the standard output.  Neither of which we
   want ever to happen.

So, let's make sure we avoid these nonsense cases.  Specifically,
when http.cookies is set to "-", ignore it with a warning, and when
it is set to "" and http.savecookies is set, ignore http.savecookies
with a warning.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-09 21:28:38 -07:00
brian m. carlson
610cbc1dfb http: allow authenticating proactively
When making a request over HTTP(S), Git only sends authentication if it
receives a 401 response.  Thus, if a repository is open to the public
for reading, Git will typically never ask for authentication for fetches
and clones.

However, there may be times when a user would like to authenticate
nevertheless.  For example, a forge may give higher rate limits to users
who authenticate because they are easier to contact in case of excessive
use.  Or it may be useful for a known heavy user, such as an internal
service, to proactively authenticate so its use can be monitored and, if
necessary, throttled.

Let's make this possible with a new option, "http.proactiveAuth".  This
option specifies a type of authentication which can be used to
authenticate against the host in question.  This is necessary because we
lack the WWW-Authenticate header to provide us details; similarly, we
cannot accept certain types of authentication because we require
information from the server, such as a nonce or challenge, to
successfully authenticate.

If we're in auto mode and we got a username and password, set the
authentication scheme to Basic.  libcurl will not send authentication
proactively unless there's a single choice of allowed authentication,
and we know in this case we didn't get an authtype entry telling us what
scheme to use, or we would have taken a different codepath and written
the header ourselves.  In any event, of the other schemes that libcurl
supports, Digest and NTLM require a nonce or challenge, which means that
they cannot work with proactive auth, and GSSAPI does not use a username
and password at all, so Basic is the only logical choice among the
built-in options.

Note that the existing http_proactive_auth variable signifies proactive
auth if there are already credentials, which is different from the
functionality we're adding, which always seeks credentials even if none
are provided.  Nonetheless, t5540 tests the existing behavior for
WebDAV-based pushes to an open repository without credentials, so we
preserve it.  While at first this may seem an insecure and bizarre
decision, it may be that authentication is done with TLS certificates,
in which case it might actually provide a quite high level of security.
Expand the variable to use an enum to handle the additional cases and a
helper function to distinguish our new cases from the old ones.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-09 21:27:51 -07:00
brian m. carlson
70405acf60 doc: mention that proxies must be completely transparent
We already document in the FAQ that proxies must be completely
transparent and not modify the request or response in any way, but add
similar documentation to the http.proxy entry.  We know that while the
FAQ is very useful, users sometimes are less likely to read in favor of
the documentation specific to an option or command, so adding it in both
places will help users be adequately informed.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-09 21:24:42 -07:00
brian m. carlson
804ecbcfd1 gitfaq: add entry about syncing working trees
Users very commonly want to sync their working tree with uncommitted
changes across machines, often to carry across in-progress work or
stashes.  Despite this not being a recommended approach, users want to
do it and are not dissuaded by suggestions not to, so let's recommend a
sensible technique.

The technique that many users are using is their preferred cloud syncing
service, which is a bad idea.  Users have reported problems where they
end up with duplicate files that won't go away (with names like "file.c
2"), broken references, oddly named references that have date stamps
appended to them, missing objects, and general corruption and data loss.
That's because almost all of these tools sync file by file, which is a
great technique if your project is a single word processing document or
spreadsheet, but is utterly abysmal for Git repositories because they
don't necessarily snapshot the entire repository correctly.  They also
tend to sync the files immediately instead of when the repository is
quiescent, so writing multiple files, as occurs during a commit or a gc,
can confuse the tools and lead to corruption.

We know that the old standby, rsync, is up to the task, provided that
the repository is quiescent, so let's suggest that and dissuade people
from using cloud syncing tools.  Let's tell people about common things
they should be aware of before doing this and that this is still
potentially risky.  Additionally, let's tell people that Git's security
model does not permit sharing working trees across users in case they
planned to do that.  While we'd still prefer users didn't try to do
this, hopefully this will lead them in a safer direction.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-09 21:24:42 -07:00
brian m. carlson
c98f78b806 gitfaq: give advice on using eol attribute in gitattributes
In the FAQ, we tell people how to use the text attribute, but we fail to
explain what to do with the eol attribute.  As we ourselves have
noticed, most shell implementations do not care for carriage returns,
and as such, people will practically always want them to use LF endings.
Similar things can be said for batch files on Windows, except with CRLF
endings.

Since these are common things to have in a repository, let's help users
make a good decision by recommending that they use the gitattributes
file to correctly check out the endings.

In addition, let's correct the cross-reference to this question, which
originally referred to "the following entry", even though a new entry
has been inserted in between.  The cross-reference notation should
prevent this from occurring and provide a link in formats, such as HTML,
which support that.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-09 21:24:42 -07:00
brian m. carlson
2101341484 gitfaq: add documentation on proxies
Many corporate environments and local systems have proxies in use.  Note
the situations in which proxies can be used and how to configure them.
At the same time, note what standards a proxy must follow to work with
Git.  Explicitly call out certain classes that are known to routinely
have problems reported various places online, including in the Git for
Windows issue tracker and on Stack Overflow, and recommend against the
use of such software, noting that they are associated with myriad
security problems (including, for example, breaking sandboxing and image
integrity[0], and, for TLS middleboxes, the use of insecure protocols
and ciphers and lack of certificate verification[1]). Don't mention the
specific nature of these security problems in the FAQ entry because they
are extremely numerous and varied and we wish to keep the FAQ entry
relatively brief.

[0] https://issues.chromium.org/issues/40285192
[1] https://faculty.cc.gatech.edu/~mbailey/publications/ndss17_interception.pdf

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-09 21:24:42 -07:00
Junio C Hamano
58696bfcaa ci: unify bash calling convention
Under ci/ hierarchy, we run scripts under either "sh" (any Bourne
compatible POSIX shell would work) or specifically "bash" (as they
require features from bash, e.g., ${parameter/pattern/string}
expansion).  As we have the CI environment under our control, we can
expect that /bin/sh will always be fine to run the scripts that only
require a Bourne shell, but we may not know where "bash" is
installed depending on the distro used.

So let's make sure we start these scripts with either one of these:

	#!/bin/sh
	#!/usr/bin/env bash

Yes, the latter has to assume that everybody installs "env" at that
path and not as /bin/env or /usr/local/bin/env, but this currently
is the best we could do.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-08 16:23:05 -07:00
Junio C Hamano
557ae147e6 The ninteenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-08 14:53:11 -07:00
Junio C Hamano
a43b001cce Merge branch 'ds/sparse-lstat-caching'
The code to deal with modified paths that are out-of-cone in a
sparsely checked out working tree has been optimized.

* ds/sparse-lstat-caching:
  sparse-index: improve lstat caching of sparse paths
  sparse-index: count lstat() calls
  sparse-index: use strbuf in path_found()
  sparse-index: refactor path_found()
  sparse-checkout: refactor skip worktree retry logic
2024-07-08 14:53:11 -07:00
Junio C Hamano
125e389470 Merge branch 'xx/bundie-uri-fixes'
When bundleURI interface fetches multiple bundles, Git failed to
take full advantage of all bundles and ended up slurping duplicated
objects.

* xx/bundie-uri-fixes:
  unbundle: extend object verification for fetches
  fetch-pack: expose fsckObjects configuration logic
  bundle-uri: verify oid before writing refs
2024-07-08 14:53:11 -07:00
Junio C Hamano
3997614c24 Merge branch 'ps/leakfixes-more'
More memory leaks have been plugged.

* ps/leakfixes-more: (29 commits)
  builtin/blame: fix leaking ignore revs files
  builtin/blame: fix leaking prefixed paths
  blame: fix leaking data for blame scoreboards
  line-range: plug leaking find functions
  merge: fix leaking merge bases
  builtin/merge: fix leaking `struct cmdnames` in `get_strategy()`
  sequencer: fix memory leaks in `make_script_with_merges()`
  builtin/clone: plug leaking HEAD ref in `wanted_peer_refs()`
  apply: fix leaking string in `match_fragment()`
  sequencer: fix leaking string buffer in `commit_staged_changes()`
  commit: fix leaking parents when calling `commit_tree_extended()`
  config: fix leaking "core.notesref" variable
  rerere: fix various trivial leaks
  builtin/stash: fix leak in `show_stash()`
  revision: free diff options
  builtin/log: fix leaking commit list in git-cherry(1)
  merge-recursive: fix memory leak when finalizing merge
  builtin/merge-recursive: fix leaking object ID bases
  builtin/difftool: plug memory leaks in `run_dir_diff()`
  object-name: free leaking object contexts
  ...
2024-07-08 14:53:10 -07:00
Junio C Hamano
ecf7fc600a Merge branch 'tb/path-filter-fix'
The Bloom filter used for path limited history traversal was broken
on systems whose "char" is unsigned; update the implementation and
bump the format version to 2.

* tb/path-filter-fix:
  bloom: introduce `deinit_bloom_filters()`
  commit-graph: reuse existing Bloom filters where possible
  object.h: fix mis-aligned flag bits table
  commit-graph: new Bloom filter version that fixes murmur3
  commit-graph: unconditionally load Bloom filters
  bloom: prepare to discard incompatible Bloom filters
  bloom: annotate filters with hash version
  repo-settings: introduce commitgraph.changedPathsVersion
  t4216: test changed path filters with high bit paths
  t/helper/test-read-graph: implement `bloom-filters` mode
  bloom.h: make `load_bloom_filter_from_graph()` public
  t/helper/test-read-graph.c: extract `dump_graph_info()`
  gitformat-commit-graph: describe version 2 of BDAT
  commit-graph: ensure Bloom filters are read with consistent settings
  revision.c: consult Bloom filters for root commits
  t/t4216-log-bloom.sh: harden `test_bloom_filters_not_used()`
2024-07-08 14:53:10 -07:00
Junio C Hamano
6f75d230a1 Merge branch 'db/date-underflow-fix'
date parser updates to be more careful about underflowing epoch
based timestamp.

* db/date-underflow-fix:
  date: detect underflow/overflow when parsing dates with timezone offset
  t0006: simplify prerequisites
2024-07-08 14:53:09 -07:00
Junio C Hamano
4e18cd5ef7 Merge branch 'rj/pager-die-upon-exec-failure'
When GIT_PAGER failed to spawn, depending on the code path taken,
we failed immediately (correct) or just spew the payload to the
standard output (incorrect).  The code now always fail immediately
when GIT_PAGER fails.

* rj/pager-die-upon-exec-failure:
  pager: die when paging to non-existing command
2024-07-08 14:53:08 -07:00
Junio C Hamano
2fa5ae30da Merge branch 'ss/doc-eol-attr-fix'
Doc update.

* ss/doc-eol-attr-fix:
  doc: fix case error of eol attribute in example
2024-07-08 14:53:08 -07:00
Junio C Hamano
87f4164124 Merge branch 'jc/archive-prefix-with-add-virtual-file'
"git archive --add-virtual-file=<path>:<contents>" never paid
attention to the --prefix=<prefix> option but the documentation
said it would. The documentation has been corrected.

* jc/archive-prefix-with-add-virtual-file:
  archive: document that --add-virtual-file takes full path
2024-07-08 14:53:07 -07:00
Derrick Stolee
9479a31d60 advice: warn when sparse index expands
Typically, forcing a sparse index to expand to a full index means that
Git could not determine the status of a file outside of the
sparse-checkout and needed to expand sparse trees into the full list of
sparse blobs. This operation can be very slow when the sparse-checkout
is much smaller than the full tree at HEAD.

When users are in this state, there is usually a modified or untracked
file outside of the sparse-checkout mentioned by the output of 'git
status'. There are a number of reasons why this is insufficient:

 1. Users may not have a full understanding of which files are inside or
    outside of their sparse-checkout. This is more common in monorepos
    that manage the sparse-checkout using custom tools that map build
    dependencies into sparse-checkout definitions.

 2. In some cases, an empty directory could exist outside the
    sparse-checkout and these empty directories are not reported by 'git
    status' and friends.

 3. If the user has '.gitignore' or 'exclude' files, then 'git status'
    will squelch the warnings and not demonstrate any problems.

In order to help users who are in this state, add a new advice message
to indicate that a sparse index is expanded to a full index. This
message should be written at most once per process, so add a static
global 'give_advice_on_expansion' to sparse-index.c. Further, there is a
case in 'git sparse-checkout set' that uses the sparse index as an
in-memory data structure (even when writing a full index) so we need to
disable the message in that kind of case.

The t1092-sparse-checkout-compatibility.sh test script compares the
behavior of several Git commands across full and sparse repositories,
including sparse repositories with and without a sparse index. We need
to disable the advice in the sparse-index repo to avoid differences in
stderr. By leaving the advice on in the sparse-checkout repo (without
the sparse index), we can test the behavior of disabling the advice in
convert_to_sparse(). (Indeed, these tests are how that necessity was
discovered.) Add a test that reenables the advice and demonstrates that
the message is output.

The advice message is defined outside of expand_index() to avoid super-
wide lines. It is also defined as a macro to avoid compile issues with
-Werror=format-security.

Signed-off-by: Derrick Stolee <stolee@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-08 12:23:59 -07:00
Rikita Ishikawa
428c40da61 doc: fix the max number of branches shown by "show-branch"
The number to be displayed is calculated by the following defined in
object.h:

    #define REV_SHIFT        2
    #define MAX_REVS        (FLAG_BITS - REV_SHIFT)

FLAG_BITS is currently 28, so 26 is the correct number.

Signed-off-by: Rikita Ishikawa <lagrange.resolvent@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-08 08:26:46 -07:00
Jesús Ariel Cabello Mateos
cf6ead095b gitweb: rss/atom change published/updated date to committer date
The author date is used for published/updated date in the rss/atom
feed stream.  Change it to the committer date that reflects the
"published/updated" definition better and makes rss/atom feeds more
linear.  Gitlab/Github rss/atom feeds use the committer date.

Additionally, to be consistent, also use the committer date to
determine the date of the last commit to send in the feed
instead of the author date.

Signed-off-by: Jesús Ariel Cabello Mateos <080ariel@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-07 23:04:41 -07:00
Junio C Hamano
5c9be4c9d6 Merge https://github.com/j6t/git-gui
* https://github.com/j6t/git-gui:
  git-gui: fix inability to quit after closing another instance
  git-gui: sv.po: Update Swedish translation (576t0f0u)
  git-gui: note the new maintainer
  Makefile(s): do not enforce "all indents must be done with tab"
  Makefile(s): avoid recipe prefix in conditional statements
  doc: switch links to https
  doc: update links to current pages
  git-gui: po: fix typo in French "aperçu"
2024-07-07 22:50:59 -07:00
Johannes Sixt
2864e85593 Merge branch 'os/catch-rename'
The problem can be reproduced on Linux with this sequence:

1. Run git gui from a terminal.
2. Edit the commit message and wait for at least 2 seconds.
3. Terminate the instance from the terminal, for example with Ctrl-C,
   to simulate crash. This leaves the file .git/GITGUI_BCK behind.
4. Start two instances of git gui &.

At this point the first instance can be closed (it renames
.git/GITGUI_BCK to .git/GITGUI_MSG), but the seconds brings an error
message about the absent file and cannot be closed thereafter and must
be killed from the command line.

The renaming that happens by the first instance is the correct action
and need not be repeated by the second instance. It is the correct
action to ignore the failed renaming.

On the other hand, the second instance could just edit the commit
message again, wait 2 seconds to write GITGUI_BCK, and then can be
closed without failing. At this point, since the user has edited the
message, it is again correct to preserve the edited version in
GITGUI_MSG.

* os/catch-rename:
  git-gui: fix inability to quit after closing another instance
2024-07-07 14:14:59 +02:00
René Scharfe
1457dff9be clang-format: include kh_foreach* macros in ForEachMacros
The command for generating the list of ForEachMacros searches for
macros whose name contains the string "for_each".  Include those whose
name contains "foreach" as well.  That brings in kh_foreach and
kh_foreach_value from khash.h.

Regenerating the list also brings in hashmap-based macros added by
87571c3f71 (hashmap: use *_entry APIs for iteration, 2019-10-06),
f0e63c4113 (hashmap: use *_entry APIs to wrap container_of, 2019-10-06),
4fa1d501f7 (strmap: add functions facilitating use as a string->int map,
2020-11-05), b70c82e6ed (strmap: add more utility functions,
2020-11-05), and 1201eb628a (strmap: add a strset sub-type, 2020-11-06).

for_each_abbrev is no longer found because its definition was removed by
d850b7a545 (cocci: apply the "cache.h" part of "the_repository.pending",
2023-03-28).  Note that it had been a false positive, though, as it had
been a function wrapper, not a for-like macro.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-06 15:12:36 -07:00
Taylor Blau
df32729866 config.mak.dev: fix typo when enabling -Wpedantic
In ebd2e4a13a (Makefile: restrict -Wpedantic and -Wno-pedantic-ms-format
better, 2021-09-28), we tightened our Makefile's behavior to only enable
-Wpedantic when compiling with either gcc5/clang4 or greater as older
compiler versions did not have support for -Wpedantic.

Commit ebd2e4a13a was looking for either "gcc5" or "clang4" to appear in
the COMPILER_FEATURES variable, combining the two "$(filter ...)"
searches with an "$(or ...)".

But ebd2e4a13a has a typo where instead of writing:

    ifneq ($(or ($filter ...),$(filter ...)),)

we wrote:

    ifneq (($or ($filter ...),$(filter ...)),)

Causing our Makefile (when invoked with DEVELOPER=1, and a sufficiently
recent compiler version) to barf:

    $ make DEVELOPER=1
    config.mak.dev:13: extraneous text after 'ifneq' directive
    [...]

Correctly combine the results of the two "$(filter ...)" operations by
using "$(or ...)", not "$or".

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-06 15:10:29 -07:00
René Scharfe
c6f35e529e t-strvec: use test_msg()
check_strvec_loc() checks each strvec item by looping through them and
comparing them with expected values.  If a check fails then we'd like
to know which item is affected.  It reports that information by building
a strbuf and delivering its contents using a failing assertion, e.g.
if there are fewer items in the strvec than expected:

   # check "vec->nr > nr" failed at t/unit-tests/t-strvec.c:19
   #    left: 1
   #   right: 1
   # check "strvec index 1" failed at t/unit-tests/t-strvec.c:71

Note that the index variable is "nr" and thus the interesting value is
reported twice in that example (in lines three and four).

Stop printing the index explicitly for checks that already report it.
The message for the same condition as above becomes:

   # check "vec->nr > nr" failed at t/unit-tests/t-strvec.c:19
   #    left: 1
   #   right: 1

For the string comparison, whose error message doesn't include the
index, report it using the simpler and more appropriate test_msg()
instead.  Report the index using its actual variable name and format the
line like the preceding ones.  The message for an unexpected string
value becomes:

   # check "!strcmp(vec->v[nr], str)" failed at t/unit-tests/t-strvec.c:24
   #    left: "foo"
   #   right: "bar"
   #      nr: 0

Reported-by: Phillip Wood <phillip.wood@dunelm.org.uk>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-06 15:01:13 -07:00
Elijah Newren
fcf59ac136 merge-ort: fix missing early return
One of the conversions in f19b9165 (merge-ort: convert more error()
cases to path_msg(), 2024-06-19) accidentally lost the early return.

Restore it.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-06 10:47:00 -07:00
Ghanshyam Thakkar
28c1c07700 t: migrate helper/test-oidmap.c to unit-tests/t-oidmap.c
helper/test-oidmap.c along with t0016-oidmap.sh test the oidmap.h
library which is built on top of hashmap.h.

Migrate them to the unit testing framework for better performance,
concise code and better debugging. Along with the migration also plug
memory leaks and make the test logic independent for all the tests.
The migration removes 'put' tests from t0016, because it is used as
setup to all the other tests, so testing it separately does not yield
any benefit.

Mentored-by: Christian Couder <chriscool@tuxfamily.org>
Mentored-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com>
Reviewed-by: Josh Steadmon <steadmon@google.com>
Helped-by: Phillip Wood <phillip.wood123@gmail.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Ghanshyam Thakkar <shyamthakkar001@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-03 09:12:14 -07:00
Junio C Hamano
4d8ee0317f push: avoid showing false negotiation errors
When "git push" is configured to use the push negotiation, a push of
deletion of a branch (without pushing anything else) may end up not
having anything to negotiate for the common ancestor discovery.

In such a case, we end up making an internal invocation of "git
fetch --negotiate-only" without any "--negotiate-tip" parameters
that stops the negotiate-only fetch from being run, which by itself
is not a bad thing (one fewer round-trip), but the end-user sees a
"fatal: --negotiate-only needs one or more --negotiation-tip=*"
message that the user cannot act upon.

Teach "git push" to notice the situation and omit performing the
negotiate-only fetch to begin with.  One fewer process spawned, one
fewer "alarming" message given the user.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-02 15:06:13 -07:00
Junio C Hamano
d1e6c61272 checkout: special case error messages during noop switching
"git checkout" ran with no branch and no pathspec behaves like
switching the branch to the current branch (in other words, a
no-op, except that it gives a side-effect "here are the modified
paths" report).  But unlike "git checkout HEAD" or "git checkout
main" (when you are on the 'main' branch), the user is much less
conscious that they are "switching" to the current branch.

This twists end-user expectation in a strange way.  There are
options (like "--ours") that make sense only when we are checking
out paths out of either the tree-ish or out of the index.  So the
error message the command below gives

    $ git checkout --ours
    fatal: '--ours/theirs' cannot be used with switching branches

is technically correct, but because the end-user may not even be
aware of the fact that the command they are issuing is about no-op
branch switching [*], they may find the error confusing.

Let's refactor the code to make it easier to special case the "no-op
branch switching" situation, and then customize the exact error
message for "--ours/--theirs".  Since it is more likely that the
end-user forgot to give pathspec that is required by the option,
let's make it say

    $ git checkout --ours
    fatal: '--ours/theirs' needs the paths to check out

instead.

Among the other options that are incompatible with branch switching,
there may be some that benefit by having messages tweaked when a
no-op branch switching is done, but I'll leave them as #leftoverbits
material.

[Footnote]

 * Yes, the end-users are irrational.  When they did not give
   "--ours", they take it granted that "git checkout" gives a short
   status, e.g..

    $ git checkout
    M	builtin/checkout.c
    M	t/t7201-co.sh

   exactly as a branch switching command.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-02 13:53:56 -07:00
Junio C Hamano
06e570c0df Sync with 'maint' 2024-07-02 10:01:10 -07:00
Junio C Hamano
c2ad9d68d6 The eighteenth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-02 09:59:02 -07:00
Junio C Hamano
2d97b4e235 Merge branch 'rs/diff-color-moved-w-no-ext-diff-fix'
"git diff --no-ext-diff" when diff.external is configured ignored
the "--color-moved" option.

* rs/diff-color-moved-w-no-ext-diff-fix:
  diff: allow --color-moved with --no-ext-diff
2024-07-02 09:59:02 -07:00
Junio C Hamano
ca349c387b Merge branch 'ew/object-convert-leakfix'
Leakfix.

* ew/object-convert-leakfix:
  object-file: fix leak on conversion failure
2024-07-02 09:59:01 -07:00
Junio C Hamano
ca463101c8 Merge branch 'jk/remote-wo-url'
Memory ownership rules for the in-core representation of
remote.*.url configuration values have been straightened out, which
resulted in a few leak fixes and code clarification.

* jk/remote-wo-url:
  remote: drop checks for zero-url case
  remote: always require at least one url in a remote
  t5801: test remote.*.vcs config
  t5801: make remote-testgit GIT_DIR setup more robust
  remote: allow resetting url list
  config: document remote.*.url/pushurl interaction
  remote: simplify url/pushurl selection
  remote: use strvecs to store remote url/pushurl
  remote: transfer ownership of memory in add_url(), etc
  remote: refactor alias_url() memory ownership
  archive: fix check for missing url
2024-07-02 09:59:01 -07:00
Junio C Hamano
24cbd29164 Merge branch 'jc/fuzz-sans-curl'
CI job to build minimum fuzzers learned to pass NO_CURL=NoThanks to
the build procedure, as its build environment does not offer, or
the rest of the build needs, anything cURL.

* jc/fuzz-sans-curl:
  fuzz: minimum fuzzers environment lacks libcURL
2024-07-02 09:59:01 -07:00
Junio C Hamano
43fab448cf Merge branch 'rb/build-options-w-lib-versions'
"git version --build-options" reports the version information of
OpenSSL and other libraries (if used) in the build.

* rb/build-options-w-lib-versions:
  version: teach --build-options to reports zlib version information
  version: teach --build-options to reports libcurl version information
  version: --build-options reports OpenSSL version information
2024-07-02 09:59:00 -07:00
Junio C Hamano
7b472da915 Merge branch 'ps/use-the-repository'
A CPP macro USE_THE_REPOSITORY_VARIABLE is introduced to help
transition the codebase to rely less on the availability of the
singleton the_repository instance.

* ps/use-the-repository:
  hex: guard declarations with `USE_THE_REPOSITORY_VARIABLE`
  t/helper: remove dependency on `the_repository` in "proc-receive"
  t/helper: fix segfault in "oid-array" command without repository
  t/helper: use correct object hash in partial-clone helper
  compat/fsmonitor: fix socket path in networked SHA256 repos
  replace-object: use hash algorithm from passed-in repository
  protocol-caps: use hash algorithm from passed-in repository
  oidset: pass hash algorithm when parsing file
  http-fetch: don't crash when parsing packfile without a repo
  hash-ll: merge with "hash.h"
  refs: avoid include cycle with "repository.h"
  global: introduce `USE_THE_REPOSITORY_VARIABLE` macro
  hash: require hash algorithm in `empty_tree_oid_hex()`
  hash: require hash algorithm in `is_empty_{blob,tree}_oid()`
  hash: make `is_null_oid()` independent of `the_repository`
  hash: convert `oidcmp()` and `oideq()` to compare whole hash
  global: ensure that object IDs are always padded
  hash: require hash algorithm in `oidread()` and `oidclr()`
  hash: require hash algorithm in `hasheq()`, `hashcmp()` and `hashclr()`
  hash: drop (mostly) unused `is_empty_{blob,tree}_sha1()` functions
2024-07-02 09:59:00 -07:00
Junio C Hamano
ae447ed130 Merge branch 'ew/cat-file-unbuffered-tests'
The output from "git cat-file --batch-check" and "--batch-command
(info)" should not be unbuffered, for which some tests have been
added.

* ew/cat-file-unbuffered-tests:
  t1006: ensure cat-file info isn't buffered by default
  Git.pm: use array in command_bidi_pipe example
2024-07-02 09:58:59 -07:00