2015-11-09 21:34:01 +08:00
|
|
|
/*
|
|
|
|
* The backend-independent part of the reference module.
|
|
|
|
*/
|
|
|
|
|
2023-04-11 15:41:50 +08:00
|
|
|
#include "git-compat-util.h"
|
2023-04-11 11:00:39 +08:00
|
|
|
#include "advice.h"
|
2017-06-15 02:07:36 +08:00
|
|
|
#include "config.h"
|
2023-03-21 14:26:03 +08:00
|
|
|
#include "environment.h"
|
2017-02-10 19:16:15 +08:00
|
|
|
#include "hashmap.h"
|
2023-03-21 14:25:54 +08:00
|
|
|
#include "gettext.h"
|
2023-02-24 08:09:27 +08:00
|
|
|
#include "hex.h"
|
2014-10-01 18:28:42 +08:00
|
|
|
#include "lockfile.h"
|
2017-04-16 14:41:26 +08:00
|
|
|
#include "iterator.h"
|
2006-12-20 06:34:12 +08:00
|
|
|
#include "refs.h"
|
2015-11-10 19:42:36 +08:00
|
|
|
#include "refs/refs-internal.h"
|
refs: implement reference transaction hook
The low-level reference transactions used to update references are
currently completely opaque to the user. While certainly desirable in
most usecases, there are some which might want to hook into the
transaction to observe all queued reference updates as well as observing
the abortion or commit of a prepared transaction.
One such usecase would be to have a set of replicas of a given Git
repository, where we perform Git operations on all of the repositories
at once and expect the outcome to be the same in all of them. While
there exist hooks already for a certain subset of Git commands that
could be used to implement a voting mechanism for this, many others
currently don't have any mechanism for this.
The above scenario is the motivation for the new "reference-transaction"
hook that reaches directly into Git's reference transaction mechanism.
The hook receives as parameter the current state the transaction was
moved to ("prepared", "committed" or "aborted") and gets via its
standard input all queued reference updates. While the exit code gets
ignored in the "committed" and "aborted" states, a non-zero exit code in
the "prepared" state will cause the transaction to be aborted
prematurely.
Given the usecase described above, a voting mechanism can now be
implemented via this hook: as soon as it gets called, it will take all
of stdin and use it to cast a vote to a central service. When all
replicas of the repository agree, the hook will exit with zero,
otherwise it will abort the transaction by returning non-zero. The most
important upside is that this will catch _all_ commands writing
references at once, allowing to implement strong consistency for
reference updates via a single mechanism.
In order to test the impact on the case where we don't have any
"reference-transaction" hook installed in the repository, this commit
introduce two new performance tests for git-update-refs(1). Run against
an empty repository, it produces the following results:
Test origin/master HEAD
--------------------------------------------------------------------
1400.2: update-ref 2.70(2.10+0.71) 2.71(2.10+0.73) +0.4%
1400.3: update-ref --stdin 0.21(0.09+0.11) 0.21(0.07+0.14) +0.0%
The performance test p1400.2 creates, updates and deletes a branch a
thousand times, thus averaging runtime of git-update-refs over 3000
invocations. p1400.3 instead calls `git-update-refs --stdin` three times
and queues a thousand creations, updates and deletes respectively.
As expected, p1400.3 consistently shows no noticeable impact, as for
each batch of updates there's a single call to access(3P) for the
negative hook lookup. On the other hand, for p1400.2, one can see an
impact caused by this patchset. But doing five runs of the performance
tests where each one was run with GIT_PERF_REPEAT_COUNT=10, the overhead
ranged from -1.5% to +1.1%. These inconsistent performance numbers can
be explained by the overhead of spawning 3000 processes. This shows that
the overhead of assembling the hook path and executing access(3P) once
to check if it's there is mostly outweighed by the operating system's
overhead.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-06-19 14:56:14 +08:00
|
|
|
#include "run-command.h"
|
2021-09-27 03:03:26 +08:00
|
|
|
#include "hook.h"
|
2023-04-11 15:41:49 +08:00
|
|
|
#include "object-name.h"
|
2023-05-16 14:34:06 +08:00
|
|
|
#include "object-store-ll.h"
|
2006-11-20 05:22:44 +08:00
|
|
|
#include "object.h"
|
2023-05-16 14:33:59 +08:00
|
|
|
#include "path.h"
|
2006-11-20 05:22:44 +08:00
|
|
|
#include "tag.h"
|
2017-03-26 10:42:31 +08:00
|
|
|
#include "submodule.h"
|
2017-04-24 18:01:22 +08:00
|
|
|
#include "worktree.h"
|
2020-07-29 04:23:39 +08:00
|
|
|
#include "strvec.h"
|
2018-04-12 08:21:09 +08:00
|
|
|
#include "repository.h"
|
2023-03-21 14:26:05 +08:00
|
|
|
#include "setup.h"
|
refs: implement reference transaction hook
The low-level reference transactions used to update references are
currently completely opaque to the user. While certainly desirable in
most usecases, there are some which might want to hook into the
transaction to observe all queued reference updates as well as observing
the abortion or commit of a prepared transaction.
One such usecase would be to have a set of replicas of a given Git
repository, where we perform Git operations on all of the repositories
at once and expect the outcome to be the same in all of them. While
there exist hooks already for a certain subset of Git commands that
could be used to implement a voting mechanism for this, many others
currently don't have any mechanism for this.
The above scenario is the motivation for the new "reference-transaction"
hook that reaches directly into Git's reference transaction mechanism.
The hook receives as parameter the current state the transaction was
moved to ("prepared", "committed" or "aborted") and gets via its
standard input all queued reference updates. While the exit code gets
ignored in the "committed" and "aborted" states, a non-zero exit code in
the "prepared" state will cause the transaction to be aborted
prematurely.
Given the usecase described above, a voting mechanism can now be
implemented via this hook: as soon as it gets called, it will take all
of stdin and use it to cast a vote to a central service. When all
replicas of the repository agree, the hook will exit with zero,
otherwise it will abort the transaction by returning non-zero. The most
important upside is that this will catch _all_ commands writing
references at once, allowing to implement strong consistency for
reference updates via a single mechanism.
In order to test the impact on the case where we don't have any
"reference-transaction" hook installed in the repository, this commit
introduce two new performance tests for git-update-refs(1). Run against
an empty repository, it produces the following results:
Test origin/master HEAD
--------------------------------------------------------------------
1400.2: update-ref 2.70(2.10+0.71) 2.71(2.10+0.73) +0.4%
1400.3: update-ref --stdin 0.21(0.09+0.11) 0.21(0.07+0.14) +0.0%
The performance test p1400.2 creates, updates and deletes a branch a
thousand times, thus averaging runtime of git-update-refs over 3000
invocations. p1400.3 instead calls `git-update-refs --stdin` three times
and queues a thousand creations, updates and deletes respectively.
As expected, p1400.3 consistently shows no noticeable impact, as for
each batch of updates there's a single call to access(3P) for the
negative hook lookup. On the other hand, for p1400.2, one can see an
impact caused by this patchset. But doing five runs of the performance
tests where each one was run with GIT_PERF_REPEAT_COUNT=10, the overhead
ranged from -1.5% to +1.1%. These inconsistent performance numbers can
be explained by the overhead of spawning 3000 processes. This shows that
the overhead of assembling the hook path and executing access(3P) once
to check if it's there is mostly outweighed by the operating system's
overhead.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-06-19 14:56:14 +08:00
|
|
|
#include "sigchain.h"
|
date API: create a date.h, split from cache.h
Move the declaration of the date.c functions from cache.h, and adjust
the relevant users to include the new date.h header.
The show_ident_date() function belonged in pretty.h (it's defined in
pretty.c), its two users outside of pretty.c didn't strictly need to
include pretty.h, as they get it indirectly, but let's add it to them
anyway.
Similarly, the change to "builtin/{fast-import,show-branch,tag}.c"
isn't needed as far as the compiler is concerned, but since they all
use the "DATE_MODE()" macro we now define in date.h, let's have them
include it.
We could simply include this new header in "cache.h", but as this
change shows these functions weren't common enough to warrant
including in it in the first place. By moving them out of cache.h
changes to this API will no longer cause a (mostly) full re-build of
the project when "make" is run.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-02-16 16:14:02 +08:00
|
|
|
#include "date.h"
|
2022-08-06 01:58:36 +08:00
|
|
|
#include "commit.h"
|
2023-05-16 14:34:03 +08:00
|
|
|
#include "wildmatch.h"
|
2014-12-12 16:57:02 +08:00
|
|
|
|
2016-09-05 00:08:10 +08:00
|
|
|
/*
|
|
|
|
* List of all available backends
|
|
|
|
*/
|
2023-12-29 15:26:34 +08:00
|
|
|
static const struct ref_storage_be *refs_backends[] = {
|
|
|
|
[REF_STORAGE_FORMAT_FILES] = &refs_be_files,
|
refs: introduce reftable backend
Due to scalability issues, Shawn Pearce has originally proposed a new
"reftable" format more than six years ago [1]. Initially, this new
format was implemented in JGit with promising results. Around two years
ago, we have then added the "reftable" library to the Git codebase via
a4bbd13be3 (Merge branch 'hn/reftable', 2021-12-15). With this we have
landed all the low-level code to read and write reftables. Notably
missing though was the integration of this low-level code into the Git
code base in the form of a new ref backend that ties all of this
together.
This gap is now finally closed by introducing a new "reftable" backend
into the Git codebase. This new backend promises to bring some notable
improvements to Git repositories:
- It becomes possible to do truly atomic writes where either all refs
are committed to disk or none are. This was not possible with the
"files" backend because ref updates were split across multiple loose
files.
- The disk space required to store many refs is reduced, both compared
to loose refs and packed-refs. This is enabled both by the reftable
format being a binary format, which is more compact, and by prefix
compression.
- We can ignore filesystem-specific behaviour as ref names are not
encoded via paths anymore. This means there is no need to handle
case sensitivity on Windows systems or Unicode precomposition on
macOS.
- There is no need to rewrite the complete refdb anymore every time a
ref is being deleted like it was the case for packed-refs. This
means that ref deletions are now constant time instead of scaling
linearly with the number of refs.
- We can ignore file/directory conflicts so that it becomes possible
to store both "refs/heads/foo" and "refs/heads/foo/bar".
- Due to this property we can retain reflogs for deleted refs. We have
previously been deleting reflogs together with their refs to avoid
file/directory conflicts, which is not necessary anymore.
- We can properly enumerate all refs. With the "files" backend it is
not easily possible to distinguish between refs and non-refs because
they may live side by side in the gitdir.
Not all of these improvements are realized with the current "reftable"
backend implementation. At this point, the new backend is supposed to be
a drop-in replacement for the "files" backend that is used by basically
all Git repositories nowadays. It strives for 1:1 compatibility, which
means that a user can expect the same behaviour regardless of whether
they use the "reftable" backend or the "files" backend for most of the
part.
Most notably, this means we artificially limit the capabilities of the
"reftable" backend to match the limits of the "files" backend. It is not
possible to create refs that would end up with file/directory conflicts,
we do not retain reflogs, we perform stricter-than-necessary checks.
This is done intentionally due to two main reasons:
- It makes it significantly easier to land the "reftable" backend as
tests behave the same. It would be tough to argue for each and every
single test that doesn't pass with the "reftable" backend.
- It ensures compatibility between repositories that use the "files"
backend and repositories that use the "reftable" backend. Like this,
hosters can migrate their repositories to use the "reftable" backend
without causing issues for clients that use the "files" backend in
their clones.
It is expected that these artificial limitations may eventually go away
in the long term.
Performance-wise things very much depend on the actual workload. The
following benchmarks compare the "files" and "reftable" backends in the
current version:
- Creating N refs in separate transactions shows that the "files"
backend is ~50% faster. This is not surprising given that creating a
ref only requires us to create a single loose ref. The "reftable"
backend will also perform auto compaction on updates. In real-world
workloads we would likely also want to perform pack loose refs,
which would likely change the picture.
Benchmark 1: update-ref: create refs sequentially (refformat = files, refcount = 1)
Time (mean ± σ): 2.1 ms ± 0.3 ms [User: 0.6 ms, System: 1.7 ms]
Range (min … max): 1.8 ms … 4.3 ms 133 runs
Benchmark 2: update-ref: create refs sequentially (refformat = reftable, refcount = 1)
Time (mean ± σ): 2.7 ms ± 0.1 ms [User: 0.6 ms, System: 2.2 ms]
Range (min … max): 2.4 ms … 2.9 ms 132 runs
Benchmark 3: update-ref: create refs sequentially (refformat = files, refcount = 1000)
Time (mean ± σ): 1.975 s ± 0.006 s [User: 0.437 s, System: 1.535 s]
Range (min … max): 1.969 s … 1.980 s 3 runs
Benchmark 4: update-ref: create refs sequentially (refformat = reftable, refcount = 1000)
Time (mean ± σ): 2.611 s ± 0.013 s [User: 0.782 s, System: 1.825 s]
Range (min … max): 2.597 s … 2.622 s 3 runs
Benchmark 5: update-ref: create refs sequentially (refformat = files, refcount = 100000)
Time (mean ± σ): 198.442 s ± 0.241 s [User: 43.051 s, System: 155.250 s]
Range (min … max): 198.189 s … 198.670 s 3 runs
Benchmark 6: update-ref: create refs sequentially (refformat = reftable, refcount = 100000)
Time (mean ± σ): 294.509 s ± 4.269 s [User: 104.046 s, System: 190.326 s]
Range (min … max): 290.223 s … 298.761 s 3 runs
- Creating N refs in a single transaction shows that the "files"
backend is significantly slower once we start to write many refs.
The "reftable" backend only needs to update two files, whereas the
"files" backend needs to write one file per ref.
Benchmark 1: update-ref: create many refs (refformat = files, refcount = 1)
Time (mean ± σ): 1.9 ms ± 0.1 ms [User: 0.4 ms, System: 1.4 ms]
Range (min … max): 1.8 ms … 2.6 ms 151 runs
Benchmark 2: update-ref: create many refs (refformat = reftable, refcount = 1)
Time (mean ± σ): 2.5 ms ± 0.1 ms [User: 0.7 ms, System: 1.7 ms]
Range (min … max): 2.4 ms … 3.4 ms 148 runs
Benchmark 3: update-ref: create many refs (refformat = files, refcount = 1000)
Time (mean ± σ): 152.5 ms ± 5.2 ms [User: 19.1 ms, System: 133.1 ms]
Range (min … max): 148.5 ms … 167.8 ms 15 runs
Benchmark 4: update-ref: create many refs (refformat = reftable, refcount = 1000)
Time (mean ± σ): 58.0 ms ± 2.5 ms [User: 28.4 ms, System: 29.4 ms]
Range (min … max): 56.3 ms … 72.9 ms 40 runs
Benchmark 5: update-ref: create many refs (refformat = files, refcount = 1000000)
Time (mean ± σ): 152.752 s ± 0.710 s [User: 20.315 s, System: 131.310 s]
Range (min … max): 152.165 s … 153.542 s 3 runs
Benchmark 6: update-ref: create many refs (refformat = reftable, refcount = 1000000)
Time (mean ± σ): 51.912 s ± 0.127 s [User: 26.483 s, System: 25.424 s]
Range (min … max): 51.769 s … 52.012 s 3 runs
- Deleting a ref in a fully-packed repository shows that the "files"
backend scales with the number of refs. The "reftable" backend has
constant-time deletions.
Benchmark 1: update-ref: delete ref (refformat = files, refcount = 1)
Time (mean ± σ): 1.7 ms ± 0.1 ms [User: 0.4 ms, System: 1.2 ms]
Range (min … max): 1.6 ms … 2.1 ms 316 runs
Benchmark 2: update-ref: delete ref (refformat = reftable, refcount = 1)
Time (mean ± σ): 1.8 ms ± 0.1 ms [User: 0.4 ms, System: 1.3 ms]
Range (min … max): 1.7 ms … 2.1 ms 294 runs
Benchmark 3: update-ref: delete ref (refformat = files, refcount = 1000)
Time (mean ± σ): 2.0 ms ± 0.1 ms [User: 0.5 ms, System: 1.4 ms]
Range (min … max): 1.9 ms … 2.5 ms 287 runs
Benchmark 4: update-ref: delete ref (refformat = reftable, refcount = 1000)
Time (mean ± σ): 1.9 ms ± 0.1 ms [User: 0.5 ms, System: 1.3 ms]
Range (min … max): 1.8 ms … 2.1 ms 217 runs
Benchmark 5: update-ref: delete ref (refformat = files, refcount = 1000000)
Time (mean ± σ): 229.8 ms ± 7.9 ms [User: 182.6 ms, System: 46.8 ms]
Range (min … max): 224.6 ms … 245.2 ms 6 runs
Benchmark 6: update-ref: delete ref (refformat = reftable, refcount = 1000000)
Time (mean ± σ): 2.0 ms ± 0.0 ms [User: 0.6 ms, System: 1.3 ms]
Range (min … max): 2.0 ms … 2.1 ms 3 runs
- Listing all refs shows no significant advantage for either of the
backends. The "files" backend is a bit faster, but not by a
significant margin. When repositories are not packed the "reftable"
backend outperforms the "files" backend because the "reftable"
backend performs auto-compaction.
Benchmark 1: show-ref: print all refs (refformat = files, refcount = 1, packed = true)
Time (mean ± σ): 1.6 ms ± 0.1 ms [User: 0.4 ms, System: 1.1 ms]
Range (min … max): 1.5 ms … 2.0 ms 1729 runs
Benchmark 2: show-ref: print all refs (refformat = reftable, refcount = 1, packed = true)
Time (mean ± σ): 1.6 ms ± 0.1 ms [User: 0.4 ms, System: 1.1 ms]
Range (min … max): 1.5 ms … 1.8 ms 1816 runs
Benchmark 3: show-ref: print all refs (refformat = files, refcount = 1000, packed = true)
Time (mean ± σ): 4.3 ms ± 0.1 ms [User: 0.9 ms, System: 3.3 ms]
Range (min … max): 4.1 ms … 4.6 ms 645 runs
Benchmark 4: show-ref: print all refs (refformat = reftable, refcount = 1000, packed = true)
Time (mean ± σ): 4.5 ms ± 0.2 ms [User: 1.0 ms, System: 3.3 ms]
Range (min … max): 4.2 ms … 5.9 ms 643 runs
Benchmark 5: show-ref: print all refs (refformat = files, refcount = 1000000, packed = true)
Time (mean ± σ): 2.537 s ± 0.034 s [User: 0.488 s, System: 2.048 s]
Range (min … max): 2.511 s … 2.627 s 10 runs
Benchmark 6: show-ref: print all refs (refformat = reftable, refcount = 1000000, packed = true)
Time (mean ± σ): 2.712 s ± 0.017 s [User: 0.653 s, System: 2.059 s]
Range (min … max): 2.692 s … 2.752 s 10 runs
Benchmark 7: show-ref: print all refs (refformat = files, refcount = 1, packed = false)
Time (mean ± σ): 1.6 ms ± 0.1 ms [User: 0.4 ms, System: 1.1 ms]
Range (min … max): 1.5 ms … 1.9 ms 1834 runs
Benchmark 8: show-ref: print all refs (refformat = reftable, refcount = 1, packed = false)
Time (mean ± σ): 1.6 ms ± 0.1 ms [User: 0.4 ms, System: 1.1 ms]
Range (min … max): 1.4 ms … 2.0 ms 1840 runs
Benchmark 9: show-ref: print all refs (refformat = files, refcount = 1000, packed = false)
Time (mean ± σ): 13.8 ms ± 0.2 ms [User: 2.8 ms, System: 10.8 ms]
Range (min … max): 13.3 ms … 14.5 ms 208 runs
Benchmark 10: show-ref: print all refs (refformat = reftable, refcount = 1000, packed = false)
Time (mean ± σ): 4.5 ms ± 0.2 ms [User: 1.2 ms, System: 3.3 ms]
Range (min … max): 4.3 ms … 6.2 ms 624 runs
Benchmark 11: show-ref: print all refs (refformat = files, refcount = 1000000, packed = false)
Time (mean ± σ): 12.127 s ± 0.129 s [User: 2.675 s, System: 9.451 s]
Range (min … max): 11.965 s … 12.370 s 10 runs
Benchmark 12: show-ref: print all refs (refformat = reftable, refcount = 1000000, packed = false)
Time (mean ± σ): 2.799 s ± 0.022 s [User: 0.735 s, System: 2.063 s]
Range (min … max): 2.769 s … 2.836 s 10 runs
- Printing a single ref shows no real difference between the "files"
and "reftable" backends.
Benchmark 1: show-ref: print single ref (refformat = files, refcount = 1)
Time (mean ± σ): 1.5 ms ± 0.1 ms [User: 0.4 ms, System: 1.0 ms]
Range (min … max): 1.4 ms … 1.8 ms 1779 runs
Benchmark 2: show-ref: print single ref (refformat = reftable, refcount = 1)
Time (mean ± σ): 1.6 ms ± 0.1 ms [User: 0.4 ms, System: 1.1 ms]
Range (min … max): 1.4 ms … 2.5 ms 1753 runs
Benchmark 3: show-ref: print single ref (refformat = files, refcount = 1000)
Time (mean ± σ): 1.5 ms ± 0.1 ms [User: 0.3 ms, System: 1.1 ms]
Range (min … max): 1.4 ms … 1.9 ms 1840 runs
Benchmark 4: show-ref: print single ref (refformat = reftable, refcount = 1000)
Time (mean ± σ): 1.6 ms ± 0.1 ms [User: 0.4 ms, System: 1.1 ms]
Range (min … max): 1.5 ms … 2.0 ms 1831 runs
Benchmark 5: show-ref: print single ref (refformat = files, refcount = 1000000)
Time (mean ± σ): 1.6 ms ± 0.1 ms [User: 0.4 ms, System: 1.1 ms]
Range (min … max): 1.5 ms … 2.1 ms 1848 runs
Benchmark 6: show-ref: print single ref (refformat = reftable, refcount = 1000000)
Time (mean ± σ): 1.6 ms ± 0.1 ms [User: 0.4 ms, System: 1.1 ms]
Range (min … max): 1.5 ms … 2.1 ms 1762 runs
So overall, performance depends on the usecases. Except for many
sequential writes the "reftable" backend is roughly on par or
significantly faster than the "files" backend though. Given that the
"files" backend has received 18 years of optimizations by now this can
be seen as a win. Furthermore, we can expect that the "reftable" backend
will grow faster over time when attention turns more towards
optimizations.
The complete test suite passes, except for those tests explicitly marked
to require the REFFILES prerequisite. Some tests in t0610 are marked as
failing because they depend on still-in-flight bug fixes. Tests can be
run with the new backend by setting the GIT_TEST_DEFAULT_REF_FORMAT
environment variable to "reftable".
There is a single known conceptual incompatibility with the dumb HTTP
transport. As "info/refs" SHOULD NOT contain the HEAD reference, and
because the "HEAD" file is not valid anymore, it is impossible for the
remote client to figure out the default branch without changing the
protocol. This shortcoming needs to be handled in a subsequent patch
series.
As the reftable library has already been introduced a while ago, this
commit message will not go into the details of how exactly the on-disk
format works. Please refer to our preexisting technical documentation at
Documentation/technical/reftable for this.
[1]: https://public-inbox.org/git/CAJo=hJtyof=HRy=2sLP0ng0uZ4=S-DpZ5dR1aF+VHVETKG20OQ@mail.gmail.com/
Original-idea-by: Shawn Pearce <spearce@spearce.org>
Based-on-patch-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-07 15:20:31 +08:00
|
|
|
[REF_STORAGE_FORMAT_REFTABLE] = &refs_be_reftable,
|
2023-12-29 15:26:34 +08:00
|
|
|
};
|
2016-09-05 00:08:10 +08:00
|
|
|
|
2023-12-29 15:26:34 +08:00
|
|
|
static const struct ref_storage_be *find_ref_storage_backend(unsigned int ref_storage_format)
|
2016-09-05 00:08:10 +08:00
|
|
|
{
|
2023-12-29 15:26:34 +08:00
|
|
|
if (ref_storage_format < ARRAY_SIZE(refs_backends))
|
|
|
|
return refs_backends[ref_storage_format];
|
2016-09-05 00:08:10 +08:00
|
|
|
return NULL;
|
|
|
|
}
|
|
|
|
|
2023-12-29 15:26:34 +08:00
|
|
|
unsigned int ref_storage_format_by_name(const char *name)
|
|
|
|
{
|
|
|
|
for (unsigned int i = 0; i < ARRAY_SIZE(refs_backends); i++)
|
|
|
|
if (refs_backends[i] && !strcmp(refs_backends[i]->name, name))
|
|
|
|
return i;
|
|
|
|
return REF_STORAGE_FORMAT_UNKNOWN;
|
|
|
|
}
|
|
|
|
|
|
|
|
const char *ref_storage_format_to_name(unsigned int ref_storage_format)
|
|
|
|
{
|
|
|
|
const struct ref_storage_be *be = find_ref_storage_backend(ref_storage_format);
|
|
|
|
if (!be)
|
|
|
|
return "unknown";
|
|
|
|
return be->name;
|
|
|
|
}
|
|
|
|
|
2012-04-10 13:30:13 +08:00
|
|
|
/*
|
2014-06-04 11:38:10 +08:00
|
|
|
* How to handle various characters in refnames:
|
|
|
|
* 0: An acceptable character for refs
|
2014-07-29 01:41:53 +08:00
|
|
|
* 1: End-of-component
|
|
|
|
* 2: ., look for a preceding . to reject .. in refs
|
|
|
|
* 3: {, look for a preceding @ to reject @{ in refs
|
2015-07-23 05:05:32 +08:00
|
|
|
* 4: A bad character: ASCII control characters, and
|
2015-07-23 05:05:33 +08:00
|
|
|
* ":", "?", "[", "\", "^", "~", SP, or TAB
|
|
|
|
* 5: *, reject unless REFNAME_REFSPEC_PATTERN is set
|
2014-06-04 11:38:10 +08:00
|
|
|
*/
|
|
|
|
static unsigned char refname_disposition[256] = {
|
2014-07-29 01:41:53 +08:00
|
|
|
1, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4,
|
|
|
|
4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4, 4,
|
2015-07-23 05:05:33 +08:00
|
|
|
4, 0, 0, 0, 0, 0, 0, 0, 0, 0, 5, 0, 0, 0, 2, 1,
|
2014-07-29 01:41:53 +08:00
|
|
|
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4, 0, 0, 0, 0, 4,
|
2014-06-04 11:38:10 +08:00
|
|
|
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
|
2014-07-29 01:41:53 +08:00
|
|
|
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 4, 4, 0, 4, 0,
|
|
|
|
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
|
|
|
|
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3, 0, 0, 4, 4
|
2014-06-04 11:38:10 +08:00
|
|
|
};
|
|
|
|
|
2022-08-06 01:58:36 +08:00
|
|
|
struct ref_namespace_info ref_namespace[] = {
|
|
|
|
[NAMESPACE_HEAD] = {
|
|
|
|
.ref = "HEAD",
|
|
|
|
.decoration = DECORATION_REF_HEAD,
|
|
|
|
.exact = 1,
|
|
|
|
},
|
|
|
|
[NAMESPACE_BRANCHES] = {
|
|
|
|
.ref = "refs/heads/",
|
|
|
|
.decoration = DECORATION_REF_LOCAL,
|
|
|
|
},
|
|
|
|
[NAMESPACE_TAGS] = {
|
|
|
|
.ref = "refs/tags/",
|
|
|
|
.decoration = DECORATION_REF_TAG,
|
|
|
|
},
|
|
|
|
[NAMESPACE_REMOTE_REFS] = {
|
|
|
|
/*
|
|
|
|
* The default refspec for new remotes copies refs from
|
|
|
|
* refs/heads/ on the remote into refs/remotes/<remote>/.
|
|
|
|
* As such, "refs/remotes/" has special handling.
|
|
|
|
*/
|
|
|
|
.ref = "refs/remotes/",
|
|
|
|
.decoration = DECORATION_REF_REMOTE,
|
|
|
|
},
|
|
|
|
[NAMESPACE_STASH] = {
|
|
|
|
/*
|
|
|
|
* The single ref "refs/stash" stores the latest stash.
|
|
|
|
* Older stashes can be found in the reflog.
|
|
|
|
*/
|
|
|
|
.ref = "refs/stash",
|
|
|
|
.exact = 1,
|
|
|
|
.decoration = DECORATION_REF_STASH,
|
|
|
|
},
|
|
|
|
[NAMESPACE_REPLACE] = {
|
|
|
|
/*
|
|
|
|
* This namespace allows Git to act as if one object ID
|
|
|
|
* points to the content of another. Unlike the other
|
|
|
|
* ref namespaces, this one can be changed by the
|
|
|
|
* GIT_REPLACE_REF_BASE environment variable. This
|
|
|
|
* .namespace value will be overwritten in setup_git_env().
|
|
|
|
*/
|
|
|
|
.ref = "refs/replace/",
|
|
|
|
.decoration = DECORATION_GRAFTED,
|
|
|
|
},
|
|
|
|
[NAMESPACE_NOTES] = {
|
|
|
|
/*
|
|
|
|
* The refs/notes/commit ref points to the tip of a
|
|
|
|
* parallel commit history that adds metadata to commits
|
|
|
|
* in the normal history. This ref can be overwritten
|
|
|
|
* by the core.notesRef config variable or the
|
|
|
|
* GIT_NOTES_REFS environment variable.
|
|
|
|
*/
|
|
|
|
.ref = "refs/notes/commit",
|
|
|
|
.exact = 1,
|
|
|
|
},
|
|
|
|
[NAMESPACE_PREFETCH] = {
|
|
|
|
/*
|
|
|
|
* Prefetch refs are written by the background 'fetch'
|
|
|
|
* maintenance task. It allows faster foreground fetches
|
|
|
|
* by advertising these previously-downloaded tips without
|
|
|
|
* updating refs/remotes/ without user intervention.
|
|
|
|
*/
|
|
|
|
.ref = "refs/prefetch/",
|
|
|
|
},
|
|
|
|
[NAMESPACE_REWRITTEN] = {
|
|
|
|
/*
|
|
|
|
* Rewritten refs are used by the 'label' command in the
|
|
|
|
* sequencer. These are particularly useful during an
|
|
|
|
* interactive rebase that uses the 'merge' command.
|
|
|
|
*/
|
|
|
|
.ref = "refs/rewritten/",
|
|
|
|
},
|
|
|
|
};
|
|
|
|
|
|
|
|
void update_ref_namespace(enum ref_namespace namespace, char *ref)
|
|
|
|
{
|
|
|
|
struct ref_namespace_info *info = &ref_namespace[namespace];
|
|
|
|
if (info->ref_updated)
|
2024-06-07 14:37:39 +08:00
|
|
|
free((char *)info->ref);
|
2022-08-06 01:58:36 +08:00
|
|
|
info->ref = ref;
|
|
|
|
info->ref_updated = 1;
|
|
|
|
}
|
|
|
|
|
2014-06-04 11:38:10 +08:00
|
|
|
/*
|
|
|
|
* Try to read one refname component from the front of refname.
|
|
|
|
* Return the length of the component found, or -1 if the component is
|
|
|
|
* not legal. It is legal if it is something reasonable to have under
|
|
|
|
* ".git/refs/"; We do not like it if:
|
2012-04-10 13:30:13 +08:00
|
|
|
*
|
2019-03-08 17:28:34 +08:00
|
|
|
* - it begins with ".", or
|
2012-04-10 13:30:13 +08:00
|
|
|
* - it has double dots "..", or
|
2015-07-23 05:05:32 +08:00
|
|
|
* - it has ASCII control characters, or
|
2015-07-23 05:05:33 +08:00
|
|
|
* - it has ":", "?", "[", "\", "^", "~", SP, or TAB anywhere, or
|
|
|
|
* - it has "*" anywhere unless REFNAME_REFSPEC_PATTERN is set, or
|
2015-07-23 05:05:32 +08:00
|
|
|
* - it ends with a "/", or
|
|
|
|
* - it ends with ".lock", or
|
|
|
|
* - it contains a "@{" portion
|
2019-03-08 17:28:34 +08:00
|
|
|
*
|
|
|
|
* When sanitized is not NULL, instead of rejecting the input refname
|
|
|
|
* as an error, try to come up with a usable replacement for the input
|
|
|
|
* refname in it.
|
2012-04-10 13:30:13 +08:00
|
|
|
*/
|
2019-03-08 17:28:34 +08:00
|
|
|
static int check_refname_component(const char *refname, int *flags,
|
|
|
|
struct strbuf *sanitized)
|
2012-04-10 13:30:13 +08:00
|
|
|
{
|
|
|
|
const char *cp;
|
|
|
|
char last = '\0';
|
2019-03-08 17:28:34 +08:00
|
|
|
size_t component_start = 0; /* garbage - not a reasonable initial value */
|
|
|
|
|
|
|
|
if (sanitized)
|
|
|
|
component_start = sanitized->len;
|
2012-04-10 13:30:13 +08:00
|
|
|
|
|
|
|
for (cp = refname; ; cp++) {
|
2014-06-04 11:38:10 +08:00
|
|
|
int ch = *cp & 255;
|
|
|
|
unsigned char disp = refname_disposition[ch];
|
2019-03-08 17:28:34 +08:00
|
|
|
|
|
|
|
if (sanitized && disp != 1)
|
|
|
|
strbuf_addch(sanitized, ch);
|
|
|
|
|
2014-06-04 11:38:10 +08:00
|
|
|
switch (disp) {
|
2014-07-29 01:41:53 +08:00
|
|
|
case 1:
|
2014-06-04 11:38:10 +08:00
|
|
|
goto out;
|
2014-07-29 01:41:53 +08:00
|
|
|
case 2:
|
2019-03-08 17:28:34 +08:00
|
|
|
if (last == '.') { /* Refname contains "..". */
|
|
|
|
if (sanitized)
|
|
|
|
/* collapse ".." to single "." */
|
|
|
|
strbuf_setlen(sanitized, sanitized->len - 1);
|
|
|
|
else
|
|
|
|
return -1;
|
|
|
|
}
|
2014-06-04 11:38:10 +08:00
|
|
|
break;
|
2014-07-29 01:41:53 +08:00
|
|
|
case 3:
|
2019-03-08 17:28:34 +08:00
|
|
|
if (last == '@') { /* Refname contains "@{". */
|
|
|
|
if (sanitized)
|
|
|
|
sanitized->buf[sanitized->len-1] = '-';
|
|
|
|
else
|
|
|
|
return -1;
|
|
|
|
}
|
2012-04-10 13:30:13 +08:00
|
|
|
break;
|
2014-07-29 01:41:53 +08:00
|
|
|
case 4:
|
2019-03-08 17:28:34 +08:00
|
|
|
/* forbidden char */
|
|
|
|
if (sanitized)
|
|
|
|
sanitized->buf[sanitized->len-1] = '-';
|
|
|
|
else
|
|
|
|
return -1;
|
|
|
|
break;
|
2015-07-23 05:05:33 +08:00
|
|
|
case 5:
|
2019-03-08 17:28:34 +08:00
|
|
|
if (!(*flags & REFNAME_REFSPEC_PATTERN)) {
|
|
|
|
/* refspec can't be a pattern */
|
|
|
|
if (sanitized)
|
|
|
|
sanitized->buf[sanitized->len-1] = '-';
|
|
|
|
else
|
|
|
|
return -1;
|
|
|
|
}
|
2015-07-23 05:05:33 +08:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Unset the pattern flag so that we only accept
|
|
|
|
* a single asterisk for one side of refspec.
|
|
|
|
*/
|
|
|
|
*flags &= ~ REFNAME_REFSPEC_PATTERN;
|
|
|
|
break;
|
2014-06-04 11:38:10 +08:00
|
|
|
}
|
2012-04-10 13:30:13 +08:00
|
|
|
last = ch;
|
|
|
|
}
|
2014-06-04 11:38:10 +08:00
|
|
|
out:
|
2012-04-10 13:30:13 +08:00
|
|
|
if (cp == refname)
|
2012-04-10 13:30:22 +08:00
|
|
|
return 0; /* Component has zero length. */
|
2019-03-08 17:28:34 +08:00
|
|
|
|
|
|
|
if (refname[0] == '.') { /* Component starts with '.'. */
|
|
|
|
if (sanitized)
|
|
|
|
sanitized->buf[component_start] = '-';
|
|
|
|
else
|
|
|
|
return -1;
|
|
|
|
}
|
2014-10-01 18:28:15 +08:00
|
|
|
if (cp - refname >= LOCK_SUFFIX_LEN &&
|
2019-03-08 17:28:34 +08:00
|
|
|
!memcmp(cp - LOCK_SUFFIX_LEN, LOCK_SUFFIX, LOCK_SUFFIX_LEN)) {
|
|
|
|
if (!sanitized)
|
|
|
|
return -1;
|
|
|
|
/* Refname ends with ".lock". */
|
|
|
|
while (strbuf_strip_suffix(sanitized, LOCK_SUFFIX)) {
|
|
|
|
/* try again in case we have .lock.lock */
|
|
|
|
}
|
|
|
|
}
|
2012-04-10 13:30:13 +08:00
|
|
|
return cp - refname;
|
|
|
|
}
|
|
|
|
|
2019-03-08 17:28:34 +08:00
|
|
|
static int check_or_sanitize_refname(const char *refname, int flags,
|
|
|
|
struct strbuf *sanitized)
|
2012-04-10 13:30:13 +08:00
|
|
|
{
|
|
|
|
int component_len, component_count = 0;
|
|
|
|
|
2019-03-08 17:28:34 +08:00
|
|
|
if (!strcmp(refname, "@")) {
|
Add new @ shortcut for HEAD
Typing 'HEAD' is tedious, especially when we can use '@' instead.
The reason for choosing '@' is that it follows naturally from the
ref@op syntax (e.g. HEAD@{u}), except we have no ref, and no
operation, and when we don't have those, it makes sens to assume
'HEAD'.
So now we can use 'git show @~1', and all that goody goodness.
Until now '@' was a valid name, but it conflicts with this idea, so
let's make it invalid. Probably very few people, if any, used this name.
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-09-02 14:34:30 +08:00
|
|
|
/* Refname is a single character '@'. */
|
2019-03-08 17:28:34 +08:00
|
|
|
if (sanitized)
|
|
|
|
strbuf_addch(sanitized, '-');
|
|
|
|
else
|
|
|
|
return -1;
|
|
|
|
}
|
Add new @ shortcut for HEAD
Typing 'HEAD' is tedious, especially when we can use '@' instead.
The reason for choosing '@' is that it follows naturally from the
ref@op syntax (e.g. HEAD@{u}), except we have no ref, and no
operation, and when we don't have those, it makes sens to assume
'HEAD'.
So now we can use 'git show @~1', and all that goody goodness.
Until now '@' was a valid name, but it conflicts with this idea, so
let's make it invalid. Probably very few people, if any, used this name.
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-09-02 14:34:30 +08:00
|
|
|
|
2012-04-10 13:30:13 +08:00
|
|
|
while (1) {
|
2019-03-08 17:28:34 +08:00
|
|
|
if (sanitized && sanitized->len)
|
|
|
|
strbuf_complete(sanitized, '/');
|
|
|
|
|
2012-04-10 13:30:13 +08:00
|
|
|
/* We are at the start of a path component. */
|
2019-03-08 17:28:34 +08:00
|
|
|
component_len = check_refname_component(refname, &flags,
|
|
|
|
sanitized);
|
|
|
|
if (sanitized && component_len == 0)
|
|
|
|
; /* OK, omit empty component */
|
|
|
|
else if (component_len <= 0)
|
2015-07-23 05:05:33 +08:00
|
|
|
return -1;
|
|
|
|
|
2012-04-10 13:30:13 +08:00
|
|
|
component_count++;
|
|
|
|
if (refname[component_len] == '\0')
|
|
|
|
break;
|
|
|
|
/* Skip to next component. */
|
|
|
|
refname += component_len + 1;
|
|
|
|
}
|
|
|
|
|
2019-03-08 17:28:34 +08:00
|
|
|
if (refname[component_len - 1] == '.') {
|
|
|
|
/* Refname ends with '.'. */
|
|
|
|
if (sanitized)
|
|
|
|
; /* omit ending dot */
|
|
|
|
else
|
|
|
|
return -1;
|
|
|
|
}
|
2012-04-10 13:30:13 +08:00
|
|
|
if (!(flags & REFNAME_ALLOW_ONELEVEL) && component_count < 2)
|
|
|
|
return -1; /* Refname has only one component. */
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2019-03-08 17:28:34 +08:00
|
|
|
int check_refname_format(const char *refname, int flags)
|
|
|
|
{
|
|
|
|
return check_or_sanitize_refname(refname, flags, NULL);
|
|
|
|
}
|
|
|
|
|
|
|
|
void sanitize_refname_component(const char *refname, struct strbuf *out)
|
|
|
|
{
|
|
|
|
if (check_or_sanitize_refname(refname, REFNAME_ALLOW_ONELEVEL, out))
|
|
|
|
BUG("sanitizing refname '%s' check returned error", refname);
|
|
|
|
}
|
|
|
|
|
2015-11-10 19:42:36 +08:00
|
|
|
int refname_is_safe(const char *refname)
|
refs.c: allow listing and deleting badly named refs
We currently do not handle badly named refs well:
$ cp .git/refs/heads/master .git/refs/heads/master.....@\*@\\.
$ git branch
fatal: Reference has invalid format: 'refs/heads/master.....@*@\.'
$ git branch -D master.....@\*@\\.
error: branch 'master.....@*@\.' not found.
Users cannot recover from a badly named ref without manually finding
and deleting the loose ref file or appropriate line in packed-refs.
Making that easier will make it easier to tweak the ref naming rules
in the future, for example to forbid shell metacharacters like '`'
and '"', without putting people in a state that is hard to get out of.
So allow "branch --list" to show these refs and allow "branch -d/-D"
and "update-ref -d" to delete them. Other commands (for example to
rename refs) will continue to not handle these refs but can be changed
in later patches.
Details:
In resolving functions, refuse to resolve refs that don't pass the
git-check-ref-format(1) check unless the new RESOLVE_REF_ALLOW_BAD_NAME
flag is passed. Even with RESOLVE_REF_ALLOW_BAD_NAME, refuse to
resolve refs that escape the refs/ directory and do not match the
pattern [A-Z_]* (think "HEAD" and "MERGE_HEAD").
In locking functions, refuse to act on badly named refs unless they
are being deleted and either are in the refs/ directory or match [A-Z_]*.
Just like other invalid refs, flag resolved, badly named refs with the
REF_ISBROKEN flag, treat them as resolving to null_sha1, and skip them
in all iteration functions except for for_each_rawref.
Flag badly named refs (but not symrefs pointing to badly named refs)
with a REF_BAD_NAME flag to make it easier for future callers to
notice and handle them specially. For example, in a later patch
for-each-ref will use this flag to detect refs whose names can confuse
callers parsing for-each-ref output.
In the transaction API, refuse to create or update badly named refs,
but allow deleting them (unless they try to escape refs/ and don't match
[A-Z_]*).
Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-04 02:45:43 +08:00
|
|
|
{
|
2016-04-27 18:39:11 +08:00
|
|
|
const char *rest;
|
|
|
|
|
|
|
|
if (skip_prefix(refname, "refs/", &rest)) {
|
refs.c: allow listing and deleting badly named refs
We currently do not handle badly named refs well:
$ cp .git/refs/heads/master .git/refs/heads/master.....@\*@\\.
$ git branch
fatal: Reference has invalid format: 'refs/heads/master.....@*@\.'
$ git branch -D master.....@\*@\\.
error: branch 'master.....@*@\.' not found.
Users cannot recover from a badly named ref without manually finding
and deleting the loose ref file or appropriate line in packed-refs.
Making that easier will make it easier to tweak the ref naming rules
in the future, for example to forbid shell metacharacters like '`'
and '"', without putting people in a state that is hard to get out of.
So allow "branch --list" to show these refs and allow "branch -d/-D"
and "update-ref -d" to delete them. Other commands (for example to
rename refs) will continue to not handle these refs but can be changed
in later patches.
Details:
In resolving functions, refuse to resolve refs that don't pass the
git-check-ref-format(1) check unless the new RESOLVE_REF_ALLOW_BAD_NAME
flag is passed. Even with RESOLVE_REF_ALLOW_BAD_NAME, refuse to
resolve refs that escape the refs/ directory and do not match the
pattern [A-Z_]* (think "HEAD" and "MERGE_HEAD").
In locking functions, refuse to act on badly named refs unless they
are being deleted and either are in the refs/ directory or match [A-Z_]*.
Just like other invalid refs, flag resolved, badly named refs with the
REF_ISBROKEN flag, treat them as resolving to null_sha1, and skip them
in all iteration functions except for for_each_rawref.
Flag badly named refs (but not symrefs pointing to badly named refs)
with a REF_BAD_NAME flag to make it easier for future callers to
notice and handle them specially. For example, in a later patch
for-each-ref will use this flag to detect refs whose names can confuse
callers parsing for-each-ref output.
In the transaction API, refuse to create or update badly named refs,
but allow deleting them (unless they try to escape refs/ and don't match
[A-Z_]*).
Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-04 02:45:43 +08:00
|
|
|
char *buf;
|
|
|
|
int result;
|
2016-04-27 18:40:39 +08:00
|
|
|
size_t restlen = strlen(rest);
|
|
|
|
|
|
|
|
/* rest must not be empty, or start or end with "/" */
|
|
|
|
if (!restlen || *rest == '/' || rest[restlen - 1] == '/')
|
|
|
|
return 0;
|
refs.c: allow listing and deleting badly named refs
We currently do not handle badly named refs well:
$ cp .git/refs/heads/master .git/refs/heads/master.....@\*@\\.
$ git branch
fatal: Reference has invalid format: 'refs/heads/master.....@*@\.'
$ git branch -D master.....@\*@\\.
error: branch 'master.....@*@\.' not found.
Users cannot recover from a badly named ref without manually finding
and deleting the loose ref file or appropriate line in packed-refs.
Making that easier will make it easier to tweak the ref naming rules
in the future, for example to forbid shell metacharacters like '`'
and '"', without putting people in a state that is hard to get out of.
So allow "branch --list" to show these refs and allow "branch -d/-D"
and "update-ref -d" to delete them. Other commands (for example to
rename refs) will continue to not handle these refs but can be changed
in later patches.
Details:
In resolving functions, refuse to resolve refs that don't pass the
git-check-ref-format(1) check unless the new RESOLVE_REF_ALLOW_BAD_NAME
flag is passed. Even with RESOLVE_REF_ALLOW_BAD_NAME, refuse to
resolve refs that escape the refs/ directory and do not match the
pattern [A-Z_]* (think "HEAD" and "MERGE_HEAD").
In locking functions, refuse to act on badly named refs unless they
are being deleted and either are in the refs/ directory or match [A-Z_]*.
Just like other invalid refs, flag resolved, badly named refs with the
REF_ISBROKEN flag, treat them as resolving to null_sha1, and skip them
in all iteration functions except for for_each_rawref.
Flag badly named refs (but not symrefs pointing to badly named refs)
with a REF_BAD_NAME flag to make it easier for future callers to
notice and handle them specially. For example, in a later patch
for-each-ref will use this flag to detect refs whose names can confuse
callers parsing for-each-ref output.
In the transaction API, refuse to create or update badly named refs,
but allow deleting them (unless they try to escape refs/ and don't match
[A-Z_]*).
Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-04 02:45:43 +08:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Does the refname try to escape refs/?
|
|
|
|
* For example: refs/foo/../bar is safe but refs/foo/../../bar
|
|
|
|
* is not.
|
|
|
|
*/
|
2016-04-27 18:40:39 +08:00
|
|
|
buf = xmallocz(restlen);
|
|
|
|
result = !normalize_path_copy(buf, rest) && !strcmp(buf, rest);
|
refs.c: allow listing and deleting badly named refs
We currently do not handle badly named refs well:
$ cp .git/refs/heads/master .git/refs/heads/master.....@\*@\\.
$ git branch
fatal: Reference has invalid format: 'refs/heads/master.....@*@\.'
$ git branch -D master.....@\*@\\.
error: branch 'master.....@*@\.' not found.
Users cannot recover from a badly named ref without manually finding
and deleting the loose ref file or appropriate line in packed-refs.
Making that easier will make it easier to tweak the ref naming rules
in the future, for example to forbid shell metacharacters like '`'
and '"', without putting people in a state that is hard to get out of.
So allow "branch --list" to show these refs and allow "branch -d/-D"
and "update-ref -d" to delete them. Other commands (for example to
rename refs) will continue to not handle these refs but can be changed
in later patches.
Details:
In resolving functions, refuse to resolve refs that don't pass the
git-check-ref-format(1) check unless the new RESOLVE_REF_ALLOW_BAD_NAME
flag is passed. Even with RESOLVE_REF_ALLOW_BAD_NAME, refuse to
resolve refs that escape the refs/ directory and do not match the
pattern [A-Z_]* (think "HEAD" and "MERGE_HEAD").
In locking functions, refuse to act on badly named refs unless they
are being deleted and either are in the refs/ directory or match [A-Z_]*.
Just like other invalid refs, flag resolved, badly named refs with the
REF_ISBROKEN flag, treat them as resolving to null_sha1, and skip them
in all iteration functions except for for_each_rawref.
Flag badly named refs (but not symrefs pointing to badly named refs)
with a REF_BAD_NAME flag to make it easier for future callers to
notice and handle them specially. For example, in a later patch
for-each-ref will use this flag to detect refs whose names can confuse
callers parsing for-each-ref output.
In the transaction API, refuse to create or update badly named refs,
but allow deleting them (unless they try to escape refs/ and don't match
[A-Z_]*).
Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-04 02:45:43 +08:00
|
|
|
free(buf);
|
|
|
|
return result;
|
|
|
|
}
|
2016-04-27 18:42:27 +08:00
|
|
|
|
|
|
|
do {
|
refs.c: allow listing and deleting badly named refs
We currently do not handle badly named refs well:
$ cp .git/refs/heads/master .git/refs/heads/master.....@\*@\\.
$ git branch
fatal: Reference has invalid format: 'refs/heads/master.....@*@\.'
$ git branch -D master.....@\*@\\.
error: branch 'master.....@*@\.' not found.
Users cannot recover from a badly named ref without manually finding
and deleting the loose ref file or appropriate line in packed-refs.
Making that easier will make it easier to tweak the ref naming rules
in the future, for example to forbid shell metacharacters like '`'
and '"', without putting people in a state that is hard to get out of.
So allow "branch --list" to show these refs and allow "branch -d/-D"
and "update-ref -d" to delete them. Other commands (for example to
rename refs) will continue to not handle these refs but can be changed
in later patches.
Details:
In resolving functions, refuse to resolve refs that don't pass the
git-check-ref-format(1) check unless the new RESOLVE_REF_ALLOW_BAD_NAME
flag is passed. Even with RESOLVE_REF_ALLOW_BAD_NAME, refuse to
resolve refs that escape the refs/ directory and do not match the
pattern [A-Z_]* (think "HEAD" and "MERGE_HEAD").
In locking functions, refuse to act on badly named refs unless they
are being deleted and either are in the refs/ directory or match [A-Z_]*.
Just like other invalid refs, flag resolved, badly named refs with the
REF_ISBROKEN flag, treat them as resolving to null_sha1, and skip them
in all iteration functions except for for_each_rawref.
Flag badly named refs (but not symrefs pointing to badly named refs)
with a REF_BAD_NAME flag to make it easier for future callers to
notice and handle them specially. For example, in a later patch
for-each-ref will use this flag to detect refs whose names can confuse
callers parsing for-each-ref output.
In the transaction API, refuse to create or update badly named refs,
but allow deleting them (unless they try to escape refs/ and don't match
[A-Z_]*).
Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-04 02:45:43 +08:00
|
|
|
if (!isupper(*refname) && *refname != '_')
|
|
|
|
return 0;
|
|
|
|
refname++;
|
2016-04-27 18:42:27 +08:00
|
|
|
} while (*refname);
|
refs.c: allow listing and deleting badly named refs
We currently do not handle badly named refs well:
$ cp .git/refs/heads/master .git/refs/heads/master.....@\*@\\.
$ git branch
fatal: Reference has invalid format: 'refs/heads/master.....@*@\.'
$ git branch -D master.....@\*@\\.
error: branch 'master.....@*@\.' not found.
Users cannot recover from a badly named ref without manually finding
and deleting the loose ref file or appropriate line in packed-refs.
Making that easier will make it easier to tweak the ref naming rules
in the future, for example to forbid shell metacharacters like '`'
and '"', without putting people in a state that is hard to get out of.
So allow "branch --list" to show these refs and allow "branch -d/-D"
and "update-ref -d" to delete them. Other commands (for example to
rename refs) will continue to not handle these refs but can be changed
in later patches.
Details:
In resolving functions, refuse to resolve refs that don't pass the
git-check-ref-format(1) check unless the new RESOLVE_REF_ALLOW_BAD_NAME
flag is passed. Even with RESOLVE_REF_ALLOW_BAD_NAME, refuse to
resolve refs that escape the refs/ directory and do not match the
pattern [A-Z_]* (think "HEAD" and "MERGE_HEAD").
In locking functions, refuse to act on badly named refs unless they
are being deleted and either are in the refs/ directory or match [A-Z_]*.
Just like other invalid refs, flag resolved, badly named refs with the
REF_ISBROKEN flag, treat them as resolving to null_sha1, and skip them
in all iteration functions except for for_each_rawref.
Flag badly named refs (but not symrefs pointing to badly named refs)
with a REF_BAD_NAME flag to make it easier for future callers to
notice and handle them specially. For example, in a later patch
for-each-ref will use this flag to detect refs whose names can confuse
callers parsing for-each-ref output.
In the transaction API, refuse to create or update badly named refs,
but allow deleting them (unless they try to escape refs/ and don't match
[A-Z_]*).
Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-04 02:45:43 +08:00
|
|
|
return 1;
|
|
|
|
}
|
|
|
|
|
2017-06-23 15:01:37 +08:00
|
|
|
/*
|
|
|
|
* Return true if refname, which has the specified oid and flags, can
|
|
|
|
* be resolved to an object in the database. If the referred-to object
|
|
|
|
* does not exist, emit a warning and return false.
|
|
|
|
*/
|
|
|
|
int ref_resolves_to_object(const char *refname,
|
2021-10-09 05:08:15 +08:00
|
|
|
struct repository *repo,
|
2017-06-23 15:01:37 +08:00
|
|
|
const struct object_id *oid,
|
|
|
|
unsigned int flags)
|
|
|
|
{
|
|
|
|
if (flags & REF_ISBROKEN)
|
|
|
|
return 0;
|
2021-10-09 05:08:15 +08:00
|
|
|
if (!repo_has_object_file(repo, oid)) {
|
2018-07-21 15:49:35 +08:00
|
|
|
error(_("%s does not point to a valid object!"), refname);
|
2017-06-23 15:01:37 +08:00
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
return 1;
|
|
|
|
}
|
|
|
|
|
2017-03-26 10:42:34 +08:00
|
|
|
char *refs_resolve_refdup(struct ref_store *refs,
|
|
|
|
const char *refname, int resolve_flags,
|
refs: convert resolve_refdup and refs_resolve_refdup to struct object_id
All of the callers already pass the hash member of struct object_id, so
update them to pass a pointer to the struct directly,
This transformation was done with an update to declaration and
definition and the following semantic patch:
@@
expression E1, E2, E3, E4;
@@
- resolve_refdup(E1, E2, E3.hash, E4)
+ resolve_refdup(E1, E2, &E3, E4)
@@
expression E1, E2, E3, E4;
@@
- resolve_refdup(E1, E2, E3->hash, E4)
+ resolve_refdup(E1, E2, E3, E4)
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-16 06:06:55 +08:00
|
|
|
struct object_id *oid, int *flags)
|
2017-03-26 10:42:34 +08:00
|
|
|
{
|
|
|
|
const char *result;
|
|
|
|
|
|
|
|
result = refs_resolve_ref_unsafe(refs, refname, resolve_flags,
|
2022-01-26 22:37:01 +08:00
|
|
|
oid, flags);
|
2017-03-26 10:42:34 +08:00
|
|
|
return xstrdup_or_null(result);
|
|
|
|
}
|
|
|
|
|
2023-07-11 05:12:05 +08:00
|
|
|
/* The argument to for_each_filter_refs */
|
|
|
|
struct for_each_ref_filter {
|
2015-11-09 21:34:01 +08:00
|
|
|
const char *pattern;
|
2018-11-12 21:25:44 +08:00
|
|
|
const char *prefix;
|
2015-11-09 21:34:01 +08:00
|
|
|
each_ref_fn *fn;
|
|
|
|
void *cb_data;
|
|
|
|
};
|
2012-04-10 13:30:26 +08:00
|
|
|
|
2024-05-07 15:11:39 +08:00
|
|
|
int refs_read_ref_full(struct ref_store *refs, const char *refname,
|
|
|
|
int resolve_flags, struct object_id *oid, int *flags)
|
2012-04-10 13:30:21 +08:00
|
|
|
{
|
2021-10-16 17:39:27 +08:00
|
|
|
if (refs_resolve_ref_unsafe(refs, refname, resolve_flags,
|
2022-01-26 22:37:01 +08:00
|
|
|
oid, flags))
|
2015-11-09 21:34:01 +08:00
|
|
|
return 0;
|
|
|
|
return -1;
|
2012-04-10 13:30:21 +08:00
|
|
|
}
|
|
|
|
|
2024-05-07 15:11:39 +08:00
|
|
|
int refs_read_ref(struct ref_store *refs, const char *refname, struct object_id *oid)
|
|
|
|
{
|
|
|
|
return refs_read_ref_full(refs, refname, RESOLVE_REF_READING, oid, NULL);
|
|
|
|
}
|
|
|
|
|
2020-08-22 00:59:34 +08:00
|
|
|
int refs_ref_exists(struct ref_store *refs, const char *refname)
|
2019-04-06 19:34:24 +08:00
|
|
|
{
|
2021-10-16 17:39:27 +08:00
|
|
|
return !!refs_resolve_ref_unsafe(refs, refname, RESOLVE_REF_READING,
|
2022-01-26 22:37:01 +08:00
|
|
|
NULL, NULL);
|
2019-04-06 19:34:24 +08:00
|
|
|
}
|
|
|
|
|
2023-07-11 05:12:05 +08:00
|
|
|
static int for_each_filter_refs(const char *refname,
|
|
|
|
const struct object_id *oid,
|
|
|
|
int flags, void *data)
|
2012-04-10 13:30:26 +08:00
|
|
|
{
|
2023-07-11 05:12:05 +08:00
|
|
|
struct for_each_ref_filter *filter = data;
|
2015-11-09 21:34:01 +08:00
|
|
|
|
2017-06-23 05:38:08 +08:00
|
|
|
if (wildmatch(filter->pattern, refname, 0))
|
2015-11-09 21:34:01 +08:00
|
|
|
return 0;
|
2018-11-12 21:25:44 +08:00
|
|
|
if (filter->prefix)
|
|
|
|
skip_prefix(refname, filter->prefix, &refname);
|
2015-11-09 21:34:01 +08:00
|
|
|
return filter->fn(refname, oid, flags, filter->cb_data);
|
2012-04-10 13:30:26 +08:00
|
|
|
}
|
|
|
|
|
2017-10-16 06:07:10 +08:00
|
|
|
enum peel_status peel_object(const struct object_id *name, struct object_id *oid)
|
2007-04-17 09:42:50 +08:00
|
|
|
{
|
2021-04-13 15:16:36 +08:00
|
|
|
struct object *o = lookup_unknown_object(the_repository, name);
|
2007-04-17 09:42:50 +08:00
|
|
|
|
2015-11-09 21:34:01 +08:00
|
|
|
if (o->type == OBJ_NONE) {
|
2018-04-26 02:20:59 +08:00
|
|
|
int type = oid_object_info(the_repository, name, NULL);
|
2020-06-17 17:14:08 +08:00
|
|
|
if (type < 0 || !object_as_type(o, type, 0))
|
2015-11-09 21:34:01 +08:00
|
|
|
return PEEL_INVALID;
|
|
|
|
}
|
2012-04-10 13:30:13 +08:00
|
|
|
|
2015-11-09 21:34:01 +08:00
|
|
|
if (o->type != OBJ_TAG)
|
|
|
|
return PEEL_NON_TAG;
|
2012-05-23 05:03:29 +08:00
|
|
|
|
2015-11-09 21:34:01 +08:00
|
|
|
o = deref_tag_noverify(o);
|
|
|
|
if (!o)
|
|
|
|
return PEEL_INVALID;
|
|
|
|
|
2017-10-16 06:07:10 +08:00
|
|
|
oidcpy(oid, &o->oid);
|
2015-11-09 21:34:01 +08:00
|
|
|
return PEEL_PEELED;
|
2012-05-23 05:03:29 +08:00
|
|
|
}
|
|
|
|
|
2015-11-09 21:34:01 +08:00
|
|
|
struct warn_if_dangling_data {
|
|
|
|
FILE *fp;
|
|
|
|
const char *refname;
|
|
|
|
const struct string_list *refnames;
|
|
|
|
const char *msg_fmt;
|
|
|
|
};
|
2012-04-10 13:30:13 +08:00
|
|
|
|
2022-08-19 18:08:32 +08:00
|
|
|
static int warn_if_dangling_symref(const char *refname,
|
2022-08-26 01:09:48 +08:00
|
|
|
const struct object_id *oid UNUSED,
|
2015-11-09 21:34:01 +08:00
|
|
|
int flags, void *cb_data)
|
|
|
|
{
|
|
|
|
struct warn_if_dangling_data *d = cb_data;
|
|
|
|
const char *resolves_to;
|
2012-04-10 13:30:13 +08:00
|
|
|
|
2015-11-09 21:34:01 +08:00
|
|
|
if (!(flags & REF_ISSYMREF))
|
|
|
|
return 0;
|
2012-04-10 13:30:13 +08:00
|
|
|
|
2024-05-07 15:11:53 +08:00
|
|
|
resolves_to = refs_resolve_ref_unsafe(get_main_ref_store(the_repository),
|
|
|
|
refname, 0, NULL, NULL);
|
2015-11-09 21:34:01 +08:00
|
|
|
if (!resolves_to
|
|
|
|
|| (d->refname
|
|
|
|
? strcmp(resolves_to, d->refname)
|
|
|
|
: !string_list_has_string(d->refnames, resolves_to))) {
|
|
|
|
return 0;
|
|
|
|
}
|
2012-04-10 13:30:13 +08:00
|
|
|
|
2015-11-09 21:34:01 +08:00
|
|
|
fprintf(d->fp, d->msg_fmt, refname);
|
|
|
|
fputc('\n', d->fp);
|
|
|
|
return 0;
|
2012-04-10 13:30:13 +08:00
|
|
|
}
|
|
|
|
|
2015-11-09 21:34:01 +08:00
|
|
|
void warn_dangling_symref(FILE *fp, const char *msg_fmt, const char *refname)
|
2012-04-25 06:45:11 +08:00
|
|
|
{
|
2015-11-09 21:34:01 +08:00
|
|
|
struct warn_if_dangling_data data;
|
|
|
|
|
|
|
|
data.fp = fp;
|
|
|
|
data.refname = refname;
|
|
|
|
data.refnames = NULL;
|
|
|
|
data.msg_fmt = msg_fmt;
|
2024-05-07 15:11:53 +08:00
|
|
|
refs_for_each_rawref(get_main_ref_store(the_repository),
|
|
|
|
warn_if_dangling_symref, &data);
|
2012-04-25 06:45:11 +08:00
|
|
|
}
|
|
|
|
|
2015-11-09 21:34:01 +08:00
|
|
|
void warn_dangling_symrefs(FILE *fp, const char *msg_fmt, const struct string_list *refnames)
|
2012-04-10 13:30:26 +08:00
|
|
|
{
|
2015-11-09 21:34:01 +08:00
|
|
|
struct warn_if_dangling_data data;
|
2012-04-10 13:30:26 +08:00
|
|
|
|
2015-11-09 21:34:01 +08:00
|
|
|
data.fp = fp;
|
|
|
|
data.refname = NULL;
|
|
|
|
data.refnames = refnames;
|
|
|
|
data.msg_fmt = msg_fmt;
|
2024-05-07 15:11:53 +08:00
|
|
|
refs_for_each_rawref(get_main_ref_store(the_repository),
|
|
|
|
warn_if_dangling_symref, &data);
|
2012-04-10 13:30:26 +08:00
|
|
|
}
|
|
|
|
|
2017-03-26 10:42:34 +08:00
|
|
|
int refs_for_each_tag_ref(struct ref_store *refs, each_ref_fn fn, void *cb_data)
|
|
|
|
{
|
|
|
|
return refs_for_each_ref_in(refs, "refs/tags/", fn, cb_data);
|
|
|
|
}
|
|
|
|
|
|
|
|
int refs_for_each_branch_ref(struct ref_store *refs, each_ref_fn fn, void *cb_data)
|
|
|
|
{
|
|
|
|
return refs_for_each_ref_in(refs, "refs/heads/", fn, cb_data);
|
2012-04-10 13:30:26 +08:00
|
|
|
}
|
|
|
|
|
2017-03-26 10:42:34 +08:00
|
|
|
int refs_for_each_remote_ref(struct ref_store *refs, each_ref_fn fn, void *cb_data)
|
|
|
|
{
|
|
|
|
return refs_for_each_ref_in(refs, "refs/remotes/", fn, cb_data);
|
2011-12-12 13:38:15 +08:00
|
|
|
}
|
2012-04-10 13:30:26 +08:00
|
|
|
|
2024-05-07 15:11:39 +08:00
|
|
|
int refs_head_ref_namespaced(struct ref_store *refs, each_ref_fn fn, void *cb_data)
|
2015-11-09 21:34:01 +08:00
|
|
|
{
|
|
|
|
struct strbuf buf = STRBUF_INIT;
|
|
|
|
int ret = 0;
|
|
|
|
struct object_id oid;
|
|
|
|
int flag;
|
2007-04-17 09:42:50 +08:00
|
|
|
|
2015-11-09 21:34:01 +08:00
|
|
|
strbuf_addf(&buf, "%sHEAD", get_git_namespace());
|
2024-05-07 15:11:39 +08:00
|
|
|
if (!refs_read_ref_full(refs, buf.buf, RESOLVE_REF_READING, &oid, &flag))
|
2015-11-09 21:34:01 +08:00
|
|
|
ret = fn(buf.buf, &oid, flag, cb_data);
|
|
|
|
strbuf_release(&buf);
|
2007-04-17 09:42:50 +08:00
|
|
|
|
2015-11-09 21:34:01 +08:00
|
|
|
return ret;
|
2011-09-30 06:11:42 +08:00
|
|
|
}
|
2007-04-17 09:42:50 +08:00
|
|
|
|
log: add option to choose which refs to decorate
When `log --decorate` is used, git will decorate commits with all
available refs. While in most cases this may give the desired effect,
under some conditions it can lead to excessively verbose output.
Introduce two command line options, `--decorate-refs=<pattern>` and
`--decorate-refs-exclude=<pattern>` to allow the user to select which
refs are used in decoration.
When "--decorate-refs=<pattern>" is given, only the refs that match the
pattern are used in decoration. The refs that match the pattern when
"--decorate-refs-exclude=<pattern>" is given, are never used in
decoration.
These options follow the same convention for mixing negative and
positive patterns across the system, assuming that the inclusive default
is to match all refs available.
(1) if there is no positive pattern given, pretend as if an
inclusive default positive pattern was given;
(2) for each candidate, reject it if it matches no positive
pattern, or if it matches any one of the negative patterns.
The rules for what is considered a match are slightly different from the
rules used elsewhere.
Commands like `log --glob` assume a trailing '/*' when glob chars are
not present in the pattern. This makes it difficult to specify a single
ref. On the other hand, commands like `describe --match --all` allow
specifying exact refs, but do not have the convenience of allowing
"shorthand refs" like 'refs/heads' or 'heads' to refer to
'refs/heads/*'.
The commands introduced in this patch consider a match if:
(a) the pattern contains globs chars,
and regular pattern matching returns a match.
(b) the pattern does not contain glob chars,
and ref '<pattern>' exists, or if ref exists under '<pattern>/'
This allows both behaviours (allowing single refs and shorthand refs)
yet remaining compatible with existent commands.
Helped-by: Kevin Daudt <me@ikke.info>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Rafael Ascensão <rafa.almas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-11-22 05:33:41 +08:00
|
|
|
void normalize_glob_ref(struct string_list_item *item, const char *prefix,
|
|
|
|
const char *pattern)
|
|
|
|
{
|
|
|
|
struct strbuf normalized_pattern = STRBUF_INIT;
|
|
|
|
|
|
|
|
if (*pattern == '/')
|
|
|
|
BUG("pattern must not start with '/'");
|
|
|
|
|
2022-08-06 01:58:33 +08:00
|
|
|
if (prefix)
|
log: add option to choose which refs to decorate
When `log --decorate` is used, git will decorate commits with all
available refs. While in most cases this may give the desired effect,
under some conditions it can lead to excessively verbose output.
Introduce two command line options, `--decorate-refs=<pattern>` and
`--decorate-refs-exclude=<pattern>` to allow the user to select which
refs are used in decoration.
When "--decorate-refs=<pattern>" is given, only the refs that match the
pattern are used in decoration. The refs that match the pattern when
"--decorate-refs-exclude=<pattern>" is given, are never used in
decoration.
These options follow the same convention for mixing negative and
positive patterns across the system, assuming that the inclusive default
is to match all refs available.
(1) if there is no positive pattern given, pretend as if an
inclusive default positive pattern was given;
(2) for each candidate, reject it if it matches no positive
pattern, or if it matches any one of the negative patterns.
The rules for what is considered a match are slightly different from the
rules used elsewhere.
Commands like `log --glob` assume a trailing '/*' when glob chars are
not present in the pattern. This makes it difficult to specify a single
ref. On the other hand, commands like `describe --match --all` allow
specifying exact refs, but do not have the convenience of allowing
"shorthand refs" like 'refs/heads' or 'heads' to refer to
'refs/heads/*'.
The commands introduced in this patch consider a match if:
(a) the pattern contains globs chars,
and regular pattern matching returns a match.
(b) the pattern does not contain glob chars,
and ref '<pattern>' exists, or if ref exists under '<pattern>/'
This allows both behaviours (allowing single refs and shorthand refs)
yet remaining compatible with existent commands.
Helped-by: Kevin Daudt <me@ikke.info>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Rafael Ascensão <rafa.almas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-11-22 05:33:41 +08:00
|
|
|
strbuf_addstr(&normalized_pattern, prefix);
|
2022-08-06 01:58:33 +08:00
|
|
|
else if (!starts_with(pattern, "refs/") &&
|
|
|
|
strcmp(pattern, "HEAD"))
|
log: add option to choose which refs to decorate
When `log --decorate` is used, git will decorate commits with all
available refs. While in most cases this may give the desired effect,
under some conditions it can lead to excessively verbose output.
Introduce two command line options, `--decorate-refs=<pattern>` and
`--decorate-refs-exclude=<pattern>` to allow the user to select which
refs are used in decoration.
When "--decorate-refs=<pattern>" is given, only the refs that match the
pattern are used in decoration. The refs that match the pattern when
"--decorate-refs-exclude=<pattern>" is given, are never used in
decoration.
These options follow the same convention for mixing negative and
positive patterns across the system, assuming that the inclusive default
is to match all refs available.
(1) if there is no positive pattern given, pretend as if an
inclusive default positive pattern was given;
(2) for each candidate, reject it if it matches no positive
pattern, or if it matches any one of the negative patterns.
The rules for what is considered a match are slightly different from the
rules used elsewhere.
Commands like `log --glob` assume a trailing '/*' when glob chars are
not present in the pattern. This makes it difficult to specify a single
ref. On the other hand, commands like `describe --match --all` allow
specifying exact refs, but do not have the convenience of allowing
"shorthand refs" like 'refs/heads' or 'heads' to refer to
'refs/heads/*'.
The commands introduced in this patch consider a match if:
(a) the pattern contains globs chars,
and regular pattern matching returns a match.
(b) the pattern does not contain glob chars,
and ref '<pattern>' exists, or if ref exists under '<pattern>/'
This allows both behaviours (allowing single refs and shorthand refs)
yet remaining compatible with existent commands.
Helped-by: Kevin Daudt <me@ikke.info>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Rafael Ascensão <rafa.almas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-11-22 05:33:41 +08:00
|
|
|
strbuf_addstr(&normalized_pattern, "refs/");
|
2022-08-06 01:58:33 +08:00
|
|
|
/*
|
|
|
|
* NEEDSWORK: Special case other symrefs such as REBASE_HEAD,
|
|
|
|
* MERGE_HEAD, etc.
|
|
|
|
*/
|
|
|
|
|
log: add option to choose which refs to decorate
When `log --decorate` is used, git will decorate commits with all
available refs. While in most cases this may give the desired effect,
under some conditions it can lead to excessively verbose output.
Introduce two command line options, `--decorate-refs=<pattern>` and
`--decorate-refs-exclude=<pattern>` to allow the user to select which
refs are used in decoration.
When "--decorate-refs=<pattern>" is given, only the refs that match the
pattern are used in decoration. The refs that match the pattern when
"--decorate-refs-exclude=<pattern>" is given, are never used in
decoration.
These options follow the same convention for mixing negative and
positive patterns across the system, assuming that the inclusive default
is to match all refs available.
(1) if there is no positive pattern given, pretend as if an
inclusive default positive pattern was given;
(2) for each candidate, reject it if it matches no positive
pattern, or if it matches any one of the negative patterns.
The rules for what is considered a match are slightly different from the
rules used elsewhere.
Commands like `log --glob` assume a trailing '/*' when glob chars are
not present in the pattern. This makes it difficult to specify a single
ref. On the other hand, commands like `describe --match --all` allow
specifying exact refs, but do not have the convenience of allowing
"shorthand refs" like 'refs/heads' or 'heads' to refer to
'refs/heads/*'.
The commands introduced in this patch consider a match if:
(a) the pattern contains globs chars,
and regular pattern matching returns a match.
(b) the pattern does not contain glob chars,
and ref '<pattern>' exists, or if ref exists under '<pattern>/'
This allows both behaviours (allowing single refs and shorthand refs)
yet remaining compatible with existent commands.
Helped-by: Kevin Daudt <me@ikke.info>
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Rafael Ascensão <rafa.almas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-11-22 05:33:41 +08:00
|
|
|
strbuf_addstr(&normalized_pattern, pattern);
|
|
|
|
strbuf_strip_suffix(&normalized_pattern, "/");
|
|
|
|
|
|
|
|
item->string = strbuf_detach(&normalized_pattern, NULL);
|
|
|
|
item->util = has_glob_specials(pattern) ? NULL : item->string;
|
|
|
|
strbuf_release(&normalized_pattern);
|
|
|
|
}
|
|
|
|
|
2024-05-07 15:11:39 +08:00
|
|
|
int refs_for_each_glob_ref_in(struct ref_store *refs, each_ref_fn fn,
|
|
|
|
const char *pattern, const char *prefix, void *cb_data)
|
2013-04-23 03:52:18 +08:00
|
|
|
{
|
2015-11-09 21:34:01 +08:00
|
|
|
struct strbuf real_pattern = STRBUF_INIT;
|
2023-07-11 05:12:05 +08:00
|
|
|
struct for_each_ref_filter filter;
|
2015-11-09 21:34:01 +08:00
|
|
|
int ret;
|
2010-01-20 17:48:25 +08:00
|
|
|
|
2013-12-01 04:55:40 +08:00
|
|
|
if (!prefix && !starts_with(pattern, "refs/"))
|
2010-01-20 17:48:25 +08:00
|
|
|
strbuf_addstr(&real_pattern, "refs/");
|
2010-01-20 17:48:26 +08:00
|
|
|
else if (prefix)
|
|
|
|
strbuf_addstr(&real_pattern, prefix);
|
2010-01-20 17:48:25 +08:00
|
|
|
strbuf_addstr(&real_pattern, pattern);
|
|
|
|
|
2010-03-13 01:04:26 +08:00
|
|
|
if (!has_glob_specials(pattern)) {
|
2010-02-04 13:23:18 +08:00
|
|
|
/* Append implied '/' '*' if not present. */
|
use strbuf_complete to conditionally append slash
When working with paths in strbufs, we frequently want to
ensure that a directory contains a trailing slash before
appending to it. We can shorten this code (and make the
intent more obvious) by calling strbuf_complete.
Most of these cases are trivially identical conversions, but
there are two things to note:
- in a few cases we did not check that the strbuf is
non-empty (which would lead to an out-of-bounds memory
access). These were generally not triggerable in
practice, either from earlier assertions, or typically
because we would have just fed the strbuf to opendir(),
which would choke on an empty path.
- in a few cases we indexed the buffer with "original_len"
or similar, rather than the current sb->len, and it is
not immediately obvious from the diff that they are the
same. In all of these cases, I manually verified that
the strbuf does not change between the assignment and
the strbuf_complete call.
This does not convert cases which look like:
if (sb->len && !is_dir_sep(sb->buf[sb->len - 1]))
strbuf_addch(sb, '/');
as those are obviously semantically different. Some of these
cases arguably should be doing that, but that is out of
scope for this change, which aims purely for cleanup with no
behavior change (and at least it will make such sites easier
to find and examine in the future, as we can grep for
strbuf_complete).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-09-25 05:08:35 +08:00
|
|
|
strbuf_complete(&real_pattern, '/');
|
2010-01-20 17:48:25 +08:00
|
|
|
/* No need to check for '*', there is none. */
|
|
|
|
strbuf_addch(&real_pattern, '*');
|
|
|
|
}
|
|
|
|
|
|
|
|
filter.pattern = real_pattern.buf;
|
2018-11-12 21:25:44 +08:00
|
|
|
filter.prefix = prefix;
|
2010-01-20 17:48:25 +08:00
|
|
|
filter.fn = fn;
|
|
|
|
filter.cb_data = cb_data;
|
2024-05-07 15:11:39 +08:00
|
|
|
ret = refs_for_each_ref(refs, for_each_filter_refs, &filter);
|
2010-01-20 17:48:25 +08:00
|
|
|
|
|
|
|
strbuf_release(&real_pattern);
|
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
2024-05-07 15:11:39 +08:00
|
|
|
int refs_for_each_glob_ref(struct ref_store *refs, each_ref_fn fn,
|
|
|
|
const char *pattern, void *cb_data)
|
|
|
|
{
|
|
|
|
return refs_for_each_glob_ref_in(refs, fn, pattern, NULL, cb_data);
|
|
|
|
}
|
|
|
|
|
2009-05-14 05:22:04 +08:00
|
|
|
const char *prettify_refname(const char *name)
|
2009-03-09 09:06:05 +08:00
|
|
|
{
|
2017-03-23 23:50:12 +08:00
|
|
|
if (skip_prefix(name, "refs/heads/", &name) ||
|
|
|
|
skip_prefix(name, "refs/tags/", &name) ||
|
|
|
|
skip_prefix(name, "refs/remotes/", &name))
|
|
|
|
; /* nothing */
|
|
|
|
return name;
|
2009-03-09 09:06:05 +08:00
|
|
|
}
|
|
|
|
|
2014-01-14 11:16:07 +08:00
|
|
|
static const char *ref_rev_parse_rules[] = {
|
add refname_match()
We use at least two rulesets for matching abbreviated refnames with
full refnames (starting with 'refs/'). git-rev-parse and git-fetch
use slightly different rules.
This commit introduces a new function refname_match
(const char *abbrev_name, const char *full_name, const char **rules).
abbrev_name is expanded using the rules and matched against full_name.
If a match is found the function returns true. rules is a NULL-terminate
list of format patterns with "%.*s", for example:
const char *ref_rev_parse_rules[] = {
"%.*s",
"refs/%.*s",
"refs/tags/%.*s",
"refs/heads/%.*s",
"refs/remotes/%.*s",
"refs/remotes/%.*s/HEAD",
NULL
};
Asterisks are included in the format strings because this is the form
required in sha1_name.c. Sharing the list with the functions there is
a good idea to avoid duplicating the rules. Hopefully this
facilitates unified matching rules in the future.
This commit makes the rules used by rev-parse for resolving refs to
sha1s available for string comparison. Before this change, the rules
were buried in get_sha1*() and dwim_ref().
A follow-up commit will refactor the rules used by fetch.
refname_match() will be used for matching refspecs in git-send-pack.
Thanks to Daniel Barkalow <barkalow@iabervon.org> for pointing
out that ref_matches_abbrev in remote.c solves a similar problem
and care should be taken to avoid confusion.
Signed-off-by: Steffen Prohaska <prohaska@zib.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-11-11 22:01:46 +08:00
|
|
|
"%.*s",
|
|
|
|
"refs/%.*s",
|
|
|
|
"refs/tags/%.*s",
|
|
|
|
"refs/heads/%.*s",
|
|
|
|
"refs/remotes/%.*s",
|
|
|
|
"refs/remotes/%.*s/HEAD",
|
|
|
|
NULL
|
|
|
|
};
|
|
|
|
|
remote: make refspec follow the same disambiguation rule as local refs
When matching a non-wildcard LHS of a refspec against a list of
refs, find_ref_by_name_abbrev() returns the first ref that matches
using any DWIM rules used by refname_match() in refs.c, even if a
better match occurs later in the list of refs.
This causes unexpected behavior when (for example) fetching using
the refspec "refs/heads/s:<something>" from a remote with both
"refs/heads/refs/heads/s" and "refs/heads/s"; even if the former was
inadvertently created, one would still expect the latter to be
fetched. Similarly, when both a tag T and a branch T exist,
fetching T should favor the tag, just like how local refname
disambiguation rule works. But because the code walks over
ls-remote output from the remote, which happens to be sorted in
alphabetical order and has refs/heads/T before refs/tags/T, a
request to fetch T is (mis)interpreted as fetching refs/heads/T.
Update refname_match(), all of whose current callers care only if it
returns non-zero (i.e. matches) to see if an abbreviated name can
mean the full name being tested, so that it returns a positive
integer whose magnitude can be used to tell the precedence, and fix
the find_ref_by_name_abbrev() function not to stop at the first
match but find the match with the highest precedence.
This is based on an earlier work, which special cased only the exact
matches, by Jonathan Tan.
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Helped-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-02 00:22:37 +08:00
|
|
|
#define NUM_REV_PARSE_RULES (ARRAY_SIZE(ref_rev_parse_rules) - 1)
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Is it possible that the caller meant full_name with abbrev_name?
|
|
|
|
* If so return a non-zero value to signal "yes"; the magnitude of
|
|
|
|
* the returned value gives the precedence used for disambiguation.
|
|
|
|
*
|
|
|
|
* If abbrev_name cannot mean full_name, return 0.
|
|
|
|
*/
|
2014-01-14 11:16:07 +08:00
|
|
|
int refname_match(const char *abbrev_name, const char *full_name)
|
add refname_match()
We use at least two rulesets for matching abbreviated refnames with
full refnames (starting with 'refs/'). git-rev-parse and git-fetch
use slightly different rules.
This commit introduces a new function refname_match
(const char *abbrev_name, const char *full_name, const char **rules).
abbrev_name is expanded using the rules and matched against full_name.
If a match is found the function returns true. rules is a NULL-terminate
list of format patterns with "%.*s", for example:
const char *ref_rev_parse_rules[] = {
"%.*s",
"refs/%.*s",
"refs/tags/%.*s",
"refs/heads/%.*s",
"refs/remotes/%.*s",
"refs/remotes/%.*s/HEAD",
NULL
};
Asterisks are included in the format strings because this is the form
required in sha1_name.c. Sharing the list with the functions there is
a good idea to avoid duplicating the rules. Hopefully this
facilitates unified matching rules in the future.
This commit makes the rules used by rev-parse for resolving refs to
sha1s available for string comparison. Before this change, the rules
were buried in get_sha1*() and dwim_ref().
A follow-up commit will refactor the rules used by fetch.
refname_match() will be used for matching refspecs in git-send-pack.
Thanks to Daniel Barkalow <barkalow@iabervon.org> for pointing
out that ref_matches_abbrev in remote.c solves a similar problem
and care should be taken to avoid confusion.
Signed-off-by: Steffen Prohaska <prohaska@zib.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-11-11 22:01:46 +08:00
|
|
|
{
|
|
|
|
const char **p;
|
|
|
|
const int abbrev_name_len = strlen(abbrev_name);
|
remote: make refspec follow the same disambiguation rule as local refs
When matching a non-wildcard LHS of a refspec against a list of
refs, find_ref_by_name_abbrev() returns the first ref that matches
using any DWIM rules used by refname_match() in refs.c, even if a
better match occurs later in the list of refs.
This causes unexpected behavior when (for example) fetching using
the refspec "refs/heads/s:<something>" from a remote with both
"refs/heads/refs/heads/s" and "refs/heads/s"; even if the former was
inadvertently created, one would still expect the latter to be
fetched. Similarly, when both a tag T and a branch T exist,
fetching T should favor the tag, just like how local refname
disambiguation rule works. But because the code walks over
ls-remote output from the remote, which happens to be sorted in
alphabetical order and has refs/heads/T before refs/tags/T, a
request to fetch T is (mis)interpreted as fetching refs/heads/T.
Update refname_match(), all of whose current callers care only if it
returns non-zero (i.e. matches) to see if an abbreviated name can
mean the full name being tested, so that it returns a positive
integer whose magnitude can be used to tell the precedence, and fix
the find_ref_by_name_abbrev() function not to stop at the first
match but find the match with the highest precedence.
This is based on an earlier work, which special cased only the exact
matches, by Jonathan Tan.
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Helped-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-02 00:22:37 +08:00
|
|
|
const int num_rules = NUM_REV_PARSE_RULES;
|
add refname_match()
We use at least two rulesets for matching abbreviated refnames with
full refnames (starting with 'refs/'). git-rev-parse and git-fetch
use slightly different rules.
This commit introduces a new function refname_match
(const char *abbrev_name, const char *full_name, const char **rules).
abbrev_name is expanded using the rules and matched against full_name.
If a match is found the function returns true. rules is a NULL-terminate
list of format patterns with "%.*s", for example:
const char *ref_rev_parse_rules[] = {
"%.*s",
"refs/%.*s",
"refs/tags/%.*s",
"refs/heads/%.*s",
"refs/remotes/%.*s",
"refs/remotes/%.*s/HEAD",
NULL
};
Asterisks are included in the format strings because this is the form
required in sha1_name.c. Sharing the list with the functions there is
a good idea to avoid duplicating the rules. Hopefully this
facilitates unified matching rules in the future.
This commit makes the rules used by rev-parse for resolving refs to
sha1s available for string comparison. Before this change, the rules
were buried in get_sha1*() and dwim_ref().
A follow-up commit will refactor the rules used by fetch.
refname_match() will be used for matching refspecs in git-send-pack.
Thanks to Daniel Barkalow <barkalow@iabervon.org> for pointing
out that ref_matches_abbrev in remote.c solves a similar problem
and care should be taken to avoid confusion.
Signed-off-by: Steffen Prohaska <prohaska@zib.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-11-11 22:01:46 +08:00
|
|
|
|
remote: make refspec follow the same disambiguation rule as local refs
When matching a non-wildcard LHS of a refspec against a list of
refs, find_ref_by_name_abbrev() returns the first ref that matches
using any DWIM rules used by refname_match() in refs.c, even if a
better match occurs later in the list of refs.
This causes unexpected behavior when (for example) fetching using
the refspec "refs/heads/s:<something>" from a remote with both
"refs/heads/refs/heads/s" and "refs/heads/s"; even if the former was
inadvertently created, one would still expect the latter to be
fetched. Similarly, when both a tag T and a branch T exist,
fetching T should favor the tag, just like how local refname
disambiguation rule works. But because the code walks over
ls-remote output from the remote, which happens to be sorted in
alphabetical order and has refs/heads/T before refs/tags/T, a
request to fetch T is (mis)interpreted as fetching refs/heads/T.
Update refname_match(), all of whose current callers care only if it
returns non-zero (i.e. matches) to see if an abbreviated name can
mean the full name being tested, so that it returns a positive
integer whose magnitude can be used to tell the precedence, and fix
the find_ref_by_name_abbrev() function not to stop at the first
match but find the match with the highest precedence.
This is based on an earlier work, which special cased only the exact
matches, by Jonathan Tan.
Helped-by: Jonathan Tan <jonathantanmy@google.com>
Helped-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-02 00:22:37 +08:00
|
|
|
for (p = ref_rev_parse_rules; *p; p++)
|
|
|
|
if (!strcmp(full_name, mkpath(*p, abbrev_name_len, abbrev_name)))
|
|
|
|
return &ref_rev_parse_rules[num_rules] - p;
|
add refname_match()
We use at least two rulesets for matching abbreviated refnames with
full refnames (starting with 'refs/'). git-rev-parse and git-fetch
use slightly different rules.
This commit introduces a new function refname_match
(const char *abbrev_name, const char *full_name, const char **rules).
abbrev_name is expanded using the rules and matched against full_name.
If a match is found the function returns true. rules is a NULL-terminate
list of format patterns with "%.*s", for example:
const char *ref_rev_parse_rules[] = {
"%.*s",
"refs/%.*s",
"refs/tags/%.*s",
"refs/heads/%.*s",
"refs/remotes/%.*s",
"refs/remotes/%.*s/HEAD",
NULL
};
Asterisks are included in the format strings because this is the form
required in sha1_name.c. Sharing the list with the functions there is
a good idea to avoid duplicating the rules. Hopefully this
facilitates unified matching rules in the future.
This commit makes the rules used by rev-parse for resolving refs to
sha1s available for string comparison. Before this change, the rules
were buried in get_sha1*() and dwim_ref().
A follow-up commit will refactor the rules used by fetch.
refname_match() will be used for matching refspecs in git-send-pack.
Thanks to Daniel Barkalow <barkalow@iabervon.org> for pointing
out that ref_matches_abbrev in remote.c solves a similar problem
and care should be taken to avoid confusion.
Signed-off-by: Steffen Prohaska <prohaska@zib.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2007-11-11 22:01:46 +08:00
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2018-03-16 01:31:24 +08:00
|
|
|
/*
|
|
|
|
* Given a 'prefix' expand it by the rules in 'ref_rev_parse_rules' and add
|
|
|
|
* the results to 'prefixes'
|
|
|
|
*/
|
2020-07-29 04:25:12 +08:00
|
|
|
void expand_ref_prefix(struct strvec *prefixes, const char *prefix)
|
2018-03-16 01:31:24 +08:00
|
|
|
{
|
|
|
|
const char **p;
|
|
|
|
int len = strlen(prefix);
|
|
|
|
|
|
|
|
for (p = ref_rev_parse_rules; *p; p++)
|
2020-07-29 04:25:12 +08:00
|
|
|
strvec_pushf(prefixes, *p, len, prefix);
|
2018-03-16 01:31:24 +08:00
|
|
|
}
|
|
|
|
|
2020-12-11 19:36:57 +08:00
|
|
|
static const char default_branch_name_advice[] = N_(
|
|
|
|
"Using '%s' as the name for the initial branch. This default branch name\n"
|
|
|
|
"is subject to change. To configure the initial branch name to use in all\n"
|
|
|
|
"of your new repositories, which will suppress this warning, call:\n"
|
|
|
|
"\n"
|
|
|
|
"\tgit config --global init.defaultBranch <name>\n"
|
|
|
|
"\n"
|
|
|
|
"Names commonly chosen instead of 'master' are 'main', 'trunk' and\n"
|
|
|
|
"'development'. The just-created branch can be renamed via this command:\n"
|
|
|
|
"\n"
|
|
|
|
"\tgit branch -m <name>\n"
|
|
|
|
);
|
|
|
|
|
2020-12-11 19:36:56 +08:00
|
|
|
char *repo_default_branch_name(struct repository *r, int quiet)
|
2020-06-24 22:46:33 +08:00
|
|
|
{
|
|
|
|
const char *config_key = "init.defaultbranch";
|
|
|
|
const char *config_display_key = "init.defaultBranch";
|
|
|
|
char *ret = NULL, *full_ref;
|
2020-10-23 22:00:00 +08:00
|
|
|
const char *env = getenv("GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME");
|
2020-06-24 22:46:33 +08:00
|
|
|
|
2020-10-23 22:00:00 +08:00
|
|
|
if (env && *env)
|
|
|
|
ret = xstrdup(env);
|
|
|
|
else if (repo_config_get_string(r, config_key, &ret) < 0)
|
2020-06-24 22:46:33 +08:00
|
|
|
die(_("could not retrieve `%s`"), config_display_key);
|
|
|
|
|
2020-12-11 19:36:57 +08:00
|
|
|
if (!ret) {
|
2020-06-24 22:46:33 +08:00
|
|
|
ret = xstrdup("master");
|
2020-12-11 19:36:57 +08:00
|
|
|
if (!quiet)
|
|
|
|
advise(_(default_branch_name_advice), ret);
|
|
|
|
}
|
2020-06-24 22:46:33 +08:00
|
|
|
|
|
|
|
full_ref = xstrfmt("refs/heads/%s", ret);
|
|
|
|
if (check_refname_format(full_ref, 0))
|
|
|
|
die(_("invalid branch name: %s = %s"), config_display_key, ret);
|
|
|
|
free(full_ref);
|
|
|
|
|
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
2020-12-11 19:36:56 +08:00
|
|
|
const char *git_default_branch_name(int quiet)
|
2020-06-24 22:46:33 +08:00
|
|
|
{
|
|
|
|
static char *ret;
|
|
|
|
|
|
|
|
if (!ret)
|
2020-12-11 19:36:56 +08:00
|
|
|
ret = repo_default_branch_name(the_repository, quiet);
|
2020-06-24 22:46:33 +08:00
|
|
|
|
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
2011-10-13 01:35:38 +08:00
|
|
|
/*
|
|
|
|
* *string and *len will only be substituted, and *string returned (for
|
|
|
|
* later free()ing) if the string passed in is a magic short-hand form
|
|
|
|
* to name a branch.
|
|
|
|
*/
|
2019-04-06 19:34:26 +08:00
|
|
|
static char *substitute_branch_name(struct repository *r,
|
2020-09-02 06:28:09 +08:00
|
|
|
const char **string, int *len,
|
|
|
|
int nonfatal_dangling_mark)
|
2011-10-13 01:35:38 +08:00
|
|
|
{
|
|
|
|
struct strbuf buf = STRBUF_INIT;
|
2020-09-02 06:28:09 +08:00
|
|
|
struct interpret_branch_name_options options = {
|
|
|
|
.nonfatal_dangling_mark = nonfatal_dangling_mark
|
|
|
|
};
|
2020-09-02 06:28:07 +08:00
|
|
|
int ret = repo_interpret_branch_name(r, *string, *len, &buf, &options);
|
2011-10-13 01:35:38 +08:00
|
|
|
|
|
|
|
if (ret == *len) {
|
|
|
|
size_t size;
|
|
|
|
*string = strbuf_detach(&buf, &size);
|
|
|
|
*len = size;
|
|
|
|
return (char *)*string;
|
|
|
|
}
|
|
|
|
|
|
|
|
return NULL;
|
|
|
|
}
|
|
|
|
|
2019-04-06 19:34:28 +08:00
|
|
|
int repo_dwim_ref(struct repository *r, const char *str, int len,
|
2020-09-02 06:28:09 +08:00
|
|
|
struct object_id *oid, char **ref, int nonfatal_dangling_mark)
|
2011-10-13 01:35:38 +08:00
|
|
|
{
|
2020-09-02 06:28:09 +08:00
|
|
|
char *last_branch = substitute_branch_name(r, &str, &len,
|
|
|
|
nonfatal_dangling_mark);
|
2019-04-06 19:34:28 +08:00
|
|
|
int refs_found = expand_ref(r, str, len, oid, ref);
|
2016-06-12 18:54:02 +08:00
|
|
|
free(last_branch);
|
|
|
|
return refs_found;
|
|
|
|
}
|
|
|
|
|
2019-04-06 19:34:27 +08:00
|
|
|
int expand_ref(struct repository *repo, const char *str, int len,
|
|
|
|
struct object_id *oid, char **ref)
|
2016-06-12 18:54:02 +08:00
|
|
|
{
|
2011-10-13 01:35:38 +08:00
|
|
|
const char **p, *r;
|
|
|
|
int refs_found = 0;
|
2017-03-29 03:46:33 +08:00
|
|
|
struct strbuf fullref = STRBUF_INIT;
|
2011-10-13 01:35:38 +08:00
|
|
|
|
|
|
|
*ref = NULL;
|
|
|
|
for (p = ref_rev_parse_rules; *p; p++) {
|
2017-10-16 06:06:57 +08:00
|
|
|
struct object_id oid_from_ref;
|
|
|
|
struct object_id *this_result;
|
2011-10-13 01:35:38 +08:00
|
|
|
int flag;
|
2021-10-16 17:39:24 +08:00
|
|
|
struct ref_store *refs = get_main_ref_store(repo);
|
2011-10-13 01:35:38 +08:00
|
|
|
|
2017-10-16 06:06:57 +08:00
|
|
|
this_result = refs_found ? &oid_from_ref : oid;
|
2017-03-29 03:46:33 +08:00
|
|
|
strbuf_reset(&fullref);
|
|
|
|
strbuf_addf(&fullref, *p, len, str);
|
2021-10-16 17:39:27 +08:00
|
|
|
r = refs_resolve_ref_unsafe(refs, fullref.buf,
|
2021-10-16 17:39:24 +08:00
|
|
|
RESOLVE_REF_READING,
|
2022-01-26 22:37:01 +08:00
|
|
|
this_result, &flag);
|
2011-10-13 01:35:38 +08:00
|
|
|
if (r) {
|
|
|
|
if (!refs_found++)
|
|
|
|
*ref = xstrdup(r);
|
|
|
|
if (!warn_ambiguous_refs)
|
|
|
|
break;
|
2017-03-29 03:46:33 +08:00
|
|
|
} else if ((flag & REF_ISSYMREF) && strcmp(fullref.buf, "HEAD")) {
|
2018-07-21 15:49:35 +08:00
|
|
|
warning(_("ignoring dangling symref %s"), fullref.buf);
|
2017-03-29 03:46:33 +08:00
|
|
|
} else if ((flag & REF_ISBROKEN) && strchr(fullref.buf, '/')) {
|
2018-07-21 15:49:35 +08:00
|
|
|
warning(_("ignoring broken ref %s"), fullref.buf);
|
2011-10-20 04:55:49 +08:00
|
|
|
}
|
2011-10-13 01:35:38 +08:00
|
|
|
}
|
2017-03-29 03:46:33 +08:00
|
|
|
strbuf_release(&fullref);
|
2011-10-13 01:35:38 +08:00
|
|
|
return refs_found;
|
|
|
|
}
|
|
|
|
|
2019-04-06 19:34:29 +08:00
|
|
|
int repo_dwim_log(struct repository *r, const char *str, int len,
|
|
|
|
struct object_id *oid, char **log)
|
2011-10-13 01:35:38 +08:00
|
|
|
{
|
2019-04-06 19:34:29 +08:00
|
|
|
struct ref_store *refs = get_main_ref_store(r);
|
2020-09-02 06:28:09 +08:00
|
|
|
char *last_branch = substitute_branch_name(r, &str, &len, 0);
|
2011-10-13 01:35:38 +08:00
|
|
|
const char **p;
|
|
|
|
int logs_found = 0;
|
2017-03-29 03:46:33 +08:00
|
|
|
struct strbuf path = STRBUF_INIT;
|
2011-10-13 01:35:38 +08:00
|
|
|
|
|
|
|
*log = NULL;
|
|
|
|
for (p = ref_rev_parse_rules; *p; p++) {
|
2017-10-16 06:06:59 +08:00
|
|
|
struct object_id hash;
|
2011-10-13 01:35:38 +08:00
|
|
|
const char *ref, *it;
|
|
|
|
|
2017-03-29 03:46:33 +08:00
|
|
|
strbuf_reset(&path);
|
|
|
|
strbuf_addf(&path, *p, len, str);
|
2019-04-06 19:34:29 +08:00
|
|
|
ref = refs_resolve_ref_unsafe(refs, path.buf,
|
|
|
|
RESOLVE_REF_READING,
|
2022-01-26 22:37:01 +08:00
|
|
|
oid ? &hash : NULL, NULL);
|
2011-10-13 01:35:38 +08:00
|
|
|
if (!ref)
|
|
|
|
continue;
|
2019-04-06 19:34:29 +08:00
|
|
|
if (refs_reflog_exists(refs, path.buf))
|
2017-03-29 03:46:33 +08:00
|
|
|
it = path.buf;
|
2019-04-06 19:34:29 +08:00
|
|
|
else if (strcmp(ref, path.buf) &&
|
|
|
|
refs_reflog_exists(refs, ref))
|
2011-10-13 01:35:38 +08:00
|
|
|
it = ref;
|
|
|
|
else
|
|
|
|
continue;
|
|
|
|
if (!logs_found++) {
|
|
|
|
*log = xstrdup(it);
|
2021-08-23 19:36:08 +08:00
|
|
|
if (oid)
|
|
|
|
oidcpy(oid, &hash);
|
2011-10-13 01:35:38 +08:00
|
|
|
}
|
2015-11-09 21:34:01 +08:00
|
|
|
if (!warn_ambiguous_refs)
|
|
|
|
break;
|
2006-10-01 06:02:00 +08:00
|
|
|
}
|
2017-03-29 03:46:33 +08:00
|
|
|
strbuf_release(&path);
|
2015-11-09 21:34:01 +08:00
|
|
|
free(last_branch);
|
|
|
|
return logs_found;
|
2013-09-04 23:22:41 +08:00
|
|
|
}
|
|
|
|
|
2019-04-06 19:34:29 +08:00
|
|
|
int dwim_log(const char *str, int len, struct object_id *oid, char **log)
|
|
|
|
{
|
|
|
|
return repo_dwim_log(the_repository, str, len, oid, log);
|
|
|
|
}
|
|
|
|
|
2022-09-20 00:34:50 +08:00
|
|
|
int is_per_worktree_ref(const char *refname)
|
2015-07-31 14:06:18 +08:00
|
|
|
{
|
2020-07-28 00:25:47 +08:00
|
|
|
return starts_with(refname, "refs/worktree/") ||
|
|
|
|
starts_with(refname, "refs/bisect/") ||
|
|
|
|
starts_with(refname, "refs/rewritten/");
|
2015-07-31 14:06:18 +08:00
|
|
|
}
|
|
|
|
|
2024-05-15 14:51:05 +08:00
|
|
|
int is_pseudo_ref(const char *refname)
|
2024-05-15 14:51:01 +08:00
|
|
|
{
|
|
|
|
static const char * const pseudo_refs[] = {
|
|
|
|
"FETCH_HEAD",
|
|
|
|
"MERGE_HEAD",
|
|
|
|
};
|
|
|
|
size_t i;
|
|
|
|
|
|
|
|
for (i = 0; i < ARRAY_SIZE(pseudo_refs); i++)
|
|
|
|
if (!strcmp(refname, pseudo_refs[i]))
|
|
|
|
return 1;
|
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2024-05-15 14:50:42 +08:00
|
|
|
static int is_root_ref_syntax(const char *refname)
|
2015-07-31 14:06:18 +08:00
|
|
|
{
|
|
|
|
const char *c;
|
|
|
|
|
|
|
|
for (c = refname; *c; c++) {
|
|
|
|
if (!isupper(*c) && *c != '-' && *c != '_')
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
return 1;
|
|
|
|
}
|
|
|
|
|
refs: do not check ref existence in `is_root_ref()`
Before this patch series, root refs except for "HEAD" and our special
refs were classified as pseudorefs. Furthermore, our terminology
clarified that pseudorefs must not be symbolic refs. This restriction
is enforced in `is_root_ref()`, which explicitly checks that a supposed
root ref resolves to an object ID without recursing.
This has been extremely confusing right from the start because (in old
terminology) a ref name may sometimes be a pseudoref and sometimes not
depending on whether it is a symbolic or regular ref. This behaviour
does not seem reasonable at all and I very much doubt that it results in
anything sane.
Last but not least, the current behaviour can actually lead to a
segfault when calling `is_root_ref()` with a reference that either does
not exist or that is a symbolic ref because we never initialized `oid`,
but then read it via `is_null_oid()`.
We have now changed terminology to clarify that pseudorefs are really
only "MERGE_HEAD" and "FETCH_HEAD", whereas all the other refs that live
in the root of the ref hierarchy are just plain refs. Thus, we do not
need to check whether the ref is symbolic or not. In fact, we can now
avoid looking up the ref completely as the name is sufficient for us to
figure out whether something would be a root ref or not.
This change of course changes semantics for our callers. As there are
only three of them we can assess each of them individually:
- "ref-filter.c:ref_kind_from_refname()" uses it to classify refs.
It's clear that the intent is to classify based on the ref name,
only.
- "refs/reftable_backend.c:reftable_ref_iterator_advance()" uses it to
filter root refs. Again, using existence checks is pointless here as
the iterator has just surfaced the ref, so we know it does exist.
- "refs/files_backend.c:add_pseudoref_and_head_entries()" uses it to
determine whether it should add a ref to the root directory of its
iterator. This had the effect that we skipped over any files that
are either a symbolic ref, or which are not a ref at all.
The new behaviour is to include symbolic refs know, which aligns us
with the adapted terminology. Furthermore, files which look like
root refs but aren't are now mark those as "broken". As broken refs
are not surfaced by our tooling, this should not lead to a change in
user-visible behaviour, but may cause us to emit warnings. This
feels like the right thing to do as we would otherwise just silently
ignore corrupted root refs completely.
So in all cases the existence check was either superfluous, not in line
with the adapted terminology or masked potential issues. This commit
thus changes the behaviour as proposed and drops the existence check
altogether.
Add a test that verifies that this does not change user-visible
behaviour. Namely, we still don't want to show broken refs to the user
by default in git-for-each-ref(1). What this does allow though is for
internal callers to surface dangling root refs when they pass in the
`DO_FOR_EACH_INCLUDE_BROKEN` flag.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15 14:50:51 +08:00
|
|
|
int is_root_ref(const char *refname)
|
2024-02-23 18:01:08 +08:00
|
|
|
{
|
2024-05-15 14:50:42 +08:00
|
|
|
static const char *const irregular_root_refs[] = {
|
2024-05-15 14:50:56 +08:00
|
|
|
"HEAD",
|
2024-02-23 18:01:08 +08:00
|
|
|
"AUTO_MERGE",
|
|
|
|
"BISECT_EXPECTED_REV",
|
|
|
|
"NOTES_MERGE_PARTIAL",
|
|
|
|
"NOTES_MERGE_REF",
|
|
|
|
"MERGE_AUTOSTASH",
|
|
|
|
};
|
|
|
|
size_t i;
|
|
|
|
|
2024-05-15 14:51:01 +08:00
|
|
|
if (!is_root_ref_syntax(refname) ||
|
|
|
|
is_pseudo_ref(refname))
|
2024-02-23 18:01:08 +08:00
|
|
|
return 0;
|
|
|
|
|
refs: do not check ref existence in `is_root_ref()`
Before this patch series, root refs except for "HEAD" and our special
refs were classified as pseudorefs. Furthermore, our terminology
clarified that pseudorefs must not be symbolic refs. This restriction
is enforced in `is_root_ref()`, which explicitly checks that a supposed
root ref resolves to an object ID without recursing.
This has been extremely confusing right from the start because (in old
terminology) a ref name may sometimes be a pseudoref and sometimes not
depending on whether it is a symbolic or regular ref. This behaviour
does not seem reasonable at all and I very much doubt that it results in
anything sane.
Last but not least, the current behaviour can actually lead to a
segfault when calling `is_root_ref()` with a reference that either does
not exist or that is a symbolic ref because we never initialized `oid`,
but then read it via `is_null_oid()`.
We have now changed terminology to clarify that pseudorefs are really
only "MERGE_HEAD" and "FETCH_HEAD", whereas all the other refs that live
in the root of the ref hierarchy are just plain refs. Thus, we do not
need to check whether the ref is symbolic or not. In fact, we can now
avoid looking up the ref completely as the name is sufficient for us to
figure out whether something would be a root ref or not.
This change of course changes semantics for our callers. As there are
only three of them we can assess each of them individually:
- "ref-filter.c:ref_kind_from_refname()" uses it to classify refs.
It's clear that the intent is to classify based on the ref name,
only.
- "refs/reftable_backend.c:reftable_ref_iterator_advance()" uses it to
filter root refs. Again, using existence checks is pointless here as
the iterator has just surfaced the ref, so we know it does exist.
- "refs/files_backend.c:add_pseudoref_and_head_entries()" uses it to
determine whether it should add a ref to the root directory of its
iterator. This had the effect that we skipped over any files that
are either a symbolic ref, or which are not a ref at all.
The new behaviour is to include symbolic refs know, which aligns us
with the adapted terminology. Furthermore, files which look like
root refs but aren't are now mark those as "broken". As broken refs
are not surfaced by our tooling, this should not lead to a change in
user-visible behaviour, but may cause us to emit warnings. This
feels like the right thing to do as we would otherwise just silently
ignore corrupted root refs completely.
So in all cases the existence check was either superfluous, not in line
with the adapted terminology or masked potential issues. This commit
thus changes the behaviour as proposed and drops the existence check
altogether.
Add a test that verifies that this does not change user-visible
behaviour. Namely, we still don't want to show broken refs to the user
by default in git-for-each-ref(1). What this does allow though is for
internal callers to surface dangling root refs when they pass in the
`DO_FOR_EACH_INCLUDE_BROKEN` flag.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15 14:50:51 +08:00
|
|
|
if (ends_with(refname, "_HEAD"))
|
|
|
|
return 1;
|
2024-02-23 18:01:08 +08:00
|
|
|
|
2024-05-15 14:50:42 +08:00
|
|
|
for (i = 0; i < ARRAY_SIZE(irregular_root_refs); i++)
|
refs: do not check ref existence in `is_root_ref()`
Before this patch series, root refs except for "HEAD" and our special
refs were classified as pseudorefs. Furthermore, our terminology
clarified that pseudorefs must not be symbolic refs. This restriction
is enforced in `is_root_ref()`, which explicitly checks that a supposed
root ref resolves to an object ID without recursing.
This has been extremely confusing right from the start because (in old
terminology) a ref name may sometimes be a pseudoref and sometimes not
depending on whether it is a symbolic or regular ref. This behaviour
does not seem reasonable at all and I very much doubt that it results in
anything sane.
Last but not least, the current behaviour can actually lead to a
segfault when calling `is_root_ref()` with a reference that either does
not exist or that is a symbolic ref because we never initialized `oid`,
but then read it via `is_null_oid()`.
We have now changed terminology to clarify that pseudorefs are really
only "MERGE_HEAD" and "FETCH_HEAD", whereas all the other refs that live
in the root of the ref hierarchy are just plain refs. Thus, we do not
need to check whether the ref is symbolic or not. In fact, we can now
avoid looking up the ref completely as the name is sufficient for us to
figure out whether something would be a root ref or not.
This change of course changes semantics for our callers. As there are
only three of them we can assess each of them individually:
- "ref-filter.c:ref_kind_from_refname()" uses it to classify refs.
It's clear that the intent is to classify based on the ref name,
only.
- "refs/reftable_backend.c:reftable_ref_iterator_advance()" uses it to
filter root refs. Again, using existence checks is pointless here as
the iterator has just surfaced the ref, so we know it does exist.
- "refs/files_backend.c:add_pseudoref_and_head_entries()" uses it to
determine whether it should add a ref to the root directory of its
iterator. This had the effect that we skipped over any files that
are either a symbolic ref, or which are not a ref at all.
The new behaviour is to include symbolic refs know, which aligns us
with the adapted terminology. Furthermore, files which look like
root refs but aren't are now mark those as "broken". As broken refs
are not surfaced by our tooling, this should not lead to a change in
user-visible behaviour, but may cause us to emit warnings. This
feels like the right thing to do as we would otherwise just silently
ignore corrupted root refs completely.
So in all cases the existence check was either superfluous, not in line
with the adapted terminology or masked potential issues. This commit
thus changes the behaviour as proposed and drops the existence check
altogether.
Add a test that verifies that this does not change user-visible
behaviour. Namely, we still don't want to show broken refs to the user
by default in git-for-each-ref(1). What this does allow though is for
internal callers to surface dangling root refs when they pass in the
`DO_FOR_EACH_INCLUDE_BROKEN` flag.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-05-15 14:50:51 +08:00
|
|
|
if (!strcmp(refname, irregular_root_refs[i]))
|
|
|
|
return 1;
|
2024-02-23 18:01:08 +08:00
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2022-09-20 00:34:50 +08:00
|
|
|
static int is_current_worktree_ref(const char *ref) {
|
2024-05-15 14:50:42 +08:00
|
|
|
return is_root_ref_syntax(ref) || is_per_worktree_ref(ref);
|
2018-10-21 16:08:54 +08:00
|
|
|
}
|
|
|
|
|
2022-09-20 00:34:50 +08:00
|
|
|
enum ref_worktree_type parse_worktree_ref(const char *maybe_worktree_ref,
|
|
|
|
const char **worktree_name, int *worktree_name_length,
|
|
|
|
const char **bare_refname)
|
2018-10-21 16:08:54 +08:00
|
|
|
{
|
2022-09-20 00:34:50 +08:00
|
|
|
const char *name_dummy;
|
|
|
|
int name_length_dummy;
|
|
|
|
const char *ref_dummy;
|
2018-10-21 16:08:54 +08:00
|
|
|
|
2022-09-20 00:34:50 +08:00
|
|
|
if (!worktree_name)
|
|
|
|
worktree_name = &name_dummy;
|
|
|
|
if (!worktree_name_length)
|
|
|
|
worktree_name_length = &name_length_dummy;
|
|
|
|
if (!bare_refname)
|
|
|
|
bare_refname = &ref_dummy;
|
|
|
|
|
|
|
|
if (skip_prefix(maybe_worktree_ref, "worktrees/", bare_refname)) {
|
|
|
|
const char *slash = strchr(*bare_refname, '/');
|
|
|
|
|
|
|
|
*worktree_name = *bare_refname;
|
|
|
|
if (!slash) {
|
|
|
|
*worktree_name_length = strlen(*worktree_name);
|
|
|
|
|
|
|
|
/* This is an error condition, and the caller tell because the bare_refname is "" */
|
|
|
|
*bare_refname = *worktree_name + *worktree_name_length;
|
|
|
|
return REF_WORKTREE_OTHER;
|
|
|
|
}
|
|
|
|
|
|
|
|
*worktree_name_length = slash - *bare_refname;
|
|
|
|
*bare_refname = slash + 1;
|
|
|
|
|
|
|
|
if (is_current_worktree_ref(*bare_refname))
|
|
|
|
return REF_WORKTREE_OTHER;
|
|
|
|
}
|
|
|
|
|
|
|
|
*worktree_name = NULL;
|
|
|
|
*worktree_name_length = 0;
|
|
|
|
|
|
|
|
if (skip_prefix(maybe_worktree_ref, "main-worktree/", bare_refname)
|
|
|
|
&& is_current_worktree_ref(*bare_refname))
|
|
|
|
return REF_WORKTREE_MAIN;
|
|
|
|
|
|
|
|
*bare_refname = maybe_worktree_ref;
|
|
|
|
if (is_current_worktree_ref(maybe_worktree_ref))
|
|
|
|
return REF_WORKTREE_CURRENT;
|
|
|
|
|
|
|
|
return REF_WORKTREE_SHARED;
|
2015-07-31 14:06:18 +08:00
|
|
|
}
|
|
|
|
|
2017-08-21 19:51:34 +08:00
|
|
|
long get_files_ref_lock_timeout_ms(void)
|
|
|
|
{
|
|
|
|
static int configured = 0;
|
|
|
|
|
|
|
|
/* The default timeout is 100 ms: */
|
|
|
|
static int timeout_ms = 100;
|
|
|
|
|
|
|
|
if (!configured) {
|
|
|
|
git_config_get_int("core.filesreflocktimeout", &timeout_ms);
|
|
|
|
configured = 1;
|
|
|
|
}
|
|
|
|
|
|
|
|
return timeout_ms;
|
|
|
|
}
|
|
|
|
|
2017-03-26 10:42:35 +08:00
|
|
|
int refs_delete_ref(struct ref_store *refs, const char *msg,
|
|
|
|
const char *refname,
|
2017-10-16 06:06:50 +08:00
|
|
|
const struct object_id *old_oid,
|
2017-03-26 10:42:35 +08:00
|
|
|
unsigned int flags)
|
2007-01-27 06:26:09 +08:00
|
|
|
{
|
2014-05-01 00:22:45 +08:00
|
|
|
struct ref_transaction *transaction;
|
2015-07-22 05:04:50 +08:00
|
|
|
struct strbuf err = STRBUF_INIT;
|
2007-01-27 06:26:10 +08:00
|
|
|
|
2022-04-14 06:51:33 +08:00
|
|
|
transaction = ref_store_transaction_begin(refs, &err);
|
2014-05-01 00:22:45 +08:00
|
|
|
if (!transaction ||
|
2017-10-16 06:06:53 +08:00
|
|
|
ref_transaction_delete(transaction, refname, old_oid,
|
2017-02-21 09:10:32 +08:00
|
|
|
flags, msg, &err) ||
|
2014-05-01 03:22:42 +08:00
|
|
|
ref_transaction_commit(transaction, &err)) {
|
2014-05-01 00:22:45 +08:00
|
|
|
error("%s", err.buf);
|
|
|
|
ref_transaction_free(transaction);
|
|
|
|
strbuf_release(&err);
|
2006-10-01 06:02:00 +08:00
|
|
|
return 1;
|
2007-01-27 06:26:09 +08:00
|
|
|
}
|
2015-11-09 21:34:01 +08:00
|
|
|
ref_transaction_free(transaction);
|
|
|
|
strbuf_release(&err);
|
2008-01-17 03:14:30 +08:00
|
|
|
return 0;
|
|
|
|
}
|
2007-01-27 06:26:09 +08:00
|
|
|
|
reflog: cleanse messages in the refs.c layer
Regarding reflog messages:
- We expect that a reflog message consists of a single line. The
file format used by the files backend may add a LF after the
message as a delimiter, and output by commands like "git log -g"
may complete such an incomplete line by adding a LF at the end,
but philosophically, the terminating LF is not a part of the
message.
- We however allow callers of refs API to supply a random sequence
of NUL terminated bytes. We cleanse caller-supplied message by
squashing a run of whitespaces into a SP, and by trimming trailing
whitespace, before storing the message. This is how we tolerate,
instead of erring out, a message with LF in it (be it at the end,
in the middle, or both).
Currently, the cleansing of the reflog message is done by the files
backend, before the log is written out. This is sufficient with the
current code, as that is the only backend that writes reflogs. But
new backends can be added that write reflogs, and we'd want the
resulting log message we would read out of "log -g" the same no
matter what backend is used, and moving the code to do so to the
generic layer is a way to do so.
An added benefit is that the "cleansing" function could be updated
later, independent from individual backends, to e.g. allow
multi-line log messages if we wanted to, and when that happens, it
would help a lot to ensure we covered all bases if the cleansing
function (which would be updated) is called from the generic layer.
Side note: I am not interested in supporting multi-line reflog
messages right at the moment (nobody is asking for it), but I
envision that instead of the "squash a run of whitespaces into a SP
and rtrim" cleansing, we can %urlencode problematic bytes in the
message *AND* append a SP at the end, when a new version of Git that
supports multi-line and/or verbatim reflog messages writes a reflog
record. The reading side can detect the presense of SP at the end
(which should have been rtrimmed out if it were written by existing
versions of Git) as a signal that decoding %urlencode recovers the
original reflog message.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-11 01:19:53 +08:00
|
|
|
static void copy_reflog_msg(struct strbuf *sb, const char *msg)
|
2007-07-29 08:17:17 +08:00
|
|
|
{
|
|
|
|
char c;
|
|
|
|
int wasspace = 1;
|
2007-01-27 06:26:10 +08:00
|
|
|
|
2007-07-29 08:17:17 +08:00
|
|
|
while ((c = *msg++)) {
|
|
|
|
if (wasspace && isspace(c))
|
|
|
|
continue;
|
|
|
|
wasspace = isspace(c);
|
|
|
|
if (wasspace)
|
|
|
|
c = ' ';
|
2018-07-11 05:08:22 +08:00
|
|
|
strbuf_addch(sb, c);
|
2015-07-22 05:04:50 +08:00
|
|
|
}
|
2018-07-11 05:08:22 +08:00
|
|
|
strbuf_rtrim(sb);
|
2007-07-29 08:17:17 +08:00
|
|
|
}
|
2007-01-27 06:26:10 +08:00
|
|
|
|
reflog: cleanse messages in the refs.c layer
Regarding reflog messages:
- We expect that a reflog message consists of a single line. The
file format used by the files backend may add a LF after the
message as a delimiter, and output by commands like "git log -g"
may complete such an incomplete line by adding a LF at the end,
but philosophically, the terminating LF is not a part of the
message.
- We however allow callers of refs API to supply a random sequence
of NUL terminated bytes. We cleanse caller-supplied message by
squashing a run of whitespaces into a SP, and by trimming trailing
whitespace, before storing the message. This is how we tolerate,
instead of erring out, a message with LF in it (be it at the end,
in the middle, or both).
Currently, the cleansing of the reflog message is done by the files
backend, before the log is written out. This is sufficient with the
current code, as that is the only backend that writes reflogs. But
new backends can be added that write reflogs, and we'd want the
resulting log message we would read out of "log -g" the same no
matter what backend is used, and moving the code to do so to the
generic layer is a way to do so.
An added benefit is that the "cleansing" function could be updated
later, independent from individual backends, to e.g. allow
multi-line log messages if we wanted to, and when that happens, it
would help a lot to ensure we covered all bases if the cleansing
function (which would be updated) is called from the generic layer.
Side note: I am not interested in supporting multi-line reflog
messages right at the moment (nobody is asking for it), but I
envision that instead of the "squash a run of whitespaces into a SP
and rtrim" cleansing, we can %urlencode problematic bytes in the
message *AND* append a SP at the end, when a new version of Git that
supports multi-line and/or verbatim reflog messages writes a reflog
record. The reading side can detect the presense of SP at the end
(which should have been rtrimmed out if it were written by existing
versions of Git) as a signal that decoding %urlencode recovers the
original reflog message.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-11 01:19:53 +08:00
|
|
|
static char *normalize_reflog_message(const char *msg)
|
|
|
|
{
|
|
|
|
struct strbuf sb = STRBUF_INIT;
|
|
|
|
|
|
|
|
if (msg && *msg)
|
|
|
|
copy_reflog_msg(&sb, msg);
|
|
|
|
return strbuf_detach(&sb, NULL);
|
|
|
|
}
|
|
|
|
|
2015-11-10 19:42:36 +08:00
|
|
|
int should_autocreate_reflog(const char *refname)
|
2015-07-22 05:04:51 +08:00
|
|
|
{
|
2017-01-27 18:09:47 +08:00
|
|
|
switch (log_all_ref_updates) {
|
|
|
|
case LOG_REFS_ALWAYS:
|
|
|
|
return 1;
|
|
|
|
case LOG_REFS_NORMAL:
|
|
|
|
return starts_with(refname, "refs/heads/") ||
|
|
|
|
starts_with(refname, "refs/remotes/") ||
|
|
|
|
starts_with(refname, "refs/notes/") ||
|
|
|
|
!strcmp(refname, "HEAD");
|
|
|
|
default:
|
2015-07-22 05:04:51 +08:00
|
|
|
return 0;
|
2017-01-27 18:09:47 +08:00
|
|
|
}
|
2015-07-22 05:04:51 +08:00
|
|
|
}
|
|
|
|
|
2014-07-16 07:02:38 +08:00
|
|
|
int is_branch(const char *refname)
|
2008-01-16 07:50:17 +08:00
|
|
|
{
|
2013-12-01 04:55:40 +08:00
|
|
|
return !strcmp(refname, "HEAD") || starts_with(refname, "refs/heads/");
|
2007-01-27 06:26:09 +08:00
|
|
|
}
|
|
|
|
|
2014-06-04 00:09:59 +08:00
|
|
|
struct read_ref_at_cb {
|
|
|
|
const char *refname;
|
2017-04-27 03:29:31 +08:00
|
|
|
timestamp_t at_time;
|
2014-06-04 00:09:59 +08:00
|
|
|
int cnt;
|
|
|
|
int reccnt;
|
2017-10-16 06:07:03 +08:00
|
|
|
struct object_id *oid;
|
2014-06-04 00:09:59 +08:00
|
|
|
int found_it;
|
|
|
|
|
2017-10-16 06:07:03 +08:00
|
|
|
struct object_id ooid;
|
|
|
|
struct object_id noid;
|
2014-06-04 00:09:59 +08:00
|
|
|
int tz;
|
2017-04-27 03:29:31 +08:00
|
|
|
timestamp_t date;
|
2014-06-04 00:09:59 +08:00
|
|
|
char **msg;
|
2017-04-27 03:29:31 +08:00
|
|
|
timestamp_t *cutoff_time;
|
2014-06-04 00:09:59 +08:00
|
|
|
int *cutoff_tz;
|
|
|
|
int *cutoff_cnt;
|
|
|
|
};
|
|
|
|
|
2021-01-06 17:01:53 +08:00
|
|
|
static void set_read_ref_cutoffs(struct read_ref_at_cb *cb,
|
|
|
|
timestamp_t timestamp, int tz, const char *message)
|
|
|
|
{
|
|
|
|
if (cb->msg)
|
|
|
|
*cb->msg = xstrdup(message);
|
|
|
|
if (cb->cutoff_time)
|
|
|
|
*cb->cutoff_time = timestamp;
|
|
|
|
if (cb->cutoff_tz)
|
|
|
|
*cb->cutoff_tz = tz;
|
|
|
|
if (cb->cutoff_cnt)
|
|
|
|
*cb->cutoff_cnt = cb->reccnt;
|
|
|
|
}
|
|
|
|
|
2017-02-22 07:47:32 +08:00
|
|
|
static int read_ref_at_ent(struct object_id *ooid, struct object_id *noid,
|
2022-08-26 01:09:48 +08:00
|
|
|
const char *email UNUSED,
|
2022-08-19 18:08:35 +08:00
|
|
|
timestamp_t timestamp, int tz,
|
|
|
|
const char *message, void *cb_data)
|
2014-06-04 00:09:59 +08:00
|
|
|
{
|
|
|
|
struct read_ref_at_cb *cb = cb_data;
|
|
|
|
|
|
|
|
cb->tz = tz;
|
|
|
|
cb->date = timestamp;
|
|
|
|
|
Revert "refs: allow @{n} to work with n-sized reflog"
This reverts commit 6436a20284f33d42103cac93bd82e65bebb31526.
The idea of that commit is that if read_ref_at() is counting back to the
Nth reflog but the reflog is short by one entry (e.g., because it was
pruned), we can find the oid of the missing entry by looking at the
"before" oid value of the entry that comes after it (whereas before, we
looked at the "after" value of each entry and complained that we
couldn't find the one from before the truncation).
This works fine for resolving the oid of ref@{n}, as it is used by
get_oid_basic(), which does not look at any other aspect of the reflog
we found (e.g., its timestamp or message). But there's another caller of
read_ref_at(): in show-branch we use it to walk over the reflog, and we
do care about the reflog entry. And so that commit broke "show-branch
--reflog"; it shows the reflog message for ref@{0} as ref@{1}, ref@{1}
as ref@{2}, and so on.
For example, in the new test in t3202 we produce:
! [branch@{0}] (0 seconds ago) commit: three
! [branch@{1}] (0 seconds ago) commit: three
! [branch@{2}] (60 seconds ago) commit: two
! [branch@{3}] (2 minutes ago) reset: moving to HEAD^
instead of the correct:
! [branch@{0}] (0 seconds ago) commit: three
! [branch@{1}] (60 seconds ago) commit: two
! [branch@{2}] (2 minutes ago) reset: moving to HEAD^
! [branch@{3}] (2 minutes ago) commit: one
But there's another bug, too: because it is looking at the "old" value
of the reflog after the one we're interested in, it has to special-case
ref@{0} (since there isn't anything after it). That's why it doesn't
show the offset bug in the output above. But this special-case code
fails to handle the situation where the reflog is empty or missing; it
returns success even though the reflog message out-parameter has been
left uninitialized. You can't trigger this through get_oid_basic(), but
"show-branch --reflog" will pretty reliably segfault as it tries to
access the garbage pointer.
Fixing the segfault would be pretty easy. But the off-by-one problem is
inherent in this approach. So let's start by reverting the commit to
give us a clean slate to work with.
This isn't a pure revert; all of the code changes are reverted, but for
the tests:
1. We'll flip the cases in t1508 to expect_failure; making these work
was the goal of 6436a2028, and we'll want to use them for our
replacement approach.
2. There's a test in t3202 for "show-branch --reflog", but it expects
the broken output! It was added by f2463490c4 (show-branch: show
reflog message, 2021-12-02) which was fixing another bug, and I
think the author simply didn't notice that the second line showed
the wrong reflog.
Rather than fixing that test, let's replace it with one that is
more thorough (while still covering the reflog message fix from
that commit). We'll use a longer reflog, which lets us see more
entries (thus making the "off by one" pattern much more clear). And
we'll use a more recent timestamp for "now" so that our relative
dates have more resolution. That lets us see that the reflog dates
are correct (whereas when you are 4 years away, two entries that
are 60 seconds apart will have the same "4 years ago" relative
date). Because we're adjusting the repository state, I've moved
this new test to the end of the script, leaving the other tests
undisturbed.
We'll also add a new test which covers the missing reflog case;
previously it segfaulted, but now it reports the empty reflog).
Reported-by: Yasushi SHOJI <yasushi.shoji@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-26 18:02:26 +08:00
|
|
|
if (timestamp <= cb->at_time || cb->cnt == 0) {
|
2021-01-06 17:01:53 +08:00
|
|
|
set_read_ref_cutoffs(cb, timestamp, tz, message);
|
2014-06-04 00:09:59 +08:00
|
|
|
/*
|
2017-11-05 16:42:09 +08:00
|
|
|
* we have not yet updated cb->[n|o]oid so they still
|
2014-06-04 00:09:59 +08:00
|
|
|
* hold the values for the previous record.
|
|
|
|
*/
|
Revert "refs: allow @{n} to work with n-sized reflog"
This reverts commit 6436a20284f33d42103cac93bd82e65bebb31526.
The idea of that commit is that if read_ref_at() is counting back to the
Nth reflog but the reflog is short by one entry (e.g., because it was
pruned), we can find the oid of the missing entry by looking at the
"before" oid value of the entry that comes after it (whereas before, we
looked at the "after" value of each entry and complained that we
couldn't find the one from before the truncation).
This works fine for resolving the oid of ref@{n}, as it is used by
get_oid_basic(), which does not look at any other aspect of the reflog
we found (e.g., its timestamp or message). But there's another caller of
read_ref_at(): in show-branch we use it to walk over the reflog, and we
do care about the reflog entry. And so that commit broke "show-branch
--reflog"; it shows the reflog message for ref@{0} as ref@{1}, ref@{1}
as ref@{2}, and so on.
For example, in the new test in t3202 we produce:
! [branch@{0}] (0 seconds ago) commit: three
! [branch@{1}] (0 seconds ago) commit: three
! [branch@{2}] (60 seconds ago) commit: two
! [branch@{3}] (2 minutes ago) reset: moving to HEAD^
instead of the correct:
! [branch@{0}] (0 seconds ago) commit: three
! [branch@{1}] (60 seconds ago) commit: two
! [branch@{2}] (2 minutes ago) reset: moving to HEAD^
! [branch@{3}] (2 minutes ago) commit: one
But there's another bug, too: because it is looking at the "old" value
of the reflog after the one we're interested in, it has to special-case
ref@{0} (since there isn't anything after it). That's why it doesn't
show the offset bug in the output above. But this special-case code
fails to handle the situation where the reflog is empty or missing; it
returns success even though the reflog message out-parameter has been
left uninitialized. You can't trigger this through get_oid_basic(), but
"show-branch --reflog" will pretty reliably segfault as it tries to
access the garbage pointer.
Fixing the segfault would be pretty easy. But the off-by-one problem is
inherent in this approach. So let's start by reverting the commit to
give us a clean slate to work with.
This isn't a pure revert; all of the code changes are reverted, but for
the tests:
1. We'll flip the cases in t1508 to expect_failure; making these work
was the goal of 6436a2028, and we'll want to use them for our
replacement approach.
2. There's a test in t3202 for "show-branch --reflog", but it expects
the broken output! It was added by f2463490c4 (show-branch: show
reflog message, 2021-12-02) which was fixing another bug, and I
think the author simply didn't notice that the second line showed
the wrong reflog.
Rather than fixing that test, let's replace it with one that is
more thorough (while still covering the reflog message fix from
that commit). We'll use a longer reflog, which lets us see more
entries (thus making the "off by one" pattern much more clear). And
we'll use a more recent timestamp for "now" so that our relative
dates have more resolution. That lets us see that the reflog dates
are correct (whereas when you are 4 years away, two entries that
are 60 seconds apart will have the same "4 years ago" relative
date). Because we're adjusting the repository state, I've moved
this new test to the end of the script, leaving the other tests
undisturbed.
We'll also add a new test which covers the missing reflog case;
previously it segfaulted, but now it reports the empty reflog).
Reported-by: Yasushi SHOJI <yasushi.shoji@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-26 18:02:26 +08:00
|
|
|
if (!is_null_oid(&cb->ooid)) {
|
|
|
|
oidcpy(cb->oid, noid);
|
|
|
|
if (!oideq(&cb->ooid, noid))
|
|
|
|
warning(_("log for ref %s has gap after %s"),
|
convert "enum date_mode" into a struct
In preparation for adding date modes that may carry extra
information beyond the mode itself, this patch converts the
date_mode enum into a struct.
Most of the conversion is fairly straightforward; we pass
the struct as a pointer and dereference the type field where
necessary. Locations that declare a date_mode can use a "{}"
constructor. However, the tricky case is where we use the
enum labels as constants, like:
show_date(t, tz, DATE_NORMAL);
Ideally we could say:
show_date(t, tz, &{ DATE_NORMAL });
but of course C does not allow that. Likewise, we cannot
cast the constant to a struct, because we need to pass an
actual address. Our options are basically:
1. Manually add a "struct date_mode d = { DATE_NORMAL }"
definition to each caller, and pass "&d". This makes
the callers uglier, because they sometimes do not even
have their own scope (e.g., they are inside a switch
statement).
2. Provide a pre-made global "date_normal" struct that can
be passed by address. We'd also need "date_rfc2822",
"date_iso8601", and so forth. But at least the ugliness
is defined in one place.
3. Provide a wrapper that generates the correct struct on
the fly. The big downside is that we end up pointing to
a single global, which makes our wrapper non-reentrant.
But show_date is already not reentrant, so it does not
matter.
This patch implements 3, along with a minor macro to keep
the size of the callers sane.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-06-26 00:55:02 +08:00
|
|
|
cb->refname, show_date(cb->date, cb->tz, DATE_MODE(RFC2822)));
|
Revert "refs: allow @{n} to work with n-sized reflog"
This reverts commit 6436a20284f33d42103cac93bd82e65bebb31526.
The idea of that commit is that if read_ref_at() is counting back to the
Nth reflog but the reflog is short by one entry (e.g., because it was
pruned), we can find the oid of the missing entry by looking at the
"before" oid value of the entry that comes after it (whereas before, we
looked at the "after" value of each entry and complained that we
couldn't find the one from before the truncation).
This works fine for resolving the oid of ref@{n}, as it is used by
get_oid_basic(), which does not look at any other aspect of the reflog
we found (e.g., its timestamp or message). But there's another caller of
read_ref_at(): in show-branch we use it to walk over the reflog, and we
do care about the reflog entry. And so that commit broke "show-branch
--reflog"; it shows the reflog message for ref@{0} as ref@{1}, ref@{1}
as ref@{2}, and so on.
For example, in the new test in t3202 we produce:
! [branch@{0}] (0 seconds ago) commit: three
! [branch@{1}] (0 seconds ago) commit: three
! [branch@{2}] (60 seconds ago) commit: two
! [branch@{3}] (2 minutes ago) reset: moving to HEAD^
instead of the correct:
! [branch@{0}] (0 seconds ago) commit: three
! [branch@{1}] (60 seconds ago) commit: two
! [branch@{2}] (2 minutes ago) reset: moving to HEAD^
! [branch@{3}] (2 minutes ago) commit: one
But there's another bug, too: because it is looking at the "old" value
of the reflog after the one we're interested in, it has to special-case
ref@{0} (since there isn't anything after it). That's why it doesn't
show the offset bug in the output above. But this special-case code
fails to handle the situation where the reflog is empty or missing; it
returns success even though the reflog message out-parameter has been
left uninitialized. You can't trigger this through get_oid_basic(), but
"show-branch --reflog" will pretty reliably segfault as it tries to
access the garbage pointer.
Fixing the segfault would be pretty easy. But the off-by-one problem is
inherent in this approach. So let's start by reverting the commit to
give us a clean slate to work with.
This isn't a pure revert; all of the code changes are reverted, but for
the tests:
1. We'll flip the cases in t1508 to expect_failure; making these work
was the goal of 6436a2028, and we'll want to use them for our
replacement approach.
2. There's a test in t3202 for "show-branch --reflog", but it expects
the broken output! It was added by f2463490c4 (show-branch: show
reflog message, 2021-12-02) which was fixing another bug, and I
think the author simply didn't notice that the second line showed
the wrong reflog.
Rather than fixing that test, let's replace it with one that is
more thorough (while still covering the reflog message fix from
that commit). We'll use a longer reflog, which lets us see more
entries (thus making the "off by one" pattern much more clear). And
we'll use a more recent timestamp for "now" so that our relative
dates have more resolution. That lets us see that the reflog dates
are correct (whereas when you are 4 years away, two entries that
are 60 seconds apart will have the same "4 years ago" relative
date). Because we're adjusting the repository state, I've moved
this new test to the end of the script, leaving the other tests
undisturbed.
We'll also add a new test which covers the missing reflog case;
previously it segfaulted, but now it reports the empty reflog).
Reported-by: Yasushi SHOJI <yasushi.shoji@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-26 18:02:26 +08:00
|
|
|
}
|
|
|
|
else if (cb->date == cb->at_time)
|
2017-10-16 06:07:03 +08:00
|
|
|
oidcpy(cb->oid, noid);
|
2018-08-29 05:22:48 +08:00
|
|
|
else if (!oideq(noid, cb->oid))
|
2018-07-21 15:49:35 +08:00
|
|
|
warning(_("log for ref %s unexpectedly ended on %s"),
|
2014-06-04 00:09:59 +08:00
|
|
|
cb->refname, show_date(cb->date, cb->tz,
|
convert "enum date_mode" into a struct
In preparation for adding date modes that may carry extra
information beyond the mode itself, this patch converts the
date_mode enum into a struct.
Most of the conversion is fairly straightforward; we pass
the struct as a pointer and dereference the type field where
necessary. Locations that declare a date_mode can use a "{}"
constructor. However, the tricky case is where we use the
enum labels as constants, like:
show_date(t, tz, DATE_NORMAL);
Ideally we could say:
show_date(t, tz, &{ DATE_NORMAL });
but of course C does not allow that. Likewise, we cannot
cast the constant to a struct, because we need to pass an
actual address. Our options are basically:
1. Manually add a "struct date_mode d = { DATE_NORMAL }"
definition to each caller, and pass "&d". This makes
the callers uglier, because they sometimes do not even
have their own scope (e.g., they are inside a switch
statement).
2. Provide a pre-made global "date_normal" struct that can
be passed by address. We'd also need "date_rfc2822",
"date_iso8601", and so forth. But at least the ugliness
is defined in one place.
3. Provide a wrapper that generates the correct struct on
the fly. The big downside is that we end up pointing to
a single global, which makes our wrapper non-reentrant.
But show_date is already not reentrant, so it does not
matter.
This patch implements 3, along with a minor macro to keep
the size of the callers sane.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-06-26 00:55:02 +08:00
|
|
|
DATE_MODE(RFC2822)));
|
Revert "refs: allow @{n} to work with n-sized reflog"
This reverts commit 6436a20284f33d42103cac93bd82e65bebb31526.
The idea of that commit is that if read_ref_at() is counting back to the
Nth reflog but the reflog is short by one entry (e.g., because it was
pruned), we can find the oid of the missing entry by looking at the
"before" oid value of the entry that comes after it (whereas before, we
looked at the "after" value of each entry and complained that we
couldn't find the one from before the truncation).
This works fine for resolving the oid of ref@{n}, as it is used by
get_oid_basic(), which does not look at any other aspect of the reflog
we found (e.g., its timestamp or message). But there's another caller of
read_ref_at(): in show-branch we use it to walk over the reflog, and we
do care about the reflog entry. And so that commit broke "show-branch
--reflog"; it shows the reflog message for ref@{0} as ref@{1}, ref@{1}
as ref@{2}, and so on.
For example, in the new test in t3202 we produce:
! [branch@{0}] (0 seconds ago) commit: three
! [branch@{1}] (0 seconds ago) commit: three
! [branch@{2}] (60 seconds ago) commit: two
! [branch@{3}] (2 minutes ago) reset: moving to HEAD^
instead of the correct:
! [branch@{0}] (0 seconds ago) commit: three
! [branch@{1}] (60 seconds ago) commit: two
! [branch@{2}] (2 minutes ago) reset: moving to HEAD^
! [branch@{3}] (2 minutes ago) commit: one
But there's another bug, too: because it is looking at the "old" value
of the reflog after the one we're interested in, it has to special-case
ref@{0} (since there isn't anything after it). That's why it doesn't
show the offset bug in the output above. But this special-case code
fails to handle the situation where the reflog is empty or missing; it
returns success even though the reflog message out-parameter has been
left uninitialized. You can't trigger this through get_oid_basic(), but
"show-branch --reflog" will pretty reliably segfault as it tries to
access the garbage pointer.
Fixing the segfault would be pretty easy. But the off-by-one problem is
inherent in this approach. So let's start by reverting the commit to
give us a clean slate to work with.
This isn't a pure revert; all of the code changes are reverted, but for
the tests:
1. We'll flip the cases in t1508 to expect_failure; making these work
was the goal of 6436a2028, and we'll want to use them for our
replacement approach.
2. There's a test in t3202 for "show-branch --reflog", but it expects
the broken output! It was added by f2463490c4 (show-branch: show
reflog message, 2021-12-02) which was fixing another bug, and I
think the author simply didn't notice that the second line showed
the wrong reflog.
Rather than fixing that test, let's replace it with one that is
more thorough (while still covering the reflog message fix from
that commit). We'll use a longer reflog, which lets us see more
entries (thus making the "off by one" pattern much more clear). And
we'll use a more recent timestamp for "now" so that our relative
dates have more resolution. That lets us see that the reflog dates
are correct (whereas when you are 4 years away, two entries that
are 60 seconds apart will have the same "4 years ago" relative
date). Because we're adjusting the repository state, I've moved
this new test to the end of the script, leaving the other tests
undisturbed.
We'll also add a new test which covers the missing reflog case;
previously it segfaulted, but now it reports the empty reflog).
Reported-by: Yasushi SHOJI <yasushi.shoji@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-26 18:02:26 +08:00
|
|
|
cb->reccnt++;
|
|
|
|
oidcpy(&cb->ooid, ooid);
|
|
|
|
oidcpy(&cb->noid, noid);
|
2014-06-04 00:09:59 +08:00
|
|
|
cb->found_it = 1;
|
Revert "refs: allow @{n} to work with n-sized reflog"
This reverts commit 6436a20284f33d42103cac93bd82e65bebb31526.
The idea of that commit is that if read_ref_at() is counting back to the
Nth reflog but the reflog is short by one entry (e.g., because it was
pruned), we can find the oid of the missing entry by looking at the
"before" oid value of the entry that comes after it (whereas before, we
looked at the "after" value of each entry and complained that we
couldn't find the one from before the truncation).
This works fine for resolving the oid of ref@{n}, as it is used by
get_oid_basic(), which does not look at any other aspect of the reflog
we found (e.g., its timestamp or message). But there's another caller of
read_ref_at(): in show-branch we use it to walk over the reflog, and we
do care about the reflog entry. And so that commit broke "show-branch
--reflog"; it shows the reflog message for ref@{0} as ref@{1}, ref@{1}
as ref@{2}, and so on.
For example, in the new test in t3202 we produce:
! [branch@{0}] (0 seconds ago) commit: three
! [branch@{1}] (0 seconds ago) commit: three
! [branch@{2}] (60 seconds ago) commit: two
! [branch@{3}] (2 minutes ago) reset: moving to HEAD^
instead of the correct:
! [branch@{0}] (0 seconds ago) commit: three
! [branch@{1}] (60 seconds ago) commit: two
! [branch@{2}] (2 minutes ago) reset: moving to HEAD^
! [branch@{3}] (2 minutes ago) commit: one
But there's another bug, too: because it is looking at the "old" value
of the reflog after the one we're interested in, it has to special-case
ref@{0} (since there isn't anything after it). That's why it doesn't
show the offset bug in the output above. But this special-case code
fails to handle the situation where the reflog is empty or missing; it
returns success even though the reflog message out-parameter has been
left uninitialized. You can't trigger this through get_oid_basic(), but
"show-branch --reflog" will pretty reliably segfault as it tries to
access the garbage pointer.
Fixing the segfault would be pretty easy. But the off-by-one problem is
inherent in this approach. So let's start by reverting the commit to
give us a clean slate to work with.
This isn't a pure revert; all of the code changes are reverted, but for
the tests:
1. We'll flip the cases in t1508 to expect_failure; making these work
was the goal of 6436a2028, and we'll want to use them for our
replacement approach.
2. There's a test in t3202 for "show-branch --reflog", but it expects
the broken output! It was added by f2463490c4 (show-branch: show
reflog message, 2021-12-02) which was fixing another bug, and I
think the author simply didn't notice that the second line showed
the wrong reflog.
Rather than fixing that test, let's replace it with one that is
more thorough (while still covering the reflog message fix from
that commit). We'll use a longer reflog, which lets us see more
entries (thus making the "off by one" pattern much more clear). And
we'll use a more recent timestamp for "now" so that our relative
dates have more resolution. That lets us see that the reflog dates
are correct (whereas when you are 4 years away, two entries that
are 60 seconds apart will have the same "4 years ago" relative
date). Because we're adjusting the repository state, I've moved
this new test to the end of the script, leaving the other tests
undisturbed.
We'll also add a new test which covers the missing reflog case;
previously it segfaulted, but now it reports the empty reflog).
Reported-by: Yasushi SHOJI <yasushi.shoji@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-26 18:02:26 +08:00
|
|
|
return 1;
|
2014-06-04 00:09:59 +08:00
|
|
|
}
|
2021-01-06 17:01:53 +08:00
|
|
|
cb->reccnt++;
|
2017-10-16 06:07:03 +08:00
|
|
|
oidcpy(&cb->ooid, ooid);
|
|
|
|
oidcpy(&cb->noid, noid);
|
Revert "refs: allow @{n} to work with n-sized reflog"
This reverts commit 6436a20284f33d42103cac93bd82e65bebb31526.
The idea of that commit is that if read_ref_at() is counting back to the
Nth reflog but the reflog is short by one entry (e.g., because it was
pruned), we can find the oid of the missing entry by looking at the
"before" oid value of the entry that comes after it (whereas before, we
looked at the "after" value of each entry and complained that we
couldn't find the one from before the truncation).
This works fine for resolving the oid of ref@{n}, as it is used by
get_oid_basic(), which does not look at any other aspect of the reflog
we found (e.g., its timestamp or message). But there's another caller of
read_ref_at(): in show-branch we use it to walk over the reflog, and we
do care about the reflog entry. And so that commit broke "show-branch
--reflog"; it shows the reflog message for ref@{0} as ref@{1}, ref@{1}
as ref@{2}, and so on.
For example, in the new test in t3202 we produce:
! [branch@{0}] (0 seconds ago) commit: three
! [branch@{1}] (0 seconds ago) commit: three
! [branch@{2}] (60 seconds ago) commit: two
! [branch@{3}] (2 minutes ago) reset: moving to HEAD^
instead of the correct:
! [branch@{0}] (0 seconds ago) commit: three
! [branch@{1}] (60 seconds ago) commit: two
! [branch@{2}] (2 minutes ago) reset: moving to HEAD^
! [branch@{3}] (2 minutes ago) commit: one
But there's another bug, too: because it is looking at the "old" value
of the reflog after the one we're interested in, it has to special-case
ref@{0} (since there isn't anything after it). That's why it doesn't
show the offset bug in the output above. But this special-case code
fails to handle the situation where the reflog is empty or missing; it
returns success even though the reflog message out-parameter has been
left uninitialized. You can't trigger this through get_oid_basic(), but
"show-branch --reflog" will pretty reliably segfault as it tries to
access the garbage pointer.
Fixing the segfault would be pretty easy. But the off-by-one problem is
inherent in this approach. So let's start by reverting the commit to
give us a clean slate to work with.
This isn't a pure revert; all of the code changes are reverted, but for
the tests:
1. We'll flip the cases in t1508 to expect_failure; making these work
was the goal of 6436a2028, and we'll want to use them for our
replacement approach.
2. There's a test in t3202 for "show-branch --reflog", but it expects
the broken output! It was added by f2463490c4 (show-branch: show
reflog message, 2021-12-02) which was fixing another bug, and I
think the author simply didn't notice that the second line showed
the wrong reflog.
Rather than fixing that test, let's replace it with one that is
more thorough (while still covering the reflog message fix from
that commit). We'll use a longer reflog, which lets us see more
entries (thus making the "off by one" pattern much more clear). And
we'll use a more recent timestamp for "now" so that our relative
dates have more resolution. That lets us see that the reflog dates
are correct (whereas when you are 4 years away, two entries that
are 60 seconds apart will have the same "4 years ago" relative
date). Because we're adjusting the repository state, I've moved
this new test to the end of the script, leaving the other tests
undisturbed.
We'll also add a new test which covers the missing reflog case;
previously it segfaulted, but now it reports the empty reflog).
Reported-by: Yasushi SHOJI <yasushi.shoji@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-26 18:02:26 +08:00
|
|
|
if (cb->cnt > 0)
|
|
|
|
cb->cnt--;
|
|
|
|
return 0;
|
2014-06-04 00:09:59 +08:00
|
|
|
}
|
|
|
|
|
2017-02-22 07:47:32 +08:00
|
|
|
static int read_ref_at_ent_oldest(struct object_id *ooid, struct object_id *noid,
|
2022-08-26 01:09:48 +08:00
|
|
|
const char *email UNUSED,
|
2022-08-19 18:08:35 +08:00
|
|
|
timestamp_t timestamp, int tz,
|
|
|
|
const char *message, void *cb_data)
|
2014-06-04 00:09:59 +08:00
|
|
|
{
|
|
|
|
struct read_ref_at_cb *cb = cb_data;
|
|
|
|
|
2021-01-06 17:01:53 +08:00
|
|
|
set_read_ref_cutoffs(cb, timestamp, tz, message);
|
2017-10-16 06:07:03 +08:00
|
|
|
oidcpy(cb->oid, ooid);
|
get_oid_basic(): special-case ref@{n} for oldest reflog entry
The goal of 6436a20284 (refs: allow @{n} to work with n-sized reflog,
2021-01-07) was that if we have "n" entries in a reflog, we should still
be able to resolve ref@{n} by looking at the "old" value of the oldest
entry.
Commit 6436a20284 tried to put the logic into read_ref_at() by shifting
its idea of "n" by one. But we reverted that in the previous commit,
since it led to bugs in other callers which cared about the details of
the reflog entry we found. Instead, let's put the special case into the
caller that resolves @{n}, as it cares only about the oid.
read_ref_at() is even kind enough to return the "old" value from the
final reflog; it just returns "1" to signal to us that we ran off the
end of the reflog. But we can notice in the caller that we read just
enough records for that "old" value to be the one we're looking for, and
use it.
Note that read_ref_at() could notice this case, too, and just return 0.
But we don't want to do that, because the caller must be made aware that
we only found the oid, not an actual reflog entry (and the call sites in
show-branch do care about this).
There is one complication, though. When read_ref_at() hits a truncated
reflog, it will return the "old" value of the oldest entry only if it is
not the null oid. Otherwise, it actually returns the "new" value from
that entry! This bit of fudging is due to d1a4489a56 (avoid null SHA1 in
oldest reflog, 2008-07-08), where asking for "ref@{20.years.ago}" for a
ref created recently will produce the initial value as a convenience
(even though technically it did not exist 20 years ago).
But this convenience is only useful for time-based cutoffs. For
count-based cutoffs, get_oid_basic() has always simply complained about
going too far back:
$ git rev-parse HEAD@{20}
fatal: log for 'HEAD' only has 16 entries
and we should continue to do so, rather than returning a nonsense value
(there's even a test in t1508 already which covers this). So let's have
the d1a4489a56 code kick in only when doing timestamp-based cutoffs.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-26 18:04:07 +08:00
|
|
|
if (cb->at_time && is_null_oid(cb->oid))
|
2017-10-16 06:07:03 +08:00
|
|
|
oidcpy(cb->oid, noid);
|
2014-06-04 00:09:59 +08:00
|
|
|
/* We just want the first entry */
|
|
|
|
return 1;
|
2007-01-19 17:19:05 +08:00
|
|
|
}
|
|
|
|
|
2019-04-06 19:34:30 +08:00
|
|
|
int read_ref_at(struct ref_store *refs, const char *refname,
|
|
|
|
unsigned int flags, timestamp_t at_time, int cnt,
|
2017-10-16 06:07:03 +08:00
|
|
|
struct object_id *oid, char **msg,
|
2017-04-27 03:29:31 +08:00
|
|
|
timestamp_t *cutoff_time, int *cutoff_tz, int *cutoff_cnt)
|
2006-05-17 17:56:09 +08:00
|
|
|
{
|
2014-06-04 00:09:59 +08:00
|
|
|
struct read_ref_at_cb cb;
|
2006-05-17 17:56:09 +08:00
|
|
|
|
2014-06-04 00:09:59 +08:00
|
|
|
memset(&cb, 0, sizeof(cb));
|
|
|
|
cb.refname = refname;
|
|
|
|
cb.at_time = at_time;
|
|
|
|
cb.cnt = cnt;
|
|
|
|
cb.msg = msg;
|
|
|
|
cb.cutoff_time = cutoff_time;
|
|
|
|
cb.cutoff_tz = cutoff_tz;
|
|
|
|
cb.cutoff_cnt = cutoff_cnt;
|
2017-10-16 06:07:03 +08:00
|
|
|
cb.oid = oid;
|
2014-06-04 00:09:59 +08:00
|
|
|
|
2019-04-06 19:34:30 +08:00
|
|
|
refs_for_each_reflog_ent_reverse(refs, refname, read_ref_at_ent, &cb);
|
2014-06-04 00:09:59 +08:00
|
|
|
|
2014-09-19 11:45:37 +08:00
|
|
|
if (!cb.reccnt) {
|
read_ref_at(): special-case ref@{0} for an empty reflog
The previous commit special-cased get_oid_basic()'s handling of ref@{n}
for a reflog with n entries. But its special case doesn't work for
ref@{0} in an empty reflog, because read_ref_at() dies when it notices
the empty reflog!
We can make this work by special-casing this in read_ref_at(). It's
somewhat gross, for two reasons:
1. We have no reflog entry to describe in the "msg" out-parameter. So
we have to leave it uninitialized or make something up.
2. Likewise, we have no oid to put in the "oid" out-parameter. Leaving
it untouched is actually the best thing here, as all of the callers
will have initialized it with the current ref value via
repo_dwim_log(). This is rather subtle, but it is how things worked
in 6436a20284 (refs: allow @{n} to work with n-sized reflog,
2021-01-07) before we reverted it.
The key difference from 6436a20284 here is that we'll return "1" to
indicate that we _didn't_ find the requested reflog entry. Coupled with
the special-casing in get_oid_basic() in the previous commit, that's
enough to make looking up ref@{0} work, and we can flip 6436a20284's
test back to expect_success.
It also means that the call in show-branch which segfaulted with
6436a20284 (and which is now tested in t3202) remains OK. The caller
notices that we could not find any reflog entry, and so it breaks out of
its loop, showing nothing. This is different from the current behavior
of producing an error, but it's just as reasonable (and is exactly what
we'd do if you asked it to walk starting at ref@{1} but there was only 1
entry).
Thus nobody should actually look at the reflog entry info we return. But
we'll still put in some fake values just to be on the safe side, since
this is such a subtle and confusing interface. Likewise, we'll document
what's going on in a comment above the function declaration. If this
were a function with a lot of callers, the footgun would probably not be
worth it. But it has only ever had two callers in its 18-year existence,
and it seems unlikely to grow more. So let's hold our noses and let
users enjoy the convenience of a simulated ref@{0}.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-02-26 18:08:03 +08:00
|
|
|
if (cnt == 0) {
|
|
|
|
/*
|
|
|
|
* The caller asked for ref@{0}, and we had no entries.
|
|
|
|
* It's a bit subtle, but in practice all callers have
|
|
|
|
* prepped the "oid" field with the current value of
|
|
|
|
* the ref, which is the most reasonable fallback.
|
|
|
|
*
|
|
|
|
* We'll put dummy values into the out-parameters (so
|
|
|
|
* they're not just uninitialized garbage), and the
|
|
|
|
* caller can take our return value as a hint that
|
|
|
|
* we did not find any such reflog.
|
|
|
|
*/
|
|
|
|
set_read_ref_cutoffs(&cb, 0, 0, "empty reflog");
|
|
|
|
return 1;
|
|
|
|
}
|
2017-07-14 07:49:29 +08:00
|
|
|
if (flags & GET_OID_QUIETLY)
|
2014-09-19 11:45:37 +08:00
|
|
|
exit(128);
|
|
|
|
else
|
2018-07-21 15:49:35 +08:00
|
|
|
die(_("log for %s is empty"), refname);
|
2014-09-19 11:45:37 +08:00
|
|
|
}
|
2014-06-04 00:09:59 +08:00
|
|
|
if (cb.found_it)
|
|
|
|
return 0;
|
|
|
|
|
2019-04-06 19:34:30 +08:00
|
|
|
refs_for_each_reflog_ent(refs, refname, read_ref_at_ent_oldest, &cb);
|
2006-05-17 17:56:09 +08:00
|
|
|
|
2007-01-19 17:19:05 +08:00
|
|
|
return 1;
|
2006-05-17 17:56:09 +08:00
|
|
|
}
|
2006-12-18 17:18:16 +08:00
|
|
|
|
2017-03-26 10:42:35 +08:00
|
|
|
struct ref_transaction *ref_store_transaction_begin(struct ref_store *refs,
|
|
|
|
struct strbuf *err)
|
2014-04-07 21:48:10 +08:00
|
|
|
{
|
2017-03-26 10:42:35 +08:00
|
|
|
struct ref_transaction *tr;
|
2014-08-29 07:42:37 +08:00
|
|
|
assert(err);
|
|
|
|
|
2021-03-14 00:17:22 +08:00
|
|
|
CALLOC_ARRAY(tr, 1);
|
2017-03-26 10:42:35 +08:00
|
|
|
tr->ref_store = refs;
|
|
|
|
return tr;
|
|
|
|
}
|
|
|
|
|
2014-06-20 22:42:42 +08:00
|
|
|
void ref_transaction_free(struct ref_transaction *transaction)
|
2014-04-07 21:48:10 +08:00
|
|
|
{
|
2017-05-22 22:17:37 +08:00
|
|
|
size_t i;
|
2014-04-07 21:48:10 +08:00
|
|
|
|
2014-06-20 22:42:45 +08:00
|
|
|
if (!transaction)
|
|
|
|
return;
|
|
|
|
|
ref_transaction_prepare(): new optional step for reference updates
In the future, compound reference stores will sometimes need to modify
references in two different reference stores at the same time, meaning
that a single logical reference transaction might have to be
implemented as two internal sub-transactions. They won't want to call
`ref_transaction_commit()` for the two sub-transactions one after the
other, because that wouldn't be atomic (the first commit could succeed
and the second one fail). Instead, they will want to prepare both
sub-transactions (i.e., obtain any necessary locks and do any
pre-checks), and only if both prepare steps succeed, then commit both
sub-transactions.
Start preparing for that day by adding a new, optional
`ref_transaction_prepare()` step to the reference transaction
sequence, which obtains the locks and does any prechecks, reporting
any errors that occur. Also add a `ref_transaction_abort()` function
that can be used to abort a sub-transaction even if it has already
been prepared.
That is on the side of the public-facing API. On the side of the
`ref_store` VTABLE, get rid of `transaction_commit` and instead add
methods `transaction_prepare`, `transaction_finish`, and
`transaction_abort`. A `ref_transaction_commit()` now basically calls
methods `transaction_prepare` then `transaction_finish`.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-22 22:17:44 +08:00
|
|
|
switch (transaction->state) {
|
|
|
|
case REF_TRANSACTION_OPEN:
|
|
|
|
case REF_TRANSACTION_CLOSED:
|
|
|
|
/* OK */
|
|
|
|
break;
|
|
|
|
case REF_TRANSACTION_PREPARED:
|
2018-05-02 17:38:39 +08:00
|
|
|
BUG("free called on a prepared reference transaction");
|
ref_transaction_prepare(): new optional step for reference updates
In the future, compound reference stores will sometimes need to modify
references in two different reference stores at the same time, meaning
that a single logical reference transaction might have to be
implemented as two internal sub-transactions. They won't want to call
`ref_transaction_commit()` for the two sub-transactions one after the
other, because that wouldn't be atomic (the first commit could succeed
and the second one fail). Instead, they will want to prepare both
sub-transactions (i.e., obtain any necessary locks and do any
pre-checks), and only if both prepare steps succeed, then commit both
sub-transactions.
Start preparing for that day by adding a new, optional
`ref_transaction_prepare()` step to the reference transaction
sequence, which obtains the locks and does any prechecks, reporting
any errors that occur. Also add a `ref_transaction_abort()` function
that can be used to abort a sub-transaction even if it has already
been prepared.
That is on the side of the public-facing API. On the side of the
`ref_store` VTABLE, get rid of `transaction_commit` and instead add
methods `transaction_prepare`, `transaction_finish`, and
`transaction_abort`. A `ref_transaction_commit()` now basically calls
methods `transaction_prepare` then `transaction_finish`.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-22 22:17:44 +08:00
|
|
|
break;
|
|
|
|
default:
|
2018-05-02 17:38:39 +08:00
|
|
|
BUG("unexpected reference transaction state");
|
ref_transaction_prepare(): new optional step for reference updates
In the future, compound reference stores will sometimes need to modify
references in two different reference stores at the same time, meaning
that a single logical reference transaction might have to be
implemented as two internal sub-transactions. They won't want to call
`ref_transaction_commit()` for the two sub-transactions one after the
other, because that wouldn't be atomic (the first commit could succeed
and the second one fail). Instead, they will want to prepare both
sub-transactions (i.e., obtain any necessary locks and do any
pre-checks), and only if both prepare steps succeed, then commit both
sub-transactions.
Start preparing for that day by adding a new, optional
`ref_transaction_prepare()` step to the reference transaction
sequence, which obtains the locks and does any prechecks, reporting
any errors that occur. Also add a `ref_transaction_abort()` function
that can be used to abort a sub-transaction even if it has already
been prepared.
That is on the side of the public-facing API. On the side of the
`ref_store` VTABLE, get rid of `transaction_commit` and instead add
methods `transaction_prepare`, `transaction_finish`, and
`transaction_abort`. A `ref_transaction_commit()` now basically calls
methods `transaction_prepare` then `transaction_finish`.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-22 22:17:44 +08:00
|
|
|
break;
|
|
|
|
}
|
|
|
|
|
2014-05-01 03:22:42 +08:00
|
|
|
for (i = 0; i < transaction->nr; i++) {
|
|
|
|
free(transaction->updates[i]->msg);
|
2024-05-07 20:58:56 +08:00
|
|
|
free((char *)transaction->updates[i]->new_target);
|
|
|
|
free((char *)transaction->updates[i]->old_target);
|
2014-04-07 21:48:14 +08:00
|
|
|
free(transaction->updates[i]);
|
2014-05-01 03:22:42 +08:00
|
|
|
}
|
2014-04-07 21:48:10 +08:00
|
|
|
free(transaction->updates);
|
|
|
|
free(transaction);
|
|
|
|
}
|
|
|
|
|
2016-04-25 17:39:54 +08:00
|
|
|
struct ref_update *ref_transaction_add_update(
|
|
|
|
struct ref_transaction *transaction,
|
|
|
|
const char *refname, unsigned int flags,
|
2017-10-16 06:06:53 +08:00
|
|
|
const struct object_id *new_oid,
|
|
|
|
const struct object_id *old_oid,
|
2024-05-07 20:58:52 +08:00
|
|
|
const char *new_target, const char *old_target,
|
2016-04-25 17:39:54 +08:00
|
|
|
const char *msg)
|
2014-04-07 21:48:10 +08:00
|
|
|
{
|
2016-02-23 06:44:32 +08:00
|
|
|
struct ref_update *update;
|
2016-04-25 17:39:54 +08:00
|
|
|
|
|
|
|
if (transaction->state != REF_TRANSACTION_OPEN)
|
2018-05-02 17:38:39 +08:00
|
|
|
BUG("update called for transaction that is not open");
|
2016-04-25 17:39:54 +08:00
|
|
|
|
2024-05-07 20:58:52 +08:00
|
|
|
if (old_oid && old_target)
|
|
|
|
BUG("only one of old_oid and old_target should be non NULL");
|
|
|
|
if (new_oid && new_target)
|
|
|
|
BUG("only one of new_oid and new_target should be non NULL");
|
|
|
|
|
2016-02-23 06:44:32 +08:00
|
|
|
FLEX_ALLOC_STR(update, refname, refname);
|
2014-04-07 21:48:10 +08:00
|
|
|
ALLOC_GROW(transaction->updates, transaction->nr + 1, transaction->alloc);
|
|
|
|
transaction->updates[transaction->nr++] = update;
|
2016-04-25 17:39:54 +08:00
|
|
|
|
|
|
|
update->flags = flags;
|
|
|
|
|
2024-05-07 20:58:56 +08:00
|
|
|
update->new_target = xstrdup_or_null(new_target);
|
|
|
|
update->old_target = xstrdup_or_null(old_target);
|
|
|
|
if ((flags & REF_HAVE_NEW) && new_oid)
|
2017-10-16 06:06:53 +08:00
|
|
|
oidcpy(&update->new_oid, new_oid);
|
2024-05-07 20:58:56 +08:00
|
|
|
if ((flags & REF_HAVE_OLD) && old_oid)
|
2017-10-16 06:06:53 +08:00
|
|
|
oidcpy(&update->old_oid, old_oid);
|
2024-05-07 20:58:56 +08:00
|
|
|
|
reflog: cleanse messages in the refs.c layer
Regarding reflog messages:
- We expect that a reflog message consists of a single line. The
file format used by the files backend may add a LF after the
message as a delimiter, and output by commands like "git log -g"
may complete such an incomplete line by adding a LF at the end,
but philosophically, the terminating LF is not a part of the
message.
- We however allow callers of refs API to supply a random sequence
of NUL terminated bytes. We cleanse caller-supplied message by
squashing a run of whitespaces into a SP, and by trimming trailing
whitespace, before storing the message. This is how we tolerate,
instead of erring out, a message with LF in it (be it at the end,
in the middle, or both).
Currently, the cleansing of the reflog message is done by the files
backend, before the log is written out. This is sufficient with the
current code, as that is the only backend that writes reflogs. But
new backends can be added that write reflogs, and we'd want the
resulting log message we would read out of "log -g" the same no
matter what backend is used, and moving the code to do so to the
generic layer is a way to do so.
An added benefit is that the "cleansing" function could be updated
later, independent from individual backends, to e.g. allow
multi-line log messages if we wanted to, and when that happens, it
would help a lot to ensure we covered all bases if the cleansing
function (which would be updated) is called from the generic layer.
Side note: I am not interested in supporting multi-line reflog
messages right at the moment (nobody is asking for it), but I
envision that instead of the "squash a run of whitespaces into a SP
and rtrim" cleansing, we can %urlencode problematic bytes in the
message *AND* append a SP at the end, when a new version of Git that
supports multi-line and/or verbatim reflog messages writes a reflog
record. The reading side can detect the presense of SP at the end
(which should have been rtrimmed out if it were written by existing
versions of Git) as a signal that decoding %urlencode recovers the
original reflog message.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-11 01:19:53 +08:00
|
|
|
update->msg = normalize_reflog_message(msg);
|
2014-04-07 21:48:10 +08:00
|
|
|
return update;
|
|
|
|
}
|
|
|
|
|
2014-06-20 22:43:00 +08:00
|
|
|
int ref_transaction_update(struct ref_transaction *transaction,
|
|
|
|
const char *refname,
|
2017-10-16 06:06:53 +08:00
|
|
|
const struct object_id *new_oid,
|
|
|
|
const struct object_id *old_oid,
|
2024-05-07 20:58:52 +08:00
|
|
|
const char *new_target,
|
|
|
|
const char *old_target,
|
2015-02-18 01:00:15 +08:00
|
|
|
unsigned int flags, const char *msg,
|
2014-06-20 22:43:00 +08:00
|
|
|
struct strbuf *err)
|
2014-04-07 21:48:10 +08:00
|
|
|
{
|
2014-08-29 07:42:37 +08:00
|
|
|
assert(err);
|
|
|
|
|
2021-12-07 21:38:18 +08:00
|
|
|
if (!(flags & REF_SKIP_REFNAME_VERIFICATION) &&
|
|
|
|
((new_oid && !is_null_oid(new_oid)) ?
|
|
|
|
check_refname_format(refname, REFNAME_ALLOW_ONELEVEL) :
|
|
|
|
!refname_is_safe(refname))) {
|
2018-07-21 15:49:35 +08:00
|
|
|
strbuf_addf(err, _("refusing to update ref with bad name '%s'"),
|
refs.c: allow listing and deleting badly named refs
We currently do not handle badly named refs well:
$ cp .git/refs/heads/master .git/refs/heads/master.....@\*@\\.
$ git branch
fatal: Reference has invalid format: 'refs/heads/master.....@*@\.'
$ git branch -D master.....@\*@\\.
error: branch 'master.....@*@\.' not found.
Users cannot recover from a badly named ref without manually finding
and deleting the loose ref file or appropriate line in packed-refs.
Making that easier will make it easier to tweak the ref naming rules
in the future, for example to forbid shell metacharacters like '`'
and '"', without putting people in a state that is hard to get out of.
So allow "branch --list" to show these refs and allow "branch -d/-D"
and "update-ref -d" to delete them. Other commands (for example to
rename refs) will continue to not handle these refs but can be changed
in later patches.
Details:
In resolving functions, refuse to resolve refs that don't pass the
git-check-ref-format(1) check unless the new RESOLVE_REF_ALLOW_BAD_NAME
flag is passed. Even with RESOLVE_REF_ALLOW_BAD_NAME, refuse to
resolve refs that escape the refs/ directory and do not match the
pattern [A-Z_]* (think "HEAD" and "MERGE_HEAD").
In locking functions, refuse to act on badly named refs unless they
are being deleted and either are in the refs/ directory or match [A-Z_]*.
Just like other invalid refs, flag resolved, badly named refs with the
REF_ISBROKEN flag, treat them as resolving to null_sha1, and skip them
in all iteration functions except for for_each_rawref.
Flag badly named refs (but not symrefs pointing to badly named refs)
with a REF_BAD_NAME flag to make it easier for future callers to
notice and handle them specially. For example, in a later patch
for-each-ref will use this flag to detect refs whose names can confuse
callers parsing for-each-ref output.
In the transaction API, refuse to create or update badly named refs,
but allow deleting them (unless they try to escape refs/ and don't match
[A-Z_]*).
Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-04 02:45:43 +08:00
|
|
|
refname);
|
|
|
|
return -1;
|
|
|
|
}
|
|
|
|
|
2024-05-15 14:51:10 +08:00
|
|
|
if (!(flags & REF_SKIP_REFNAME_VERIFICATION) &&
|
|
|
|
is_pseudo_ref(refname)) {
|
|
|
|
strbuf_addf(err, _("refusing to update pseudoref '%s'"),
|
|
|
|
refname);
|
|
|
|
return -1;
|
|
|
|
}
|
|
|
|
|
2017-11-05 16:42:03 +08:00
|
|
|
if (flags & ~REF_TRANSACTION_UPDATE_ALLOWED_FLAGS)
|
|
|
|
BUG("illegal flags 0x%x passed to ref_transaction_update()", flags);
|
refs: strip out not allowed flags from ref_transaction_update
Callers are only allowed to pass certain flags into
ref_transaction_update, other flags are internal to it. To prevent
mistakes from the callers, strip the internal only flags out before
continuing.
This was noticed because of a compiler warning gcc 7.1.1 issued about
passing a NULL parameter as second parameter to memcpy (through
hashcpy):
In file included from refs.c:5:0:
refs.c: In function ‘ref_transaction_verify’:
cache.h:948:2: error: argument 2 null where non-null expected [-Werror=nonnull]
memcpy(sha_dst, sha_src, GIT_SHA1_RAWSZ);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from git-compat-util.h:165:0,
from cache.h:4,
from refs.c:5:
/usr/include/string.h:43:14: note: in a call to function ‘memcpy’ declared here
extern void *memcpy (void *__restrict __dest, const void *__restrict __src,
^~~~~~
The call to hascpy in ref_transaction_add_update is protected by the
passed in flags, but as we only add flags there, gcc notices
REF_HAVE_NEW or REF_HAVE_OLD flags could be passed in from the outside,
which would potentially result in passing in NULL as second parameter to
memcpy.
Fix both the compiler warning, and make the interface safer for its
users by stripping the internal flags out.
Suggested-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-13 06:59:21 +08:00
|
|
|
|
refs: work around gcc-11 warning with REF_HAVE_NEW
Using gcc-11 (or 12) to compile refs.o with -O3 results in:
In file included from hashmap.h:4,
from cache.h:6,
from refs.c:5:
In function ‘oidcpy’,
inlined from ‘ref_transaction_add_update’ at refs.c:1065:3,
inlined from ‘ref_transaction_update’ at refs.c:1094:2,
inlined from ‘ref_transaction_verify’ at refs.c:1132:9:
hash.h:262:9: warning: argument 2 null where non-null expected [-Wnonnull]
262 | memcpy(dst->hash, src->hash, GIT_MAX_RAWSZ);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from git-compat-util.h:177,
from cache.h:4,
from refs.c:5:
refs.c: In function ‘ref_transaction_verify’:
/usr/include/string.h:43:14: note: in a call to function ‘memcpy’ declared ‘nonnull’
43 | extern void *memcpy (void *__restrict __dest, const void *__restrict __src,
| ^~~~~~
That call to memcpy() is in a conditional block that requires
REF_HAVE_NEW to be set. But in ref_transaction_update(), we make sure it
isn't set coming in:
if (flags & ~REF_TRANSACTION_UPDATE_ALLOWED_FLAGS)
BUG("illegal flags 0x%x passed to ref_transaction_update()", flags);
and then only set it if the variable isn't NULL:
flags |= (new_oid ? REF_HAVE_NEW : 0) | (old_oid ? REF_HAVE_OLD : 0);
So it should be impossible to reach that memcpy() with a NULL oid. But
for whatever reason, gcc doesn't accept that hitting the BUG() means we
won't go any further, even though it's marked with the noreturn
attribute. And the conditional is correct; ALLOWED_FLAGS doesn't contain
HAVE_NEW or HAVE_OLD, and you can even simplify it to check for those
flags explicitly and the compiler still complains.
We can work around this by just clearing the disallowed flags
explicitly. This should be a noop because of the BUG() check, but it
makes the compiler happy.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-11-20 05:28:30 +08:00
|
|
|
/*
|
|
|
|
* Clear flags outside the allowed set; this should be a noop because
|
|
|
|
* of the BUG() check above, but it works around a -Wnonnull warning
|
|
|
|
* with some versions of "gcc -O3".
|
|
|
|
*/
|
|
|
|
flags &= REF_TRANSACTION_UPDATE_ALLOWED_FLAGS;
|
|
|
|
|
2017-10-16 06:06:53 +08:00
|
|
|
flags |= (new_oid ? REF_HAVE_NEW : 0) | (old_oid ? REF_HAVE_OLD : 0);
|
2024-05-07 20:58:56 +08:00
|
|
|
flags |= (new_target ? REF_HAVE_NEW : 0) | (old_target ? REF_HAVE_OLD : 0);
|
2016-04-25 17:39:54 +08:00
|
|
|
|
|
|
|
ref_transaction_add_update(transaction, refname, flags,
|
2024-05-07 20:58:52 +08:00
|
|
|
new_oid, old_oid, new_target,
|
|
|
|
old_target, msg);
|
2014-06-20 22:43:00 +08:00
|
|
|
return 0;
|
2014-04-07 21:48:10 +08:00
|
|
|
}
|
|
|
|
|
2014-04-17 06:26:44 +08:00
|
|
|
int ref_transaction_create(struct ref_transaction *transaction,
|
|
|
|
const char *refname,
|
2017-10-16 06:06:53 +08:00
|
|
|
const struct object_id *new_oid,
|
2015-02-18 01:00:13 +08:00
|
|
|
unsigned int flags, const char *msg,
|
2014-04-17 06:26:44 +08:00
|
|
|
struct strbuf *err)
|
2014-04-07 21:48:10 +08:00
|
|
|
{
|
clone: die() instead of BUG() on bad refs
When cloning directly from a local repository, we load a list of refs
based on scanning the $GIT_DIR/refs/ directory of the "server"
repository. If files exist in that directory that do not parse as
hexadecimal hashes, then the ref array used by write_remote_refs()
ends up with some entries with null OIDs. This causes us to hit a BUG()
statement in ref_transaction_create():
BUG: create called without valid new_oid
This BUG() call used to be a die() until 033abf97f (Replace all
die("BUG: ...") calls by BUG() ones, 2018-05-02). Before that, the die()
was added by f04c5b552 (ref_transaction_create(): check that new_sha1 is
valid, 2015-02-17).
The original report for this bug [1] mentioned that this problem did not
exist in Git 2.27.0. The failure bisects unsurprisingly to 968f12fda
(refs: turn on GIT_REF_PARANOIA by default, 2021-09-24). When
GIT_REF_PARANOIA is enabled, this case always fails as far back as I am
able to successfully compile and test the Git codebase.
[1] https://github.com/git-for-windows/git/issues/3781
There are two approaches to consider here. One would be to remove this
BUG() statement in favor of returning with an error. There are only two
callers to ref_transaction_create(), so this would have a limited
impact.
The other approach would be to add special casing in 'git clone' to
avoid this faulty input to the method.
While I originally started with changing 'git clone', I decided that
modifying ref_transaction_create() was a more complete solution. This
prevents failing with a BUG() statement when we already have a good way
to report an error (including a reason for that error) within the
method. Both callers properly check the return value and die() with the
error message, so this is an appropriate direction.
The added test helps check against a regression, but does check that our
intended error message is handled correctly.
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-04-25 21:47:30 +08:00
|
|
|
if (!new_oid || is_null_oid(new_oid)) {
|
|
|
|
strbuf_addf(err, "'%s' has a null OID", refname);
|
|
|
|
return 1;
|
|
|
|
}
|
2017-10-16 06:06:53 +08:00
|
|
|
return ref_transaction_update(transaction, refname, new_oid,
|
2024-05-07 20:58:52 +08:00
|
|
|
null_oid(), NULL, NULL, flags,
|
|
|
|
msg, err);
|
2014-04-07 21:48:10 +08:00
|
|
|
}
|
|
|
|
|
2014-04-17 06:27:45 +08:00
|
|
|
int ref_transaction_delete(struct ref_transaction *transaction,
|
|
|
|
const char *refname,
|
2017-10-16 06:06:53 +08:00
|
|
|
const struct object_id *old_oid,
|
2015-02-18 01:00:16 +08:00
|
|
|
unsigned int flags, const char *msg,
|
2014-04-17 06:27:45 +08:00
|
|
|
struct strbuf *err)
|
2014-04-07 21:48:10 +08:00
|
|
|
{
|
2017-10-16 06:06:53 +08:00
|
|
|
if (old_oid && is_null_oid(old_oid))
|
2018-05-02 17:38:39 +08:00
|
|
|
BUG("delete called with old_oid set to zeros");
|
2015-02-18 01:00:15 +08:00
|
|
|
return ref_transaction_update(transaction, refname,
|
2021-04-26 09:02:56 +08:00
|
|
|
null_oid(), old_oid,
|
2024-05-07 20:58:52 +08:00
|
|
|
NULL, NULL, flags,
|
|
|
|
msg, err);
|
2014-04-07 21:48:10 +08:00
|
|
|
}
|
|
|
|
|
2015-02-18 01:00:21 +08:00
|
|
|
int ref_transaction_verify(struct ref_transaction *transaction,
|
|
|
|
const char *refname,
|
2017-10-16 06:06:53 +08:00
|
|
|
const struct object_id *old_oid,
|
2015-02-18 01:00:21 +08:00
|
|
|
unsigned int flags,
|
|
|
|
struct strbuf *err)
|
|
|
|
{
|
2017-10-16 06:06:53 +08:00
|
|
|
if (!old_oid)
|
2018-05-02 17:38:39 +08:00
|
|
|
BUG("verify called with old_oid set to NULL");
|
2015-02-18 01:00:21 +08:00
|
|
|
return ref_transaction_update(transaction, refname,
|
2017-10-16 06:06:53 +08:00
|
|
|
NULL, old_oid,
|
2024-05-07 20:58:52 +08:00
|
|
|
NULL, NULL,
|
2015-02-18 01:00:21 +08:00
|
|
|
flags, NULL, err);
|
|
|
|
}
|
|
|
|
|
2017-03-26 10:42:35 +08:00
|
|
|
int refs_update_ref(struct ref_store *refs, const char *msg,
|
2017-10-16 06:06:51 +08:00
|
|
|
const char *refname, const struct object_id *new_oid,
|
|
|
|
const struct object_id *old_oid, unsigned int flags,
|
2017-03-26 10:42:35 +08:00
|
|
|
enum action_on_err onerr)
|
2013-09-04 23:22:40 +08:00
|
|
|
{
|
2015-07-31 14:06:19 +08:00
|
|
|
struct ref_transaction *t = NULL;
|
2014-04-25 07:36:55 +08:00
|
|
|
struct strbuf err = STRBUF_INIT;
|
2015-07-31 14:06:19 +08:00
|
|
|
int ret = 0;
|
2014-04-25 07:36:55 +08:00
|
|
|
|
2022-04-14 06:51:33 +08:00
|
|
|
t = ref_store_transaction_begin(refs, &err);
|
2020-07-28 00:25:46 +08:00
|
|
|
if (!t ||
|
2024-05-07 20:58:52 +08:00
|
|
|
ref_transaction_update(t, refname, new_oid, old_oid, NULL, NULL,
|
|
|
|
flags, msg, &err) ||
|
2020-07-28 00:25:46 +08:00
|
|
|
ref_transaction_commit(t, &err)) {
|
|
|
|
ret = 1;
|
|
|
|
ref_transaction_free(t);
|
2015-07-31 14:06:19 +08:00
|
|
|
}
|
|
|
|
if (ret) {
|
2018-07-21 15:49:35 +08:00
|
|
|
const char *str = _("update_ref failed for ref '%s': %s");
|
2014-04-25 07:36:55 +08:00
|
|
|
|
|
|
|
switch (onerr) {
|
|
|
|
case UPDATE_REFS_MSG_ON_ERR:
|
|
|
|
error(str, refname, err.buf);
|
|
|
|
break;
|
|
|
|
case UPDATE_REFS_DIE_ON_ERR:
|
|
|
|
die(str, refname, err.buf);
|
|
|
|
break;
|
|
|
|
case UPDATE_REFS_QUIET_ON_ERR:
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
strbuf_release(&err);
|
2013-09-04 23:22:40 +08:00
|
|
|
return 1;
|
2014-04-25 07:36:55 +08:00
|
|
|
}
|
|
|
|
strbuf_release(&err);
|
2015-07-31 14:06:19 +08:00
|
|
|
if (t)
|
|
|
|
ref_transaction_free(t);
|
2014-04-25 07:36:55 +08:00
|
|
|
return 0;
|
2013-09-04 23:22:40 +08:00
|
|
|
}
|
|
|
|
|
shorten_unambiguous_ref(): avoid sscanf()
To shorten a fully qualified ref (e.g., taking "refs/heads/foo" to just
"foo"), we munge the usual lookup rules ("refs/heads/%.*s", etc) to drop
the ".*" modifier (so "refs/heads/%s"), and then use sscanf() to match
that against the refname, pulling the "%s" content into a separate
buffer.
This has a few downsides:
- sscanf("%s") reportedly misbehaves on macOS with some input and
locale combinations, returning a partial or garbled string. See
this thread:
https://lore.kernel.org/git/CAGF3oAcCi+fG12j-1U0hcrWwkF5K_9WhOi6ZPHBzUUzfkrZDxA@mail.gmail.com/
- scanf's matching of "%s" is greedy. So the "refs/remotes/%s/HEAD"
rule would never pull "origin" out of "refs/remotes/origin/HEAD".
Instead it always produced "origin/HEAD", which is redundant with
the "refs/remotes/%s" rule.
- scanf in general is an error-prone interface. For example, scanning
for "%s" will copy bytes into a destination string, which must have
been correctly sized ahead of time to avoid a buffer overflow. In
this case, the code is OK (the buffer is pessimistically sized to
match the original string, which should give us a maximum). But in
general, we do not want to encourage people to use scanf at all.
So instead, let's note that our lookup rules are not arbitrary format
strings, but all contain exactly one "%.*s" placeholder. We already rely
on this, both for lookup (we feed the lookup format along with exactly
one int/ptr combo to snprintf, etc) and for shortening (we munge "%.*s"
to "%s", and then insist that sscanf() finds exactly one result).
We can parse this manually by just matching the bytes that occur before
and after the "%.*s" placeholder. While we have a few extra lines of
parsing code, the result is arguably simpler, as can skip the
preprocessing step and its tricky memory management entirely.
The in-code comments should explain the parsing strategy, but there's
one subtle change here. The original code allocated a single buffer, and
then overwrote it in each loop iteration, since that's the only option
sscanf() gives us. But our parser can actually return a ptr/len combo
for the matched string, which is all we need (since we just feed it back
to the lookup rules with "%.*s"), and then copy it only when returning
to the caller.
There are a few new tests here, all using symbolic-ref (the code can be
triggered in many ways, but symrefs are convenient in that we don't need
to create a real ref, which avoids any complications from the filesystem
munging the name):
- the first covers the real-world case which misbehaved on macOS.
Setting LC_ALL is required to trigger the problem there (since
otherwise our tests use LC_ALL=C), and hopefully is at worst simply
ignored on other systems (and doesn't cause libc to complain, etc,
on systems without that locale).
- the second covers the "origin/HEAD" case as discussed above, which
is now fixed
- the remainder are for "weird" cases that work both before and after
this patch, but would be easy to get wrong with off-by-one problems
in the parsing (and came out of discussions and earlier iterations
of the patch that did get them wrong).
- absent here are tests of boring, expected-to-work cases like
"refs/heads/foo", etc. Those are covered all over the test suite
both explicitly (for-each-ref's refname:short) and implicitly (in
the output of git-status, etc).
Reported-by: 孟子易 <mengziyi540841@gmail.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-15 23:16:21 +08:00
|
|
|
/*
|
|
|
|
* Check that the string refname matches a rule of the form
|
|
|
|
* "{prefix}%.*s{suffix}". So "foo/bar/baz" would match the rule
|
|
|
|
* "foo/%.*s/baz", and return the string "bar".
|
|
|
|
*/
|
|
|
|
static const char *match_parse_rule(const char *refname, const char *rule,
|
|
|
|
size_t *len)
|
2009-04-07 15:14:20 +08:00
|
|
|
{
|
shorten_unambiguous_ref(): avoid sscanf()
To shorten a fully qualified ref (e.g., taking "refs/heads/foo" to just
"foo"), we munge the usual lookup rules ("refs/heads/%.*s", etc) to drop
the ".*" modifier (so "refs/heads/%s"), and then use sscanf() to match
that against the refname, pulling the "%s" content into a separate
buffer.
This has a few downsides:
- sscanf("%s") reportedly misbehaves on macOS with some input and
locale combinations, returning a partial or garbled string. See
this thread:
https://lore.kernel.org/git/CAGF3oAcCi+fG12j-1U0hcrWwkF5K_9WhOi6ZPHBzUUzfkrZDxA@mail.gmail.com/
- scanf's matching of "%s" is greedy. So the "refs/remotes/%s/HEAD"
rule would never pull "origin" out of "refs/remotes/origin/HEAD".
Instead it always produced "origin/HEAD", which is redundant with
the "refs/remotes/%s" rule.
- scanf in general is an error-prone interface. For example, scanning
for "%s" will copy bytes into a destination string, which must have
been correctly sized ahead of time to avoid a buffer overflow. In
this case, the code is OK (the buffer is pessimistically sized to
match the original string, which should give us a maximum). But in
general, we do not want to encourage people to use scanf at all.
So instead, let's note that our lookup rules are not arbitrary format
strings, but all contain exactly one "%.*s" placeholder. We already rely
on this, both for lookup (we feed the lookup format along with exactly
one int/ptr combo to snprintf, etc) and for shortening (we munge "%.*s"
to "%s", and then insist that sscanf() finds exactly one result).
We can parse this manually by just matching the bytes that occur before
and after the "%.*s" placeholder. While we have a few extra lines of
parsing code, the result is arguably simpler, as can skip the
preprocessing step and its tricky memory management entirely.
The in-code comments should explain the parsing strategy, but there's
one subtle change here. The original code allocated a single buffer, and
then overwrote it in each loop iteration, since that's the only option
sscanf() gives us. But our parser can actually return a ptr/len combo
for the matched string, which is all we need (since we just feed it back
to the lookup rules with "%.*s"), and then copy it only when returning
to the caller.
There are a few new tests here, all using symbolic-ref (the code can be
triggered in many ways, but symrefs are convenient in that we don't need
to create a real ref, which avoids any complications from the filesystem
munging the name):
- the first covers the real-world case which misbehaved on macOS.
Setting LC_ALL is required to trigger the problem there (since
otherwise our tests use LC_ALL=C), and hopefully is at worst simply
ignored on other systems (and doesn't cause libc to complain, etc,
on systems without that locale).
- the second covers the "origin/HEAD" case as discussed above, which
is now fixed
- the remainder are for "weird" cases that work both before and after
this patch, but would be easy to get wrong with off-by-one problems
in the parsing (and came out of discussions and earlier iterations
of the patch that did get them wrong).
- absent here are tests of boring, expected-to-work cases like
"refs/heads/foo", etc. Those are covered all over the test suite
both explicitly (for-each-ref's refname:short) and implicitly (in
the output of git-status, etc).
Reported-by: 孟子易 <mengziyi540841@gmail.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-15 23:16:21 +08:00
|
|
|
/*
|
|
|
|
* Check that rule matches refname up to the first percent in the rule.
|
|
|
|
* We can bail immediately if not, but otherwise we leave "rule" at the
|
|
|
|
* %-placeholder, and "refname" at the start of the potential matched
|
|
|
|
* name.
|
|
|
|
*/
|
|
|
|
while (*rule != '%') {
|
|
|
|
if (!*rule)
|
|
|
|
BUG("rev-parse rule did not have percent");
|
|
|
|
if (*refname++ != *rule++)
|
|
|
|
return NULL;
|
|
|
|
}
|
2009-04-07 15:14:20 +08:00
|
|
|
|
shorten_unambiguous_ref(): avoid sscanf()
To shorten a fully qualified ref (e.g., taking "refs/heads/foo" to just
"foo"), we munge the usual lookup rules ("refs/heads/%.*s", etc) to drop
the ".*" modifier (so "refs/heads/%s"), and then use sscanf() to match
that against the refname, pulling the "%s" content into a separate
buffer.
This has a few downsides:
- sscanf("%s") reportedly misbehaves on macOS with some input and
locale combinations, returning a partial or garbled string. See
this thread:
https://lore.kernel.org/git/CAGF3oAcCi+fG12j-1U0hcrWwkF5K_9WhOi6ZPHBzUUzfkrZDxA@mail.gmail.com/
- scanf's matching of "%s" is greedy. So the "refs/remotes/%s/HEAD"
rule would never pull "origin" out of "refs/remotes/origin/HEAD".
Instead it always produced "origin/HEAD", which is redundant with
the "refs/remotes/%s" rule.
- scanf in general is an error-prone interface. For example, scanning
for "%s" will copy bytes into a destination string, which must have
been correctly sized ahead of time to avoid a buffer overflow. In
this case, the code is OK (the buffer is pessimistically sized to
match the original string, which should give us a maximum). But in
general, we do not want to encourage people to use scanf at all.
So instead, let's note that our lookup rules are not arbitrary format
strings, but all contain exactly one "%.*s" placeholder. We already rely
on this, both for lookup (we feed the lookup format along with exactly
one int/ptr combo to snprintf, etc) and for shortening (we munge "%.*s"
to "%s", and then insist that sscanf() finds exactly one result).
We can parse this manually by just matching the bytes that occur before
and after the "%.*s" placeholder. While we have a few extra lines of
parsing code, the result is arguably simpler, as can skip the
preprocessing step and its tricky memory management entirely.
The in-code comments should explain the parsing strategy, but there's
one subtle change here. The original code allocated a single buffer, and
then overwrote it in each loop iteration, since that's the only option
sscanf() gives us. But our parser can actually return a ptr/len combo
for the matched string, which is all we need (since we just feed it back
to the lookup rules with "%.*s"), and then copy it only when returning
to the caller.
There are a few new tests here, all using symbolic-ref (the code can be
triggered in many ways, but symrefs are convenient in that we don't need
to create a real ref, which avoids any complications from the filesystem
munging the name):
- the first covers the real-world case which misbehaved on macOS.
Setting LC_ALL is required to trigger the problem there (since
otherwise our tests use LC_ALL=C), and hopefully is at worst simply
ignored on other systems (and doesn't cause libc to complain, etc,
on systems without that locale).
- the second covers the "origin/HEAD" case as discussed above, which
is now fixed
- the remainder are for "weird" cases that work both before and after
this patch, but would be easy to get wrong with off-by-one problems
in the parsing (and came out of discussions and earlier iterations
of the patch that did get them wrong).
- absent here are tests of boring, expected-to-work cases like
"refs/heads/foo", etc. Those are covered all over the test suite
both explicitly (for-each-ref's refname:short) and implicitly (in
the output of git-status, etc).
Reported-by: 孟子易 <mengziyi540841@gmail.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-15 23:16:21 +08:00
|
|
|
/*
|
|
|
|
* Check that our "%" is the expected placeholder. This assumes there
|
|
|
|
* are no other percents (placeholder or quoted) in the string, but
|
|
|
|
* that is sufficient for our rev-parse rules.
|
|
|
|
*/
|
|
|
|
if (!skip_prefix(rule, "%.*s", &rule))
|
|
|
|
return NULL;
|
2009-04-07 15:14:20 +08:00
|
|
|
|
shorten_unambiguous_ref(): avoid sscanf()
To shorten a fully qualified ref (e.g., taking "refs/heads/foo" to just
"foo"), we munge the usual lookup rules ("refs/heads/%.*s", etc) to drop
the ".*" modifier (so "refs/heads/%s"), and then use sscanf() to match
that against the refname, pulling the "%s" content into a separate
buffer.
This has a few downsides:
- sscanf("%s") reportedly misbehaves on macOS with some input and
locale combinations, returning a partial or garbled string. See
this thread:
https://lore.kernel.org/git/CAGF3oAcCi+fG12j-1U0hcrWwkF5K_9WhOi6ZPHBzUUzfkrZDxA@mail.gmail.com/
- scanf's matching of "%s" is greedy. So the "refs/remotes/%s/HEAD"
rule would never pull "origin" out of "refs/remotes/origin/HEAD".
Instead it always produced "origin/HEAD", which is redundant with
the "refs/remotes/%s" rule.
- scanf in general is an error-prone interface. For example, scanning
for "%s" will copy bytes into a destination string, which must have
been correctly sized ahead of time to avoid a buffer overflow. In
this case, the code is OK (the buffer is pessimistically sized to
match the original string, which should give us a maximum). But in
general, we do not want to encourage people to use scanf at all.
So instead, let's note that our lookup rules are not arbitrary format
strings, but all contain exactly one "%.*s" placeholder. We already rely
on this, both for lookup (we feed the lookup format along with exactly
one int/ptr combo to snprintf, etc) and for shortening (we munge "%.*s"
to "%s", and then insist that sscanf() finds exactly one result).
We can parse this manually by just matching the bytes that occur before
and after the "%.*s" placeholder. While we have a few extra lines of
parsing code, the result is arguably simpler, as can skip the
preprocessing step and its tricky memory management entirely.
The in-code comments should explain the parsing strategy, but there's
one subtle change here. The original code allocated a single buffer, and
then overwrote it in each loop iteration, since that's the only option
sscanf() gives us. But our parser can actually return a ptr/len combo
for the matched string, which is all we need (since we just feed it back
to the lookup rules with "%.*s"), and then copy it only when returning
to the caller.
There are a few new tests here, all using symbolic-ref (the code can be
triggered in many ways, but symrefs are convenient in that we don't need
to create a real ref, which avoids any complications from the filesystem
munging the name):
- the first covers the real-world case which misbehaved on macOS.
Setting LC_ALL is required to trigger the problem there (since
otherwise our tests use LC_ALL=C), and hopefully is at worst simply
ignored on other systems (and doesn't cause libc to complain, etc,
on systems without that locale).
- the second covers the "origin/HEAD" case as discussed above, which
is now fixed
- the remainder are for "weird" cases that work both before and after
this patch, but would be easy to get wrong with off-by-one problems
in the parsing (and came out of discussions and earlier iterations
of the patch that did get them wrong).
- absent here are tests of boring, expected-to-work cases like
"refs/heads/foo", etc. Those are covered all over the test suite
both explicitly (for-each-ref's refname:short) and implicitly (in
the output of git-status, etc).
Reported-by: 孟子易 <mengziyi540841@gmail.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-15 23:16:21 +08:00
|
|
|
/*
|
|
|
|
* And now check that our suffix (if any) matches.
|
|
|
|
*/
|
|
|
|
if (!strip_suffix(refname, rule, len))
|
|
|
|
return NULL;
|
2009-04-07 15:14:20 +08:00
|
|
|
|
shorten_unambiguous_ref(): avoid sscanf()
To shorten a fully qualified ref (e.g., taking "refs/heads/foo" to just
"foo"), we munge the usual lookup rules ("refs/heads/%.*s", etc) to drop
the ".*" modifier (so "refs/heads/%s"), and then use sscanf() to match
that against the refname, pulling the "%s" content into a separate
buffer.
This has a few downsides:
- sscanf("%s") reportedly misbehaves on macOS with some input and
locale combinations, returning a partial or garbled string. See
this thread:
https://lore.kernel.org/git/CAGF3oAcCi+fG12j-1U0hcrWwkF5K_9WhOi6ZPHBzUUzfkrZDxA@mail.gmail.com/
- scanf's matching of "%s" is greedy. So the "refs/remotes/%s/HEAD"
rule would never pull "origin" out of "refs/remotes/origin/HEAD".
Instead it always produced "origin/HEAD", which is redundant with
the "refs/remotes/%s" rule.
- scanf in general is an error-prone interface. For example, scanning
for "%s" will copy bytes into a destination string, which must have
been correctly sized ahead of time to avoid a buffer overflow. In
this case, the code is OK (the buffer is pessimistically sized to
match the original string, which should give us a maximum). But in
general, we do not want to encourage people to use scanf at all.
So instead, let's note that our lookup rules are not arbitrary format
strings, but all contain exactly one "%.*s" placeholder. We already rely
on this, both for lookup (we feed the lookup format along with exactly
one int/ptr combo to snprintf, etc) and for shortening (we munge "%.*s"
to "%s", and then insist that sscanf() finds exactly one result).
We can parse this manually by just matching the bytes that occur before
and after the "%.*s" placeholder. While we have a few extra lines of
parsing code, the result is arguably simpler, as can skip the
preprocessing step and its tricky memory management entirely.
The in-code comments should explain the parsing strategy, but there's
one subtle change here. The original code allocated a single buffer, and
then overwrote it in each loop iteration, since that's the only option
sscanf() gives us. But our parser can actually return a ptr/len combo
for the matched string, which is all we need (since we just feed it back
to the lookup rules with "%.*s"), and then copy it only when returning
to the caller.
There are a few new tests here, all using symbolic-ref (the code can be
triggered in many ways, but symrefs are convenient in that we don't need
to create a real ref, which avoids any complications from the filesystem
munging the name):
- the first covers the real-world case which misbehaved on macOS.
Setting LC_ALL is required to trigger the problem there (since
otherwise our tests use LC_ALL=C), and hopefully is at worst simply
ignored on other systems (and doesn't cause libc to complain, etc,
on systems without that locale).
- the second covers the "origin/HEAD" case as discussed above, which
is now fixed
- the remainder are for "weird" cases that work both before and after
this patch, but would be easy to get wrong with off-by-one problems
in the parsing (and came out of discussions and earlier iterations
of the patch that did get them wrong).
- absent here are tests of boring, expected-to-work cases like
"refs/heads/foo", etc. Those are covered all over the test suite
both explicitly (for-each-ref's refname:short) and implicitly (in
the output of git-status, etc).
Reported-by: 孟子易 <mengziyi540841@gmail.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-15 23:16:21 +08:00
|
|
|
return refname; /* len set by strip_suffix() */
|
|
|
|
}
|
2009-04-07 15:14:20 +08:00
|
|
|
|
2019-04-06 19:34:25 +08:00
|
|
|
char *refs_shorten_unambiguous_ref(struct ref_store *refs,
|
|
|
|
const char *refname, int strict)
|
2009-04-07 15:14:20 +08:00
|
|
|
{
|
|
|
|
int i;
|
2017-03-29 03:46:33 +08:00
|
|
|
struct strbuf resolved_buf = STRBUF_INIT;
|
2009-04-07 15:14:20 +08:00
|
|
|
|
|
|
|
/* skip first rule, it will always match */
|
shorten_unambiguous_ref(): use NUM_REV_PARSE_RULES constant
The ref_rev_parse_rules[] array is terminated with a NULL entry, and we
count it and store the result in the local nr_rules variable. But we
don't need to do so; since the array is a constant, we can compute its
size directly. The original code probably didn't do that because it was
written as part of for-each-ref, and saw the array only as a pointer. It
was migrated in 7c2b3029df (make get_short_ref a public function,
2009-04-07) and could have been updated then, but that subtlety was not
noticed.
We even have a constant that represents this value already, courtesy of
60650a48c0 (remote: make refspec follow the same disambiguation rule as
local refs, 2018-08-01), though again, nobody noticed at the time that
it could be used here, too.
The current count-up isn't a big deal, as we need to preprocess that
array anyway. But it will become more cumbersome as we refactor the
shortening code. So let's get rid of it and just use the constant
everywhere.
Note that there are two things here that aren't just simple text
replacements:
1. We also use nr_rules to see if a previous call has initialized the
static pre-processing variables. We can just use the scanf_fmts
pointer to do the same thing, as it is non-NULL only after we've
done that initialization.
2. If nr_rules is zero after we've counted it up, we bail from the
function. This code is unreachable, though, as the set of rules is
hard-coded and non-empty. And that becomes even more apparent now
that we are using the constant. So we can drop this conditional
completely (and ironically, the code would have the same output if
it _did_ trigger, as we'd simply skip the loop entirely and return
the whole refname).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-15 23:16:18 +08:00
|
|
|
for (i = NUM_REV_PARSE_RULES - 1; i > 0 ; --i) {
|
2009-04-07 15:14:20 +08:00
|
|
|
int j;
|
2009-04-13 18:25:46 +08:00
|
|
|
int rules_to_fail = i;
|
shorten_unambiguous_ref(): avoid sscanf()
To shorten a fully qualified ref (e.g., taking "refs/heads/foo" to just
"foo"), we munge the usual lookup rules ("refs/heads/%.*s", etc) to drop
the ".*" modifier (so "refs/heads/%s"), and then use sscanf() to match
that against the refname, pulling the "%s" content into a separate
buffer.
This has a few downsides:
- sscanf("%s") reportedly misbehaves on macOS with some input and
locale combinations, returning a partial or garbled string. See
this thread:
https://lore.kernel.org/git/CAGF3oAcCi+fG12j-1U0hcrWwkF5K_9WhOi6ZPHBzUUzfkrZDxA@mail.gmail.com/
- scanf's matching of "%s" is greedy. So the "refs/remotes/%s/HEAD"
rule would never pull "origin" out of "refs/remotes/origin/HEAD".
Instead it always produced "origin/HEAD", which is redundant with
the "refs/remotes/%s" rule.
- scanf in general is an error-prone interface. For example, scanning
for "%s" will copy bytes into a destination string, which must have
been correctly sized ahead of time to avoid a buffer overflow. In
this case, the code is OK (the buffer is pessimistically sized to
match the original string, which should give us a maximum). But in
general, we do not want to encourage people to use scanf at all.
So instead, let's note that our lookup rules are not arbitrary format
strings, but all contain exactly one "%.*s" placeholder. We already rely
on this, both for lookup (we feed the lookup format along with exactly
one int/ptr combo to snprintf, etc) and for shortening (we munge "%.*s"
to "%s", and then insist that sscanf() finds exactly one result).
We can parse this manually by just matching the bytes that occur before
and after the "%.*s" placeholder. While we have a few extra lines of
parsing code, the result is arguably simpler, as can skip the
preprocessing step and its tricky memory management entirely.
The in-code comments should explain the parsing strategy, but there's
one subtle change here. The original code allocated a single buffer, and
then overwrote it in each loop iteration, since that's the only option
sscanf() gives us. But our parser can actually return a ptr/len combo
for the matched string, which is all we need (since we just feed it back
to the lookup rules with "%.*s"), and then copy it only when returning
to the caller.
There are a few new tests here, all using symbolic-ref (the code can be
triggered in many ways, but symrefs are convenient in that we don't need
to create a real ref, which avoids any complications from the filesystem
munging the name):
- the first covers the real-world case which misbehaved on macOS.
Setting LC_ALL is required to trigger the problem there (since
otherwise our tests use LC_ALL=C), and hopefully is at worst simply
ignored on other systems (and doesn't cause libc to complain, etc,
on systems without that locale).
- the second covers the "origin/HEAD" case as discussed above, which
is now fixed
- the remainder are for "weird" cases that work both before and after
this patch, but would be easy to get wrong with off-by-one problems
in the parsing (and came out of discussions and earlier iterations
of the patch that did get them wrong).
- absent here are tests of boring, expected-to-work cases like
"refs/heads/foo", etc. Those are covered all over the test suite
both explicitly (for-each-ref's refname:short) and implicitly (in
the output of git-status, etc).
Reported-by: 孟子易 <mengziyi540841@gmail.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-15 23:16:21 +08:00
|
|
|
const char *short_name;
|
shorten_unambiguous_ref(): avoid integer truncation
We parse the shortened name "foo" out of the full refname
"refs/heads/foo", and then assign the result of strlen(short_name) to an
int, which may truncate or wrap to negative.
In practice, this should never happen, as it requires a 2GB refname. And
even somebody trying to do something malicious should at worst end up
with a confused answer (we use the size only to feed back as a
placeholder length to strbuf_addf() to see if there are any collisions
in the lookup rules).
And it may even be impossible to trigger this, as we parse the string
with sscanf(), and stdio formatting functions are not known for handling
large strings well. I didn't test, but I wouldn't be surprised if
sscanf() on many platforms simply reports no match here.
But even if it is not a problem in practice so far, it is worth fixing
for two reasons:
1. We'll shortly be replacing the sscanf() call with a real parser
which will handle arbitrary-sized strings.
2. Assigning strlen() to an int is an anti-pattern that requires
people to look twice when auditing for real overflow problems.
So we'll make this a size_t. Unfortunately we still have to cast to int
eventually for the strbuf_addf() call, but at least we can localize the
cast there, and check that it will be valid. I used our new cast helper
here, which will just bail completely. That should be OK, as anybody
with a 2GB refname is up to no good, but if we really wanted to, we
could detect it manually and just refuse to shorten the refname.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-15 23:16:14 +08:00
|
|
|
size_t short_name_len;
|
2009-04-07 15:14:20 +08:00
|
|
|
|
shorten_unambiguous_ref(): avoid sscanf()
To shorten a fully qualified ref (e.g., taking "refs/heads/foo" to just
"foo"), we munge the usual lookup rules ("refs/heads/%.*s", etc) to drop
the ".*" modifier (so "refs/heads/%s"), and then use sscanf() to match
that against the refname, pulling the "%s" content into a separate
buffer.
This has a few downsides:
- sscanf("%s") reportedly misbehaves on macOS with some input and
locale combinations, returning a partial or garbled string. See
this thread:
https://lore.kernel.org/git/CAGF3oAcCi+fG12j-1U0hcrWwkF5K_9WhOi6ZPHBzUUzfkrZDxA@mail.gmail.com/
- scanf's matching of "%s" is greedy. So the "refs/remotes/%s/HEAD"
rule would never pull "origin" out of "refs/remotes/origin/HEAD".
Instead it always produced "origin/HEAD", which is redundant with
the "refs/remotes/%s" rule.
- scanf in general is an error-prone interface. For example, scanning
for "%s" will copy bytes into a destination string, which must have
been correctly sized ahead of time to avoid a buffer overflow. In
this case, the code is OK (the buffer is pessimistically sized to
match the original string, which should give us a maximum). But in
general, we do not want to encourage people to use scanf at all.
So instead, let's note that our lookup rules are not arbitrary format
strings, but all contain exactly one "%.*s" placeholder. We already rely
on this, both for lookup (we feed the lookup format along with exactly
one int/ptr combo to snprintf, etc) and for shortening (we munge "%.*s"
to "%s", and then insist that sscanf() finds exactly one result).
We can parse this manually by just matching the bytes that occur before
and after the "%.*s" placeholder. While we have a few extra lines of
parsing code, the result is arguably simpler, as can skip the
preprocessing step and its tricky memory management entirely.
The in-code comments should explain the parsing strategy, but there's
one subtle change here. The original code allocated a single buffer, and
then overwrote it in each loop iteration, since that's the only option
sscanf() gives us. But our parser can actually return a ptr/len combo
for the matched string, which is all we need (since we just feed it back
to the lookup rules with "%.*s"), and then copy it only when returning
to the caller.
There are a few new tests here, all using symbolic-ref (the code can be
triggered in many ways, but symrefs are convenient in that we don't need
to create a real ref, which avoids any complications from the filesystem
munging the name):
- the first covers the real-world case which misbehaved on macOS.
Setting LC_ALL is required to trigger the problem there (since
otherwise our tests use LC_ALL=C), and hopefully is at worst simply
ignored on other systems (and doesn't cause libc to complain, etc,
on systems without that locale).
- the second covers the "origin/HEAD" case as discussed above, which
is now fixed
- the remainder are for "weird" cases that work both before and after
this patch, but would be easy to get wrong with off-by-one problems
in the parsing (and came out of discussions and earlier iterations
of the patch that did get them wrong).
- absent here are tests of boring, expected-to-work cases like
"refs/heads/foo", etc. Those are covered all over the test suite
both explicitly (for-each-ref's refname:short) and implicitly (in
the output of git-status, etc).
Reported-by: 孟子易 <mengziyi540841@gmail.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-15 23:16:21 +08:00
|
|
|
short_name = match_parse_rule(refname, ref_rev_parse_rules[i],
|
|
|
|
&short_name_len);
|
|
|
|
if (!short_name)
|
2009-04-07 15:14:20 +08:00
|
|
|
continue;
|
|
|
|
|
2009-04-13 18:25:46 +08:00
|
|
|
/*
|
|
|
|
* in strict mode, all (except the matched one) rules
|
|
|
|
* must fail to resolve to a valid non-ambiguous ref
|
|
|
|
*/
|
|
|
|
if (strict)
|
shorten_unambiguous_ref(): use NUM_REV_PARSE_RULES constant
The ref_rev_parse_rules[] array is terminated with a NULL entry, and we
count it and store the result in the local nr_rules variable. But we
don't need to do so; since the array is a constant, we can compute its
size directly. The original code probably didn't do that because it was
written as part of for-each-ref, and saw the array only as a pointer. It
was migrated in 7c2b3029df (make get_short_ref a public function,
2009-04-07) and could have been updated then, but that subtlety was not
noticed.
We even have a constant that represents this value already, courtesy of
60650a48c0 (remote: make refspec follow the same disambiguation rule as
local refs, 2018-08-01), though again, nobody noticed at the time that
it could be used here, too.
The current count-up isn't a big deal, as we need to preprocess that
array anyway. But it will become more cumbersome as we refactor the
shortening code. So let's get rid of it and just use the constant
everywhere.
Note that there are two things here that aren't just simple text
replacements:
1. We also use nr_rules to see if a previous call has initialized the
static pre-processing variables. We can just use the scanf_fmts
pointer to do the same thing, as it is non-NULL only after we've
done that initialization.
2. If nr_rules is zero after we've counted it up, we bail from the
function. This code is unreachable, though, as the set of rules is
hard-coded and non-empty. And that becomes even more apparent now
that we are using the constant. So we can drop this conditional
completely (and ironically, the code would have the same output if
it _did_ trigger, as we'd simply skip the loop entirely and return
the whole refname).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-15 23:16:18 +08:00
|
|
|
rules_to_fail = NUM_REV_PARSE_RULES;
|
2009-04-13 18:25:46 +08:00
|
|
|
|
2009-04-07 15:14:20 +08:00
|
|
|
/*
|
|
|
|
* check if the short name resolves to a valid ref,
|
|
|
|
* but use only rules prior to the matched one
|
|
|
|
*/
|
2009-04-13 18:25:46 +08:00
|
|
|
for (j = 0; j < rules_to_fail; j++) {
|
2009-04-07 15:14:20 +08:00
|
|
|
const char *rule = ref_rev_parse_rules[j];
|
|
|
|
|
2009-04-13 18:25:46 +08:00
|
|
|
/* skip matched rule */
|
|
|
|
if (i == j)
|
|
|
|
continue;
|
|
|
|
|
2009-04-07 15:14:20 +08:00
|
|
|
/*
|
|
|
|
* the short name is ambiguous, if it resolves
|
|
|
|
* (with this previous rule) to a valid ref
|
|
|
|
* read_ref() returns 0 on success
|
|
|
|
*/
|
2017-03-29 03:46:33 +08:00
|
|
|
strbuf_reset(&resolved_buf);
|
|
|
|
strbuf_addf(&resolved_buf, rule,
|
shorten_unambiguous_ref(): avoid integer truncation
We parse the shortened name "foo" out of the full refname
"refs/heads/foo", and then assign the result of strlen(short_name) to an
int, which may truncate or wrap to negative.
In practice, this should never happen, as it requires a 2GB refname. And
even somebody trying to do something malicious should at worst end up
with a confused answer (we use the size only to feed back as a
placeholder length to strbuf_addf() to see if there are any collisions
in the lookup rules).
And it may even be impossible to trigger this, as we parse the string
with sscanf(), and stdio formatting functions are not known for handling
large strings well. I didn't test, but I wouldn't be surprised if
sscanf() on many platforms simply reports no match here.
But even if it is not a problem in practice so far, it is worth fixing
for two reasons:
1. We'll shortly be replacing the sscanf() call with a real parser
which will handle arbitrary-sized strings.
2. Assigning strlen() to an int is an anti-pattern that requires
people to look twice when auditing for real overflow problems.
So we'll make this a size_t. Unfortunately we still have to cast to int
eventually for the strbuf_addf() call, but at least we can localize the
cast there, and check that it will be valid. I used our new cast helper
here, which will just bail completely. That should be OK, as anybody
with a 2GB refname is up to no good, but if we really wanted to, we
could detect it manually and just refuse to shorten the refname.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-15 23:16:14 +08:00
|
|
|
cast_size_t_to_int(short_name_len),
|
|
|
|
short_name);
|
2019-04-06 19:34:25 +08:00
|
|
|
if (refs_ref_exists(refs, resolved_buf.buf))
|
2009-04-07 15:14:20 +08:00
|
|
|
break;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* short name is non-ambiguous if all previous rules
|
|
|
|
* haven't resolved to a valid ref
|
|
|
|
*/
|
2017-03-29 03:46:33 +08:00
|
|
|
if (j == rules_to_fail) {
|
|
|
|
strbuf_release(&resolved_buf);
|
shorten_unambiguous_ref(): avoid sscanf()
To shorten a fully qualified ref (e.g., taking "refs/heads/foo" to just
"foo"), we munge the usual lookup rules ("refs/heads/%.*s", etc) to drop
the ".*" modifier (so "refs/heads/%s"), and then use sscanf() to match
that against the refname, pulling the "%s" content into a separate
buffer.
This has a few downsides:
- sscanf("%s") reportedly misbehaves on macOS with some input and
locale combinations, returning a partial or garbled string. See
this thread:
https://lore.kernel.org/git/CAGF3oAcCi+fG12j-1U0hcrWwkF5K_9WhOi6ZPHBzUUzfkrZDxA@mail.gmail.com/
- scanf's matching of "%s" is greedy. So the "refs/remotes/%s/HEAD"
rule would never pull "origin" out of "refs/remotes/origin/HEAD".
Instead it always produced "origin/HEAD", which is redundant with
the "refs/remotes/%s" rule.
- scanf in general is an error-prone interface. For example, scanning
for "%s" will copy bytes into a destination string, which must have
been correctly sized ahead of time to avoid a buffer overflow. In
this case, the code is OK (the buffer is pessimistically sized to
match the original string, which should give us a maximum). But in
general, we do not want to encourage people to use scanf at all.
So instead, let's note that our lookup rules are not arbitrary format
strings, but all contain exactly one "%.*s" placeholder. We already rely
on this, both for lookup (we feed the lookup format along with exactly
one int/ptr combo to snprintf, etc) and for shortening (we munge "%.*s"
to "%s", and then insist that sscanf() finds exactly one result).
We can parse this manually by just matching the bytes that occur before
and after the "%.*s" placeholder. While we have a few extra lines of
parsing code, the result is arguably simpler, as can skip the
preprocessing step and its tricky memory management entirely.
The in-code comments should explain the parsing strategy, but there's
one subtle change here. The original code allocated a single buffer, and
then overwrote it in each loop iteration, since that's the only option
sscanf() gives us. But our parser can actually return a ptr/len combo
for the matched string, which is all we need (since we just feed it back
to the lookup rules with "%.*s"), and then copy it only when returning
to the caller.
There are a few new tests here, all using symbolic-ref (the code can be
triggered in many ways, but symrefs are convenient in that we don't need
to create a real ref, which avoids any complications from the filesystem
munging the name):
- the first covers the real-world case which misbehaved on macOS.
Setting LC_ALL is required to trigger the problem there (since
otherwise our tests use LC_ALL=C), and hopefully is at worst simply
ignored on other systems (and doesn't cause libc to complain, etc,
on systems without that locale).
- the second covers the "origin/HEAD" case as discussed above, which
is now fixed
- the remainder are for "weird" cases that work both before and after
this patch, but would be easy to get wrong with off-by-one problems
in the parsing (and came out of discussions and earlier iterations
of the patch that did get them wrong).
- absent here are tests of boring, expected-to-work cases like
"refs/heads/foo", etc. Those are covered all over the test suite
both explicitly (for-each-ref's refname:short) and implicitly (in
the output of git-status, etc).
Reported-by: 孟子易 <mengziyi540841@gmail.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-15 23:16:21 +08:00
|
|
|
return xmemdupz(short_name, short_name_len);
|
2017-03-29 03:46:33 +08:00
|
|
|
}
|
2009-04-07 15:14:20 +08:00
|
|
|
}
|
|
|
|
|
2017-03-29 03:46:33 +08:00
|
|
|
strbuf_release(&resolved_buf);
|
2011-12-12 13:38:09 +08:00
|
|
|
return xstrdup(refname);
|
2009-04-07 15:14:20 +08:00
|
|
|
}
|
upload/receive-pack: allow hiding ref hierarchies
A repository may have refs that are only used for its internal
bookkeeping purposes that should not be exposed to the others that
come over the network.
Teach upload-pack to omit some refs from its initial advertisement
by paying attention to the uploadpack.hiderefs multi-valued
configuration variable. Do the same to receive-pack via the
receive.hiderefs variable. As a convenient short-hand, allow using
transfer.hiderefs to set the value to both of these variables.
Any ref that is under the hierarchies listed on the value of these
variable is excluded from responses to requests made by "ls-remote",
"fetch", etc. (for upload-pack) and "push" (for receive-pack).
Because these hidden refs do not count as OUR_REF, an attempt to
fetch objects at the tip of them will be rejected, and because these
refs do not get advertised, "git push :" will not see local branches
that have the same name as them as "matching" ones to be sent.
An attempt to update/delete these hidden refs with an explicit
refspec, e.g. "git push origin :refs/hidden/22", is rejected. This
is not a new restriction. To the pusher, it would appear that there
is no such ref, so its push request will conclude with "Now that I
sent you all the data, it is time for you to update the refs. I saw
that the ref did not exist when I started pushing, and I want the
result to point at this commit". The receiving end will apply the
compare-and-swap rule to this request and rejects the push with
"Well, your update request conflicts with somebody else; I see there
is such a ref.", which is the right thing to do. Otherwise a push to
a hidden ref will always be "the last one wins", which is not a good
default.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-01-19 08:08:30 +08:00
|
|
|
|
2022-11-17 13:46:43 +08:00
|
|
|
int parse_hide_refs_config(const char *var, const char *value, const char *section,
|
2023-07-11 05:12:33 +08:00
|
|
|
struct strvec *hide_refs)
|
upload/receive-pack: allow hiding ref hierarchies
A repository may have refs that are only used for its internal
bookkeeping purposes that should not be exposed to the others that
come over the network.
Teach upload-pack to omit some refs from its initial advertisement
by paying attention to the uploadpack.hiderefs multi-valued
configuration variable. Do the same to receive-pack via the
receive.hiderefs variable. As a convenient short-hand, allow using
transfer.hiderefs to set the value to both of these variables.
Any ref that is under the hierarchies listed on the value of these
variable is excluded from responses to requests made by "ls-remote",
"fetch", etc. (for upload-pack) and "push" (for receive-pack).
Because these hidden refs do not count as OUR_REF, an attempt to
fetch objects at the tip of them will be rejected, and because these
refs do not get advertised, "git push :" will not see local branches
that have the same name as them as "matching" ones to be sent.
An attempt to update/delete these hidden refs with an explicit
refspec, e.g. "git push origin :refs/hidden/22", is rejected. This
is not a new restriction. To the pusher, it would appear that there
is no such ref, so its push request will conclude with "Now that I
sent you all the data, it is time for you to update the refs. I saw
that the ref did not exist when I started pushing, and I want the
result to point at this commit". The receiving end will apply the
compare-and-swap rule to this request and rejects the push with
"Well, your update request conflicts with somebody else; I see there
is such a ref.", which is the right thing to do. Otherwise a push to
a hidden ref will always be "the last one wins", which is not a good
default.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-01-19 08:08:30 +08:00
|
|
|
{
|
2017-02-25 05:08:16 +08:00
|
|
|
const char *key;
|
upload/receive-pack: allow hiding ref hierarchies
A repository may have refs that are only used for its internal
bookkeeping purposes that should not be exposed to the others that
come over the network.
Teach upload-pack to omit some refs from its initial advertisement
by paying attention to the uploadpack.hiderefs multi-valued
configuration variable. Do the same to receive-pack via the
receive.hiderefs variable. As a convenient short-hand, allow using
transfer.hiderefs to set the value to both of these variables.
Any ref that is under the hierarchies listed on the value of these
variable is excluded from responses to requests made by "ls-remote",
"fetch", etc. (for upload-pack) and "push" (for receive-pack).
Because these hidden refs do not count as OUR_REF, an attempt to
fetch objects at the tip of them will be rejected, and because these
refs do not get advertised, "git push :" will not see local branches
that have the same name as them as "matching" ones to be sent.
An attempt to update/delete these hidden refs with an explicit
refspec, e.g. "git push origin :refs/hidden/22", is rejected. This
is not a new restriction. To the pusher, it would appear that there
is no such ref, so its push request will conclude with "Now that I
sent you all the data, it is time for you to update the refs. I saw
that the ref did not exist when I started pushing, and I want the
result to point at this commit". The receiving end will apply the
compare-and-swap rule to this request and rejects the push with
"Well, your update request conflicts with somebody else; I see there
is such a ref.", which is the right thing to do. Otherwise a push to
a hidden ref will always be "the last one wins", which is not a good
default.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-01-19 08:08:30 +08:00
|
|
|
if (!strcmp("transfer.hiderefs", var) ||
|
2017-02-25 05:08:16 +08:00
|
|
|
(!parse_config_key(var, section, NULL, NULL, &key) &&
|
|
|
|
!strcmp(key, "hiderefs"))) {
|
upload/receive-pack: allow hiding ref hierarchies
A repository may have refs that are only used for its internal
bookkeeping purposes that should not be exposed to the others that
come over the network.
Teach upload-pack to omit some refs from its initial advertisement
by paying attention to the uploadpack.hiderefs multi-valued
configuration variable. Do the same to receive-pack via the
receive.hiderefs variable. As a convenient short-hand, allow using
transfer.hiderefs to set the value to both of these variables.
Any ref that is under the hierarchies listed on the value of these
variable is excluded from responses to requests made by "ls-remote",
"fetch", etc. (for upload-pack) and "push" (for receive-pack).
Because these hidden refs do not count as OUR_REF, an attempt to
fetch objects at the tip of them will be rejected, and because these
refs do not get advertised, "git push :" will not see local branches
that have the same name as them as "matching" ones to be sent.
An attempt to update/delete these hidden refs with an explicit
refspec, e.g. "git push origin :refs/hidden/22", is rejected. This
is not a new restriction. To the pusher, it would appear that there
is no such ref, so its push request will conclude with "Now that I
sent you all the data, it is time for you to update the refs. I saw
that the ref did not exist when I started pushing, and I want the
result to point at this commit". The receiving end will apply the
compare-and-swap rule to this request and rejects the push with
"Well, your update request conflicts with somebody else; I see there
is such a ref.", which is the right thing to do. Otherwise a push to
a hidden ref will always be "the last one wins", which is not a good
default.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-01-19 08:08:30 +08:00
|
|
|
char *ref;
|
|
|
|
int len;
|
|
|
|
|
|
|
|
if (!value)
|
|
|
|
return config_error_nonbool(var);
|
2023-07-11 05:12:33 +08:00
|
|
|
|
|
|
|
/* drop const to remove trailing '/' characters */
|
|
|
|
ref = (char *)strvec_push(hide_refs, value);
|
upload/receive-pack: allow hiding ref hierarchies
A repository may have refs that are only used for its internal
bookkeeping purposes that should not be exposed to the others that
come over the network.
Teach upload-pack to omit some refs from its initial advertisement
by paying attention to the uploadpack.hiderefs multi-valued
configuration variable. Do the same to receive-pack via the
receive.hiderefs variable. As a convenient short-hand, allow using
transfer.hiderefs to set the value to both of these variables.
Any ref that is under the hierarchies listed on the value of these
variable is excluded from responses to requests made by "ls-remote",
"fetch", etc. (for upload-pack) and "push" (for receive-pack).
Because these hidden refs do not count as OUR_REF, an attempt to
fetch objects at the tip of them will be rejected, and because these
refs do not get advertised, "git push :" will not see local branches
that have the same name as them as "matching" ones to be sent.
An attempt to update/delete these hidden refs with an explicit
refspec, e.g. "git push origin :refs/hidden/22", is rejected. This
is not a new restriction. To the pusher, it would appear that there
is no such ref, so its push request will conclude with "Now that I
sent you all the data, it is time for you to update the refs. I saw
that the ref did not exist when I started pushing, and I want the
result to point at this commit". The receiving end will apply the
compare-and-swap rule to this request and rejects the push with
"Well, your update request conflicts with somebody else; I see there
is such a ref.", which is the right thing to do. Otherwise a push to
a hidden ref will always be "the last one wins", which is not a good
default.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-01-19 08:08:30 +08:00
|
|
|
len = strlen(ref);
|
|
|
|
while (len && ref[len - 1] == '/')
|
|
|
|
ref[--len] = '\0';
|
|
|
|
}
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2022-11-17 13:46:43 +08:00
|
|
|
int ref_is_hidden(const char *refname, const char *refname_full,
|
2023-07-11 05:12:33 +08:00
|
|
|
const struct strvec *hide_refs)
|
upload/receive-pack: allow hiding ref hierarchies
A repository may have refs that are only used for its internal
bookkeeping purposes that should not be exposed to the others that
come over the network.
Teach upload-pack to omit some refs from its initial advertisement
by paying attention to the uploadpack.hiderefs multi-valued
configuration variable. Do the same to receive-pack via the
receive.hiderefs variable. As a convenient short-hand, allow using
transfer.hiderefs to set the value to both of these variables.
Any ref that is under the hierarchies listed on the value of these
variable is excluded from responses to requests made by "ls-remote",
"fetch", etc. (for upload-pack) and "push" (for receive-pack).
Because these hidden refs do not count as OUR_REF, an attempt to
fetch objects at the tip of them will be rejected, and because these
refs do not get advertised, "git push :" will not see local branches
that have the same name as them as "matching" ones to be sent.
An attempt to update/delete these hidden refs with an explicit
refspec, e.g. "git push origin :refs/hidden/22", is rejected. This
is not a new restriction. To the pusher, it would appear that there
is no such ref, so its push request will conclude with "Now that I
sent you all the data, it is time for you to update the refs. I saw
that the ref did not exist when I started pushing, and I want the
result to point at this commit". The receiving end will apply the
compare-and-swap rule to this request and rejects the push with
"Well, your update request conflicts with somebody else; I see there
is such a ref.", which is the right thing to do. Otherwise a push to
a hidden ref will always be "the last one wins", which is not a good
default.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-01-19 08:08:30 +08:00
|
|
|
{
|
refs: support negative transfer.hideRefs
If you hide a hierarchy of refs using the transfer.hideRefs
config, there is no way to later override that config to
"unhide" it. This patch implements a "negative" hide which
causes matches to immediately be marked as unhidden, even if
another match would hide it. We take care to apply the
matches in reverse-order from how they are fed to us by the
config machinery, as that lets our usual "last one wins"
config precedence work (and entries in .git/config, for
example, will override /etc/gitconfig).
So you can now do:
$ git config --system transfer.hideRefs refs/secret
$ git config transfer.hideRefs '!refs/secret/not-so-secret'
to hide refs/secret in all repos, except for one public bit
in one specific repo. Or you can even do:
$ git clone \
-u "git -c transfer.hiderefs="!refs/foo" upload-pack" \
remote:repo.git
to clone remote:repo.git, overriding any hiding it has
configured.
There are two alternatives that were considered and
rejected:
1. A generic config mechanism for removing an item from a
list. E.g.: (e.g., "[transfer] hideRefs -= refs/foo").
This is nice because it could apply to other
multi-valued config, as well. But it is not nearly as
flexible. There is no way to say:
[transfer]
hideRefs = refs/secret
hideRefs = refs/secret/not-so-secret
Having explicit negative specifications means we can
override previous entries, even if they are not the
same literal string.
2. Adding another variable to override some parts of
hideRefs (e.g., "exposeRefs").
This solves the problem from alternative (1), but it
cannot easily obey the normal config precedence,
because it would use two separate lists. For example:
[transfer]
hideRefs = refs/secret
exposeRefs = refs/secret/not-so-secret
hideRefs = refs/secret/not-so-secret/no-really-its-secret
With two lists, we have to apply the "expose" rules
first, and only then apply the "hide" rules. But that
does not match what the above config intends.
Of course we could internally parse that to a single
list, respecting the ordering, which saves us having to
invent the new "!" syntax. But using a single name
communicates to the user that the ordering _is_
important. And "!" is well-known for negation, and
should not appear at the beginning of a ref (it is
actually valid in a ref-name, but all entries here
should be fully-qualified, starting with "refs/").
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-07-29 04:23:26 +08:00
|
|
|
int i;
|
upload/receive-pack: allow hiding ref hierarchies
A repository may have refs that are only used for its internal
bookkeeping purposes that should not be exposed to the others that
come over the network.
Teach upload-pack to omit some refs from its initial advertisement
by paying attention to the uploadpack.hiderefs multi-valued
configuration variable. Do the same to receive-pack via the
receive.hiderefs variable. As a convenient short-hand, allow using
transfer.hiderefs to set the value to both of these variables.
Any ref that is under the hierarchies listed on the value of these
variable is excluded from responses to requests made by "ls-remote",
"fetch", etc. (for upload-pack) and "push" (for receive-pack).
Because these hidden refs do not count as OUR_REF, an attempt to
fetch objects at the tip of them will be rejected, and because these
refs do not get advertised, "git push :" will not see local branches
that have the same name as them as "matching" ones to be sent.
An attempt to update/delete these hidden refs with an explicit
refspec, e.g. "git push origin :refs/hidden/22", is rejected. This
is not a new restriction. To the pusher, it would appear that there
is no such ref, so its push request will conclude with "Now that I
sent you all the data, it is time for you to update the refs. I saw
that the ref did not exist when I started pushing, and I want the
result to point at this commit". The receiving end will apply the
compare-and-swap rule to this request and rejects the push with
"Well, your update request conflicts with somebody else; I see there
is such a ref.", which is the right thing to do. Otherwise a push to
a hidden ref will always be "the last one wins", which is not a good
default.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-01-19 08:08:30 +08:00
|
|
|
|
refs: support negative transfer.hideRefs
If you hide a hierarchy of refs using the transfer.hideRefs
config, there is no way to later override that config to
"unhide" it. This patch implements a "negative" hide which
causes matches to immediately be marked as unhidden, even if
another match would hide it. We take care to apply the
matches in reverse-order from how they are fed to us by the
config machinery, as that lets our usual "last one wins"
config precedence work (and entries in .git/config, for
example, will override /etc/gitconfig).
So you can now do:
$ git config --system transfer.hideRefs refs/secret
$ git config transfer.hideRefs '!refs/secret/not-so-secret'
to hide refs/secret in all repos, except for one public bit
in one specific repo. Or you can even do:
$ git clone \
-u "git -c transfer.hiderefs="!refs/foo" upload-pack" \
remote:repo.git
to clone remote:repo.git, overriding any hiding it has
configured.
There are two alternatives that were considered and
rejected:
1. A generic config mechanism for removing an item from a
list. E.g.: (e.g., "[transfer] hideRefs -= refs/foo").
This is nice because it could apply to other
multi-valued config, as well. But it is not nearly as
flexible. There is no way to say:
[transfer]
hideRefs = refs/secret
hideRefs = refs/secret/not-so-secret
Having explicit negative specifications means we can
override previous entries, even if they are not the
same literal string.
2. Adding another variable to override some parts of
hideRefs (e.g., "exposeRefs").
This solves the problem from alternative (1), but it
cannot easily obey the normal config precedence,
because it would use two separate lists. For example:
[transfer]
hideRefs = refs/secret
exposeRefs = refs/secret/not-so-secret
hideRefs = refs/secret/not-so-secret/no-really-its-secret
With two lists, we have to apply the "expose" rules
first, and only then apply the "hide" rules. But that
does not match what the above config intends.
Of course we could internally parse that to a single
list, respecting the ordering, which saves us having to
invent the new "!" syntax. But using a single name
communicates to the user that the ordering _is_
important. And "!" is well-known for negation, and
should not appear at the beginning of a ref (it is
actually valid in a ref-name, but all entries here
should be fully-qualified, starting with "refs/").
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-07-29 04:23:26 +08:00
|
|
|
for (i = hide_refs->nr - 1; i >= 0; i--) {
|
2023-07-11 05:12:33 +08:00
|
|
|
const char *match = hide_refs->v[i];
|
2015-11-03 15:58:16 +08:00
|
|
|
const char *subject;
|
refs: support negative transfer.hideRefs
If you hide a hierarchy of refs using the transfer.hideRefs
config, there is no way to later override that config to
"unhide" it. This patch implements a "negative" hide which
causes matches to immediately be marked as unhidden, even if
another match would hide it. We take care to apply the
matches in reverse-order from how they are fed to us by the
config machinery, as that lets our usual "last one wins"
config precedence work (and entries in .git/config, for
example, will override /etc/gitconfig).
So you can now do:
$ git config --system transfer.hideRefs refs/secret
$ git config transfer.hideRefs '!refs/secret/not-so-secret'
to hide refs/secret in all repos, except for one public bit
in one specific repo. Or you can even do:
$ git clone \
-u "git -c transfer.hiderefs="!refs/foo" upload-pack" \
remote:repo.git
to clone remote:repo.git, overriding any hiding it has
configured.
There are two alternatives that were considered and
rejected:
1. A generic config mechanism for removing an item from a
list. E.g.: (e.g., "[transfer] hideRefs -= refs/foo").
This is nice because it could apply to other
multi-valued config, as well. But it is not nearly as
flexible. There is no way to say:
[transfer]
hideRefs = refs/secret
hideRefs = refs/secret/not-so-secret
Having explicit negative specifications means we can
override previous entries, even if they are not the
same literal string.
2. Adding another variable to override some parts of
hideRefs (e.g., "exposeRefs").
This solves the problem from alternative (1), but it
cannot easily obey the normal config precedence,
because it would use two separate lists. For example:
[transfer]
hideRefs = refs/secret
exposeRefs = refs/secret/not-so-secret
hideRefs = refs/secret/not-so-secret/no-really-its-secret
With two lists, we have to apply the "expose" rules
first, and only then apply the "hide" rules. But that
does not match what the above config intends.
Of course we could internally parse that to a single
list, respecting the ordering, which saves us having to
invent the new "!" syntax. But using a single name
communicates to the user that the ordering _is_
important. And "!" is well-known for negation, and
should not appear at the beginning of a ref (it is
actually valid in a ref-name, but all entries here
should be fully-qualified, starting with "refs/").
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-07-29 04:23:26 +08:00
|
|
|
int neg = 0;
|
2017-07-22 12:39:12 +08:00
|
|
|
const char *p;
|
refs: support negative transfer.hideRefs
If you hide a hierarchy of refs using the transfer.hideRefs
config, there is no way to later override that config to
"unhide" it. This patch implements a "negative" hide which
causes matches to immediately be marked as unhidden, even if
another match would hide it. We take care to apply the
matches in reverse-order from how they are fed to us by the
config machinery, as that lets our usual "last one wins"
config precedence work (and entries in .git/config, for
example, will override /etc/gitconfig).
So you can now do:
$ git config --system transfer.hideRefs refs/secret
$ git config transfer.hideRefs '!refs/secret/not-so-secret'
to hide refs/secret in all repos, except for one public bit
in one specific repo. Or you can even do:
$ git clone \
-u "git -c transfer.hiderefs="!refs/foo" upload-pack" \
remote:repo.git
to clone remote:repo.git, overriding any hiding it has
configured.
There are two alternatives that were considered and
rejected:
1. A generic config mechanism for removing an item from a
list. E.g.: (e.g., "[transfer] hideRefs -= refs/foo").
This is nice because it could apply to other
multi-valued config, as well. But it is not nearly as
flexible. There is no way to say:
[transfer]
hideRefs = refs/secret
hideRefs = refs/secret/not-so-secret
Having explicit negative specifications means we can
override previous entries, even if they are not the
same literal string.
2. Adding another variable to override some parts of
hideRefs (e.g., "exposeRefs").
This solves the problem from alternative (1), but it
cannot easily obey the normal config precedence,
because it would use two separate lists. For example:
[transfer]
hideRefs = refs/secret
exposeRefs = refs/secret/not-so-secret
hideRefs = refs/secret/not-so-secret/no-really-its-secret
With two lists, we have to apply the "expose" rules
first, and only then apply the "hide" rules. But that
does not match what the above config intends.
Of course we could internally parse that to a single
list, respecting the ordering, which saves us having to
invent the new "!" syntax. But using a single name
communicates to the user that the ordering _is_
important. And "!" is well-known for negation, and
should not appear at the beginning of a ref (it is
actually valid in a ref-name, but all entries here
should be fully-qualified, starting with "refs/").
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-07-29 04:23:26 +08:00
|
|
|
|
|
|
|
if (*match == '!') {
|
|
|
|
neg = 1;
|
|
|
|
match++;
|
|
|
|
}
|
|
|
|
|
2015-11-03 15:58:16 +08:00
|
|
|
if (*match == '^') {
|
|
|
|
subject = refname_full;
|
|
|
|
match++;
|
|
|
|
} else {
|
|
|
|
subject = refname;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* refname can be NULL when namespaces are used. */
|
2017-07-22 12:39:12 +08:00
|
|
|
if (subject &&
|
|
|
|
skip_prefix(subject, match, &p) &&
|
|
|
|
(!*p || *p == '/'))
|
refs: support negative transfer.hideRefs
If you hide a hierarchy of refs using the transfer.hideRefs
config, there is no way to later override that config to
"unhide" it. This patch implements a "negative" hide which
causes matches to immediately be marked as unhidden, even if
another match would hide it. We take care to apply the
matches in reverse-order from how they are fed to us by the
config machinery, as that lets our usual "last one wins"
config precedence work (and entries in .git/config, for
example, will override /etc/gitconfig).
So you can now do:
$ git config --system transfer.hideRefs refs/secret
$ git config transfer.hideRefs '!refs/secret/not-so-secret'
to hide refs/secret in all repos, except for one public bit
in one specific repo. Or you can even do:
$ git clone \
-u "git -c transfer.hiderefs="!refs/foo" upload-pack" \
remote:repo.git
to clone remote:repo.git, overriding any hiding it has
configured.
There are two alternatives that were considered and
rejected:
1. A generic config mechanism for removing an item from a
list. E.g.: (e.g., "[transfer] hideRefs -= refs/foo").
This is nice because it could apply to other
multi-valued config, as well. But it is not nearly as
flexible. There is no way to say:
[transfer]
hideRefs = refs/secret
hideRefs = refs/secret/not-so-secret
Having explicit negative specifications means we can
override previous entries, even if they are not the
same literal string.
2. Adding another variable to override some parts of
hideRefs (e.g., "exposeRefs").
This solves the problem from alternative (1), but it
cannot easily obey the normal config precedence,
because it would use two separate lists. For example:
[transfer]
hideRefs = refs/secret
exposeRefs = refs/secret/not-so-secret
hideRefs = refs/secret/not-so-secret/no-really-its-secret
With two lists, we have to apply the "expose" rules
first, and only then apply the "hide" rules. But that
does not match what the above config intends.
Of course we could internally parse that to a single
list, respecting the ordering, which saves us having to
invent the new "!" syntax. But using a single name
communicates to the user that the ordering _is_
important. And "!" is well-known for negation, and
should not appear at the beginning of a ref (it is
actually valid in a ref-name, but all entries here
should be fully-qualified, starting with "refs/").
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-07-29 04:23:26 +08:00
|
|
|
return !neg;
|
upload/receive-pack: allow hiding ref hierarchies
A repository may have refs that are only used for its internal
bookkeeping purposes that should not be exposed to the others that
come over the network.
Teach upload-pack to omit some refs from its initial advertisement
by paying attention to the uploadpack.hiderefs multi-valued
configuration variable. Do the same to receive-pack via the
receive.hiderefs variable. As a convenient short-hand, allow using
transfer.hiderefs to set the value to both of these variables.
Any ref that is under the hierarchies listed on the value of these
variable is excluded from responses to requests made by "ls-remote",
"fetch", etc. (for upload-pack) and "push" (for receive-pack).
Because these hidden refs do not count as OUR_REF, an attempt to
fetch objects at the tip of them will be rejected, and because these
refs do not get advertised, "git push :" will not see local branches
that have the same name as them as "matching" ones to be sent.
An attempt to update/delete these hidden refs with an explicit
refspec, e.g. "git push origin :refs/hidden/22", is rejected. This
is not a new restriction. To the pusher, it would appear that there
is no such ref, so its push request will conclude with "Now that I
sent you all the data, it is time for you to update the refs. I saw
that the ref did not exist when I started pushing, and I want the
result to point at this commit". The receiving end will apply the
compare-and-swap rule to this request and rejects the push with
"Well, your update request conflicts with somebody else; I see there
is such a ref.", which is the right thing to do. Otherwise a push to
a hidden ref will always be "the last one wins", which is not a good
default.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-01-19 08:08:30 +08:00
|
|
|
}
|
|
|
|
return 0;
|
|
|
|
}
|
2014-12-12 16:56:59 +08:00
|
|
|
|
2023-07-11 05:12:39 +08:00
|
|
|
const char **hidden_refs_to_excludes(const struct strvec *hide_refs)
|
|
|
|
{
|
|
|
|
const char **pattern;
|
|
|
|
for (pattern = hide_refs->v; *pattern; pattern++) {
|
|
|
|
/*
|
|
|
|
* We can't feed any excludes from hidden refs config
|
|
|
|
* sections, since later rules may override previous
|
|
|
|
* ones. For example, with rules "refs/foo" and
|
|
|
|
* "!refs/foo/bar", we should show "refs/foo/bar" (and
|
|
|
|
* everything underneath it), but the earlier exclusion
|
|
|
|
* would cause us to skip all of "refs/foo". We
|
|
|
|
* likewise don't implement the namespace stripping
|
|
|
|
* required for '^' rules.
|
|
|
|
*
|
|
|
|
* Both are possible to do, but complicated, so avoid
|
|
|
|
* populating the jump list at all if we see either of
|
|
|
|
* these patterns.
|
|
|
|
*/
|
|
|
|
if (**pattern == '!' || **pattern == '^')
|
|
|
|
return NULL;
|
|
|
|
}
|
|
|
|
return hide_refs->v;
|
|
|
|
}
|
|
|
|
|
2015-11-10 19:42:40 +08:00
|
|
|
const char *find_descendant_ref(const char *dirname,
|
|
|
|
const struct string_list *extras,
|
|
|
|
const struct string_list *skip)
|
2014-12-12 16:56:59 +08:00
|
|
|
{
|
2015-11-10 19:42:40 +08:00
|
|
|
int pos;
|
2014-12-12 16:56:59 +08:00
|
|
|
|
2015-11-10 19:42:40 +08:00
|
|
|
if (!extras)
|
|
|
|
return NULL;
|
2014-12-12 16:56:59 +08:00
|
|
|
|
|
|
|
/*
|
2015-11-10 19:42:40 +08:00
|
|
|
* Look at the place where dirname would be inserted into
|
|
|
|
* extras. If there is an entry at that position that starts
|
|
|
|
* with dirname (remember, dirname includes the trailing
|
|
|
|
* slash) and is not in skip, then we have a conflict.
|
2014-12-12 16:56:59 +08:00
|
|
|
*/
|
2015-11-10 19:42:40 +08:00
|
|
|
for (pos = string_list_find_insert_index(extras, dirname, 0);
|
|
|
|
pos < extras->nr; pos++) {
|
|
|
|
const char *extra_refname = extras->items[pos].string;
|
2014-12-12 16:56:59 +08:00
|
|
|
|
2015-11-10 19:42:40 +08:00
|
|
|
if (!starts_with(extra_refname, dirname))
|
|
|
|
break;
|
|
|
|
|
|
|
|
if (!skip || !string_list_has_string(skip, extra_refname))
|
|
|
|
return extra_refname;
|
2014-12-12 16:56:59 +08:00
|
|
|
}
|
2015-11-10 19:42:40 +08:00
|
|
|
return NULL;
|
|
|
|
}
|
2014-12-12 16:56:59 +08:00
|
|
|
|
2017-08-23 20:36:55 +08:00
|
|
|
int refs_head_ref(struct ref_store *refs, each_ref_fn fn, void *cb_data)
|
2016-04-08 03:02:48 +08:00
|
|
|
{
|
|
|
|
struct object_id oid;
|
|
|
|
int flag;
|
|
|
|
|
2021-10-16 17:39:27 +08:00
|
|
|
if (refs_resolve_ref_unsafe(refs, "HEAD", RESOLVE_REF_READING,
|
2022-01-26 22:37:01 +08:00
|
|
|
&oid, &flag))
|
2016-04-08 03:02:48 +08:00
|
|
|
return fn("HEAD", &oid, flag, cb_data);
|
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2017-03-21 00:33:08 +08:00
|
|
|
struct ref_iterator *refs_ref_iterator_begin(
|
|
|
|
struct ref_store *refs,
|
2023-07-11 05:12:22 +08:00
|
|
|
const char *prefix,
|
|
|
|
const char **exclude_patterns,
|
|
|
|
int trim,
|
2021-09-25 02:39:44 +08:00
|
|
|
enum do_for_each_ref_flags flags)
|
2017-03-21 00:33:08 +08:00
|
|
|
{
|
|
|
|
struct ref_iterator *iter;
|
|
|
|
|
2021-09-25 02:42:38 +08:00
|
|
|
if (!(flags & DO_FOR_EACH_INCLUDE_BROKEN)) {
|
2021-09-25 02:46:37 +08:00
|
|
|
static int ref_paranoia = -1;
|
|
|
|
|
2021-09-25 02:42:38 +08:00
|
|
|
if (ref_paranoia < 0)
|
refs: turn on GIT_REF_PARANOIA by default
The original point of the GIT_REF_PARANOIA flag was to include broken
refs in iterations, so that possibly-destructive operations would not
silently ignore them (and would generally instead try to operate on the
oids and fail when the objects could not be accessed).
We already turned this on by default for some dangerous operations, like
"repack -ad" (where missing a reachability tip would mean dropping the
associated history). But it was not on for general use, even though it
could easily result in the spreading of corruption (e.g., imagine
cloning a repository which simply omits some of its refs because
their objects are missing; the result quietly succeeds even though you
did not clone everything!).
This patch turns on GIT_REF_PARANOIA by default. So a clone as mentioned
above would actually fail (upload-pack tells us about the broken ref,
and when we ask for the objects, pack-objects fails to deliver them).
This may be inconvenient when working with a corrupted repository, but:
- we are better off to err on the side of complaining about
corruption, and then provide mechanisms for explicitly loosening
safety.
- this is only one type of corruption anyway. If we are missing any
other objects in the history that _aren't_ ref tips, then we'd
behave similarly (happily show the ref, but then barf when we
started traversing).
We retain the GIT_REF_PARANOIA variable, but simply default it to "1"
instead of "0". That gives the user an escape hatch for loosening this
when working with a corrupt repository. It won't work across a remote
connection to upload-pack (because we can't necessarily set environment
variables on the remote), but there the client has other options (e.g.,
choosing which refs to fetch).
As a bonus, this also makes ref iteration faster in general (because we
don't have to call has_object_file() for each ref), though probably not
noticeably so in the general case. In a repo with a million refs, it
shaved a few hundred milliseconds off of upload-pack's advertisement;
that's noticeable, but most repos are not nearly that large.
The possible downside here is that any operation which iterates refs but
doesn't ever open their objects may now quietly claim to have X when the
object is corrupted (e.g., "git rev-list new-branch --not --all" will
treat a broken ref as uninteresting). But again, that's not really any
different than corruption below the ref level. We might have
refs/heads/old-branch as non-corrupt, but we are not actively checking
that we have the entire reachable history. Or the pointed-to object
could even be corrupted on-disk (but our "do we have it" check would
still succeed). In that sense, this is merely bringing ref-corruption in
line with general object corruption.
One alternative implementation would be to actually check for broken
refs, and then _immediately die_ if we see any. That would cause the
"rev-list --not --all" case above to abort immediately. But in many ways
that's the worst of all worlds:
- it still spends time looking up the objects an extra time
- it still doesn't catch corruption below the ref level
- it's even more inconvenient; with the current implementation of
GIT_REF_PARANOIA for something like upload-pack, we can make
the advertisement and let the client choose a non-broken piece of
history. If we bail as soon as we see a broken ref, they cannot even
see the advertisement.
The test changes here show some of the fallout. A non-destructive "git
repack -adk" now fails by default (but we can override it). Deleting a
broken ref now actually tells the hooks the correct "before" state,
rather than a confusing null oid.
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-25 02:46:13 +08:00
|
|
|
ref_paranoia = git_env_bool("GIT_REF_PARANOIA", 1);
|
2021-09-25 02:42:38 +08:00
|
|
|
if (ref_paranoia) {
|
|
|
|
flags |= DO_FOR_EACH_INCLUDE_BROKEN;
|
|
|
|
flags |= DO_FOR_EACH_OMIT_DANGLING_SYMREFS;
|
|
|
|
}
|
|
|
|
}
|
2017-05-22 22:17:52 +08:00
|
|
|
|
2023-07-11 05:12:22 +08:00
|
|
|
iter = refs->be->iterator_begin(refs, prefix, exclude_patterns, flags);
|
2017-05-22 22:17:36 +08:00
|
|
|
/*
|
|
|
|
* `iterator_begin()` already takes care of prefix, but we
|
|
|
|
* might need to do some trimming:
|
|
|
|
*/
|
|
|
|
if (trim)
|
|
|
|
iter = prefix_ref_iterator_begin(iter, "", trim);
|
2017-03-21 00:33:08 +08:00
|
|
|
|
|
|
|
return iter;
|
|
|
|
}
|
|
|
|
|
do_for_each_ref(): reimplement using reference iteration
Use the reference iterator interface to implement do_for_each_ref().
Delete a bunch of code supporting the old for_each_ref() implementation.
And now that do_for_each_ref() is generic code (it is no longer tied to
the files backend), move it to refs.c.
The implementation is via a new function, do_for_each_ref_iterator(),
which takes a reference iterator as argument and calls a callback
function for each of the references in the iterator.
This change requires the current_ref performance hack for peel_ref() to
be implemented via ref_iterator_peel() rather than peel_entry() because
we don't have a ref_entry handy (it is hidden under three layers:
file_ref_iterator, merge_ref_iterator, and cache_ref_iterator). So:
* do_for_each_ref_iterator() records the active iterator in
current_ref_iter while it is running.
* peel_ref() checks whether current_ref_iter is pointing at the
requested reference. If so, it asks the iterator to peel the
reference (which it can do efficiently via its "peel" virtual
function). For extra safety, we do the optimization only if the
refname *addresses* are the same, not only if the refname *strings*
are the same, to forestall possible mixups between refnames that come
from different ref_iterators.
Please note that this optimization of peel_ref() is only available when
iterating via do_for_each_ref_iterator() (including all of the
for_each_ref() functions, which call it indirectly). It would be
complicated to implement a similar optimization when iterating directly
using a reference iterator, because multiple reference iterators can be
in use at the same time, with interleaved calls to
ref_iterator_advance(). (In fact we do exactly that in
merge_ref_iterator.)
But that is not necessary. peel_ref() is only called while iterating
over references. Callers who iterate using the for_each_ref() functions
benefit from the optimization described above. Callers who iterate using
reference iterators directly have access to the ref_iterator, so they
can call ref_iterator_peel() themselves to get an analogous optimization
in a more straightforward manner.
If we rewrite all callers to use the reference iteration API, then we
can remove the current_ref_iter hack permanently.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-18 12:15:16 +08:00
|
|
|
/*
|
|
|
|
* Call fn for each reference in the specified submodule for which the
|
|
|
|
* refname begins with prefix. If trim is non-zero, then trim that
|
|
|
|
* many characters off the beginning of each refname before passing
|
|
|
|
* the refname to fn. flags can be DO_FOR_EACH_INCLUDE_BROKEN to
|
|
|
|
* include broken references in the iteration. If fn ever returns a
|
|
|
|
* non-zero value, stop the iteration and return that value;
|
|
|
|
* otherwise, return 0.
|
|
|
|
*/
|
2018-08-21 02:24:16 +08:00
|
|
|
static int do_for_each_repo_ref(struct repository *r, const char *prefix,
|
|
|
|
each_repo_ref_fn fn, int trim, int flags,
|
|
|
|
void *cb_data)
|
|
|
|
{
|
|
|
|
struct ref_iterator *iter;
|
|
|
|
struct ref_store *refs = get_main_ref_store(r);
|
|
|
|
|
|
|
|
if (!refs)
|
|
|
|
return 0;
|
|
|
|
|
2023-07-11 05:12:22 +08:00
|
|
|
iter = refs_ref_iterator_begin(refs, prefix, NULL, trim, flags);
|
2018-08-21 02:24:16 +08:00
|
|
|
|
|
|
|
return do_for_each_repo_ref_iterator(r, iter, fn, cb_data);
|
|
|
|
}
|
|
|
|
|
|
|
|
struct do_for_each_ref_help {
|
|
|
|
each_ref_fn *fn;
|
|
|
|
void *cb_data;
|
|
|
|
};
|
|
|
|
|
2023-07-03 14:44:02 +08:00
|
|
|
static int do_for_each_ref_helper(struct repository *r UNUSED,
|
2018-08-21 02:24:16 +08:00
|
|
|
const char *refname,
|
|
|
|
const struct object_id *oid,
|
|
|
|
int flags,
|
|
|
|
void *cb_data)
|
|
|
|
{
|
|
|
|
struct do_for_each_ref_help *hp = cb_data;
|
|
|
|
|
|
|
|
return hp->fn(refname, oid, flags, hp->cb_data);
|
|
|
|
}
|
|
|
|
|
2017-03-26 10:42:34 +08:00
|
|
|
static int do_for_each_ref(struct ref_store *refs, const char *prefix,
|
2023-07-11 05:12:22 +08:00
|
|
|
const char **exclude_patterns,
|
2021-09-25 02:39:44 +08:00
|
|
|
each_ref_fn fn, int trim,
|
|
|
|
enum do_for_each_ref_flags flags, void *cb_data)
|
do_for_each_ref(): reimplement using reference iteration
Use the reference iterator interface to implement do_for_each_ref().
Delete a bunch of code supporting the old for_each_ref() implementation.
And now that do_for_each_ref() is generic code (it is no longer tied to
the files backend), move it to refs.c.
The implementation is via a new function, do_for_each_ref_iterator(),
which takes a reference iterator as argument and calls a callback
function for each of the references in the iterator.
This change requires the current_ref performance hack for peel_ref() to
be implemented via ref_iterator_peel() rather than peel_entry() because
we don't have a ref_entry handy (it is hidden under three layers:
file_ref_iterator, merge_ref_iterator, and cache_ref_iterator). So:
* do_for_each_ref_iterator() records the active iterator in
current_ref_iter while it is running.
* peel_ref() checks whether current_ref_iter is pointing at the
requested reference. If so, it asks the iterator to peel the
reference (which it can do efficiently via its "peel" virtual
function). For extra safety, we do the optimization only if the
refname *addresses* are the same, not only if the refname *strings*
are the same, to forestall possible mixups between refnames that come
from different ref_iterators.
Please note that this optimization of peel_ref() is only available when
iterating via do_for_each_ref_iterator() (including all of the
for_each_ref() functions, which call it indirectly). It would be
complicated to implement a similar optimization when iterating directly
using a reference iterator, because multiple reference iterators can be
in use at the same time, with interleaved calls to
ref_iterator_advance(). (In fact we do exactly that in
merge_ref_iterator.)
But that is not necessary. peel_ref() is only called while iterating
over references. Callers who iterate using the for_each_ref() functions
benefit from the optimization described above. Callers who iterate using
reference iterators directly have access to the ref_iterator, so they
can call ref_iterator_peel() themselves to get an analogous optimization
in a more straightforward manner.
If we rewrite all callers to use the reference iteration API, then we
can remove the current_ref_iter hack permanently.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-18 12:15:16 +08:00
|
|
|
{
|
|
|
|
struct ref_iterator *iter;
|
2018-08-21 02:24:16 +08:00
|
|
|
struct do_for_each_ref_help hp = { fn, cb_data };
|
do_for_each_ref(): reimplement using reference iteration
Use the reference iterator interface to implement do_for_each_ref().
Delete a bunch of code supporting the old for_each_ref() implementation.
And now that do_for_each_ref() is generic code (it is no longer tied to
the files backend), move it to refs.c.
The implementation is via a new function, do_for_each_ref_iterator(),
which takes a reference iterator as argument and calls a callback
function for each of the references in the iterator.
This change requires the current_ref performance hack for peel_ref() to
be implemented via ref_iterator_peel() rather than peel_entry() because
we don't have a ref_entry handy (it is hidden under three layers:
file_ref_iterator, merge_ref_iterator, and cache_ref_iterator). So:
* do_for_each_ref_iterator() records the active iterator in
current_ref_iter while it is running.
* peel_ref() checks whether current_ref_iter is pointing at the
requested reference. If so, it asks the iterator to peel the
reference (which it can do efficiently via its "peel" virtual
function). For extra safety, we do the optimization only if the
refname *addresses* are the same, not only if the refname *strings*
are the same, to forestall possible mixups between refnames that come
from different ref_iterators.
Please note that this optimization of peel_ref() is only available when
iterating via do_for_each_ref_iterator() (including all of the
for_each_ref() functions, which call it indirectly). It would be
complicated to implement a similar optimization when iterating directly
using a reference iterator, because multiple reference iterators can be
in use at the same time, with interleaved calls to
ref_iterator_advance(). (In fact we do exactly that in
merge_ref_iterator.)
But that is not necessary. peel_ref() is only called while iterating
over references. Callers who iterate using the for_each_ref() functions
benefit from the optimization described above. Callers who iterate using
reference iterators directly have access to the ref_iterator, so they
can call ref_iterator_peel() themselves to get an analogous optimization
in a more straightforward manner.
If we rewrite all callers to use the reference iteration API, then we
can remove the current_ref_iter hack permanently.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-18 12:15:16 +08:00
|
|
|
|
2016-09-05 00:08:11 +08:00
|
|
|
if (!refs)
|
|
|
|
return 0;
|
|
|
|
|
2023-07-11 05:12:22 +08:00
|
|
|
iter = refs_ref_iterator_begin(refs, prefix, exclude_patterns, trim,
|
|
|
|
flags);
|
do_for_each_ref(): reimplement using reference iteration
Use the reference iterator interface to implement do_for_each_ref().
Delete a bunch of code supporting the old for_each_ref() implementation.
And now that do_for_each_ref() is generic code (it is no longer tied to
the files backend), move it to refs.c.
The implementation is via a new function, do_for_each_ref_iterator(),
which takes a reference iterator as argument and calls a callback
function for each of the references in the iterator.
This change requires the current_ref performance hack for peel_ref() to
be implemented via ref_iterator_peel() rather than peel_entry() because
we don't have a ref_entry handy (it is hidden under three layers:
file_ref_iterator, merge_ref_iterator, and cache_ref_iterator). So:
* do_for_each_ref_iterator() records the active iterator in
current_ref_iter while it is running.
* peel_ref() checks whether current_ref_iter is pointing at the
requested reference. If so, it asks the iterator to peel the
reference (which it can do efficiently via its "peel" virtual
function). For extra safety, we do the optimization only if the
refname *addresses* are the same, not only if the refname *strings*
are the same, to forestall possible mixups between refnames that come
from different ref_iterators.
Please note that this optimization of peel_ref() is only available when
iterating via do_for_each_ref_iterator() (including all of the
for_each_ref() functions, which call it indirectly). It would be
complicated to implement a similar optimization when iterating directly
using a reference iterator, because multiple reference iterators can be
in use at the same time, with interleaved calls to
ref_iterator_advance(). (In fact we do exactly that in
merge_ref_iterator.)
But that is not necessary. peel_ref() is only called while iterating
over references. Callers who iterate using the for_each_ref() functions
benefit from the optimization described above. Callers who iterate using
reference iterators directly have access to the ref_iterator, so they
can call ref_iterator_peel() themselves to get an analogous optimization
in a more straightforward manner.
If we rewrite all callers to use the reference iteration API, then we
can remove the current_ref_iter hack permanently.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-18 12:15:16 +08:00
|
|
|
|
2018-08-21 02:24:16 +08:00
|
|
|
return do_for_each_repo_ref_iterator(the_repository, iter,
|
|
|
|
do_for_each_ref_helper, &hp);
|
do_for_each_ref(): reimplement using reference iteration
Use the reference iterator interface to implement do_for_each_ref().
Delete a bunch of code supporting the old for_each_ref() implementation.
And now that do_for_each_ref() is generic code (it is no longer tied to
the files backend), move it to refs.c.
The implementation is via a new function, do_for_each_ref_iterator(),
which takes a reference iterator as argument and calls a callback
function for each of the references in the iterator.
This change requires the current_ref performance hack for peel_ref() to
be implemented via ref_iterator_peel() rather than peel_entry() because
we don't have a ref_entry handy (it is hidden under three layers:
file_ref_iterator, merge_ref_iterator, and cache_ref_iterator). So:
* do_for_each_ref_iterator() records the active iterator in
current_ref_iter while it is running.
* peel_ref() checks whether current_ref_iter is pointing at the
requested reference. If so, it asks the iterator to peel the
reference (which it can do efficiently via its "peel" virtual
function). For extra safety, we do the optimization only if the
refname *addresses* are the same, not only if the refname *strings*
are the same, to forestall possible mixups between refnames that come
from different ref_iterators.
Please note that this optimization of peel_ref() is only available when
iterating via do_for_each_ref_iterator() (including all of the
for_each_ref() functions, which call it indirectly). It would be
complicated to implement a similar optimization when iterating directly
using a reference iterator, because multiple reference iterators can be
in use at the same time, with interleaved calls to
ref_iterator_advance(). (In fact we do exactly that in
merge_ref_iterator.)
But that is not necessary. peel_ref() is only called while iterating
over references. Callers who iterate using the for_each_ref() functions
benefit from the optimization described above. Callers who iterate using
reference iterators directly have access to the ref_iterator, so they
can call ref_iterator_peel() themselves to get an analogous optimization
in a more straightforward manner.
If we rewrite all callers to use the reference iteration API, then we
can remove the current_ref_iter hack permanently.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-18 12:15:16 +08:00
|
|
|
}
|
|
|
|
|
2017-03-26 10:42:34 +08:00
|
|
|
int refs_for_each_ref(struct ref_store *refs, each_ref_fn fn, void *cb_data)
|
|
|
|
{
|
2023-07-11 05:12:22 +08:00
|
|
|
return do_for_each_ref(refs, "", NULL, fn, 0, 0, cb_data);
|
2017-03-26 10:42:34 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
int refs_for_each_ref_in(struct ref_store *refs, const char *prefix,
|
|
|
|
each_ref_fn fn, void *cb_data)
|
|
|
|
{
|
2023-07-11 05:12:22 +08:00
|
|
|
return do_for_each_ref(refs, prefix, NULL, fn, strlen(prefix), 0, cb_data);
|
2016-04-08 03:02:49 +08:00
|
|
|
}
|
|
|
|
|
2017-08-23 20:36:56 +08:00
|
|
|
int refs_for_each_fullref_in(struct ref_store *refs, const char *prefix,
|
2023-07-11 05:12:22 +08:00
|
|
|
const char **exclude_patterns,
|
2021-09-25 02:48:48 +08:00
|
|
|
each_ref_fn fn, void *cb_data)
|
2017-06-18 21:39:41 +08:00
|
|
|
{
|
2023-07-11 05:12:22 +08:00
|
|
|
return do_for_each_ref(refs, prefix, exclude_patterns, fn, 0, 0, cb_data);
|
2017-06-18 21:39:41 +08:00
|
|
|
}
|
|
|
|
|
2018-08-21 02:24:19 +08:00
|
|
|
int for_each_replace_ref(struct repository *r, each_repo_ref_fn fn, void *cb_data)
|
2016-04-08 03:02:49 +08:00
|
|
|
{
|
2022-08-06 01:58:37 +08:00
|
|
|
const char *git_replace_ref_base = ref_namespace[NAMESPACE_REPLACE].ref;
|
2018-08-21 02:24:19 +08:00
|
|
|
return do_for_each_repo_ref(r, git_replace_ref_base, fn,
|
|
|
|
strlen(git_replace_ref_base),
|
|
|
|
DO_FOR_EACH_INCLUDE_BROKEN, cb_data);
|
2016-04-08 03:02:49 +08:00
|
|
|
}
|
|
|
|
|
2024-05-07 15:11:39 +08:00
|
|
|
int refs_for_each_namespaced_ref(struct ref_store *refs,
|
|
|
|
const char **exclude_patterns,
|
|
|
|
each_ref_fn fn, void *cb_data)
|
2016-04-08 03:02:49 +08:00
|
|
|
{
|
|
|
|
struct strbuf buf = STRBUF_INIT;
|
|
|
|
int ret;
|
|
|
|
strbuf_addf(&buf, "%srefs/", get_git_namespace());
|
2024-05-07 15:11:39 +08:00
|
|
|
ret = do_for_each_ref(refs, buf.buf, exclude_patterns, fn, 0, 0, cb_data);
|
2016-04-08 03:02:49 +08:00
|
|
|
strbuf_release(&buf);
|
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
2017-03-26 10:42:34 +08:00
|
|
|
int refs_for_each_rawref(struct ref_store *refs, each_ref_fn fn, void *cb_data)
|
2016-04-08 03:02:49 +08:00
|
|
|
{
|
2023-07-11 05:12:22 +08:00
|
|
|
return do_for_each_ref(refs, "", NULL, fn, 0,
|
2016-04-08 03:02:49 +08:00
|
|
|
DO_FOR_EACH_INCLUDE_BROKEN, cb_data);
|
|
|
|
}
|
2016-04-08 03:03:10 +08:00
|
|
|
|
2024-02-23 18:01:10 +08:00
|
|
|
int refs_for_each_include_root_refs(struct ref_store *refs, each_ref_fn fn,
|
|
|
|
void *cb_data)
|
|
|
|
{
|
|
|
|
return do_for_each_ref(refs, "", NULL, fn, 0,
|
|
|
|
DO_FOR_EACH_INCLUDE_ROOT_REFS, cb_data);
|
2017-03-26 10:42:34 +08:00
|
|
|
}
|
|
|
|
|
2021-01-21 00:04:21 +08:00
|
|
|
static int qsort_strcmp(const void *va, const void *vb)
|
|
|
|
{
|
|
|
|
const char *a = *(const char **)va;
|
|
|
|
const char *b = *(const char **)vb;
|
|
|
|
|
|
|
|
return strcmp(a, b);
|
|
|
|
}
|
|
|
|
|
|
|
|
static void find_longest_prefixes_1(struct string_list *out,
|
|
|
|
struct strbuf *prefix,
|
|
|
|
const char **patterns, size_t nr)
|
|
|
|
{
|
|
|
|
size_t i;
|
|
|
|
|
|
|
|
for (i = 0; i < nr; i++) {
|
|
|
|
char c = patterns[i][prefix->len];
|
|
|
|
if (!c || is_glob_special(c)) {
|
|
|
|
string_list_append(out, prefix->buf);
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
i = 0;
|
|
|
|
while (i < nr) {
|
|
|
|
size_t end;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Set "end" to the index of the element _after_ the last one
|
|
|
|
* in our group.
|
|
|
|
*/
|
|
|
|
for (end = i + 1; end < nr; end++) {
|
|
|
|
if (patterns[i][prefix->len] != patterns[end][prefix->len])
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
|
|
|
|
strbuf_addch(prefix, patterns[i][prefix->len]);
|
|
|
|
find_longest_prefixes_1(out, prefix, patterns + i, end - i);
|
|
|
|
strbuf_setlen(prefix, prefix->len - 1);
|
|
|
|
|
|
|
|
i = end;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
static void find_longest_prefixes(struct string_list *out,
|
|
|
|
const char **patterns)
|
|
|
|
{
|
|
|
|
struct strvec sorted = STRVEC_INIT;
|
|
|
|
struct strbuf prefix = STRBUF_INIT;
|
|
|
|
|
|
|
|
strvec_pushv(&sorted, patterns);
|
|
|
|
QSORT(sorted.v, sorted.nr, qsort_strcmp);
|
|
|
|
|
|
|
|
find_longest_prefixes_1(out, &prefix, sorted.v, sorted.nr);
|
|
|
|
|
|
|
|
strvec_clear(&sorted);
|
|
|
|
strbuf_release(&prefix);
|
|
|
|
}
|
|
|
|
|
2022-12-13 19:11:10 +08:00
|
|
|
int refs_for_each_fullref_in_prefixes(struct ref_store *ref_store,
|
|
|
|
const char *namespace,
|
|
|
|
const char **patterns,
|
2023-07-11 05:12:22 +08:00
|
|
|
const char **exclude_patterns,
|
2022-12-13 19:11:10 +08:00
|
|
|
each_ref_fn fn, void *cb_data)
|
2021-01-21 00:04:21 +08:00
|
|
|
{
|
|
|
|
struct string_list prefixes = STRING_LIST_INIT_DUP;
|
|
|
|
struct string_list_item *prefix;
|
|
|
|
struct strbuf buf = STRBUF_INIT;
|
|
|
|
int ret = 0, namespace_len;
|
|
|
|
|
|
|
|
find_longest_prefixes(&prefixes, patterns);
|
|
|
|
|
|
|
|
if (namespace)
|
|
|
|
strbuf_addstr(&buf, namespace);
|
|
|
|
namespace_len = buf.len;
|
|
|
|
|
|
|
|
for_each_string_list_item(prefix, &prefixes) {
|
|
|
|
strbuf_addstr(&buf, prefix->string);
|
2023-07-11 05:12:22 +08:00
|
|
|
ret = refs_for_each_fullref_in(ref_store, buf.buf,
|
|
|
|
exclude_patterns, fn, cb_data);
|
2021-01-21 00:04:21 +08:00
|
|
|
if (ret)
|
|
|
|
break;
|
|
|
|
strbuf_setlen(&buf, namespace_len);
|
|
|
|
}
|
|
|
|
|
|
|
|
string_list_clear(&prefixes, 0);
|
|
|
|
strbuf_release(&buf);
|
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
2020-08-19 22:27:58 +08:00
|
|
|
static int refs_read_special_head(struct ref_store *ref_store,
|
|
|
|
const char *refname, struct object_id *oid,
|
2021-10-16 17:39:10 +08:00
|
|
|
struct strbuf *referent, unsigned int *type,
|
|
|
|
int *failure_errno)
|
2020-08-19 22:27:58 +08:00
|
|
|
{
|
|
|
|
struct strbuf full_path = STRBUF_INIT;
|
|
|
|
struct strbuf content = STRBUF_INIT;
|
|
|
|
int result = -1;
|
|
|
|
strbuf_addf(&full_path, "%s/%s", ref_store->gitdir, refname);
|
|
|
|
|
2023-12-14 21:37:02 +08:00
|
|
|
if (strbuf_read_file(&content, full_path.buf, 0) < 0) {
|
|
|
|
*failure_errno = errno;
|
2020-08-19 22:27:58 +08:00
|
|
|
goto done;
|
2023-12-14 21:37:02 +08:00
|
|
|
}
|
2020-08-19 22:27:58 +08:00
|
|
|
|
2021-10-16 17:39:10 +08:00
|
|
|
result = parse_loose_ref_contents(content.buf, oid, referent, type,
|
|
|
|
failure_errno);
|
2020-08-19 22:27:58 +08:00
|
|
|
|
|
|
|
done:
|
|
|
|
strbuf_release(&full_path);
|
|
|
|
strbuf_release(&content);
|
|
|
|
return result;
|
|
|
|
}
|
|
|
|
|
2021-10-16 17:39:09 +08:00
|
|
|
int refs_read_raw_ref(struct ref_store *ref_store, const char *refname,
|
|
|
|
struct object_id *oid, struct strbuf *referent,
|
|
|
|
unsigned int *type, int *failure_errno)
|
2017-03-21 00:33:07 +08:00
|
|
|
{
|
2021-10-16 17:39:09 +08:00
|
|
|
assert(failure_errno);
|
2024-05-15 14:50:47 +08:00
|
|
|
if (is_pseudo_ref(refname))
|
2020-08-19 22:27:58 +08:00
|
|
|
return refs_read_special_head(ref_store, refname, oid, referent,
|
2021-10-16 17:39:10 +08:00
|
|
|
type, failure_errno);
|
2020-08-19 22:27:58 +08:00
|
|
|
|
|
|
|
return ref_store->be->read_raw_ref(ref_store, refname, oid, referent,
|
2021-10-16 17:39:09 +08:00
|
|
|
type, failure_errno);
|
2017-03-21 00:33:07 +08:00
|
|
|
}
|
|
|
|
|
refs: add ability for backends to special-case reading of symbolic refs
Reading of symbolic and non-symbolic references is currently treated the
same in reference backends: we always call `refs_read_raw_ref()` and
then decide based on the returned flags what type it is. This has one
downside though: symbolic references may be treated different from
normal references in a backend from normal references. The packed-refs
backend for example doesn't even know about symbolic references, and as
a result it is pointless to even ask it for one.
There are cases where we really only care about whether a reference is
symbolic or not, but don't care about whether it exists at all or may be
a non-symbolic reference. But it is not possible to optimize for this
case right now, and as a consequence we will always first check for a
loose reference to exist, and if it doesn't, we'll query the packed-refs
backend for a known-to-not-be-symbolic reference. This is inefficient
and requires us to search all packed references even though we know to
not care for the result at all.
Introduce a new function `refs_read_symbolic_ref()` which allows us to
fix this case. This function will only ever return symbolic references
and can thus optimize for the scenario layed out above. By default, if
the backend doesn't provide an implementation for it, we just use the
old code path and fall back to `read_raw_ref()`. But in case the backend
provides its own, more efficient implementation, we will use that one
instead.
Note that this function is explicitly designed to not distinguish
between missing references and non-symbolic references. If it did, we'd
be forced to always search the packed-refs backend to see whether the
symbolic reference the user asked for really doesn't exist, or if it
exists as a non-symbolic reference.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-03-01 17:33:46 +08:00
|
|
|
int refs_read_symbolic_ref(struct ref_store *ref_store, const char *refname,
|
|
|
|
struct strbuf *referent)
|
|
|
|
{
|
2022-03-18 01:27:19 +08:00
|
|
|
return ref_store->be->read_symbolic_ref(ref_store, refname, referent);
|
refs: add ability for backends to special-case reading of symbolic refs
Reading of symbolic and non-symbolic references is currently treated the
same in reference backends: we always call `refs_read_raw_ref()` and
then decide based on the returned flags what type it is. This has one
downside though: symbolic references may be treated different from
normal references in a backend from normal references. The packed-refs
backend for example doesn't even know about symbolic references, and as
a result it is pointless to even ask it for one.
There are cases where we really only care about whether a reference is
symbolic or not, but don't care about whether it exists at all or may be
a non-symbolic reference. But it is not possible to optimize for this
case right now, and as a consequence we will always first check for a
loose reference to exist, and if it doesn't, we'll query the packed-refs
backend for a known-to-not-be-symbolic reference. This is inefficient
and requires us to search all packed references even though we know to
not care for the result at all.
Introduce a new function `refs_read_symbolic_ref()` which allows us to
fix this case. This function will only ever return symbolic references
and can thus optimize for the scenario layed out above. By default, if
the backend doesn't provide an implementation for it, we just use the
old code path and fall back to `read_raw_ref()`. But in case the backend
provides its own, more efficient implementation, we will use that one
instead.
Note that this function is explicitly designed to not distinguish
between missing references and non-symbolic references. If it did, we'd
be forced to always search the packed-refs backend to see whether the
symbolic reference the user asked for really doesn't exist, or if it
exists as a non-symbolic reference.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-03-01 17:33:46 +08:00
|
|
|
}
|
|
|
|
|
2017-03-26 10:42:34 +08:00
|
|
|
const char *refs_resolve_ref_unsafe(struct ref_store *refs,
|
2017-02-10 04:53:52 +08:00
|
|
|
const char *refname,
|
|
|
|
int resolve_flags,
|
2021-10-16 17:39:08 +08:00
|
|
|
struct object_id *oid,
|
2022-01-26 22:37:01 +08:00
|
|
|
int *flags)
|
2016-04-08 03:03:10 +08:00
|
|
|
{
|
|
|
|
static struct strbuf sb_refname = STRBUF_INIT;
|
2017-09-23 17:41:45 +08:00
|
|
|
struct object_id unused_oid;
|
2016-04-08 03:03:10 +08:00
|
|
|
int unused_flags;
|
|
|
|
int symref_count;
|
|
|
|
|
refs: convert resolve_ref_unsafe to struct object_id
Convert resolve_ref_unsafe to take a pointer to struct object_id by
converting one remaining caller to use struct object_id, removing the
temporary NULL pointer check in expand_ref, converting the declaration
and definition, and applying the following semantic patch:
@@
expression E1, E2, E3, E4;
@@
- resolve_ref_unsafe(E1, E2, E3.hash, E4)
+ resolve_ref_unsafe(E1, E2, &E3, E4)
@@
expression E1, E2, E3, E4;
@@
- resolve_ref_unsafe(E1, E2, E3->hash, E4)
+ resolve_ref_unsafe(E1, E2, E3, E4)
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-16 06:07:09 +08:00
|
|
|
if (!oid)
|
|
|
|
oid = &unused_oid;
|
2016-04-08 03:03:10 +08:00
|
|
|
if (!flags)
|
|
|
|
flags = &unused_flags;
|
|
|
|
|
|
|
|
*flags = 0;
|
|
|
|
|
|
|
|
if (check_refname_format(refname, REFNAME_ALLOW_ONELEVEL)) {
|
|
|
|
if (!(resolve_flags & RESOLVE_REF_ALLOW_BAD_NAME) ||
|
2022-01-26 22:37:01 +08:00
|
|
|
!refname_is_safe(refname))
|
2016-04-08 03:03:10 +08:00
|
|
|
return NULL;
|
|
|
|
|
|
|
|
/*
|
2023-03-28 21:58:57 +08:00
|
|
|
* repo_dwim_ref() uses REF_ISBROKEN to distinguish between
|
2016-04-08 03:03:10 +08:00
|
|
|
* missing refs and refs that were present but invalid,
|
|
|
|
* to complain about the latter to stderr.
|
|
|
|
*
|
|
|
|
* We don't know whether the ref exists, so don't set
|
|
|
|
* REF_ISBROKEN yet.
|
|
|
|
*/
|
|
|
|
*flags |= REF_BAD_NAME;
|
|
|
|
}
|
|
|
|
|
|
|
|
for (symref_count = 0; symref_count < SYMREF_MAXDEPTH; symref_count++) {
|
|
|
|
unsigned int read_flags = 0;
|
2022-01-26 22:37:01 +08:00
|
|
|
int failure_errno;
|
2016-04-08 03:03:10 +08:00
|
|
|
|
2021-10-16 17:39:09 +08:00
|
|
|
if (refs_read_raw_ref(refs, refname, oid, &sb_refname,
|
2022-01-26 22:37:01 +08:00
|
|
|
&read_flags, &failure_errno)) {
|
2016-04-08 03:03:10 +08:00
|
|
|
*flags |= read_flags;
|
refs_resolve_ref_unsafe: handle d/f conflicts for writes
If our call to refs_read_raw_ref() fails, we check errno to
see if the ref is simply missing, or if we encountered a
more serious error. If it's just missing, then in "write"
mode (i.e., when RESOLVE_REFS_READING is not set), this is
perfectly fine.
However, checking for ENOENT isn't sufficient to catch all
missing-ref cases. In the filesystem backend, we may also
see EISDIR when we try to resolve "a" and "a/b" exists.
Likewise, we may see ENOTDIR if we try to resolve "a/b" and
"a" exists. In both of those cases, we know that our
resolved ref doesn't exist, but we return an error (rather
than reporting the refname and returning a null sha1).
This has been broken for a long time, but nobody really
noticed because the next step after resolving without the
READING flag is usually to lock the ref and write it. But in
both of those cases, the write will fail with the same
errno due to the directory/file conflict.
There are two cases where we can notice this, though:
1. If we try to write "a" and there's a leftover directory
already at "a", even though there is no ref "a/b". The
actual write is smart enough to move the empty "a" out
of the way.
This is reasonably rare, if only because the writing
code has to do an independent resolution before trying
its write (because the actual update_ref() code handles
this case fine). The notes-merge code does this, and
before the fix in the prior commit t3308 erroneously
expected this case to fail.
2. When resolving symbolic refs, we typically do not use
the READING flag because we want to resolve even
symrefs that point to unborn refs. Even if those unborn
refs could not actually be written because of d/f
conflicts with existing refs.
You can see this by asking "git symbolic-ref" to report
the target of a symref pointing past a d/f conflict.
We can fix the problem by recognizing the other "missing"
errnos and treating them like ENOENT. This should be safe to
do even for callers who are then going to actually write the
ref, because the actual writing process will fail if the d/f
conflict is a real one (and t1404 checks these cases).
Arguably this should be the responsibility of the
files-backend to normalize all "missing ref" errors into
ENOENT (since something like EISDIR may not be meaningful at
all to a database backend). However other callers of
refs_read_raw_ref() may actually care about the distinction;
putting this into resolve_ref() is the minimal fix for now.
The new tests in t1401 use git-symbolic-ref, which is the
most direct way to check the resolution by itself.
Interestingly we actually had a test that setup this case
already, but we only used it to verify that the funny state
could be overwritten, not that it could be resolved.
We also add a new test in t3200, as "branch -m" was the
original motivation for looking into this. What happens is
this:
0. HEAD is pointing to branch "a"
1. The user asks to rename "a" to "a/b".
2. We create "a/b" and delete "a".
3. We then try to update any worktree HEADs that point to
the renamed ref (including the main repo HEAD). To do
that, we have to resolve each HEAD. But now our HEAD is
pointing at "a", and we get EISDIR due to the loose
"a/b". As a result, we think there is no HEAD, and we
do not update it. It now points to the bogus "a".
Interestingly this case used to work, but only accidentally.
Before 31824d180d (branch: fix branch renaming not updating
HEADs correctly, 2017-08-24), we'd update any HEAD which we
couldn't resolve. That was wrong, but it papered over the
fact that we were incorrectly failing to resolve HEAD.
So while the bug demonstrated by the git-symbolic-ref is
quite old, the regression to "branch -m" is recent.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-06 22:42:17 +08:00
|
|
|
|
|
|
|
/* In reading mode, refs must eventually resolve */
|
|
|
|
if (resolve_flags & RESOLVE_REF_READING)
|
|
|
|
return NULL;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Otherwise a missing ref is OK. But the files backend
|
|
|
|
* may show errors besides ENOENT if there are
|
|
|
|
* similarly-named refs.
|
|
|
|
*/
|
2022-01-26 22:37:01 +08:00
|
|
|
if (failure_errno != ENOENT &&
|
|
|
|
failure_errno != EISDIR &&
|
|
|
|
failure_errno != ENOTDIR)
|
2016-04-08 03:03:10 +08:00
|
|
|
return NULL;
|
refs_resolve_ref_unsafe: handle d/f conflicts for writes
If our call to refs_read_raw_ref() fails, we check errno to
see if the ref is simply missing, or if we encountered a
more serious error. If it's just missing, then in "write"
mode (i.e., when RESOLVE_REFS_READING is not set), this is
perfectly fine.
However, checking for ENOENT isn't sufficient to catch all
missing-ref cases. In the filesystem backend, we may also
see EISDIR when we try to resolve "a" and "a/b" exists.
Likewise, we may see ENOTDIR if we try to resolve "a/b" and
"a" exists. In both of those cases, we know that our
resolved ref doesn't exist, but we return an error (rather
than reporting the refname and returning a null sha1).
This has been broken for a long time, but nobody really
noticed because the next step after resolving without the
READING flag is usually to lock the ref and write it. But in
both of those cases, the write will fail with the same
errno due to the directory/file conflict.
There are two cases where we can notice this, though:
1. If we try to write "a" and there's a leftover directory
already at "a", even though there is no ref "a/b". The
actual write is smart enough to move the empty "a" out
of the way.
This is reasonably rare, if only because the writing
code has to do an independent resolution before trying
its write (because the actual update_ref() code handles
this case fine). The notes-merge code does this, and
before the fix in the prior commit t3308 erroneously
expected this case to fail.
2. When resolving symbolic refs, we typically do not use
the READING flag because we want to resolve even
symrefs that point to unborn refs. Even if those unborn
refs could not actually be written because of d/f
conflicts with existing refs.
You can see this by asking "git symbolic-ref" to report
the target of a symref pointing past a d/f conflict.
We can fix the problem by recognizing the other "missing"
errnos and treating them like ENOENT. This should be safe to
do even for callers who are then going to actually write the
ref, because the actual writing process will fail if the d/f
conflict is a real one (and t1404 checks these cases).
Arguably this should be the responsibility of the
files-backend to normalize all "missing ref" errors into
ENOENT (since something like EISDIR may not be meaningful at
all to a database backend). However other callers of
refs_read_raw_ref() may actually care about the distinction;
putting this into resolve_ref() is the minimal fix for now.
The new tests in t1401 use git-symbolic-ref, which is the
most direct way to check the resolution by itself.
Interestingly we actually had a test that setup this case
already, but we only used it to verify that the funny state
could be overwritten, not that it could be resolved.
We also add a new test in t3200, as "branch -m" was the
original motivation for looking into this. What happens is
this:
0. HEAD is pointing to branch "a"
1. The user asks to rename "a" to "a/b".
2. We create "a/b" and delete "a".
3. We then try to update any worktree HEADs that point to
the renamed ref (including the main repo HEAD). To do
that, we have to resolve each HEAD. But now our HEAD is
pointing at "a", and we get EISDIR due to the loose
"a/b". As a result, we think there is no HEAD, and we
do not update it. It now points to the bogus "a".
Interestingly this case used to work, but only accidentally.
Before 31824d180d (branch: fix branch renaming not updating
HEADs correctly, 2017-08-24), we'd update any HEAD which we
couldn't resolve. That was wrong, but it papered over the
fact that we were incorrectly failing to resolve HEAD.
So while the bug demonstrated by the git-symbolic-ref is
quite old, the regression to "branch -m" is recent.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-06 22:42:17 +08:00
|
|
|
|
refs: convert resolve_ref_unsafe to struct object_id
Convert resolve_ref_unsafe to take a pointer to struct object_id by
converting one remaining caller to use struct object_id, removing the
temporary NULL pointer check in expand_ref, converting the declaration
and definition, and applying the following semantic patch:
@@
expression E1, E2, E3, E4;
@@
- resolve_ref_unsafe(E1, E2, E3.hash, E4)
+ resolve_ref_unsafe(E1, E2, &E3, E4)
@@
expression E1, E2, E3, E4;
@@
- resolve_ref_unsafe(E1, E2, E3->hash, E4)
+ resolve_ref_unsafe(E1, E2, E3, E4)
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-16 06:07:09 +08:00
|
|
|
oidclr(oid);
|
2016-04-08 03:03:10 +08:00
|
|
|
if (*flags & REF_BAD_NAME)
|
|
|
|
*flags |= REF_ISBROKEN;
|
|
|
|
return refname;
|
|
|
|
}
|
|
|
|
|
|
|
|
*flags |= read_flags;
|
|
|
|
|
|
|
|
if (!(read_flags & REF_ISSYMREF)) {
|
|
|
|
if (*flags & REF_BAD_NAME) {
|
refs: convert resolve_ref_unsafe to struct object_id
Convert resolve_ref_unsafe to take a pointer to struct object_id by
converting one remaining caller to use struct object_id, removing the
temporary NULL pointer check in expand_ref, converting the declaration
and definition, and applying the following semantic patch:
@@
expression E1, E2, E3, E4;
@@
- resolve_ref_unsafe(E1, E2, E3.hash, E4)
+ resolve_ref_unsafe(E1, E2, &E3, E4)
@@
expression E1, E2, E3, E4;
@@
- resolve_ref_unsafe(E1, E2, E3->hash, E4)
+ resolve_ref_unsafe(E1, E2, E3, E4)
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-16 06:07:09 +08:00
|
|
|
oidclr(oid);
|
2016-04-08 03:03:10 +08:00
|
|
|
*flags |= REF_ISBROKEN;
|
|
|
|
}
|
|
|
|
return refname;
|
|
|
|
}
|
|
|
|
|
|
|
|
refname = sb_refname.buf;
|
|
|
|
if (resolve_flags & RESOLVE_REF_NO_RECURSE) {
|
refs: convert resolve_ref_unsafe to struct object_id
Convert resolve_ref_unsafe to take a pointer to struct object_id by
converting one remaining caller to use struct object_id, removing the
temporary NULL pointer check in expand_ref, converting the declaration
and definition, and applying the following semantic patch:
@@
expression E1, E2, E3, E4;
@@
- resolve_ref_unsafe(E1, E2, E3.hash, E4)
+ resolve_ref_unsafe(E1, E2, &E3, E4)
@@
expression E1, E2, E3, E4;
@@
- resolve_ref_unsafe(E1, E2, E3->hash, E4)
+ resolve_ref_unsafe(E1, E2, E3, E4)
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-16 06:07:09 +08:00
|
|
|
oidclr(oid);
|
2016-04-08 03:03:10 +08:00
|
|
|
return refname;
|
|
|
|
}
|
|
|
|
if (check_refname_format(refname, REFNAME_ALLOW_ONELEVEL)) {
|
|
|
|
if (!(resolve_flags & RESOLVE_REF_ALLOW_BAD_NAME) ||
|
2022-01-26 22:37:01 +08:00
|
|
|
!refname_is_safe(refname))
|
2016-04-08 03:03:10 +08:00
|
|
|
return NULL;
|
|
|
|
|
|
|
|
*flags |= REF_ISBROKEN | REF_BAD_NAME;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
return NULL;
|
|
|
|
}
|
2016-09-05 00:08:11 +08:00
|
|
|
|
2016-09-05 00:08:41 +08:00
|
|
|
/* backend functions */
|
2024-01-08 18:05:26 +08:00
|
|
|
int refs_init_db(struct ref_store *refs, int flags, struct strbuf *err)
|
2016-09-05 00:08:41 +08:00
|
|
|
{
|
2024-01-08 18:05:26 +08:00
|
|
|
return refs->be->init_db(refs, flags, err);
|
2016-09-05 00:08:41 +08:00
|
|
|
}
|
|
|
|
|
2016-09-05 00:08:24 +08:00
|
|
|
int resolve_gitlink_ref(const char *submodule, const char *refname,
|
refs: convert resolve_gitlink_ref to struct object_id
Convert the declaration and definition of resolve_gitlink_ref to use
struct object_id and apply the following semantic patch:
@@
expression E1, E2, E3;
@@
- resolve_gitlink_ref(E1, E2, E3.hash)
+ resolve_gitlink_ref(E1, E2, &E3)
@@
expression E1, E2, E3;
@@
- resolve_gitlink_ref(E1, E2, E3->hash)
+ resolve_gitlink_ref(E1, E2, E3)
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-16 06:07:07 +08:00
|
|
|
struct object_id *oid)
|
2016-09-05 00:08:22 +08:00
|
|
|
{
|
|
|
|
struct ref_store *refs;
|
|
|
|
int flags;
|
|
|
|
|
2017-08-23 20:36:54 +08:00
|
|
|
refs = get_submodule_ref_store(submodule);
|
2016-09-05 00:08:23 +08:00
|
|
|
|
2016-09-05 00:08:22 +08:00
|
|
|
if (!refs)
|
|
|
|
return -1;
|
|
|
|
|
2022-01-26 22:37:01 +08:00
|
|
|
if (!refs_resolve_ref_unsafe(refs, refname, 0, oid, &flags) ||
|
|
|
|
is_null_oid(oid))
|
2016-09-05 00:08:22 +08:00
|
|
|
return -1;
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2017-04-04 18:21:20 +08:00
|
|
|
struct ref_store_hash_entry
|
2017-02-10 19:16:15 +08:00
|
|
|
{
|
2019-10-07 07:30:43 +08:00
|
|
|
struct hashmap_entry ent;
|
2017-02-10 19:16:15 +08:00
|
|
|
|
|
|
|
struct ref_store *refs;
|
|
|
|
|
2017-04-04 18:21:20 +08:00
|
|
|
/* NUL-terminated identifier of the ref store: */
|
|
|
|
char name[FLEX_ARRAY];
|
2017-02-10 19:16:15 +08:00
|
|
|
};
|
|
|
|
|
2022-08-26 01:09:48 +08:00
|
|
|
static int ref_store_hash_cmp(const void *cmp_data UNUSED,
|
2019-10-07 07:30:37 +08:00
|
|
|
const struct hashmap_entry *eptr,
|
|
|
|
const struct hashmap_entry *entry_or_key,
|
2017-02-10 19:16:15 +08:00
|
|
|
const void *keydata)
|
|
|
|
{
|
2019-10-07 07:30:37 +08:00
|
|
|
const struct ref_store_hash_entry *e1, *e2;
|
|
|
|
const char *name;
|
|
|
|
|
|
|
|
e1 = container_of(eptr, const struct ref_store_hash_entry, ent);
|
|
|
|
e2 = container_of(entry_or_key, const struct ref_store_hash_entry, ent);
|
|
|
|
name = keydata ? keydata : e2->name;
|
2017-02-10 19:16:15 +08:00
|
|
|
|
2017-04-04 18:21:20 +08:00
|
|
|
return strcmp(e1->name, name);
|
2017-02-10 19:16:15 +08:00
|
|
|
}
|
|
|
|
|
2017-04-04 18:21:20 +08:00
|
|
|
static struct ref_store_hash_entry *alloc_ref_store_hash_entry(
|
|
|
|
const char *name, struct ref_store *refs)
|
2017-02-10 19:16:15 +08:00
|
|
|
{
|
2017-04-04 18:21:20 +08:00
|
|
|
struct ref_store_hash_entry *entry;
|
2017-02-10 19:16:15 +08:00
|
|
|
|
2017-04-04 18:21:20 +08:00
|
|
|
FLEX_ALLOC_STR(entry, name, name);
|
2019-10-07 07:30:27 +08:00
|
|
|
hashmap_entry_init(&entry->ent, strhash(name));
|
2017-02-10 19:16:15 +08:00
|
|
|
entry->refs = refs;
|
|
|
|
return entry;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* A hashmap of ref_stores, stored by submodule name: */
|
|
|
|
static struct hashmap submodule_ref_stores;
|
2016-09-05 00:08:11 +08:00
|
|
|
|
2017-04-24 18:01:22 +08:00
|
|
|
/* A hashmap of ref_stores, stored by worktree id: */
|
|
|
|
static struct hashmap worktree_ref_stores;
|
|
|
|
|
2017-02-10 19:16:12 +08:00
|
|
|
/*
|
2017-04-04 18:21:20 +08:00
|
|
|
* Look up a ref store by name. If that ref_store hasn't been
|
|
|
|
* registered yet, return NULL.
|
2017-02-10 19:16:12 +08:00
|
|
|
*/
|
2017-04-04 18:21:20 +08:00
|
|
|
static struct ref_store *lookup_ref_store_map(struct hashmap *map,
|
|
|
|
const char *name)
|
2016-09-05 00:08:11 +08:00
|
|
|
{
|
2017-04-04 18:21:20 +08:00
|
|
|
struct ref_store_hash_entry *entry;
|
2019-10-07 07:30:36 +08:00
|
|
|
unsigned int hash;
|
2016-09-05 00:08:11 +08:00
|
|
|
|
2017-04-04 18:21:20 +08:00
|
|
|
if (!map->tablesize)
|
2017-02-10 19:16:15 +08:00
|
|
|
/* It's initialized on demand in register_ref_store(). */
|
|
|
|
return NULL;
|
2017-02-10 19:16:11 +08:00
|
|
|
|
2019-10-07 07:30:36 +08:00
|
|
|
hash = strhash(name);
|
|
|
|
entry = hashmap_get_entry_from_hash(map, hash, name,
|
|
|
|
struct ref_store_hash_entry, ent);
|
2017-02-10 19:16:15 +08:00
|
|
|
return entry ? entry->refs : NULL;
|
2016-09-05 00:08:11 +08:00
|
|
|
}
|
|
|
|
|
2017-02-10 19:16:12 +08:00
|
|
|
/*
|
|
|
|
* Create, record, and return a ref_store instance for the specified
|
2017-03-26 10:42:31 +08:00
|
|
|
* gitdir.
|
2017-02-10 19:16:12 +08:00
|
|
|
*/
|
2021-10-09 05:08:14 +08:00
|
|
|
static struct ref_store *ref_store_init(struct repository *repo,
|
|
|
|
const char *gitdir,
|
2017-03-26 10:42:32 +08:00
|
|
|
unsigned int flags)
|
2016-09-05 00:08:11 +08:00
|
|
|
{
|
2023-12-29 15:26:39 +08:00
|
|
|
const struct ref_storage_be *be;
|
2017-02-10 19:16:14 +08:00
|
|
|
struct ref_store *refs;
|
2016-09-05 00:08:11 +08:00
|
|
|
|
2023-12-29 15:26:39 +08:00
|
|
|
be = find_ref_storage_backend(repo->ref_storage_format);
|
2016-09-05 00:08:11 +08:00
|
|
|
if (!be)
|
2023-12-29 15:26:34 +08:00
|
|
|
BUG("reference backend is unknown");
|
2016-09-05 00:08:11 +08:00
|
|
|
|
2021-10-09 05:08:14 +08:00
|
|
|
refs = be->init(repo, gitdir, flags);
|
2017-02-10 19:16:14 +08:00
|
|
|
return refs;
|
2016-09-05 00:08:11 +08:00
|
|
|
}
|
|
|
|
|
2018-04-12 08:21:14 +08:00
|
|
|
struct ref_store *get_main_ref_store(struct repository *r)
|
2017-03-26 10:42:25 +08:00
|
|
|
{
|
repository: mark the "refs" pointer as private
The "refs" pointer in a struct repository starts life as NULL, but then
is lazily initialized when it is accessed via get_main_ref_store().
However, it's easy for calling code to forget this and access it
directly, leading to code which works _some_ of the time, but fails if
it is called before anybody else accesses the refs.
This was the cause of the bug fixed by 5ff4b920eb (sha1-name: do not
assume that the ref store is initialized, 2020-04-09). In order to
prevent similar bugs, let's more clearly mark the "refs" field as
private.
In addition to helping future code, the name change will help us audit
any existing direct uses. Besides get_main_ref_store() itself, it turns
out there is only one. But we know it's OK as it is on the line directly
after the fix from 5ff4b920eb, which will have initialized the pointer.
However it's still a good idea for it to model the proper use of the
accessing function, so we'll convert it.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-04-10 11:04:11 +08:00
|
|
|
if (r->refs_private)
|
|
|
|
return r->refs_private;
|
2017-03-26 10:42:25 +08:00
|
|
|
|
2018-05-19 06:25:53 +08:00
|
|
|
if (!r->gitdir)
|
|
|
|
BUG("attempting to get main_ref_store outside of repository");
|
|
|
|
|
2021-10-09 05:08:14 +08:00
|
|
|
r->refs_private = ref_store_init(r, r->gitdir, REF_STORE_ALL_CAPS);
|
2020-09-09 18:15:08 +08:00
|
|
|
r->refs_private = maybe_debug_wrap_ref_store(r->gitdir, r->refs_private);
|
repository: mark the "refs" pointer as private
The "refs" pointer in a struct repository starts life as NULL, but then
is lazily initialized when it is accessed via get_main_ref_store().
However, it's easy for calling code to forget this and access it
directly, leading to code which works _some_ of the time, but fails if
it is called before anybody else accesses the refs.
This was the cause of the bug fixed by 5ff4b920eb (sha1-name: do not
assume that the ref store is initialized, 2020-04-09). In order to
prevent similar bugs, let's more clearly mark the "refs" field as
private.
In addition to helping future code, the name change will help us audit
any existing direct uses. Besides get_main_ref_store() itself, it turns
out there is only one. But we know it's OK as it is on the line directly
after the fix from 5ff4b920eb, which will have initialized the pointer.
However it's still a good idea for it to model the proper use of the
accessing function, so we'll convert it.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-04-10 11:04:11 +08:00
|
|
|
return r->refs_private;
|
2017-03-26 10:42:28 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
2017-04-04 18:21:20 +08:00
|
|
|
* Associate a ref store with a name. It is a fatal error to call this
|
|
|
|
* function twice for the same name.
|
2017-03-26 10:42:28 +08:00
|
|
|
*/
|
2017-04-04 18:21:20 +08:00
|
|
|
static void register_ref_store_map(struct hashmap *map,
|
|
|
|
const char *type,
|
|
|
|
struct ref_store *refs,
|
|
|
|
const char *name)
|
2017-03-26 10:42:28 +08:00
|
|
|
{
|
2019-10-07 07:30:32 +08:00
|
|
|
struct ref_store_hash_entry *entry;
|
|
|
|
|
2017-04-04 18:21:20 +08:00
|
|
|
if (!map->tablesize)
|
2017-07-01 03:14:05 +08:00
|
|
|
hashmap_init(map, ref_store_hash_cmp, NULL, 0);
|
2017-03-26 10:42:28 +08:00
|
|
|
|
2019-10-07 07:30:32 +08:00
|
|
|
entry = alloc_ref_store_hash_entry(name, refs);
|
|
|
|
if (hashmap_put(map, &entry->ent))
|
2018-05-02 17:38:39 +08:00
|
|
|
BUG("%s ref_store '%s' initialized twice", type, name);
|
2017-03-26 10:42:25 +08:00
|
|
|
}
|
|
|
|
|
2017-03-26 10:42:33 +08:00
|
|
|
struct ref_store *get_submodule_ref_store(const char *submodule)
|
2016-09-05 00:08:11 +08:00
|
|
|
{
|
2017-03-26 10:42:27 +08:00
|
|
|
struct strbuf submodule_sb = STRBUF_INIT;
|
2016-09-05 00:08:11 +08:00
|
|
|
struct ref_store *refs;
|
2017-08-23 20:36:54 +08:00
|
|
|
char *to_free = NULL;
|
|
|
|
size_t len;
|
2021-10-09 05:08:14 +08:00
|
|
|
struct repository *subrepo;
|
2016-09-05 00:08:11 +08:00
|
|
|
|
2017-08-23 20:37:03 +08:00
|
|
|
if (!submodule)
|
|
|
|
return NULL;
|
|
|
|
|
2017-08-23 20:37:04 +08:00
|
|
|
len = strlen(submodule);
|
|
|
|
while (len && is_dir_sep(submodule[len - 1]))
|
|
|
|
len--;
|
|
|
|
if (!len)
|
|
|
|
return NULL;
|
2016-09-05 00:08:11 +08:00
|
|
|
|
2017-08-23 20:36:54 +08:00
|
|
|
if (submodule[len])
|
|
|
|
/* We need to strip off one or more trailing slashes */
|
|
|
|
submodule = to_free = xmemdupz(submodule, len);
|
2016-09-05 00:08:11 +08:00
|
|
|
|
2017-04-04 18:21:20 +08:00
|
|
|
refs = lookup_ref_store_map(&submodule_ref_stores, submodule);
|
2017-03-26 10:42:27 +08:00
|
|
|
if (refs)
|
2017-08-23 20:36:53 +08:00
|
|
|
goto done;
|
2016-09-05 00:08:11 +08:00
|
|
|
|
2017-03-26 10:42:27 +08:00
|
|
|
strbuf_addstr(&submodule_sb, submodule);
|
2017-08-23 20:36:53 +08:00
|
|
|
if (!is_nonbare_repository_dir(&submodule_sb))
|
|
|
|
goto done;
|
2016-09-05 00:08:11 +08:00
|
|
|
|
2017-08-23 20:36:53 +08:00
|
|
|
if (submodule_to_gitdir(&submodule_sb, submodule))
|
|
|
|
goto done;
|
2016-09-05 00:08:11 +08:00
|
|
|
|
2021-10-09 05:08:14 +08:00
|
|
|
subrepo = xmalloc(sizeof(*subrepo));
|
|
|
|
/*
|
|
|
|
* NEEDSWORK: Make get_submodule_ref_store() work with arbitrary
|
|
|
|
* superprojects other than the_repository. This probably should be
|
|
|
|
* done by making it take a struct repository * parameter instead of a
|
|
|
|
* submodule path.
|
|
|
|
*/
|
|
|
|
if (repo_submodule_init(subrepo, the_repository, submodule,
|
|
|
|
null_oid())) {
|
|
|
|
free(subrepo);
|
|
|
|
goto done;
|
|
|
|
}
|
|
|
|
refs = ref_store_init(subrepo, submodule_sb.buf,
|
2017-03-26 10:42:32 +08:00
|
|
|
REF_STORE_READ | REF_STORE_ODB);
|
2017-04-04 18:21:20 +08:00
|
|
|
register_ref_store_map(&submodule_ref_stores, "submodule",
|
|
|
|
refs, submodule);
|
2017-03-26 10:42:31 +08:00
|
|
|
|
2017-08-23 20:36:53 +08:00
|
|
|
done:
|
2017-03-26 10:42:31 +08:00
|
|
|
strbuf_release(&submodule_sb);
|
2017-08-23 20:36:54 +08:00
|
|
|
free(to_free);
|
|
|
|
|
2016-09-05 00:08:11 +08:00
|
|
|
return refs;
|
|
|
|
}
|
|
|
|
|
2017-04-24 18:01:22 +08:00
|
|
|
struct ref_store *get_worktree_ref_store(const struct worktree *wt)
|
|
|
|
{
|
|
|
|
struct ref_store *refs;
|
|
|
|
const char *id;
|
|
|
|
|
|
|
|
if (wt->is_current)
|
2018-04-12 08:21:09 +08:00
|
|
|
return get_main_ref_store(the_repository);
|
2017-04-24 18:01:22 +08:00
|
|
|
|
|
|
|
id = wt->id ? wt->id : "/";
|
|
|
|
refs = lookup_ref_store_map(&worktree_ref_stores, id);
|
|
|
|
if (refs)
|
|
|
|
return refs;
|
|
|
|
|
|
|
|
if (wt->id)
|
2021-10-09 05:08:14 +08:00
|
|
|
refs = ref_store_init(the_repository,
|
|
|
|
git_common_path("worktrees/%s", wt->id),
|
2017-04-24 18:01:22 +08:00
|
|
|
REF_STORE_ALL_CAPS);
|
|
|
|
else
|
2021-10-09 05:08:14 +08:00
|
|
|
refs = ref_store_init(the_repository,
|
|
|
|
get_git_common_dir(),
|
2017-04-24 18:01:22 +08:00
|
|
|
REF_STORE_ALL_CAPS);
|
|
|
|
|
|
|
|
if (refs)
|
|
|
|
register_ref_store_map(&worktree_ref_stores, "worktree",
|
|
|
|
refs, id);
|
|
|
|
return refs;
|
|
|
|
}
|
|
|
|
|
2021-12-23 02:11:54 +08:00
|
|
|
void base_ref_store_init(struct ref_store *refs, struct repository *repo,
|
|
|
|
const char *path, const struct ref_storage_be *be)
|
2016-09-05 00:08:11 +08:00
|
|
|
{
|
2017-02-10 19:16:11 +08:00
|
|
|
refs->be = be;
|
2021-12-23 02:11:54 +08:00
|
|
|
refs->repo = repo;
|
|
|
|
refs->gitdir = xstrdup(path);
|
2016-09-05 00:08:11 +08:00
|
|
|
}
|
2016-09-05 00:08:16 +08:00
|
|
|
|
|
|
|
/* backend functions */
|
2023-05-13 05:34:41 +08:00
|
|
|
int refs_pack_refs(struct ref_store *refs, struct pack_refs_opts *opts)
|
2016-09-05 00:08:27 +08:00
|
|
|
{
|
2023-05-13 05:34:41 +08:00
|
|
|
return refs->be->pack_refs(refs, opts);
|
2016-09-05 00:08:27 +08:00
|
|
|
}
|
|
|
|
|
refs: switch peel_ref() to peel_iterated_oid()
The peel_ref() interface is confusing and error-prone:
- it's typically used by ref iteration callbacks that have both a
refname and oid. But since they pass only the refname, we may load
the ref value from the filesystem again. This is inefficient, but
also means we are open to a race if somebody simultaneously updates
the ref. E.g., this:
int some_ref_cb(const char *refname, const struct object_id *oid, ...)
{
if (!peel_ref(refname, &peeled))
printf("%s peels to %s",
oid_to_hex(oid), oid_to_hex(&peeled);
}
could print nonsense. It is correct to say "refname peels to..."
(you may see the "before" value or the "after" value, either of
which is consistent), but mentioning both oids may be mixing
before/after values.
Worse, whether this is possible depends on whether the optimization
to read from the current iterator value kicks in. So it is actually
not possible with:
for_each_ref(some_ref_cb);
but it _is_ possible with:
head_ref(some_ref_cb);
which does not use the iterator mechanism (though in practice, HEAD
should never peel to anything, so this may not be triggerable).
- it must take a fully-qualified refname for the read_ref_full() code
path to work. Yet we routinely pass it partial refnames from
callbacks to for_each_tag_ref(), etc. This happens to work when
iterating because there we do not call read_ref_full() at all, and
only use the passed refname to check if it is the same as the
iterator. But the requirements for the function parameters are quite
unclear.
Instead of taking a refname, let's instead take an oid. That fixes both
problems. It's a little funny for a "ref" function not to involve refs
at all. The key thing is that it's optimizing under the hood based on
having access to the ref iterator. So let's change the name to make it
clear why you'd want this function versus just peel_object().
There are two other directions I considered but rejected:
- we could pass the peel information into the each_ref_fn callback.
However, we don't know if the caller actually wants it or not. For
packed-refs, providing it is essentially free. But for loose refs,
we actually have to peel the object, which would be wasteful in most
cases. We could likewise pass in a flag to the callback indicating
whether the peeled information is known, but that complicates those
callbacks, as they then have to decide whether to manually peel
themselves. Plus it requires changing the interface of every
callback, whether they care about peeling or not, and there are many
of them.
- we could make a function to return the peeled value of the current
iterated ref (computing it if necessary), and BUG() otherwise. I.e.:
int peel_current_iterated_ref(struct object_id *out);
Each of the current callers is an each_ref_fn callback, so they'd
mostly be happy. But:
- we use those callbacks with functions like head_ref(), which do
not use the iteration code. So we'd need to handle the fallback
case there, anyway.
- it's possible that a caller would want to call into generic code
that sometimes is used during iteration and sometimes not. This
encapsulates the logic to do the fast thing when possible, and
fallback when necessary.
The implementation is mostly obvious, but I want to call out a few
things in the patch:
- the test-tool coverage for peel_ref() is now meaningless, as it all
collapses to a single peel_object() call (arguably they were pretty
uninteresting before; the tricky part of that function is the
fast-path we see during iteration, but these calls didn't trigger
that). I've just dropped it entirely, though note that some other
tests relied on the tags we created; I've moved that creation to the
tests where it matters.
- we no longer need to take a ref_store parameter, since we'd never
look up a ref now. We do still rely on a global "current iterator"
variable which _could_ be kept per-ref-store. But in practice this
is only useful if there are multiple recursive iterations, at which
point the more appropriate solution is probably a stack of
iterators. No caller used the actual ref-store parameter anyway
(they all call the wrapper that passes the_repository).
- the original only kicked in the optimization when the "refname"
pointer matched (i.e., not string comparison). We do likewise with
the "oid" parameter here, but fall back to doing an actual oideq()
call. This in theory lets us kick in the optimization more often,
though in practice no current caller cares. It should never be
wrong, though (peeling is a property of an object, so two refs
pointing to the same object would peel identically).
- the original took care not to touch the peeled out-parameter unless
we found something to put in it. But no caller cares about this, and
anyway, it is enforced by peel_object() itself (and even in the
optimized iterator case, that's where we eventually end up). We can
shorten the code and avoid an extra copy by just passing the
out-parameter through the stack.
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-21 03:44:43 +08:00
|
|
|
int peel_iterated_oid(const struct object_id *base, struct object_id *peeled)
|
2017-03-26 10:42:34 +08:00
|
|
|
{
|
refs: switch peel_ref() to peel_iterated_oid()
The peel_ref() interface is confusing and error-prone:
- it's typically used by ref iteration callbacks that have both a
refname and oid. But since they pass only the refname, we may load
the ref value from the filesystem again. This is inefficient, but
also means we are open to a race if somebody simultaneously updates
the ref. E.g., this:
int some_ref_cb(const char *refname, const struct object_id *oid, ...)
{
if (!peel_ref(refname, &peeled))
printf("%s peels to %s",
oid_to_hex(oid), oid_to_hex(&peeled);
}
could print nonsense. It is correct to say "refname peels to..."
(you may see the "before" value or the "after" value, either of
which is consistent), but mentioning both oids may be mixing
before/after values.
Worse, whether this is possible depends on whether the optimization
to read from the current iterator value kicks in. So it is actually
not possible with:
for_each_ref(some_ref_cb);
but it _is_ possible with:
head_ref(some_ref_cb);
which does not use the iterator mechanism (though in practice, HEAD
should never peel to anything, so this may not be triggerable).
- it must take a fully-qualified refname for the read_ref_full() code
path to work. Yet we routinely pass it partial refnames from
callbacks to for_each_tag_ref(), etc. This happens to work when
iterating because there we do not call read_ref_full() at all, and
only use the passed refname to check if it is the same as the
iterator. But the requirements for the function parameters are quite
unclear.
Instead of taking a refname, let's instead take an oid. That fixes both
problems. It's a little funny for a "ref" function not to involve refs
at all. The key thing is that it's optimizing under the hood based on
having access to the ref iterator. So let's change the name to make it
clear why you'd want this function versus just peel_object().
There are two other directions I considered but rejected:
- we could pass the peel information into the each_ref_fn callback.
However, we don't know if the caller actually wants it or not. For
packed-refs, providing it is essentially free. But for loose refs,
we actually have to peel the object, which would be wasteful in most
cases. We could likewise pass in a flag to the callback indicating
whether the peeled information is known, but that complicates those
callbacks, as they then have to decide whether to manually peel
themselves. Plus it requires changing the interface of every
callback, whether they care about peeling or not, and there are many
of them.
- we could make a function to return the peeled value of the current
iterated ref (computing it if necessary), and BUG() otherwise. I.e.:
int peel_current_iterated_ref(struct object_id *out);
Each of the current callers is an each_ref_fn callback, so they'd
mostly be happy. But:
- we use those callbacks with functions like head_ref(), which do
not use the iteration code. So we'd need to handle the fallback
case there, anyway.
- it's possible that a caller would want to call into generic code
that sometimes is used during iteration and sometimes not. This
encapsulates the logic to do the fast thing when possible, and
fallback when necessary.
The implementation is mostly obvious, but I want to call out a few
things in the patch:
- the test-tool coverage for peel_ref() is now meaningless, as it all
collapses to a single peel_object() call (arguably they were pretty
uninteresting before; the tricky part of that function is the
fast-path we see during iteration, but these calls didn't trigger
that). I've just dropped it entirely, though note that some other
tests relied on the tags we created; I've moved that creation to the
tests where it matters.
- we no longer need to take a ref_store parameter, since we'd never
look up a ref now. We do still rely on a global "current iterator"
variable which _could_ be kept per-ref-store. But in practice this
is only useful if there are multiple recursive iterations, at which
point the more appropriate solution is probably a stack of
iterators. No caller used the actual ref-store parameter anyway
(they all call the wrapper that passes the_repository).
- the original only kicked in the optimization when the "refname"
pointer matched (i.e., not string comparison). We do likewise with
the "oid" parameter here, but fall back to doing an actual oideq()
call. This in theory lets us kick in the optimization more often,
though in practice no current caller cares. It should never be
wrong, though (peeling is a property of an object, so two refs
pointing to the same object would peel identically).
- the original took care not to touch the peeled out-parameter unless
we found something to put in it. But no caller cares about this, and
anyway, it is enforced by peel_object() itself (and even in the
optimized iterator case, that's where we eventually end up). We can
shorten the code and avoid an extra copy by just passing the
out-parameter through the stack.
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-21 03:44:43 +08:00
|
|
|
if (current_ref_iter &&
|
|
|
|
(current_ref_iter->oid == base ||
|
|
|
|
oideq(current_ref_iter->oid, base)))
|
|
|
|
return ref_iterator_peel(current_ref_iter, peeled);
|
2017-09-25 16:00:14 +08:00
|
|
|
|
2021-05-19 23:31:28 +08:00
|
|
|
return peel_object(base, peeled) ? -1 : 0;
|
2017-03-26 10:42:34 +08:00
|
|
|
}
|
2016-09-05 00:08:29 +08:00
|
|
|
|
2024-05-07 20:58:58 +08:00
|
|
|
int refs_update_symref(struct ref_store *refs, const char *ref,
|
|
|
|
const char *target, const char *logmsg)
|
2017-03-26 10:42:34 +08:00
|
|
|
{
|
2024-05-07 20:58:57 +08:00
|
|
|
struct ref_transaction *transaction;
|
|
|
|
struct strbuf err = STRBUF_INIT;
|
|
|
|
int ret = 0;
|
reflog: cleanse messages in the refs.c layer
Regarding reflog messages:
- We expect that a reflog message consists of a single line. The
file format used by the files backend may add a LF after the
message as a delimiter, and output by commands like "git log -g"
may complete such an incomplete line by adding a LF at the end,
but philosophically, the terminating LF is not a part of the
message.
- We however allow callers of refs API to supply a random sequence
of NUL terminated bytes. We cleanse caller-supplied message by
squashing a run of whitespaces into a SP, and by trimming trailing
whitespace, before storing the message. This is how we tolerate,
instead of erring out, a message with LF in it (be it at the end,
in the middle, or both).
Currently, the cleansing of the reflog message is done by the files
backend, before the log is written out. This is sufficient with the
current code, as that is the only backend that writes reflogs. But
new backends can be added that write reflogs, and we'd want the
resulting log message we would read out of "log -g" the same no
matter what backend is used, and moving the code to do so to the
generic layer is a way to do so.
An added benefit is that the "cleansing" function could be updated
later, independent from individual backends, to e.g. allow
multi-line log messages if we wanted to, and when that happens, it
would help a lot to ensure we covered all bases if the cleansing
function (which would be updated) is called from the generic layer.
Side note: I am not interested in supporting multi-line reflog
messages right at the moment (nobody is asking for it), but I
envision that instead of the "squash a run of whitespaces into a SP
and rtrim" cleansing, we can %urlencode problematic bytes in the
message *AND* append a SP at the end, when a new version of Git that
supports multi-line and/or verbatim reflog messages writes a reflog
record. The reading side can detect the presense of SP at the end
(which should have been rtrimmed out if it were written by existing
versions of Git) as a signal that decoding %urlencode recovers the
original reflog message.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-11 01:19:53 +08:00
|
|
|
|
2024-05-07 20:58:57 +08:00
|
|
|
transaction = ref_store_transaction_begin(refs, &err);
|
|
|
|
if (!transaction ||
|
2024-05-07 20:58:58 +08:00
|
|
|
ref_transaction_update(transaction, ref, NULL, NULL,
|
|
|
|
target, NULL, REF_NO_DEREF,
|
2024-05-07 20:58:57 +08:00
|
|
|
logmsg, &err) ||
|
|
|
|
ref_transaction_commit(transaction, &err)) {
|
|
|
|
ret = error("%s", err.buf);
|
|
|
|
}
|
|
|
|
|
|
|
|
strbuf_release(&err);
|
|
|
|
if (transaction)
|
|
|
|
ref_transaction_free(transaction);
|
|
|
|
|
|
|
|
return ret;
|
2016-09-05 00:08:29 +08:00
|
|
|
}
|
|
|
|
|
2017-05-22 22:17:45 +08:00
|
|
|
int ref_update_reject_duplicates(struct string_list *refnames,
|
|
|
|
struct strbuf *err)
|
|
|
|
{
|
2017-05-22 22:17:46 +08:00
|
|
|
size_t i, n = refnames->nr;
|
2017-05-22 22:17:45 +08:00
|
|
|
|
|
|
|
assert(err);
|
|
|
|
|
2017-05-22 22:17:47 +08:00
|
|
|
for (i = 1; i < n; i++) {
|
|
|
|
int cmp = strcmp(refnames->items[i - 1].string,
|
|
|
|
refnames->items[i].string);
|
|
|
|
|
|
|
|
if (!cmp) {
|
2017-05-22 22:17:45 +08:00
|
|
|
strbuf_addf(err,
|
2018-07-21 15:49:35 +08:00
|
|
|
_("multiple updates for ref '%s' not allowed"),
|
2017-05-22 22:17:45 +08:00
|
|
|
refnames->items[i].string);
|
|
|
|
return 1;
|
2017-05-22 22:17:47 +08:00
|
|
|
} else if (cmp > 0) {
|
2018-05-02 17:38:39 +08:00
|
|
|
BUG("ref_update_reject_duplicates() received unsorted list");
|
2017-05-22 22:17:45 +08:00
|
|
|
}
|
2017-05-22 22:17:47 +08:00
|
|
|
}
|
2017-05-22 22:17:45 +08:00
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
refs: implement reference transaction hook
The low-level reference transactions used to update references are
currently completely opaque to the user. While certainly desirable in
most usecases, there are some which might want to hook into the
transaction to observe all queued reference updates as well as observing
the abortion or commit of a prepared transaction.
One such usecase would be to have a set of replicas of a given Git
repository, where we perform Git operations on all of the repositories
at once and expect the outcome to be the same in all of them. While
there exist hooks already for a certain subset of Git commands that
could be used to implement a voting mechanism for this, many others
currently don't have any mechanism for this.
The above scenario is the motivation for the new "reference-transaction"
hook that reaches directly into Git's reference transaction mechanism.
The hook receives as parameter the current state the transaction was
moved to ("prepared", "committed" or "aborted") and gets via its
standard input all queued reference updates. While the exit code gets
ignored in the "committed" and "aborted" states, a non-zero exit code in
the "prepared" state will cause the transaction to be aborted
prematurely.
Given the usecase described above, a voting mechanism can now be
implemented via this hook: as soon as it gets called, it will take all
of stdin and use it to cast a vote to a central service. When all
replicas of the repository agree, the hook will exit with zero,
otherwise it will abort the transaction by returning non-zero. The most
important upside is that this will catch _all_ commands writing
references at once, allowing to implement strong consistency for
reference updates via a single mechanism.
In order to test the impact on the case where we don't have any
"reference-transaction" hook installed in the repository, this commit
introduce two new performance tests for git-update-refs(1). Run against
an empty repository, it produces the following results:
Test origin/master HEAD
--------------------------------------------------------------------
1400.2: update-ref 2.70(2.10+0.71) 2.71(2.10+0.73) +0.4%
1400.3: update-ref --stdin 0.21(0.09+0.11) 0.21(0.07+0.14) +0.0%
The performance test p1400.2 creates, updates and deletes a branch a
thousand times, thus averaging runtime of git-update-refs over 3000
invocations. p1400.3 instead calls `git-update-refs --stdin` three times
and queues a thousand creations, updates and deletes respectively.
As expected, p1400.3 consistently shows no noticeable impact, as for
each batch of updates there's a single call to access(3P) for the
negative hook lookup. On the other hand, for p1400.2, one can see an
impact caused by this patchset. But doing five runs of the performance
tests where each one was run with GIT_PERF_REPEAT_COUNT=10, the overhead
ranged from -1.5% to +1.1%. These inconsistent performance numbers can
be explained by the overhead of spawning 3000 processes. This shows that
the overhead of assembling the hook path and executing access(3P) once
to check if it's there is mostly outweighed by the operating system's
overhead.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-06-19 14:56:14 +08:00
|
|
|
static int run_transaction_hook(struct ref_transaction *transaction,
|
|
|
|
const char *state)
|
|
|
|
{
|
|
|
|
struct child_process proc = CHILD_PROCESS_INIT;
|
|
|
|
struct strbuf buf = STRBUF_INIT;
|
refs: remove lookup cache for reference-transaction hook
When adding the reference-transaction hook, there were concerns about
the performance impact it may have on setups which do not make use of
the new hook at all. After all, it gets executed every time a reftx is
prepared, committed or aborted, which linearly scales with the number of
reference-transactions created per session. And as there are code paths
like `git push` which create a new transaction for each reference to be
updated, this may translate to calling `find_hook()` quite a lot.
To address this concern, a cache was added with the intention to not
repeatedly do negative hook lookups. Turns out this cache caused a
regression, which was fixed via e5256c82e5 (refs: fix interleaving hook
calls with reference-transaction hook, 2020-08-07). In the process of
discussing the fix, we realized that the cache doesn't really help even
in the negative-lookup case. While performance tests added to benchmark
this did show a slight improvement in the 1% range, this really doesn't
warrent having a cache. Furthermore, it's quite flaky, too. E.g. running
it twice in succession produces the following results:
Test master pks-reftx-hook-remove-cache
--------------------------------------------------------------------------
1400.2: update-ref 2.79(2.16+0.74) 2.73(2.12+0.71) -2.2%
1400.3: update-ref --stdin 0.22(0.08+0.14) 0.21(0.08+0.12) -4.5%
Test master pks-reftx-hook-remove-cache
--------------------------------------------------------------------------
1400.2: update-ref 2.70(2.09+0.72) 2.74(2.13+0.71) +1.5%
1400.3: update-ref --stdin 0.21(0.10+0.10) 0.21(0.08+0.13) +0.0%
One case notably absent from those benchmarks is a single executable
searching for the hook hundreds of times, which is exactly the case for
which the negative cache was added. p1400.2 will spawn a new update-ref
for each transaction and p1400.3 only has a single reference-transaction
for all reference updates. So this commit adds a third benchmark, which
performs an non-atomic push of a thousand references. This will create a
new reference transaction per reference. But even for this case, the
negative cache doesn't consistently improve performance:
Test master pks-reftx-hook-remove-cache
--------------------------------------------------------------------------
1400.4: nonatomic push 6.63(6.50+0.13) 6.81(6.67+0.14) +2.7%
1400.4: nonatomic push 6.35(6.21+0.14) 6.39(6.23+0.16) +0.6%
1400.4: nonatomic push 6.43(6.31+0.13) 6.42(6.28+0.15) -0.2%
So let's just remove the cache altogether to simplify the code.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-08-25 18:35:24 +08:00
|
|
|
const char *hook;
|
refs: implement reference transaction hook
The low-level reference transactions used to update references are
currently completely opaque to the user. While certainly desirable in
most usecases, there are some which might want to hook into the
transaction to observe all queued reference updates as well as observing
the abortion or commit of a prepared transaction.
One such usecase would be to have a set of replicas of a given Git
repository, where we perform Git operations on all of the repositories
at once and expect the outcome to be the same in all of them. While
there exist hooks already for a certain subset of Git commands that
could be used to implement a voting mechanism for this, many others
currently don't have any mechanism for this.
The above scenario is the motivation for the new "reference-transaction"
hook that reaches directly into Git's reference transaction mechanism.
The hook receives as parameter the current state the transaction was
moved to ("prepared", "committed" or "aborted") and gets via its
standard input all queued reference updates. While the exit code gets
ignored in the "committed" and "aborted" states, a non-zero exit code in
the "prepared" state will cause the transaction to be aborted
prematurely.
Given the usecase described above, a voting mechanism can now be
implemented via this hook: as soon as it gets called, it will take all
of stdin and use it to cast a vote to a central service. When all
replicas of the repository agree, the hook will exit with zero,
otherwise it will abort the transaction by returning non-zero. The most
important upside is that this will catch _all_ commands writing
references at once, allowing to implement strong consistency for
reference updates via a single mechanism.
In order to test the impact on the case where we don't have any
"reference-transaction" hook installed in the repository, this commit
introduce two new performance tests for git-update-refs(1). Run against
an empty repository, it produces the following results:
Test origin/master HEAD
--------------------------------------------------------------------
1400.2: update-ref 2.70(2.10+0.71) 2.71(2.10+0.73) +0.4%
1400.3: update-ref --stdin 0.21(0.09+0.11) 0.21(0.07+0.14) +0.0%
The performance test p1400.2 creates, updates and deletes a branch a
thousand times, thus averaging runtime of git-update-refs over 3000
invocations. p1400.3 instead calls `git-update-refs --stdin` three times
and queues a thousand creations, updates and deletes respectively.
As expected, p1400.3 consistently shows no noticeable impact, as for
each batch of updates there's a single call to access(3P) for the
negative hook lookup. On the other hand, for p1400.2, one can see an
impact caused by this patchset. But doing five runs of the performance
tests where each one was run with GIT_PERF_REPEAT_COUNT=10, the overhead
ranged from -1.5% to +1.1%. These inconsistent performance numbers can
be explained by the overhead of spawning 3000 processes. This shows that
the overhead of assembling the hook path and executing access(3P) once
to check if it's there is mostly outweighed by the operating system's
overhead.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-06-19 14:56:14 +08:00
|
|
|
int ret = 0, i;
|
|
|
|
|
refs: remove lookup cache for reference-transaction hook
When adding the reference-transaction hook, there were concerns about
the performance impact it may have on setups which do not make use of
the new hook at all. After all, it gets executed every time a reftx is
prepared, committed or aborted, which linearly scales with the number of
reference-transactions created per session. And as there are code paths
like `git push` which create a new transaction for each reference to be
updated, this may translate to calling `find_hook()` quite a lot.
To address this concern, a cache was added with the intention to not
repeatedly do negative hook lookups. Turns out this cache caused a
regression, which was fixed via e5256c82e5 (refs: fix interleaving hook
calls with reference-transaction hook, 2020-08-07). In the process of
discussing the fix, we realized that the cache doesn't really help even
in the negative-lookup case. While performance tests added to benchmark
this did show a slight improvement in the 1% range, this really doesn't
warrent having a cache. Furthermore, it's quite flaky, too. E.g. running
it twice in succession produces the following results:
Test master pks-reftx-hook-remove-cache
--------------------------------------------------------------------------
1400.2: update-ref 2.79(2.16+0.74) 2.73(2.12+0.71) -2.2%
1400.3: update-ref --stdin 0.22(0.08+0.14) 0.21(0.08+0.12) -4.5%
Test master pks-reftx-hook-remove-cache
--------------------------------------------------------------------------
1400.2: update-ref 2.70(2.09+0.72) 2.74(2.13+0.71) +1.5%
1400.3: update-ref --stdin 0.21(0.10+0.10) 0.21(0.08+0.13) +0.0%
One case notably absent from those benchmarks is a single executable
searching for the hook hundreds of times, which is exactly the case for
which the negative cache was added. p1400.2 will spawn a new update-ref
for each transaction and p1400.3 only has a single reference-transaction
for all reference updates. So this commit adds a third benchmark, which
performs an non-atomic push of a thousand references. This will create a
new reference transaction per reference. But even for this case, the
negative cache doesn't consistently improve performance:
Test master pks-reftx-hook-remove-cache
--------------------------------------------------------------------------
1400.4: nonatomic push 6.63(6.50+0.13) 6.81(6.67+0.14) +2.7%
1400.4: nonatomic push 6.35(6.21+0.14) 6.39(6.23+0.16) +0.6%
1400.4: nonatomic push 6.43(6.31+0.13) 6.42(6.28+0.15) -0.2%
So let's just remove the cache altogether to simplify the code.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-08-25 18:35:24 +08:00
|
|
|
hook = find_hook("reference-transaction");
|
refs: implement reference transaction hook
The low-level reference transactions used to update references are
currently completely opaque to the user. While certainly desirable in
most usecases, there are some which might want to hook into the
transaction to observe all queued reference updates as well as observing
the abortion or commit of a prepared transaction.
One such usecase would be to have a set of replicas of a given Git
repository, where we perform Git operations on all of the repositories
at once and expect the outcome to be the same in all of them. While
there exist hooks already for a certain subset of Git commands that
could be used to implement a voting mechanism for this, many others
currently don't have any mechanism for this.
The above scenario is the motivation for the new "reference-transaction"
hook that reaches directly into Git's reference transaction mechanism.
The hook receives as parameter the current state the transaction was
moved to ("prepared", "committed" or "aborted") and gets via its
standard input all queued reference updates. While the exit code gets
ignored in the "committed" and "aborted" states, a non-zero exit code in
the "prepared" state will cause the transaction to be aborted
prematurely.
Given the usecase described above, a voting mechanism can now be
implemented via this hook: as soon as it gets called, it will take all
of stdin and use it to cast a vote to a central service. When all
replicas of the repository agree, the hook will exit with zero,
otherwise it will abort the transaction by returning non-zero. The most
important upside is that this will catch _all_ commands writing
references at once, allowing to implement strong consistency for
reference updates via a single mechanism.
In order to test the impact on the case where we don't have any
"reference-transaction" hook installed in the repository, this commit
introduce two new performance tests for git-update-refs(1). Run against
an empty repository, it produces the following results:
Test origin/master HEAD
--------------------------------------------------------------------
1400.2: update-ref 2.70(2.10+0.71) 2.71(2.10+0.73) +0.4%
1400.3: update-ref --stdin 0.21(0.09+0.11) 0.21(0.07+0.14) +0.0%
The performance test p1400.2 creates, updates and deletes a branch a
thousand times, thus averaging runtime of git-update-refs over 3000
invocations. p1400.3 instead calls `git-update-refs --stdin` three times
and queues a thousand creations, updates and deletes respectively.
As expected, p1400.3 consistently shows no noticeable impact, as for
each batch of updates there's a single call to access(3P) for the
negative hook lookup. On the other hand, for p1400.2, one can see an
impact caused by this patchset. But doing five runs of the performance
tests where each one was run with GIT_PERF_REPEAT_COUNT=10, the overhead
ranged from -1.5% to +1.1%. These inconsistent performance numbers can
be explained by the overhead of spawning 3000 processes. This shows that
the overhead of assembling the hook path and executing access(3P) once
to check if it's there is mostly outweighed by the operating system's
overhead.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-06-19 14:56:14 +08:00
|
|
|
if (!hook)
|
|
|
|
return ret;
|
|
|
|
|
2020-07-29 04:25:12 +08:00
|
|
|
strvec_pushl(&proc.args, hook, state, NULL);
|
refs: implement reference transaction hook
The low-level reference transactions used to update references are
currently completely opaque to the user. While certainly desirable in
most usecases, there are some which might want to hook into the
transaction to observe all queued reference updates as well as observing
the abortion or commit of a prepared transaction.
One such usecase would be to have a set of replicas of a given Git
repository, where we perform Git operations on all of the repositories
at once and expect the outcome to be the same in all of them. While
there exist hooks already for a certain subset of Git commands that
could be used to implement a voting mechanism for this, many others
currently don't have any mechanism for this.
The above scenario is the motivation for the new "reference-transaction"
hook that reaches directly into Git's reference transaction mechanism.
The hook receives as parameter the current state the transaction was
moved to ("prepared", "committed" or "aborted") and gets via its
standard input all queued reference updates. While the exit code gets
ignored in the "committed" and "aborted" states, a non-zero exit code in
the "prepared" state will cause the transaction to be aborted
prematurely.
Given the usecase described above, a voting mechanism can now be
implemented via this hook: as soon as it gets called, it will take all
of stdin and use it to cast a vote to a central service. When all
replicas of the repository agree, the hook will exit with zero,
otherwise it will abort the transaction by returning non-zero. The most
important upside is that this will catch _all_ commands writing
references at once, allowing to implement strong consistency for
reference updates via a single mechanism.
In order to test the impact on the case where we don't have any
"reference-transaction" hook installed in the repository, this commit
introduce two new performance tests for git-update-refs(1). Run against
an empty repository, it produces the following results:
Test origin/master HEAD
--------------------------------------------------------------------
1400.2: update-ref 2.70(2.10+0.71) 2.71(2.10+0.73) +0.4%
1400.3: update-ref --stdin 0.21(0.09+0.11) 0.21(0.07+0.14) +0.0%
The performance test p1400.2 creates, updates and deletes a branch a
thousand times, thus averaging runtime of git-update-refs over 3000
invocations. p1400.3 instead calls `git-update-refs --stdin` three times
and queues a thousand creations, updates and deletes respectively.
As expected, p1400.3 consistently shows no noticeable impact, as for
each batch of updates there's a single call to access(3P) for the
negative hook lookup. On the other hand, for p1400.2, one can see an
impact caused by this patchset. But doing five runs of the performance
tests where each one was run with GIT_PERF_REPEAT_COUNT=10, the overhead
ranged from -1.5% to +1.1%. These inconsistent performance numbers can
be explained by the overhead of spawning 3000 processes. This shows that
the overhead of assembling the hook path and executing access(3P) once
to check if it's there is mostly outweighed by the operating system's
overhead.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-06-19 14:56:14 +08:00
|
|
|
proc.in = -1;
|
|
|
|
proc.stdout_to_stderr = 1;
|
|
|
|
proc.trace2_hook_name = "reference-transaction";
|
|
|
|
|
|
|
|
ret = start_command(&proc);
|
|
|
|
if (ret)
|
|
|
|
return ret;
|
|
|
|
|
|
|
|
sigchain_push(SIGPIPE, SIG_IGN);
|
|
|
|
|
|
|
|
for (i = 0; i < transaction->nr; i++) {
|
|
|
|
struct ref_update *update = transaction->updates[i];
|
|
|
|
|
|
|
|
strbuf_reset(&buf);
|
2024-05-07 20:58:54 +08:00
|
|
|
|
|
|
|
if (!(update->flags & REF_HAVE_OLD))
|
|
|
|
strbuf_addf(&buf, "%s ", oid_to_hex(null_oid()));
|
|
|
|
else if (update->old_target)
|
|
|
|
strbuf_addf(&buf, "ref:%s ", update->old_target);
|
|
|
|
else
|
|
|
|
strbuf_addf(&buf, "%s ", oid_to_hex(&update->old_oid));
|
|
|
|
|
|
|
|
if (!(update->flags & REF_HAVE_NEW))
|
|
|
|
strbuf_addf(&buf, "%s ", oid_to_hex(null_oid()));
|
|
|
|
else if (update->new_target)
|
|
|
|
strbuf_addf(&buf, "ref:%s ", update->new_target);
|
|
|
|
else
|
|
|
|
strbuf_addf(&buf, "%s ", oid_to_hex(&update->new_oid));
|
|
|
|
|
|
|
|
strbuf_addf(&buf, "%s\n", update->refname);
|
refs: implement reference transaction hook
The low-level reference transactions used to update references are
currently completely opaque to the user. While certainly desirable in
most usecases, there are some which might want to hook into the
transaction to observe all queued reference updates as well as observing
the abortion or commit of a prepared transaction.
One such usecase would be to have a set of replicas of a given Git
repository, where we perform Git operations on all of the repositories
at once and expect the outcome to be the same in all of them. While
there exist hooks already for a certain subset of Git commands that
could be used to implement a voting mechanism for this, many others
currently don't have any mechanism for this.
The above scenario is the motivation for the new "reference-transaction"
hook that reaches directly into Git's reference transaction mechanism.
The hook receives as parameter the current state the transaction was
moved to ("prepared", "committed" or "aborted") and gets via its
standard input all queued reference updates. While the exit code gets
ignored in the "committed" and "aborted" states, a non-zero exit code in
the "prepared" state will cause the transaction to be aborted
prematurely.
Given the usecase described above, a voting mechanism can now be
implemented via this hook: as soon as it gets called, it will take all
of stdin and use it to cast a vote to a central service. When all
replicas of the repository agree, the hook will exit with zero,
otherwise it will abort the transaction by returning non-zero. The most
important upside is that this will catch _all_ commands writing
references at once, allowing to implement strong consistency for
reference updates via a single mechanism.
In order to test the impact on the case where we don't have any
"reference-transaction" hook installed in the repository, this commit
introduce two new performance tests for git-update-refs(1). Run against
an empty repository, it produces the following results:
Test origin/master HEAD
--------------------------------------------------------------------
1400.2: update-ref 2.70(2.10+0.71) 2.71(2.10+0.73) +0.4%
1400.3: update-ref --stdin 0.21(0.09+0.11) 0.21(0.07+0.14) +0.0%
The performance test p1400.2 creates, updates and deletes a branch a
thousand times, thus averaging runtime of git-update-refs over 3000
invocations. p1400.3 instead calls `git-update-refs --stdin` three times
and queues a thousand creations, updates and deletes respectively.
As expected, p1400.3 consistently shows no noticeable impact, as for
each batch of updates there's a single call to access(3P) for the
negative hook lookup. On the other hand, for p1400.2, one can see an
impact caused by this patchset. But doing five runs of the performance
tests where each one was run with GIT_PERF_REPEAT_COUNT=10, the overhead
ranged from -1.5% to +1.1%. These inconsistent performance numbers can
be explained by the overhead of spawning 3000 processes. This shows that
the overhead of assembling the hook path and executing access(3P) once
to check if it's there is mostly outweighed by the operating system's
overhead.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-06-19 14:56:14 +08:00
|
|
|
|
|
|
|
if (write_in_full(proc.in, buf.buf, buf.len) < 0) {
|
2021-10-16 17:39:25 +08:00
|
|
|
if (errno != EPIPE) {
|
|
|
|
/* Don't leak errno outside this API */
|
|
|
|
errno = 0;
|
refs: implement reference transaction hook
The low-level reference transactions used to update references are
currently completely opaque to the user. While certainly desirable in
most usecases, there are some which might want to hook into the
transaction to observe all queued reference updates as well as observing
the abortion or commit of a prepared transaction.
One such usecase would be to have a set of replicas of a given Git
repository, where we perform Git operations on all of the repositories
at once and expect the outcome to be the same in all of them. While
there exist hooks already for a certain subset of Git commands that
could be used to implement a voting mechanism for this, many others
currently don't have any mechanism for this.
The above scenario is the motivation for the new "reference-transaction"
hook that reaches directly into Git's reference transaction mechanism.
The hook receives as parameter the current state the transaction was
moved to ("prepared", "committed" or "aborted") and gets via its
standard input all queued reference updates. While the exit code gets
ignored in the "committed" and "aborted" states, a non-zero exit code in
the "prepared" state will cause the transaction to be aborted
prematurely.
Given the usecase described above, a voting mechanism can now be
implemented via this hook: as soon as it gets called, it will take all
of stdin and use it to cast a vote to a central service. When all
replicas of the repository agree, the hook will exit with zero,
otherwise it will abort the transaction by returning non-zero. The most
important upside is that this will catch _all_ commands writing
references at once, allowing to implement strong consistency for
reference updates via a single mechanism.
In order to test the impact on the case where we don't have any
"reference-transaction" hook installed in the repository, this commit
introduce two new performance tests for git-update-refs(1). Run against
an empty repository, it produces the following results:
Test origin/master HEAD
--------------------------------------------------------------------
1400.2: update-ref 2.70(2.10+0.71) 2.71(2.10+0.73) +0.4%
1400.3: update-ref --stdin 0.21(0.09+0.11) 0.21(0.07+0.14) +0.0%
The performance test p1400.2 creates, updates and deletes a branch a
thousand times, thus averaging runtime of git-update-refs over 3000
invocations. p1400.3 instead calls `git-update-refs --stdin` three times
and queues a thousand creations, updates and deletes respectively.
As expected, p1400.3 consistently shows no noticeable impact, as for
each batch of updates there's a single call to access(3P) for the
negative hook lookup. On the other hand, for p1400.2, one can see an
impact caused by this patchset. But doing five runs of the performance
tests where each one was run with GIT_PERF_REPEAT_COUNT=10, the overhead
ranged from -1.5% to +1.1%. These inconsistent performance numbers can
be explained by the overhead of spawning 3000 processes. This shows that
the overhead of assembling the hook path and executing access(3P) once
to check if it's there is mostly outweighed by the operating system's
overhead.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-06-19 14:56:14 +08:00
|
|
|
ret = -1;
|
2021-10-16 17:39:25 +08:00
|
|
|
}
|
refs: implement reference transaction hook
The low-level reference transactions used to update references are
currently completely opaque to the user. While certainly desirable in
most usecases, there are some which might want to hook into the
transaction to observe all queued reference updates as well as observing
the abortion or commit of a prepared transaction.
One such usecase would be to have a set of replicas of a given Git
repository, where we perform Git operations on all of the repositories
at once and expect the outcome to be the same in all of them. While
there exist hooks already for a certain subset of Git commands that
could be used to implement a voting mechanism for this, many others
currently don't have any mechanism for this.
The above scenario is the motivation for the new "reference-transaction"
hook that reaches directly into Git's reference transaction mechanism.
The hook receives as parameter the current state the transaction was
moved to ("prepared", "committed" or "aborted") and gets via its
standard input all queued reference updates. While the exit code gets
ignored in the "committed" and "aborted" states, a non-zero exit code in
the "prepared" state will cause the transaction to be aborted
prematurely.
Given the usecase described above, a voting mechanism can now be
implemented via this hook: as soon as it gets called, it will take all
of stdin and use it to cast a vote to a central service. When all
replicas of the repository agree, the hook will exit with zero,
otherwise it will abort the transaction by returning non-zero. The most
important upside is that this will catch _all_ commands writing
references at once, allowing to implement strong consistency for
reference updates via a single mechanism.
In order to test the impact on the case where we don't have any
"reference-transaction" hook installed in the repository, this commit
introduce two new performance tests for git-update-refs(1). Run against
an empty repository, it produces the following results:
Test origin/master HEAD
--------------------------------------------------------------------
1400.2: update-ref 2.70(2.10+0.71) 2.71(2.10+0.73) +0.4%
1400.3: update-ref --stdin 0.21(0.09+0.11) 0.21(0.07+0.14) +0.0%
The performance test p1400.2 creates, updates and deletes a branch a
thousand times, thus averaging runtime of git-update-refs over 3000
invocations. p1400.3 instead calls `git-update-refs --stdin` three times
and queues a thousand creations, updates and deletes respectively.
As expected, p1400.3 consistently shows no noticeable impact, as for
each batch of updates there's a single call to access(3P) for the
negative hook lookup. On the other hand, for p1400.2, one can see an
impact caused by this patchset. But doing five runs of the performance
tests where each one was run with GIT_PERF_REPEAT_COUNT=10, the overhead
ranged from -1.5% to +1.1%. These inconsistent performance numbers can
be explained by the overhead of spawning 3000 processes. This shows that
the overhead of assembling the hook path and executing access(3P) once
to check if it's there is mostly outweighed by the operating system's
overhead.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-06-19 14:56:14 +08:00
|
|
|
break;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
close(proc.in);
|
|
|
|
sigchain_pop(SIGPIPE);
|
|
|
|
strbuf_release(&buf);
|
|
|
|
|
|
|
|
ret |= finish_command(&proc);
|
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
ref_transaction_prepare(): new optional step for reference updates
In the future, compound reference stores will sometimes need to modify
references in two different reference stores at the same time, meaning
that a single logical reference transaction might have to be
implemented as two internal sub-transactions. They won't want to call
`ref_transaction_commit()` for the two sub-transactions one after the
other, because that wouldn't be atomic (the first commit could succeed
and the second one fail). Instead, they will want to prepare both
sub-transactions (i.e., obtain any necessary locks and do any
pre-checks), and only if both prepare steps succeed, then commit both
sub-transactions.
Start preparing for that day by adding a new, optional
`ref_transaction_prepare()` step to the reference transaction
sequence, which obtains the locks and does any prechecks, reporting
any errors that occur. Also add a `ref_transaction_abort()` function
that can be used to abort a sub-transaction even if it has already
been prepared.
That is on the side of the public-facing API. On the side of the
`ref_store` VTABLE, get rid of `transaction_commit` and instead add
methods `transaction_prepare`, `transaction_finish`, and
`transaction_abort`. A `ref_transaction_commit()` now basically calls
methods `transaction_prepare` then `transaction_finish`.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-22 22:17:44 +08:00
|
|
|
int ref_transaction_prepare(struct ref_transaction *transaction,
|
|
|
|
struct strbuf *err)
|
2016-09-05 00:08:16 +08:00
|
|
|
{
|
2017-03-26 10:42:35 +08:00
|
|
|
struct ref_store *refs = transaction->ref_store;
|
refs: implement reference transaction hook
The low-level reference transactions used to update references are
currently completely opaque to the user. While certainly desirable in
most usecases, there are some which might want to hook into the
transaction to observe all queued reference updates as well as observing
the abortion or commit of a prepared transaction.
One such usecase would be to have a set of replicas of a given Git
repository, where we perform Git operations on all of the repositories
at once and expect the outcome to be the same in all of them. While
there exist hooks already for a certain subset of Git commands that
could be used to implement a voting mechanism for this, many others
currently don't have any mechanism for this.
The above scenario is the motivation for the new "reference-transaction"
hook that reaches directly into Git's reference transaction mechanism.
The hook receives as parameter the current state the transaction was
moved to ("prepared", "committed" or "aborted") and gets via its
standard input all queued reference updates. While the exit code gets
ignored in the "committed" and "aborted" states, a non-zero exit code in
the "prepared" state will cause the transaction to be aborted
prematurely.
Given the usecase described above, a voting mechanism can now be
implemented via this hook: as soon as it gets called, it will take all
of stdin and use it to cast a vote to a central service. When all
replicas of the repository agree, the hook will exit with zero,
otherwise it will abort the transaction by returning non-zero. The most
important upside is that this will catch _all_ commands writing
references at once, allowing to implement strong consistency for
reference updates via a single mechanism.
In order to test the impact on the case where we don't have any
"reference-transaction" hook installed in the repository, this commit
introduce two new performance tests for git-update-refs(1). Run against
an empty repository, it produces the following results:
Test origin/master HEAD
--------------------------------------------------------------------
1400.2: update-ref 2.70(2.10+0.71) 2.71(2.10+0.73) +0.4%
1400.3: update-ref --stdin 0.21(0.09+0.11) 0.21(0.07+0.14) +0.0%
The performance test p1400.2 creates, updates and deletes a branch a
thousand times, thus averaging runtime of git-update-refs over 3000
invocations. p1400.3 instead calls `git-update-refs --stdin` three times
and queues a thousand creations, updates and deletes respectively.
As expected, p1400.3 consistently shows no noticeable impact, as for
each batch of updates there's a single call to access(3P) for the
negative hook lookup. On the other hand, for p1400.2, one can see an
impact caused by this patchset. But doing five runs of the performance
tests where each one was run with GIT_PERF_REPEAT_COUNT=10, the overhead
ranged from -1.5% to +1.1%. These inconsistent performance numbers can
be explained by the overhead of spawning 3000 processes. This shows that
the overhead of assembling the hook path and executing access(3P) once
to check if it's there is mostly outweighed by the operating system's
overhead.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-06-19 14:56:14 +08:00
|
|
|
int ret;
|
2016-09-05 00:08:16 +08:00
|
|
|
|
2017-05-22 22:17:43 +08:00
|
|
|
switch (transaction->state) {
|
|
|
|
case REF_TRANSACTION_OPEN:
|
|
|
|
/* Good. */
|
|
|
|
break;
|
ref_transaction_prepare(): new optional step for reference updates
In the future, compound reference stores will sometimes need to modify
references in two different reference stores at the same time, meaning
that a single logical reference transaction might have to be
implemented as two internal sub-transactions. They won't want to call
`ref_transaction_commit()` for the two sub-transactions one after the
other, because that wouldn't be atomic (the first commit could succeed
and the second one fail). Instead, they will want to prepare both
sub-transactions (i.e., obtain any necessary locks and do any
pre-checks), and only if both prepare steps succeed, then commit both
sub-transactions.
Start preparing for that day by adding a new, optional
`ref_transaction_prepare()` step to the reference transaction
sequence, which obtains the locks and does any prechecks, reporting
any errors that occur. Also add a `ref_transaction_abort()` function
that can be used to abort a sub-transaction even if it has already
been prepared.
That is on the side of the public-facing API. On the side of the
`ref_store` VTABLE, get rid of `transaction_commit` and instead add
methods `transaction_prepare`, `transaction_finish`, and
`transaction_abort`. A `ref_transaction_commit()` now basically calls
methods `transaction_prepare` then `transaction_finish`.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-22 22:17:44 +08:00
|
|
|
case REF_TRANSACTION_PREPARED:
|
2018-05-02 17:38:39 +08:00
|
|
|
BUG("prepare called twice on reference transaction");
|
ref_transaction_prepare(): new optional step for reference updates
In the future, compound reference stores will sometimes need to modify
references in two different reference stores at the same time, meaning
that a single logical reference transaction might have to be
implemented as two internal sub-transactions. They won't want to call
`ref_transaction_commit()` for the two sub-transactions one after the
other, because that wouldn't be atomic (the first commit could succeed
and the second one fail). Instead, they will want to prepare both
sub-transactions (i.e., obtain any necessary locks and do any
pre-checks), and only if both prepare steps succeed, then commit both
sub-transactions.
Start preparing for that day by adding a new, optional
`ref_transaction_prepare()` step to the reference transaction
sequence, which obtains the locks and does any prechecks, reporting
any errors that occur. Also add a `ref_transaction_abort()` function
that can be used to abort a sub-transaction even if it has already
been prepared.
That is on the side of the public-facing API. On the side of the
`ref_store` VTABLE, get rid of `transaction_commit` and instead add
methods `transaction_prepare`, `transaction_finish`, and
`transaction_abort`. A `ref_transaction_commit()` now basically calls
methods `transaction_prepare` then `transaction_finish`.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-22 22:17:44 +08:00
|
|
|
break;
|
2017-05-22 22:17:43 +08:00
|
|
|
case REF_TRANSACTION_CLOSED:
|
2018-05-02 17:38:39 +08:00
|
|
|
BUG("prepare called on a closed reference transaction");
|
2017-05-22 22:17:43 +08:00
|
|
|
break;
|
|
|
|
default:
|
2018-05-02 17:38:39 +08:00
|
|
|
BUG("unexpected reference transaction state");
|
2017-05-22 22:17:43 +08:00
|
|
|
break;
|
|
|
|
}
|
|
|
|
|
2021-12-07 06:05:05 +08:00
|
|
|
if (refs->repo->objects->odb->disable_ref_updates) {
|
2017-04-11 06:14:12 +08:00
|
|
|
strbuf_addstr(err,
|
|
|
|
_("ref updates forbidden inside quarantine environment"));
|
|
|
|
return -1;
|
|
|
|
}
|
|
|
|
|
refs: implement reference transaction hook
The low-level reference transactions used to update references are
currently completely opaque to the user. While certainly desirable in
most usecases, there are some which might want to hook into the
transaction to observe all queued reference updates as well as observing
the abortion or commit of a prepared transaction.
One such usecase would be to have a set of replicas of a given Git
repository, where we perform Git operations on all of the repositories
at once and expect the outcome to be the same in all of them. While
there exist hooks already for a certain subset of Git commands that
could be used to implement a voting mechanism for this, many others
currently don't have any mechanism for this.
The above scenario is the motivation for the new "reference-transaction"
hook that reaches directly into Git's reference transaction mechanism.
The hook receives as parameter the current state the transaction was
moved to ("prepared", "committed" or "aborted") and gets via its
standard input all queued reference updates. While the exit code gets
ignored in the "committed" and "aborted" states, a non-zero exit code in
the "prepared" state will cause the transaction to be aborted
prematurely.
Given the usecase described above, a voting mechanism can now be
implemented via this hook: as soon as it gets called, it will take all
of stdin and use it to cast a vote to a central service. When all
replicas of the repository agree, the hook will exit with zero,
otherwise it will abort the transaction by returning non-zero. The most
important upside is that this will catch _all_ commands writing
references at once, allowing to implement strong consistency for
reference updates via a single mechanism.
In order to test the impact on the case where we don't have any
"reference-transaction" hook installed in the repository, this commit
introduce two new performance tests for git-update-refs(1). Run against
an empty repository, it produces the following results:
Test origin/master HEAD
--------------------------------------------------------------------
1400.2: update-ref 2.70(2.10+0.71) 2.71(2.10+0.73) +0.4%
1400.3: update-ref --stdin 0.21(0.09+0.11) 0.21(0.07+0.14) +0.0%
The performance test p1400.2 creates, updates and deletes a branch a
thousand times, thus averaging runtime of git-update-refs over 3000
invocations. p1400.3 instead calls `git-update-refs --stdin` three times
and queues a thousand creations, updates and deletes respectively.
As expected, p1400.3 consistently shows no noticeable impact, as for
each batch of updates there's a single call to access(3P) for the
negative hook lookup. On the other hand, for p1400.2, one can see an
impact caused by this patchset. But doing five runs of the performance
tests where each one was run with GIT_PERF_REPEAT_COUNT=10, the overhead
ranged from -1.5% to +1.1%. These inconsistent performance numbers can
be explained by the overhead of spawning 3000 processes. This shows that
the overhead of assembling the hook path and executing access(3P) once
to check if it's there is mostly outweighed by the operating system's
overhead.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-06-19 14:56:14 +08:00
|
|
|
ret = refs->be->transaction_prepare(refs, transaction, err);
|
|
|
|
if (ret)
|
|
|
|
return ret;
|
|
|
|
|
|
|
|
ret = run_transaction_hook(transaction, "prepared");
|
|
|
|
if (ret) {
|
|
|
|
ref_transaction_abort(transaction, err);
|
|
|
|
die(_("ref updates aborted by hook"));
|
|
|
|
}
|
|
|
|
|
|
|
|
return 0;
|
ref_transaction_prepare(): new optional step for reference updates
In the future, compound reference stores will sometimes need to modify
references in two different reference stores at the same time, meaning
that a single logical reference transaction might have to be
implemented as two internal sub-transactions. They won't want to call
`ref_transaction_commit()` for the two sub-transactions one after the
other, because that wouldn't be atomic (the first commit could succeed
and the second one fail). Instead, they will want to prepare both
sub-transactions (i.e., obtain any necessary locks and do any
pre-checks), and only if both prepare steps succeed, then commit both
sub-transactions.
Start preparing for that day by adding a new, optional
`ref_transaction_prepare()` step to the reference transaction
sequence, which obtains the locks and does any prechecks, reporting
any errors that occur. Also add a `ref_transaction_abort()` function
that can be used to abort a sub-transaction even if it has already
been prepared.
That is on the side of the public-facing API. On the side of the
`ref_store` VTABLE, get rid of `transaction_commit` and instead add
methods `transaction_prepare`, `transaction_finish`, and
`transaction_abort`. A `ref_transaction_commit()` now basically calls
methods `transaction_prepare` then `transaction_finish`.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-22 22:17:44 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
int ref_transaction_abort(struct ref_transaction *transaction,
|
|
|
|
struct strbuf *err)
|
|
|
|
{
|
|
|
|
struct ref_store *refs = transaction->ref_store;
|
|
|
|
int ret = 0;
|
|
|
|
|
|
|
|
switch (transaction->state) {
|
|
|
|
case REF_TRANSACTION_OPEN:
|
|
|
|
/* No need to abort explicitly. */
|
|
|
|
break;
|
|
|
|
case REF_TRANSACTION_PREPARED:
|
|
|
|
ret = refs->be->transaction_abort(refs, transaction, err);
|
|
|
|
break;
|
|
|
|
case REF_TRANSACTION_CLOSED:
|
2018-05-02 17:38:39 +08:00
|
|
|
BUG("abort called on a closed reference transaction");
|
ref_transaction_prepare(): new optional step for reference updates
In the future, compound reference stores will sometimes need to modify
references in two different reference stores at the same time, meaning
that a single logical reference transaction might have to be
implemented as two internal sub-transactions. They won't want to call
`ref_transaction_commit()` for the two sub-transactions one after the
other, because that wouldn't be atomic (the first commit could succeed
and the second one fail). Instead, they will want to prepare both
sub-transactions (i.e., obtain any necessary locks and do any
pre-checks), and only if both prepare steps succeed, then commit both
sub-transactions.
Start preparing for that day by adding a new, optional
`ref_transaction_prepare()` step to the reference transaction
sequence, which obtains the locks and does any prechecks, reporting
any errors that occur. Also add a `ref_transaction_abort()` function
that can be used to abort a sub-transaction even if it has already
been prepared.
That is on the side of the public-facing API. On the side of the
`ref_store` VTABLE, get rid of `transaction_commit` and instead add
methods `transaction_prepare`, `transaction_finish`, and
`transaction_abort`. A `ref_transaction_commit()` now basically calls
methods `transaction_prepare` then `transaction_finish`.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-22 22:17:44 +08:00
|
|
|
break;
|
|
|
|
default:
|
2018-05-02 17:38:39 +08:00
|
|
|
BUG("unexpected reference transaction state");
|
ref_transaction_prepare(): new optional step for reference updates
In the future, compound reference stores will sometimes need to modify
references in two different reference stores at the same time, meaning
that a single logical reference transaction might have to be
implemented as two internal sub-transactions. They won't want to call
`ref_transaction_commit()` for the two sub-transactions one after the
other, because that wouldn't be atomic (the first commit could succeed
and the second one fail). Instead, they will want to prepare both
sub-transactions (i.e., obtain any necessary locks and do any
pre-checks), and only if both prepare steps succeed, then commit both
sub-transactions.
Start preparing for that day by adding a new, optional
`ref_transaction_prepare()` step to the reference transaction
sequence, which obtains the locks and does any prechecks, reporting
any errors that occur. Also add a `ref_transaction_abort()` function
that can be used to abort a sub-transaction even if it has already
been prepared.
That is on the side of the public-facing API. On the side of the
`ref_store` VTABLE, get rid of `transaction_commit` and instead add
methods `transaction_prepare`, `transaction_finish`, and
`transaction_abort`. A `ref_transaction_commit()` now basically calls
methods `transaction_prepare` then `transaction_finish`.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-22 22:17:44 +08:00
|
|
|
break;
|
|
|
|
}
|
|
|
|
|
refs: implement reference transaction hook
The low-level reference transactions used to update references are
currently completely opaque to the user. While certainly desirable in
most usecases, there are some which might want to hook into the
transaction to observe all queued reference updates as well as observing
the abortion or commit of a prepared transaction.
One such usecase would be to have a set of replicas of a given Git
repository, where we perform Git operations on all of the repositories
at once and expect the outcome to be the same in all of them. While
there exist hooks already for a certain subset of Git commands that
could be used to implement a voting mechanism for this, many others
currently don't have any mechanism for this.
The above scenario is the motivation for the new "reference-transaction"
hook that reaches directly into Git's reference transaction mechanism.
The hook receives as parameter the current state the transaction was
moved to ("prepared", "committed" or "aborted") and gets via its
standard input all queued reference updates. While the exit code gets
ignored in the "committed" and "aborted" states, a non-zero exit code in
the "prepared" state will cause the transaction to be aborted
prematurely.
Given the usecase described above, a voting mechanism can now be
implemented via this hook: as soon as it gets called, it will take all
of stdin and use it to cast a vote to a central service. When all
replicas of the repository agree, the hook will exit with zero,
otherwise it will abort the transaction by returning non-zero. The most
important upside is that this will catch _all_ commands writing
references at once, allowing to implement strong consistency for
reference updates via a single mechanism.
In order to test the impact on the case where we don't have any
"reference-transaction" hook installed in the repository, this commit
introduce two new performance tests for git-update-refs(1). Run against
an empty repository, it produces the following results:
Test origin/master HEAD
--------------------------------------------------------------------
1400.2: update-ref 2.70(2.10+0.71) 2.71(2.10+0.73) +0.4%
1400.3: update-ref --stdin 0.21(0.09+0.11) 0.21(0.07+0.14) +0.0%
The performance test p1400.2 creates, updates and deletes a branch a
thousand times, thus averaging runtime of git-update-refs over 3000
invocations. p1400.3 instead calls `git-update-refs --stdin` three times
and queues a thousand creations, updates and deletes respectively.
As expected, p1400.3 consistently shows no noticeable impact, as for
each batch of updates there's a single call to access(3P) for the
negative hook lookup. On the other hand, for p1400.2, one can see an
impact caused by this patchset. But doing five runs of the performance
tests where each one was run with GIT_PERF_REPEAT_COUNT=10, the overhead
ranged from -1.5% to +1.1%. These inconsistent performance numbers can
be explained by the overhead of spawning 3000 processes. This shows that
the overhead of assembling the hook path and executing access(3P) once
to check if it's there is mostly outweighed by the operating system's
overhead.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-06-19 14:56:14 +08:00
|
|
|
run_transaction_hook(transaction, "aborted");
|
|
|
|
|
ref_transaction_prepare(): new optional step for reference updates
In the future, compound reference stores will sometimes need to modify
references in two different reference stores at the same time, meaning
that a single logical reference transaction might have to be
implemented as two internal sub-transactions. They won't want to call
`ref_transaction_commit()` for the two sub-transactions one after the
other, because that wouldn't be atomic (the first commit could succeed
and the second one fail). Instead, they will want to prepare both
sub-transactions (i.e., obtain any necessary locks and do any
pre-checks), and only if both prepare steps succeed, then commit both
sub-transactions.
Start preparing for that day by adding a new, optional
`ref_transaction_prepare()` step to the reference transaction
sequence, which obtains the locks and does any prechecks, reporting
any errors that occur. Also add a `ref_transaction_abort()` function
that can be used to abort a sub-transaction even if it has already
been prepared.
That is on the side of the public-facing API. On the side of the
`ref_store` VTABLE, get rid of `transaction_commit` and instead add
methods `transaction_prepare`, `transaction_finish`, and
`transaction_abort`. A `ref_transaction_commit()` now basically calls
methods `transaction_prepare` then `transaction_finish`.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-22 22:17:44 +08:00
|
|
|
ref_transaction_free(transaction);
|
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
|
|
|
int ref_transaction_commit(struct ref_transaction *transaction,
|
|
|
|
struct strbuf *err)
|
|
|
|
{
|
|
|
|
struct ref_store *refs = transaction->ref_store;
|
|
|
|
int ret;
|
|
|
|
|
|
|
|
switch (transaction->state) {
|
|
|
|
case REF_TRANSACTION_OPEN:
|
|
|
|
/* Need to prepare first. */
|
|
|
|
ret = ref_transaction_prepare(transaction, err);
|
|
|
|
if (ret)
|
|
|
|
return ret;
|
|
|
|
break;
|
|
|
|
case REF_TRANSACTION_PREPARED:
|
|
|
|
/* Fall through to finish. */
|
|
|
|
break;
|
|
|
|
case REF_TRANSACTION_CLOSED:
|
2018-05-02 17:38:39 +08:00
|
|
|
BUG("commit called on a closed reference transaction");
|
ref_transaction_prepare(): new optional step for reference updates
In the future, compound reference stores will sometimes need to modify
references in two different reference stores at the same time, meaning
that a single logical reference transaction might have to be
implemented as two internal sub-transactions. They won't want to call
`ref_transaction_commit()` for the two sub-transactions one after the
other, because that wouldn't be atomic (the first commit could succeed
and the second one fail). Instead, they will want to prepare both
sub-transactions (i.e., obtain any necessary locks and do any
pre-checks), and only if both prepare steps succeed, then commit both
sub-transactions.
Start preparing for that day by adding a new, optional
`ref_transaction_prepare()` step to the reference transaction
sequence, which obtains the locks and does any prechecks, reporting
any errors that occur. Also add a `ref_transaction_abort()` function
that can be used to abort a sub-transaction even if it has already
been prepared.
That is on the side of the public-facing API. On the side of the
`ref_store` VTABLE, get rid of `transaction_commit` and instead add
methods `transaction_prepare`, `transaction_finish`, and
`transaction_abort`. A `ref_transaction_commit()` now basically calls
methods `transaction_prepare` then `transaction_finish`.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-22 22:17:44 +08:00
|
|
|
break;
|
|
|
|
default:
|
2018-05-02 17:38:39 +08:00
|
|
|
BUG("unexpected reference transaction state");
|
ref_transaction_prepare(): new optional step for reference updates
In the future, compound reference stores will sometimes need to modify
references in two different reference stores at the same time, meaning
that a single logical reference transaction might have to be
implemented as two internal sub-transactions. They won't want to call
`ref_transaction_commit()` for the two sub-transactions one after the
other, because that wouldn't be atomic (the first commit could succeed
and the second one fail). Instead, they will want to prepare both
sub-transactions (i.e., obtain any necessary locks and do any
pre-checks), and only if both prepare steps succeed, then commit both
sub-transactions.
Start preparing for that day by adding a new, optional
`ref_transaction_prepare()` step to the reference transaction
sequence, which obtains the locks and does any prechecks, reporting
any errors that occur. Also add a `ref_transaction_abort()` function
that can be used to abort a sub-transaction even if it has already
been prepared.
That is on the side of the public-facing API. On the side of the
`ref_store` VTABLE, get rid of `transaction_commit` and instead add
methods `transaction_prepare`, `transaction_finish`, and
`transaction_abort`. A `ref_transaction_commit()` now basically calls
methods `transaction_prepare` then `transaction_finish`.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-22 22:17:44 +08:00
|
|
|
break;
|
|
|
|
}
|
|
|
|
|
refs: implement reference transaction hook
The low-level reference transactions used to update references are
currently completely opaque to the user. While certainly desirable in
most usecases, there are some which might want to hook into the
transaction to observe all queued reference updates as well as observing
the abortion or commit of a prepared transaction.
One such usecase would be to have a set of replicas of a given Git
repository, where we perform Git operations on all of the repositories
at once and expect the outcome to be the same in all of them. While
there exist hooks already for a certain subset of Git commands that
could be used to implement a voting mechanism for this, many others
currently don't have any mechanism for this.
The above scenario is the motivation for the new "reference-transaction"
hook that reaches directly into Git's reference transaction mechanism.
The hook receives as parameter the current state the transaction was
moved to ("prepared", "committed" or "aborted") and gets via its
standard input all queued reference updates. While the exit code gets
ignored in the "committed" and "aborted" states, a non-zero exit code in
the "prepared" state will cause the transaction to be aborted
prematurely.
Given the usecase described above, a voting mechanism can now be
implemented via this hook: as soon as it gets called, it will take all
of stdin and use it to cast a vote to a central service. When all
replicas of the repository agree, the hook will exit with zero,
otherwise it will abort the transaction by returning non-zero. The most
important upside is that this will catch _all_ commands writing
references at once, allowing to implement strong consistency for
reference updates via a single mechanism.
In order to test the impact on the case where we don't have any
"reference-transaction" hook installed in the repository, this commit
introduce two new performance tests for git-update-refs(1). Run against
an empty repository, it produces the following results:
Test origin/master HEAD
--------------------------------------------------------------------
1400.2: update-ref 2.70(2.10+0.71) 2.71(2.10+0.73) +0.4%
1400.3: update-ref --stdin 0.21(0.09+0.11) 0.21(0.07+0.14) +0.0%
The performance test p1400.2 creates, updates and deletes a branch a
thousand times, thus averaging runtime of git-update-refs over 3000
invocations. p1400.3 instead calls `git-update-refs --stdin` three times
and queues a thousand creations, updates and deletes respectively.
As expected, p1400.3 consistently shows no noticeable impact, as for
each batch of updates there's a single call to access(3P) for the
negative hook lookup. On the other hand, for p1400.2, one can see an
impact caused by this patchset. But doing five runs of the performance
tests where each one was run with GIT_PERF_REPEAT_COUNT=10, the overhead
ranged from -1.5% to +1.1%. These inconsistent performance numbers can
be explained by the overhead of spawning 3000 processes. This shows that
the overhead of assembling the hook path and executing access(3P) once
to check if it's there is mostly outweighed by the operating system's
overhead.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-06-19 14:56:14 +08:00
|
|
|
ret = refs->be->transaction_finish(refs, transaction, err);
|
|
|
|
if (!ret)
|
|
|
|
run_transaction_hook(transaction, "committed");
|
|
|
|
return ret;
|
2016-09-05 00:08:16 +08:00
|
|
|
}
|
2016-09-05 00:08:26 +08:00
|
|
|
|
2017-03-26 10:42:34 +08:00
|
|
|
int refs_verify_refname_available(struct ref_store *refs,
|
|
|
|
const char *refname,
|
2017-04-16 14:41:26 +08:00
|
|
|
const struct string_list *extras,
|
2017-03-26 10:42:34 +08:00
|
|
|
const struct string_list *skip,
|
|
|
|
struct strbuf *err)
|
2016-09-05 00:08:26 +08:00
|
|
|
{
|
2017-04-16 14:41:26 +08:00
|
|
|
const char *slash;
|
|
|
|
const char *extra_refname;
|
|
|
|
struct strbuf dirname = STRBUF_INIT;
|
|
|
|
struct strbuf referent = STRBUF_INIT;
|
|
|
|
struct object_id oid;
|
|
|
|
unsigned int type;
|
|
|
|
struct ref_iterator *iter;
|
|
|
|
int ok;
|
|
|
|
int ret = -1;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* For the sake of comments in this function, suppose that
|
|
|
|
* refname is "refs/foo/bar".
|
|
|
|
*/
|
|
|
|
|
|
|
|
assert(err);
|
|
|
|
|
|
|
|
strbuf_grow(&dirname, strlen(refname) + 1);
|
|
|
|
for (slash = strchr(refname, '/'); slash; slash = strchr(slash + 1, '/')) {
|
2021-10-16 17:39:09 +08:00
|
|
|
/*
|
|
|
|
* Just saying "Is a directory" when we e.g. can't
|
|
|
|
* lock some multi-level ref isn't very informative,
|
|
|
|
* the user won't be told *what* is a directory, so
|
|
|
|
* let's not use strerror() below.
|
|
|
|
*/
|
|
|
|
int ignore_errno;
|
2017-04-16 14:41:26 +08:00
|
|
|
/* Expand dirname to the new prefix, not including the trailing slash: */
|
|
|
|
strbuf_add(&dirname, refname + dirname.len, slash - refname - dirname.len);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* We are still at a leading dir of the refname (e.g.,
|
|
|
|
* "refs/foo"; if there is a reference with that name,
|
|
|
|
* it is a conflict, *unless* it is in skip.
|
|
|
|
*/
|
|
|
|
if (skip && string_list_has_string(skip, dirname.buf))
|
|
|
|
continue;
|
|
|
|
|
2021-10-16 17:39:09 +08:00
|
|
|
if (!refs_read_raw_ref(refs, dirname.buf, &oid, &referent,
|
|
|
|
&type, &ignore_errno)) {
|
2018-07-21 15:49:35 +08:00
|
|
|
strbuf_addf(err, _("'%s' exists; cannot create '%s'"),
|
2017-04-16 14:41:26 +08:00
|
|
|
dirname.buf, refname);
|
|
|
|
goto cleanup;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (extras && string_list_has_string(extras, dirname.buf)) {
|
2018-07-21 15:49:35 +08:00
|
|
|
strbuf_addf(err, _("cannot process '%s' and '%s' at the same time"),
|
2017-04-16 14:41:26 +08:00
|
|
|
refname, dirname.buf);
|
|
|
|
goto cleanup;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* We are at the leaf of our refname (e.g., "refs/foo/bar").
|
|
|
|
* There is no point in searching for a reference with that
|
|
|
|
* name, because a refname isn't considered to conflict with
|
|
|
|
* itself. But we still need to check for references whose
|
|
|
|
* names are in the "refs/foo/bar/" namespace, because they
|
|
|
|
* *do* conflict.
|
|
|
|
*/
|
|
|
|
strbuf_addstr(&dirname, refname + dirname.len);
|
|
|
|
strbuf_addch(&dirname, '/');
|
|
|
|
|
2023-07-11 05:12:22 +08:00
|
|
|
iter = refs_ref_iterator_begin(refs, dirname.buf, NULL, 0,
|
2017-04-16 14:41:26 +08:00
|
|
|
DO_FOR_EACH_INCLUDE_BROKEN);
|
|
|
|
while ((ok = ref_iterator_advance(iter)) == ITER_OK) {
|
|
|
|
if (skip &&
|
|
|
|
string_list_has_string(skip, iter->refname))
|
|
|
|
continue;
|
|
|
|
|
2018-07-21 15:49:35 +08:00
|
|
|
strbuf_addf(err, _("'%s' exists; cannot create '%s'"),
|
2017-04-16 14:41:26 +08:00
|
|
|
iter->refname, refname);
|
|
|
|
ref_iterator_abort(iter);
|
|
|
|
goto cleanup;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (ok != ITER_DONE)
|
2018-05-02 17:38:39 +08:00
|
|
|
BUG("error while iterating over references");
|
2017-04-16 14:41:26 +08:00
|
|
|
|
|
|
|
extra_refname = find_descendant_ref(dirname.buf, extras, skip);
|
|
|
|
if (extra_refname)
|
2018-07-21 15:49:35 +08:00
|
|
|
strbuf_addf(err, _("cannot process '%s' and '%s' at the same time"),
|
2017-04-16 14:41:26 +08:00
|
|
|
refname, extra_refname);
|
|
|
|
else
|
|
|
|
ret = 0;
|
|
|
|
|
|
|
|
cleanup:
|
|
|
|
strbuf_release(&referent);
|
|
|
|
strbuf_release(&dirname);
|
|
|
|
return ret;
|
2016-09-05 00:08:26 +08:00
|
|
|
}
|
2016-09-05 00:08:38 +08:00
|
|
|
|
2024-02-21 20:37:39 +08:00
|
|
|
struct do_for_each_reflog_help {
|
|
|
|
each_reflog_fn *fn;
|
|
|
|
void *cb_data;
|
|
|
|
};
|
|
|
|
|
|
|
|
static int do_for_each_reflog_helper(struct repository *r UNUSED,
|
|
|
|
const char *refname,
|
|
|
|
const struct object_id *oid UNUSED,
|
|
|
|
int flags,
|
|
|
|
void *cb_data)
|
|
|
|
{
|
|
|
|
struct do_for_each_reflog_help *hp = cb_data;
|
|
|
|
return hp->fn(refname, hp->cb_data);
|
|
|
|
}
|
|
|
|
|
|
|
|
int refs_for_each_reflog(struct ref_store *refs, each_reflog_fn fn, void *cb_data)
|
2016-09-05 00:08:38 +08:00
|
|
|
{
|
|
|
|
struct ref_iterator *iter;
|
2024-02-21 20:37:39 +08:00
|
|
|
struct do_for_each_reflog_help hp = { fn, cb_data };
|
2016-09-05 00:08:38 +08:00
|
|
|
|
|
|
|
iter = refs->be->reflog_iterator_begin(refs);
|
|
|
|
|
2018-08-21 02:24:16 +08:00
|
|
|
return do_for_each_repo_ref_iterator(the_repository, iter,
|
2024-02-21 20:37:39 +08:00
|
|
|
do_for_each_reflog_helper, &hp);
|
2016-09-05 00:08:38 +08:00
|
|
|
}
|
|
|
|
|
2017-03-26 10:42:34 +08:00
|
|
|
int refs_for_each_reflog_ent_reverse(struct ref_store *refs,
|
|
|
|
const char *refname,
|
|
|
|
each_reflog_ent_fn fn,
|
|
|
|
void *cb_data)
|
|
|
|
{
|
2016-09-05 00:08:38 +08:00
|
|
|
return refs->be->for_each_reflog_ent_reverse(refs, refname,
|
|
|
|
fn, cb_data);
|
|
|
|
}
|
|
|
|
|
2017-03-26 10:42:34 +08:00
|
|
|
int refs_for_each_reflog_ent(struct ref_store *refs, const char *refname,
|
|
|
|
each_reflog_ent_fn fn, void *cb_data)
|
|
|
|
{
|
|
|
|
return refs->be->for_each_reflog_ent(refs, refname, fn, cb_data);
|
|
|
|
}
|
|
|
|
|
|
|
|
int refs_reflog_exists(struct ref_store *refs, const char *refname)
|
|
|
|
{
|
|
|
|
return refs->be->reflog_exists(refs, refname);
|
2016-09-05 00:08:38 +08:00
|
|
|
}
|
|
|
|
|
2017-03-26 10:42:34 +08:00
|
|
|
int refs_create_reflog(struct ref_store *refs, const char *refname,
|
2021-11-22 22:19:08 +08:00
|
|
|
struct strbuf *err)
|
2017-03-26 10:42:34 +08:00
|
|
|
{
|
2021-11-22 22:19:08 +08:00
|
|
|
return refs->be->create_reflog(refs, refname, err);
|
2016-09-05 00:08:38 +08:00
|
|
|
}
|
|
|
|
|
2017-03-26 10:42:34 +08:00
|
|
|
int refs_delete_reflog(struct ref_store *refs, const char *refname)
|
|
|
|
{
|
|
|
|
return refs->be->delete_reflog(refs, refname);
|
2016-09-05 00:08:38 +08:00
|
|
|
}
|
|
|
|
|
2017-03-26 10:42:34 +08:00
|
|
|
int refs_reflog_expire(struct ref_store *refs,
|
2021-08-23 19:36:11 +08:00
|
|
|
const char *refname,
|
2017-03-26 10:42:34 +08:00
|
|
|
unsigned int flags,
|
|
|
|
reflog_expiry_prepare_fn prepare_fn,
|
|
|
|
reflog_expiry_should_prune_fn should_prune_fn,
|
|
|
|
reflog_expiry_cleanup_fn cleanup_fn,
|
|
|
|
void *policy_cb_data)
|
|
|
|
{
|
2021-08-23 19:36:11 +08:00
|
|
|
return refs->be->reflog_expire(refs, refname, flags,
|
2017-03-26 10:42:34 +08:00
|
|
|
prepare_fn, should_prune_fn,
|
|
|
|
cleanup_fn, policy_cb_data);
|
2016-09-05 00:08:38 +08:00
|
|
|
}
|
|
|
|
|
2016-09-05 00:08:39 +08:00
|
|
|
int initial_ref_transaction_commit(struct ref_transaction *transaction,
|
|
|
|
struct strbuf *err)
|
|
|
|
{
|
2017-03-26 10:42:35 +08:00
|
|
|
struct ref_store *refs = transaction->ref_store;
|
2016-09-05 00:08:39 +08:00
|
|
|
|
|
|
|
return refs->be->initial_transaction_commit(refs, transaction, err);
|
|
|
|
}
|
2016-09-05 00:08:40 +08:00
|
|
|
|
2022-02-17 21:04:32 +08:00
|
|
|
void ref_transaction_for_each_queued_update(struct ref_transaction *transaction,
|
|
|
|
ref_transaction_for_each_queued_update_fn cb,
|
|
|
|
void *cb_data)
|
|
|
|
{
|
|
|
|
int i;
|
|
|
|
|
|
|
|
for (i = 0; i < transaction->nr; i++) {
|
|
|
|
struct ref_update *update = transaction->updates[i];
|
|
|
|
|
|
|
|
cb(update->refname,
|
|
|
|
(update->flags & REF_HAVE_OLD) ? &update->old_oid : NULL,
|
|
|
|
(update->flags & REF_HAVE_NEW) ? &update->new_oid : NULL,
|
|
|
|
cb_data);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
reflog: cleanse messages in the refs.c layer
Regarding reflog messages:
- We expect that a reflog message consists of a single line. The
file format used by the files backend may add a LF after the
message as a delimiter, and output by commands like "git log -g"
may complete such an incomplete line by adding a LF at the end,
but philosophically, the terminating LF is not a part of the
message.
- We however allow callers of refs API to supply a random sequence
of NUL terminated bytes. We cleanse caller-supplied message by
squashing a run of whitespaces into a SP, and by trimming trailing
whitespace, before storing the message. This is how we tolerate,
instead of erring out, a message with LF in it (be it at the end,
in the middle, or both).
Currently, the cleansing of the reflog message is done by the files
backend, before the log is written out. This is sufficient with the
current code, as that is the only backend that writes reflogs. But
new backends can be added that write reflogs, and we'd want the
resulting log message we would read out of "log -g" the same no
matter what backend is used, and moving the code to do so to the
generic layer is a way to do so.
An added benefit is that the "cleansing" function could be updated
later, independent from individual backends, to e.g. allow
multi-line log messages if we wanted to, and when that happens, it
would help a lot to ensure we covered all bases if the cleansing
function (which would be updated) is called from the generic layer.
Side note: I am not interested in supporting multi-line reflog
messages right at the moment (nobody is asking for it), but I
envision that instead of the "squash a run of whitespaces into a SP
and rtrim" cleansing, we can %urlencode problematic bytes in the
message *AND* append a SP at the end, when a new version of Git that
supports multi-line and/or verbatim reflog messages writes a reflog
record. The reading side can detect the presense of SP at the end
(which should have been rtrimmed out if it were written by existing
versions of Git) as a signal that decoding %urlencode recovers the
original reflog message.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-11 01:19:53 +08:00
|
|
|
int refs_delete_refs(struct ref_store *refs, const char *logmsg,
|
2017-05-22 22:17:38 +08:00
|
|
|
struct string_list *refnames, unsigned int flags)
|
2016-09-05 00:08:40 +08:00
|
|
|
{
|
2023-11-14 16:58:46 +08:00
|
|
|
struct ref_transaction *transaction;
|
|
|
|
struct strbuf err = STRBUF_INIT;
|
|
|
|
struct string_list_item *item;
|
|
|
|
int ret = 0, failures = 0;
|
reflog: cleanse messages in the refs.c layer
Regarding reflog messages:
- We expect that a reflog message consists of a single line. The
file format used by the files backend may add a LF after the
message as a delimiter, and output by commands like "git log -g"
may complete such an incomplete line by adding a LF at the end,
but philosophically, the terminating LF is not a part of the
message.
- We however allow callers of refs API to supply a random sequence
of NUL terminated bytes. We cleanse caller-supplied message by
squashing a run of whitespaces into a SP, and by trimming trailing
whitespace, before storing the message. This is how we tolerate,
instead of erring out, a message with LF in it (be it at the end,
in the middle, or both).
Currently, the cleansing of the reflog message is done by the files
backend, before the log is written out. This is sufficient with the
current code, as that is the only backend that writes reflogs. But
new backends can be added that write reflogs, and we'd want the
resulting log message we would read out of "log -g" the same no
matter what backend is used, and moving the code to do so to the
generic layer is a way to do so.
An added benefit is that the "cleansing" function could be updated
later, independent from individual backends, to e.g. allow
multi-line log messages if we wanted to, and when that happens, it
would help a lot to ensure we covered all bases if the cleansing
function (which would be updated) is called from the generic layer.
Side note: I am not interested in supporting multi-line reflog
messages right at the moment (nobody is asking for it), but I
envision that instead of the "squash a run of whitespaces into a SP
and rtrim" cleansing, we can %urlencode problematic bytes in the
message *AND* append a SP at the end, when a new version of Git that
supports multi-line and/or verbatim reflog messages writes a reflog
record. The reading side can detect the presense of SP at the end
(which should have been rtrimmed out if it were written by existing
versions of Git) as a signal that decoding %urlencode recovers the
original reflog message.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-11 01:19:53 +08:00
|
|
|
char *msg;
|
2023-11-14 16:58:46 +08:00
|
|
|
|
|
|
|
if (!refnames->nr)
|
|
|
|
return 0;
|
reflog: cleanse messages in the refs.c layer
Regarding reflog messages:
- We expect that a reflog message consists of a single line. The
file format used by the files backend may add a LF after the
message as a delimiter, and output by commands like "git log -g"
may complete such an incomplete line by adding a LF at the end,
but philosophically, the terminating LF is not a part of the
message.
- We however allow callers of refs API to supply a random sequence
of NUL terminated bytes. We cleanse caller-supplied message by
squashing a run of whitespaces into a SP, and by trimming trailing
whitespace, before storing the message. This is how we tolerate,
instead of erring out, a message with LF in it (be it at the end,
in the middle, or both).
Currently, the cleansing of the reflog message is done by the files
backend, before the log is written out. This is sufficient with the
current code, as that is the only backend that writes reflogs. But
new backends can be added that write reflogs, and we'd want the
resulting log message we would read out of "log -g" the same no
matter what backend is used, and moving the code to do so to the
generic layer is a way to do so.
An added benefit is that the "cleansing" function could be updated
later, independent from individual backends, to e.g. allow
multi-line log messages if we wanted to, and when that happens, it
would help a lot to ensure we covered all bases if the cleansing
function (which would be updated) is called from the generic layer.
Side note: I am not interested in supporting multi-line reflog
messages right at the moment (nobody is asking for it), but I
envision that instead of the "squash a run of whitespaces into a SP
and rtrim" cleansing, we can %urlencode problematic bytes in the
message *AND* append a SP at the end, when a new version of Git that
supports multi-line and/or verbatim reflog messages writes a reflog
record. The reading side can detect the presense of SP at the end
(which should have been rtrimmed out if it were written by existing
versions of Git) as a signal that decoding %urlencode recovers the
original reflog message.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-11 01:19:53 +08:00
|
|
|
|
|
|
|
msg = normalize_reflog_message(logmsg);
|
2023-11-14 16:58:46 +08:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Since we don't check the references' old_oids, the
|
|
|
|
* individual updates can't fail, so we can pack all of the
|
|
|
|
* updates into a single transaction.
|
|
|
|
*/
|
|
|
|
transaction = ref_store_transaction_begin(refs, &err);
|
|
|
|
if (!transaction) {
|
|
|
|
ret = error("%s", err.buf);
|
|
|
|
goto out;
|
|
|
|
}
|
|
|
|
|
|
|
|
for_each_string_list_item(item, refnames) {
|
|
|
|
ret = ref_transaction_delete(transaction, item->string,
|
|
|
|
NULL, flags, msg, &err);
|
|
|
|
if (ret) {
|
|
|
|
warning(_("could not delete reference %s: %s"),
|
|
|
|
item->string, err.buf);
|
|
|
|
strbuf_reset(&err);
|
|
|
|
failures = 1;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
ret = ref_transaction_commit(transaction, &err);
|
|
|
|
if (ret) {
|
|
|
|
if (refnames->nr == 1)
|
|
|
|
error(_("could not delete reference %s: %s"),
|
|
|
|
refnames->items[0].string, err.buf);
|
|
|
|
else
|
|
|
|
error(_("could not delete references: %s"), err.buf);
|
|
|
|
}
|
|
|
|
|
|
|
|
out:
|
|
|
|
if (!ret && failures)
|
|
|
|
ret = -1;
|
|
|
|
ref_transaction_free(transaction);
|
|
|
|
strbuf_release(&err);
|
reflog: cleanse messages in the refs.c layer
Regarding reflog messages:
- We expect that a reflog message consists of a single line. The
file format used by the files backend may add a LF after the
message as a delimiter, and output by commands like "git log -g"
may complete such an incomplete line by adding a LF at the end,
but philosophically, the terminating LF is not a part of the
message.
- We however allow callers of refs API to supply a random sequence
of NUL terminated bytes. We cleanse caller-supplied message by
squashing a run of whitespaces into a SP, and by trimming trailing
whitespace, before storing the message. This is how we tolerate,
instead of erring out, a message with LF in it (be it at the end,
in the middle, or both).
Currently, the cleansing of the reflog message is done by the files
backend, before the log is written out. This is sufficient with the
current code, as that is the only backend that writes reflogs. But
new backends can be added that write reflogs, and we'd want the
resulting log message we would read out of "log -g" the same no
matter what backend is used, and moving the code to do so to the
generic layer is a way to do so.
An added benefit is that the "cleansing" function could be updated
later, independent from individual backends, to e.g. allow
multi-line log messages if we wanted to, and when that happens, it
would help a lot to ensure we covered all bases if the cleansing
function (which would be updated) is called from the generic layer.
Side note: I am not interested in supporting multi-line reflog
messages right at the moment (nobody is asking for it), but I
envision that instead of the "squash a run of whitespaces into a SP
and rtrim" cleansing, we can %urlencode problematic bytes in the
message *AND* append a SP at the end, when a new version of Git that
supports multi-line and/or verbatim reflog messages writes a reflog
record. The reading side can detect the presense of SP at the end
(which should have been rtrimmed out if it were written by existing
versions of Git) as a signal that decoding %urlencode recovers the
original reflog message.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-11 01:19:53 +08:00
|
|
|
free(msg);
|
2023-11-14 16:58:46 +08:00
|
|
|
return ret;
|
2016-09-05 00:08:40 +08:00
|
|
|
}
|
2016-09-05 00:08:42 +08:00
|
|
|
|
2017-03-26 10:42:34 +08:00
|
|
|
int refs_rename_ref(struct ref_store *refs, const char *oldref,
|
|
|
|
const char *newref, const char *logmsg)
|
|
|
|
{
|
reflog: cleanse messages in the refs.c layer
Regarding reflog messages:
- We expect that a reflog message consists of a single line. The
file format used by the files backend may add a LF after the
message as a delimiter, and output by commands like "git log -g"
may complete such an incomplete line by adding a LF at the end,
but philosophically, the terminating LF is not a part of the
message.
- We however allow callers of refs API to supply a random sequence
of NUL terminated bytes. We cleanse caller-supplied message by
squashing a run of whitespaces into a SP, and by trimming trailing
whitespace, before storing the message. This is how we tolerate,
instead of erring out, a message with LF in it (be it at the end,
in the middle, or both).
Currently, the cleansing of the reflog message is done by the files
backend, before the log is written out. This is sufficient with the
current code, as that is the only backend that writes reflogs. But
new backends can be added that write reflogs, and we'd want the
resulting log message we would read out of "log -g" the same no
matter what backend is used, and moving the code to do so to the
generic layer is a way to do so.
An added benefit is that the "cleansing" function could be updated
later, independent from individual backends, to e.g. allow
multi-line log messages if we wanted to, and when that happens, it
would help a lot to ensure we covered all bases if the cleansing
function (which would be updated) is called from the generic layer.
Side note: I am not interested in supporting multi-line reflog
messages right at the moment (nobody is asking for it), but I
envision that instead of the "squash a run of whitespaces into a SP
and rtrim" cleansing, we can %urlencode problematic bytes in the
message *AND* append a SP at the end, when a new version of Git that
supports multi-line and/or verbatim reflog messages writes a reflog
record. The reading side can detect the presense of SP at the end
(which should have been rtrimmed out if it were written by existing
versions of Git) as a signal that decoding %urlencode recovers the
original reflog message.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-11 01:19:53 +08:00
|
|
|
char *msg;
|
|
|
|
int retval;
|
|
|
|
|
|
|
|
msg = normalize_reflog_message(logmsg);
|
|
|
|
retval = refs->be->rename_ref(refs, oldref, newref, msg);
|
|
|
|
free(msg);
|
|
|
|
return retval;
|
2016-09-05 00:08:42 +08:00
|
|
|
}
|
2017-03-26 10:42:34 +08:00
|
|
|
|
branch: add a --copy (-c) option to go with --move (-m)
Add the ability to --copy a branch and its reflog and configuration,
this uses the same underlying machinery as the --move (-m) option
except the reflog and configuration is copied instead of being moved.
This is useful for e.g. copying a topic branch to a new version,
e.g. work to work-2 after submitting the work topic to the list, while
preserving all the tracking info and other configuration that goes
with the branch, and unlike --move keeping the other already-submitted
branch around for reference.
Like --move, when the source branch is the currently checked out
branch the HEAD is moved to the destination branch. In the case of
--move we don't really have a choice (other than remaining on a
detached HEAD) and in order to keep the functionality consistent, we
are doing it in similar way for --copy too.
The most common usage of this feature is expected to be moving to a
new topic branch which is a copy of the current one, in that case
moving to the target branch is what the user wants, and doesn't
unexpectedly behave differently than --move would.
One outstanding caveat of this implementation is that:
git checkout maint &&
git checkout master &&
git branch -c topic &&
git checkout -
Will check out 'maint' instead of 'master'. This is because the @{-N}
feature (or its -1 shorthand "-") relies on HEAD reflogs created by
the checkout command, so in this case we'll checkout maint instead of
master, as the user might expect. What to do about that is left to a
future change.
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Sahil Dua <sahildua2305@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-19 05:19:16 +08:00
|
|
|
int refs_copy_existing_ref(struct ref_store *refs, const char *oldref,
|
|
|
|
const char *newref, const char *logmsg)
|
|
|
|
{
|
reflog: cleanse messages in the refs.c layer
Regarding reflog messages:
- We expect that a reflog message consists of a single line. The
file format used by the files backend may add a LF after the
message as a delimiter, and output by commands like "git log -g"
may complete such an incomplete line by adding a LF at the end,
but philosophically, the terminating LF is not a part of the
message.
- We however allow callers of refs API to supply a random sequence
of NUL terminated bytes. We cleanse caller-supplied message by
squashing a run of whitespaces into a SP, and by trimming trailing
whitespace, before storing the message. This is how we tolerate,
instead of erring out, a message with LF in it (be it at the end,
in the middle, or both).
Currently, the cleansing of the reflog message is done by the files
backend, before the log is written out. This is sufficient with the
current code, as that is the only backend that writes reflogs. But
new backends can be added that write reflogs, and we'd want the
resulting log message we would read out of "log -g" the same no
matter what backend is used, and moving the code to do so to the
generic layer is a way to do so.
An added benefit is that the "cleansing" function could be updated
later, independent from individual backends, to e.g. allow
multi-line log messages if we wanted to, and when that happens, it
would help a lot to ensure we covered all bases if the cleansing
function (which would be updated) is called from the generic layer.
Side note: I am not interested in supporting multi-line reflog
messages right at the moment (nobody is asking for it), but I
envision that instead of the "squash a run of whitespaces into a SP
and rtrim" cleansing, we can %urlencode problematic bytes in the
message *AND* append a SP at the end, when a new version of Git that
supports multi-line and/or verbatim reflog messages writes a reflog
record. The reading side can detect the presense of SP at the end
(which should have been rtrimmed out if it were written by existing
versions of Git) as a signal that decoding %urlencode recovers the
original reflog message.
Signed-off-by: Han-Wen Nienhuys <hanwen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-11 01:19:53 +08:00
|
|
|
char *msg;
|
|
|
|
int retval;
|
|
|
|
|
|
|
|
msg = normalize_reflog_message(logmsg);
|
|
|
|
retval = refs->be->copy_ref(refs, oldref, newref, msg);
|
|
|
|
free(msg);
|
|
|
|
return retval;
|
branch: add a --copy (-c) option to go with --move (-m)
Add the ability to --copy a branch and its reflog and configuration,
this uses the same underlying machinery as the --move (-m) option
except the reflog and configuration is copied instead of being moved.
This is useful for e.g. copying a topic branch to a new version,
e.g. work to work-2 after submitting the work topic to the list, while
preserving all the tracking info and other configuration that goes
with the branch, and unlike --move keeping the other already-submitted
branch around for reference.
Like --move, when the source branch is the currently checked out
branch the HEAD is moved to the destination branch. In the case of
--move we don't really have a choice (other than remaining on a
detached HEAD) and in order to keep the functionality consistent, we
are doing it in similar way for --copy too.
The most common usage of this feature is expected to be moving to a
new topic branch which is a copy of the current one, in that case
moving to the target branch is what the user wants, and doesn't
unexpectedly behave differently than --move would.
One outstanding caveat of this implementation is that:
git checkout maint &&
git checkout master &&
git branch -c topic &&
git checkout -
Will check out 'maint' instead of 'master'. This is because the @{-N}
feature (or its -1 shorthand "-") relies on HEAD reflogs created by
the checkout command, so in this case we'll checkout maint instead of
master, as the user might expect. What to do about that is left to a
future change.
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Sahil Dua <sahildua2305@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-19 05:19:16 +08:00
|
|
|
}
|
|
|
|
|
2024-05-07 20:58:55 +08:00
|
|
|
const char *ref_update_original_update_refname(struct ref_update *update)
|
|
|
|
{
|
|
|
|
while (update->parent_update)
|
|
|
|
update = update->parent_update;
|
|
|
|
|
|
|
|
return update->refname;
|
|
|
|
}
|
2024-05-07 20:58:56 +08:00
|
|
|
|
|
|
|
int ref_update_has_null_new_value(struct ref_update *update)
|
|
|
|
{
|
|
|
|
return !update->new_target && is_null_oid(&update->new_oid);
|
|
|
|
}
|
|
|
|
|
|
|
|
int ref_update_check_old_target(const char *referent, struct ref_update *update,
|
|
|
|
struct strbuf *err)
|
|
|
|
{
|
|
|
|
if (!update->old_target)
|
|
|
|
BUG("called without old_target set");
|
|
|
|
|
|
|
|
if (!strcmp(referent, update->old_target))
|
|
|
|
return 0;
|
|
|
|
|
|
|
|
if (!strcmp(referent, ""))
|
|
|
|
strbuf_addf(err, "verifying symref target: '%s': "
|
|
|
|
"reference is missing but expected %s",
|
|
|
|
ref_update_original_update_refname(update),
|
|
|
|
update->old_target);
|
|
|
|
else
|
|
|
|
strbuf_addf(err, "verifying symref target: '%s': "
|
|
|
|
"is at %s but expected %s",
|
|
|
|
ref_update_original_update_refname(update),
|
|
|
|
referent, update->old_target);
|
|
|
|
return -1;
|
|
|
|
}
|