In a repository with many refs, check_refname_component can be a major
contributor to the runtime of some git commands. One such command is
git rev-parse HEAD
Timings for one particular repo, with about 60k refs, almost all
packed, are:
Old: 35 ms
New: 29 ms
Many other commands which read refs are also sped up.
Signed-off-by: David Turner <dturner@twitter.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that we free the transaction when we are done, there is no need to
make a copy of transaction->updates before working with it.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It used to be that ref_transaction_commit() allocated a temporary
array to hold the types of references while it is working. Instead,
add a type field to ref_update that ref_transaction_commit() can use
as its scratch space.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Now that we manage ref_update objects internally, we can use them to
hold some of the scratch space we need when actually carrying out the
updates. Store the (struct ref_lock *) there.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use temporary variables in the for-loop blocks to simplify expressions
in the rest of the loop.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is consistent with the usual nomenclature.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It has been superseded by reference transactions. This also means
that struct ref_update can become private.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Build out the API for dealing with a bunch of reference checks and
changes within a transaction. Define an opaque ref_transaction type
that is managed entirely within refs.c. Introduce functions for
beginning a transaction, adding updates to a transaction, and
committing/rolling back a transaction.
This API will soon replace update_refs().
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The old signature of update_refs() required a
(const struct ref_update **) for its updates_orig argument. The
"const" is presumably there to promise that the function will not
modify the contents of the structures.
But this declaration does not permit the function to be called with a
(struct ref_update **), which is perfectly legitimate. C's type
system is not powerful enough to express what we'd like. So remove
the first "const" from the declaration.
On the other hand, the function *can* promise not to modify the
pointers within the array that is passed to it without inconveniencing
its callers. So add a "const" that has that effect, making the final
declaration
(struct ref_update * const *).
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Given that these constants are only being used when updating
references, it is inappropriate to give them such generic names as
"DIE_ON_ERR". So prefix their names with "UPDATE_REFS_".
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We invented hashcpy() to keep the abstraction of "object name"
behind it. Use it instead of calling memcpy() with hard-coded
20-byte length when moving object names between pieces of memory.
Leave ppc/sha1.c as-is, because the function is about the SHA-1 hash
algorithm whose output is and will always be 20 bytes.
Helped-by: Michael Haggerty <mhagger@alum.mit.edu>
Helped-by: Duy Nguyen <pclouds@gmail.com>
Signed-off-by: Sun He <sunheehnus@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make it clear that we don't use fnmatch() anymore.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Code clean-up and protection against concurrent write access to the
ref namespace.
* mh/safe-create-leading-directories:
rename_tmp_log(): on SCLD_VANISHED, retry
rename_tmp_log(): limit the number of remote_empty_directories() attempts
rename_tmp_log(): handle a possible mkdir/rmdir race
rename_ref(): extract function rename_tmp_log()
remove_dir_recurse(): handle disappearing files and directories
remove_dir_recurse(): tighten condition for removing unreadable dir
lock_ref_sha1_basic(): if locking fails with ENOENT, retry
lock_ref_sha1_basic(): on SCLD_VANISHED, retry
safe_create_leading_directories(): add new error value SCLD_VANISHED
cmd_init_db(): when creating directories, handle errors conservatively
safe_create_leading_directories(): introduce enum for return values
safe_create_leading_directories(): always restore slash at end of loop
safe_create_leading_directories(): split on first of multiple slashes
safe_create_leading_directories(): rename local variable
safe_create_leading_directories(): add explicit "slash" pointer
safe_create_leading_directories(): reduce scope of local variable
safe_create_leading_directories(): fix format of "if" chaining
If safe_create_leading_directories() fails because a file along the
path unexpectedly vanished, try again from the beginning. Try at most
4 times.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This doesn't seem to be a likely error, but we've got the counter
anyway, so we might as well use it for an added bit of safety.
Please note that the first call to rename() is optimistic, and it is
normal for it to fail if there is a directory in the way. So bump the
total number of allowed attempts to 4, to be sure that we can still
have at least 3 retries in the case of a race.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If a directory vanishes while renaming the temporary reflog file,
retry (up to 3 times). This could happen if another process deletes
the directory created by safe_create_leading_directories() just before
we rename the file into the directory.
As far as I can tell, this race could not occur internal to git. The
only time that a directory under $GIT_DIR/logs is deleted is if room
has to be made for a log file for a reference with the same name;
for example, in the following sequence:
git branch foo/bar # Creates file .git/logs/refs/heads/foo/bar
git branch -d foo/bar # Deletes file but leaves .git/logs/refs/heads/foo/
git branch foo # Deletes .git/logs/refs/heads/foo/
But the only reason the last command deletes the directory is because
it wants to create a file with the same name. So if another process
(e.g.,
git branch foo/baz
) wants to create that directory, one of the two is doomed to failure
anyway because of a D/F conflict.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If hold_lock_file_for_update() fails with errno==ENOENT, it might be
because somebody else (for example, a pack-refs process) has just
deleted one of the lockfile's ancestor directories. So if this
condition is detected, try again (up to 3 times).
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If safe_create_leading_directories() fails because a file along the
path unexpectedly vanished, try again (up to 3 times).
This can occur if another process is deleting directories at the same
time as we are trying to make them. For example, "git pack-refs
--all" tries to delete the loose refs and any empty directories that
are left behind. If a pack-refs process is running, then it might
delete a directory that we need to put a new loose reference in.
If safe_create_leading_directories() thinks this might have happened,
then take its advice and try again (maximum three attempts).
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We used to use two separate rules for the normal ref resolution
dwimming and dwimming done to decide which remote ref to grab. The
third parameter to refname_match() selected which rules to use.
When these two rules were harmonized in
2011-11-04 dd621df9cd refs DWIMmery: use the same rule for both "git fetch" and others
, ref_fetch_rules was #defined to avoid potential breakages for
in-flight topics.
It is now safe to remove the backwards-compatibility code, so remove
refname_match()'s third parameter, make ref_rev_parse_rules private to
refs.c, and remove ref_fetch_rules entirely.
Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* mh/shorten-unambigous-ref:
shorten_unambiguous_ref(): tighten up pointer arithmetic
gen_scanf_fmt(): delete function and use snprintf() instead
shorten_unambiguous_ref(): introduce a new local variable
As long as we're being pathologically stingy with mallocs, we might as
well do the math right and save 6 (!) bytes.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To replace "%.*s" with "%s", all we have to do is use snprintf()
to interpolate "%s" into the pattern.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When filling the scanf_fmts array, use a separate variable to keep
track of the offset to avoid clobbering total_len (which we will need
in the next commit).
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Leaving only the function definitions and declarations so that any
new topic in flight can still make use of the old functions, replace
existing uses of the prefixcmp() and suffixcmp() with new API
functions.
The change can be recreated by mechanically applying this:
$ git grep -l -e prefixcmp -e suffixcmp -- \*.c |
grep -v strbuf\\.c |
xargs perl -pi -e '
s|!prefixcmp\(|starts_with\(|g;
s|prefixcmp\(|!starts_with\(|g;
s|!suffixcmp\(|ends_with\(|g;
s|suffixcmp\(|!ends_with\(|g;
'
on the result of preparatory changes in this series.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The function 'invalidate_ref_cache' was introduced in 79c7ca5 (2011-10-17,
invalidate_ref_cache(): rename function from invalidate_cached_refs())
by a rename and elevated to be publicly usable in 8be8bde (2011-10-17,
invalidate_ref_cache(): expose this function in the refs API)
However it is not used anymore, as 8bf90dc (2011-10-17, write_ref_sha1():
only invalidate the loose ref cache) and (much) later 506a760 (2013-04-22,
refs: change how packed refs are deleted) removed any calls to this
function. So it seems as if we don't need that function any more,
good bye!
Signed-off-by: Stefan Beller <stefanbeller@googlemail.com>
Acked-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In shorten_unambiguous_ref, we build and cache a reverse-map of the
rev-parse rules like this:
static char **scanf_fmts;
static int nr_rules;
if (!nr_rules) {
for (; ref_rev_parse_rules[nr_rules]; nr_rules++)
... generate scanf_fmts ...
}
where ref_rev_parse_rules is terminated with a NULL pointer.
Compiling with "gcc -O2 -Wall" does not cause any problems, but
compiling with "-O3 -Wall" generates:
$ make CFLAGS='-O3 -Wall' refs.o
refs.c: In function ‘shorten_unambiguous_ref’:
refs.c:3379:29: warning: array subscript is above array bounds [-Warray-bounds]
for (; ref_rev_parse_rules[nr_rules]; nr_rules++)
Curiously, we can silence this by explicitly nr_rules to 0
in the beginning of the loop, even though the compiler
should be able to tell that we follow this code path only
when nr_rules is already 0.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A call to update_ref_lock() passes '0' to the 'int *type_p' parameter.
Noticed by sparse. ("Using plain integer as NULL pointer")
Signed-off-by: Ramsay Jones <ramsay@ramsay1.demon.co.uk>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Instead of typing four capital letters "HEAD", you can say "@" now,
e.g. "git log @".
* fc/at-head:
Add new @ shortcut for HEAD
sha1-name: pass len argument to interpret_branch_name()
Give "update-refs" a "--stdin" option to read multiple update
requests and perform them in an all-or-none fashion.
* bk/refs-multi-update:
update-ref: add test cases covering --stdin signature
update-ref: support multiple simultaneous updates
refs: add update_refs for multiple simultaneous updates
refs: add function to repack without multiple refs
refs: factor delete_ref loose ref step into a helper
refs: factor update_ref steps into helpers
refs: report ref type from lock_any_ref_for_update
reset: rename update_refs to reset_refs
Typing 'HEAD' is tedious, especially when we can use '@' instead.
The reason for choosing '@' is that it follows naturally from the
ref@op syntax (e.g. HEAD@{u}), except we have no ref, and no
operation, and when we don't have those, it makes sens to assume
'HEAD'.
So now we can use 'git show @~1', and all that goody goodness.
Until now '@' was a valid name, but it conflicts with this idea, so
let's make it invalid. Probably very few people, if any, used this name.
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Allow a safer "rewind of the remote tip" push than blind "--force",
by requiring that the overwritten remote ref to be unchanged since
the new history to replace it was prepared.
The machinery is more or less ready. The "--force" option is again
the big red button to override any safety, thanks to J6t's sanity
(the original round allowed --lockref to defeat --force).
The logic to choose the default implemented here is fragile
(e.g. "git fetch" after seeing a failure will update the
remote-tracking branch and will make the next "push" pass,
defeating the safety pretty easily). It is suitable only for the
simplest workflows, and it may hurt users more than it helps them.
* jc/push-cas:
push: teach --force-with-lease to smart-http transport
send-pack: fix parsing of --force-with-lease option
t5540/5541: smart-http does not support "--force-with-lease"
t5533: test "push --force-with-lease"
push --force-with-lease: tie it all together
push --force-with-lease: implement logic to populate old_sha1_expect[]
remote.c: add command line option parser for "--force-with-lease"
builtin/push.c: use OPT_BOOL, not OPT_BOOLEAN
cache.h: move remote/connect API out of it
Add 'struct ref_update' to encode the information needed to update or
delete a ref (name, new sha1, optional old sha1, no-deref flag). Add
function 'update_refs' accepting an array of updates to perform. First
sort the input array to order locks consistently everywhere and reject
multiple updates to the same ref. Then acquire locks on all refs with
verified old values. Then update or delete all refs accordingly. Fail
if any one lock cannot be obtained or any one old value does not match.
Though the refs themselves cannot be modified together in a single
atomic transaction, this function does enable some useful semantics.
For example, a caller may create a new branch starting from the head of
another branch and rewind the original branch at the same time. This
transfers ownership of commits between branches without risk of losing
commits added to the original branch by a concurrent process, or risk of
a concurrent process creating the new branch first.
Signed-off-by: Brad King <brad.king@kitware.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Generalize repack_without_ref as repack_without_refs to support a list
of refs and implement the former in terms of the latter.
Signed-off-by: Brad King <brad.king@kitware.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Factor loose ref deletion into helper function delete_ref_loose to allow
later use elsewhere.
Signed-off-by: Brad King <brad.king@kitware.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Factor the lock and write steps and error handling into helper functions
update_ref_lock and update_ref_write to allow later use elsewhere.
Expose lock_any_ref_for_update's type_p to update_ref_lock callers.
While at it, drop "static" from the local "lock" variable as it is not
necessary to keep across invocations.
Signed-off-by: Brad King <brad.king@kitware.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is useful to make sure we don't step outside the boundaries of what
we are interpreting at the moment. For example while interpreting
foobar@{u}~1, the job of interpret_branch_name() ends right before ~1,
but there's no way to figure that out inside the function, unless the
len argument is passed.
So let's do that.
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Expose lock_ref_sha1_basic's type_p argument to callers of
lock_any_ref_for_update. Update all call sites to ignore it by passing
NULL for now.
Signed-off-by: Brad King <brad.king@kitware.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This reverts commit cdfd94837b, as it
does not just apply to "@" (and forms with modifiers like @{u}
applied to it), but also affects e.g. "refs/heads/@/foo", which it
shouldn't.
The basic idea of giving a short-hand might be good, and the topic
can be retried later, but let's revert to avoid affecting existing
use cases for now for the upcoming release.
Fix a NULL-pointer dereference during nested iterations over
references (for example, when replace references are being used).
* mh/packed-refs-do-one-ref-recursion:
do_one_ref(): save and restore value of current_ref
If do_one_ref() is called recursively, then the inner call should not
permanently overwrite the value stored in current_ref by the outer
call. Aside from the tiny optimization loss, peel_ref() expects the
value of current_ref not to change across a call to peel_entry(). But
in the presence of replace references that assumption could be
violated by a recursive call to do_one_ref:
do_for_each_entry()
do_one_ref()
builtin/describe.c:get_name()
peel_ref()
peel_entry()
peel_object ()
deref_tag_noverify()
parse_object()
lookup_replace_object()
do_lookup_replace_object()
prepare_replace_object()
do_for_each_ref()
do_for_each_entry()
do_for_each_entry_in_dir()
do_one_ref()
The inner call to do_one_ref() was unconditionally setting current_ref
to NULL when it was done, causing peel_ref() to perform an invalid
memory access.
So change do_one_ref() to save the old value of current_ref before
overwriting it, and restore the old value afterward rather than
setting it to NULL.
Reported-by: Mantas Mikulėnas <grawity@gmail.com>
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The definition of "struct ref" in "cache.h", a header file so
central to the system, always confused me. This structure is not
about the local ref used by sha1-name API to name local objects.
It is what refspecs are expanded into, after finding out what refs
the other side has, to define what refs are updated after object
transfer succeeds to what values. It belongs to "remote.h" together
with "struct refspec".
While we are at it, also move the types and functions related to the
Git transport connection to a new header file connect.h
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git pack-refs" that races with new ref creation or deletion have
been susceptible to lossage of refs under right conditions, which
has been tightened up.
* mh/ref-races:
for_each_ref: load all loose refs before packed refs
get_packed_ref_cache: reload packed-refs file when it changes
add a stat_validity struct
Extract a struct stat_data from cache_entry
packed_ref_cache: increment refcount when locked
do_for_each_entry(): increment the packed refs cache refcount
refs: manage lifetime of packed refs cache via reference counting
refs: implement simple transactions for the packed-refs file
refs: wrap the packed refs cache in a level of indirection
pack_refs(): split creation of packed refs and entry writing
repack_without_ref(): split list curation and entry writing
Now that we keep track of the packed-refs file metadata, we can detect
when the packed-refs file has been modified since we last read it, and
we do so automatically every time that get_packed_ref_cache() is
called. So there is no need to invalidate the cache automatically
when lock_packed_refs() is called; usually the old copy will still be
valid.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>