global: introduce `USE_THE_REPOSITORY_VARIABLE` macro
Use of the `the_repository` variable is deprecated nowadays, and we
slowly but steadily convert the codebase to not use it anymore. Instead,
callers should be passing down the repository to work on via parameters.
It is hard though to prove that a given code unit does not use this
variable anymore. The most trivial case, merely demonstrating that there
is no direct use of `the_repository`, is already a bit of a pain during
code reviews as the reviewer needs to manually verify claims made by the
patch author. The bigger problem though is that we have many interfaces
that implicitly rely on `the_repository`.
Introduce a new `USE_THE_REPOSITORY_VARIABLE` macro that allows code
units to opt into usage of `the_repository`. The intent of this macro is
to demonstrate that a certain code unit does not use this variable
anymore, and to keep it from new dependencies on it in future changes,
be it explicit or implicit
For now, the macro only guards `the_repository` itself as well as
`the_hash_algo`. There are many more known interfaces where we have an
implicit dependency on `the_repository`, but those are not guarded at
the current point in time. Over time though, we should start to add
guards as required (or even better, just remove them).
Define the macro as required in our code units. As expected, most of our
code still relies on the global variable. Nearly all of our builtins
rely on the variable as there is no way yet to pass `the_repository` to
their entry point. For now, declare the macro in "biultin.h" to keep the
required changes at least a little bit more contained.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-14 14:50:23 +08:00
|
|
|
#define USE_THE_REPOSITORY_VARIABLE
|
|
|
|
|
2023-04-23 04:17:23 +08:00
|
|
|
#include "git-compat-util.h"
|
2023-02-24 08:09:27 +08:00
|
|
|
#include "hex.h"
|
2005-04-19 02:39:48 +08:00
|
|
|
#include "tree.h"
|
2023-04-11 15:41:49 +08:00
|
|
|
#include "object-name.h"
|
2023-05-16 14:34:06 +08:00
|
|
|
#include "object-store-ll.h"
|
2005-09-05 14:03:51 +08:00
|
|
|
#include "commit.h"
|
2018-05-16 05:48:42 +08:00
|
|
|
#include "alloc.h"
|
2006-05-30 03:16:12 +08:00
|
|
|
#include "tree-walk.h"
|
2018-06-29 09:21:51 +08:00
|
|
|
#include "repository.h"
|
2023-08-31 14:21:55 +08:00
|
|
|
#include "environment.h"
|
2005-04-19 02:39:48 +08:00
|
|
|
|
|
|
|
const char *tree_type = "tree";
|
|
|
|
|
2021-03-21 06:37:50 +08:00
|
|
|
int read_tree_at(struct repository *r,
|
|
|
|
struct tree *tree, struct strbuf *base,
|
2023-08-31 14:21:55 +08:00
|
|
|
int depth,
|
2021-03-21 06:37:50 +08:00
|
|
|
const struct pathspec *pathspec,
|
|
|
|
read_tree_fn_t fn, void *context)
|
2005-04-23 07:42:37 +08:00
|
|
|
{
|
2006-05-30 03:17:28 +08:00
|
|
|
struct tree_desc desc;
|
tree_entry(): new tree-walking helper function
This adds a "tree_entry()" function that combines the common operation of
doing a "tree_entry_extract()" + "update_tree_entry()".
It also has a simplified calling convention, designed for simple loops
that traverse over a whole tree: the arguments are pointers to the tree
descriptor and a name_entry structure to fill in, and it returns a boolean
"true" if there was an entry left to be gotten in the tree.
This allows tree traversal with
struct tree_desc desc;
struct name_entry entry;
desc.buf = tree->buffer;
desc.size = tree->size;
while (tree_entry(&desc, &entry) {
... use "entry.{path, sha1, mode, pathlen}" ...
}
which is not only shorter than writing it out in full, it's hopefully less
error prone too.
[ It's actually a tad faster too - we don't need to recalculate the entry
pathlength in both extract and update, but need to do it only once.
Also, some callers can avoid doing a "strlen()" on the result, since
it's returned as part of the name_entry structure.
However, by now we're talking just 1% speedup on "git-rev-list --objects
--all", and we're definitely at the point where tree walking is no
longer the issue any more. ]
NOTE! Not everybody wants to use this new helper function, since some of
the tree walkers very much on purpose do the descriptor update separately
from the entry extraction. So the "extract + update" sequence still
remains as the core sequence, this is just a simplified interface.
We should probably add a silly two-line inline helper function for
initializing the descriptor from the "struct tree" too, just to cut down
on the noise from that common "desc" initializer.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-05-31 00:45:45 +08:00
|
|
|
struct name_entry entry;
|
2017-05-07 06:10:15 +08:00
|
|
|
struct object_id oid;
|
2011-10-24 14:36:10 +08:00
|
|
|
int len, oldlen = base->len;
|
|
|
|
enum interesting retval = entry_not_interesting;
|
2006-05-30 03:17:28 +08:00
|
|
|
|
2023-08-31 14:21:55 +08:00
|
|
|
if (depth > max_allowed_tree_depth)
|
|
|
|
return error("exceeded maximum allowed tree depth");
|
|
|
|
|
2006-01-26 14:13:36 +08:00
|
|
|
if (parse_tree(tree))
|
|
|
|
return -1;
|
2006-05-30 03:17:28 +08:00
|
|
|
|
2023-10-02 10:40:28 +08:00
|
|
|
init_tree_desc(&desc, &tree->object.oid, tree->buffer, tree->size);
|
2006-05-30 03:17:28 +08:00
|
|
|
|
tree_entry(): new tree-walking helper function
This adds a "tree_entry()" function that combines the common operation of
doing a "tree_entry_extract()" + "update_tree_entry()".
It also has a simplified calling convention, designed for simple loops
that traverse over a whole tree: the arguments are pointers to the tree
descriptor and a name_entry structure to fill in, and it returns a boolean
"true" if there was an entry left to be gotten in the tree.
This allows tree traversal with
struct tree_desc desc;
struct name_entry entry;
desc.buf = tree->buffer;
desc.size = tree->size;
while (tree_entry(&desc, &entry) {
... use "entry.{path, sha1, mode, pathlen}" ...
}
which is not only shorter than writing it out in full, it's hopefully less
error prone too.
[ It's actually a tad faster too - we don't need to recalculate the entry
pathlength in both extract and update, but need to do it only once.
Also, some callers can avoid doing a "strlen()" on the result, since
it's returned as part of the name_entry structure.
However, by now we're talking just 1% speedup on "git-rev-list --objects
--all", and we're definitely at the point where tree walking is no
longer the issue any more. ]
NOTE! Not everybody wants to use this new helper function, since some of
the tree walkers very much on purpose do the descriptor update separately
from the entry extraction. So the "extract + update" sequence still
remains as the core sequence, this is just a simplified interface.
We should probably add a silly two-line inline helper function for
initializing the descriptor from the "struct tree" too, just to cut down
on the noise from that common "desc" initializer.
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2006-05-31 00:45:45 +08:00
|
|
|
while (tree_entry(&desc, &entry)) {
|
2011-10-24 14:36:10 +08:00
|
|
|
if (retval != all_entries_interesting) {
|
2018-11-19 00:47:57 +08:00
|
|
|
retval = tree_entry_interesting(r->index, &entry,
|
2023-07-08 06:21:15 +08:00
|
|
|
base, pathspec);
|
2011-10-24 14:36:10 +08:00
|
|
|
if (retval == all_entries_not_interesting)
|
2011-03-25 17:34:18 +08:00
|
|
|
break;
|
2011-10-24 14:36:10 +08:00
|
|
|
if (retval == entry_not_interesting)
|
2011-03-25 17:34:18 +08:00
|
|
|
continue;
|
|
|
|
}
|
2005-07-15 02:26:31 +08:00
|
|
|
|
2019-01-15 08:39:44 +08:00
|
|
|
switch (fn(&entry.oid, base,
|
2021-03-21 06:37:51 +08:00
|
|
|
entry.path, entry.mode, context)) {
|
2005-11-27 01:38:20 +08:00
|
|
|
case 0:
|
|
|
|
continue;
|
|
|
|
case READ_TREE_RECURSIVE:
|
2009-02-11 09:42:04 +08:00
|
|
|
break;
|
2005-11-27 01:38:20 +08:00
|
|
|
default:
|
|
|
|
return -1;
|
|
|
|
}
|
2009-01-25 08:52:05 +08:00
|
|
|
|
2011-03-25 17:34:18 +08:00
|
|
|
if (S_ISDIR(entry.mode))
|
2019-01-15 08:39:44 +08:00
|
|
|
oidcpy(&oid, &entry.oid);
|
2011-03-25 17:34:18 +08:00
|
|
|
else if (S_ISGITLINK(entry.mode)) {
|
|
|
|
struct commit *commit;
|
2009-01-25 08:52:05 +08:00
|
|
|
|
2019-01-30 04:47:56 +08:00
|
|
|
commit = lookup_commit(r, &entry.oid);
|
2009-01-25 08:52:05 +08:00
|
|
|
if (!commit)
|
2011-03-25 17:34:18 +08:00
|
|
|
die("Commit %s in submodule path %s%s not found",
|
2019-01-15 08:39:44 +08:00
|
|
|
oid_to_hex(&entry.oid),
|
2011-03-25 17:34:18 +08:00
|
|
|
base->buf, entry.path);
|
2009-01-25 08:52:05 +08:00
|
|
|
|
libs: use "struct repository *" argument, not "the_repository"
As can easily be seen from grepping in our sources, we had these uses
of "the_repository" in various library code in cases where the
function in question was already getting a "struct repository *"
argument. Let's use that argument instead.
Out of these changes only the changes to "cache-tree.c",
"commit-reach.c", "shallow.c" and "upload-pack.c" would have cleanly
applied before the migration away from the "repo_*()" wrapper macros
in the preceding commits.
The rest aren't new, as we'd previously implicitly refer to
"the_repository", but it's now more obvious that we were doing the
wrong thing all along, and should have used the parameter instead.
The change to change "get_index_format_default(the_repository)" in
"read-cache.c" to use the "r" variable instead should arguably have
been part of [1], or in the subsequent cleanup in [2]. Let's do it
here, as can be seen from the initial code in [3] it's not important
that we use "the_repository" there, but would prefer to always use the
current repository.
This change excludes the "the_repository" use in "upload-pack.c"'s
upload_pack_advertise(), as the in-flight [4] makes that change.
1. ee1f0c242ef (read-cache: add index.skipHash config option,
2023-01-06)
2. 6269f8eaad0 (treewide: always have a valid "index_state.repo"
member, 2023-01-17)
3. 7211b9e7534 (repo-settings: consolidate some config settings,
2019-08-13)
4. <Y/hbUsGPVNAxTdmS@coredump.intra.peff.net>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-28 21:58:58 +08:00
|
|
|
if (repo_parse_commit(r, commit))
|
2011-03-25 17:34:18 +08:00
|
|
|
die("Invalid commit %s in submodule path %s%s",
|
2019-01-15 08:39:44 +08:00
|
|
|
oid_to_hex(&entry.oid),
|
2011-03-25 17:34:18 +08:00
|
|
|
base->buf, entry.path);
|
|
|
|
|
2018-04-07 03:09:38 +08:00
|
|
|
oidcpy(&oid, get_commit_tree_oid(commit));
|
2005-04-23 07:42:37 +08:00
|
|
|
}
|
2011-03-25 17:34:18 +08:00
|
|
|
else
|
|
|
|
continue;
|
|
|
|
|
2011-10-24 14:36:09 +08:00
|
|
|
len = tree_entry_len(&entry);
|
2011-03-25 17:34:18 +08:00
|
|
|
strbuf_add(base, entry.path, len);
|
|
|
|
strbuf_addch(base, '/');
|
2021-03-21 06:37:50 +08:00
|
|
|
retval = read_tree_at(r, lookup_tree(r, &oid),
|
2023-08-31 14:21:55 +08:00
|
|
|
base, depth + 1, pathspec,
|
2021-03-21 06:37:50 +08:00
|
|
|
fn, context);
|
2011-03-25 17:34:18 +08:00
|
|
|
strbuf_setlen(base, oldlen);
|
|
|
|
if (retval)
|
|
|
|
return -1;
|
2005-04-23 07:42:37 +08:00
|
|
|
}
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2021-03-21 06:37:51 +08:00
|
|
|
int read_tree(struct repository *r,
|
|
|
|
struct tree *tree,
|
|
|
|
const struct pathspec *pathspec,
|
|
|
|
read_tree_fn_t fn, void *context)
|
2011-03-25 17:34:18 +08:00
|
|
|
{
|
|
|
|
struct strbuf sb = STRBUF_INIT;
|
2023-08-31 14:21:55 +08:00
|
|
|
int ret = read_tree_at(r, tree, &sb, 0, pathspec, fn, context);
|
2011-03-25 17:34:18 +08:00
|
|
|
strbuf_release(&sb);
|
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
2023-04-23 04:17:22 +08:00
|
|
|
int base_name_compare(const char *name1, size_t len1, int mode1,
|
|
|
|
const char *name2, size_t len2, int mode2)
|
|
|
|
{
|
|
|
|
unsigned char c1, c2;
|
|
|
|
size_t len = len1 < len2 ? len1 : len2;
|
|
|
|
int cmp;
|
|
|
|
|
|
|
|
cmp = memcmp(name1, name2, len);
|
|
|
|
if (cmp)
|
|
|
|
return cmp;
|
|
|
|
c1 = name1[len];
|
|
|
|
c2 = name2[len];
|
|
|
|
if (!c1 && S_ISDIR(mode1))
|
|
|
|
c1 = '/';
|
|
|
|
if (!c2 && S_ISDIR(mode2))
|
|
|
|
c2 = '/';
|
|
|
|
return (c1 < c2) ? -1 : (c1 > c2) ? 1 : 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* df_name_compare() is identical to base_name_compare(), except it
|
|
|
|
* compares conflicting directory/file entries as equal. Note that
|
|
|
|
* while a directory name compares as equal to a regular file, they
|
|
|
|
* then individually compare _differently_ to a filename that has
|
|
|
|
* a dot after the basename (because '\0' < '.' < '/').
|
|
|
|
*
|
|
|
|
* This is used by routines that want to traverse the git namespace
|
|
|
|
* but then handle conflicting entries together when possible.
|
|
|
|
*/
|
|
|
|
int df_name_compare(const char *name1, size_t len1, int mode1,
|
|
|
|
const char *name2, size_t len2, int mode2)
|
|
|
|
{
|
|
|
|
unsigned char c1, c2;
|
|
|
|
size_t len = len1 < len2 ? len1 : len2;
|
|
|
|
int cmp;
|
|
|
|
|
|
|
|
cmp = memcmp(name1, name2, len);
|
|
|
|
if (cmp)
|
|
|
|
return cmp;
|
|
|
|
/* Directories and files compare equal (same length, same name) */
|
|
|
|
if (len1 == len2)
|
|
|
|
return 0;
|
|
|
|
c1 = name1[len];
|
|
|
|
if (!c1 && S_ISDIR(mode1))
|
|
|
|
c1 = '/';
|
|
|
|
c2 = name2[len];
|
|
|
|
if (!c2 && S_ISDIR(mode2))
|
|
|
|
c2 = '/';
|
|
|
|
if (c1 == '/' && !c2)
|
|
|
|
return 0;
|
|
|
|
if (c2 == '/' && !c1)
|
|
|
|
return 0;
|
|
|
|
return c1 - c2;
|
|
|
|
}
|
|
|
|
|
|
|
|
int name_compare(const char *name1, size_t len1, const char *name2, size_t len2)
|
|
|
|
{
|
|
|
|
size_t min_len = (len1 < len2) ? len1 : len2;
|
|
|
|
int cmp = memcmp(name1, name2, min_len);
|
|
|
|
if (cmp)
|
|
|
|
return cmp;
|
|
|
|
if (len1 < len2)
|
|
|
|
return -1;
|
|
|
|
if (len1 > len2)
|
|
|
|
return 1;
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2018-06-29 09:22:09 +08:00
|
|
|
struct tree *lookup_tree(struct repository *r, const struct object_id *oid)
|
2005-04-19 02:39:48 +08:00
|
|
|
{
|
2019-06-20 15:41:14 +08:00
|
|
|
struct object *obj = lookup_object(r, oid);
|
2007-04-17 13:11:43 +08:00
|
|
|
if (!obj)
|
2019-06-20 15:41:21 +08:00
|
|
|
return create_object(r, oid, alloc_tree_node(r));
|
2020-06-17 17:14:08 +08:00
|
|
|
return object_as_type(obj, OBJ_TREE, 0);
|
2005-04-19 02:39:48 +08:00
|
|
|
}
|
|
|
|
|
2006-05-30 03:18:33 +08:00
|
|
|
int parse_tree_buffer(struct tree *item, void *buffer, unsigned long size)
|
|
|
|
{
|
2005-04-19 02:39:48 +08:00
|
|
|
if (item->object.parsed)
|
|
|
|
return 0;
|
|
|
|
item->object.parsed = 1;
|
2006-05-30 03:16:12 +08:00
|
|
|
item->buffer = buffer;
|
|
|
|
item->size = size;
|
|
|
|
|
2006-05-30 03:18:33 +08:00
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
add quieter versions of parse_{tree,commit}
When we call parse_commit, it will complain to stderr if the
object does not exist or cannot be read. This means that we
may produce useless error messages if this situation is
expected (e.g., because the object is marked UNINTERESTING,
or because revs->ignore_missing_links is set).
We can fix this by adding a new "parse_X_gently" form that
takes a flag to suppress the messages. The existing
"parse_X" form is already gentle in the sense that it
returns an error rather than dying, and we could in theory
just add a "quiet" flag to it (with existing callers passing
"0"). But doing it this way means we do not have to disturb
existing callers.
Note also that the new flag is "quiet_on_missing", and not
just "quiet". We could add a flag to suppress _all_ errors,
but besides being a more invasive change (we would have to
pass the flag down to sub-functions, too), there is a good
reason not to: we would never want to use it. Missing a
linked object is expected in some circumstances, but it is
never expected to have a malformed commit, or to get a tree
when we wanted a commit. We should always complain about
these corruptions.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-06-01 17:56:26 +08:00
|
|
|
int parse_tree_gently(struct tree *item, int quiet_on_missing)
|
2005-05-07 01:48:34 +08:00
|
|
|
{
|
2007-02-27 03:55:59 +08:00
|
|
|
enum object_type type;
|
2005-05-07 01:48:34 +08:00
|
|
|
void *buffer;
|
|
|
|
unsigned long size;
|
|
|
|
|
|
|
|
if (item->object.parsed)
|
|
|
|
return 0;
|
2023-03-28 21:58:50 +08:00
|
|
|
buffer = repo_read_object_file(the_repository, &item->object.oid,
|
|
|
|
&type, &size);
|
2005-05-07 01:48:34 +08:00
|
|
|
if (!buffer)
|
add quieter versions of parse_{tree,commit}
When we call parse_commit, it will complain to stderr if the
object does not exist or cannot be read. This means that we
may produce useless error messages if this situation is
expected (e.g., because the object is marked UNINTERESTING,
or because revs->ignore_missing_links is set).
We can fix this by adding a new "parse_X_gently" form that
takes a flag to suppress the messages. The existing
"parse_X" form is already gentle in the sense that it
returns an error rather than dying, and we could in theory
just add a "quiet" flag to it (with existing callers passing
"0"). But doing it this way means we do not have to disturb
existing callers.
Note also that the new flag is "quiet_on_missing", and not
just "quiet". We could add a flag to suppress _all_ errors,
but besides being a more invasive change (we would have to
pass the flag down to sub-functions, too), there is a good
reason not to: we would never want to use it. Missing a
linked object is expected in some circumstances, but it is
never expected to have a malformed commit, or to get a tree
when we wanted a commit. We should always complain about
these corruptions.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-06-01 17:56:26 +08:00
|
|
|
return quiet_on_missing ? -1 :
|
|
|
|
error("Could not read %s",
|
2015-11-10 10:22:28 +08:00
|
|
|
oid_to_hex(&item->object.oid));
|
2007-02-27 03:55:59 +08:00
|
|
|
if (type != OBJ_TREE) {
|
2005-05-07 01:48:34 +08:00
|
|
|
free(buffer);
|
|
|
|
return error("Object %s not a tree",
|
2015-11-10 10:22:28 +08:00
|
|
|
oid_to_hex(&item->object.oid));
|
2005-05-07 01:48:34 +08:00
|
|
|
}
|
2006-05-30 03:16:12 +08:00
|
|
|
return parse_tree_buffer(item, buffer, size);
|
2005-05-07 01:48:34 +08:00
|
|
|
}
|
2005-09-05 14:03:51 +08:00
|
|
|
|
2013-06-06 06:37:39 +08:00
|
|
|
void free_tree_buffer(struct tree *tree)
|
|
|
|
{
|
2017-06-16 07:15:46 +08:00
|
|
|
FREE_AND_NULL(tree->buffer);
|
2013-06-06 06:37:39 +08:00
|
|
|
tree->size = 0;
|
|
|
|
tree->object.parsed = 0;
|
|
|
|
}
|
|
|
|
|
2017-05-07 06:10:37 +08:00
|
|
|
struct tree *parse_tree_indirect(const struct object_id *oid)
|
2005-09-05 14:03:51 +08:00
|
|
|
{
|
2019-08-30 03:06:22 +08:00
|
|
|
struct repository *r = the_repository;
|
|
|
|
struct object *obj = parse_object(r, oid);
|
|
|
|
return (struct tree *)repo_peel_to_type(r, NULL, 0, obj, OBJ_TREE);
|
2005-09-05 14:03:51 +08:00
|
|
|
}
|