global: introduce `USE_THE_REPOSITORY_VARIABLE` macro
Use of the `the_repository` variable is deprecated nowadays, and we
slowly but steadily convert the codebase to not use it anymore. Instead,
callers should be passing down the repository to work on via parameters.
It is hard though to prove that a given code unit does not use this
variable anymore. The most trivial case, merely demonstrating that there
is no direct use of `the_repository`, is already a bit of a pain during
code reviews as the reviewer needs to manually verify claims made by the
patch author. The bigger problem though is that we have many interfaces
that implicitly rely on `the_repository`.
Introduce a new `USE_THE_REPOSITORY_VARIABLE` macro that allows code
units to opt into usage of `the_repository`. The intent of this macro is
to demonstrate that a certain code unit does not use this variable
anymore, and to keep it from new dependencies on it in future changes,
be it explicit or implicit
For now, the macro only guards `the_repository` itself as well as
`the_hash_algo`. There are many more known interfaces where we have an
implicit dependency on `the_repository`, but those are not guarded at
the current point in time. Over time though, we should start to add
guards as required (or even better, just remove them).
Define the macro as required in our code units. As expected, most of our
code still relies on the global variable. Nearly all of our builtins
rely on the variable as there is no way yet to pass `the_repository` to
their entry point. For now, declare the macro in "biultin.h" to keep the
required changes at least a little bit more contained.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-14 14:50:23 +08:00
|
|
|
#define USE_THE_REPOSITORY_VARIABLE
|
|
|
|
|
2023-04-11 15:41:48 +08:00
|
|
|
#include "git-compat-util.h"
|
2023-03-21 14:25:54 +08:00
|
|
|
#include "gettext.h"
|
2023-02-24 08:09:27 +08:00
|
|
|
#include "hex.h"
|
2023-05-16 14:34:06 +08:00
|
|
|
#include "object-store-ll.h"
|
2019-06-25 21:40:27 +08:00
|
|
|
#include "promisor-remote.h"
|
|
|
|
#include "config.h"
|
2023-04-11 11:00:38 +08:00
|
|
|
#include "trace2.h"
|
2019-06-25 21:40:37 +08:00
|
|
|
#include "transport.h"
|
2020-08-18 12:01:36 +08:00
|
|
|
#include "strvec.h"
|
promisor-remote: die upon failing fetch
In a partial clone, an attempt to read a missing object results in an
attempt to fetch that single object. In order to avoid multiple
sequential fetches, which would occur when multiple objects are missing
(which is the typical case), some commands have been taught to prefetch
in a batch: such a command would, in a partial clone, notice that
several objects that it will eventually need are missing, and call
promisor_remote_get_direct() with all such objects at once.
When this batch prefetch fails, these commands fall back to the
sequential fetches. But at $DAYJOB we have noticed that this results in
a bad user experience: a command would take unexpectedly long to finish
(and possibly use up a lot of bandwidth) if the batch prefetch would
fail for some intermittent reason, but all subsequent fetches would
work. It would be a better user experience for such a command would
just fail.
Therefore, make it a fatal error if the prefetch fails and at least one
object being fetched is known to be a promisor object. (The latter
criterion is to make sure that we are not misleading the user that such
an object would be present from the promisor remote. For example, a
missing object may be a result of repository corruption and not because
it is expectedly missing due to the repository being a partial clone.)
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-05 05:13:41 +08:00
|
|
|
#include "packfile.h"
|
2024-04-24 15:11:55 +08:00
|
|
|
#include "environment.h"
|
2019-06-25 21:40:37 +08:00
|
|
|
|
2021-06-18 01:13:23 +08:00
|
|
|
struct promisor_remote_config {
|
|
|
|
struct promisor_remote *promisors;
|
|
|
|
struct promisor_remote **promisors_tail;
|
|
|
|
};
|
2019-06-25 21:40:38 +08:00
|
|
|
|
2021-06-16 06:41:42 +08:00
|
|
|
static int fetch_objects(struct repository *repo,
|
|
|
|
const char *remote_name,
|
2019-06-25 21:40:37 +08:00
|
|
|
const struct object_id *oids,
|
|
|
|
int oid_nr)
|
|
|
|
{
|
2020-08-18 12:01:36 +08:00
|
|
|
struct child_process child = CHILD_PROCESS_INIT;
|
2019-06-25 21:40:37 +08:00
|
|
|
int i;
|
2020-08-18 12:01:36 +08:00
|
|
|
FILE *child_in;
|
2024-05-25 18:09:27 +08:00
|
|
|
int quiet;
|
2020-08-18 12:01:36 +08:00
|
|
|
|
2024-04-24 15:11:55 +08:00
|
|
|
if (git_env_bool(NO_LAZY_FETCH_ENVIRONMENT, 0)) {
|
upload-pack: disable lazy-fetching by default
The upload-pack command tries to avoid trusting the repository in which
it's run (e.g., by not running any hooks and not using any config that
contains arbitrary commands). But if the server side of a fetch or a
clone is a partial clone, then either upload-pack or its child
pack-objects may run a lazy "git fetch" under the hood. And it is very
easy to convince fetch to run arbitrary commands.
The "server" side can be a local repository owned by someone else, who
would be able to configure commands that are run during a clone with the
current user's permissions. This issue has been designated
CVE-2024-32004.
The fix in this commit's parent helps in this scenario, as well as in
related scenarios using SSH to clone, where the untrusted .git directory
is owned by a different user id. But if you received one as a zip file,
on a USB stick, etc, it may be owned by your user but still untrusted.
This has been designated CVE-2024-32465.
To mitigate the issue more completely, let's disable lazy fetching
entirely during `upload-pack`. While fetching from a partial repository
should be relatively rare, it is certainly not an unreasonable workflow.
And thus we need to provide an escape hatch.
This commit works by respecting a GIT_NO_LAZY_FETCH environment variable
(to skip the lazy-fetch), and setting it in upload-pack, but only when
the user has not already done so (which gives us the escape hatch).
The name of the variable is specifically chosen to match what has
already been added in 'master' via e6d5479e7a (git: extend
--no-lazy-fetch to work across subprocesses, 2024-02-27). Since we're
building this fix as a backport for older versions, we could cherry-pick
that patch and its earlier steps. However, we don't really need the
niceties (like a "--no-lazy-fetch" option) that it offers. By using the
same name, everything should just work when the two are eventually
merged, but here are a few notes:
- the blocking of the fetch in e6d5479e7a is incomplete! It sets
fetch_if_missing to 0 when we setup the repository variable, but
that isn't enough. pack-objects in particular will call
prefetch_to_pack() even if that variable is 0. This patch by
contrast checks the environment variable at the lowest level before
we call the lazy fetch, where we can be sure to catch all code
paths.
Possibly the setting of fetch_if_missing from e6d5479e7a can be
reverted, but it may be useful to have. For example, some code may
want to use that flag to change behavior before it gets to the point
of trying to start the fetch. At any rate, that's all outside the
scope of this patch.
- there's documentation for GIT_NO_LAZY_FETCH in e6d5479e7a. We can
live without that here, because for the most part the user shouldn't
need to set it themselves. The exception is if they do want to
override upload-pack's default, and that requires a separate
documentation section (which is added here)
- it would be nice to use the NO_LAZY_FETCH_ENVIRONMENT macro added by
e6d5479e7a, but those definitions have moved from cache.h to
environment.h between 2.39.3 and master. I just used the raw string
literals, and we can replace them with the macro once this topic is
merged to master.
At least with respect to CVE-2024-32004, this does render this commit's
parent commit somewhat redundant. However, it is worth retaining that
commit as defense in depth, and because it may help other issues (e.g.,
symlink/hardlink TOCTOU races, where zip files are not really an
interesting attack vector).
The tests in t0411 still pass, but now we have _two_ mechanisms ensuring
that the evil command is not run. Let's beef up the existing ones to
check that they failed for the expected reason, that we refused to run
upload-pack at all with an alternate user id. And add two new ones for
the same-user case that both the restriction and its escape hatch.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2024-04-16 16:35:33 +08:00
|
|
|
static int warning_shown;
|
|
|
|
if (!warning_shown) {
|
|
|
|
warning_shown = 1;
|
|
|
|
warning(_("lazy fetching disabled; some objects may not be available"));
|
|
|
|
}
|
|
|
|
return -1;
|
|
|
|
}
|
|
|
|
|
2020-08-18 12:01:36 +08:00
|
|
|
child.git_cmd = 1;
|
|
|
|
child.in = -1;
|
2021-06-18 01:13:26 +08:00
|
|
|
if (repo != the_repository)
|
2022-06-02 17:09:50 +08:00
|
|
|
prepare_other_repo_env(&child.env, repo->gitdir);
|
2020-08-18 12:01:36 +08:00
|
|
|
strvec_pushl(&child.args, "-c", "fetch.negotiationAlgorithm=noop",
|
|
|
|
"fetch", remote_name, "--no-tags",
|
|
|
|
"--no-write-fetch-head", "--recurse-submodules=no",
|
|
|
|
"--filter=blob:none", "--stdin", NULL);
|
2024-05-25 18:09:27 +08:00
|
|
|
if (!git_config_get_bool("promisor.quiet", &quiet) && quiet)
|
|
|
|
strvec_push(&child.args, "--quiet");
|
2020-08-18 12:01:36 +08:00
|
|
|
if (start_command(&child))
|
|
|
|
die(_("promisor-remote: unable to fork off fetch subprocess"));
|
|
|
|
child_in = xfdopen(child.in, "w");
|
2019-06-25 21:40:37 +08:00
|
|
|
|
2021-06-16 06:41:42 +08:00
|
|
|
trace2_data_intmax("promisor", repo, "fetch_count", oid_nr);
|
|
|
|
|
2019-06-25 21:40:37 +08:00
|
|
|
for (i = 0; i < oid_nr; i++) {
|
2020-08-18 12:01:36 +08:00
|
|
|
if (fputs(oid_to_hex(&oids[i]), child_in) < 0)
|
|
|
|
die_errno(_("promisor-remote: could not write to fetch subprocess"));
|
|
|
|
if (fputc('\n', child_in) < 0)
|
|
|
|
die_errno(_("promisor-remote: could not write to fetch subprocess"));
|
2019-06-25 21:40:37 +08:00
|
|
|
}
|
2020-08-18 12:01:36 +08:00
|
|
|
|
|
|
|
if (fclose(child_in) < 0)
|
|
|
|
die_errno(_("promisor-remote: could not close stdin to fetch subprocess"));
|
|
|
|
return finish_command(&child) ? -1 : 0;
|
2019-06-25 21:40:37 +08:00
|
|
|
}
|
2019-06-25 21:40:27 +08:00
|
|
|
|
2021-06-18 01:13:23 +08:00
|
|
|
static struct promisor_remote *promisor_remote_new(struct promisor_remote_config *config,
|
|
|
|
const char *remote_name)
|
2019-06-25 21:40:27 +08:00
|
|
|
{
|
|
|
|
struct promisor_remote *r;
|
|
|
|
|
|
|
|
if (*remote_name == '/') {
|
|
|
|
warning(_("promisor remote name cannot begin with '/': %s"),
|
|
|
|
remote_name);
|
|
|
|
return NULL;
|
|
|
|
}
|
|
|
|
|
|
|
|
FLEX_ALLOC_STR(r, name, remote_name);
|
|
|
|
|
2021-06-18 01:13:23 +08:00
|
|
|
*config->promisors_tail = r;
|
|
|
|
config->promisors_tail = &r->next;
|
2019-06-25 21:40:27 +08:00
|
|
|
|
|
|
|
return r;
|
|
|
|
}
|
|
|
|
|
2021-06-18 01:13:23 +08:00
|
|
|
static struct promisor_remote *promisor_remote_lookup(struct promisor_remote_config *config,
|
|
|
|
const char *remote_name,
|
2019-06-25 21:40:27 +08:00
|
|
|
struct promisor_remote **previous)
|
|
|
|
{
|
|
|
|
struct promisor_remote *r, *p;
|
|
|
|
|
2021-06-18 01:13:23 +08:00
|
|
|
for (p = NULL, r = config->promisors; r; p = r, r = r->next)
|
2019-06-25 21:40:27 +08:00
|
|
|
if (!strcmp(r->name, remote_name)) {
|
|
|
|
if (previous)
|
|
|
|
*previous = p;
|
|
|
|
return r;
|
|
|
|
}
|
|
|
|
|
|
|
|
return NULL;
|
|
|
|
}
|
|
|
|
|
2021-06-18 01:13:23 +08:00
|
|
|
static void promisor_remote_move_to_tail(struct promisor_remote_config *config,
|
|
|
|
struct promisor_remote *r,
|
2019-06-25 21:40:30 +08:00
|
|
|
struct promisor_remote *previous)
|
|
|
|
{
|
2022-05-03 00:50:37 +08:00
|
|
|
if (!r->next)
|
2019-10-01 06:03:55 +08:00
|
|
|
return;
|
|
|
|
|
2019-06-25 21:40:30 +08:00
|
|
|
if (previous)
|
|
|
|
previous->next = r->next;
|
|
|
|
else
|
2021-06-18 01:13:23 +08:00
|
|
|
config->promisors = r->next ? r->next : r;
|
2019-06-25 21:40:30 +08:00
|
|
|
r->next = NULL;
|
2021-06-18 01:13:23 +08:00
|
|
|
*config->promisors_tail = r;
|
|
|
|
config->promisors_tail = &r->next;
|
2019-06-25 21:40:30 +08:00
|
|
|
}
|
|
|
|
|
config: add ctx arg to config_fn_t
Add a new "const struct config_context *ctx" arg to config_fn_t to hold
additional information about the config iteration operation.
config_context has a "struct key_value_info kvi" member that holds
metadata about the config source being read (e.g. what kind of config
source it is, the filename, etc). In this series, we're only interested
in .kvi, so we could have just used "struct key_value_info" as an arg,
but config_context makes it possible to add/adjust members in the future
without changing the config_fn_t signature. We could also consider other
ways of organizing the args (e.g. moving the config name and value into
config_context or key_value_info), but in my experiments, the
incremental benefit doesn't justify the added complexity (e.g. a
config_fn_t will sometimes invoke another config_fn_t but with a
different config value).
In subsequent commits, the .kvi member will replace the global "struct
config_reader" in config.c, making config iteration a global-free
operation. It requires much more work for the machinery to provide
meaningful values of .kvi, so for now, merely change the signature and
call sites, pass NULL as a placeholder value, and don't rely on the arg
in any meaningful way.
Most of the changes are performed by
contrib/coccinelle/config_fn_ctx.pending.cocci, which, for every
config_fn_t:
- Modifies the signature to accept "const struct config_context *ctx"
- Passes "ctx" to any inner config_fn_t, if needed
- Adds UNUSED attributes to "ctx", if needed
Most config_fn_t instances are easily identified by seeing if they are
called by the various config functions. Most of the remaining ones are
manually named in the .cocci patch. Manual cleanups are still needed,
but the majority of it is trivial; it's either adjusting config_fn_t
that the .cocci patch didn't catch, or adding forward declarations of
"struct config_context ctx" to make the signatures make sense.
The non-trivial changes are in cases where we are invoking a config_fn_t
outside of config machinery, and we now need to decide what value of
"ctx" to pass. These cases are:
- trace2/tr2_cfg.c:tr2_cfg_set_fl()
This is indirectly called by git_config_set() so that the trace2
machinery can notice the new config values and update its settings
using the tr2 config parsing function, i.e. tr2_cfg_cb().
- builtin/checkout.c:checkout_main()
This calls git_xmerge_config() as a shorthand for parsing a CLI arg.
This might be worth refactoring away in the future, since
git_xmerge_config() can call git_default_config(), which can do much
more than just parsing.
Handle them by creating a KVI_INIT macro that initializes "struct
key_value_info" to a reasonable default, and use that to construct the
"ctx" arg.
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-29 03:26:22 +08:00
|
|
|
static int promisor_remote_config(const char *var, const char *value,
|
|
|
|
const struct config_context *ctx UNUSED,
|
|
|
|
void *data)
|
2019-06-25 21:40:27 +08:00
|
|
|
{
|
2021-06-18 01:13:23 +08:00
|
|
|
struct promisor_remote_config *config = data;
|
2019-06-25 21:40:27 +08:00
|
|
|
const char *name;
|
2020-04-11 03:44:28 +08:00
|
|
|
size_t namelen;
|
2019-06-25 21:40:27 +08:00
|
|
|
const char *subkey;
|
|
|
|
|
|
|
|
if (parse_config_key(var, "remote", &name, &namelen, &subkey) < 0)
|
|
|
|
return 0;
|
|
|
|
|
|
|
|
if (!strcmp(subkey, "promisor")) {
|
|
|
|
char *remote_name;
|
|
|
|
|
|
|
|
if (!git_config_bool(var, value))
|
|
|
|
return 0;
|
|
|
|
|
|
|
|
remote_name = xmemdupz(name, namelen);
|
|
|
|
|
2021-06-18 01:13:23 +08:00
|
|
|
if (!promisor_remote_lookup(config, remote_name, NULL))
|
|
|
|
promisor_remote_new(config, remote_name);
|
2019-06-25 21:40:27 +08:00
|
|
|
|
|
|
|
free(remote_name);
|
|
|
|
return 0;
|
|
|
|
}
|
2019-06-25 21:40:32 +08:00
|
|
|
if (!strcmp(subkey, "partialclonefilter")) {
|
|
|
|
struct promisor_remote *r;
|
|
|
|
char *remote_name = xmemdupz(name, namelen);
|
|
|
|
|
2021-06-18 01:13:23 +08:00
|
|
|
r = promisor_remote_lookup(config, remote_name, NULL);
|
2019-06-25 21:40:32 +08:00
|
|
|
if (!r)
|
2021-06-18 01:13:23 +08:00
|
|
|
r = promisor_remote_new(config, remote_name);
|
2019-06-25 21:40:32 +08:00
|
|
|
|
|
|
|
free(remote_name);
|
|
|
|
|
|
|
|
if (!r)
|
|
|
|
return 0;
|
|
|
|
|
2024-09-26 19:46:57 +08:00
|
|
|
FREE_AND_NULL(r->partial_clone_filter);
|
2019-06-25 21:40:32 +08:00
|
|
|
return git_config_string(&r->partial_clone_filter, var, value);
|
|
|
|
}
|
2019-06-25 21:40:27 +08:00
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2021-06-18 01:13:23 +08:00
|
|
|
static void promisor_remote_init(struct repository *r)
|
2019-06-25 21:40:27 +08:00
|
|
|
{
|
2021-06-18 01:13:23 +08:00
|
|
|
struct promisor_remote_config *config;
|
|
|
|
|
|
|
|
if (r->promisor_remote_config)
|
2019-06-25 21:40:27 +08:00
|
|
|
return;
|
2021-06-18 01:13:23 +08:00
|
|
|
config = r->promisor_remote_config =
|
2022-08-23 17:57:33 +08:00
|
|
|
xcalloc(1, sizeof(*r->promisor_remote_config));
|
2021-06-18 01:13:23 +08:00
|
|
|
config->promisors_tail = &config->promisors;
|
2019-06-25 21:40:27 +08:00
|
|
|
|
2021-06-18 01:13:23 +08:00
|
|
|
repo_config(r, promisor_remote_config, config);
|
2019-06-25 21:40:30 +08:00
|
|
|
|
2021-06-18 01:13:23 +08:00
|
|
|
if (r->repository_format_partial_clone) {
|
2019-06-25 21:40:30 +08:00
|
|
|
struct promisor_remote *o, *previous;
|
|
|
|
|
2021-06-18 01:13:23 +08:00
|
|
|
o = promisor_remote_lookup(config,
|
|
|
|
r->repository_format_partial_clone,
|
2019-06-25 21:40:30 +08:00
|
|
|
&previous);
|
|
|
|
if (o)
|
2021-06-18 01:13:23 +08:00
|
|
|
promisor_remote_move_to_tail(config, o, previous);
|
2019-06-25 21:40:30 +08:00
|
|
|
else
|
2021-06-18 01:13:23 +08:00
|
|
|
promisor_remote_new(config, r->repository_format_partial_clone);
|
2019-06-25 21:40:30 +08:00
|
|
|
}
|
2019-06-25 21:40:27 +08:00
|
|
|
}
|
|
|
|
|
2021-06-18 01:13:23 +08:00
|
|
|
void promisor_remote_clear(struct promisor_remote_config *config)
|
2019-06-25 21:40:29 +08:00
|
|
|
{
|
2021-06-18 01:13:23 +08:00
|
|
|
while (config->promisors) {
|
|
|
|
struct promisor_remote *r = config->promisors;
|
2024-09-26 19:46:57 +08:00
|
|
|
free(r->partial_clone_filter);
|
2021-06-18 01:13:23 +08:00
|
|
|
config->promisors = config->promisors->next;
|
2019-06-25 21:40:29 +08:00
|
|
|
free(r);
|
|
|
|
}
|
|
|
|
|
2021-06-18 01:13:23 +08:00
|
|
|
config->promisors_tail = &config->promisors;
|
2019-06-25 21:40:29 +08:00
|
|
|
}
|
|
|
|
|
2021-06-18 01:13:23 +08:00
|
|
|
void repo_promisor_remote_reinit(struct repository *r)
|
2019-06-25 21:40:29 +08:00
|
|
|
{
|
2021-06-18 01:13:23 +08:00
|
|
|
promisor_remote_clear(r->promisor_remote_config);
|
|
|
|
FREE_AND_NULL(r->promisor_remote_config);
|
|
|
|
promisor_remote_init(r);
|
2019-06-25 21:40:29 +08:00
|
|
|
}
|
|
|
|
|
2021-06-18 01:13:23 +08:00
|
|
|
struct promisor_remote *repo_promisor_remote_find(struct repository *r,
|
|
|
|
const char *remote_name)
|
2019-06-25 21:40:27 +08:00
|
|
|
{
|
2021-06-18 01:13:23 +08:00
|
|
|
promisor_remote_init(r);
|
2019-06-25 21:40:27 +08:00
|
|
|
|
|
|
|
if (!remote_name)
|
2021-06-18 01:13:23 +08:00
|
|
|
return r->promisor_remote_config->promisors;
|
2019-06-25 21:40:27 +08:00
|
|
|
|
2021-06-18 01:13:23 +08:00
|
|
|
return promisor_remote_lookup(r->promisor_remote_config, remote_name, NULL);
|
2019-06-25 21:40:27 +08:00
|
|
|
}
|
|
|
|
|
2021-06-18 01:13:23 +08:00
|
|
|
int repo_has_promisor_remote(struct repository *r)
|
2019-06-25 21:40:27 +08:00
|
|
|
{
|
2021-06-18 01:13:23 +08:00
|
|
|
return !!repo_promisor_remote_find(r, NULL);
|
2019-06-25 21:40:27 +08:00
|
|
|
}
|
2019-06-25 21:40:28 +08:00
|
|
|
|
|
|
|
static int remove_fetched_oids(struct repository *repo,
|
|
|
|
struct object_id **oids,
|
|
|
|
int oid_nr, int to_free)
|
|
|
|
{
|
|
|
|
int i, remaining_nr = 0;
|
|
|
|
int *remaining = xcalloc(oid_nr, sizeof(*remaining));
|
|
|
|
struct object_id *old_oids = *oids;
|
|
|
|
struct object_id *new_oids;
|
|
|
|
|
|
|
|
for (i = 0; i < oid_nr; i++)
|
|
|
|
if (oid_object_info_extended(repo, &old_oids[i], NULL,
|
|
|
|
OBJECT_INFO_SKIP_FETCH_OBJECT)) {
|
|
|
|
remaining[i] = 1;
|
|
|
|
remaining_nr++;
|
|
|
|
}
|
|
|
|
|
|
|
|
if (remaining_nr) {
|
|
|
|
int j = 0;
|
2021-03-14 00:17:22 +08:00
|
|
|
CALLOC_ARRAY(new_oids, remaining_nr);
|
2019-06-25 21:40:28 +08:00
|
|
|
for (i = 0; i < oid_nr; i++)
|
|
|
|
if (remaining[i])
|
|
|
|
oidcpy(&new_oids[j++], &old_oids[i]);
|
|
|
|
*oids = new_oids;
|
|
|
|
if (to_free)
|
|
|
|
free(old_oids);
|
|
|
|
}
|
|
|
|
|
|
|
|
free(remaining);
|
|
|
|
|
|
|
|
return remaining_nr;
|
|
|
|
}
|
|
|
|
|
2022-10-05 05:13:40 +08:00
|
|
|
void promisor_remote_get_direct(struct repository *repo,
|
|
|
|
const struct object_id *oids,
|
|
|
|
int oid_nr)
|
2019-06-25 21:40:28 +08:00
|
|
|
{
|
|
|
|
struct promisor_remote *r;
|
|
|
|
struct object_id *remaining_oids = (struct object_id *)oids;
|
|
|
|
int remaining_nr = oid_nr;
|
|
|
|
int to_free = 0;
|
promisor-remote: die upon failing fetch
In a partial clone, an attempt to read a missing object results in an
attempt to fetch that single object. In order to avoid multiple
sequential fetches, which would occur when multiple objects are missing
(which is the typical case), some commands have been taught to prefetch
in a batch: such a command would, in a partial clone, notice that
several objects that it will eventually need are missing, and call
promisor_remote_get_direct() with all such objects at once.
When this batch prefetch fails, these commands fall back to the
sequential fetches. But at $DAYJOB we have noticed that this results in
a bad user experience: a command would take unexpectedly long to finish
(and possibly use up a lot of bandwidth) if the batch prefetch would
fail for some intermittent reason, but all subsequent fetches would
work. It would be a better user experience for such a command would
just fail.
Therefore, make it a fatal error if the prefetch fails and at least one
object being fetched is known to be a promisor object. (The latter
criterion is to make sure that we are not misleading the user that such
an object would be present from the promisor remote. For example, a
missing object may be a result of repository corruption and not because
it is expectedly missing due to the repository being a partial clone.)
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-05 05:13:41 +08:00
|
|
|
int i;
|
2019-06-25 21:40:28 +08:00
|
|
|
|
2020-04-03 03:19:16 +08:00
|
|
|
if (oid_nr == 0)
|
2022-10-05 05:13:40 +08:00
|
|
|
return;
|
2020-04-03 03:19:16 +08:00
|
|
|
|
2021-06-18 01:13:23 +08:00
|
|
|
promisor_remote_init(repo);
|
2019-06-25 21:40:28 +08:00
|
|
|
|
2021-06-18 01:13:23 +08:00
|
|
|
for (r = repo->promisor_remote_config->promisors; r; r = r->next) {
|
2021-06-16 06:41:42 +08:00
|
|
|
if (fetch_objects(repo, r->name, remaining_oids, remaining_nr) < 0) {
|
2019-06-25 21:40:28 +08:00
|
|
|
if (remaining_nr == 1)
|
|
|
|
continue;
|
|
|
|
remaining_nr = remove_fetched_oids(repo, &remaining_oids,
|
|
|
|
remaining_nr, to_free);
|
|
|
|
if (remaining_nr) {
|
|
|
|
to_free = 1;
|
|
|
|
continue;
|
|
|
|
}
|
|
|
|
}
|
promisor-remote: die upon failing fetch
In a partial clone, an attempt to read a missing object results in an
attempt to fetch that single object. In order to avoid multiple
sequential fetches, which would occur when multiple objects are missing
(which is the typical case), some commands have been taught to prefetch
in a batch: such a command would, in a partial clone, notice that
several objects that it will eventually need are missing, and call
promisor_remote_get_direct() with all such objects at once.
When this batch prefetch fails, these commands fall back to the
sequential fetches. But at $DAYJOB we have noticed that this results in
a bad user experience: a command would take unexpectedly long to finish
(and possibly use up a lot of bandwidth) if the batch prefetch would
fail for some intermittent reason, but all subsequent fetches would
work. It would be a better user experience for such a command would
just fail.
Therefore, make it a fatal error if the prefetch fails and at least one
object being fetched is known to be a promisor object. (The latter
criterion is to make sure that we are not misleading the user that such
an object would be present from the promisor remote. For example, a
missing object may be a result of repository corruption and not because
it is expectedly missing due to the repository being a partial clone.)
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-05 05:13:41 +08:00
|
|
|
goto all_fetched;
|
|
|
|
}
|
|
|
|
|
|
|
|
for (i = 0; i < remaining_nr; i++) {
|
|
|
|
if (is_promisor_object(&remaining_oids[i]))
|
|
|
|
die(_("could not fetch %s from promisor remote"),
|
|
|
|
oid_to_hex(&remaining_oids[i]));
|
2019-06-25 21:40:28 +08:00
|
|
|
}
|
|
|
|
|
promisor-remote: die upon failing fetch
In a partial clone, an attempt to read a missing object results in an
attempt to fetch that single object. In order to avoid multiple
sequential fetches, which would occur when multiple objects are missing
(which is the typical case), some commands have been taught to prefetch
in a batch: such a command would, in a partial clone, notice that
several objects that it will eventually need are missing, and call
promisor_remote_get_direct() with all such objects at once.
When this batch prefetch fails, these commands fall back to the
sequential fetches. But at $DAYJOB we have noticed that this results in
a bad user experience: a command would take unexpectedly long to finish
(and possibly use up a lot of bandwidth) if the batch prefetch would
fail for some intermittent reason, but all subsequent fetches would
work. It would be a better user experience for such a command would
just fail.
Therefore, make it a fatal error if the prefetch fails and at least one
object being fetched is known to be a promisor object. (The latter
criterion is to make sure that we are not misleading the user that such
an object would be present from the promisor remote. For example, a
missing object may be a result of repository corruption and not because
it is expectedly missing due to the repository being a partial clone.)
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-10-05 05:13:41 +08:00
|
|
|
all_fetched:
|
2019-06-25 21:40:28 +08:00
|
|
|
if (to_free)
|
|
|
|
free(remaining_oids);
|
|
|
|
}
|