submodule: lazily add submodule ODBs as alternates

Teach Git to add submodule ODBs as alternates to the object store of
the_repository only upon the first access of an object not in
the_repository, and not when add_submodule_odb() is called.

This provides a means of gradually migrating from accessing a
submodule's object through alternates to accessing a submodule's object
by explicitly passing its repository object. Any Git command can declare
that it might access submodule objects by calling add_submodule_odb()
(as they do now), but the submodule ODBs themselves will not be added
until needed, so individual commands and/or combinations of arguments
can be migrated one by one.

[The advantage of explicit repository-object passing is code clarity (it
is clear which repository an object read is from), performance (there is
no need to linearly search through all submodule ODBs whenever an object
is accessed from any repository, whether superproject or submodule), and
the possibility of future features like partial clone submodules (which
right now is not possible because if an object is missing, we do not
know which repository to lazy-fetch into).]

This commit also introduces an environment variable that a test may set
to make the actual registration of alternates fatal, in order to
demonstrate that its codepaths do not need this registration.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Reviewed-by: Emily Shaffer <emilyshaffer@google.com>
Reviewed-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This commit is contained in:
Jonathan Tan 2021-08-16 14:09:51 -07:00 committed by Junio C Hamano
parent 2d755dfac9
commit a35e03dee0
4 changed files with 41 additions and 1 deletions

View File

@ -32,6 +32,7 @@
#include "packfile.h"
#include "object-store.h"
#include "promisor-remote.h"
#include "submodule.h"
/* The maximum size for an object header. */
#define MAX_HEADER_LEN 32
@ -1592,6 +1593,10 @@ static int do_oid_object_info_extended(struct repository *r,
break;
}
if (register_all_submodule_odb_as_alternates())
/* We added some alternates; retry */
continue;
/* Check if it is a missing object */
if (fetch_if_missing && repo_has_promisor_remote(r) &&
!already_retried &&

View File

@ -165,6 +165,8 @@ void stage_updated_gitmodules(struct index_state *istate)
die(_("staging updated .gitmodules failed"));
}
static struct string_list added_submodule_odb_paths = STRING_LIST_INIT_NODUP;
/* TODO: remove this function, use repo_submodule_init instead. */
int add_submodule_odb(const char *path)
{
@ -178,12 +180,28 @@ int add_submodule_odb(const char *path)
ret = -1;
goto done;
}
add_to_alternates_memory(objects_directory.buf);
string_list_insert(&added_submodule_odb_paths,
strbuf_detach(&objects_directory, NULL));
done:
strbuf_release(&objects_directory);
return ret;
}
int register_all_submodule_odb_as_alternates(void)
{
int i;
int ret = added_submodule_odb_paths.nr;
for (i = 0; i < added_submodule_odb_paths.nr; i++)
add_to_alternates_memory(added_submodule_odb_paths.items[i].string);
if (ret) {
string_list_clear(&added_submodule_odb_paths, 0);
if (git_env_bool("GIT_TEST_FATAL_REGISTER_SUBMODULE_ODB", 0))
BUG("register_all_submodule_odb_as_alternates() called");
}
return ret;
}
void set_diffopt_flags_from_submodule_config(struct diff_options *diffopt,
const char *path)
{

View File

@ -97,7 +97,14 @@ int submodule_uses_gitfile(const char *path);
#define SUBMODULE_REMOVAL_IGNORE_IGNORED_UNTRACKED (1<<2)
int bad_to_remove_submodule(const char *path, unsigned flags);
/*
* Call add_submodule_odb() to add the submodule at the given path to a list.
* When register_all_submodule_odb_as_alternates() is called, the object stores
* of all submodules in that list will be added as alternates in
* the_repository.
*/
int add_submodule_odb(const char *path);
int register_all_submodule_odb_as_alternates(void);
/*
* Checks if there are submodule changes in a..b. If a is the null OID,

View File

@ -448,6 +448,16 @@ GIT_TEST_CHECKOUT_WORKERS=<n> overrides the 'checkout.workers' setting
to <n> and 'checkout.thresholdForParallelism' to 0, forcing the
execution of the parallel-checkout code.
GIT_TEST_FATAL_REGISTER_SUBMODULE_ODB=<boolean>, when true, makes
registering submodule ODBs as alternates a fatal action. Support for
this environment variable can be removed once the migration to
explicitly providing repositories when accessing submodule objects is
complete (in which case we might want to replace this with a trace2
call so that users can make it visible if accessing submodule objects
without an explicit repository still happens) or needs to be abandoned
for whatever reason (in which case the migrated codepaths still retain
their performance benefits).
Naming Tests
------------