git/ls-refs.c

216 lines
5.4 KiB
C
Raw Normal View History

global: introduce `USE_THE_REPOSITORY_VARIABLE` macro Use of the `the_repository` variable is deprecated nowadays, and we slowly but steadily convert the codebase to not use it anymore. Instead, callers should be passing down the repository to work on via parameters. It is hard though to prove that a given code unit does not use this variable anymore. The most trivial case, merely demonstrating that there is no direct use of `the_repository`, is already a bit of a pain during code reviews as the reviewer needs to manually verify claims made by the patch author. The bigger problem though is that we have many interfaces that implicitly rely on `the_repository`. Introduce a new `USE_THE_REPOSITORY_VARIABLE` macro that allows code units to opt into usage of `the_repository`. The intent of this macro is to demonstrate that a certain code unit does not use this variable anymore, and to keep it from new dependencies on it in future changes, be it explicit or implicit For now, the macro only guards `the_repository` itself as well as `the_hash_algo`. There are many more known interfaces where we have an implicit dependency on `the_repository`, but those are not guarded at the current point in time. Over time though, we should start to add guards as required (or even better, just remove them). Define the macro as required in our code units. As expected, most of our code still relies on the global variable. Nearly all of our builtins rely on the variable as there is no way yet to pass `the_repository` to their entry point. For now, declare the macro in "biultin.h" to keep the required changes at least a little bit more contained. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-14 14:50:23 +08:00
#define USE_THE_REPOSITORY_VARIABLE
#include "git-compat-util.h"
#include "environment.h"
#include "gettext.h"
#include "hash.h"
#include "hex.h"
#include "repository.h"
#include "refs.h"
#include "strvec.h"
#include "ls-refs.h"
#include "pkt-line.h"
#include "config.h"
#include "string-list.h"
static enum {
UNBORN_IGNORE = 0,
UNBORN_ALLOW,
UNBORN_ADVERTISE /* implies ALLOW */
} unborn_config(struct repository *r)
ls-refs: report unborn targets of symrefs When cloning, we choose the default branch based on the remote HEAD. But if there is no remote HEAD reported (which could happen if the target of the remote HEAD is unborn), we'll fall back to using our local init.defaultBranch. Traditionally this hasn't been a big deal, because most repos used "master" as the default. But these days it is likely to cause confusion if the server and client implementations choose different values (e.g., if the remote started with "main", we may choose "master" locally, create commits there, and then the user is surprised when they push to "master" and not "main"). To solve this, the remote needs to communicate the target of the HEAD symref, even if it is unborn, and "git clone" needs to use this information. Currently, symrefs that have unborn targets (such as in this case) are not communicated by the protocol. Teach Git to advertise and support the "unborn" feature in "ls-refs" (by default, this is advertised, but server administrators may turn this off through the lsrefs.unborn config). This feature indicates that "ls-refs" supports the "unborn" argument; when it is specified, "ls-refs" will send the HEAD symref with the name of its unborn target. This change is only for protocol v2. A similar change for protocol v0 would require independent protocol design (there being no analogous position to signal support for "unborn") and client-side plumbing of the data required, so the scope of this patch set is limited to protocol v2. The client side will be updated to use this in a subsequent commit. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-06 04:48:47 +08:00
{
const char *str = NULL;
if (repo_config_get_string_tmp(r, "lsrefs.unborn", &str)) {
ls-refs: report unborn targets of symrefs When cloning, we choose the default branch based on the remote HEAD. But if there is no remote HEAD reported (which could happen if the target of the remote HEAD is unborn), we'll fall back to using our local init.defaultBranch. Traditionally this hasn't been a big deal, because most repos used "master" as the default. But these days it is likely to cause confusion if the server and client implementations choose different values (e.g., if the remote started with "main", we may choose "master" locally, create commits there, and then the user is surprised when they push to "master" and not "main"). To solve this, the remote needs to communicate the target of the HEAD symref, even if it is unborn, and "git clone" needs to use this information. Currently, symrefs that have unborn targets (such as in this case) are not communicated by the protocol. Teach Git to advertise and support the "unborn" feature in "ls-refs" (by default, this is advertised, but server administrators may turn this off through the lsrefs.unborn config). This feature indicates that "ls-refs" supports the "unborn" argument; when it is specified, "ls-refs" will send the HEAD symref with the name of its unborn target. This change is only for protocol v2. A similar change for protocol v0 would require independent protocol design (there being no analogous position to signal support for "unborn") and client-side plumbing of the data required, so the scope of this patch set is limited to protocol v2. The client side will be updated to use this in a subsequent commit. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-06 04:48:47 +08:00
/*
* If there is no such config, advertise and allow it by
* default.
*/
return UNBORN_ADVERTISE;
ls-refs: report unborn targets of symrefs When cloning, we choose the default branch based on the remote HEAD. But if there is no remote HEAD reported (which could happen if the target of the remote HEAD is unborn), we'll fall back to using our local init.defaultBranch. Traditionally this hasn't been a big deal, because most repos used "master" as the default. But these days it is likely to cause confusion if the server and client implementations choose different values (e.g., if the remote started with "main", we may choose "master" locally, create commits there, and then the user is surprised when they push to "master" and not "main"). To solve this, the remote needs to communicate the target of the HEAD symref, even if it is unborn, and "git clone" needs to use this information. Currently, symrefs that have unborn targets (such as in this case) are not communicated by the protocol. Teach Git to advertise and support the "unborn" feature in "ls-refs" (by default, this is advertised, but server administrators may turn this off through the lsrefs.unborn config). This feature indicates that "ls-refs" supports the "unborn" argument; when it is specified, "ls-refs" will send the HEAD symref with the name of its unborn target. This change is only for protocol v2. A similar change for protocol v0 would require independent protocol design (there being no analogous position to signal support for "unborn") and client-side plumbing of the data required, so the scope of this patch set is limited to protocol v2. The client side will be updated to use this in a subsequent commit. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-06 04:48:47 +08:00
} else {
if (!strcmp(str, "advertise")) {
return UNBORN_ADVERTISE;
ls-refs: report unborn targets of symrefs When cloning, we choose the default branch based on the remote HEAD. But if there is no remote HEAD reported (which could happen if the target of the remote HEAD is unborn), we'll fall back to using our local init.defaultBranch. Traditionally this hasn't been a big deal, because most repos used "master" as the default. But these days it is likely to cause confusion if the server and client implementations choose different values (e.g., if the remote started with "main", we may choose "master" locally, create commits there, and then the user is surprised when they push to "master" and not "main"). To solve this, the remote needs to communicate the target of the HEAD symref, even if it is unborn, and "git clone" needs to use this information. Currently, symrefs that have unborn targets (such as in this case) are not communicated by the protocol. Teach Git to advertise and support the "unborn" feature in "ls-refs" (by default, this is advertised, but server administrators may turn this off through the lsrefs.unborn config). This feature indicates that "ls-refs" supports the "unborn" argument; when it is specified, "ls-refs" will send the HEAD symref with the name of its unborn target. This change is only for protocol v2. A similar change for protocol v0 would require independent protocol design (there being no analogous position to signal support for "unborn") and client-side plumbing of the data required, so the scope of this patch set is limited to protocol v2. The client side will be updated to use this in a subsequent commit. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-06 04:48:47 +08:00
} else if (!strcmp(str, "allow")) {
return UNBORN_ALLOW;
ls-refs: report unborn targets of symrefs When cloning, we choose the default branch based on the remote HEAD. But if there is no remote HEAD reported (which could happen if the target of the remote HEAD is unborn), we'll fall back to using our local init.defaultBranch. Traditionally this hasn't been a big deal, because most repos used "master" as the default. But these days it is likely to cause confusion if the server and client implementations choose different values (e.g., if the remote started with "main", we may choose "master" locally, create commits there, and then the user is surprised when they push to "master" and not "main"). To solve this, the remote needs to communicate the target of the HEAD symref, even if it is unborn, and "git clone" needs to use this information. Currently, symrefs that have unborn targets (such as in this case) are not communicated by the protocol. Teach Git to advertise and support the "unborn" feature in "ls-refs" (by default, this is advertised, but server administrators may turn this off through the lsrefs.unborn config). This feature indicates that "ls-refs" supports the "unborn" argument; when it is specified, "ls-refs" will send the HEAD symref with the name of its unborn target. This change is only for protocol v2. A similar change for protocol v0 would require independent protocol design (there being no analogous position to signal support for "unborn") and client-side plumbing of the data required, so the scope of this patch set is limited to protocol v2. The client side will be updated to use this in a subsequent commit. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-06 04:48:47 +08:00
} else if (!strcmp(str, "ignore")) {
return UNBORN_IGNORE;
ls-refs: report unborn targets of symrefs When cloning, we choose the default branch based on the remote HEAD. But if there is no remote HEAD reported (which could happen if the target of the remote HEAD is unborn), we'll fall back to using our local init.defaultBranch. Traditionally this hasn't been a big deal, because most repos used "master" as the default. But these days it is likely to cause confusion if the server and client implementations choose different values (e.g., if the remote started with "main", we may choose "master" locally, create commits there, and then the user is surprised when they push to "master" and not "main"). To solve this, the remote needs to communicate the target of the HEAD symref, even if it is unborn, and "git clone" needs to use this information. Currently, symrefs that have unborn targets (such as in this case) are not communicated by the protocol. Teach Git to advertise and support the "unborn" feature in "ls-refs" (by default, this is advertised, but server administrators may turn this off through the lsrefs.unborn config). This feature indicates that "ls-refs" supports the "unborn" argument; when it is specified, "ls-refs" will send the HEAD symref with the name of its unborn target. This change is only for protocol v2. A similar change for protocol v0 would require independent protocol design (there being no analogous position to signal support for "unborn") and client-side plumbing of the data required, so the scope of this patch set is limited to protocol v2. The client side will be updated to use this in a subsequent commit. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-06 04:48:47 +08:00
} else {
die(_("invalid value for '%s': '%s'"),
"lsrefs.unborn", str);
ls-refs: report unborn targets of symrefs When cloning, we choose the default branch based on the remote HEAD. But if there is no remote HEAD reported (which could happen if the target of the remote HEAD is unborn), we'll fall back to using our local init.defaultBranch. Traditionally this hasn't been a big deal, because most repos used "master" as the default. But these days it is likely to cause confusion if the server and client implementations choose different values (e.g., if the remote started with "main", we may choose "master" locally, create commits there, and then the user is surprised when they push to "master" and not "main"). To solve this, the remote needs to communicate the target of the HEAD symref, even if it is unborn, and "git clone" needs to use this information. Currently, symrefs that have unborn targets (such as in this case) are not communicated by the protocol. Teach Git to advertise and support the "unborn" feature in "ls-refs" (by default, this is advertised, but server administrators may turn this off through the lsrefs.unborn config). This feature indicates that "ls-refs" supports the "unborn" argument; when it is specified, "ls-refs" will send the HEAD symref with the name of its unborn target. This change is only for protocol v2. A similar change for protocol v0 would require independent protocol design (there being no analogous position to signal support for "unborn") and client-side plumbing of the data required, so the scope of this patch set is limited to protocol v2. The client side will be updated to use this in a subsequent commit. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-06 04:48:47 +08:00
}
}
}
ls-refs: ignore very long ref-prefix counts Because each "ref-prefix" capability from the client comes in its own pkt-line, there's no limit to the number of them that a misbehaving client may send. We read them all into a strvec, which means the client can waste arbitrary amounts of our memory by just sending us "ref-prefix foo" over and over. One possible solution is to just drop the connection when the limit is reached. If we set it high enough, then only misbehaving or malicious clients would hit it. But "high enough" is vague, and it's unfriendly if we guess wrong and a legitimate client hits this. But we can do better. Since supporting the ref-prefix capability is optional anyway, the client has to further cull the response based on their own patterns. So we can simply ignore the patterns once we cross a certain threshold. Note that we have to ignore _all_ patterns, not just the ones past our limit (since otherwise we'd send too little data). The limit here is fairly arbitrary, and probably much higher than anyone would need in practice. It might be worth limiting it further, if only because we check it linearly (so with "m" local refs and "n" patterns, we do "m * n" string comparisons). But if we care about optimizing this, an even better solution may be a more advanced data structure anyway. I didn't bother making the limit configurable, since it's so high and since Git should behave correctly in either case. It wouldn't be too hard to do, but it makes both the code and documentation more complex. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-16 02:35:31 +08:00
/*
* If we see this many or more "ref-prefix" lines from the client, we consider
* it "too many" and will avoid using the prefix feature entirely.
*/
#define TOO_MANY_PREFIXES 65536
/*
* Check if one of the prefixes is a prefix of the ref.
* If no prefixes were provided, all refs match.
*/
static int ref_match(const struct strvec *prefixes, const char *refname)
{
int i;
if (!prefixes->nr)
return 1; /* no restriction */
for (i = 0; i < prefixes->nr; i++) {
const char *prefix = prefixes->v[i];
if (starts_with(refname, prefix))
return 1;
}
return 0;
}
struct ls_refs_data {
unsigned peel;
unsigned symrefs;
struct strvec prefixes;
struct strbuf buf;
struct strvec hidden_refs;
ls-refs: report unborn targets of symrefs When cloning, we choose the default branch based on the remote HEAD. But if there is no remote HEAD reported (which could happen if the target of the remote HEAD is unborn), we'll fall back to using our local init.defaultBranch. Traditionally this hasn't been a big deal, because most repos used "master" as the default. But these days it is likely to cause confusion if the server and client implementations choose different values (e.g., if the remote started with "main", we may choose "master" locally, create commits there, and then the user is surprised when they push to "master" and not "main"). To solve this, the remote needs to communicate the target of the HEAD symref, even if it is unborn, and "git clone" needs to use this information. Currently, symrefs that have unborn targets (such as in this case) are not communicated by the protocol. Teach Git to advertise and support the "unborn" feature in "ls-refs" (by default, this is advertised, but server administrators may turn this off through the lsrefs.unborn config). This feature indicates that "ls-refs" supports the "unborn" argument; when it is specified, "ls-refs" will send the HEAD symref with the name of its unborn target. This change is only for protocol v2. A similar change for protocol v0 would require independent protocol design (there being no analogous position to signal support for "unborn") and client-side plumbing of the data required, so the scope of this patch set is limited to protocol v2. The client side will be updated to use this in a subsequent commit. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-06 04:48:47 +08:00
unsigned unborn : 1;
};
static int send_ref(const char *refname, const struct object_id *oid,
int flag, void *cb_data)
{
struct ls_refs_data *data = cb_data;
const char *refname_nons = strip_namespace(refname);
strbuf_reset(&data->buf);
if (ref_is_hidden(refname_nons, refname, &data->hidden_refs))
return 0;
if (!ref_match(&data->prefixes, refname_nons))
return 0;
ls-refs: report unborn targets of symrefs When cloning, we choose the default branch based on the remote HEAD. But if there is no remote HEAD reported (which could happen if the target of the remote HEAD is unborn), we'll fall back to using our local init.defaultBranch. Traditionally this hasn't been a big deal, because most repos used "master" as the default. But these days it is likely to cause confusion if the server and client implementations choose different values (e.g., if the remote started with "main", we may choose "master" locally, create commits there, and then the user is surprised when they push to "master" and not "main"). To solve this, the remote needs to communicate the target of the HEAD symref, even if it is unborn, and "git clone" needs to use this information. Currently, symrefs that have unborn targets (such as in this case) are not communicated by the protocol. Teach Git to advertise and support the "unborn" feature in "ls-refs" (by default, this is advertised, but server administrators may turn this off through the lsrefs.unborn config). This feature indicates that "ls-refs" supports the "unborn" argument; when it is specified, "ls-refs" will send the HEAD symref with the name of its unborn target. This change is only for protocol v2. A similar change for protocol v0 would require independent protocol design (there being no analogous position to signal support for "unborn") and client-side plumbing of the data required, so the scope of this patch set is limited to protocol v2. The client side will be updated to use this in a subsequent commit. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-06 04:48:47 +08:00
if (oid)
strbuf_addf(&data->buf, "%s %s", oid_to_hex(oid), refname_nons);
ls-refs: report unborn targets of symrefs When cloning, we choose the default branch based on the remote HEAD. But if there is no remote HEAD reported (which could happen if the target of the remote HEAD is unborn), we'll fall back to using our local init.defaultBranch. Traditionally this hasn't been a big deal, because most repos used "master" as the default. But these days it is likely to cause confusion if the server and client implementations choose different values (e.g., if the remote started with "main", we may choose "master" locally, create commits there, and then the user is surprised when they push to "master" and not "main"). To solve this, the remote needs to communicate the target of the HEAD symref, even if it is unborn, and "git clone" needs to use this information. Currently, symrefs that have unborn targets (such as in this case) are not communicated by the protocol. Teach Git to advertise and support the "unborn" feature in "ls-refs" (by default, this is advertised, but server administrators may turn this off through the lsrefs.unborn config). This feature indicates that "ls-refs" supports the "unborn" argument; when it is specified, "ls-refs" will send the HEAD symref with the name of its unborn target. This change is only for protocol v2. A similar change for protocol v0 would require independent protocol design (there being no analogous position to signal support for "unborn") and client-side plumbing of the data required, so the scope of this patch set is limited to protocol v2. The client side will be updated to use this in a subsequent commit. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-06 04:48:47 +08:00
else
strbuf_addf(&data->buf, "unborn %s", refname_nons);
if (data->symrefs && flag & REF_ISSYMREF) {
struct object_id unused;
const char *symref_target = refs_resolve_ref_unsafe(get_main_ref_store(the_repository),
refname,
0,
&unused,
&flag);
if (!symref_target)
die("'%s' is a symref but it is not?", refname);
strbuf_addf(&data->buf, " symref-target:%s",
upload-pack: strip namespace from symref data Since 7171d8c15f (upload-pack: send symbolic ref information as capability, 2013-09-17), we've sent cloning and fetching clients special information about which branch HEAD is pointing to, so that they don't have to guess based on matching up commit ids. However, this feature has never worked properly with the GIT_NAMESPACE feature. Because upload-pack uses head_ref_namespaced(find_symref), we do find and report on refs/namespaces/foo/HEAD instead of the actual HEAD of the repo. This makes sense, since the branch pointed to by the top-level HEAD may not be advertised at all. But we do two things wrong: 1. We report the full name refs/namespaces/foo/HEAD, instead of just HEAD. Meaning no client is going to bother doing anything with that symref, since we're not otherwise advertising it. 2. We report the symref destination using its full name (e.g., refs/namespaces/foo/refs/heads/master). That's similarly useless to the client, who only saw "refs/heads/master" in the advertisement. We should be stripping the namespace prefix off of both places (which this patch fixes). Likely nobody noticed because we tend to do the right thing anyway. Bug (1) means that we said nothing about HEAD (just refs/namespace/foo/HEAD). And so the client half of the code, from a45b5f0552 (connect: annotate refs with their symref information in get_remote_head(), 2013-09-17), does not annotate HEAD, and we use the fallback in guess_remote_head(), matching refs by object id. Which is usually right. It only falls down in ambiguous cases, like the one laid out in the included test. This also means that we don't have to worry about breaking anybody who was putting pre-stripped names into their namespace symrefs when we fix bug (2). Because of bug (1), nobody would have been using the symref we advertised in the first place (not to mention that those symrefs would have appeared broken for any non-namespaced access). Note that we have separate fixes here for the v0 and v2 protocols. The symref advertisement moved in v2 to be a part of the ls-refs command. This actually gets part (1) right, since the symref annotation piggy-backs on the existing ref advertisement, which is properly stripped. But it still needs a fix for part (2). The included tests cover both protocols. Reported-by: Bryan Turner <bturner@atlassian.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-05-23 14:11:21 +08:00
strip_namespace(symref_target));
}
ls-refs: report unborn targets of symrefs When cloning, we choose the default branch based on the remote HEAD. But if there is no remote HEAD reported (which could happen if the target of the remote HEAD is unborn), we'll fall back to using our local init.defaultBranch. Traditionally this hasn't been a big deal, because most repos used "master" as the default. But these days it is likely to cause confusion if the server and client implementations choose different values (e.g., if the remote started with "main", we may choose "master" locally, create commits there, and then the user is surprised when they push to "master" and not "main"). To solve this, the remote needs to communicate the target of the HEAD symref, even if it is unborn, and "git clone" needs to use this information. Currently, symrefs that have unborn targets (such as in this case) are not communicated by the protocol. Teach Git to advertise and support the "unborn" feature in "ls-refs" (by default, this is advertised, but server administrators may turn this off through the lsrefs.unborn config). This feature indicates that "ls-refs" supports the "unborn" argument; when it is specified, "ls-refs" will send the HEAD symref with the name of its unborn target. This change is only for protocol v2. A similar change for protocol v0 would require independent protocol design (there being no analogous position to signal support for "unborn") and client-side plumbing of the data required, so the scope of this patch set is limited to protocol v2. The client side will be updated to use this in a subsequent commit. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-06 04:48:47 +08:00
if (data->peel && oid) {
struct object_id peeled;
if (!peel_iterated_oid(the_repository, oid, &peeled))
strbuf_addf(&data->buf, " peeled:%s", oid_to_hex(&peeled));
}
strbuf_addch(&data->buf, '\n');
packet_fwrite(stdout, data->buf.buf, data->buf.len);
return 0;
}
ls-refs: report unborn targets of symrefs When cloning, we choose the default branch based on the remote HEAD. But if there is no remote HEAD reported (which could happen if the target of the remote HEAD is unborn), we'll fall back to using our local init.defaultBranch. Traditionally this hasn't been a big deal, because most repos used "master" as the default. But these days it is likely to cause confusion if the server and client implementations choose different values (e.g., if the remote started with "main", we may choose "master" locally, create commits there, and then the user is surprised when they push to "master" and not "main"). To solve this, the remote needs to communicate the target of the HEAD symref, even if it is unborn, and "git clone" needs to use this information. Currently, symrefs that have unborn targets (such as in this case) are not communicated by the protocol. Teach Git to advertise and support the "unborn" feature in "ls-refs" (by default, this is advertised, but server administrators may turn this off through the lsrefs.unborn config). This feature indicates that "ls-refs" supports the "unborn" argument; when it is specified, "ls-refs" will send the HEAD symref with the name of its unborn target. This change is only for protocol v2. A similar change for protocol v0 would require independent protocol design (there being no analogous position to signal support for "unborn") and client-side plumbing of the data required, so the scope of this patch set is limited to protocol v2. The client side will be updated to use this in a subsequent commit. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-06 04:48:47 +08:00
static void send_possibly_unborn_head(struct ls_refs_data *data)
{
struct strbuf namespaced = STRBUF_INIT;
struct object_id oid;
int flag;
int oid_is_null;
strbuf_addf(&namespaced, "%sHEAD", get_git_namespace());
if (!refs_resolve_ref_unsafe(get_main_ref_store(the_repository), namespaced.buf, 0, &oid, &flag))
ls-refs: report unborn targets of symrefs When cloning, we choose the default branch based on the remote HEAD. But if there is no remote HEAD reported (which could happen if the target of the remote HEAD is unborn), we'll fall back to using our local init.defaultBranch. Traditionally this hasn't been a big deal, because most repos used "master" as the default. But these days it is likely to cause confusion if the server and client implementations choose different values (e.g., if the remote started with "main", we may choose "master" locally, create commits there, and then the user is surprised when they push to "master" and not "main"). To solve this, the remote needs to communicate the target of the HEAD symref, even if it is unborn, and "git clone" needs to use this information. Currently, symrefs that have unborn targets (such as in this case) are not communicated by the protocol. Teach Git to advertise and support the "unborn" feature in "ls-refs" (by default, this is advertised, but server administrators may turn this off through the lsrefs.unborn config). This feature indicates that "ls-refs" supports the "unborn" argument; when it is specified, "ls-refs" will send the HEAD symref with the name of its unborn target. This change is only for protocol v2. A similar change for protocol v0 would require independent protocol design (there being no analogous position to signal support for "unborn") and client-side plumbing of the data required, so the scope of this patch set is limited to protocol v2. The client side will be updated to use this in a subsequent commit. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-06 04:48:47 +08:00
return; /* bad ref */
oid_is_null = is_null_oid(&oid);
if (!oid_is_null ||
(data->unborn && data->symrefs && (flag & REF_ISSYMREF)))
send_ref(namespaced.buf, oid_is_null ? NULL : &oid, flag, data);
strbuf_release(&namespaced);
}
static int ls_refs_config(const char *var, const char *value,
config: add ctx arg to config_fn_t Add a new "const struct config_context *ctx" arg to config_fn_t to hold additional information about the config iteration operation. config_context has a "struct key_value_info kvi" member that holds metadata about the config source being read (e.g. what kind of config source it is, the filename, etc). In this series, we're only interested in .kvi, so we could have just used "struct key_value_info" as an arg, but config_context makes it possible to add/adjust members in the future without changing the config_fn_t signature. We could also consider other ways of organizing the args (e.g. moving the config name and value into config_context or key_value_info), but in my experiments, the incremental benefit doesn't justify the added complexity (e.g. a config_fn_t will sometimes invoke another config_fn_t but with a different config value). In subsequent commits, the .kvi member will replace the global "struct config_reader" in config.c, making config iteration a global-free operation. It requires much more work for the machinery to provide meaningful values of .kvi, so for now, merely change the signature and call sites, pass NULL as a placeholder value, and don't rely on the arg in any meaningful way. Most of the changes are performed by contrib/coccinelle/config_fn_ctx.pending.cocci, which, for every config_fn_t: - Modifies the signature to accept "const struct config_context *ctx" - Passes "ctx" to any inner config_fn_t, if needed - Adds UNUSED attributes to "ctx", if needed Most config_fn_t instances are easily identified by seeing if they are called by the various config functions. Most of the remaining ones are manually named in the .cocci patch. Manual cleanups are still needed, but the majority of it is trivial; it's either adjusting config_fn_t that the .cocci patch didn't catch, or adding forward declarations of "struct config_context ctx" to make the signatures make sense. The non-trivial changes are in cases where we are invoking a config_fn_t outside of config machinery, and we now need to decide what value of "ctx" to pass. These cases are: - trace2/tr2_cfg.c:tr2_cfg_set_fl() This is indirectly called by git_config_set() so that the trace2 machinery can notice the new config values and update its settings using the tr2 config parsing function, i.e. tr2_cfg_cb(). - builtin/checkout.c:checkout_main() This calls git_xmerge_config() as a shorthand for parsing a CLI arg. This might be worth refactoring away in the future, since git_xmerge_config() can call git_default_config(), which can do much more than just parsing. Handle them by creating a KVI_INIT macro that initializes "struct key_value_info" to a reasonable default, and use that to construct the "ctx" arg. Signed-off-by: Glen Choo <chooglen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-29 03:26:22 +08:00
const struct config_context *ctx UNUSED,
void *cb_data)
{
struct ls_refs_data *data = cb_data;
/*
* We only serve fetches over v2 for now, so respect only "uploadpack"
* config. This may need to eventually be expanded to "receive", but we
* don't yet know how that information will be passed to ls-refs.
*/
return parse_hide_refs_config(var, value, "uploadpack", &data->hidden_refs);
}
int ls_refs(struct repository *r, struct packet_reader *request)
{
struct ls_refs_data data;
memset(&data, 0, sizeof(data));
strvec_init(&data.prefixes);
strbuf_init(&data.buf, 0);
strvec_init(&data.hidden_refs);
git_config(ls_refs_config, &data);
while (packet_reader_read(request) == PACKET_READ_NORMAL) {
const char *arg = request->line;
const char *out;
if (!strcmp("peel", arg))
data.peel = 1;
else if (!strcmp("symrefs", arg))
data.symrefs = 1;
ls-refs: ignore very long ref-prefix counts Because each "ref-prefix" capability from the client comes in its own pkt-line, there's no limit to the number of them that a misbehaving client may send. We read them all into a strvec, which means the client can waste arbitrary amounts of our memory by just sending us "ref-prefix foo" over and over. One possible solution is to just drop the connection when the limit is reached. If we set it high enough, then only misbehaving or malicious clients would hit it. But "high enough" is vague, and it's unfriendly if we guess wrong and a legitimate client hits this. But we can do better. Since supporting the ref-prefix capability is optional anyway, the client has to further cull the response based on their own patterns. So we can simply ignore the patterns once we cross a certain threshold. Note that we have to ignore _all_ patterns, not just the ones past our limit (since otherwise we'd send too little data). The limit here is fairly arbitrary, and probably much higher than anyone would need in practice. It might be worth limiting it further, if only because we check it linearly (so with "m" local refs and "n" patterns, we do "m * n" string comparisons). But if we care about optimizing this, an even better solution may be a more advanced data structure anyway. I didn't bother making the limit configurable, since it's so high and since Git should behave correctly in either case. It wouldn't be too hard to do, but it makes both the code and documentation more complex. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-16 02:35:31 +08:00
else if (skip_prefix(arg, "ref-prefix ", &out)) {
if (data.prefixes.nr < TOO_MANY_PREFIXES)
strvec_push(&data.prefixes, out);
}
ls-refs: report unborn targets of symrefs When cloning, we choose the default branch based on the remote HEAD. But if there is no remote HEAD reported (which could happen if the target of the remote HEAD is unborn), we'll fall back to using our local init.defaultBranch. Traditionally this hasn't been a big deal, because most repos used "master" as the default. But these days it is likely to cause confusion if the server and client implementations choose different values (e.g., if the remote started with "main", we may choose "master" locally, create commits there, and then the user is surprised when they push to "master" and not "main"). To solve this, the remote needs to communicate the target of the HEAD symref, even if it is unborn, and "git clone" needs to use this information. Currently, symrefs that have unborn targets (such as in this case) are not communicated by the protocol. Teach Git to advertise and support the "unborn" feature in "ls-refs" (by default, this is advertised, but server administrators may turn this off through the lsrefs.unborn config). This feature indicates that "ls-refs" supports the "unborn" argument; when it is specified, "ls-refs" will send the HEAD symref with the name of its unborn target. This change is only for protocol v2. A similar change for protocol v0 would require independent protocol design (there being no analogous position to signal support for "unborn") and client-side plumbing of the data required, so the scope of this patch set is limited to protocol v2. The client side will be updated to use this in a subsequent commit. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-06 04:48:47 +08:00
else if (!strcmp("unborn", arg))
data.unborn = !!unborn_config(r);
else
die(_("unexpected line: '%s'"), arg);
}
if (request->status != PACKET_READ_FLUSH)
die(_("expected flush after ls-refs arguments"));
ls-refs: ignore very long ref-prefix counts Because each "ref-prefix" capability from the client comes in its own pkt-line, there's no limit to the number of them that a misbehaving client may send. We read them all into a strvec, which means the client can waste arbitrary amounts of our memory by just sending us "ref-prefix foo" over and over. One possible solution is to just drop the connection when the limit is reached. If we set it high enough, then only misbehaving or malicious clients would hit it. But "high enough" is vague, and it's unfriendly if we guess wrong and a legitimate client hits this. But we can do better. Since supporting the ref-prefix capability is optional anyway, the client has to further cull the response based on their own patterns. So we can simply ignore the patterns once we cross a certain threshold. Note that we have to ignore _all_ patterns, not just the ones past our limit (since otherwise we'd send too little data). The limit here is fairly arbitrary, and probably much higher than anyone would need in practice. It might be worth limiting it further, if only because we check it linearly (so with "m" local refs and "n" patterns, we do "m * n" string comparisons). But if we care about optimizing this, an even better solution may be a more advanced data structure anyway. I didn't bother making the limit configurable, since it's so high and since Git should behave correctly in either case. It wouldn't be too hard to do, but it makes both the code and documentation more complex. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-16 02:35:31 +08:00
/*
* If we saw too many prefixes, we must avoid using them at all; as
* soon as we have any prefix, they are meant to form a comprehensive
* list.
*/
if (data.prefixes.nr >= TOO_MANY_PREFIXES)
strvec_clear(&data.prefixes);
ls-refs: report unborn targets of symrefs When cloning, we choose the default branch based on the remote HEAD. But if there is no remote HEAD reported (which could happen if the target of the remote HEAD is unborn), we'll fall back to using our local init.defaultBranch. Traditionally this hasn't been a big deal, because most repos used "master" as the default. But these days it is likely to cause confusion if the server and client implementations choose different values (e.g., if the remote started with "main", we may choose "master" locally, create commits there, and then the user is surprised when they push to "master" and not "main"). To solve this, the remote needs to communicate the target of the HEAD symref, even if it is unborn, and "git clone" needs to use this information. Currently, symrefs that have unborn targets (such as in this case) are not communicated by the protocol. Teach Git to advertise and support the "unborn" feature in "ls-refs" (by default, this is advertised, but server administrators may turn this off through the lsrefs.unborn config). This feature indicates that "ls-refs" supports the "unborn" argument; when it is specified, "ls-refs" will send the HEAD symref with the name of its unborn target. This change is only for protocol v2. A similar change for protocol v0 would require independent protocol design (there being no analogous position to signal support for "unborn") and client-side plumbing of the data required, so the scope of this patch set is limited to protocol v2. The client side will be updated to use this in a subsequent commit. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-06 04:48:47 +08:00
send_possibly_unborn_head(&data);
ls-refs.c: traverse prefixes of disjoint "ref-prefix" sets ls-refs performs a single revision walk over the whole ref namespace, and sends ones that match with one of the given ref prefixes down to the user. This can be expensive if there are many refs overall, but the portion of them covered by the given prefixes is small by comparison. To attempt to reduce the difference between the number of refs traversed, and the number of refs sent, only traverse references which are in the longest common prefix of the given prefixes. This is very reminiscent of the approach taken in b31e2680c4 (ref-filter.c: find disjoint pattern prefixes, 2019-06-26) which does an analogous thing for multi-patterned 'git for-each-ref' invocations. The callback 'send_ref' is resilient to ignore extra patterns by discarding any arguments which do not begin with at least one of the specified prefixes. Similarly, the code introduced in b31e2680c4 is resilient to stop early at metacharacters, but we only pass strict prefixes here. At worst we would return too many results, but the double checking done by send_ref will throw away anything that doesn't start with something in the prefix list. Finally, if no prefixes were provided, then implicitly add the empty string (which will match all references) since this matches the existing behavior (see the "no restrictions" comment in "ls-refs.c:ref_match()"). Original-patch-by: Jacob Vosmaer <jacob@gitlab.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-01-21 00:04:30 +08:00
if (!data.prefixes.nr)
strvec_push(&data.prefixes, "");
refs_for_each_fullref_in_prefixes(get_main_ref_store(r),
get_git_namespace(), data.prefixes.v,
hidden_refs_to_excludes(&data.hidden_refs),
send_ref, &data);
packet_fflush(stdout);
strvec_clear(&data.prefixes);
strbuf_release(&data.buf);
strvec_clear(&data.hidden_refs);
return 0;
}
ls-refs: report unborn targets of symrefs When cloning, we choose the default branch based on the remote HEAD. But if there is no remote HEAD reported (which could happen if the target of the remote HEAD is unborn), we'll fall back to using our local init.defaultBranch. Traditionally this hasn't been a big deal, because most repos used "master" as the default. But these days it is likely to cause confusion if the server and client implementations choose different values (e.g., if the remote started with "main", we may choose "master" locally, create commits there, and then the user is surprised when they push to "master" and not "main"). To solve this, the remote needs to communicate the target of the HEAD symref, even if it is unborn, and "git clone" needs to use this information. Currently, symrefs that have unborn targets (such as in this case) are not communicated by the protocol. Teach Git to advertise and support the "unborn" feature in "ls-refs" (by default, this is advertised, but server administrators may turn this off through the lsrefs.unborn config). This feature indicates that "ls-refs" supports the "unborn" argument; when it is specified, "ls-refs" will send the HEAD symref with the name of its unborn target. This change is only for protocol v2. A similar change for protocol v0 would require independent protocol design (there being no analogous position to signal support for "unborn") and client-side plumbing of the data required, so the scope of this patch set is limited to protocol v2. The client side will be updated to use this in a subsequent commit. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-06 04:48:47 +08:00
int ls_refs_advertise(struct repository *r, struct strbuf *value)
{
if (value && unborn_config(r) == UNBORN_ADVERTISE)
strbuf_addstr(value, "unborn");
ls-refs: report unborn targets of symrefs When cloning, we choose the default branch based on the remote HEAD. But if there is no remote HEAD reported (which could happen if the target of the remote HEAD is unborn), we'll fall back to using our local init.defaultBranch. Traditionally this hasn't been a big deal, because most repos used "master" as the default. But these days it is likely to cause confusion if the server and client implementations choose different values (e.g., if the remote started with "main", we may choose "master" locally, create commits there, and then the user is surprised when they push to "master" and not "main"). To solve this, the remote needs to communicate the target of the HEAD symref, even if it is unborn, and "git clone" needs to use this information. Currently, symrefs that have unborn targets (such as in this case) are not communicated by the protocol. Teach Git to advertise and support the "unborn" feature in "ls-refs" (by default, this is advertised, but server administrators may turn this off through the lsrefs.unborn config). This feature indicates that "ls-refs" supports the "unborn" argument; when it is specified, "ls-refs" will send the HEAD symref with the name of its unborn target. This change is only for protocol v2. A similar change for protocol v0 would require independent protocol design (there being no analogous position to signal support for "unborn") and client-side plumbing of the data required, so the scope of this patch set is limited to protocol v2. The client side will be updated to use this in a subsequent commit. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-06 04:48:47 +08:00
return 1;
}