mirrors/git

mirror of https://github.com/git/git.git synced 2024-11-30 21:44:02 +08:00

Author	SHA1	Message	Date
Jeff King	108c265f27	serve: reject bogus v2 "command=ls-refs=foo" When we see a line from the client like "command=ls-refs", we parse everything after the equals sign as a capability, which we check against our capabilities table. If we don't recognize the command (e.g., "command=foo"), we'll reject it. But in parse_command(), we use the same get_capability() parser for parsing non-command lines. So if we see "command=ls-refs=foo", we will feed "ls-refs=foo" to get_capability(), which will say "OK, that's ls-refs, with value 'foo'". But then we simply ignore the value entirely. The client is violating the spec here, which says: command = PKT-LINE("command=" key LF) key = 1*(ALPHA \| DIGIT \| "-_") I.e., the key is not even allowed to have an equals sign in it. Whereas a real non-command capability does allow a value: capability = PKT-LINE(key[=value] LF) So by reusing the same get_capability() parser, we are mixing up the "key" and "capability" tokens. However, since that parser tells us whether it saw an "=", we can still use it; we just need to reject any input that produces a non-NULL value field. The current behavior isn't really hurting anything (the client should never send such a request, and if it does, we just ignore the "value" part). But since it does violate the spec, let's tighten it up to prevent any surprising behavior. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-15 12:25:19 -07:00
Jeff King	f0a35c9ce5	serve: drop "keys" strvec We collect the set of capabilities the client sends us in a strvec. While this is usually small, there's no limit to the number of capabilities the client can send us (e.g., they could just send us "agent" pkt-lines over and over, and we'd keep adding them to the list). Since all code has been converted away from using this list, let's get rid of it. This avoids a potential attack where clients waste our memory. Note that we do have to replace it with a flag, because some of the flush-packet logic checks whether we've seen any valid commands or keys. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-15 12:25:19 -07:00
Jeff King	ab539c9094	serve: provide "receive" function for session-id capability Rather than pulling the session-id string from the list of collected capabilities, we can handle it as soon as we receive it. This gets us closer to dropping the collected list entirely. The behavior should be the same, with one exception. Previously if the client sent us multiple session-id lines, we'd report only the first. Now we'll pass each one along to trace2. This shouldn't matter in practice, since clients shouldn't do that (and if they do, it's probably sensible to log them all). As this removes the last caller of the static has_capability(), we can remove it, as well (and in fact we must to avoid -Wunused-function complaining). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-14 17:12:05 -07:00
Jeff King	c7d3aabd27	serve: provide "receive" function for object-format capability We get any "object-format" specified by the client by searching for it in the collected list of capabilities the client sent. We can instead just handle it as soon as they send it. This is slightly more efficient, and gets us one step closer to dropping that collected list. Note that we do still have to do our final hash check after receiving all capabilities (because they might not have sent an object-format line at all, and we still have to check that the default matches our repository algorithm). Since the check_algorithm() function would now be down to a single if() statement, I've just inlined it in its only caller. There should be no change of behavior here, except for two broken-protocol cases: - if the client sends multiple conflicting object-format capabilities (which they should not), we'll now choose the last one rather than the first. We could also detect and complain about the duplicates quite easily now, which we could not before, but I didn't do so here. - if the client sends a bogus "object-format" with no equals sign, we'll now say so, rather than "unknown object format: ''" Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-14 17:12:05 -07:00
Jeff King	e56e53067f	serve: add "receive" method for v2 capabilities table We have a capabilities table that tells us what we should tell the client we are capable of, and what to do when a client gives us a particular command (e.g., "command=ls-refs"). But it doesn't tell us what to do when the client sends us back a capability (e.g., "object-format=sha256"). We just collect them all in a strvec and hope somebody can use them later. Instead, let's provide a function pointer in the table to act on these. This will eventually help us avoid collecting the strings, which will be more efficient and less prone to mischief. Using the new method is optional, which helps in two ways: - we can move existing capabilities over to this new system gradually in individual commits - some capabilities we don't actually do anything with anyway. For example, the client is free to say "agent=git/1.2.3" to us, but we do not act on the information in any way. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-14 10:56:19 -07:00
Jeff King	5ef260d2d1	serve: return capability "value" from get_capability() When the client sends v2 capabilities, they may be simple, like: foo or have a value like: foo=bar (all of the current capabilities actually expect a value, but the protocol allows for boolean ones). We use get_capability() to make sure the client's pktline matches a capability. In doing so, we parse enough to see the "=" and the value (if any), but we immediately forget it. Nobody cares for now, because they end up parsing the values out later using has_capability(). But in preparation for changing that, let's pass back a pointer so the callers know what we found. Note that unlike has_capability(), we'll return NULL for a "simple" capability. Distinguishing these will be useful for some future patches. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-14 10:56:19 -07:00
Jeff King	76804526f9	serve: rename is_command() to parse_command() The is_command() function not only tells us whether the pktline is a valid command string, but it also parses out the command (and complains if we see a duplicate). Let's rename it to make those extra functions a bit more obvious. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-14 10:56:19 -07:00
Ævar Arnfjörð Bjarmason	f234da8019	serve.[ch]: remove "serve_options", split up --advertise-refs code The "advertise capabilities" mode of serve.c added in `ed10cb952d` (serve: introduce git-serve, 2018-03-15) is only used by the http-backend.c to call {upload,receive}-pack with the --advertise-refs parameter. See `42526b478e` (Add stateless RPC options to upload-pack, receive-pack, 2009-10-30). Let's just make cmd_upload_pack() take the two (v2) or three (v2) parameters the the v2/v1 servicing functions need directly, and pass those in via the function signature. The logic of whether daemon mode is implied by the timeout belongs in the v1 function (only used there). Once we split up the "advertise v2 refs" from "serve v2 request" it becomes clear that v2 never cared about those in combination. The only time it mattered was for v1 to emit its ref advertisement, in that case we wanted to emit the smart-http-only "no-done" capability. Since we only do that in the --advertise-refs codepath let's just have it set "do_done" itself in v1's upload_pack() just before send_ref(), at that point --advertise-refs and --stateless-rpc in combination are redundant (the only user is get_info_refs() in http-backend.c), so we can just pass in --advertise-refs only. Since we need to touch all the serve() and advertise_capabilities() codepaths let's rename them to less clever and obvious names, it's been suggested numerous times, the latest of which is [1]'s suggestion for protocol_v2_serve_loop(). Let's go with that. 1. https://lore.kernel.org/git/CAFQ2z_NyGb8rju5CKzmo6KhZXD0Dp21u-BbyCb2aNxLEoSPRJw@mail.gmail.com/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-08-05 08:59:37 -07:00
Ævar Arnfjörð Bjarmason	5befe8a1f1	serve.c: move version line to advertise_capabilities() The advertise_capabilities() is only called from serve() and we always emit this version line before it. In a subsequent commit I'll make builtin/upload-pack.c sometimes call advertise_capabilities() directly, so it'll make sense to have this line emitted by advertise_capabilities(), not serve() itself. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-08-05 08:59:37 -07:00
Ævar Arnfjörð Bjarmason	eea7f7a977	serve: move transfer.advertiseSID check into session_id_advertise() In `6b5b6e422e` (serve: advertise session ID in v2 capabilities, 2020-11-11) the check for transfer.advertiseSID was added to the beginning of the main serve() loop. Thus on startup of the server we'd populate it. Let's instead use an explicit lazy initialization pattern in session_id_advertise() itself, we'll still look the config up only once per-process, but by moving it out of serve() itself the further changing of that routine becomes easier. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-08-05 08:59:37 -07:00
Ævar Arnfjörð Bjarmason	28a592e4f4	serve.[ch]: don't pass "struct strvec *keys" to commands The serve.c API added in `ed10cb952d` (serve: introduce git-serve, 2018-03-15) was passing in the raw capabilities "keys", but nothing downstream of it ever used them. Let's remove that code because it's not needed. If we do end up needing to pass information about the advertisement in the future it'll make more sense to have serve.c parse the capabilities keys and pass the result of its parsing, rather than expecting expecting its API users to parse the same keys again. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-08-05 08:59:37 -07:00
Ævar Arnfjörð Bjarmason	85baaed475	serve: use designated initializers Change the declaration of the protocol_capability struct to use designated initializers, this makes this more verbose now, but a follow-up commit will add a new field. At that point these lines would be too dense to be on one line comfortably. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-08-05 08:59:37 -07:00
Ævar Arnfjörð Bjarmason	eacf36a4d1	serve: mark has_capability() as static The has_capability() function introduced in `ed10cb952d` (serve: introduce git-serve, 2018-03-15) has never been used anywhere except serve.c, so let's mark it as static. It was later changed from "extern" in `554544276a` (*.[ch]: remove extern from function declarations using spatch, 2019-04-29), but we could have simply marked it as "static" instead. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-08-05 08:59:36 -07:00
Junio C Hamano	1e893a1216	Merge branch 'dl/packet-read-response-end-fix' Error message update. * dl/packet-read-response-end-fix: pkt-line: replace "stateless separator" with "response end"	2021-07-28 13:17:56 -07:00
Denton Liu	8232a0ff48	pkt-line: replace "stateless separator" with "response end" In `0181b600a6` (pkt-line: define PACKET_READ_RESPONSE_END, 2020-05-19), the Response End packet was defined for Git's network protocol. When the patch was sent, it included an oversight where the error messages referenced "stateless separator", the work-in-progress name, over "response end", the final name chosen. Correct these error messages by having them correctly reference a "response end" packet. Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-07-09 11:51:17 -07:00
Bruno Albuquerque	a2ba162cda	object-info: support for retrieving object info Sometimes it is useful to get information of an object without having to download it completely. Add the "object-info" capability that lets the client ask for object-related information with their full hexadecimal object names. Only sizes are returned for now. Signed-off-by: Bruno Albuquerque <bga@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-20 17:41:13 -07:00
Jonathan Tan	59e1205d16	ls-refs: report unborn targets of symrefs When cloning, we choose the default branch based on the remote HEAD. But if there is no remote HEAD reported (which could happen if the target of the remote HEAD is unborn), we'll fall back to using our local init.defaultBranch. Traditionally this hasn't been a big deal, because most repos used "master" as the default. But these days it is likely to cause confusion if the server and client implementations choose different values (e.g., if the remote started with "main", we may choose "master" locally, create commits there, and then the user is surprised when they push to "master" and not "main"). To solve this, the remote needs to communicate the target of the HEAD symref, even if it is unborn, and "git clone" needs to use this information. Currently, symrefs that have unborn targets (such as in this case) are not communicated by the protocol. Teach Git to advertise and support the "unborn" feature in "ls-refs" (by default, this is advertised, but server administrators may turn this off through the lsrefs.unborn config). This feature indicates that "ls-refs" supports the "unborn" argument; when it is specified, "ls-refs" will send the HEAD symref with the name of its unborn target. This change is only for protocol v2. A similar change for protocol v0 would require independent protocol design (there being no analogous position to signal support for "unborn") and client-side plumbing of the data required, so the scope of this patch set is limited to protocol v2. The client side will be updated to use this in a subsequent commit. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-05 13:49:53 -08:00
Josh Steadmon	829594677c	upload-pack, serve: log received client session ID When upload-pack (protocol v0/v1) or a protocol v2 server receives a session-id capability from a client, log the received session ID via a trace2 data event. Signed-off-by: Josh Steadmon <steadmon@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-11 18:26:53 -08:00
Josh Steadmon	6b5b6e422e	serve: advertise session ID in v2 capabilities When transfer.advertiseSID is true, advertise the server's session ID for all protocol v2 connections via the new session-id capability. Signed-off-by: Josh Steadmon <steadmon@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-11 18:26:53 -08:00
Jeff King	d70a9eb611	strvec: rename struct fields The "argc" and "argv" names made sense when the struct was argv_array, but now they're just confusing. Let's rename them to "nr" (which we use for counts elsewhere) and "v" (which is rather terse, but reads well when combined with typical variable names like "args.v"). Note that we have to update all of the callers immediately. Playing tricks with the preprocessor is hard here, because we wouldn't want to rewrite unrelated tokens. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-07-30 19:18:06 -07:00
Jeff King	c972bf4cf5	strvec: convert remaining callers away from argv_array name We eventually want to drop the argv_array name and just use strvec consistently. There's no particular reason we have to do it all at once, or care about interactions between converted and unconverted bits. Because of our preprocessor compat layer, the names are interchangeable to the compiler (so even a definition and declaration using different names is OK). This patch converts all of the remaining files, as the resulting diff is reasonably sized. The conversion was done purely mechanically with: git ls-files '.c' '.h' \| xargs perl -i -pe ' s/ARGV_ARRAY/STRVEC/g; s/argv_array/strvec/g; ' We'll deal with any indentation/style fallouts separately. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-07-28 15:02:18 -07:00
Jeff King	dbbcd44fb4	strvec: rename files from argv-array to strvec This requires updating #include lines across the code-base, but that's all fairly mechanical, and was done with: git ls-files '.c' '.h' \| xargs perl -i -pe 's/argv-array.h/strvec.h/' Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-07-28 15:02:17 -07:00
Junio C Hamano	12210859da	Merge branch 'bc/sha-256-part-2' SHA-256 migration work continues. * bc/sha-256-part-2: (44 commits) remote-testgit: adapt for object-format bundle: detect hash algorithm when reading refs t5300: pass --object-format to git index-pack t5704: send object-format capability with SHA-256 t5703: use object-format serve option t5702: offer an object-format capability in the test t/helper: initialize the repository for test-sha1-array remote-curl: avoid truncating refs with ls-remote t1050: pass algorithm to index-pack when outside repo builtin/index-pack: add option to specify hash algorithm remote-curl: detect algorithm for dumb HTTP by size builtin/ls-remote: initialize repository based on fetch t5500: make hash independent serve: advertise object-format capability for protocol v2 connect: parse v2 refs with correct hash algorithm connect: pass full packet reader when parsing v2 refs Documentation/technical: document object-format for protocol v2 t1302: expect repo format version 1 for SHA-256 builtin/show-index: provide options to determine hash algo t5302: modernize test formatting ...	2020-07-06 22:09:13 -07:00
brian m. carlson	9de0dd361c	serve: advertise object-format capability for protocol v2 In order to communicate the protocol supported by the server side, add support for advertising the object-format capability. We check that the client side sends us an identical algorithm if it sends us its own object-format capability, and assume it speaks SHA-1 if not. In the test, when we're using an algorithm other than SHA-1, we need to specify the algorithm in use so we don't get a failure with an "unknown format" message. Add a test that we handle a mismatched algorithm. Remove the test_oid_init call since it's no longer necessary. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-27 10:07:07 -07:00
Denton Liu	0181b600a6	pkt-line: define PACKET_READ_RESPONSE_END In a future commit, we will use PACKET_READ_RESPONSE_END to separate messages proxied by remote-curl. To prepare for this, add the PACKET_READ_RESPONSE_END enum value. In switch statements that need a case added, die() or BUG() when a PACKET_READ_RESPONSE_END is unexpected. Otherwise, mirror how PACKET_READ_DELIM is implemented (especially in cases where packets are being forwarded). Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-24 16:26:00 -07:00
Masaya Suzuki	2d103c31c2	pack-protocol.txt: accept error packets in any context In the Git pack protocol definition, an error packet may appear only in a certain context. However, servers can face a runtime error (e.g. I/O error) at an arbitrary timing. This patch changes the protocol to allow an error packet to be sent instead of any packet. Without this protocol spec change, when a server cannot process a request, there's no way to tell that to a client. Since the server cannot produce a valid response, it would be forced to cut a connection without telling why. With this protocol spec change, the server can be more gentle in this situation. An old client may see these error packets as an unexpected packet, but this is not worse than having an unexpected EOF. Following this protocol spec change, the error packet handling code is moved to pkt-line.c. Implementation wise, this implementation uses pkt-line to communicate with a subprocess. Since this is not a part of Git protocol, it's possible that a packet that is not supposed to be an error packet is mistakenly parsed as an error packet. This error packet handling is enabled only for the Git pack protocol parsing code considering this. Signed-off-by: Masaya Suzuki <masayasuzuki@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-01-02 13:05:30 -08:00
Brandon Williams	ecc3e5342d	serve: introduce the server-option capability Introduce the "server-option" capability to protocol version 2. This enables future clients the ability to send server specific options in command requests when using protocol version 2. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-04-24 11:24:40 +09:00
Brandon Williams	685fbd3291	fetch-pack: perform a fetch using v2 When communicating with a v2 server, perform a fetch by requesting the 'fetch' command. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-03-15 12:01:08 -07:00
Brandon Williams	3145ea957d	upload-pack: introduce fetch server command Introduce the 'fetch' server command. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-03-15 12:01:08 -07:00
Brandon Williams	72d0ea0056	ls-refs: introduce ls-refs server command Introduce the ls-refs server command. In protocol v2, the ls-refs command is used to request the ref advertisement from the server. Since it is a command which can be requested (as opposed to mandatory in v1), a client can sent a number of parameters in its request to limit the ref advertisement based on provided ref-prefixes. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-03-15 12:01:08 -07:00
Brandon Williams	ed10cb952d	serve: introduce git-serve Introduce git-serve, the base server for protocol version 2. Protocol version 2 is intended to be a replacement for Git's current wire protocol. The intention is that it will be a simpler, less wasteful protocol which can evolve over time. Protocol version 2 improves upon version 1 by eliminating the initial ref advertisement. In its place a server will export a list of capabilities and commands which it supports in a capability advertisement. A client can then request that a particular command be executed by providing a number of capabilities and command specific parameters. At the completion of a command, a client can request that another command be executed or can terminate the connection by sending a flush packet. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-03-15 12:01:08 -07:00

31 Commits