mirrors/git

mirror of https://github.com/git/git.git synced 2024-12-02 14:34:03 +08:00

Author	SHA1	Message	Date
Jeff King	a1bc3c88de	http: fix leak of http_object_request struct The new_http_object_request() function allocates a struct on the heap, along with some fields inside the struct. But the matching function to clean it up, release_http_object_request(), only frees the interior fields without freeing the struct itself, causing a leak. The related http_pack_request new/release pair gets this right, and at first glance we should be able to do the same thing and just add a single free() call. But there's a catch. These http_object_request structs are typically embedded in the object_request struct of http-walker.c. And when we clean up that parent struct, it sanity-checks the embedded struct to make sure we are not leaking descriptors. Which means a use-after-free if we simply free() the embedded struct. I have no idea how valuable that sanity-check is, or whether it can simply be deleted. This all goes back to `5424bc557f` (http*: add helper methods for fetching objects (loose), 2009-06-06). But the obvious way to make it all work is to be sure we set the pointer to NULL after freeing it (and our freeing process closes the descriptor, so we know there is no leak). To make sure we do that consistently, we'll switch the pointer we take in release_http_object_request() to a pointer-to-pointer, and we'll set it to NULL ourselves. And then the compiler can help us find each caller which needs to be updated. Most cases will just pass "&obj_req->req", which will obviously do the right thing. In a few cases, like http-push's finish_request(), we are working with a copy of the pointer, so we don't NULL the original. But it's OK because the next step is to free the struct containing the original pointer anyway. This lets us mark t5551 as leak-free. Ironically this is the "smart" http test, and the leak here only affects dumb http. But there's a single dumb-http invocation in there. The full dumb tests are in t5550, which still has some more leaks. This also makes t5559 leak-free, as it's just an HTTP/2 variant of t5551. But we don't need to mark it as such, since it inherits the flag from t5551. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-09-25 10:24:55 -07:00
brian m. carlson	ad9bb6dfe6	http: add support for authtype and credential Now that we have the credential helper code set up to handle arbitrary authentications schemes, let's add support for this in the HTTP code, where we really want to use it. If we're using this new functionality, don't set a username and password, and instead set a header wherever we'd normally do so, including for proxy authentication. Since we can now handle this case, ask the credential helper to enable the appropriate capabilities. Finally, if we're using the authtype value, set "Expect: 100-continue". Any type of authentication that requires multiple rounds (such as NTLM or Kerberos) requires a 100 Continue (if we're larger than http.postBuffer) because otherwise we send the pack data before we're authenticated, the push gets a 401 response, and we can't rewind the stream. We don't know for certain what other custom schemes might require this, the HTTP/1.1 standard has required handling this since 1999, the broken HTTP server for which we disabled this (Google's) is now fixed and has been for some time, and libcurl has a 1-second fallback in case the HTTP server is still broken. In addition, it is not unreasonable to require compliance with a 25-year old standard to use new Git features. For all of these reasons, do so here. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-04-16 22:39:07 -07:00
brian m. carlson	d01c76f1cf	http: use new headers for each object request Currently we create one set of headers for all object requests and reuse it. However, we'll need to adjust the headers for authentication purposes in the future, so let's create a new set for each request so that we can adjust them if the authentication changes. Note that the cost of allocation here is tiny compared to the fact that we're making a network call, not to mention probably a full TLS connection, so this shouldn't have a significant impact on performance. Moreover, nobody who cares about performance is using the dumb HTTP protocol anyway, since it often makes huge numbers of requests compared to the smart protocol. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-04-16 22:39:06 -07:00
Elijah Newren	f25e65e0fe	http.h: remove unnecessary include The unnecessary include in the header transitively pulled in some other headers actually needed by source files, though. Have those source files explicitly include the headers they need. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-12-26 12:04:32 -08:00
Elijah Newren	b6fdc44c84	treewide: remove cache.h inclusion due to object-file.h changes Signed-off-by: Elijah Newren <newren@gmail.com> Acked-by: Calvin Wan <calvinwan@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-11 08:52:10 -07:00
Elijah Newren	d88dbaa718	git-zlib: move declarations for git-zlib functions from cache.h Move functions from cache.h for zlib.c into a new header file. Since adding a "zlib.h" would cause issues with the real zlib, rename zlib.c to git-zlib.c while we are at it. Signed-off-by: Elijah Newren <newren@gmail.com> Acked-by: Calvin Wan <calvinwan@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-11 08:52:10 -07:00
Jeff King	fe7e44e1ab	http: prefer CURLOPT_SEEKFUNCTION to CURLOPT_IOCTLFUNCTION The IOCTLFUNCTION option has been deprecated, and generates a compiler warning in recent versions of curl. We can switch to using SEEKFUNCTION instead. It was added in 2008 via curl 7.18.0; our INSTALL file already indicates we require at least curl 7.19.4. But there's one catch: curl says we should use CURL_SEEKFUNC_{OK,FAIL}, and those didn't arrive until 7.19.5. One workaround would be to use a bare 0/1 here (or define our own macros). But let's just bump the minimum required version to 7.19.5. That version is only a minor version bump from our existing requirement, and is only a 2 month time bump for versions that are almost 13 years old. So it's not likely that anybody cares about the distinction. Switching means we have to rewrite the ioctl functions into seek functions. In some ways they are simpler (seeking is the only operation), but in some ways more complex (the ioctl allowed only a full rewind, but now we can seek to arbitrary offsets). Curl will only ever use SEEK_SET (per their documentation), so I didn't bother implementing anything else, since it would naturally be completely untested. This seems unlikely to change, but I added an assertion just in case. Likewise, I doubt curl will ever try to seek outside of the buffer sizes we've told it, but I erred on the defensive side here, rather than do an out-of-bounds read. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-01-17 08:03:08 -08:00
Li Linchao	b0c4adcdd7	remote-curl: send Accept-Language header to server Git server end's ability to accept Accept-Language header was introduced in `f18604bbf2` (http: add Accept-Language header if possible, 2015-01-28), but this is only used by very early phase of the transfer, which is HTTP GET request to discover references. For other phases, like POST request in the smart HTTP, the server does not know what language the client speaks. Teach git client to learn end-user's preferred language and throw accept-language header to the server side. Once the server gets this header, it has the ability to talk to end-user with language they understand. This would be very helpful for many non-English speakers. Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Li Linchao <lilinchao@oschina.cn> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-07-11 12:24:28 -07:00
Derrick Stolee	c1d024b843	http: make http_get_file() external This method will be used in an upcoming extension of git-remote-curl to download a single file over HTTP(S) by request. Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-05-16 15:02:09 -07:00
Ævar Arnfjörð Bjarmason	3e8084f188	http: check CURLE_SSL_PINNEDPUBKEYNOTMATCH when emitting errors Change the error shown when a http.pinnedPubKey doesn't match to point the http.pinnedPubKey variable added in `aeff8a6121` (http: implement public key pinning, 2016-02-15), e.g.: git -c http.pinnedPubKey=sha256/someNonMatchingKey ls-remote https://github.com/git/git.git fatal: unable to access 'https://github.com/git/git.git/' with http.pinnedPubkey configuration: SSL: public key does not match pinned public key! Before this we'd emit the exact same thing without the " with http.pinnedPubkey configuration". The advantage of doing this is that we're going to get a translated message (everything after the ":" is hardcoded in English in libcurl), and we've got a reference to the git-specific configuration variable that's causing the error. Unfortunately we can't test this easily, as there are no tests that require https:// in the test suite, and t/lib-httpd.sh doesn't know how to set up such tests. See [1] for the start of a discussion about what it would take to have divergent "t/lib-httpd/apache.conf" test setups. #leftoverbits 1. https://lore.kernel.org/git/YUonS1uoZlZEt+Yd@coredump.intra.peff.net/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-27 10:58:07 -07:00
Ævar Arnfjörð Bjarmason	5db9d38359	http: drop support for curl < 7.19.3 and < 7.17.0 (again) Remove the conditional use of CURLAUTH_DIGEST_IE and CURLOPT_USE_SSL. These two have been split from earlier simpler checks against LIBCURL_VERSION_NUM for ease of review. According to https://github.com/curl/curl/blob/master/docs/libcurl/symbols-in-versions the CURLAUTH_DIGEST_IE flag became available in 7.19.3, and CURLOPT_USE_SSL in 7.17.0. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-07-30 16:00:10 -07:00
Jeff King	644de29e22	http: drop support for curl < 7.19.4 In the last commit we dropped support for curl < 7.16.0, let's continue that and drop support for versions older than 7.19.3. This allows us to simplify the code by getting rid of some "#ifdef"'s. Git was broken with vanilla curl < 7.19.4 from v2.12.0 until v2.15.0. Compiling with it was broken by using CURLPROTO_* outside any "#ifdef" in `aeae4db174` (http: create function to get curl allowed protocols, 2016-12-14), and fixed in v2.15.0 in `f18777ba6e` (http: fix handling of missing CURLPROTO_, 2017-08-11). It's unclear how much anyone was impacted by that in practice, since as noted in [1] RHEL versions using curl older than that still compiled, because RedHat backported some features. Perhaps other vendors did the same. Still, it's one datapoint indicating that it wasn't in active use at the time. That (the v2.12.0 release) was in Feb 24, 2017, with v2.15.0 on Oct 30, 2017, it's now mid-2021. 1. http://lore.kernel.org/git/c8a2716d-76ac-735c-57f9-175ca3acbcb0@jupiterrise.com; followed-up by `f18777ba6e` (http: fix handling of missing CURLPROTO_, 2017-08-11) Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-07-30 12:04:41 -07:00
Jeff King	013c7e2b07	http: drop support for curl < 7.16.0 In the last commit we dropped support for curl < 7.11.1, let's continue that and drop support for versions older than 7.16.0. This allows us to get rid of some now-obsolete #ifdefs. Choosing 7.16.0 is a somewhat arbitrary cutoff: 1. It came out in October of 2006, almost 15 years ago. Besides being a nice round number, around 10 years is a common end-of-life support period, even for conservative distributions. 2. That version introduced the curl_multi interface, which gives us a lot of bang for the buck in removing #ifdefs RHEL 5 came with curl 7.15.5[1] (released in August 2006). RHEL 5's extended life cycle program ended on 2020-11-30[1]. RHEL 6 comes with curl 7.19.7 (released in November 2009), and RHEL 7 comes with 7.29.0 (released in February 2013). 1. http://lore.kernel.org/git/873e1f31-2a96-5b72-2f20-a5816cad1b51@jupiterrise.com Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-07-30 09:11:15 -07:00
Jeff King	1119a15b5c	http: drop support for curl < 7.11.1 Drop support for this ancient version of curl and simplify the code by allowing us get rid of some "#ifdef"'s. Git will not build with vanilla curl older than 7.11.1 due our use of CURLOPT_POSTFIELDSIZE in `37ee680d9b` (http.postbuffer: allow full range of ssize_t values, 2017-04-11). This field was introduced in curl 7.11.1. We could solve these compilation problems with more #ifdefs, but it's not worth the trouble. Version 7.11.1 came out in March of 2004, over 17 years ago. Let's declare that too old and drop any existing ifdefs that go further back. One obvious benefit is that we'll have fewer conditional bits cluttering the code. This patch drops all #ifdefs that reference older versions (note that curl's preprocessor macros are in hex, so we're looking for 070b01, not 071101). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-07-30 09:11:15 -07:00
Jonathan Tan	726b25a91b	http: allow custom index-pack args Currently, when fetching, packfiles referenced by URIs are run through index-pack without any arguments other than --stdin and --keep, no matter what arguments are used for the packfile that is inline in the fetch response. As a preparation for ensuring that all packs (whether inline or not) use the same index-pack arguments, teach the http subsystem to allow custom index-pack arguments. http-fetch has been updated to use the new API. For now, it passes --keep alone instead of --keep with a process ID, but this is only temporary because http-fetch itself will be taught to accept index-pack parameters (instead of using a hardcoded constant) in a subsequent commit. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-22 12:07:40 -08:00
Junio C Hamano	34e849b05a	Merge branch 'jt/cdn-offload' The "fetch/clone" protocol has been updated to allow the server to instruct the clients to grab pre-packaged packfile(s) in addition to the packed object data coming over the wire. * jt/cdn-offload: upload-pack: fix a sparse '0 as NULL pointer' warning upload-pack: send part of packfile response as uri fetch-pack: support more than one pack lockfile upload-pack: refactor reading of pack-objects out Documentation: add Packfile URIs design doc Documentation: order protocol v2 sections http-fetch: support fetching packfiles by URL http-fetch: refactor into function http: refactor finish_http_pack_request() http: use --stdin when indexing dumb HTTP pack	2020-06-25 12:27:47 -07:00
Jonathan Tan	8d5d2a34df	http-fetch: support fetching packfiles by URL Teach http-fetch the ability to download packfiles directly, given a URL, and to verify them. The http_pack_request suite has been augmented with a function that takes a URL directly. With this function, the hash is only used to determine the name of the temporary file. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-10 18:06:34 -07:00
Jonathan Tan	eb05349247	http: refactor finish_http_pack_request() finish_http_pack_request() does multiple tasks, including some housekeeping on a struct packed_git - (1) closing its index, (2) removing it from a list, and (3) installing it. These concerns are independent of fetching a pack through HTTP: they are there only because (1) the calling code opens the pack's index before deciding to fetch it, (2) the calling code maintains a list of packfiles that can be fetched, and (3) the calling code fetches it in order to make use of its objects in the same process. In preparation for a subsequent commit, which adds a feature that does not need any of this housekeeping, remove (1), (2), and (3) from finish_http_pack_request(). (2) and (3) are now done by a helper function, and (1) is the responsibility of the caller (in this patch, done closer to the point where the pack index is opened). Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-10 18:06:34 -07:00
Jonathan Tan	7167a62b9e	http, imap-send: stop using CURLOPT_VERBOSE Whenever GIT_CURL_VERBOSE is set, teach Git to behave as if GIT_TRACE_CURL=1 and GIT_TRACE_CURL_NO_DATA=1 is set, instead of setting CURLOPT_VERBOSE. This is to prevent inadvertent revelation of sensitive data. In particular, GIT_CURL_VERBOSE redacts neither the "Authorization" header nor any cookies specified by GIT_REDACT_COOKIES. Unifying the tracing mechanism also has the future benefit that any improvements to the tracing mechanism will benefit both users of GIT_CURL_VERBOSE and GIT_TRACE_CURL, and we do not need to remember to implement any improvement twice. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-11 11:18:01 -07:00
Carlo Marcelo Arenas Belón	93b980e58f	http: use xmalloc with cURL `f0ed8226c9` (Add custom memory allocator to MinGW and MacOS builds, 2009-05-31) never told cURL about it. Correct that by using the cURL initializer available since version 7.12 to point to xmalloc and friends for consistency which then will pass the allocation requests along when USE_NED_ALLOCATOR=YesPlease is used (most likely in Windows) Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-08-15 12:37:10 -07:00
Junio C Hamano	4aeeef3773	Merge branch 'dl/no-extern-in-func-decl' Mechanically and systematically drop "extern" from function declarlation. * dl/no-extern-in-func-decl: .[ch]: manually align parameter lists .[ch]: remove extern from function declarations using sed *.[ch]: remove extern from function declarations using spatch	2019-05-13 23:50:32 +09:00
Denton Liu	ad6dad0996	*.[ch]: manually align parameter lists In previous patches, extern was mechanically removed from function declarations without care to formatting, causing parameter lists to be misaligned. Manually format changed sections such that the parameter lists should be realigned. Viewing this patch with 'git diff -w' should produce no output. Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-05-05 15:20:10 +09:00
Denton Liu	554544276a	.[ch]: remove extern from function declarations using spatch There has been a push to remove extern from function declarations. Remove some instances of "extern" for function declarations which are caught by Coccinelle. Note that Coccinelle has some difficulty with processing functions with `__attribute__` or varargs so some `extern` declarations are left behind to be dealt with in a future patch. This was the Coccinelle patch used: @@ type T; identifier f; @@ - extern T f(...); and it was run with: $ git ls-files \.{c,h} \| grep -v ^compat/ \| xargs spatch --sp-file contrib/coccinelle/noextern.cocci --in-place Files under `compat/` are intentionally excluded as some are directly copied from external sources and we should avoid churning them as much as possible. Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-05-05 15:20:06 +09:00
Junio C Hamano	d4e568b2a3	Merge branch 'bc/hash-transition-16' Conversion from unsigned char[20] to struct object_id continues. * bc/hash-transition-16: (35 commits) gitweb: make hash size independent Git.pm: make hash size independent read-cache: read data in a hash-independent way dir: make untracked cache extension hash size independent builtin/difftool: use parse_oid_hex refspec: make hash size independent archive: convert struct archiver_args to object_id builtin/get-tar-commit-id: make hash size independent get-tar-commit-id: parse comment record hash: add a function to lookup hash algorithm by length remote-curl: make hash size independent http: replace sha1_to_hex http: compute hash of downloaded objects using the_hash_algo http: replace hard-coded constant with the_hash_algo http-walker: replace sha1_to_hex http-push: remove remaining uses of sha1_to_hex http-backend: allow 64-character hex names http-push: convert to use the_hash_algo builtin/pull: make hash-size independent builtin/am: make hash size independent ...	2019-04-25 16:41:17 +09:00
brian m. carlson	eed0e60f02	http: compute hash of downloaded objects using the_hash_algo Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-04-01 11:57:38 +09:00
Jeff King	a3722bcbbd	http: factor out curl result code normalization We make some requests with CURLOPT_FAILONERROR and some without, and then handle_curl_result() normalizes any failures to a uniform CURLcode. There are some other code paths in the dumb-http walker which don't use handle_curl_result(); let's pull the normalization into its own function so it can be reused. Arguably those code paths would benefit from the rest of handle_curl_result(), notably the auth handling. But retro-fitting it now would be a lot of work, and in practice it doesn't matter too much (whatever authentication we needed to make the initial contact with the server is generally sufficient for the rest of the dumb-http requests). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-03-24 21:22:40 +09:00
Junio C Hamano	cba595ab1a	Merge branch 'jk/loose-object-cache-oid' Code clean-up. * jk/loose-object-cache-oid: prefer "hash mismatch" to "sha1 mismatch" sha1-file: avoid "sha1 file" for generic use in messages sha1-file: prefer "loose object file" to "sha1 file" in messages sha1-file: drop has_sha1_file() convert has_sha1_file() callers to has_object_file() sha1-file: convert pass-through functions to object_id sha1-file: modernize loose header/stream functions sha1-file: modernize loose object file functions http: use struct object_id instead of bare sha1 update comment references to sha1_object_info() sha1-file: fix outdated sha1 comment references	2019-02-06 22:05:27 -08:00
Masaya Suzuki	e6cf87b12d	http: enable keep_error for HTTP requests curl stops parsing a response when it sees a bad HTTP status code and it has CURLOPT_FAILONERROR set. This prevents GIT_CURL_VERBOSE to show HTTP headers on error. keep_error is an option to receive the HTTP response body for those error responses. By enabling this option, curl will process the HTTP response headers, and they're shown if GIT_CURL_VERBOSE is set. Signed-off-by: Masaya Suzuki <masayasuzuki@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-01-10 15:00:56 -08:00
Jeff King	f0be0db13d	http: use struct object_id instead of bare sha1 The dumb-http walker code still passes around and stores object ids as "unsigned char *sha1". Let's modernize it. There's probably still more work to be done to handle dumb-http fetches with a new, larger hash. But that can wait; this is enough that we can now convert some of the low-level object routines that we call into from here (and in fact, some of the "oid.hash" references added here will be further improved in the next patch). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-01-08 09:40:19 -08:00
Junio C Hamano	7c3d15fe31	Merge branch 'jk/snprintf-truncation' Avoid unchecked snprintf() to make future code auditing easier. * jk/snprintf-truncation: fmt_with_err: add a comment that truncation is OK shorten_unambiguous_ref: use xsnprintf fsmonitor: use internal argv_array of struct child_process log_write_email_headers: use strbufs http: use strbufs instead of fixed buffers	2018-05-30 21:51:28 +09:00
Jeff King	390c6cbc5e	http: use strbufs instead of fixed buffers We keep the names of incoming packs and objects in fixed PATH_MAX-size buffers, and snprintf() into them. This is unlikely to end up with truncated filenames, but it is possible (especially on systems where PATH_MAX is shorter than actual paths can be). Let's switch to using strbufs, which makes the question go away entirely. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-05-21 09:54:30 +09:00
Brandon Williams	8ff14ed412	http: allow providing extra headers for http requests Add a way for callers to request that extra headers be included when making http requests. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-03-15 12:01:09 -07:00
David Turner	37ee680d9b	http.postbuffer: allow full range of ssize_t values Unfortunately, in order to push some large repos where a server does not support chunked encoding, the http postbuffer must sometimes exceed two gigabytes. On a 64-bit system, this is OK: we just malloc a larger buffer. This means that we need to use CURLOPT_POSTFIELDSIZE_LARGE to set the buffer size. Signed-off-by: David Turner <dturner@twosigma.com> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-04-13 18:24:32 -07:00
Junio C Hamano	8a2882f23e	Merge branch 'jk/http-walker-limit-redirect-2.9' Transport with dumb http can be fooled into following foreign URLs that the end user does not intend to, especially with the server side redirects and http-alternates mechanism, which can lead to security issues. Tighten the redirection and make it more obvious to the end user when it happens. * jk/http-walker-limit-redirect-2.9: http: treat http-alternates like redirects http: make redirects more obvious remote-curl: rename shadowed options variable http: always update the base URL for redirects http: simplify update_url_from_redirect	2016-12-19 14:45:32 -08:00
Jeff King	50d3413740	http: make redirects more obvious We instruct curl to always follow HTTP redirects. This is convenient, but it creates opportunities for malicious servers to create confusing situations. For instance, imagine Alice is a git user with access to a private repository on Bob's server. Mallory runs her own server and wants to access objects from Bob's repository. Mallory may try a few tricks that involve asking Alice to clone from her, build on top, and then push the result: 1. Mallory may simply redirect all fetch requests to Bob's server. Git will transparently follow those redirects and fetch Bob's history, which Alice may believe she got from Mallory. The subsequent push seems like it is just feeding Mallory back her own objects, but is actually leaking Bob's objects. There is nothing in git's output to indicate that Bob's repository was involved at all. The downside (for Mallory) of this attack is that Alice will have received Bob's entire repository, and is likely to notice that when building on top of it. 2. If Mallory happens to know the sha1 of some object X in Bob's repository, she can instead build her own history that references that object. She then runs a dumb http server, and Alice's client will fetch each object individually. When it asks for X, Mallory redirects her to Bob's server. The end result is that Alice obtains objects from Bob, but they may be buried deep in history. Alice is less likely to notice. Both of these attacks are fairly hard to pull off. There's a social component in getting Mallory to convince Alice to work with her. Alice may be prompted for credentials in accessing Bob's repository (but not always, if she is using a credential helper that caches). Attack (1) requires a certain amount of obliviousness on Alice's part while making a new commit. Attack (2) requires that Mallory knows a sha1 in Bob's repository, that Bob's server supports dumb http, and that the object in question is loose on Bob's server. But we can probably make things a bit more obvious without any loss of functionality. This patch does two things to that end. First, when we encounter a whole-repo redirect during the initial ref discovery, we now inform the user on stderr, making attack (1) much more obvious. Second, the decision to follow redirects is now configurable. The truly paranoid can set the new http.followRedirects to false to avoid any redirection entirely. But for a more practical default, we will disallow redirects only after the initial ref discovery. This is enough to thwart attacks similar to (2), while still allowing the common use of redirects at the repository level. Since `c93c92f30` (http: update base URLs when we see redirects, 2013-09-28) we re-root all further requests from the redirect destination, which should generally mean that no further redirection is necessary. As an escape hatch, in case there really is a server that needs to redirect individual requests, the user can set http.followRedirects to "true" (and this can be done on a per-server basis via http.*.followRedirects config). Reported-by: Jann Horn <jannh@google.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-12-06 12:32:48 -08:00
Junio C Hamano	2f84df2ca0	Merge branch 'ep/http-curl-trace' HTTP transport gained an option to produce more detailed debugging trace. * ep/http-curl-trace: imap-send.c: introduce the GIT_TRACE_CURL enviroment variable http.c: implement the GIT_TRACE_CURL environment variable	2016-07-06 13:38:06 -07:00
Elia Pinto	74c682d3c6	http.c: implement the GIT_TRACE_CURL environment variable Implement the GIT_TRACE_CURL environment variable to allow a greater degree of detail of GIT_CURL_VERBOSE, in particular the complete transport header and all the data payload exchanged. It might be useful if a particular situation could require a more thorough debugging analysis. Document the new GIT_TRACE_CURL environment variable. Helped-by: Torsten Bögershausen <tboegi@web.de> Helped-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Helped-by: Junio C Hamano <gitster@pobox.com> Helped-by: Eric Sunshine <sunshine@sunshineco.com> Helped-by: Jeff King <peff@peff.net> Signed-off-by: Elia Pinto <gitter.spiros@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-05-24 15:48:18 -07:00
Johannes Schindelin	8cb01e2fd3	http: support sending custom HTTP headers We introduce a way to send custom HTTP headers with all requests. This allows us, for example, to send an extra token from build agents for temporary access to private repositories. (This is the use case that triggered this patch.) This feature can be used like this: git -c http.extraheader='Secret: sssh!' fetch $URL $REF Note that `curl_easy_setopt(..., CURLOPT_HTTPHEADER, ...)` takes only a single list, overriding any previous call. This means we have to collect _all_ of the headers we want to use into a single list, and feed it to cURL in one shot. Since we already unconditionally set a "pragma" header when initializing the curl handles, we can add our new headers to that list. For callers which override the default header list (like probe_rpc), we provide `http_copy_default_headers()` so they can do the same trick. Big thanks to Jeff King and Junio Hamano for their outstanding help and patient reviews. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-04-27 14:02:33 -07:00
Junio C Hamano	e84d5e9fa1	Merge branch 'ew/force-ipv4' "git fetch" and friends that make network connections can now be told to only use ipv4 (or ipv6). * ew/force-ipv4: connect & http: support -4 and -6 switches for remote operations	2016-02-24 13:25:54 -08:00
Eric Wong	c915f11eb4	connect & http: support -4 and -6 switches for remote operations Sometimes it is necessary to force IPv4-only or IPv6-only operation on networks where name lookups may return a non-routable address and stall remote operations. The ssh(1) command has an equivalent switches which we may pass when we run them. There may be old ssh(1) implementations out there which do not support these switches; they should report the appropriate error in that case. rsync support is untouched for now since it is deprecated and scheduled to be removed. Signed-off-by: Eric Wong <normalperson@yhbt.net> Reviewed-by: Torsten Bögershausen <tboegi@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-02-12 11:34:14 -08:00
Knut Franke	372370f167	http: use credential API to handle proxy authentication Currently, the only way to pass proxy credentials to curl is by including them in the proxy URL. Usually, this means they will end up on disk unencrypted, one way or another (by inclusion in ~/.gitconfig, shell profile or history). Since proxy authentication often uses a domain user, credentials can be security sensitive; therefore, a safer way of passing credentials is desirable. If the configured proxy contains a username but not a password, query the credential API for one. Also, make sure we approve/reject proxy credentials properly. For consistency reasons, add parsing of http_proxy/https_proxy/all_proxy environment variables, which would otherwise be evaluated as a fallback by curl. Without this, we would have different semantics for git configuration and environment variables. Helped-by: Junio C Hamano <gitster@pobox.com> Helped-by: Eric Sunshine <sunshine@sunshineco.com> Helped-by: Elia Pinto <gitter.spiros@gmail.com> Signed-off-by: Knut Franke <k.franke@science-computing.de> Signed-off-by: Elia Pinto <gitter.spiros@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-01-26 10:53:25 -08:00
David Turner	835c4d3689	http.c: use CURLOPT_RANGE for range requests A HTTP server is permitted to return a non-range response to a HTTP range request (and Apache httpd in fact does this in some cases). While libcurl knows how to correctly handle this (by skipping bytes before and after the requested range), it only turns on this handling if it is aware that a range request is being made. By manually setting the range header instead of using CURLOPT_RANGE, we were hiding the fact that this was a range request from libcurl. This could cause corruption. Signed-off-by: David Turner <dturner@twopensource.com> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-11-02 14:18:06 -08:00
Junio C Hamano	b90a3d7b32	http.c: make finish_active_slot() and handle_curl_result() static They used to be used directly by remote-curl.c for the smart-http protocol. But they got wrapped by run_one_slot() in `beed336` (http: never use curl_easy_perform, 2014-02-18). Any future users are expected to follow that route. Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-01-15 11:00:52 -08:00
Jeff King	e31316263a	http: optionally extract charset parameter from content-type Since the previous commit, we now give a sanitized, shortened version of the content-type header to any callers who ask for it. This patch adds back a way for them to cleanly access specific parameters to the type. We could easily extract all parameters and make them available via a string_list, but: 1. That complicates the interface and memory management. 2. In practice, no planned callers care about anything except the charset. This patch therefore goes with the simplest thing, and we can expand or change the interface later if it becomes necessary. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-05-27 09:59:19 -07:00
Junio C Hamano	d59c12d7ad	Merge branch 'jl/nor-or-nand-and' Eradicate mistaken use of "nor" (that is, essentially "nor" used not in "neither A nor B" ;-)) from in-code comments, command output strings, and documentations. * jl/nor-or-nand-and: code and test: fix misuses of "nor" comments: fix misuses of "nor" contrib: fix misuses of "nor" Documentation: fix misuses of "nor"	2014-04-08 12:00:28 -07:00
Justin Lebar	01689909eb	comments: fix misuses of "nor" Signed-off-by: Justin Lebar <jlebar@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-03-31 15:29:27 -07:00
Jeff King	beed336c3e	http: never use curl_easy_perform We currently don't reuse http connections when fetching via the smart-http protocol. This is bad because the TCP handshake introduces latency, and especially because SSL connection setup may be non-trivial. We can fix it by consistently using curl's "multi" interface. The reason is rather complicated: Our http code has two ways of being used: queuing many "slots" to be fetched in parallel, or fetching a single request in a blocking manner. The parallel code is built on curl's "multi" interface. Most of the single-request code uses http_request, which is built on top of the parallel code (we just feed it one slot, and wait until it finishes). However, one could also accomplish the single-request scheme by avoiding curl's multi interface entirely and just using curl_easy_perform. This is simpler, and is used by post_rpc in the smart-http protocol. It does work to use the same curl handle in both contexts, as long as it is not at the same time. However, internally curl may not share all of the cached resources between both contexts. In particular, a connection formed using the "multi" code will go into a reuse pool connected to the "multi" object. Further requests using the "easy" interface will not be able to reuse that connection. The smart http protocol does ref discovery via http_request, which uses the "multi" interface, and then follows up with the "easy" interface for its rpc calls. As a result, we make two HTTP connections rather than reusing a single one. We could teach the ref discovery to use the "easy" interface. But it is only once we have done this discovery that we know whether the protocol will be smart or dumb. If it is dumb, then our further requests, which want to fetch objects in parallel, will not be able to reuse the same connection. Instead, this patch switches post_rpc to build on the parallel interface, which means that we use it consistently everywhere. It's a little more complicated to use, but since we have the infrastructure already, it doesn't add any code; we can just factor out the relevant bits from http_request. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-02-18 15:50:57 -08:00
Junio C Hamano	c5a77e8f92	Merge branch 'bc/http-100-continue' Issue "100 Continue" responses to help use of GSS-Negotiate authentication scheme over HTTP transport when needed. * bc/http-100-continue: remote-curl: fix large pushes with GSSAPI remote-curl: pass curl slot_results back through run_slot http: return curl's AUTHAVAIL via slot_results	2013-12-05 12:58:59 -08:00
Jeff King	0972ccd97c	http: return curl's AUTHAVAIL via slot_results Callers of the http code may want to know which auth types were available for the previous request. But after finishing with the curl slot, they are not supposed to look at the curl handle again. We already handle returning other information via the slot_results struct; let's add a flag to check the available auth. Note that older versions of curl did not support this, so we simply return 0 (something like "-1" would be worse, as the value is a bitflag and we might accidentally set a flag). This is sufficient for the callers planned in this series, who only trigger some optional behavior if particular bits are set, and can live with a fake "no bits" answer. Signed-off-by: Jeff King <peff@peff.net>	2013-10-31 10:05:55 -07:00
Jeff King	c93c92f309	http: update base URLs when we see redirects If a caller asks the http_get_* functions to go to a particular URL and we end up elsewhere due to a redirect, the effective_url field can tell us where we went. It would be nice to remember this redirect and short-cut further requests for two reasons: 1. It's more efficient. Otherwise we spend an extra http round-trip to the server for each subsequent request, just to get redirected. 2. If we end up with an http 401 and are going to ask for credentials, it is to feed them to the redirect target. If the redirect is an http->https upgrade, this means our credentials may be provided on the http leg, just to end up redirected to https. And if the redirect crosses server boundaries, then curl will drop the credentials entirely as it follows the redirect. However, it, it is not enough to simply record the effective URL we saw and use that for subsequent requests. We were originally fed a "base" url like: http://example.com/foo.git and we want to figure out what the new base is, even though the URLs we see may be: original: http://example.com/foo.git/info/refs effective: http://example.com/bar.git/info/refs Subsequent requests will not be for "info/refs", but for other paths relative to the base. We must ask the caller to pass in the original base, and we must pass the redirected base back to the caller (so that it can generate more URLs from it). Furthermore, we need to feed the new base to the credential code, so that requests to credential helpers (or to the user) match the URL we will be requesting. This patch teaches http_request_reauth to do this munging. Since it is the caller who cares about making more URLs, it seems at first glance that callers could simply check effective_url themselves and handle it. However, since we need to update the credential struct before the second re-auth request, we have to do it inside http_request_reauth. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>	2013-10-14 16:56:47 -07:00

1 2 3

105 Commits