global: introduce `USE_THE_REPOSITORY_VARIABLE` macro
Use of the `the_repository` variable is deprecated nowadays, and we
slowly but steadily convert the codebase to not use it anymore. Instead,
callers should be passing down the repository to work on via parameters.
It is hard though to prove that a given code unit does not use this
variable anymore. The most trivial case, merely demonstrating that there
is no direct use of `the_repository`, is already a bit of a pain during
code reviews as the reviewer needs to manually verify claims made by the
patch author. The bigger problem though is that we have many interfaces
that implicitly rely on `the_repository`.
Introduce a new `USE_THE_REPOSITORY_VARIABLE` macro that allows code
units to opt into usage of `the_repository`. The intent of this macro is
to demonstrate that a certain code unit does not use this variable
anymore, and to keep it from new dependencies on it in future changes,
be it explicit or implicit
For now, the macro only guards `the_repository` itself as well as
`the_hash_algo`. There are many more known interfaces where we have an
implicit dependency on `the_repository`, but those are not guarded at
the current point in time. Over time though, we should start to add
guards as required (or even better, just remove them).
Define the macro as required in our code units. As expected, most of our
code still relies on the global variable. Nearly all of our builtins
rely on the variable as there is no way yet to pass `the_repository` to
their entry point. For now, declare the macro in "biultin.h" to keep the
required changes at least a little bit more contained.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-06-14 14:50:23 +08:00
|
|
|
#define USE_THE_REPOSITORY_VARIABLE
|
|
|
|
|
2014-09-14 15:40:45 +08:00
|
|
|
#include "git-compat-util.h"
|
http: centralize the accounting of libcurl dependencies
As discussed in 644de29e220 (http: drop support for curl < 7.19.4,
2021-07-30) checking against LIBCURL_VERSION_NUM isn't as reliable as
checking specific symbols present in curl, as some distros have been
known to backport features.
However, while some of the curl_easy_setopt() arguments we rely on are
macros, others are enum, and we can't assume that those that are
macros won't change into enums in the future.
So we're still going to have to check LIBCURL_VERSION_NUM, but by
doing that in one central place and using a macro definition of our
own, anyone who's backporting features can define it themselves, and
thus have access to more modern curl features that they backported,
even if they didn't bump the LIBCURL_VERSION_NUM.
More importantly, as shown in a preceding commit doing these version
checks makes for hard to read and possibly buggy code, as shown by the
bug fixed there where we were conflating base 10 for base 16 when
comparing the version.
By doing them all in one place we'll hopefully reduce the chances of
such future mistakes, furthermore it now becomes easier to see at a
glance what the oldest supported version is, which makes it easier to
reason about any future deprecation similar to the recent
e48a623dea0 (Merge branch 'ab/http-drop-old-curl', 2021-08-24).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-13 22:51:28 +08:00
|
|
|
#include "git-curl-compat.h"
|
2023-02-24 08:09:27 +08:00
|
|
|
#include "hex.h"
|
2005-11-19 03:02:58 +08:00
|
|
|
#include "http.h"
|
2017-06-15 02:07:36 +08:00
|
|
|
#include "config.h"
|
2009-06-06 16:44:01 +08:00
|
|
|
#include "pack.h"
|
2010-04-19 22:23:09 +08:00
|
|
|
#include "run-command.h"
|
2010-11-14 09:51:15 +08:00
|
|
|
#include "url.h"
|
2013-08-06 04:20:36 +08:00
|
|
|
#include "urlmatch.h"
|
http: use credential API to get passwords
This patch converts the http code to use the new credential
API, both for http authentication as well as for getting
certificate passwords.
Most of the code change is simply variable naming (the
passwords are now contained inside the credential struct)
or deletion of obsolete code (the credential code handles
URL parsing and prompting for us).
The behavior should be the same, with one exception: the
credential code will prompt with a description based on the
credential components. Therefore, the old prompt of:
Username for 'example.com':
Password for 'example.com':
now looks like:
Username for 'https://example.com/repo.git':
Password for 'https://user@example.com/repo.git':
Note that we include more information in each line,
specifically:
1. We now include the protocol. While more noisy, this is
an important part of knowing what you are accessing
(especially if you care about http vs https).
2. We include the username in the password prompt. This is
not a big deal when you have just been prompted for it,
but the username may also come from the remote's URL
(and after future patches, from configuration or
credential helpers). In that case, it's a nice
reminder of the user for which you're giving the
password.
3. We include the path component of the URL. In many
cases, the user won't care about this and it's simply
noise (i.e., they'll use the same credential for a
whole site). However, that is part of a larger
question, which is whether path components should be
part of credential context, both for prompting and for
lookup by storage helpers. That issue will be addressed
as a whole in a future patch.
Similarly, for unlocking certificates, we used to say:
Certificate Password for 'example.com':
and we now say:
Password for 'cert:///path/to/certificate':
Showing the path to the client certificate makes more sense,
as that is what you are unlocking, not "example.com".
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-12-10 18:31:21 +08:00
|
|
|
#include "credential.h"
|
2012-06-03 03:03:08 +08:00
|
|
|
#include "version.h"
|
2013-02-21 04:02:45 +08:00
|
|
|
#include "pkt-line.h"
|
2015-02-26 11:04:16 +08:00
|
|
|
#include "gettext.h"
|
2023-04-11 11:00:38 +08:00
|
|
|
#include "trace.h"
|
http: limit redirection to protocol-whitelist
Previously, libcurl would follow redirection to any protocol
it was compiled for support with. This is desirable to allow
redirection from HTTP to HTTPS. However, it would even
successfully allow redirection from HTTP to SFTP, a protocol
that git does not otherwise support at all. Furthermore
git's new protocol-whitelisting could be bypassed by
following a redirect within the remote helper, as it was
only enforced at transport selection time.
This patch limits redirects within libcurl to HTTP, HTTPS,
FTP and FTPS. If there is a protocol-whitelist present, this
list is limited to those also allowed by the whitelist. As
redirection happens from within libcurl, it is impossible
for an HTTP redirect to a protocol implemented within
another remote helper.
When the curl version git was compiled with is too old to
support restrictions on protocol redirection, we warn the
user if GIT_ALLOW_PROTOCOL restrictions were requested. This
is a little inaccurate, as even without that variable in the
environment, we would still restrict SFTP, etc, and we do
not warn in that case. But anything else means we would
literally warn every time git accesses an http remote.
This commit includes a test, but it is not as robust as we
would hope. It redirects an http request to ftp, and checks
that curl complained about the protocol, which means that we
are relying on curl's specific error message to know what
happened. Ideally we would redirect to a working ftp server
and confirm that we can clone without protocol restrictions,
and not with them. But we do not have a portable way of
providing an ftp server, nor any other protocol that curl
supports (https is the closest, but we would have to deal
with certificates).
[jk: added test and version warning]
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-09-23 06:06:04 +08:00
|
|
|
#include "transport.h"
|
2017-08-19 06:20:16 +08:00
|
|
|
#include "packfile.h"
|
2018-01-19 08:28:01 +08:00
|
|
|
#include "string-list.h"
|
2023-04-11 15:41:53 +08:00
|
|
|
#include "object-file.h"
|
2023-05-16 14:34:06 +08:00
|
|
|
#include "object-store-ll.h"
|
dumb-http: store downloaded pack idx as tempfile
This patch fixes a regression in b1b8dfde69 (finalize_object_file():
implement collision check, 2024-09-26) where fetching a v1 pack idx file
over the dumb-http protocol would cause the fetch to fail.
The core of the issue is that dumb-http stores the idx we fetch from the
remote at the same path that will eventually hold the idx we generate
from "index-pack --stdin". The sequence is something like this:
0. We realize we need some object X, which we don't have locally, and
nor does the other side have it as a loose object.
1. We download the list of remote packs from objects/info/packs.
2. For each entry in that file, we download each pack index and store
it locally in .git/objects/pack/pack-$hash.idx (the $hash is not
something we can verify yet and is given to us by the remote).
3. We check each pack index we got to see if it has object X. When we
find a match, we download the matching .pack file from the remote
to a tempfile. We feed that to "index-pack --stdin", which
reindexes the pack, rather than trusting that it has what the other
side claims it does. In most cases, this will end up generating the
exact same (byte-for-byte) pack index which we'll store at the same
pack-$hash.idx path, because the index generation and $hash id are
computed based on what's in the packfile. But:
a. The other side might have used other options to generate the
index. For instance we use index v2 by default, but long ago
it was v1 (and you can still ask for v1 explicitly).
b. The other side might even use a different mechanism to
determine $hash. E.g., long ago it was based on the sorted
list of objects in the packfile, but we switched to using the
pack checksum in 1190a1acf8 (pack-objects: name pack files
after trailer hash, 2013-12-05).
The regression we saw in the real world was (3a). A recent client
fetching from a server with a v1 index downloaded that index, then
complained about trying to overwrite it with its own v2 index. This
collision is otherwise harmless; we know we want to replace the remote
version with our local one, but the collision check doesn't realize
that.
There are a few options to fix it:
- we could teach index-pack a command-line option to ignore only pack
idx collisions, and use it when the dumb-http code invokes
index-pack. This would be an awkward thing to expose users to and
would involve a lot of boilerplate to get the option down to the
collision code.
- we could delete the remote .idx file right before running
index-pack. It should be redundant at that point (since we've just
downloaded the matching pack). But it feels risky to delete
something from our own .git/objects based on what the other side has
said. I'm not entirely positive that a malicious server couldn't lie
about which pack-$hash.idx it has and get us to delete something
precious.
- we can stop co-mingling the downloaded idx files in our local
objects directory. This is a slightly bigger change but I think
fixes the root of the problem more directly.
This patch implements the third option. The big design questions are:
where do we store the downloaded files, and how do we manage their
lifetimes?
There are some additional quirks to the dumb-http system we should
consider. Remember that in step 2 we downloaded every pack index, but in
step 3 we may only download some of the matching packs. What happens to
those other idx files now? They sit in the .git/objects/pack directory,
possibly waiting to be used at a later date. That may save bandwidth for
a subsequent fetch, but it also creates a lot of weird corner cases:
- our local object directory now has semi-untrusted .idx files sitting
around, without their matching .pack
- in case 3b, we noted that we might not generate the same hash as the
other side. In that case even if we download the matching pack,
our index-pack invocation will store it in a different
pack-$hash.idx file. And the unmatched .idx will sit there forever.
- if the server repacks, it may delete the old packs. Now we have
these orphaned .idx files sitting around locally that will never be
used (nor deleted).
- if we repack locally we may delete our local version of the server's
pack index and not realize we have it. So we'll download it again,
even though we have all of the objects it mentions.
I think the right solution here is probably some more complex cache
management system: download the remote .idx files to their own storage
directory, mark them as "seen" when we get their matching pack (to avoid
re-downloading even if we repack), and then delete them when the
server's objects/info/refs no longer mentions them.
But since the dumb http protocol is so ancient and so inferior to the
smart http protocol, I don't think it's worth spending a lot of time
creating such a system. For this patch I'm just downloading the idx
files to .git/objects/tmp_pack_*, and marking them as tempfiles to be
deleted when we exit (and due to the name, any we miss due to a crash,
etc, should eventually be removed by "git gc" runs based on timestamps).
That is slightly worse for one case: if we download an idx but not the
matching pack, we won't retain that idx for subsequent runs. But the
flip side is that we're making other cases better (we never hold on to
useless idx files forever). I suspect that worse case does not even come
up often, since it implies that the packs are generated to match
distinct parts of history (i.e., in practice even in a repo with many
packs you're going to end up grabbing all of those packs to do a clone).
If somebody really cares about that, I think the right path forward is a
managed cache directory as above, and this patch is providing the first
step in that direction anyway (by moving things out of the objects/pack/
directory).
There are two test changes. One demonstrates the broken v1 index case
(it double-checks the resulting clone with fsck to be careful, but prior
to this patch it actually fails at the clone step). The other tweaks the
expectation for a test that covers the "slightly worse" case to
accommodate the extra index download.
The code changes are fairly simple. We stop using finalize_object_file()
to copy the remote's index file into place, and leave it as a tempfile.
We give the tempfile a real ".idx" name, since the packfile code expects
that, and thus we make sure it is out of the usual packs/ directory (so
we'd never mistake it for a real local .idx).
We also have to change parse_pack_index(), which creates a temporary
packed_git to access our index (we need this because all of the pack idx
code assumes we have that struct). It reads the index data from the
tempfile, but prior to this patch would speculatively write the
finalized name into the packed_git struct using the pack-$hash we expect
to use.
I was mildly surprised that this worked at all, since we call
verify_pack_index() on the packed_git which mentions the final name
before moving the file into place! But it works because
parse_pack_index() leaves the mmap-ed data in the struct, so the
lazy-open in verify_pack_index() never triggers, and we read from the
tempfile, ignoring the filename in the struct completely. Hacky, but it
works.
After this patch, parse_pack_index() now uses the index filename we pass
in to derive a matching .pack name. This is OK to change because there
are only two callers, both in the dumb http code (and the other passes
in an existing pack-$hash.idx name, so the derived name is going to be
pack-$hash.pack, which is what we were using anyway).
I'll follow up with some more cleanups in that area, but this patch is
sufficient to fix the regression.
Reported-by: fox <fox.gbr@townlong-yak.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-25 14:58:06 +08:00
|
|
|
#include "tempfile.h"
|
2005-11-19 03:02:58 +08:00
|
|
|
|
2016-05-23 21:44:02 +08:00
|
|
|
static struct trace_key trace_curl = TRACE_KEY_INIT(CURL);
|
2018-01-19 08:28:02 +08:00
|
|
|
static int trace_curl_data = 1;
|
2020-06-06 05:21:36 +08:00
|
|
|
static int trace_curl_redact = 1;
|
2016-02-03 12:09:14 +08:00
|
|
|
long int git_curl_ipresolve = CURL_IPRESOLVE_WHATEVER;
|
2009-03-10 09:47:29 +08:00
|
|
|
int active_requests;
|
2009-06-06 16:43:41 +08:00
|
|
|
int http_is_verbose;
|
2017-04-12 02:13:57 +08:00
|
|
|
ssize_t http_post_buffer = 16 * LARGE_PACKET_MAX;
|
2005-11-19 03:02:58 +08:00
|
|
|
|
2009-11-27 23:42:26 +08:00
|
|
|
static int min_curl_sessions = 1;
|
|
|
|
static int curl_session_count;
|
2007-12-10 01:04:57 +08:00
|
|
|
static int max_requests = -1;
|
|
|
|
static CURLM *curlm;
|
|
|
|
static CURL *curl_default;
|
http*: add helper methods for fetching objects (loose)
The code handling the fetching of loose objects in http-push.c and
http-walker.c have been refactored into new methods and a new struct
(object_http_request) in http.c. They are not meant to be invoked
elsewhere.
The new methods in http.c are
- new_http_object_request
- process_http_object_request
- finish_http_object_request
- abort_http_object_request
- release_http_object_request
and the new struct is http_object_request.
RANGER_HEADER_SIZE and no_pragma_header is no longer made available
outside of http.c, since after the above changes, there are no other
instances of usage outside of http.c.
Remove members of the transfer_request struct in http-push.c and
http-walker.c, including filename, real_sha1 and zret, as they are used
no longer used.
Move the methods append_remote_object_url() and get_remote_object_url()
from http-push.c to http.c. Additionally, get_remote_object_url() is no
longer defined only when USE_CURL_MULTI is defined, since
non-USE_CURL_MULTI code in http.c uses it (namely, in
new_http_object_request()).
Refactor code from http-push.c::start_fetch_loose() and
http-walker.c::start_object_fetch_request() that deals with the details
of coming up with the filename to store the retrieved object, resuming
a previously aborted request, and making a new curl request, into a new
function, new_http_object_request().
Refactor code from http-walker.c::process_object_request() into the
function, process_http_object_request().
Refactor code from http-push.c::finish_request() and
http-walker.c::finish_object_request() into a new function,
finish_http_object_request(). It returns the result of the
move_temp_to_file() invocation.
Add a function, release_http_object_request(), which cleans up object
request data. http-push.c and http-walker.c invoke this function
separately; http-push.c::release_request() and
http-walker.c::release_object_request() do not invoke this function.
Add a function, abort_http_object_request(), which unlink()s the object
file and invokes release_http_object_request(). Update
http-walker.c::abort_object_request() to use this.
Signed-off-by: Tay Ray Chuan <rctay89@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-06-06 16:44:02 +08:00
|
|
|
|
|
|
|
#define PREV_BUF_SIZE 4096
|
|
|
|
|
2005-11-19 03:02:58 +08:00
|
|
|
char curl_errorstr[CURL_ERROR_SIZE];
|
|
|
|
|
2007-12-10 01:04:57 +08:00
|
|
|
static int curl_ssl_verify = -1;
|
2013-04-08 03:10:39 +08:00
|
|
|
static int curl_ssl_try;
|
2024-05-27 19:46:39 +08:00
|
|
|
static char *curl_http_version;
|
2024-05-27 19:46:10 +08:00
|
|
|
static char *ssl_cert;
|
|
|
|
static char *ssl_cert_type;
|
2024-05-27 19:46:39 +08:00
|
|
|
static char *ssl_cipherlist;
|
|
|
|
static char *ssl_version;
|
2015-08-15 03:37:43 +08:00
|
|
|
static struct {
|
|
|
|
const char *name;
|
|
|
|
long ssl_version;
|
|
|
|
} sslversions[] = {
|
|
|
|
{ "sslv2", CURL_SSLVERSION_SSLv2 },
|
|
|
|
{ "sslv3", CURL_SSLVERSION_SSLv3 },
|
|
|
|
{ "tlsv1", CURL_SSLVERSION_TLSv1 },
|
http: centralize the accounting of libcurl dependencies
As discussed in 644de29e220 (http: drop support for curl < 7.19.4,
2021-07-30) checking against LIBCURL_VERSION_NUM isn't as reliable as
checking specific symbols present in curl, as some distros have been
known to backport features.
However, while some of the curl_easy_setopt() arguments we rely on are
macros, others are enum, and we can't assume that those that are
macros won't change into enums in the future.
So we're still going to have to check LIBCURL_VERSION_NUM, but by
doing that in one central place and using a macro definition of our
own, anyone who's backporting features can define it themselves, and
thus have access to more modern curl features that they backported,
even if they didn't bump the LIBCURL_VERSION_NUM.
More importantly, as shown in a preceding commit doing these version
checks makes for hard to read and possibly buggy code, as shown by the
bug fixed there where we were conflating base 10 for base 16 when
comparing the version.
By doing them all in one place we'll hopefully reduce the chances of
such future mistakes, furthermore it now becomes easier to see at a
glance what the oldest supported version is, which makes it easier to
reason about any future deprecation similar to the recent
e48a623dea0 (Merge branch 'ab/http-drop-old-curl', 2021-08-24).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-13 22:51:28 +08:00
|
|
|
#ifdef GIT_CURL_HAVE_CURL_SSLVERSION_TLSv1_0
|
2015-08-15 03:37:43 +08:00
|
|
|
{ "tlsv1.0", CURL_SSLVERSION_TLSv1_0 },
|
|
|
|
{ "tlsv1.1", CURL_SSLVERSION_TLSv1_1 },
|
|
|
|
{ "tlsv1.2", CURL_SSLVERSION_TLSv1_2 },
|
|
|
|
#endif
|
http: centralize the accounting of libcurl dependencies
As discussed in 644de29e220 (http: drop support for curl < 7.19.4,
2021-07-30) checking against LIBCURL_VERSION_NUM isn't as reliable as
checking specific symbols present in curl, as some distros have been
known to backport features.
However, while some of the curl_easy_setopt() arguments we rely on are
macros, others are enum, and we can't assume that those that are
macros won't change into enums in the future.
So we're still going to have to check LIBCURL_VERSION_NUM, but by
doing that in one central place and using a macro definition of our
own, anyone who's backporting features can define it themselves, and
thus have access to more modern curl features that they backported,
even if they didn't bump the LIBCURL_VERSION_NUM.
More importantly, as shown in a preceding commit doing these version
checks makes for hard to read and possibly buggy code, as shown by the
bug fixed there where we were conflating base 10 for base 16 when
comparing the version.
By doing them all in one place we'll hopefully reduce the chances of
such future mistakes, furthermore it now becomes easier to see at a
glance what the oldest supported version is, which makes it easier to
reason about any future deprecation similar to the recent
e48a623dea0 (Merge branch 'ab/http-drop-old-curl', 2021-08-24).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-13 22:51:28 +08:00
|
|
|
#ifdef GIT_CURL_HAVE_CURL_SSLVERSION_TLSv1_3
|
2018-03-29 18:14:18 +08:00
|
|
|
{ "tlsv1.3", CURL_SSLVERSION_TLSv1_3 },
|
|
|
|
#endif
|
2015-08-15 03:37:43 +08:00
|
|
|
};
|
2024-05-27 19:46:10 +08:00
|
|
|
static char *ssl_key;
|
|
|
|
static char *ssl_key_type;
|
|
|
|
static char *ssl_capath;
|
|
|
|
static char *curl_no_proxy;
|
http: centralize the accounting of libcurl dependencies
As discussed in 644de29e220 (http: drop support for curl < 7.19.4,
2021-07-30) checking against LIBCURL_VERSION_NUM isn't as reliable as
checking specific symbols present in curl, as some distros have been
known to backport features.
However, while some of the curl_easy_setopt() arguments we rely on are
macros, others are enum, and we can't assume that those that are
macros won't change into enums in the future.
So we're still going to have to check LIBCURL_VERSION_NUM, but by
doing that in one central place and using a macro definition of our
own, anyone who's backporting features can define it themselves, and
thus have access to more modern curl features that they backported,
even if they didn't bump the LIBCURL_VERSION_NUM.
More importantly, as shown in a preceding commit doing these version
checks makes for hard to read and possibly buggy code, as shown by the
bug fixed there where we were conflating base 10 for base 16 when
comparing the version.
By doing them all in one place we'll hopefully reduce the chances of
such future mistakes, furthermore it now becomes easier to see at a
glance what the oldest supported version is, which makes it easier to
reason about any future deprecation similar to the recent
e48a623dea0 (Merge branch 'ab/http-drop-old-curl', 2021-08-24).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-13 22:51:28 +08:00
|
|
|
#ifdef GIT_CURL_HAVE_CURLOPT_PINNEDPUBLICKEY
|
2024-05-27 19:46:15 +08:00
|
|
|
static char *ssl_pinnedkey;
|
2016-02-15 22:04:22 +08:00
|
|
|
#endif
|
2024-05-27 19:46:10 +08:00
|
|
|
static char *ssl_cainfo;
|
2007-12-10 01:04:57 +08:00
|
|
|
static long curl_low_speed_limit = -1;
|
|
|
|
static long curl_low_speed_time = -1;
|
2009-03-10 09:47:29 +08:00
|
|
|
static int curl_ftp_no_epsv;
|
2024-05-27 19:46:10 +08:00
|
|
|
static char *curl_http_proxy;
|
|
|
|
static char *http_proxy_authmethod;
|
2020-03-05 02:40:05 +08:00
|
|
|
|
2024-05-27 19:46:10 +08:00
|
|
|
static char *http_proxy_ssl_cert;
|
|
|
|
static char *http_proxy_ssl_key;
|
|
|
|
static char *http_proxy_ssl_ca_info;
|
2020-03-05 02:40:05 +08:00
|
|
|
static struct credential proxy_cert_auth = CREDENTIAL_INIT;
|
|
|
|
static int proxy_ssl_cert_password_required;
|
|
|
|
|
2016-01-26 21:02:47 +08:00
|
|
|
static struct {
|
|
|
|
const char *name;
|
|
|
|
long curlauth_param;
|
|
|
|
} proxy_authmethods[] = {
|
|
|
|
{ "basic", CURLAUTH_BASIC },
|
|
|
|
{ "digest", CURLAUTH_DIGEST },
|
|
|
|
{ "negotiate", CURLAUTH_GSSNEGOTIATE },
|
|
|
|
{ "ntlm", CURLAUTH_NTLM },
|
|
|
|
{ "anyauth", CURLAUTH_ANY },
|
|
|
|
/*
|
|
|
|
* CURLAUTH_DIGEST_IE has no corresponding command-line option in
|
|
|
|
* curl(1) and is not included in CURLAUTH_ANY, so we leave it out
|
|
|
|
* here, too
|
|
|
|
*/
|
|
|
|
};
|
2017-08-12 00:37:34 +08:00
|
|
|
#ifdef CURLGSSAPI_DELEGATION_FLAG
|
2024-05-27 19:46:39 +08:00
|
|
|
static char *curl_deleg;
|
2016-09-29 02:01:34 +08:00
|
|
|
static struct {
|
|
|
|
const char *name;
|
|
|
|
long curl_deleg_param;
|
|
|
|
} curl_deleg_levels[] = {
|
|
|
|
{ "none", CURLGSSAPI_DELEGATION_NONE },
|
|
|
|
{ "policy", CURLGSSAPI_DELEGATION_POLICY_FLAG },
|
|
|
|
{ "always", CURLGSSAPI_DELEGATION_FLAG },
|
|
|
|
};
|
|
|
|
#endif
|
|
|
|
|
http: allow authenticating proactively
When making a request over HTTP(S), Git only sends authentication if it
receives a 401 response. Thus, if a repository is open to the public
for reading, Git will typically never ask for authentication for fetches
and clones.
However, there may be times when a user would like to authenticate
nevertheless. For example, a forge may give higher rate limits to users
who authenticate because they are easier to contact in case of excessive
use. Or it may be useful for a known heavy user, such as an internal
service, to proactively authenticate so its use can be monitored and, if
necessary, throttled.
Let's make this possible with a new option, "http.proactiveAuth". This
option specifies a type of authentication which can be used to
authenticate against the host in question. This is necessary because we
lack the WWW-Authenticate header to provide us details; similarly, we
cannot accept certain types of authentication because we require
information from the server, such as a nonce or challenge, to
successfully authenticate.
If we're in auto mode and we got a username and password, set the
authentication scheme to Basic. libcurl will not send authentication
proactively unless there's a single choice of allowed authentication,
and we know in this case we didn't get an authtype entry telling us what
scheme to use, or we would have taken a different codepath and written
the header ourselves. In any event, of the other schemes that libcurl
supports, Digest and NTLM require a nonce or challenge, which means that
they cannot work with proactive auth, and GSSAPI does not use a username
and password at all, so Basic is the only logical choice among the
built-in options.
Note that the existing http_proactive_auth variable signifies proactive
auth if there are already credentials, which is different from the
functionality we're adding, which always seeks credentials even if none
are provided. Nonetheless, t5540 tests the existing behavior for
WebDAV-based pushes to an open repository without credentials, so we
preserve it. While at first this may seem an insecure and bizarre
decision, it may be that authentication is done with TLS certificates,
in which case it might actually provide a quite high level of security.
Expand the variable to use an enum to handle the additional cases and a
helper function to distinguish our new cases from the old ones.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-10 08:01:55 +08:00
|
|
|
enum proactive_auth {
|
|
|
|
PROACTIVE_AUTH_NONE = 0,
|
|
|
|
PROACTIVE_AUTH_IF_CREDENTIALS,
|
|
|
|
PROACTIVE_AUTH_AUTO,
|
|
|
|
PROACTIVE_AUTH_BASIC,
|
|
|
|
};
|
|
|
|
|
http: use credential API to handle proxy authentication
Currently, the only way to pass proxy credentials to curl is by including them
in the proxy URL. Usually, this means they will end up on disk unencrypted, one
way or another (by inclusion in ~/.gitconfig, shell profile or history). Since
proxy authentication often uses a domain user, credentials can be security
sensitive; therefore, a safer way of passing credentials is desirable.
If the configured proxy contains a username but not a password, query the
credential API for one. Also, make sure we approve/reject proxy credentials
properly.
For consistency reasons, add parsing of http_proxy/https_proxy/all_proxy
environment variables, which would otherwise be evaluated as a fallback by curl.
Without this, we would have different semantics for git configuration and
environment variables.
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Helped-by: Elia Pinto <gitter.spiros@gmail.com>
Signed-off-by: Knut Franke <k.franke@science-computing.de>
Signed-off-by: Elia Pinto <gitter.spiros@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-01-26 21:02:48 +08:00
|
|
|
static struct credential proxy_auth = CREDENTIAL_INIT;
|
|
|
|
static const char *curl_proxyuserpwd;
|
2024-05-27 19:46:15 +08:00
|
|
|
static char *curl_cookie_file;
|
2013-07-24 06:40:17 +08:00
|
|
|
static int curl_save_cookies;
|
http: hoist credential request out of handle_curl_result
When we are handling a curl response code in http_request or
in the remote-curl RPC code, we use the handle_curl_result
helper to translate curl's response into an easy-to-use
code. When we see an HTTP 401, we do one of two things:
1. If we already had a filled-in credential, we mark it as
rejected, and then return HTTP_NOAUTH to indicate to
the caller that we failed.
2. If we didn't, then we ask for a new credential and tell
the caller HTTP_REAUTH to indicate that they may want
to try again.
Rejecting in the first case makes sense; it is the natural
result of the request we just made. However, prompting for
more credentials in the second step does not always make
sense. We do not know for sure that the caller is going to
make a second request, and nor are we sure that it will be
to the same URL. Logically, the prompt belongs not to the
request we just finished, but to the request we are (maybe)
about to make.
In practice, it is very hard to trigger any bad behavior.
Currently, if we make a second request, it will always be to
the same URL (even in the face of redirects, because curl
handles the redirects internally). And we almost always
retry on HTTP_REAUTH these days. The one exception is if we
are streaming a large RPC request to the server (e.g., a
pushed packfile), in which case we cannot restart. It's
extremely unlikely to see a 401 response at this stage,
though, as we would typically have seen it when we sent a
probe request, before streaming the data.
This patch drops the automatic prompt out of case 2, and
instead requires the caller to do it. This is a few extra
lines of code, and the bug it fixes is unlikely to come up
in practice. But it is conceptually cleaner, and paves the
way for better handling of credentials across redirects.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2013-09-28 16:31:45 +08:00
|
|
|
struct credential http_auth = CREDENTIAL_INIT;
|
http: allow authenticating proactively
When making a request over HTTP(S), Git only sends authentication if it
receives a 401 response. Thus, if a repository is open to the public
for reading, Git will typically never ask for authentication for fetches
and clones.
However, there may be times when a user would like to authenticate
nevertheless. For example, a forge may give higher rate limits to users
who authenticate because they are easier to contact in case of excessive
use. Or it may be useful for a known heavy user, such as an internal
service, to proactively authenticate so its use can be monitored and, if
necessary, throttled.
Let's make this possible with a new option, "http.proactiveAuth". This
option specifies a type of authentication which can be used to
authenticate against the host in question. This is necessary because we
lack the WWW-Authenticate header to provide us details; similarly, we
cannot accept certain types of authentication because we require
information from the server, such as a nonce or challenge, to
successfully authenticate.
If we're in auto mode and we got a username and password, set the
authentication scheme to Basic. libcurl will not send authentication
proactively unless there's a single choice of allowed authentication,
and we know in this case we didn't get an authtype entry telling us what
scheme to use, or we would have taken a different codepath and written
the header ourselves. In any event, of the other schemes that libcurl
supports, Digest and NTLM require a nonce or challenge, which means that
they cannot work with proactive auth, and GSSAPI does not use a username
and password at all, so Basic is the only logical choice among the
built-in options.
Note that the existing http_proactive_auth variable signifies proactive
auth if there are already credentials, which is different from the
functionality we're adding, which always seeks credentials even if none
are provided. Nonetheless, t5540 tests the existing behavior for
WebDAV-based pushes to an open repository without credentials, so we
preserve it. While at first this may seem an insecure and bizarre
decision, it may be that authentication is done with TLS certificates,
in which case it might actually provide a quite high level of security.
Expand the variable to use an enum to handle the additional cases and a
helper function to distinguish our new cases from the old ones.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-10 08:01:55 +08:00
|
|
|
static enum proactive_auth http_proactive_auth;
|
2024-05-27 19:46:10 +08:00
|
|
|
static char *user_agent;
|
http: add an "auto" mode for http.emptyauth
This variable needs to be specified to make some types of
non-basic authentication work, but ideally this would just
work out of the box for everyone.
However, simply setting it to "1" by default introduces an
extra round-trip for cases where it _isn't_ useful. We end
up sending a bogus empty credential that the server rejects.
Instead, let's introduce an automatic mode, that works like
this:
1. We won't try to send the bogus credential on the first
request. We'll wait to get an HTTP 401, as usual.
2. After seeing an HTTP 401, the empty-auth hack will kick
in only when we know there is an auth method available
that might make use of it (i.e., something besides
"Basic" or "Digest").
That should make it work out of the box, without incurring
any extra round-trips for people hitting Basic-only servers.
This _does_ incur an extra round-trip if you really want to
use "Basic" but your server advertises other methods (the
emptyauth hack will kick in but fail, and then Git will
actually ask for a password).
The auto mode may incur an extra round-trip over setting
http.emptyauth=true, because part of the emptyauth hack is
to feed this blank password to curl even before we've made a
single request.
Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-26 03:18:31 +08:00
|
|
|
static int curl_empty_auth = -1;
|
2005-11-19 03:02:58 +08:00
|
|
|
|
http: make redirects more obvious
We instruct curl to always follow HTTP redirects. This is
convenient, but it creates opportunities for malicious
servers to create confusing situations. For instance,
imagine Alice is a git user with access to a private
repository on Bob's server. Mallory runs her own server and
wants to access objects from Bob's repository.
Mallory may try a few tricks that involve asking Alice to
clone from her, build on top, and then push the result:
1. Mallory may simply redirect all fetch requests to Bob's
server. Git will transparently follow those redirects
and fetch Bob's history, which Alice may believe she
got from Mallory. The subsequent push seems like it is
just feeding Mallory back her own objects, but is
actually leaking Bob's objects. There is nothing in
git's output to indicate that Bob's repository was
involved at all.
The downside (for Mallory) of this attack is that Alice
will have received Bob's entire repository, and is
likely to notice that when building on top of it.
2. If Mallory happens to know the sha1 of some object X in
Bob's repository, she can instead build her own history
that references that object. She then runs a dumb http
server, and Alice's client will fetch each object
individually. When it asks for X, Mallory redirects her
to Bob's server. The end result is that Alice obtains
objects from Bob, but they may be buried deep in
history. Alice is less likely to notice.
Both of these attacks are fairly hard to pull off. There's a
social component in getting Mallory to convince Alice to
work with her. Alice may be prompted for credentials in
accessing Bob's repository (but not always, if she is using
a credential helper that caches). Attack (1) requires a
certain amount of obliviousness on Alice's part while making
a new commit. Attack (2) requires that Mallory knows a sha1
in Bob's repository, that Bob's server supports dumb http,
and that the object in question is loose on Bob's server.
But we can probably make things a bit more obvious without
any loss of functionality. This patch does two things to
that end.
First, when we encounter a whole-repo redirect during the
initial ref discovery, we now inform the user on stderr,
making attack (1) much more obvious.
Second, the decision to follow redirects is now
configurable. The truly paranoid can set the new
http.followRedirects to false to avoid any redirection
entirely. But for a more practical default, we will disallow
redirects only after the initial ref discovery. This is
enough to thwart attacks similar to (2), while still
allowing the common use of redirects at the repository
level. Since c93c92f30 (http: update base URLs when we see
redirects, 2013-09-28) we re-root all further requests from
the redirect destination, which should generally mean that
no further redirection is necessary.
As an escape hatch, in case there really is a server that
needs to redirect individual requests, the user can set
http.followRedirects to "true" (and this can be done on a
per-server basis via http.*.followRedirects config).
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-12-07 02:24:41 +08:00
|
|
|
enum http_follow_config http_follow_config = HTTP_FOLLOW_INITIAL;
|
|
|
|
|
http: use credential API to get passwords
This patch converts the http code to use the new credential
API, both for http authentication as well as for getting
certificate passwords.
Most of the code change is simply variable naming (the
passwords are now contained inside the credential struct)
or deletion of obsolete code (the credential code handles
URL parsing and prompting for us).
The behavior should be the same, with one exception: the
credential code will prompt with a description based on the
credential components. Therefore, the old prompt of:
Username for 'example.com':
Password for 'example.com':
now looks like:
Username for 'https://example.com/repo.git':
Password for 'https://user@example.com/repo.git':
Note that we include more information in each line,
specifically:
1. We now include the protocol. While more noisy, this is
an important part of knowing what you are accessing
(especially if you care about http vs https).
2. We include the username in the password prompt. This is
not a big deal when you have just been prompted for it,
but the username may also come from the remote's URL
(and after future patches, from configuration or
credential helpers). In that case, it's a nice
reminder of the user for which you're giving the
password.
3. We include the path component of the URL. In many
cases, the user won't care about this and it's simply
noise (i.e., they'll use the same credential for a
whole site). However, that is part of a larger
question, which is whether path components should be
part of credential context, both for prompting and for
lookup by storage helpers. That issue will be addressed
as a whole in a future patch.
Similarly, for unlocking certificates, we used to say:
Certificate Password for 'example.com':
and we now say:
Password for 'cert:///path/to/certificate':
Showing the path to the client certificate makes more sense,
as that is what you are unlocking, not "example.com".
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-12-10 18:31:21 +08:00
|
|
|
static struct credential cert_auth = CREDENTIAL_INIT;
|
2009-05-28 11:16:02 +08:00
|
|
|
static int ssl_cert_password_required;
|
2015-01-08 08:29:20 +08:00
|
|
|
static unsigned long http_auth_methods = CURLAUTH_ANY;
|
http: add an "auto" mode for http.emptyauth
This variable needs to be specified to make some types of
non-basic authentication work, but ideally this would just
work out of the box for everyone.
However, simply setting it to "1" by default introduces an
extra round-trip for cases where it _isn't_ useful. We end
up sending a bogus empty credential that the server rejects.
Instead, let's introduce an automatic mode, that works like
this:
1. We won't try to send the bogus credential on the first
request. We'll wait to get an HTTP 401, as usual.
2. After seeing an HTTP 401, the empty-auth hack will kick
in only when we know there is an auth method available
that might make use of it (i.e., something besides
"Basic" or "Digest").
That should make it work out of the box, without incurring
any extra round-trips for people hitting Basic-only servers.
This _does_ incur an extra round-trip if you really want to
use "Basic" but your server advertises other methods (the
emptyauth hack will kick in but fail, and then Git will
actually ask for a password).
The auto mode may incur an extra round-trip over setting
http.emptyauth=true, because part of the emptyauth hack is
to feed this blank password to curl even before we've made a
single request.
Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-26 03:18:31 +08:00
|
|
|
static int http_auth_methods_restricted;
|
|
|
|
/* Modes for which empty_auth cannot actually help us. */
|
|
|
|
static unsigned long empty_auth_useless =
|
|
|
|
CURLAUTH_BASIC
|
|
|
|
| CURLAUTH_DIGEST_IE
|
|
|
|
| CURLAUTH_DIGEST;
|
2009-05-28 11:16:02 +08:00
|
|
|
|
2007-12-10 01:04:57 +08:00
|
|
|
static struct curl_slist *pragma_header;
|
remote-curl: unbreak http.extraHeader with custom allocators
In 93b980e58f5 (http: use xmalloc with cURL, 2019-08-15), we started to
ask cURL to use `xmalloc()`, and if compiled with nedmalloc, that means
implicitly a different allocator than the system one.
Which means that all of cURL's allocations and releases now _need_ to
use that allocator.
However, the `http_options()` function used `slist_append()` to add any
configured extra HTTP header(s) _before_ asking cURL to use `xmalloc()`,
and `http_cleanup()` would release them _afterwards_, i.e. in the
presence of custom allocators, cURL would attempt to use the wrong
allocator to release the memory.
A naïve attempt at fixing this would move the call to
`curl_global_init()` _before_ the config is parsed (i.e. before that
call to `slist_append()`).
However, that does not work, as we _also_ parse the config setting
`http.sslbackend` and if found, call `curl_global_sslset()` which *must*
be called before `curl_global_init()`, for details see:
https://curl.haxx.se/libcurl/c/curl_global_sslset.html
So let's instead make the config parsing entirely independent from
cURL's data structures. Incidentally, this deletes two more lines than
it introduces, which is nice.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-11-06 18:04:55 +08:00
|
|
|
static struct string_list extra_http_headers = STRING_LIST_INIT_DUP;
|
2009-06-06 16:43:41 +08:00
|
|
|
|
2022-05-16 16:38:51 +08:00
|
|
|
static struct curl_slist *host_resolutions;
|
|
|
|
|
2009-03-10 09:47:29 +08:00
|
|
|
static struct active_request_slot *active_queue_head;
|
2005-11-19 03:02:58 +08:00
|
|
|
|
2015-01-28 20:04:37 +08:00
|
|
|
static char *cached_accept_language;
|
|
|
|
|
2018-10-15 18:14:43 +08:00
|
|
|
static char *http_ssl_backend;
|
|
|
|
|
2018-10-26 02:53:55 +08:00
|
|
|
static int http_schannel_check_revoke = 1;
|
2018-10-26 02:53:56 +08:00
|
|
|
/*
|
|
|
|
* With the backend being set to `schannel`, setting sslCAinfo would override
|
|
|
|
* the Certificate Store in cURL v7.60.0 and later, which is not what we want
|
|
|
|
* by default.
|
|
|
|
*/
|
|
|
|
static int http_schannel_use_ssl_cainfo;
|
2018-10-26 02:53:55 +08:00
|
|
|
|
http: allow authenticating proactively
When making a request over HTTP(S), Git only sends authentication if it
receives a 401 response. Thus, if a repository is open to the public
for reading, Git will typically never ask for authentication for fetches
and clones.
However, there may be times when a user would like to authenticate
nevertheless. For example, a forge may give higher rate limits to users
who authenticate because they are easier to contact in case of excessive
use. Or it may be useful for a known heavy user, such as an internal
service, to proactively authenticate so its use can be monitored and, if
necessary, throttled.
Let's make this possible with a new option, "http.proactiveAuth". This
option specifies a type of authentication which can be used to
authenticate against the host in question. This is necessary because we
lack the WWW-Authenticate header to provide us details; similarly, we
cannot accept certain types of authentication because we require
information from the server, such as a nonce or challenge, to
successfully authenticate.
If we're in auto mode and we got a username and password, set the
authentication scheme to Basic. libcurl will not send authentication
proactively unless there's a single choice of allowed authentication,
and we know in this case we didn't get an authtype entry telling us what
scheme to use, or we would have taken a different codepath and written
the header ourselves. In any event, of the other schemes that libcurl
supports, Digest and NTLM require a nonce or challenge, which means that
they cannot work with proactive auth, and GSSAPI does not use a username
and password at all, so Basic is the only logical choice among the
built-in options.
Note that the existing http_proactive_auth variable signifies proactive
auth if there are already credentials, which is different from the
functionality we're adding, which always seeks credentials even if none
are provided. Nonetheless, t5540 tests the existing behavior for
WebDAV-based pushes to an open repository without credentials, so we
preserve it. While at first this may seem an insecure and bizarre
decision, it may be that authentication is done with TLS certificates,
in which case it might actually provide a quite high level of security.
Expand the variable to use an enum to handle the additional cases and a
helper function to distinguish our new cases from the old ones.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-10 08:01:55 +08:00
|
|
|
static int always_auth_proactively(void)
|
|
|
|
{
|
|
|
|
return http_proactive_auth != PROACTIVE_AUTH_NONE &&
|
|
|
|
http_proactive_auth != PROACTIVE_AUTH_IF_CREDENTIALS;
|
|
|
|
}
|
|
|
|
|
2011-05-03 23:47:27 +08:00
|
|
|
size_t fread_buffer(char *ptr, size_t eltsize, size_t nmemb, void *buffer_)
|
2005-11-19 03:02:58 +08:00
|
|
|
{
|
|
|
|
size_t size = eltsize * nmemb;
|
2008-07-04 15:37:40 +08:00
|
|
|
struct buffer *buffer = buffer_;
|
|
|
|
|
2007-12-10 03:30:59 +08:00
|
|
|
if (size > buffer->buf.len - buffer->posn)
|
|
|
|
size = buffer->buf.len - buffer->posn;
|
|
|
|
memcpy(ptr, buffer->buf.buf + buffer->posn, size);
|
2005-11-19 03:02:58 +08:00
|
|
|
buffer->posn += size;
|
2007-12-10 03:30:59 +08:00
|
|
|
|
Make fread/fwrite-like functions in http.c more like fread/fwrite.
The fread/fwrite-like functions in http.c, namely fread_buffer,
fwrite_buffer, fwrite_null, fwrite_sha1_file all return the
multiplication of the size and number of items they are being given.
Practically speaking, it doesn't matter, because in all contexts where
those functions are used, size is 1.
But those functions being similar to fread and fwrite (the curl API is
designed around being able to use fread and fwrite directly), it might
be preferable to make them behave like fread and fwrite, which, from
the fread/fwrite manual page, is:
On success, fread() and fwrite() return the number of items read
or written. This number equals the number of bytes transferred
only when size is 1. If an error occurs, or the end of the file
is reached, the return value is a short item count (or zero).
Signed-off-by: Mike Hommey <mh@glandium.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-05-08 07:03:54 +08:00
|
|
|
return size / eltsize;
|
2005-11-19 03:02:58 +08:00
|
|
|
}
|
|
|
|
|
http: prefer CURLOPT_SEEKFUNCTION to CURLOPT_IOCTLFUNCTION
The IOCTLFUNCTION option has been deprecated, and generates a compiler
warning in recent versions of curl. We can switch to using SEEKFUNCTION
instead. It was added in 2008 via curl 7.18.0; our INSTALL file already
indicates we require at least curl 7.19.4.
But there's one catch: curl says we should use CURL_SEEKFUNC_{OK,FAIL},
and those didn't arrive until 7.19.5. One workaround would be to use a
bare 0/1 here (or define our own macros). But let's just bump the
minimum required version to 7.19.5. That version is only a minor version
bump from our existing requirement, and is only a 2 month time bump for
versions that are almost 13 years old. So it's not likely that anybody
cares about the distinction.
Switching means we have to rewrite the ioctl functions into seek
functions. In some ways they are simpler (seeking is the only
operation), but in some ways more complex (the ioctl allowed only a full
rewind, but now we can seek to arbitrary offsets).
Curl will only ever use SEEK_SET (per their documentation), so I didn't
bother implementing anything else, since it would naturally be
completely untested. This seems unlikely to change, but I added an
assertion just in case.
Likewise, I doubt curl will ever try to seek outside of the buffer sizes
we've told it, but I erred on the defensive side here, rather than do an
out-of-bounds read.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-17 11:04:44 +08:00
|
|
|
int seek_buffer(void *clientp, curl_off_t offset, int origin)
|
2009-04-02 00:48:24 +08:00
|
|
|
{
|
|
|
|
struct buffer *buffer = clientp;
|
|
|
|
|
http: prefer CURLOPT_SEEKFUNCTION to CURLOPT_IOCTLFUNCTION
The IOCTLFUNCTION option has been deprecated, and generates a compiler
warning in recent versions of curl. We can switch to using SEEKFUNCTION
instead. It was added in 2008 via curl 7.18.0; our INSTALL file already
indicates we require at least curl 7.19.4.
But there's one catch: curl says we should use CURL_SEEKFUNC_{OK,FAIL},
and those didn't arrive until 7.19.5. One workaround would be to use a
bare 0/1 here (or define our own macros). But let's just bump the
minimum required version to 7.19.5. That version is only a minor version
bump from our existing requirement, and is only a 2 month time bump for
versions that are almost 13 years old. So it's not likely that anybody
cares about the distinction.
Switching means we have to rewrite the ioctl functions into seek
functions. In some ways they are simpler (seeking is the only
operation), but in some ways more complex (the ioctl allowed only a full
rewind, but now we can seek to arbitrary offsets).
Curl will only ever use SEEK_SET (per their documentation), so I didn't
bother implementing anything else, since it would naturally be
completely untested. This seems unlikely to change, but I added an
assertion just in case.
Likewise, I doubt curl will ever try to seek outside of the buffer sizes
we've told it, but I erred on the defensive side here, rather than do an
out-of-bounds read.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-17 11:04:44 +08:00
|
|
|
if (origin != SEEK_SET)
|
|
|
|
BUG("seek_buffer only handles SEEK_SET");
|
|
|
|
if (offset < 0 || offset >= buffer->buf.len) {
|
|
|
|
error("curl seek would be outside of buffer");
|
|
|
|
return CURL_SEEKFUNC_FAIL;
|
2009-04-02 00:48:24 +08:00
|
|
|
}
|
http: prefer CURLOPT_SEEKFUNCTION to CURLOPT_IOCTLFUNCTION
The IOCTLFUNCTION option has been deprecated, and generates a compiler
warning in recent versions of curl. We can switch to using SEEKFUNCTION
instead. It was added in 2008 via curl 7.18.0; our INSTALL file already
indicates we require at least curl 7.19.4.
But there's one catch: curl says we should use CURL_SEEKFUNC_{OK,FAIL},
and those didn't arrive until 7.19.5. One workaround would be to use a
bare 0/1 here (or define our own macros). But let's just bump the
minimum required version to 7.19.5. That version is only a minor version
bump from our existing requirement, and is only a 2 month time bump for
versions that are almost 13 years old. So it's not likely that anybody
cares about the distinction.
Switching means we have to rewrite the ioctl functions into seek
functions. In some ways they are simpler (seeking is the only
operation), but in some ways more complex (the ioctl allowed only a full
rewind, but now we can seek to arbitrary offsets).
Curl will only ever use SEEK_SET (per their documentation), so I didn't
bother implementing anything else, since it would naturally be
completely untested. This seems unlikely to change, but I added an
assertion just in case.
Likewise, I doubt curl will ever try to seek outside of the buffer sizes
we've told it, but I erred on the defensive side here, rather than do an
out-of-bounds read.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-17 11:04:44 +08:00
|
|
|
|
|
|
|
buffer->posn = offset;
|
|
|
|
return CURL_SEEKFUNC_OK;
|
2009-04-02 00:48:24 +08:00
|
|
|
}
|
|
|
|
|
2011-05-03 23:47:27 +08:00
|
|
|
size_t fwrite_buffer(char *ptr, size_t eltsize, size_t nmemb, void *buffer_)
|
2005-11-19 03:02:58 +08:00
|
|
|
{
|
|
|
|
size_t size = eltsize * nmemb;
|
2008-07-04 15:37:40 +08:00
|
|
|
struct strbuf *buffer = buffer_;
|
|
|
|
|
2007-12-10 03:30:59 +08:00
|
|
|
strbuf_add(buffer, ptr, size);
|
Make fread/fwrite-like functions in http.c more like fread/fwrite.
The fread/fwrite-like functions in http.c, namely fread_buffer,
fwrite_buffer, fwrite_null, fwrite_sha1_file all return the
multiplication of the size and number of items they are being given.
Practically speaking, it doesn't matter, because in all contexts where
those functions are used, size is 1.
But those functions being similar to fread and fwrite (the curl API is
designed around being able to use fread and fwrite directly), it might
be preferable to make them behave like fread and fwrite, which, from
the fread/fwrite manual page, is:
On success, fread() and fwrite() return the number of items read
or written. This number equals the number of bytes transferred
only when size is 1. If an error occurs, or the end of the file
is reached, the return value is a short item count (or zero).
Signed-off-by: Mike Hommey <mh@glandium.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-05-08 07:03:54 +08:00
|
|
|
return nmemb;
|
2005-11-19 03:02:58 +08:00
|
|
|
}
|
|
|
|
|
2023-02-28 01:20:19 +08:00
|
|
|
/*
|
|
|
|
* A folded header continuation line starts with any number of spaces or
|
|
|
|
* horizontal tab characters (SP or HTAB) as per RFC 7230 section 3.2.
|
|
|
|
* It is not a continuation line if the line starts with any other character.
|
|
|
|
*/
|
|
|
|
static inline int is_hdr_continuation(const char *ptr, const size_t size)
|
|
|
|
{
|
|
|
|
return size && (*ptr == ' ' || *ptr == '\t');
|
|
|
|
}
|
|
|
|
|
2023-07-03 14:44:05 +08:00
|
|
|
static size_t fwrite_wwwauth(char *ptr, size_t eltsize, size_t nmemb, void *p UNUSED)
|
2023-02-28 01:20:19 +08:00
|
|
|
{
|
|
|
|
size_t size = eltsize * nmemb;
|
|
|
|
struct strvec *values = &http_auth.wwwauth_headers;
|
|
|
|
struct strbuf buf = STRBUF_INIT;
|
|
|
|
const char *val;
|
|
|
|
size_t val_len;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Header lines may not come NULL-terminated from libcurl so we must
|
|
|
|
* limit all scans to the maximum length of the header line, or leverage
|
|
|
|
* strbufs for all operations.
|
|
|
|
*
|
|
|
|
* In addition, it is possible that header values can be split over
|
|
|
|
* multiple lines as per RFC 7230. 'Line folding' has been deprecated
|
|
|
|
* but older servers may still emit them. A continuation header field
|
|
|
|
* value is identified as starting with a space or horizontal tab.
|
|
|
|
*
|
|
|
|
* The formal definition of a header field as given in RFC 7230 is:
|
|
|
|
*
|
|
|
|
* header-field = field-name ":" OWS field-value OWS
|
|
|
|
*
|
|
|
|
* field-name = token
|
|
|
|
* field-value = *( field-content / obs-fold )
|
|
|
|
* field-content = field-vchar [ 1*( SP / HTAB ) field-vchar ]
|
|
|
|
* field-vchar = VCHAR / obs-text
|
|
|
|
*
|
|
|
|
* obs-fold = CRLF 1*( SP / HTAB )
|
|
|
|
* ; obsolete line folding
|
|
|
|
* ; see Section 3.2.4
|
|
|
|
*/
|
|
|
|
|
|
|
|
/* Start of a new WWW-Authenticate header */
|
|
|
|
if (skip_iprefix_mem(ptr, size, "www-authenticate:", &val, &val_len)) {
|
|
|
|
strbuf_add(&buf, val, val_len);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Strip the CRLF that should be present at the end of each
|
|
|
|
* field as well as any trailing or leading whitespace from the
|
|
|
|
* value.
|
|
|
|
*/
|
|
|
|
strbuf_trim(&buf);
|
|
|
|
|
|
|
|
strvec_push(values, buf.buf);
|
|
|
|
http_auth.header_is_last_match = 1;
|
|
|
|
goto exit;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* This line could be a continuation of the previously matched header
|
|
|
|
* field. If this is the case then we should append this value to the
|
|
|
|
* end of the previously consumed value.
|
|
|
|
*/
|
|
|
|
if (http_auth.header_is_last_match && is_hdr_continuation(ptr, size)) {
|
|
|
|
/*
|
|
|
|
* Trim the CRLF and any leading or trailing from this line.
|
|
|
|
*/
|
|
|
|
strbuf_add(&buf, ptr, size);
|
|
|
|
strbuf_trim(&buf);
|
|
|
|
|
|
|
|
/*
|
|
|
|
* At this point we should always have at least one existing
|
|
|
|
* value, even if it is empty. Do not bother appending the new
|
|
|
|
* value if this continuation header is itself empty.
|
|
|
|
*/
|
|
|
|
if (!values->nr) {
|
|
|
|
BUG("should have at least one existing header value");
|
|
|
|
} else if (buf.len) {
|
|
|
|
char *prev = xstrdup(values->v[values->nr - 1]);
|
|
|
|
|
|
|
|
/* Join two non-empty values with a single space. */
|
|
|
|
const char *const sp = *prev ? " " : "";
|
|
|
|
|
|
|
|
strvec_pop(values);
|
|
|
|
strvec_pushf(values, "%s%s%s", prev, sp, buf.buf);
|
|
|
|
free(prev);
|
|
|
|
}
|
|
|
|
|
|
|
|
goto exit;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Not a continuation of a previously matched auth header line. */
|
|
|
|
http_auth.header_is_last_match = 0;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* If this is a HTTP status line and not a header field, this signals
|
|
|
|
* a different HTTP response. libcurl writes all the output of all
|
|
|
|
* response headers of all responses, including redirects.
|
|
|
|
* We only care about the last HTTP request response's headers so clear
|
|
|
|
* the existing array.
|
|
|
|
*/
|
|
|
|
if (skip_iprefix_mem(ptr, size, "http/", &val, &val_len))
|
|
|
|
strvec_clear(values);
|
|
|
|
|
|
|
|
exit:
|
|
|
|
strbuf_release(&buf);
|
|
|
|
return size;
|
|
|
|
}
|
|
|
|
|
2023-07-03 14:44:05 +08:00
|
|
|
size_t fwrite_null(char *ptr UNUSED, size_t eltsize UNUSED, size_t nmemb,
|
|
|
|
void *data UNUSED)
|
2005-11-19 03:02:58 +08:00
|
|
|
{
|
Make fread/fwrite-like functions in http.c more like fread/fwrite.
The fread/fwrite-like functions in http.c, namely fread_buffer,
fwrite_buffer, fwrite_null, fwrite_sha1_file all return the
multiplication of the size and number of items they are being given.
Practically speaking, it doesn't matter, because in all contexts where
those functions are used, size is 1.
But those functions being similar to fread and fwrite (the curl API is
designed around being able to use fread and fwrite directly), it might
be preferable to make them behave like fread and fwrite, which, from
the fread/fwrite manual page, is:
On success, fread() and fwrite() return the number of items read
or written. This number equals the number of bytes transferred
only when size is 1. If an error occurs, or the end of the file
is reached, the return value is a short item count (or zero).
Signed-off-by: Mike Hommey <mh@glandium.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-05-08 07:03:54 +08:00
|
|
|
return nmemb;
|
2005-11-19 03:02:58 +08:00
|
|
|
}
|
|
|
|
|
2024-04-17 08:02:27 +08:00
|
|
|
static struct curl_slist *object_request_headers(void)
|
|
|
|
{
|
|
|
|
return curl_slist_append(http_copy_default_headers(), "Pragma:");
|
|
|
|
}
|
|
|
|
|
2015-01-15 07:40:46 +08:00
|
|
|
static void closedown_active_slot(struct active_request_slot *slot)
|
|
|
|
{
|
|
|
|
active_requests--;
|
|
|
|
slot->in_use = 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
static void finish_active_slot(struct active_request_slot *slot)
|
|
|
|
{
|
|
|
|
closedown_active_slot(slot);
|
|
|
|
curl_easy_getinfo(slot->curl, CURLINFO_HTTP_CODE, &slot->http_code);
|
|
|
|
|
2022-05-03 00:50:37 +08:00
|
|
|
if (slot->finished)
|
2015-01-15 07:40:46 +08:00
|
|
|
(*slot->finished) = 1;
|
|
|
|
|
|
|
|
/* Store slot results so they can be read after the slot is reused */
|
2022-05-03 00:50:37 +08:00
|
|
|
if (slot->results) {
|
2015-01-15 07:40:46 +08:00
|
|
|
slot->results->curl_result = slot->curl_result;
|
|
|
|
slot->results->http_code = slot->http_code;
|
|
|
|
curl_easy_getinfo(slot->curl, CURLINFO_HTTPAUTH_AVAIL,
|
|
|
|
&slot->results->auth_avail);
|
http: use credential API to handle proxy authentication
Currently, the only way to pass proxy credentials to curl is by including them
in the proxy URL. Usually, this means they will end up on disk unencrypted, one
way or another (by inclusion in ~/.gitconfig, shell profile or history). Since
proxy authentication often uses a domain user, credentials can be security
sensitive; therefore, a safer way of passing credentials is desirable.
If the configured proxy contains a username but not a password, query the
credential API for one. Also, make sure we approve/reject proxy credentials
properly.
For consistency reasons, add parsing of http_proxy/https_proxy/all_proxy
environment variables, which would otherwise be evaluated as a fallback by curl.
Without this, we would have different semantics for git configuration and
environment variables.
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Helped-by: Elia Pinto <gitter.spiros@gmail.com>
Signed-off-by: Knut Franke <k.franke@science-computing.de>
Signed-off-by: Elia Pinto <gitter.spiros@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-01-26 21:02:48 +08:00
|
|
|
|
|
|
|
curl_easy_getinfo(slot->curl, CURLINFO_HTTP_CONNECTCODE,
|
|
|
|
&slot->results->http_connectcode);
|
2015-01-15 07:40:46 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
/* Run callback if appropriate */
|
2022-05-03 00:50:37 +08:00
|
|
|
if (slot->callback_func)
|
2015-01-15 07:40:46 +08:00
|
|
|
slot->callback_func(slot->callback_data);
|
|
|
|
}
|
|
|
|
|
2016-09-13 08:25:56 +08:00
|
|
|
static void xmulti_remove_handle(struct active_request_slot *slot)
|
|
|
|
{
|
|
|
|
curl_multi_remove_handle(curlm, slot->curl);
|
|
|
|
}
|
|
|
|
|
2005-11-19 03:02:58 +08:00
|
|
|
static void process_curl_messages(void)
|
|
|
|
{
|
|
|
|
int num_messages;
|
|
|
|
struct active_request_slot *slot;
|
|
|
|
CURLMsg *curl_message = curl_multi_info_read(curlm, &num_messages);
|
|
|
|
|
|
|
|
while (curl_message != NULL) {
|
|
|
|
if (curl_message->msg == CURLMSG_DONE) {
|
|
|
|
int curl_result = curl_message->data.result;
|
|
|
|
slot = active_queue_head;
|
|
|
|
while (slot != NULL &&
|
|
|
|
slot->curl != curl_message->easy_handle)
|
|
|
|
slot = slot->next;
|
2022-05-03 00:50:37 +08:00
|
|
|
if (slot) {
|
2016-09-13 08:25:56 +08:00
|
|
|
xmulti_remove_handle(slot);
|
2005-11-19 03:02:58 +08:00
|
|
|
slot->curl_result = curl_result;
|
|
|
|
finish_active_slot(slot);
|
|
|
|
} else {
|
|
|
|
fprintf(stderr, "Received DONE message for unknown request!\n");
|
|
|
|
}
|
|
|
|
} else {
|
|
|
|
fprintf(stderr, "Unknown CURL message received: %d\n",
|
|
|
|
(int)curl_message->msg);
|
|
|
|
}
|
|
|
|
curl_message = curl_multi_info_read(curlm, &num_messages);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
config: add ctx arg to config_fn_t
Add a new "const struct config_context *ctx" arg to config_fn_t to hold
additional information about the config iteration operation.
config_context has a "struct key_value_info kvi" member that holds
metadata about the config source being read (e.g. what kind of config
source it is, the filename, etc). In this series, we're only interested
in .kvi, so we could have just used "struct key_value_info" as an arg,
but config_context makes it possible to add/adjust members in the future
without changing the config_fn_t signature. We could also consider other
ways of organizing the args (e.g. moving the config name and value into
config_context or key_value_info), but in my experiments, the
incremental benefit doesn't justify the added complexity (e.g. a
config_fn_t will sometimes invoke another config_fn_t but with a
different config value).
In subsequent commits, the .kvi member will replace the global "struct
config_reader" in config.c, making config iteration a global-free
operation. It requires much more work for the machinery to provide
meaningful values of .kvi, so for now, merely change the signature and
call sites, pass NULL as a placeholder value, and don't rely on the arg
in any meaningful way.
Most of the changes are performed by
contrib/coccinelle/config_fn_ctx.pending.cocci, which, for every
config_fn_t:
- Modifies the signature to accept "const struct config_context *ctx"
- Passes "ctx" to any inner config_fn_t, if needed
- Adds UNUSED attributes to "ctx", if needed
Most config_fn_t instances are easily identified by seeing if they are
called by the various config functions. Most of the remaining ones are
manually named in the .cocci patch. Manual cleanups are still needed,
but the majority of it is trivial; it's either adjusting config_fn_t
that the .cocci patch didn't catch, or adding forward declarations of
"struct config_context ctx" to make the signatures make sense.
The non-trivial changes are in cases where we are invoking a config_fn_t
outside of config machinery, and we now need to decide what value of
"ctx" to pass. These cases are:
- trace2/tr2_cfg.c:tr2_cfg_set_fl()
This is indirectly called by git_config_set() so that the trace2
machinery can notice the new config values and update its settings
using the tr2 config parsing function, i.e. tr2_cfg_cb().
- builtin/checkout.c:checkout_main()
This calls git_xmerge_config() as a shorthand for parsing a CLI arg.
This might be worth refactoring away in the future, since
git_xmerge_config() can call git_default_config(), which can do much
more than just parsing.
Handle them by creating a KVI_INIT macro that initializes "struct
key_value_info" to a reasonable default, and use that to construct the
"ctx" arg.
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-29 03:26:22 +08:00
|
|
|
static int http_options(const char *var, const char *value,
|
|
|
|
const struct config_context *ctx, void *data)
|
2005-11-19 03:02:58 +08:00
|
|
|
{
|
2018-11-09 11:44:14 +08:00
|
|
|
if (!strcmp("http.version", var)) {
|
|
|
|
return git_config_string(&curl_http_version, var, value);
|
|
|
|
}
|
2005-11-19 03:02:58 +08:00
|
|
|
if (!strcmp("http.sslverify", var)) {
|
http_init(): Fix config file parsing
We honor the command line options, environment variables, variables in
repository configuration file, variables in user's global configuration
file, variables in the system configuration file, and then finally use
built-in default. To implement this semantics, the code should:
- start from built-in default values;
- call git_config() with the configuration parser callback, which
implements "later definition overrides earlier ones" logic
(git_config() reads the system's, user's and then repository's
configuration file in this order);
- override the result from the above with environment variables if set;
- override the result from the above with command line options.
The initialization code http_init() for http transfer got this wrong, and
implemented a "first one wins, ignoring the later ones" in http_options(),
to compensate this mistake, read environment variables before calling
git_config(). This is all wrong.
As a second class citizen, the http codepath hasn't been audited as
closely as other parts of the system, but we should try to bring sanity to
it, before inviting contributors to improve on it.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-03-10 10:00:30 +08:00
|
|
|
curl_ssl_verify = git_config_bool(var, value);
|
2005-11-19 03:02:58 +08:00
|
|
|
return 0;
|
|
|
|
}
|
2015-05-08 21:22:15 +08:00
|
|
|
if (!strcmp("http.sslcipherlist", var))
|
|
|
|
return git_config_string(&ssl_cipherlist, var, value);
|
2015-08-15 03:37:43 +08:00
|
|
|
if (!strcmp("http.sslversion", var))
|
|
|
|
return git_config_string(&ssl_version, var, value);
|
http_init(): Fix config file parsing
We honor the command line options, environment variables, variables in
repository configuration file, variables in user's global configuration
file, variables in the system configuration file, and then finally use
built-in default. To implement this semantics, the code should:
- start from built-in default values;
- call git_config() with the configuration parser callback, which
implements "later definition overrides earlier ones" logic
(git_config() reads the system's, user's and then repository's
configuration file in this order);
- override the result from the above with environment variables if set;
- override the result from the above with command line options.
The initialization code http_init() for http transfer got this wrong, and
implemented a "first one wins, ignoring the later ones" in http_options(),
to compensate this mistake, read environment variables before calling
git_config(). This is all wrong.
As a second class citizen, the http codepath hasn't been audited as
closely as other parts of the system, but we should try to bring sanity to
it, before inviting contributors to improve on it.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-03-10 10:00:30 +08:00
|
|
|
if (!strcmp("http.sslcert", var))
|
2024-05-27 19:46:15 +08:00
|
|
|
return git_config_pathname(&ssl_cert, var, value);
|
2023-03-20 23:48:49 +08:00
|
|
|
if (!strcmp("http.sslcerttype", var))
|
2024-05-27 19:46:39 +08:00
|
|
|
return git_config_string(&ssl_cert_type, var, value);
|
http_init(): Fix config file parsing
We honor the command line options, environment variables, variables in
repository configuration file, variables in user's global configuration
file, variables in the system configuration file, and then finally use
built-in default. To implement this semantics, the code should:
- start from built-in default values;
- call git_config() with the configuration parser callback, which
implements "later definition overrides earlier ones" logic
(git_config() reads the system's, user's and then repository's
configuration file in this order);
- override the result from the above with environment variables if set;
- override the result from the above with command line options.
The initialization code http_init() for http transfer got this wrong, and
implemented a "first one wins, ignoring the later ones" in http_options(),
to compensate this mistake, read environment variables before calling
git_config(). This is all wrong.
As a second class citizen, the http codepath hasn't been audited as
closely as other parts of the system, but we should try to bring sanity to
it, before inviting contributors to improve on it.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-03-10 10:00:30 +08:00
|
|
|
if (!strcmp("http.sslkey", var))
|
2024-05-27 19:46:15 +08:00
|
|
|
return git_config_pathname(&ssl_key, var, value);
|
2023-03-20 23:48:49 +08:00
|
|
|
if (!strcmp("http.sslkeytype", var))
|
2024-05-27 19:46:39 +08:00
|
|
|
return git_config_string(&ssl_key_type, var, value);
|
http_init(): Fix config file parsing
We honor the command line options, environment variables, variables in
repository configuration file, variables in user's global configuration
file, variables in the system configuration file, and then finally use
built-in default. To implement this semantics, the code should:
- start from built-in default values;
- call git_config() with the configuration parser callback, which
implements "later definition overrides earlier ones" logic
(git_config() reads the system's, user's and then repository's
configuration file in this order);
- override the result from the above with environment variables if set;
- override the result from the above with command line options.
The initialization code http_init() for http transfer got this wrong, and
implemented a "first one wins, ignoring the later ones" in http_options(),
to compensate this mistake, read environment variables before calling
git_config(). This is all wrong.
As a second class citizen, the http codepath hasn't been audited as
closely as other parts of the system, but we should try to bring sanity to
it, before inviting contributors to improve on it.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-03-10 10:00:30 +08:00
|
|
|
if (!strcmp("http.sslcapath", var))
|
2024-05-27 19:46:15 +08:00
|
|
|
return git_config_pathname(&ssl_capath, var, value);
|
http_init(): Fix config file parsing
We honor the command line options, environment variables, variables in
repository configuration file, variables in user's global configuration
file, variables in the system configuration file, and then finally use
built-in default. To implement this semantics, the code should:
- start from built-in default values;
- call git_config() with the configuration parser callback, which
implements "later definition overrides earlier ones" logic
(git_config() reads the system's, user's and then repository's
configuration file in this order);
- override the result from the above with environment variables if set;
- override the result from the above with command line options.
The initialization code http_init() for http transfer got this wrong, and
implemented a "first one wins, ignoring the later ones" in http_options(),
to compensate this mistake, read environment variables before calling
git_config(). This is all wrong.
As a second class citizen, the http codepath hasn't been audited as
closely as other parts of the system, but we should try to bring sanity to
it, before inviting contributors to improve on it.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-03-10 10:00:30 +08:00
|
|
|
if (!strcmp("http.sslcainfo", var))
|
2024-05-27 19:46:15 +08:00
|
|
|
return git_config_pathname(&ssl_cainfo, var, value);
|
2009-05-28 11:16:03 +08:00
|
|
|
if (!strcmp("http.sslcertpasswordprotected", var)) {
|
2013-07-13 02:52:47 +08:00
|
|
|
ssl_cert_password_required = git_config_bool(var, value);
|
2009-05-28 11:16:03 +08:00
|
|
|
return 0;
|
|
|
|
}
|
2013-04-08 03:10:39 +08:00
|
|
|
if (!strcmp("http.ssltry", var)) {
|
|
|
|
curl_ssl_try = git_config_bool(var, value);
|
|
|
|
return 0;
|
|
|
|
}
|
2018-10-15 18:14:43 +08:00
|
|
|
if (!strcmp("http.sslbackend", var)) {
|
|
|
|
free(http_ssl_backend);
|
|
|
|
http_ssl_backend = xstrdup_or_null(value);
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2018-10-26 02:53:55 +08:00
|
|
|
if (!strcmp("http.schannelcheckrevoke", var)) {
|
|
|
|
http_schannel_check_revoke = git_config_bool(var, value);
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2018-10-26 02:53:56 +08:00
|
|
|
if (!strcmp("http.schannelusesslcainfo", var)) {
|
|
|
|
http_schannel_use_ssl_cainfo = git_config_bool(var, value);
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2009-11-27 23:42:26 +08:00
|
|
|
if (!strcmp("http.minsessions", var)) {
|
config: pass kvi to die_bad_number()
Plumb "struct key_value_info" through all code paths that end in
die_bad_number(), which lets us remove the helper functions that read
analogous values from "struct config_reader". As a result, nothing reads
config_reader.config_kvi any more, so remove that too.
In config.c, this requires changing the signature of
git_configset_get_value() to 'return' "kvi" in an out parameter so that
git_configset_get_<type>() can pass it to git_config_<type>(). Only
numeric types will use "kvi", so for non-numeric types (e.g.
git_configset_get_string()), pass NULL to indicate that the out
parameter isn't needed.
Outside of config.c, config callbacks now need to pass "ctx->kvi" to any
of the git_config_<type>() functions that parse a config string into a
number type. Included is a .cocci patch to make that refactor.
The only exceptional case is builtin/config.c, where git_config_<type>()
is called outside of a config callback (namely, on user-provided input),
so config source information has never been available. In this case,
die_bad_number() defaults to a generic, but perfectly descriptive
message. Let's provide a safe, non-NULL for "kvi" anyway, but make sure
not to change the message.
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-29 03:26:27 +08:00
|
|
|
min_curl_sessions = git_config_int(var, value, ctx->kvi);
|
2009-11-27 23:42:26 +08:00
|
|
|
if (min_curl_sessions > 1)
|
|
|
|
min_curl_sessions = 1;
|
|
|
|
return 0;
|
|
|
|
}
|
2005-11-19 03:02:58 +08:00
|
|
|
if (!strcmp("http.maxrequests", var)) {
|
config: pass kvi to die_bad_number()
Plumb "struct key_value_info" through all code paths that end in
die_bad_number(), which lets us remove the helper functions that read
analogous values from "struct config_reader". As a result, nothing reads
config_reader.config_kvi any more, so remove that too.
In config.c, this requires changing the signature of
git_configset_get_value() to 'return' "kvi" in an out parameter so that
git_configset_get_<type>() can pass it to git_config_<type>(). Only
numeric types will use "kvi", so for non-numeric types (e.g.
git_configset_get_string()), pass NULL to indicate that the out
parameter isn't needed.
Outside of config.c, config callbacks now need to pass "ctx->kvi" to any
of the git_config_<type>() functions that parse a config string into a
number type. Included is a .cocci patch to make that refactor.
The only exceptional case is builtin/config.c, where git_config_<type>()
is called outside of a config callback (namely, on user-provided input),
so config source information has never been available. In this case,
die_bad_number() defaults to a generic, but perfectly descriptive
message. Let's provide a safe, non-NULL for "kvi" anyway, but make sure
not to change the message.
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-29 03:26:27 +08:00
|
|
|
max_requests = git_config_int(var, value, ctx->kvi);
|
2005-11-19 03:02:58 +08:00
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
if (!strcmp("http.lowspeedlimit", var)) {
|
config: pass kvi to die_bad_number()
Plumb "struct key_value_info" through all code paths that end in
die_bad_number(), which lets us remove the helper functions that read
analogous values from "struct config_reader". As a result, nothing reads
config_reader.config_kvi any more, so remove that too.
In config.c, this requires changing the signature of
git_configset_get_value() to 'return' "kvi" in an out parameter so that
git_configset_get_<type>() can pass it to git_config_<type>(). Only
numeric types will use "kvi", so for non-numeric types (e.g.
git_configset_get_string()), pass NULL to indicate that the out
parameter isn't needed.
Outside of config.c, config callbacks now need to pass "ctx->kvi" to any
of the git_config_<type>() functions that parse a config string into a
number type. Included is a .cocci patch to make that refactor.
The only exceptional case is builtin/config.c, where git_config_<type>()
is called outside of a config callback (namely, on user-provided input),
so config source information has never been available. In this case,
die_bad_number() defaults to a generic, but perfectly descriptive
message. Let's provide a safe, non-NULL for "kvi" anyway, but make sure
not to change the message.
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-29 03:26:27 +08:00
|
|
|
curl_low_speed_limit = (long)git_config_int(var, value, ctx->kvi);
|
2005-11-19 03:02:58 +08:00
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
if (!strcmp("http.lowspeedtime", var)) {
|
config: pass kvi to die_bad_number()
Plumb "struct key_value_info" through all code paths that end in
die_bad_number(), which lets us remove the helper functions that read
analogous values from "struct config_reader". As a result, nothing reads
config_reader.config_kvi any more, so remove that too.
In config.c, this requires changing the signature of
git_configset_get_value() to 'return' "kvi" in an out parameter so that
git_configset_get_<type>() can pass it to git_config_<type>(). Only
numeric types will use "kvi", so for non-numeric types (e.g.
git_configset_get_string()), pass NULL to indicate that the out
parameter isn't needed.
Outside of config.c, config callbacks now need to pass "ctx->kvi" to any
of the git_config_<type>() functions that parse a config string into a
number type. Included is a .cocci patch to make that refactor.
The only exceptional case is builtin/config.c, where git_config_<type>()
is called outside of a config callback (namely, on user-provided input),
so config source information has never been available. In this case,
die_bad_number() defaults to a generic, but perfectly descriptive
message. Let's provide a safe, non-NULL for "kvi" anyway, but make sure
not to change the message.
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-29 03:26:27 +08:00
|
|
|
curl_low_speed_time = (long)git_config_int(var, value, ctx->kvi);
|
2005-11-19 03:02:58 +08:00
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2006-09-29 08:10:44 +08:00
|
|
|
if (!strcmp("http.noepsv", var)) {
|
|
|
|
curl_ftp_no_epsv = git_config_bool(var, value);
|
|
|
|
return 0;
|
|
|
|
}
|
http_init(): Fix config file parsing
We honor the command line options, environment variables, variables in
repository configuration file, variables in user's global configuration
file, variables in the system configuration file, and then finally use
built-in default. To implement this semantics, the code should:
- start from built-in default values;
- call git_config() with the configuration parser callback, which
implements "later definition overrides earlier ones" logic
(git_config() reads the system's, user's and then repository's
configuration file in this order);
- override the result from the above with environment variables if set;
- override the result from the above with command line options.
The initialization code http_init() for http transfer got this wrong, and
implemented a "first one wins, ignoring the later ones" in http_options(),
to compensate this mistake, read environment variables before calling
git_config(). This is all wrong.
As a second class citizen, the http codepath hasn't been audited as
closely as other parts of the system, but we should try to bring sanity to
it, before inviting contributors to improve on it.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-03-10 10:00:30 +08:00
|
|
|
if (!strcmp("http.proxy", var))
|
2024-05-27 19:46:39 +08:00
|
|
|
return git_config_string(&curl_http_proxy, var, value);
|
2006-09-29 08:10:44 +08:00
|
|
|
|
2016-01-26 21:02:47 +08:00
|
|
|
if (!strcmp("http.proxyauthmethod", var))
|
2024-05-27 19:46:39 +08:00
|
|
|
return git_config_string(&http_proxy_authmethod, var, value);
|
2016-01-26 21:02:47 +08:00
|
|
|
|
2020-03-05 02:40:05 +08:00
|
|
|
if (!strcmp("http.proxysslcert", var))
|
2024-05-27 19:46:39 +08:00
|
|
|
return git_config_string(&http_proxy_ssl_cert, var, value);
|
2020-03-05 02:40:05 +08:00
|
|
|
|
|
|
|
if (!strcmp("http.proxysslkey", var))
|
2024-05-27 19:46:39 +08:00
|
|
|
return git_config_string(&http_proxy_ssl_key, var, value);
|
2020-03-05 02:40:05 +08:00
|
|
|
|
|
|
|
if (!strcmp("http.proxysslcainfo", var))
|
2024-05-27 19:46:39 +08:00
|
|
|
return git_config_string(&http_proxy_ssl_ca_info, var, value);
|
2020-03-05 02:40:05 +08:00
|
|
|
|
|
|
|
if (!strcmp("http.proxysslcertpasswordprotected", var)) {
|
|
|
|
proxy_ssl_cert_password_required = git_config_bool(var, value);
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2011-06-03 04:31:25 +08:00
|
|
|
if (!strcmp("http.cookiefile", var))
|
2016-05-05 02:42:15 +08:00
|
|
|
return git_config_pathname(&curl_cookie_file, var, value);
|
2013-07-24 06:40:17 +08:00
|
|
|
if (!strcmp("http.savecookies", var)) {
|
|
|
|
curl_save_cookies = git_config_bool(var, value);
|
|
|
|
return 0;
|
|
|
|
}
|
2011-06-03 04:31:25 +08:00
|
|
|
|
2009-10-31 08:47:41 +08:00
|
|
|
if (!strcmp("http.postbuffer", var)) {
|
config: pass kvi to die_bad_number()
Plumb "struct key_value_info" through all code paths that end in
die_bad_number(), which lets us remove the helper functions that read
analogous values from "struct config_reader". As a result, nothing reads
config_reader.config_kvi any more, so remove that too.
In config.c, this requires changing the signature of
git_configset_get_value() to 'return' "kvi" in an out parameter so that
git_configset_get_<type>() can pass it to git_config_<type>(). Only
numeric types will use "kvi", so for non-numeric types (e.g.
git_configset_get_string()), pass NULL to indicate that the out
parameter isn't needed.
Outside of config.c, config callbacks now need to pass "ctx->kvi" to any
of the git_config_<type>() functions that parse a config string into a
number type. Included is a .cocci patch to make that refactor.
The only exceptional case is builtin/config.c, where git_config_<type>()
is called outside of a config callback (namely, on user-provided input),
so config source information has never been available. In this case,
die_bad_number() defaults to a generic, but perfectly descriptive
message. Let's provide a safe, non-NULL for "kvi" anyway, but make sure
not to change the message.
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-29 03:26:27 +08:00
|
|
|
http_post_buffer = git_config_ssize_t(var, value, ctx->kvi);
|
2017-04-12 02:13:57 +08:00
|
|
|
if (http_post_buffer < 0)
|
2022-06-17 18:03:09 +08:00
|
|
|
warning(_("negative value for http.postBuffer; defaulting to %d"), LARGE_PACKET_MAX);
|
2009-10-31 08:47:41 +08:00
|
|
|
if (http_post_buffer < LARGE_PACKET_MAX)
|
|
|
|
http_post_buffer = LARGE_PACKET_MAX;
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2010-08-12 04:40:38 +08:00
|
|
|
if (!strcmp("http.useragent", var))
|
2024-05-27 19:46:39 +08:00
|
|
|
return git_config_string(&user_agent, var, value);
|
2010-08-12 04:40:38 +08:00
|
|
|
|
2016-02-16 02:44:46 +08:00
|
|
|
if (!strcmp("http.emptyauth", var)) {
|
http: add an "auto" mode for http.emptyauth
This variable needs to be specified to make some types of
non-basic authentication work, but ideally this would just
work out of the box for everyone.
However, simply setting it to "1" by default introduces an
extra round-trip for cases where it _isn't_ useful. We end
up sending a bogus empty credential that the server rejects.
Instead, let's introduce an automatic mode, that works like
this:
1. We won't try to send the bogus credential on the first
request. We'll wait to get an HTTP 401, as usual.
2. After seeing an HTTP 401, the empty-auth hack will kick
in only when we know there is an auth method available
that might make use of it (i.e., something besides
"Basic" or "Digest").
That should make it work out of the box, without incurring
any extra round-trips for people hitting Basic-only servers.
This _does_ incur an extra round-trip if you really want to
use "Basic" but your server advertises other methods (the
emptyauth hack will kick in but fail, and then Git will
actually ask for a password).
The auto mode may incur an extra round-trip over setting
http.emptyauth=true, because part of the emptyauth hack is
to feed this blank password to curl even before we've made a
single request.
Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-26 03:18:31 +08:00
|
|
|
if (value && !strcmp("auto", value))
|
|
|
|
curl_empty_auth = -1;
|
|
|
|
else
|
|
|
|
curl_empty_auth = git_config_bool(var, value);
|
2016-02-16 02:44:46 +08:00
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2016-09-29 02:01:34 +08:00
|
|
|
if (!strcmp("http.delegation", var)) {
|
2017-08-12 00:37:34 +08:00
|
|
|
#ifdef CURLGSSAPI_DELEGATION_FLAG
|
2016-09-29 02:01:34 +08:00
|
|
|
return git_config_string(&curl_deleg, var, value);
|
|
|
|
#else
|
|
|
|
warning(_("Delegation control is not supported with cURL < 7.22.0"));
|
|
|
|
return 0;
|
|
|
|
#endif
|
|
|
|
}
|
|
|
|
|
2016-02-15 22:04:22 +08:00
|
|
|
if (!strcmp("http.pinnedpubkey", var)) {
|
http: centralize the accounting of libcurl dependencies
As discussed in 644de29e220 (http: drop support for curl < 7.19.4,
2021-07-30) checking against LIBCURL_VERSION_NUM isn't as reliable as
checking specific symbols present in curl, as some distros have been
known to backport features.
However, while some of the curl_easy_setopt() arguments we rely on are
macros, others are enum, and we can't assume that those that are
macros won't change into enums in the future.
So we're still going to have to check LIBCURL_VERSION_NUM, but by
doing that in one central place and using a macro definition of our
own, anyone who's backporting features can define it themselves, and
thus have access to more modern curl features that they backported,
even if they didn't bump the LIBCURL_VERSION_NUM.
More importantly, as shown in a preceding commit doing these version
checks makes for hard to read and possibly buggy code, as shown by the
bug fixed there where we were conflating base 10 for base 16 when
comparing the version.
By doing them all in one place we'll hopefully reduce the chances of
such future mistakes, furthermore it now becomes easier to see at a
glance what the oldest supported version is, which makes it easier to
reason about any future deprecation similar to the recent
e48a623dea0 (Merge branch 'ab/http-drop-old-curl', 2021-08-24).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-13 22:51:28 +08:00
|
|
|
#ifdef GIT_CURL_HAVE_CURLOPT_PINNEDPUBLICKEY
|
2016-02-15 22:04:22 +08:00
|
|
|
return git_config_pathname(&ssl_pinnedkey, var, value);
|
|
|
|
#else
|
2021-09-13 22:51:27 +08:00
|
|
|
warning(_("Public key pinning not supported with cURL < 7.39.0"));
|
2016-02-15 22:04:22 +08:00
|
|
|
return 0;
|
|
|
|
#endif
|
|
|
|
}
|
2016-02-25 05:25:58 +08:00
|
|
|
|
2016-04-27 20:20:37 +08:00
|
|
|
if (!strcmp("http.extraheader", var)) {
|
|
|
|
if (!value) {
|
|
|
|
return config_error_nonbool(var);
|
|
|
|
} else if (!*value) {
|
remote-curl: unbreak http.extraHeader with custom allocators
In 93b980e58f5 (http: use xmalloc with cURL, 2019-08-15), we started to
ask cURL to use `xmalloc()`, and if compiled with nedmalloc, that means
implicitly a different allocator than the system one.
Which means that all of cURL's allocations and releases now _need_ to
use that allocator.
However, the `http_options()` function used `slist_append()` to add any
configured extra HTTP header(s) _before_ asking cURL to use `xmalloc()`,
and `http_cleanup()` would release them _afterwards_, i.e. in the
presence of custom allocators, cURL would attempt to use the wrong
allocator to release the memory.
A naïve attempt at fixing this would move the call to
`curl_global_init()` _before_ the config is parsed (i.e. before that
call to `slist_append()`).
However, that does not work, as we _also_ parse the config setting
`http.sslbackend` and if found, call `curl_global_sslset()` which *must*
be called before `curl_global_init()`, for details see:
https://curl.haxx.se/libcurl/c/curl_global_sslset.html
So let's instead make the config parsing entirely independent from
cURL's data structures. Incidentally, this deletes two more lines than
it introduces, which is nice.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-11-06 18:04:55 +08:00
|
|
|
string_list_clear(&extra_http_headers, 0);
|
2016-04-27 20:20:37 +08:00
|
|
|
} else {
|
remote-curl: unbreak http.extraHeader with custom allocators
In 93b980e58f5 (http: use xmalloc with cURL, 2019-08-15), we started to
ask cURL to use `xmalloc()`, and if compiled with nedmalloc, that means
implicitly a different allocator than the system one.
Which means that all of cURL's allocations and releases now _need_ to
use that allocator.
However, the `http_options()` function used `slist_append()` to add any
configured extra HTTP header(s) _before_ asking cURL to use `xmalloc()`,
and `http_cleanup()` would release them _afterwards_, i.e. in the
presence of custom allocators, cURL would attempt to use the wrong
allocator to release the memory.
A naïve attempt at fixing this would move the call to
`curl_global_init()` _before_ the config is parsed (i.e. before that
call to `slist_append()`).
However, that does not work, as we _also_ parse the config setting
`http.sslbackend` and if found, call `curl_global_sslset()` which *must*
be called before `curl_global_init()`, for details see:
https://curl.haxx.se/libcurl/c/curl_global_sslset.html
So let's instead make the config parsing entirely independent from
cURL's data structures. Incidentally, this deletes two more lines than
it introduces, which is nice.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-11-06 18:04:55 +08:00
|
|
|
string_list_append(&extra_http_headers, value);
|
2016-04-27 20:20:37 +08:00
|
|
|
}
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2022-05-16 16:38:51 +08:00
|
|
|
if (!strcmp("http.curloptresolve", var)) {
|
|
|
|
if (!value) {
|
|
|
|
return config_error_nonbool(var);
|
|
|
|
} else if (!*value) {
|
|
|
|
curl_slist_free_all(host_resolutions);
|
|
|
|
host_resolutions = NULL;
|
|
|
|
} else {
|
|
|
|
host_resolutions = curl_slist_append(host_resolutions, value);
|
|
|
|
}
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
http: make redirects more obvious
We instruct curl to always follow HTTP redirects. This is
convenient, but it creates opportunities for malicious
servers to create confusing situations. For instance,
imagine Alice is a git user with access to a private
repository on Bob's server. Mallory runs her own server and
wants to access objects from Bob's repository.
Mallory may try a few tricks that involve asking Alice to
clone from her, build on top, and then push the result:
1. Mallory may simply redirect all fetch requests to Bob's
server. Git will transparently follow those redirects
and fetch Bob's history, which Alice may believe she
got from Mallory. The subsequent push seems like it is
just feeding Mallory back her own objects, but is
actually leaking Bob's objects. There is nothing in
git's output to indicate that Bob's repository was
involved at all.
The downside (for Mallory) of this attack is that Alice
will have received Bob's entire repository, and is
likely to notice that when building on top of it.
2. If Mallory happens to know the sha1 of some object X in
Bob's repository, she can instead build her own history
that references that object. She then runs a dumb http
server, and Alice's client will fetch each object
individually. When it asks for X, Mallory redirects her
to Bob's server. The end result is that Alice obtains
objects from Bob, but they may be buried deep in
history. Alice is less likely to notice.
Both of these attacks are fairly hard to pull off. There's a
social component in getting Mallory to convince Alice to
work with her. Alice may be prompted for credentials in
accessing Bob's repository (but not always, if she is using
a credential helper that caches). Attack (1) requires a
certain amount of obliviousness on Alice's part while making
a new commit. Attack (2) requires that Mallory knows a sha1
in Bob's repository, that Bob's server supports dumb http,
and that the object in question is loose on Bob's server.
But we can probably make things a bit more obvious without
any loss of functionality. This patch does two things to
that end.
First, when we encounter a whole-repo redirect during the
initial ref discovery, we now inform the user on stderr,
making attack (1) much more obvious.
Second, the decision to follow redirects is now
configurable. The truly paranoid can set the new
http.followRedirects to false to avoid any redirection
entirely. But for a more practical default, we will disallow
redirects only after the initial ref discovery. This is
enough to thwart attacks similar to (2), while still
allowing the common use of redirects at the repository
level. Since c93c92f30 (http: update base URLs when we see
redirects, 2013-09-28) we re-root all further requests from
the redirect destination, which should generally mean that
no further redirection is necessary.
As an escape hatch, in case there really is a server that
needs to redirect individual requests, the user can set
http.followRedirects to "true" (and this can be done on a
per-server basis via http.*.followRedirects config).
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-12-07 02:24:41 +08:00
|
|
|
if (!strcmp("http.followredirects", var)) {
|
|
|
|
if (value && !strcmp(value, "initial"))
|
|
|
|
http_follow_config = HTTP_FOLLOW_INITIAL;
|
|
|
|
else if (git_config_bool(var, value))
|
|
|
|
http_follow_config = HTTP_FOLLOW_ALWAYS;
|
|
|
|
else
|
|
|
|
http_follow_config = HTTP_FOLLOW_NONE;
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
http: allow authenticating proactively
When making a request over HTTP(S), Git only sends authentication if it
receives a 401 response. Thus, if a repository is open to the public
for reading, Git will typically never ask for authentication for fetches
and clones.
However, there may be times when a user would like to authenticate
nevertheless. For example, a forge may give higher rate limits to users
who authenticate because they are easier to contact in case of excessive
use. Or it may be useful for a known heavy user, such as an internal
service, to proactively authenticate so its use can be monitored and, if
necessary, throttled.
Let's make this possible with a new option, "http.proactiveAuth". This
option specifies a type of authentication which can be used to
authenticate against the host in question. This is necessary because we
lack the WWW-Authenticate header to provide us details; similarly, we
cannot accept certain types of authentication because we require
information from the server, such as a nonce or challenge, to
successfully authenticate.
If we're in auto mode and we got a username and password, set the
authentication scheme to Basic. libcurl will not send authentication
proactively unless there's a single choice of allowed authentication,
and we know in this case we didn't get an authtype entry telling us what
scheme to use, or we would have taken a different codepath and written
the header ourselves. In any event, of the other schemes that libcurl
supports, Digest and NTLM require a nonce or challenge, which means that
they cannot work with proactive auth, and GSSAPI does not use a username
and password at all, so Basic is the only logical choice among the
built-in options.
Note that the existing http_proactive_auth variable signifies proactive
auth if there are already credentials, which is different from the
functionality we're adding, which always seeks credentials even if none
are provided. Nonetheless, t5540 tests the existing behavior for
WebDAV-based pushes to an open repository without credentials, so we
preserve it. While at first this may seem an insecure and bizarre
decision, it may be that authentication is done with TLS certificates,
in which case it might actually provide a quite high level of security.
Expand the variable to use an enum to handle the additional cases and a
helper function to distinguish our new cases from the old ones.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-10 08:01:55 +08:00
|
|
|
if (!strcmp("http.proactiveauth", var)) {
|
|
|
|
if (!value)
|
|
|
|
return config_error_nonbool(var);
|
|
|
|
if (!strcmp(value, "auto"))
|
|
|
|
http_proactive_auth = PROACTIVE_AUTH_AUTO;
|
|
|
|
else if (!strcmp(value, "basic"))
|
|
|
|
http_proactive_auth = PROACTIVE_AUTH_BASIC;
|
|
|
|
else if (!strcmp(value, "none"))
|
|
|
|
http_proactive_auth = PROACTIVE_AUTH_NONE;
|
|
|
|
else
|
|
|
|
warning(_("Unknown value for http.proactiveauth"));
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2005-11-19 03:02:58 +08:00
|
|
|
/* Fall back on the default ones */
|
config: add ctx arg to config_fn_t
Add a new "const struct config_context *ctx" arg to config_fn_t to hold
additional information about the config iteration operation.
config_context has a "struct key_value_info kvi" member that holds
metadata about the config source being read (e.g. what kind of config
source it is, the filename, etc). In this series, we're only interested
in .kvi, so we could have just used "struct key_value_info" as an arg,
but config_context makes it possible to add/adjust members in the future
without changing the config_fn_t signature. We could also consider other
ways of organizing the args (e.g. moving the config name and value into
config_context or key_value_info), but in my experiments, the
incremental benefit doesn't justify the added complexity (e.g. a
config_fn_t will sometimes invoke another config_fn_t but with a
different config value).
In subsequent commits, the .kvi member will replace the global "struct
config_reader" in config.c, making config iteration a global-free
operation. It requires much more work for the machinery to provide
meaningful values of .kvi, so for now, merely change the signature and
call sites, pass NULL as a placeholder value, and don't rely on the arg
in any meaningful way.
Most of the changes are performed by
contrib/coccinelle/config_fn_ctx.pending.cocci, which, for every
config_fn_t:
- Modifies the signature to accept "const struct config_context *ctx"
- Passes "ctx" to any inner config_fn_t, if needed
- Adds UNUSED attributes to "ctx", if needed
Most config_fn_t instances are easily identified by seeing if they are
called by the various config functions. Most of the remaining ones are
manually named in the .cocci patch. Manual cleanups are still needed,
but the majority of it is trivial; it's either adjusting config_fn_t
that the .cocci patch didn't catch, or adding forward declarations of
"struct config_context ctx" to make the signatures make sense.
The non-trivial changes are in cases where we are invoking a config_fn_t
outside of config machinery, and we now need to decide what value of
"ctx" to pass. These cases are:
- trace2/tr2_cfg.c:tr2_cfg_set_fl()
This is indirectly called by git_config_set() so that the trace2
machinery can notice the new config values and update its settings
using the tr2 config parsing function, i.e. tr2_cfg_cb().
- builtin/checkout.c:checkout_main()
This calls git_xmerge_config() as a shorthand for parsing a CLI arg.
This might be worth refactoring away in the future, since
git_xmerge_config() can call git_default_config(), which can do much
more than just parsing.
Handle them by creating a KVI_INIT macro that initializes "struct
key_value_info" to a reasonable default, and use that to construct the
"ctx" arg.
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-29 03:26:22 +08:00
|
|
|
return git_default_config(var, value, ctx, data);
|
2005-11-19 03:02:58 +08:00
|
|
|
}
|
|
|
|
|
http: add an "auto" mode for http.emptyauth
This variable needs to be specified to make some types of
non-basic authentication work, but ideally this would just
work out of the box for everyone.
However, simply setting it to "1" by default introduces an
extra round-trip for cases where it _isn't_ useful. We end
up sending a bogus empty credential that the server rejects.
Instead, let's introduce an automatic mode, that works like
this:
1. We won't try to send the bogus credential on the first
request. We'll wait to get an HTTP 401, as usual.
2. After seeing an HTTP 401, the empty-auth hack will kick
in only when we know there is an auth method available
that might make use of it (i.e., something besides
"Basic" or "Digest").
That should make it work out of the box, without incurring
any extra round-trips for people hitting Basic-only servers.
This _does_ incur an extra round-trip if you really want to
use "Basic" but your server advertises other methods (the
emptyauth hack will kick in but fail, and then Git will
actually ask for a password).
The auto mode may incur an extra round-trip over setting
http.emptyauth=true, because part of the emptyauth hack is
to feed this blank password to curl even before we've made a
single request.
Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-26 03:18:31 +08:00
|
|
|
static int curl_empty_auth_enabled(void)
|
|
|
|
{
|
|
|
|
if (curl_empty_auth >= 0)
|
|
|
|
return curl_empty_auth;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* In the automatic case, kick in the empty-auth
|
|
|
|
* hack as long as we would potentially try some
|
|
|
|
* method more exotic than "Basic" or "Digest".
|
|
|
|
*
|
|
|
|
* But only do this when this is our second or
|
|
|
|
* subsequent request, as by then we know what
|
|
|
|
* methods are available.
|
|
|
|
*/
|
|
|
|
if (http_auth_methods_restricted &&
|
|
|
|
(http_auth_methods & ~empty_auth_useless))
|
|
|
|
return 1;
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
http: add support for authtype and credential
Now that we have the credential helper code set up to handle arbitrary
authentications schemes, let's add support for this in the HTTP code,
where we really want to use it. If we're using this new functionality,
don't set a username and password, and instead set a header wherever
we'd normally do so, including for proxy authentication.
Since we can now handle this case, ask the credential helper to enable
the appropriate capabilities.
Finally, if we're using the authtype value, set "Expect: 100-continue".
Any type of authentication that requires multiple rounds (such as NTLM
or Kerberos) requires a 100 Continue (if we're larger than
http.postBuffer) because otherwise we send the pack data before we're
authenticated, the push gets a 401 response, and we can't rewind the
stream. We don't know for certain what other custom schemes might
require this, the HTTP/1.1 standard has required handling this since
1999, the broken HTTP server for which we disabled this (Google's) is
now fixed and has been for some time, and libcurl has a 1-second
fallback in case the HTTP server is still broken. In addition, it is
not unreasonable to require compliance with a 25-year old standard to
use new Git features. For all of these reasons, do so here.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-17 08:02:32 +08:00
|
|
|
struct curl_slist *http_append_auth_header(const struct credential *c,
|
|
|
|
struct curl_slist *headers)
|
|
|
|
{
|
|
|
|
if (c->authtype && c->credential) {
|
|
|
|
struct strbuf auth = STRBUF_INIT;
|
|
|
|
strbuf_addf(&auth, "Authorization: %s %s",
|
|
|
|
c->authtype, c->credential);
|
|
|
|
headers = curl_slist_append(headers, auth.buf);
|
|
|
|
strbuf_release(&auth);
|
|
|
|
}
|
|
|
|
return headers;
|
|
|
|
}
|
|
|
|
|
2009-03-10 14:34:25 +08:00
|
|
|
static void init_curl_http_auth(CURL *result)
|
|
|
|
{
|
http: add support for authtype and credential
Now that we have the credential helper code set up to handle arbitrary
authentications schemes, let's add support for this in the HTTP code,
where we really want to use it. If we're using this new functionality,
don't set a username and password, and instead set a header wherever
we'd normally do so, including for proxy authentication.
Since we can now handle this case, ask the credential helper to enable
the appropriate capabilities.
Finally, if we're using the authtype value, set "Expect: 100-continue".
Any type of authentication that requires multiple rounds (such as NTLM
or Kerberos) requires a 100 Continue (if we're larger than
http.postBuffer) because otherwise we send the pack data before we're
authenticated, the push gets a 401 response, and we can't rewind the
stream. We don't know for certain what other custom schemes might
require this, the HTTP/1.1 standard has required handling this since
1999, the broken HTTP server for which we disabled this (Google's) is
now fixed and has been for some time, and libcurl has a 1-second
fallback in case the HTTP server is still broken. In addition, it is
not unreasonable to require compliance with a 25-year old standard to
use new Git features. For all of these reasons, do so here.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-17 08:02:32 +08:00
|
|
|
if ((!http_auth.username || !*http_auth.username) &&
|
|
|
|
(!http_auth.credential || !*http_auth.credential)) {
|
http: allow authenticating proactively
When making a request over HTTP(S), Git only sends authentication if it
receives a 401 response. Thus, if a repository is open to the public
for reading, Git will typically never ask for authentication for fetches
and clones.
However, there may be times when a user would like to authenticate
nevertheless. For example, a forge may give higher rate limits to users
who authenticate because they are easier to contact in case of excessive
use. Or it may be useful for a known heavy user, such as an internal
service, to proactively authenticate so its use can be monitored and, if
necessary, throttled.
Let's make this possible with a new option, "http.proactiveAuth". This
option specifies a type of authentication which can be used to
authenticate against the host in question. This is necessary because we
lack the WWW-Authenticate header to provide us details; similarly, we
cannot accept certain types of authentication because we require
information from the server, such as a nonce or challenge, to
successfully authenticate.
If we're in auto mode and we got a username and password, set the
authentication scheme to Basic. libcurl will not send authentication
proactively unless there's a single choice of allowed authentication,
and we know in this case we didn't get an authtype entry telling us what
scheme to use, or we would have taken a different codepath and written
the header ourselves. In any event, of the other schemes that libcurl
supports, Digest and NTLM require a nonce or challenge, which means that
they cannot work with proactive auth, and GSSAPI does not use a username
and password at all, so Basic is the only logical choice among the
built-in options.
Note that the existing http_proactive_auth variable signifies proactive
auth if there are already credentials, which is different from the
functionality we're adding, which always seeks credentials even if none
are provided. Nonetheless, t5540 tests the existing behavior for
WebDAV-based pushes to an open repository without credentials, so we
preserve it. While at first this may seem an insecure and bizarre
decision, it may be that authentication is done with TLS certificates,
in which case it might actually provide a quite high level of security.
Expand the variable to use an enum to handle the additional cases and a
helper function to distinguish our new cases from the old ones.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-10 08:01:55 +08:00
|
|
|
int empty_auth = curl_empty_auth_enabled();
|
|
|
|
if ((empty_auth != -1 && !always_auth_proactively()) || empty_auth == 1) {
|
2016-02-16 02:44:46 +08:00
|
|
|
curl_easy_setopt(result, CURLOPT_USERPWD, ":");
|
http: allow authenticating proactively
When making a request over HTTP(S), Git only sends authentication if it
receives a 401 response. Thus, if a repository is open to the public
for reading, Git will typically never ask for authentication for fetches
and clones.
However, there may be times when a user would like to authenticate
nevertheless. For example, a forge may give higher rate limits to users
who authenticate because they are easier to contact in case of excessive
use. Or it may be useful for a known heavy user, such as an internal
service, to proactively authenticate so its use can be monitored and, if
necessary, throttled.
Let's make this possible with a new option, "http.proactiveAuth". This
option specifies a type of authentication which can be used to
authenticate against the host in question. This is necessary because we
lack the WWW-Authenticate header to provide us details; similarly, we
cannot accept certain types of authentication because we require
information from the server, such as a nonce or challenge, to
successfully authenticate.
If we're in auto mode and we got a username and password, set the
authentication scheme to Basic. libcurl will not send authentication
proactively unless there's a single choice of allowed authentication,
and we know in this case we didn't get an authtype entry telling us what
scheme to use, or we would have taken a different codepath and written
the header ourselves. In any event, of the other schemes that libcurl
supports, Digest and NTLM require a nonce or challenge, which means that
they cannot work with proactive auth, and GSSAPI does not use a username
and password at all, so Basic is the only logical choice among the
built-in options.
Note that the existing http_proactive_auth variable signifies proactive
auth if there are already credentials, which is different from the
functionality we're adding, which always seeks credentials even if none
are provided. Nonetheless, t5540 tests the existing behavior for
WebDAV-based pushes to an open repository without credentials, so we
preserve it. While at first this may seem an insecure and bizarre
decision, it may be that authentication is done with TLS certificates,
in which case it might actually provide a quite high level of security.
Expand the variable to use an enum to handle the additional cases and a
helper function to distinguish our new cases from the old ones.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-10 08:01:55 +08:00
|
|
|
return;
|
|
|
|
} else if (!always_auth_proactively()) {
|
|
|
|
return;
|
|
|
|
} else if (http_proactive_auth == PROACTIVE_AUTH_BASIC) {
|
|
|
|
strvec_push(&http_auth.wwwauth_headers, "Basic");
|
|
|
|
}
|
2016-02-16 02:44:46 +08:00
|
|
|
}
|
2012-04-13 14:19:25 +08:00
|
|
|
|
http: add support for authtype and credential
Now that we have the credential helper code set up to handle arbitrary
authentications schemes, let's add support for this in the HTTP code,
where we really want to use it. If we're using this new functionality,
don't set a username and password, and instead set a header wherever
we'd normally do so, including for proxy authentication.
Since we can now handle this case, ask the credential helper to enable
the appropriate capabilities.
Finally, if we're using the authtype value, set "Expect: 100-continue".
Any type of authentication that requires multiple rounds (such as NTLM
or Kerberos) requires a 100 Continue (if we're larger than
http.postBuffer) because otherwise we send the pack data before we're
authenticated, the push gets a 401 response, and we can't rewind the
stream. We don't know for certain what other custom schemes might
require this, the HTTP/1.1 standard has required handling this since
1999, the broken HTTP server for which we disabled this (Google's) is
now fixed and has been for some time, and libcurl has a 1-second
fallback in case the HTTP server is still broken. In addition, it is
not unreasonable to require compliance with a 25-year old standard to
use new Git features. For all of these reasons, do so here.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-17 08:02:32 +08:00
|
|
|
credential_fill(&http_auth, 1);
|
2012-04-13 14:19:25 +08:00
|
|
|
|
http: add support for authtype and credential
Now that we have the credential helper code set up to handle arbitrary
authentications schemes, let's add support for this in the HTTP code,
where we really want to use it. If we're using this new functionality,
don't set a username and password, and instead set a header wherever
we'd normally do so, including for proxy authentication.
Since we can now handle this case, ask the credential helper to enable
the appropriate capabilities.
Finally, if we're using the authtype value, set "Expect: 100-continue".
Any type of authentication that requires multiple rounds (such as NTLM
or Kerberos) requires a 100 Continue (if we're larger than
http.postBuffer) because otherwise we send the pack data before we're
authenticated, the push gets a 401 response, and we can't rewind the
stream. We don't know for certain what other custom schemes might
require this, the HTTP/1.1 standard has required handling this since
1999, the broken HTTP server for which we disabled this (Google's) is
now fixed and has been for some time, and libcurl has a 1-second
fallback in case the HTTP server is still broken. In addition, it is
not unreasonable to require compliance with a 25-year old standard to
use new Git features. For all of these reasons, do so here.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-17 08:02:32 +08:00
|
|
|
if (http_auth.password) {
|
http: allow authenticating proactively
When making a request over HTTP(S), Git only sends authentication if it
receives a 401 response. Thus, if a repository is open to the public
for reading, Git will typically never ask for authentication for fetches
and clones.
However, there may be times when a user would like to authenticate
nevertheless. For example, a forge may give higher rate limits to users
who authenticate because they are easier to contact in case of excessive
use. Or it may be useful for a known heavy user, such as an internal
service, to proactively authenticate so its use can be monitored and, if
necessary, throttled.
Let's make this possible with a new option, "http.proactiveAuth". This
option specifies a type of authentication which can be used to
authenticate against the host in question. This is necessary because we
lack the WWW-Authenticate header to provide us details; similarly, we
cannot accept certain types of authentication because we require
information from the server, such as a nonce or challenge, to
successfully authenticate.
If we're in auto mode and we got a username and password, set the
authentication scheme to Basic. libcurl will not send authentication
proactively unless there's a single choice of allowed authentication,
and we know in this case we didn't get an authtype entry telling us what
scheme to use, or we would have taken a different codepath and written
the header ourselves. In any event, of the other schemes that libcurl
supports, Digest and NTLM require a nonce or challenge, which means that
they cannot work with proactive auth, and GSSAPI does not use a username
and password at all, so Basic is the only logical choice among the
built-in options.
Note that the existing http_proactive_auth variable signifies proactive
auth if there are already credentials, which is different from the
functionality we're adding, which always seeks credentials even if none
are provided. Nonetheless, t5540 tests the existing behavior for
WebDAV-based pushes to an open repository without credentials, so we
preserve it. While at first this may seem an insecure and bizarre
decision, it may be that authentication is done with TLS certificates,
in which case it might actually provide a quite high level of security.
Expand the variable to use an enum to handle the additional cases and a
helper function to distinguish our new cases from the old ones.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-10 08:01:55 +08:00
|
|
|
if (always_auth_proactively()) {
|
|
|
|
/*
|
|
|
|
* We got a credential without an authtype and we don't
|
|
|
|
* know what's available. Since our only two options at
|
|
|
|
* the moment are auto (which defaults to basic) and
|
|
|
|
* basic, use basic for now.
|
|
|
|
*/
|
|
|
|
curl_easy_setopt(result, CURLOPT_HTTPAUTH, CURLAUTH_BASIC);
|
|
|
|
}
|
http: add support for authtype and credential
Now that we have the credential helper code set up to handle arbitrary
authentications schemes, let's add support for this in the HTTP code,
where we really want to use it. If we're using this new functionality,
don't set a username and password, and instead set a header wherever
we'd normally do so, including for proxy authentication.
Since we can now handle this case, ask the credential helper to enable
the appropriate capabilities.
Finally, if we're using the authtype value, set "Expect: 100-continue".
Any type of authentication that requires multiple rounds (such as NTLM
or Kerberos) requires a 100 Continue (if we're larger than
http.postBuffer) because otherwise we send the pack data before we're
authenticated, the push gets a 401 response, and we can't rewind the
stream. We don't know for certain what other custom schemes might
require this, the HTTP/1.1 standard has required handling this since
1999, the broken HTTP server for which we disabled this (Google's) is
now fixed and has been for some time, and libcurl has a 1-second
fallback in case the HTTP server is still broken. In addition, it is
not unreasonable to require compliance with a 25-year old standard to
use new Git features. For all of these reasons, do so here.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-17 08:02:32 +08:00
|
|
|
curl_easy_setopt(result, CURLOPT_USERNAME, http_auth.username);
|
|
|
|
curl_easy_setopt(result, CURLOPT_PASSWORD, http_auth.password);
|
|
|
|
}
|
2009-03-10 14:34:25 +08:00
|
|
|
}
|
|
|
|
|
2016-01-26 21:02:47 +08:00
|
|
|
/* *var must be free-able */
|
2024-05-27 19:46:10 +08:00
|
|
|
static void var_override(char **var, char *value)
|
2016-01-26 21:02:47 +08:00
|
|
|
{
|
|
|
|
if (value) {
|
2024-05-27 19:46:10 +08:00
|
|
|
free(*var);
|
2016-01-26 21:02:47 +08:00
|
|
|
*var = xstrdup(value);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
http: use credential API to handle proxy authentication
Currently, the only way to pass proxy credentials to curl is by including them
in the proxy URL. Usually, this means they will end up on disk unencrypted, one
way or another (by inclusion in ~/.gitconfig, shell profile or history). Since
proxy authentication often uses a domain user, credentials can be security
sensitive; therefore, a safer way of passing credentials is desirable.
If the configured proxy contains a username but not a password, query the
credential API for one. Also, make sure we approve/reject proxy credentials
properly.
For consistency reasons, add parsing of http_proxy/https_proxy/all_proxy
environment variables, which would otherwise be evaluated as a fallback by curl.
Without this, we would have different semantics for git configuration and
environment variables.
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Helped-by: Elia Pinto <gitter.spiros@gmail.com>
Signed-off-by: Knut Franke <k.franke@science-computing.de>
Signed-off-by: Elia Pinto <gitter.spiros@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-01-26 21:02:48 +08:00
|
|
|
static void set_proxyauth_name_password(CURL *result)
|
|
|
|
{
|
http: add support for authtype and credential
Now that we have the credential helper code set up to handle arbitrary
authentications schemes, let's add support for this in the HTTP code,
where we really want to use it. If we're using this new functionality,
don't set a username and password, and instead set a header wherever
we'd normally do so, including for proxy authentication.
Since we can now handle this case, ask the credential helper to enable
the appropriate capabilities.
Finally, if we're using the authtype value, set "Expect: 100-continue".
Any type of authentication that requires multiple rounds (such as NTLM
or Kerberos) requires a 100 Continue (if we're larger than
http.postBuffer) because otherwise we send the pack data before we're
authenticated, the push gets a 401 response, and we can't rewind the
stream. We don't know for certain what other custom schemes might
require this, the HTTP/1.1 standard has required handling this since
1999, the broken HTTP server for which we disabled this (Google's) is
now fixed and has been for some time, and libcurl has a 1-second
fallback in case the HTTP server is still broken. In addition, it is
not unreasonable to require compliance with a 25-year old standard to
use new Git features. For all of these reasons, do so here.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-17 08:02:32 +08:00
|
|
|
if (proxy_auth.password) {
|
http: use credential API to handle proxy authentication
Currently, the only way to pass proxy credentials to curl is by including them
in the proxy URL. Usually, this means they will end up on disk unencrypted, one
way or another (by inclusion in ~/.gitconfig, shell profile or history). Since
proxy authentication often uses a domain user, credentials can be security
sensitive; therefore, a safer way of passing credentials is desirable.
If the configured proxy contains a username but not a password, query the
credential API for one. Also, make sure we approve/reject proxy credentials
properly.
For consistency reasons, add parsing of http_proxy/https_proxy/all_proxy
environment variables, which would otherwise be evaluated as a fallback by curl.
Without this, we would have different semantics for git configuration and
environment variables.
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Helped-by: Elia Pinto <gitter.spiros@gmail.com>
Signed-off-by: Knut Franke <k.franke@science-computing.de>
Signed-off-by: Elia Pinto <gitter.spiros@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-01-26 21:02:48 +08:00
|
|
|
curl_easy_setopt(result, CURLOPT_PROXYUSERNAME,
|
|
|
|
proxy_auth.username);
|
|
|
|
curl_easy_setopt(result, CURLOPT_PROXYPASSWORD,
|
|
|
|
proxy_auth.password);
|
http: add support for authtype and credential
Now that we have the credential helper code set up to handle arbitrary
authentications schemes, let's add support for this in the HTTP code,
where we really want to use it. If we're using this new functionality,
don't set a username and password, and instead set a header wherever
we'd normally do so, including for proxy authentication.
Since we can now handle this case, ask the credential helper to enable
the appropriate capabilities.
Finally, if we're using the authtype value, set "Expect: 100-continue".
Any type of authentication that requires multiple rounds (such as NTLM
or Kerberos) requires a 100 Continue (if we're larger than
http.postBuffer) because otherwise we send the pack data before we're
authenticated, the push gets a 401 response, and we can't rewind the
stream. We don't know for certain what other custom schemes might
require this, the HTTP/1.1 standard has required handling this since
1999, the broken HTTP server for which we disabled this (Google's) is
now fixed and has been for some time, and libcurl has a 1-second
fallback in case the HTTP server is still broken. In addition, it is
not unreasonable to require compliance with a 25-year old standard to
use new Git features. For all of these reasons, do so here.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-17 08:02:32 +08:00
|
|
|
} else if (proxy_auth.authtype && proxy_auth.credential) {
|
|
|
|
curl_easy_setopt(result, CURLOPT_PROXYHEADER,
|
|
|
|
http_append_auth_header(&proxy_auth, NULL));
|
|
|
|
}
|
http: use credential API to handle proxy authentication
Currently, the only way to pass proxy credentials to curl is by including them
in the proxy URL. Usually, this means they will end up on disk unencrypted, one
way or another (by inclusion in ~/.gitconfig, shell profile or history). Since
proxy authentication often uses a domain user, credentials can be security
sensitive; therefore, a safer way of passing credentials is desirable.
If the configured proxy contains a username but not a password, query the
credential API for one. Also, make sure we approve/reject proxy credentials
properly.
For consistency reasons, add parsing of http_proxy/https_proxy/all_proxy
environment variables, which would otherwise be evaluated as a fallback by curl.
Without this, we would have different semantics for git configuration and
environment variables.
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Helped-by: Elia Pinto <gitter.spiros@gmail.com>
Signed-off-by: Knut Franke <k.franke@science-computing.de>
Signed-off-by: Elia Pinto <gitter.spiros@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-01-26 21:02:48 +08:00
|
|
|
}
|
|
|
|
|
2016-01-26 21:02:47 +08:00
|
|
|
static void init_curl_proxy_auth(CURL *result)
|
|
|
|
{
|
http: use credential API to handle proxy authentication
Currently, the only way to pass proxy credentials to curl is by including them
in the proxy URL. Usually, this means they will end up on disk unencrypted, one
way or another (by inclusion in ~/.gitconfig, shell profile or history). Since
proxy authentication often uses a domain user, credentials can be security
sensitive; therefore, a safer way of passing credentials is desirable.
If the configured proxy contains a username but not a password, query the
credential API for one. Also, make sure we approve/reject proxy credentials
properly.
For consistency reasons, add parsing of http_proxy/https_proxy/all_proxy
environment variables, which would otherwise be evaluated as a fallback by curl.
Without this, we would have different semantics for git configuration and
environment variables.
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Helped-by: Elia Pinto <gitter.spiros@gmail.com>
Signed-off-by: Knut Franke <k.franke@science-computing.de>
Signed-off-by: Elia Pinto <gitter.spiros@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-01-26 21:02:48 +08:00
|
|
|
if (proxy_auth.username) {
|
http: add support for authtype and credential
Now that we have the credential helper code set up to handle arbitrary
authentications schemes, let's add support for this in the HTTP code,
where we really want to use it. If we're using this new functionality,
don't set a username and password, and instead set a header wherever
we'd normally do so, including for proxy authentication.
Since we can now handle this case, ask the credential helper to enable
the appropriate capabilities.
Finally, if we're using the authtype value, set "Expect: 100-continue".
Any type of authentication that requires multiple rounds (such as NTLM
or Kerberos) requires a 100 Continue (if we're larger than
http.postBuffer) because otherwise we send the pack data before we're
authenticated, the push gets a 401 response, and we can't rewind the
stream. We don't know for certain what other custom schemes might
require this, the HTTP/1.1 standard has required handling this since
1999, the broken HTTP server for which we disabled this (Google's) is
now fixed and has been for some time, and libcurl has a 1-second
fallback in case the HTTP server is still broken. In addition, it is
not unreasonable to require compliance with a 25-year old standard to
use new Git features. For all of these reasons, do so here.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-17 08:02:32 +08:00
|
|
|
if (!proxy_auth.password && !proxy_auth.credential)
|
|
|
|
credential_fill(&proxy_auth, 1);
|
http: use credential API to handle proxy authentication
Currently, the only way to pass proxy credentials to curl is by including them
in the proxy URL. Usually, this means they will end up on disk unencrypted, one
way or another (by inclusion in ~/.gitconfig, shell profile or history). Since
proxy authentication often uses a domain user, credentials can be security
sensitive; therefore, a safer way of passing credentials is desirable.
If the configured proxy contains a username but not a password, query the
credential API for one. Also, make sure we approve/reject proxy credentials
properly.
For consistency reasons, add parsing of http_proxy/https_proxy/all_proxy
environment variables, which would otherwise be evaluated as a fallback by curl.
Without this, we would have different semantics for git configuration and
environment variables.
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Helped-by: Elia Pinto <gitter.spiros@gmail.com>
Signed-off-by: Knut Franke <k.franke@science-computing.de>
Signed-off-by: Elia Pinto <gitter.spiros@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-01-26 21:02:48 +08:00
|
|
|
set_proxyauth_name_password(result);
|
|
|
|
}
|
|
|
|
|
2016-01-26 21:02:47 +08:00
|
|
|
var_override(&http_proxy_authmethod, getenv("GIT_HTTP_PROXY_AUTHMETHOD"));
|
|
|
|
|
|
|
|
if (http_proxy_authmethod) {
|
|
|
|
int i;
|
|
|
|
for (i = 0; i < ARRAY_SIZE(proxy_authmethods); i++) {
|
|
|
|
if (!strcmp(http_proxy_authmethod, proxy_authmethods[i].name)) {
|
|
|
|
curl_easy_setopt(result, CURLOPT_PROXYAUTH,
|
|
|
|
proxy_authmethods[i].curlauth_param);
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
if (i == ARRAY_SIZE(proxy_authmethods)) {
|
|
|
|
warning("unsupported proxy authentication method %s: using anyauth",
|
|
|
|
http_proxy_authmethod);
|
|
|
|
curl_easy_setopt(result, CURLOPT_PROXYAUTH, CURLAUTH_ANY);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
else
|
|
|
|
curl_easy_setopt(result, CURLOPT_PROXYAUTH, CURLAUTH_ANY);
|
|
|
|
}
|
|
|
|
|
2009-05-28 11:16:02 +08:00
|
|
|
static int has_cert_password(void)
|
|
|
|
{
|
|
|
|
if (ssl_cert == NULL || ssl_cert_password_required != 1)
|
|
|
|
return 0;
|
http: use credential API to get passwords
This patch converts the http code to use the new credential
API, both for http authentication as well as for getting
certificate passwords.
Most of the code change is simply variable naming (the
passwords are now contained inside the credential struct)
or deletion of obsolete code (the credential code handles
URL parsing and prompting for us).
The behavior should be the same, with one exception: the
credential code will prompt with a description based on the
credential components. Therefore, the old prompt of:
Username for 'example.com':
Password for 'example.com':
now looks like:
Username for 'https://example.com/repo.git':
Password for 'https://user@example.com/repo.git':
Note that we include more information in each line,
specifically:
1. We now include the protocol. While more noisy, this is
an important part of knowing what you are accessing
(especially if you care about http vs https).
2. We include the username in the password prompt. This is
not a big deal when you have just been prompted for it,
but the username may also come from the remote's URL
(and after future patches, from configuration or
credential helpers). In that case, it's a nice
reminder of the user for which you're giving the
password.
3. We include the path component of the URL. In many
cases, the user won't care about this and it's simply
noise (i.e., they'll use the same credential for a
whole site). However, that is part of a larger
question, which is whether path components should be
part of credential context, both for prompting and for
lookup by storage helpers. That issue will be addressed
as a whole in a future patch.
Similarly, for unlocking certificates, we used to say:
Certificate Password for 'example.com':
and we now say:
Password for 'cert:///path/to/certificate':
Showing the path to the client certificate makes more sense,
as that is what you are unlocking, not "example.com".
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-12-10 18:31:21 +08:00
|
|
|
if (!cert_auth.password) {
|
|
|
|
cert_auth.protocol = xstrdup("cert");
|
2020-04-19 11:48:05 +08:00
|
|
|
cert_auth.host = xstrdup("");
|
2012-12-22 00:31:19 +08:00
|
|
|
cert_auth.username = xstrdup("");
|
http: use credential API to get passwords
This patch converts the http code to use the new credential
API, both for http authentication as well as for getting
certificate passwords.
Most of the code change is simply variable naming (the
passwords are now contained inside the credential struct)
or deletion of obsolete code (the credential code handles
URL parsing and prompting for us).
The behavior should be the same, with one exception: the
credential code will prompt with a description based on the
credential components. Therefore, the old prompt of:
Username for 'example.com':
Password for 'example.com':
now looks like:
Username for 'https://example.com/repo.git':
Password for 'https://user@example.com/repo.git':
Note that we include more information in each line,
specifically:
1. We now include the protocol. While more noisy, this is
an important part of knowing what you are accessing
(especially if you care about http vs https).
2. We include the username in the password prompt. This is
not a big deal when you have just been prompted for it,
but the username may also come from the remote's URL
(and after future patches, from configuration or
credential helpers). In that case, it's a nice
reminder of the user for which you're giving the
password.
3. We include the path component of the URL. In many
cases, the user won't care about this and it's simply
noise (i.e., they'll use the same credential for a
whole site). However, that is part of a larger
question, which is whether path components should be
part of credential context, both for prompting and for
lookup by storage helpers. That issue will be addressed
as a whole in a future patch.
Similarly, for unlocking certificates, we used to say:
Certificate Password for 'example.com':
and we now say:
Password for 'cert:///path/to/certificate':
Showing the path to the client certificate makes more sense,
as that is what you are unlocking, not "example.com".
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-12-10 18:31:21 +08:00
|
|
|
cert_auth.path = xstrdup(ssl_cert);
|
credential: gate new fields on capability
We support the new credential and authtype fields, but we lack a way to
indicate to a credential helper that we'd like them to be used. Without
some sort of indication, the credential helper doesn't know if it should
try to provide us a username and password, or a pre-encoded credential.
For example, the helper might prefer a more restricted Bearer token if
pre-encoded credentials are possible, but might have to fall back to
more general username and password if not.
Let's provide a simple way to indicate whether Git (or, for that matter,
the helper) is capable of understanding the authtype and credential
fields. We send this capability when we generate a request, and the
other side may reply to indicate to us that it does, too.
For now, don't enable sending capabilities for the HTTP code. In a
future commit, we'll introduce appropriate handling for that code,
which requires more in-depth work.
The logic for determining whether a capability is supported may seem
complex, but it is not. At each stage, we emit the capability to the
following stage if all preceding stages have declared it. Thus, if the
caller to git credential fill didn't declare it, then we won't send it
to the helper, and if fill's caller did send but the helper doesn't
understand it, then we won't send it on in the response. If we're an
internal user, then we know about all capabilities and will request
them.
For "git credential approve" and "git credential reject", we set the
helper capability before calling the helper, since we assume that the
input we're getting from the external program comes from a previous call
to "git credential fill", and thus we'll invoke send a capability to the
helper if and only if we got one from the standard input, which is the
correct behavior.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-17 08:02:29 +08:00
|
|
|
credential_fill(&cert_auth, 0);
|
http: use credential API to get passwords
This patch converts the http code to use the new credential
API, both for http authentication as well as for getting
certificate passwords.
Most of the code change is simply variable naming (the
passwords are now contained inside the credential struct)
or deletion of obsolete code (the credential code handles
URL parsing and prompting for us).
The behavior should be the same, with one exception: the
credential code will prompt with a description based on the
credential components. Therefore, the old prompt of:
Username for 'example.com':
Password for 'example.com':
now looks like:
Username for 'https://example.com/repo.git':
Password for 'https://user@example.com/repo.git':
Note that we include more information in each line,
specifically:
1. We now include the protocol. While more noisy, this is
an important part of knowing what you are accessing
(especially if you care about http vs https).
2. We include the username in the password prompt. This is
not a big deal when you have just been prompted for it,
but the username may also come from the remote's URL
(and after future patches, from configuration or
credential helpers). In that case, it's a nice
reminder of the user for which you're giving the
password.
3. We include the path component of the URL. In many
cases, the user won't care about this and it's simply
noise (i.e., they'll use the same credential for a
whole site). However, that is part of a larger
question, which is whether path components should be
part of credential context, both for prompting and for
lookup by storage helpers. That issue will be addressed
as a whole in a future patch.
Similarly, for unlocking certificates, we used to say:
Certificate Password for 'example.com':
and we now say:
Password for 'cert:///path/to/certificate':
Showing the path to the client certificate makes more sense,
as that is what you are unlocking, not "example.com".
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-12-10 18:31:21 +08:00
|
|
|
}
|
|
|
|
return 1;
|
2009-05-28 11:16:02 +08:00
|
|
|
}
|
|
|
|
|
http: centralize the accounting of libcurl dependencies
As discussed in 644de29e220 (http: drop support for curl < 7.19.4,
2021-07-30) checking against LIBCURL_VERSION_NUM isn't as reliable as
checking specific symbols present in curl, as some distros have been
known to backport features.
However, while some of the curl_easy_setopt() arguments we rely on are
macros, others are enum, and we can't assume that those that are
macros won't change into enums in the future.
So we're still going to have to check LIBCURL_VERSION_NUM, but by
doing that in one central place and using a macro definition of our
own, anyone who's backporting features can define it themselves, and
thus have access to more modern curl features that they backported,
even if they didn't bump the LIBCURL_VERSION_NUM.
More importantly, as shown in a preceding commit doing these version
checks makes for hard to read and possibly buggy code, as shown by the
bug fixed there where we were conflating base 10 for base 16 when
comparing the version.
By doing them all in one place we'll hopefully reduce the chances of
such future mistakes, furthermore it now becomes easier to see at a
glance what the oldest supported version is, which makes it easier to
reason about any future deprecation similar to the recent
e48a623dea0 (Merge branch 'ab/http-drop-old-curl', 2021-08-24).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-13 22:51:28 +08:00
|
|
|
#ifdef GIT_CURL_HAVE_CURLOPT_PROXY_KEYPASSWD
|
2020-03-05 02:40:05 +08:00
|
|
|
static int has_proxy_cert_password(void)
|
|
|
|
{
|
|
|
|
if (http_proxy_ssl_cert == NULL || proxy_ssl_cert_password_required != 1)
|
|
|
|
return 0;
|
|
|
|
if (!proxy_cert_auth.password) {
|
|
|
|
proxy_cert_auth.protocol = xstrdup("cert");
|
2020-04-20 13:05:56 +08:00
|
|
|
proxy_cert_auth.host = xstrdup("");
|
2020-03-05 02:40:05 +08:00
|
|
|
proxy_cert_auth.username = xstrdup("");
|
|
|
|
proxy_cert_auth.path = xstrdup(http_proxy_ssl_cert);
|
credential: gate new fields on capability
We support the new credential and authtype fields, but we lack a way to
indicate to a credential helper that we'd like them to be used. Without
some sort of indication, the credential helper doesn't know if it should
try to provide us a username and password, or a pre-encoded credential.
For example, the helper might prefer a more restricted Bearer token if
pre-encoded credentials are possible, but might have to fall back to
more general username and password if not.
Let's provide a simple way to indicate whether Git (or, for that matter,
the helper) is capable of understanding the authtype and credential
fields. We send this capability when we generate a request, and the
other side may reply to indicate to us that it does, too.
For now, don't enable sending capabilities for the HTTP code. In a
future commit, we'll introduce appropriate handling for that code,
which requires more in-depth work.
The logic for determining whether a capability is supported may seem
complex, but it is not. At each stage, we emit the capability to the
following stage if all preceding stages have declared it. Thus, if the
caller to git credential fill didn't declare it, then we won't send it
to the helper, and if fill's caller did send but the helper doesn't
understand it, then we won't send it on in the response. If we're an
internal user, then we know about all capabilities and will request
them.
For "git credential approve" and "git credential reject", we set the
helper capability before calling the helper, since we assume that the
input we're getting from the external program comes from a previous call
to "git credential fill", and thus we'll invoke send a capability to the
helper if and only if we got one from the standard input, which is the
correct behavior.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-17 08:02:29 +08:00
|
|
|
credential_fill(&proxy_cert_auth, 0);
|
2020-03-05 02:40:05 +08:00
|
|
|
}
|
|
|
|
return 1;
|
|
|
|
}
|
|
|
|
#endif
|
|
|
|
|
http: centralize the accounting of libcurl dependencies
As discussed in 644de29e220 (http: drop support for curl < 7.19.4,
2021-07-30) checking against LIBCURL_VERSION_NUM isn't as reliable as
checking specific symbols present in curl, as some distros have been
known to backport features.
However, while some of the curl_easy_setopt() arguments we rely on are
macros, others are enum, and we can't assume that those that are
macros won't change into enums in the future.
So we're still going to have to check LIBCURL_VERSION_NUM, but by
doing that in one central place and using a macro definition of our
own, anyone who's backporting features can define it themselves, and
thus have access to more modern curl features that they backported,
even if they didn't bump the LIBCURL_VERSION_NUM.
More importantly, as shown in a preceding commit doing these version
checks makes for hard to read and possibly buggy code, as shown by the
bug fixed there where we were conflating base 10 for base 16 when
comparing the version.
By doing them all in one place we'll hopefully reduce the chances of
such future mistakes, furthermore it now becomes easier to see at a
glance what the oldest supported version is, which makes it easier to
reason about any future deprecation similar to the recent
e48a623dea0 (Merge branch 'ab/http-drop-old-curl', 2021-08-24).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-13 22:51:28 +08:00
|
|
|
#ifdef GITCURL_HAVE_CURLOPT_TCP_KEEPALIVE
|
2013-10-15 08:06:14 +08:00
|
|
|
static void set_curl_keepalive(CURL *c)
|
|
|
|
{
|
|
|
|
curl_easy_setopt(c, CURLOPT_TCP_KEEPALIVE, 1);
|
|
|
|
}
|
|
|
|
|
2021-07-30 17:31:54 +08:00
|
|
|
#else
|
2013-10-13 06:29:40 +08:00
|
|
|
static int sockopt_callback(void *client, curl_socket_t fd, curlsocktype type)
|
|
|
|
{
|
|
|
|
int ka = 1;
|
|
|
|
int rc;
|
|
|
|
socklen_t len = (socklen_t)sizeof(ka);
|
|
|
|
|
|
|
|
if (type != CURLSOCKTYPE_IPCXN)
|
|
|
|
return 0;
|
|
|
|
|
|
|
|
rc = setsockopt(fd, SOL_SOCKET, SO_KEEPALIVE, (void *)&ka, len);
|
|
|
|
if (rc < 0)
|
2016-05-08 17:47:48 +08:00
|
|
|
warning_errno("unable to set SO_KEEPALIVE on socket");
|
2013-10-13 06:29:40 +08:00
|
|
|
|
2021-09-13 22:51:29 +08:00
|
|
|
return CURL_SOCKOPT_OK;
|
2013-10-13 06:29:40 +08:00
|
|
|
}
|
|
|
|
|
2013-10-15 08:06:14 +08:00
|
|
|
static void set_curl_keepalive(CURL *c)
|
|
|
|
{
|
|
|
|
curl_easy_setopt(c, CURLOPT_SOCKOPTFUNCTION, sockopt_callback);
|
|
|
|
}
|
|
|
|
#endif
|
|
|
|
|
2022-11-12 06:35:06 +08:00
|
|
|
/* Return 1 if redactions have been made, 0 otherwise. */
|
|
|
|
static int redact_sensitive_header(struct strbuf *header, size_t offset)
|
2016-05-23 21:44:02 +08:00
|
|
|
{
|
2022-11-12 06:35:06 +08:00
|
|
|
int ret = 0;
|
2016-05-23 21:44:02 +08:00
|
|
|
const char *sensitive_header;
|
|
|
|
|
2020-06-06 05:21:36 +08:00
|
|
|
if (trace_curl_redact &&
|
2022-11-12 06:35:06 +08:00
|
|
|
(skip_iprefix(header->buf + offset, "Authorization:", &sensitive_header) ||
|
|
|
|
skip_iprefix(header->buf + offset, "Proxy-Authorization:", &sensitive_header))) {
|
2016-05-23 21:44:02 +08:00
|
|
|
/* The first token is the type, which is OK to log */
|
|
|
|
while (isspace(*sensitive_header))
|
|
|
|
sensitive_header++;
|
|
|
|
while (*sensitive_header && !isspace(*sensitive_header))
|
|
|
|
sensitive_header++;
|
|
|
|
/* Everything else is opaque and possibly sensitive */
|
|
|
|
strbuf_setlen(header, sensitive_header - header->buf);
|
|
|
|
strbuf_addstr(header, " <redacted>");
|
2022-11-12 06:35:06 +08:00
|
|
|
ret = 1;
|
2020-06-06 05:21:36 +08:00
|
|
|
} else if (trace_curl_redact &&
|
2022-11-12 06:35:06 +08:00
|
|
|
skip_iprefix(header->buf + offset, "Cookie:", &sensitive_header)) {
|
2018-01-19 08:28:01 +08:00
|
|
|
struct strbuf redacted_header = STRBUF_INIT;
|
2020-06-06 05:21:36 +08:00
|
|
|
const char *cookie;
|
2018-01-19 08:28:01 +08:00
|
|
|
|
|
|
|
while (isspace(*sensitive_header))
|
|
|
|
sensitive_header++;
|
|
|
|
|
2020-06-06 05:21:36 +08:00
|
|
|
cookie = sensitive_header;
|
2018-01-19 08:28:01 +08:00
|
|
|
|
|
|
|
while (cookie) {
|
|
|
|
char *equals;
|
|
|
|
char *semicolon = strstr(cookie, "; ");
|
|
|
|
if (semicolon)
|
|
|
|
*semicolon = 0;
|
|
|
|
equals = strchrnul(cookie, '=');
|
|
|
|
if (!equals) {
|
|
|
|
/* invalid cookie, just append and continue */
|
|
|
|
strbuf_addstr(&redacted_header, cookie);
|
|
|
|
continue;
|
|
|
|
}
|
2020-06-06 05:21:36 +08:00
|
|
|
strbuf_add(&redacted_header, cookie, equals - cookie);
|
|
|
|
strbuf_addstr(&redacted_header, "=<redacted>");
|
2018-01-19 08:28:01 +08:00
|
|
|
if (semicolon) {
|
|
|
|
/*
|
|
|
|
* There are more cookies. (Or, for some
|
|
|
|
* reason, the input string ends in "; ".)
|
|
|
|
*/
|
|
|
|
strbuf_addstr(&redacted_header, "; ");
|
|
|
|
cookie = semicolon + strlen("; ");
|
|
|
|
} else {
|
|
|
|
cookie = NULL;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
strbuf_setlen(header, sensitive_header - header->buf);
|
|
|
|
strbuf_addbuf(header, &redacted_header);
|
2024-09-25 05:59:14 +08:00
|
|
|
strbuf_release(&redacted_header);
|
2022-11-12 06:35:06 +08:00
|
|
|
ret = 1;
|
|
|
|
}
|
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
2023-09-15 19:33:16 +08:00
|
|
|
static int match_curl_h2_trace(const char *line, const char **out)
|
|
|
|
{
|
http: update curl http/2 info matching for curl 8.3.0
To redact header lines in http/2 curl traces, we have to parse past some
prefix bytes that curl sticks in the info lines it passes to us. That
changed once already, and we adapted in db30130165 (http: handle both
"h2" and "h2h3" in curl info lines, 2023-06-17).
Now it has changed again, in curl's fbacb14c4 (http2: cleanup trace
messages, 2023-08-04), which was released in curl 8.3.0. Running a build
of git linked against that version will fail to redact the trace (and as
before, t5559 notices and complains).
The format here is a little more complicated than the other ones, as it
now includes a "stream id". This is not constant but is always numeric,
so we can easily parse past it.
We'll continue to match the old versions, of course, since we want to
work with many different versions of curl. We can't even select one
format at compile time, because the behavior depends on the runtime
version of curl we use, not the version we build against.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-15 19:34:43 +08:00
|
|
|
const char *p;
|
|
|
|
|
2023-09-15 19:33:16 +08:00
|
|
|
/*
|
|
|
|
* curl prior to 8.1.0 gives us:
|
|
|
|
*
|
|
|
|
* h2h3 [<header-name>: <header-val>]
|
|
|
|
*
|
|
|
|
* Starting in 8.1.0, the first token became just "h2".
|
|
|
|
*/
|
|
|
|
if (skip_iprefix(line, "h2h3 [", out) ||
|
|
|
|
skip_iprefix(line, "h2 [", out))
|
|
|
|
return 1;
|
|
|
|
|
http: update curl http/2 info matching for curl 8.3.0
To redact header lines in http/2 curl traces, we have to parse past some
prefix bytes that curl sticks in the info lines it passes to us. That
changed once already, and we adapted in db30130165 (http: handle both
"h2" and "h2h3" in curl info lines, 2023-06-17).
Now it has changed again, in curl's fbacb14c4 (http2: cleanup trace
messages, 2023-08-04), which was released in curl 8.3.0. Running a build
of git linked against that version will fail to redact the trace (and as
before, t5559 notices and complains).
The format here is a little more complicated than the other ones, as it
now includes a "stream id". This is not constant but is always numeric,
so we can easily parse past it.
We'll continue to match the old versions, of course, since we want to
work with many different versions of curl. We can't even select one
format at compile time, because the behavior depends on the runtime
version of curl we use, not the version we build against.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-09-15 19:34:43 +08:00
|
|
|
/*
|
|
|
|
* curl 8.3.0 uses:
|
|
|
|
* [HTTP/2] [<stream-id>] [<header-name>: <header-val>]
|
|
|
|
* where <stream-id> is numeric.
|
|
|
|
*/
|
|
|
|
if (skip_iprefix(line, "[HTTP/2] [", &p)) {
|
|
|
|
while (isdigit(*p))
|
|
|
|
p++;
|
|
|
|
if (skip_prefix(p, "] [", out))
|
|
|
|
return 1;
|
|
|
|
}
|
|
|
|
|
2023-09-15 19:33:16 +08:00
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2022-11-12 06:35:06 +08:00
|
|
|
/* Redact headers in info */
|
|
|
|
static void redact_sensitive_info_header(struct strbuf *header)
|
|
|
|
{
|
|
|
|
const char *sensitive_header;
|
|
|
|
|
|
|
|
if (trace_curl_redact &&
|
2023-09-15 19:33:16 +08:00
|
|
|
match_curl_h2_trace(header->buf, &sensitive_header)) {
|
2022-11-12 06:35:06 +08:00
|
|
|
if (redact_sensitive_header(header, sensitive_header - header->buf)) {
|
|
|
|
/* redaction ate our closing bracket */
|
|
|
|
strbuf_addch(header, ']');
|
|
|
|
}
|
2016-05-23 21:44:02 +08:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
static void curl_dump_header(const char *text, unsigned char *ptr, size_t size, int hide_sensitive_header)
|
|
|
|
{
|
|
|
|
struct strbuf out = STRBUF_INIT;
|
|
|
|
struct strbuf **headers, **header;
|
|
|
|
|
|
|
|
strbuf_addf(&out, "%s, %10.10ld bytes (0x%8.8lx)\n",
|
|
|
|
text, (long)size, (long)size);
|
|
|
|
trace_strbuf(&trace_curl, &out);
|
|
|
|
strbuf_reset(&out);
|
|
|
|
strbuf_add(&out, ptr, size);
|
|
|
|
headers = strbuf_split_max(&out, '\n', 0);
|
|
|
|
|
|
|
|
for (header = headers; *header; header++) {
|
|
|
|
if (hide_sensitive_header)
|
2022-11-12 06:35:06 +08:00
|
|
|
redact_sensitive_header(*header, 0);
|
2020-02-09 21:44:23 +08:00
|
|
|
strbuf_insertstr((*header), 0, text);
|
|
|
|
strbuf_insertstr((*header), strlen(text), ": ");
|
2016-05-23 21:44:02 +08:00
|
|
|
strbuf_rtrim((*header));
|
|
|
|
strbuf_addch((*header), '\n');
|
|
|
|
trace_strbuf(&trace_curl, (*header));
|
|
|
|
}
|
|
|
|
strbuf_list_free(headers);
|
|
|
|
strbuf_release(&out);
|
|
|
|
}
|
|
|
|
|
|
|
|
static void curl_dump_data(const char *text, unsigned char *ptr, size_t size)
|
|
|
|
{
|
|
|
|
size_t i;
|
|
|
|
struct strbuf out = STRBUF_INIT;
|
|
|
|
unsigned int width = 60;
|
|
|
|
|
|
|
|
strbuf_addf(&out, "%s, %10.10ld bytes (0x%8.8lx)\n",
|
|
|
|
text, (long)size, (long)size);
|
|
|
|
trace_strbuf(&trace_curl, &out);
|
|
|
|
|
|
|
|
for (i = 0; i < size; i += width) {
|
|
|
|
size_t w;
|
|
|
|
|
|
|
|
strbuf_reset(&out);
|
|
|
|
strbuf_addf(&out, "%s: ", text);
|
|
|
|
for (w = 0; (w < width) && (i + w < size); w++) {
|
|
|
|
unsigned char ch = ptr[i + w];
|
|
|
|
|
|
|
|
strbuf_addch(&out,
|
|
|
|
(ch >= 0x20) && (ch < 0x80)
|
|
|
|
? ch : '.');
|
|
|
|
}
|
|
|
|
strbuf_addch(&out, '\n');
|
|
|
|
trace_strbuf(&trace_curl, &out);
|
|
|
|
}
|
|
|
|
strbuf_release(&out);
|
|
|
|
}
|
|
|
|
|
2022-11-12 06:35:06 +08:00
|
|
|
static void curl_dump_info(char *data, size_t size)
|
|
|
|
{
|
|
|
|
struct strbuf buf = STRBUF_INIT;
|
|
|
|
|
|
|
|
strbuf_add(&buf, data, size);
|
|
|
|
|
|
|
|
redact_sensitive_info_header(&buf);
|
|
|
|
trace_printf_key(&trace_curl, "== Info: %s", buf.buf);
|
|
|
|
|
|
|
|
strbuf_release(&buf);
|
|
|
|
}
|
|
|
|
|
2023-07-03 14:44:05 +08:00
|
|
|
static int curl_trace(CURL *handle UNUSED, curl_infotype type,
|
|
|
|
char *data, size_t size,
|
|
|
|
void *userp UNUSED)
|
2016-05-23 21:44:02 +08:00
|
|
|
{
|
|
|
|
const char *text;
|
|
|
|
enum { NO_FILTER = 0, DO_FILTER = 1 };
|
|
|
|
|
|
|
|
switch (type) {
|
|
|
|
case CURLINFO_TEXT:
|
2022-11-12 06:35:06 +08:00
|
|
|
curl_dump_info(data, size);
|
2017-09-21 14:23:24 +08:00
|
|
|
break;
|
2016-05-23 21:44:02 +08:00
|
|
|
case CURLINFO_HEADER_OUT:
|
|
|
|
text = "=> Send header";
|
|
|
|
curl_dump_header(text, (unsigned char *)data, size, DO_FILTER);
|
|
|
|
break;
|
|
|
|
case CURLINFO_DATA_OUT:
|
2018-01-19 08:28:02 +08:00
|
|
|
if (trace_curl_data) {
|
|
|
|
text = "=> Send data";
|
|
|
|
curl_dump_data(text, (unsigned char *)data, size);
|
|
|
|
}
|
2016-05-23 21:44:02 +08:00
|
|
|
break;
|
|
|
|
case CURLINFO_SSL_DATA_OUT:
|
2018-01-19 08:28:02 +08:00
|
|
|
if (trace_curl_data) {
|
|
|
|
text = "=> Send SSL data";
|
|
|
|
curl_dump_data(text, (unsigned char *)data, size);
|
|
|
|
}
|
2016-05-23 21:44:02 +08:00
|
|
|
break;
|
|
|
|
case CURLINFO_HEADER_IN:
|
|
|
|
text = "<= Recv header";
|
|
|
|
curl_dump_header(text, (unsigned char *)data, size, NO_FILTER);
|
|
|
|
break;
|
|
|
|
case CURLINFO_DATA_IN:
|
2018-01-19 08:28:02 +08:00
|
|
|
if (trace_curl_data) {
|
|
|
|
text = "<= Recv data";
|
|
|
|
curl_dump_data(text, (unsigned char *)data, size);
|
|
|
|
}
|
2016-05-23 21:44:02 +08:00
|
|
|
break;
|
|
|
|
case CURLINFO_SSL_DATA_IN:
|
2018-01-19 08:28:02 +08:00
|
|
|
if (trace_curl_data) {
|
|
|
|
text = "<= Recv SSL data";
|
|
|
|
curl_dump_data(text, (unsigned char *)data, size);
|
|
|
|
}
|
2016-05-23 21:44:02 +08:00
|
|
|
break;
|
2017-09-21 14:23:24 +08:00
|
|
|
|
|
|
|
default: /* we ignore unknown types by default */
|
|
|
|
return 0;
|
2016-05-23 21:44:02 +08:00
|
|
|
}
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2020-05-12 01:43:10 +08:00
|
|
|
void http_trace_curl_no_data(void)
|
|
|
|
{
|
|
|
|
trace_override_envvar(&trace_curl, "1");
|
|
|
|
trace_curl_data = 0;
|
|
|
|
}
|
|
|
|
|
2016-05-23 21:44:02 +08:00
|
|
|
void setup_curl_trace(CURL *handle)
|
|
|
|
{
|
|
|
|
if (!trace_want(&trace_curl))
|
|
|
|
return;
|
|
|
|
curl_easy_setopt(handle, CURLOPT_VERBOSE, 1L);
|
|
|
|
curl_easy_setopt(handle, CURLOPT_DEBUGFUNCTION, curl_trace);
|
|
|
|
curl_easy_setopt(handle, CURLOPT_DEBUGDATA, NULL);
|
|
|
|
}
|
|
|
|
|
http: support CURLOPT_PROTOCOLS_STR
The CURLOPT_PROTOCOLS (and matching CURLOPT_REDIR_PROTOCOLS) flag was
deprecated in curl 7.85.0, and using it generate compiler warnings as of
curl 7.87.0. The path forward is to use CURLOPT_PROTOCOLS_STR, but we
can't just do so unilaterally, as it was only introduced less than a
year ago in 7.85.0.
Until that version becomes ubiquitous, we have to either disable the
deprecation warning or conditionally use the "STR" variant on newer
versions of libcurl. This patch switches to the new variant, which is
nice for two reasons:
- we don't have to worry that silencing curl's deprecation warnings
might cause us to miss other more useful ones
- we'd eventually want to move to the new variant anyway, so this gets
us set up (albeit with some extra ugly boilerplate for the
conditional)
There are a lot of ways to split up the two cases. One way would be to
abstract the storage type (strbuf versus a long), how to append
(strbuf_addstr vs bitwise OR), how to initialize, which CURLOPT to use,
and so on. But the resulting code looks pretty magical:
GIT_CURL_PROTOCOL_TYPE allowed = GIT_CURL_PROTOCOL_TYPE_INIT;
if (...http is allowed...)
GIT_CURL_PROTOCOL_APPEND(&allowed, "http", CURLOPT_HTTP);
and you end up with more "#define GIT_CURL_PROTOCOL_TYPE" macros than
actual code.
On the other end of the spectrum, we could just implement two separate
functions, one that handles a string list and one that handles bits. But
then we end up repeating our list of protocols (http, https, ftp, ftp).
This patch takes the middle ground. The run-time code is always there to
handle both types, and we just choose which one to feed to curl.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-17 11:04:48 +08:00
|
|
|
static void proto_list_append(struct strbuf *list, const char *proto)
|
2016-12-15 06:39:53 +08:00
|
|
|
{
|
http: support CURLOPT_PROTOCOLS_STR
The CURLOPT_PROTOCOLS (and matching CURLOPT_REDIR_PROTOCOLS) flag was
deprecated in curl 7.85.0, and using it generate compiler warnings as of
curl 7.87.0. The path forward is to use CURLOPT_PROTOCOLS_STR, but we
can't just do so unilaterally, as it was only introduced less than a
year ago in 7.85.0.
Until that version becomes ubiquitous, we have to either disable the
deprecation warning or conditionally use the "STR" variant on newer
versions of libcurl. This patch switches to the new variant, which is
nice for two reasons:
- we don't have to worry that silencing curl's deprecation warnings
might cause us to miss other more useful ones
- we'd eventually want to move to the new variant anyway, so this gets
us set up (albeit with some extra ugly boilerplate for the
conditional)
There are a lot of ways to split up the two cases. One way would be to
abstract the storage type (strbuf versus a long), how to append
(strbuf_addstr vs bitwise OR), how to initialize, which CURLOPT to use,
and so on. But the resulting code looks pretty magical:
GIT_CURL_PROTOCOL_TYPE allowed = GIT_CURL_PROTOCOL_TYPE_INIT;
if (...http is allowed...)
GIT_CURL_PROTOCOL_APPEND(&allowed, "http", CURLOPT_HTTP);
and you end up with more "#define GIT_CURL_PROTOCOL_TYPE" macros than
actual code.
On the other end of the spectrum, we could just implement two separate
functions, one that handles a string list and one that handles bits. But
then we end up repeating our list of protocols (http, https, ftp, ftp).
This patch takes the middle ground. The run-time code is always there to
handle both types, and we just choose which one to feed to curl.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-17 11:04:48 +08:00
|
|
|
if (!list)
|
|
|
|
return;
|
|
|
|
if (list->len)
|
|
|
|
strbuf_addch(list, ',');
|
|
|
|
strbuf_addstr(list, proto);
|
|
|
|
}
|
|
|
|
|
|
|
|
static long get_curl_allowed_protocols(int from_user, struct strbuf *list)
|
|
|
|
{
|
|
|
|
long bits = 0;
|
|
|
|
|
|
|
|
if (is_transport_allowed("http", from_user)) {
|
|
|
|
bits |= CURLPROTO_HTTP;
|
|
|
|
proto_list_append(list, "http");
|
|
|
|
}
|
|
|
|
if (is_transport_allowed("https", from_user)) {
|
|
|
|
bits |= CURLPROTO_HTTPS;
|
|
|
|
proto_list_append(list, "https");
|
|
|
|
}
|
|
|
|
if (is_transport_allowed("ftp", from_user)) {
|
|
|
|
bits |= CURLPROTO_FTP;
|
|
|
|
proto_list_append(list, "ftp");
|
|
|
|
}
|
|
|
|
if (is_transport_allowed("ftps", from_user)) {
|
|
|
|
bits |= CURLPROTO_FTPS;
|
|
|
|
proto_list_append(list, "ftps");
|
|
|
|
}
|
|
|
|
|
|
|
|
return bits;
|
2016-12-15 06:39:53 +08:00
|
|
|
}
|
2016-05-23 21:44:02 +08:00
|
|
|
|
http: centralize the accounting of libcurl dependencies
As discussed in 644de29e220 (http: drop support for curl < 7.19.4,
2021-07-30) checking against LIBCURL_VERSION_NUM isn't as reliable as
checking specific symbols present in curl, as some distros have been
known to backport features.
However, while some of the curl_easy_setopt() arguments we rely on are
macros, others are enum, and we can't assume that those that are
macros won't change into enums in the future.
So we're still going to have to check LIBCURL_VERSION_NUM, but by
doing that in one central place and using a macro definition of our
own, anyone who's backporting features can define it themselves, and
thus have access to more modern curl features that they backported,
even if they didn't bump the LIBCURL_VERSION_NUM.
More importantly, as shown in a preceding commit doing these version
checks makes for hard to read and possibly buggy code, as shown by the
bug fixed there where we were conflating base 10 for base 16 when
comparing the version.
By doing them all in one place we'll hopefully reduce the chances of
such future mistakes, furthermore it now becomes easier to see at a
glance what the oldest supported version is, which makes it easier to
reason about any future deprecation similar to the recent
e48a623dea0 (Merge branch 'ab/http-drop-old-curl', 2021-08-24).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-13 22:51:28 +08:00
|
|
|
#ifdef GIT_CURL_HAVE_CURL_HTTP_VERSION_2
|
2018-11-09 11:44:14 +08:00
|
|
|
static int get_curl_http_version_opt(const char *version_string, long *opt)
|
|
|
|
{
|
|
|
|
int i;
|
|
|
|
static struct {
|
|
|
|
const char *name;
|
|
|
|
long opt_token;
|
|
|
|
} choice[] = {
|
|
|
|
{ "HTTP/1.1", CURL_HTTP_VERSION_1_1 },
|
|
|
|
{ "HTTP/2", CURL_HTTP_VERSION_2 }
|
|
|
|
};
|
|
|
|
|
|
|
|
for (i = 0; i < ARRAY_SIZE(choice); i++) {
|
|
|
|
if (!strcmp(version_string, choice[i].name)) {
|
|
|
|
*opt = choice[i].opt_token;
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
warning("unknown value given to http.version: '%s'", version_string);
|
|
|
|
return -1; /* not found */
|
|
|
|
}
|
|
|
|
|
|
|
|
#endif
|
|
|
|
|
2009-03-10 09:47:29 +08:00
|
|
|
static CURL *get_curl_handle(void)
|
2005-11-19 09:06:46 +08:00
|
|
|
{
|
2009-03-10 09:47:29 +08:00
|
|
|
CURL *result = curl_easy_init();
|
2005-11-19 09:06:46 +08:00
|
|
|
|
2014-08-14 01:31:24 +08:00
|
|
|
if (!result)
|
|
|
|
die("curl_easy_init failed");
|
|
|
|
|
2008-02-22 07:10:37 +08:00
|
|
|
if (!curl_ssl_verify) {
|
|
|
|
curl_easy_setopt(result, CURLOPT_SSL_VERIFYPEER, 0);
|
|
|
|
curl_easy_setopt(result, CURLOPT_SSL_VERIFYHOST, 0);
|
|
|
|
} else {
|
|
|
|
/* Verify authenticity of the peer's certificate */
|
|
|
|
curl_easy_setopt(result, CURLOPT_SSL_VERIFYPEER, 1);
|
|
|
|
/* The name in the cert must match whom we tried to connect */
|
|
|
|
curl_easy_setopt(result, CURLOPT_SSL_VERIFYHOST, 2);
|
|
|
|
}
|
|
|
|
|
http: centralize the accounting of libcurl dependencies
As discussed in 644de29e220 (http: drop support for curl < 7.19.4,
2021-07-30) checking against LIBCURL_VERSION_NUM isn't as reliable as
checking specific symbols present in curl, as some distros have been
known to backport features.
However, while some of the curl_easy_setopt() arguments we rely on are
macros, others are enum, and we can't assume that those that are
macros won't change into enums in the future.
So we're still going to have to check LIBCURL_VERSION_NUM, but by
doing that in one central place and using a macro definition of our
own, anyone who's backporting features can define it themselves, and
thus have access to more modern curl features that they backported,
even if they didn't bump the LIBCURL_VERSION_NUM.
More importantly, as shown in a preceding commit doing these version
checks makes for hard to read and possibly buggy code, as shown by the
bug fixed there where we were conflating base 10 for base 16 when
comparing the version.
By doing them all in one place we'll hopefully reduce the chances of
such future mistakes, furthermore it now becomes easier to see at a
glance what the oldest supported version is, which makes it easier to
reason about any future deprecation similar to the recent
e48a623dea0 (Merge branch 'ab/http-drop-old-curl', 2021-08-24).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-13 22:51:28 +08:00
|
|
|
#ifdef GIT_CURL_HAVE_CURL_HTTP_VERSION_2
|
2018-11-09 11:44:14 +08:00
|
|
|
if (curl_http_version) {
|
|
|
|
long opt;
|
|
|
|
if (!get_curl_http_version_opt(curl_http_version, &opt)) {
|
|
|
|
/* Set request use http version */
|
|
|
|
curl_easy_setopt(result, CURLOPT_HTTP_VERSION, opt);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
#endif
|
|
|
|
|
2005-11-19 09:06:46 +08:00
|
|
|
curl_easy_setopt(result, CURLOPT_NETRC, CURL_NETRC_OPTIONAL);
|
2009-12-29 02:04:24 +08:00
|
|
|
curl_easy_setopt(result, CURLOPT_HTTPAUTH, CURLAUTH_ANY);
|
2005-11-19 09:06:46 +08:00
|
|
|
|
2017-08-12 00:37:34 +08:00
|
|
|
#ifdef CURLGSSAPI_DELEGATION_FLAG
|
2016-09-29 02:01:34 +08:00
|
|
|
if (curl_deleg) {
|
|
|
|
int i;
|
|
|
|
for (i = 0; i < ARRAY_SIZE(curl_deleg_levels); i++) {
|
|
|
|
if (!strcmp(curl_deleg, curl_deleg_levels[i].name)) {
|
|
|
|
curl_easy_setopt(result, CURLOPT_GSSAPI_DELEGATION,
|
|
|
|
curl_deleg_levels[i].curl_deleg_param);
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
if (i == ARRAY_SIZE(curl_deleg_levels))
|
|
|
|
warning("Unknown delegation method '%s': using default",
|
|
|
|
curl_deleg);
|
|
|
|
}
|
|
|
|
#endif
|
|
|
|
|
2018-10-26 02:53:55 +08:00
|
|
|
if (http_ssl_backend && !strcmp("schannel", http_ssl_backend) &&
|
|
|
|
!http_schannel_check_revoke) {
|
http: centralize the accounting of libcurl dependencies
As discussed in 644de29e220 (http: drop support for curl < 7.19.4,
2021-07-30) checking against LIBCURL_VERSION_NUM isn't as reliable as
checking specific symbols present in curl, as some distros have been
known to backport features.
However, while some of the curl_easy_setopt() arguments we rely on are
macros, others are enum, and we can't assume that those that are
macros won't change into enums in the future.
So we're still going to have to check LIBCURL_VERSION_NUM, but by
doing that in one central place and using a macro definition of our
own, anyone who's backporting features can define it themselves, and
thus have access to more modern curl features that they backported,
even if they didn't bump the LIBCURL_VERSION_NUM.
More importantly, as shown in a preceding commit doing these version
checks makes for hard to read and possibly buggy code, as shown by the
bug fixed there where we were conflating base 10 for base 16 when
comparing the version.
By doing them all in one place we'll hopefully reduce the chances of
such future mistakes, furthermore it now becomes easier to see at a
glance what the oldest supported version is, which makes it easier to
reason about any future deprecation similar to the recent
e48a623dea0 (Merge branch 'ab/http-drop-old-curl', 2021-08-24).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-13 22:51:28 +08:00
|
|
|
#ifdef GIT_CURL_HAVE_CURLSSLOPT_NO_REVOKE
|
2018-10-26 02:53:55 +08:00
|
|
|
curl_easy_setopt(result, CURLOPT_SSL_OPTIONS, CURLSSLOPT_NO_REVOKE);
|
|
|
|
#else
|
2018-11-29 05:43:09 +08:00
|
|
|
warning(_("CURLSSLOPT_NO_REVOKE not supported with cURL < 7.44.0"));
|
2018-10-26 02:53:55 +08:00
|
|
|
#endif
|
|
|
|
}
|
|
|
|
|
http: allow authenticating proactively
When making a request over HTTP(S), Git only sends authentication if it
receives a 401 response. Thus, if a repository is open to the public
for reading, Git will typically never ask for authentication for fetches
and clones.
However, there may be times when a user would like to authenticate
nevertheless. For example, a forge may give higher rate limits to users
who authenticate because they are easier to contact in case of excessive
use. Or it may be useful for a known heavy user, such as an internal
service, to proactively authenticate so its use can be monitored and, if
necessary, throttled.
Let's make this possible with a new option, "http.proactiveAuth". This
option specifies a type of authentication which can be used to
authenticate against the host in question. This is necessary because we
lack the WWW-Authenticate header to provide us details; similarly, we
cannot accept certain types of authentication because we require
information from the server, such as a nonce or challenge, to
successfully authenticate.
If we're in auto mode and we got a username and password, set the
authentication scheme to Basic. libcurl will not send authentication
proactively unless there's a single choice of allowed authentication,
and we know in this case we didn't get an authtype entry telling us what
scheme to use, or we would have taken a different codepath and written
the header ourselves. In any event, of the other schemes that libcurl
supports, Digest and NTLM require a nonce or challenge, which means that
they cannot work with proactive auth, and GSSAPI does not use a username
and password at all, so Basic is the only logical choice among the
built-in options.
Note that the existing http_proactive_auth variable signifies proactive
auth if there are already credentials, which is different from the
functionality we're adding, which always seeks credentials even if none
are provided. Nonetheless, t5540 tests the existing behavior for
WebDAV-based pushes to an open repository without credentials, so we
preserve it. While at first this may seem an insecure and bizarre
decision, it may be that authentication is done with TLS certificates,
in which case it might actually provide a quite high level of security.
Expand the variable to use an enum to handle the additional cases and a
helper function to distinguish our new cases from the old ones.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-10 08:01:55 +08:00
|
|
|
if (http_proactive_auth != PROACTIVE_AUTH_NONE)
|
2011-12-14 08:11:56 +08:00
|
|
|
init_curl_http_auth(result);
|
|
|
|
|
2015-08-15 03:37:43 +08:00
|
|
|
if (getenv("GIT_SSL_VERSION"))
|
|
|
|
ssl_version = getenv("GIT_SSL_VERSION");
|
|
|
|
if (ssl_version && *ssl_version) {
|
|
|
|
int i;
|
|
|
|
for (i = 0; i < ARRAY_SIZE(sslversions); i++) {
|
|
|
|
if (!strcmp(ssl_version, sslversions[i].name)) {
|
|
|
|
curl_easy_setopt(result, CURLOPT_SSLVERSION,
|
|
|
|
sslversions[i].ssl_version);
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
if (i == ARRAY_SIZE(sslversions))
|
|
|
|
warning("unsupported ssl version %s: using default",
|
|
|
|
ssl_version);
|
|
|
|
}
|
|
|
|
|
2015-05-08 21:22:15 +08:00
|
|
|
if (getenv("GIT_SSL_CIPHER_LIST"))
|
|
|
|
ssl_cipherlist = getenv("GIT_SSL_CIPHER_LIST");
|
|
|
|
if (ssl_cipherlist != NULL && *ssl_cipherlist)
|
|
|
|
curl_easy_setopt(result, CURLOPT_SSL_CIPHER_LIST,
|
|
|
|
ssl_cipherlist);
|
|
|
|
|
2022-05-03 00:50:37 +08:00
|
|
|
if (ssl_cert)
|
2005-11-19 09:06:46 +08:00
|
|
|
curl_easy_setopt(result, CURLOPT_SSLCERT, ssl_cert);
|
2023-03-20 23:48:49 +08:00
|
|
|
if (ssl_cert_type)
|
|
|
|
curl_easy_setopt(result, CURLOPT_SSLCERTTYPE, ssl_cert_type);
|
2009-05-28 11:16:02 +08:00
|
|
|
if (has_cert_password())
|
http: use credential API to get passwords
This patch converts the http code to use the new credential
API, both for http authentication as well as for getting
certificate passwords.
Most of the code change is simply variable naming (the
passwords are now contained inside the credential struct)
or deletion of obsolete code (the credential code handles
URL parsing and prompting for us).
The behavior should be the same, with one exception: the
credential code will prompt with a description based on the
credential components. Therefore, the old prompt of:
Username for 'example.com':
Password for 'example.com':
now looks like:
Username for 'https://example.com/repo.git':
Password for 'https://user@example.com/repo.git':
Note that we include more information in each line,
specifically:
1. We now include the protocol. While more noisy, this is
an important part of knowing what you are accessing
(especially if you care about http vs https).
2. We include the username in the password prompt. This is
not a big deal when you have just been prompted for it,
but the username may also come from the remote's URL
(and after future patches, from configuration or
credential helpers). In that case, it's a nice
reminder of the user for which you're giving the
password.
3. We include the path component of the URL. In many
cases, the user won't care about this and it's simply
noise (i.e., they'll use the same credential for a
whole site). However, that is part of a larger
question, which is whether path components should be
part of credential context, both for prompting and for
lookup by storage helpers. That issue will be addressed
as a whole in a future patch.
Similarly, for unlocking certificates, we used to say:
Certificate Password for 'example.com':
and we now say:
Password for 'cert:///path/to/certificate':
Showing the path to the client certificate makes more sense,
as that is what you are unlocking, not "example.com".
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-12-10 18:31:21 +08:00
|
|
|
curl_easy_setopt(result, CURLOPT_KEYPASSWD, cert_auth.password);
|
2022-05-03 00:50:37 +08:00
|
|
|
if (ssl_key)
|
2005-11-19 09:06:46 +08:00
|
|
|
curl_easy_setopt(result, CURLOPT_SSLKEY, ssl_key);
|
2023-03-20 23:48:49 +08:00
|
|
|
if (ssl_key_type)
|
|
|
|
curl_easy_setopt(result, CURLOPT_SSLKEYTYPE, ssl_key_type);
|
2022-05-03 00:50:37 +08:00
|
|
|
if (ssl_capath)
|
2005-11-19 09:06:46 +08:00
|
|
|
curl_easy_setopt(result, CURLOPT_CAPATH, ssl_capath);
|
http: centralize the accounting of libcurl dependencies
As discussed in 644de29e220 (http: drop support for curl < 7.19.4,
2021-07-30) checking against LIBCURL_VERSION_NUM isn't as reliable as
checking specific symbols present in curl, as some distros have been
known to backport features.
However, while some of the curl_easy_setopt() arguments we rely on are
macros, others are enum, and we can't assume that those that are
macros won't change into enums in the future.
So we're still going to have to check LIBCURL_VERSION_NUM, but by
doing that in one central place and using a macro definition of our
own, anyone who's backporting features can define it themselves, and
thus have access to more modern curl features that they backported,
even if they didn't bump the LIBCURL_VERSION_NUM.
More importantly, as shown in a preceding commit doing these version
checks makes for hard to read and possibly buggy code, as shown by the
bug fixed there where we were conflating base 10 for base 16 when
comparing the version.
By doing them all in one place we'll hopefully reduce the chances of
such future mistakes, furthermore it now becomes easier to see at a
glance what the oldest supported version is, which makes it easier to
reason about any future deprecation similar to the recent
e48a623dea0 (Merge branch 'ab/http-drop-old-curl', 2021-08-24).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-13 22:51:28 +08:00
|
|
|
#ifdef GIT_CURL_HAVE_CURLOPT_PINNEDPUBLICKEY
|
2022-05-03 00:50:37 +08:00
|
|
|
if (ssl_pinnedkey)
|
2016-02-15 22:04:22 +08:00
|
|
|
curl_easy_setopt(result, CURLOPT_PINNEDPUBLICKEY, ssl_pinnedkey);
|
2005-11-19 09:06:46 +08:00
|
|
|
#endif
|
2018-10-26 02:53:56 +08:00
|
|
|
if (http_ssl_backend && !strcmp("schannel", http_ssl_backend) &&
|
|
|
|
!http_schannel_use_ssl_cainfo) {
|
|
|
|
curl_easy_setopt(result, CURLOPT_CAINFO, NULL);
|
http: centralize the accounting of libcurl dependencies
As discussed in 644de29e220 (http: drop support for curl < 7.19.4,
2021-07-30) checking against LIBCURL_VERSION_NUM isn't as reliable as
checking specific symbols present in curl, as some distros have been
known to backport features.
However, while some of the curl_easy_setopt() arguments we rely on are
macros, others are enum, and we can't assume that those that are
macros won't change into enums in the future.
So we're still going to have to check LIBCURL_VERSION_NUM, but by
doing that in one central place and using a macro definition of our
own, anyone who's backporting features can define it themselves, and
thus have access to more modern curl features that they backported,
even if they didn't bump the LIBCURL_VERSION_NUM.
More importantly, as shown in a preceding commit doing these version
checks makes for hard to read and possibly buggy code, as shown by the
bug fixed there where we were conflating base 10 for base 16 when
comparing the version.
By doing them all in one place we'll hopefully reduce the chances of
such future mistakes, furthermore it now becomes easier to see at a
glance what the oldest supported version is, which makes it easier to
reason about any future deprecation similar to the recent
e48a623dea0 (Merge branch 'ab/http-drop-old-curl', 2021-08-24).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-13 22:51:28 +08:00
|
|
|
#ifdef GIT_CURL_HAVE_CURLOPT_PROXY_CAINFO
|
2018-10-26 02:53:56 +08:00
|
|
|
curl_easy_setopt(result, CURLOPT_PROXY_CAINFO, NULL);
|
|
|
|
#endif
|
2020-03-05 02:40:05 +08:00
|
|
|
} else if (ssl_cainfo != NULL || http_proxy_ssl_ca_info != NULL) {
|
2022-05-03 00:50:37 +08:00
|
|
|
if (ssl_cainfo)
|
2020-03-05 02:40:05 +08:00
|
|
|
curl_easy_setopt(result, CURLOPT_CAINFO, ssl_cainfo);
|
http: centralize the accounting of libcurl dependencies
As discussed in 644de29e220 (http: drop support for curl < 7.19.4,
2021-07-30) checking against LIBCURL_VERSION_NUM isn't as reliable as
checking specific symbols present in curl, as some distros have been
known to backport features.
However, while some of the curl_easy_setopt() arguments we rely on are
macros, others are enum, and we can't assume that those that are
macros won't change into enums in the future.
So we're still going to have to check LIBCURL_VERSION_NUM, but by
doing that in one central place and using a macro definition of our
own, anyone who's backporting features can define it themselves, and
thus have access to more modern curl features that they backported,
even if they didn't bump the LIBCURL_VERSION_NUM.
More importantly, as shown in a preceding commit doing these version
checks makes for hard to read and possibly buggy code, as shown by the
bug fixed there where we were conflating base 10 for base 16 when
comparing the version.
By doing them all in one place we'll hopefully reduce the chances of
such future mistakes, furthermore it now becomes easier to see at a
glance what the oldest supported version is, which makes it easier to
reason about any future deprecation similar to the recent
e48a623dea0 (Merge branch 'ab/http-drop-old-curl', 2021-08-24).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-13 22:51:28 +08:00
|
|
|
#ifdef GIT_CURL_HAVE_CURLOPT_PROXY_CAINFO
|
2022-05-03 00:50:37 +08:00
|
|
|
if (http_proxy_ssl_ca_info)
|
2020-03-05 02:40:05 +08:00
|
|
|
curl_easy_setopt(result, CURLOPT_PROXY_CAINFO, http_proxy_ssl_ca_info);
|
|
|
|
#endif
|
|
|
|
}
|
2005-11-19 09:06:46 +08:00
|
|
|
|
|
|
|
if (curl_low_speed_limit > 0 && curl_low_speed_time > 0) {
|
|
|
|
curl_easy_setopt(result, CURLOPT_LOW_SPEED_LIMIT,
|
|
|
|
curl_low_speed_limit);
|
|
|
|
curl_easy_setopt(result, CURLOPT_LOW_SPEED_TIME,
|
|
|
|
curl_low_speed_time);
|
|
|
|
}
|
|
|
|
|
2015-09-23 06:06:20 +08:00
|
|
|
curl_easy_setopt(result, CURLOPT_MAXREDIRS, 20);
|
2010-09-25 12:20:35 +08:00
|
|
|
curl_easy_setopt(result, CURLOPT_POSTREDIR, CURL_REDIR_POST_ALL);
|
http: support CURLOPT_PROTOCOLS_STR
The CURLOPT_PROTOCOLS (and matching CURLOPT_REDIR_PROTOCOLS) flag was
deprecated in curl 7.85.0, and using it generate compiler warnings as of
curl 7.87.0. The path forward is to use CURLOPT_PROTOCOLS_STR, but we
can't just do so unilaterally, as it was only introduced less than a
year ago in 7.85.0.
Until that version becomes ubiquitous, we have to either disable the
deprecation warning or conditionally use the "STR" variant on newer
versions of libcurl. This patch switches to the new variant, which is
nice for two reasons:
- we don't have to worry that silencing curl's deprecation warnings
might cause us to miss other more useful ones
- we'd eventually want to move to the new variant anyway, so this gets
us set up (albeit with some extra ugly boilerplate for the
conditional)
There are a lot of ways to split up the two cases. One way would be to
abstract the storage type (strbuf versus a long), how to append
(strbuf_addstr vs bitwise OR), how to initialize, which CURLOPT to use,
and so on. But the resulting code looks pretty magical:
GIT_CURL_PROTOCOL_TYPE allowed = GIT_CURL_PROTOCOL_TYPE_INIT;
if (...http is allowed...)
GIT_CURL_PROTOCOL_APPEND(&allowed, "http", CURLOPT_HTTP);
and you end up with more "#define GIT_CURL_PROTOCOL_TYPE" macros than
actual code.
On the other end of the spectrum, we could just implement two separate
functions, one that handles a string list and one that handles bits. But
then we end up repeating our list of protocols (http, https, ftp, ftp).
This patch takes the middle ground. The run-time code is always there to
handle both types, and we just choose which one to feed to curl.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-17 11:04:48 +08:00
|
|
|
|
|
|
|
#ifdef GIT_CURL_HAVE_CURLOPT_PROTOCOLS_STR
|
|
|
|
{
|
|
|
|
struct strbuf buf = STRBUF_INIT;
|
|
|
|
|
|
|
|
get_curl_allowed_protocols(0, &buf);
|
|
|
|
curl_easy_setopt(result, CURLOPT_REDIR_PROTOCOLS_STR, buf.buf);
|
|
|
|
strbuf_reset(&buf);
|
|
|
|
|
|
|
|
get_curl_allowed_protocols(-1, &buf);
|
|
|
|
curl_easy_setopt(result, CURLOPT_PROTOCOLS_STR, buf.buf);
|
|
|
|
strbuf_release(&buf);
|
|
|
|
}
|
|
|
|
#else
|
2016-12-15 06:39:53 +08:00
|
|
|
curl_easy_setopt(result, CURLOPT_REDIR_PROTOCOLS,
|
http: support CURLOPT_PROTOCOLS_STR
The CURLOPT_PROTOCOLS (and matching CURLOPT_REDIR_PROTOCOLS) flag was
deprecated in curl 7.85.0, and using it generate compiler warnings as of
curl 7.87.0. The path forward is to use CURLOPT_PROTOCOLS_STR, but we
can't just do so unilaterally, as it was only introduced less than a
year ago in 7.85.0.
Until that version becomes ubiquitous, we have to either disable the
deprecation warning or conditionally use the "STR" variant on newer
versions of libcurl. This patch switches to the new variant, which is
nice for two reasons:
- we don't have to worry that silencing curl's deprecation warnings
might cause us to miss other more useful ones
- we'd eventually want to move to the new variant anyway, so this gets
us set up (albeit with some extra ugly boilerplate for the
conditional)
There are a lot of ways to split up the two cases. One way would be to
abstract the storage type (strbuf versus a long), how to append
(strbuf_addstr vs bitwise OR), how to initialize, which CURLOPT to use,
and so on. But the resulting code looks pretty magical:
GIT_CURL_PROTOCOL_TYPE allowed = GIT_CURL_PROTOCOL_TYPE_INIT;
if (...http is allowed...)
GIT_CURL_PROTOCOL_APPEND(&allowed, "http", CURLOPT_HTTP);
and you end up with more "#define GIT_CURL_PROTOCOL_TYPE" macros than
actual code.
On the other end of the spectrum, we could just implement two separate
functions, one that handles a string list and one that handles bits. But
then we end up repeating our list of protocols (http, https, ftp, ftp).
This patch takes the middle ground. The run-time code is always there to
handle both types, and we just choose which one to feed to curl.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-17 11:04:48 +08:00
|
|
|
get_curl_allowed_protocols(0, NULL));
|
2016-12-15 06:39:53 +08:00
|
|
|
curl_easy_setopt(result, CURLOPT_PROTOCOLS,
|
http: support CURLOPT_PROTOCOLS_STR
The CURLOPT_PROTOCOLS (and matching CURLOPT_REDIR_PROTOCOLS) flag was
deprecated in curl 7.85.0, and using it generate compiler warnings as of
curl 7.87.0. The path forward is to use CURLOPT_PROTOCOLS_STR, but we
can't just do so unilaterally, as it was only introduced less than a
year ago in 7.85.0.
Until that version becomes ubiquitous, we have to either disable the
deprecation warning or conditionally use the "STR" variant on newer
versions of libcurl. This patch switches to the new variant, which is
nice for two reasons:
- we don't have to worry that silencing curl's deprecation warnings
might cause us to miss other more useful ones
- we'd eventually want to move to the new variant anyway, so this gets
us set up (albeit with some extra ugly boilerplate for the
conditional)
There are a lot of ways to split up the two cases. One way would be to
abstract the storage type (strbuf versus a long), how to append
(strbuf_addstr vs bitwise OR), how to initialize, which CURLOPT to use,
and so on. But the resulting code looks pretty magical:
GIT_CURL_PROTOCOL_TYPE allowed = GIT_CURL_PROTOCOL_TYPE_INIT;
if (...http is allowed...)
GIT_CURL_PROTOCOL_APPEND(&allowed, "http", CURLOPT_HTTP);
and you end up with more "#define GIT_CURL_PROTOCOL_TYPE" macros than
actual code.
On the other end of the spectrum, we could just implement two separate
functions, one that handles a string list and one that handles bits. But
then we end up repeating our list of protocols (http, https, ftp, ftp).
This patch takes the middle ground. The run-time code is always there to
handle both types, and we just choose which one to feed to curl.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-17 11:04:48 +08:00
|
|
|
get_curl_allowed_protocols(-1, NULL));
|
|
|
|
#endif
|
|
|
|
|
2006-02-01 19:44:37 +08:00
|
|
|
if (getenv("GIT_CURL_VERBOSE"))
|
2020-05-12 01:43:10 +08:00
|
|
|
http_trace_curl_no_data();
|
2016-05-23 21:44:02 +08:00
|
|
|
setup_curl_trace(result);
|
2018-01-19 08:28:02 +08:00
|
|
|
if (getenv("GIT_TRACE_CURL_NO_DATA"))
|
|
|
|
trace_curl_data = 0;
|
2020-06-06 05:21:36 +08:00
|
|
|
if (!git_env_bool("GIT_TRACE_REDACT", 1))
|
|
|
|
trace_curl_redact = 0;
|
2006-02-01 19:44:37 +08:00
|
|
|
|
2010-08-12 04:40:38 +08:00
|
|
|
curl_easy_setopt(result, CURLOPT_USERAGENT,
|
2012-06-03 03:03:08 +08:00
|
|
|
user_agent ? user_agent : git_user_agent());
|
2006-04-05 01:11:29 +08:00
|
|
|
|
2006-09-29 08:10:44 +08:00
|
|
|
if (curl_ftp_no_epsv)
|
|
|
|
curl_easy_setopt(result, CURLOPT_FTP_USE_EPSV, 0);
|
|
|
|
|
2013-04-08 03:10:39 +08:00
|
|
|
if (curl_ssl_try)
|
|
|
|
curl_easy_setopt(result, CURLOPT_USE_SSL, CURLUSESSL_TRY);
|
|
|
|
|
http: use credential API to handle proxy authentication
Currently, the only way to pass proxy credentials to curl is by including them
in the proxy URL. Usually, this means they will end up on disk unencrypted, one
way or another (by inclusion in ~/.gitconfig, shell profile or history). Since
proxy authentication often uses a domain user, credentials can be security
sensitive; therefore, a safer way of passing credentials is desirable.
If the configured proxy contains a username but not a password, query the
credential API for one. Also, make sure we approve/reject proxy credentials
properly.
For consistency reasons, add parsing of http_proxy/https_proxy/all_proxy
environment variables, which would otherwise be evaluated as a fallback by curl.
Without this, we would have different semantics for git configuration and
environment variables.
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Helped-by: Elia Pinto <gitter.spiros@gmail.com>
Signed-off-by: Knut Franke <k.franke@science-computing.de>
Signed-off-by: Elia Pinto <gitter.spiros@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-01-26 21:02:48 +08:00
|
|
|
/*
|
|
|
|
* CURL also examines these variables as a fallback; but we need to query
|
|
|
|
* them here in order to decide whether to prompt for missing password (cf.
|
|
|
|
* init_curl_proxy_auth()).
|
|
|
|
*
|
|
|
|
* Unlike many other common environment variables, these are historically
|
|
|
|
* lowercase only. It appears that CURL did not know this and implemented
|
|
|
|
* only uppercase variants, which was later corrected to take both - with
|
|
|
|
* the exception of http_proxy, which is lowercase only also in CURL. As
|
|
|
|
* the lowercase versions are the historical quasi-standard, they take
|
|
|
|
* precedence here, as in CURL.
|
|
|
|
*/
|
|
|
|
if (!curl_http_proxy) {
|
2016-09-08 04:06:42 +08:00
|
|
|
if (http_auth.protocol && !strcmp(http_auth.protocol, "https")) {
|
http: use credential API to handle proxy authentication
Currently, the only way to pass proxy credentials to curl is by including them
in the proxy URL. Usually, this means they will end up on disk unencrypted, one
way or another (by inclusion in ~/.gitconfig, shell profile or history). Since
proxy authentication often uses a domain user, credentials can be security
sensitive; therefore, a safer way of passing credentials is desirable.
If the configured proxy contains a username but not a password, query the
credential API for one. Also, make sure we approve/reject proxy credentials
properly.
For consistency reasons, add parsing of http_proxy/https_proxy/all_proxy
environment variables, which would otherwise be evaluated as a fallback by curl.
Without this, we would have different semantics for git configuration and
environment variables.
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Helped-by: Elia Pinto <gitter.spiros@gmail.com>
Signed-off-by: Knut Franke <k.franke@science-computing.de>
Signed-off-by: Elia Pinto <gitter.spiros@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-01-26 21:02:48 +08:00
|
|
|
var_override(&curl_http_proxy, getenv("HTTPS_PROXY"));
|
|
|
|
var_override(&curl_http_proxy, getenv("https_proxy"));
|
|
|
|
} else {
|
|
|
|
var_override(&curl_http_proxy, getenv("http_proxy"));
|
|
|
|
}
|
|
|
|
if (!curl_http_proxy) {
|
|
|
|
var_override(&curl_http_proxy, getenv("ALL_PROXY"));
|
|
|
|
var_override(&curl_http_proxy, getenv("all_proxy"));
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
http: honor empty http.proxy option to bypass proxy
Curl distinguishes between an empty proxy address and a NULL proxy
address. In the first case it completely disables proxy usage, but if
the proxy address option is NULL then curl attempts to determine the
proxy address from the http_proxy environment variable.
According to the documentation, if the http.proxy option is set to an
empty string, git should bypass proxy and connect to the server
directly:
export http_proxy=http://network-proxy/
cd ~/foobar-project
git config remote.origin.proxy ""
git fetch
Previously, proxy host was configured by one line:
curl_easy_setopt(result, CURLOPT_PROXY, curl_http_proxy);
Commit 372370f167 ("http: use credential API to handle proxy
authentication", 2016-01-26) parses the proxy option, then extracts the
proxy host address and updates the curl configuration, making the
previous call a noop:
credential_from_url(&proxy_auth, curl_http_proxy);
curl_easy_setopt(result, CURLOPT_PROXY, proxy_auth.host);
But if the proxy option is empty then the proxy host field becomes NULL.
This forces curl to fall back to detecting the proxy configuration from
the environment, causing the http.proxy option to not work anymore.
Fix this issue by explicitly handling http.proxy being set the empty
string. This also makes the code a bit more clear and should help us
avoid such regressions in the future.
Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Sergey Ryazanov <ryazanov.s.a@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-04-12 04:22:18 +08:00
|
|
|
if (curl_http_proxy && curl_http_proxy[0] == '\0') {
|
|
|
|
/*
|
|
|
|
* Handle case with the empty http.proxy value here to keep
|
|
|
|
* common code clean.
|
|
|
|
* NB: empty option disables proxying at all.
|
|
|
|
*/
|
|
|
|
curl_easy_setopt(result, CURLOPT_PROXY, "");
|
|
|
|
} else if (curl_http_proxy) {
|
2024-08-02 13:20:07 +08:00
|
|
|
struct strbuf proxy = STRBUF_INIT;
|
|
|
|
|
2016-04-09 03:16:06 +08:00
|
|
|
if (starts_with(curl_http_proxy, "socks5h"))
|
|
|
|
curl_easy_setopt(result,
|
|
|
|
CURLOPT_PROXYTYPE, CURLPROXY_SOCKS5_HOSTNAME);
|
|
|
|
else if (starts_with(curl_http_proxy, "socks5"))
|
2015-10-26 21:15:07 +08:00
|
|
|
curl_easy_setopt(result,
|
|
|
|
CURLOPT_PROXYTYPE, CURLPROXY_SOCKS5);
|
|
|
|
else if (starts_with(curl_http_proxy, "socks4a"))
|
|
|
|
curl_easy_setopt(result,
|
|
|
|
CURLOPT_PROXYTYPE, CURLPROXY_SOCKS4A);
|
|
|
|
else if (starts_with(curl_http_proxy, "socks"))
|
|
|
|
curl_easy_setopt(result,
|
|
|
|
CURLOPT_PROXYTYPE, CURLPROXY_SOCKS4);
|
http: centralize the accounting of libcurl dependencies
As discussed in 644de29e220 (http: drop support for curl < 7.19.4,
2021-07-30) checking against LIBCURL_VERSION_NUM isn't as reliable as
checking specific symbols present in curl, as some distros have been
known to backport features.
However, while some of the curl_easy_setopt() arguments we rely on are
macros, others are enum, and we can't assume that those that are
macros won't change into enums in the future.
So we're still going to have to check LIBCURL_VERSION_NUM, but by
doing that in one central place and using a macro definition of our
own, anyone who's backporting features can define it themselves, and
thus have access to more modern curl features that they backported,
even if they didn't bump the LIBCURL_VERSION_NUM.
More importantly, as shown in a preceding commit doing these version
checks makes for hard to read and possibly buggy code, as shown by the
bug fixed there where we were conflating base 10 for base 16 when
comparing the version.
By doing them all in one place we'll hopefully reduce the chances of
such future mistakes, furthermore it now becomes easier to see at a
glance what the oldest supported version is, which makes it easier to
reason about any future deprecation similar to the recent
e48a623dea0 (Merge branch 'ab/http-drop-old-curl', 2021-08-24).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-13 22:51:28 +08:00
|
|
|
#ifdef GIT_CURL_HAVE_CURLOPT_PROXY_KEYPASSWD
|
2020-03-05 02:40:05 +08:00
|
|
|
else if (starts_with(curl_http_proxy, "https")) {
|
|
|
|
curl_easy_setopt(result, CURLOPT_PROXYTYPE, CURLPROXY_HTTPS);
|
|
|
|
|
|
|
|
if (http_proxy_ssl_cert)
|
|
|
|
curl_easy_setopt(result, CURLOPT_PROXY_SSLCERT, http_proxy_ssl_cert);
|
|
|
|
|
|
|
|
if (http_proxy_ssl_key)
|
|
|
|
curl_easy_setopt(result, CURLOPT_PROXY_SSLKEY, http_proxy_ssl_key);
|
|
|
|
|
|
|
|
if (has_proxy_cert_password())
|
|
|
|
curl_easy_setopt(result, CURLOPT_PROXY_KEYPASSWD, proxy_cert_auth.password);
|
|
|
|
}
|
2015-10-26 21:15:07 +08:00
|
|
|
#endif
|
http: use credential API to handle proxy authentication
Currently, the only way to pass proxy credentials to curl is by including them
in the proxy URL. Usually, this means they will end up on disk unencrypted, one
way or another (by inclusion in ~/.gitconfig, shell profile or history). Since
proxy authentication often uses a domain user, credentials can be security
sensitive; therefore, a safer way of passing credentials is desirable.
If the configured proxy contains a username but not a password, query the
credential API for one. Also, make sure we approve/reject proxy credentials
properly.
For consistency reasons, add parsing of http_proxy/https_proxy/all_proxy
environment variables, which would otherwise be evaluated as a fallback by curl.
Without this, we would have different semantics for git configuration and
environment variables.
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Helped-by: Elia Pinto <gitter.spiros@gmail.com>
Signed-off-by: Knut Franke <k.franke@science-computing.de>
Signed-off-by: Elia Pinto <gitter.spiros@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-01-26 21:02:48 +08:00
|
|
|
if (strstr(curl_http_proxy, "://"))
|
|
|
|
credential_from_url(&proxy_auth, curl_http_proxy);
|
|
|
|
else {
|
|
|
|
struct strbuf url = STRBUF_INIT;
|
|
|
|
strbuf_addf(&url, "http://%s", curl_http_proxy);
|
|
|
|
credential_from_url(&proxy_auth, url.buf);
|
|
|
|
strbuf_release(&url);
|
|
|
|
}
|
|
|
|
|
2017-04-12 04:22:19 +08:00
|
|
|
if (!proxy_auth.host)
|
|
|
|
die("Invalid proxy URL '%s'", curl_http_proxy);
|
|
|
|
|
2024-08-02 13:20:07 +08:00
|
|
|
strbuf_addstr(&proxy, proxy_auth.host);
|
|
|
|
if (proxy_auth.path) {
|
|
|
|
curl_version_info_data *ver = curl_version_info(CURLVERSION_NOW);
|
|
|
|
|
|
|
|
if (ver->version_num < 0x075400)
|
|
|
|
die("libcurl 7.84 or later is required to support paths in proxy URLs");
|
|
|
|
|
|
|
|
if (!starts_with(proxy_auth.protocol, "socks"))
|
|
|
|
die("Invalid proxy URL '%s': only SOCKS proxies support paths",
|
|
|
|
curl_http_proxy);
|
|
|
|
|
|
|
|
if (strcasecmp(proxy_auth.host, "localhost"))
|
|
|
|
die("Invalid proxy URL '%s': host must be localhost if a path is present",
|
|
|
|
curl_http_proxy);
|
|
|
|
|
|
|
|
strbuf_addch(&proxy, '/');
|
|
|
|
strbuf_add_percentencode(&proxy, proxy_auth.path, 0);
|
|
|
|
}
|
|
|
|
curl_easy_setopt(result, CURLOPT_PROXY, proxy.buf);
|
|
|
|
strbuf_release(&proxy);
|
|
|
|
|
2016-02-29 23:16:57 +08:00
|
|
|
var_override(&curl_no_proxy, getenv("NO_PROXY"));
|
|
|
|
var_override(&curl_no_proxy, getenv("no_proxy"));
|
|
|
|
curl_easy_setopt(result, CURLOPT_NOPROXY, curl_no_proxy);
|
2015-06-27 02:19:04 +08:00
|
|
|
}
|
2016-01-26 21:02:47 +08:00
|
|
|
init_curl_proxy_auth(result);
|
2007-11-23 08:07:00 +08:00
|
|
|
|
2013-10-15 08:06:14 +08:00
|
|
|
set_curl_keepalive(result);
|
2013-10-13 06:29:40 +08:00
|
|
|
|
2005-11-19 09:06:46 +08:00
|
|
|
return result;
|
|
|
|
}
|
|
|
|
|
2024-05-27 19:46:10 +08:00
|
|
|
static void set_from_env(char **var, const char *envname)
|
http_init(): Fix config file parsing
We honor the command line options, environment variables, variables in
repository configuration file, variables in user's global configuration
file, variables in the system configuration file, and then finally use
built-in default. To implement this semantics, the code should:
- start from built-in default values;
- call git_config() with the configuration parser callback, which
implements "later definition overrides earlier ones" logic
(git_config() reads the system's, user's and then repository's
configuration file in this order);
- override the result from the above with environment variables if set;
- override the result from the above with command line options.
The initialization code http_init() for http transfer got this wrong, and
implemented a "first one wins, ignoring the later ones" in http_options(),
to compensate this mistake, read environment variables before calling
git_config(). This is all wrong.
As a second class citizen, the http codepath hasn't been audited as
closely as other parts of the system, but we should try to bring sanity to
it, before inviting contributors to improve on it.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-03-10 10:00:30 +08:00
|
|
|
{
|
|
|
|
const char *val = getenv(envname);
|
2024-05-27 19:46:10 +08:00
|
|
|
if (val) {
|
|
|
|
FREE_AND_NULL(*var);
|
|
|
|
*var = xstrdup(val);
|
|
|
|
}
|
http_init(): Fix config file parsing
We honor the command line options, environment variables, variables in
repository configuration file, variables in user's global configuration
file, variables in the system configuration file, and then finally use
built-in default. To implement this semantics, the code should:
- start from built-in default values;
- call git_config() with the configuration parser callback, which
implements "later definition overrides earlier ones" logic
(git_config() reads the system's, user's and then repository's
configuration file in this order);
- override the result from the above with environment variables if set;
- override the result from the above with command line options.
The initialization code http_init() for http transfer got this wrong, and
implemented a "first one wins, ignoring the later ones" in http_options(),
to compensate this mistake, read environment variables before calling
git_config(). This is all wrong.
As a second class citizen, the http codepath hasn't been audited as
closely as other parts of the system, but we should try to bring sanity to
it, before inviting contributors to improve on it.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-03-10 10:00:30 +08:00
|
|
|
}
|
|
|
|
|
2011-12-14 08:11:56 +08:00
|
|
|
void http_init(struct remote *remote, const char *url, int proactive_auth)
|
2005-11-19 03:02:58 +08:00
|
|
|
{
|
|
|
|
char *low_speed_limit;
|
|
|
|
char *low_speed_time;
|
2013-08-06 04:20:36 +08:00
|
|
|
char *normalized_url;
|
2021-10-01 18:27:33 +08:00
|
|
|
struct urlmatch_config config = URLMATCH_CONFIG_INIT;
|
2013-08-06 04:20:36 +08:00
|
|
|
|
|
|
|
config.section = "http";
|
|
|
|
config.key = NULL;
|
|
|
|
config.collect_fn = http_options;
|
|
|
|
config.cascade_fn = git_default_config;
|
|
|
|
config.cb = NULL;
|
2005-11-19 03:02:58 +08:00
|
|
|
|
2009-06-06 16:43:41 +08:00
|
|
|
http_is_verbose = 0;
|
2013-08-06 04:20:36 +08:00
|
|
|
normalized_url = url_normalize(url, &config.url);
|
2009-06-06 16:43:41 +08:00
|
|
|
|
2013-08-06 04:20:36 +08:00
|
|
|
git_config(urlmatch_config_entry, &config);
|
|
|
|
free(normalized_url);
|
2019-08-26 15:49:11 +08:00
|
|
|
string_list_clear(&config.vars, 1);
|
http_init(): Fix config file parsing
We honor the command line options, environment variables, variables in
repository configuration file, variables in user's global configuration
file, variables in the system configuration file, and then finally use
built-in default. To implement this semantics, the code should:
- start from built-in default values;
- call git_config() with the configuration parser callback, which
implements "later definition overrides earlier ones" logic
(git_config() reads the system's, user's and then repository's
configuration file in this order);
- override the result from the above with environment variables if set;
- override the result from the above with command line options.
The initialization code http_init() for http transfer got this wrong, and
implemented a "first one wins, ignoring the later ones" in http_options(),
to compensate this mistake, read environment variables before calling
git_config(). This is all wrong.
As a second class citizen, the http codepath hasn't been audited as
closely as other parts of the system, but we should try to bring sanity to
it, before inviting contributors to improve on it.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-03-10 10:00:30 +08:00
|
|
|
|
http: centralize the accounting of libcurl dependencies
As discussed in 644de29e220 (http: drop support for curl < 7.19.4,
2021-07-30) checking against LIBCURL_VERSION_NUM isn't as reliable as
checking specific symbols present in curl, as some distros have been
known to backport features.
However, while some of the curl_easy_setopt() arguments we rely on are
macros, others are enum, and we can't assume that those that are
macros won't change into enums in the future.
So we're still going to have to check LIBCURL_VERSION_NUM, but by
doing that in one central place and using a macro definition of our
own, anyone who's backporting features can define it themselves, and
thus have access to more modern curl features that they backported,
even if they didn't bump the LIBCURL_VERSION_NUM.
More importantly, as shown in a preceding commit doing these version
checks makes for hard to read and possibly buggy code, as shown by the
bug fixed there where we were conflating base 10 for base 16 when
comparing the version.
By doing them all in one place we'll hopefully reduce the chances of
such future mistakes, furthermore it now becomes easier to see at a
glance what the oldest supported version is, which makes it easier to
reason about any future deprecation similar to the recent
e48a623dea0 (Merge branch 'ab/http-drop-old-curl', 2021-08-24).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-13 22:51:28 +08:00
|
|
|
#ifdef GIT_CURL_HAVE_CURLSSLSET_NO_BACKENDS
|
2018-10-15 18:14:43 +08:00
|
|
|
if (http_ssl_backend) {
|
|
|
|
const curl_ssl_backend **backends;
|
|
|
|
struct strbuf buf = STRBUF_INIT;
|
|
|
|
int i;
|
|
|
|
|
|
|
|
switch (curl_global_sslset(-1, http_ssl_backend, &backends)) {
|
|
|
|
case CURLSSLSET_UNKNOWN_BACKEND:
|
|
|
|
strbuf_addf(&buf, _("Unsupported SSL backend '%s'. "
|
|
|
|
"Supported SSL backends:"),
|
|
|
|
http_ssl_backend);
|
|
|
|
for (i = 0; backends[i]; i++)
|
|
|
|
strbuf_addf(&buf, "\n\t%s", backends[i]->name);
|
|
|
|
die("%s", buf.buf);
|
|
|
|
case CURLSSLSET_NO_BACKENDS:
|
|
|
|
die(_("Could not set SSL backend to '%s': "
|
|
|
|
"cURL was built without SSL backends"),
|
|
|
|
http_ssl_backend);
|
|
|
|
case CURLSSLSET_TOO_LATE:
|
|
|
|
die(_("Could not set SSL backend to '%s': already set"),
|
|
|
|
http_ssl_backend);
|
|
|
|
case CURLSSLSET_OK:
|
|
|
|
break; /* Okay! */
|
|
|
|
}
|
|
|
|
}
|
|
|
|
#endif
|
|
|
|
|
2014-08-14 01:31:24 +08:00
|
|
|
if (curl_global_init(CURL_GLOBAL_ALL) != CURLE_OK)
|
|
|
|
die("curl_global_init failed");
|
2005-11-19 03:02:58 +08:00
|
|
|
|
http: allow authenticating proactively
When making a request over HTTP(S), Git only sends authentication if it
receives a 401 response. Thus, if a repository is open to the public
for reading, Git will typically never ask for authentication for fetches
and clones.
However, there may be times when a user would like to authenticate
nevertheless. For example, a forge may give higher rate limits to users
who authenticate because they are easier to contact in case of excessive
use. Or it may be useful for a known heavy user, such as an internal
service, to proactively authenticate so its use can be monitored and, if
necessary, throttled.
Let's make this possible with a new option, "http.proactiveAuth". This
option specifies a type of authentication which can be used to
authenticate against the host in question. This is necessary because we
lack the WWW-Authenticate header to provide us details; similarly, we
cannot accept certain types of authentication because we require
information from the server, such as a nonce or challenge, to
successfully authenticate.
If we're in auto mode and we got a username and password, set the
authentication scheme to Basic. libcurl will not send authentication
proactively unless there's a single choice of allowed authentication,
and we know in this case we didn't get an authtype entry telling us what
scheme to use, or we would have taken a different codepath and written
the header ourselves. In any event, of the other schemes that libcurl
supports, Digest and NTLM require a nonce or challenge, which means that
they cannot work with proactive auth, and GSSAPI does not use a username
and password at all, so Basic is the only logical choice among the
built-in options.
Note that the existing http_proactive_auth variable signifies proactive
auth if there are already credentials, which is different from the
functionality we're adding, which always seeks credentials even if none
are provided. Nonetheless, t5540 tests the existing behavior for
WebDAV-based pushes to an open repository without credentials, so we
preserve it. While at first this may seem an insecure and bizarre
decision, it may be that authentication is done with TLS certificates,
in which case it might actually provide a quite high level of security.
Expand the variable to use an enum to handle the additional cases and a
helper function to distinguish our new cases from the old ones.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-10 08:01:55 +08:00
|
|
|
if (proactive_auth && http_proactive_auth == PROACTIVE_AUTH_NONE)
|
|
|
|
http_proactive_auth = PROACTIVE_AUTH_IF_CREDENTIALS;
|
2011-12-14 08:11:56 +08:00
|
|
|
|
2008-02-28 04:35:50 +08:00
|
|
|
if (remote && remote->http_proxy)
|
|
|
|
curl_http_proxy = xstrdup(remote->http_proxy);
|
|
|
|
|
2016-01-26 21:02:47 +08:00
|
|
|
if (remote)
|
|
|
|
var_override(&http_proxy_authmethod, remote->http_proxy_authmethod);
|
|
|
|
|
2016-04-27 20:20:37 +08:00
|
|
|
pragma_header = curl_slist_append(http_copy_default_headers(),
|
|
|
|
"Pragma: no-cache");
|
2005-11-19 03:02:58 +08:00
|
|
|
|
|
|
|
{
|
|
|
|
char *http_max_requests = getenv("GIT_HTTP_MAX_REQUESTS");
|
2022-05-03 00:50:37 +08:00
|
|
|
if (http_max_requests)
|
2005-11-19 03:02:58 +08:00
|
|
|
max_requests = atoi(http_max_requests);
|
|
|
|
}
|
|
|
|
|
|
|
|
curlm = curl_multi_init();
|
2014-08-17 15:35:53 +08:00
|
|
|
if (!curlm)
|
|
|
|
die("curl_multi_init failed");
|
2005-11-19 03:02:58 +08:00
|
|
|
|
|
|
|
if (getenv("GIT_SSL_NO_VERIFY"))
|
|
|
|
curl_ssl_verify = 0;
|
|
|
|
|
http_init(): Fix config file parsing
We honor the command line options, environment variables, variables in
repository configuration file, variables in user's global configuration
file, variables in the system configuration file, and then finally use
built-in default. To implement this semantics, the code should:
- start from built-in default values;
- call git_config() with the configuration parser callback, which
implements "later definition overrides earlier ones" logic
(git_config() reads the system's, user's and then repository's
configuration file in this order);
- override the result from the above with environment variables if set;
- override the result from the above with command line options.
The initialization code http_init() for http transfer got this wrong, and
implemented a "first one wins, ignoring the later ones" in http_options(),
to compensate this mistake, read environment variables before calling
git_config(). This is all wrong.
As a second class citizen, the http codepath hasn't been audited as
closely as other parts of the system, but we should try to bring sanity to
it, before inviting contributors to improve on it.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-03-10 10:00:30 +08:00
|
|
|
set_from_env(&ssl_cert, "GIT_SSL_CERT");
|
2023-03-20 23:48:49 +08:00
|
|
|
set_from_env(&ssl_cert_type, "GIT_SSL_CERT_TYPE");
|
http_init(): Fix config file parsing
We honor the command line options, environment variables, variables in
repository configuration file, variables in user's global configuration
file, variables in the system configuration file, and then finally use
built-in default. To implement this semantics, the code should:
- start from built-in default values;
- call git_config() with the configuration parser callback, which
implements "later definition overrides earlier ones" logic
(git_config() reads the system's, user's and then repository's
configuration file in this order);
- override the result from the above with environment variables if set;
- override the result from the above with command line options.
The initialization code http_init() for http transfer got this wrong, and
implemented a "first one wins, ignoring the later ones" in http_options(),
to compensate this mistake, read environment variables before calling
git_config(). This is all wrong.
As a second class citizen, the http codepath hasn't been audited as
closely as other parts of the system, but we should try to bring sanity to
it, before inviting contributors to improve on it.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-03-10 10:00:30 +08:00
|
|
|
set_from_env(&ssl_key, "GIT_SSL_KEY");
|
2023-03-20 23:48:49 +08:00
|
|
|
set_from_env(&ssl_key_type, "GIT_SSL_KEY_TYPE");
|
http_init(): Fix config file parsing
We honor the command line options, environment variables, variables in
repository configuration file, variables in user's global configuration
file, variables in the system configuration file, and then finally use
built-in default. To implement this semantics, the code should:
- start from built-in default values;
- call git_config() with the configuration parser callback, which
implements "later definition overrides earlier ones" logic
(git_config() reads the system's, user's and then repository's
configuration file in this order);
- override the result from the above with environment variables if set;
- override the result from the above with command line options.
The initialization code http_init() for http transfer got this wrong, and
implemented a "first one wins, ignoring the later ones" in http_options(),
to compensate this mistake, read environment variables before calling
git_config(). This is all wrong.
As a second class citizen, the http codepath hasn't been audited as
closely as other parts of the system, but we should try to bring sanity to
it, before inviting contributors to improve on it.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-03-10 10:00:30 +08:00
|
|
|
set_from_env(&ssl_capath, "GIT_SSL_CAPATH");
|
|
|
|
set_from_env(&ssl_cainfo, "GIT_SSL_CAINFO");
|
2005-11-19 03:02:58 +08:00
|
|
|
|
2010-08-12 04:40:38 +08:00
|
|
|
set_from_env(&user_agent, "GIT_HTTP_USER_AGENT");
|
|
|
|
|
2005-11-19 03:02:58 +08:00
|
|
|
low_speed_limit = getenv("GIT_HTTP_LOW_SPEED_LIMIT");
|
2022-05-03 00:50:37 +08:00
|
|
|
if (low_speed_limit)
|
2005-11-19 03:02:58 +08:00
|
|
|
curl_low_speed_limit = strtol(low_speed_limit, NULL, 10);
|
|
|
|
low_speed_time = getenv("GIT_HTTP_LOW_SPEED_TIME");
|
2022-05-03 00:50:37 +08:00
|
|
|
if (low_speed_time)
|
2005-11-19 03:02:58 +08:00
|
|
|
curl_low_speed_time = strtol(low_speed_time, NULL, 10);
|
|
|
|
|
|
|
|
if (curl_ssl_verify == -1)
|
|
|
|
curl_ssl_verify = 1;
|
|
|
|
|
2009-11-27 23:42:26 +08:00
|
|
|
curl_session_count = 0;
|
2005-11-19 03:02:58 +08:00
|
|
|
if (max_requests < 1)
|
|
|
|
max_requests = DEFAULT_MAX_REQUESTS;
|
|
|
|
|
2020-03-05 02:40:06 +08:00
|
|
|
set_from_env(&http_proxy_ssl_cert, "GIT_PROXY_SSL_CERT");
|
|
|
|
set_from_env(&http_proxy_ssl_key, "GIT_PROXY_SSL_KEY");
|
|
|
|
set_from_env(&http_proxy_ssl_ca_info, "GIT_PROXY_SSL_CAINFO");
|
|
|
|
|
|
|
|
if (getenv("GIT_PROXY_SSL_CERT_PASSWORD_PROTECTED"))
|
|
|
|
proxy_ssl_cert_password_required = 1;
|
|
|
|
|
2006-09-29 08:10:44 +08:00
|
|
|
if (getenv("GIT_CURL_FTP_NO_EPSV"))
|
|
|
|
curl_ftp_no_epsv = 1;
|
|
|
|
|
2011-10-14 15:40:40 +08:00
|
|
|
if (url) {
|
http: use credential API to get passwords
This patch converts the http code to use the new credential
API, both for http authentication as well as for getting
certificate passwords.
Most of the code change is simply variable naming (the
passwords are now contained inside the credential struct)
or deletion of obsolete code (the credential code handles
URL parsing and prompting for us).
The behavior should be the same, with one exception: the
credential code will prompt with a description based on the
credential components. Therefore, the old prompt of:
Username for 'example.com':
Password for 'example.com':
now looks like:
Username for 'https://example.com/repo.git':
Password for 'https://user@example.com/repo.git':
Note that we include more information in each line,
specifically:
1. We now include the protocol. While more noisy, this is
an important part of knowing what you are accessing
(especially if you care about http vs https).
2. We include the username in the password prompt. This is
not a big deal when you have just been prompted for it,
but the username may also come from the remote's URL
(and after future patches, from configuration or
credential helpers). In that case, it's a nice
reminder of the user for which you're giving the
password.
3. We include the path component of the URL. In many
cases, the user won't care about this and it's simply
noise (i.e., they'll use the same credential for a
whole site). However, that is part of a larger
question, which is whether path components should be
part of credential context, both for prompting and for
lookup by storage helpers. That issue will be addressed
as a whole in a future patch.
Similarly, for unlocking certificates, we used to say:
Certificate Password for 'example.com':
and we now say:
Password for 'cert:///path/to/certificate':
Showing the path to the client certificate makes more sense,
as that is what you are unlocking, not "example.com".
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-12-10 18:31:21 +08:00
|
|
|
credential_from_url(&http_auth, url);
|
2009-05-28 11:16:03 +08:00
|
|
|
if (!ssl_cert_password_required &&
|
|
|
|
getenv("GIT_SSL_CERT_PASSWORD_PROTECTED") &&
|
2013-12-01 04:55:40 +08:00
|
|
|
starts_with(url, "https://"))
|
2009-05-28 11:16:02 +08:00
|
|
|
ssl_cert_password_required = 1;
|
|
|
|
}
|
2009-03-10 14:34:25 +08:00
|
|
|
|
2005-11-19 03:02:58 +08:00
|
|
|
curl_default = get_curl_handle();
|
|
|
|
}
|
|
|
|
|
|
|
|
void http_cleanup(void)
|
|
|
|
{
|
|
|
|
struct active_request_slot *slot = active_queue_head;
|
|
|
|
|
|
|
|
while (slot != NULL) {
|
2007-09-15 15:23:00 +08:00
|
|
|
struct active_request_slot *next = slot->next;
|
2022-05-03 00:50:37 +08:00
|
|
|
if (slot->curl) {
|
2016-09-13 08:25:56 +08:00
|
|
|
xmulti_remove_handle(slot);
|
2005-11-19 03:02:58 +08:00
|
|
|
curl_easy_cleanup(slot->curl);
|
2008-03-04 03:30:16 +08:00
|
|
|
}
|
2007-09-15 15:23:00 +08:00
|
|
|
free(slot);
|
|
|
|
slot = next;
|
2005-11-19 03:02:58 +08:00
|
|
|
}
|
2007-09-15 15:23:00 +08:00
|
|
|
active_queue_head = NULL;
|
2005-11-19 03:02:58 +08:00
|
|
|
|
|
|
|
curl_easy_cleanup(curl_default);
|
|
|
|
|
|
|
|
curl_multi_cleanup(curlm);
|
|
|
|
curl_global_cleanup();
|
2006-06-07 00:41:32 +08:00
|
|
|
|
remote-curl: unbreak http.extraHeader with custom allocators
In 93b980e58f5 (http: use xmalloc with cURL, 2019-08-15), we started to
ask cURL to use `xmalloc()`, and if compiled with nedmalloc, that means
implicitly a different allocator than the system one.
Which means that all of cURL's allocations and releases now _need_ to
use that allocator.
However, the `http_options()` function used `slist_append()` to add any
configured extra HTTP header(s) _before_ asking cURL to use `xmalloc()`,
and `http_cleanup()` would release them _afterwards_, i.e. in the
presence of custom allocators, cURL would attempt to use the wrong
allocator to release the memory.
A naïve attempt at fixing this would move the call to
`curl_global_init()` _before_ the config is parsed (i.e. before that
call to `slist_append()`).
However, that does not work, as we _also_ parse the config setting
`http.sslbackend` and if found, call `curl_global_sslset()` which *must*
be called before `curl_global_init()`, for details see:
https://curl.haxx.se/libcurl/c/curl_global_sslset.html
So let's instead make the config parsing entirely independent from
cURL's data structures. Incidentally, this deletes two more lines than
it introduces, which is nice.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-11-06 18:04:55 +08:00
|
|
|
string_list_clear(&extra_http_headers, 0);
|
2016-04-27 20:20:37 +08:00
|
|
|
|
2006-06-07 00:41:32 +08:00
|
|
|
curl_slist_free_all(pragma_header);
|
2007-09-15 15:23:00 +08:00
|
|
|
pragma_header = NULL;
|
2008-02-28 04:35:50 +08:00
|
|
|
|
2022-05-16 16:38:51 +08:00
|
|
|
curl_slist_free_all(host_resolutions);
|
|
|
|
host_resolutions = NULL;
|
|
|
|
|
2008-02-28 04:35:50 +08:00
|
|
|
if (curl_http_proxy) {
|
2008-12-07 08:45:37 +08:00
|
|
|
free((void *)curl_http_proxy);
|
2008-02-28 04:35:50 +08:00
|
|
|
curl_http_proxy = NULL;
|
|
|
|
}
|
2009-05-28 11:16:02 +08:00
|
|
|
|
http: use credential API to handle proxy authentication
Currently, the only way to pass proxy credentials to curl is by including them
in the proxy URL. Usually, this means they will end up on disk unencrypted, one
way or another (by inclusion in ~/.gitconfig, shell profile or history). Since
proxy authentication often uses a domain user, credentials can be security
sensitive; therefore, a safer way of passing credentials is desirable.
If the configured proxy contains a username but not a password, query the
credential API for one. Also, make sure we approve/reject proxy credentials
properly.
For consistency reasons, add parsing of http_proxy/https_proxy/all_proxy
environment variables, which would otherwise be evaluated as a fallback by curl.
Without this, we would have different semantics for git configuration and
environment variables.
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Helped-by: Elia Pinto <gitter.spiros@gmail.com>
Signed-off-by: Knut Franke <k.franke@science-computing.de>
Signed-off-by: Elia Pinto <gitter.spiros@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-01-26 21:02:48 +08:00
|
|
|
if (proxy_auth.password) {
|
|
|
|
memset(proxy_auth.password, 0, strlen(proxy_auth.password));
|
2017-06-16 07:15:46 +08:00
|
|
|
FREE_AND_NULL(proxy_auth.password);
|
http: use credential API to handle proxy authentication
Currently, the only way to pass proxy credentials to curl is by including them
in the proxy URL. Usually, this means they will end up on disk unencrypted, one
way or another (by inclusion in ~/.gitconfig, shell profile or history). Since
proxy authentication often uses a domain user, credentials can be security
sensitive; therefore, a safer way of passing credentials is desirable.
If the configured proxy contains a username but not a password, query the
credential API for one. Also, make sure we approve/reject proxy credentials
properly.
For consistency reasons, add parsing of http_proxy/https_proxy/all_proxy
environment variables, which would otherwise be evaluated as a fallback by curl.
Without this, we would have different semantics for git configuration and
environment variables.
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Helped-by: Elia Pinto <gitter.spiros@gmail.com>
Signed-off-by: Knut Franke <k.franke@science-computing.de>
Signed-off-by: Elia Pinto <gitter.spiros@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-01-26 21:02:48 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
free((void *)curl_proxyuserpwd);
|
|
|
|
curl_proxyuserpwd = NULL;
|
|
|
|
|
2016-01-26 21:02:47 +08:00
|
|
|
free((void *)http_proxy_authmethod);
|
|
|
|
http_proxy_authmethod = NULL;
|
|
|
|
|
2022-05-03 00:50:37 +08:00
|
|
|
if (cert_auth.password) {
|
http: use credential API to get passwords
This patch converts the http code to use the new credential
API, both for http authentication as well as for getting
certificate passwords.
Most of the code change is simply variable naming (the
passwords are now contained inside the credential struct)
or deletion of obsolete code (the credential code handles
URL parsing and prompting for us).
The behavior should be the same, with one exception: the
credential code will prompt with a description based on the
credential components. Therefore, the old prompt of:
Username for 'example.com':
Password for 'example.com':
now looks like:
Username for 'https://example.com/repo.git':
Password for 'https://user@example.com/repo.git':
Note that we include more information in each line,
specifically:
1. We now include the protocol. While more noisy, this is
an important part of knowing what you are accessing
(especially if you care about http vs https).
2. We include the username in the password prompt. This is
not a big deal when you have just been prompted for it,
but the username may also come from the remote's URL
(and after future patches, from configuration or
credential helpers). In that case, it's a nice
reminder of the user for which you're giving the
password.
3. We include the path component of the URL. In many
cases, the user won't care about this and it's simply
noise (i.e., they'll use the same credential for a
whole site). However, that is part of a larger
question, which is whether path components should be
part of credential context, both for prompting and for
lookup by storage helpers. That issue will be addressed
as a whole in a future patch.
Similarly, for unlocking certificates, we used to say:
Certificate Password for 'example.com':
and we now say:
Password for 'cert:///path/to/certificate':
Showing the path to the client certificate makes more sense,
as that is what you are unlocking, not "example.com".
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-12-10 18:31:21 +08:00
|
|
|
memset(cert_auth.password, 0, strlen(cert_auth.password));
|
2017-06-16 07:15:46 +08:00
|
|
|
FREE_AND_NULL(cert_auth.password);
|
2009-05-28 11:16:02 +08:00
|
|
|
}
|
|
|
|
ssl_cert_password_required = 0;
|
2015-01-28 20:04:37 +08:00
|
|
|
|
2022-05-03 00:50:37 +08:00
|
|
|
if (proxy_cert_auth.password) {
|
2020-03-05 02:40:05 +08:00
|
|
|
memset(proxy_cert_auth.password, 0, strlen(proxy_cert_auth.password));
|
|
|
|
FREE_AND_NULL(proxy_cert_auth.password);
|
|
|
|
}
|
|
|
|
proxy_ssl_cert_password_required = 0;
|
|
|
|
|
2017-06-16 07:15:46 +08:00
|
|
|
FREE_AND_NULL(cached_accept_language);
|
2005-11-19 03:02:58 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
struct active_request_slot *get_active_slot(void)
|
|
|
|
{
|
|
|
|
struct active_request_slot *slot = active_queue_head;
|
|
|
|
struct active_request_slot *newslot;
|
|
|
|
|
|
|
|
int num_transfers;
|
|
|
|
|
|
|
|
/* Wait for a slot to open up if the queue is full */
|
|
|
|
while (active_requests >= max_requests) {
|
|
|
|
curl_multi_perform(curlm, &num_transfers);
|
2009-03-10 09:47:29 +08:00
|
|
|
if (num_transfers < active_requests)
|
2005-11-19 03:02:58 +08:00
|
|
|
process_curl_messages();
|
|
|
|
}
|
|
|
|
|
2009-03-10 09:47:29 +08:00
|
|
|
while (slot != NULL && slot->in_use)
|
2005-11-19 03:02:58 +08:00
|
|
|
slot = slot->next;
|
2009-03-10 09:47:29 +08:00
|
|
|
|
2022-05-03 00:50:37 +08:00
|
|
|
if (!slot) {
|
2005-11-19 03:02:58 +08:00
|
|
|
newslot = xmalloc(sizeof(*newslot));
|
|
|
|
newslot->curl = NULL;
|
|
|
|
newslot->in_use = 0;
|
|
|
|
newslot->next = NULL;
|
|
|
|
|
|
|
|
slot = active_queue_head;
|
2022-05-03 00:50:37 +08:00
|
|
|
if (!slot) {
|
2005-11-19 03:02:58 +08:00
|
|
|
active_queue_head = newslot;
|
|
|
|
} else {
|
2009-03-10 09:47:29 +08:00
|
|
|
while (slot->next != NULL)
|
2005-11-19 03:02:58 +08:00
|
|
|
slot = slot->next;
|
|
|
|
slot->next = newslot;
|
|
|
|
}
|
|
|
|
slot = newslot;
|
|
|
|
}
|
|
|
|
|
2022-05-03 00:50:37 +08:00
|
|
|
if (!slot->curl) {
|
2005-11-19 03:02:58 +08:00
|
|
|
slot->curl = curl_easy_duphandle(curl_default);
|
2009-11-27 23:42:26 +08:00
|
|
|
curl_session_count++;
|
2005-11-19 03:02:58 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
active_requests++;
|
|
|
|
slot->in_use = 1;
|
2006-02-01 03:06:55 +08:00
|
|
|
slot->results = NULL;
|
2006-03-11 12:18:01 +08:00
|
|
|
slot->finished = NULL;
|
2005-11-19 03:02:58 +08:00
|
|
|
slot->callback_data = NULL;
|
|
|
|
slot->callback_func = NULL;
|
http.c: cookie file tightening
The http.cookiefile configuration variable is used to call
curl_easy_setopt() to set CURLOPT_COOKIEFILE and if http.savecookies
is set, the same value is used for CURLOPT_COOKIEJAR. The former is
used only to read cookies at startup, the latter is used to write
cookies at the end.
The manual pages https://curl.se/libcurl/c/CURLOPT_COOKIEFILE.html
and https://curl.se/libcurl/c/CURLOPT_COOKIEJAR.html talk about two
interesting special values.
* "" (an empty string) given to CURLOPT_COOKIEFILE means not to
read cookies from any file upon startup.
* It is not specified what "" (an empty string) given to
CURLOPT_COOKIEJAR does; presumably open a file whose name is an
empty string and write cookies to it? In any case, that is not
what we want to see happen, ever.
* "-" (a dash) given to CURLOPT_COOKIEFILE makes cURL read cookies
from the standard input, and given to CURLOPT_COOKIEJAR makes
cURL write cookies to the standard output. Neither of which we
want ever to happen.
So, let's make sure we avoid these nonsense cases. Specifically,
when http.cookies is set to "-", ignore it with a warning, and when
it is set to "" and http.savecookies is set, ignore http.savecookies
with a warning.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-10 07:03:48 +08:00
|
|
|
|
|
|
|
if (curl_cookie_file && !strcmp(curl_cookie_file, "-")) {
|
|
|
|
warning(_("refusing to read cookies from http.cookiefile '-'"));
|
|
|
|
FREE_AND_NULL(curl_cookie_file);
|
|
|
|
}
|
2011-06-03 04:31:25 +08:00
|
|
|
curl_easy_setopt(slot->curl, CURLOPT_COOKIEFILE, curl_cookie_file);
|
http.c: cookie file tightening
The http.cookiefile configuration variable is used to call
curl_easy_setopt() to set CURLOPT_COOKIEFILE and if http.savecookies
is set, the same value is used for CURLOPT_COOKIEJAR. The former is
used only to read cookies at startup, the latter is used to write
cookies at the end.
The manual pages https://curl.se/libcurl/c/CURLOPT_COOKIEFILE.html
and https://curl.se/libcurl/c/CURLOPT_COOKIEJAR.html talk about two
interesting special values.
* "" (an empty string) given to CURLOPT_COOKIEFILE means not to
read cookies from any file upon startup.
* It is not specified what "" (an empty string) given to
CURLOPT_COOKIEJAR does; presumably open a file whose name is an
empty string and write cookies to it? In any case, that is not
what we want to see happen, ever.
* "-" (a dash) given to CURLOPT_COOKIEFILE makes cURL read cookies
from the standard input, and given to CURLOPT_COOKIEJAR makes
cURL write cookies to the standard output. Neither of which we
want ever to happen.
So, let's make sure we avoid these nonsense cases. Specifically,
when http.cookies is set to "-", ignore it with a warning, and when
it is set to "" and http.savecookies is set, ignore http.savecookies
with a warning.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-10 07:03:48 +08:00
|
|
|
if (curl_save_cookies && (!curl_cookie_file || !curl_cookie_file[0])) {
|
|
|
|
curl_save_cookies = 0;
|
|
|
|
warning(_("ignoring http.savecookies for empty http.cookiefile"));
|
|
|
|
}
|
2013-07-24 06:40:17 +08:00
|
|
|
if (curl_save_cookies)
|
|
|
|
curl_easy_setopt(slot->curl, CURLOPT_COOKIEJAR, curl_cookie_file);
|
2005-11-19 03:02:58 +08:00
|
|
|
curl_easy_setopt(slot->curl, CURLOPT_HTTPHEADER, pragma_header);
|
2022-05-16 16:38:51 +08:00
|
|
|
curl_easy_setopt(slot->curl, CURLOPT_RESOLVE, host_resolutions);
|
2005-11-19 03:02:58 +08:00
|
|
|
curl_easy_setopt(slot->curl, CURLOPT_ERRORBUFFER, curl_errorstr);
|
2006-06-01 07:25:03 +08:00
|
|
|
curl_easy_setopt(slot->curl, CURLOPT_CUSTOMREQUEST, NULL);
|
|
|
|
curl_easy_setopt(slot->curl, CURLOPT_READFUNCTION, NULL);
|
|
|
|
curl_easy_setopt(slot->curl, CURLOPT_WRITEFUNCTION, NULL);
|
2011-04-26 23:04:49 +08:00
|
|
|
curl_easy_setopt(slot->curl, CURLOPT_POSTFIELDS, NULL);
|
http: reset POSTFIELDSIZE when clearing curl handle
In get_active_slot(), we return a CURL handle that may have been used
before (reusing them is good because it lets curl reuse the same
connection across many requests). We set a few curl options back to
defaults that may have been modified by previous requests.
We reset POSTFIELDS to NULL, but do not reset POSTFIELDSIZE (which
defaults to "-1"). This usually doesn't matter because most POSTs will
set both fields together anyway. But there is one exception: when
handling a large request in remote-curl's post_rpc(), we don't set
_either_, and instead set a READFUNCTION to stream data into libcurl.
This can interact weirdly with a stale POSTFIELDSIZE setting, because
curl will assume it should read only some set number of bytes from our
READFUNCTION. However, it has worked in practice because we also
manually set a "Transfer-Encoding: chunked" header, which libcurl uses
as a clue to set the POSTFIELDSIZE to -1 itself.
So everything works, but we're better off resetting the size manually
for a few reasons:
- there was a regression in curl 8.7.0 where the chunked header
detection didn't kick in, causing any large HTTP requests made by
Git to fail. This has since been fixed (but not yet released). In
the issue, curl folks recommended setting it explicitly to -1:
https://github.com/curl/curl/issues/13229#issuecomment-2029826058
and it indeed works around the regression. So even though it won't
be strictly necessary after the fix there, this will help folks who
end up using the affected libcurl versions.
- it's consistent with what a new curl handle would look like. Since
get_active_slot() may or may not return a used handle, this reduces
the possibility of heisenbugs that only appear with certain request
patterns.
Note that the recommendation in the curl issue is to actually drop the
manual Transfer-Encoding header. Modern libcurl will add the header
itself when streaming from a READFUNCTION. However, that code wasn't
added until 802aa5ae2 (HTTP: use chunked Transfer-Encoding for HTTP_POST
if size unknown, 2019-07-22), which is in curl 7.66.0. We claim to
support back to 7.19.5, so those older versions still need the manual
header.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-03 04:05:17 +08:00
|
|
|
curl_easy_setopt(slot->curl, CURLOPT_POSTFIELDSIZE, -1L);
|
2006-06-01 07:25:03 +08:00
|
|
|
curl_easy_setopt(slot->curl, CURLOPT_UPLOAD, 0);
|
|
|
|
curl_easy_setopt(slot->curl, CURLOPT_HTTPGET, 1);
|
http: set curl FAILONERROR each time we select a handle
Because we reuse curl handles for multiple requests, the
setup of a handle happens in two stages: stable, global
setup and per-request setup. The lifecycle of a handle is
something like:
1. get_curl_handle; do basic global setup that will last
through the whole program (e.g., setting the user
agent, ssl options, etc)
2. get_active_slot; set up a per-request baseline (e.g.,
clearing the read/write functions, making it a GET
request, etc)
3. perform the request with curl_*_perform functions
4. goto step 2 to perform another request
Breaking it down this way means we can avoid doing global
setup from step (1) repeatedly, but we still finish step (2)
with a predictable baseline setup that callers can rely on.
Until commit 6d052d7 (http: add HTTP_KEEP_ERROR option,
2013-04-05), setting curl's FAILONERROR option was a global
setup; we never changed it. However, 6d052d7 introduced an
option where some requests might turn off FAILONERROR. Later
requests using the same handle would have the option
unexpectedly turned off, which meant they would not notice
http failures at all.
This could easily be seen in the test-suite for the
"half-auth" cases of t5541 and t5551. The initial requests
turned off FAILONERROR, which meant it was erroneously off
for the rpc POST. That worked fine for a successful request,
but meant that we failed to react properly to the HTTP 401
(instead, we treated whatever the server handed us as a
successful message body).
The solution is simple: now that FAILONERROR is a
per-request setting, we move it to get_active_slot to make
sure it is reset for each request.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-04-16 08:30:38 +08:00
|
|
|
curl_easy_setopt(slot->curl, CURLOPT_FAILONERROR, 1);
|
2015-11-03 05:39:58 +08:00
|
|
|
curl_easy_setopt(slot->curl, CURLOPT_RANGE, NULL);
|
2016-02-03 12:09:14 +08:00
|
|
|
|
http: make redirects more obvious
We instruct curl to always follow HTTP redirects. This is
convenient, but it creates opportunities for malicious
servers to create confusing situations. For instance,
imagine Alice is a git user with access to a private
repository on Bob's server. Mallory runs her own server and
wants to access objects from Bob's repository.
Mallory may try a few tricks that involve asking Alice to
clone from her, build on top, and then push the result:
1. Mallory may simply redirect all fetch requests to Bob's
server. Git will transparently follow those redirects
and fetch Bob's history, which Alice may believe she
got from Mallory. The subsequent push seems like it is
just feeding Mallory back her own objects, but is
actually leaking Bob's objects. There is nothing in
git's output to indicate that Bob's repository was
involved at all.
The downside (for Mallory) of this attack is that Alice
will have received Bob's entire repository, and is
likely to notice that when building on top of it.
2. If Mallory happens to know the sha1 of some object X in
Bob's repository, she can instead build her own history
that references that object. She then runs a dumb http
server, and Alice's client will fetch each object
individually. When it asks for X, Mallory redirects her
to Bob's server. The end result is that Alice obtains
objects from Bob, but they may be buried deep in
history. Alice is less likely to notice.
Both of these attacks are fairly hard to pull off. There's a
social component in getting Mallory to convince Alice to
work with her. Alice may be prompted for credentials in
accessing Bob's repository (but not always, if she is using
a credential helper that caches). Attack (1) requires a
certain amount of obliviousness on Alice's part while making
a new commit. Attack (2) requires that Mallory knows a sha1
in Bob's repository, that Bob's server supports dumb http,
and that the object in question is loose on Bob's server.
But we can probably make things a bit more obvious without
any loss of functionality. This patch does two things to
that end.
First, when we encounter a whole-repo redirect during the
initial ref discovery, we now inform the user on stderr,
making attack (1) much more obvious.
Second, the decision to follow redirects is now
configurable. The truly paranoid can set the new
http.followRedirects to false to avoid any redirection
entirely. But for a more practical default, we will disallow
redirects only after the initial ref discovery. This is
enough to thwart attacks similar to (2), while still
allowing the common use of redirects at the repository
level. Since c93c92f30 (http: update base URLs when we see
redirects, 2013-09-28) we re-root all further requests from
the redirect destination, which should generally mean that
no further redirection is necessary.
As an escape hatch, in case there really is a server that
needs to redirect individual requests, the user can set
http.followRedirects to "true" (and this can be done on a
per-server basis via http.*.followRedirects config).
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-12-07 02:24:41 +08:00
|
|
|
/*
|
|
|
|
* Default following to off unless "ALWAYS" is configured; this gives
|
|
|
|
* callers a sane starting point, and they can tweak for individual
|
|
|
|
* HTTP_FOLLOW_* cases themselves.
|
|
|
|
*/
|
|
|
|
if (http_follow_config == HTTP_FOLLOW_ALWAYS)
|
|
|
|
curl_easy_setopt(slot->curl, CURLOPT_FOLLOWLOCATION, 1);
|
|
|
|
else
|
|
|
|
curl_easy_setopt(slot->curl, CURLOPT_FOLLOWLOCATION, 0);
|
|
|
|
|
2016-02-03 12:09:14 +08:00
|
|
|
curl_easy_setopt(slot->curl, CURLOPT_IPRESOLVE, git_curl_ipresolve);
|
2015-01-08 08:29:20 +08:00
|
|
|
curl_easy_setopt(slot->curl, CURLOPT_HTTPAUTH, http_auth_methods);
|
http: add support for authtype and credential
Now that we have the credential helper code set up to handle arbitrary
authentications schemes, let's add support for this in the HTTP code,
where we really want to use it. If we're using this new functionality,
don't set a username and password, and instead set a header wherever
we'd normally do so, including for proxy authentication.
Since we can now handle this case, ask the credential helper to enable
the appropriate capabilities.
Finally, if we're using the authtype value, set "Expect: 100-continue".
Any type of authentication that requires multiple rounds (such as NTLM
or Kerberos) requires a 100 Continue (if we're larger than
http.postBuffer) because otherwise we send the pack data before we're
authenticated, the push gets a 401 response, and we can't rewind the
stream. We don't know for certain what other custom schemes might
require this, the HTTP/1.1 standard has required handling this since
1999, the broken HTTP server for which we disabled this (Google's) is
now fixed and has been for some time, and libcurl has a 1-second
fallback in case the HTTP server is still broken. In addition, it is
not unreasonable to require compliance with a 25-year old standard to
use new Git features. For all of these reasons, do so here.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-17 08:02:32 +08:00
|
|
|
if (http_auth.password || http_auth.credential || curl_empty_auth_enabled())
|
2012-04-10 17:53:40 +08:00
|
|
|
init_curl_http_auth(slot->curl);
|
2005-11-19 03:02:58 +08:00
|
|
|
|
|
|
|
return slot;
|
|
|
|
}
|
|
|
|
|
|
|
|
int start_active_slot(struct active_request_slot *slot)
|
|
|
|
{
|
|
|
|
CURLMcode curlm_result = curl_multi_add_handle(curlm, slot->curl);
|
2007-09-11 11:02:28 +08:00
|
|
|
int num_transfers;
|
2005-11-19 03:02:58 +08:00
|
|
|
|
|
|
|
if (curlm_result != CURLM_OK &&
|
|
|
|
curlm_result != CURLM_CALL_MULTI_PERFORM) {
|
2016-09-13 08:25:55 +08:00
|
|
|
warning("curl_multi_add_handle failed: %s",
|
|
|
|
curl_multi_strerror(curlm_result));
|
2005-11-19 03:02:58 +08:00
|
|
|
active_requests--;
|
|
|
|
slot->in_use = 0;
|
|
|
|
return 0;
|
|
|
|
}
|
2007-09-11 11:02:28 +08:00
|
|
|
|
|
|
|
/*
|
|
|
|
* We know there must be something to do, since we just added
|
|
|
|
* something.
|
|
|
|
*/
|
|
|
|
curl_multi_perform(curlm, &num_transfers);
|
2005-11-19 03:02:58 +08:00
|
|
|
return 1;
|
|
|
|
}
|
|
|
|
|
2007-09-11 11:02:34 +08:00
|
|
|
struct fill_chain {
|
|
|
|
void *data;
|
|
|
|
int (*fill)(void *);
|
|
|
|
struct fill_chain *next;
|
|
|
|
};
|
|
|
|
|
2009-03-10 09:47:29 +08:00
|
|
|
static struct fill_chain *fill_cfg;
|
2007-09-11 11:02:34 +08:00
|
|
|
|
|
|
|
void add_fill_function(void *data, int (*fill)(void *))
|
|
|
|
{
|
2018-02-15 02:59:42 +08:00
|
|
|
struct fill_chain *new_fill = xmalloc(sizeof(*new_fill));
|
2007-09-11 11:02:34 +08:00
|
|
|
struct fill_chain **linkp = &fill_cfg;
|
2018-02-15 02:59:42 +08:00
|
|
|
new_fill->data = data;
|
|
|
|
new_fill->fill = fill;
|
|
|
|
new_fill->next = NULL;
|
2007-09-11 11:02:34 +08:00
|
|
|
while (*linkp)
|
|
|
|
linkp = &(*linkp)->next;
|
2018-02-15 02:59:42 +08:00
|
|
|
*linkp = new_fill;
|
2007-09-11 11:02:34 +08:00
|
|
|
}
|
|
|
|
|
2007-09-11 11:02:28 +08:00
|
|
|
void fill_active_slots(void)
|
|
|
|
{
|
|
|
|
struct active_request_slot *slot = active_queue_head;
|
|
|
|
|
2007-09-11 11:02:34 +08:00
|
|
|
while (active_requests < max_requests) {
|
|
|
|
struct fill_chain *fill;
|
|
|
|
for (fill = fill_cfg; fill; fill = fill->next)
|
|
|
|
if (fill->fill(fill->data))
|
|
|
|
break;
|
|
|
|
|
|
|
|
if (!fill)
|
2007-09-11 11:02:28 +08:00
|
|
|
break;
|
2007-09-11 11:02:34 +08:00
|
|
|
}
|
2007-09-11 11:02:28 +08:00
|
|
|
|
|
|
|
while (slot != NULL) {
|
2009-11-27 23:42:26 +08:00
|
|
|
if (!slot->in_use && slot->curl != NULL
|
|
|
|
&& curl_session_count > min_curl_sessions) {
|
2007-09-11 11:02:28 +08:00
|
|
|
curl_easy_cleanup(slot->curl);
|
|
|
|
slot->curl = NULL;
|
2009-11-27 23:42:26 +08:00
|
|
|
curl_session_count--;
|
2007-09-11 11:02:28 +08:00
|
|
|
}
|
|
|
|
slot = slot->next;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2005-11-19 03:02:58 +08:00
|
|
|
void step_active_slots(void)
|
|
|
|
{
|
|
|
|
int num_transfers;
|
|
|
|
CURLMcode curlm_result;
|
|
|
|
|
|
|
|
do {
|
|
|
|
curlm_result = curl_multi_perform(curlm, &num_transfers);
|
|
|
|
} while (curlm_result == CURLM_CALL_MULTI_PERFORM);
|
|
|
|
if (num_transfers < active_requests) {
|
|
|
|
process_curl_messages();
|
|
|
|
fill_active_slots();
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
void run_active_slot(struct active_request_slot *slot)
|
|
|
|
{
|
|
|
|
fd_set readfds;
|
|
|
|
fd_set writefds;
|
|
|
|
fd_set excfds;
|
|
|
|
int max_fd;
|
|
|
|
struct timeval select_timeout;
|
2006-03-11 12:18:01 +08:00
|
|
|
int finished = 0;
|
2005-11-19 03:02:58 +08:00
|
|
|
|
2006-03-11 12:18:01 +08:00
|
|
|
slot->finished = &finished;
|
|
|
|
while (!finished) {
|
2005-11-19 03:02:58 +08:00
|
|
|
step_active_slots();
|
|
|
|
|
2011-11-04 22:19:27 +08:00
|
|
|
if (slot->in_use) {
|
2011-11-04 22:19:26 +08:00
|
|
|
long curl_timeout;
|
|
|
|
curl_multi_timeout(curlm, &curl_timeout);
|
|
|
|
if (curl_timeout == 0) {
|
|
|
|
continue;
|
|
|
|
} else if (curl_timeout == -1) {
|
|
|
|
select_timeout.tv_sec = 0;
|
|
|
|
select_timeout.tv_usec = 50000;
|
|
|
|
} else {
|
|
|
|
select_timeout.tv_sec = curl_timeout / 1000;
|
|
|
|
select_timeout.tv_usec = (curl_timeout % 1000) * 1000;
|
|
|
|
}
|
2005-11-19 03:02:58 +08:00
|
|
|
|
2011-11-04 22:19:25 +08:00
|
|
|
max_fd = -1;
|
2005-11-19 03:02:58 +08:00
|
|
|
FD_ZERO(&readfds);
|
|
|
|
FD_ZERO(&writefds);
|
|
|
|
FD_ZERO(&excfds);
|
2011-11-04 22:19:25 +08:00
|
|
|
curl_multi_fdset(curlm, &readfds, &writefds, &excfds, &max_fd);
|
2011-11-04 22:19:26 +08:00
|
|
|
|
2012-10-20 05:04:20 +08:00
|
|
|
/*
|
|
|
|
* It can happen that curl_multi_timeout returns a pathologically
|
|
|
|
* long timeout when curl_multi_fdset returns no file descriptors
|
|
|
|
* to read. See commit message for more details.
|
|
|
|
*/
|
|
|
|
if (max_fd < 0 &&
|
|
|
|
(select_timeout.tv_sec > 0 ||
|
|
|
|
select_timeout.tv_usec > 50000)) {
|
|
|
|
select_timeout.tv_sec = 0;
|
|
|
|
select_timeout.tv_usec = 50000;
|
|
|
|
}
|
|
|
|
|
2011-11-04 22:19:25 +08:00
|
|
|
select(max_fd+1, &readfds, &writefds, &excfds, &select_timeout);
|
2005-11-19 03:02:58 +08:00
|
|
|
}
|
|
|
|
}
|
http.c: clear the 'finished' member once we are done with it
In http.c, the run_active_slot() function allows the given "slot" to
make progress by calling step_active_slots() in a loop repeatedly,
and the loop is not left until the request held in the slot
completes.
Ages ago, we used to use the slot->in_use member to get out of the
loop, which misbehaved when the request in "slot" completes (at
which time, the result of the request is copied away from the slot,
and the in_use member is cleared, making the slot ready to be
reused), and the "slot" gets reused to service a different request
(at which time, the "slot" becomes in_use again, even though it is
for a different request). The loop terminating condition mistakenly
thought that the original request has yet to be completed.
Today's code, after baa7b67d (HTTP slot reuse fixes, 2006-03-10)
fixed this issue, uses a separate "slot->finished" member that is
set in run_active_slot() to point to an on-stack variable, and the
code that completes the request in finish_active_slot() clears the
on-stack variable via the pointer to signal that the particular
request held by the slot has completed. It also clears the in_use
member (as before that fix), so that the slot itself can safely be
reused for an unrelated request.
One thing that is not quite clean in this arrangement is that,
unless the slot gets reused, at which point the finished member is
reset to NULL, the member keeps the value of &finished, which
becomes a dangling pointer into the stack when run_active_slot()
returns. Clear the finished member before the control leaves the
function, which has a side effect of unconfusing compilers like
recent GCC 12 that is over-eager to warn against such an assignment.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-05-27 03:37:31 +08:00
|
|
|
|
|
|
|
/*
|
|
|
|
* The value of slot->finished we set before the loop was used
|
|
|
|
* to set our "finished" variable when our request completed.
|
|
|
|
*
|
2024-09-20 02:34:27 +08:00
|
|
|
* 1. The slot may not have been reused for another request
|
http.c: clear the 'finished' member once we are done with it
In http.c, the run_active_slot() function allows the given "slot" to
make progress by calling step_active_slots() in a loop repeatedly,
and the loop is not left until the request held in the slot
completes.
Ages ago, we used to use the slot->in_use member to get out of the
loop, which misbehaved when the request in "slot" completes (at
which time, the result of the request is copied away from the slot,
and the in_use member is cleared, making the slot ready to be
reused), and the "slot" gets reused to service a different request
(at which time, the "slot" becomes in_use again, even though it is
for a different request). The loop terminating condition mistakenly
thought that the original request has yet to be completed.
Today's code, after baa7b67d (HTTP slot reuse fixes, 2006-03-10)
fixed this issue, uses a separate "slot->finished" member that is
set in run_active_slot() to point to an on-stack variable, and the
code that completes the request in finish_active_slot() clears the
on-stack variable via the pointer to signal that the particular
request held by the slot has completed. It also clears the in_use
member (as before that fix), so that the slot itself can safely be
reused for an unrelated request.
One thing that is not quite clean in this arrangement is that,
unless the slot gets reused, at which point the finished member is
reset to NULL, the member keeps the value of &finished, which
becomes a dangling pointer into the stack when run_active_slot()
returns. Clear the finished member before the control leaves the
function, which has a side effect of unconfusing compilers like
recent GCC 12 that is over-eager to warn against such an assignment.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-05-27 03:37:31 +08:00
|
|
|
* yet, in which case it still has &finished.
|
|
|
|
*
|
|
|
|
* 2. The slot may already be in-use to serve another request,
|
|
|
|
* which can further be divided into two cases:
|
|
|
|
*
|
|
|
|
* (a) If call run_active_slot() hasn't been called for that
|
|
|
|
* other request, slot->finished would have been cleared
|
|
|
|
* by get_active_slot() and has NULL.
|
|
|
|
*
|
|
|
|
* (b) If the request did call run_active_slot(), then the
|
|
|
|
* call would have updated slot->finished at the beginning
|
|
|
|
* of this function, and with the clearing of the member
|
|
|
|
* below, we would find that slot->finished is now NULL.
|
|
|
|
*
|
|
|
|
* In all cases, slot->finished has no useful information to
|
|
|
|
* anybody at this point. Some compilers warn us for
|
|
|
|
* attempting to smuggle a pointer that is about to become
|
|
|
|
* invalid, i.e. &finished. We clear it here to assure them.
|
|
|
|
*/
|
|
|
|
slot->finished = NULL;
|
2005-11-19 03:02:58 +08:00
|
|
|
}
|
|
|
|
|
2010-01-12 14:26:08 +08:00
|
|
|
static void release_active_slot(struct active_request_slot *slot)
|
2006-02-07 18:07:39 +08:00
|
|
|
{
|
|
|
|
closedown_active_slot(slot);
|
2016-09-13 08:25:57 +08:00
|
|
|
if (slot->curl) {
|
2016-09-13 08:25:56 +08:00
|
|
|
xmulti_remove_handle(slot);
|
2016-09-13 08:25:57 +08:00
|
|
|
if (curl_session_count > min_curl_sessions) {
|
|
|
|
curl_easy_cleanup(slot->curl);
|
|
|
|
slot->curl = NULL;
|
|
|
|
curl_session_count--;
|
|
|
|
}
|
2006-02-07 18:07:39 +08:00
|
|
|
}
|
|
|
|
fill_active_slots();
|
|
|
|
}
|
|
|
|
|
2005-11-19 03:02:58 +08:00
|
|
|
void finish_all_active_slots(void)
|
|
|
|
{
|
|
|
|
struct active_request_slot *slot = active_queue_head;
|
|
|
|
|
|
|
|
while (slot != NULL)
|
|
|
|
if (slot->in_use) {
|
|
|
|
run_active_slot(slot);
|
|
|
|
slot = active_queue_head;
|
|
|
|
} else {
|
|
|
|
slot = slot->next;
|
|
|
|
}
|
|
|
|
}
|
2007-12-11 07:08:25 +08:00
|
|
|
|
2009-06-06 16:43:43 +08:00
|
|
|
/* Helpers for modifying and creating URLs */
|
2007-12-11 07:08:25 +08:00
|
|
|
static inline int needs_quote(int ch)
|
|
|
|
{
|
|
|
|
if (((ch >= 'A') && (ch <= 'Z'))
|
|
|
|
|| ((ch >= 'a') && (ch <= 'z'))
|
|
|
|
|| ((ch >= '0') && (ch <= '9'))
|
|
|
|
|| (ch == '/')
|
|
|
|
|| (ch == '-')
|
|
|
|
|| (ch == '.'))
|
|
|
|
return 0;
|
|
|
|
return 1;
|
|
|
|
}
|
|
|
|
|
|
|
|
static char *quote_ref_url(const char *base, const char *ref)
|
|
|
|
{
|
2009-03-08 00:47:21 +08:00
|
|
|
struct strbuf buf = STRBUF_INIT;
|
2007-12-11 07:08:25 +08:00
|
|
|
const char *cp;
|
2009-03-08 00:47:21 +08:00
|
|
|
int ch;
|
2007-12-11 07:08:25 +08:00
|
|
|
|
2009-06-06 16:43:43 +08:00
|
|
|
end_url_with_slash(&buf, base);
|
2009-03-08 00:47:21 +08:00
|
|
|
|
|
|
|
for (cp = ref; (ch = *cp) != 0; cp++)
|
2007-12-11 07:08:25 +08:00
|
|
|
if (needs_quote(ch))
|
2009-03-08 00:47:21 +08:00
|
|
|
strbuf_addf(&buf, "%%%02x", ch);
|
2007-12-11 07:08:25 +08:00
|
|
|
else
|
2009-03-08 00:47:21 +08:00
|
|
|
strbuf_addch(&buf, *cp);
|
2007-12-11 07:08:25 +08:00
|
|
|
|
2009-03-08 00:47:21 +08:00
|
|
|
return strbuf_detach(&buf, NULL);
|
2007-12-11 07:08:25 +08:00
|
|
|
}
|
|
|
|
|
http*: add helper methods for fetching objects (loose)
The code handling the fetching of loose objects in http-push.c and
http-walker.c have been refactored into new methods and a new struct
(object_http_request) in http.c. They are not meant to be invoked
elsewhere.
The new methods in http.c are
- new_http_object_request
- process_http_object_request
- finish_http_object_request
- abort_http_object_request
- release_http_object_request
and the new struct is http_object_request.
RANGER_HEADER_SIZE and no_pragma_header is no longer made available
outside of http.c, since after the above changes, there are no other
instances of usage outside of http.c.
Remove members of the transfer_request struct in http-push.c and
http-walker.c, including filename, real_sha1 and zret, as they are used
no longer used.
Move the methods append_remote_object_url() and get_remote_object_url()
from http-push.c to http.c. Additionally, get_remote_object_url() is no
longer defined only when USE_CURL_MULTI is defined, since
non-USE_CURL_MULTI code in http.c uses it (namely, in
new_http_object_request()).
Refactor code from http-push.c::start_fetch_loose() and
http-walker.c::start_object_fetch_request() that deals with the details
of coming up with the filename to store the retrieved object, resuming
a previously aborted request, and making a new curl request, into a new
function, new_http_object_request().
Refactor code from http-walker.c::process_object_request() into the
function, process_http_object_request().
Refactor code from http-push.c::finish_request() and
http-walker.c::finish_object_request() into a new function,
finish_http_object_request(). It returns the result of the
move_temp_to_file() invocation.
Add a function, release_http_object_request(), which cleans up object
request data. http-push.c and http-walker.c invoke this function
separately; http-push.c::release_request() and
http-walker.c::release_object_request() do not invoke this function.
Add a function, abort_http_object_request(), which unlink()s the object
file and invokes release_http_object_request(). Update
http-walker.c::abort_object_request() to use this.
Signed-off-by: Tay Ray Chuan <rctay89@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-06-06 16:44:02 +08:00
|
|
|
void append_remote_object_url(struct strbuf *buf, const char *url,
|
|
|
|
const char *hex,
|
|
|
|
int only_two_digit_prefix)
|
|
|
|
{
|
2009-08-17 17:09:43 +08:00
|
|
|
end_url_with_slash(buf, url);
|
|
|
|
|
|
|
|
strbuf_addf(buf, "objects/%.*s/", 2, hex);
|
http*: add helper methods for fetching objects (loose)
The code handling the fetching of loose objects in http-push.c and
http-walker.c have been refactored into new methods and a new struct
(object_http_request) in http.c. They are not meant to be invoked
elsewhere.
The new methods in http.c are
- new_http_object_request
- process_http_object_request
- finish_http_object_request
- abort_http_object_request
- release_http_object_request
and the new struct is http_object_request.
RANGER_HEADER_SIZE and no_pragma_header is no longer made available
outside of http.c, since after the above changes, there are no other
instances of usage outside of http.c.
Remove members of the transfer_request struct in http-push.c and
http-walker.c, including filename, real_sha1 and zret, as they are used
no longer used.
Move the methods append_remote_object_url() and get_remote_object_url()
from http-push.c to http.c. Additionally, get_remote_object_url() is no
longer defined only when USE_CURL_MULTI is defined, since
non-USE_CURL_MULTI code in http.c uses it (namely, in
new_http_object_request()).
Refactor code from http-push.c::start_fetch_loose() and
http-walker.c::start_object_fetch_request() that deals with the details
of coming up with the filename to store the retrieved object, resuming
a previously aborted request, and making a new curl request, into a new
function, new_http_object_request().
Refactor code from http-walker.c::process_object_request() into the
function, process_http_object_request().
Refactor code from http-push.c::finish_request() and
http-walker.c::finish_object_request() into a new function,
finish_http_object_request(). It returns the result of the
move_temp_to_file() invocation.
Add a function, release_http_object_request(), which cleans up object
request data. http-push.c and http-walker.c invoke this function
separately; http-push.c::release_request() and
http-walker.c::release_object_request() do not invoke this function.
Add a function, abort_http_object_request(), which unlink()s the object
file and invokes release_http_object_request(). Update
http-walker.c::abort_object_request() to use this.
Signed-off-by: Tay Ray Chuan <rctay89@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-06-06 16:44:02 +08:00
|
|
|
if (!only_two_digit_prefix)
|
2016-08-06 04:37:11 +08:00
|
|
|
strbuf_addstr(buf, hex + 2);
|
http*: add helper methods for fetching objects (loose)
The code handling the fetching of loose objects in http-push.c and
http-walker.c have been refactored into new methods and a new struct
(object_http_request) in http.c. They are not meant to be invoked
elsewhere.
The new methods in http.c are
- new_http_object_request
- process_http_object_request
- finish_http_object_request
- abort_http_object_request
- release_http_object_request
and the new struct is http_object_request.
RANGER_HEADER_SIZE and no_pragma_header is no longer made available
outside of http.c, since after the above changes, there are no other
instances of usage outside of http.c.
Remove members of the transfer_request struct in http-push.c and
http-walker.c, including filename, real_sha1 and zret, as they are used
no longer used.
Move the methods append_remote_object_url() and get_remote_object_url()
from http-push.c to http.c. Additionally, get_remote_object_url() is no
longer defined only when USE_CURL_MULTI is defined, since
non-USE_CURL_MULTI code in http.c uses it (namely, in
new_http_object_request()).
Refactor code from http-push.c::start_fetch_loose() and
http-walker.c::start_object_fetch_request() that deals with the details
of coming up with the filename to store the retrieved object, resuming
a previously aborted request, and making a new curl request, into a new
function, new_http_object_request().
Refactor code from http-walker.c::process_object_request() into the
function, process_http_object_request().
Refactor code from http-push.c::finish_request() and
http-walker.c::finish_object_request() into a new function,
finish_http_object_request(). It returns the result of the
move_temp_to_file() invocation.
Add a function, release_http_object_request(), which cleans up object
request data. http-push.c and http-walker.c invoke this function
separately; http-push.c::release_request() and
http-walker.c::release_object_request() do not invoke this function.
Add a function, abort_http_object_request(), which unlink()s the object
file and invokes release_http_object_request(). Update
http-walker.c::abort_object_request() to use this.
Signed-off-by: Tay Ray Chuan <rctay89@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-06-06 16:44:02 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
char *get_remote_object_url(const char *url, const char *hex,
|
|
|
|
int only_two_digit_prefix)
|
|
|
|
{
|
|
|
|
struct strbuf buf = STRBUF_INIT;
|
|
|
|
append_remote_object_url(&buf, url, hex, only_two_digit_prefix);
|
|
|
|
return strbuf_detach(&buf, NULL);
|
|
|
|
}
|
|
|
|
|
2019-03-24 20:08:38 +08:00
|
|
|
void normalize_curl_result(CURLcode *result, long http_code,
|
|
|
|
char *errorstr, size_t errorlen)
|
2012-08-27 21:26:04 +08:00
|
|
|
{
|
2013-04-06 06:14:06 +08:00
|
|
|
/*
|
|
|
|
* If we see a failing http code with CURLE_OK, we have turned off
|
|
|
|
* FAILONERROR (to keep the server's custom error response), and should
|
|
|
|
* translate the code into failure here.
|
http: make redirects more obvious
We instruct curl to always follow HTTP redirects. This is
convenient, but it creates opportunities for malicious
servers to create confusing situations. For instance,
imagine Alice is a git user with access to a private
repository on Bob's server. Mallory runs her own server and
wants to access objects from Bob's repository.
Mallory may try a few tricks that involve asking Alice to
clone from her, build on top, and then push the result:
1. Mallory may simply redirect all fetch requests to Bob's
server. Git will transparently follow those redirects
and fetch Bob's history, which Alice may believe she
got from Mallory. The subsequent push seems like it is
just feeding Mallory back her own objects, but is
actually leaking Bob's objects. There is nothing in
git's output to indicate that Bob's repository was
involved at all.
The downside (for Mallory) of this attack is that Alice
will have received Bob's entire repository, and is
likely to notice that when building on top of it.
2. If Mallory happens to know the sha1 of some object X in
Bob's repository, she can instead build her own history
that references that object. She then runs a dumb http
server, and Alice's client will fetch each object
individually. When it asks for X, Mallory redirects her
to Bob's server. The end result is that Alice obtains
objects from Bob, but they may be buried deep in
history. Alice is less likely to notice.
Both of these attacks are fairly hard to pull off. There's a
social component in getting Mallory to convince Alice to
work with her. Alice may be prompted for credentials in
accessing Bob's repository (but not always, if she is using
a credential helper that caches). Attack (1) requires a
certain amount of obliviousness on Alice's part while making
a new commit. Attack (2) requires that Mallory knows a sha1
in Bob's repository, that Bob's server supports dumb http,
and that the object in question is loose on Bob's server.
But we can probably make things a bit more obvious without
any loss of functionality. This patch does two things to
that end.
First, when we encounter a whole-repo redirect during the
initial ref discovery, we now inform the user on stderr,
making attack (1) much more obvious.
Second, the decision to follow redirects is now
configurable. The truly paranoid can set the new
http.followRedirects to false to avoid any redirection
entirely. But for a more practical default, we will disallow
redirects only after the initial ref discovery. This is
enough to thwart attacks similar to (2), while still
allowing the common use of redirects at the repository
level. Since c93c92f30 (http: update base URLs when we see
redirects, 2013-09-28) we re-root all further requests from
the redirect destination, which should generally mean that
no further redirection is necessary.
As an escape hatch, in case there really is a server that
needs to redirect individual requests, the user can set
http.followRedirects to "true" (and this can be done on a
per-server basis via http.*.followRedirects config).
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-12-07 02:24:41 +08:00
|
|
|
*
|
|
|
|
* Likewise, if we see a redirect (30x code), that means we turned off
|
|
|
|
* redirect-following, and we should treat the result as an error.
|
2013-04-06 06:14:06 +08:00
|
|
|
*/
|
2019-03-24 20:08:38 +08:00
|
|
|
if (*result == CURLE_OK && http_code >= 300) {
|
|
|
|
*result = CURLE_HTTP_RETURNED_ERROR;
|
2013-04-06 06:14:06 +08:00
|
|
|
/*
|
|
|
|
* Normally curl will already have put the "reason phrase"
|
|
|
|
* from the server into curl_errorstr; unfortunately without
|
|
|
|
* FAILONERROR it is lost, so we can give only the numeric
|
|
|
|
* status code.
|
|
|
|
*/
|
2019-03-24 20:08:38 +08:00
|
|
|
xsnprintf(errorstr, errorlen,
|
2017-03-29 03:46:56 +08:00
|
|
|
"The requested URL returned error: %ld",
|
2019-03-24 20:08:38 +08:00
|
|
|
http_code);
|
2013-04-06 06:14:06 +08:00
|
|
|
}
|
2019-03-24 20:08:38 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
static int handle_curl_result(struct slot_results *results)
|
|
|
|
{
|
|
|
|
normalize_curl_result(&results->curl_result, results->http_code,
|
|
|
|
curl_errorstr, sizeof(curl_errorstr));
|
2013-04-06 06:14:06 +08:00
|
|
|
|
2012-08-27 21:26:04 +08:00
|
|
|
if (results->curl_result == CURLE_OK) {
|
|
|
|
credential_approve(&http_auth);
|
2021-03-12 10:40:27 +08:00
|
|
|
credential_approve(&proxy_auth);
|
2021-03-12 10:40:26 +08:00
|
|
|
credential_approve(&cert_auth);
|
2012-08-27 21:26:04 +08:00
|
|
|
return HTTP_OK;
|
2021-03-12 10:40:26 +08:00
|
|
|
} else if (results->curl_result == CURLE_SSL_CERTPROBLEM) {
|
|
|
|
/*
|
|
|
|
* We can't tell from here whether it's a bad path, bad
|
|
|
|
* certificate, bad password, or something else wrong
|
|
|
|
* with the certificate. So we reject the credential to
|
|
|
|
* avoid caching or saving a bad password.
|
|
|
|
*/
|
|
|
|
credential_reject(&cert_auth);
|
|
|
|
return HTTP_NOAUTH;
|
2021-09-24 18:08:20 +08:00
|
|
|
#ifdef GIT_CURL_HAVE_CURLE_SSL_PINNEDPUBKEYNOTMATCH
|
|
|
|
} else if (results->curl_result == CURLE_SSL_PINNEDPUBKEYNOTMATCH) {
|
|
|
|
return HTTP_NOMATCHPUBLICKEY;
|
|
|
|
#endif
|
2012-08-27 21:26:04 +08:00
|
|
|
} else if (missing_target(results))
|
|
|
|
return HTTP_MISSING_TARGET;
|
|
|
|
else if (results->http_code == 401) {
|
http: add support for authtype and credential
Now that we have the credential helper code set up to handle arbitrary
authentications schemes, let's add support for this in the HTTP code,
where we really want to use it. If we're using this new functionality,
don't set a username and password, and instead set a header wherever
we'd normally do so, including for proxy authentication.
Since we can now handle this case, ask the credential helper to enable
the appropriate capabilities.
Finally, if we're using the authtype value, set "Expect: 100-continue".
Any type of authentication that requires multiple rounds (such as NTLM
or Kerberos) requires a 100 Continue (if we're larger than
http.postBuffer) because otherwise we send the pack data before we're
authenticated, the push gets a 401 response, and we can't rewind the
stream. We don't know for certain what other custom schemes might
require this, the HTTP/1.1 standard has required handling this since
1999, the broken HTTP server for which we disabled this (Google's) is
now fixed and has been for some time, and libcurl has a 1-second
fallback in case the HTTP server is still broken. In addition, it is
not unreasonable to require compliance with a 25-year old standard to
use new Git features. For all of these reasons, do so here.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-17 08:02:32 +08:00
|
|
|
if ((http_auth.username && http_auth.password) ||\
|
|
|
|
(http_auth.authtype && http_auth.credential)) {
|
credential: add support for multistage credential rounds
Over HTTP, NTLM and Kerberos require two rounds of authentication on the
client side. It's possible that there are custom authentication schemes
that also implement this same approach. Since these are tricky schemes
to implement and the HTTP library in use may not always handle them
gracefully on all systems, it would be helpful to allow the credential
helper to implement them instead for increased portability and
robustness.
To allow this to happen, add a boolean flag, continue, that indicates
that instead of failing when we get a 401, we should retry another round
of authentication. However, this necessitates some changes in our
current credential code so that we can make this work.
Keep the state[] headers between iterations, but only use them to send
to the helper and only consider the new ones we read from the credential
helper to be valid on subsequent iterations. That avoids us passing
stale data when we finally approve or reject the credential. Similarly,
clear the multistage and wwwauth[] values appropriately so that we
don't pass stale data or think we're trying a multiround response when
we're not. Remove the credential values so that we can actually fill a
second time with new responses.
Limit the number of iterations of reauthentication we do to 3. This
means that if there's a problem, we'll terminate with an error message
instead of retrying indefinitely and not informing the user (and
possibly conducting a DoS on the server).
In our tests, handle creating multiple response output files from our
helper so we can verify that each of the messages sent is correct.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-17 08:02:37 +08:00
|
|
|
if (http_auth.multistage) {
|
|
|
|
credential_clear_secrets(&http_auth);
|
|
|
|
return HTTP_REAUTH;
|
|
|
|
}
|
2012-08-27 21:26:04 +08:00
|
|
|
credential_reject(&http_auth);
|
http: allow authenticating proactively
When making a request over HTTP(S), Git only sends authentication if it
receives a 401 response. Thus, if a repository is open to the public
for reading, Git will typically never ask for authentication for fetches
and clones.
However, there may be times when a user would like to authenticate
nevertheless. For example, a forge may give higher rate limits to users
who authenticate because they are easier to contact in case of excessive
use. Or it may be useful for a known heavy user, such as an internal
service, to proactively authenticate so its use can be monitored and, if
necessary, throttled.
Let's make this possible with a new option, "http.proactiveAuth". This
option specifies a type of authentication which can be used to
authenticate against the host in question. This is necessary because we
lack the WWW-Authenticate header to provide us details; similarly, we
cannot accept certain types of authentication because we require
information from the server, such as a nonce or challenge, to
successfully authenticate.
If we're in auto mode and we got a username and password, set the
authentication scheme to Basic. libcurl will not send authentication
proactively unless there's a single choice of allowed authentication,
and we know in this case we didn't get an authtype entry telling us what
scheme to use, or we would have taken a different codepath and written
the header ourselves. In any event, of the other schemes that libcurl
supports, Digest and NTLM require a nonce or challenge, which means that
they cannot work with proactive auth, and GSSAPI does not use a username
and password at all, so Basic is the only logical choice among the
built-in options.
Note that the existing http_proactive_auth variable signifies proactive
auth if there are already credentials, which is different from the
functionality we're adding, which always seeks credentials even if none
are provided. Nonetheless, t5540 tests the existing behavior for
WebDAV-based pushes to an open repository without credentials, so we
preserve it. While at first this may seem an insecure and bizarre
decision, it may be that authentication is done with TLS certificates,
in which case it might actually provide a quite high level of security.
Expand the variable to use an enum to handle the additional cases and a
helper function to distinguish our new cases from the old ones.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-10 08:01:55 +08:00
|
|
|
if (always_auth_proactively())
|
|
|
|
http_proactive_auth = PROACTIVE_AUTH_NONE;
|
2012-08-27 21:26:04 +08:00
|
|
|
return HTTP_NOAUTH;
|
|
|
|
} else {
|
2021-05-18 14:27:42 +08:00
|
|
|
http_auth_methods &= ~CURLAUTH_GSSNEGOTIATE;
|
|
|
|
if (results->auth_avail) {
|
|
|
|
http_auth_methods &= results->auth_avail;
|
|
|
|
http_auth_methods_restricted = 1;
|
|
|
|
}
|
2012-08-27 21:26:04 +08:00
|
|
|
return HTTP_REAUTH;
|
|
|
|
}
|
|
|
|
} else {
|
http: use credential API to handle proxy authentication
Currently, the only way to pass proxy credentials to curl is by including them
in the proxy URL. Usually, this means they will end up on disk unencrypted, one
way or another (by inclusion in ~/.gitconfig, shell profile or history). Since
proxy authentication often uses a domain user, credentials can be security
sensitive; therefore, a safer way of passing credentials is desirable.
If the configured proxy contains a username but not a password, query the
credential API for one. Also, make sure we approve/reject proxy credentials
properly.
For consistency reasons, add parsing of http_proxy/https_proxy/all_proxy
environment variables, which would otherwise be evaluated as a fallback by curl.
Without this, we would have different semantics for git configuration and
environment variables.
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Helped-by: Elia Pinto <gitter.spiros@gmail.com>
Signed-off-by: Knut Franke <k.franke@science-computing.de>
Signed-off-by: Elia Pinto <gitter.spiros@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-01-26 21:02:48 +08:00
|
|
|
if (results->http_connectcode == 407)
|
|
|
|
credential_reject(&proxy_auth);
|
2012-08-27 21:26:04 +08:00
|
|
|
if (!curl_errorstr[0])
|
|
|
|
strlcpy(curl_errorstr,
|
|
|
|
curl_easy_strerror(results->curl_result),
|
|
|
|
sizeof(curl_errorstr));
|
|
|
|
return HTTP_ERROR;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
http: never use curl_easy_perform
We currently don't reuse http connections when fetching via
the smart-http protocol. This is bad because the TCP
handshake introduces latency, and especially because SSL
connection setup may be non-trivial.
We can fix it by consistently using curl's "multi"
interface. The reason is rather complicated:
Our http code has two ways of being used: queuing many
"slots" to be fetched in parallel, or fetching a single
request in a blocking manner. The parallel code is built on
curl's "multi" interface. Most of the single-request code
uses http_request, which is built on top of the parallel
code (we just feed it one slot, and wait until it finishes).
However, one could also accomplish the single-request scheme
by avoiding curl's multi interface entirely and just using
curl_easy_perform. This is simpler, and is used by post_rpc
in the smart-http protocol.
It does work to use the same curl handle in both contexts,
as long as it is not at the same time. However, internally
curl may not share all of the cached resources between both
contexts. In particular, a connection formed using the
"multi" code will go into a reuse pool connected to the
"multi" object. Further requests using the "easy" interface
will not be able to reuse that connection.
The smart http protocol does ref discovery via http_request,
which uses the "multi" interface, and then follows up with
the "easy" interface for its rpc calls. As a result, we make
two HTTP connections rather than reusing a single one.
We could teach the ref discovery to use the "easy"
interface. But it is only once we have done this discovery
that we know whether the protocol will be smart or dumb. If
it is dumb, then our further requests, which want to fetch
objects in parallel, will not be able to reuse the same
connection.
Instead, this patch switches post_rpc to build on the
parallel interface, which means that we use it consistently
everywhere. It's a little more complicated to use, but since
we have the infrastructure already, it doesn't add any code;
we can just factor out the relevant bits from http_request.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-02-18 18:34:20 +08:00
|
|
|
int run_one_slot(struct active_request_slot *slot,
|
|
|
|
struct slot_results *results)
|
|
|
|
{
|
|
|
|
slot->results = results;
|
|
|
|
if (!start_active_slot(slot)) {
|
2017-03-29 03:46:56 +08:00
|
|
|
xsnprintf(curl_errorstr, sizeof(curl_errorstr),
|
|
|
|
"failed to start HTTP request");
|
http: never use curl_easy_perform
We currently don't reuse http connections when fetching via
the smart-http protocol. This is bad because the TCP
handshake introduces latency, and especially because SSL
connection setup may be non-trivial.
We can fix it by consistently using curl's "multi"
interface. The reason is rather complicated:
Our http code has two ways of being used: queuing many
"slots" to be fetched in parallel, or fetching a single
request in a blocking manner. The parallel code is built on
curl's "multi" interface. Most of the single-request code
uses http_request, which is built on top of the parallel
code (we just feed it one slot, and wait until it finishes).
However, one could also accomplish the single-request scheme
by avoiding curl's multi interface entirely and just using
curl_easy_perform. This is simpler, and is used by post_rpc
in the smart-http protocol.
It does work to use the same curl handle in both contexts,
as long as it is not at the same time. However, internally
curl may not share all of the cached resources between both
contexts. In particular, a connection formed using the
"multi" code will go into a reuse pool connected to the
"multi" object. Further requests using the "easy" interface
will not be able to reuse that connection.
The smart http protocol does ref discovery via http_request,
which uses the "multi" interface, and then follows up with
the "easy" interface for its rpc calls. As a result, we make
two HTTP connections rather than reusing a single one.
We could teach the ref discovery to use the "easy"
interface. But it is only once we have done this discovery
that we know whether the protocol will be smart or dumb. If
it is dumb, then our further requests, which want to fetch
objects in parallel, will not be able to reuse the same
connection.
Instead, this patch switches post_rpc to build on the
parallel interface, which means that we use it consistently
everywhere. It's a little more complicated to use, but since
we have the infrastructure already, it doesn't add any code;
we can just factor out the relevant bits from http_request.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-02-18 18:34:20 +08:00
|
|
|
return HTTP_START_FAILED;
|
|
|
|
}
|
|
|
|
|
|
|
|
run_active_slot(slot);
|
|
|
|
return handle_curl_result(results);
|
|
|
|
}
|
|
|
|
|
2016-04-27 20:20:37 +08:00
|
|
|
struct curl_slist *http_copy_default_headers(void)
|
|
|
|
{
|
remote-curl: unbreak http.extraHeader with custom allocators
In 93b980e58f5 (http: use xmalloc with cURL, 2019-08-15), we started to
ask cURL to use `xmalloc()`, and if compiled with nedmalloc, that means
implicitly a different allocator than the system one.
Which means that all of cURL's allocations and releases now _need_ to
use that allocator.
However, the `http_options()` function used `slist_append()` to add any
configured extra HTTP header(s) _before_ asking cURL to use `xmalloc()`,
and `http_cleanup()` would release them _afterwards_, i.e. in the
presence of custom allocators, cURL would attempt to use the wrong
allocator to release the memory.
A naïve attempt at fixing this would move the call to
`curl_global_init()` _before_ the config is parsed (i.e. before that
call to `slist_append()`).
However, that does not work, as we _also_ parse the config setting
`http.sslbackend` and if found, call `curl_global_sslset()` which *must*
be called before `curl_global_init()`, for details see:
https://curl.haxx.se/libcurl/c/curl_global_sslset.html
So let's instead make the config parsing entirely independent from
cURL's data structures. Incidentally, this deletes two more lines than
it introduces, which is nice.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-11-06 18:04:55 +08:00
|
|
|
struct curl_slist *headers = NULL;
|
|
|
|
const struct string_list_item *item;
|
2016-04-27 20:20:37 +08:00
|
|
|
|
remote-curl: unbreak http.extraHeader with custom allocators
In 93b980e58f5 (http: use xmalloc with cURL, 2019-08-15), we started to
ask cURL to use `xmalloc()`, and if compiled with nedmalloc, that means
implicitly a different allocator than the system one.
Which means that all of cURL's allocations and releases now _need_ to
use that allocator.
However, the `http_options()` function used `slist_append()` to add any
configured extra HTTP header(s) _before_ asking cURL to use `xmalloc()`,
and `http_cleanup()` would release them _afterwards_, i.e. in the
presence of custom allocators, cURL would attempt to use the wrong
allocator to release the memory.
A naïve attempt at fixing this would move the call to
`curl_global_init()` _before_ the config is parsed (i.e. before that
call to `slist_append()`).
However, that does not work, as we _also_ parse the config setting
`http.sslbackend` and if found, call `curl_global_sslset()` which *must*
be called before `curl_global_init()`, for details see:
https://curl.haxx.se/libcurl/c/curl_global_sslset.html
So let's instead make the config parsing entirely independent from
cURL's data structures. Incidentally, this deletes two more lines than
it introduces, which is nice.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-11-06 18:04:55 +08:00
|
|
|
for_each_string_list_item(item, &extra_http_headers)
|
|
|
|
headers = curl_slist_append(headers, item->string);
|
2016-04-27 20:20:37 +08:00
|
|
|
|
|
|
|
return headers;
|
|
|
|
}
|
|
|
|
|
2013-09-28 16:31:11 +08:00
|
|
|
static CURLcode curlinfo_strbuf(CURL *curl, CURLINFO info, struct strbuf *buf)
|
|
|
|
{
|
|
|
|
char *ptr;
|
|
|
|
CURLcode ret;
|
|
|
|
|
|
|
|
strbuf_reset(buf);
|
|
|
|
ret = curl_easy_getinfo(curl, info, &ptr);
|
|
|
|
if (!ret && ptr)
|
|
|
|
strbuf_addstr(buf, ptr);
|
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
2014-05-22 17:30:05 +08:00
|
|
|
/*
|
|
|
|
* Check for and extract a content-type parameter. "raw"
|
|
|
|
* should be positioned at the start of the potential
|
|
|
|
* parameter, with any whitespace already removed.
|
|
|
|
*
|
|
|
|
* "name" is the name of the parameter. The value is appended
|
|
|
|
* to "out".
|
|
|
|
*/
|
|
|
|
static int extract_param(const char *raw, const char *name,
|
|
|
|
struct strbuf *out)
|
|
|
|
{
|
|
|
|
size_t len = strlen(name);
|
|
|
|
|
|
|
|
if (strncasecmp(raw, name, len))
|
|
|
|
return -1;
|
|
|
|
raw += len;
|
|
|
|
|
|
|
|
if (*raw != '=')
|
|
|
|
return -1;
|
|
|
|
raw++;
|
|
|
|
|
2014-06-18 06:11:53 +08:00
|
|
|
while (*raw && !isspace(*raw) && *raw != ';')
|
2014-05-22 17:30:05 +08:00
|
|
|
strbuf_addch(out, *raw++);
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2014-05-22 17:29:47 +08:00
|
|
|
/*
|
|
|
|
* Extract a normalized version of the content type, with any
|
|
|
|
* spaces suppressed, all letters lowercased, and no trailing ";"
|
|
|
|
* or parameters.
|
|
|
|
*
|
|
|
|
* Note that we will silently remove even invalid whitespace. For
|
|
|
|
* example, "text / plain" is specifically forbidden by RFC 2616,
|
|
|
|
* but "text/plain" is the only reasonable output, and this keeps
|
|
|
|
* our code simple.
|
|
|
|
*
|
2014-05-22 17:30:05 +08:00
|
|
|
* If the "charset" argument is not NULL, store the value of any
|
|
|
|
* charset parameter there.
|
|
|
|
*
|
2014-05-22 17:29:47 +08:00
|
|
|
* Example:
|
2014-05-22 17:30:05 +08:00
|
|
|
* "TEXT/PLAIN; charset=utf-8" -> "text/plain", "utf-8"
|
2014-05-22 17:29:47 +08:00
|
|
|
* "text / plain" -> "text/plain"
|
|
|
|
*/
|
2014-05-22 17:30:05 +08:00
|
|
|
static void extract_content_type(struct strbuf *raw, struct strbuf *type,
|
|
|
|
struct strbuf *charset)
|
2014-05-22 17:29:47 +08:00
|
|
|
{
|
|
|
|
const char *p;
|
|
|
|
|
|
|
|
strbuf_reset(type);
|
|
|
|
strbuf_grow(type, raw->len);
|
|
|
|
for (p = raw->buf; *p; p++) {
|
|
|
|
if (isspace(*p))
|
|
|
|
continue;
|
2014-05-22 17:30:05 +08:00
|
|
|
if (*p == ';') {
|
|
|
|
p++;
|
2014-05-22 17:29:47 +08:00
|
|
|
break;
|
2014-05-22 17:30:05 +08:00
|
|
|
}
|
2014-05-22 17:29:47 +08:00
|
|
|
strbuf_addch(type, tolower(*p));
|
|
|
|
}
|
2014-05-22 17:30:05 +08:00
|
|
|
|
|
|
|
if (!charset)
|
|
|
|
return;
|
|
|
|
|
|
|
|
strbuf_reset(charset);
|
|
|
|
while (*p) {
|
2014-06-18 06:11:53 +08:00
|
|
|
while (isspace(*p) || *p == ';')
|
2014-05-22 17:30:05 +08:00
|
|
|
p++;
|
|
|
|
if (!extract_param(p, "charset", charset))
|
|
|
|
return;
|
|
|
|
while (*p && !isspace(*p))
|
|
|
|
p++;
|
|
|
|
}
|
2014-05-22 17:36:12 +08:00
|
|
|
|
|
|
|
if (!charset->len && starts_with(type->buf, "text/"))
|
|
|
|
strbuf_addstr(charset, "ISO-8859-1");
|
2014-05-22 17:29:47 +08:00
|
|
|
}
|
|
|
|
|
2015-01-28 20:04:37 +08:00
|
|
|
static void write_accept_language(struct strbuf *buf)
|
|
|
|
{
|
|
|
|
/*
|
|
|
|
* MAX_DECIMAL_PLACES must not be larger than 3. If it is larger than
|
|
|
|
* that, q-value will be smaller than 0.001, the minimum q-value the
|
|
|
|
* HTTP specification allows. See
|
2023-11-24 11:35:12 +08:00
|
|
|
* https://datatracker.ietf.org/doc/html/rfc7231#section-5.3.1 for q-value.
|
2015-01-28 20:04:37 +08:00
|
|
|
*/
|
|
|
|
const int MAX_DECIMAL_PLACES = 3;
|
|
|
|
const int MAX_LANGUAGE_TAGS = 1000;
|
|
|
|
const int MAX_ACCEPT_LANGUAGE_HEADER_SIZE = 4000;
|
|
|
|
char **language_tags = NULL;
|
|
|
|
int num_langs = 0;
|
|
|
|
const char *s = get_preferred_languages();
|
|
|
|
int i;
|
|
|
|
struct strbuf tag = STRBUF_INIT;
|
|
|
|
|
|
|
|
/* Don't add Accept-Language header if no language is preferred. */
|
|
|
|
if (!s)
|
|
|
|
return;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Split the colon-separated string of preferred languages into
|
|
|
|
* language_tags array.
|
|
|
|
*/
|
|
|
|
do {
|
|
|
|
/* collect language tag */
|
|
|
|
for (; *s && (isalnum(*s) || *s == '_'); s++)
|
|
|
|
strbuf_addch(&tag, *s == '_' ? '-' : *s);
|
|
|
|
|
|
|
|
/* skip .codeset, @modifier and any other unnecessary parts */
|
|
|
|
while (*s && *s != ':')
|
|
|
|
s++;
|
|
|
|
|
|
|
|
if (tag.len) {
|
|
|
|
num_langs++;
|
|
|
|
REALLOC_ARRAY(language_tags, num_langs);
|
|
|
|
language_tags[num_langs - 1] = strbuf_detach(&tag, NULL);
|
|
|
|
if (num_langs >= MAX_LANGUAGE_TAGS - 1) /* -1 for '*' */
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
} while (*s++);
|
|
|
|
|
|
|
|
/* write Accept-Language header into buf */
|
|
|
|
if (num_langs) {
|
|
|
|
int last_buf_len = 0;
|
|
|
|
int max_q;
|
|
|
|
int decimal_places;
|
|
|
|
char q_format[32];
|
|
|
|
|
|
|
|
/* add '*' */
|
|
|
|
REALLOC_ARRAY(language_tags, num_langs + 1);
|
2024-06-07 14:38:49 +08:00
|
|
|
language_tags[num_langs++] = xstrdup("*");
|
2015-01-28 20:04:37 +08:00
|
|
|
|
|
|
|
/* compute decimal_places */
|
|
|
|
for (max_q = 1, decimal_places = 0;
|
|
|
|
max_q < num_langs && decimal_places <= MAX_DECIMAL_PLACES;
|
|
|
|
decimal_places++, max_q *= 10)
|
|
|
|
;
|
|
|
|
|
2015-09-25 05:06:08 +08:00
|
|
|
xsnprintf(q_format, sizeof(q_format), ";q=0.%%0%dd", decimal_places);
|
2015-01-28 20:04:37 +08:00
|
|
|
|
|
|
|
strbuf_addstr(buf, "Accept-Language: ");
|
|
|
|
|
|
|
|
for (i = 0; i < num_langs; i++) {
|
|
|
|
if (i > 0)
|
|
|
|
strbuf_addstr(buf, ", ");
|
|
|
|
|
|
|
|
strbuf_addstr(buf, language_tags[i]);
|
|
|
|
|
|
|
|
if (i > 0)
|
|
|
|
strbuf_addf(buf, q_format, max_q - i);
|
|
|
|
|
|
|
|
if (buf->len > MAX_ACCEPT_LANGUAGE_HEADER_SIZE) {
|
|
|
|
strbuf_remove(buf, last_buf_len, buf->len - last_buf_len);
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
|
|
|
|
last_buf_len = buf->len;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2024-06-07 14:38:49 +08:00
|
|
|
for (i = 0; i < num_langs; i++)
|
2015-01-28 20:04:37 +08:00
|
|
|
free(language_tags[i]);
|
|
|
|
free(language_tags);
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Get an Accept-Language header which indicates user's preferred languages.
|
|
|
|
*
|
|
|
|
* Examples:
|
|
|
|
* LANGUAGE= -> ""
|
|
|
|
* LANGUAGE=ko:en -> "Accept-Language: ko, en; q=0.9, *; q=0.1"
|
|
|
|
* LANGUAGE=ko_KR.UTF-8:sr@latin -> "Accept-Language: ko-KR, sr; q=0.9, *; q=0.1"
|
|
|
|
* LANGUAGE=ko LANG=en_US.UTF-8 -> "Accept-Language: ko, *; q=0.1"
|
|
|
|
* LANGUAGE= LANG=en_US.UTF-8 -> "Accept-Language: en-US, *; q=0.1"
|
|
|
|
* LANGUAGE= LANG=C -> ""
|
|
|
|
*/
|
2022-07-11 13:58:54 +08:00
|
|
|
const char *http_get_accept_language_header(void)
|
2015-01-28 20:04:37 +08:00
|
|
|
{
|
|
|
|
if (!cached_accept_language) {
|
|
|
|
struct strbuf buf = STRBUF_INIT;
|
|
|
|
write_accept_language(&buf);
|
|
|
|
if (buf.len > 0)
|
|
|
|
cached_accept_language = strbuf_detach(&buf, NULL);
|
|
|
|
}
|
|
|
|
|
|
|
|
return cached_accept_language;
|
|
|
|
}
|
|
|
|
|
2015-11-03 05:39:58 +08:00
|
|
|
static void http_opt_request_remainder(CURL *curl, off_t pos)
|
|
|
|
{
|
|
|
|
char buf[128];
|
|
|
|
xsnprintf(buf, sizeof(buf), "%"PRIuMAX"-", (uintmax_t)pos);
|
|
|
|
curl_easy_setopt(curl, CURLOPT_RANGE, buf);
|
|
|
|
}
|
|
|
|
|
2009-06-06 16:43:53 +08:00
|
|
|
/* http_request() targets */
|
|
|
|
#define HTTP_REQUEST_STRBUF 0
|
|
|
|
#define HTTP_REQUEST_FILE 1
|
|
|
|
|
2013-09-28 16:31:23 +08:00
|
|
|
static int http_request(const char *url,
|
|
|
|
void *result, int target,
|
|
|
|
const struct http_get_options *options)
|
2009-06-06 16:43:53 +08:00
|
|
|
{
|
|
|
|
struct active_request_slot *slot;
|
|
|
|
struct slot_results results;
|
2016-04-27 20:20:37 +08:00
|
|
|
struct curl_slist *headers = http_copy_default_headers();
|
2009-06-06 16:43:53 +08:00
|
|
|
struct strbuf buf = STRBUF_INIT;
|
2015-01-28 20:04:37 +08:00
|
|
|
const char *accept_language;
|
2009-06-06 16:43:53 +08:00
|
|
|
int ret;
|
|
|
|
|
|
|
|
slot = get_active_slot();
|
|
|
|
curl_easy_setopt(slot->curl, CURLOPT_HTTPGET, 1);
|
|
|
|
|
2022-05-03 00:50:37 +08:00
|
|
|
if (!result) {
|
2009-06-06 16:43:53 +08:00
|
|
|
curl_easy_setopt(slot->curl, CURLOPT_NOBODY, 1);
|
|
|
|
} else {
|
|
|
|
curl_easy_setopt(slot->curl, CURLOPT_NOBODY, 0);
|
2021-07-31 01:59:46 +08:00
|
|
|
curl_easy_setopt(slot->curl, CURLOPT_WRITEDATA, result);
|
2009-06-06 16:43:53 +08:00
|
|
|
|
|
|
|
if (target == HTTP_REQUEST_FILE) {
|
2015-11-03 06:10:27 +08:00
|
|
|
off_t posn = ftello(result);
|
2009-06-06 16:43:53 +08:00
|
|
|
curl_easy_setopt(slot->curl, CURLOPT_WRITEFUNCTION,
|
|
|
|
fwrite);
|
2015-11-03 05:39:58 +08:00
|
|
|
if (posn > 0)
|
|
|
|
http_opt_request_remainder(slot->curl, posn);
|
2009-06-06 16:43:53 +08:00
|
|
|
} else
|
|
|
|
curl_easy_setopt(slot->curl, CURLOPT_WRITEFUNCTION,
|
|
|
|
fwrite_buffer);
|
|
|
|
}
|
|
|
|
|
2023-02-28 01:20:19 +08:00
|
|
|
curl_easy_setopt(slot->curl, CURLOPT_HEADERFUNCTION, fwrite_wwwauth);
|
|
|
|
|
2022-07-11 13:58:54 +08:00
|
|
|
accept_language = http_get_accept_language_header();
|
2015-01-28 20:04:37 +08:00
|
|
|
|
|
|
|
if (accept_language)
|
|
|
|
headers = curl_slist_append(headers, accept_language);
|
|
|
|
|
2009-06-06 16:43:53 +08:00
|
|
|
strbuf_addstr(&buf, "Pragma:");
|
2013-09-28 16:31:23 +08:00
|
|
|
if (options && options->no_cache)
|
2009-06-06 16:43:53 +08:00
|
|
|
strbuf_addstr(&buf, " no-cache");
|
http: make redirects more obvious
We instruct curl to always follow HTTP redirects. This is
convenient, but it creates opportunities for malicious
servers to create confusing situations. For instance,
imagine Alice is a git user with access to a private
repository on Bob's server. Mallory runs her own server and
wants to access objects from Bob's repository.
Mallory may try a few tricks that involve asking Alice to
clone from her, build on top, and then push the result:
1. Mallory may simply redirect all fetch requests to Bob's
server. Git will transparently follow those redirects
and fetch Bob's history, which Alice may believe she
got from Mallory. The subsequent push seems like it is
just feeding Mallory back her own objects, but is
actually leaking Bob's objects. There is nothing in
git's output to indicate that Bob's repository was
involved at all.
The downside (for Mallory) of this attack is that Alice
will have received Bob's entire repository, and is
likely to notice that when building on top of it.
2. If Mallory happens to know the sha1 of some object X in
Bob's repository, she can instead build her own history
that references that object. She then runs a dumb http
server, and Alice's client will fetch each object
individually. When it asks for X, Mallory redirects her
to Bob's server. The end result is that Alice obtains
objects from Bob, but they may be buried deep in
history. Alice is less likely to notice.
Both of these attacks are fairly hard to pull off. There's a
social component in getting Mallory to convince Alice to
work with her. Alice may be prompted for credentials in
accessing Bob's repository (but not always, if she is using
a credential helper that caches). Attack (1) requires a
certain amount of obliviousness on Alice's part while making
a new commit. Attack (2) requires that Mallory knows a sha1
in Bob's repository, that Bob's server supports dumb http,
and that the object in question is loose on Bob's server.
But we can probably make things a bit more obvious without
any loss of functionality. This patch does two things to
that end.
First, when we encounter a whole-repo redirect during the
initial ref discovery, we now inform the user on stderr,
making attack (1) much more obvious.
Second, the decision to follow redirects is now
configurable. The truly paranoid can set the new
http.followRedirects to false to avoid any redirection
entirely. But for a more practical default, we will disallow
redirects only after the initial ref discovery. This is
enough to thwart attacks similar to (2), while still
allowing the common use of redirects at the repository
level. Since c93c92f30 (http: update base URLs when we see
redirects, 2013-09-28) we re-root all further requests from
the redirect destination, which should generally mean that
no further redirection is necessary.
As an escape hatch, in case there really is a server that
needs to redirect individual requests, the user can set
http.followRedirects to "true" (and this can be done on a
per-server basis via http.*.followRedirects config).
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-12-07 02:24:41 +08:00
|
|
|
if (options && options->initial_request &&
|
|
|
|
http_follow_config == HTTP_FOLLOW_INITIAL)
|
|
|
|
curl_easy_setopt(slot->curl, CURLOPT_FOLLOWLOCATION, 1);
|
2009-06-06 16:43:53 +08:00
|
|
|
|
|
|
|
headers = curl_slist_append(headers, buf.buf);
|
|
|
|
|
2018-03-16 01:31:38 +08:00
|
|
|
/* Add additional headers here */
|
|
|
|
if (options && options->extra_headers) {
|
|
|
|
const struct string_list_item *item;
|
http: add support for authtype and credential
Now that we have the credential helper code set up to handle arbitrary
authentications schemes, let's add support for this in the HTTP code,
where we really want to use it. If we're using this new functionality,
don't set a username and password, and instead set a header wherever
we'd normally do so, including for proxy authentication.
Since we can now handle this case, ask the credential helper to enable
the appropriate capabilities.
Finally, if we're using the authtype value, set "Expect: 100-continue".
Any type of authentication that requires multiple rounds (such as NTLM
or Kerberos) requires a 100 Continue (if we're larger than
http.postBuffer) because otherwise we send the pack data before we're
authenticated, the push gets a 401 response, and we can't rewind the
stream. We don't know for certain what other custom schemes might
require this, the HTTP/1.1 standard has required handling this since
1999, the broken HTTP server for which we disabled this (Google's) is
now fixed and has been for some time, and libcurl has a 1-second
fallback in case the HTTP server is still broken. In addition, it is
not unreasonable to require compliance with a 25-year old standard to
use new Git features. For all of these reasons, do so here.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-17 08:02:32 +08:00
|
|
|
if (options && options->extra_headers) {
|
|
|
|
for_each_string_list_item(item, options->extra_headers) {
|
|
|
|
headers = curl_slist_append(headers, item->string);
|
|
|
|
}
|
2018-03-16 01:31:38 +08:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
http: add support for authtype and credential
Now that we have the credential helper code set up to handle arbitrary
authentications schemes, let's add support for this in the HTTP code,
where we really want to use it. If we're using this new functionality,
don't set a username and password, and instead set a header wherever
we'd normally do so, including for proxy authentication.
Since we can now handle this case, ask the credential helper to enable
the appropriate capabilities.
Finally, if we're using the authtype value, set "Expect: 100-continue".
Any type of authentication that requires multiple rounds (such as NTLM
or Kerberos) requires a 100 Continue (if we're larger than
http.postBuffer) because otherwise we send the pack data before we're
authenticated, the push gets a 401 response, and we can't rewind the
stream. We don't know for certain what other custom schemes might
require this, the HTTP/1.1 standard has required handling this since
1999, the broken HTTP server for which we disabled this (Google's) is
now fixed and has been for some time, and libcurl has a 1-second
fallback in case the HTTP server is still broken. In addition, it is
not unreasonable to require compliance with a 25-year old standard to
use new Git features. For all of these reasons, do so here.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-17 08:02:32 +08:00
|
|
|
headers = http_append_auth_header(&http_auth, headers);
|
|
|
|
|
2009-06-06 16:43:53 +08:00
|
|
|
curl_easy_setopt(slot->curl, CURLOPT_URL, url);
|
|
|
|
curl_easy_setopt(slot->curl, CURLOPT_HTTPHEADER, headers);
|
2018-05-23 02:42:03 +08:00
|
|
|
curl_easy_setopt(slot->curl, CURLOPT_ENCODING, "");
|
2019-01-11 03:33:47 +08:00
|
|
|
curl_easy_setopt(slot->curl, CURLOPT_FAILONERROR, 0);
|
2009-06-06 16:43:53 +08:00
|
|
|
|
http: never use curl_easy_perform
We currently don't reuse http connections when fetching via
the smart-http protocol. This is bad because the TCP
handshake introduces latency, and especially because SSL
connection setup may be non-trivial.
We can fix it by consistently using curl's "multi"
interface. The reason is rather complicated:
Our http code has two ways of being used: queuing many
"slots" to be fetched in parallel, or fetching a single
request in a blocking manner. The parallel code is built on
curl's "multi" interface. Most of the single-request code
uses http_request, which is built on top of the parallel
code (we just feed it one slot, and wait until it finishes).
However, one could also accomplish the single-request scheme
by avoiding curl's multi interface entirely and just using
curl_easy_perform. This is simpler, and is used by post_rpc
in the smart-http protocol.
It does work to use the same curl handle in both contexts,
as long as it is not at the same time. However, internally
curl may not share all of the cached resources between both
contexts. In particular, a connection formed using the
"multi" code will go into a reuse pool connected to the
"multi" object. Further requests using the "easy" interface
will not be able to reuse that connection.
The smart http protocol does ref discovery via http_request,
which uses the "multi" interface, and then follows up with
the "easy" interface for its rpc calls. As a result, we make
two HTTP connections rather than reusing a single one.
We could teach the ref discovery to use the "easy"
interface. But it is only once we have done this discovery
that we know whether the protocol will be smart or dumb. If
it is dumb, then our further requests, which want to fetch
objects in parallel, will not be able to reuse the same
connection.
Instead, this patch switches post_rpc to build on the
parallel interface, which means that we use it consistently
everywhere. It's a little more complicated to use, but since
we have the infrastructure already, it doesn't add any code;
we can just factor out the relevant bits from http_request.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-02-18 18:34:20 +08:00
|
|
|
ret = run_one_slot(slot, &results);
|
2009-06-06 16:43:53 +08:00
|
|
|
|
2014-05-22 17:29:47 +08:00
|
|
|
if (options && options->content_type) {
|
|
|
|
struct strbuf raw = STRBUF_INIT;
|
|
|
|
curlinfo_strbuf(slot->curl, CURLINFO_CONTENT_TYPE, &raw);
|
2014-05-22 17:30:05 +08:00
|
|
|
extract_content_type(&raw, options->content_type,
|
|
|
|
options->charset);
|
2014-05-22 17:29:47 +08:00
|
|
|
strbuf_release(&raw);
|
|
|
|
}
|
2013-02-01 05:02:07 +08:00
|
|
|
|
2013-09-28 16:32:02 +08:00
|
|
|
if (options && options->effective_url)
|
|
|
|
curlinfo_strbuf(slot->curl, CURLINFO_EFFECTIVE_URL,
|
|
|
|
options->effective_url);
|
2013-02-01 05:02:07 +08:00
|
|
|
|
2009-06-06 16:43:53 +08:00
|
|
|
curl_slist_free_all(headers);
|
|
|
|
strbuf_release(&buf);
|
|
|
|
|
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
http: update base URLs when we see redirects
If a caller asks the http_get_* functions to go to a
particular URL and we end up elsewhere due to a redirect,
the effective_url field can tell us where we went.
It would be nice to remember this redirect and short-cut
further requests for two reasons:
1. It's more efficient. Otherwise we spend an extra http
round-trip to the server for each subsequent request,
just to get redirected.
2. If we end up with an http 401 and are going to ask for
credentials, it is to feed them to the redirect target.
If the redirect is an http->https upgrade, this means
our credentials may be provided on the http leg, just
to end up redirected to https. And if the redirect
crosses server boundaries, then curl will drop the
credentials entirely as it follows the redirect.
However, it, it is not enough to simply record the effective
URL we saw and use that for subsequent requests. We were
originally fed a "base" url like:
http://example.com/foo.git
and we want to figure out what the new base is, even though
the URLs we see may be:
original: http://example.com/foo.git/info/refs
effective: http://example.com/bar.git/info/refs
Subsequent requests will not be for "info/refs", but for
other paths relative to the base. We must ask the caller to
pass in the original base, and we must pass the redirected
base back to the caller (so that it can generate more URLs
from it). Furthermore, we need to feed the new base to the
credential code, so that requests to credential helpers (or
to the user) match the URL we will be requesting.
This patch teaches http_request_reauth to do this munging.
Since it is the caller who cares about making more URLs, it
seems at first glance that callers could simply check
effective_url themselves and handle it. However, since we
need to update the credential struct before the second
re-auth request, we have to do it inside http_request_reauth.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2013-09-28 16:34:05 +08:00
|
|
|
/*
|
|
|
|
* Update the "base" url to a more appropriate value, as deduced by
|
|
|
|
* redirects seen when requesting a URL starting with "url".
|
|
|
|
*
|
|
|
|
* The "asked" parameter is a URL that we asked curl to access, and must begin
|
|
|
|
* with "base".
|
|
|
|
*
|
|
|
|
* The "got" parameter is the URL that curl reported to us as where we ended
|
|
|
|
* up.
|
|
|
|
*
|
|
|
|
* Returns 1 if we updated the base url, 0 otherwise.
|
|
|
|
*
|
|
|
|
* Our basic strategy is to compare "base" and "asked" to find the bits
|
|
|
|
* specific to our request. We then strip those bits off of "got" to yield the
|
|
|
|
* new base. So for example, if our base is "http://example.com/foo.git",
|
|
|
|
* and we ask for "http://example.com/foo.git/info/refs", we might end up
|
|
|
|
* with "https://other.example.com/foo.git/info/refs". We would want the
|
|
|
|
* new URL to become "https://other.example.com/foo.git".
|
|
|
|
*
|
|
|
|
* Note that this assumes a sane redirect scheme. It's entirely possible
|
|
|
|
* in the example above to end up at a URL that does not even end in
|
http: always update the base URL for redirects
If a malicious server redirects the initial ref
advertisement, it may be able to leak sha1s from other,
unrelated servers that the client has access to. For
example, imagine that Alice is a git user, she has access to
a private repository on a server hosted by Bob, and Mallory
runs a malicious server and wants to find out about Bob's
private repository.
Mallory asks Alice to clone an unrelated repository from her
over HTTP. When Alice's client contacts Mallory's server for
the initial ref advertisement, the server issues an HTTP
redirect for Bob's server. Alice contacts Bob's server and
gets the ref advertisement for the private repository. If
there is anything to fetch, she then follows up by asking
the server for one or more sha1 objects. But who is the
server?
If it is still Mallory's server, then Alice will leak the
existence of those sha1s to her.
Since commit c93c92f30 (http: update base URLs when we see
redirects, 2013-09-28), the client usually rewrites the base
URL such that all further requests will go to Bob's server.
But this is done by textually matching the URL. If we were
originally looking for "http://mallory/repo.git/info/refs",
and we got pointed at "http://bob/other.git/info/refs", then
we know that the right root is "http://bob/other.git".
If the redirect appears to change more than just the root,
we punt and continue to use the original server. E.g.,
imagine the redirect adds a URL component that Bob's server
will ignore, like "http://bob/other.git/info/refs?dummy=1".
We can solve this by aborting in this case rather than
silently continuing to use Mallory's server. In addition to
protecting from sha1 leakage, it's arguably safer and more
sane to refuse a confusing redirect like that in general.
For example, part of the motivation in c93c92f30 is
avoiding accidentally sending credentials over clear http,
just to get a response that says "try again over https". So
even in a non-malicious case, we'd prefer to err on the side
of caution.
The downside is that it's possible this will break a
legitimate but complicated server-side redirection scheme.
The setup given in the newly added test does work, but it's
convoluted enough that we don't need to care about it. A
more plausible case would be a server which redirects a
request for "info/refs?service=git-upload-pack" to just
"info/refs" (because it does not do smart HTTP, and for some
reason really dislikes query parameters). Right now we
would transparently downgrade to dumb-http, but with this
patch, we'd complain (and the user would have to set
GIT_SMART_HTTP=0 to fetch).
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-12-07 02:24:35 +08:00
|
|
|
* "info/refs". In such a case we die. There's not much we can do, such a
|
|
|
|
* scheme is unlikely to represent a real git repository, and failing to
|
|
|
|
* rewrite the base opens options for malicious redirects to do funny things.
|
http: update base URLs when we see redirects
If a caller asks the http_get_* functions to go to a
particular URL and we end up elsewhere due to a redirect,
the effective_url field can tell us where we went.
It would be nice to remember this redirect and short-cut
further requests for two reasons:
1. It's more efficient. Otherwise we spend an extra http
round-trip to the server for each subsequent request,
just to get redirected.
2. If we end up with an http 401 and are going to ask for
credentials, it is to feed them to the redirect target.
If the redirect is an http->https upgrade, this means
our credentials may be provided on the http leg, just
to end up redirected to https. And if the redirect
crosses server boundaries, then curl will drop the
credentials entirely as it follows the redirect.
However, it, it is not enough to simply record the effective
URL we saw and use that for subsequent requests. We were
originally fed a "base" url like:
http://example.com/foo.git
and we want to figure out what the new base is, even though
the URLs we see may be:
original: http://example.com/foo.git/info/refs
effective: http://example.com/bar.git/info/refs
Subsequent requests will not be for "info/refs", but for
other paths relative to the base. We must ask the caller to
pass in the original base, and we must pass the redirected
base back to the caller (so that it can generate more URLs
from it). Furthermore, we need to feed the new base to the
credential code, so that requests to credential helpers (or
to the user) match the URL we will be requesting.
This patch teaches http_request_reauth to do this munging.
Since it is the caller who cares about making more URLs, it
seems at first glance that callers could simply check
effective_url themselves and handle it. However, since we
need to update the credential struct before the second
re-auth request, we have to do it inside http_request_reauth.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2013-09-28 16:34:05 +08:00
|
|
|
*/
|
|
|
|
static int update_url_from_redirect(struct strbuf *base,
|
|
|
|
const char *asked,
|
|
|
|
const struct strbuf *got)
|
|
|
|
{
|
|
|
|
const char *tail;
|
2016-12-07 02:24:29 +08:00
|
|
|
size_t new_len;
|
http: update base URLs when we see redirects
If a caller asks the http_get_* functions to go to a
particular URL and we end up elsewhere due to a redirect,
the effective_url field can tell us where we went.
It would be nice to remember this redirect and short-cut
further requests for two reasons:
1. It's more efficient. Otherwise we spend an extra http
round-trip to the server for each subsequent request,
just to get redirected.
2. If we end up with an http 401 and are going to ask for
credentials, it is to feed them to the redirect target.
If the redirect is an http->https upgrade, this means
our credentials may be provided on the http leg, just
to end up redirected to https. And if the redirect
crosses server boundaries, then curl will drop the
credentials entirely as it follows the redirect.
However, it, it is not enough to simply record the effective
URL we saw and use that for subsequent requests. We were
originally fed a "base" url like:
http://example.com/foo.git
and we want to figure out what the new base is, even though
the URLs we see may be:
original: http://example.com/foo.git/info/refs
effective: http://example.com/bar.git/info/refs
Subsequent requests will not be for "info/refs", but for
other paths relative to the base. We must ask the caller to
pass in the original base, and we must pass the redirected
base back to the caller (so that it can generate more URLs
from it). Furthermore, we need to feed the new base to the
credential code, so that requests to credential helpers (or
to the user) match the URL we will be requesting.
This patch teaches http_request_reauth to do this munging.
Since it is the caller who cares about making more URLs, it
seems at first glance that callers could simply check
effective_url themselves and handle it. However, since we
need to update the credential struct before the second
re-auth request, we have to do it inside http_request_reauth.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2013-09-28 16:34:05 +08:00
|
|
|
|
|
|
|
if (!strcmp(asked, got->buf))
|
|
|
|
return 0;
|
|
|
|
|
2014-06-19 03:57:17 +08:00
|
|
|
if (!skip_prefix(asked, base->buf, &tail))
|
2018-05-02 17:38:39 +08:00
|
|
|
BUG("update_url_from_redirect: %s is not a superset of %s",
|
http: update base URLs when we see redirects
If a caller asks the http_get_* functions to go to a
particular URL and we end up elsewhere due to a redirect,
the effective_url field can tell us where we went.
It would be nice to remember this redirect and short-cut
further requests for two reasons:
1. It's more efficient. Otherwise we spend an extra http
round-trip to the server for each subsequent request,
just to get redirected.
2. If we end up with an http 401 and are going to ask for
credentials, it is to feed them to the redirect target.
If the redirect is an http->https upgrade, this means
our credentials may be provided on the http leg, just
to end up redirected to https. And if the redirect
crosses server boundaries, then curl will drop the
credentials entirely as it follows the redirect.
However, it, it is not enough to simply record the effective
URL we saw and use that for subsequent requests. We were
originally fed a "base" url like:
http://example.com/foo.git
and we want to figure out what the new base is, even though
the URLs we see may be:
original: http://example.com/foo.git/info/refs
effective: http://example.com/bar.git/info/refs
Subsequent requests will not be for "info/refs", but for
other paths relative to the base. We must ask the caller to
pass in the original base, and we must pass the redirected
base back to the caller (so that it can generate more URLs
from it). Furthermore, we need to feed the new base to the
credential code, so that requests to credential helpers (or
to the user) match the URL we will be requesting.
This patch teaches http_request_reauth to do this munging.
Since it is the caller who cares about making more URLs, it
seems at first glance that callers could simply check
effective_url themselves and handle it. However, since we
need to update the credential struct before the second
re-auth request, we have to do it inside http_request_reauth.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2013-09-28 16:34:05 +08:00
|
|
|
asked, base->buf);
|
|
|
|
|
2016-12-07 02:24:29 +08:00
|
|
|
new_len = got->len;
|
|
|
|
if (!strip_suffix_mem(got->buf, &new_len, tail))
|
http: always update the base URL for redirects
If a malicious server redirects the initial ref
advertisement, it may be able to leak sha1s from other,
unrelated servers that the client has access to. For
example, imagine that Alice is a git user, she has access to
a private repository on a server hosted by Bob, and Mallory
runs a malicious server and wants to find out about Bob's
private repository.
Mallory asks Alice to clone an unrelated repository from her
over HTTP. When Alice's client contacts Mallory's server for
the initial ref advertisement, the server issues an HTTP
redirect for Bob's server. Alice contacts Bob's server and
gets the ref advertisement for the private repository. If
there is anything to fetch, she then follows up by asking
the server for one or more sha1 objects. But who is the
server?
If it is still Mallory's server, then Alice will leak the
existence of those sha1s to her.
Since commit c93c92f30 (http: update base URLs when we see
redirects, 2013-09-28), the client usually rewrites the base
URL such that all further requests will go to Bob's server.
But this is done by textually matching the URL. If we were
originally looking for "http://mallory/repo.git/info/refs",
and we got pointed at "http://bob/other.git/info/refs", then
we know that the right root is "http://bob/other.git".
If the redirect appears to change more than just the root,
we punt and continue to use the original server. E.g.,
imagine the redirect adds a URL component that Bob's server
will ignore, like "http://bob/other.git/info/refs?dummy=1".
We can solve this by aborting in this case rather than
silently continuing to use Mallory's server. In addition to
protecting from sha1 leakage, it's arguably safer and more
sane to refuse a confusing redirect like that in general.
For example, part of the motivation in c93c92f30 is
avoiding accidentally sending credentials over clear http,
just to get a response that says "try again over https". So
even in a non-malicious case, we'd prefer to err on the side
of caution.
The downside is that it's possible this will break a
legitimate but complicated server-side redirection scheme.
The setup given in the newly added test does work, but it's
convoluted enough that we don't need to care about it. A
more plausible case would be a server which redirects a
request for "info/refs?service=git-upload-pack" to just
"info/refs" (because it does not do smart HTTP, and for some
reason really dislikes query parameters). Right now we
would transparently downgrade to dumb-http, but with this
patch, we'd complain (and the user would have to set
GIT_SMART_HTTP=0 to fetch).
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-12-07 02:24:35 +08:00
|
|
|
die(_("unable to update url base from redirection:\n"
|
|
|
|
" asked for: %s\n"
|
|
|
|
" redirect: %s"),
|
|
|
|
asked, got->buf);
|
http: update base URLs when we see redirects
If a caller asks the http_get_* functions to go to a
particular URL and we end up elsewhere due to a redirect,
the effective_url field can tell us where we went.
It would be nice to remember this redirect and short-cut
further requests for two reasons:
1. It's more efficient. Otherwise we spend an extra http
round-trip to the server for each subsequent request,
just to get redirected.
2. If we end up with an http 401 and are going to ask for
credentials, it is to feed them to the redirect target.
If the redirect is an http->https upgrade, this means
our credentials may be provided on the http leg, just
to end up redirected to https. And if the redirect
crosses server boundaries, then curl will drop the
credentials entirely as it follows the redirect.
However, it, it is not enough to simply record the effective
URL we saw and use that for subsequent requests. We were
originally fed a "base" url like:
http://example.com/foo.git
and we want to figure out what the new base is, even though
the URLs we see may be:
original: http://example.com/foo.git/info/refs
effective: http://example.com/bar.git/info/refs
Subsequent requests will not be for "info/refs", but for
other paths relative to the base. We must ask the caller to
pass in the original base, and we must pass the redirected
base back to the caller (so that it can generate more URLs
from it). Furthermore, we need to feed the new base to the
credential code, so that requests to credential helpers (or
to the user) match the URL we will be requesting.
This patch teaches http_request_reauth to do this munging.
Since it is the caller who cares about making more URLs, it
seems at first glance that callers could simply check
effective_url themselves and handle it. However, since we
need to update the credential struct before the second
re-auth request, we have to do it inside http_request_reauth.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2013-09-28 16:34:05 +08:00
|
|
|
|
|
|
|
strbuf_reset(base);
|
2016-12-07 02:24:29 +08:00
|
|
|
strbuf_add(base, got->buf, new_len);
|
http: always update the base URL for redirects
If a malicious server redirects the initial ref
advertisement, it may be able to leak sha1s from other,
unrelated servers that the client has access to. For
example, imagine that Alice is a git user, she has access to
a private repository on a server hosted by Bob, and Mallory
runs a malicious server and wants to find out about Bob's
private repository.
Mallory asks Alice to clone an unrelated repository from her
over HTTP. When Alice's client contacts Mallory's server for
the initial ref advertisement, the server issues an HTTP
redirect for Bob's server. Alice contacts Bob's server and
gets the ref advertisement for the private repository. If
there is anything to fetch, she then follows up by asking
the server for one or more sha1 objects. But who is the
server?
If it is still Mallory's server, then Alice will leak the
existence of those sha1s to her.
Since commit c93c92f30 (http: update base URLs when we see
redirects, 2013-09-28), the client usually rewrites the base
URL such that all further requests will go to Bob's server.
But this is done by textually matching the URL. If we were
originally looking for "http://mallory/repo.git/info/refs",
and we got pointed at "http://bob/other.git/info/refs", then
we know that the right root is "http://bob/other.git".
If the redirect appears to change more than just the root,
we punt and continue to use the original server. E.g.,
imagine the redirect adds a URL component that Bob's server
will ignore, like "http://bob/other.git/info/refs?dummy=1".
We can solve this by aborting in this case rather than
silently continuing to use Mallory's server. In addition to
protecting from sha1 leakage, it's arguably safer and more
sane to refuse a confusing redirect like that in general.
For example, part of the motivation in c93c92f30 is
avoiding accidentally sending credentials over clear http,
just to get a response that says "try again over https". So
even in a non-malicious case, we'd prefer to err on the side
of caution.
The downside is that it's possible this will break a
legitimate but complicated server-side redirection scheme.
The setup given in the newly added test does work, but it's
convoluted enough that we don't need to care about it. A
more plausible case would be a server which redirects a
request for "info/refs?service=git-upload-pack" to just
"info/refs" (because it does not do smart HTTP, and for some
reason really dislikes query parameters). Right now we
would transparently downgrade to dumb-http, but with this
patch, we'd complain (and the user would have to set
GIT_SMART_HTTP=0 to fetch).
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-12-07 02:24:35 +08:00
|
|
|
|
http: update base URLs when we see redirects
If a caller asks the http_get_* functions to go to a
particular URL and we end up elsewhere due to a redirect,
the effective_url field can tell us where we went.
It would be nice to remember this redirect and short-cut
further requests for two reasons:
1. It's more efficient. Otherwise we spend an extra http
round-trip to the server for each subsequent request,
just to get redirected.
2. If we end up with an http 401 and are going to ask for
credentials, it is to feed them to the redirect target.
If the redirect is an http->https upgrade, this means
our credentials may be provided on the http leg, just
to end up redirected to https. And if the redirect
crosses server boundaries, then curl will drop the
credentials entirely as it follows the redirect.
However, it, it is not enough to simply record the effective
URL we saw and use that for subsequent requests. We were
originally fed a "base" url like:
http://example.com/foo.git
and we want to figure out what the new base is, even though
the URLs we see may be:
original: http://example.com/foo.git/info/refs
effective: http://example.com/bar.git/info/refs
Subsequent requests will not be for "info/refs", but for
other paths relative to the base. We must ask the caller to
pass in the original base, and we must pass the redirected
base back to the caller (so that it can generate more URLs
from it). Furthermore, we need to feed the new base to the
credential code, so that requests to credential helpers (or
to the user) match the URL we will be requesting.
This patch teaches http_request_reauth to do this munging.
Since it is the caller who cares about making more URLs, it
seems at first glance that callers could simply check
effective_url themselves and handle it. However, since we
need to update the credential struct before the second
re-auth request, we have to do it inside http_request_reauth.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2013-09-28 16:34:05 +08:00
|
|
|
return 1;
|
|
|
|
}
|
|
|
|
|
2013-02-01 05:02:07 +08:00
|
|
|
static int http_request_reauth(const char *url,
|
|
|
|
void *result, int target,
|
2013-09-28 16:31:23 +08:00
|
|
|
struct http_get_options *options)
|
2011-07-18 15:50:14 +08:00
|
|
|
{
|
credential: add support for multistage credential rounds
Over HTTP, NTLM and Kerberos require two rounds of authentication on the
client side. It's possible that there are custom authentication schemes
that also implement this same approach. Since these are tricky schemes
to implement and the HTTP library in use may not always handle them
gracefully on all systems, it would be helpful to allow the credential
helper to implement them instead for increased portability and
robustness.
To allow this to happen, add a boolean flag, continue, that indicates
that instead of failing when we get a 401, we should retry another round
of authentication. However, this necessitates some changes in our
current credential code so that we can make this work.
Keep the state[] headers between iterations, but only use them to send
to the helper and only consider the new ones we read from the credential
helper to be valid on subsequent iterations. That avoids us passing
stale data when we finally approve or reject the credential. Similarly,
clear the multistage and wwwauth[] values appropriately so that we
don't pass stale data or think we're trying a multiround response when
we're not. Remove the credential values so that we can actually fill a
second time with new responses.
Limit the number of iterations of reauthentication we do to 3. This
means that if there's a problem, we'll terminate with an error message
instead of retrying indefinitely and not informing the user (and
possibly conducting a DoS on the server).
In our tests, handle creating multiple response output files from our
helper so we can verify that each of the messages sent is correct.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-17 08:02:37 +08:00
|
|
|
int i = 3;
|
http: allow authenticating proactively
When making a request over HTTP(S), Git only sends authentication if it
receives a 401 response. Thus, if a repository is open to the public
for reading, Git will typically never ask for authentication for fetches
and clones.
However, there may be times when a user would like to authenticate
nevertheless. For example, a forge may give higher rate limits to users
who authenticate because they are easier to contact in case of excessive
use. Or it may be useful for a known heavy user, such as an internal
service, to proactively authenticate so its use can be monitored and, if
necessary, throttled.
Let's make this possible with a new option, "http.proactiveAuth". This
option specifies a type of authentication which can be used to
authenticate against the host in question. This is necessary because we
lack the WWW-Authenticate header to provide us details; similarly, we
cannot accept certain types of authentication because we require
information from the server, such as a nonce or challenge, to
successfully authenticate.
If we're in auto mode and we got a username and password, set the
authentication scheme to Basic. libcurl will not send authentication
proactively unless there's a single choice of allowed authentication,
and we know in this case we didn't get an authtype entry telling us what
scheme to use, or we would have taken a different codepath and written
the header ourselves. In any event, of the other schemes that libcurl
supports, Digest and NTLM require a nonce or challenge, which means that
they cannot work with proactive auth, and GSSAPI does not use a username
and password at all, so Basic is the only logical choice among the
built-in options.
Note that the existing http_proactive_auth variable signifies proactive
auth if there are already credentials, which is different from the
functionality we're adding, which always seeks credentials even if none
are provided. Nonetheless, t5540 tests the existing behavior for
WebDAV-based pushes to an open repository without credentials, so we
preserve it. While at first this may seem an insecure and bizarre
decision, it may be that authentication is done with TLS certificates,
in which case it might actually provide a quite high level of security.
Expand the variable to use an enum to handle the additional cases and a
helper function to distinguish our new cases from the old ones.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-07-10 08:01:55 +08:00
|
|
|
int ret;
|
|
|
|
|
|
|
|
if (always_auth_proactively())
|
|
|
|
credential_fill(&http_auth, 1);
|
|
|
|
|
|
|
|
ret = http_request(url, result, target, options);
|
http: update base URLs when we see redirects
If a caller asks the http_get_* functions to go to a
particular URL and we end up elsewhere due to a redirect,
the effective_url field can tell us where we went.
It would be nice to remember this redirect and short-cut
further requests for two reasons:
1. It's more efficient. Otherwise we spend an extra http
round-trip to the server for each subsequent request,
just to get redirected.
2. If we end up with an http 401 and are going to ask for
credentials, it is to feed them to the redirect target.
If the redirect is an http->https upgrade, this means
our credentials may be provided on the http leg, just
to end up redirected to https. And if the redirect
crosses server boundaries, then curl will drop the
credentials entirely as it follows the redirect.
However, it, it is not enough to simply record the effective
URL we saw and use that for subsequent requests. We were
originally fed a "base" url like:
http://example.com/foo.git
and we want to figure out what the new base is, even though
the URLs we see may be:
original: http://example.com/foo.git/info/refs
effective: http://example.com/bar.git/info/refs
Subsequent requests will not be for "info/refs", but for
other paths relative to the base. We must ask the caller to
pass in the original base, and we must pass the redirected
base back to the caller (so that it can generate more URLs
from it). Furthermore, we need to feed the new base to the
credential code, so that requests to credential helpers (or
to the user) match the URL we will be requesting.
This patch teaches http_request_reauth to do this munging.
Since it is the caller who cares about making more URLs, it
seems at first glance that callers could simply check
effective_url themselves and handle it. However, since we
need to update the credential struct before the second
re-auth request, we have to do it inside http_request_reauth.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2013-09-28 16:34:05 +08:00
|
|
|
|
http: attempt updating base URL only if no error
http.c supports HTTP redirects of the form
http://foo/info/refs?service=git-upload-pack
-> http://anything
-> http://bar/info/refs?service=git-upload-pack
(that is to say, as long as the Git part of the path and the query
string is preserved in the final redirect destination, the intermediate
steps can have any URL). However, if one of the intermediate steps
results in an HTTP exception, a confusing "unable to update url base
from redirection" message is printed instead of a Curl error message
with the HTTP exception code.
This was introduced by 2 commits. Commit c93c92f ("http: update base
URLs when we see redirects", 2013-09-28) introduced a best-effort
optimization that required checking if only the "base" part of the URL
differed between the initial request and the final redirect destination,
but it performed the check before any HTTP status checking was done. If
something went wrong, the normal code path was still followed, so this
did not cause any confusing error messages until commit 6628eb4 ("http:
always update the base URL for redirects", 2016-12-06), which taught
http to die if the non-"base" part of the URL differed.
Therefore, teach http to check the HTTP status before attempting to
check if only the "base" part of the URL differed. This commit teaches
http_request_reauth to return early without updating options->base_url
upon an error; the only invoker of this function that passes a non-NULL
"options" is remote-curl.c (through "http_get_strbuf"), which only uses
options->base_url for an informational message in the situations that
this commit cares about (that is, when the return value is not HTTP_OK).
The included test checks that the redirect scheme at the beginning of
this commit message works, and that returning a 502 in the middle of the
redirect scheme produces the correct result. Note that this is different
from the test in commit 6628eb4 ("http: always update the base URL for
redirects", 2016-12-06) in that this commit tests that a Git-shaped URL
(http://.../info/refs?service=git-upload-pack) works, whereas commit
6628eb4 tests that a non-Git-shaped URL
(http://.../info/refs/foo?service=git-upload-pack) does not work (even
though Git is processing that URL) and is an error that is fatal, not
silently swallowed.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-28 10:53:11 +08:00
|
|
|
if (ret != HTTP_OK && ret != HTTP_REAUTH)
|
|
|
|
return ret;
|
|
|
|
|
http: update base URLs when we see redirects
If a caller asks the http_get_* functions to go to a
particular URL and we end up elsewhere due to a redirect,
the effective_url field can tell us where we went.
It would be nice to remember this redirect and short-cut
further requests for two reasons:
1. It's more efficient. Otherwise we spend an extra http
round-trip to the server for each subsequent request,
just to get redirected.
2. If we end up with an http 401 and are going to ask for
credentials, it is to feed them to the redirect target.
If the redirect is an http->https upgrade, this means
our credentials may be provided on the http leg, just
to end up redirected to https. And if the redirect
crosses server boundaries, then curl will drop the
credentials entirely as it follows the redirect.
However, it, it is not enough to simply record the effective
URL we saw and use that for subsequent requests. We were
originally fed a "base" url like:
http://example.com/foo.git
and we want to figure out what the new base is, even though
the URLs we see may be:
original: http://example.com/foo.git/info/refs
effective: http://example.com/bar.git/info/refs
Subsequent requests will not be for "info/refs", but for
other paths relative to the base. We must ask the caller to
pass in the original base, and we must pass the redirected
base back to the caller (so that it can generate more URLs
from it). Furthermore, we need to feed the new base to the
credential code, so that requests to credential helpers (or
to the user) match the URL we will be requesting.
This patch teaches http_request_reauth to do this munging.
Since it is the caller who cares about making more URLs, it
seems at first glance that callers could simply check
effective_url themselves and handle it. However, since we
need to update the credential struct before the second
re-auth request, we have to do it inside http_request_reauth.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2013-09-28 16:34:05 +08:00
|
|
|
if (options && options->effective_url && options->base_url) {
|
|
|
|
if (update_url_from_redirect(options->base_url,
|
|
|
|
url, options->effective_url)) {
|
|
|
|
credential_from_url(&http_auth, options->base_url->buf);
|
|
|
|
url = options->effective_url->buf;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
credential: add support for multistage credential rounds
Over HTTP, NTLM and Kerberos require two rounds of authentication on the
client side. It's possible that there are custom authentication schemes
that also implement this same approach. Since these are tricky schemes
to implement and the HTTP library in use may not always handle them
gracefully on all systems, it would be helpful to allow the credential
helper to implement them instead for increased portability and
robustness.
To allow this to happen, add a boolean flag, continue, that indicates
that instead of failing when we get a 401, we should retry another round
of authentication. However, this necessitates some changes in our
current credential code so that we can make this work.
Keep the state[] headers between iterations, but only use them to send
to the helper and only consider the new ones we read from the credential
helper to be valid on subsequent iterations. That avoids us passing
stale data when we finally approve or reject the credential. Similarly,
clear the multistage and wwwauth[] values appropriately so that we
don't pass stale data or think we're trying a multiround response when
we're not. Remove the credential values so that we can actually fill a
second time with new responses.
Limit the number of iterations of reauthentication we do to 3. This
means that if there's a problem, we'll terminate with an error message
instead of retrying indefinitely and not informing the user (and
possibly conducting a DoS on the server).
In our tests, handle creating multiple response output files from our
helper so we can verify that each of the messages sent is correct.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-17 08:02:37 +08:00
|
|
|
while (ret == HTTP_REAUTH && --i) {
|
|
|
|
/*
|
|
|
|
* The previous request may have put cruft into our output stream; we
|
|
|
|
* should clear it out before making our next request.
|
|
|
|
*/
|
|
|
|
switch (target) {
|
|
|
|
case HTTP_REQUEST_STRBUF:
|
|
|
|
strbuf_reset(result);
|
|
|
|
break;
|
2024-10-16 16:13:18 +08:00
|
|
|
case HTTP_REQUEST_FILE: {
|
|
|
|
FILE *f = result;
|
|
|
|
if (fflush(f)) {
|
credential: add support for multistage credential rounds
Over HTTP, NTLM and Kerberos require two rounds of authentication on the
client side. It's possible that there are custom authentication schemes
that also implement this same approach. Since these are tricky schemes
to implement and the HTTP library in use may not always handle them
gracefully on all systems, it would be helpful to allow the credential
helper to implement them instead for increased portability and
robustness.
To allow this to happen, add a boolean flag, continue, that indicates
that instead of failing when we get a 401, we should retry another round
of authentication. However, this necessitates some changes in our
current credential code so that we can make this work.
Keep the state[] headers between iterations, but only use them to send
to the helper and only consider the new ones we read from the credential
helper to be valid on subsequent iterations. That avoids us passing
stale data when we finally approve or reject the credential. Similarly,
clear the multistage and wwwauth[] values appropriately so that we
don't pass stale data or think we're trying a multiround response when
we're not. Remove the credential values so that we can actually fill a
second time with new responses.
Limit the number of iterations of reauthentication we do to 3. This
means that if there's a problem, we'll terminate with an error message
instead of retrying indefinitely and not informing the user (and
possibly conducting a DoS on the server).
In our tests, handle creating multiple response output files from our
helper so we can verify that each of the messages sent is correct.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-17 08:02:37 +08:00
|
|
|
error_errno("unable to flush a file");
|
|
|
|
return HTTP_START_FAILED;
|
|
|
|
}
|
2024-10-16 16:13:18 +08:00
|
|
|
rewind(f);
|
|
|
|
if (ftruncate(fileno(f), 0) < 0) {
|
credential: add support for multistage credential rounds
Over HTTP, NTLM and Kerberos require two rounds of authentication on the
client side. It's possible that there are custom authentication schemes
that also implement this same approach. Since these are tricky schemes
to implement and the HTTP library in use may not always handle them
gracefully on all systems, it would be helpful to allow the credential
helper to implement them instead for increased portability and
robustness.
To allow this to happen, add a boolean flag, continue, that indicates
that instead of failing when we get a 401, we should retry another round
of authentication. However, this necessitates some changes in our
current credential code so that we can make this work.
Keep the state[] headers between iterations, but only use them to send
to the helper and only consider the new ones we read from the credential
helper to be valid on subsequent iterations. That avoids us passing
stale data when we finally approve or reject the credential. Similarly,
clear the multistage and wwwauth[] values appropriately so that we
don't pass stale data or think we're trying a multiround response when
we're not. Remove the credential values so that we can actually fill a
second time with new responses.
Limit the number of iterations of reauthentication we do to 3. This
means that if there's a problem, we'll terminate with an error message
instead of retrying indefinitely and not informing the user (and
possibly conducting a DoS on the server).
In our tests, handle creating multiple response output files from our
helper so we can verify that each of the messages sent is correct.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-17 08:02:37 +08:00
|
|
|
error_errno("unable to truncate a file");
|
|
|
|
return HTTP_START_FAILED;
|
|
|
|
}
|
|
|
|
break;
|
2024-10-16 16:13:18 +08:00
|
|
|
}
|
credential: add support for multistage credential rounds
Over HTTP, NTLM and Kerberos require two rounds of authentication on the
client side. It's possible that there are custom authentication schemes
that also implement this same approach. Since these are tricky schemes
to implement and the HTTP library in use may not always handle them
gracefully on all systems, it would be helpful to allow the credential
helper to implement them instead for increased portability and
robustness.
To allow this to happen, add a boolean flag, continue, that indicates
that instead of failing when we get a 401, we should retry another round
of authentication. However, this necessitates some changes in our
current credential code so that we can make this work.
Keep the state[] headers between iterations, but only use them to send
to the helper and only consider the new ones we read from the credential
helper to be valid on subsequent iterations. That avoids us passing
stale data when we finally approve or reject the credential. Similarly,
clear the multistage and wwwauth[] values appropriately so that we
don't pass stale data or think we're trying a multiround response when
we're not. Remove the credential values so that we can actually fill a
second time with new responses.
Limit the number of iterations of reauthentication we do to 3. This
means that if there's a problem, we'll terminate with an error message
instead of retrying indefinitely and not informing the user (and
possibly conducting a DoS on the server).
In our tests, handle creating multiple response output files from our
helper so we can verify that each of the messages sent is correct.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-17 08:02:37 +08:00
|
|
|
default:
|
|
|
|
BUG("Unknown http_request target");
|
2013-04-06 06:14:06 +08:00
|
|
|
}
|
http: hoist credential request out of handle_curl_result
When we are handling a curl response code in http_request or
in the remote-curl RPC code, we use the handle_curl_result
helper to translate curl's response into an easy-to-use
code. When we see an HTTP 401, we do one of two things:
1. If we already had a filled-in credential, we mark it as
rejected, and then return HTTP_NOAUTH to indicate to
the caller that we failed.
2. If we didn't, then we ask for a new credential and tell
the caller HTTP_REAUTH to indicate that they may want
to try again.
Rejecting in the first case makes sense; it is the natural
result of the request we just made. However, prompting for
more credentials in the second step does not always make
sense. We do not know for sure that the caller is going to
make a second request, and nor are we sure that it will be
to the same URL. Logically, the prompt belongs not to the
request we just finished, but to the request we are (maybe)
about to make.
In practice, it is very hard to trigger any bad behavior.
Currently, if we make a second request, it will always be to
the same URL (even in the face of redirects, because curl
handles the redirects internally). And we almost always
retry on HTTP_REAUTH these days. The one exception is if we
are streaming a large RPC request to the server (e.g., a
pushed packfile), in which case we cannot restart. It's
extremely unlikely to see a 401 response at this stage,
though, as we would typically have seen it when we sent a
probe request, before streaming the data.
This patch drops the automatic prompt out of case 2, and
instead requires the caller to do it. This is a few extra
lines of code, and the bug it fixes is unlikely to come up
in practice. But it is conceptually cleaner, and paves the
way for better handling of credentials across redirects.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2013-09-28 16:31:45 +08:00
|
|
|
|
credential: add support for multistage credential rounds
Over HTTP, NTLM and Kerberos require two rounds of authentication on the
client side. It's possible that there are custom authentication schemes
that also implement this same approach. Since these are tricky schemes
to implement and the HTTP library in use may not always handle them
gracefully on all systems, it would be helpful to allow the credential
helper to implement them instead for increased portability and
robustness.
To allow this to happen, add a boolean flag, continue, that indicates
that instead of failing when we get a 401, we should retry another round
of authentication. However, this necessitates some changes in our
current credential code so that we can make this work.
Keep the state[] headers between iterations, but only use them to send
to the helper and only consider the new ones we read from the credential
helper to be valid on subsequent iterations. That avoids us passing
stale data when we finally approve or reject the credential. Similarly,
clear the multistage and wwwauth[] values appropriately so that we
don't pass stale data or think we're trying a multiround response when
we're not. Remove the credential values so that we can actually fill a
second time with new responses.
Limit the number of iterations of reauthentication we do to 3. This
means that if there's a problem, we'll terminate with an error message
instead of retrying indefinitely and not informing the user (and
possibly conducting a DoS on the server).
In our tests, handle creating multiple response output files from our
helper so we can verify that each of the messages sent is correct.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-17 08:02:37 +08:00
|
|
|
credential_fill(&http_auth, 1);
|
http: hoist credential request out of handle_curl_result
When we are handling a curl response code in http_request or
in the remote-curl RPC code, we use the handle_curl_result
helper to translate curl's response into an easy-to-use
code. When we see an HTTP 401, we do one of two things:
1. If we already had a filled-in credential, we mark it as
rejected, and then return HTTP_NOAUTH to indicate to
the caller that we failed.
2. If we didn't, then we ask for a new credential and tell
the caller HTTP_REAUTH to indicate that they may want
to try again.
Rejecting in the first case makes sense; it is the natural
result of the request we just made. However, prompting for
more credentials in the second step does not always make
sense. We do not know for sure that the caller is going to
make a second request, and nor are we sure that it will be
to the same URL. Logically, the prompt belongs not to the
request we just finished, but to the request we are (maybe)
about to make.
In practice, it is very hard to trigger any bad behavior.
Currently, if we make a second request, it will always be to
the same URL (even in the face of redirects, because curl
handles the redirects internally). And we almost always
retry on HTTP_REAUTH these days. The one exception is if we
are streaming a large RPC request to the server (e.g., a
pushed packfile), in which case we cannot restart. It's
extremely unlikely to see a 401 response at this stage,
though, as we would typically have seen it when we sent a
probe request, before streaming the data.
This patch drops the automatic prompt out of case 2, and
instead requires the caller to do it. This is a few extra
lines of code, and the bug it fixes is unlikely to come up
in practice. But it is conceptually cleaner, and paves the
way for better handling of credentials across redirects.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
2013-09-28 16:31:45 +08:00
|
|
|
|
credential: add support for multistage credential rounds
Over HTTP, NTLM and Kerberos require two rounds of authentication on the
client side. It's possible that there are custom authentication schemes
that also implement this same approach. Since these are tricky schemes
to implement and the HTTP library in use may not always handle them
gracefully on all systems, it would be helpful to allow the credential
helper to implement them instead for increased portability and
robustness.
To allow this to happen, add a boolean flag, continue, that indicates
that instead of failing when we get a 401, we should retry another round
of authentication. However, this necessitates some changes in our
current credential code so that we can make this work.
Keep the state[] headers between iterations, but only use them to send
to the helper and only consider the new ones we read from the credential
helper to be valid on subsequent iterations. That avoids us passing
stale data when we finally approve or reject the credential. Similarly,
clear the multistage and wwwauth[] values appropriately so that we
don't pass stale data or think we're trying a multiround response when
we're not. Remove the credential values so that we can actually fill a
second time with new responses.
Limit the number of iterations of reauthentication we do to 3. This
means that if there's a problem, we'll terminate with an error message
instead of retrying indefinitely and not informing the user (and
possibly conducting a DoS on the server).
In our tests, handle creating multiple response output files from our
helper so we can verify that each of the messages sent is correct.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-17 08:02:37 +08:00
|
|
|
ret = http_request(url, result, target, options);
|
|
|
|
}
|
|
|
|
return ret;
|
2011-07-18 15:50:14 +08:00
|
|
|
}
|
|
|
|
|
2013-02-01 05:02:07 +08:00
|
|
|
int http_get_strbuf(const char *url,
|
2013-09-28 16:31:23 +08:00
|
|
|
struct strbuf *result,
|
|
|
|
struct http_get_options *options)
|
2009-06-06 16:43:53 +08:00
|
|
|
{
|
2013-09-28 16:31:23 +08:00
|
|
|
return http_request_reauth(url, result, HTTP_REQUEST_STRBUF, options);
|
2009-06-06 16:43:53 +08:00
|
|
|
}
|
|
|
|
|
2010-01-12 14:26:08 +08:00
|
|
|
/*
|
2012-03-28 16:41:54 +08:00
|
|
|
* Downloads a URL and stores the result in the given file.
|
2010-01-12 14:26:08 +08:00
|
|
|
*
|
|
|
|
* If a previous interrupted download is detected (i.e. a previous temporary
|
|
|
|
* file is still around) the download is resumed.
|
|
|
|
*/
|
2022-05-17 04:11:02 +08:00
|
|
|
int http_get_file(const char *url, const char *filename,
|
|
|
|
struct http_get_options *options)
|
2009-06-06 16:43:53 +08:00
|
|
|
{
|
|
|
|
int ret;
|
|
|
|
struct strbuf tmpfile = STRBUF_INIT;
|
|
|
|
FILE *result;
|
|
|
|
|
|
|
|
strbuf_addf(&tmpfile, "%s.temp", filename);
|
|
|
|
result = fopen(tmpfile.buf, "a");
|
2013-09-28 16:31:00 +08:00
|
|
|
if (!result) {
|
2009-06-06 16:43:53 +08:00
|
|
|
error("Unable to open local file %s", tmpfile.buf);
|
|
|
|
ret = HTTP_ERROR;
|
|
|
|
goto cleanup;
|
|
|
|
}
|
|
|
|
|
2013-09-28 16:31:23 +08:00
|
|
|
ret = http_request_reauth(url, result, HTTP_REQUEST_FILE, options);
|
2009-06-06 16:43:53 +08:00
|
|
|
fclose(result);
|
|
|
|
|
2015-08-08 05:40:24 +08:00
|
|
|
if (ret == HTTP_OK && finalize_object_file(tmpfile.buf, filename))
|
2009-06-06 16:43:53 +08:00
|
|
|
ret = HTTP_ERROR;
|
|
|
|
cleanup:
|
|
|
|
strbuf_release(&tmpfile);
|
|
|
|
return ret;
|
|
|
|
}
|
|
|
|
|
Make walker.fetch_ref() take a struct ref.
This simplifies a few things, makes a few things slightly more
complicated, but, more importantly, allows that, when struct ref can
represent a symref, http_fetch_ref() can return one.
Incidentally makes the string that http_fetch_ref() gets include "refs/"
(if appropriate), because that's how the name field of struct ref works.
As far as I can tell, the usage in walker:interpret_target() wouldn't have
worked previously, if it ever would have been used, which it wouldn't
(since the fetch process uses the hash instead of the name of the ref
there).
Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-04-27 03:53:09 +08:00
|
|
|
int http_fetch_ref(const char *base, struct ref *ref)
|
2007-12-11 07:08:25 +08:00
|
|
|
{
|
2013-09-28 16:31:23 +08:00
|
|
|
struct http_get_options options = {0};
|
2007-12-11 07:08:25 +08:00
|
|
|
char *url;
|
|
|
|
struct strbuf buffer = STRBUF_INIT;
|
2009-06-06 16:43:55 +08:00
|
|
|
int ret = -1;
|
2007-12-11 07:08:25 +08:00
|
|
|
|
2013-09-28 16:31:23 +08:00
|
|
|
options.no_cache = 1;
|
|
|
|
|
Make walker.fetch_ref() take a struct ref.
This simplifies a few things, makes a few things slightly more
complicated, but, more importantly, allows that, when struct ref can
represent a symref, http_fetch_ref() can return one.
Incidentally makes the string that http_fetch_ref() gets include "refs/"
(if appropriate), because that's how the name field of struct ref works.
As far as I can tell, the usage in walker:interpret_target() wouldn't have
worked previously, if it ever would have been used, which it wouldn't
(since the fetch process uses the hash instead of the name of the ref
there).
Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-04-27 03:53:09 +08:00
|
|
|
url = quote_ref_url(base, ref->name);
|
2013-09-28 16:31:23 +08:00
|
|
|
if (http_get_strbuf(url, &buffer, &options) == HTTP_OK) {
|
2009-06-06 16:43:55 +08:00
|
|
|
strbuf_rtrim(&buffer);
|
2019-02-19 08:05:13 +08:00
|
|
|
if (buffer.len == the_hash_algo->hexsz)
|
2015-11-10 10:22:20 +08:00
|
|
|
ret = get_oid_hex(buffer.buf, &ref->old_oid);
|
2013-12-01 04:55:40 +08:00
|
|
|
else if (starts_with(buffer.buf, "ref: ")) {
|
2009-06-06 16:43:55 +08:00
|
|
|
ref->symref = xstrdup(buffer.buf + 5);
|
|
|
|
ret = 0;
|
2007-12-11 07:08:25 +08:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
strbuf_release(&buffer);
|
|
|
|
free(url);
|
|
|
|
return ret;
|
|
|
|
}
|
2009-06-06 16:43:59 +08:00
|
|
|
|
|
|
|
/* Helpers for fetching packs */
|
2019-02-19 08:05:15 +08:00
|
|
|
static char *fetch_pack_index(unsigned char *hash, const char *base_url)
|
2009-06-06 16:43:59 +08:00
|
|
|
{
|
2010-04-19 22:23:10 +08:00
|
|
|
char *url, *tmp;
|
2009-06-06 16:43:59 +08:00
|
|
|
struct strbuf buf = STRBUF_INIT;
|
|
|
|
|
|
|
|
if (http_is_verbose)
|
2019-02-19 08:05:15 +08:00
|
|
|
fprintf(stderr, "Getting index for pack %s\n", hash_to_hex(hash));
|
2009-06-06 16:43:59 +08:00
|
|
|
|
|
|
|
end_url_with_slash(&buf, base_url);
|
2019-02-19 08:05:15 +08:00
|
|
|
strbuf_addf(&buf, "objects/pack/pack-%s.idx", hash_to_hex(hash));
|
2009-06-06 16:43:59 +08:00
|
|
|
url = strbuf_detach(&buf, NULL);
|
|
|
|
|
dumb-http: store downloaded pack idx as tempfile
This patch fixes a regression in b1b8dfde69 (finalize_object_file():
implement collision check, 2024-09-26) where fetching a v1 pack idx file
over the dumb-http protocol would cause the fetch to fail.
The core of the issue is that dumb-http stores the idx we fetch from the
remote at the same path that will eventually hold the idx we generate
from "index-pack --stdin". The sequence is something like this:
0. We realize we need some object X, which we don't have locally, and
nor does the other side have it as a loose object.
1. We download the list of remote packs from objects/info/packs.
2. For each entry in that file, we download each pack index and store
it locally in .git/objects/pack/pack-$hash.idx (the $hash is not
something we can verify yet and is given to us by the remote).
3. We check each pack index we got to see if it has object X. When we
find a match, we download the matching .pack file from the remote
to a tempfile. We feed that to "index-pack --stdin", which
reindexes the pack, rather than trusting that it has what the other
side claims it does. In most cases, this will end up generating the
exact same (byte-for-byte) pack index which we'll store at the same
pack-$hash.idx path, because the index generation and $hash id are
computed based on what's in the packfile. But:
a. The other side might have used other options to generate the
index. For instance we use index v2 by default, but long ago
it was v1 (and you can still ask for v1 explicitly).
b. The other side might even use a different mechanism to
determine $hash. E.g., long ago it was based on the sorted
list of objects in the packfile, but we switched to using the
pack checksum in 1190a1acf8 (pack-objects: name pack files
after trailer hash, 2013-12-05).
The regression we saw in the real world was (3a). A recent client
fetching from a server with a v1 index downloaded that index, then
complained about trying to overwrite it with its own v2 index. This
collision is otherwise harmless; we know we want to replace the remote
version with our local one, but the collision check doesn't realize
that.
There are a few options to fix it:
- we could teach index-pack a command-line option to ignore only pack
idx collisions, and use it when the dumb-http code invokes
index-pack. This would be an awkward thing to expose users to and
would involve a lot of boilerplate to get the option down to the
collision code.
- we could delete the remote .idx file right before running
index-pack. It should be redundant at that point (since we've just
downloaded the matching pack). But it feels risky to delete
something from our own .git/objects based on what the other side has
said. I'm not entirely positive that a malicious server couldn't lie
about which pack-$hash.idx it has and get us to delete something
precious.
- we can stop co-mingling the downloaded idx files in our local
objects directory. This is a slightly bigger change but I think
fixes the root of the problem more directly.
This patch implements the third option. The big design questions are:
where do we store the downloaded files, and how do we manage their
lifetimes?
There are some additional quirks to the dumb-http system we should
consider. Remember that in step 2 we downloaded every pack index, but in
step 3 we may only download some of the matching packs. What happens to
those other idx files now? They sit in the .git/objects/pack directory,
possibly waiting to be used at a later date. That may save bandwidth for
a subsequent fetch, but it also creates a lot of weird corner cases:
- our local object directory now has semi-untrusted .idx files sitting
around, without their matching .pack
- in case 3b, we noted that we might not generate the same hash as the
other side. In that case even if we download the matching pack,
our index-pack invocation will store it in a different
pack-$hash.idx file. And the unmatched .idx will sit there forever.
- if the server repacks, it may delete the old packs. Now we have
these orphaned .idx files sitting around locally that will never be
used (nor deleted).
- if we repack locally we may delete our local version of the server's
pack index and not realize we have it. So we'll download it again,
even though we have all of the objects it mentions.
I think the right solution here is probably some more complex cache
management system: download the remote .idx files to their own storage
directory, mark them as "seen" when we get their matching pack (to avoid
re-downloading even if we repack), and then delete them when the
server's objects/info/refs no longer mentions them.
But since the dumb http protocol is so ancient and so inferior to the
smart http protocol, I don't think it's worth spending a lot of time
creating such a system. For this patch I'm just downloading the idx
files to .git/objects/tmp_pack_*, and marking them as tempfiles to be
deleted when we exit (and due to the name, any we miss due to a crash,
etc, should eventually be removed by "git gc" runs based on timestamps).
That is slightly worse for one case: if we download an idx but not the
matching pack, we won't retain that idx for subsequent runs. But the
flip side is that we're making other cases better (we never hold on to
useless idx files forever). I suspect that worse case does not even come
up often, since it implies that the packs are generated to match
distinct parts of history (i.e., in practice even in a repo with many
packs you're going to end up grabbing all of those packs to do a clone).
If somebody really cares about that, I think the right path forward is a
managed cache directory as above, and this patch is providing the first
step in that direction anyway (by moving things out of the objects/pack/
directory).
There are two test changes. One demonstrates the broken v1 index case
(it double-checks the resulting clone with fsck to be careful, but prior
to this patch it actually fails at the clone step). The other tweaks the
expectation for a test that covers the "slightly worse" case to
accommodate the extra index download.
The code changes are fairly simple. We stop using finalize_object_file()
to copy the remote's index file into place, and leave it as a tempfile.
We give the tempfile a real ".idx" name, since the packfile code expects
that, and thus we make sure it is out of the usual packs/ directory (so
we'd never mistake it for a real local .idx).
We also have to change parse_pack_index(), which creates a temporary
packed_git to access our index (we need this because all of the pack idx
code assumes we have that struct). It reads the index data from the
tempfile, but prior to this patch would speculatively write the
finalized name into the packed_git struct using the pack-$hash we expect
to use.
I was mildly surprised that this worked at all, since we call
verify_pack_index() on the packed_git which mentions the final name
before moving the file into place! But it works because
parse_pack_index() leaves the mmap-ed data in the struct, so the
lazy-open in verify_pack_index() never triggers, and we read from the
tempfile, ignoring the filename in the struct completely. Hacky, but it
works.
After this patch, parse_pack_index() now uses the index filename we pass
in to derive a matching .pack name. This is OK to change because there
are only two callers, both in the dumb http code (and the other passes
in an existing pack-$hash.idx name, so the derived name is going to be
pack-$hash.pack, which is what we were using anyway).
I'll follow up with some more cleanups in that area, but this patch is
sufficient to fix the regression.
Reported-by: fox <fox.gbr@townlong-yak.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-25 14:58:06 +08:00
|
|
|
/*
|
|
|
|
* Don't put this into packs/, since it's just temporary and we don't
|
|
|
|
* want to confuse it with our local .idx files. We'll generate our
|
|
|
|
* own index if we choose to download the matching packfile.
|
|
|
|
*
|
|
|
|
* It's tempting to use xmks_tempfile() here, but it's important that
|
|
|
|
* the file not exist, otherwise http_get_file() complains. So we
|
|
|
|
* create a filename that should be unique, and then just register it
|
|
|
|
* as a tempfile so that it will get cleaned up on exit.
|
|
|
|
*
|
|
|
|
* In theory we could hold on to the tempfile and delete these as soon
|
|
|
|
* as we download the matching pack, but it would take a bit of
|
|
|
|
* refactoring. Leaving them until the process ends is probably OK.
|
|
|
|
*/
|
|
|
|
tmp = xstrfmt("%s/tmp_pack_%s.idx",
|
|
|
|
repo_get_object_directory(the_repository),
|
|
|
|
hash_to_hex(hash));
|
|
|
|
register_tempfile(tmp);
|
2010-04-19 22:23:10 +08:00
|
|
|
|
2013-10-25 04:17:19 +08:00
|
|
|
if (http_get_file(url, tmp, NULL) != HTTP_OK) {
|
2012-04-30 08:28:45 +08:00
|
|
|
error("Unable to get pack index %s", url);
|
2017-06-16 07:15:46 +08:00
|
|
|
FREE_AND_NULL(tmp);
|
2010-04-19 22:23:10 +08:00
|
|
|
}
|
2009-06-06 16:43:59 +08:00
|
|
|
|
|
|
|
free(url);
|
2010-04-19 22:23:10 +08:00
|
|
|
return tmp;
|
2009-06-06 16:43:59 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
static int fetch_and_setup_pack_index(struct packed_git **packs_head,
|
|
|
|
unsigned char *sha1, const char *base_url)
|
|
|
|
{
|
2024-10-25 15:00:09 +08:00
|
|
|
struct packed_git *new_pack, *p;
|
2010-04-19 22:23:10 +08:00
|
|
|
char *tmp_idx = NULL;
|
|
|
|
int ret;
|
2009-06-06 16:43:59 +08:00
|
|
|
|
2024-10-25 15:00:09 +08:00
|
|
|
/*
|
|
|
|
* If we already have the pack locally, no need to fetch its index or
|
|
|
|
* even add it to list; we already have all of its objects.
|
|
|
|
*/
|
|
|
|
for (p = get_all_packs(the_repository); p; p = p->next) {
|
|
|
|
if (hasheq(p->hash, sha1, the_repository->hash_algo))
|
|
|
|
return 0;
|
2010-04-19 22:23:10 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
tmp_idx = fetch_pack_index(sha1, base_url);
|
|
|
|
if (!tmp_idx)
|
2009-06-06 16:43:59 +08:00
|
|
|
return -1;
|
|
|
|
|
2010-04-19 22:23:10 +08:00
|
|
|
new_pack = parse_pack_index(sha1, tmp_idx);
|
|
|
|
if (!new_pack) {
|
|
|
|
unlink(tmp_idx);
|
|
|
|
free(tmp_idx);
|
|
|
|
|
2009-06-06 16:43:59 +08:00
|
|
|
return -1; /* parse_pack_index() already issued error message */
|
2010-04-19 22:23:10 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
ret = verify_pack_index(new_pack);
|
dumb-http: store downloaded pack idx as tempfile
This patch fixes a regression in b1b8dfde69 (finalize_object_file():
implement collision check, 2024-09-26) where fetching a v1 pack idx file
over the dumb-http protocol would cause the fetch to fail.
The core of the issue is that dumb-http stores the idx we fetch from the
remote at the same path that will eventually hold the idx we generate
from "index-pack --stdin". The sequence is something like this:
0. We realize we need some object X, which we don't have locally, and
nor does the other side have it as a loose object.
1. We download the list of remote packs from objects/info/packs.
2. For each entry in that file, we download each pack index and store
it locally in .git/objects/pack/pack-$hash.idx (the $hash is not
something we can verify yet and is given to us by the remote).
3. We check each pack index we got to see if it has object X. When we
find a match, we download the matching .pack file from the remote
to a tempfile. We feed that to "index-pack --stdin", which
reindexes the pack, rather than trusting that it has what the other
side claims it does. In most cases, this will end up generating the
exact same (byte-for-byte) pack index which we'll store at the same
pack-$hash.idx path, because the index generation and $hash id are
computed based on what's in the packfile. But:
a. The other side might have used other options to generate the
index. For instance we use index v2 by default, but long ago
it was v1 (and you can still ask for v1 explicitly).
b. The other side might even use a different mechanism to
determine $hash. E.g., long ago it was based on the sorted
list of objects in the packfile, but we switched to using the
pack checksum in 1190a1acf8 (pack-objects: name pack files
after trailer hash, 2013-12-05).
The regression we saw in the real world was (3a). A recent client
fetching from a server with a v1 index downloaded that index, then
complained about trying to overwrite it with its own v2 index. This
collision is otherwise harmless; we know we want to replace the remote
version with our local one, but the collision check doesn't realize
that.
There are a few options to fix it:
- we could teach index-pack a command-line option to ignore only pack
idx collisions, and use it when the dumb-http code invokes
index-pack. This would be an awkward thing to expose users to and
would involve a lot of boilerplate to get the option down to the
collision code.
- we could delete the remote .idx file right before running
index-pack. It should be redundant at that point (since we've just
downloaded the matching pack). But it feels risky to delete
something from our own .git/objects based on what the other side has
said. I'm not entirely positive that a malicious server couldn't lie
about which pack-$hash.idx it has and get us to delete something
precious.
- we can stop co-mingling the downloaded idx files in our local
objects directory. This is a slightly bigger change but I think
fixes the root of the problem more directly.
This patch implements the third option. The big design questions are:
where do we store the downloaded files, and how do we manage their
lifetimes?
There are some additional quirks to the dumb-http system we should
consider. Remember that in step 2 we downloaded every pack index, but in
step 3 we may only download some of the matching packs. What happens to
those other idx files now? They sit in the .git/objects/pack directory,
possibly waiting to be used at a later date. That may save bandwidth for
a subsequent fetch, but it also creates a lot of weird corner cases:
- our local object directory now has semi-untrusted .idx files sitting
around, without their matching .pack
- in case 3b, we noted that we might not generate the same hash as the
other side. In that case even if we download the matching pack,
our index-pack invocation will store it in a different
pack-$hash.idx file. And the unmatched .idx will sit there forever.
- if the server repacks, it may delete the old packs. Now we have
these orphaned .idx files sitting around locally that will never be
used (nor deleted).
- if we repack locally we may delete our local version of the server's
pack index and not realize we have it. So we'll download it again,
even though we have all of the objects it mentions.
I think the right solution here is probably some more complex cache
management system: download the remote .idx files to their own storage
directory, mark them as "seen" when we get their matching pack (to avoid
re-downloading even if we repack), and then delete them when the
server's objects/info/refs no longer mentions them.
But since the dumb http protocol is so ancient and so inferior to the
smart http protocol, I don't think it's worth spending a lot of time
creating such a system. For this patch I'm just downloading the idx
files to .git/objects/tmp_pack_*, and marking them as tempfiles to be
deleted when we exit (and due to the name, any we miss due to a crash,
etc, should eventually be removed by "git gc" runs based on timestamps).
That is slightly worse for one case: if we download an idx but not the
matching pack, we won't retain that idx for subsequent runs. But the
flip side is that we're making other cases better (we never hold on to
useless idx files forever). I suspect that worse case does not even come
up often, since it implies that the packs are generated to match
distinct parts of history (i.e., in practice even in a repo with many
packs you're going to end up grabbing all of those packs to do a clone).
If somebody really cares about that, I think the right path forward is a
managed cache directory as above, and this patch is providing the first
step in that direction anyway (by moving things out of the objects/pack/
directory).
There are two test changes. One demonstrates the broken v1 index case
(it double-checks the resulting clone with fsck to be careful, but prior
to this patch it actually fails at the clone step). The other tweaks the
expectation for a test that covers the "slightly worse" case to
accommodate the extra index download.
The code changes are fairly simple. We stop using finalize_object_file()
to copy the remote's index file into place, and leave it as a tempfile.
We give the tempfile a real ".idx" name, since the packfile code expects
that, and thus we make sure it is out of the usual packs/ directory (so
we'd never mistake it for a real local .idx).
We also have to change parse_pack_index(), which creates a temporary
packed_git to access our index (we need this because all of the pack idx
code assumes we have that struct). It reads the index data from the
tempfile, but prior to this patch would speculatively write the
finalized name into the packed_git struct using the pack-$hash we expect
to use.
I was mildly surprised that this worked at all, since we call
verify_pack_index() on the packed_git which mentions the final name
before moving the file into place! But it works because
parse_pack_index() leaves the mmap-ed data in the struct, so the
lazy-open in verify_pack_index() never triggers, and we read from the
tempfile, ignoring the filename in the struct completely. Hacky, but it
works.
After this patch, parse_pack_index() now uses the index filename we pass
in to derive a matching .pack name. This is OK to change because there
are only two callers, both in the dumb http code (and the other passes
in an existing pack-$hash.idx name, so the derived name is going to be
pack-$hash.pack, which is what we were using anyway).
I'll follow up with some more cleanups in that area, but this patch is
sufficient to fix the regression.
Reported-by: fox <fox.gbr@townlong-yak.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
2024-10-25 14:58:06 +08:00
|
|
|
if (!ret)
|
2010-04-19 22:23:10 +08:00
|
|
|
close_pack_index(new_pack);
|
|
|
|
free(tmp_idx);
|
|
|
|
if (ret)
|
|
|
|
return -1;
|
|
|
|
|
2009-06-06 16:43:59 +08:00
|
|
|
new_pack->next = *packs_head;
|
|
|
|
*packs_head = new_pack;
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
int http_get_info_packs(const char *base_url, struct packed_git **packs_head)
|
|
|
|
{
|
2013-09-28 16:31:23 +08:00
|
|
|
struct http_get_options options = {0};
|
2019-04-06 02:12:55 +08:00
|
|
|
int ret = 0;
|
|
|
|
char *url;
|
|
|
|
const char *data;
|
2009-06-06 16:43:59 +08:00
|
|
|
struct strbuf buf = STRBUF_INIT;
|
2019-04-06 02:12:55 +08:00
|
|
|
struct object_id oid;
|
2009-06-06 16:43:59 +08:00
|
|
|
|
|
|
|
end_url_with_slash(&buf, base_url);
|
|
|
|
strbuf_addstr(&buf, "objects/info/packs");
|
|
|
|
url = strbuf_detach(&buf, NULL);
|
|
|
|
|
2013-09-28 16:31:23 +08:00
|
|
|
options.no_cache = 1;
|
|
|
|
ret = http_get_strbuf(url, &buf, &options);
|
2009-06-06 16:43:59 +08:00
|
|
|
if (ret != HTTP_OK)
|
|
|
|
goto cleanup;
|
|
|
|
|
|
|
|
data = buf.buf;
|
2019-04-06 02:12:55 +08:00
|
|
|
while (*data) {
|
|
|
|
if (skip_prefix(data, "P pack-", &data) &&
|
|
|
|
!parse_oid_hex(data, &oid, &data) &&
|
|
|
|
skip_prefix(data, ".pack", &data) &&
|
|
|
|
(*data == '\n' || *data == '\0')) {
|
|
|
|
fetch_and_setup_pack_index(packs_head, oid.hash, base_url);
|
|
|
|
} else {
|
|
|
|
data = strchrnul(data, '\n');
|
2009-06-06 16:43:59 +08:00
|
|
|
}
|
2019-04-06 02:12:55 +08:00
|
|
|
if (*data)
|
|
|
|
data++; /* skip past newline */
|
2009-06-06 16:43:59 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
cleanup:
|
|
|
|
free(url);
|
2024-09-25 06:02:27 +08:00
|
|
|
strbuf_release(&buf);
|
2009-06-06 16:43:59 +08:00
|
|
|
return ret;
|
|
|
|
}
|
2009-06-06 16:44:01 +08:00
|
|
|
|
|
|
|
void release_http_pack_request(struct http_pack_request *preq)
|
|
|
|
{
|
2022-05-03 00:50:37 +08:00
|
|
|
if (preq->packfile) {
|
2009-06-06 16:44:01 +08:00
|
|
|
fclose(preq->packfile);
|
|
|
|
preq->packfile = NULL;
|
|
|
|
}
|
|
|
|
preq->slot = NULL;
|
2018-05-19 09:56:37 +08:00
|
|
|
strbuf_release(&preq->tmpfile);
|
2024-04-17 08:02:27 +08:00
|
|
|
curl_slist_free_all(preq->headers);
|
2009-06-06 16:44:01 +08:00
|
|
|
free(preq->url);
|
2015-03-21 08:28:06 +08:00
|
|
|
free(preq);
|
2009-06-06 16:44:01 +08:00
|
|
|
}
|
|
|
|
|
2021-02-23 03:20:06 +08:00
|
|
|
static const char *default_index_pack_args[] =
|
|
|
|
{"index-pack", "--stdin", NULL};
|
|
|
|
|
2009-06-06 16:44:01 +08:00
|
|
|
int finish_http_pack_request(struct http_pack_request *preq)
|
|
|
|
{
|
2014-08-20 03:09:35 +08:00
|
|
|
struct child_process ip = CHILD_PROCESS_INIT;
|
2020-06-11 04:57:15 +08:00
|
|
|
int tmpfile_fd;
|
|
|
|
int ret = 0;
|
2009-06-06 16:44:01 +08:00
|
|
|
|
2010-04-18 04:07:37 +08:00
|
|
|
fclose(preq->packfile);
|
|
|
|
preq->packfile = NULL;
|
2009-06-06 16:44:01 +08:00
|
|
|
|
2020-06-11 04:57:15 +08:00
|
|
|
tmpfile_fd = xopen(preq->tmpfile.buf, O_RDONLY);
|
2010-04-19 22:23:09 +08:00
|
|
|
|
|
|
|
ip.git_cmd = 1;
|
2020-06-11 04:57:15 +08:00
|
|
|
ip.in = tmpfile_fd;
|
2021-11-26 06:52:18 +08:00
|
|
|
strvec_pushv(&ip.args, preq->index_pack_args ?
|
|
|
|
preq->index_pack_args :
|
|
|
|
default_index_pack_args);
|
2021-02-23 03:20:06 +08:00
|
|
|
|
|
|
|
if (preq->preserve_index_pack_stdout)
|
2020-06-11 04:57:18 +08:00
|
|
|
ip.out = 0;
|
2021-02-23 03:20:06 +08:00
|
|
|
else
|
2020-06-11 04:57:18 +08:00
|
|
|
ip.no_stdout = 1;
|
2010-04-19 22:23:09 +08:00
|
|
|
|
|
|
|
if (run_command(&ip)) {
|
2020-06-11 04:57:15 +08:00
|
|
|
ret = -1;
|
|
|
|
goto cleanup;
|
2010-04-19 22:23:09 +08:00
|
|
|
}
|
|
|
|
|
2020-06-11 04:57:15 +08:00
|
|
|
cleanup:
|
|
|
|
close(tmpfile_fd);
|
|
|
|
unlink(preq->tmpfile.buf);
|
|
|
|
return ret;
|
2009-06-06 16:44:01 +08:00
|
|
|
}
|
|
|
|
|
http: refactor finish_http_pack_request()
finish_http_pack_request() does multiple tasks, including some
housekeeping on a struct packed_git - (1) closing its index, (2)
removing it from a list, and (3) installing it. These concerns are
independent of fetching a pack through HTTP: they are there only because
(1) the calling code opens the pack's index before deciding to fetch it,
(2) the calling code maintains a list of packfiles that can be fetched,
and (3) the calling code fetches it in order to make use of its objects
in the same process.
In preparation for a subsequent commit, which adds a feature that does
not need any of this housekeeping, remove (1), (2), and (3) from
finish_http_pack_request(). (2) and (3) are now done by a helper
function, and (1) is the responsibility of the caller (in this patch,
done closer to the point where the pack index is opened).
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-06-11 04:57:16 +08:00
|
|
|
void http_install_packfile(struct packed_git *p,
|
|
|
|
struct packed_git **list_to_remove_from)
|
|
|
|
{
|
|
|
|
struct packed_git **lst = list_to_remove_from;
|
|
|
|
|
|
|
|
while (*lst != p)
|
|
|
|
lst = &((*lst)->next);
|
|
|
|
*lst = (*lst)->next;
|
2009-06-06 16:44:01 +08:00
|
|
|
|
2018-03-24 01:45:18 +08:00
|
|
|
install_packed_git(the_repository, p);
|
2009-06-06 16:44:01 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
struct http_pack_request *new_http_pack_request(
|
2020-06-11 04:57:18 +08:00
|
|
|
const unsigned char *packed_git_hash, const char *base_url) {
|
|
|
|
|
|
|
|
struct strbuf buf = STRBUF_INIT;
|
|
|
|
|
|
|
|
end_url_with_slash(&buf, base_url);
|
|
|
|
strbuf_addf(&buf, "objects/pack/pack-%s.pack",
|
|
|
|
hash_to_hex(packed_git_hash));
|
|
|
|
return new_direct_http_pack_request(packed_git_hash,
|
|
|
|
strbuf_detach(&buf, NULL));
|
|
|
|
}
|
|
|
|
|
|
|
|
struct http_pack_request *new_direct_http_pack_request(
|
|
|
|
const unsigned char *packed_git_hash, char *url)
|
2009-06-06 16:44:01 +08:00
|
|
|
{
|
2015-11-03 06:10:27 +08:00
|
|
|
off_t prev_posn = 0;
|
2009-06-06 16:44:01 +08:00
|
|
|
struct http_pack_request *preq;
|
|
|
|
|
2021-03-14 00:17:22 +08:00
|
|
|
CALLOC_ARRAY(preq, 1);
|
2018-05-19 09:56:37 +08:00
|
|
|
strbuf_init(&preq->tmpfile, 0);
|
2009-06-06 16:44:01 +08:00
|
|
|
|
2020-06-11 04:57:18 +08:00
|
|
|
preq->url = url;
|
2009-06-06 16:44:01 +08:00
|
|
|
|
2024-10-25 15:00:55 +08:00
|
|
|
odb_pack_name(&preq->tmpfile, packed_git_hash, "pack");
|
|
|
|
strbuf_addstr(&preq->tmpfile, ".temp");
|
2018-05-19 09:56:37 +08:00
|
|
|
preq->packfile = fopen(preq->tmpfile.buf, "a");
|
2009-06-06 16:44:01 +08:00
|
|
|
if (!preq->packfile) {
|
|
|
|
error("Unable to open local file %s for pack",
|
2018-05-19 09:56:37 +08:00
|
|
|
preq->tmpfile.buf);
|
2009-06-06 16:44:01 +08:00
|
|
|
goto abort;
|
|
|
|
}
|
|
|
|
|
|
|
|
preq->slot = get_active_slot();
|
2024-04-17 08:02:27 +08:00
|
|
|
preq->headers = object_request_headers();
|
2021-07-31 01:59:46 +08:00
|
|
|
curl_easy_setopt(preq->slot->curl, CURLOPT_WRITEDATA, preq->packfile);
|
2009-06-06 16:44:01 +08:00
|
|
|
curl_easy_setopt(preq->slot->curl, CURLOPT_WRITEFUNCTION, fwrite);
|
2009-08-10 23:59:55 +08:00
|
|
|
curl_easy_setopt(preq->slot->curl, CURLOPT_URL, preq->url);
|
2024-04-17 08:02:27 +08:00
|
|
|
curl_easy_setopt(preq->slot->curl, CURLOPT_HTTPHEADER, preq->headers);
|
2009-06-06 16:44:01 +08:00
|
|
|
|
|
|
|
/*
|
|
|
|
* If there is data present from a previous transfer attempt,
|
|
|
|
* resume where it left off
|
|
|
|
*/
|
2015-11-03 06:10:27 +08:00
|
|
|
prev_posn = ftello(preq->packfile);
|
2009-06-06 16:44:01 +08:00
|
|
|
if (prev_posn>0) {
|
|
|
|
if (http_is_verbose)
|
|
|
|
fprintf(stderr,
|
2015-11-12 08:07:42 +08:00
|
|
|
"Resuming fetch of pack %s at byte %"PRIuMAX"\n",
|
http: refactor finish_http_pack_request()
finish_http_pack_request() does multiple tasks, including some
housekeeping on a struct packed_git - (1) closing its index, (2)
removing it from a list, and (3) installing it. These concerns are
independent of fetching a pack through HTTP: they are there only because
(1) the calling code opens the pack's index before deciding to fetch it,
(2) the calling code maintains a list of packfiles that can be fetched,
and (3) the calling code fetches it in order to make use of its objects
in the same process.
In preparation for a subsequent commit, which adds a feature that does
not need any of this housekeeping, remove (1), (2), and (3) from
finish_http_pack_request(). (2) and (3) are now done by a helper
function, and (1) is the responsibility of the caller (in this patch,
done closer to the point where the pack index is opened).
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-06-11 04:57:16 +08:00
|
|
|
hash_to_hex(packed_git_hash),
|
2019-02-19 08:05:03 +08:00
|
|
|
(uintmax_t)prev_posn);
|
2015-11-03 05:39:58 +08:00
|
|
|
http_opt_request_remainder(preq->slot->curl, prev_posn);
|
2009-06-06 16:44:01 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
return preq;
|
|
|
|
|
|
|
|
abort:
|
2018-05-19 09:56:37 +08:00
|
|
|
strbuf_release(&preq->tmpfile);
|
2009-08-10 23:59:55 +08:00
|
|
|
free(preq->url);
|
2009-08-10 23:55:48 +08:00
|
|
|
free(preq);
|
2009-06-06 16:44:01 +08:00
|
|
|
return NULL;
|
|
|
|
}
|
http*: add helper methods for fetching objects (loose)
The code handling the fetching of loose objects in http-push.c and
http-walker.c have been refactored into new methods and a new struct
(object_http_request) in http.c. They are not meant to be invoked
elsewhere.
The new methods in http.c are
- new_http_object_request
- process_http_object_request
- finish_http_object_request
- abort_http_object_request
- release_http_object_request
and the new struct is http_object_request.
RANGER_HEADER_SIZE and no_pragma_header is no longer made available
outside of http.c, since after the above changes, there are no other
instances of usage outside of http.c.
Remove members of the transfer_request struct in http-push.c and
http-walker.c, including filename, real_sha1 and zret, as they are used
no longer used.
Move the methods append_remote_object_url() and get_remote_object_url()
from http-push.c to http.c. Additionally, get_remote_object_url() is no
longer defined only when USE_CURL_MULTI is defined, since
non-USE_CURL_MULTI code in http.c uses it (namely, in
new_http_object_request()).
Refactor code from http-push.c::start_fetch_loose() and
http-walker.c::start_object_fetch_request() that deals with the details
of coming up with the filename to store the retrieved object, resuming
a previously aborted request, and making a new curl request, into a new
function, new_http_object_request().
Refactor code from http-walker.c::process_object_request() into the
function, process_http_object_request().
Refactor code from http-push.c::finish_request() and
http-walker.c::finish_object_request() into a new function,
finish_http_object_request(). It returns the result of the
move_temp_to_file() invocation.
Add a function, release_http_object_request(), which cleans up object
request data. http-push.c and http-walker.c invoke this function
separately; http-push.c::release_request() and
http-walker.c::release_object_request() do not invoke this function.
Add a function, abort_http_object_request(), which unlink()s the object
file and invokes release_http_object_request(). Update
http-walker.c::abort_object_request() to use this.
Signed-off-by: Tay Ray Chuan <rctay89@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-06-06 16:44:02 +08:00
|
|
|
|
|
|
|
/* Helpers for fetching objects (loose) */
|
2011-05-03 23:47:27 +08:00
|
|
|
static size_t fwrite_sha1_file(char *ptr, size_t eltsize, size_t nmemb,
|
http*: add helper methods for fetching objects (loose)
The code handling the fetching of loose objects in http-push.c and
http-walker.c have been refactored into new methods and a new struct
(object_http_request) in http.c. They are not meant to be invoked
elsewhere.
The new methods in http.c are
- new_http_object_request
- process_http_object_request
- finish_http_object_request
- abort_http_object_request
- release_http_object_request
and the new struct is http_object_request.
RANGER_HEADER_SIZE and no_pragma_header is no longer made available
outside of http.c, since after the above changes, there are no other
instances of usage outside of http.c.
Remove members of the transfer_request struct in http-push.c and
http-walker.c, including filename, real_sha1 and zret, as they are used
no longer used.
Move the methods append_remote_object_url() and get_remote_object_url()
from http-push.c to http.c. Additionally, get_remote_object_url() is no
longer defined only when USE_CURL_MULTI is defined, since
non-USE_CURL_MULTI code in http.c uses it (namely, in
new_http_object_request()).
Refactor code from http-push.c::start_fetch_loose() and
http-walker.c::start_object_fetch_request() that deals with the details
of coming up with the filename to store the retrieved object, resuming
a previously aborted request, and making a new curl request, into a new
function, new_http_object_request().
Refactor code from http-walker.c::process_object_request() into the
function, process_http_object_request().
Refactor code from http-push.c::finish_request() and
http-walker.c::finish_object_request() into a new function,
finish_http_object_request(). It returns the result of the
move_temp_to_file() invocation.
Add a function, release_http_object_request(), which cleans up object
request data. http-push.c and http-walker.c invoke this function
separately; http-push.c::release_request() and
http-walker.c::release_object_request() do not invoke this function.
Add a function, abort_http_object_request(), which unlink()s the object
file and invokes release_http_object_request(). Update
http-walker.c::abort_object_request() to use this.
Signed-off-by: Tay Ray Chuan <rctay89@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-06-06 16:44:02 +08:00
|
|
|
void *data)
|
|
|
|
{
|
|
|
|
unsigned char expn[4096];
|
|
|
|
size_t size = eltsize * nmemb;
|
|
|
|
int posn = 0;
|
2016-07-12 04:51:30 +08:00
|
|
|
struct http_object_request *freq = data;
|
|
|
|
struct active_request_slot *slot = freq->slot;
|
|
|
|
|
|
|
|
if (slot) {
|
|
|
|
CURLcode c = curl_easy_getinfo(slot->curl, CURLINFO_HTTP_CODE,
|
|
|
|
&slot->http_code);
|
|
|
|
if (c != CURLE_OK)
|
2018-05-02 17:38:39 +08:00
|
|
|
BUG("curl_easy_getinfo for HTTP code failed: %s",
|
2016-07-12 04:51:30 +08:00
|
|
|
curl_easy_strerror(c));
|
http-walker: complain about non-404 loose object errors
Since commit 17966c0a6 (http: avoid disconnecting on 404s
for loose objects, 2016-07-11), we turn off curl's
FAILONERROR option and instead manually deal with failing
HTTP codes.
However, the logic to do so only recognizes HTTP 404 as a
failure. This is probably the most common result, but if we
were to get another code, the curl result remains CURLE_OK,
and we treat it as success. We still end up detecting the
failure when we try to zlib-inflate the object (which will
fail), but instead of reporting the HTTP error, we just
claim that the object is corrupt.
Instead, let's catch anything in the 300's or above as an
error (300's are redirects which are not an error at the
HTTP level, but are an indication that we've explicitly
disabled redirects, so we should treat them as such; we
certainly don't have the resulting object content).
Note that we also fill in req->errorstr, which we didn't do
before. Without FAILONERROR, curl will not have filled this
in, and it will remain a blank string. This never mattered
for the 404 case, because in the logic below we hit the
"missing_target()" branch and print nothing. But for other
errors, we'd want to say _something_, if only to fill in the
blank slot in the error message.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-12-07 02:25:39 +08:00
|
|
|
if (slot->http_code >= 300)
|
Make fread/fwrite-like functions in http.c more like fread/fwrite.
The fread/fwrite-like functions in http.c, namely fread_buffer,
fwrite_buffer, fwrite_null, fwrite_sha1_file all return the
multiplication of the size and number of items they are being given.
Practically speaking, it doesn't matter, because in all contexts where
those functions are used, size is 1.
But those functions being similar to fread and fwrite (the curl API is
designed around being able to use fread and fwrite directly), it might
be preferable to make them behave like fread and fwrite, which, from
the fread/fwrite manual page, is:
On success, fread() and fwrite() return the number of items read
or written. This number equals the number of bytes transferred
only when size is 1. If an error occurs, or the end of the file
is reached, the return value is a short item count (or zero).
Signed-off-by: Mike Hommey <mh@glandium.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-05-08 07:03:54 +08:00
|
|
|
return nmemb;
|
2016-07-12 04:51:30 +08:00
|
|
|
}
|
|
|
|
|
http*: add helper methods for fetching objects (loose)
The code handling the fetching of loose objects in http-push.c and
http-walker.c have been refactored into new methods and a new struct
(object_http_request) in http.c. They are not meant to be invoked
elsewhere.
The new methods in http.c are
- new_http_object_request
- process_http_object_request
- finish_http_object_request
- abort_http_object_request
- release_http_object_request
and the new struct is http_object_request.
RANGER_HEADER_SIZE and no_pragma_header is no longer made available
outside of http.c, since after the above changes, there are no other
instances of usage outside of http.c.
Remove members of the transfer_request struct in http-push.c and
http-walker.c, including filename, real_sha1 and zret, as they are used
no longer used.
Move the methods append_remote_object_url() and get_remote_object_url()
from http-push.c to http.c. Additionally, get_remote_object_url() is no
longer defined only when USE_CURL_MULTI is defined, since
non-USE_CURL_MULTI code in http.c uses it (namely, in
new_http_object_request()).
Refactor code from http-push.c::start_fetch_loose() and
http-walker.c::start_object_fetch_request() that deals with the details
of coming up with the filename to store the retrieved object, resuming
a previously aborted request, and making a new curl request, into a new
function, new_http_object_request().
Refactor code from http-walker.c::process_object_request() into the
function, process_http_object_request().
Refactor code from http-push.c::finish_request() and
http-walker.c::finish_object_request() into a new function,
finish_http_object_request(). It returns the result of the
move_temp_to_file() invocation.
Add a function, release_http_object_request(), which cleans up object
request data. http-push.c and http-walker.c invoke this function
separately; http-push.c::release_request() and
http-walker.c::release_object_request() do not invoke this function.
Add a function, abort_http_object_request(), which unlink()s the object
file and invokes release_http_object_request(). Update
http-walker.c::abort_object_request() to use this.
Signed-off-by: Tay Ray Chuan <rctay89@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-06-06 16:44:02 +08:00
|
|
|
do {
|
|
|
|
ssize_t retval = xwrite(freq->localfile,
|
|
|
|
(char *) ptr + posn, size - posn);
|
|
|
|
if (retval < 0)
|
Make fread/fwrite-like functions in http.c more like fread/fwrite.
The fread/fwrite-like functions in http.c, namely fread_buffer,
fwrite_buffer, fwrite_null, fwrite_sha1_file all return the
multiplication of the size and number of items they are being given.
Practically speaking, it doesn't matter, because in all contexts where
those functions are used, size is 1.
But those functions being similar to fread and fwrite (the curl API is
designed around being able to use fread and fwrite directly), it might
be preferable to make them behave like fread and fwrite, which, from
the fread/fwrite manual page, is:
On success, fread() and fwrite() return the number of items read
or written. This number equals the number of bytes transferred
only when size is 1. If an error occurs, or the end of the file
is reached, the return value is a short item count (or zero).
Signed-off-by: Mike Hommey <mh@glandium.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-05-08 07:03:54 +08:00
|
|
|
return posn / eltsize;
|
http*: add helper methods for fetching objects (loose)
The code handling the fetching of loose objects in http-push.c and
http-walker.c have been refactored into new methods and a new struct
(object_http_request) in http.c. They are not meant to be invoked
elsewhere.
The new methods in http.c are
- new_http_object_request
- process_http_object_request
- finish_http_object_request
- abort_http_object_request
- release_http_object_request
and the new struct is http_object_request.
RANGER_HEADER_SIZE and no_pragma_header is no longer made available
outside of http.c, since after the above changes, there are no other
instances of usage outside of http.c.
Remove members of the transfer_request struct in http-push.c and
http-walker.c, including filename, real_sha1 and zret, as they are used
no longer used.
Move the methods append_remote_object_url() and get_remote_object_url()
from http-push.c to http.c. Additionally, get_remote_object_url() is no
longer defined only when USE_CURL_MULTI is defined, since
non-USE_CURL_MULTI code in http.c uses it (namely, in
new_http_object_request()).
Refactor code from http-push.c::start_fetch_loose() and
http-walker.c::start_object_fetch_request() that deals with the details
of coming up with the filename to store the retrieved object, resuming
a previously aborted request, and making a new curl request, into a new
function, new_http_object_request().
Refactor code from http-walker.c::process_object_request() into the
function, process_http_object_request().
Refactor code from http-push.c::finish_request() and
http-walker.c::finish_object_request() into a new function,
finish_http_object_request(). It returns the result of the
move_temp_to_file() invocation.
Add a function, release_http_object_request(), which cleans up object
request data. http-push.c and http-walker.c invoke this function
separately; http-push.c::release_request() and
http-walker.c::release_object_request() do not invoke this function.
Add a function, abort_http_object_request(), which unlink()s the object
file and invokes release_http_object_request(). Update
http-walker.c::abort_object_request() to use this.
Signed-off-by: Tay Ray Chuan <rctay89@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-06-06 16:44:02 +08:00
|
|
|
posn += retval;
|
|
|
|
} while (posn < size);
|
|
|
|
|
|
|
|
freq->stream.avail_in = size;
|
2011-05-03 23:47:27 +08:00
|
|
|
freq->stream.next_in = (void *)ptr;
|
http*: add helper methods for fetching objects (loose)
The code handling the fetching of loose objects in http-push.c and
http-walker.c have been refactored into new methods and a new struct
(object_http_request) in http.c. They are not meant to be invoked
elsewhere.
The new methods in http.c are
- new_http_object_request
- process_http_object_request
- finish_http_object_request
- abort_http_object_request
- release_http_object_request
and the new struct is http_object_request.
RANGER_HEADER_SIZE and no_pragma_header is no longer made available
outside of http.c, since after the above changes, there are no other
instances of usage outside of http.c.
Remove members of the transfer_request struct in http-push.c and
http-walker.c, including filename, real_sha1 and zret, as they are used
no longer used.
Move the methods append_remote_object_url() and get_remote_object_url()
from http-push.c to http.c. Additionally, get_remote_object_url() is no
longer defined only when USE_CURL_MULTI is defined, since
non-USE_CURL_MULTI code in http.c uses it (namely, in
new_http_object_request()).
Refactor code from http-push.c::start_fetch_loose() and
http-walker.c::start_object_fetch_request() that deals with the details
of coming up with the filename to store the retrieved object, resuming
a previously aborted request, and making a new curl request, into a new
function, new_http_object_request().
Refactor code from http-walker.c::process_object_request() into the
function, process_http_object_request().
Refactor code from http-push.c::finish_request() and
http-walker.c::finish_object_request() into a new function,
finish_http_object_request(). It returns the result of the
move_temp_to_file() invocation.
Add a function, release_http_object_request(), which cleans up object
request data. http-push.c and http-walker.c invoke this function
separately; http-push.c::release_request() and
http-walker.c::release_object_request() do not invoke this function.
Add a function, abort_http_object_request(), which unlink()s the object
file and invokes release_http_object_request(). Update
http-walker.c::abort_object_request() to use this.
Signed-off-by: Tay Ray Chuan <rctay89@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-06-06 16:44:02 +08:00
|
|
|
do {
|
|
|
|
freq->stream.next_out = expn;
|
|
|
|
freq->stream.avail_out = sizeof(expn);
|
|
|
|
freq->zret = git_inflate(&freq->stream, Z_SYNC_FLUSH);
|
2019-02-19 08:05:14 +08:00
|
|
|
the_hash_algo->update_fn(&freq->c, expn,
|
|
|
|
sizeof(expn) - freq->stream.avail_out);
|
http*: add helper methods for fetching objects (loose)
The code handling the fetching of loose objects in http-push.c and
http-walker.c have been refactored into new methods and a new struct
(object_http_request) in http.c. They are not meant to be invoked
elsewhere.
The new methods in http.c are
- new_http_object_request
- process_http_object_request
- finish_http_object_request
- abort_http_object_request
- release_http_object_request
and the new struct is http_object_request.
RANGER_HEADER_SIZE and no_pragma_header is no longer made available
outside of http.c, since after the above changes, there are no other
instances of usage outside of http.c.
Remove members of the transfer_request struct in http-push.c and
http-walker.c, including filename, real_sha1 and zret, as they are used
no longer used.
Move the methods append_remote_object_url() and get_remote_object_url()
from http-push.c to http.c. Additionally, get_remote_object_url() is no
longer defined only when USE_CURL_MULTI is defined, since
non-USE_CURL_MULTI code in http.c uses it (namely, in
new_http_object_request()).
Refactor code from http-push.c::start_fetch_loose() and
http-walker.c::start_object_fetch_request() that deals with the details
of coming up with the filename to store the retrieved object, resuming
a previously aborted request, and making a new curl request, into a new
function, new_http_object_request().
Refactor code from http-walker.c::process_object_request() into the
function, process_http_object_request().
Refactor code from http-push.c::finish_request() and
http-walker.c::finish_object_request() into a new function,
finish_http_object_request(). It returns the result of the
move_temp_to_file() invocation.
Add a function, release_http_object_request(), which cleans up object
request data. http-push.c and http-walker.c invoke this function
separately; http-push.c::release_request() and
http-walker.c::release_object_request() do not invoke this function.
Add a function, abort_http_object_request(), which unlink()s the object
file and invokes release_http_object_request(). Update
http-walker.c::abort_object_request() to use this.
Signed-off-by: Tay Ray Chuan <rctay89@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-06-06 16:44:02 +08:00
|
|
|
} while (freq->stream.avail_in && freq->zret == Z_OK);
|
Make fread/fwrite-like functions in http.c more like fread/fwrite.
The fread/fwrite-like functions in http.c, namely fread_buffer,
fwrite_buffer, fwrite_null, fwrite_sha1_file all return the
multiplication of the size and number of items they are being given.
Practically speaking, it doesn't matter, because in all contexts where
those functions are used, size is 1.
But those functions being similar to fread and fwrite (the curl API is
designed around being able to use fread and fwrite directly), it might
be preferable to make them behave like fread and fwrite, which, from
the fread/fwrite manual page, is:
On success, fread() and fwrite() return the number of items read
or written. This number equals the number of bytes transferred
only when size is 1. If an error occurs, or the end of the file
is reached, the return value is a short item count (or zero).
Signed-off-by: Mike Hommey <mh@glandium.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-05-08 07:03:54 +08:00
|
|
|
return nmemb;
|
http*: add helper methods for fetching objects (loose)
The code handling the fetching of loose objects in http-push.c and
http-walker.c have been refactored into new methods and a new struct
(object_http_request) in http.c. They are not meant to be invoked
elsewhere.
The new methods in http.c are
- new_http_object_request
- process_http_object_request
- finish_http_object_request
- abort_http_object_request
- release_http_object_request
and the new struct is http_object_request.
RANGER_HEADER_SIZE and no_pragma_header is no longer made available
outside of http.c, since after the above changes, there are no other
instances of usage outside of http.c.
Remove members of the transfer_request struct in http-push.c and
http-walker.c, including filename, real_sha1 and zret, as they are used
no longer used.
Move the methods append_remote_object_url() and get_remote_object_url()
from http-push.c to http.c. Additionally, get_remote_object_url() is no
longer defined only when USE_CURL_MULTI is defined, since
non-USE_CURL_MULTI code in http.c uses it (namely, in
new_http_object_request()).
Refactor code from http-push.c::start_fetch_loose() and
http-walker.c::start_object_fetch_request() that deals with the details
of coming up with the filename to store the retrieved object, resuming
a previously aborted request, and making a new curl request, into a new
function, new_http_object_request().
Refactor code from http-walker.c::process_object_request() into the
function, process_http_object_request().
Refactor code from http-push.c::finish_request() and
http-walker.c::finish_object_request() into a new function,
finish_http_object_request(). It returns the result of the
move_temp_to_file() invocation.
Add a function, release_http_object_request(), which cleans up object
request data. http-push.c and http-walker.c invoke this function
separately; http-push.c::release_request() and
http-walker.c::release_object_request() do not invoke this function.
Add a function, abort_http_object_request(), which unlink()s the object
file and invokes release_http_object_request(). Update
http-walker.c::abort_object_request() to use this.
Signed-off-by: Tay Ray Chuan <rctay89@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-06-06 16:44:02 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
struct http_object_request *new_http_object_request(const char *base_url,
|
2019-01-07 16:34:40 +08:00
|
|
|
const struct object_id *oid)
|
http*: add helper methods for fetching objects (loose)
The code handling the fetching of loose objects in http-push.c and
http-walker.c have been refactored into new methods and a new struct
(object_http_request) in http.c. They are not meant to be invoked
elsewhere.
The new methods in http.c are
- new_http_object_request
- process_http_object_request
- finish_http_object_request
- abort_http_object_request
- release_http_object_request
and the new struct is http_object_request.
RANGER_HEADER_SIZE and no_pragma_header is no longer made available
outside of http.c, since after the above changes, there are no other
instances of usage outside of http.c.
Remove members of the transfer_request struct in http-push.c and
http-walker.c, including filename, real_sha1 and zret, as they are used
no longer used.
Move the methods append_remote_object_url() and get_remote_object_url()
from http-push.c to http.c. Additionally, get_remote_object_url() is no
longer defined only when USE_CURL_MULTI is defined, since
non-USE_CURL_MULTI code in http.c uses it (namely, in
new_http_object_request()).
Refactor code from http-push.c::start_fetch_loose() and
http-walker.c::start_object_fetch_request() that deals with the details
of coming up with the filename to store the retrieved object, resuming
a previously aborted request, and making a new curl request, into a new
function, new_http_object_request().
Refactor code from http-walker.c::process_object_request() into the
function, process_http_object_request().
Refactor code from http-push.c::finish_request() and
http-walker.c::finish_object_request() into a new function,
finish_http_object_request(). It returns the result of the
move_temp_to_file() invocation.
Add a function, release_http_object_request(), which cleans up object
request data. http-push.c and http-walker.c invoke this function
separately; http-push.c::release_request() and
http-walker.c::release_object_request() do not invoke this function.
Add a function, abort_http_object_request(), which unlink()s the object
file and invokes release_http_object_request(). Update
http-walker.c::abort_object_request() to use this.
Signed-off-by: Tay Ray Chuan <rctay89@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-06-06 16:44:02 +08:00
|
|
|
{
|
2019-01-07 16:34:40 +08:00
|
|
|
char *hex = oid_to_hex(oid);
|
2018-01-18 01:54:54 +08:00
|
|
|
struct strbuf filename = STRBUF_INIT;
|
2018-05-19 09:56:37 +08:00
|
|
|
struct strbuf prevfile = STRBUF_INIT;
|
http*: add helper methods for fetching objects (loose)
The code handling the fetching of loose objects in http-push.c and
http-walker.c have been refactored into new methods and a new struct
(object_http_request) in http.c. They are not meant to be invoked
elsewhere.
The new methods in http.c are
- new_http_object_request
- process_http_object_request
- finish_http_object_request
- abort_http_object_request
- release_http_object_request
and the new struct is http_object_request.
RANGER_HEADER_SIZE and no_pragma_header is no longer made available
outside of http.c, since after the above changes, there are no other
instances of usage outside of http.c.
Remove members of the transfer_request struct in http-push.c and
http-walker.c, including filename, real_sha1 and zret, as they are used
no longer used.
Move the methods append_remote_object_url() and get_remote_object_url()
from http-push.c to http.c. Additionally, get_remote_object_url() is no
longer defined only when USE_CURL_MULTI is defined, since
non-USE_CURL_MULTI code in http.c uses it (namely, in
new_http_object_request()).
Refactor code from http-push.c::start_fetch_loose() and
http-walker.c::start_object_fetch_request() that deals with the details
of coming up with the filename to store the retrieved object, resuming
a previously aborted request, and making a new curl request, into a new
function, new_http_object_request().
Refactor code from http-walker.c::process_object_request() into the
function, process_http_object_request().
Refactor code from http-push.c::finish_request() and
http-walker.c::finish_object_request() into a new function,
finish_http_object_request(). It returns the result of the
move_temp_to_file() invocation.
Add a function, release_http_object_request(), which cleans up object
request data. http-push.c and http-walker.c invoke this function
separately; http-push.c::release_request() and
http-walker.c::release_object_request() do not invoke this function.
Add a function, abort_http_object_request(), which unlink()s the object
file and invokes release_http_object_request(). Update
http-walker.c::abort_object_request() to use this.
Signed-off-by: Tay Ray Chuan <rctay89@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-06-06 16:44:02 +08:00
|
|
|
int prevlocal;
|
2011-05-03 23:47:27 +08:00
|
|
|
char prev_buf[PREV_BUF_SIZE];
|
http*: add helper methods for fetching objects (loose)
The code handling the fetching of loose objects in http-push.c and
http-walker.c have been refactored into new methods and a new struct
(object_http_request) in http.c. They are not meant to be invoked
elsewhere.
The new methods in http.c are
- new_http_object_request
- process_http_object_request
- finish_http_object_request
- abort_http_object_request
- release_http_object_request
and the new struct is http_object_request.
RANGER_HEADER_SIZE and no_pragma_header is no longer made available
outside of http.c, since after the above changes, there are no other
instances of usage outside of http.c.
Remove members of the transfer_request struct in http-push.c and
http-walker.c, including filename, real_sha1 and zret, as they are used
no longer used.
Move the methods append_remote_object_url() and get_remote_object_url()
from http-push.c to http.c. Additionally, get_remote_object_url() is no
longer defined only when USE_CURL_MULTI is defined, since
non-USE_CURL_MULTI code in http.c uses it (namely, in
new_http_object_request()).
Refactor code from http-push.c::start_fetch_loose() and
http-walker.c::start_object_fetch_request() that deals with the details
of coming up with the filename to store the retrieved object, resuming
a previously aborted request, and making a new curl request, into a new
function, new_http_object_request().
Refactor code from http-walker.c::process_object_request() into the
function, process_http_object_request().
Refactor code from http-push.c::finish_request() and
http-walker.c::finish_object_request() into a new function,
finish_http_object_request(). It returns the result of the
move_temp_to_file() invocation.
Add a function, release_http_object_request(), which cleans up object
request data. http-push.c and http-walker.c invoke this function
separately; http-push.c::release_request() and
http-walker.c::release_object_request() do not invoke this function.
Add a function, abort_http_object_request(), which unlink()s the object
file and invokes release_http_object_request(). Update
http-walker.c::abort_object_request() to use this.
Signed-off-by: Tay Ray Chuan <rctay89@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-06-06 16:44:02 +08:00
|
|
|
ssize_t prev_read = 0;
|
2015-11-03 06:10:27 +08:00
|
|
|
off_t prev_posn = 0;
|
http*: add helper methods for fetching objects (loose)
The code handling the fetching of loose objects in http-push.c and
http-walker.c have been refactored into new methods and a new struct
(object_http_request) in http.c. They are not meant to be invoked
elsewhere.
The new methods in http.c are
- new_http_object_request
- process_http_object_request
- finish_http_object_request
- abort_http_object_request
- release_http_object_request
and the new struct is http_object_request.
RANGER_HEADER_SIZE and no_pragma_header is no longer made available
outside of http.c, since after the above changes, there are no other
instances of usage outside of http.c.
Remove members of the transfer_request struct in http-push.c and
http-walker.c, including filename, real_sha1 and zret, as they are used
no longer used.
Move the methods append_remote_object_url() and get_remote_object_url()
from http-push.c to http.c. Additionally, get_remote_object_url() is no
longer defined only when USE_CURL_MULTI is defined, since
non-USE_CURL_MULTI code in http.c uses it (namely, in
new_http_object_request()).
Refactor code from http-push.c::start_fetch_loose() and
http-walker.c::start_object_fetch_request() that deals with the details
of coming up with the filename to store the retrieved object, resuming
a previously aborted request, and making a new curl request, into a new
function, new_http_object_request().
Refactor code from http-walker.c::process_object_request() into the
function, process_http_object_request().
Refactor code from http-push.c::finish_request() and
http-walker.c::finish_object_request() into a new function,
finish_http_object_request(). It returns the result of the
move_temp_to_file() invocation.
Add a function, release_http_object_request(), which cleans up object
request data. http-push.c and http-walker.c invoke this function
separately; http-push.c::release_request() and
http-walker.c::release_object_request() do not invoke this function.
Add a function, abort_http_object_request(), which unlink()s the object
file and invokes release_http_object_request(). Update
http-walker.c::abort_object_request() to use this.
Signed-off-by: Tay Ray Chuan <rctay89@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-06-06 16:44:02 +08:00
|
|
|
struct http_object_request *freq;
|
|
|
|
|
2021-03-14 00:17:22 +08:00
|
|
|
CALLOC_ARRAY(freq, 1);
|
2018-05-19 09:56:37 +08:00
|
|
|
strbuf_init(&freq->tmpfile, 0);
|
2019-01-07 16:34:40 +08:00
|
|
|
oidcpy(&freq->oid, oid);
|
http*: add helper methods for fetching objects (loose)
The code handling the fetching of loose objects in http-push.c and
http-walker.c have been refactored into new methods and a new struct
(object_http_request) in http.c. They are not meant to be invoked
elsewhere.
The new methods in http.c are
- new_http_object_request
- process_http_object_request
- finish_http_object_request
- abort_http_object_request
- release_http_object_request
and the new struct is http_object_request.
RANGER_HEADER_SIZE and no_pragma_header is no longer made available
outside of http.c, since after the above changes, there are no other
instances of usage outside of http.c.
Remove members of the transfer_request struct in http-push.c and
http-walker.c, including filename, real_sha1 and zret, as they are used
no longer used.
Move the methods append_remote_object_url() and get_remote_object_url()
from http-push.c to http.c. Additionally, get_remote_object_url() is no
longer defined only when USE_CURL_MULTI is defined, since
non-USE_CURL_MULTI code in http.c uses it (namely, in
new_http_object_request()).
Refactor code from http-push.c::start_fetch_loose() and
http-walker.c::start_object_fetch_request() that deals with the details
of coming up with the filename to store the retrieved object, resuming
a previously aborted request, and making a new curl request, into a new
function, new_http_object_request().
Refactor code from http-walker.c::process_object_request() into the
function, process_http_object_request().
Refactor code from http-push.c::finish_request() and
http-walker.c::finish_object_request() into a new function,
finish_http_object_request(). It returns the result of the
move_temp_to_file() invocation.
Add a function, release_http_object_request(), which cleans up object
request data. http-push.c and http-walker.c invoke this function
separately; http-push.c::release_request() and
http-walker.c::release_object_request() do not invoke this function.
Add a function, abort_http_object_request(), which unlink()s the object
file and invokes release_http_object_request(). Update
http-walker.c::abort_object_request() to use this.
Signed-off-by: Tay Ray Chuan <rctay89@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-06-06 16:44:02 +08:00
|
|
|
freq->localfile = -1;
|
|
|
|
|
sha1-file: modernize loose object file functions
The loose object access code in sha1-file.c is some of the oldest in
Git, and could use some modernizing. It mostly uses "unsigned char *"
for object ids, which these days should be "struct object_id".
It also uses the term "sha1_file" in many functions, which is confusing.
The term "loose_objects" is much better. It clearly distinguishes
them from packed objects (which didn't even exist back when the name
"sha1_file" came into being). And it also distinguishes it from the
checksummed-file concept in csum-file.c (which until recently was
actually called "struct sha1file"!).
This patch converts the functions {open,close,map,stat}_sha1_file() into
open_loose_object(), etc, and switches their sha1 arguments for
object_id structs. Similarly, path functions like fill_sha1_path()
become fill_loose_path() and use object_ids.
The function sha1_loose_object_info() already says "loose", so we can
just drop the "sha1" (and teach it to use object_id).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-07 16:35:42 +08:00
|
|
|
loose_object_path(the_repository, &filename, oid);
|
2018-05-19 09:56:37 +08:00
|
|
|
strbuf_addf(&freq->tmpfile, "%s.temp", filename.buf);
|
http*: add helper methods for fetching objects (loose)
The code handling the fetching of loose objects in http-push.c and
http-walker.c have been refactored into new methods and a new struct
(object_http_request) in http.c. They are not meant to be invoked
elsewhere.
The new methods in http.c are
- new_http_object_request
- process_http_object_request
- finish_http_object_request
- abort_http_object_request
- release_http_object_request
and the new struct is http_object_request.
RANGER_HEADER_SIZE and no_pragma_header is no longer made available
outside of http.c, since after the above changes, there are no other
instances of usage outside of http.c.
Remove members of the transfer_request struct in http-push.c and
http-walker.c, including filename, real_sha1 and zret, as they are used
no longer used.
Move the methods append_remote_object_url() and get_remote_object_url()
from http-push.c to http.c. Additionally, get_remote_object_url() is no
longer defined only when USE_CURL_MULTI is defined, since
non-USE_CURL_MULTI code in http.c uses it (namely, in
new_http_object_request()).
Refactor code from http-push.c::start_fetch_loose() and
http-walker.c::start_object_fetch_request() that deals with the details
of coming up with the filename to store the retrieved object, resuming
a previously aborted request, and making a new curl request, into a new
function, new_http_object_request().
Refactor code from http-walker.c::process_object_request() into the
function, process_http_object_request().
Refactor code from http-push.c::finish_request() and
http-walker.c::finish_object_request() into a new function,
finish_http_object_request(). It returns the result of the
move_temp_to_file() invocation.
Add a function, release_http_object_request(), which cleans up object
request data. http-push.c and http-walker.c invoke this function
separately; http-push.c::release_request() and
http-walker.c::release_object_request() do not invoke this function.
Add a function, abort_http_object_request(), which unlink()s the object
file and invokes release_http_object_request(). Update
http-walker.c::abort_object_request() to use this.
Signed-off-by: Tay Ray Chuan <rctay89@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-06-06 16:44:02 +08:00
|
|
|
|
2018-05-19 09:56:37 +08:00
|
|
|
strbuf_addf(&prevfile, "%s.prev", filename.buf);
|
|
|
|
unlink_or_warn(prevfile.buf);
|
|
|
|
rename(freq->tmpfile.buf, prevfile.buf);
|
|
|
|
unlink_or_warn(freq->tmpfile.buf);
|
2018-01-18 01:54:54 +08:00
|
|
|
strbuf_release(&filename);
|
http*: add helper methods for fetching objects (loose)
The code handling the fetching of loose objects in http-push.c and
http-walker.c have been refactored into new methods and a new struct
(object_http_request) in http.c. They are not meant to be invoked
elsewhere.
The new methods in http.c are
- new_http_object_request
- process_http_object_request
- finish_http_object_request
- abort_http_object_request
- release_http_object_request
and the new struct is http_object_request.
RANGER_HEADER_SIZE and no_pragma_header is no longer made available
outside of http.c, since after the above changes, there are no other
instances of usage outside of http.c.
Remove members of the transfer_request struct in http-push.c and
http-walker.c, including filename, real_sha1 and zret, as they are used
no longer used.
Move the methods append_remote_object_url() and get_remote_object_url()
from http-push.c to http.c. Additionally, get_remote_object_url() is no
longer defined only when USE_CURL_MULTI is defined, since
non-USE_CURL_MULTI code in http.c uses it (namely, in
new_http_object_request()).
Refactor code from http-push.c::start_fetch_loose() and
http-walker.c::start_object_fetch_request() that deals with the details
of coming up with the filename to store the retrieved object, resuming
a previously aborted request, and making a new curl request, into a new
function, new_http_object_request().
Refactor code from http-walker.c::process_object_request() into the
function, process_http_object_request().
Refactor code from http-push.c::finish_request() and
http-walker.c::finish_object_request() into a new function,
finish_http_object_request(). It returns the result of the
move_temp_to_file() invocation.
Add a function, release_http_object_request(), which cleans up object
request data. http-push.c and http-walker.c invoke this function
separately; http-push.c::release_request() and
http-walker.c::release_object_request() do not invoke this function.
Add a function, abort_http_object_request(), which unlink()s the object
file and invokes release_http_object_request(). Update
http-walker.c::abort_object_request() to use this.
Signed-off-by: Tay Ray Chuan <rctay89@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-06-06 16:44:02 +08:00
|
|
|
|
|
|
|
if (freq->localfile != -1)
|
|
|
|
error("fd leakage in start: %d", freq->localfile);
|
2018-05-19 09:56:37 +08:00
|
|
|
freq->localfile = open(freq->tmpfile.buf,
|
http*: add helper methods for fetching objects (loose)
The code handling the fetching of loose objects in http-push.c and
http-walker.c have been refactored into new methods and a new struct
(object_http_request) in http.c. They are not meant to be invoked
elsewhere.
The new methods in http.c are
- new_http_object_request
- process_http_object_request
- finish_http_object_request
- abort_http_object_request
- release_http_object_request
and the new struct is http_object_request.
RANGER_HEADER_SIZE and no_pragma_header is no longer made available
outside of http.c, since after the above changes, there are no other
instances of usage outside of http.c.
Remove members of the transfer_request struct in http-push.c and
http-walker.c, including filename, real_sha1 and zret, as they are used
no longer used.
Move the methods append_remote_object_url() and get_remote_object_url()
from http-push.c to http.c. Additionally, get_remote_object_url() is no
longer defined only when USE_CURL_MULTI is defined, since
non-USE_CURL_MULTI code in http.c uses it (namely, in
new_http_object_request()).
Refactor code from http-push.c::start_fetch_loose() and
http-walker.c::start_object_fetch_request() that deals with the details
of coming up with the filename to store the retrieved object, resuming
a previously aborted request, and making a new curl request, into a new
function, new_http_object_request().
Refactor code from http-walker.c::process_object_request() into the
function, process_http_object_request().
Refactor code from http-push.c::finish_request() and
http-walker.c::finish_object_request() into a new function,
finish_http_object_request(). It returns the result of the
move_temp_to_file() invocation.
Add a function, release_http_object_request(), which cleans up object
request data. http-push.c and http-walker.c invoke this function
separately; http-push.c::release_request() and
http-walker.c::release_object_request() do not invoke this function.
Add a function, abort_http_object_request(), which unlink()s the object
file and invokes release_http_object_request(). Update
http-walker.c::abort_object_request() to use this.
Signed-off-by: Tay Ray Chuan <rctay89@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-06-06 16:44:02 +08:00
|
|
|
O_WRONLY | O_CREAT | O_EXCL, 0666);
|
|
|
|
/*
|
|
|
|
* This could have failed due to the "lazy directory creation";
|
|
|
|
* try to mkdir the last path component.
|
|
|
|
*/
|
|
|
|
if (freq->localfile < 0 && errno == ENOENT) {
|
2018-05-19 09:56:37 +08:00
|
|
|
char *dir = strrchr(freq->tmpfile.buf, '/');
|
http*: add helper methods for fetching objects (loose)
The code handling the fetching of loose objects in http-push.c and
http-walker.c have been refactored into new methods and a new struct
(object_http_request) in http.c. They are not meant to be invoked
elsewhere.
The new methods in http.c are
- new_http_object_request
- process_http_object_request
- finish_http_object_request
- abort_http_object_request
- release_http_object_request
and the new struct is http_object_request.
RANGER_HEADER_SIZE and no_pragma_header is no longer made available
outside of http.c, since after the above changes, there are no other
instances of usage outside of http.c.
Remove members of the transfer_request struct in http-push.c and
http-walker.c, including filename, real_sha1 and zret, as they are used
no longer used.
Move the methods append_remote_object_url() and get_remote_object_url()
from http-push.c to http.c. Additionally, get_remote_object_url() is no
longer defined only when USE_CURL_MULTI is defined, since
non-USE_CURL_MULTI code in http.c uses it (namely, in
new_http_object_request()).
Refactor code from http-push.c::start_fetch_loose() and
http-walker.c::start_object_fetch_request() that deals with the details
of coming up with the filename to store the retrieved object, resuming
a previously aborted request, and making a new curl request, into a new
function, new_http_object_request().
Refactor code from http-walker.c::process_object_request() into the
function, process_http_object_request().
Refactor code from http-push.c::finish_request() and
http-walker.c::finish_object_request() into a new function,
finish_http_object_request(). It returns the result of the
move_temp_to_file() invocation.
Add a function, release_http_object_request(), which cleans up object
request data. http-push.c and http-walker.c invoke this function
separately; http-push.c::release_request() and
http-walker.c::release_object_request() do not invoke this function.
Add a function, abort_http_object_request(), which unlink()s the object
file and invokes release_http_object_request(). Update
http-walker.c::abort_object_request() to use this.
Signed-off-by: Tay Ray Chuan <rctay89@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-06-06 16:44:02 +08:00
|
|
|
if (dir) {
|
|
|
|
*dir = 0;
|
2018-05-19 09:56:37 +08:00
|
|
|
mkdir(freq->tmpfile.buf, 0777);
|
http*: add helper methods for fetching objects (loose)
The code handling the fetching of loose objects in http-push.c and
http-walker.c have been refactored into new methods and a new struct
(object_http_request) in http.c. They are not meant to be invoked
elsewhere.
The new methods in http.c are
- new_http_object_request
- process_http_object_request
- finish_http_object_request
- abort_http_object_request
- release_http_object_request
and the new struct is http_object_request.
RANGER_HEADER_SIZE and no_pragma_header is no longer made available
outside of http.c, since after the above changes, there are no other
instances of usage outside of http.c.
Remove members of the transfer_request struct in http-push.c and
http-walker.c, including filename, real_sha1 and zret, as they are used
no longer used.
Move the methods append_remote_object_url() and get_remote_object_url()
from http-push.c to http.c. Additionally, get_remote_object_url() is no
longer defined only when USE_CURL_MULTI is defined, since
non-USE_CURL_MULTI code in http.c uses it (namely, in
new_http_object_request()).
Refactor code from http-push.c::start_fetch_loose() and
http-walker.c::start_object_fetch_request() that deals with the details
of coming up with the filename to store the retrieved object, resuming
a previously aborted request, and making a new curl request, into a new
function, new_http_object_request().
Refactor code from http-walker.c::process_object_request() into the
function, process_http_object_request().
Refactor code from http-push.c::finish_request() and
http-walker.c::finish_object_request() into a new function,
finish_http_object_request(). It returns the result of the
move_temp_to_file() invocation.
Add a function, release_http_object_request(), which cleans up object
request data. http-push.c and http-walker.c invoke this function
separately; http-push.c::release_request() and
http-walker.c::release_object_request() do not invoke this function.
Add a function, abort_http_object_request(), which unlink()s the object
file and invokes release_http_object_request(). Update
http-walker.c::abort_object_request() to use this.
Signed-off-by: Tay Ray Chuan <rctay89@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-06-06 16:44:02 +08:00
|
|
|
*dir = '/';
|
|
|
|
}
|
2018-05-19 09:56:37 +08:00
|
|
|
freq->localfile = open(freq->tmpfile.buf,
|
http*: add helper methods for fetching objects (loose)
The code handling the fetching of loose objects in http-push.c and
http-walker.c have been refactored into new methods and a new struct
(object_http_request) in http.c. They are not meant to be invoked
elsewhere.
The new methods in http.c are
- new_http_object_request
- process_http_object_request
- finish_http_object_request
- abort_http_object_request
- release_http_object_request
and the new struct is http_object_request.
RANGER_HEADER_SIZE and no_pragma_header is no longer made available
outside of http.c, since after the above changes, there are no other
instances of usage outside of http.c.
Remove members of the transfer_request struct in http-push.c and
http-walker.c, including filename, real_sha1 and zret, as they are used
no longer used.
Move the methods append_remote_object_url() and get_remote_object_url()
from http-push.c to http.c. Additionally, get_remote_object_url() is no
longer defined only when USE_CURL_MULTI is defined, since
non-USE_CURL_MULTI code in http.c uses it (namely, in
new_http_object_request()).
Refactor code from http-push.c::start_fetch_loose() and
http-walker.c::start_object_fetch_request() that deals with the details
of coming up with the filename to store the retrieved object, resuming
a previously aborted request, and making a new curl request, into a new
function, new_http_object_request().
Refactor code from http-walker.c::process_object_request() into the
function, process_http_object_request().
Refactor code from http-push.c::finish_request() and
http-walker.c::finish_object_request() into a new function,
finish_http_object_request(). It returns the result of the
move_temp_to_file() invocation.
Add a function, release_http_object_request(), which cleans up object
request data. http-push.c and http-walker.c invoke this function
separately; http-push.c::release_request() and
http-walker.c::release_object_request() do not invoke this function.
Add a function, abort_http_object_request(), which unlink()s the object
file and invokes release_http_object_request(). Update
http-walker.c::abort_object_request() to use this.
Signed-off-by: Tay Ray Chuan <rctay89@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-06-06 16:44:02 +08:00
|
|
|
O_WRONLY | O_CREAT | O_EXCL, 0666);
|
|
|
|
}
|
|
|
|
|
|
|
|
if (freq->localfile < 0) {
|
2018-05-19 09:56:37 +08:00
|
|
|
error_errno("Couldn't create temporary file %s",
|
|
|
|
freq->tmpfile.buf);
|
http*: add helper methods for fetching objects (loose)
The code handling the fetching of loose objects in http-push.c and
http-walker.c have been refactored into new methods and a new struct
(object_http_request) in http.c. They are not meant to be invoked
elsewhere.
The new methods in http.c are
- new_http_object_request
- process_http_object_request
- finish_http_object_request
- abort_http_object_request
- release_http_object_request
and the new struct is http_object_request.
RANGER_HEADER_SIZE and no_pragma_header is no longer made available
outside of http.c, since after the above changes, there are no other
instances of usage outside of http.c.
Remove members of the transfer_request struct in http-push.c and
http-walker.c, including filename, real_sha1 and zret, as they are used
no longer used.
Move the methods append_remote_object_url() and get_remote_object_url()
from http-push.c to http.c. Additionally, get_remote_object_url() is no
longer defined only when USE_CURL_MULTI is defined, since
non-USE_CURL_MULTI code in http.c uses it (namely, in
new_http_object_request()).
Refactor code from http-push.c::start_fetch_loose() and
http-walker.c::start_object_fetch_request() that deals with the details
of coming up with the filename to store the retrieved object, resuming
a previously aborted request, and making a new curl request, into a new
function, new_http_object_request().
Refactor code from http-walker.c::process_object_request() into the
function, process_http_object_request().
Refactor code from http-push.c::finish_request() and
http-walker.c::finish_object_request() into a new function,
finish_http_object_request(). It returns the result of the
move_temp_to_file() invocation.
Add a function, release_http_object_request(), which cleans up object
request data. http-push.c and http-walker.c invoke this function
separately; http-push.c::release_request() and
http-walker.c::release_object_request() do not invoke this function.
Add a function, abort_http_object_request(), which unlink()s the object
file and invokes release_http_object_request(). Update
http-walker.c::abort_object_request() to use this.
Signed-off-by: Tay Ray Chuan <rctay89@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-06-06 16:44:02 +08:00
|
|
|
goto abort;
|
|
|
|
}
|
|
|
|
|
|
|
|
git_inflate_init(&freq->stream);
|
|
|
|
|
2019-02-19 08:05:14 +08:00
|
|
|
the_hash_algo->init_fn(&freq->c);
|
http*: add helper methods for fetching objects (loose)
The code handling the fetching of loose objects in http-push.c and
http-walker.c have been refactored into new methods and a new struct
(object_http_request) in http.c. They are not meant to be invoked
elsewhere.
The new methods in http.c are
- new_http_object_request
- process_http_object_request
- finish_http_object_request
- abort_http_object_request
- release_http_object_request
and the new struct is http_object_request.
RANGER_HEADER_SIZE and no_pragma_header is no longer made available
outside of http.c, since after the above changes, there are no other
instances of usage outside of http.c.
Remove members of the transfer_request struct in http-push.c and
http-walker.c, including filename, real_sha1 and zret, as they are used
no longer used.
Move the methods append_remote_object_url() and get_remote_object_url()
from http-push.c to http.c. Additionally, get_remote_object_url() is no
longer defined only when USE_CURL_MULTI is defined, since
non-USE_CURL_MULTI code in http.c uses it (namely, in
new_http_object_request()).
Refactor code from http-push.c::start_fetch_loose() and
http-walker.c::start_object_fetch_request() that deals with the details
of coming up with the filename to store the retrieved object, resuming
a previously aborted request, and making a new curl request, into a new
function, new_http_object_request().
Refactor code from http-walker.c::process_object_request() into the
function, process_http_object_request().
Refactor code from http-push.c::finish_request() and
http-walker.c::finish_object_request() into a new function,
finish_http_object_request(). It returns the result of the
move_temp_to_file() invocation.
Add a function, release_http_object_request(), which cleans up object
request data. http-push.c and http-walker.c invoke this function
separately; http-push.c::release_request() and
http-walker.c::release_object_request() do not invoke this function.
Add a function, abort_http_object_request(), which unlink()s the object
file and invokes release_http_object_request(). Update
http-walker.c::abort_object_request() to use this.
Signed-off-by: Tay Ray Chuan <rctay89@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-06-06 16:44:02 +08:00
|
|
|
|
2009-08-10 23:59:55 +08:00
|
|
|
freq->url = get_remote_object_url(base_url, hex, 0);
|
http*: add helper methods for fetching objects (loose)
The code handling the fetching of loose objects in http-push.c and
http-walker.c have been refactored into new methods and a new struct
(object_http_request) in http.c. They are not meant to be invoked
elsewhere.
The new methods in http.c are
- new_http_object_request
- process_http_object_request
- finish_http_object_request
- abort_http_object_request
- release_http_object_request
and the new struct is http_object_request.
RANGER_HEADER_SIZE and no_pragma_header is no longer made available
outside of http.c, since after the above changes, there are no other
instances of usage outside of http.c.
Remove members of the transfer_request struct in http-push.c and
http-walker.c, including filename, real_sha1 and zret, as they are used
no longer used.
Move the methods append_remote_object_url() and get_remote_object_url()
from http-push.c to http.c. Additionally, get_remote_object_url() is no
longer defined only when USE_CURL_MULTI is defined, since
non-USE_CURL_MULTI code in http.c uses it (namely, in
new_http_object_request()).
Refactor code from http-push.c::start_fetch_loose() and
http-walker.c::start_object_fetch_request() that deals with the details
of coming up with the filename to store the retrieved object, resuming
a previously aborted request, and making a new curl request, into a new
function, new_http_object_request().
Refactor code from http-walker.c::process_object_request() into the
function, process_http_object_request().
Refactor code from http-push.c::finish_request() and
http-walker.c::finish_object_request() into a new function,
finish_http_object_request(). It returns the result of the
move_temp_to_file() invocation.
Add a function, release_http_object_request(), which cleans up object
request data. http-push.c and http-walker.c invoke this function
separately; http-push.c::release_request() and
http-walker.c::release_object_request() do not invoke this function.
Add a function, abort_http_object_request(), which unlink()s the object
file and invokes release_http_object_request(). Update
http-walker.c::abort_object_request() to use this.
Signed-off-by: Tay Ray Chuan <rctay89@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-06-06 16:44:02 +08:00
|
|
|
|
|
|
|
/*
|
|
|
|
* If a previous temp file is present, process what was already
|
|
|
|
* fetched.
|
|
|
|
*/
|
2018-05-19 09:56:37 +08:00
|
|
|
prevlocal = open(prevfile.buf, O_RDONLY);
|
http*: add helper methods for fetching objects (loose)
The code handling the fetching of loose objects in http-push.c and
http-walker.c have been refactored into new methods and a new struct
(object_http_request) in http.c. They are not meant to be invoked
elsewhere.
The new methods in http.c are
- new_http_object_request
- process_http_object_request
- finish_http_object_request
- abort_http_object_request
- release_http_object_request
and the new struct is http_object_request.
RANGER_HEADER_SIZE and no_pragma_header is no longer made available
outside of http.c, since after the above changes, there are no other
instances of usage outside of http.c.
Remove members of the transfer_request struct in http-push.c and
http-walker.c, including filename, real_sha1 and zret, as they are used
no longer used.
Move the methods append_remote_object_url() and get_remote_object_url()
from http-push.c to http.c. Additionally, get_remote_object_url() is no
longer defined only when USE_CURL_MULTI is defined, since
non-USE_CURL_MULTI code in http.c uses it (namely, in
new_http_object_request()).
Refactor code from http-push.c::start_fetch_loose() and
http-walker.c::start_object_fetch_request() that deals with the details
of coming up with the filename to store the retrieved object, resuming
a previously aborted request, and making a new curl request, into a new
function, new_http_object_request().
Refactor code from http-walker.c::process_object_request() into the
function, process_http_object_request().
Refactor code from http-push.c::finish_request() and
http-walker.c::finish_object_request() into a new function,
finish_http_object_request(). It returns the result of the
move_temp_to_file() invocation.
Add a function, release_http_object_request(), which cleans up object
request data. http-push.c and http-walker.c invoke this function
separately; http-push.c::release_request() and
http-walker.c::release_object_request() do not invoke this function.
Add a function, abort_http_object_request(), which unlink()s the object
file and invokes release_http_object_request(). Update
http-walker.c::abort_object_request() to use this.
Signed-off-by: Tay Ray Chuan <rctay89@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-06-06 16:44:02 +08:00
|
|
|
if (prevlocal != -1) {
|
|
|
|
do {
|
|
|
|
prev_read = xread(prevlocal, prev_buf, PREV_BUF_SIZE);
|
|
|
|
if (prev_read>0) {
|
|
|
|
if (fwrite_sha1_file(prev_buf,
|
|
|
|
1,
|
|
|
|
prev_read,
|
|
|
|
freq) == prev_read) {
|
|
|
|
prev_posn += prev_read;
|
|
|
|
} else {
|
|
|
|
prev_read = -1;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
} while (prev_read > 0);
|
|
|
|
close(prevlocal);
|
|
|
|
}
|
2018-05-19 09:56:37 +08:00
|
|
|
unlink_or_warn(prevfile.buf);
|
|
|
|
strbuf_release(&prevfile);
|
http*: add helper methods for fetching objects (loose)
The code handling the fetching of loose objects in http-push.c and
http-walker.c have been refactored into new methods and a new struct
(object_http_request) in http.c. They are not meant to be invoked
elsewhere.
The new methods in http.c are
- new_http_object_request
- process_http_object_request
- finish_http_object_request
- abort_http_object_request
- release_http_object_request
and the new struct is http_object_request.
RANGER_HEADER_SIZE and no_pragma_header is no longer made available
outside of http.c, since after the above changes, there are no other
instances of usage outside of http.c.
Remove members of the transfer_request struct in http-push.c and
http-walker.c, including filename, real_sha1 and zret, as they are used
no longer used.
Move the methods append_remote_object_url() and get_remote_object_url()
from http-push.c to http.c. Additionally, get_remote_object_url() is no
longer defined only when USE_CURL_MULTI is defined, since
non-USE_CURL_MULTI code in http.c uses it (namely, in
new_http_object_request()).
Refactor code from http-push.c::start_fetch_loose() and
http-walker.c::start_object_fetch_request() that deals with the details
of coming up with the filename to store the retrieved object, resuming
a previously aborted request, and making a new curl request, into a new
function, new_http_object_request().
Refactor code from http-walker.c::process_object_request() into the
function, process_http_object_request().
Refactor code from http-push.c::finish_request() and
http-walker.c::finish_object_request() into a new function,
finish_http_object_request(). It returns the result of the
move_temp_to_file() invocation.
Add a function, release_http_object_request(), which cleans up object
request data. http-push.c and http-walker.c invoke this function
separately; http-push.c::release_request() and
http-walker.c::release_object_request() do not invoke this function.
Add a function, abort_http_object_request(), which unlink()s the object
file and invokes release_http_object_request(). Update
http-walker.c::abort_object_request() to use this.
Signed-off-by: Tay Ray Chuan <rctay89@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-06-06 16:44:02 +08:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Reset inflate/SHA1 if there was an error reading the previous temp
|
|
|
|
* file; also rewind to the beginning of the local file.
|
|
|
|
*/
|
|
|
|
if (prev_read == -1) {
|
http: call git_inflate_end() when releasing http_object_request
In new_http_object_request(), we initialize the zlib stream with
git_inflate_init(). We must have a matching git_inflate_end() to avoid
leaking any memory allocated by zlib.
In most cases this happens in finish_http_object_request(), but we don't
always get there. If we abort a request mid-stream, then we may clean it
up without hitting that function.
We can't just add a git_inflate_end() call to the release function,
though. That would double-free the cases that did actually finish.
Instead, we'll move the call from the finish function to the release
function. This does delay it for the cases that do finish, but I don't
think it matters. We should have already reached Z_STREAM_END (and
complain if we didn't), and we do not record any status code from
git_inflate_end().
This leak is triggered by t5550 at least (and probably other dumb-http
tests).
I did find one other related spot of interest. If we try to read a
previously downloaded file and fail, we reset the stream by calling
memset() followed by a fresh git_inflate_init(). I don't think this case
is triggered in the test suite, but it seemed like an obvious leak, so I
added the appropriate git_inflate_end() before the memset() there.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-25 06:02:13 +08:00
|
|
|
git_inflate_end(&freq->stream);
|
http*: add helper methods for fetching objects (loose)
The code handling the fetching of loose objects in http-push.c and
http-walker.c have been refactored into new methods and a new struct
(object_http_request) in http.c. They are not meant to be invoked
elsewhere.
The new methods in http.c are
- new_http_object_request
- process_http_object_request
- finish_http_object_request
- abort_http_object_request
- release_http_object_request
and the new struct is http_object_request.
RANGER_HEADER_SIZE and no_pragma_header is no longer made available
outside of http.c, since after the above changes, there are no other
instances of usage outside of http.c.
Remove members of the transfer_request struct in http-push.c and
http-walker.c, including filename, real_sha1 and zret, as they are used
no longer used.
Move the methods append_remote_object_url() and get_remote_object_url()
from http-push.c to http.c. Additionally, get_remote_object_url() is no
longer defined only when USE_CURL_MULTI is defined, since
non-USE_CURL_MULTI code in http.c uses it (namely, in
new_http_object_request()).
Refactor code from http-push.c::start_fetch_loose() and
http-walker.c::start_object_fetch_request() that deals with the details
of coming up with the filename to store the retrieved object, resuming
a previously aborted request, and making a new curl request, into a new
function, new_http_object_request().
Refactor code from http-walker.c::process_object_request() into the
function, process_http_object_request().
Refactor code from http-push.c::finish_request() and
http-walker.c::finish_object_request() into a new function,
finish_http_object_request(). It returns the result of the
move_temp_to_file() invocation.
Add a function, release_http_object_request(), which cleans up object
request data. http-push.c and http-walker.c invoke this function
separately; http-push.c::release_request() and
http-walker.c::release_object_request() do not invoke this function.
Add a function, abort_http_object_request(), which unlink()s the object
file and invokes release_http_object_request(). Update
http-walker.c::abort_object_request() to use this.
Signed-off-by: Tay Ray Chuan <rctay89@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-06-06 16:44:02 +08:00
|
|
|
memset(&freq->stream, 0, sizeof(freq->stream));
|
|
|
|
git_inflate_init(&freq->stream);
|
2019-02-19 08:05:14 +08:00
|
|
|
the_hash_algo->init_fn(&freq->c);
|
http*: add helper methods for fetching objects (loose)
The code handling the fetching of loose objects in http-push.c and
http-walker.c have been refactored into new methods and a new struct
(object_http_request) in http.c. They are not meant to be invoked
elsewhere.
The new methods in http.c are
- new_http_object_request
- process_http_object_request
- finish_http_object_request
- abort_http_object_request
- release_http_object_request
and the new struct is http_object_request.
RANGER_HEADER_SIZE and no_pragma_header is no longer made available
outside of http.c, since after the above changes, there are no other
instances of usage outside of http.c.
Remove members of the transfer_request struct in http-push.c and
http-walker.c, including filename, real_sha1 and zret, as they are used
no longer used.
Move the methods append_remote_object_url() and get_remote_object_url()
from http-push.c to http.c. Additionally, get_remote_object_url() is no
longer defined only when USE_CURL_MULTI is defined, since
non-USE_CURL_MULTI code in http.c uses it (namely, in
new_http_object_request()).
Refactor code from http-push.c::start_fetch_loose() and
http-walker.c::start_object_fetch_request() that deals with the details
of coming up with the filename to store the retrieved object, resuming
a previously aborted request, and making a new curl request, into a new
function, new_http_object_request().
Refactor code from http-walker.c::process_object_request() into the
function, process_http_object_request().
Refactor code from http-push.c::finish_request() and
http-walker.c::finish_object_request() into a new function,
finish_http_object_request(). It returns the result of the
move_temp_to_file() invocation.
Add a function, release_http_object_request(), which cleans up object
request data. http-push.c and http-walker.c invoke this function
separately; http-push.c::release_request() and
http-walker.c::release_object_request() do not invoke this function.
Add a function, abort_http_object_request(), which unlink()s the object
file and invokes release_http_object_request(). Update
http-walker.c::abort_object_request() to use this.
Signed-off-by: Tay Ray Chuan <rctay89@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-06-06 16:44:02 +08:00
|
|
|
if (prev_posn>0) {
|
|
|
|
prev_posn = 0;
|
|
|
|
lseek(freq->localfile, 0, SEEK_SET);
|
2009-08-11 00:05:06 +08:00
|
|
|
if (ftruncate(freq->localfile, 0) < 0) {
|
2016-05-08 17:47:48 +08:00
|
|
|
error_errno("Couldn't truncate temporary file %s",
|
2018-05-19 09:56:37 +08:00
|
|
|
freq->tmpfile.buf);
|
2009-08-11 00:05:06 +08:00
|
|
|
goto abort;
|
|
|
|
}
|
http*: add helper methods for fetching objects (loose)
The code handling the fetching of loose objects in http-push.c and
http-walker.c have been refactored into new methods and a new struct
(object_http_request) in http.c. They are not meant to be invoked
elsewhere.
The new methods in http.c are
- new_http_object_request
- process_http_object_request
- finish_http_object_request
- abort_http_object_request
- release_http_object_request
and the new struct is http_object_request.
RANGER_HEADER_SIZE and no_pragma_header is no longer made available
outside of http.c, since after the above changes, there are no other
instances of usage outside of http.c.
Remove members of the transfer_request struct in http-push.c and
http-walker.c, including filename, real_sha1 and zret, as they are used
no longer used.
Move the methods append_remote_object_url() and get_remote_object_url()
from http-push.c to http.c. Additionally, get_remote_object_url() is no
longer defined only when USE_CURL_MULTI is defined, since
non-USE_CURL_MULTI code in http.c uses it (namely, in
new_http_object_request()).
Refactor code from http-push.c::start_fetch_loose() and
http-walker.c::start_object_fetch_request() that deals with the details
of coming up with the filename to store the retrieved object, resuming
a previously aborted request, and making a new curl request, into a new
function, new_http_object_request().
Refactor code from http-walker.c::process_object_request() into the
function, process_http_object_request().
Refactor code from http-push.c::finish_request() and
http-walker.c::finish_object_request() into a new function,
finish_http_object_request(). It returns the result of the
move_temp_to_file() invocation.
Add a function, release_http_object_request(), which cleans up object
request data. http-push.c and http-walker.c invoke this function
separately; http-push.c::release_request() and
http-walker.c::release_object_request() do not invoke this function.
Add a function, abort_http_object_request(), which unlink()s the object
file and invokes release_http_object_request(). Update
http-walker.c::abort_object_request() to use this.
Signed-off-by: Tay Ray Chuan <rctay89@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-06-06 16:44:02 +08:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
freq->slot = get_active_slot();
|
2024-04-17 08:02:27 +08:00
|
|
|
freq->headers = object_request_headers();
|
http*: add helper methods for fetching objects (loose)
The code handling the fetching of loose objects in http-push.c and
http-walker.c have been refactored into new methods and a new struct
(object_http_request) in http.c. They are not meant to be invoked
elsewhere.
The new methods in http.c are
- new_http_object_request
- process_http_object_request
- finish_http_object_request
- abort_http_object_request
- release_http_object_request
and the new struct is http_object_request.
RANGER_HEADER_SIZE and no_pragma_header is no longer made available
outside of http.c, since after the above changes, there are no other
instances of usage outside of http.c.
Remove members of the transfer_request struct in http-push.c and
http-walker.c, including filename, real_sha1 and zret, as they are used
no longer used.
Move the methods append_remote_object_url() and get_remote_object_url()
from http-push.c to http.c. Additionally, get_remote_object_url() is no
longer defined only when USE_CURL_MULTI is defined, since
non-USE_CURL_MULTI code in http.c uses it (namely, in
new_http_object_request()).
Refactor code from http-push.c::start_fetch_loose() and
http-walker.c::start_object_fetch_request() that deals with the details
of coming up with the filename to store the retrieved object, resuming
a previously aborted request, and making a new curl request, into a new
function, new_http_object_request().
Refactor code from http-walker.c::process_object_request() into the
function, process_http_object_request().
Refactor code from http-push.c::finish_request() and
http-walker.c::finish_object_request() into a new function,
finish_http_object_request(). It returns the result of the
move_temp_to_file() invocation.
Add a function, release_http_object_request(), which cleans up object
request data. http-push.c and http-walker.c invoke this function
separately; http-push.c::release_request() and
http-walker.c::release_object_request() do not invoke this function.
Add a function, abort_http_object_request(), which unlink()s the object
file and invokes release_http_object_request(). Update
http-walker.c::abort_object_request() to use this.
Signed-off-by: Tay Ray Chuan <rctay89@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-06-06 16:44:02 +08:00
|
|
|
|
2021-07-31 01:59:46 +08:00
|
|
|
curl_easy_setopt(freq->slot->curl, CURLOPT_WRITEDATA, freq);
|
2016-07-12 04:51:30 +08:00
|
|
|
curl_easy_setopt(freq->slot->curl, CURLOPT_FAILONERROR, 0);
|
http*: add helper methods for fetching objects (loose)
The code handling the fetching of loose objects in http-push.c and
http-walker.c have been refactored into new methods and a new struct
(object_http_request) in http.c. They are not meant to be invoked
elsewhere.
The new methods in http.c are
- new_http_object_request
- process_http_object_request
- finish_http_object_request
- abort_http_object_request
- release_http_object_request
and the new struct is http_object_request.
RANGER_HEADER_SIZE and no_pragma_header is no longer made available
outside of http.c, since after the above changes, there are no other
instances of usage outside of http.c.
Remove members of the transfer_request struct in http-push.c and
http-walker.c, including filename, real_sha1 and zret, as they are used
no longer used.
Move the methods append_remote_object_url() and get_remote_object_url()
from http-push.c to http.c. Additionally, get_remote_object_url() is no
longer defined only when USE_CURL_MULTI is defined, since
non-USE_CURL_MULTI code in http.c uses it (namely, in
new_http_object_request()).
Refactor code from http-push.c::start_fetch_loose() and
http-walker.c::start_object_fetch_request() that deals with the details
of coming up with the filename to store the retrieved object, resuming
a previously aborted request, and making a new curl request, into a new
function, new_http_object_request().
Refactor code from http-walker.c::process_object_request() into the
function, process_http_object_request().
Refactor code from http-push.c::finish_request() and
http-walker.c::finish_object_request() into a new function,
finish_http_object_request(). It returns the result of the
move_temp_to_file() invocation.
Add a function, release_http_object_request(), which cleans up object
request data. http-push.c and http-walker.c invoke this function
separately; http-push.c::release_request() and
http-walker.c::release_object_request() do not invoke this function.
Add a function, abort_http_object_request(), which unlink()s the object
file and invokes release_http_object_request(). Update
http-walker.c::abort_object_request() to use this.
Signed-off-by: Tay Ray Chuan <rctay89@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-06-06 16:44:02 +08:00
|
|
|
curl_easy_setopt(freq->slot->curl, CURLOPT_WRITEFUNCTION, fwrite_sha1_file);
|
|
|
|
curl_easy_setopt(freq->slot->curl, CURLOPT_ERRORBUFFER, freq->errorstr);
|
2009-08-10 23:59:55 +08:00
|
|
|
curl_easy_setopt(freq->slot->curl, CURLOPT_URL, freq->url);
|
2024-04-17 08:02:27 +08:00
|
|
|
curl_easy_setopt(freq->slot->curl, CURLOPT_HTTPHEADER, freq->headers);
|
http*: add helper methods for fetching objects (loose)
The code handling the fetching of loose objects in http-push.c and
http-walker.c have been refactored into new methods and a new struct
(object_http_request) in http.c. They are not meant to be invoked
elsewhere.
The new methods in http.c are
- new_http_object_request
- process_http_object_request
- finish_http_object_request
- abort_http_object_request
- release_http_object_request
and the new struct is http_object_request.
RANGER_HEADER_SIZE and no_pragma_header is no longer made available
outside of http.c, since after the above changes, there are no other
instances of usage outside of http.c.
Remove members of the transfer_request struct in http-push.c and
http-walker.c, including filename, real_sha1 and zret, as they are used
no longer used.
Move the methods append_remote_object_url() and get_remote_object_url()
from http-push.c to http.c. Additionally, get_remote_object_url() is no
longer defined only when USE_CURL_MULTI is defined, since
non-USE_CURL_MULTI code in http.c uses it (namely, in
new_http_object_request()).
Refactor code from http-push.c::start_fetch_loose() and
http-walker.c::start_object_fetch_request() that deals with the details
of coming up with the filename to store the retrieved object, resuming
a previously aborted request, and making a new curl request, into a new
function, new_http_object_request().
Refactor code from http-walker.c::process_object_request() into the
function, process_http_object_request().
Refactor code from http-push.c::finish_request() and
http-walker.c::finish_object_request() into a new function,
finish_http_object_request(). It returns the result of the
move_temp_to_file() invocation.
Add a function, release_http_object_request(), which cleans up object
request data. http-push.c and http-walker.c invoke this function
separately; http-push.c::release_request() and
http-walker.c::release_object_request() do not invoke this function.
Add a function, abort_http_object_request(), which unlink()s the object
file and invokes release_http_object_request(). Update
http-walker.c::abort_object_request() to use this.
Signed-off-by: Tay Ray Chuan <rctay89@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-06-06 16:44:02 +08:00
|
|
|
|
|
|
|
/*
|
|
|
|
* If we have successfully processed data from a previous fetch
|
|
|
|
* attempt, only fetch the data we don't already have.
|
|
|
|
*/
|
|
|
|
if (prev_posn>0) {
|
|
|
|
if (http_is_verbose)
|
|
|
|
fprintf(stderr,
|
2015-11-12 08:07:42 +08:00
|
|
|
"Resuming fetch of object %s at byte %"PRIuMAX"\n",
|
|
|
|
hex, (uintmax_t)prev_posn);
|
2015-11-03 05:39:58 +08:00
|
|
|
http_opt_request_remainder(freq->slot->curl, prev_posn);
|
http*: add helper methods for fetching objects (loose)
The code handling the fetching of loose objects in http-push.c and
http-walker.c have been refactored into new methods and a new struct
(object_http_request) in http.c. They are not meant to be invoked
elsewhere.
The new methods in http.c are
- new_http_object_request
- process_http_object_request
- finish_http_object_request
- abort_http_object_request
- release_http_object_request
and the new struct is http_object_request.
RANGER_HEADER_SIZE and no_pragma_header is no longer made available
outside of http.c, since after the above changes, there are no other
instances of usage outside of http.c.
Remove members of the transfer_request struct in http-push.c and
http-walker.c, including filename, real_sha1 and zret, as they are used
no longer used.
Move the methods append_remote_object_url() and get_remote_object_url()
from http-push.c to http.c. Additionally, get_remote_object_url() is no
longer defined only when USE_CURL_MULTI is defined, since
non-USE_CURL_MULTI code in http.c uses it (namely, in
new_http_object_request()).
Refactor code from http-push.c::start_fetch_loose() and
http-walker.c::start_object_fetch_request() that deals with the details
of coming up with the filename to store the retrieved object, resuming
a previously aborted request, and making a new curl request, into a new
function, new_http_object_request().
Refactor code from http-walker.c::process_object_request() into the
function, process_http_object_request().
Refactor code from http-push.c::finish_request() and
http-walker.c::finish_object_request() into a new function,
finish_http_object_request(). It returns the result of the
move_temp_to_file() invocation.
Add a function, release_http_object_request(), which cleans up object
request data. http-push.c and http-walker.c invoke this function
separately; http-push.c::release_request() and
http-walker.c::release_object_request() do not invoke this function.
Add a function, abort_http_object_request(), which unlink()s the object
file and invokes release_http_object_request(). Update
http-walker.c::abort_object_request() to use this.
Signed-off-by: Tay Ray Chuan <rctay89@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-06-06 16:44:02 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
return freq;
|
|
|
|
|
|
|
|
abort:
|
2018-05-19 09:56:37 +08:00
|
|
|
strbuf_release(&prevfile);
|
2009-08-10 23:59:55 +08:00
|
|
|
free(freq->url);
|
http*: add helper methods for fetching objects (loose)
The code handling the fetching of loose objects in http-push.c and
http-walker.c have been refactored into new methods and a new struct
(object_http_request) in http.c. They are not meant to be invoked
elsewhere.
The new methods in http.c are
- new_http_object_request
- process_http_object_request
- finish_http_object_request
- abort_http_object_request
- release_http_object_request
and the new struct is http_object_request.
RANGER_HEADER_SIZE and no_pragma_header is no longer made available
outside of http.c, since after the above changes, there are no other
instances of usage outside of http.c.
Remove members of the transfer_request struct in http-push.c and
http-walker.c, including filename, real_sha1 and zret, as they are used
no longer used.
Move the methods append_remote_object_url() and get_remote_object_url()
from http-push.c to http.c. Additionally, get_remote_object_url() is no
longer defined only when USE_CURL_MULTI is defined, since
non-USE_CURL_MULTI code in http.c uses it (namely, in
new_http_object_request()).
Refactor code from http-push.c::start_fetch_loose() and
http-walker.c::start_object_fetch_request() that deals with the details
of coming up with the filename to store the retrieved object, resuming
a previously aborted request, and making a new curl request, into a new
function, new_http_object_request().
Refactor code from http-walker.c::process_object_request() into the
function, process_http_object_request().
Refactor code from http-push.c::finish_request() and
http-walker.c::finish_object_request() into a new function,
finish_http_object_request(). It returns the result of the
move_temp_to_file() invocation.
Add a function, release_http_object_request(), which cleans up object
request data. http-push.c and http-walker.c invoke this function
separately; http-push.c::release_request() and
http-walker.c::release_object_request() do not invoke this function.
Add a function, abort_http_object_request(), which unlink()s the object
file and invokes release_http_object_request(). Update
http-walker.c::abort_object_request() to use this.
Signed-off-by: Tay Ray Chuan <rctay89@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-06-06 16:44:02 +08:00
|
|
|
free(freq);
|
|
|
|
return NULL;
|
|
|
|
}
|
|
|
|
|
|
|
|
void process_http_object_request(struct http_object_request *freq)
|
|
|
|
{
|
2022-05-03 00:50:37 +08:00
|
|
|
if (!freq->slot)
|
http*: add helper methods for fetching objects (loose)
The code handling the fetching of loose objects in http-push.c and
http-walker.c have been refactored into new methods and a new struct
(object_http_request) in http.c. They are not meant to be invoked
elsewhere.
The new methods in http.c are
- new_http_object_request
- process_http_object_request
- finish_http_object_request
- abort_http_object_request
- release_http_object_request
and the new struct is http_object_request.
RANGER_HEADER_SIZE and no_pragma_header is no longer made available
outside of http.c, since after the above changes, there are no other
instances of usage outside of http.c.
Remove members of the transfer_request struct in http-push.c and
http-walker.c, including filename, real_sha1 and zret, as they are used
no longer used.
Move the methods append_remote_object_url() and get_remote_object_url()
from http-push.c to http.c. Additionally, get_remote_object_url() is no
longer defined only when USE_CURL_MULTI is defined, since
non-USE_CURL_MULTI code in http.c uses it (namely, in
new_http_object_request()).
Refactor code from http-push.c::start_fetch_loose() and
http-walker.c::start_object_fetch_request() that deals with the details
of coming up with the filename to store the retrieved object, resuming
a previously aborted request, and making a new curl request, into a new
function, new_http_object_request().
Refactor code from http-walker.c::process_object_request() into the
function, process_http_object_request().
Refactor code from http-push.c::finish_request() and
http-walker.c::finish_object_request() into a new function,
finish_http_object_request(). It returns the result of the
move_temp_to_file() invocation.
Add a function, release_http_object_request(), which cleans up object
request data. http-push.c and http-walker.c invoke this function
separately; http-push.c::release_request() and
http-walker.c::release_object_request() do not invoke this function.
Add a function, abort_http_object_request(), which unlink()s the object
file and invokes release_http_object_request(). Update
http-walker.c::abort_object_request() to use this.
Signed-off-by: Tay Ray Chuan <rctay89@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-06-06 16:44:02 +08:00
|
|
|
return;
|
|
|
|
freq->curl_result = freq->slot->curl_result;
|
|
|
|
freq->http_code = freq->slot->http_code;
|
|
|
|
freq->slot = NULL;
|
|
|
|
}
|
|
|
|
|
|
|
|
int finish_http_object_request(struct http_object_request *freq)
|
|
|
|
{
|
|
|
|
struct stat st;
|
2018-01-18 01:54:54 +08:00
|
|
|
struct strbuf filename = STRBUF_INIT;
|
http*: add helper methods for fetching objects (loose)
The code handling the fetching of loose objects in http-push.c and
http-walker.c have been refactored into new methods and a new struct
(object_http_request) in http.c. They are not meant to be invoked
elsewhere.
The new methods in http.c are
- new_http_object_request
- process_http_object_request
- finish_http_object_request
- abort_http_object_request
- release_http_object_request
and the new struct is http_object_request.
RANGER_HEADER_SIZE and no_pragma_header is no longer made available
outside of http.c, since after the above changes, there are no other
instances of usage outside of http.c.
Remove members of the transfer_request struct in http-push.c and
http-walker.c, including filename, real_sha1 and zret, as they are used
no longer used.
Move the methods append_remote_object_url() and get_remote_object_url()
from http-push.c to http.c. Additionally, get_remote_object_url() is no
longer defined only when USE_CURL_MULTI is defined, since
non-USE_CURL_MULTI code in http.c uses it (namely, in
new_http_object_request()).
Refactor code from http-push.c::start_fetch_loose() and
http-walker.c::start_object_fetch_request() that deals with the details
of coming up with the filename to store the retrieved object, resuming
a previously aborted request, and making a new curl request, into a new
function, new_http_object_request().
Refactor code from http-walker.c::process_object_request() into the
function, process_http_object_request().
Refactor code from http-push.c::finish_request() and
http-walker.c::finish_object_request() into a new function,
finish_http_object_request(). It returns the result of the
move_temp_to_file() invocation.
Add a function, release_http_object_request(), which cleans up object
request data. http-push.c and http-walker.c invoke this function
separately; http-push.c::release_request() and
http-walker.c::release_object_request() do not invoke this function.
Add a function, abort_http_object_request(), which unlink()s the object
file and invokes release_http_object_request(). Update
http-walker.c::abort_object_request() to use this.
Signed-off-by: Tay Ray Chuan <rctay89@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-06-06 16:44:02 +08:00
|
|
|
|
|
|
|
close(freq->localfile);
|
|
|
|
freq->localfile = -1;
|
|
|
|
|
|
|
|
process_http_object_request(freq);
|
|
|
|
|
|
|
|
if (freq->http_code == 416) {
|
2010-01-04 00:20:30 +08:00
|
|
|
warning("requested range invalid; we may already have all the data.");
|
http*: add helper methods for fetching objects (loose)
The code handling the fetching of loose objects in http-push.c and
http-walker.c have been refactored into new methods and a new struct
(object_http_request) in http.c. They are not meant to be invoked
elsewhere.
The new methods in http.c are
- new_http_object_request
- process_http_object_request
- finish_http_object_request
- abort_http_object_request
- release_http_object_request
and the new struct is http_object_request.
RANGER_HEADER_SIZE and no_pragma_header is no longer made available
outside of http.c, since after the above changes, there are no other
instances of usage outside of http.c.
Remove members of the transfer_request struct in http-push.c and
http-walker.c, including filename, real_sha1 and zret, as they are used
no longer used.
Move the methods append_remote_object_url() and get_remote_object_url()
from http-push.c to http.c. Additionally, get_remote_object_url() is no
longer defined only when USE_CURL_MULTI is defined, since
non-USE_CURL_MULTI code in http.c uses it (namely, in
new_http_object_request()).
Refactor code from http-push.c::start_fetch_loose() and
http-walker.c::start_object_fetch_request() that deals with the details
of coming up with the filename to store the retrieved object, resuming
a previously aborted request, and making a new curl request, into a new
function, new_http_object_request().
Refactor code from http-walker.c::process_object_request() into the
function, process_http_object_request().
Refactor code from http-push.c::finish_request() and
http-walker.c::finish_object_request() into a new function,
finish_http_object_request(). It returns the result of the
move_temp_to_file() invocation.
Add a function, release_http_object_request(), which cleans up object
request data. http-push.c and http-walker.c invoke this function
separately; http-push.c::release_request() and
http-walker.c::release_object_request() do not invoke this function.
Add a function, abort_http_object_request(), which unlink()s the object
file and invokes release_http_object_request(). Update
http-walker.c::abort_object_request() to use this.
Signed-off-by: Tay Ray Chuan <rctay89@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-06-06 16:44:02 +08:00
|
|
|
} else if (freq->curl_result != CURLE_OK) {
|
2018-05-19 09:56:37 +08:00
|
|
|
if (stat(freq->tmpfile.buf, &st) == 0)
|
http*: add helper methods for fetching objects (loose)
The code handling the fetching of loose objects in http-push.c and
http-walker.c have been refactored into new methods and a new struct
(object_http_request) in http.c. They are not meant to be invoked
elsewhere.
The new methods in http.c are
- new_http_object_request
- process_http_object_request
- finish_http_object_request
- abort_http_object_request
- release_http_object_request
and the new struct is http_object_request.
RANGER_HEADER_SIZE and no_pragma_header is no longer made available
outside of http.c, since after the above changes, there are no other
instances of usage outside of http.c.
Remove members of the transfer_request struct in http-push.c and
http-walker.c, including filename, real_sha1 and zret, as they are used
no longer used.
Move the methods append_remote_object_url() and get_remote_object_url()
from http-push.c to http.c. Additionally, get_remote_object_url() is no
longer defined only when USE_CURL_MULTI is defined, since
non-USE_CURL_MULTI code in http.c uses it (namely, in
new_http_object_request()).
Refactor code from http-push.c::start_fetch_loose() and
http-walker.c::start_object_fetch_request() that deals with the details
of coming up with the filename to store the retrieved object, resuming
a previously aborted request, and making a new curl request, into a new
function, new_http_object_request().
Refactor code from http-walker.c::process_object_request() into the
function, process_http_object_request().
Refactor code from http-push.c::finish_request() and
http-walker.c::finish_object_request() into a new function,
finish_http_object_request(). It returns the result of the
move_temp_to_file() invocation.
Add a function, release_http_object_request(), which cleans up object
request data. http-push.c and http-walker.c invoke this function
separately; http-push.c::release_request() and
http-walker.c::release_object_request() do not invoke this function.
Add a function, abort_http_object_request(), which unlink()s the object
file and invokes release_http_object_request(). Update
http-walker.c::abort_object_request() to use this.
Signed-off-by: Tay Ray Chuan <rctay89@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-06-06 16:44:02 +08:00
|
|
|
if (st.st_size == 0)
|
2018-05-19 09:56:37 +08:00
|
|
|
unlink_or_warn(freq->tmpfile.buf);
|
http*: add helper methods for fetching objects (loose)
The code handling the fetching of loose objects in http-push.c and
http-walker.c have been refactored into new methods and a new struct
(object_http_request) in http.c. They are not meant to be invoked
elsewhere.
The new methods in http.c are
- new_http_object_request
- process_http_object_request
- finish_http_object_request
- abort_http_object_request
- release_http_object_request
and the new struct is http_object_request.
RANGER_HEADER_SIZE and no_pragma_header is no longer made available
outside of http.c, since after the above changes, there are no other
instances of usage outside of http.c.
Remove members of the transfer_request struct in http-push.c and
http-walker.c, including filename, real_sha1 and zret, as they are used
no longer used.
Move the methods append_remote_object_url() and get_remote_object_url()
from http-push.c to http.c. Additionally, get_remote_object_url() is no
longer defined only when USE_CURL_MULTI is defined, since
non-USE_CURL_MULTI code in http.c uses it (namely, in
new_http_object_request()).
Refactor code from http-push.c::start_fetch_loose() and
http-walker.c::start_object_fetch_request() that deals with the details
of coming up with the filename to store the retrieved object, resuming
a previously aborted request, and making a new curl request, into a new
function, new_http_object_request().
Refactor code from http-walker.c::process_object_request() into the
function, process_http_object_request().
Refactor code from http-push.c::finish_request() and
http-walker.c::finish_object_request() into a new function,
finish_http_object_request(). It returns the result of the
move_temp_to_file() invocation.
Add a function, release_http_object_request(), which cleans up object
request data. http-push.c and http-walker.c invoke this function
separately; http-push.c::release_request() and
http-walker.c::release_object_request() do not invoke this function.
Add a function, abort_http_object_request(), which unlink()s the object
file and invokes release_http_object_request(). Update
http-walker.c::abort_object_request() to use this.
Signed-off-by: Tay Ray Chuan <rctay89@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-06-06 16:44:02 +08:00
|
|
|
return -1;
|
|
|
|
}
|
|
|
|
|
2021-04-26 09:02:53 +08:00
|
|
|
the_hash_algo->final_oid_fn(&freq->real_oid, &freq->c);
|
http*: add helper methods for fetching objects (loose)
The code handling the fetching of loose objects in http-push.c and
http-walker.c have been refactored into new methods and a new struct
(object_http_request) in http.c. They are not meant to be invoked
elsewhere.
The new methods in http.c are
- new_http_object_request
- process_http_object_request
- finish_http_object_request
- abort_http_object_request
- release_http_object_request
and the new struct is http_object_request.
RANGER_HEADER_SIZE and no_pragma_header is no longer made available
outside of http.c, since after the above changes, there are no other
instances of usage outside of http.c.
Remove members of the transfer_request struct in http-push.c and
http-walker.c, including filename, real_sha1 and zret, as they are used
no longer used.
Move the methods append_remote_object_url() and get_remote_object_url()
from http-push.c to http.c. Additionally, get_remote_object_url() is no
longer defined only when USE_CURL_MULTI is defined, since
non-USE_CURL_MULTI code in http.c uses it (namely, in
new_http_object_request()).
Refactor code from http-push.c::start_fetch_loose() and
http-walker.c::start_object_fetch_request() that deals with the details
of coming up with the filename to store the retrieved object, resuming
a previously aborted request, and making a new curl request, into a new
function, new_http_object_request().
Refactor code from http-walker.c::process_object_request() into the
function, process_http_object_request().
Refactor code from http-push.c::finish_request() and
http-walker.c::finish_object_request() into a new function,
finish_http_object_request(). It returns the result of the
move_temp_to_file() invocation.
Add a function, release_http_object_request(), which cleans up object
request data. http-push.c and http-walker.c invoke this function
separately; http-push.c::release_request() and
http-walker.c::release_object_request() do not invoke this function.
Add a function, abort_http_object_request(), which unlink()s the object
file and invokes release_http_object_request(). Update
http-walker.c::abort_object_request() to use this.
Signed-off-by: Tay Ray Chuan <rctay89@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-06-06 16:44:02 +08:00
|
|
|
if (freq->zret != Z_STREAM_END) {
|
2018-05-19 09:56:37 +08:00
|
|
|
unlink_or_warn(freq->tmpfile.buf);
|
http*: add helper methods for fetching objects (loose)
The code handling the fetching of loose objects in http-push.c and
http-walker.c have been refactored into new methods and a new struct
(object_http_request) in http.c. They are not meant to be invoked
elsewhere.
The new methods in http.c are
- new_http_object_request
- process_http_object_request
- finish_http_object_request
- abort_http_object_request
- release_http_object_request
and the new struct is http_object_request.
RANGER_HEADER_SIZE and no_pragma_header is no longer made available
outside of http.c, since after the above changes, there are no other
instances of usage outside of http.c.
Remove members of the transfer_request struct in http-push.c and
http-walker.c, including filename, real_sha1 and zret, as they are used
no longer used.
Move the methods append_remote_object_url() and get_remote_object_url()
from http-push.c to http.c. Additionally, get_remote_object_url() is no
longer defined only when USE_CURL_MULTI is defined, since
non-USE_CURL_MULTI code in http.c uses it (namely, in
new_http_object_request()).
Refactor code from http-push.c::start_fetch_loose() and
http-walker.c::start_object_fetch_request() that deals with the details
of coming up with the filename to store the retrieved object, resuming
a previously aborted request, and making a new curl request, into a new
function, new_http_object_request().
Refactor code from http-walker.c::process_object_request() into the
function, process_http_object_request().
Refactor code from http-push.c::finish_request() and
http-walker.c::finish_object_request() into a new function,
finish_http_object_request(). It returns the result of the
move_temp_to_file() invocation.
Add a function, release_http_object_request(), which cleans up object
request data. http-push.c and http-walker.c invoke this function
separately; http-push.c::release_request() and
http-walker.c::release_object_request() do not invoke this function.
Add a function, abort_http_object_request(), which unlink()s the object
file and invokes release_http_object_request(). Update
http-walker.c::abort_object_request() to use this.
Signed-off-by: Tay Ray Chuan <rctay89@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-06-06 16:44:02 +08:00
|
|
|
return -1;
|
|
|
|
}
|
2019-01-07 16:34:40 +08:00
|
|
|
if (!oideq(&freq->oid, &freq->real_oid)) {
|
2018-05-19 09:56:37 +08:00
|
|
|
unlink_or_warn(freq->tmpfile.buf);
|
http*: add helper methods for fetching objects (loose)
The code handling the fetching of loose objects in http-push.c and
http-walker.c have been refactored into new methods and a new struct
(object_http_request) in http.c. They are not meant to be invoked
elsewhere.
The new methods in http.c are
- new_http_object_request
- process_http_object_request
- finish_http_object_request
- abort_http_object_request
- release_http_object_request
and the new struct is http_object_request.
RANGER_HEADER_SIZE and no_pragma_header is no longer made available
outside of http.c, since after the above changes, there are no other
instances of usage outside of http.c.
Remove members of the transfer_request struct in http-push.c and
http-walker.c, including filename, real_sha1 and zret, as they are used
no longer used.
Move the methods append_remote_object_url() and get_remote_object_url()
from http-push.c to http.c. Additionally, get_remote_object_url() is no
longer defined only when USE_CURL_MULTI is defined, since
non-USE_CURL_MULTI code in http.c uses it (namely, in
new_http_object_request()).
Refactor code from http-push.c::start_fetch_loose() and
http-walker.c::start_object_fetch_request() that deals with the details
of coming up with the filename to store the retrieved object, resuming
a previously aborted request, and making a new curl request, into a new
function, new_http_object_request().
Refactor code from http-walker.c::process_object_request() into the
function, process_http_object_request().
Refactor code from http-push.c::finish_request() and
http-walker.c::finish_object_request() into a new function,
finish_http_object_request(). It returns the result of the
move_temp_to_file() invocation.
Add a function, release_http_object_request(), which cleans up object
request data. http-push.c and http-walker.c invoke this function
separately; http-push.c::release_request() and
http-walker.c::release_object_request() do not invoke this function.
Add a function, abort_http_object_request(), which unlink()s the object
file and invokes release_http_object_request(). Update
http-walker.c::abort_object_request() to use this.
Signed-off-by: Tay Ray Chuan <rctay89@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-06-06 16:44:02 +08:00
|
|
|
return -1;
|
|
|
|
}
|
sha1-file: modernize loose object file functions
The loose object access code in sha1-file.c is some of the oldest in
Git, and could use some modernizing. It mostly uses "unsigned char *"
for object ids, which these days should be "struct object_id".
It also uses the term "sha1_file" in many functions, which is confusing.
The term "loose_objects" is much better. It clearly distinguishes
them from packed objects (which didn't even exist back when the name
"sha1_file" came into being). And it also distinguishes it from the
checksummed-file concept in csum-file.c (which until recently was
actually called "struct sha1file"!).
This patch converts the functions {open,close,map,stat}_sha1_file() into
open_loose_object(), etc, and switches their sha1 arguments for
object_id structs. Similarly, path functions like fill_sha1_path()
become fill_loose_path() and use object_ids.
The function sha1_loose_object_info() already says "loose", so we can
just drop the "sha1" (and teach it to use object_id).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-07 16:35:42 +08:00
|
|
|
loose_object_path(the_repository, &filename, &freq->oid);
|
2018-05-19 09:56:37 +08:00
|
|
|
freq->rename = finalize_object_file(freq->tmpfile.buf, filename.buf);
|
2018-01-18 01:54:54 +08:00
|
|
|
strbuf_release(&filename);
|
http*: add helper methods for fetching objects (loose)
The code handling the fetching of loose objects in http-push.c and
http-walker.c have been refactored into new methods and a new struct
(object_http_request) in http.c. They are not meant to be invoked
elsewhere.
The new methods in http.c are
- new_http_object_request
- process_http_object_request
- finish_http_object_request
- abort_http_object_request
- release_http_object_request
and the new struct is http_object_request.
RANGER_HEADER_SIZE and no_pragma_header is no longer made available
outside of http.c, since after the above changes, there are no other
instances of usage outside of http.c.
Remove members of the transfer_request struct in http-push.c and
http-walker.c, including filename, real_sha1 and zret, as they are used
no longer used.
Move the methods append_remote_object_url() and get_remote_object_url()
from http-push.c to http.c. Additionally, get_remote_object_url() is no
longer defined only when USE_CURL_MULTI is defined, since
non-USE_CURL_MULTI code in http.c uses it (namely, in
new_http_object_request()).
Refactor code from http-push.c::start_fetch_loose() and
http-walker.c::start_object_fetch_request() that deals with the details
of coming up with the filename to store the retrieved object, resuming
a previously aborted request, and making a new curl request, into a new
function, new_http_object_request().
Refactor code from http-walker.c::process_object_request() into the
function, process_http_object_request().
Refactor code from http-push.c::finish_request() and
http-walker.c::finish_object_request() into a new function,
finish_http_object_request(). It returns the result of the
move_temp_to_file() invocation.
Add a function, release_http_object_request(), which cleans up object
request data. http-push.c and http-walker.c invoke this function
separately; http-push.c::release_request() and
http-walker.c::release_object_request() do not invoke this function.
Add a function, abort_http_object_request(), which unlink()s the object
file and invokes release_http_object_request(). Update
http-walker.c::abort_object_request() to use this.
Signed-off-by: Tay Ray Chuan <rctay89@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-06-06 16:44:02 +08:00
|
|
|
|
|
|
|
return freq->rename;
|
|
|
|
}
|
|
|
|
|
http: fix leak of http_object_request struct
The new_http_object_request() function allocates a struct on the heap,
along with some fields inside the struct. But the matching function to
clean it up, release_http_object_request(), only frees the interior
fields without freeing the struct itself, causing a leak.
The related http_pack_request new/release pair gets this right, and at
first glance we should be able to do the same thing and just add a
single free() call. But there's a catch.
These http_object_request structs are typically embedded in the
object_request struct of http-walker.c. And when we clean up that parent
struct, it sanity-checks the embedded struct to make sure we are not
leaking descriptors. Which means a use-after-free if we simply free()
the embedded struct.
I have no idea how valuable that sanity-check is, or whether it can
simply be deleted. This all goes back to 5424bc557f (http*: add helper
methods for fetching objects (loose), 2009-06-06). But the obvious way
to make it all work is to be sure we set the pointer to NULL after
freeing it (and our freeing process closes the descriptor, so we know
there is no leak).
To make sure we do that consistently, we'll switch the pointer we take
in release_http_object_request() to a pointer-to-pointer, and we'll set
it to NULL ourselves. And then the compiler can help us find each caller
which needs to be updated.
Most cases will just pass "&obj_req->req", which will obviously do the
right thing. In a few cases, like http-push's finish_request(), we are
working with a copy of the pointer, so we don't NULL the original. But
it's OK because the next step is to free the struct containing the
original pointer anyway.
This lets us mark t5551 as leak-free. Ironically this is the "smart"
http test, and the leak here only affects dumb http. But there's a
single dumb-http invocation in there. The full dumb tests are in t5550,
which still has some more leaks.
This also makes t5559 leak-free, as it's just an HTTP/2 variant of
t5551. But we don't need to mark it as such, since it inherits the flag
from t5551.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-25 06:01:09 +08:00
|
|
|
void abort_http_object_request(struct http_object_request **freq_p)
|
http*: add helper methods for fetching objects (loose)
The code handling the fetching of loose objects in http-push.c and
http-walker.c have been refactored into new methods and a new struct
(object_http_request) in http.c. They are not meant to be invoked
elsewhere.
The new methods in http.c are
- new_http_object_request
- process_http_object_request
- finish_http_object_request
- abort_http_object_request
- release_http_object_request
and the new struct is http_object_request.
RANGER_HEADER_SIZE and no_pragma_header is no longer made available
outside of http.c, since after the above changes, there are no other
instances of usage outside of http.c.
Remove members of the transfer_request struct in http-push.c and
http-walker.c, including filename, real_sha1 and zret, as they are used
no longer used.
Move the methods append_remote_object_url() and get_remote_object_url()
from http-push.c to http.c. Additionally, get_remote_object_url() is no
longer defined only when USE_CURL_MULTI is defined, since
non-USE_CURL_MULTI code in http.c uses it (namely, in
new_http_object_request()).
Refactor code from http-push.c::start_fetch_loose() and
http-walker.c::start_object_fetch_request() that deals with the details
of coming up with the filename to store the retrieved object, resuming
a previously aborted request, and making a new curl request, into a new
function, new_http_object_request().
Refactor code from http-walker.c::process_object_request() into the
function, process_http_object_request().
Refactor code from http-push.c::finish_request() and
http-walker.c::finish_object_request() into a new function,
finish_http_object_request(). It returns the result of the
move_temp_to_file() invocation.
Add a function, release_http_object_request(), which cleans up object
request data. http-push.c and http-walker.c invoke this function
separately; http-push.c::release_request() and
http-walker.c::release_object_request() do not invoke this function.
Add a function, abort_http_object_request(), which unlink()s the object
file and invokes release_http_object_request(). Update
http-walker.c::abort_object_request() to use this.
Signed-off-by: Tay Ray Chuan <rctay89@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-06-06 16:44:02 +08:00
|
|
|
{
|
http: fix leak of http_object_request struct
The new_http_object_request() function allocates a struct on the heap,
along with some fields inside the struct. But the matching function to
clean it up, release_http_object_request(), only frees the interior
fields without freeing the struct itself, causing a leak.
The related http_pack_request new/release pair gets this right, and at
first glance we should be able to do the same thing and just add a
single free() call. But there's a catch.
These http_object_request structs are typically embedded in the
object_request struct of http-walker.c. And when we clean up that parent
struct, it sanity-checks the embedded struct to make sure we are not
leaking descriptors. Which means a use-after-free if we simply free()
the embedded struct.
I have no idea how valuable that sanity-check is, or whether it can
simply be deleted. This all goes back to 5424bc557f (http*: add helper
methods for fetching objects (loose), 2009-06-06). But the obvious way
to make it all work is to be sure we set the pointer to NULL after
freeing it (and our freeing process closes the descriptor, so we know
there is no leak).
To make sure we do that consistently, we'll switch the pointer we take
in release_http_object_request() to a pointer-to-pointer, and we'll set
it to NULL ourselves. And then the compiler can help us find each caller
which needs to be updated.
Most cases will just pass "&obj_req->req", which will obviously do the
right thing. In a few cases, like http-push's finish_request(), we are
working with a copy of the pointer, so we don't NULL the original. But
it's OK because the next step is to free the struct containing the
original pointer anyway.
This lets us mark t5551 as leak-free. Ironically this is the "smart"
http test, and the leak here only affects dumb http. But there's a
single dumb-http invocation in there. The full dumb tests are in t5550,
which still has some more leaks.
This also makes t5559 leak-free, as it's just an HTTP/2 variant of
t5551. But we don't need to mark it as such, since it inherits the flag
from t5551.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-25 06:01:09 +08:00
|
|
|
struct http_object_request *freq = *freq_p;
|
2018-05-19 09:56:37 +08:00
|
|
|
unlink_or_warn(freq->tmpfile.buf);
|
http*: add helper methods for fetching objects (loose)
The code handling the fetching of loose objects in http-push.c and
http-walker.c have been refactored into new methods and a new struct
(object_http_request) in http.c. They are not meant to be invoked
elsewhere.
The new methods in http.c are
- new_http_object_request
- process_http_object_request
- finish_http_object_request
- abort_http_object_request
- release_http_object_request
and the new struct is http_object_request.
RANGER_HEADER_SIZE and no_pragma_header is no longer made available
outside of http.c, since after the above changes, there are no other
instances of usage outside of http.c.
Remove members of the transfer_request struct in http-push.c and
http-walker.c, including filename, real_sha1 and zret, as they are used
no longer used.
Move the methods append_remote_object_url() and get_remote_object_url()
from http-push.c to http.c. Additionally, get_remote_object_url() is no
longer defined only when USE_CURL_MULTI is defined, since
non-USE_CURL_MULTI code in http.c uses it (namely, in
new_http_object_request()).
Refactor code from http-push.c::start_fetch_loose() and
http-walker.c::start_object_fetch_request() that deals with the details
of coming up with the filename to store the retrieved object, resuming
a previously aborted request, and making a new curl request, into a new
function, new_http_object_request().
Refactor code from http-walker.c::process_object_request() into the
function, process_http_object_request().
Refactor code from http-push.c::finish_request() and
http-walker.c::finish_object_request() into a new function,
finish_http_object_request(). It returns the result of the
move_temp_to_file() invocation.
Add a function, release_http_object_request(), which cleans up object
request data. http-push.c and http-walker.c invoke this function
separately; http-push.c::release_request() and
http-walker.c::release_object_request() do not invoke this function.
Add a function, abort_http_object_request(), which unlink()s the object
file and invokes release_http_object_request(). Update
http-walker.c::abort_object_request() to use this.
Signed-off-by: Tay Ray Chuan <rctay89@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-06-06 16:44:02 +08:00
|
|
|
|
http: fix leak of http_object_request struct
The new_http_object_request() function allocates a struct on the heap,
along with some fields inside the struct. But the matching function to
clean it up, release_http_object_request(), only frees the interior
fields without freeing the struct itself, causing a leak.
The related http_pack_request new/release pair gets this right, and at
first glance we should be able to do the same thing and just add a
single free() call. But there's a catch.
These http_object_request structs are typically embedded in the
object_request struct of http-walker.c. And when we clean up that parent
struct, it sanity-checks the embedded struct to make sure we are not
leaking descriptors. Which means a use-after-free if we simply free()
the embedded struct.
I have no idea how valuable that sanity-check is, or whether it can
simply be deleted. This all goes back to 5424bc557f (http*: add helper
methods for fetching objects (loose), 2009-06-06). But the obvious way
to make it all work is to be sure we set the pointer to NULL after
freeing it (and our freeing process closes the descriptor, so we know
there is no leak).
To make sure we do that consistently, we'll switch the pointer we take
in release_http_object_request() to a pointer-to-pointer, and we'll set
it to NULL ourselves. And then the compiler can help us find each caller
which needs to be updated.
Most cases will just pass "&obj_req->req", which will obviously do the
right thing. In a few cases, like http-push's finish_request(), we are
working with a copy of the pointer, so we don't NULL the original. But
it's OK because the next step is to free the struct containing the
original pointer anyway.
This lets us mark t5551 as leak-free. Ironically this is the "smart"
http test, and the leak here only affects dumb http. But there's a
single dumb-http invocation in there. The full dumb tests are in t5550,
which still has some more leaks.
This also makes t5559 leak-free, as it's just an HTTP/2 variant of
t5551. But we don't need to mark it as such, since it inherits the flag
from t5551.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-25 06:01:09 +08:00
|
|
|
release_http_object_request(freq_p);
|
http*: add helper methods for fetching objects (loose)
The code handling the fetching of loose objects in http-push.c and
http-walker.c have been refactored into new methods and a new struct
(object_http_request) in http.c. They are not meant to be invoked
elsewhere.
The new methods in http.c are
- new_http_object_request
- process_http_object_request
- finish_http_object_request
- abort_http_object_request
- release_http_object_request
and the new struct is http_object_request.
RANGER_HEADER_SIZE and no_pragma_header is no longer made available
outside of http.c, since after the above changes, there are no other
instances of usage outside of http.c.
Remove members of the transfer_request struct in http-push.c and
http-walker.c, including filename, real_sha1 and zret, as they are used
no longer used.
Move the methods append_remote_object_url() and get_remote_object_url()
from http-push.c to http.c. Additionally, get_remote_object_url() is no
longer defined only when USE_CURL_MULTI is defined, since
non-USE_CURL_MULTI code in http.c uses it (namely, in
new_http_object_request()).
Refactor code from http-push.c::start_fetch_loose() and
http-walker.c::start_object_fetch_request() that deals with the details
of coming up with the filename to store the retrieved object, resuming
a previously aborted request, and making a new curl request, into a new
function, new_http_object_request().
Refactor code from http-walker.c::process_object_request() into the
function, process_http_object_request().
Refactor code from http-push.c::finish_request() and
http-walker.c::finish_object_request() into a new function,
finish_http_object_request(). It returns the result of the
move_temp_to_file() invocation.
Add a function, release_http_object_request(), which cleans up object
request data. http-push.c and http-walker.c invoke this function
separately; http-push.c::release_request() and
http-walker.c::release_object_request() do not invoke this function.
Add a function, abort_http_object_request(), which unlink()s the object
file and invokes release_http_object_request(). Update
http-walker.c::abort_object_request() to use this.
Signed-off-by: Tay Ray Chuan <rctay89@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-06-06 16:44:02 +08:00
|
|
|
}
|
|
|
|
|
http: fix leak of http_object_request struct
The new_http_object_request() function allocates a struct on the heap,
along with some fields inside the struct. But the matching function to
clean it up, release_http_object_request(), only frees the interior
fields without freeing the struct itself, causing a leak.
The related http_pack_request new/release pair gets this right, and at
first glance we should be able to do the same thing and just add a
single free() call. But there's a catch.
These http_object_request structs are typically embedded in the
object_request struct of http-walker.c. And when we clean up that parent
struct, it sanity-checks the embedded struct to make sure we are not
leaking descriptors. Which means a use-after-free if we simply free()
the embedded struct.
I have no idea how valuable that sanity-check is, or whether it can
simply be deleted. This all goes back to 5424bc557f (http*: add helper
methods for fetching objects (loose), 2009-06-06). But the obvious way
to make it all work is to be sure we set the pointer to NULL after
freeing it (and our freeing process closes the descriptor, so we know
there is no leak).
To make sure we do that consistently, we'll switch the pointer we take
in release_http_object_request() to a pointer-to-pointer, and we'll set
it to NULL ourselves. And then the compiler can help us find each caller
which needs to be updated.
Most cases will just pass "&obj_req->req", which will obviously do the
right thing. In a few cases, like http-push's finish_request(), we are
working with a copy of the pointer, so we don't NULL the original. But
it's OK because the next step is to free the struct containing the
original pointer anyway.
This lets us mark t5551 as leak-free. Ironically this is the "smart"
http test, and the leak here only affects dumb http. But there's a
single dumb-http invocation in there. The full dumb tests are in t5550,
which still has some more leaks.
This also makes t5559 leak-free, as it's just an HTTP/2 variant of
t5551. But we don't need to mark it as such, since it inherits the flag
from t5551.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-25 06:01:09 +08:00
|
|
|
void release_http_object_request(struct http_object_request **freq_p)
|
http*: add helper methods for fetching objects (loose)
The code handling the fetching of loose objects in http-push.c and
http-walker.c have been refactored into new methods and a new struct
(object_http_request) in http.c. They are not meant to be invoked
elsewhere.
The new methods in http.c are
- new_http_object_request
- process_http_object_request
- finish_http_object_request
- abort_http_object_request
- release_http_object_request
and the new struct is http_object_request.
RANGER_HEADER_SIZE and no_pragma_header is no longer made available
outside of http.c, since after the above changes, there are no other
instances of usage outside of http.c.
Remove members of the transfer_request struct in http-push.c and
http-walker.c, including filename, real_sha1 and zret, as they are used
no longer used.
Move the methods append_remote_object_url() and get_remote_object_url()
from http-push.c to http.c. Additionally, get_remote_object_url() is no
longer defined only when USE_CURL_MULTI is defined, since
non-USE_CURL_MULTI code in http.c uses it (namely, in
new_http_object_request()).
Refactor code from http-push.c::start_fetch_loose() and
http-walker.c::start_object_fetch_request() that deals with the details
of coming up with the filename to store the retrieved object, resuming
a previously aborted request, and making a new curl request, into a new
function, new_http_object_request().
Refactor code from http-walker.c::process_object_request() into the
function, process_http_object_request().
Refactor code from http-push.c::finish_request() and
http-walker.c::finish_object_request() into a new function,
finish_http_object_request(). It returns the result of the
move_temp_to_file() invocation.
Add a function, release_http_object_request(), which cleans up object
request data. http-push.c and http-walker.c invoke this function
separately; http-push.c::release_request() and
http-walker.c::release_object_request() do not invoke this function.
Add a function, abort_http_object_request(), which unlink()s the object
file and invokes release_http_object_request(). Update
http-walker.c::abort_object_request() to use this.
Signed-off-by: Tay Ray Chuan <rctay89@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-06-06 16:44:02 +08:00
|
|
|
{
|
http: fix leak of http_object_request struct
The new_http_object_request() function allocates a struct on the heap,
along with some fields inside the struct. But the matching function to
clean it up, release_http_object_request(), only frees the interior
fields without freeing the struct itself, causing a leak.
The related http_pack_request new/release pair gets this right, and at
first glance we should be able to do the same thing and just add a
single free() call. But there's a catch.
These http_object_request structs are typically embedded in the
object_request struct of http-walker.c. And when we clean up that parent
struct, it sanity-checks the embedded struct to make sure we are not
leaking descriptors. Which means a use-after-free if we simply free()
the embedded struct.
I have no idea how valuable that sanity-check is, or whether it can
simply be deleted. This all goes back to 5424bc557f (http*: add helper
methods for fetching objects (loose), 2009-06-06). But the obvious way
to make it all work is to be sure we set the pointer to NULL after
freeing it (and our freeing process closes the descriptor, so we know
there is no leak).
To make sure we do that consistently, we'll switch the pointer we take
in release_http_object_request() to a pointer-to-pointer, and we'll set
it to NULL ourselves. And then the compiler can help us find each caller
which needs to be updated.
Most cases will just pass "&obj_req->req", which will obviously do the
right thing. In a few cases, like http-push's finish_request(), we are
working with a copy of the pointer, so we don't NULL the original. But
it's OK because the next step is to free the struct containing the
original pointer anyway.
This lets us mark t5551 as leak-free. Ironically this is the "smart"
http test, and the leak here only affects dumb http. But there's a
single dumb-http invocation in there. The full dumb tests are in t5550,
which still has some more leaks.
This also makes t5559 leak-free, as it's just an HTTP/2 variant of
t5551. But we don't need to mark it as such, since it inherits the flag
from t5551.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-25 06:01:09 +08:00
|
|
|
struct http_object_request *freq = *freq_p;
|
http*: add helper methods for fetching objects (loose)
The code handling the fetching of loose objects in http-push.c and
http-walker.c have been refactored into new methods and a new struct
(object_http_request) in http.c. They are not meant to be invoked
elsewhere.
The new methods in http.c are
- new_http_object_request
- process_http_object_request
- finish_http_object_request
- abort_http_object_request
- release_http_object_request
and the new struct is http_object_request.
RANGER_HEADER_SIZE and no_pragma_header is no longer made available
outside of http.c, since after the above changes, there are no other
instances of usage outside of http.c.
Remove members of the transfer_request struct in http-push.c and
http-walker.c, including filename, real_sha1 and zret, as they are used
no longer used.
Move the methods append_remote_object_url() and get_remote_object_url()
from http-push.c to http.c. Additionally, get_remote_object_url() is no
longer defined only when USE_CURL_MULTI is defined, since
non-USE_CURL_MULTI code in http.c uses it (namely, in
new_http_object_request()).
Refactor code from http-push.c::start_fetch_loose() and
http-walker.c::start_object_fetch_request() that deals with the details
of coming up with the filename to store the retrieved object, resuming
a previously aborted request, and making a new curl request, into a new
function, new_http_object_request().
Refactor code from http-walker.c::process_object_request() into the
function, process_http_object_request().
Refactor code from http-push.c::finish_request() and
http-walker.c::finish_object_request() into a new function,
finish_http_object_request(). It returns the result of the
move_temp_to_file() invocation.
Add a function, release_http_object_request(), which cleans up object
request data. http-push.c and http-walker.c invoke this function
separately; http-push.c::release_request() and
http-walker.c::release_object_request() do not invoke this function.
Add a function, abort_http_object_request(), which unlink()s the object
file and invokes release_http_object_request(). Update
http-walker.c::abort_object_request() to use this.
Signed-off-by: Tay Ray Chuan <rctay89@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-06-06 16:44:02 +08:00
|
|
|
if (freq->localfile != -1) {
|
|
|
|
close(freq->localfile);
|
|
|
|
freq->localfile = -1;
|
|
|
|
}
|
2018-08-17 21:02:50 +08:00
|
|
|
FREE_AND_NULL(freq->url);
|
2022-05-03 00:50:37 +08:00
|
|
|
if (freq->slot) {
|
2009-08-26 20:20:53 +08:00
|
|
|
freq->slot->callback_func = NULL;
|
|
|
|
freq->slot->callback_data = NULL;
|
|
|
|
release_active_slot(freq->slot);
|
|
|
|
freq->slot = NULL;
|
|
|
|
}
|
2024-04-17 08:02:27 +08:00
|
|
|
curl_slist_free_all(freq->headers);
|
2018-05-19 09:56:37 +08:00
|
|
|
strbuf_release(&freq->tmpfile);
|
http: call git_inflate_end() when releasing http_object_request
In new_http_object_request(), we initialize the zlib stream with
git_inflate_init(). We must have a matching git_inflate_end() to avoid
leaking any memory allocated by zlib.
In most cases this happens in finish_http_object_request(), but we don't
always get there. If we abort a request mid-stream, then we may clean it
up without hitting that function.
We can't just add a git_inflate_end() call to the release function,
though. That would double-free the cases that did actually finish.
Instead, we'll move the call from the finish function to the release
function. This does delay it for the cases that do finish, but I don't
think it matters. We should have already reached Z_STREAM_END (and
complain if we didn't), and we do not record any status code from
git_inflate_end().
This leak is triggered by t5550 at least (and probably other dumb-http
tests).
I did find one other related spot of interest. If we try to read a
previously downloaded file and fail, we reset the stream by calling
memset() followed by a fresh git_inflate_init(). I don't think this case
is triggered in the test suite, but it seemed like an obvious leak, so I
added the appropriate git_inflate_end() before the memset() there.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-25 06:02:13 +08:00
|
|
|
git_inflate_end(&freq->stream);
|
http: fix leak of http_object_request struct
The new_http_object_request() function allocates a struct on the heap,
along with some fields inside the struct. But the matching function to
clean it up, release_http_object_request(), only frees the interior
fields without freeing the struct itself, causing a leak.
The related http_pack_request new/release pair gets this right, and at
first glance we should be able to do the same thing and just add a
single free() call. But there's a catch.
These http_object_request structs are typically embedded in the
object_request struct of http-walker.c. And when we clean up that parent
struct, it sanity-checks the embedded struct to make sure we are not
leaking descriptors. Which means a use-after-free if we simply free()
the embedded struct.
I have no idea how valuable that sanity-check is, or whether it can
simply be deleted. This all goes back to 5424bc557f (http*: add helper
methods for fetching objects (loose), 2009-06-06). But the obvious way
to make it all work is to be sure we set the pointer to NULL after
freeing it (and our freeing process closes the descriptor, so we know
there is no leak).
To make sure we do that consistently, we'll switch the pointer we take
in release_http_object_request() to a pointer-to-pointer, and we'll set
it to NULL ourselves. And then the compiler can help us find each caller
which needs to be updated.
Most cases will just pass "&obj_req->req", which will obviously do the
right thing. In a few cases, like http-push's finish_request(), we are
working with a copy of the pointer, so we don't NULL the original. But
it's OK because the next step is to free the struct containing the
original pointer anyway.
This lets us mark t5551 as leak-free. Ironically this is the "smart"
http test, and the leak here only affects dumb http. But there's a
single dumb-http invocation in there. The full dumb tests are in t5550,
which still has some more leaks.
This also makes t5559 leak-free, as it's just an HTTP/2 variant of
t5551. But we don't need to mark it as such, since it inherits the flag
from t5551.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-09-25 06:01:09 +08:00
|
|
|
|
|
|
|
free(freq);
|
|
|
|
*freq_p = NULL;
|
http*: add helper methods for fetching objects (loose)
The code handling the fetching of loose objects in http-push.c and
http-walker.c have been refactored into new methods and a new struct
(object_http_request) in http.c. They are not meant to be invoked
elsewhere.
The new methods in http.c are
- new_http_object_request
- process_http_object_request
- finish_http_object_request
- abort_http_object_request
- release_http_object_request
and the new struct is http_object_request.
RANGER_HEADER_SIZE and no_pragma_header is no longer made available
outside of http.c, since after the above changes, there are no other
instances of usage outside of http.c.
Remove members of the transfer_request struct in http-push.c and
http-walker.c, including filename, real_sha1 and zret, as they are used
no longer used.
Move the methods append_remote_object_url() and get_remote_object_url()
from http-push.c to http.c. Additionally, get_remote_object_url() is no
longer defined only when USE_CURL_MULTI is defined, since
non-USE_CURL_MULTI code in http.c uses it (namely, in
new_http_object_request()).
Refactor code from http-push.c::start_fetch_loose() and
http-walker.c::start_object_fetch_request() that deals with the details
of coming up with the filename to store the retrieved object, resuming
a previously aborted request, and making a new curl request, into a new
function, new_http_object_request().
Refactor code from http-walker.c::process_object_request() into the
function, process_http_object_request().
Refactor code from http-push.c::finish_request() and
http-walker.c::finish_object_request() into a new function,
finish_http_object_request(). It returns the result of the
move_temp_to_file() invocation.
Add a function, release_http_object_request(), which cleans up object
request data. http-push.c and http-walker.c invoke this function
separately; http-push.c::release_request() and
http-walker.c::release_object_request() do not invoke this function.
Add a function, abort_http_object_request(), which unlink()s the object
file and invokes release_http_object_request(). Update
http-walker.c::abort_object_request() to use this.
Signed-off-by: Tay Ray Chuan <rctay89@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-06-06 16:44:02 +08:00
|
|
|
}
|