2009-10-31 08:47:32 +08:00
|
|
|
#include "cache.h"
|
2017-06-15 02:07:36 +08:00
|
|
|
#include "config.h"
|
2009-10-31 08:47:32 +08:00
|
|
|
#include "refs.h"
|
|
|
|
#include "pkt-line.h"
|
|
|
|
#include "object.h"
|
|
|
|
#include "tag.h"
|
|
|
|
#include "exec_cmd.h"
|
2009-10-31 08:47:34 +08:00
|
|
|
#include "run-command.h"
|
|
|
|
#include "string-list.h"
|
2010-05-23 17:17:55 +08:00
|
|
|
#include "url.h"
|
2012-03-30 15:01:30 +08:00
|
|
|
#include "argv-array.h"
|
2009-10-31 08:47:32 +08:00
|
|
|
|
|
|
|
static const char content_type[] = "Content-Type";
|
|
|
|
static const char content_length[] = "Content-Length";
|
|
|
|
static const char last_modified[] = "Last-Modified";
|
2009-11-05 09:16:37 +08:00
|
|
|
static int getanyfile = 1;
|
http-backend: spool ref negotiation requests to buffer
When http-backend spawns "upload-pack" to do ref
negotiation, it streams the http request body to
upload-pack, who then streams the http response back to the
client as it reads. In theory, git can go full-duplex; the
client can consume our response while it is still sending
the request. In practice, however, HTTP is a half-duplex
protocol. Even if our client is ready to read and write
simultaneously, we may have other HTTP infrastructure in the
way, including the webserver that spawns our CGI, or any
intermediate proxies.
In at least one documented case[1], this leads to deadlock
when trying a fetch over http. What happens is basically:
1. Apache proxies the request to the CGI, http-backend.
2. http-backend gzip-inflates the data and sends
the result to upload-pack.
3. upload-pack acts on the data and generates output over
the pipe back to Apache. Apache isn't reading because
it's busy writing (step 1).
This works fine most of the time, because the upload-pack
output ends up in a system pipe buffer, and Apache reads
it as soon as it finishes writing. But if both the request
and the response exceed the system pipe buffer size, then we
deadlock (Apache blocks writing to http-backend,
http-backend blocks writing to upload-pack, and upload-pack
blocks writing to Apache).
We need to break the deadlock by spooling either the input
or the output. In this case, it's ideal to spool the input,
because Apache does not start reading either stdout _or_
stderr until we have consumed all of the input. So until we
do so, we cannot even get an error message out to the
client.
The solution is fairly straight-forward: we read the request
body into an in-memory buffer in http-backend, freeing up
Apache, and then feed the data ourselves to upload-pack. But
there are a few important things to note:
1. We limit the in-memory buffer to prevent an obvious
denial-of-service attack. This is a new hard limit on
requests, but it's unlikely to come into play. The
default value is 10MB, which covers even the ridiculous
100,000-ref negotation in the included test (that
actually caps out just over 5MB). But it's configurable
on the off chance that you don't mind spending some
extra memory to make even ridiculous requests work.
2. We must take care only to buffer when we have to. For
pushes, the incoming packfile may be of arbitrary
size, and we should connect the input directly to
receive-pack. There's no deadlock problem here, though,
because we do not produce any output until the whole
packfile has been read.
For upload-pack's initial ref advertisement, we
similarly do not need to buffer. Even though we may
generate a lot of output, there is no request body at
all (i.e., it is a GET, not a POST).
[1] http://article.gmane.org/gmane.comp.version-control.git/269020
Test-adapted-from: Dennis Kaarsemaker <dennis@kaarsemaker.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-20 15:37:09 +08:00
|
|
|
static unsigned long max_request_buffer = 10 * 1024 * 1024;
|
2009-10-31 08:47:32 +08:00
|
|
|
|
2009-10-31 08:47:34 +08:00
|
|
|
static struct string_list *query_params;
|
|
|
|
|
|
|
|
struct rpc_service {
|
|
|
|
const char *name;
|
|
|
|
const char *config_name;
|
http-backend: spool ref negotiation requests to buffer
When http-backend spawns "upload-pack" to do ref
negotiation, it streams the http request body to
upload-pack, who then streams the http response back to the
client as it reads. In theory, git can go full-duplex; the
client can consume our response while it is still sending
the request. In practice, however, HTTP is a half-duplex
protocol. Even if our client is ready to read and write
simultaneously, we may have other HTTP infrastructure in the
way, including the webserver that spawns our CGI, or any
intermediate proxies.
In at least one documented case[1], this leads to deadlock
when trying a fetch over http. What happens is basically:
1. Apache proxies the request to the CGI, http-backend.
2. http-backend gzip-inflates the data and sends
the result to upload-pack.
3. upload-pack acts on the data and generates output over
the pipe back to Apache. Apache isn't reading because
it's busy writing (step 1).
This works fine most of the time, because the upload-pack
output ends up in a system pipe buffer, and Apache reads
it as soon as it finishes writing. But if both the request
and the response exceed the system pipe buffer size, then we
deadlock (Apache blocks writing to http-backend,
http-backend blocks writing to upload-pack, and upload-pack
blocks writing to Apache).
We need to break the deadlock by spooling either the input
or the output. In this case, it's ideal to spool the input,
because Apache does not start reading either stdout _or_
stderr until we have consumed all of the input. So until we
do so, we cannot even get an error message out to the
client.
The solution is fairly straight-forward: we read the request
body into an in-memory buffer in http-backend, freeing up
Apache, and then feed the data ourselves to upload-pack. But
there are a few important things to note:
1. We limit the in-memory buffer to prevent an obvious
denial-of-service attack. This is a new hard limit on
requests, but it's unlikely to come into play. The
default value is 10MB, which covers even the ridiculous
100,000-ref negotation in the included test (that
actually caps out just over 5MB). But it's configurable
on the off chance that you don't mind spending some
extra memory to make even ridiculous requests work.
2. We must take care only to buffer when we have to. For
pushes, the incoming packfile may be of arbitrary
size, and we should connect the input directly to
receive-pack. There's no deadlock problem here, though,
because we do not produce any output until the whole
packfile has been read.
For upload-pack's initial ref advertisement, we
similarly do not need to buffer. Even though we may
generate a lot of output, there is no request body at
all (i.e., it is a GET, not a POST).
[1] http://article.gmane.org/gmane.comp.version-control.git/269020
Test-adapted-from: Dennis Kaarsemaker <dennis@kaarsemaker.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-20 15:37:09 +08:00
|
|
|
unsigned buffer_input : 1;
|
2009-10-31 08:47:34 +08:00
|
|
|
signed enabled : 2;
|
|
|
|
};
|
|
|
|
|
|
|
|
static struct rpc_service rpc_service[] = {
|
http-backend: spool ref negotiation requests to buffer
When http-backend spawns "upload-pack" to do ref
negotiation, it streams the http request body to
upload-pack, who then streams the http response back to the
client as it reads. In theory, git can go full-duplex; the
client can consume our response while it is still sending
the request. In practice, however, HTTP is a half-duplex
protocol. Even if our client is ready to read and write
simultaneously, we may have other HTTP infrastructure in the
way, including the webserver that spawns our CGI, or any
intermediate proxies.
In at least one documented case[1], this leads to deadlock
when trying a fetch over http. What happens is basically:
1. Apache proxies the request to the CGI, http-backend.
2. http-backend gzip-inflates the data and sends
the result to upload-pack.
3. upload-pack acts on the data and generates output over
the pipe back to Apache. Apache isn't reading because
it's busy writing (step 1).
This works fine most of the time, because the upload-pack
output ends up in a system pipe buffer, and Apache reads
it as soon as it finishes writing. But if both the request
and the response exceed the system pipe buffer size, then we
deadlock (Apache blocks writing to http-backend,
http-backend blocks writing to upload-pack, and upload-pack
blocks writing to Apache).
We need to break the deadlock by spooling either the input
or the output. In this case, it's ideal to spool the input,
because Apache does not start reading either stdout _or_
stderr until we have consumed all of the input. So until we
do so, we cannot even get an error message out to the
client.
The solution is fairly straight-forward: we read the request
body into an in-memory buffer in http-backend, freeing up
Apache, and then feed the data ourselves to upload-pack. But
there are a few important things to note:
1. We limit the in-memory buffer to prevent an obvious
denial-of-service attack. This is a new hard limit on
requests, but it's unlikely to come into play. The
default value is 10MB, which covers even the ridiculous
100,000-ref negotation in the included test (that
actually caps out just over 5MB). But it's configurable
on the off chance that you don't mind spending some
extra memory to make even ridiculous requests work.
2. We must take care only to buffer when we have to. For
pushes, the incoming packfile may be of arbitrary
size, and we should connect the input directly to
receive-pack. There's no deadlock problem here, though,
because we do not produce any output until the whole
packfile has been read.
For upload-pack's initial ref advertisement, we
similarly do not need to buffer. Even though we may
generate a lot of output, there is no request body at
all (i.e., it is a GET, not a POST).
[1] http://article.gmane.org/gmane.comp.version-control.git/269020
Test-adapted-from: Dennis Kaarsemaker <dennis@kaarsemaker.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-20 15:37:09 +08:00
|
|
|
{ "upload-pack", "uploadpack", 1, 1 },
|
|
|
|
{ "receive-pack", "receivepack", 0, -1 },
|
2009-10-31 08:47:34 +08:00
|
|
|
};
|
|
|
|
|
|
|
|
static struct string_list *get_parameters(void)
|
|
|
|
{
|
|
|
|
if (!query_params) {
|
|
|
|
const char *query = getenv("QUERY_STRING");
|
|
|
|
|
|
|
|
query_params = xcalloc(1, sizeof(*query_params));
|
|
|
|
while (query && *query) {
|
2010-05-23 17:17:55 +08:00
|
|
|
char *name = url_decode_parameter_name(&query);
|
|
|
|
char *value = url_decode_parameter_value(&query);
|
2009-10-31 08:47:34 +08:00
|
|
|
struct string_list_item *i;
|
|
|
|
|
2010-06-26 07:41:37 +08:00
|
|
|
i = string_list_lookup(query_params, name);
|
2009-10-31 08:47:34 +08:00
|
|
|
if (!i)
|
2010-06-26 07:41:35 +08:00
|
|
|
i = string_list_insert(query_params, name);
|
2009-10-31 08:47:34 +08:00
|
|
|
else
|
|
|
|
free(i->util);
|
|
|
|
i->util = value;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
return query_params;
|
|
|
|
}
|
|
|
|
|
|
|
|
static const char *get_parameter(const char *name)
|
|
|
|
{
|
|
|
|
struct string_list_item *i;
|
2010-06-26 07:41:37 +08:00
|
|
|
i = string_list_lookup(get_parameters(), name);
|
2009-10-31 08:47:34 +08:00
|
|
|
return i ? i->util : NULL;
|
|
|
|
}
|
|
|
|
|
2009-11-15 05:10:58 +08:00
|
|
|
__attribute__((format (printf, 2, 3)))
|
2009-10-31 08:47:32 +08:00
|
|
|
static void format_write(int fd, const char *fmt, ...)
|
|
|
|
{
|
|
|
|
static char buffer[1024];
|
|
|
|
|
|
|
|
va_list args;
|
|
|
|
unsigned n;
|
|
|
|
|
|
|
|
va_start(args, fmt);
|
|
|
|
n = vsnprintf(buffer, sizeof(buffer), fmt, args);
|
|
|
|
va_end(args);
|
|
|
|
if (n >= sizeof(buffer))
|
|
|
|
die("protocol error: impossibly long line");
|
|
|
|
|
2013-02-21 04:01:56 +08:00
|
|
|
write_or_die(fd, buffer, n);
|
2009-10-31 08:47:32 +08:00
|
|
|
}
|
|
|
|
|
2016-08-10 07:47:31 +08:00
|
|
|
static void http_status(struct strbuf *hdr, unsigned code, const char *msg)
|
2009-10-31 08:47:32 +08:00
|
|
|
{
|
2016-08-10 07:47:31 +08:00
|
|
|
strbuf_addf(hdr, "Status: %u %s\r\n", code, msg);
|
2009-10-31 08:47:32 +08:00
|
|
|
}
|
|
|
|
|
2016-08-10 07:47:31 +08:00
|
|
|
static void hdr_str(struct strbuf *hdr, const char *name, const char *value)
|
2009-10-31 08:47:32 +08:00
|
|
|
{
|
2016-08-10 07:47:31 +08:00
|
|
|
strbuf_addf(hdr, "%s: %s\r\n", name, value);
|
2009-10-31 08:47:32 +08:00
|
|
|
}
|
|
|
|
|
2016-08-10 07:47:31 +08:00
|
|
|
static void hdr_int(struct strbuf *hdr, const char *name, uintmax_t value)
|
2009-10-31 08:47:32 +08:00
|
|
|
{
|
2016-08-10 07:47:31 +08:00
|
|
|
strbuf_addf(hdr, "%s: %" PRIuMAX "\r\n", name, value);
|
2009-10-31 08:47:32 +08:00
|
|
|
}
|
|
|
|
|
2017-04-27 03:29:31 +08:00
|
|
|
static void hdr_date(struct strbuf *hdr, const char *name, timestamp_t when)
|
2009-10-31 08:47:32 +08:00
|
|
|
{
|
convert "enum date_mode" into a struct
In preparation for adding date modes that may carry extra
information beyond the mode itself, this patch converts the
date_mode enum into a struct.
Most of the conversion is fairly straightforward; we pass
the struct as a pointer and dereference the type field where
necessary. Locations that declare a date_mode can use a "{}"
constructor. However, the tricky case is where we use the
enum labels as constants, like:
show_date(t, tz, DATE_NORMAL);
Ideally we could say:
show_date(t, tz, &{ DATE_NORMAL });
but of course C does not allow that. Likewise, we cannot
cast the constant to a struct, because we need to pass an
actual address. Our options are basically:
1. Manually add a "struct date_mode d = { DATE_NORMAL }"
definition to each caller, and pass "&d". This makes
the callers uglier, because they sometimes do not even
have their own scope (e.g., they are inside a switch
statement).
2. Provide a pre-made global "date_normal" struct that can
be passed by address. We'd also need "date_rfc2822",
"date_iso8601", and so forth. But at least the ugliness
is defined in one place.
3. Provide a wrapper that generates the correct struct on
the fly. The big downside is that we end up pointing to
a single global, which makes our wrapper non-reentrant.
But show_date is already not reentrant, so it does not
matter.
This patch implements 3, along with a minor macro to keep
the size of the callers sane.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-06-26 00:55:02 +08:00
|
|
|
const char *value = show_date(when, 0, DATE_MODE(RFC2822));
|
2016-08-10 07:47:31 +08:00
|
|
|
hdr_str(hdr, name, value);
|
2009-10-31 08:47:32 +08:00
|
|
|
}
|
|
|
|
|
2016-08-10 07:47:31 +08:00
|
|
|
static void hdr_nocache(struct strbuf *hdr)
|
2009-10-31 08:47:32 +08:00
|
|
|
{
|
2016-08-10 07:47:31 +08:00
|
|
|
hdr_str(hdr, "Expires", "Fri, 01 Jan 1980 00:00:00 GMT");
|
|
|
|
hdr_str(hdr, "Pragma", "no-cache");
|
|
|
|
hdr_str(hdr, "Cache-Control", "no-cache, max-age=0, must-revalidate");
|
2009-10-31 08:47:32 +08:00
|
|
|
}
|
|
|
|
|
2016-08-10 07:47:31 +08:00
|
|
|
static void hdr_cache_forever(struct strbuf *hdr)
|
2009-10-31 08:47:32 +08:00
|
|
|
{
|
2017-04-27 03:29:31 +08:00
|
|
|
timestamp_t now = time(NULL);
|
2016-08-10 07:47:31 +08:00
|
|
|
hdr_date(hdr, "Date", now);
|
|
|
|
hdr_date(hdr, "Expires", now + 31536000);
|
|
|
|
hdr_str(hdr, "Cache-Control", "public, max-age=31536000");
|
2009-10-31 08:47:32 +08:00
|
|
|
}
|
|
|
|
|
2016-08-10 07:47:31 +08:00
|
|
|
static void end_headers(struct strbuf *hdr)
|
2009-10-31 08:47:32 +08:00
|
|
|
{
|
2016-08-10 07:47:31 +08:00
|
|
|
strbuf_add(hdr, "\r\n", 2);
|
|
|
|
write_or_die(1, hdr->buf, hdr->len);
|
|
|
|
strbuf_release(hdr);
|
2009-10-31 08:47:32 +08:00
|
|
|
}
|
|
|
|
|
2016-08-10 07:47:31 +08:00
|
|
|
__attribute__((format (printf, 2, 3)))
|
|
|
|
static NORETURN void not_found(struct strbuf *hdr, const char *err, ...)
|
2009-10-31 08:47:32 +08:00
|
|
|
{
|
|
|
|
va_list params;
|
|
|
|
|
2016-08-10 07:47:31 +08:00
|
|
|
http_status(hdr, 404, "Not Found");
|
|
|
|
hdr_nocache(hdr);
|
|
|
|
end_headers(hdr);
|
2009-10-31 08:47:32 +08:00
|
|
|
|
|
|
|
va_start(params, err);
|
|
|
|
if (err && *err)
|
|
|
|
vfprintf(stderr, err, params);
|
|
|
|
va_end(params);
|
|
|
|
exit(0);
|
|
|
|
}
|
|
|
|
|
2016-08-10 07:47:31 +08:00
|
|
|
__attribute__((format (printf, 2, 3)))
|
|
|
|
static NORETURN void forbidden(struct strbuf *hdr, const char *err, ...)
|
2009-10-31 08:47:34 +08:00
|
|
|
{
|
|
|
|
va_list params;
|
|
|
|
|
2016-08-10 07:47:31 +08:00
|
|
|
http_status(hdr, 403, "Forbidden");
|
|
|
|
hdr_nocache(hdr);
|
|
|
|
end_headers(hdr);
|
2009-10-31 08:47:34 +08:00
|
|
|
|
|
|
|
va_start(params, err);
|
|
|
|
if (err && *err)
|
|
|
|
vfprintf(stderr, err, params);
|
|
|
|
va_end(params);
|
|
|
|
exit(0);
|
|
|
|
}
|
|
|
|
|
2016-08-10 07:47:31 +08:00
|
|
|
static void select_getanyfile(struct strbuf *hdr)
|
2009-11-05 09:16:37 +08:00
|
|
|
{
|
|
|
|
if (!getanyfile)
|
2016-08-10 07:47:31 +08:00
|
|
|
forbidden(hdr, "Unsupported service: getanyfile");
|
2009-11-05 09:16:37 +08:00
|
|
|
}
|
|
|
|
|
2016-08-10 07:47:31 +08:00
|
|
|
static void send_strbuf(struct strbuf *hdr,
|
|
|
|
const char *type, struct strbuf *buf)
|
2009-10-31 08:47:32 +08:00
|
|
|
{
|
2016-08-10 07:47:31 +08:00
|
|
|
hdr_int(hdr, content_length, buf->len);
|
|
|
|
hdr_str(hdr, content_type, type);
|
|
|
|
end_headers(hdr);
|
2013-02-21 04:01:56 +08:00
|
|
|
write_or_die(1, buf->buf, buf->len);
|
2009-10-31 08:47:32 +08:00
|
|
|
}
|
|
|
|
|
2016-08-10 07:47:31 +08:00
|
|
|
static void send_local_file(struct strbuf *hdr, const char *the_type,
|
|
|
|
const char *name)
|
2009-10-31 08:47:32 +08:00
|
|
|
{
|
2015-08-10 17:35:31 +08:00
|
|
|
char *p = git_pathdup("%s", name);
|
2009-10-31 08:47:32 +08:00
|
|
|
size_t buf_alloc = 8192;
|
|
|
|
char *buf = xmalloc(buf_alloc);
|
|
|
|
int fd;
|
|
|
|
struct stat sb;
|
|
|
|
|
|
|
|
fd = open(p, O_RDONLY);
|
|
|
|
if (fd < 0)
|
2016-08-10 07:47:31 +08:00
|
|
|
not_found(hdr, "Cannot open '%s': %s", p, strerror(errno));
|
2009-10-31 08:47:32 +08:00
|
|
|
if (fstat(fd, &sb) < 0)
|
|
|
|
die_errno("Cannot stat '%s'", p);
|
|
|
|
|
2016-08-10 07:47:31 +08:00
|
|
|
hdr_int(hdr, content_length, sb.st_size);
|
|
|
|
hdr_str(hdr, content_type, the_type);
|
|
|
|
hdr_date(hdr, last_modified, sb.st_mtime);
|
|
|
|
end_headers(hdr);
|
2009-10-31 08:47:32 +08:00
|
|
|
|
2009-11-12 12:42:41 +08:00
|
|
|
for (;;) {
|
2009-10-31 08:47:32 +08:00
|
|
|
ssize_t n = xread(fd, buf, buf_alloc);
|
|
|
|
if (n < 0)
|
|
|
|
die_errno("Cannot read '%s'", p);
|
|
|
|
if (!n)
|
|
|
|
break;
|
2013-02-21 04:01:56 +08:00
|
|
|
write_or_die(1, buf, n);
|
2009-10-31 08:47:32 +08:00
|
|
|
}
|
|
|
|
close(fd);
|
|
|
|
free(buf);
|
2015-08-10 17:35:31 +08:00
|
|
|
free(p);
|
2009-10-31 08:47:32 +08:00
|
|
|
}
|
|
|
|
|
2016-08-10 07:47:31 +08:00
|
|
|
static void get_text_file(struct strbuf *hdr, char *name)
|
2009-10-31 08:47:32 +08:00
|
|
|
{
|
2016-08-10 07:47:31 +08:00
|
|
|
select_getanyfile(hdr);
|
|
|
|
hdr_nocache(hdr);
|
|
|
|
send_local_file(hdr, "text/plain", name);
|
2009-10-31 08:47:32 +08:00
|
|
|
}
|
|
|
|
|
2016-08-10 07:47:31 +08:00
|
|
|
static void get_loose_object(struct strbuf *hdr, char *name)
|
2009-10-31 08:47:32 +08:00
|
|
|
{
|
2016-08-10 07:47:31 +08:00
|
|
|
select_getanyfile(hdr);
|
|
|
|
hdr_cache_forever(hdr);
|
|
|
|
send_local_file(hdr, "application/x-git-loose-object", name);
|
2009-10-31 08:47:32 +08:00
|
|
|
}
|
|
|
|
|
2016-08-10 07:47:31 +08:00
|
|
|
static void get_pack_file(struct strbuf *hdr, char *name)
|
2009-10-31 08:47:32 +08:00
|
|
|
{
|
2016-08-10 07:47:31 +08:00
|
|
|
select_getanyfile(hdr);
|
|
|
|
hdr_cache_forever(hdr);
|
|
|
|
send_local_file(hdr, "application/x-git-packed-objects", name);
|
2009-10-31 08:47:32 +08:00
|
|
|
}
|
|
|
|
|
2016-08-10 07:47:31 +08:00
|
|
|
static void get_idx_file(struct strbuf *hdr, char *name)
|
2009-10-31 08:47:32 +08:00
|
|
|
{
|
2016-08-10 07:47:31 +08:00
|
|
|
select_getanyfile(hdr);
|
|
|
|
hdr_cache_forever(hdr);
|
|
|
|
send_local_file(hdr, "application/x-git-packed-objects-toc", name);
|
2009-10-31 08:47:32 +08:00
|
|
|
}
|
|
|
|
|
2014-08-08 00:21:17 +08:00
|
|
|
static void http_config(void)
|
2009-10-31 08:47:34 +08:00
|
|
|
{
|
2014-08-08 00:21:17 +08:00
|
|
|
int i, value = 0;
|
|
|
|
struct strbuf var = STRBUF_INIT;
|
use skip_prefix to avoid magic numbers
It's a common idiom to match a prefix and then skip past it
with a magic number, like:
if (starts_with(foo, "bar"))
foo += 3;
This is easy to get wrong, since you have to count the
prefix string yourself, and there's no compiler check if the
string changes. We can use skip_prefix to avoid the magic
numbers here.
Note that some of these conversions could be much shorter.
For example:
if (starts_with(arg, "--foo=")) {
bar = arg + 6;
continue;
}
could become:
if (skip_prefix(arg, "--foo=", &bar))
continue;
However, I have left it as:
if (skip_prefix(arg, "--foo=", &v)) {
bar = v;
continue;
}
to visually match nearby cases which need to actually
process the string. Like:
if (skip_prefix(arg, "--foo=", &v)) {
bar = atoi(v);
continue;
}
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-19 03:47:50 +08:00
|
|
|
|
2014-08-08 00:21:17 +08:00
|
|
|
git_config_get_bool("http.getanyfile", &getanyfile);
|
http-backend: spool ref negotiation requests to buffer
When http-backend spawns "upload-pack" to do ref
negotiation, it streams the http request body to
upload-pack, who then streams the http response back to the
client as it reads. In theory, git can go full-duplex; the
client can consume our response while it is still sending
the request. In practice, however, HTTP is a half-duplex
protocol. Even if our client is ready to read and write
simultaneously, we may have other HTTP infrastructure in the
way, including the webserver that spawns our CGI, or any
intermediate proxies.
In at least one documented case[1], this leads to deadlock
when trying a fetch over http. What happens is basically:
1. Apache proxies the request to the CGI, http-backend.
2. http-backend gzip-inflates the data and sends
the result to upload-pack.
3. upload-pack acts on the data and generates output over
the pipe back to Apache. Apache isn't reading because
it's busy writing (step 1).
This works fine most of the time, because the upload-pack
output ends up in a system pipe buffer, and Apache reads
it as soon as it finishes writing. But if both the request
and the response exceed the system pipe buffer size, then we
deadlock (Apache blocks writing to http-backend,
http-backend blocks writing to upload-pack, and upload-pack
blocks writing to Apache).
We need to break the deadlock by spooling either the input
or the output. In this case, it's ideal to spool the input,
because Apache does not start reading either stdout _or_
stderr until we have consumed all of the input. So until we
do so, we cannot even get an error message out to the
client.
The solution is fairly straight-forward: we read the request
body into an in-memory buffer in http-backend, freeing up
Apache, and then feed the data ourselves to upload-pack. But
there are a few important things to note:
1. We limit the in-memory buffer to prevent an obvious
denial-of-service attack. This is a new hard limit on
requests, but it's unlikely to come into play. The
default value is 10MB, which covers even the ridiculous
100,000-ref negotation in the included test (that
actually caps out just over 5MB). But it's configurable
on the off chance that you don't mind spending some
extra memory to make even ridiculous requests work.
2. We must take care only to buffer when we have to. For
pushes, the incoming packfile may be of arbitrary
size, and we should connect the input directly to
receive-pack. There's no deadlock problem here, though,
because we do not produce any output until the whole
packfile has been read.
For upload-pack's initial ref advertisement, we
similarly do not need to buffer. Even though we may
generate a lot of output, there is no request body at
all (i.e., it is a GET, not a POST).
[1] http://article.gmane.org/gmane.comp.version-control.git/269020
Test-adapted-from: Dennis Kaarsemaker <dennis@kaarsemaker.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-20 15:37:09 +08:00
|
|
|
git_config_get_ulong("http.maxrequestbuffer", &max_request_buffer);
|
2009-10-31 08:47:34 +08:00
|
|
|
|
2014-08-08 00:21:17 +08:00
|
|
|
for (i = 0; i < ARRAY_SIZE(rpc_service); i++) {
|
|
|
|
struct rpc_service *svc = &rpc_service[i];
|
|
|
|
strbuf_addf(&var, "http.%s", svc->config_name);
|
|
|
|
if (!git_config_get_bool(var.buf, &value))
|
|
|
|
svc->enabled = value;
|
|
|
|
strbuf_reset(&var);
|
2009-11-05 09:16:37 +08:00
|
|
|
}
|
|
|
|
|
2014-08-08 00:21:17 +08:00
|
|
|
strbuf_release(&var);
|
2009-10-31 08:47:34 +08:00
|
|
|
}
|
|
|
|
|
2016-08-10 07:47:31 +08:00
|
|
|
static struct rpc_service *select_service(struct strbuf *hdr, const char *name)
|
2009-10-31 08:47:34 +08:00
|
|
|
{
|
use skip_prefix to avoid magic numbers
It's a common idiom to match a prefix and then skip past it
with a magic number, like:
if (starts_with(foo, "bar"))
foo += 3;
This is easy to get wrong, since you have to count the
prefix string yourself, and there's no compiler check if the
string changes. We can use skip_prefix to avoid the magic
numbers here.
Note that some of these conversions could be much shorter.
For example:
if (starts_with(arg, "--foo=")) {
bar = arg + 6;
continue;
}
could become:
if (skip_prefix(arg, "--foo=", &bar))
continue;
However, I have left it as:
if (skip_prefix(arg, "--foo=", &v)) {
bar = v;
continue;
}
to visually match nearby cases which need to actually
process the string. Like:
if (skip_prefix(arg, "--foo=", &v)) {
bar = atoi(v);
continue;
}
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-19 03:47:50 +08:00
|
|
|
const char *svc_name;
|
2009-10-31 08:47:34 +08:00
|
|
|
struct rpc_service *svc = NULL;
|
|
|
|
int i;
|
|
|
|
|
use skip_prefix to avoid magic numbers
It's a common idiom to match a prefix and then skip past it
with a magic number, like:
if (starts_with(foo, "bar"))
foo += 3;
This is easy to get wrong, since you have to count the
prefix string yourself, and there's no compiler check if the
string changes. We can use skip_prefix to avoid the magic
numbers here.
Note that some of these conversions could be much shorter.
For example:
if (starts_with(arg, "--foo=")) {
bar = arg + 6;
continue;
}
could become:
if (skip_prefix(arg, "--foo=", &bar))
continue;
However, I have left it as:
if (skip_prefix(arg, "--foo=", &v)) {
bar = v;
continue;
}
to visually match nearby cases which need to actually
process the string. Like:
if (skip_prefix(arg, "--foo=", &v)) {
bar = atoi(v);
continue;
}
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-19 03:47:50 +08:00
|
|
|
if (!skip_prefix(name, "git-", &svc_name))
|
2016-08-10 07:47:31 +08:00
|
|
|
forbidden(hdr, "Unsupported service: '%s'", name);
|
2009-10-31 08:47:34 +08:00
|
|
|
|
|
|
|
for (i = 0; i < ARRAY_SIZE(rpc_service); i++) {
|
|
|
|
struct rpc_service *s = &rpc_service[i];
|
use skip_prefix to avoid magic numbers
It's a common idiom to match a prefix and then skip past it
with a magic number, like:
if (starts_with(foo, "bar"))
foo += 3;
This is easy to get wrong, since you have to count the
prefix string yourself, and there's no compiler check if the
string changes. We can use skip_prefix to avoid the magic
numbers here.
Note that some of these conversions could be much shorter.
For example:
if (starts_with(arg, "--foo=")) {
bar = arg + 6;
continue;
}
could become:
if (skip_prefix(arg, "--foo=", &bar))
continue;
However, I have left it as:
if (skip_prefix(arg, "--foo=", &v)) {
bar = v;
continue;
}
to visually match nearby cases which need to actually
process the string. Like:
if (skip_prefix(arg, "--foo=", &v)) {
bar = atoi(v);
continue;
}
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-19 03:47:50 +08:00
|
|
|
if (!strcmp(s->name, svc_name)) {
|
2009-10-31 08:47:34 +08:00
|
|
|
svc = s;
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
if (!svc)
|
2016-08-10 07:47:31 +08:00
|
|
|
forbidden(hdr, "Unsupported service: '%s'", name);
|
2009-10-31 08:47:34 +08:00
|
|
|
|
|
|
|
if (svc->enabled < 0) {
|
|
|
|
const char *user = getenv("REMOTE_USER");
|
|
|
|
svc->enabled = (user && *user) ? 1 : 0;
|
|
|
|
}
|
|
|
|
if (!svc->enabled)
|
2016-08-10 07:47:31 +08:00
|
|
|
forbidden(hdr, "Service not enabled: '%s'", svc->name);
|
2009-10-31 08:47:34 +08:00
|
|
|
return svc;
|
|
|
|
}
|
|
|
|
|
http-backend: spool ref negotiation requests to buffer
When http-backend spawns "upload-pack" to do ref
negotiation, it streams the http request body to
upload-pack, who then streams the http response back to the
client as it reads. In theory, git can go full-duplex; the
client can consume our response while it is still sending
the request. In practice, however, HTTP is a half-duplex
protocol. Even if our client is ready to read and write
simultaneously, we may have other HTTP infrastructure in the
way, including the webserver that spawns our CGI, or any
intermediate proxies.
In at least one documented case[1], this leads to deadlock
when trying a fetch over http. What happens is basically:
1. Apache proxies the request to the CGI, http-backend.
2. http-backend gzip-inflates the data and sends
the result to upload-pack.
3. upload-pack acts on the data and generates output over
the pipe back to Apache. Apache isn't reading because
it's busy writing (step 1).
This works fine most of the time, because the upload-pack
output ends up in a system pipe buffer, and Apache reads
it as soon as it finishes writing. But if both the request
and the response exceed the system pipe buffer size, then we
deadlock (Apache blocks writing to http-backend,
http-backend blocks writing to upload-pack, and upload-pack
blocks writing to Apache).
We need to break the deadlock by spooling either the input
or the output. In this case, it's ideal to spool the input,
because Apache does not start reading either stdout _or_
stderr until we have consumed all of the input. So until we
do so, we cannot even get an error message out to the
client.
The solution is fairly straight-forward: we read the request
body into an in-memory buffer in http-backend, freeing up
Apache, and then feed the data ourselves to upload-pack. But
there are a few important things to note:
1. We limit the in-memory buffer to prevent an obvious
denial-of-service attack. This is a new hard limit on
requests, but it's unlikely to come into play. The
default value is 10MB, which covers even the ridiculous
100,000-ref negotation in the included test (that
actually caps out just over 5MB). But it's configurable
on the off chance that you don't mind spending some
extra memory to make even ridiculous requests work.
2. We must take care only to buffer when we have to. For
pushes, the incoming packfile may be of arbitrary
size, and we should connect the input directly to
receive-pack. There's no deadlock problem here, though,
because we do not produce any output until the whole
packfile has been read.
For upload-pack's initial ref advertisement, we
similarly do not need to buffer. Even though we may
generate a lot of output, there is no request body at
all (i.e., it is a GET, not a POST).
[1] http://article.gmane.org/gmane.comp.version-control.git/269020
Test-adapted-from: Dennis Kaarsemaker <dennis@kaarsemaker.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-20 15:37:09 +08:00
|
|
|
/*
|
|
|
|
* This is basically strbuf_read(), except that if we
|
|
|
|
* hit max_request_buffer we die (we'd rather reject a
|
|
|
|
* maliciously large request than chew up infinite memory).
|
|
|
|
*/
|
|
|
|
static ssize_t read_request(int fd, unsigned char **out)
|
|
|
|
{
|
|
|
|
size_t len = 0, alloc = 8192;
|
|
|
|
unsigned char *buf = xmalloc(alloc);
|
|
|
|
|
|
|
|
if (max_request_buffer < alloc)
|
|
|
|
max_request_buffer = alloc;
|
|
|
|
|
|
|
|
while (1) {
|
|
|
|
ssize_t cnt;
|
|
|
|
|
|
|
|
cnt = read_in_full(fd, buf + len, alloc - len);
|
|
|
|
if (cnt < 0) {
|
|
|
|
free(buf);
|
|
|
|
return -1;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* partial read from read_in_full means we hit EOF */
|
|
|
|
len += cnt;
|
|
|
|
if (len < alloc) {
|
|
|
|
*out = buf;
|
|
|
|
return len;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* otherwise, grow and try again (if we can) */
|
|
|
|
if (alloc == max_request_buffer)
|
|
|
|
die("request was larger than our maximum size (%lu);"
|
|
|
|
" try setting GIT_HTTP_MAX_REQUEST_BUFFER",
|
|
|
|
max_request_buffer);
|
|
|
|
|
|
|
|
alloc = alloc_nr(alloc);
|
|
|
|
if (alloc > max_request_buffer)
|
|
|
|
alloc = max_request_buffer;
|
|
|
|
REALLOC_ARRAY(buf, alloc);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
static void inflate_request(const char *prog_name, int out, int buffer_input)
|
2009-10-31 08:47:34 +08:00
|
|
|
{
|
2011-06-11 02:52:15 +08:00
|
|
|
git_zstream stream;
|
http-backend: spool ref negotiation requests to buffer
When http-backend spawns "upload-pack" to do ref
negotiation, it streams the http request body to
upload-pack, who then streams the http response back to the
client as it reads. In theory, git can go full-duplex; the
client can consume our response while it is still sending
the request. In practice, however, HTTP is a half-duplex
protocol. Even if our client is ready to read and write
simultaneously, we may have other HTTP infrastructure in the
way, including the webserver that spawns our CGI, or any
intermediate proxies.
In at least one documented case[1], this leads to deadlock
when trying a fetch over http. What happens is basically:
1. Apache proxies the request to the CGI, http-backend.
2. http-backend gzip-inflates the data and sends
the result to upload-pack.
3. upload-pack acts on the data and generates output over
the pipe back to Apache. Apache isn't reading because
it's busy writing (step 1).
This works fine most of the time, because the upload-pack
output ends up in a system pipe buffer, and Apache reads
it as soon as it finishes writing. But if both the request
and the response exceed the system pipe buffer size, then we
deadlock (Apache blocks writing to http-backend,
http-backend blocks writing to upload-pack, and upload-pack
blocks writing to Apache).
We need to break the deadlock by spooling either the input
or the output. In this case, it's ideal to spool the input,
because Apache does not start reading either stdout _or_
stderr until we have consumed all of the input. So until we
do so, we cannot even get an error message out to the
client.
The solution is fairly straight-forward: we read the request
body into an in-memory buffer in http-backend, freeing up
Apache, and then feed the data ourselves to upload-pack. But
there are a few important things to note:
1. We limit the in-memory buffer to prevent an obvious
denial-of-service attack. This is a new hard limit on
requests, but it's unlikely to come into play. The
default value is 10MB, which covers even the ridiculous
100,000-ref negotation in the included test (that
actually caps out just over 5MB). But it's configurable
on the off chance that you don't mind spending some
extra memory to make even ridiculous requests work.
2. We must take care only to buffer when we have to. For
pushes, the incoming packfile may be of arbitrary
size, and we should connect the input directly to
receive-pack. There's no deadlock problem here, though,
because we do not produce any output until the whole
packfile has been read.
For upload-pack's initial ref advertisement, we
similarly do not need to buffer. Even though we may
generate a lot of output, there is no request body at
all (i.e., it is a GET, not a POST).
[1] http://article.gmane.org/gmane.comp.version-control.git/269020
Test-adapted-from: Dennis Kaarsemaker <dennis@kaarsemaker.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-20 15:37:09 +08:00
|
|
|
unsigned char *full_request = NULL;
|
2009-10-31 08:47:34 +08:00
|
|
|
unsigned char in_buf[8192];
|
|
|
|
unsigned char out_buf[8192];
|
|
|
|
unsigned long cnt = 0;
|
|
|
|
|
|
|
|
memset(&stream, 0, sizeof(stream));
|
2011-06-11 01:45:29 +08:00
|
|
|
git_inflate_init_gzip_only(&stream);
|
2009-10-31 08:47:34 +08:00
|
|
|
|
|
|
|
while (1) {
|
http-backend: spool ref negotiation requests to buffer
When http-backend spawns "upload-pack" to do ref
negotiation, it streams the http request body to
upload-pack, who then streams the http response back to the
client as it reads. In theory, git can go full-duplex; the
client can consume our response while it is still sending
the request. In practice, however, HTTP is a half-duplex
protocol. Even if our client is ready to read and write
simultaneously, we may have other HTTP infrastructure in the
way, including the webserver that spawns our CGI, or any
intermediate proxies.
In at least one documented case[1], this leads to deadlock
when trying a fetch over http. What happens is basically:
1. Apache proxies the request to the CGI, http-backend.
2. http-backend gzip-inflates the data and sends
the result to upload-pack.
3. upload-pack acts on the data and generates output over
the pipe back to Apache. Apache isn't reading because
it's busy writing (step 1).
This works fine most of the time, because the upload-pack
output ends up in a system pipe buffer, and Apache reads
it as soon as it finishes writing. But if both the request
and the response exceed the system pipe buffer size, then we
deadlock (Apache blocks writing to http-backend,
http-backend blocks writing to upload-pack, and upload-pack
blocks writing to Apache).
We need to break the deadlock by spooling either the input
or the output. In this case, it's ideal to spool the input,
because Apache does not start reading either stdout _or_
stderr until we have consumed all of the input. So until we
do so, we cannot even get an error message out to the
client.
The solution is fairly straight-forward: we read the request
body into an in-memory buffer in http-backend, freeing up
Apache, and then feed the data ourselves to upload-pack. But
there are a few important things to note:
1. We limit the in-memory buffer to prevent an obvious
denial-of-service attack. This is a new hard limit on
requests, but it's unlikely to come into play. The
default value is 10MB, which covers even the ridiculous
100,000-ref negotation in the included test (that
actually caps out just over 5MB). But it's configurable
on the off chance that you don't mind spending some
extra memory to make even ridiculous requests work.
2. We must take care only to buffer when we have to. For
pushes, the incoming packfile may be of arbitrary
size, and we should connect the input directly to
receive-pack. There's no deadlock problem here, though,
because we do not produce any output until the whole
packfile has been read.
For upload-pack's initial ref advertisement, we
similarly do not need to buffer. Even though we may
generate a lot of output, there is no request body at
all (i.e., it is a GET, not a POST).
[1] http://article.gmane.org/gmane.comp.version-control.git/269020
Test-adapted-from: Dennis Kaarsemaker <dennis@kaarsemaker.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-20 15:37:09 +08:00
|
|
|
ssize_t n;
|
|
|
|
|
|
|
|
if (buffer_input) {
|
|
|
|
if (full_request)
|
|
|
|
n = 0; /* nothing left to read */
|
|
|
|
else
|
|
|
|
n = read_request(0, &full_request);
|
|
|
|
stream.next_in = full_request;
|
|
|
|
} else {
|
|
|
|
n = xread(0, in_buf, sizeof(in_buf));
|
|
|
|
stream.next_in = in_buf;
|
|
|
|
}
|
|
|
|
|
2009-10-31 08:47:34 +08:00
|
|
|
if (n <= 0)
|
|
|
|
die("request ended in the middle of the gzip stream");
|
|
|
|
stream.avail_in = n;
|
|
|
|
|
|
|
|
while (0 < stream.avail_in) {
|
|
|
|
int ret;
|
|
|
|
|
|
|
|
stream.next_out = out_buf;
|
|
|
|
stream.avail_out = sizeof(out_buf);
|
|
|
|
|
2011-06-11 01:39:27 +08:00
|
|
|
ret = git_inflate(&stream, Z_NO_FLUSH);
|
2009-10-31 08:47:34 +08:00
|
|
|
if (ret != Z_OK && ret != Z_STREAM_END)
|
|
|
|
die("zlib error inflating request, result %d", ret);
|
|
|
|
|
|
|
|
n = stream.total_out - cnt;
|
|
|
|
if (write_in_full(out, out_buf, n) != n)
|
|
|
|
die("%s aborted reading request", prog_name);
|
|
|
|
cnt += n;
|
|
|
|
|
|
|
|
if (ret == Z_STREAM_END)
|
|
|
|
goto done;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
done:
|
2011-06-11 01:39:27 +08:00
|
|
|
git_inflate_end(&stream);
|
2009-10-31 08:47:34 +08:00
|
|
|
close(out);
|
http-backend: spool ref negotiation requests to buffer
When http-backend spawns "upload-pack" to do ref
negotiation, it streams the http request body to
upload-pack, who then streams the http response back to the
client as it reads. In theory, git can go full-duplex; the
client can consume our response while it is still sending
the request. In practice, however, HTTP is a half-duplex
protocol. Even if our client is ready to read and write
simultaneously, we may have other HTTP infrastructure in the
way, including the webserver that spawns our CGI, or any
intermediate proxies.
In at least one documented case[1], this leads to deadlock
when trying a fetch over http. What happens is basically:
1. Apache proxies the request to the CGI, http-backend.
2. http-backend gzip-inflates the data and sends
the result to upload-pack.
3. upload-pack acts on the data and generates output over
the pipe back to Apache. Apache isn't reading because
it's busy writing (step 1).
This works fine most of the time, because the upload-pack
output ends up in a system pipe buffer, and Apache reads
it as soon as it finishes writing. But if both the request
and the response exceed the system pipe buffer size, then we
deadlock (Apache blocks writing to http-backend,
http-backend blocks writing to upload-pack, and upload-pack
blocks writing to Apache).
We need to break the deadlock by spooling either the input
or the output. In this case, it's ideal to spool the input,
because Apache does not start reading either stdout _or_
stderr until we have consumed all of the input. So until we
do so, we cannot even get an error message out to the
client.
The solution is fairly straight-forward: we read the request
body into an in-memory buffer in http-backend, freeing up
Apache, and then feed the data ourselves to upload-pack. But
there are a few important things to note:
1. We limit the in-memory buffer to prevent an obvious
denial-of-service attack. This is a new hard limit on
requests, but it's unlikely to come into play. The
default value is 10MB, which covers even the ridiculous
100,000-ref negotation in the included test (that
actually caps out just over 5MB). But it's configurable
on the off chance that you don't mind spending some
extra memory to make even ridiculous requests work.
2. We must take care only to buffer when we have to. For
pushes, the incoming packfile may be of arbitrary
size, and we should connect the input directly to
receive-pack. There's no deadlock problem here, though,
because we do not produce any output until the whole
packfile has been read.
For upload-pack's initial ref advertisement, we
similarly do not need to buffer. Even though we may
generate a lot of output, there is no request body at
all (i.e., it is a GET, not a POST).
[1] http://article.gmane.org/gmane.comp.version-control.git/269020
Test-adapted-from: Dennis Kaarsemaker <dennis@kaarsemaker.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-20 15:37:09 +08:00
|
|
|
free(full_request);
|
|
|
|
}
|
|
|
|
|
|
|
|
static void copy_request(const char *prog_name, int out)
|
|
|
|
{
|
|
|
|
unsigned char *buf;
|
|
|
|
ssize_t n = read_request(0, &buf);
|
|
|
|
if (n < 0)
|
|
|
|
die_errno("error reading request body");
|
|
|
|
if (write_in_full(out, buf, n) != n)
|
|
|
|
die("%s aborted reading request", prog_name);
|
|
|
|
close(out);
|
|
|
|
free(buf);
|
2009-10-31 08:47:34 +08:00
|
|
|
}
|
|
|
|
|
http-backend: spool ref negotiation requests to buffer
When http-backend spawns "upload-pack" to do ref
negotiation, it streams the http request body to
upload-pack, who then streams the http response back to the
client as it reads. In theory, git can go full-duplex; the
client can consume our response while it is still sending
the request. In practice, however, HTTP is a half-duplex
protocol. Even if our client is ready to read and write
simultaneously, we may have other HTTP infrastructure in the
way, including the webserver that spawns our CGI, or any
intermediate proxies.
In at least one documented case[1], this leads to deadlock
when trying a fetch over http. What happens is basically:
1. Apache proxies the request to the CGI, http-backend.
2. http-backend gzip-inflates the data and sends
the result to upload-pack.
3. upload-pack acts on the data and generates output over
the pipe back to Apache. Apache isn't reading because
it's busy writing (step 1).
This works fine most of the time, because the upload-pack
output ends up in a system pipe buffer, and Apache reads
it as soon as it finishes writing. But if both the request
and the response exceed the system pipe buffer size, then we
deadlock (Apache blocks writing to http-backend,
http-backend blocks writing to upload-pack, and upload-pack
blocks writing to Apache).
We need to break the deadlock by spooling either the input
or the output. In this case, it's ideal to spool the input,
because Apache does not start reading either stdout _or_
stderr until we have consumed all of the input. So until we
do so, we cannot even get an error message out to the
client.
The solution is fairly straight-forward: we read the request
body into an in-memory buffer in http-backend, freeing up
Apache, and then feed the data ourselves to upload-pack. But
there are a few important things to note:
1. We limit the in-memory buffer to prevent an obvious
denial-of-service attack. This is a new hard limit on
requests, but it's unlikely to come into play. The
default value is 10MB, which covers even the ridiculous
100,000-ref negotation in the included test (that
actually caps out just over 5MB). But it's configurable
on the off chance that you don't mind spending some
extra memory to make even ridiculous requests work.
2. We must take care only to buffer when we have to. For
pushes, the incoming packfile may be of arbitrary
size, and we should connect the input directly to
receive-pack. There's no deadlock problem here, though,
because we do not produce any output until the whole
packfile has been read.
For upload-pack's initial ref advertisement, we
similarly do not need to buffer. Even though we may
generate a lot of output, there is no request body at
all (i.e., it is a GET, not a POST).
[1] http://article.gmane.org/gmane.comp.version-control.git/269020
Test-adapted-from: Dennis Kaarsemaker <dennis@kaarsemaker.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-20 15:37:09 +08:00
|
|
|
static void run_service(const char **argv, int buffer_input)
|
2009-10-31 08:47:34 +08:00
|
|
|
{
|
|
|
|
const char *encoding = getenv("HTTP_CONTENT_ENCODING");
|
|
|
|
const char *user = getenv("REMOTE_USER");
|
|
|
|
const char *host = getenv("REMOTE_ADDR");
|
|
|
|
int gzipped_request = 0;
|
2014-08-20 03:09:35 +08:00
|
|
|
struct child_process cld = CHILD_PROCESS_INIT;
|
2009-10-31 08:47:34 +08:00
|
|
|
|
|
|
|
if (encoding && !strcmp(encoding, "gzip"))
|
|
|
|
gzipped_request = 1;
|
|
|
|
else if (encoding && !strcmp(encoding, "x-gzip"))
|
|
|
|
gzipped_request = 1;
|
|
|
|
|
|
|
|
if (!user || !*user)
|
|
|
|
user = "anonymous";
|
|
|
|
if (!host || !*host)
|
|
|
|
host = "(none)";
|
|
|
|
|
2012-03-30 15:01:30 +08:00
|
|
|
if (!getenv("GIT_COMMITTER_NAME"))
|
2014-10-19 19:14:20 +08:00
|
|
|
argv_array_pushf(&cld.env_array, "GIT_COMMITTER_NAME=%s", user);
|
2012-03-30 15:01:30 +08:00
|
|
|
if (!getenv("GIT_COMMITTER_EMAIL"))
|
2014-10-19 19:14:20 +08:00
|
|
|
argv_array_pushf(&cld.env_array,
|
|
|
|
"GIT_COMMITTER_EMAIL=%s@http.%s", user, host);
|
2009-10-31 08:47:34 +08:00
|
|
|
|
|
|
|
cld.argv = argv;
|
http-backend: spool ref negotiation requests to buffer
When http-backend spawns "upload-pack" to do ref
negotiation, it streams the http request body to
upload-pack, who then streams the http response back to the
client as it reads. In theory, git can go full-duplex; the
client can consume our response while it is still sending
the request. In practice, however, HTTP is a half-duplex
protocol. Even if our client is ready to read and write
simultaneously, we may have other HTTP infrastructure in the
way, including the webserver that spawns our CGI, or any
intermediate proxies.
In at least one documented case[1], this leads to deadlock
when trying a fetch over http. What happens is basically:
1. Apache proxies the request to the CGI, http-backend.
2. http-backend gzip-inflates the data and sends
the result to upload-pack.
3. upload-pack acts on the data and generates output over
the pipe back to Apache. Apache isn't reading because
it's busy writing (step 1).
This works fine most of the time, because the upload-pack
output ends up in a system pipe buffer, and Apache reads
it as soon as it finishes writing. But if both the request
and the response exceed the system pipe buffer size, then we
deadlock (Apache blocks writing to http-backend,
http-backend blocks writing to upload-pack, and upload-pack
blocks writing to Apache).
We need to break the deadlock by spooling either the input
or the output. In this case, it's ideal to spool the input,
because Apache does not start reading either stdout _or_
stderr until we have consumed all of the input. So until we
do so, we cannot even get an error message out to the
client.
The solution is fairly straight-forward: we read the request
body into an in-memory buffer in http-backend, freeing up
Apache, and then feed the data ourselves to upload-pack. But
there are a few important things to note:
1. We limit the in-memory buffer to prevent an obvious
denial-of-service attack. This is a new hard limit on
requests, but it's unlikely to come into play. The
default value is 10MB, which covers even the ridiculous
100,000-ref negotation in the included test (that
actually caps out just over 5MB). But it's configurable
on the off chance that you don't mind spending some
extra memory to make even ridiculous requests work.
2. We must take care only to buffer when we have to. For
pushes, the incoming packfile may be of arbitrary
size, and we should connect the input directly to
receive-pack. There's no deadlock problem here, though,
because we do not produce any output until the whole
packfile has been read.
For upload-pack's initial ref advertisement, we
similarly do not need to buffer. Even though we may
generate a lot of output, there is no request body at
all (i.e., it is a GET, not a POST).
[1] http://article.gmane.org/gmane.comp.version-control.git/269020
Test-adapted-from: Dennis Kaarsemaker <dennis@kaarsemaker.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-20 15:37:09 +08:00
|
|
|
if (buffer_input || gzipped_request)
|
2009-10-31 08:47:34 +08:00
|
|
|
cld.in = -1;
|
|
|
|
cld.git_cmd = 1;
|
|
|
|
if (start_command(&cld))
|
|
|
|
exit(1);
|
|
|
|
|
|
|
|
close(1);
|
|
|
|
if (gzipped_request)
|
http-backend: spool ref negotiation requests to buffer
When http-backend spawns "upload-pack" to do ref
negotiation, it streams the http request body to
upload-pack, who then streams the http response back to the
client as it reads. In theory, git can go full-duplex; the
client can consume our response while it is still sending
the request. In practice, however, HTTP is a half-duplex
protocol. Even if our client is ready to read and write
simultaneously, we may have other HTTP infrastructure in the
way, including the webserver that spawns our CGI, or any
intermediate proxies.
In at least one documented case[1], this leads to deadlock
when trying a fetch over http. What happens is basically:
1. Apache proxies the request to the CGI, http-backend.
2. http-backend gzip-inflates the data and sends
the result to upload-pack.
3. upload-pack acts on the data and generates output over
the pipe back to Apache. Apache isn't reading because
it's busy writing (step 1).
This works fine most of the time, because the upload-pack
output ends up in a system pipe buffer, and Apache reads
it as soon as it finishes writing. But if both the request
and the response exceed the system pipe buffer size, then we
deadlock (Apache blocks writing to http-backend,
http-backend blocks writing to upload-pack, and upload-pack
blocks writing to Apache).
We need to break the deadlock by spooling either the input
or the output. In this case, it's ideal to spool the input,
because Apache does not start reading either stdout _or_
stderr until we have consumed all of the input. So until we
do so, we cannot even get an error message out to the
client.
The solution is fairly straight-forward: we read the request
body into an in-memory buffer in http-backend, freeing up
Apache, and then feed the data ourselves to upload-pack. But
there are a few important things to note:
1. We limit the in-memory buffer to prevent an obvious
denial-of-service attack. This is a new hard limit on
requests, but it's unlikely to come into play. The
default value is 10MB, which covers even the ridiculous
100,000-ref negotation in the included test (that
actually caps out just over 5MB). But it's configurable
on the off chance that you don't mind spending some
extra memory to make even ridiculous requests work.
2. We must take care only to buffer when we have to. For
pushes, the incoming packfile may be of arbitrary
size, and we should connect the input directly to
receive-pack. There's no deadlock problem here, though,
because we do not produce any output until the whole
packfile has been read.
For upload-pack's initial ref advertisement, we
similarly do not need to buffer. Even though we may
generate a lot of output, there is no request body at
all (i.e., it is a GET, not a POST).
[1] http://article.gmane.org/gmane.comp.version-control.git/269020
Test-adapted-from: Dennis Kaarsemaker <dennis@kaarsemaker.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-20 15:37:09 +08:00
|
|
|
inflate_request(argv[0], cld.in, buffer_input);
|
|
|
|
else if (buffer_input)
|
|
|
|
copy_request(argv[0], cld.in);
|
2009-10-31 08:47:34 +08:00
|
|
|
else
|
|
|
|
close(0);
|
|
|
|
|
|
|
|
if (finish_command(&cld))
|
|
|
|
exit(1);
|
|
|
|
}
|
|
|
|
|
2015-05-26 02:38:55 +08:00
|
|
|
static int show_text_ref(const char *name, const struct object_id *oid,
|
|
|
|
int flag, void *cb_data)
|
2009-10-31 08:47:32 +08:00
|
|
|
{
|
2013-04-10 08:55:08 +08:00
|
|
|
const char *name_nons = strip_namespace(name);
|
2009-10-31 08:47:32 +08:00
|
|
|
struct strbuf *buf = cb_data;
|
object: convert parse_object* to take struct object_id
Make parse_object, parse_object_or_die, and parse_object_buffer take a
pointer to struct object_id. Remove the temporary variables inserted
earlier, since they are no longer necessary. Transform all of the
callers using the following semantic patch:
@@
expression E1;
@@
- parse_object(E1.hash)
+ parse_object(&E1)
@@
expression E1;
@@
- parse_object(E1->hash)
+ parse_object(E1)
@@
expression E1, E2;
@@
- parse_object_or_die(E1.hash, E2)
+ parse_object_or_die(&E1, E2)
@@
expression E1, E2;
@@
- parse_object_or_die(E1->hash, E2)
+ parse_object_or_die(E1, E2)
@@
expression E1, E2, E3, E4, E5;
@@
- parse_object_buffer(E1.hash, E2, E3, E4, E5)
+ parse_object_buffer(&E1, E2, E3, E4, E5)
@@
expression E1, E2, E3, E4, E5;
@@
- parse_object_buffer(E1->hash, E2, E3, E4, E5)
+ parse_object_buffer(E1, E2, E3, E4, E5)
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-07 06:10:38 +08:00
|
|
|
struct object *o = parse_object(oid);
|
2009-10-31 08:47:32 +08:00
|
|
|
if (!o)
|
|
|
|
return 0;
|
|
|
|
|
2015-05-26 02:38:55 +08:00
|
|
|
strbuf_addf(buf, "%s\t%s\n", oid_to_hex(oid), name_nons);
|
2009-10-31 08:47:32 +08:00
|
|
|
if (o->type == OBJ_TAG) {
|
|
|
|
o = deref_tag(o, name, 0);
|
|
|
|
if (!o)
|
|
|
|
return 0;
|
2015-11-10 10:22:28 +08:00
|
|
|
strbuf_addf(buf, "%s\t%s^{}\n", oid_to_hex(&o->oid),
|
2013-04-10 08:55:08 +08:00
|
|
|
name_nons);
|
2009-10-31 08:47:32 +08:00
|
|
|
}
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2016-08-10 07:47:31 +08:00
|
|
|
static void get_info_refs(struct strbuf *hdr, char *arg)
|
2009-10-31 08:47:32 +08:00
|
|
|
{
|
2009-10-31 08:47:34 +08:00
|
|
|
const char *service_name = get_parameter("service");
|
2009-10-31 08:47:32 +08:00
|
|
|
struct strbuf buf = STRBUF_INIT;
|
|
|
|
|
2016-08-10 07:47:31 +08:00
|
|
|
hdr_nocache(hdr);
|
2009-10-31 08:47:34 +08:00
|
|
|
|
|
|
|
if (service_name) {
|
|
|
|
const char *argv[] = {NULL /* service name */,
|
|
|
|
"--stateless-rpc", "--advertise-refs",
|
|
|
|
".", NULL};
|
2016-08-10 07:47:31 +08:00
|
|
|
struct rpc_service *svc = select_service(hdr, service_name);
|
2009-10-31 08:47:34 +08:00
|
|
|
|
|
|
|
strbuf_addf(&buf, "application/x-git-%s-advertisement",
|
|
|
|
svc->name);
|
2016-08-10 07:47:31 +08:00
|
|
|
hdr_str(hdr, content_type, buf.buf);
|
|
|
|
end_headers(hdr);
|
2009-10-31 08:47:34 +08:00
|
|
|
|
2016-10-17 07:20:29 +08:00
|
|
|
packet_write_fmt(1, "# service=git-%s\n", svc->name);
|
2009-10-31 08:47:34 +08:00
|
|
|
packet_flush(1);
|
|
|
|
|
|
|
|
argv[0] = svc->name;
|
http-backend: spool ref negotiation requests to buffer
When http-backend spawns "upload-pack" to do ref
negotiation, it streams the http request body to
upload-pack, who then streams the http response back to the
client as it reads. In theory, git can go full-duplex; the
client can consume our response while it is still sending
the request. In practice, however, HTTP is a half-duplex
protocol. Even if our client is ready to read and write
simultaneously, we may have other HTTP infrastructure in the
way, including the webserver that spawns our CGI, or any
intermediate proxies.
In at least one documented case[1], this leads to deadlock
when trying a fetch over http. What happens is basically:
1. Apache proxies the request to the CGI, http-backend.
2. http-backend gzip-inflates the data and sends
the result to upload-pack.
3. upload-pack acts on the data and generates output over
the pipe back to Apache. Apache isn't reading because
it's busy writing (step 1).
This works fine most of the time, because the upload-pack
output ends up in a system pipe buffer, and Apache reads
it as soon as it finishes writing. But if both the request
and the response exceed the system pipe buffer size, then we
deadlock (Apache blocks writing to http-backend,
http-backend blocks writing to upload-pack, and upload-pack
blocks writing to Apache).
We need to break the deadlock by spooling either the input
or the output. In this case, it's ideal to spool the input,
because Apache does not start reading either stdout _or_
stderr until we have consumed all of the input. So until we
do so, we cannot even get an error message out to the
client.
The solution is fairly straight-forward: we read the request
body into an in-memory buffer in http-backend, freeing up
Apache, and then feed the data ourselves to upload-pack. But
there are a few important things to note:
1. We limit the in-memory buffer to prevent an obvious
denial-of-service attack. This is a new hard limit on
requests, but it's unlikely to come into play. The
default value is 10MB, which covers even the ridiculous
100,000-ref negotation in the included test (that
actually caps out just over 5MB). But it's configurable
on the off chance that you don't mind spending some
extra memory to make even ridiculous requests work.
2. We must take care only to buffer when we have to. For
pushes, the incoming packfile may be of arbitrary
size, and we should connect the input directly to
receive-pack. There's no deadlock problem here, though,
because we do not produce any output until the whole
packfile has been read.
For upload-pack's initial ref advertisement, we
similarly do not need to buffer. Even though we may
generate a lot of output, there is no request body at
all (i.e., it is a GET, not a POST).
[1] http://article.gmane.org/gmane.comp.version-control.git/269020
Test-adapted-from: Dennis Kaarsemaker <dennis@kaarsemaker.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-20 15:37:09 +08:00
|
|
|
run_service(argv, 0);
|
2009-10-31 08:47:34 +08:00
|
|
|
|
|
|
|
} else {
|
2016-08-10 07:47:31 +08:00
|
|
|
select_getanyfile(hdr);
|
2013-04-10 08:55:08 +08:00
|
|
|
for_each_namespaced_ref(show_text_ref, &buf);
|
2016-08-10 07:47:31 +08:00
|
|
|
send_strbuf(hdr, "text/plain", &buf);
|
2009-10-31 08:47:34 +08:00
|
|
|
}
|
2009-10-31 08:47:32 +08:00
|
|
|
strbuf_release(&buf);
|
|
|
|
}
|
|
|
|
|
2015-05-26 02:38:55 +08:00
|
|
|
static int show_head_ref(const char *refname, const struct object_id *oid,
|
|
|
|
int flag, void *cb_data)
|
2013-04-10 08:55:08 +08:00
|
|
|
{
|
|
|
|
struct strbuf *buf = cb_data;
|
|
|
|
|
|
|
|
if (flag & REF_ISSYMREF) {
|
2015-05-26 02:38:56 +08:00
|
|
|
struct object_id unused;
|
2014-07-16 03:59:36 +08:00
|
|
|
const char *target = resolve_ref_unsafe(refname,
|
|
|
|
RESOLVE_REF_READING,
|
2015-05-26 02:38:56 +08:00
|
|
|
unused.hash, NULL);
|
2013-04-10 08:55:08 +08:00
|
|
|
|
2016-04-08 03:03:09 +08:00
|
|
|
if (target)
|
|
|
|
strbuf_addf(buf, "ref: %s\n", strip_namespace(target));
|
2013-04-10 08:55:08 +08:00
|
|
|
} else {
|
2015-05-26 02:38:55 +08:00
|
|
|
strbuf_addf(buf, "%s\n", oid_to_hex(oid));
|
2013-04-10 08:55:08 +08:00
|
|
|
}
|
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
2016-08-10 07:47:31 +08:00
|
|
|
static void get_head(struct strbuf *hdr, char *arg)
|
2013-04-10 08:55:08 +08:00
|
|
|
{
|
|
|
|
struct strbuf buf = STRBUF_INIT;
|
|
|
|
|
2016-08-10 07:47:31 +08:00
|
|
|
select_getanyfile(hdr);
|
2013-04-10 08:55:08 +08:00
|
|
|
head_ref_namespaced(show_head_ref, &buf);
|
2016-08-10 07:47:31 +08:00
|
|
|
send_strbuf(hdr, "text/plain", &buf);
|
2013-04-10 08:55:08 +08:00
|
|
|
strbuf_release(&buf);
|
|
|
|
}
|
|
|
|
|
2016-08-10 07:47:31 +08:00
|
|
|
static void get_info_packs(struct strbuf *hdr, char *arg)
|
2009-10-31 08:47:32 +08:00
|
|
|
{
|
|
|
|
size_t objdirlen = strlen(get_object_directory());
|
|
|
|
struct strbuf buf = STRBUF_INIT;
|
|
|
|
struct packed_git *p;
|
|
|
|
size_t cnt = 0;
|
|
|
|
|
2016-08-10 07:47:31 +08:00
|
|
|
select_getanyfile(hdr);
|
2009-10-31 08:47:32 +08:00
|
|
|
prepare_packed_git();
|
|
|
|
for (p = packed_git; p; p = p->next) {
|
|
|
|
if (p->pack_local)
|
|
|
|
cnt++;
|
|
|
|
}
|
|
|
|
|
|
|
|
strbuf_grow(&buf, cnt * 53 + 2);
|
|
|
|
for (p = packed_git; p; p = p->next) {
|
|
|
|
if (p->pack_local)
|
|
|
|
strbuf_addf(&buf, "P %s\n", p->pack_name + objdirlen + 6);
|
|
|
|
}
|
|
|
|
strbuf_addch(&buf, '\n');
|
|
|
|
|
2016-08-10 07:47:31 +08:00
|
|
|
hdr_nocache(hdr);
|
|
|
|
send_strbuf(hdr, "text/plain; charset=utf-8", &buf);
|
2009-10-31 08:47:32 +08:00
|
|
|
strbuf_release(&buf);
|
|
|
|
}
|
|
|
|
|
2016-08-10 07:47:31 +08:00
|
|
|
static void check_content_type(struct strbuf *hdr, const char *accepted_type)
|
2009-10-31 08:47:34 +08:00
|
|
|
{
|
|
|
|
const char *actual_type = getenv("CONTENT_TYPE");
|
|
|
|
|
|
|
|
if (!actual_type)
|
|
|
|
actual_type = "";
|
|
|
|
|
|
|
|
if (strcmp(actual_type, accepted_type)) {
|
2016-08-10 07:47:31 +08:00
|
|
|
http_status(hdr, 415, "Unsupported Media Type");
|
|
|
|
hdr_nocache(hdr);
|
|
|
|
end_headers(hdr);
|
2009-10-31 08:47:34 +08:00
|
|
|
format_write(1,
|
|
|
|
"Expected POST with Content-Type '%s',"
|
|
|
|
" but received '%s' instead.\n",
|
|
|
|
accepted_type, actual_type);
|
|
|
|
exit(0);
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2016-08-10 07:47:31 +08:00
|
|
|
static void service_rpc(struct strbuf *hdr, char *service_name)
|
2009-10-31 08:47:34 +08:00
|
|
|
{
|
|
|
|
const char *argv[] = {NULL, "--stateless-rpc", ".", NULL};
|
2016-08-10 07:47:31 +08:00
|
|
|
struct rpc_service *svc = select_service(hdr, service_name);
|
2009-10-31 08:47:34 +08:00
|
|
|
struct strbuf buf = STRBUF_INIT;
|
|
|
|
|
|
|
|
strbuf_reset(&buf);
|
|
|
|
strbuf_addf(&buf, "application/x-git-%s-request", svc->name);
|
2016-08-10 07:47:31 +08:00
|
|
|
check_content_type(hdr, buf.buf);
|
2009-10-31 08:47:34 +08:00
|
|
|
|
2016-08-10 07:47:31 +08:00
|
|
|
hdr_nocache(hdr);
|
2009-10-31 08:47:34 +08:00
|
|
|
|
|
|
|
strbuf_reset(&buf);
|
|
|
|
strbuf_addf(&buf, "application/x-git-%s-result", svc->name);
|
2016-08-10 07:47:31 +08:00
|
|
|
hdr_str(hdr, content_type, buf.buf);
|
2009-10-31 08:47:34 +08:00
|
|
|
|
2016-08-10 07:47:31 +08:00
|
|
|
end_headers(hdr);
|
2009-10-31 08:47:34 +08:00
|
|
|
|
|
|
|
argv[0] = svc->name;
|
http-backend: spool ref negotiation requests to buffer
When http-backend spawns "upload-pack" to do ref
negotiation, it streams the http request body to
upload-pack, who then streams the http response back to the
client as it reads. In theory, git can go full-duplex; the
client can consume our response while it is still sending
the request. In practice, however, HTTP is a half-duplex
protocol. Even if our client is ready to read and write
simultaneously, we may have other HTTP infrastructure in the
way, including the webserver that spawns our CGI, or any
intermediate proxies.
In at least one documented case[1], this leads to deadlock
when trying a fetch over http. What happens is basically:
1. Apache proxies the request to the CGI, http-backend.
2. http-backend gzip-inflates the data and sends
the result to upload-pack.
3. upload-pack acts on the data and generates output over
the pipe back to Apache. Apache isn't reading because
it's busy writing (step 1).
This works fine most of the time, because the upload-pack
output ends up in a system pipe buffer, and Apache reads
it as soon as it finishes writing. But if both the request
and the response exceed the system pipe buffer size, then we
deadlock (Apache blocks writing to http-backend,
http-backend blocks writing to upload-pack, and upload-pack
blocks writing to Apache).
We need to break the deadlock by spooling either the input
or the output. In this case, it's ideal to spool the input,
because Apache does not start reading either stdout _or_
stderr until we have consumed all of the input. So until we
do so, we cannot even get an error message out to the
client.
The solution is fairly straight-forward: we read the request
body into an in-memory buffer in http-backend, freeing up
Apache, and then feed the data ourselves to upload-pack. But
there are a few important things to note:
1. We limit the in-memory buffer to prevent an obvious
denial-of-service attack. This is a new hard limit on
requests, but it's unlikely to come into play. The
default value is 10MB, which covers even the ridiculous
100,000-ref negotation in the included test (that
actually caps out just over 5MB). But it's configurable
on the off chance that you don't mind spending some
extra memory to make even ridiculous requests work.
2. We must take care only to buffer when we have to. For
pushes, the incoming packfile may be of arbitrary
size, and we should connect the input directly to
receive-pack. There's no deadlock problem here, though,
because we do not produce any output until the whole
packfile has been read.
For upload-pack's initial ref advertisement, we
similarly do not need to buffer. Even though we may
generate a lot of output, there is no request body at
all (i.e., it is a GET, not a POST).
[1] http://article.gmane.org/gmane.comp.version-control.git/269020
Test-adapted-from: Dennis Kaarsemaker <dennis@kaarsemaker.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-20 15:37:09 +08:00
|
|
|
run_service(argv, svc->buffer_input);
|
2009-10-31 08:47:34 +08:00
|
|
|
strbuf_release(&buf);
|
|
|
|
}
|
|
|
|
|
http-backend: fix die recursion with custom handler
When we die() in http-backend, we call a custom handler that
writes an HTTP 500 response to stdout, then reports the
error to stderr. Our routines for writing out the HTTP
response may themselves die, leading to us entering die()
again.
When it was originally written, that was OK; our custom
handler keeps a variable to notice this and does not
recurse. However, since cd163d4 (usage.c: detect recursion
in die routines and bail out immediately, 2012-11-14), the
main die() implementation detects recursion before we even
get to our custom handler, and bails without printing
anything useful.
We can handle this case by doing two things:
1. Installing a custom die_is_recursing handler that
allows us to enter up to one level of recursion. Only
the first call to our custom handler will try to write
out the error response. So if we die again, that is OK.
If we end up dying more than that, it is a sign that we
are in an infinite recursion.
2. Reporting the error to stderr before trying to write
out the HTTP response. In the current code, if we do
die() trying to write out the response, we'll exit
immediately from this second die(), and never get a
chance to output the original error (which is almost
certainly the more interesting one; the second die is
just going to be along the lines of "I tried to write
to stdout but it was closed").
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-15 14:29:27 +08:00
|
|
|
static int dead;
|
2009-10-31 08:47:32 +08:00
|
|
|
static NORETURN void die_webcgi(const char *err, va_list params)
|
|
|
|
{
|
http-backend: fix die recursion with custom handler
When we die() in http-backend, we call a custom handler that
writes an HTTP 500 response to stdout, then reports the
error to stderr. Our routines for writing out the HTTP
response may themselves die, leading to us entering die()
again.
When it was originally written, that was OK; our custom
handler keeps a variable to notice this and does not
recurse. However, since cd163d4 (usage.c: detect recursion
in die routines and bail out immediately, 2012-11-14), the
main die() implementation detects recursion before we even
get to our custom handler, and bails without printing
anything useful.
We can handle this case by doing two things:
1. Installing a custom die_is_recursing handler that
allows us to enter up to one level of recursion. Only
the first call to our custom handler will try to write
out the error response. So if we die again, that is OK.
If we end up dying more than that, it is a sign that we
are in an infinite recursion.
2. Reporting the error to stderr before trying to write
out the HTTP response. In the current code, if we do
die() trying to write out the response, we'll exit
immediately from this second die(), and never get a
chance to output the original error (which is almost
certainly the more interesting one; the second die is
just going to be along the lines of "I tried to write
to stdout but it was closed").
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-15 14:29:27 +08:00
|
|
|
if (dead <= 1) {
|
2016-08-10 07:47:31 +08:00
|
|
|
struct strbuf hdr = STRBUF_INIT;
|
|
|
|
|
http-backend: fix die recursion with custom handler
When we die() in http-backend, we call a custom handler that
writes an HTTP 500 response to stdout, then reports the
error to stderr. Our routines for writing out the HTTP
response may themselves die, leading to us entering die()
again.
When it was originally written, that was OK; our custom
handler keeps a variable to notice this and does not
recurse. However, since cd163d4 (usage.c: detect recursion
in die routines and bail out immediately, 2012-11-14), the
main die() implementation detects recursion before we even
get to our custom handler, and bails without printing
anything useful.
We can handle this case by doing two things:
1. Installing a custom die_is_recursing handler that
allows us to enter up to one level of recursion. Only
the first call to our custom handler will try to write
out the error response. So if we die again, that is OK.
If we end up dying more than that, it is a sign that we
are in an infinite recursion.
2. Reporting the error to stderr before trying to write
out the HTTP response. In the current code, if we do
die() trying to write out the response, we'll exit
immediately from this second die(), and never get a
chance to output the original error (which is almost
certainly the more interesting one; the second die is
just going to be along the lines of "I tried to write
to stdout but it was closed").
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-15 14:29:27 +08:00
|
|
|
vreportf("fatal: ", err, params);
|
2009-10-31 08:47:32 +08:00
|
|
|
|
2016-08-10 07:47:31 +08:00
|
|
|
http_status(&hdr, 500, "Internal Server Error");
|
|
|
|
hdr_nocache(&hdr);
|
|
|
|
end_headers(&hdr);
|
2010-03-22 22:22:04 +08:00
|
|
|
}
|
|
|
|
exit(0); /* we successfully reported a failure ;-) */
|
2009-10-31 08:47:32 +08:00
|
|
|
}
|
|
|
|
|
http-backend: fix die recursion with custom handler
When we die() in http-backend, we call a custom handler that
writes an HTTP 500 response to stdout, then reports the
error to stderr. Our routines for writing out the HTTP
response may themselves die, leading to us entering die()
again.
When it was originally written, that was OK; our custom
handler keeps a variable to notice this and does not
recurse. However, since cd163d4 (usage.c: detect recursion
in die routines and bail out immediately, 2012-11-14), the
main die() implementation detects recursion before we even
get to our custom handler, and bails without printing
anything useful.
We can handle this case by doing two things:
1. Installing a custom die_is_recursing handler that
allows us to enter up to one level of recursion. Only
the first call to our custom handler will try to write
out the error response. So if we die again, that is OK.
If we end up dying more than that, it is a sign that we
are in an infinite recursion.
2. Reporting the error to stderr before trying to write
out the HTTP response. In the current code, if we do
die() trying to write out the response, we'll exit
immediately from this second die(), and never get a
chance to output the original error (which is almost
certainly the more interesting one; the second die is
just going to be along the lines of "I tried to write
to stdout but it was closed").
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-15 14:29:27 +08:00
|
|
|
static int die_webcgi_recursing(void)
|
|
|
|
{
|
|
|
|
return dead++ > 1;
|
|
|
|
}
|
|
|
|
|
2009-10-31 08:47:35 +08:00
|
|
|
static char* getdir(void)
|
|
|
|
{
|
|
|
|
struct strbuf buf = STRBUF_INIT;
|
|
|
|
char *pathinfo = getenv("PATH_INFO");
|
|
|
|
char *root = getenv("GIT_PROJECT_ROOT");
|
|
|
|
char *path = getenv("PATH_TRANSLATED");
|
|
|
|
|
|
|
|
if (root && *root) {
|
|
|
|
if (!pathinfo || !*pathinfo)
|
|
|
|
die("GIT_PROJECT_ROOT is set but PATH_INFO is not");
|
2009-11-10 03:26:43 +08:00
|
|
|
if (daemon_avoid_alias(pathinfo))
|
|
|
|
die("'%s': aliased", pathinfo);
|
2010-11-25 16:21:06 +08:00
|
|
|
end_url_with_slash(&buf, root);
|
2009-11-10 03:26:43 +08:00
|
|
|
if (pathinfo[0] == '/')
|
|
|
|
pathinfo++;
|
2009-10-31 08:47:35 +08:00
|
|
|
strbuf_addstr(&buf, pathinfo);
|
|
|
|
return strbuf_detach(&buf, NULL);
|
|
|
|
} else if (path && *path) {
|
|
|
|
return xstrdup(path);
|
|
|
|
} else
|
|
|
|
die("No GIT_PROJECT_ROOT or PATH_TRANSLATED from server");
|
|
|
|
return NULL;
|
|
|
|
}
|
|
|
|
|
2009-10-31 08:47:32 +08:00
|
|
|
static struct service_cmd {
|
|
|
|
const char *method;
|
|
|
|
const char *pattern;
|
2016-08-10 07:47:31 +08:00
|
|
|
void (*imp)(struct strbuf *, char *);
|
2009-10-31 08:47:32 +08:00
|
|
|
} services[] = {
|
2013-04-10 08:55:08 +08:00
|
|
|
{"GET", "/HEAD$", get_head},
|
2009-10-31 08:47:32 +08:00
|
|
|
{"GET", "/info/refs$", get_info_refs},
|
|
|
|
{"GET", "/objects/info/alternates$", get_text_file},
|
|
|
|
{"GET", "/objects/info/http-alternates$", get_text_file},
|
|
|
|
{"GET", "/objects/info/packs$", get_info_packs},
|
|
|
|
{"GET", "/objects/[0-9a-f]{2}/[0-9a-f]{38}$", get_loose_object},
|
|
|
|
{"GET", "/objects/pack/pack-[0-9a-f]{40}\\.pack$", get_pack_file},
|
2009-10-31 08:47:34 +08:00
|
|
|
{"GET", "/objects/pack/pack-[0-9a-f]{40}\\.idx$", get_idx_file},
|
|
|
|
|
|
|
|
{"POST", "/git-upload-pack$", service_rpc},
|
|
|
|
{"POST", "/git-receive-pack$", service_rpc}
|
2009-10-31 08:47:32 +08:00
|
|
|
};
|
|
|
|
|
2016-08-10 07:47:31 +08:00
|
|
|
static int bad_request(struct strbuf *hdr, const struct service_cmd *c)
|
|
|
|
{
|
|
|
|
const char *proto = getenv("SERVER_PROTOCOL");
|
|
|
|
|
|
|
|
if (proto && !strcmp(proto, "HTTP/1.1")) {
|
|
|
|
http_status(hdr, 405, "Method Not Allowed");
|
|
|
|
hdr_str(hdr, "Allow",
|
|
|
|
!strcmp(c->method, "GET") ? "GET, HEAD" : c->method);
|
|
|
|
} else
|
|
|
|
http_status(hdr, 400, "Bad Request");
|
|
|
|
hdr_nocache(hdr);
|
|
|
|
end_headers(hdr);
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
add an extra level of indirection to main()
There are certain startup tasks that we expect every git
process to do. In some cases this is just to improve the
quality of the program (e.g., setting up gettext()). In
others it is a requirement for using certain functions in
libgit.a (e.g., system_path() expects that you have called
git_extract_argv0_path()).
Most commands are builtins and are covered by the git.c
version of main(). However, there are still a few external
commands that use their own main(). Each of these has to
remember to include the correct startup sequence, and we are
not always consistent.
Rather than just fix the inconsistencies, let's make this
harder to get wrong by providing a common main() that can
run this standard startup.
We basically have two options to do this:
- the compat/mingw.h file already does something like this by
adding a #define that replaces the definition of main with a
wrapper that calls mingw_startup().
The upside is that the code in each program doesn't need
to be changed at all; it's rewritten on the fly by the
preprocessor.
The downside is that it may make debugging of the startup
sequence a bit more confusing, as the preprocessor is
quietly inserting new code.
- the builtin functions are all of the form cmd_foo(),
and git.c's main() calls them.
This is much more explicit, which may make things more
obvious to somebody reading the code. It's also more
flexible (because of course we have to figure out _which_
cmd_foo() to call).
The downside is that each of the builtins must define
cmd_foo(), instead of just main().
This patch chooses the latter option, preferring the more
explicit approach, even though it is more invasive. We
introduce a new file common-main.c, with the "real" main. It
expects to call cmd_main() from whatever other objects it is
linked against.
We link common-main.o against anything that links against
libgit.a, since we know that such programs will need to do
this setup. Note that common-main.o can't actually go inside
libgit.a, as the linker would not pick up its main()
function automatically (it has no callers).
The rest of the patch is just adjusting all of the various
external programs (mostly in t/helper) to use cmd_main().
I've provided a global declaration for cmd_main(), which
means that all of the programs also need to match its
signature. In particular, many functions need to switch to
"const char **" instead of "char **" for argv. This effect
ripples out to a few other variables and functions, as well.
This makes the patch even more invasive, but the end result
is much better. We should be treating argv strings as const
anyway, and now all programs conform to the same signature
(which also matches the way builtins are defined).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-01 13:58:58 +08:00
|
|
|
int cmd_main(int argc, const char **argv)
|
2009-10-31 08:47:32 +08:00
|
|
|
{
|
|
|
|
char *method = getenv("REQUEST_METHOD");
|
2009-10-31 08:47:35 +08:00
|
|
|
char *dir;
|
2009-10-31 08:47:32 +08:00
|
|
|
struct service_cmd *cmd = NULL;
|
|
|
|
char *cmd_arg = NULL;
|
|
|
|
int i;
|
2016-08-10 07:47:31 +08:00
|
|
|
struct strbuf hdr = STRBUF_INIT;
|
2009-10-31 08:47:32 +08:00
|
|
|
|
|
|
|
set_die_routine(die_webcgi);
|
http-backend: fix die recursion with custom handler
When we die() in http-backend, we call a custom handler that
writes an HTTP 500 response to stdout, then reports the
error to stderr. Our routines for writing out the HTTP
response may themselves die, leading to us entering die()
again.
When it was originally written, that was OK; our custom
handler keeps a variable to notice this and does not
recurse. However, since cd163d4 (usage.c: detect recursion
in die routines and bail out immediately, 2012-11-14), the
main die() implementation detects recursion before we even
get to our custom handler, and bails without printing
anything useful.
We can handle this case by doing two things:
1. Installing a custom die_is_recursing handler that
allows us to enter up to one level of recursion. Only
the first call to our custom handler will try to write
out the error response. So if we die again, that is OK.
If we end up dying more than that, it is a sign that we
are in an infinite recursion.
2. Reporting the error to stderr before trying to write
out the HTTP response. In the current code, if we do
die() trying to write out the response, we'll exit
immediately from this second die(), and never get a
chance to output the original error (which is almost
certainly the more interesting one; the second die is
just going to be along the lines of "I tried to write
to stdout but it was closed").
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-15 14:29:27 +08:00
|
|
|
set_die_is_recursing_routine(die_webcgi_recursing);
|
2009-10-31 08:47:32 +08:00
|
|
|
|
|
|
|
if (!method)
|
|
|
|
die("No REQUEST_METHOD from server");
|
|
|
|
if (!strcmp(method, "HEAD"))
|
|
|
|
method = "GET";
|
2009-10-31 08:47:35 +08:00
|
|
|
dir = getdir();
|
2009-10-31 08:47:32 +08:00
|
|
|
|
|
|
|
for (i = 0; i < ARRAY_SIZE(services); i++) {
|
|
|
|
struct service_cmd *c = &services[i];
|
|
|
|
regex_t re;
|
|
|
|
regmatch_t out[1];
|
|
|
|
|
|
|
|
if (regcomp(&re, c->pattern, REG_EXTENDED))
|
|
|
|
die("Bogus regex in service table: %s", c->pattern);
|
|
|
|
if (!regexec(&re, dir, 1, out, 0)) {
|
2009-11-15 05:10:57 +08:00
|
|
|
size_t n;
|
2009-10-31 08:47:32 +08:00
|
|
|
|
2016-08-10 07:47:31 +08:00
|
|
|
if (strcmp(method, c->method))
|
|
|
|
return bad_request(&hdr, c);
|
2009-10-31 08:47:32 +08:00
|
|
|
|
|
|
|
cmd = c;
|
2009-11-15 05:10:57 +08:00
|
|
|
n = out[0].rm_eo - out[0].rm_so;
|
2014-07-19 23:35:34 +08:00
|
|
|
cmd_arg = xmemdupz(dir + out[0].rm_so + 1, n - 1);
|
2009-10-31 08:47:32 +08:00
|
|
|
dir[out[0].rm_so] = 0;
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
regfree(&re);
|
|
|
|
}
|
|
|
|
|
|
|
|
if (!cmd)
|
2016-08-10 07:47:31 +08:00
|
|
|
not_found(&hdr, "Request not supported: '%s'", dir);
|
2009-10-31 08:47:32 +08:00
|
|
|
|
|
|
|
setup_path();
|
|
|
|
if (!enter_repo(dir, 0))
|
2016-08-10 07:47:31 +08:00
|
|
|
not_found(&hdr, "Not a git repository: '%s'", dir);
|
2009-12-29 05:49:00 +08:00
|
|
|
if (!getenv("GIT_HTTP_EXPORT_ALL") &&
|
|
|
|
access("git-daemon-export-ok", F_OK) )
|
2016-08-10 07:47:31 +08:00
|
|
|
not_found(&hdr, "Repository not exported: '%s'", dir);
|
2009-10-31 08:47:32 +08:00
|
|
|
|
2014-08-08 00:21:17 +08:00
|
|
|
http_config();
|
http-backend: spool ref negotiation requests to buffer
When http-backend spawns "upload-pack" to do ref
negotiation, it streams the http request body to
upload-pack, who then streams the http response back to the
client as it reads. In theory, git can go full-duplex; the
client can consume our response while it is still sending
the request. In practice, however, HTTP is a half-duplex
protocol. Even if our client is ready to read and write
simultaneously, we may have other HTTP infrastructure in the
way, including the webserver that spawns our CGI, or any
intermediate proxies.
In at least one documented case[1], this leads to deadlock
when trying a fetch over http. What happens is basically:
1. Apache proxies the request to the CGI, http-backend.
2. http-backend gzip-inflates the data and sends
the result to upload-pack.
3. upload-pack acts on the data and generates output over
the pipe back to Apache. Apache isn't reading because
it's busy writing (step 1).
This works fine most of the time, because the upload-pack
output ends up in a system pipe buffer, and Apache reads
it as soon as it finishes writing. But if both the request
and the response exceed the system pipe buffer size, then we
deadlock (Apache blocks writing to http-backend,
http-backend blocks writing to upload-pack, and upload-pack
blocks writing to Apache).
We need to break the deadlock by spooling either the input
or the output. In this case, it's ideal to spool the input,
because Apache does not start reading either stdout _or_
stderr until we have consumed all of the input. So until we
do so, we cannot even get an error message out to the
client.
The solution is fairly straight-forward: we read the request
body into an in-memory buffer in http-backend, freeing up
Apache, and then feed the data ourselves to upload-pack. But
there are a few important things to note:
1. We limit the in-memory buffer to prevent an obvious
denial-of-service attack. This is a new hard limit on
requests, but it's unlikely to come into play. The
default value is 10MB, which covers even the ridiculous
100,000-ref negotation in the included test (that
actually caps out just over 5MB). But it's configurable
on the off chance that you don't mind spending some
extra memory to make even ridiculous requests work.
2. We must take care only to buffer when we have to. For
pushes, the incoming packfile may be of arbitrary
size, and we should connect the input directly to
receive-pack. There's no deadlock problem here, though,
because we do not produce any output until the whole
packfile has been read.
For upload-pack's initial ref advertisement, we
similarly do not need to buffer. Even though we may
generate a lot of output, there is no request body at
all (i.e., it is a GET, not a POST).
[1] http://article.gmane.org/gmane.comp.version-control.git/269020
Test-adapted-from: Dennis Kaarsemaker <dennis@kaarsemaker.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-20 15:37:09 +08:00
|
|
|
max_request_buffer = git_env_ulong("GIT_HTTP_MAX_REQUEST_BUFFER",
|
|
|
|
max_request_buffer);
|
|
|
|
|
2016-08-10 07:47:31 +08:00
|
|
|
cmd->imp(&hdr, cmd_arg);
|
2009-10-31 08:47:32 +08:00
|
|
|
return 0;
|
|
|
|
}
|