git/tempfile.c
Jeff King 77a42b3b84 tempfile: drop active flag
Our tempfile struct contains an "active" flag. Long ago, this flag was
important: tempfile structs were always allocated for the lifetime of
the program and added to a global linked list, and the active flag was
what told us whether a struct's tempfile needed to be cleaned up on
exit.

But since 422a21c6a0 (tempfile: remove deactivated list entries,
2017-09-05) and 076aa2cbda (tempfile: auto-allocate tempfiles on heap,
2017-09-05), we actually remove items from the list, and the active flag
is generally always set to true for any allocated struct. We set it to
true in all of the creation functions, and in the normal code flow it
becomes false only in deactivate_tempfile(), which then immediately
frees the struct.

So the flag isn't performing that role anymore, and in fact makes things
more confusing. Dscho noted that delete_tempfile() is a noop for an
inactive struct. Since 076aa2cbda taught it to free the struct when
deactivating, we'd leak any struct whose active flag is unset. But in
practice it's not a leak, because again, we'll free when we unset the
flag, and never see the allocated-but-inactive state.

Can we just get rid of the flag? The answer is yes, but it requires
looking at a few other spots:

  1. I said above that the flag only becomes false before we deallocate,
     but there's one exception: when we call remove_tempfiles() from a
     signal or atexit handler, we unset the active flag as we remove
     each file. This isn't important for delete_tempfile(), as nobody
     would call it anymore, since we're exiting.

     It does in theory provide us some protection against racily
     double-removing a tempfile. If we receive a second signal while we
     are already in the cleanup routines, we'll start the cleanup loop
     again, and may visit the same tempfile. But this race already
     exists, because calling unlink() and unsetting the active flag
     aren't atomic! And it's OK in practice, because unlink() is
     idempotent (barring the unlikely event that some other process
     chooses our exact temp filename in that instant).

     So dropping the active flag widens the race a bit, but it was
     already there, and is fairly harmless in practice. If we really
     care about addressing it, the right thing is probably to block
     further signals while we're doing our cleanup (which we could
     actually do atomically).

  2. The active flag is declared as "volatile sig_atomic_t". The idea is
     that it's the final bit that gets set to tell the cleanup routines
     that the tempfile is ready to be used (or not used), and it's safe
     to receive a signal racing with regular code which adds or removes
     a tempfile from the list.

     In practice, I don't think this is buying us anything. The presence
     on the linked list is really what tells the cleanup routines to
     look at the struct. That is already marked as "volatile". It's not
     a sig_atomic_t, so it's possible that we could see a sheared write
     there as an entry is added or removed. But that is true of the
     current code, too! Before we can even look at the "active" flag,
     we'd have to follow a link to the struct itself. If we see a
     sheared write in the pointer to the struct, then we'll look at
     garbage memory anyway, and there's not much we can do.

This patch removes the active flag entirely, using presence on the
global linked list as an indicator that a tempfile ought to be cleaned
up. We are already careful to add to the list as the final step in
activating. On deactivation, we'll make sure to remove from the list as
the first step, before freeing any fields. The use of the volatile
keyword should mean that those things happen in the expected order.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-30 14:16:49 -07:00

375 lines
9.5 KiB
C

/*
* State diagram and cleanup
* -------------------------
*
* If the program exits while a temporary file is active, we want to
* make sure that we remove it. This is done by remembering the active
* temporary files in a linked list, `tempfile_list`. An `atexit(3)`
* handler and a signal handler are registered, to clean up any active
* temporary files.
*
* Because the signal handler can run at any time, `tempfile_list` and
* the `tempfile` objects that comprise it must be kept in
* self-consistent states at all times.
*
* The possible states of a `tempfile` object are as follows:
*
* - Uninitialized. In this state the object's `on_list` field must be
* zero but the rest of its contents need not be initialized. As
* soon as the object is used in any way, it is irrevocably
* registered in `tempfile_list`, and `on_list` is set.
*
* - Active, file open (after `create_tempfile()` or
* `reopen_tempfile()`). In this state:
*
* - the temporary file exists
* - `active` is set
* - `filename` holds the filename of the temporary file
* - `fd` holds a file descriptor open for writing to it
* - `fp` holds a pointer to an open `FILE` object if and only if
* `fdopen_tempfile()` has been called on the object
* - `owner` holds the PID of the process that created the file
*
* - Active, file closed (after `close_tempfile_gently()`). Same
* as the previous state, except that the temporary file is closed,
* `fd` is -1, and `fp` is `NULL`.
*
* - Inactive (after `delete_tempfile()`, `rename_tempfile()`, or a
* failed attempt to create a temporary file). In this state:
*
* - `active` is unset
* - `filename` is empty (usually, though there are transitory
* states in which this condition doesn't hold). Client code should
* *not* rely on the filename being empty in this state.
* - `fd` is -1 and `fp` is `NULL`
* - the object is removed from `tempfile_list` (but could be used again)
*
* A temporary file is owned by the process that created it. The
* `tempfile` has an `owner` field that records the owner's PID. This
* field is used to prevent a forked process from deleting a temporary
* file created by its parent.
*/
#include "cache.h"
#include "tempfile.h"
#include "sigchain.h"
static VOLATILE_LIST_HEAD(tempfile_list);
static void remove_template_directory(struct tempfile *tempfile,
int in_signal_handler)
{
if (tempfile->directory) {
if (in_signal_handler)
rmdir(tempfile->directory);
else
rmdir_or_warn(tempfile->directory);
}
}
static void remove_tempfiles(int in_signal_handler)
{
pid_t me = getpid();
volatile struct volatile_list_head *pos;
list_for_each(pos, &tempfile_list) {
struct tempfile *p = list_entry(pos, struct tempfile, list);
if (!is_tempfile_active(p) || p->owner != me)
continue;
if (p->fd >= 0)
close(p->fd);
if (in_signal_handler)
unlink(p->filename.buf);
else
unlink_or_warn(p->filename.buf);
remove_template_directory(p, in_signal_handler);
}
}
static void remove_tempfiles_on_exit(void)
{
remove_tempfiles(0);
}
static void remove_tempfiles_on_signal(int signo)
{
remove_tempfiles(1);
sigchain_pop(signo);
raise(signo);
}
static struct tempfile *new_tempfile(void)
{
struct tempfile *tempfile = xmalloc(sizeof(*tempfile));
tempfile->fd = -1;
tempfile->fp = NULL;
tempfile->owner = 0;
INIT_LIST_HEAD(&tempfile->list);
strbuf_init(&tempfile->filename, 0);
tempfile->directory = NULL;
return tempfile;
}
static void activate_tempfile(struct tempfile *tempfile)
{
static int initialized;
if (!initialized) {
sigchain_push_common(remove_tempfiles_on_signal);
atexit(remove_tempfiles_on_exit);
initialized = 1;
}
volatile_list_add(&tempfile->list, &tempfile_list);
tempfile->owner = getpid();
}
static void deactivate_tempfile(struct tempfile *tempfile)
{
volatile_list_del(&tempfile->list);
strbuf_release(&tempfile->filename);
free(tempfile->directory);
free(tempfile);
}
/* Make sure errno contains a meaningful value on error */
struct tempfile *create_tempfile_mode(const char *path, int mode)
{
struct tempfile *tempfile = new_tempfile();
strbuf_add_absolute_path(&tempfile->filename, path);
tempfile->fd = open(tempfile->filename.buf,
O_RDWR | O_CREAT | O_EXCL | O_CLOEXEC, mode);
if (O_CLOEXEC && tempfile->fd < 0 && errno == EINVAL)
/* Try again w/o O_CLOEXEC: the kernel might not support it */
tempfile->fd = open(tempfile->filename.buf,
O_RDWR | O_CREAT | O_EXCL, mode);
if (tempfile->fd < 0) {
deactivate_tempfile(tempfile);
return NULL;
}
activate_tempfile(tempfile);
if (adjust_shared_perm(tempfile->filename.buf)) {
int save_errno = errno;
error("cannot fix permission bits on %s", tempfile->filename.buf);
delete_tempfile(&tempfile);
errno = save_errno;
return NULL;
}
return tempfile;
}
struct tempfile *register_tempfile(const char *path)
{
struct tempfile *tempfile = new_tempfile();
strbuf_add_absolute_path(&tempfile->filename, path);
activate_tempfile(tempfile);
return tempfile;
}
struct tempfile *mks_tempfile_sm(const char *filename_template, int suffixlen, int mode)
{
struct tempfile *tempfile = new_tempfile();
strbuf_add_absolute_path(&tempfile->filename, filename_template);
tempfile->fd = git_mkstemps_mode(tempfile->filename.buf, suffixlen, mode);
if (tempfile->fd < 0) {
deactivate_tempfile(tempfile);
return NULL;
}
activate_tempfile(tempfile);
return tempfile;
}
struct tempfile *mks_tempfile_tsm(const char *filename_template, int suffixlen, int mode)
{
struct tempfile *tempfile = new_tempfile();
const char *tmpdir;
tmpdir = getenv("TMPDIR");
if (!tmpdir)
tmpdir = "/tmp";
strbuf_addf(&tempfile->filename, "%s/%s", tmpdir, filename_template);
tempfile->fd = git_mkstemps_mode(tempfile->filename.buf, suffixlen, mode);
if (tempfile->fd < 0) {
deactivate_tempfile(tempfile);
return NULL;
}
activate_tempfile(tempfile);
return tempfile;
}
struct tempfile *mks_tempfile_dt(const char *directory_template,
const char *filename)
{
struct tempfile *tempfile;
const char *tmpdir;
struct strbuf sb = STRBUF_INIT;
int fd;
size_t directorylen;
if (!ends_with(directory_template, "XXXXXX")) {
errno = EINVAL;
return NULL;
}
tmpdir = getenv("TMPDIR");
if (!tmpdir)
tmpdir = "/tmp";
strbuf_addf(&sb, "%s/%s", tmpdir, directory_template);
directorylen = sb.len;
if (!mkdtemp(sb.buf)) {
int orig_errno = errno;
strbuf_release(&sb);
errno = orig_errno;
return NULL;
}
strbuf_addf(&sb, "/%s", filename);
fd = open(sb.buf, O_CREAT | O_EXCL | O_RDWR, 0600);
if (fd < 0) {
int orig_errno = errno;
strbuf_setlen(&sb, directorylen);
rmdir(sb.buf);
strbuf_release(&sb);
errno = orig_errno;
return NULL;
}
tempfile = new_tempfile();
strbuf_swap(&tempfile->filename, &sb);
tempfile->directory = xmemdupz(tempfile->filename.buf, directorylen);
tempfile->fd = fd;
activate_tempfile(tempfile);
return tempfile;
}
struct tempfile *xmks_tempfile_m(const char *filename_template, int mode)
{
struct tempfile *tempfile;
struct strbuf full_template = STRBUF_INIT;
strbuf_add_absolute_path(&full_template, filename_template);
tempfile = mks_tempfile_m(full_template.buf, mode);
if (!tempfile)
die_errno("Unable to create temporary file '%s'",
full_template.buf);
strbuf_release(&full_template);
return tempfile;
}
FILE *fdopen_tempfile(struct tempfile *tempfile, const char *mode)
{
if (!is_tempfile_active(tempfile))
BUG("fdopen_tempfile() called for inactive object");
if (tempfile->fp)
BUG("fdopen_tempfile() called for open object");
tempfile->fp = fdopen(tempfile->fd, mode);
return tempfile->fp;
}
const char *get_tempfile_path(struct tempfile *tempfile)
{
if (!is_tempfile_active(tempfile))
BUG("get_tempfile_path() called for inactive object");
return tempfile->filename.buf;
}
int get_tempfile_fd(struct tempfile *tempfile)
{
if (!is_tempfile_active(tempfile))
BUG("get_tempfile_fd() called for inactive object");
return tempfile->fd;
}
FILE *get_tempfile_fp(struct tempfile *tempfile)
{
if (!is_tempfile_active(tempfile))
BUG("get_tempfile_fp() called for inactive object");
return tempfile->fp;
}
int close_tempfile_gently(struct tempfile *tempfile)
{
int fd;
FILE *fp;
int err;
if (!is_tempfile_active(tempfile) || tempfile->fd < 0)
return 0;
fd = tempfile->fd;
fp = tempfile->fp;
tempfile->fd = -1;
if (fp) {
tempfile->fp = NULL;
if (ferror(fp)) {
err = -1;
if (!fclose(fp))
errno = EIO;
} else {
err = fclose(fp);
}
} else {
err = close(fd);
}
return err ? -1 : 0;
}
int reopen_tempfile(struct tempfile *tempfile)
{
if (!is_tempfile_active(tempfile))
BUG("reopen_tempfile called for an inactive object");
if (0 <= tempfile->fd)
BUG("reopen_tempfile called for an open object");
tempfile->fd = open(tempfile->filename.buf, O_WRONLY|O_TRUNC);
return tempfile->fd;
}
int rename_tempfile(struct tempfile **tempfile_p, const char *path)
{
struct tempfile *tempfile = *tempfile_p;
if (!is_tempfile_active(tempfile))
BUG("rename_tempfile called for inactive object");
if (close_tempfile_gently(tempfile)) {
delete_tempfile(tempfile_p);
return -1;
}
if (rename(tempfile->filename.buf, path)) {
int save_errno = errno;
delete_tempfile(tempfile_p);
errno = save_errno;
return -1;
}
deactivate_tempfile(tempfile);
*tempfile_p = NULL;
return 0;
}
void delete_tempfile(struct tempfile **tempfile_p)
{
struct tempfile *tempfile = *tempfile_p;
if (!is_tempfile_active(tempfile))
return;
close_tempfile_gently(tempfile);
unlink_or_warn(tempfile->filename.buf);
remove_template_directory(tempfile, 0);
deactivate_tempfile(tempfile);
*tempfile_p = NULL;
}