Commit Graph

175 Commits

Author SHA1 Message Date
Ævar Arnfjörð Bjarmason
7367d88261 archive: stop passing "stage" through read_tree_recursive()
The "stage" variable being passed around in the archive code has only
ever been an elaborate way to hardcode the value "0".

This code was added in its original form in e4fbbfe9ec (Add
git-zip-tree, 2006-08-26), at which point a hardcoded "0" would be
passed down through read_tree_recursive() to write_zip_entry().

It was then diligently added to the "struct directory" in
ed22b4173b (archive: support filtering paths with glob, 2014-09-21),
but we were still not doing anything except passing it around as-is.

Let's stop doing that in the code internal to archive.c, we'll still
feed "0" to read_tree_recursive() itself, but won't use it. That we're
providing it at all to read_tree_recursive() will be changed in a
follow-up commit.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-20 16:09:26 -07:00
René Scharfe
96099726dd archive: expand only a single %(describe) per archive
Every %(describe) placeholder in $Format:...$ strings in files with the
attribute export-subst is expanded by calling git describe.  This can
potentially result in a lot of such calls per archive.  That's OK for
local repositories under control of the user of git archive, but could
be a problem for hosted repositories.

Expand only a single %(describe) placeholder per archive for now to
avoid denial-of-service attacks.  We can make this limit configurable
later if needed, but let's start out simple.

Reported-by: Jeff King <peff@peff.net>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-03-11 13:22:44 -08:00
Junio C Hamano
3eebb3e044 Merge branch 'rs/archive-plug-leak-refname'
Memleak fix.

* rs/archive-plug-leak-refname:
  archive: release refname after use
2020-11-25 15:24:53 -08:00
René Scharfe
1c3e412916 archive: release refname after use
parse_treeish_arg() uses dwim_ref() to set refname to a strdup'd string.
Release it after use.  Also remove the const qualifier from the refname
member to signify that ownership of the string is handed to the struct,
leaving cleanup duty with the caller of parse_treeish_arg(), thus
avoiding a cast.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-16 14:21:43 -08:00
René Scharfe
cde8ea9c66 archive: support compression levels beyond 9
Compression programs like zip, gzip, bzip2 and xz allow to adjust the
trade-off between CPU cost and size gain with numerical options from -1
for fast compression and -9 for high compression ratio.  zip also
accepts -0 for storing files verbatim.  git archive directly support
these single-digit compression levels for ZIP output and passes them to
filters like gzip.

Zstandard additionally supports compression level options -10 to -19, or
up to -22 with --ultra.  This *seems* to work with git archive in most
cases, e.g. it will produce an archive with -19 without complaining, but
since it only supports single-digit compression level options this is
the same as -1 -9 and thus -9.

Allow git archive to accept multi-digit compression levels to support
the full range supported by zstd.  Explicitly reject them for the ZIP
format, as otherwise deflateInit2() would just fail with a somewhat
cryptic "stream consistency error".

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-11-09 11:25:45 -08:00
René Scharfe
2947a7930d archive: add --add-file
Allow users to append non-tracked files.  This simplifies the generation
of source packages with a few extra files, e.g. containing version
information.  They get the same access times and user information as
tracked files.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-19 15:56:06 -07:00
René Scharfe
200589abcb archive: read short blobs in archive.c::write_archive_entry()
Centralize reading of symlink destinations and the contents of regular
files that are too small to be streamed.  This reduces code duplication
and allows future patches to add support for adding non-tracked files to
archives.  The backends are expected to stream blobs if buffer is NULL.

object_file_to_archive() is only called from archive.c and thus no
longer exported.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-19 15:56:05 -07:00
Jonathan Tan
f24c30e0b6 wt-status: tolerate dangling marks
When a user checks out the upstream branch of HEAD, the upstream branch
not being a local branch, and then runs "git status", like this:

  git clone $URL client
  cd client
  git checkout @{u}
  git status

no status is printed, but instead an error message:

  fatal: HEAD does not point to a branch

(This error message when running "git branch" persists even after
checking out other things - it only stops after checking out a branch.)

This is because "git status" reads the reflog when determining the "HEAD
detached" message, and thus attempts to DWIM "@{u}", but that doesn't
work because HEAD no longer points to a branch.

Therefore, when calculating the status of a worktree, tolerate dangling
marks. This is done by adding an additional parameter to
dwim_ref() and repo_dwim_ref().

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-09-02 14:39:25 -07:00
brian m. carlson
c397aac02f convert: provide additional metadata to filters
Now that we have the codebase wired up to pass any additional metadata
to filters, let's collect the additional metadata that we'd like to
pass.

The two main places we pass this metadata are checkouts and archives.
In these two situations, reading HEAD isn't a valid option, since HEAD
isn't updated for checkouts until after the working tree is written and
archives can accept an arbitrary tree.  In other situations, HEAD will
usually reflect the refname of the branch in current use.

We pass a smaller amount of data in other cases, such as git cat-file,
where we can really only logically know about the blob.

This commit updates only the parts of the checkout code where we don't
use unpack_trees.  That function and callers of it will be handled in a
future commit.

In the archive code, we leak a small amount of memory, since nothing we
pass in the archiver argument structure is freed.

Signed-off-by: brian m. carlson <bk2204@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-03-16 11:37:02 -07:00
brian m. carlson
ab90ecae99 convert: permit passing additional metadata to filter processes
There are a variety of situations where a filter process can make use of
some additional metadata.  For example, some people find the ident
filter too limiting and would like to include the commit or the branch
in their smudged files.  This information isn't available during
checkout as HEAD hasn't been updated at that point, and it wouldn't be
available in archives either.

Let's add a way to pass this metadata down to the filter.  We pass the
blob we're operating on, the treeish (preferring the commit over the
tree if one exists), and the ref we're operating on.  Note that we won't
pass this information in all cases, such as when renormalizing or when
we're performing diffs, since it doesn't make sense in those cases.

The data we currently get from the filter process looks like the
following:

  command=smudge
  pathname=git.c
  0000

With this change, we'll get data more like this:

  command=smudge
  pathname=git.c
  refname=refs/tags/v2.25.1
  treeish=c522f061d551c9bb8684a7c3859b2ece4499b56b
  blob=7be7ad34bd053884ec48923706e70c81719a8660
  0000

There are a couple things to note about this approach.  For operations
like checkout, treeish will always be a commit, since we cannot check
out individual trees, but for other operations, like archive, we can end
up operating on only a particular tree, so we'll provide only a tree as
the treeish.  Similar comments apply for refname, since there are a
variety of cases in which we won't have a ref.

This commit wires up the code to print this information, but doesn't
pass any of it at this point.  In a future commit, we'll have various
code paths pass the actual useful data down.

Signed-off-by: brian m. carlson <bk2204@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-03-16 11:37:02 -07:00
Nguyễn Thái Ngọc Duy
50ddb089ff tree-walk.c: remove the_repo from get_tree_entry()
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-06-27 12:45:17 -07:00
Junio C Hamano
96379f043f Merge branch 'en/merge-directory-renames'
"git merge-recursive" backend recently learned a new heuristics to
infer file movement based on how other files in the same directory
moved.  As this is inherently less robust heuristics than the one
based on the content similarity of the file itself (rather than
based on what its neighbours are doing), it sometimes gives an
outcome unexpected by the end users.  This has been toned down to
leave the renamed paths in higher/conflicted stages in the index so
that the user can examine and confirm the result.

* en/merge-directory-renames:
  merge-recursive: switch directory rename detection default
  merge-recursive: give callers of handle_content_merge() access to contents
  merge-recursive: track information associated with directory renames
  t6043: fix copied test description to match its purpose
  merge-recursive: switch from (oid,mode) pairs to a diff_filespec
  merge-recursive: cleanup handle_rename_* function signatures
  merge-recursive: track branch where rename occurred in rename struct
  merge-recursive: remove ren[12]_other fields from rename_conflict_info
  merge-recursive: shrink rename_conflict_info
  merge-recursive: move some struct declarations together
  merge-recursive: use 'ci' for rename_conflict_info variable name
  merge-recursive: rename locals 'o' and 'a' to 'obuf' and 'abuf'
  merge-recursive: rename diff_filespec 'one' to 'o'
  merge-recursive: rename merge_options argument from 'o' to 'opt'
  Use 'unsigned short' for mode, like diff_filespec does
2019-05-09 00:37:22 +09:00
Elijah Newren
5ec1e72823 Use 'unsigned short' for mode, like diff_filespec does
struct diff_filespec defines mode to be an 'unsigned short'.  Several
other places in the API which we'd like to interact with using a
diff_filespec used a plain unsigned (or unsigned int).  This caused
problems when taking addresses, so switch to unsigned short.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-04-08 16:02:07 +09:00
brian m. carlson
bbf05cf70e archive: convert struct archiver_args to object_id
Change the commit_sha1 member to be called "commit_oid" and change it to
be a pointer to struct object_id.  Additionally, update some uses of
GIT_SHA1_HEXSZ and hard-coded values to use the_hash_algo instead.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-04-01 11:57:39 +09:00
Junio C Hamano
f2b6aa98be Merge branch 'nd/indentation-fix'
Code cleanup.

* nd/indentation-fix:
  Indent code with TABs
2019-01-14 15:29:32 -08:00
Junio C Hamano
d6f05a435f Merge branch 'nd/attr-pathspec-in-tree-walk'
The traversal over tree objects has learned to honor
":(attr:label)" pathspec match, which has been implemented only for
enumerating paths on the filesystem.

* nd/attr-pathspec-in-tree-walk:
  tree-walk: support :(attr) matching
  dir.c: move, rename and export match_attrs()
  pathspec.h: clean up "extern" in function declarations
  tree-walk.c: make tree_entry_interesting() take an index
  tree.c: make read_tree*() take 'struct repository *'
2019-01-14 15:29:28 -08:00
Junio C Hamano
3813a89fae Merge branch 'nd/i18n'
More _("i18n") markings.

* nd/i18n:
  fsck: mark strings for translation
  fsck: reduce word legos to help i18n
  parse-options.c: mark more strings for translation
  parse-options.c: turn some die() to BUG()
  parse-options: replace opterror() with optname()
  repack: mark more strings for translation
  remote.c: mark messages for translation
  remote.c: turn some error() or die() to BUG()
  reflog: mark strings for translation
  read-cache.c: add missing colon separators
  read-cache.c: mark more strings for translation
  read-cache.c: turn die("internal error") to BUG()
  attr.c: mark more string for translation
  archive.c: mark more strings for translation
  alias.c: mark split_cmdline_strerror() strings for translation
  git.c: mark more strings for translation
2019-01-04 13:33:31 -08:00
Nguyễn Thái Ngọc Duy
ec36c42a63 Indent code with TABs
We indent with TABs and sometimes for fine alignment, TABs followed by
spaces, but never all spaces (unless the indentation is less than 8
columns). Indenting with spaces slips through in some places. Fix
them.

Imported code and compat/ are left alone on purpose. The former should
remain as close as upstream as possible. The latter pretty much has
separate maintainers, it's up to them to decide.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-12-09 12:37:32 +09:00
Nguyễn Thái Ngọc Duy
e092073d64 tree.c: make read_tree*() take 'struct repository *'
These functions call tree_entry_interesting() which will soon require
a 'struct index_state *' to be passed in. Instead of just changing the
function signature to take an index, update to take a repo instead
because these functions do need object database access.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-11-19 10:50:33 +09:00
Nguyễn Thái Ngọc Duy
c6e7965ddf archive.c: mark more strings for translation
Two messages also print extra information to be more useful

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-11-12 14:47:09 +09:00
Josh Steadmon
00436bf1b1 archive: initialize archivers earlier
Initialize archivers as soon as possible when running git-archive.
Various non-obvious behavior depends on having the archivers
initialized, such as determining the desired archival format from the
provided filename.

Since 08716b3c11 ("archive: refactor file extension format-guessing",
2011-06-21), archive_format_from_filename() has used the registered
archivers to match filenames (provided via --output) to archival
formats. However, when git-archive is executed with --remote, format
detection happens before the archivers have been registered. This causes
archives from remotes to always be generated as TAR files, regardless of
the actual filename (unless an explicit --format is provided).

This patch fixes that behavior; archival format is determined properly
from the output filename, even when --remote is used.

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-10-26 10:17:59 +09:00
Junio C Hamano
11877b9ebe Merge branch 'nd/the-index'
Various codepaths in the core-ish part learn to work on an
arbitrary in-core index structure, not necessarily the default
instance "the_index".

* nd/the-index: (23 commits)
  revision.c: reduce implicit dependency the_repository
  revision.c: remove implicit dependency on the_index
  ws.c: remove implicit dependency on the_index
  tree-diff.c: remove implicit dependency on the_index
  submodule.c: remove implicit dependency on the_index
  line-range.c: remove implicit dependency on the_index
  userdiff.c: remove implicit dependency on the_index
  rerere.c: remove implicit dependency on the_index
  sha1-file.c: remove implicit dependency on the_index
  patch-ids.c: remove implicit dependency on the_index
  merge.c: remove implicit dependency on the_index
  merge-blobs.c: remove implicit dependency on the_index
  ll-merge.c: remove implicit dependency on the_index
  diff-lib.c: remove implicit dependency on the_index
  read-cache.c: remove implicit dependency on the_index
  diff.c: remove implicit dependency on the_index
  grep.c: remove implicit dependency on the_index
  diff.c: remove the_index dependency in textconv() functions
  blame.c: rename "repo" argument to "r"
  combine-diff.c: remove implicit dependency on the_index
  ...
2018-10-19 13:34:02 +09:00
Nguyễn Thái Ngọc Duy
ce3a7ec8bd archive.c: remove implicit dependency the_repository
The new "repo" field in archive_args has been added since b612ee202a
(archive.c: avoid access to the_index - 2018-08-13). Use it instead of
hard coding the_repository.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-09-17 09:20:06 -07:00
Torsten Bögershausen
d64324cb60 Make git_check_attr() a void function
git_check_attr() returns always 0.
Remove all the error handling code of the callers, which is never executed.
Change git_check_attr() to be a void function.

Signed-off-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-09-12 15:15:34 -07:00
Nguyễn Thái Ngọc Duy
b612ee202a archive.c: avoid access to the_index
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-13 14:14:43 -07:00
Nguyễn Thái Ngọc Duy
c4500e251f attr: remove index from git_attr_set_direction()
Since attr checking API now take the index, there's no need to set an
index in advance with this call. Most call sites are straightforward
because they either pass the_index or NULL (which defaults back to
the_index previously). There's only one suspicious call site in
unpack-trees.c where it sets a different index.

This code in unpack-trees is about to check out entries from the
new/temporary index after merging is done in it. The attributes will
be used by entry.c code to do crlf conversion if needed. entry.c now
respects struct checkout's istate field, and this field is correctly
set in unpack-trees.c, there should be no regression from this change.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-13 14:14:43 -07:00
Nguyễn Thái Ngọc Duy
6d2df284e7 dir.c: remove an implicit dependency on the_index in pathspec code
Make the match_patchspec API and friends take an index_state instead
of assuming the_index in dir.c. All external call sites are converted
blindly to keep the patch simple and retain current behavior.
Individual call sites may receive further updates to use the right
index instead of the_index.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-13 14:14:42 -07:00
Nguyễn Thái Ngọc Duy
7f944e264e convert.c: remove an implicit dependency on the_index
Make the convert API take an index_state instead of assuming the_index
in convert.c. All external call sites are converted blindly to keep
the patch simple and retain current behavior. Individual call sites
may receive further updates to use the right index instead of
the_index.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-13 14:14:42 -07:00
Nguyễn Thái Ngọc Duy
7a400a2c02 attr: remove an implicit dependency on the_index
Make the attr API take an index_state instead of assuming the_index in
attr code. All call sites are converted blindly to keep the patch
simple and retain current behavior. Individual call sites may receive
further updates to use the right index instead of the_index.

There is one ugly temporary workaround added in attr.c that needs some
more explanation.

Commit c24f3abace (apply: file commited with CRLF should roundtrip
diff and apply - 2017-08-19) forces one convert_to_git() call to NOT
read the index at all. But what do you know, we read it anyway by
falling back to the_index. When "istate" from convert_to_git is now
propagated down to read_attr_from_array() we will hit segfault
somewhere inside read_blob_data_from_index.

The right way of dealing with this is to kill "use_index" variable and
only follow "istate" but at this stage we are not ready for that:
while most git_attr_set_direction() calls just passes the_index to be
assigned to use_index, unpack-trees passes a different one which is
used by entry.c code, which has no way to know what index to use if we
delete use_index. So this has to be done later.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-13 14:14:42 -07:00
Stefan Beller
21e1ee8f4f commit: add repository argument to lookup_commit_reference_gently
Add a repository argument to allow callers of
lookup_commit_reference_gently to be more specific about which
repository to handle. This is a small mechanical change; it doesn't
change the implementation to handle repositories other than
the_repository yet.

As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-06-29 10:43:39 -07:00
Junio C Hamano
b16b60f71b Merge branch 'sb/object-store-grafts' into sb/object-store-lookup
* sb/object-store-grafts:
  commit: allow lookup_commit_graft to handle arbitrary repositories
  commit: allow prepare_commit_graft to handle arbitrary repositories
  shallow: migrate shallow information into the object parser
  path.c: migrate global git_path_* to take a repository argument
  cache: convert get_graft_file to handle arbitrary repositories
  commit: convert read_graft_file to handle arbitrary repositories
  commit: convert register_commit_graft to handle arbitrary repositories
  commit: convert commit_graft_pos() to handle arbitrary repositories
  shallow: add repository argument to is_repository_shallow
  shallow: add repository argument to check_shallow_file_for_update
  shallow: add repository argument to register_shallow
  shallow: add repository argument to set_alternate_shallow_file
  commit: add repository argument to lookup_commit_graft
  commit: add repository argument to prepare_commit_graft
  commit: add repository argument to read_graft_file
  commit: add repository argument to register_commit_graft
  commit: add repository argument to commit_graft_pos
  object: move grafts to object parser
  object-store: move object access functions to object-store.h
2018-06-29 10:43:28 -07:00
Nguyễn Thái Ngọc Duy
3e4a67b47d Use OPT_SET_INT_F() for cmdline option specification
The only thing these commands need is extra parseopt flag which can be
passed in by OPT_SET_INT_F() and it is a bit more compact than full
struct initialization.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-24 16:12:29 +09:00
Stefan Beller
cbd53a2193 object-store: move object access functions to object-store.h
This should make these functions easier to find and cache.h less
overwhelming to read.

In particular, this moves:
- read_object_file
- oid_object_info
- write_object_file

As a result, most of the codebase needs to #include object-store.h.
In this patch the #include is only added to files that would fail to
compile otherwise.  It would be better to #include wherever
identifiers from the header are used.  That can happen later
when we have better tooling for it.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-16 11:42:03 +09:00
brian m. carlson
b4f5aca40e sha1_file: convert read_sha1_file to struct object_id
Convert read_sha1_file to take a pointer to struct object_id and rename
it read_object_file.  Do the same for read_sha1_file_extended.

Convert one use in grep.c to use the new function without any other code
change, since the pointer being passed is a void pointer that is already
initialized with a pointer to struct object_id.  Update the declaration
and definitions of the modified functions, and apply the following
semantic patch to convert the remaining callers:

@@
expression E1, E2, E3;
@@
- read_sha1_file(E1.hash, E2, E3)
+ read_object_file(&E1, E2, E3)

@@
expression E1, E2, E3;
@@
- read_sha1_file(E1->hash, E2, E3)
+ read_object_file(E1, E2, E3)

@@
expression E1, E2, E3, E4;
@@
- read_sha1_file_extended(E1.hash, E2, E3, E4)
+ read_object_file_extended(&E1, E2, E3, E4)

@@
expression E1, E2, E3, E4;
@@
- read_sha1_file_extended(E1->hash, E2, E3, E4)
+ read_object_file_extended(E1, E2, E3, E4)

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-14 09:23:50 -07:00
brian m. carlson
916bc35b29 tree-walk: convert tree entry functions to object_id
Convert get_tree_entry and find_tree_entry to take pointers to struct
object_id.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-14 09:23:50 -07:00
brian m. carlson
e5ec981a4b archive: convert sha1_file_to_archive to struct object_id
Convert this function to take a pointer to struct object_id and rename
it object_file_to_archive.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-14 09:23:48 -07:00
brian m. carlson
015ff4f822 archive: convert write_archive_entry_fn_t to object_id
Convert the write_archive_entry_fn_t type to use a pointer to struct
object_id.  Convert various static functions in the tar and zip
archivers also.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-14 09:23:48 -07:00
brian m. carlson
df46d77e00 tree: convert read_tree_recursive to struct object_id
Convert the callback functions for read_tree_recursive to take a pointer
to struct object_id.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-14 09:23:47 -07:00
brian m. carlson
cca5fa6406 refs: convert dwim_ref and expand_ref to struct object_id
All of the callers of these functions just pass the hash member of a
struct object_id, so convert them to use a pointer to struct object_id
directly.  Insert a check for NULL in expand_ref on a temporary basis;
this check can be removed when resolve_ref_unsafe is converted as well.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-16 11:05:51 +09:00
Junio C Hamano
62b1cb7b13 Merge branch 'rs/archive-excluded-directory'
"git archive", especially when used with pathspec, stored an empty
directory in its output, even though Git itself never does so.
This has been fixed.

* rs/archive-excluded-directory:
  archive: don't add empty directories to archives
2017-09-25 15:24:07 +09:00
René Scharfe
4318094047 archive: don't add empty directories to archives
While git doesn't track empty directories, git archive can be tricked
into putting some into archives.  One way is to construct an empty tree
object, as t5004 does.  While that is supported by the object database,
it can't be represented in the index and thus it's unlikely to occur in
the wild.

Another way is using the literal name of a directory in an exclude
pathspec -- its contents are are excluded, but the directory stub is
included.  That's inconsistent: exclude pathspecs containing wildcards
don't leave empty directories in the archive.

Yet another way is have a few levels of nested subdirectories (e.g.
d1/d2/d3/file1) and ignoring the entries at the leaves (e.g. file1).
The directories with the ignored content are ignored as well (e.g. d3),
but their empty parents are included (e.g. d2).

As empty directories are not supported by git, they should also not be
written into archives.  If an empty directory is really needed then it
can be tracked and archived by placing an empty .gitignore file in it.

There already is a mechanism in place for suppressing empty directories.
When read_tree_recursive() encounters a directory excluded by a pathspec
then it enters it anyway because it might contain included entries.  It
calls the callback function before it is able to decide if the directory
is actually needed.  For that reason git archive adds directories to a
queue and writes entries for them only when it encounters the first
child item -- but currently only if pathspecs with wildcards are used.

Queue *all* directories, no matter if there even are pathspecs present.
This prevents git archive from writing entries for empty directories in
all cases.

Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-14 15:08:22 +09:00
Junio C Hamano
a7d7f125e3 Merge branch 'rs/archive-excluded-directory'
"git archive" did not work well with pathspecs and the
export-ignore attribute.

* rs/archive-excluded-directory:
  archive: don't queue excluded directories
  archive: factor out helper functions for handling attributes
  t5001: add tests for export-ignore attributes and exclude pathspecs
2017-09-06 13:11:25 +09:00
René Scharfe
5ff247ac0c archive: don't queue excluded directories
Reject directories with the attribute export-ignore already while
queuing them.  This prevents read_tree_recursive() from descending into
them and this avoids write_archive_entry() rejecting them later on,
which queue_or_write_archive_entry() is not prepared for.

Borrow the existing strbuf to build the full path to avoid string
copies and extra allocations; just make sure we restore the original
value before moving on.

Keep checking any other attributes in write_archive_entry() as before,
but avoid checking them twice.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-19 00:40:25 -07:00
René Scharfe
c6c08f7e9a archive: factor out helper functions for handling attributes
Add helpers for accessing attributes that encapsulate the details of how
to retrieve their values.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-19 00:40:23 -07:00
brian m. carlson
e82caf384b sha1_name: convert get_sha1* to get_oid*
Now that all the callers of get_sha1 directly or indirectly use struct
object_id, rename the functions starting with get_sha1 to start with
get_oid.  Convert the internals in sha1_name.c to use struct object_id
as well, and eliminate explicit length checks where possible.  Convert a
use of 40 in get_oid_basic to GIT_SHA1_HEXSZ.

Outside of sha1_name.c and cache.h, this transition was made with the
following semantic patch:

@@
expression E1, E2;
@@
- get_sha1(E1, E2.hash)
+ get_oid(E1, &E2)

@@
expression E1, E2;
@@
- get_sha1(E1, E2->hash)
+ get_oid(E1, E2)

@@
expression E1, E2;
@@
- get_sha1_committish(E1, E2.hash)
+ get_oid_committish(E1, &E2)

@@
expression E1, E2;
@@
- get_sha1_committish(E1, E2->hash)
+ get_oid_committish(E1, E2)

@@
expression E1, E2;
@@
- get_sha1_treeish(E1, E2.hash)
+ get_oid_treeish(E1, &E2)

@@
expression E1, E2;
@@
- get_sha1_treeish(E1, E2->hash)
+ get_oid_treeish(E1, E2)

@@
expression E1, E2;
@@
- get_sha1_commit(E1, E2.hash)
+ get_oid_commit(E1, &E2)

@@
expression E1, E2;
@@
- get_sha1_commit(E1, E2->hash)
+ get_oid_commit(E1, E2)

@@
expression E1, E2;
@@
- get_sha1_tree(E1, E2.hash)
+ get_oid_tree(E1, &E2)

@@
expression E1, E2;
@@
- get_sha1_tree(E1, E2->hash)
+ get_oid_tree(E1, E2)

@@
expression E1, E2;
@@
- get_sha1_blob(E1, E2.hash)
+ get_oid_blob(E1, &E2)

@@
expression E1, E2;
@@
- get_sha1_blob(E1, E2->hash)
+ get_oid_blob(E1, E2)

@@
expression E1, E2, E3, E4;
@@
- get_sha1_with_context(E1, E2, E3.hash, E4)
+ get_oid_with_context(E1, E2, &E3, E4)

@@
expression E1, E2, E3, E4;
@@
- get_sha1_with_context(E1, E2, E3->hash, E4)
+ get_oid_with_context(E1, E2, E3, E4)

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-07-17 13:54:51 -07:00
Junio C Hamano
f31d23a399 Merge branch 'bw/config-h'
Fix configuration codepath to pay proper attention to commondir
that is used in multi-worktree situation, and isolate config API
into its own header file.

* bw/config-h:
  config: don't implicitly use gitdir or commondir
  config: respect commondir
  setup: teach discover_git_directory to respect the commondir
  config: don't include config.h by default
  config: remove git_config_iter
  config: create config.h
2017-06-24 14:28:41 -07:00
Brandon Williams
b2141fc1d2 config: don't include config.h by default
Stop including config.h by default in cache.h.  Instead only include
config.h in those files which require use of the config system.

Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-15 12:56:22 -07:00
brian m. carlson
a9dbc17910 tree: convert parse_tree_indirect to struct object_id
Convert parse_tree_indirect to take a pointer to struct object_id.
Update all the callers.  This transformation was achieved using the
following semantic patch and manual updates to the declaration and
definition.  Update builtin/checkout.c manually as well, since it uses a
ternary expression not handled by the semantic patch.

@@
expression E1;
@@
- parse_tree_indirect(E1.hash)
+ parse_tree_indirect(&E1)

@@
expression E1;
@@
- parse_tree_indirect(E1->hash)
+ parse_tree_indirect(E1)

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-08 15:12:58 +09:00
brian m. carlson
bc83266abe Convert lookup_commit* to struct object_id
Convert lookup_commit, lookup_commit_or_die,
lookup_commit_reference, and lookup_commit_reference_gently to take
struct object_id arguments.

Introduce a temporary in parse_object buffer in order to convert this
function.  This is required since in order to convert parse_object and
parse_object_buffer, lookup_commit_reference_gently and
lookup_commit_or_die would need to be converted.  Not introducing a
temporary would therefore require that lookup_commit_or_die take a
struct object_id *, but lookup_commit would take unsigned char *,
leaving a confusing and hard-to-use interface.

parse_object_buffer will lose this temporary in a later patch.

This commit was created with manual changes to commit.c, commit.h, and
object.c, plus the following semantic patch:

@@
expression E1, E2;
@@
- lookup_commit_reference_gently(E1.hash, E2)
+ lookup_commit_reference_gently(&E1, E2)

@@
expression E1, E2;
@@
- lookup_commit_reference_gently(E1->hash, E2)
+ lookup_commit_reference_gently(E1, E2)

@@
expression E1;
@@
- lookup_commit_reference(E1.hash)
+ lookup_commit_reference(&E1)

@@
expression E1;
@@
- lookup_commit_reference(E1->hash)
+ lookup_commit_reference(E1)

@@
expression E1;
@@
- lookup_commit(E1.hash)
+ lookup_commit(&E1)

@@
expression E1;
@@
- lookup_commit(E1->hash)
+ lookup_commit(E1)

@@
expression E1, E2;
@@
- lookup_commit_or_die(E1.hash, E2)
+ lookup_commit_or_die(&E1, E2)

@@
expression E1, E2;
@@
- lookup_commit_or_die(E1->hash, E2)
+ lookup_commit_or_die(E1, E2)

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-08 15:12:57 +09:00
Junio C Hamano
2aef63d31c attr: convert git_check_attrs() callers to use the new API
The remaining callers are all simple "I have N attributes I am
interested in.  I'll ask about them with various paths one by one".

After this step, no caller to git_check_attrs() remains.  After
removing it, we can extend "struct attr_check" struct with data
that can be used in optimizing the query for the specific N
attributes it contains.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-01 13:46:52 -08:00
Junio C Hamano
7bd18054d2 attr: rename function and struct related to checking attributes
The traditional API to check attributes is to prepare an N-element
array of "struct git_attr_check" and pass N and the array to the
function "git_check_attr()" as arguments.

In preparation to revamp the API to pass a single structure, in
which these N elements are held, rename the type used for these
individual array elements to "struct attr_check_item" and rename
the function to "git_check_attrs()".

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-01 13:46:52 -08:00
Junio C Hamano
eb0224c617 archive: read local configuration
Since b9605bc4f2 ("config: only read .git/config from configured
repos", 2016-09-12), we do not read from ".git/config" unless we
know we are in a repository.  "git archive" however didn't do the
repository discovery and instead relied on the old behaviour.

Teach the command to run a "gentle" version of repository discovery
so that local configuration variables are honoured.

[jc: stole tests from peff]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-11-22 13:55:20 -08:00
Vasco Almeida
5a36d00cf2 i18n: archive: mark errors for translation
Signed-off-by: Vasco Almeida <vascomalmeida@sapo.pt>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-09 12:44:59 -07:00
Junio C Hamano
ed6e8038f9 pathspec: rename free_pathspec() to clear_pathspec()
The function takes a pointer to a pathspec structure, and releases
the resources held by it, but does not free() the structure itself.
Such a function should be called "clear", not "free".

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-02 14:09:22 -07:00
Jeff King
50a6c8efa2 use st_add and st_mult for allocation size computation
If our size computation overflows size_t, we may allocate a
much smaller buffer than we expected and overflow it. It's
probably impossible to trigger an overflow in most of these
sites in practice, but it is easy enough convert their
additions and multiplications into overflow-checking
variants. This may be fixing real bugs, and it makes
auditing the code easier.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-02-22 14:51:09 -08:00
brian m. carlson
ed1c9977cb Remove get_object_hash.
Convert all instances of get_object_hash to use an appropriate reference
to the hash member of the oid member of struct object.  This provides no
functional change, as it is essentially a macro substitution.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Jeff King <peff@peff.net>
2015-11-20 08:02:05 -05:00
brian m. carlson
7999b2cf77 Add several uses of get_object_hash.
Convert most instances where the sha1 member of struct object is
dereferenced to use get_object_hash.  Most instances that are passed to
functions that have versions taking struct object_id, such as
get_sha1_hex/get_oid_hex, or instances that can be trivially converted
to use struct object_id instead, are not converted.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Jeff King <peff@peff.net>
2015-11-20 08:02:05 -05:00
Jeff King
c7ab0ba340 avoid sprintf and strcpy with flex arrays
When we are allocating a struct with a FLEX_ARRAY member, we
generally compute the size of the array and then sprintf or
strcpy into it. Normally we could improve a dynamic allocation
like this by using xstrfmt, but it doesn't work here; we
have to account for the size of the rest of the struct.

But we can improve things a bit by storing the length that
we use for the allocation, and then feeding it to xsnprintf
or memcpy, which makes it more obvious that we are not
writing more than the allocated number of bytes.

It would be nice if we had some kind of helper for
allocating generic flex arrays, but it doesn't work that
well:

 - the call signature is a little bit unwieldy:

      d = flex_struct(sizeof(*d), offsetof(d, path), fmt, ...);

   You need offsetof here instead of just writing to the
   end of the base size, because we don't know how the
   struct is packed (partially this is because FLEX_ARRAY
   might not be zero, though we can account for that; but
   the size of the struct may actually be rounded up for
   alignment, and we can't know that).

 - some sites do clever things, like over-allocating because
   they know they will write larger things into the buffer
   later (e.g., struct packed_git here).

So we're better off to just write out each allocation (or
add type-specific helpers, though many of these are one-off
allocations anyway).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-10-05 11:08:05 -07:00
Junio C Hamano
d939af12bd Merge branch 'jk/date-mode-format'
Teach "git log" and friends a new "--date=format:..." option to
format timestamps using system's strftime(3).

* jk/date-mode-format:
  strbuf: make strbuf_addftime more robust
  introduce "format" date-mode
  convert "enum date_mode" into a struct
  show-branch: use DATE_RELATIVE instead of magic number
2015-08-03 11:01:27 -07:00
Jeff King
a5481a6c94 convert "enum date_mode" into a struct
In preparation for adding date modes that may carry extra
information beyond the mode itself, this patch converts the
date_mode enum into a struct.

Most of the conversion is fairly straightforward; we pass
the struct as a pointer and dereference the type field where
necessary. Locations that declare a date_mode can use a "{}"
constructor.  However, the tricky case is where we use the
enum labels as constants, like:

  show_date(t, tz, DATE_NORMAL);

Ideally we could say:

  show_date(t, tz, &{ DATE_NORMAL });

but of course C does not allow that. Likewise, we cannot
cast the constant to a struct, because we need to pass an
actual address. Our options are basically:

  1. Manually add a "struct date_mode d = { DATE_NORMAL }"
     definition to each caller, and pass "&d". This makes
     the callers uglier, because they sometimes do not even
     have their own scope (e.g., they are inside a switch
     statement).

  2. Provide a pre-made global "date_normal" struct that can
     be passed by address. We'd also need "date_rfc2822",
     "date_iso8601", and so forth. But at least the ugliness
     is defined in one place.

  3. Provide a wrapper that generates the correct struct on
     the fly. The big downside is that we end up pointing to
     a single global, which makes our wrapper non-reentrant.
     But show_date is already not reentrant, so it does not
     matter.

This patch implements 3, along with a minor macro to keep
the size of the callers sane.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-06-29 11:39:07 -07:00
Michael Haggerty
fb58c8d507 refs: move the remaining ref module declarations to refs.h
Some functions from the refs module were still declared in cache.h.
Move them to refs.h.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-06-22 13:17:12 -07:00
Junio C Hamano
a916cb5fb4 Merge branch 'bc/object-id'
Identify parts of the code that knows that we use SHA-1 hash to
name our objects too much, and use (1) symbolic constants instead
of hardcoded 20 as byte count and/or (2) use struct object_id
instead of unsigned char [20] for object names.

* bc/object-id:
  apply: convert threeway_stage to object_id
  patch-id: convert to use struct object_id
  commit: convert parts to struct object_id
  diff: convert struct combine_diff_path to object_id
  bulk-checkin.c: convert to use struct object_id
  zip: use GIT_SHA1_HEXSZ for trailers
  archive.c: convert to use struct object_id
  bisect.c: convert leaf functions to use struct object_id
  define utility functions for object IDs
  define a structure for object IDs
2015-05-05 21:00:23 -07:00
brian m. carlson
13609673c4 archive.c: convert to use struct object_id
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-03-13 22:43:12 -07:00
Alex Henrie
9c9b4f2f8b standardize usage info string format
This patch puts the usage info strings that were not already in docopt-
like format into docopt-like format, which will be a litle easier for
end users and a lot easier for translators. Changes include:

- Placing angle brackets around fill-in-the-blank parameters
- Putting dashes in multiword parameter names
- Adding spaces to [-f|--foobar] to make [-f | --foobar]
- Replacing <foobar>* with [<foobar>...]

Signed-off-by: Alex Henrie <alexhenrie24@gmail.com>
Reviewed-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-01-14 09:32:04 -08:00
Nguyễn Thái Ngọc Duy
6a0b0b6de9 tree.c: update read_tree_recursive callback to pass strbuf as base
This allows the callback to use 'base' as a temporary buffer to
quickly assemble full path "without" extra allocation. The callback
has to restore it afterwards of course.

Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-12-01 11:32:29 -08:00
Junio C Hamano
b2c45f5b96 Merge branch 'nd/archive-pathspec'
"git archive" learned to filter what gets archived with pathspec.

* nd/archive-pathspec:
  archive: support filtering paths with glob
2014-10-08 13:05:26 -07:00
Nguyễn Thái Ngọc Duy
ed22b4173b archive: support filtering paths with glob
This patch fixes two problems with using :(glob) (or even "*.c"
without ":(glob)").

The first one is we forgot to turn on the 'recursive' flag in struct
pathspec. Without that, tree_entry_interesting() will not mark
potential directories "interesting" so that it can confirm whether
those directories have anything matching the pathspec.

The marking directories interesting has a side effect that we need to
walk inside a directory to realize that there's nothing interested in
there. By that time, 'archive' code has already written the (empty)
directory down. That means lots of empty directories in the result
archive.

This problem is fixed by lazily writing directories down when we know
they are actually needed. There is a theoretical bug in this
implementation: we can't write empty trees/directories that match that
pathspec.

path_exists() is also made stricter in order to detect non-matching
pathspec because when this 'recursive' flag is on, we most likely
match some directories. The easiest way is not consider any
directories "matched".

Noticed-by: Peter Wu <peter@lekensteyn.nl>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-22 12:04:29 -07:00
Tanay Abhra
95790ff60d archive.c: replace git_config() with git_config_get_bool() family
Use `git_config_get_bool()` family instead of `git_config()` to take advantage of
the config-set API which provides a cleaner control flow.

Signed-off-by: Tanay Abhra <tanayabh@gmail.com>
Reviewed-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-08-07 13:33:27 -07:00
Junio C Hamano
6f75e48323 Merge branch 'rm/strchrnul-not-strlen'
* rm/strchrnul-not-strlen:
  use strchrnul() in place of strchr() and strlen()
2014-03-18 13:51:18 -07:00
Rohit Mani
2c5495f7b6 use strchrnul() in place of strchr() and strlen()
Avoid scanning strings twice, once with strchr() and then with
strlen(), by using strchrnul().

Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Rohit Mani <rohit.mani@outlook.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-03-10 08:35:30 -07:00
Scott J. Goldman
7671b63211 add uploadarchive.allowUnreachable option
In commit ee27ca4, we started restricting remote git-archive
invocations to only accessing reachable commits. This
matches what upload-pack allows, but does restrict some
useful cases (e.g., HEAD:foo). We loosened this in 0f544ee,
which allows `foo:bar` as long as `foo` is a ref tip.
However, that still doesn't allow many useful things, like:

  1. Commits accessible from a ref, like `foo^:bar`, which
     are reachable

  2. Arbitrary sha1s, even if they are reachable.

We can do a full object-reachability check for these cases,
but it can be quite expensive if the client has sent us the
sha1 of a tree; we have to visit every sub-tree of every
commit in the worst case.

Let's instead give site admins an escape hatch, in case they
prefer the more liberal behavior.  For many sites, the full
object database is public anyway (e.g., if you allow dumb
walker access), or the site admin may simply decide the
security/convenience tradeoff is not worth it.

This patch adds a new config option to disable the
restrictions added in ee27ca4. It defaults to off, meaning
there is no change in behavior by default.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-02-28 09:55:37 -08:00
Junio C Hamano
b1cdfb54f1 archive.c: have SP around arithmetic operators
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-10-16 10:27:26 -07:00
Nguyễn Thái Ngọc Duy
f3e743a0d9 archive: convert to use parse_pathspec
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-15 10:56:07 -07:00
Nguyễn Thái Ngọc Duy
64acde94ef move struct pathspec and related functions to pathspec.[ch]
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-15 10:56:06 -07:00
Jeff King
bd54cf17a4 archive: handle commits with an empty tree
git-archive relies on get_pathspec to convert its argv into
a list of pathspecs. When get_pathspec is given an empty
argv list, it returns a single pathspec, the empty string,
to indicate that everything matches. When we feed this to
our path_exists function, we typically see that the pathspec
turns up at least one item in the tree, and we are happy.

But when our tree is empty, we erroneously think it is
because the pathspec is too limited, when in fact it is
simply that there is nothing to be found in the tree. This
is a weird corner case, but the correct behavior is almost
certainly to produce an empty archive, not to exit with an
error.

This patch teaches git-archive to create empty archives when
there is no pathspec given (we continue to complain if a
pathspec is given, since it obviously is not matched). It
also confirms that the tar and zip writers produce sane
output in this instance.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-03-10 22:25:22 -07:00
Jean-Noël AVILA
94bc671a1f Add directory pattern matching to attributes
The manpage of gitattributes says: "The rules how the pattern
matches paths are the same as in .gitignore files" and the gitignore
pattern matching has a pattern ending with / for directory matching.

This rule is specifically relevant for the 'export-ignore' rule used
for git archive.

Signed-off-by: Jean-Noel Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-12-17 22:07:23 -08:00
Nguyễn Thái Ngọc Duy
b0ff96547e Reduce translations by using same terminologies
Somewhere in help usage, we use both "message" and "msg", "command"
and "cmd", "key id" and "key-id". This patch makes all help text from
parseopt use the first form. Clearer and 3 fewer strings for
translators.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-08-22 12:02:28 -07:00
Nguyễn Thái Ngọc Duy
0012a3873b i18n: archive: mark parseopt strings for translation
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-08-20 12:23:15 -07:00
Junio C Hamano
b83cfa5949 Merge branch 'rs/archive-tree-in-tip-simplify'
By René Scharfe
* rs/archive-tree-in-tip-simplify:
  archive-tar: keep const in checksum calculation
  archive: simplify refname handling
2012-05-23 13:35:22 -07:00
René Scharfe
c51a351a6b archive: simplify refname handling
There is no need to build a copy of the relevant part of the string just
to make sure we have a NUL-terminated string.  We can simply pass the
length of the interesting part to dwim_ref() instead.

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-05-18 11:26:10 -07:00
Nguyễn Thái Ngọc Duy
9cb513b798 archive: delegate blob reading to backend
archive-tar.c and archive-zip.c now perform conversion check, with
help of sha1_file_to_archive() from archive.c

This gives backends more freedom in dealing with (streaming) large
blobs.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-05-03 10:22:56 -07:00
Junio C Hamano
bdb8cb5296 Merge branch 'jk/maint-upload-archive'
* jk/maint-upload-archive:
  archive: re-allow HEAD:Documentation on a remote invocation
2012-01-12 23:34:17 -08:00
Carlos Martín Nieto
0f544ee897 archive: re-allow HEAD:Documentation on a remote invocation
The tightening done in (ee27ca4a: archive: don't let remote clients
get unreachable commits, 2011-11-17) went too far and disallowed
HEAD:Documentation as it would try to find "HEAD:Documentation" as a
ref.

Only DWIM the "HEAD" part to see if it exists as a ref. Once we're
sure that we've been given a valid ref, we follow the normal code
path. This still disallows attempts to access commits which are not
branch tips.

Signed-off-by: Carlos Martín Nieto <cmn@elego.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-01-11 19:21:22 -08:00
Junio C Hamano
3c4b5ad5a5 Merge branch 'jk/maint-upload-archive'
* jk/maint-upload-archive:
  archive: don't let remote clients get unreachable commits
2011-12-13 22:47:38 -08:00
Junio C Hamano
7b51c33b37 Merge branch 'jk/maint-1.6.2-upload-archive' into jk/maint-upload-archive
* jk/maint-1.6.2-upload-archive:
  archive: don't let remote clients get unreachable commits

Conflicts:
	archive.c
	archive.h
	builtin-archive.c
	builtin/upload-archive.c
	t/t5000-tar-tree.sh
2011-11-21 15:04:11 -08:00
Jeff King
ee27ca4a78 archive: don't let remote clients get unreachable commits
Usually git is careful not to allow clients to fetch
arbitrary objects from the database; for example, objects
received via upload-pack must be reachable from a ref.
Upload-archive breaks this by feeding the client's tree-ish
directly to get_sha1, which will accept arbitrary hex sha1s,
reflogs, etc.

This is not a problem if all of your objects are publicly
reachable anyway (or at least public to anybody who can run
upload-archive). Or if you are making the repo available by
dumb protocols like http or rsync (in which case the client
can read your whole object db directly).

But for sites which allow access only through smart
protocols, clients may be able to fetch trees from commits
that exist in the server's object database but are not
referenced (e.g., because history was rewound).

This patch tightens upload-archive's lookup to use dwim_ref
rather than get_sha1. This means a remote client can only
fetch the tip of a named ref, not an arbitrary sha1 or
reflog entry.

This also restricts some legitimate requests, too:

  1. Reachable non-tip commits, like:

        git archive --remote=$url v1.0~5

  2. Sub-trees of reachable commits, like:

        git archive --remote=$url v1.7.7:Documentation

Local requests continue to use get_sha1, and are not
restricted at all.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-11-21 14:42:25 -08:00
Junio C Hamano
f858c646b5 archive.c: use OPT_BOOL()
The list variable (which is OPT_BOOLEAN) is initialized to 0 and only
checked against 0 in the code, so it is safe to use OPT_BOOL().

The worktree_attributes variable (which is OPT_BOOLEAN) is initialized to
0 and later assigned to a field with the same name in struct archive_args,
which is a bitfield of width 1. It is safe and even more correct to use
OPT_BOOL() here; the new test in 5001 demonstrates why using OPT_COUNTUP
is wrong.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-09-27 17:00:06 -07:00
Michael Haggerty
d932f4eb9f Rename git_checkattr() to git_check_attr()
Suggested by: Junio Hamano <gitster@pobox.com>

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-08-04 15:53:21 -07:00
Jeff King
7b97730b76 upload-archive: allow user to turn off filters
Some tar filters may be very expensive to run, so sites do
not want to expose them via upload-archive. This patch lets
users configure tar.<filter>.remote to turn them off.

By default, gzip filters are left on, as they are about as
expensive as creating zip archives.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-06-22 11:12:35 -07:00
Jeff King
08716b3c11 archive: refactor file extension format-guessing
Git-archive will guess a format from the output filename if
no format is explicitly given.  The current function just
hardcodes "zip" to the zip format, and leaves everything
else NULL (which will default to tar). Since we are about
to add user-specified formats, we need to be more flexible.
The new rule is "if a filename ends with a dot and the name
of a format, it matches that format". For the existing "tar"
and "zip" formats, this is identical to the current
behavior. For new user-specified formats, this will do what
the user expects if they name their formats appropriately.

Because we will eventually start matching arbitrary
user-specified extensions that may include dots, the strrchr
search for the final dot is not sufficient. We need to do an
actual suffix match with each extension.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-06-22 11:12:35 -07:00
Jeff King
56baa61d01 archive: move file extension format-guessing lower
The process for guessing an archive output format based on
the filename is something like this:

  a. parse --output in cmd_archive; check the filename
     against a static set of mapping heuristics (right now
     it just matches ".zip" for zip files).

  b. if found, stick a fake "--format=zip" at the beginning
     of the arguments list (if the user did specify a
     --format manually, the later option will override our
     fake one)

  c. if it's a remote call, ship the arguments to the remote
     (including the fake), which will call write_archive on
     their end

  d. if it's local, ship the arguments to write_archive
     locally

There are two problems:

  1. The set of mappings is static and at too high a level.
     The write_archive level is going to check config for
     user-defined formats, some of which will specify
     extensions. We need to delay lookup until those are
     parsed, so we can match against them.

  2. For a remote archive call, our set of mappings (or
     formats) may not match the remote side's. This is OK in
     practice right now, because all versions of git
     understand "zip" and "tar". But as new formats are
     added, there is going to be a mismatch between what the
     client can do and what the remote server can do.

To fix (1), this patch refactors the location guessing to
happen at the write_archive level, instead of the
cmd_archive level. So instead of sticking a fake --format
field in the argv list, we actually pass a "name hint" down
the callchain; this hint is used at the appropriate time to
guess the format (if one hasn't been given already).

This patch leaves (2) unfixed. The name_hint is converted to
a "--format" option as before, and passed to the remote.
This means the local side's idea of how extensions map to
formats will take precedence.

Another option would be to pass the name hint to the remote
side and let the remote choose. This isn't a good idea for
two reasons:

  1. There's no room in the protocol for passing that
     information. We can pass a new argument, but older
     versions of git on the server will choke on it.

  2. Letting the remote side decide creates a silent
     inconsistency in user experience. Consider the case
     that the locally installed git knows about the "tar.gz"
     format, but a remote server doesn't.

     Running "git archive -o foo.tar.gz" will use the tar.gz
     format. If we use --remote, and the local side chooses
     the format, then we send "--format=tar.gz" to the
     remote, which will complain about the unknown format.
     But if we let the remote side choose the format, then
     it will realize that it doesn't know about "tar.gz" and
     output uncompressed tar without even issuing a warning.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-06-22 11:12:35 -07:00
Jeff King
4d7c989863 archive: pass archiver struct to write_archive callback
The current archivers are very static; when you are in the
write_tar_archive function, you know you are writing a tar.
However, to facilitate runtime-configurable archivers
that will share a common write function we need to tell the
function which archiver was used.

As a convenience, we also provide an opaque data pointer in
the archiver struct so that individual archivers can put
something useful there when they register themselves.
Technically they could just use the "name" field to look in
an internal map of names to data, but this is much simpler.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-06-22 11:12:35 -07:00
Jeff King
13e0f88d4a archive: refactor list of archive formats
Most of the tar and zip code was nicely split out into two
abstracted files which knew only about their specific
formats. The entry point to this code was a single "write
archive" function.

However, as these basic formats grow more complex (e.g., by
handling multiple file extensions and format names), a
static list of the entry point functions won't be enough.
Instead, let's provide a way for the tar and zip code to
tell the main archive code what they support by registering
archiver names and functions.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-06-22 11:12:35 -07:00
Jeff King
2321286298 archive: reorder option parsing and config reading
The archive command does three things during its
initialization phase:

  1. parse command-line options

  2. setup the git directory

  3. read config

During phase (1), if we see any options that do not require
a git directory (like "--list"), we handle them immediately
and exit, making it safe to abort step (2) if we are not in
a git directory.

Step (3) must come after step (2), since the git directory
may influence configuration.  However, this leaves no
possibility of configuration from step (3) impacting the
command-line options in step (1) (which is useful, for
example, for supporting user-configurable output formats).

Instead, let's reorder this to:

  1. setup the git directory, if it exists

  2. read config

  3. parse command-line options

  4. if we are not in a git repository, die

This should have the same external behavior, but puts
configuration before command-line parsing.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-06-15 15:56:28 -07:00
Nguyễn Thái Ngọc Duy
f0096c06bc Convert read_tree{,_recursive} to support struct pathspec
This patch changes behavior of the two functions. Previously it does
prefix matching only. Now it can also do wildcard matching.

All callers are updated. Some gain wildcard matching (archive,
checkout), others reset pathspec_item.has_wildcard to retain old
behavior (ls-files, ls-tree as they are plumbing).

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-03-25 09:20:33 -07:00
René Scharfe
4ad5c8044d archive: improve --verbose description
Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-11-15 10:05:38 -08:00
René Scharfe
fd03881a48 add description parameter to OPT__VERBOSE
Allows better help text to be defined than "be verbose".  Also make use
of the macro in places that already had a different description.  No
object code changes intended.

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-11-15 09:56:51 -08:00
Štěpán Němec
62b4698e55 Use angles for placeholders consistently
Signed-off-by: Štěpán Němec <stepnem@gmail.com>
Acked-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-10-08 12:29:52 -07:00
Jonathan Nieder
35039ced92 archive: abbreviate substituted commit ids again
Given a file with:

  (define archive-id "$Format:%ct|%h|a$")

and an export-subst attribute, the "%h" results in an full 40-digit
object name instead of the expected 7-digit one.

The export-subst feature requests unabbreviated object names because
that is the low-level default.  The effect was not observable until
v1.7.1.1~17^2~3 (2010-05-03), which taught log --format=%h to respect
the --abbrev option.

Reported-by: Eli Barzilay <eli@barzilay.org>
Tested-by: Eli Barzilay <eli@barzilay.org>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-07-27 13:28:54 -07:00
Junio C Hamano
06dbc1ea57 Merge branch 'jc/conflict-marker-size'
* jc/conflict-marker-size:
  rerere: honor conflict-marker-size attribute
  rerere: prepare for customizable conflict marker length
  conflict-marker-size: new attribute
  rerere: use ll_merge() instead of using xdl_merge()
  merge-tree: use ll_merge() not xdl_merge()
  xdl_merge(): allow passing down marker_size in xmparam_t
  xdl_merge(): introduce xmparam_t for merge specific parameters
  git_attr(): fix function signature

Conflicts:
	builtin-merge-file.c
	ll-merge.c
	xdiff/xdiff.h
	xdiff/xmerge.c
2010-01-20 20:28:51 -08:00