git/Documentation/git-index-pack.txt

138 lines
4.7 KiB
Plaintext
Raw Normal View History

git-index-pack(1)
=================
NAME
----
git-index-pack - Build pack index file for an existing packed archive
SYNOPSIS
--------
[verse]
'git index-pack' [-v] [-o <index-file>] [--[no-]rev-index] <pack-file>
'git index-pack' --stdin [--fix-thin] [--keep] [-v] [-o <index-file>]
[--[no-]rev-index] [<pack-file>]
DESCRIPTION
-----------
Reads a packed archive (.pack) from the specified file, and
builds a pack index file (.idx) for it. Optionally writes a
reverse-index (.rev) for the specified pack. The packed
archive together with the pack index can then be placed in
the objects/pack/ directory of a Git repository.
OPTIONS
-------
-v::
Be verbose about what is going on, including progress status.
-o <index-file>::
Write the generated pack index into the specified
file. Without this option the name of pack index
file is constructed from the name of packed archive
file by replacing .pack with .idx (and the program
fails if the name of packed archive does not end
with .pack).
--[no-]rev-index::
When this flag is provided, generate a reverse index
(a `.rev` file) corresponding to the given pack. If
`--verify` is given, ensure that the existing
reverse index is correct. Takes precedence over
`pack.writeReverseIndex`.
--stdin::
When this flag is provided, the pack is read from stdin
instead and a copy is then written to <pack-file>. If
<pack-file> is not specified, the pack is written to
objects/pack/ directory of the current Git repository with
a default name determined from the pack content. If
<pack-file> is not specified consider using --keep to
prevent a race condition between this process and
'git repack'.
--fix-thin::
Fix a "thin" pack produced by `git pack-objects --thin` (see
linkgit:git-pack-objects[1] for details) by adding the
excluded objects the deltified objects are based on to the
pack. This option only makes sense in conjunction with --stdin.
--keep::
Before moving the index into its final destination
create an empty .keep file for the associated pack file.
This option is usually necessary with --stdin to prevent a
simultaneous 'git repack' process from deleting
the newly constructed pack and index before refs can be
updated to use objects contained in the pack.
--keep=<msg>::
Like --keep create a .keep file before moving the index into
its final destination, but rather than creating an empty file
place '<msg>' followed by an LF into the .keep file. The '<msg>'
message can later be searched for within all .keep files to
locate any which have outlived their usefulness.
--index-version=<version>[,<offset>]::
This is intended to be used by the test suite only. It allows
to force the version for the generated pack index, and to force
64-bit index entries on objects located above the given offset.
--strict::
Die, if the pack contains broken objects or links.
--progress-title::
For internal use only.
+
Set the title of the progress bar. The title is "Receiving objects" by
default and "Indexing objects" when `--stdin` is specified.
clone: open a shortcut for connectivity check In order to make sure the cloned repository is good, we run "rev-list --objects --not --all $new_refs" on the repository. This is expensive on large repositories. This patch attempts to mitigate the impact in this special case. In the "good" clone case, we only have one pack. If all of the following are met, we can be sure that all objects reachable from the new refs exist, which is the intention of running "rev-list ...": - all refs point to an object in the pack - there are no dangling pointers in any object in the pack - no objects in the pack point to objects outside the pack The second and third checks can be done with the help of index-pack as a slight variation of --strict check (which introduces a new condition for the shortcut: pack transfer must be used and the number of objects large enough to call index-pack). The first is checked in check_everything_connected after we get an "ok" from index-pack. "index-pack + new checks" is still faster than the current "index-pack + rev-list", which is the whole point of this patch. If any of the conditions fail, we fall back to the good old but expensive "rev-list ..". In that case it's even more expensive because we have to pay for the new checks in index-pack. But that should only happen when the other side is either buggy or malicious. Cloning linux-2.6 over file:// before after real 3m25.693s 2m53.050s user 5m2.037s 4m42.396s sys 0m13.750s 0m16.574s A more realistic test with ssh:// over wireless before after real 11m26.629s 10m4.213s user 5m43.196s 5m19.444s sys 0m35.812s 0m37.630s This shortcut is not applied to shallow clones, partly because shallow clones should have no more objects than a usual fetch and the cost of rev-list is acceptable, partly to avoid dealing with corner cases when grafting is involved. This shortcut does not apply to unpack-objects code path either because the number of objects must be small in order to trigger that code path. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-05-26 09:16:17 +08:00
--check-self-contained-and-connected::
Die if the pack contains broken links. For internal use only.
--fsck-objects::
For internal use only.
+
Die if the pack contains broken objects. If the pack contains a tree
pointing to a .gitmodules blob that does not exist, prints the hash of
that blob (for the caller to check) after the hash that goes into the
name of the pack/idx file (see "Notes").
--threads=<n>::
Specifies the number of threads to spawn when resolving
deltas. This requires that index-pack be compiled with
pthreads otherwise this option is ignored with a warning.
This is meant to reduce packing time on multiprocessor
machines. The required amount of memory for the delta search
window is however multiplied by the number of threads.
Specifying 0 will cause Git to auto-detect the number of CPU's
and use maximum 3 threads.
--max-input-size=<size>::
Die, if the pack is larger than <size>.
--object-format=<hash-algorithm>::
Specify the given object format (hash algorithm) for the pack. The valid
values are 'sha1' and (if enabled) 'sha256'. The default is the algorithm for
the current repository (set by `extensions.objectFormat`), or 'sha1' if no
value is set or outside a repository.
+
This option cannot be used with --stdin.
Documentation: mark `--object-format=sha256` as experimental After eff45daab8 ("repository: enable SHA-256 support by default", 2020-07-29), vanilla builds of Git enable the user to run, e.g., git init --object-format=sha256 and hack away. This can be a good way to gain experience with the SHA-256 world, e.g., to find bugs that GIT_TEST_DEFAULT_HASH=sha256 make test doesn't spot. But it really is a separate world: Such SHA-256 repos will live entirely separate from the (by now fairly large) set of SHA-1 repos. Interacting across the border is possible in principle, e.g., through "diff + apply" (or "format-patch + am"), but even that has its limitations: Applying a SHA-256 diff in a SHA-1 repo works in the simple case, but if you need to resort to `-3`, you're out of luck. Similarly, "push + pull" should work, but you really will be operating mostly offset from the rest of the world. That might be ok by the time you initialize your repository, and it might be ok for several months after that, but there might come a day when you're starting to regret your use of `git init --object-format=sha256` and have dug yourself into a fairly deep hole. There are currently topics in flight to document our data formats and protocols regarding SHA-256 and in some cases (midx and commit-graph), we're considering adjusting how the file formats indicate which object format to use. Wherever `--object-format` is mentioned in our documentation, let's make it clear that using it with "sha256" is experimental. If we later need to explain why we can't handle data we generated back in 2020, we can always point to this paragraph we're adding here. By "include::"-ing a small blurb, we should be able to be consistent throughout the documentation and can eventually gradually tone down the severity of this text. One day, we might even use it to start phasing out `--object-format=sha1`, but let's not get ahead of ourselves... There's also `extensions.objectFormat`, but it's only mentioned three times. Twice where we're adding this new disclaimer and in the third spot we already have a "do not edit" warning. From there, interested readers should eventually find this new one that we're adding here. Because `GIT_DEFAULT_HASH` provides another entry point to this functionality, document the experimental nature of it too. Signed-off-by: Martin Ågren <martin.agren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-08-16 18:01:18 +08:00
+
include::object-format-disclaimer.txt[]
NOTES
-----
Once the index has been created, the hash that goes into the name of
the pack/idx file is printed to stdout. If --stdin was
also used then this is prefixed by either "pack\t", or "keep\t" if a
new .keep file was successfully created. This is useful to remove a
.keep file used as a lock to prevent the race with 'git repack'
mentioned above.
GIT
---
Part of the linkgit:git[1] suite