git/t/t1600-index.sh
Derrick Stolee ee1f0c242e read-cache: add index.skipHash config option
The previous change allowed skipping the hashing portion of the
hashwrite API, using it instead as a buffered write API. Disabling the
hashwrite can be particularly helpful when the write operation is in a
critical path.

One such critical path is the writing of the index. This operation is so
critical that the sparse index was created specifically to reduce the
size of the index to make these writes (and reads) faster.

This trade-off between file stability at rest and write-time performance
is not easy to balance. The index is an interesting case for a couple
reasons:

1. Writes block users. Writing the index takes place in many user-
   blocking foreground operations. The speed improvement directly
   impacts their use. Other file formats are typically written in the
   background (commit-graph, multi-pack-index) or are super-critical to
   correctness (pack-files).

2. Index files are short lived. It is rare that a user leaves an index
   for a long time with many staged changes. Outside of staged changes,
   the index can be completely destroyed and rewritten with minimal
   impact to the user.

Following a similar approach to one used in the microsoft/git fork [1],
add a new config option (index.skipHash) that allows disabling this
hashing during the index write. The cost is that we can no longer
validate the contents for corruption-at-rest using the trailing hash.

[1] 21fed2d914

We load this config from the repository config given by istate->repo,
with a fallback to the_repository if it is not set.

While older Git versions will not recognize the null hash as a special
case, the file format itself is still being met in terms of its
structure. Using this null hash will still allow Git operations to
function across older versions.

The one exception is 'git fsck' which checks the hash of the index file.
This used to be a check on every index read, but was split out to just
the index in a33fc72fe9 (read-cache: force_verify_index_checksum,
2017-04-14) and released first in Git 2.13.0. Document the versions that
relaxed these restrictions, with the optimistic expectation that this
change will be included in Git 2.40.0.

Here, we disable this check if the trailing hash is all zeroes. We add a
warning to the config option that this may cause undesirable behavior
with older Git versions.

As a quick comparison, I tested 'git update-index --force-write' with
and without index.skipHash=true on a copy of the Linux kernel
repository.

Benchmark 1: with hash
  Time (mean ± σ):      46.3 ms ±  13.8 ms    [User: 34.3 ms, System: 11.9 ms]
  Range (min … max):    34.3 ms …  79.1 ms    82 runs

Benchmark 2: without hash
  Time (mean ± σ):      26.0 ms ±   7.9 ms    [User: 11.8 ms, System: 14.2 ms]
  Range (min … max):    16.3 ms …  42.0 ms    69 runs

Summary
  'without hash' ran
    1.78 ± 0.76 times faster than 'with hash'

These performance benefits are substantial enough to allow users the
ability to opt-in to this feature, even with the potential confusion
with older 'git fsck' versions.

Test this new config option, both at a command-line level and within a
submodule. The confirmation is currently limited to confirm that 'git
fsck' does not complain about the index. Future updates will make this
test more robust.

It is critical that this test is placed before the test_index_version
tests, since those tests obliterate the .git/config file and hence lose
the setting from GIT_TEST_DEFAULT_HASH, if set.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-07 07:46:14 +09:00

121 lines
2.8 KiB
Bash
Executable File

#!/bin/sh
test_description='index file specific tests'
TEST_PASSES_SANITIZE_LEAK=true
. ./test-lib.sh
sane_unset GIT_TEST_SPLIT_INDEX
test_expect_success 'setup' '
echo 1 >a
'
test_expect_success 'bogus GIT_INDEX_VERSION issues warning' '
(
rm -f .git/index &&
GIT_INDEX_VERSION=2bogus &&
export GIT_INDEX_VERSION &&
git add a 2>err &&
sed "s/[0-9]//" err >actual.err &&
sed -e "s/ Z$/ /" <<-\EOF >expect.err &&
warning: GIT_INDEX_VERSION set, but the value is invalid.
Using version Z
EOF
test_cmp expect.err actual.err
)
'
test_expect_success 'out of bounds GIT_INDEX_VERSION issues warning' '
(
rm -f .git/index &&
GIT_INDEX_VERSION=1 &&
export GIT_INDEX_VERSION &&
git add a 2>err &&
sed "s/[0-9]//" err >actual.err &&
sed -e "s/ Z$/ /" <<-\EOF >expect.err &&
warning: GIT_INDEX_VERSION set, but the value is invalid.
Using version Z
EOF
test_cmp expect.err actual.err
)
'
test_expect_success 'no warning with bogus GIT_INDEX_VERSION and existing index' '
(
GIT_INDEX_VERSION=1 &&
export GIT_INDEX_VERSION &&
git add a 2>actual.err &&
test_must_be_empty actual.err
)
'
test_expect_success 'out of bounds index.version issues warning' '
(
sane_unset GIT_INDEX_VERSION &&
rm -f .git/index &&
git config --add index.version 1 &&
git add a 2>err &&
sed "s/[0-9]//" err >actual.err &&
sed -e "s/ Z$/ /" <<-\EOF >expect.err &&
warning: index.version set, but the value is invalid.
Using version Z
EOF
test_cmp expect.err actual.err
)
'
test_expect_success 'index.skipHash config option' '
rm -f .git/index &&
git -c index.skipHash=true add a &&
git fsck &&
test_commit start &&
git -c protocol.file.allow=always submodule add ./ sub &&
git config index.skipHash false &&
git -C sub config index.skipHash true &&
>sub/file &&
git -C sub add a &&
git -C sub fsck
'
test_index_version () {
INDEX_VERSION_CONFIG=$1 &&
FEATURE_MANY_FILES=$2 &&
ENV_VAR_VERSION=$3
EXPECTED_OUTPUT_VERSION=$4 &&
(
rm -f .git/index &&
rm -f .git/config &&
if test "$INDEX_VERSION_CONFIG" -ne 0
then
git config --add index.version $INDEX_VERSION_CONFIG
fi &&
git config --add feature.manyFiles $FEATURE_MANY_FILES
if test "$ENV_VAR_VERSION" -ne 0
then
GIT_INDEX_VERSION=$ENV_VAR_VERSION &&
export GIT_INDEX_VERSION
else
unset GIT_INDEX_VERSION
fi &&
git add a &&
echo $EXPECTED_OUTPUT_VERSION >expect &&
test-tool index-version <.git/index >actual &&
test_cmp expect actual
)
}
test_expect_success 'index version config precedence' '
test_index_version 0 false 0 2 &&
test_index_version 2 false 0 2 &&
test_index_version 3 false 0 2 &&
test_index_version 4 false 0 4 &&
test_index_version 2 false 4 4 &&
test_index_version 2 true 0 2 &&
test_index_version 0 true 0 4 &&
test_index_version 0 true 2 2
'
test_done