packfile: use oidread() instead of hashcpy() to fill object_id

When chasing a REF_DELTA, we need to pull the raw hash bytes out of the
mmap'd packfile into an object_id struct. We do that with a raw
hashcpy() of the appropriate length (that happens directly now, though
before the previous commit it happened inside find_pack_entry_one(),
also using a hashcpy).

But I think this creates a potentially dangerous situation due to
d4d364b2c7 (hash: convert `oidcmp()` and `oideq()` to compare whole
hash, 2024-06-14). When using sha1, we'll have uninitialized bytes in
the latter part of the object_id.hash buffer, which could fool oideq(),
etc.

We should use oidread() instead, which correctly zero-pads the extra
bytes, as of c98d762ed9 (global: ensure that object IDs are always
padded, 2024-06-14).

As far as I can see, this has not been a problem in practice because the
object_id we feed to find_pack_entry_one() is never used with oideq(),
etc. It is being compared to the bytes mmap'd from a pack idx file,
which of course do not have the extra padding bytes themselves. So
there's no bug here, but this just puzzled me while looking at the code.
We should do the more obviously safe thing, both for future-proofing and
to avoid confusing readers.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
This commit is contained in:
Jeff King 2024-10-25 03:08:10 -04:00 committed by Taylor Blau
parent 479ab76c9f
commit 863f2459a2

View File

@ -1240,7 +1240,7 @@ off_t get_delta_base(struct packed_git *p,
} else if (type == OBJ_REF_DELTA) {
/* The base entry _must_ be in the same pack */
struct object_id oid;
hashcpy(oid.hash, base_info, the_repository->hash_algo);
oidread(&oid, base_info, the_repository->hash_algo);
base_offset = find_pack_entry_one(&oid, p);
*curpos += the_hash_algo->rawsz;
} else