mirrors/git

mirror of https://github.com/git/git.git synced 2024-11-30 21:44:02 +08:00

Author	SHA1	Message	Date
Patrick Steinhardt	f2c32a66f5	oidset: pass hash algorithm when parsing file The `oidset_parse_file_carefully()` function implicitly depends on `the_repository` when parsing object IDs. Fix this by having callers pass in the hash algorithm to use. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-06-14 10:26:34 -07:00
Christian Couder	eaf07b7d15	oidset: refactor oidset_insert_from_set() In a following commit, we will need to add all the oids from a set into another set. In "list-objects-filter.c", there is already a static function called add_all() to do that. Let's rename this function oidset_insert_from_set() and move it into oidset.{c,h} to make it generally available. While at it, let's remove a useless `!= NULL`. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2024-02-14 09:39:14 -08:00
René Scharfe	325006f2db	oidset: make oidset_size() an inline function oidset_size() just reads a single word from memory and returns it. Avoid the function call overhead for this trivial operation by turning it into an inline function. While we're at it, declare its parameter const to allow it to be used on read-only oidsets. Suggested-by: Jeff King <peff@peff.net> Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-12 16:14:32 -07:00
Junio C Hamano	610e2b9240	blame: validate and peel the object names on the ignore list The command reads list of object names to place on the ignore list either from the command line or from a file, but they are not checked with their object type (those read from the file are not even checked for object existence). Extend the oidset_parse_file() API and allow it to take a callback that can be used to die (e.g. when an inappropriate input is read) or modify the object name read (e.g. when a tag pointing at a commit is read, and the caller wants a commit object name), and use it in the code that handles ignore list. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-24 22:20:58 -07:00
Junio C Hamano	6a1c17d05b	Merge branch 'tb/commit-graph-split-strategy' "git commit-graph write" learned different ways to write out split files. * tb/commit-graph-split-strategy: Revert "commit-graph.c: introduce '--[no-]check-oids'" commit-graph.c: introduce '--[no-]check-oids' commit-graph.h: replace 'commit_hex' with 'commits' oidset: introduce 'oidset_size' builtin/commit-graph.c: introduce split strategy 'replace' builtin/commit-graph.c: introduce split strategy 'no-merge' builtin/commit-graph.c: support for '--split[=<strategy>]' t/helper/test-read-graph.c: support commit-graph chains	2020-05-01 13:39:52 -07:00
Junio C Hamano	a768f866e9	Merge branch 'jk/oid-array-cleanups' Code cleanup. * jk/oid-array-cleanups: oidset: stop referring to sha1-array ref-filter: stop referring to "sha1 array" bisect: stop referring to sha1_array test-tool: rename sha1-array to oid-array oid_array: rename source file from sha1-array oid_array: use size_t for iteration oid_array: use size_t for count and allocation	2020-04-22 13:42:49 -07:00
Taylor Blau	f4781068fa	oidset: introduce 'oidset_size' Occasionally, it may be useful for callers to know the number of object IDs in an oidset. Right now, the only way to compute this is to call 'kh_size' on the internal 'kh_set_oid_t'. Similar to how we wrap other 'kh_*' functions over the 'oidset' type, let's allow callers to compute this value by introducing 'oidset_size'. We will add its first caller in the subsequent commit. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-15 09:20:29 -07:00
Jeff King	0740d0a5d3	oidset: stop referring to sha1-array Ths has been oid_array for some time, though the source only recently moved from sha1-array.[ch] to oid-array.[ch]. In either case, we should say "oid-array" here. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-03-30 10:59:08 -07:00
Junio C Hamano	883e23820e	Merge branch 'en/oidset-uninclude-hashmap' Code clean-up. * en/oidset-uninclude-hashmap: oidset: remove unnecessary include	2020-03-25 13:57:44 -07:00
Elijah Newren	757c2ba3e2	oidset: remove unnecessary include When commit `8b2f8cbcb1` ("oidset: use khash", 2018-10-04) moved from using oidmap to khash, it replaced the oidmap.h include with both one for hashmap.h and khash.h. Since the hashmap.h header is unnecessary, and the point of the patch was to switch from hashmap (used by oidmap) to khash.h, remove the unneccessary include. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-03-15 15:44:25 -07:00
Junio C Hamano	209f075593	Merge branch 'br/blame-ignore' "git blame" learned to "ignore" commits in the history, whose effects (as well as their presence) get ignored. * br/blame-ignore: t8014: remove unnecessary braces blame: drop some unused function parameters blame: add a test to cover blame_coalesce() blame: use the fingerprint heuristic to match ignored lines blame: add a fingerprint heuristic to match ignored lines blame: optionally track line fingerprints during fill_blame_origin() blame: add config options for the output of ignored or unblamable lines blame: add the ability to ignore commits and their changes blame: use a helper function in blame_chunk() Move oidset_parse_file() to oidset.c fsck: rename and touch up init_skiplist()	2019-07-19 11:30:20 -07:00
Jeff King	8fbb558af4	khash: rename kh_oid_t to kh_oid_set khash lets us define a hash as either a map or a set (i.e., with no "value" type). For the oid maps we define, "oid" is the set and "oid_map" is the map. As the bug in the previous commit shows, it's easy to pick the wrong one. So let's make the names more distinct: "oid_set" and "oid_map". An alternative naming scheme would be to actually name the type after the key/value types. So e.g., "oid" _would_ be the set, since it has no value type. And "oid_map" would become "oid_void" or similar (and "oid_pos" becomes "oid_int"). That's better in some ways: it's more regular, and a given map type can be more reasily reused in multiple contexts (e.g., something storing an "int" that isn't a "pos"). But it's also slightly less descriptive. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-06-20 10:27:48 -07:00
Barret Rhoden	f93895f8fc	Move oidset_parse_file() to oidset.c Signed-off-by: Barret Rhoden <brho@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-05-16 11:36:23 +09:00
brian m. carlson	5a8643eff1	khash: move oid hash table definition Move the oid khash table definition to khash.h and define a typedef for it, similar to the one we have for unsigned char pointers. Define variants that are maps as well. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-04-01 11:57:37 +09:00
René Scharfe	8c84ae659e	oidset: uninline oidset_init() There is no need to inline oidset_init(), as it's typically only called twice in the lifetime of an oidset (once at the beginning and at the end by oidset_clear()) and kh_resize_* is quite big, so move its definition to oidset.c. Document it while we're at it. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-10-04 11:12:14 -07:00
René Scharfe	8b2f8cbcb1	oidset: use khash Reimplement oidset using khash.h in order to reduce its memory footprint and make it faster. Performance of a command that mainly checks for duplicate objects using an oidset, with master and Clang 6.0.1: $ cmd="./git-cat-file --batch-all-objects --unordered --buffer --batch-check='%(objectname)'" $ /usr/bin/time $cmd >/dev/null 0.22user 0.03system 0:00.25elapsed 99%CPU (0avgtext+0avgdata 48484maxresident)k 0inputs+0outputs (0major+11204minor)pagefaults 0swaps $ hyperfine "$cmd" Benchmark #1: ./git-cat-file --batch-all-objects --unordered --buffer --batch-check='%(objectname)' Time (mean ± σ): 250.0 ms ± 6.0 ms [User: 225.9 ms, System: 23.6 ms] Range (min … max): 242.0 ms … 261.1 ms And with this patch: $ /usr/bin/time $cmd >/dev/null 0.14user 0.00system 0:00.15elapsed 100%CPU (0avgtext+0avgdata 41396maxresident)k 0inputs+0outputs (0major+8318minor)pagefaults 0swaps $ hyperfine "$cmd" Benchmark #1: ./git-cat-file --batch-all-objects --unordered --buffer --batch-check='%(objectname)' Time (mean ± σ): 151.9 ms ± 4.9 ms [User: 130.5 ms, System: 21.2 ms] Range (min … max): 148.2 ms … 170.4 ms Initial-patch-by: Jeff King <peff@peff.net> Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-10-04 11:12:13 -07:00
Thomas Gummerer	03e7833f3a	oidset: don't return value from oidset_init `c3a9ad3117` ("oidset: add iterator methods to oidset", 2017-11-21) introduced a 'oidset_init()' function in oidset.h, which has void as return type, but returns an expression. This makes the solaris compiler fail with: "oidset.h", line 30: void function cannot return value As the return type is void, and even the return type of the expression we're trying to return (oidmap_init) is void just remove the return statement to fix the compiler error. Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-01-08 15:24:35 -08:00
Jeff Hostetler	c3a9ad3117	oidset: add iterator methods to oidset Add the usual iterator methods to oidset. Add oidset_remove(). Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Reviewed-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-11-22 14:11:56 +09:00
Jonathan Tan	9e6fabde82	oidmap: map with OID as key This is similar to using the hashmap in hashmap.c, but with an easier-to-use API. In particular, custom entry comparisons no longer need to be written, and lookups can be done without constructing a temporary entry structure. This is implemented as a thin wrapper over the hashmap API. In particular, this means that there is an additional 4-byte overhead due to the fact that the first 4 bytes of the hash is redundantly stored. For now, I'm taking the simpler approach, but if need be, we can reimplement oidmap without affecting the callers significantly. oidset has been updated to use oidmap. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-10-01 17:18:03 +09:00
Jeff King	29c2bd5fa8	add oidset API This is similar to many of our uses of sha1-array, but it overcomes one limitation of a sha1-array: when you are de-duplicating a large input with relatively few unique entries, sha1-array uses 20 bytes per non-unique entry. Whereas this set will use memory linear in the number of unique entries (albeit a few more than 20 bytes due to hashmap overhead). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-02-08 15:39:55 -08:00

20 Commits