mirrors/git

mirror of https://github.com/git/git.git synced 2024-12-01 05:54:16 +08:00

Author	SHA1	Message	Date
Junio C Hamano	27f25845cf	Merge branch 'nd/combined-test-helper' Small test-helper programs have been consolidated into a single binary. * nd/combined-test-helper: (36 commits) t/helper: merge test-write-cache into test-tool t/helper: merge test-wildmatch into test-tool t/helper: merge test-urlmatch-normalization into test-tool t/helper: merge test-subprocess into test-tool t/helper: merge test-submodule-config into test-tool t/helper: merge test-string-list into test-tool t/helper: merge test-strcmp-offset into test-tool t/helper: merge test-sigchain into test-tool t/helper: merge test-sha1-array into test-tool t/helper: merge test-scrap-cache-tree into test-tool t/helper: merge test-run-command into test-tool t/helper: merge test-revision-walking into test-tool t/helper: merge test-regex into test-tool t/helper: merge test-ref-store into test-tool t/helper: merge test-read-cache into test-tool t/helper: merge test-prio-queue into test-tool t/helper: merge test-path-utils into test-tool t/helper: merge test-online-cpus into test-tool t/helper: merge test-mktemp into test-tool t/helper: merge (unused) test-mergesort into test-tool ...	2018-04-11 13:09:56 +09:00
Christian Couder	2e3efd0613	perf/aggregate: add --sort-by=regression option One of the most interesting thing one can be interested in when looking at performance test results is possible performance regressions. This new option makes it easy to spot such possible regressions. This new option is named '--sort-by=regression' to make it possible and easy to add other ways to sort the results, like for example '--sort-by=utime'. If we would like to sort according to how much the stime regressed we could also add a new option called '--sort-by=regression:stime'. Then '--sort-by=regression' could become a synonym for '--sort-by=regression:rtime'. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-03-27 17:04:07 -07:00
Christian Couder	c94b6ac50f	perf/aggregate: add display_dir() This new helper function will be reused in a subsequent commit. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-03-27 17:04:06 -07:00
Nguyễn Thái Ngọc Duy	c81f843d09	t/helper: merge test-write-cache into test-tool Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-03-27 08:45:47 -07:00
Nguyễn Thái Ngọc Duy	c932a5ff28	t/helper: merge test-string-list into test-tool Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-03-27 08:45:47 -07:00
Nguyễn Thái Ngọc Duy	5fbe600cb5	t/helper: merge test-read-cache into test-tool Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-03-27 08:45:47 -07:00
Nguyễn Thái Ngọc Duy	1c854745bd	t/helper: merge test-drop-caches into test-tool Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-03-27 08:45:47 -07:00
Nguyễn Thái Ngọc Duy	64eb82fea8	t/helper: merge test-lazy-init-name-hash into test-tool Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-03-27 08:45:47 -07:00
René Scharfe	53ba2c799a	perf: use GIT_PERF_REPEAT_COUNT=3 by default even without config file `9ba95ed23c` (perf/run: update get_var_from_env_or_config() for subsections) stopped setting a default value for GIT_PERF_REPEAT_COUNT if no perf config file is present, because get_var_from_env_or_config returns early in that case. Fix it by setting the default value after calling this function. Its fifth parameter is not used for any other variable, so remove the associated code. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-02-27 15:01:04 -08:00
Junio C Hamano	9b6734e510	Merge branch 'cc/perf-aggregate' "make perf" enhancement. * cc/perf-aggregate: perf/aggregate: sort JSON fields in output perf/aggregate: add --reponame option perf/aggregate: add --subsection option	2018-02-15 14:55:44 -08:00
Junio C Hamano	ed1b87ef91	Merge branch 'ab/simplify-perl-makefile' The build procedure for perl/ part has been greatly simplified by weaning ourselves off of MakeMaker. * ab/simplify-perl-makefile: perl: treat PERLLIB_EXTRA as an extra path again perl: avoid *.pmc and fix Error.pm further Makefile: replace perl/Makefile.PL with simple make rules	2018-02-13 13:39:03 -08:00
Christian Couder	ed103edfea	perf/aggregate: sort JSON fields in output It is much easier to diff the output against a previous one when the fields are sorted. Helped-by: Philip Oakley <philipoakley@iee.org> Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-02-02 11:47:45 -08:00
Christian Couder	fb2c362eb5	perf/aggregate: add --reponame option This makes it easier to use the aggregate script on the command line when one wants to get the "environment" fields set in the codespeed output. Previously setting GIT_REPO_NAME was needed for this purpose. Helped-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-02-02 11:47:41 -08:00
Christian Couder	cd5d4bf609	perf/aggregate: add --subsection option This makes it easier to use the aggregate script on the command line, to get results from subsections. Previously setting GIT_PERF_SUBSECTION was needed for this purpose. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-02-02 11:47:37 -08:00
Junio C Hamano	86d7fcc40a	Merge branch 'cc/codespeed' "perf" test output can be sent to codespeed server. * cc/codespeed: perf/run: read GIT_PERF_REPO_NAME from perf.repoName perf/run: learn to send output to codespeed server perf/run: learn about perf.codespeedOutput perf/run: add conf_opts argument to get_var_from_env_or_config() perf/aggregate: implement codespeed JSON output perf/aggregate: refactor printing results perf/aggregate: fix checking ENV{GIT_PERF_SUBSECTION}	2018-01-23 13:16:38 -08:00
Christian Couder	19cf57a92e	perf/run: read GIT_PERF_REPO_NAME from perf.repoName The GIT_PERF_REPO_NAME env variable is used in the `aggregate.perl` script to set the 'environment' field in the JSON Codespeed output. Let's make it easy to set this variable by setting it in a config file. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-01-05 12:31:08 -08:00
Christian Couder	fccec20f0b	perf/run: learn to send output to codespeed server Let's make it possible to set in a config file the URL of a codespeed server. And then let's make the `run` script send the perf test results to this URL at the end of the tests. This should make is possible to easily automate the process of running perf tests and having their results available in Codespeed. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-01-05 12:31:08 -08:00
Christian Couder	5d6bb93090	perf/run: learn about perf.codespeedOutput Let's make it possible to set in a config file the output format (regular or codespeed) of the perf tests. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-01-05 12:31:08 -08:00
Christian Couder	3ae7d2b0cd	perf/run: add conf_opts argument to get_var_from_env_or_config() Let's make it possible to use `git config` type specifiers like `--int` or `--bool`, so that config values are converted to the canonical form and easier to use. This additional argument is now the fourth argument of get_var_from_env_or_config() instead of the fifth because we want the default value argument to be unset if it is not passed, and this is simpler if it is the last argument. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-01-05 12:31:07 -08:00
Christian Couder	05eb1c37ed	perf/aggregate: implement codespeed JSON output Codespeed (https://github.com/tobami/codespeed/) is an open source project that can be used to track how some software performs over time. It stores performance test results in a database and can show nice graphs and charts on a web interface. As it can be interesting to use Codespeed to see how Git performance evolves over time and releases, let's implement a Codespeed output in "perf/aggregate.perl". Helped-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-01-05 12:31:07 -08:00
Christian Couder	30ffff6ee2	perf/aggregate: refactor printing results As we want to implement another kind of output than the current output for the perf test results, let's refactor the existing code that outputs the results in its own print_default_results() function. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-01-05 12:31:07 -08:00
Christian Couder	6f5ecad6a5	perf/aggregate: fix checking ENV{GIT_PERF_SUBSECTION} The way we check ENV{GIT_PERF_SUBSECTION} could trigger comparison between undef and "" that may be flagged by use of strict & warnings. Let's fix that. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-01-05 12:31:07 -08:00
Ævar Arnfjörð Bjarmason	7b31b55db1	perf: amend the grep tests to test grep.threads Ever since `5b594f457a` ("Threaded grep", 2010-01-25) the number of threads git-grep uses under PTHREADS has been hardcoded to 8, but there's no performance test to check whether this is an optimal setting. Amend the existing tests for the grep engines to support a mode where this can be tested, e.g.: GIT_PERF_GREP_THREADS='1 8 16' GIT_PERF_LARGE_REPO=~/g/linux ./run p782* Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-01-04 10:24:48 -08:00
Junio C Hamano	8e777af273	Merge branch 'bp/fsmonitor' Test fix. * bp/fsmonitor: p7519: improve check for prerequisite WATCHMAN	2017-12-28 14:08:48 -08:00
René Scharfe	b4f61b7fa4	p7519: improve check for prerequisite WATCHMAN The return code of command -v with a non-existing command is 1 in bash and 127 in dash. Use that return code directly to allow the script to work with dash and without watchman (e.g. on Debian). While at it stop redirecting the output. stderr is redirected to /dev/null by test_lazy_prereq already, and stdout can actually be useful -- the path of the found watchman executable is sent there, but it's shown only if the script was run with --verbose. Signed-off-by: Rene Scharfe <l.s.r@web.de> Acked-by: Ben Peart <benpeart@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-12-18 14:00:45 -08:00
Junio C Hamano	97e1f857fc	Merge branch 'ds/for-each-file-in-obj-micro-optim' The code to iterate over loose object files got optimized. * ds/for-each-file-in-obj-micro-optim: sha1_file: use strbuf_add() instead of strbuf_addf()	2017-12-13 13:28:57 -08:00
Ævar Arnfjörð Bjarmason	20d2a30f8f	Makefile: replace perl/Makefile.PL with simple make rules Replace the perl/Makefile.PL and the fallback perl/Makefile used under NO_PERL_MAKEMAKER=NoThanks with a much simpler implementation heavily inspired by how the i18n infrastructure's build process works[1]. The reason for having the Makefile.PL in the first place is that it was initially[2] building a perl C binding to interface with libgit, this functionality, that was removed[3] before Git.pm ever made it to the master branch. We've since since started maintaining a fallback perl/Makefile, as MakeMaker wouldn't work on some platforms[4]. That's just the tip of the iceberg. We have the PM.stamp hack in the top-level Makefile[5] to detect whether we need to regenerate the perl/perl.mak, which I fixed just recently to deal with issues like the perl version changing from under us[6]. There is absolutely no reason for why this needs to be so complex anymore. All we're getting out of this elaborate Rube Goldberg machine was copying perl/* to perl/blib/* as we do a string-replacement on the .pm files to hardcode @@LOCALEDIR@@ in the source, as well as pod2man-ing Git.pm & friends. So replace the whole thing with something that's pretty much a copy of how we generate po/build/.mo from po/.po, just with a small sed(1) command instead of msgfmt. As that's being done rename the files from .pm to .pmc just to indicate that they're generated (see "perldoc -f require"). While I'm at it, change the fallback for Error.pm from being something where we'll ship our own Error.pm if one doesn't exist at build time to one where we just use a Git::Error wrapper that'll always prefer the system-wide Error.pm, only falling back to our own copy if it really doesn't exist at runtime. It's now shipped as Git::FromCPAN::Error, making it easy to add other modules to Git::FromCPAN::* in the future if that's needed. Functional changes: * This will not always install into perl's idea of its global "installsitelib". This only potentially matters for packagers that need to expose Git.pm for non-git use, and as explained in the INSTALL file there's a trivial workaround. * The scripts themselves will 'use lib' the target directory, but if INSTLIBDIR is set it overrides it. It doesn't have to be this way, it could be set in addition to INSTLIBDIR, but my reading of [7] is that this is the desired behavior. * We don't build man pages for all of the perl modules as we used to, only Git(3pm). As discussed on-list[8] that we were building installed manpages for purely internal APIs like Git::I18N or private-Error.pm was always a bug anyway, and all the Git::SVN::* ones say they're internal APIs. There are apparently external users of Git.pm, but I don't expect there to be any of the others. As a side-effect of these general changes the perl documentation now only installed by install-{doc,man}, not a mere "install" as before. 1. `5e9637c629` ("i18n: add infrastructure for translating Git with gettext", 2011-11-18) 2. `b1edc53d06` ("Introduce Git.pm (v4)", 2006-06-24) 3. `18b0fc1ce1` ("Git.pm: Kill Git.xs for now", 2006-09-23) 4. `f848718a69` ("Make perl/ build procedure ActiveState friendly.", 2006-12-04) 5. `ee9be06770` ("perl: detect new files in MakeMaker builds", 2012-07-27) 6. `c59c4939c2` ("perl: regenerate perl.mak if perl -V changes", 2017-03-29) 7. `0386dd37b1` ("Makefile: add PERLLIB_EXTRA variable that adds to default perl path", 2013-11-15) 8. 87bmjjv1pu.fsf@evledraar.booking.com ("Re: [PATCH] Makefile: replace perl/Makefile.PL with simple make rules" Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-12-11 15:28:10 -08:00
Junio C Hamano	79bafd23a8	Merge branch 'jk/fewer-pack-rescan' Internaly we use 0{40} as a placeholder object name to signal the codepath that there is no such object (e.g. the fast-forward check while "git fetch" stores a new remote-tracking ref says "we know there is no 'old' thing pointed at by the ref, as we are creating it anew" by passing 0{40} for the 'old' side), and expect that a codepath to locate an in-core object to return NULL as a sign that the object does not exist. A look-up for an object that does not exist however is quite costly with a repository with large number of packfiles. This access pattern has been optimized. * jk/fewer-pack-rescan: sha1_file: fast-path null sha1 as a missing object everything_local: use "quick" object existence check p5551: add a script to test fetch pack-dir rescans t/perf/lib-pack: use fast-import checkpoint to create packs p5550: factor out nonsense-pack creation	2017-12-06 09:23:42 -08:00
Junio C Hamano	7102541ab8	Merge branch 'cc/perf-run-config' * cc/perf-run-config: perf: store subsection results in "test-results/$GIT_PERF_SUBSECTION/" perf/run: show name of rev being built perf/run: add run_subsection() perf/run: update get_var_from_env_or_config() for subsections perf/run: add get_subsections() perf/run: add calls to get_var_from_env_or_config() perf/run: add GIT_PERF_DIRS_OR_REVS perf/run: add get_var_from_env_or_config() perf/run: add '--config' option to the 'run' script	2017-12-06 09:23:36 -08:00
Derrick Stolee	163ee5e635	sha1_file: use strbuf_add() instead of strbuf_addf() Replace use of strbuf_addf() with strbuf_add() when enumerating loose objects in for_each_file_in_obj_subdir(). Since we already check the length and hex-values of the string before consuming the path, we can prevent extra computation by using the lower- level method. One consumer of for_each_file_in_obj_subdir() is the abbreviation code. OID abbreviations use a cached list of loose objects (per object subdirectory) to make repeated queries fast, but there is significant cache load time when there are many loose objects. Most repositories do not have many loose objects before repacking, but in the GVFS case the repos can grow to have millions of loose objects. Profiling 'git log' performance in GitForWindows on a GVFS-enabled repo with ~2.5 million loose objects revealed 12% of the CPU time was spent in strbuf_addf(). Add a new performance test to p4211-line-log.sh that is more sensitive to this cache-loading. By limiting to 1000 commits, we more closely resemble user wait time when reading history into a pager. For a copy of the Linux repo with two ~512 MB packfiles and ~572K loose objects, running 'git log --oneline --parents --raw -1000' had the following performance: HEAD~1 HEAD ---------------------------------------- 7.70(7.15+0.54) 7.44(7.09+0.29) -3.4% Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-12-04 10:38:55 -08:00
Junio C Hamano	e05336bdda	Merge branch 'bp/fsmonitor' We learned to talk to watchman to speed up "git status" and other operations that need to see which paths have been modified. * bp/fsmonitor: fsmonitor: preserve utf8 filenames in fsmonitor-watchman log fsmonitor: read entirety of watchman output fsmonitor: MINGW support for watchman integration fsmonitor: add a performance test fsmonitor: add a sample integration script for Watchman fsmonitor: add test cases for fsmonitor extension split-index: disable the fsmonitor extension when running the split index test fsmonitor: add a test tool to dump the index extension update-index: add fsmonitor support to update-index ls-files: Add support in ls-files to display the fsmonitor valid bit fsmonitor: add documentation for the fsmonitor extension. fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files. update-index: add a new --force-write-index option preload-index: add override to enable testing preload-index bswap: add 64 bit endianness helper get_be64	2017-11-21 14:07:50 +09:00
Jeff King	7893bf1720	p5551: add a script to test fetch pack-dir rescans Since fetch often deals with object-ids we don't have (yet), it's an easy mistake for it to use a function like parse_object() that gives the correct result (e.g., NULL) but does so very slowly (because after failing to find the object, we re-scan the pack directory looking for new packs). The regular test suite won't catch this because the end result is correct, but we would want to know about performance regressions, too. Let's add a test to the regression suite. Note that this uses a synthetic repository that has a large number of packs. That's not ideal, as it means we're not testing what "normal" users see (in fact, some of these problems have existed for ages without anybody noticing simply because a rescan on a normal repository just isn't that expensive). So what we're really looking for here is the spike you'd notice in a pathological case (a lot of unknown objects coming into a repo with a lot of packs). If that's fast, then the normal cases should be, too. Note that the test also makes liberal use of $MODERN_GIT for setup; some of these regressions go back a ways, and we should be able to use it to find the problems there. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-11-21 11:08:20 +09:00
Jeff King	0a11e40275	t/perf/lib-pack: use fast-import checkpoint to create packs We currently use fast-import only to create a large number of objects, and then run O(n) invocations of pack-objects to turn them into packs. We can do this faster by just asking fast-import to checkpoint and create a pack for each (after telling it not to turn loose tiny packs). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-11-21 11:07:28 +09:00
Jeff King	aa338d3508	p5550: factor out nonsense-pack creation We have a function to create a bunch of irrelevant packs to measure the expense of reprepare_packed_git(). Let's make that available to other perf scripts. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-11-21 11:07:12 +09:00
Derrick Stolee	1af8b01309	p4211-line-log.sh: add log --online --raw --parents perf test Add a new perf test for testing the performance of log while computing OID abbreviations. Using --oneline --raw and --parents options maximizes the number of OIDs to abbreviate while still spending some time computing diffs. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-10-13 09:25:45 +09:00
Ben Peart	14527b3002	fsmonitor: add a performance test Add a test utility (test-drop-caches) that flushes all changes to disk then drops file system cache on Windows, Linux, and OSX. Add a perf test (p7519-fsmonitor.sh) for fsmonitor. By default, the performance test will utilize the Watchman file system monitor if it is installed. If Watchman is not installed, it will use a dummy integration script that does not report any new or modified files. The dummy script has very little overhead which provides optimistic results. The performance test will also use the untracked cache feature if it is available as fsmonitor uses it to speed up scanning for untracked files. There are 4 environment variables that can be used to alter the default behavior of the performance test: GIT_PERF_7519_UNTRACKED_CACHE: used to configure core.untrackedCache GIT_PERF_7519_SPLIT_INDEX: used to configure core.splitIndex GIT_PERF_7519_FSMONITOR: used to configure core.fsmonitor GIT_PERF_7519_DROP_CACHE: if set, the OS caches are dropped between tests The big win for using fsmonitor is the elimination of the need to scan the working directory looking for changed and untracked files. If the file information is all cached in RAM, the benefits are reduced. Signed-off-by: Ben Peart <benpeart@microsoft.com> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-10-01 17:23:05 +09:00
Christian Couder	5d445f3416	perf: store subsection results in "test-results/$GIT_PERF_SUBSECTION/" When tests are run for a subsection defined in a config file, it is better if the results for the current subsection are not overwritting the results of a previous subsection. So let's store the results for a subsection in a subdirectory of "test-results/" with the subsection name. The aggregate.perl, when it is run for a subsection, should then aggregate the results found in "test-results/$GIT_PERF_SUBSECTION/". Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-09-24 16:58:34 +09:00
Christian Couder	ffdd01076e	perf/run: show name of rev being built It is nice for the user to not just show the sha1 of the current revision being built but also the actual name of this revision. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-09-24 16:58:34 +09:00
Christian Couder	afda85c25d	perf/run: add run_subsection() Let's actually use the subsections we find in the config file to run the perf tests separately for each subsection. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-09-24 16:58:34 +09:00
Christian Couder	9ba95ed23c	perf/run: update get_var_from_env_or_config() for subsections As we will set some config options in subsections, let's teach get_var_from_env_or_config() to get the config options from the subsections if they are set there. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-09-24 16:58:34 +09:00
Christian Couder	2638441e07	perf/run: add get_subsections() This function makes it possible to find subsections, so that we will be able to run different tests for different subsections in a later commit. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-09-24 16:58:34 +09:00
Christian Couder	948e22e2bb	perf/run: add calls to get_var_from_env_or_config() These calls make it possible to have the make command or the make options in a config file, instead of in environment variables. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-09-24 16:58:34 +09:00
Christian Couder	91c4339e19	perf/run: add GIT_PERF_DIRS_OR_REVS This environment variable can be set to some revisions or directories whose Git versions should be tested, in addition to the revisions or directories passed as arguments to the 'run' script. This enables a "perf.dirsOrRevs" configuration variable to be used to set revisions or directories whose Git versions should be tested. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-09-24 16:58:34 +09:00
Christian Couder	e6b71539de	perf/run: add get_var_from_env_or_config() Add get_var_from_env_or_config() to easily set variables from a config file if they are defined there and not already set. This can also set them to a default value if one is provided. As an example, use this function to set GIT_PERF_REPEAT_COUNT from the perf.repeatCount config option or from the default value. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-09-24 16:58:34 +09:00
Christian Couder	e3d5e1207e	perf/run: add '--config' option to the 'run' script It is error prone and tiring to use many long environment variables to give parameters to the 'run' script. Let's make it easy to store some parameters in a config file and to pass them to the run script. The GIT_PERF_CONFIG_FILE variable will be set to the argument of the '--config' option. This variable is not used yet. It will be used in a following commit. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-09-24 16:58:34 +09:00
Kevin Willford	3921a0b3c3	perf: add test for writing the index A performance test for writing the index to be able to determine if changes to allocating ondisk structure help. Signed-off-by: Kevin Willford <kewillf@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-08-21 15:56:53 -07:00
Junio C Hamano	5ab148dda0	Merge branch 'rs/sha1-name-readdir-optim' Optimize "what are the object names already taken in an alternate object database?" query that is used to derive the length of prefix an object name is uniquely abbreviated to. * rs/sha1-name-readdir-optim: sha1_file: guard against invalid loose subdirectory numbers sha1_file: let for_each_file_in_obj_subdir() handle subdir names p4205: add perf test script for pretty log formats sha1_name: cache readdir(3) results in find_short_object_filename()	2017-07-05 13:32:56 -07:00
René Scharfe	5a5bd5765a	p4205: add perf test script for pretty log formats Add simple performance tests for expanded log format placeholders. Suggested-by: Jeff King <peff@peff.net> Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-24 11:05:02 -07:00
Junio C Hamano	a4478c9c03	Merge branch 'jh/memihash-opt' into maint perf-test update. * jh/memihash-opt: p0004: don't error out if test repo is too small p0004: don't abort if multi-threaded is too slow p0004: use test_perf p0004: avoid using pipes p0004: simplify calls of test-lazy-init-name-hash	2017-06-13 13:27:04 -07:00
Ævar Arnfjörð Bjarmason	154ffeecc6	perf: work around the tested repo having an index.lock When the tested repo has an index.lock file it should be removed. This file may be present if e.g. git-status previously crashed in that repo, and it will make a lot of git commands fail. Let's try harder and remove the lock. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-05 11:04:51 +09:00
Junio C Hamano	36dcb57337	Merge branch 'ab/grep-preparatory-cleanup' The internal implementation of "git grep" has seen some clean-up. * ab/grep-preparatory-cleanup: (31 commits) grep: assert that threading is enabled when calling grep_{lock,unlock} grep: given --threads with NO_PTHREADS=YesPlease, warn pack-objects: fix buggy warning about threads pack-objects & index-pack: add test for --threads warning test-lib: add a PTHREADS prerequisite grep: move is_fixed() earlier to avoid forward declaration grep: change internal pcre variable & function names to be pcre1 grep: change the internal PCRE macro names to be PCRE1 grep: factor test for \0 in grep patterns into a function grep: remove redundant regflags assignments grep: catch a missing enum in switch statement perf: add a comparison test of log --grep regex engines with -F perf: add a comparison test of log --grep regex engines perf: add a comparison test of grep regex engines with -F perf: add a comparison test of grep regex engines perf: emit progress output when unpacking & building perf: add a GIT_PERF_MAKE_COMMAND for when *_MAKE_OPTS won't do grep: add tests to fix blind spots with \0 patterns grep: prepare for testing binary regexes containing rx metacharacters grep: add a test helper function for less verbose -f \0 tests ...	2017-06-02 15:06:06 +09:00
Junio C Hamano	c05e1231da	Merge branch 'jh/memihash-opt' perf-test update. * jh/memihash-opt: p0004: don't error out if test repo is too small p0004: don't abort if multi-threaded is too slow p0004: use test_perf p0004: avoid using pipes p0004: simplify calls of test-lazy-init-name-hash	2017-05-30 11:16:43 +09:00
Junio C Hamano	140921ca21	Merge branch 'ab/perf-wildmatch' Add perf-test for wildmatch. * ab/perf-wildmatch: perf: add test showing exponential growth in path globbing perf: add function to setup a fresh test repo	2017-05-30 11:16:41 +09:00
Ævar Arnfjörð Bjarmason	723fc5a6e1	perf: add a comparison test of log --grep regex engines with -F Add a performance comparison test of log --grepgrep regex engines given fixed strings. See the preceding fixed-string t/perf change ("perf: add a comparison test of grep regex engines with -F", 2017-04-21) for notes about this, in particular this mostly tests exactly the same codepath now, but might not in the future: $ GIT_PERF_REPEAT_COUNT=10 GIT_PERF_LARGE_REPO=~/g/linux ./run p4221-log-grep-engines-fixed.sh [...] Test this tree -------------------------------------------------------- 4221.1: fixed log --grep='int' 5.99(5.55+0.40) 4221.2: basic log --grep='int' 5.92(5.56+0.31) 4221.3: extended log --grep='int' 6.01(5.51+0.45) 4221.4: perl log --grep='int' 5.99(5.56+0.38) 4221.6: fixed log --grep='uncommon' 5.06(4.76+0.27) 4221.7: basic log --grep='uncommon' 5.02(4.78+0.21) 4221.8: extended log --grep='uncommon' 4.99(4.78+0.20) 4221.9: perl log --grep='uncommon' 5.00(4.72+0.26) 4221.11: fixed log --grep='æ' 5.35(5.12+0.20) 4221.12: basic log --grep='æ' 5.34(5.11+0.20) 4221.13: extended log --grep='æ' 5.39(5.10+0.22) 4221.14: perl log --grep='æ' 5.44(5.16+0.23) Only the non-ASCII -i case is different: $ GIT_PERF_REPEAT_COUNT=10 GIT_PERF_LARGE_REPO=~/g/linux GIT_PERF_4221_LOG_OPTS=' -i' ./run p4221-log-grep-engines-fixed.sh [...] Test this tree ----------------------------------------------------------- 4221.1: fixed log -i --grep='int' 6.17(5.77+0.35) 4221.2: basic log -i --grep='int' 6.16(5.59+0.39) 4221.3: extended log -i --grep='int' 6.15(5.70+0.39) 4221.4: perl log -i --grep='int' 6.15(5.69+0.38) 4221.6: fixed log -i --grep='uncommon' 5.10(4.88+0.21) 4221.7: basic log -i --grep='uncommon' 5.04(4.76+0.25) 4221.8: extended log -i --grep='uncommon' 5.07(4.82+0.23) 4221.9: perl log -i --grep='uncommon' 5.03(4.78+0.22) 4221.11: fixed log -i --grep='æ' 5.93(5.65+0.25) 4221.12: basic log -i --grep='æ' 5.88(5.62+0.25) 4221.13: extended log -i --grep='æ' 6.02(5.69+0.29) 4221.14: perl log -i --grep='æ' 5.36(5.06+0.29) See commit ("perf: add a comparison test of grep regex engines", 2017-04-19) for details on the machine the above test run was executed on. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-26 12:52:37 +09:00
Ævar Arnfjörð Bjarmason	c8f39be67e	perf: add a comparison test of log --grep regex engines Add a very basic performance comparison test comparing the POSIX basic, extended and perl engines with patterns matching log messages via --grep=<pattern>. $ GIT_PERF_REPEAT_COUNT=10 GIT_PERF_LARGE_REPO=~/g/linux ./run p4220-log-grep-engines.sh [...] Test this tree --------------------------------------------------------------------- 4220.1: basic log --grep='how.to' 6.22(6.00+0.21) 4220.2: extended log --grep='how.to' 6.23(5.98+0.23) 4220.3: perl log --grep='how.to' 6.07(5.79+0.25) 4220.5: basic log --grep='^how to' 6.19(5.93+0.22) 4220.6: extended log --grep='^how to' 6.19(5.93+0.23) 4220.7: perl log --grep='^how to' 6.14(5.88+0.24) 4220.9: basic log --grep='[how] to' 6.96(6.65+0.28) 4220.10: extended log --grep='[how] to' 6.96(6.69+0.24) 4220.11: perl log --grep='[how] to' 6.95(6.58+0.33) 4220.13: basic log --grep='$e.t[^ ]\\|v.ry$ rare' 7.10(6.80+0.27) 4220.14: extended log --grep='(e.t[^ ]\|v.ry) rare' 7.07(6.80+0.26) 4220.15: perl log --grep='(e.t[^ ]\|v.ry) rare' 7.70(7.46+0.22) 4220.17: basic log --grep='m$ú\\|u$lt.b$æ\\|y$te' 6.12(5.87+0.24) 4220.18: extended log --grep='m(ú\|u)lt.b(æ\|y)te' 6.14(5.84+0.26) 4220.19: perl log --grep='m(ú\|u)lt.b(æ\|y)te' 6.16(5.93+0.20) With -i: $ GIT_PERF_REPEAT_COUNT=10 GIT_PERF_LARGE_REPO=~/g/linux GIT_PERF_4220_LOG_OPTS=' -i' ./run p4220-log-grep-engines.sh [...] Test this tree ------------------------------------------------------------------------ 4220.1: basic log -i --grep='how.to' 6.74(6.41+0.32) 4220.2: extended log -i --grep='how.to' 6.78(6.55+0.22) 4220.3: perl log -i --grep='how.to' 6.06(5.77+0.28) 4220.5: basic log -i --grep='^how to' 6.80(6.57+0.22) 4220.6: extended log -i --grep='^how to' 6.83(6.52+0.29) 4220.7: perl log -i --grep='^how to' 6.16(5.94+0.20) 4220.9: basic log -i --grep='[how] to' 7.87(7.61+0.24) 4220.10: extended log -i --grep='[how] to' 7.85(7.57+0.27) 4220.11: perl log -i --grep='[how] to' 7.03(6.75+0.25) 4220.13: basic log -i --grep='$e.t[^ ]\\|v.ry$ rare' 8.68(8.41+0.25) 4220.14: extended log -i --grep='(e.t[^ ]\|v.ry) rare' 8.80(8.44+0.28) 4220.15: perl log -i --grep='(e.t[^ ]\|v.ry) rare' 7.85(7.56+0.26) 4220.17: basic log -i --grep='m$ú\\|u$lt.b$æ\\|y$te' 6.94(6.68+0.24) 4220.18: extended log -i --grep='m(ú\|u)lt.b(æ\|y)te' 7.04(6.76+0.24) 4220.19: perl log -i --grep='m(ú\|u)lt.b(æ\|y)te' 6.26(5.92+0.29) See commit ("perf: add a comparison test of grep regex engines", 2017-04-19) for details on the machine the above test run was executed on. Before commit ("log: make --regexp-ignore-case work with --perl-regexp", 2017-05-20) this test will almost definitely fail (depending on the repo) if passed the -i option, since it wasn't properly supported under PCRE. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-26 12:52:37 +09:00
Ævar Arnfjörð Bjarmason	bc22d81370	perf: add a comparison test of grep regex engines with -F Add a performance comparison test of grep regex engines given fixed strings. The current logic in compile_regexp() ignores the engine parameter and uses kwset() to search for these, so this test shows no difference between engines right now: $ GIT_PERF_REPEAT_COUNT=10 GIT_PERF_LARGE_REPO=~/g/linux ./run p7821-grep-engines-fixed.sh [...] Test this tree ------------------------------------------------ 7821.1: fixed grep int 0.56(1.67+0.68) 7821.2: basic grep int 0.57(1.70+0.57) 7821.3: extended grep int 0.59(1.76+0.51) 7821.4: perl grep int 1.08(1.71+0.55) 7821.6: fixed grep uncommon 0.23(0.55+0.50) 7821.7: basic grep uncommon 0.24(0.55+0.50) 7821.8: extended grep uncommon 0.26(0.55+0.52) 7821.9: perl grep uncommon 0.24(0.58+0.47) 7821.11: fixed grep æ 0.36(1.30+0.42) 7821.12: basic grep æ 0.36(1.32+0.40) 7821.13: extended grep æ 0.38(1.30+0.42) 7821.14: perl grep æ 0.35(1.24+0.48) Only when run with -i via GIT_PERF_7821_GREP_OPTS=' -i' do we avoid avoid going through the same kwset.[ch] codepath, see the "Even when -F..." comment in grep.c. This only kicks for the non-ASCII case: $ GIT_PERF_REPEAT_COUNT=10 GIT_PERF_LARGE_REPO=~/g/linux GIT_PERF_7821_GREP_OPTS=' -i' ./run p7821-grep-engines-fixed.sh [...] Test this tree --------------------------------------------------- 7821.1: fixed grep -i int 0.62(2.10+0.57) 7821.2: basic grep -i int 0.68(1.90+0.61) 7821.3: extended grep -i int 0.78(1.94+0.57) 7821.4: perl grep -i int 0.98(1.78+0.74) 7821.6: fixed grep -i uncommon 0.24(0.44+0.64) 7821.7: basic grep -i uncommon 0.25(0.56+0.54) 7821.8: extended grep -i uncommon 0.27(0.62+0.45) 7821.9: perl grep -i uncommon 0.24(0.59+0.49) 7821.11: fixed grep -i æ 0.30(0.96+0.39) 7821.12: basic grep -i æ 0.27(0.92+0.44) 7821.13: extended grep -i æ 0.28(0.90+0.46) 7821.14: perl grep -i æ 0.28(0.74+0.49) I'm planning to change how fixed-string searching happens. This test gives a baseline for comparing performance before & after any such change. See commit ("perf: add a comparison test of grep regex engines", 2017-04-19) for details on the machine the above test run was executed on. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-26 12:52:36 +09:00
Ævar Arnfjörð Bjarmason	3878c7a540	perf: add a comparison test of grep regex engines Add a very basic performance comparison test comparing the POSIX basic, extended and perl engines. In theory the "basic" and "extended" engines should be implemented using the same underlying code with a slightly different pattern parser, but some implementations may not do this. Jump through some slight hoops to test both, which is worthwhile since "basic" is the default. Running this on an i7 3.4GHz Linux 4.9.0-2 Debian testing against a checkout of linux.git & latest upstream PCRE, both PCRE and git compiled with -O3 using gcc 7.1.1: $ GIT_PERF_REPEAT_COUNT=10 GIT_PERF_LARGE_REPO=~/g/linux ./run p7820-grep-engines.sh [...] Test this tree --------------------------------------------------------------- 7820.1: basic grep 'how.to' 0.34(1.24+0.53) 7820.2: extended grep 'how.to' 0.33(1.23+0.45) 7820.3: perl grep 'how.to' 0.31(1.05+0.56) 7820.5: basic grep '^how to' 0.32(1.24+0.42) 7820.6: extended grep '^how to' 0.33(1.20+0.44) 7820.7: perl grep '^how to' 0.57(2.67+0.42) 7820.9: basic grep '[how] to' 0.51(2.16+0.45) 7820.10: extended grep '[how] to' 0.49(2.20+0.43) 7820.11: perl grep '[how] to' 0.56(2.60+0.43) 7820.13: basic grep '$e.t[^ ]\\|v.ry$ rare' 0.66(3.25+0.40) 7820.14: extended grep '(e.t[^ ]\|v.ry) rare' 0.65(3.19+0.46) 7820.15: perl grep '(e.t[^ ]\|v.ry) rare' 1.05(5.74+0.34) 7820.17: basic grep 'm$ú\\|u$lt.b$æ\\|y$te' 0.34(1.28+0.47) 7820.18: extended grep 'm(ú\|u)lt.b(æ\|y)te' 0.34(1.38+0.38) 7820.19: perl grep 'm(ú\|u)lt.b(æ\|y)te' 0.39(1.56+0.44) Options can also be passed to git-grep via the GIT_PERF_7820_GREP_OPTS environment variable. There are various modes such as "-v" that have very different performance profiles, but handling the combinatorial explosion of testing all those options would make this script much more complex and harder to maintain. Instead just add the ability to do one-shot runs with arbitrary options, e.g.: $ GIT_PERF_REPEAT_COUNT=10 GIT_PERF_LARGE_REPO=~/g/linux GIT_PERF_7820_GREP_OPTS=" -i" ./run p7820-grep-engines.sh [...] Test this tree ------------------------------------------------------------------ 7820.1: basic grep -i 'how.to' 0.49(1.72+0.38) 7820.2: extended grep -i 'how.to' 0.46(1.64+0.42) 7820.3: perl grep -i 'how.to' 0.44(1.45+0.45) 7820.5: basic grep -i '^how to' 0.47(1.76+0.38) 7820.6: extended grep -i '^how to' 0.47(1.70+0.42) 7820.7: perl grep -i '^how to' 0.65(2.72+0.37) 7820.9: basic grep -i '[how] to' 0.86(3.64+0.42) 7820.10: extended grep -i '[how] to' 0.84(3.62+0.46) 7820.11: perl grep -i '[how] to' 0.73(3.06+0.39) 7820.13: basic grep -i '$e.t[^ ]\\|v.ry$ rare' 1.63(8.13+0.36) 7820.14: extended grep -i '(e.t[^ ]\|v.ry) rare' 1.64(8.01+0.44) 7820.15: perl grep -i '(e.t[^ ]\|v.ry) rare' 1.44(6.88+0.44) 7820.17: basic grep -i 'm$ú\\|u$lt.b$æ\\|y$te' 0.66(2.67+0.44) 7820.18: extended grep -i 'm(ú\|u)lt.b(æ\|y)te' 0.66(2.67+0.43) 7820.19: perl grep -i 'm(ú\|u)lt.b(æ\|y)te' 0.59(2.31+0.37) Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-26 12:52:36 +09:00
Ævar Arnfjörð Bjarmason	b11ad029cb	perf: emit progress output when unpacking & building Amend the t/perf/run output so that in addition to the "Running N tests" heading currently being emitted, it also emits "Unpacking $rev" and "Building $rev" when setting up the build/$rev directory & when building it, respectively. This makes it easier to see what's going on and what revision is being tested as the output scrolls by. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-21 08:25:38 +09:00
Ævar Arnfjörð Bjarmason	88b6197d0b	perf: add a GIT_PERF_MAKE_COMMAND for when *_MAKE_OPTS won't do Add a git GIT_PERF_MAKE_COMMAND variable to compliment the existing GIT_PERF_MAKE_OPTS facility. This allows specifying an arbitrary shell command to execute instead of 'make'. This is useful e.g. in cases where the name, semantics or defaults of a Makefile flag have changed over time. It can even be used to change the contents of the tree, useful for monkeypatching ancient versions of git to get them to build. This opens Pandora's box in some ways, it's now possible to "jailbreak" the perf environment and e.g. modify the source tree via this arbitrary instead of just issuing a custom "make" command, such a command has to be re-entrant in the sense that subsequent perf runs will re-use the possibly modified tree. It would be pointless to try to mitigate or work around that caveat in a tool purely aimed at Git developers, so this change makes no attempt to do so. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-21 08:25:38 +09:00
René Scharfe	c5a9157393	p0004: don't error out if test repo is too small Repositories with less than 4000 entries are always handled using a single thread, causing test-lazy-init-name-hash --multi to error out. Don't abort the whole test script in that case, but simply skip the multi-threaded performance check. We can still use it to compare the single-threaded speed of different versions in that case. Signed-off-by: Rene Scharfe <l.s.r@web.de> Acked-by: Jeff Hostetler <git@jeffhostetler.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-16 11:11:56 +09:00
René Scharfe	7b0d409eb2	p0004: don't abort if multi-threaded is too slow If the single-threaded variant beats the multi-threaded one then we may have a performance bug, but that doesn't justify aborting the test. Drop that check; we can compare the results for --single and --multi using the actual performance tests. Signed-off-by: Rene Scharfe <l.s.r@web.de> Acked-by: Jeff Hostetler <git@jeffhostetler.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-16 11:11:52 +09:00
René Scharfe	48a6ace8f5	p0004: use test_perf The perf test suite (more specifically: t/perf/aggregate.perl) requires each test script to write test results into a file, otherwise it aborts when aggregating. Add actual performance tests with test_perf to allow p0004 to be run together with other perf scripts. Calibrate the value for the parameter --count based on the size of the test repository, in order to get meaningful results with smaller repos yet still be able to finish the script against huge ones without having to wait for hours. Signed-off-by: Rene Scharfe <l.s.r@web.de> Acked-by: Jeff Hostetler <git@jeffhostetler.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-16 11:11:48 +09:00
René Scharfe	e1ebb569c6	p0004: avoid using pipes The return code of commands on the producing end of a pipe is ignored. Evaluate the outcome of test-lazy-init-name-hash by calling sort separately. Signed-off-by: Rene Scharfe <l.s.r@web.de> Acked-by: Jeff Hostetler <git@jeffhostetler.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-16 11:11:43 +09:00
René Scharfe	1c002d0a9e	p0004: simplify calls of test-lazy-init-name-hash The test library puts helpers into $PATH, so we can simply call them without specifying their location. The suffix $X is also not necessary because .exe files on Windows can be started without specifying their extension, and on other platforms it's empty anyway. Signed-off-by: Rene Scharfe <l.s.r@web.de> Acked-by: Jeff Hostetler <git@jeffhostetler.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-16 11:11:21 +09:00
Ævar Arnfjörð Bjarmason	62ca75a6b9	perf: add test showing exponential growth in path globbing Add a test showing that runtimes of the wildmatch() function used for globbing in git grow exponentially in the face of some pathological globs. This issue affects both globs matching filenames via e.g. ls-files, and globs matching refnames via e.g. for-each-ref. As noted in the test description this is a test to see whether Git suffers from the issue noted in an article Russ Cox posted today about common bugs in various glob implementations: https://research.swtch.com/glob Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-12 10:07:43 +09:00
Ævar Arnfjörð Bjarmason	91de27c54a	perf: add function to setup a fresh test repo Add a function to setup a fresh test repo via 'git init' to compliment the existing functions to copy over a normal & large repo. Some performance tests don't need any existing repository data at all to be significant, e.g. tests which stress glob matches against single pathological revisions or files, which I'm about to add in a subsequent commit. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-12 10:07:42 +09:00
Christian Couder	de950c5773	p3400: add perf tests for rebasing many changes Rebasing onto many changes is interesting, but it's also interesting to see what happens when rebasing many changes. And while at it, let's also look at the impact of using a split index. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-05-08 10:50:43 +09:00
Junio C Hamano	d9dfed9e47	Merge branch 'ab/align-perf-descriptions' Output from perf tests have been updated to align their titles. * ab/align-perf-descriptions: t/perf: correctly align non-ASCII descriptions in output	2017-05-01 14:14:42 +09:00
Junio C Hamano	6cbc478d83	Merge branch 'jh/add-index-entry-optim' "git checkout" that handles a lot of paths has been optimized by reducing the number of unnecessary checks of paths in the has_dir_name() function. * jh/add-index-entry-optim: read-cache: speed up has_dir_name (part 2) read-cache: speed up has_dir_name (part 1) read-cache: speed up add_index_entry during checkout p0006-read-tree-checkout: perf test to time read-tree read-cache: add strcmp_offset function	2017-04-26 15:39:07 +09:00
Junio C Hamano	8b6bba6663	Merge branch 'jh/string-list-micro-optim' The string-list API used a custom reallocation strategy that was very inefficient, instead of using the usual ALLOC_GROW() macro, which has been fixed. * jh/string-list-micro-optim: string-list: use ALLOC_GROW macro when reallocing string_list	2017-04-23 22:07:47 -07:00
Ævar Arnfjörð Bjarmason	db7ed0f20c	t/perf: correctly align non-ASCII descriptions in output Change the test descriptions from being treated as binary blobs by perl to being treated as UTF-8. This ensures that e.g. a test description like "æ" is counted as 1 character, not 2. I have WIP performance tests for non-ASCII grep patterns on another topic that are affected by this. Now instead of: $ ./run p0000-perf-lib-sanity.sh [...] 0000.4: export a weird var 0.00(0.00+0.00) 0000.5: éḿíẗ ńöń-ÁŚĆÍÍ ćḧáŕáćẗéŕś 0.00(0.00+0.00) 0000.7: important variables available in subshells 0.00(0.00+0.00) [...] We emit: [...] 0000.4: export a weird var 0.00(0.00+0.00) 0000.5: éḿíẗ ńöń-ÁŚĆÍÍ ćḧáŕáćẗéŕś 0.00(0.00+0.00) 0000.7: important variables available in subshells 0.00(0.00+0.00) [...] Fixes code originally added in `342e9ef2d9` ("Introduce a performance testing framework", 2012-02-17). Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-04-23 21:33:15 -07:00
Junio C Hamano	8377f34540	Merge branch 'jh/memihash-opt' Hotfix for a topic that is already in 'master'. * jh/memihash-opt: p0004: make perf test executable t3008: skip lazy-init test on a single-core box test-online-cpus: helper to return cpu count name-hash: fix buffer overrun	2017-04-19 21:37:25 -07:00
Jeff Hostetler	350d870143	p0006-read-tree-checkout: perf test to time read-tree Created t/perf/repos/many-files.sh to generate large, but artificial repositories. Created t/perf/inflate-repo.sh to alter an EXISTING repo to have a set of large commits. This can be used to create a branch with 1M+ files in repositories like git.git or linux.git, but with more realistic content. It does this by making multiple copies of the entire worktree in a series of sub-directories. The branch name and ballast structure created by both scripts match, so either script can be used to generate very large test repositories for the following perf test. Created t/perf/p0006-read-tree-checkout.sh to measure performance on various read-tree, checkout, and update-index operations. This test can run using either normal repos or ones from the above scripts. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-04-19 20:33:01 -07:00
Christian Couder	c9d4999155	p0004: make perf test executable It looks like in `89c3b0ad43` (name-hash: add perf test for lazy_init_name_hash, 2017-03-23) p0004 was not created with the execute unix rights. Let's fix that. Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Acked-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-04-18 19:18:18 -07:00
Jeff Hostetler	950a234cbd	string-list: use ALLOC_GROW macro when reallocing string_list Use ALLOC_GROW() macro when reallocing a string_list array rather than simply increasing it by 32. This is a performance optimization. During status on a very large repo and there are many changes, a significant percentage of the total run time is spent reallocing the wt_status.changes array. This change decreases the time in wt_status_collect_changes_worktree() from 125 seconds to 45 seconds on my very large repository. This produced a modest gain on my 1M file artificial repo, but broke even on linux.git. Test HEAD^^ HEAD --------------------------------------------------------------------------------------- 0005.2: read-tree status br_ballast (1000001) 8.29(5.62+2.62) 8.22(5.57+2.63) -0.8% Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-04-15 02:04:41 -07:00
Junio C Hamano	0330344e0f	Merge branch 'jh/memihash-opt' The name-hash used for detecting paths that are different only in cases (which matter on case insensitive filesystems) has been optimized to take advantage of multi-threading when it makes sense. * jh/memihash-opt: name-hash: add test-lazy-init-name-hash to .gitignore name-hash: add perf test for lazy_init_name_hash name-hash: add test-lazy-init-name-hash name-hash: perf improvement for lazy_init_name_hash hashmap: document memihash_cont, hashmap_disallow_rehash api hashmap: add disallow_rehash setting hashmap: allow memihash computation to be continued name-hash: specify initial size for istate.dir_hash table	2017-03-28 14:06:00 -07:00
Jeff Hostetler	89c3b0ad43	name-hash: add perf test for lazy_init_name_hash Created t/perf/p0004-lazy-init-name-hash.sh test to demonstrate correctness and performance gains with the multithreaded version of lazy_init_name_hash(). Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-03-24 11:00:03 -07:00
Junio C Hamano	5296357386	Merge branch 'dp/filter-branch-prune-empty' "git filter-branch --prune-empty" drops a single-parent commit that becomes a no-op, but did not drop a root commit whose tree is empty. * dp/filter-branch-prune-empty: p7000: add test for filter-branch with --prune-empty filter-branch: fix --prune-empty on parentless commits t7003: ensure --prune-empty removes entire branch when applicable t7003: ensure --prune-empty can prune root commit	2017-03-14 15:23:19 -07:00
Devin J. Pohly	32da7467eb	p7000: add test for filter-branch with --prune-empty Signed-off-by: Devin J. Pohly <djpohly@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-03-03 12:43:37 -08:00
Jeff King	28e1fb5466	t/perf: add fallback for pre-bin-wrappers versions of git It's tempting to say: ./run v1.0.0 HEAD to see how we've sped up Git over the years. Unfortunately, this doesn't quite work because versions of Git prior to v1.7.0 lack bin-wrappers, so our "run" script doesn't correctly put them in the PATH. Worse, it means we silently find whatever other "git" is in the PATH, and produce test results that have no bearing on what we asked for. Let's fallback to the main git directory when bin-wrappers isn't present. Many modern perf scripts won't run with such an antique version of Git, of course, but at least those failures are detected and reported (and you're free to write a limited perf script that works across many versions). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-03-03 10:55:27 -08:00
Jeff King	83d4a409d3	t/perf: use $MODERN_GIT for all repo-copying steps Since `1a0962dee` (t/perf: fix regression in testing older versions of git, 2016-06-22), we point "$MODERN_GIT" to a copy of git that matches the t/perf script itself, and which can be used for tasks outside of the actual timings. This is needed because the setup done by perf scripts keeps moving forward in time, and may use features that the older versions of git we are testing do not have. That commit used $MODERN_GIT to fix a case where we relied on the relatively recent --git-path option. But if you go back further still, there are more problems. Since `7501b5921` (perf: make the tests work in worktrees, 2016-05-13), we use "git -C", but versions of git older than `44e1e4d67` (git: run in a directory given with -C option, 2013-09-09) don't know about "-C". So testing an old version of git with a new version of t/perf will fail the setup step. We can fix this by using $MODERN_GIT during the setup; there's no need to use the antique version, since it doesn't affect the timings. Likewise, we'll adjust the "init" invocation; antique versions of git called this "init-db". Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-03-03 10:55:26 -08:00
Jonathan Tan	67f2825174	t/perf: export variable used in other blocks In p0001, a variable was created in a test_expect_success block to be used in later test_perf blocks, but was not exported. This caused the variable to not appear in those blocks (this can be verified by writing 'test -n "$commit"' in those blocks), resulting in a slightly different invocation than what was intended. Export that variable. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-03-03 10:54:42 -08:00
Junio C Hamano	cf36a4dc35	Merge branch 'rs/p5302-create-repositories-before-tests' Adjust a perf test to new world order where commands that do require a repository are really strict about having a repository. * rs/p5302-create-repositories-before-tests: p5302: create repositories for index-pack results explicitly	2017-02-10 12:52:25 -08:00
René Scharfe	c86000c1a7	p5302: create repositories for index-pack results explicitly Before `7176a314` (index-pack: complain when --stdin is used outside of a repo) index-pack silently created a non-existing target directory; now the command refuses to work unless it's used against a valid repository. That causes p5302 to fail, which relies on the former behavior. Fix it by setting up the destinations for its performance tests using git init. Signed-off-by: Rene Scharfe <l.s.r@web.de> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-02-06 10:55:25 -08:00
René Scharfe	564e94e619	perf: add basic sort performance test Add a sort command to test-string-list that reads lines from stdin, stores them in a string_list and then sorts it. Use it in a simple perf test script to measure the performance of string_list_sort(). Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-01-23 11:02:37 -08:00
Junio C Hamano	9fcd14491d	Merge branch 'jk/fetch-quick-tag-following' When fetching from a remote that has many tags that are irrelevant to branches we are following, we used to waste way too many cycles when checking if the object pointed at by a tag (that we are not going to fetch!) exists in our repository too carefully. * jk/fetch-quick-tag-following: fetch: use "quick" has_sha1_file for tag following	2016-10-26 13:14:47 -07:00
Jeff King	5827a03545	fetch: use "quick" has_sha1_file for tag following When we auto-follow tags in a fetch, we look at all of the tags advertised by the remote and fetch ones where we don't already have the tag, but we do have the object it peels to. This involves a lot of calls to has_sha1_file(), some of which we can reasonably expect to fail. Since `45e8a74` (has_sha1_file: re-check pack directory before giving up, 2013-08-30), this may cause many calls to reprepare_packed_git(), which is potentially expensive. This has gone unnoticed for several years because it requires a fairly unique setup to matter: 1. You need to have a lot of packs on the client side to make reprepare_packed_git() expensive (the most expensive part is finding duplicates in an unsorted list, which is currently quadratic). 2. You need a large number of tag refs on the server side that are candidates for auto-following (i.e., that the client doesn't have). Each one triggers a re-read of the pack directory. 3. Under normal circumstances, the client would auto-follow those tags and after one large fetch, (2) would no longer be true. But if those tags point to history which is disconnected from what the client otherwise fetches, then it will never auto-follow, and those candidates will impact it on every fetch. So when all three are true, each fetch pays an extra O(nr_tags * nr_packs^2) cost, mostly in string comparisons on the pack names. This was exacerbated by `47bf4b0` (prepare_packed_git_one: refactor duplicate-pack check, 2014-06-30) which uses a slightly more expensive string check, under the assumption that the duplicate check doesn't happen very often (and it shouldn't; the real problem here is how often we are calling reprepare_packed_git()). This patch teaches fetch to use HAS_SHA1_QUICK to sacrifice accuracy for speed, in cases where we might be racy with a simultaneous repack. This is similar to the fix in `0eeb077` (index-pack: avoid excessive re-reading of pack directory, 2015-06-09). As with that case, it's OK for has_sha1_file() occasionally say "no I don't have it" when we do, because the worst case is not a corruption, but simply that we may fail to auto-follow a tag that points to it. Here are results from the included perf script, which sets up a situation similar to the one described above: Test HEAD^ HEAD ---------------------------------------------------------- 5550.4: fetch 11.21(10.42+0.78) 0.08(0.04+0.02) -99.3% Reported-by: Vegard Nossum <vegard.nossum@oracle.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-10-14 11:31:32 -07:00
Junio C Hamano	0c5ff91639	Merge branch 'ks/perf-build-with-autoconf' Performance tests done via "t/perf" did not use the same set of build configuration if the user relied on autoconf generated configuration. * ks/perf-build-with-autoconf: t/perf/run: copy config.mak.autogen & friends to build area	2016-09-21 15:15:27 -07:00
Junio C Hamano	7f109ef54e	Merge branch 'ks/pack-objects-bitmap' Some codepaths in "git pack-objects" were not ready to use an existing pack bitmap; now they are and as the result they have become faster. * ks/pack-objects-bitmap: pack-objects: use reachability bitmap index when generating non-stdout pack pack-objects: respect --local/--honor-pack-keep/--incremental when bitmap is in use	2016-09-21 15:15:21 -07:00
Kirill Smelkov	cd5c2812b6	t/perf/run: copy config.mak.autogen & friends to build area Otherwise for people who use autotools-based configure in main worktree, the performance testing results will be inconsistent as work and build trees could be using e.g. different optimization levels. See e.g. http://public-inbox.org/git/20160818175222.bmm3ivjheokf2qzl@sigill.intra.peff.net/ for example. NOTE config.status has to be copied because otherwise without it the build would want to run reconfigure this way loosing just copied config.mak.autogen. Signed-off-by: Kirill Smelkov <kirr@nexedi.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-09-15 13:41:11 -07:00
Kirill Smelkov	645c432d61	pack-objects: use reachability bitmap index when generating non-stdout pack Starting from `6b8fda2d` (pack-objects: use bitmaps when packing objects) if a repository has bitmap index, pack-objects can nicely speedup "Counting objects" graph traversal phase. That however was done only for case when resultant pack is sent to stdout, not written into a file. The reason here is for on-disk repack by default we want: - to produce good pack (with bitmap index not-yet-packed objects are emitted to pack in suboptimal order). - to use more robust pack-generation codepath (avoiding possible bugs in bitmap code and possible bitmap index corruption). Jeff King further explains: The reason for this split is that pack-objects tries to determine how "careful" it should be based on whether we are packing to disk or to stdout. Packing to disk implies "git repack", and that we will likely delete the old packs after finishing. We want to be more careful (so as not to carry forward a corruption, and to generate a more optimal pack), and we presumably run less frequently and can afford extra CPU. Whereas packing to stdout implies serving a remote via "git fetch" or "git push". This happens more frequently (e.g., a server handling many fetching clients), and we assume the receiving end takes more responsibility for verifying the data. But this isn't always the case. One might want to generate on-disk packfiles for a specialized object transfer. Just using "--stdout" and writing to a file is not optimal, as it will not generate the matching pack index. So it would be useful to have some way of overriding this heuristic: to tell pack-objects that even though it should generate on-disk files, it is still OK to use the reachability bitmaps to do the traversal. So we can teach pack-objects to use bitmap index for initial object counting phase when generating resultant pack file too: - if we take care to not let it be activated under git-repack: See above about repack robustness and not forward-carrying corruption. - if we know bitmap index generation is not enabled for resultant pack: The current code has singleton bitmap_git, so it cannot work simultaneously with two bitmap indices. We also want to avoid (at least with current implementation) generating bitmaps off of bitmaps. The reason here is: when generating a pack, not-yet-packed objects will be emitted into pack in suboptimal order and added to tail of the bitmap as "extended entries". When the resultant pack + some new objects in associated repository are in turn used to generate another pack with bitmap, the situation repeats: new objects are again not emitted optimally and just added to bitmap tail - not in recency order. So the pack badness can grow over time when at each step we have bitmapped pack + some other objects. That's why we want to avoid generating bitmaps off of bitmaps, not to let pack badness grow. - if we keep pack reuse enabled still only for "send-to-stdout" case: Because pack-to-file needs to generate index for destination pack, and currently on pack reuse raw entries are directly written out to the destination pack by write_reused_pack(), bypassing needed for pack index generation bookkeeping done by regular codepath in write_one() and friends. ( In the future we might teach pack-reuse code about cases when index also needs to be generated for resultant pack and remove pack-reuse-only-for-stdout limitation ) This way for pack-objects -> file we get nice speedup: erp5.git[1] (~230MB) extracted from ~ 5GB lab.nexedi.com backup repository managed by git-backup[2] via time echo 0186ac99 \| git pack-objects --revs erp5pack before: 37.2s after: 26.2s And for `git repack -adb` packed git.git time echo `5c589a73` \| git pack-objects --revs gitpack before: 7.1s after: 3.6s i.e. it can be 30% - 50% speedup for pack extraction. git-backup extracts many packs on repositories restoration. That was my initial motivation for the patch. [1] https://lab.nexedi.com/nexedi/erp5 [2] https://lab.nexedi.com/kirr/git-backup NOTE Jeff also suggests that pack.useBitmaps was probably a mistake to introduce originally. This way we are not adding another config point, but instead just always default to-file pack-objects not to use bitmap index: Tools which need to generate on-disk packs with using bitmap, can pass --use-bitmap-index explicitly. And git-repack does never pass --use-bitmap-index, so this way we can be sure regular on-disk repacking remains robust. NOTE2 `git pack-objects --stdout >file.pack` + `git index-pack file.pack` is much slower than `git pack-objects file.pack`. Extracting erp5.git pack from lab.nexedi.com backup repository: $ time echo 0186ac99 \| git pack-objects --stdout --revs >erp5pack-stdout.pack real 0m22.309s user 0m21.148s sys 0m0.932s $ time git index-pack erp5pack-stdout.pack real 0m50.873s <-- more than 2 times slower than time to generate pack itself! user 0m49.300s sys 0m1.360s So the time for `pack-object --stdout >file.pack` + `index-pack file.pack` is 72s, while `pack-objects file.pack` which does both pack and index is 27s. And even `pack-objects --no-use-bitmap-index file.pack` is 37s. Jeff explains: The packfile does not carry the sha1 of the objects. A receiving index-pack has to compute them itself, including inflating and applying all of the deltas. that's why for `git-backup restore` we want to teach `git pack-objects file.pack` to use bitmaps instead of using `git pack-objects --stdout >file.pack` + `git index-pack file.pack`. NOTE3 The speedup is now tracked via t/perf/p5310-pack-bitmaps.sh Test `56dfeb62` this tree -------------------------------------------------------------------------------- 5310.2: repack to disk 8.98(8.05+0.29) 9.05(8.08+0.33) +0.8% 5310.3: simulated clone 2.02(2.27+0.09) 2.01(2.25+0.08) -0.5% 5310.4: simulated fetch 0.81(1.07+0.02) 0.81(1.05+0.04) +0.0% 5310.5: pack to file 7.58(7.04+0.28) 7.60(7.04+0.30) +0.3% 5310.6: pack to file (bitmap) 7.55(7.02+0.28) 3.25(2.82+0.18) -57.0% 5310.8: clone (partial bitmap) 1.83(2.26+0.12) 1.82(2.22+0.14) -0.5% 5310.9: pack to file (partial bitmap) 6.86(6.58+0.30) 2.87(2.74+0.20) -58.2% More context: http://marc.info/?t=146792101400001&r=1&w=2 http://public-inbox.org/git/20160707190917.20011-1-kirr@nexedi.com/T/#t Cc: Vicent Marti <tanoku@gmail.com> Helped-by: Jeff King <peff@peff.net> Signed-off-by: Kirill Smelkov <kirr@nexedi.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-09-12 13:47:41 -07:00
Junio C Hamano	c4071eace9	Merge branch 'jk/delta-base-cache' The delta-base-cache mechanism has been a key to the performance in a repository with a tightly packed packfile, but it did not scale well even with a larger value of core.deltaBaseCacheLimit. * jk/delta-base-cache: t/perf: add basic perf tests for delta base cache delta_base_cache: use hashmap.h delta_base_cache: drop special treatment of blobs delta_base_cache: use list.h for LRU release_delta_base_cache: reuse existing detach function clear_delta_base_cache_entry: use a more descriptive name cache_or_unpack_entry: drop keep_cache parameter	2016-09-08 21:49:46 -07:00
Junio C Hamano	9010077be2	Merge branch 'kw/patch-ids-optim' * kw/patch-ids-optim: p3400: make test script executable	2016-08-31 10:03:49 -07:00
René Scharfe	ba67504fa8	p3400: make test script executable Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-08-29 12:57:16 -07:00
Jeff King	c7df68cbca	t/perf: add basic perf tests for delta base cache This just shows off the improvements done by the last few patches, and gives us a baseline for noticing regressions in the future. Here are the results with linux.git as the perf "large repo": Test origin HEAD ------------------------------------------------------------------- 0003.1: log --raw 43.41(40.36+2.69) 33.86(30.96+2.41) -22.0% 0003.2: log -S 313.61(309.74+3.78) 298.75(295.58+3.00) -4.7% (for a large repo, the "log -S" improvements are greater if you bump the delta base cache limit, but I think it makes sense to test the "stock" behavior, since that is what most people will see). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-08-23 15:26:16 -07:00
Junio C Hamano	dd610aeda6	Merge branch 'kw/patch-ids-optim' When "git rebase" tries to compare set of changes on the updated upstream and our own branch, it computes patch-id for all of these changes and attempts to find matches. This has been optimized by lazily computing the full patch-id (which is expensive) to be compared only for changes that touch the same set of paths. * kw/patch-ids-optim: rebase: avoid computing unnecessary patch IDs patch-ids: add flag to create the diff patch id using header only data patch-ids: replace the seen indicator with a commit pointer patch-ids: stop using a hand-rolled hashmap implementation	2016-08-12 09:47:39 -07:00
Kevin Willford	b3dfeebb92	rebase: avoid computing unnecessary patch IDs The `rebase` family of Git commands avoid applying patches that were already integrated upstream. They do that by using the revision walking option that computes the patch IDs of the two sides of the rebase (local-only patches vs upstream-only ones) and skipping those local patches whose patch ID matches one of the upstream ones. In many cases, this causes unnecessary churn, as already the set of paths touched by a given commit would suffice to determine that an upstream patch has no local equivalent. This hurts performance in particular when there are a lot of upstream patches, and/or large ones. Therefore, let's introduce the concept of a "diff-header-only" patch ID, compare those first, and only evaluate the "full" patch ID lazily. Please note that in contrast to the "full" patch IDs, those "diff-header-only" patch IDs are prone to collide with one another, as adjacent commits frequently touch the very same files. Hence we now have to be careful to allow multiple hash entries with the same hash. We accomplish that by using the hashmap_add() function that does not even test for hash collisions. This also allows us to evaluate the full patch ID lazily, i.e. only when we found commits with matching diff-header-only patch IDs. We add a performance test that demonstrates ~1-6% improvement. In practice this will depend on various factors such as how many upstream changes and how big those changes are along with whether file system caches are cold or warm. As Git's test suite has no way of catching performance regressions, we also add a regression test that verifies that the full patch ID computation is skipped when the diff-header-only computation suffices. Signed-off-by: Kevin Willford <kcwillford@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-08-11 14:39:16 -07:00
Jeff King	77023ea3c3	t/perf: add tests for many-pack scenarios Git's pack storage does efficient (log n) lookups in a single packfile's index, but if we have multiple packfiles, we have to linearly search each for a given object. This patch introduces some timing tests for cases where we have a large number of packs, so that we can measure any improvements we make in the following patches. The main thing we want to time is object lookup. To do this, we measure "git rev-list --objects --all", which does a fairly large number of object lookups (essentially one per object in the repository). However, we also measure the time to do a full repack, which is interesting for two reasons. One is that in addition to the usual pack lookup, it has its own linear iteration over the list of packs. And two is that because it it is the tool one uses to go from an inefficient many-pack situation back to a single pack, we care about its performance not only at marginal numbers of packs, but at the extreme cases (e.g., if you somehow end up with 5,000 packs, it is the only way to get back to 1 pack, so we need to make sure it performs well). We measure the performance of each command in three scenarios: 1 pack, 50 packs, and 1,000 packs. The 1-pack case is a baseline; any optimizations we do to handle multiple packs cannot possibly perform better than this. The 50-pack case is as far as Git should generally allow your repository to go, if you have auto-gc enabled with the default settings. So this represents the maximum performance improvement we would expect under normal circumstances. The 1,000-pack case is hopefully rare, though I have seen it in the wild where automatic maintenance was broken for some time (and the repository continued to receive pushes). This represents cases where we care less about general performance, but want to make sure that a full repack command does not take excessively long. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-07-29 11:05:06 -07:00
Junio C Hamano	e9a6d71331	Merge branch 'jk/perf-any-version' Allow t/perf framework to use the features from the most recent version of Git even when testing an older installed version. * jk/perf-any-version: p4211: explicitly disable renames in no-rename test t/perf: fix regression in testing older versions of git	2016-07-11 10:31:06 -07:00
Jeff King	85a727895d	p4211: explicitly disable renames in no-rename test p4211 tests line-log performance both with and without "-M". In v2.9.0, the case without "-M" appears to have regressed badly, but that is only because we flipped on renames by default. Let's have the test explicitly disable renames to get consistent timings (and to match the presumed intent of the test, which is to see the effects with and without renames). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-06-22 13:47:55 -07:00

1 2 3 4

199 Commits