mirrors/git

mirror of https://github.com/git/git.git synced 2024-11-28 04:23:30 +08:00

Author	SHA1	Message	Date
Junio C Hamano	658624209a	Merge branch 'ss/faq-ignore' Random bits of FAQ. * ss/faq-ignore: gitfaq: files in .gitignore are tracked	2020-05-13 12:19:19 -07:00
Junio C Hamano	3af459e48d	Merge branch 'jc/auto-gc-quiet' Teach "am", "commit", "merge" and "rebase", when they are run with the "--quiet" option, to pass "--quiet" down to "gc --auto". * jc/auto-gc-quiet: auto-gc: pass --quiet down from am, commit, merge and rebase auto-gc: extract a reusable helper from "git fetch"	2020-05-13 12:19:19 -07:00
Junio C Hamano	aa28171c27	Merge branch 'cb/credential-doc-fixes' Minor in-code comments and documentation updates around credential API. * cb/credential-doc-fixes: credential: document protocol updates credential: update gitcredentials documentation credential: correct order of parameters for credential_match credential: update description for credential_from_url_gently	2020-05-13 12:19:19 -07:00
Junio C Hamano	69ae8ffa2a	Merge branch 'tb/bitmap-walk-with-tree-zero-filter' The object walk with object filter "--filter=tree:0" can now take advantage of the pack bitmap when available. * tb/bitmap-walk-with-tree-zero-filter: pack-bitmap: pass object filter to fill-in traversal pack-bitmap.c: support 'tree:0' filtering pack-bitmap.c: make object filtering functions generic list-objects-filter: treat NULL filter_options as "disabled"	2020-05-13 12:19:18 -07:00
Junio C Hamano	896833b268	Merge branch 'tb/shallow-cleanup' Code cleanup. * tb/shallow-cleanup: shallow: use struct 'shallow_lock' for additional safety shallow.h: document '{commit,rollback}_shallow_file' shallow: extract a header file for shallow-related functions commit: make 'commit_graft_pos' non-static	2020-05-13 12:19:18 -07:00
Đoàn Trần Công Danh	27e29f859d	t1509: correct i18n test git-init(1)'s messages is subjected to i18n. They should be tested by test_i18n* family. Fix them. Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-13 09:59:00 -07:00
Emily Shaffer	4a4804edf4	bugreport: include user interactive shell It's possible a user may complain about the way that Git interacts with their interactive shell, e.g. autocompletion or shell prompt. In that case, it's useful for us to know which shell they're using interactively. Signed-off-by: Emily Shaffer <emilyshaffer@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-12 22:02:20 -07:00
Emily Shaffer	39f4919dc5	help: add shell-path to --build-options It may be useful to know which shell Git was built to try to point to, in the event that shell-based Git commands are failing. $SHELL_PATH is set during the build and used to launch the manpage viewer, as well as by git-compat-util.h, and it's used during tests. 'git version --build-options' is encouraged for use in bug reports, so it makes sense to include this information there. Signed-off-by: Emily Shaffer <emilyshaffer@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-12 22:02:17 -07:00
Emily Shaffer	98a1364740	trace2: log progress time and throughput Rather than teaching only one operation, like 'git fetch', how to write down throughput to traces, we can learn about a wide range of user operations that may seem slow by adding tooling to the progress library itself. Operations which display progress are likely to be slow-running and the kind of thing we want to monitor for performance anyways. By showing object counts and data transfer size, we should be able to make some derived measurements to ensure operations are scaling the way we expect. Signed-off-by: Emily Shaffer <emilyshaffer@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-12 15:30:39 -07:00
Ben Keene	2dfdd705ff	git-p4.py: fix --prepare-p4-only error with multiple commits When using git p4 submit with the --prepare-p4-only option, the program should prepare a single p4 changelist and notify the user that more commits are pending and then stop processing. A bug has been introduced by the p4-changelist hook feature that causes the program to continue to try and process all pending changelists at the same time. The function applyCommit returns True when applying the commit was successful and the program should continue. However, when the optional flag --prepare-p4-only is set, the program should stop after the first application. Change the logic in the run method for P4Submit to check for the flag --prepare-p4-only after successfully completing the applyCommit method. Be aware - this change will fix the existing test error in t9807.23 for --prepare-p4-only. However there is insufficent coverage for this flag. If more than 1 commit is pending submission to P4, the method will properly prepare the P4 changelist, however it will still exit the application with an exitcode of 1. The current documentation does not define what the exit code should be in this condition. (See: https://git-scm.com/docs/git-p4#Documentation/git-p4.txt---prepare-p4-only) Signed-off-by: Ben Keene <seraphire@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-12 12:42:32 -07:00
Ismael Luceno	e5894146b0	git-gui: Handle Ctrl + BS/Del in the commit msg - Control+BackSpace: Delete word to the left of the cursor. - Control+Delete : Delete word to the right of the cursor. Originally introduced by BRIEF and Turbo Vision between 1985 and 1992, they were adopted by most CUA-Compliant UIs, including those of: OS/2, Windows, Mac OS, Qt, GTK, Open/Libre Office, Gecko, and GNU Emacs. In both cases Tk already implements the functionality bound to other key combination, so we use that. Graphical examples: Deleting to the left: v------ pointer X_WORD____X ^-----^------ selection Deleting to the right: v--------- pointer X_WORD_X ^--^------ selection Signed-off-by: Ismael Luceno <ismael.luceno@tttech-auto.com> Signed-off-by: Pratyush Yadav <me@yadavpratyush.com>	2020-05-12 18:23:49 +05:30
Derrick Stolee	f32dde8c12	line-log: integrate with changed-path Bloom filters The previous changes to the line-log machinery focused on making the first result appear faster. This was achieved by no longer walking the entire commit history before returning the early results. There is still another way to improve the performance: walk most commits much faster. Let's use the changed-path Bloom filters to reduce time spent computing diffs. Since the line-log computation requires opening blobs and checking the content-diff, there is still a lot of necessary computation that cannot be replaced with changed-path Bloom filters. The part that we can reduce is most effective when checking the history of a file that is deep in several directories and those directories are modified frequently. In this case, the computation to check if a commit is TREESAME to its first parent takes a large fraction of the time. That is ripe for improvement with changed-path Bloom filters. We must ensure that prepare_to_use_bloom_filters() is called in revision.c so that the bloom_filter_settings are loaded into the struct rev_info from the commit-graph. Of course, some cases are still forbidden, but in the line-log case the pathspec is provided in a different way than normal. Since multiple paths and segments could be requested, we compute the struct bloom_key data dynamically during the commit walk. This could likely be improved, but adds code complexity that is not valuable at this time. There are two cases to care about: merge commits and "ordinary" commits. Merge commits have multiple parents, but if we are TREESAME to our first parent in every range, then pass the blame for all ranges to the first parent. Ordinary commits have the same condition, but each is done slightly differently in the process_ranges_[merge\|ordinary]_commit() methods. By checking if the changed-path Bloom filter can guarantee TREESAME, we can avoid that tree-diff cost. If the filter says "probably changed", then we need to run the tree-diff and then the blob-diff if there was a real edit. The Linux kernel repository is a good testing ground for the performance improvements claimed here. There are two different cases to test. The first is the "entire history" case, where we output the entire history to /dev/null to see how long it would take to compute the full line-log history. The second is the "first result" case, where we find how long it takes to show the first value, which is an indicator of how quickly a user would see responses when waiting at a terminal. To test, I selected the paths that were changed most frequently in the top 10,000 commits using this command (stolen from StackOverflow [1]): git log --pretty=format: --name-only -n 10000 \| sort \| \ uniq -c \| sort -rg \| head -10 which results in 121 MAINTAINERS 63 fs/namei.c 60 arch/x86/kvm/cpuid.c 59 fs/io_uring.c 58 arch/x86/kvm/vmx/vmx.c 51 arch/x86/kvm/x86.c 45 arch/x86/kvm/svm.c 42 fs/btrfs/disk-io.c 42 Documentation/scsi/index.rst (along with a bogus first result). It appears that the path arch/x86/kvm/svm.c was renamed, so we ignore that entry. This leaves the following results for the real command time: \| \| Entire History \| First Result \| \| Path \| Before \| After \| Before \| After \| \|------------------------------\|--------\|--------\|--------\|--------\| \| MAINTAINERS \| 4.26 s \| 3.87 s \| 0.41 s \| 0.39 s \| \| fs/namei.c \| 1.99 s \| 0.99 s \| 0.42 s \| 0.21 s \| \| arch/x86/kvm/cpuid.c \| 5.28 s \| 1.12 s \| 0.16 s \| 0.09 s \| \| fs/io_uring.c \| 4.34 s \| 0.99 s \| 0.94 s \| 0.27 s \| \| arch/x86/kvm/vmx/vmx.c \| 5.01 s \| 1.34 s \| 0.21 s \| 0.12 s \| \| arch/x86/kvm/x86.c \| 2.24 s \| 1.18 s \| 0.21 s \| 0.14 s \| \| fs/btrfs/disk-io.c \| 1.82 s \| 1.01 s \| 0.06 s \| 0.05 s \| \| Documentation/scsi/index.rst \| 3.30 s \| 0.89 s \| 1.46 s \| 0.03 s \| It is worth noting that the least speedup comes for the MAINTAINERS file which is * edited frequently, * low in the directory heirarchy, and * quite a large file. All of those points lead to spending more time doing the blob diff and less time doing the tree diff. Still, we see some improvement in that case and significant improvement in other cases. A 2-4x speedup is likely the more typical case as opposed to the small 5% change for that file. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-11 09:33:56 -07:00
SZEDER Gábor	b928e488bd	completion: offer '--(no-)patch' among 'git log' options Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-11 09:33:56 -07:00
SZEDER Gábor	002933f3fe	line-log: try to use generation number-based topo-ordering The previous patch made it possible to perform line-level filtering during history traversal instead of in an expensive preprocessing step, but it still requires some simpler preprocessing steps, notably topo-ordering. However, nowadays we have commit-graphs storing generation numbers, which make it possible to incrementally traverse the history in topological order, without the preparatory limit_list() and sort_in_topological_order() steps; see `b45424181e` (revision.c: generation-based topo-order algorithm, 2018-11-01). This patch combines the two, so we can do both the topo-ordering and the line-level filtering during history traversal, eliminating even those simpler preprocessing steps, and thus further reducing the delay before showing the first commit modifying the given line range. The 'revs->limited' flag plays the central role in this, because, due to limitations of the current implementation, the generation number-based topo-ordering is only enabled when this flag remains unset. Line-level log, however, always sets this flag in setup_revisions() ever since the feature was introduced in `12da1d1f6f` (Implement line-history search (git log -L), 2013-03-28). The reason for setting 'limited' is unclear, though, because the line-level log itself doesn't directly depend on it, and it doesn't affect how the limit_list() function limits the revision range. However, there is an indirect dependency: the line-level log requires topo-ordering, and the "traditional" sort_in_topological_order() requires an already limited commit list since `e6c3505b44` (Make sure we generate the whole commit list before trying to sort it topologically, 2005-07-06). The new, generation numbers-based topo-ordering doesn't require a limited commit list anymore. So don't set 'revs->limited' for line-level log, unless it is really necessary, namely: - The user explicitly requested parent rewriting, because that is still done in the line_log_filter() preprocessing step (see previous patch), which requires sort_in_topological_order() and in turn limit_list() as well. - A commit-graph file is not available or it doesn't yet contain generation numbers. In these cases we had to fall back on sort_in_topological_order() and in turn limit_list(). The existing condition with generation_numbers_enabled() has already ensured that the 'limited' flag is set in these cases; this patch just makes sure that the line-level log sets 'revs->topo_order' before that condition. While the reduced delay before showing the first commit is measurable in git.git, it takes a bigger repository to make it clearly noticable. In both cases below the line ranges were chosen so that they were modified rather close to the starting revisions, so the effect of this change is most noticable. # git.git $ time git --no-pager log -L:read_alternate_refs:sha1-file.c -1 v2.23.0 Before: real 0m0.107s user 0m0.091s sys 0m0.013s After: real 0m0.058s user 0m0.050s sys 0m0.005s # linux.git $ time git --no-pager log \ -L:build_restore_work_registers:arch/mips/mm/tlbex.c -1 v5.2 Before: real 0m1.129s user 0m1.061s sys 0m0.069s After: real 0m0.096s user 0m0.087s sys 0m0.009s Additional testing by Derrick Stolee: Since this patch improves the performance for the first result, I repeated the experiment from the previous patch on the Linux kernel repository, reporting real time here: Command: git log -L 100,200:MAINTAINERS -n 1 >/dev/null Before: 0.71 s After: 0.05 s Now, we have dropped the full topo-order of all ~910,000 commits before reporting the first result. The remaining performance improvements then are: 1. Update the parent-rewriting logic to be incremental similar to how "git log --graph" behaves. 2. Use changed-path Bloom filters to reduce the time spend in the tree-diff to see if the path(s) changed. Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-11 09:33:56 -07:00
Derrick Stolee	2f6775f00c	bloom: use num_changes not nr for limit detection As diff_tree_oid() computes a diff, it will terminate early if the total number of changed paths is strictly larger than max_changes. This includes the directories that changed, not just the file paths. However, only the file paths are reflected in the resulting diff queue's "nr" value. Use the "num_changes" from diffopt to check if the diff terminated early. This is incredibly important, as it can result in incorrect filters! For example, the first commit in the Linux kernel repo reports only 471 changes, but since these are nested inside several directories they expand to 513 "real" changes, and in fact the total list of changes is not reported. Thus, the computed filter for this commit is incorrect. Demonstrate the subtle difference by using one fewer file change in the 'get bloom filter for commit with 513 changes' test. Before, this edited 513 files inside "bigDir" which hit this inequality. However, dropping the file count by one demonstrates how the previous inequality was incorrect but the new one is correct. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-11 09:33:56 -07:00
SZEDER Gábor	3cb9d2b6f9	line-log: more responsive, incremental 'git log -L' The current line-level log implementation performs a preprocessing step in prepare_revision_walk(), during which the line_log_filter() function filters and rewrites history to keep only commits modifying the given line range. This preprocessing affects both responsiveness and correctness: - Git doesn't produce any output during this preprocessing step. Checking whether a commit modified the given line range is somewhat expensive, so depending on the size of the given revision range this preprocessing can result in a significant delay before the first commit is shown. - Limiting the number of displayed commits (e.g. 'git log -3 -L...') doesn't limit the amount of work during preprocessing, because that limit is applied during history traversal. Alas, by that point this expensive preprocessing step has already churned through the whole revision range to find all commits modifying the revision range, even though only a few of them need to be shown. - It rewrites parents, with no way to turn it off. Without the user explicitly requesting parent rewriting any parent object ID shown should be that of the immediate parent, just like in case of a pathspec-limited history traversal without parent rewriting. However, after that preprocessing step rewrote history, the subsequent "regular" history traversal (i.e. get_revision() in a loop) only sees commits modifying the given line range. Consequently, it can only show the object ID of the last ancestor that modified the given line range (which might happen to be the immediate parent, but many-many times it isn't). This patch addresses both the correctness and, at least for the common case, the responsiveness issues by integrating line-level log filtering into the regular revision walking machinery: - Make process_ranges_arbitrary_commit(), the static function in 'line-log.c' deciding whether a commit modifies the given line range, public by removing the static keyword and adding the 'line_log_' prefix, so it can be called from other parts of the revision walking machinery. - If the user didn't explicitly ask for parent rewriting (which, I believe, is the most common case): - Call this now-public function during regular history traversal, namely from get_commit_action() to ignore any commits not modifying the given line range. Note that while this check is relatively expensive, it must be performed before other, much cheaper conditions, because the tracked line range must be adjusted even when the commit will end up being ignored by other conditions. - Skip the line_log_filter() call, i.e. the expensive preprocessing step, in prepare_revision_walk(), because, thanks to the above points, the revision walking machinery is now able to filter out commits not modifying the given line range while traversing history. This way the regular history traversal sees the unmodified history, and is therefore able to print the object ids of the immediate parents of the listed commits. The eliminated preprocessing step can greatly reduce the delay before the first commit is shown, see the numbers below. - However, if the user did explicitly ask for parent rewriting via '--parents' or a similar option, then stick with the current implementation for now, i.e. perform that expensive filtering and history rewriting in the preprocessing step just like we did before, leaving the initial delay as long as it was. I tried to integrate line-level log filtering with parent rewriting into the regular history traversal, but, unfortunately, several subtleties resisted... :) Maybe someday we'll figure out how to do that, but until then at least the simple and common (i.e. without parent rewriting) 'git log -L:func:file' commands can benefit from the reduced delay. This change makes the failing 'parent oids without parent rewriting' test in 't4211-line-log.sh' succeed. The reduced delay is most noticable when there's a commit modifying the line range near the tip of a large-ish revision range: # no parent rewriting requested, no commit-graph present $ time git --no-pager log -L:read_alternate_refs:sha1-file.c -1 v2.23.0 Before: real 0m9.570s user 0m9.494s sys 0m0.076s After: real 0m0.718s user 0m0.674s sys 0m0.044s A significant part of the remaining delay is spent reading and parsing commit objects in limit_list(). With the help of the commit-graph we can eliminate most of that reading and parsing overhead, so here are the timing results of the same command as above, but this time using the commit-graph: Before: real 0m8.874s user 0m8.816s sys 0m0.057s After: real 0m0.107s user 0m0.091s sys 0m0.013s The next patch will further reduce the remaining delay. To be clear: this patch doesn't actually optimize the line-level log, but merely moves most of the work from the preprocessing step to the history traversal, so the commits modifying the line range can be shown as soon as they are processed, and the traversal can be terminated as soon as the given number of commits are shown. Consequently, listing the full history of a line range, potentially all the way to the root commit, will take the same time as before (but at least the user might start reading the output earlier). Furthermore, if the most recent commit modifying the line range is far away from the starting revision, then that initial delay will still be significant. Additional testing by Derrick Stolee: In the Linux kernel repository, the MAINTAINERS file was changed ~3,500 times across the ~915,000 commits. In addition to that edit frequency, the file itself is quite large (~18,700 lines). This means that a significant portion of the computation is taken up by computing the patch-diff of the file. This patch improves the real time it takes to output the first result quite a bit: Command: git log -L 100,200:MAINTAINERS -n 1 >/dev/null Before: 3.88 s After: 0.71 s If we drop the "-n 1" in the command, then there is no change in end-to-end process time. This is because the command still needs to walk the entire commit history, which negates the point of this patch. This is expected. As a note for future reference, the ~4.3 seconds in the old code spends ~2.6 seconds computing the patch-diffs, and the rest of the time is spent walking commits and computing diffs for which paths changed at each commit. The changed-path Bloom filters could improve the end-to-end computation time (i.e. no "-n 1" in the command). Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-11 09:33:56 -07:00
Derrick Stolee	65c1a28bb6	bloom: de-duplicate directory entries When computing a changed-path Bloom filter, we need to take the files that changed from the diff computation and extract the parent directories. That way, a directory pathspec such as "Documentation" could match commits that change "Documentation/git.txt". However, the current code does a poor job of this process. The paths are added to a hashmap, but we do not check if an entry already exists with that path. This can create many duplicate entries and cause the filter to have a much larger length than it should. This means that the filter is more sparse than intended, which helps the false positive rate, but wastes a lot of space. Properly use hashmap_get() before hashmap_add(). Also be sure to include a comparison function so these can be matched correctly. This has an effect on a test in t0095-bloom.sh. This makes sense, there are ten changes inside "smallDir" so the total number of paths in the filter should be 11. This would result in 11 * 10 bits required, and with 8 bits per byte, this results in 14 bytes. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-11 09:33:56 -07:00
SZEDER Gábor	48da94ba37	t4211-line-log: add tests for parent oids None of the tests in 't4211-line-log.sh' really check which parent object IDs are shown in the output, either implicitly as part of "Merge: ..." lines [1] or explicitly via the '%p' or '%P' format specifiers in a custom pretty format. Add two tests to 't4211-line-log.sh' to check which parent object IDs are shown, one without and one with explicitly requested parent rewriting, IOW without and with the '--parents' option. The test without '--parents' is marked as failing, because without that option parent rewriting should not be performed, and thus the parent object ID should be that of the immediate parent, just like in case of a pathspec-limited history traversal without parent rewriting. The current line-level log implementation, however, performs parent rewriting unconditionally and without a possibility to turn it off, and, consequently, it shows the object ID of the most recent ancestor that modified the given line range. In both of these new tests we only really care about the object IDs of the listed commits and their parents, but not the diffs of the line ranges; the diffs have already been thoroughly checked in the previous tests. [1] While one of the tests ('-M -L ':f:b.c' parallel-change') does list a merge commit, both of its parents happen to modify the given line range and are listed as well, so the implications of parent rewriting remained hidden and untested. Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-11 09:33:56 -07:00
Derrick Stolee	88093289cd	Documentation: changed-path Bloom filters use byte words In Documentation/technical/commit-graph-format.txt, the definition of the BIDX chunk specifies the length is a number of 8-byte words. During development we discovered that using 8-byte words in the Murmur3 hash algorithm causes issues with big-endian versus little- endian machines. Thus, the hash algorithm was adapted to work on a byte-by-byte basis. However, this caused a change in the definition of a "word" in bloom.h. Now, a "word" is a single byte, which allows filters to be as small as two bytes. These length-two filters are demonstrated in t0095-bloom.sh, and a larger filter of length 25 is demonstrated as well. The original point of using 8-byte words was for alignment reasons. It also presented opportunities for extremely sparse Bloom filters when there were a small number of changes at a commit, creating a very low false-positive rate. However, modifying the format at this point is unlikely to be a valuable exercise. Also, this use of single-byte granularity does present opportunities to save space. It is unclear if 8-byte alignment of the filters would present any meaningful performance benefits. Modify the format document to reflect reality. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-11 09:33:56 -07:00
SZEDER Gábor	d5546726fb	line-log: remove unused fields from 'struct line_log_data' Remove the unused fields 'status', 'arg_alloc', 'arg_nr' and 'args' from 'struct line_log_data'. They were already part of the struct when it was introduced in commit `12da1d1f6` (Implement line-history search (git log -L), 2013-03-28), but as far as I can tell none of them have ever been actually used. Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-11 09:33:56 -07:00
Derrick Stolee	891c17c954	bloom: parse commit before computing filters When computing changed-path Bloom filters for a commit, we need to know if the commit has a parent or not. If the commit is not parsed, then its parent pointer will be NULL. As far as I can tell, the only opportunity to reach this code without parsing the commit is inside "test-tool bloom get_filter_for_commit" but it is best to be safe. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-11 09:33:56 -07:00
René Scharfe	9068cfb20f	fsck: report non-consecutive duplicate names in trees Tree entries are sorted in path order, meaning that directory names get a slash ('/') appended implicitly. Git fsck checks if trees contains consecutive duplicates, but due to that ordering there can be non-consecutive duplicates as well if one of them is a directory and the other one isn't. Such a tree cannot be fully checked out. Find these duplicates by recording candidate file names on a stack and check candidate directory names against that stack to find matches. Suggested-by: Brandon Williams <bwilliamseng@gmail.com> Original-test-by: Brandon Williams <bwilliamseng@gmail.com> Signed-off-by: René Scharfe <l.s.r@web.de> Reviewed-by: Luke Diamand <luke@diamand.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-11 08:40:28 -07:00
Andrew Oakley	82e46d6b83	git-p4: recover from inconsistent perforce history Perforce allows you commit files and directories with the same name, so you could have files //depot/foo and //depot/foo/bar both checked in. A p4 sync of a repository in this state fails. Deleting one of the files recovers the repository. When this happens we want git-p4 to recover in the same way as perforce. Note that Perforce has this change in their 2017.1 version: Bugs fixed in 2017.1 #1489051 (Job #2170) ** Submitting a file with the same name as an existing depot directory path (or vice versa) will now be rejected. so people hopefully will not creating damaged Perforce repos anymore, but "git p4" needs to be able to interact with already corrupt ones. Signed-off-by: Andrew Oakley <andrew@adoakley.name> Reviewed-by: Luke Diamand <luke@diamand.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-10 09:58:50 -07:00
Derrick Stolee	3ce4ca0a56	multi-pack-index: respect repack.packKeptObjects=false When selecting a batch of pack-files to repack in the "git multi-pack-index repack" command, Git should respect the repack.packKeptObjects config option. When false, this option says that the pack-files with an associated ".keep" file should not be repacked. This config value is "false" by default. There are two cases for selecting a batch of objects. The first is the case where the input batch-size is zero, which specifies "repack everything". The second is with a non-zero batch size, which selects pack-files using a greedy selection criteria. Both of these cases are updated and tested. Reported-by: Son Luong Ngoc <sluongng@gmail.com> Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-10 09:50:55 -07:00
Son Luong Ngoc	e11d86de13	midx: teach "git multi-pack-index repack" honor "git repack" configurations When the "repack" subcommand of "git multi-pack-index" command creates new packfile(s), it does not call the "git repack" command but instead directly calls the "git pack-objects" command, and the configuration variables meant for the "git repack" command, like "repack.usedaeltabaseoffset", are ignored. Check the configuration variables used by "git repack" ourselves in "git multi-index-pack" and pass the corresponding options to underlying "git pack-objects". Note that `repack.writeBitmaps` configuration is ignored, as the pack bitmap facility is useful only with a single packfile. Signed-off-by: Son Luong Ngoc <sluongng@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-10 09:50:53 -07:00
Jordi Mas	db7bfba9ad	l10n: Update Catalan translation Signed-off-by: Jordi Mas <jmas@softcatala.org>	2020-05-10 10:52:58 +02:00
Johannes Schindelin	02471e7e20	rebase --autosquash: fix a potential segfault When rearranging the todo list so that the fixups/squashes are reordered just after the commits they intend to fix up, we use two arrays to maintain that list: `next` and `tail`. The idea is that `next[i]`, if set to a non-negative value, contains the index of the item that should be rearranged just after the `i`th item. To avoid having to walk the entire `next` chain when appending another fixup/squash, we also store the end of the `next` chain in `tail[i]`. The logic we currently use to update these array items is based on the assumption that given a fixup/squash item at index `i`, we just found the index `i2` indicating the first item in that fixup chain. However, as reported by Paul Ganssle, that need not be true: the special form `fixup! <commit-hash>` is allowed to point to _another_ fixup commit in the middle of the fixup chain. Example: * 0192a To fixup * 02f12 fixup! To fixup * 03763 fixup! To fixup * 04ecb fixup! 02f12 Note how the fourth commit targets the second commit, which is already a fixup that targets the first commit. Previously, we would update `next` and `tail` under our assumption that every `fixup!` commit would find the start of the `fixup!`/`squash!` chain. This would lead to a segmentation fault because we would actually end up with a `next[i]` pointing to a `fixup!` but the corresponding `tail[i]` pointing nowhere, which would the lead to a segmentation fault. Let's fix this by _inserting_, rather than _appending_, the item. In other words, if we make a given line successor of another line, we do not simply forget any previously set successor of the latter, but make it a successor of the former. In the above example, at the point when we insert 04ecb just after 02f12, 03763 would already be recorded as a successor of 04ecb, and we now "squeeze in" 04ecb. To complete the idea, we now no longer assume that `next[i]` pointing to a line means that `last[i]` points to a line, too. Instead, we extend the concept of `last` to cover also partial `fixup!`/`squash!` chains, i.e. chains starting in the middle of a larger such chain. In the above example, after processing all lines, `last[0]` (corresponding to 0192a) would point to 03763, which indeed is the end of the overall `fixup!` chain, and `last[1]` (corresponding to 02f12) would point to 04ecb (which is the last `fixup!` targeting 02f12, but it has 03763 as successor, i.e. it is not the end of overall `fixup!` chain). Reported-by: Paul Ganssle <paul@ganssle.io> Helped-by: Jeff King <peff@peff.net> Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-09 13:59:55 -07:00
Junio C Hamano	b994622632	The eighth batch Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-08 14:25:12 -07:00
Junio C Hamano	282ce92448	Merge branch 'cb/test-bash-lineno-fix' Recent change to show files and line numbers of a breakage during test (only available when running the tests with bash) were hurting other shells with syntax errors, which has been corrected. * cb/test-bash-lineno-fix: t/test_lib: avoid naked bash arrays in file_lineno	2020-05-08 14:25:12 -07:00
Junio C Hamano	41eae3eaa8	Merge branch 'cb/t0000-use-the-configured-shell' The basic test did not honor $TEST_SHELL_PATH setting, which has been corrected. * cb/t0000-use-the-configured-shell: t/t0000-basic: make sure subtests also use TEST_SHELL_PATH	2020-05-08 14:25:12 -07:00
Junio C Hamano	37b48f0efc	Merge branch 'bc/doc-credential-helper-value' Doc update. * bc/doc-credential-helper-value: docs: document credential.helper allowed values	2020-05-08 14:25:11 -07:00
Junio C Hamano	6381c301ff	Merge branch 'dl/doc-stash-remove-mention-of-reflog' Doc update. * dl/doc-stash-remove-mention-of-reflog: Doc: reference the "stash list" in autostash docs	2020-05-08 14:25:09 -07:00
Junio C Hamano	b9bcd76a9a	Merge branch 'cb/avoid-colliding-with-netbsd-hmac' The <stdlib.h> header on NetBSD brings in its own definition of hmac() function (eek), which conflicts with our own and unrelated function with the same name. Our function has been renamed to work around the issue. * cb/avoid-colliding-with-netbsd-hmac: builtin/receive-pack: avoid generic function name hmac()	2020-05-08 14:25:09 -07:00
Junio C Hamano	4c2941a5fa	Merge branch 'es/restore-staged-from-head-by-default' "git restore --staged --worktree" now defaults to take the contents out of "HEAD", instead of erring out. * es/restore-staged-from-head-by-default: restore: default to HEAD when combining --staged and --worktree	2020-05-08 14:25:08 -07:00
Junio C Hamano	6d4bf5813c	Merge branch 'jk/arith-expansion-coding-guidelines' The coding guideline for shell scripts instructed to refer to a variable with dollar-sign inside arithmetic expansion to work around a bug in old versions of dash, which is a thing of the past. Now we are not forbidden from writing $((var+1)). * jk/arith-expansion-coding-guidelines: CodingGuidelines: drop arithmetic expansion advice to use "$x"	2020-05-08 14:25:07 -07:00
Junio C Hamano	e9acbd6836	Merge branch 'ds/sparse-allow-empty-working-tree' The sparse-checkout patterns have been forbidden from excluding all paths, leaving an empty working tree, for a long time. This limitation has been lifted. * ds/sparse-allow-empty-working-tree: sparse-checkout: stop blocking empty workdirs	2020-05-08 14:25:06 -07:00
Junio C Hamano	95875e0356	Merge branch 'jt/commit-graph-plug-memleak' Fix a leak noticed by fuzzer. * jt/commit-graph-plug-memleak: commit-graph: avoid memory leaks	2020-05-08 14:25:05 -07:00
Junio C Hamano	6de1630898	Merge branch 'jk/for-each-ref-multi-key-sort-fix' "git branch" and other "for-each-ref" variants accepted multiple --sort=<key> options in the increasing order of precedence, but it had a few breakages around "--ignore-case" handling, and tie-breaking with the refname, which have been fixed. * jk/for-each-ref-multi-key-sort-fix: ref-filter: apply fallback refname sort only after all user sorts ref-filter: apply --ignore-case to all sorting keys	2020-05-08 14:25:04 -07:00
Junio C Hamano	1260f819aa	Merge branch 'jk/credential-sample-update' The samples in the credential documentation has been updated to make it clear that we depict what would appear in the .git/config file, by adding appropriate quotes as needed.. * jk/credential-sample-update: gitcredentials(7): make shell-snippet example more realistic gitcredentials(7): clarify quoting of helper examples	2020-05-08 14:25:03 -07:00
Junio C Hamano	dc4c3933b1	Merge branch 'ah/userdiff-markdown' The userdiff patterns for Markdown documents have been added. * ah/userdiff-markdown: userdiff: support Markdown	2020-05-08 14:25:01 -07:00
Junio C Hamano	933fdf8784	Merge branch 'cb/credential-store-ignore-bogus-lines' With the recent tightening of the code that is used to parse various parts of a URL for use in the credential subsystem, a hand-edited credential-store file causes the credential helper to die, which is a bit too harsh to the users. Demote the error behaviour to just ignore and keep using well-formed lines instead. * cb/credential-store-ignore-bogus-lines: credential-store: ignore bogus lines from store file credential-store: document the file format a bit more	2020-05-08 14:25:01 -07:00
Junio C Hamano	f4675f3d47	Merge branch 'dl/switch-c-option-in-error-message' In error messages that "git switch" mentions its option to create a new branch, "-b/-B" options were shown, where "-c/-C" options should be, which has been corrected. * dl/switch-c-option-in-error-message: switch: fix errors and comments related to -c and -C	2020-05-08 14:25:00 -07:00
Junio C Hamano	5c7bb0146e	CodingGuidelines: do not ==/!= compare with 0 or '\0' or NULL Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-08 11:25:12 -07:00
Christian Couder	08450ef791	upload-pack: clear filter_options for each v2 fetch command Because of the request/response model of protocol v2, the upload_pack_v2() function is sometimes called twice in the same process, while 'struct list_objects_filter_options filter_options' was declared as static at the beginning of 'upload-pack.c'. This made the check in list_objects_filter_die_if_populated(), which is called by process_args(), fail the second time upload_pack_v2() is called, as filter_options had already been populated the first time. To fix that, filter_options is not static any more. It's now owned directly by upload_pack(). It's now also part of 'struct upload_pack_data', so that it's owned indirectly by upload_pack_v2(). In the long term, the goal is to also have upload_pack() use 'struct upload_pack_data', so adding filter_options to this struct makes more sense than to have it owned directly by upload_pack_v2(). This fixes the first of the 2 bugs documented by `d0badf8797` (partial-clone: demonstrate bugs in partial fetch, 2020-02-21). Helped-by: Derrick Stolee <dstolee@microsoft.com> Helped-by: Jeff King <peff@peff.net> Helped-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-08 11:07:27 -07:00
Derrick Stolee	0eeb3be4c4	unpack-trees: avoid array out-of-bounds error The loop in warn_conflicted_path() that checks for the count of entries with the same path uses "i+count" for the array entry. However, the loop only verifies that the value of count is below the array size. Fix this by adding i to the condition. I hit this condition during a test of the in-tree sparse-checkout feature, so it is exercised by the end of the series. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> [jc: readability fix] Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-08 11:01:27 -07:00
Christopher Warrington	6c722cbe5a	bisect: allow CRLF line endings in "git bisect replay" input We advertise that the bisect log can be corrected in your editor before being fed to "git bisect replay", but some editors may turn the line endings to CRLF. Update the parser of the input lines so that the CR at the end of the line gets ignored. Were anyone to intentionally be using terms/revs with embedded CRs, replaying such bisects will no longer work with this change. I suspect that this is incredibly rare. Signed-off-by: Christopher Warrington <chwarr@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-08 10:54:27 -07:00
Shourya Shukla	6417cf9c21	submodule: port subcommand 'set-url' from shell to C Convert submodule subcommand 'set-url' to a builtin. Port 'set-url' to 'submodule--helper.c' and call the latter via 'git-submodule.sh'. Signed-off-by: Shourya Shukla <shouryashukla.oo@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-08 09:17:55 -07:00
Emily Shaffer	788a776069	bugreport: collect list of populated hooks Occasionally a failure a user is seeing may be related to a specific hook which is being run, perhaps without the user realizing. While the contents of hooks can be sensitive - containing user data or process information specific to the user's organization - simply knowing that a hook is being run at a certain stage can help us to understand whether something is going wrong. Without a definitive list of hook names within the code, we compile our own list from the documentation. This is likely prone to bitrot, but designing a single source of truth for acceptable hooks is too much overhead for this small change to the bugreport tool. Signed-off-by: Emily Shaffer <emilyshaffer@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-07 18:25:04 -07:00
Đoàn Trần Công Danh	066b70ae97	bloom: fix `make sparse` warning * We need a `final_new_line` to make our source code as text file, per POSIX and C specification. * `bloom_filters` should be limited to interal linkage only Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com> Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-07 17:08:21 -07:00
Carlo Marcelo Arenas Belón	1aed817f99	credential: document protocol updates Document protocol changes after CVE-2020-11008, including the removal of references to the override of attributes which is no longer recommended after CVE-2020-5260 and that might be removed in the future. While at it do some improvements for clarity and consistency. Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-07 14:01:56 -07:00

... 2 3 4 5 6 ...

59569 Commits