mirrors/git

mirror of https://github.com/git/git.git synced 2024-12-11 19:03:50 +08:00

Author	SHA1	Message	Date
Junio C Hamano	1eb4136ac2	diff: --{rotate,skip}-to=<path> In the implementation of "git difftool", there is a case where the user wants to start viewing the diffs at a specific path and continue on to the rest, optionally wrapping around to the beginning. Since it is somewhat cumbersome to implement such a feature as a post-processing step of "git diff" output, let's support it internally with two new options. - "git diff --rotate-to=C", when the resulting patch would show paths A B C D E without the option, would "rotate" the paths to shows patch to C D E A B instead. It is an error when there is no patch for C is shown. - "git diff --skip-to=C" would instead "skip" the paths before C, and shows patch to C D E. Again, it is an error when there is no patch for C is shown. - "git log [-p]" also accepts these two options, but it is not an error if there is no change to the specified path. Instead, the set of output paths are rotated or skipped to the specified path or the first path that sorts after the specified path. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-16 09:30:42 -08:00
Jonathan Tan	95acf11a3d	diff: restrict when prefetching occurs Commit `7fbbcb21b1` ("diff: batch fetching of missing blobs", 2019-04-08) optimized "diff" by prefetching blobs in a partial clone, but there are some cases wherein blobs do not need to be prefetched. In these cases, any command that uses the diff machinery will unnecessarily fetch blobs. diffcore_std() may read blobs when it calls the following functions: (1) diffcore_skip_stat_unmatch() (controlled by the config variable diff.autorefreshindex) (2) diffcore_break() and diffcore_merge_broken() (for break-rewrite detection) (3) diffcore_rename() (for rename detection) (4) diffcore_pickaxe() (for detecting addition/deletion of specified string) Instead of always prefetching blobs, teach diffcore_skip_stat_unmatch(), diffcore_break(), and diffcore_rename() to prefetch blobs upon the first read of a missing object. This covers (1), (2), and (3): to cover the rest, teach diffcore_std() to prefetch if the output type is one that includes blob data (and hence blob data will be required later anyway), or if it knows that (4) will be run. Helped-by: Jeff King <peff@peff.net> Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-07 16:09:29 -07:00
Jonathan Tan	1c37e86ab2	diff: make diff_populate_filespec_options struct The behavior of diff_populate_filespec() currently can be customized through a bitflag, but a subsequent patch requires it to support a non-boolean option. Replace the bitflag with an options struct. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-07 16:09:29 -07:00
Heba Waly	13c4d7eb22	diff: move doc to diff.h and diffcore.h Move the documentation from Documentation/technical/api-diff.txt to both diff.h and diffcore.h as it's easier for the developers to find the usage information beside the code instead of looking for it in another doc file. Also documentation/technical/api-diff.txt is removed because the information it has is now redundant and it'll be hard to keep it up to date and synchronized with the documentation in the header files. There are three members documented in the doc file that weren't found in the header files, assuming the doc wasn't up to date and the members no longer exist: touched_flags, COLOR_DIFF_WORDS and QUIET. Signed-off-by: Heba Waly <heba.waly@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-11-18 15:21:28 +09:00
Nguyễn Thái Ngọc Duy	b78ea5fc35	diff.c: reduce implicit dependency on the_index diff and textconv code has so widespread use that it's hard to simply update their api and all call sites at once because it would result in a big patch. For now reduce the_index references to two places: diff_setup() and fill_textconv(). Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-09-21 09:48:10 -07:00
Junio C Hamano	5ade034464	Merge branch 'en/incl-forward-decl' Code hygiene improvement for the header files. * en/incl-forward-decl: Remove forward declaration of an enum compat/precompose_utf8.h: use more common include guard style urlmatch.h: fix include guard Move definition of enum branch_track from cache.h to branch.h alloc: make allocate_alloc_state and clear_alloc_state more consistent Add missing includes and forward declarations	2018-08-20 12:41:32 -07:00
Elijah Newren	ef3ca95475	Add missing includes and forward declarations I looped over the toplevel header files, creating a temporary two-line C program for each consisting of #include "git-compat-util.h" #include $HEADER This patch is the result of manually fixing errors in compiling those tiny programs. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-08-15 11:52:09 -07:00
Nguyễn Thái Ngọc Duy	78d70d9b10	diffcore.h: drop extern from function declaration Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-08-03 10:42:55 -07:00
Brandon Williams	f9704c2d82	diff: convert fill_filespec to struct object_id Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-02 09:36:07 +09:00
Junio C Hamano	6d40812e4b	Merge branch 'tk/diffcore-delta-remove-unused' Code cleanup. * tk/diffcore-delta-remove-unused: diffcore-delta: remove unused parameter to diffcore_count_changes()	2016-11-17 13:45:22 -08:00
Tobias Klauser	974e0044d6	diffcore-delta: remove unused parameter to diffcore_count_changes() The delta_limit parameter to diffcore_count_changes() has been unused since commit `ba23bbc8e` ("diffcore-delta: make change counter to byte oriented again.", 2006-03-04). Remove the parameter and adjust all callers. Signed-off-by: Tobias Klauser <tklauser@distanz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-11-14 09:24:04 -08:00
brian m. carlson	41c9560ee5	diff: rename struct diff_filespec's sha1_valid member Now that this struct's sha1 member is called "oid", update the comment and the sha1_valid member to be called "oid_valid" instead. The following Coccinelle semantic patch was used to implement this, followed by the transformations in object_id.cocci: @@ struct diff_filespec o; @@ - o.sha1_valid + o.oid_valid @@ struct diff_filespec *p; @@ - p->sha1_valid + p->oid_valid Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-06-28 11:39:02 -07:00
brian m. carlson	a0d12c4433	diff: convert struct diff_filespec to struct object_id Convert struct diff_filespec's sha1 member to use a struct object_id called "oid" instead. The following Coccinelle semantic patch was used to implement this, followed by the transformations in object_id.cocci: @@ struct diff_filespec o; @@ - o.sha1 + o.oid.hash @@ struct diff_filespec *p; @@ - p->sha1 + p->oid.hash Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-06-28 11:39:02 -07:00
Nguyễn Thái Ngọc Duy	6bf3b81348	diff --stat: mark any file larger than core.bigfilethreshold binary Too large files may lead to failure to allocate memory. If it happens here, it could impact quite a few commands that involve diff. Moreover, too large files are inefficient to compare anyway (and most likely non-text), so mark them binary and skip looking at their content. Noticed-by: Dale R. Worley <worley@alum.mit.edu> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-08-18 10:16:45 -07:00
Nguyễn Thái Ngọc Duy	8e5dd3d654	diff.c: allow to pass more flags to diff_populate_filespec Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-08-18 10:16:35 -07:00
Junio C Hamano	9b347673a1	Merge branch 'jk/diff-filespec-cleanup' Portability fix to a topic already in v1.9 * jk/diff-filespec-cleanup: diffcore.h: be explicit about the signedness of is_binary	2014-03-18 13:48:50 -07:00
Junio C Hamano	6376463c37	Merge branch 'ks/combine-diff' Teach combine-diff to honour the path-output-order imposed by diffcore-order, and optimize how matching paths are found in the N-way diffs made with parents. * ks/combine-diff: tests: add checking that combine-diff emits only correct paths combine-diff: simplify intersect_paths() further combine-diff: combine_diff_path.len is not needed anymore combine-diff: optimize combine_diff_path sets intersection diff test: add tests for combine-diff with orderfile diffcore-order: export generic ordering interface	2014-03-05 15:06:26 -08:00
Junio C Hamano	1e745453fe	Merge branch 'nd/diff-quiet-stat-dirty' "git diff --quiet -- pathspec1 pathspec2" sometimes did not return correct status value. * nd/diff-quiet-stat-dirty: diff: do not quit early on stat-dirty files diff.c: move diffcore_skip_stat_unmatch core logic out for reuse later	2014-02-27 14:01:21 -08:00
Nguyễn Thái Ngọc Duy	f34b205f6c	diff: do not quit early on stat-dirty files When QUICK is set (i.e. with --quiet) we try to do as little work as possible, stopping after seeing the first change. stat-dirty is considered a "change" but it may turn out not, if no actual content is changed. The actual content test is performed too late in the process and the shortcut may be taken prematurely, leading to incorrect return code. Assume we do "git diff --quiet". If we have a stat-dirty file "a" and a really dirty file "b". We break the loop in run_diff_files() and stop after "a" because we have got a "change". Later in diffcore_skip_stat_unmatch() we find out "a" is actually not changed. But there's nothing else in the diff queue, we incorrectly declare "no change", ignoring the fact that "b" is changed. This also happens to "git diff --quiet HEAD" when it hits diff_can_quit_early() in oneway_diff(). This patch does the content test earlier in order to keep going if "a" is unchanged. The test result is cached so that when diffcore_skip_stat_unmatch() is done in the end, we spend no cycles on re-testing "a". Reported-by: IWAMOTO Toshihiro <iwamoto@valinux.co.jp> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-02-24 14:50:14 -08:00
Kirill Smelkov	1df4320fa2	diffcore-order: export generic ordering interface diffcore_order() interface only accepts a queue of `struct diff_filepair`. In the next patches, we'll want to order `struct combine_diff_path` by path, so let's first rework diffcore-order to also provide generic low-level interface for ordering arbitrary objects, provided they have path accessors. The new interface is: - `struct obj_order` for describing objects to ordering routine, and - order_objects() for actually doing the ordering work. Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-02-24 14:44:57 -08:00
Richard Lowe	7d0a9a752b	diffcore.h: be explicit about the signedness of is_binary Bitfields need to specify their signedness explicitly or the compiler is free to default as it sees fit. With compilers that default 'unsigned' (SUNWspro 12 seems to do this) the tri-state nature of is_binary vanishes and all files are treated as binary. Signed-off-by: Richard Lowe <richlowe@richlowe.net> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-02-24 09:52:44 -08:00
Jeff King	cbfe47b67f	diff_filespec: use only 2 bits for is_binary flag The is_binary flag needs only three values: -1, 0, and 1. However, we use a whole 32-bit int for it on most systems (both 32- and 64- bit). Instead, we can mark it to use only 2 bits. On 32-bit systems, this lets it end up as part of the bitfield above (saving 4 bytes). On 64-bit systems, we don't see any change (because the savings end up as padding), but it does leave room for another "free" 32-bit value to be added later. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-01-17 10:50:14 -08:00
Jeff King	b38f70a82b	diff_filespec: reorder is_binary field The middle of the diff_filespec struct contains a mixture of ints, shorts, and bit-fields, followed by a pointer. On an x86-64 system with an LP64 or LLP64 data model (i.e., most of them), the integers and flags end up being padded out by 41 bits to put the pointer at an 8-byte boundary. After the pointer, we have the "int is_binary" field, which is only 32 bits. We end up wasting another 32 bits to pad the struct size up to a multiple of 64 bits. We can move the is_binary field before the pointer, which lets the compiler store it where we used to have padding. This shrinks the top padding to only 9 bits (from the bit-fields), and eliminates the bottom padding entirely, dropping the struct size from 88 to 80 bytes. On a 32-bit system, there is no benefit, but nor should there be any harm (we only need 4-byte alignment there, so we were already using only 9 bits of padding). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-01-17 10:50:13 -08:00
Jeff King	428d52a5a5	diff_filespec: drop xfrm_flags field The only mention of this field in the code is by some debugging code which prints it out (and it will always be zero, since we never touch it otherwise). It was obsoleted very early on by `25d5ea4` ([PATCH] Redo rename/copy detection logic., 2005-05-24). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-01-17 10:50:11 -08:00
Jeff King	5b711b207f	diff_filespec: drop funcname_pattern_ident field This struct field was obsoleted by `be58e70` (diff: unify external diff and funcname parsing code, 2008-10-05), but we forgot to remove it. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-01-17 10:50:10 -08:00
Jeff King	b837f5d68d	diff_filespec: reorder dirty_submodule macro definitions diff_filespec has a 2-bit "dirty_submodule" field and defines two flags as macros. Originally these were right next to each other, but a new field was accidentally added in between in commit `4682d85`. This patch puts the field and its flags back together. Using an enum like: enum { DIRTY_SUBMODULE_UNTRACKED = 1, DIRTY_SUBMODULE_MODIFIED = 2 } dirty_submodule; would be more obvious, but it bloats the structure. Limiting the enum size like: } dirty_submodule : 2; might work, but it is not portable. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2014-01-17 10:50:03 -08:00
Junio C Hamano	3b753148b6	Merge branch 'jk/maint-null-in-trees' We do not want a link to 0{40} object stored anywhere in our objects. * jk/maint-null-in-trees: fsck: detect null sha1 in tree entries do not write null sha1s to on-disk index diff: do not use null sha1 as a sentinel value	2012-08-27 11:54:28 -07:00
Jeff King	e54501004a	diff: do not use null sha1 as a sentinel value The diff code represents paths using the diff_filespec struct. This struct has a sha1 to represent the sha1 of the content at that path, as well as a sha1_valid member which indicates whether its sha1 field is actually useful. If sha1_valid is not true, then the filespec represents a working tree file (e.g., for the no-index case, or for when the index is not up-to-date). The diff_filespec is only used internally, though. At the interfaces to the diff subsystem, callers feed the sha1 directly, and we create a diff_filespec from it. It's at that point that we look at the sha1 and decide whether it is valid or not; callers may pass the null sha1 as a sentinel value to indicate that it is not. We should not typically see the null sha1 coming from any other source (e.g., in the index itself, or from a tree). However, a corrupt tree might have a null sha1, which would cause "diff --patch" to accidentally diff the working tree version of a file instead of treating it as a blob. This patch extends the edges of the diff interface to accept a "sha1_valid" flag whenever we accept a sha1, and to use that flag when creating a filespec. In some cases, this means passing the flag through several layers, making the code change larger than would be desirable. One alternative would be to simply die() upon seeing corrupted trees with null sha1s. However, this fix more directly addresses the problem (while bogus sha1s in a tree are probably a bad thing, it is really the sentinel confusion sending us down the wrong code path that is what makes it devastating). And it means that git is more capable of examining and debugging these corrupted trees. For example, you can still "diff --raw" such a tree to find out when the bogus entry was introduced; you just cannot do a "--patch" diff (just as you could not with any other corrupted tree, as we do not have any content to diff). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2012-07-29 15:04:32 -07:00
Junio C Hamano	d7afe648dc	Merge branch 'jc/refactor-diff-stdin' Due to the way "git diff --no-index" is bolted onto by touching the low level code that is shared with the rest of the "git diff" code, even though it has to work in a very different way, any comparison that involves a file "-" at the root level incorrectly tried to read from the standard input. This cleans up the no-index codepath further to remove code that reads from the standard input from the core side, which is never necessary when git is running its usual diff operation. * jc/refactor-diff-stdin: diff-index.c: "git diff" has no need to read blob from the standard input diff-index.c: unify handling of command line paths diff-index.c: do not pretend paths are pathspecs	2012-07-13 15:38:05 -07:00
Junio C Hamano	4682d8521c	diff-index.c: "git diff" has no need to read blob from the standard input Only "diff --no-index -" does. Bolting the logic into the low-level function diff_populate_filespec() was a layering violation from day one. Move populate_from_stdin() function out of the generic diff.c to its only user, diff-index.c. Also make sure "-" from the command line stays a special token "read from the standard input", even if we later decide to sanitize the result from prefix_filename() function in a few obvious ways, e.g. removing unnecessary "./" prefix, duplicated slashes "//" in the middle, etc. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2012-06-28 16:18:19 -07:00
Junio C Hamano	25e5e2bf85	combine-diff: support format_callback This teaches combine-diff machinery to feed a combined merge to a callback function when DIFF_FORMAT_CALLBACK is specified. So far, format callback functions are not used for anything but 2-way diffs. A callback is given a diff_queue_struct, which is an array of diff_filepair. As its name suggests, a diff_filepair is a _pair_ of diff_filespec that represents a single preimage and a single postimage. Since "diff -c" is to compare N parents with a single merge result and filter out any paths whose result match one (or more) of the parent(s), its output has to be able to represent N preimages and 1 postimage. For this reason, a callback function that inspects a diff_filepair that results from this new infrastructure can and is expected to view the preimage side (i.e. pair->one) as an array of diff_filespec. Each element in the array, except for the last one, is marked with "has_more_entries" bit, so that the same callback function can be used for 2-way diffs and combined diffs. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2011-08-20 23:03:06 -07:00
Junio C Hamano	382f013bc4	diff: pass the entire diff-options to diffcore_pickaxe() That would make it easier to give enhanced feature to the pickaxe transformation. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2010-08-31 14:30:28 -07:00
Junio C Hamano	44c48a909a	diff --follow: do call diffcore_std() as necessary Usually, diff frontends populate the output queue with filepairs without any rename information and call diffcore_std() to sort the renames out. When --follow is in effect, however, diff-tree family of frontend has a hack that looks like this: diff-tree frontend -> diff_tree_sha1() . populate diff_queued_diff . if --follow is in effect and there is only one change that creates the target path, then -> try_to_follow_renames() -> diff_tree_sha1() with no pathspec but with -C -> diffcore_std() to find renames . if rename is found, tweak diff_queued_diff and put a single filepair that records the found rename there -> diffcore_std() . tweak elements on diff_queued_diff by - rename detection - path ordering - pickaxe filtering We need to skip parts of the second call to diffcore_std() that is related to rename detection, and do so only when try_to_follow_renames() did find a rename. Earlier `1da6175` (Make diffcore_std only can run once before a diff_flush, 2010-05-06) tried to deal with this issue incorrectly; it unconditionally disabled any second call to diffcore_std(). This hopefully fixes the breakage. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2010-08-13 12:17:45 -07:00
Jonathan Nieder	987460611a	Standardize do { ... } while (0) style Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2010-08-12 15:44:51 -07:00
Matthieu Moy	cf958afd83	Document -B<n>[/<m>], -M<n> and -C<n> variants of -B, -M and -C These options take an optional argument, but this optional argument was not documented. Original patch by Matthieu Moy, but documentation for -B mostly copied from the explanations of Junio C Hamano. While we're there, fix a typo in a comment in diffcore.h. Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2010-08-09 09:16:11 -07:00
Bo Yang	1da6175d43	Make diffcore_std only can run once before a diff_flush When file renames/copies detection is turned on, the second diffcore_std will degrade a 'C' pair to a 'R' pair. And this may happen when we run 'git log --follow' with hard copies finding. That is, the try_to_follow_renames() will run diffcore_std to find the copies, and then 'git log' will issue another diffcore_std, which will reduce 'src->rename_used' and recognize this copy as a rename. This is not what we want. So, I think we really don't need to run diffcore_std more than one time. Signed-off-by: Bo Yang <struggleyb.nku@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2010-05-07 09:34:28 -07:00
Bo Yang	9ca5df9061	Add a macro DIFF_QUEUE_CLEAR. Refactor the diff_queue_struct code, this macro help to reset the structure. Signed-off-by: Bo Yang <struggleyb.nku@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2010-05-07 09:34:27 -07:00
Jens Lehmann	c7e1a73641	git diff --submodule: Show detailed dirty status of submodules When encountering a dirty submodule while doing "git diff --submodule" print an extra line for new untracked content and another for modified but already tracked content. And if the HEAD of the submodule is equal to the ref diffed against in the superproject, drop the output which would just show the same SHA1s and no commit message headlines. To achieve that, the dirty_submodule bitfield is expanded to two bits. The output of "git status" inside the submodule is parsed to set the according bits. Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2010-03-04 22:16:33 -08:00
Jens Lehmann	e3d42c4773	Performance optimization for detection of modified submodules In the worst case is_submodule_modified() got called three times for each submodule. The information we got from scanning the whole submodule tree the first time can be reused instead. New parameters have been added to diff_change() and diff_addremove(), the information is stored in a new member of struct diff_filespec. Its value is then reused instead of calling is_submodule_modified() again. When no explicit "-dirty" is needed in the output the call to is_submodule_modified() is not necessary when the submodules HEAD already disagrees with the ref of the superproject, as this alone marks it as modified. To achieve that, get_stat_data() got an extra argument. Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2010-01-18 17:28:21 -08:00
Junio C Hamano	84cdd3c635	Merge branch 'maint' * maint: Add reference for status letters in documentation. Document that git-log takes --all-match. Update draft 1.6.0.4 release notes	2008-11-02 16:36:40 -08:00
Yann Dirson	a5a323f33c	Add reference for status letters in documentation. Also fix error in diff_filepair::status documentation, and point to the in-code reference as well as the doc. Signed-off-by: Yann Dirson <ydirson@altern.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-11-02 15:57:10 -08:00
Jeff King	122aa6f9c0	diff: introduce diff.<driver>.binary The "diff" gitattribute is somewhat overloaded right now. It can say one of three things: 1. this file is definitely binary, or definitely not (i.e., diff or !diff) 2. this file should use an external diff engine (i.e., diff=foo, diff.foo.command = custom-script) 3. this file should use particular funcname patterns (i.e., diff=foo, diff.foo.(x?)funcname = some-regex) Most of the time, there is no conflict between these uses, since using one implies that the other is irrelevant (e.g., an external diff engine will decide for itself whether the file is binary). However, there is at least one conflicting situation: there is no way to say "use the regular rules to determine whether this file is binary, but if we do diff it textually, use this funcname pattern." That is, currently setting diff=foo indicates that the file is definitely text. This patch introduces a "binary" config option for a diff driver, so that one can explicitly set diff.foo.binary. We default this value to "don't know". That is, setting a diff attribute to "foo" and using "diff.foo.funcname" will have no effect on the binaryness of a file. To get the current behavior, one can set diff.foo.binary to true. This patch also has one additional advantage: it cleans up the interface to the userdiff code a bit. Before, calling code had to know more about whether attributes were false, true, or unset to determine binaryness. Now that binaryness is a property of a driver, we can represent these situations just by passing back a driver struct. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Shawn O. Pearce <spearce@spearce.org>	2008-10-18 08:02:26 -07:00
Yann Dirson	264e0b9a3c	Bust the ghost of long-defunct diffcore-pathspec. This concept was retired by `77882f6` (Retire diffcore-pathspec., 2006-04-10), more than 2 years ago. Signed-off-by: Yann Dirson <ydirson@altern.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2008-09-19 19:48:30 -07:00
Linus Torvalds	644797119d	copy vs rename detection: avoid unnecessary O(n*m) loops The core rename detection had some rather stupid code to check if a pathname was used by a later modification or rename, which basically walked the whole pathname space for all renames for each rename, in order to tell whether it was a pure rename (no remaining users) or should be considered a copy (other users of the source file remaining). That's really silly, since we can just keep a count of users around, and replace all those complex and expensive loops with just testing that simple counter (but this all depends on the previous commit that shared the diff_filespec data structure by using a separate reference count). Note that the reference count is not the same as the rename count: they behave otherwise rather similarly, but the reference count is tied to the allocation (and decremented at de-allocation, so that when it turns zero we can get rid of the memory), while the rename count is tied to the renames and is decremented when we find a rename (so that when it turns zero we know that it was a rename, not a copy). Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2007-10-26 23:18:06 -07:00
Linus Torvalds	9fb88419ba	Ref-count the filespecs used by diffcore Rather than copy the filespecs when introducing new versions of them (for rename or copy detection), use a refcount and increment the count when reusing the diff_filespec. This avoids unnecessary allocations, but the real reason behind this is a future enhancement: we will want to track shared data across the copy/rename detection. In order to efficiently notice when a filespec is used by a rename, the rename machinery wants to keep track of a rename usage count which is shared across all different users of the filespec. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2007-10-26 23:18:05 -07:00
Junio C Hamano	8ae92e6389	rename diff_free_filespec_data_large() to diff_free_filespec_blob() Signed-off-by: Junio C Hamano <gitster@pobox.com>	2007-10-02 21:02:09 -07:00
Jeff King	eede7b7d11	diffcore-rename: cache file deltas We find rename candidates by computing a fingerprint hash of each file, and then comparing those fingerprints. There are inherently O(n^2) comparisons, so it pays in CPU time to hoist the (rather expensive) computation of the fingerprint out of that loop (or to cache it once we have computed it once). Previously, we didn't keep the filespec information around because then we had the potential to consume a great deal of memory. However, instead of keeping all of the filespec data, we can instead just keep the fingerprint. This patch implements and uses diff_free_filespec_data_large to accomplish that goal. We also have to change estimate_similarity not to needlessly repopulate the filespec data when we already have the hash. Practical tests showed 4.5x speedup for a 10% memory usage increase. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2007-10-02 21:02:03 -07:00
Junio C Hamano	e0e324a4dc	Fix configuration syntax to specify customized hunk header patterns. This updates the hunk header customization syntax. The special case 'funcname' attribute is gone. You assign the name of the type of contents to path's "diff" attribute as a string value in .gitattributes like this: .java diff=java .perl diff=perl *.doc diff=doc If you supply "diff.<name>.funcname" variable via the configuration mechanism (e.g. in $HOME/.gitconfig), the value is used as the regexp set to find the line to use for the hunk header (the variable is called "funcname" because such a line typically is the one that has the name of the function in programming language source text). If there is no such configuration, built-in default is used, if any. Currently there are two default patterns: default and java. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2007-07-07 01:49:58 -07:00
Junio C Hamano	f258475a6e	Per-path attribute based hunk header selection. This makes"diff -p" hunk headers customizable via gitattributes mechanism. It is based on Johannes's earlier patch that allowed to define a single regexp to be used for everything. The mechanism to arrive at the regexp that is used to define hunk header is the same as other use of gitattributes. You assign an attribute, funcname (because "diff -p" typically uses the name of the function the patch is about as the hunk header), a simple string value. This can be one of the names of built-in pattern (currently, "java" is defined) or a custom pattern name, to be looked up from the configuration file. (in .gitattributes) .java funcname=java .perl funcname=perl (in .git/config) [funcname] java = ... # ugly and complicated regexp to override the built-in one. perl = ... # another ugly and complicated regexp to define a new one. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2007-07-06 01:20:47 -07:00
Junio C Hamano	29a3eefde1	Introduce diff_filespec_is_binary() This replaces an explicit initialization of filespec->is_binary field used for rename/break followed by direct access to that field with a wrapper function that lazily iniaitlizes and accesses the field. We would add more attribute accesses for the use of diff routines, and it would be better to make this abstraction earlier. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2007-07-06 00:21:41 -07:00

1 2

86 Commits