Commit Graph

1044 Commits

Author SHA1 Message Date
Junio C Hamano
eee947fb95 Merge branch 'jc/maint-diffstat-numstat-context' into maint
* jc/maint-diffstat-numstat-context:
  diff: teach --stat/--numstat to honor -U$num
2011-11-01 16:10:56 -07:00
Junio C Hamano
713b85c758 Merge branch 'rs/diff-cleanup-records-fix' into maint
* rs/diff-cleanup-records-fix:
  diff: resurrect XDF_NEED_MINIMAL with --minimal
  Revert removal of multi-match discard heuristic in 27af01
2011-10-21 10:49:25 -07:00
Junio C Hamano
9b55aa03da Merge branch 'rs/diff-whole-function'
* rs/diff-whole-function:
  diff: add option to show whole functions as context
  xdiff: factor out get_func_line()
2011-10-19 10:49:13 -07:00
Junio C Hamano
7a63a920fd Merge branch 'rs/diff-cleanup-records-fix'
* rs/diff-cleanup-records-fix:
  diff: resurrect XDF_NEED_MINIMAL with --minimal
  Revert removal of multi-match discard heuristic in 27af01
2011-10-13 19:03:22 -07:00
Junio C Hamano
7ddd582402 Merge branch 'jc/maint-diffstat-numstat-context'
* jc/maint-diffstat-numstat-context:
  diff: teach --stat/--numstat to honor -U$num
2011-10-10 15:56:18 -07:00
René Scharfe
14937c2c06 diff: add option to show whole functions as context
Add the option -W/--function-context to git diff.  It is similar to
the same option of git grep and expands the context of change hunks
so that the whole surrounding function is shown.  This "natural"
context can allow changes to be understood better.

Note: GNU patch doesn't like diffs generated with the new option;
it seems to expect context lines to be the same before and after
changes.  git apply doesn't complain.

This implementation has the same shortcoming as the one in grep,
namely that there is no way to explicitly find the end of a
function.  That means that a few lines of extra context are shown,
right up to the next recognized function begins.  It's already
useful in its current form, though.

The function get_func_line() in xdiff/xemit.c is extended to work
forward as well as backward to find post-context as well as
pre-context.  It returns the position of the first found matching
line.  The func_line parameter is made optional, as we don't need
it for -W.

The enhanced function is then used in xdl_emit_diff() to extend
the context as needed.  If the added context overlaps with the
next change, it is merged into the current hunk.

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-10-10 12:05:07 -07:00
Junio C Hamano
81b568c839 diff: resurrect XDF_NEED_MINIMAL with --minimal
Earlier, 582aa00 (git diff too slow for a file, 2010-05-02)
unconditionally dropped XDF_NEED_MINIMAL option from the internal xdiff
invocation to help performance on pathological cases, while hinting that a
follow-up patch could reintroduce it with "--minimal" option from the
command line.

Make it so.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-10-03 11:58:18 -07:00
Junio C Hamano
f01cae918f diff: teach --stat/--numstat to honor -U$num
"git diff -p" piped to external diffstat and "git diff --stat" may see
different patch text (both are valid and describe the same change
correctly) when counting the number of added and deleted lines, arriving
at different results to confuse the users, as --stat/--numstat codepath
always uses the hardcoded -U0 as the context length.

Make --stat/--numstat codepath to honor the context length the same way
as the textual patch codepath does to avoid this problem.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-09-22 10:54:47 -07:00
Junio C Hamano
f946b465d7 Merge branch 'jk/color-and-pager'
* jk/color-and-pager:
  want_color: automatically fallback to color.ui
  diff: don't load color config in plumbing
  config: refactor get_colorbool function
  color: delay auto-color decision until point of use
  git_config_colorbool: refactor stdout_is_tty handling
  diff: refactor COLOR_DIFF from a flag into an int
  setup_pager: set GIT_PAGER_IN_USE
  t7006: use test_config helpers
  test-lib: add helper functions for config
  t7006: modernize calls to unset

Conflicts:
	builtin/commit.c
	parse-options.c
2011-08-28 21:19:16 -07:00
Jeff King
3e1dd17a89 diff: don't load color config in plumbing
The diff config callback is split into two functions: one
which loads "ui" config, and one which loads "basic" config.
The former chains to the latter, as the diff UI config is a
superset of the plumbing config.

The color.diff variable is only loaded in the UI config.
However, the basic config actually chains to
git_color_default_config, which loads color.ui. This doesn't
actually cause any bugs, because the plumbing diff code does
not actually look at the value of color.ui.

However, it is somewhat nonsensical, and it makes it
difficult to refactor the color code. It probably came about
because there is no git_color_config to load only color
config, but rather just git_color_default_config, which
loads color config and chains to git_default_config.

This patch splits out the color-specific portion of
git_color_default_config so that the diff UI config can call
it directly. This is perhaps better explained by the
chaining of callbacks. Before we had:

  git_diff_ui_config
    -> git_diff_basic_config
      -> git_color_default_config
        -> git_default_config

Now we have:

  git_diff_ui_config
    -> git_color_config
    -> git_diff_basic_config
      -> git_default_config

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-08-19 15:51:38 -07:00
Jeff King
daa0c3d971 color: delay auto-color decision until point of use
When we read a color value either from a config file or from
the command line, we use git_config_colorbool to convert it
from the tristate always/never/auto into a single yes/no
boolean value.

This has some timing implications with respect to starting
a pager.

If we start (or decide not to start) the pager before
checking the colorbool, everything is fine. Either isatty(1)
will give us the right information, or we will properly
check for pager_in_use().

However, if we decide to start a pager after we have checked
the colorbool, things are not so simple. If stdout is a tty,
then we will have already decided to use color. However, the
user may also have configured color.pager not to use color
with the pager. In this case, we need to actually turn off
color. Unfortunately, the pager code has no idea which color
variables were turned on (and there are many of them
throughout the code, and they may even have been manipulated
after the colorbool selection by something like "--color" on
the command line).

This bug can be seen any time a pager is started after
config and command line options are checked. This has
affected "git diff" since 89d07f7 (diff: don't run pager if
user asked for a diff style exit code, 2007-08-12). It has
also affect the log family since 1fda91b (Fix 'git log'
early pager startup error case, 2010-08-24).

This patch splits the notion of parsing a colorbool and
actually checking the configuration. The "use_color"
variables now have an additional possible value,
GIT_COLOR_AUTO. Users of the variable should use the new
"want_color()" wrapper, which will lazily determine and
cache the auto-color decision.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-08-19 15:51:34 -07:00
Jeff King
e269eb7946 git_config_colorbool: refactor stdout_is_tty handling
Usually this function figures out for itself whether stdout
is a tty. However, it has an extra parameter just to allow
git-config to override the auto-detection for its
--get-colorbool option.

Instead of an extra parameter, let's just use a global
variable. This makes calling easier in the common case, and
will make refactoring the colorbool code much simpler.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-08-18 14:48:29 -07:00
Jeff King
f1c9626105 diff: refactor COLOR_DIFF from a flag into an int
This lets us store more than just a bit flag for whether we
want color; we can also store whether we want automatic
colors. This can be useful for making the automatic-color
decision closer to the point of use.

This mostly just involves replacing DIFF_OPT_* calls with
manipulations of the flag. The biggest exception is that
calls to DIFF_OPT_TST must check for "o->use_color > 0",
which lets an "unknown" value (i.e., the default) stay at
"no color". In the previous code, a value of "-1" was not
propagated at all.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-08-18 14:35:53 -07:00
Junio C Hamano
ca01600306 Merge branch 'rc/histogram-diff'
* rc/histogram-diff:
  xdiff/xhistogram: drop need for additional variable
  xdiff/xhistogram: rely on xdl_trim_ends()
  xdiff/xhistogram: rework handling of recursed results
  xdiff: do away with xdl_mmfile_next()
  Make test number unique
  xdiff/xprepare: use a smaller sample size for histogram diff
  xdiff/xprepare: skip classification
  teach --histogram to diff
  t4033-diff-patience: factor out tests
  xdiff/xpatience: factor out fall-back-diff function
  xdiff/xprepare: refactor abort cleanups
  xdiff/xprepare: use memset()
2011-08-17 17:36:06 -07:00
Junio C Hamano
a35d78c0f4 Merge branch 'jc/zlib-wrap' into maint
* jc/zlib-wrap:
  zlib: allow feeding more than 4GB in one go
  zlib: zlib can only process 4GB at a time
  zlib: wrap deflateBound() too
  zlib: wrap deflate side of the API
  zlib: wrap inflateInit2 used to accept only for gzip format
  zlib: wrap remaining calls to direct inflate/inflateEnd
  zlib wrapper: refactor error message formatter
2011-08-16 11:23:26 -07:00
Junio C Hamano
e10e476fb1 Merge branch 'jk/combine-diff-binary-etc' into maint
* jk/combine-diff-binary-etc:
  combine-diff: respect textconv attributes
  refactor get_textconv to not require diff_filespec
  combine-diff: handle binary files as binary
  combine-diff: calculate mode_differs earlier
  combine-diff: split header printing into its own function
2011-08-16 11:23:24 -07:00
Junio C Hamano
eb4f4076aa Merge branch 'jc/zlib-wrap'
* jc/zlib-wrap:
  zlib: allow feeding more than 4GB in one go
  zlib: zlib can only process 4GB at a time
  zlib: wrap deflateBound() too
  zlib: wrap deflate side of the API
  zlib: wrap inflateInit2 used to accept only for gzip format
  zlib: wrap remaining calls to direct inflate/inflateEnd
  zlib wrapper: refactor error message formatter

Conflicts:
	sha1_file.c
2011-07-19 09:33:04 -07:00
Tay Ray Chuan
8c912eea94 teach --histogram to diff
Port JGit's HistogramDiff algorithm over to C. Rough numbers (TODO) show
that it is faster than its --patience cousin, as well as the default
Meyers algorithm.

The implementation has been reworked to use structs and pointers,
instead of bitmasks, thus doing away with JGit's 2^28 line limit.

We also use xdiff's default hash table implementation (xdl_hash_bits()
with XDL_HASHLONG()) for convenience.

Signed-off-by: Tay Ray Chuan <rctay89@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-07-12 09:29:20 -07:00
Junio C Hamano
a852aac48d Merge branch 'mg/diff-stat-count'
* mg/diff-stat-count:
  diff --stat-count: finishing touches
  diff-options.txt: describe --stat-{width,name-width,count}
  diff: introduce --stat-lines to limit the stat lines
  diff.c: omit hidden entries from namelen calculation with --stat
2011-06-29 17:03:10 -07:00
Junio C Hamano
dbae1a1336 Merge branch 'jk/combine-diff-binary-etc'
* jk/combine-diff-binary-etc:
  combine-diff: respect textconv attributes
  refactor get_textconv to not require diff_filespec
  combine-diff: handle binary files as binary
  combine-diff: calculate mode_differs earlier
  combine-diff: split header printing into its own function
2011-06-29 17:03:10 -07:00
Junio C Hamano
ef49a7a012 zlib: zlib can only process 4GB at a time
The size of objects we read from the repository and data we try to put
into the repository are represented in "unsigned long", so that on larger
architectures we can handle objects that weigh more than 4GB.

But the interface defined in zlib.h to communicate with inflate/deflate
limits avail_in (how many bytes of input are we calling zlib with) and
avail_out (how many bytes of output from zlib are we ready to accept)
fields effectively to 4GB by defining their type to be uInt.

In many places in our code, we allocate a large buffer (e.g. mmap'ing a
large loose object file) and tell zlib its size by assigning the size to
avail_in field of the stream, but that will truncate the high octets of
the real size. The worst part of this story is that we often pass around
z_stream (the state object used by zlib) to keep track of the number of
used bytes in input/output buffer by inspecting these two fields, which
practically limits our callchain to the same 4GB limit.

Wrap z_stream in another structure git_zstream that can express avail_in
and avail_out in unsigned long. For now, just die() when the caller gives
a size that cannot be given to a single zlib call. In later patches in the
series, we would make git_inflate() and git_deflate() internally loop to
give callers an illusion that our "improved" version of zlib interface can
operate on a buffer larger than 4GB in one go.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-06-10 11:52:15 -07:00
Junio C Hamano
225a6f1068 zlib: wrap deflateBound() too
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-06-10 11:18:17 -07:00
Junio C Hamano
55bb5c9147 zlib: wrap deflate side of the API
Wrap deflateInit, deflate, and deflateEnd for everybody, and the sole use
of deflateInit2 in remote-curl.c to tell the library to use gzip header
and trailer in git_deflate_init_gzip().

There is only one caller that cares about the status from deflateEnd().
Introduce git_deflate_end_gently() to let that sole caller retrieve the
status and act on it (i.e. die) for now, but we would probably want to
make inflate_end/deflate_end die when they ran out of memory and get
rid of the _gently() kind.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-06-10 11:10:29 -07:00
Junio C Hamano
456a4c08b8 Merge branch 'jk/diff-not-so-quick'
* jk/diff-not-so-quick:
  diff: futureproof "stop feeding the backend early" logic
  diff_tree: disable QUICK optimization with diff filter

Conflicts:
	diff.c
2011-06-06 11:40:14 -07:00
Junio C Hamano
b3c89315a3 Merge branch 'jc/rename-degrade-cc-to-c' into maint
* jc/rename-degrade-cc-to-c:
  diffcore-rename: fall back to -C when -C -C busts the rename limit
  diffcore-rename: record filepair for rename src
  diffcore-rename: refactor "too many candidates" logic
  builtin/diff.c: remove duplicated call to diff_result_code()
2011-05-31 12:00:02 -07:00
Junio C Hamano
28b9264dd6 diff: futureproof "stop feeding the backend early" logic
Refactor the "do not stop feeding the backend early" logic into a small
helper function and use it in both run_diff_files() and diff_tree() that
has the stop-early optimization. We may later add other types of diffcore
transformation that require to look at the whole result like diff-filter
does, and having the logic in a single place is essential for longer term
maintainability.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-05-31 09:21:36 -07:00
Junio C Hamano
e5f85df87e diff --stat-count: finishing touches
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-05-27 21:50:39 -07:00
Michael J Gruber
808e1db231 diff: introduce --stat-lines to limit the stat lines
Often one is interested in the full --stat output only for commits which
change a few files, but not others, because larger restructuring gives a
--stat which fills a few screens.

Introduce a new option --stat-count=<count> which limits the --stat output
to the first <count> lines, followed by a "..." line. It can
also be given as the third parameter in
--stat=<width>,<name-width>,<count>.

Also, the unstuck form is supported analogous to the other two stat
parameters.

Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-05-27 10:44:34 -07:00
Michael J Gruber
358e460eeb diff.c: omit hidden entries from namelen calculation with --stat
Currently, --stat calculates the longest name from all items but then
drops some (mode changes) from the output later on.

Instead, drop them from the namelen generation and calculation.

Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-05-27 10:44:02 -07:00
Junio C Hamano
d9ac3e41c3 Merge branch 'jm/maint-diff-words-with-sbe' into maint
* jm/maint-diff-words-with-sbe:
  do not read beyond end of malloc'd buffer
2011-05-26 09:43:00 -07:00
Jeff King
3813e69031 refactor get_textconv to not require diff_filespec
This function actually does two things:

  1. Load the userdiff driver for the filespec.

  2. Decide whether the driver has a textconv component, and
     initialize the textconv cache if applicable.

Only part (1) requires the filespec object, and some callers
may not have a filespec at all. So let's split them it into
two functions, and put part (2) with the userdiff code,
which is a better fit.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-05-23 15:46:02 -07:00
Junio C Hamano
34ad5a52b4 Merge branch 'jm/maint-diff-words-with-sbe'
* jm/maint-diff-words-with-sbe:
  do not read beyond end of malloc'd buffer
2011-05-23 10:27:42 -07:00
Jim Meyering
42536dd9b9 do not read beyond end of malloc'd buffer
With diff.suppress-blank-empty=true, "git diff --word-diff" would
output data that had been read from uninitialized heap memory.
The problem was that fn_out_consume did not account for the
possibility of a line with length 1, i.e., the empty context line
that diff.suppress-blank-empty=true converts from " \n" to "\n".
Since it assumed there would always be a prefix character (the space),
it decremented "len" unconditionally, thus passing len=0 to emit_line,
which would then blindly call emit_line_0 with len=-1 which would
pass that value on to fwrite as SIZE_MAX.  Boom.

Signed-off-by: Jim Meyering <meyering@redhat.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-05-20 11:39:49 -07:00
Junio C Hamano
df54e2bfd6 Merge branch 'jh/dirstat-lines'
* jh/dirstat-lines:
  Mark dirstat error messages for translation
  Improve error handling when parsing dirstat parameters
  New --dirstat=lines mode, doing dirstat analysis based on diffstat
  Allow specifying --dirstat cut-off percentage as a floating point number
  Add config variable for specifying default --dirstat behavior
  Refactor --dirstat parsing; deprecate --cumulative and --dirstat-by-file
  Make --dirstat=0 output directories that contribute < 0.1% of changes
  Add several testcases for --dirstat and friends
2011-05-13 11:01:32 -07:00
Junio C Hamano
a613b534bc Merge branch 'jc/fix-diff-files-unmerged' into maint
* jc/fix-diff-files-unmerged:
  diff-files: show unmerged entries correctly
  diff: remove often unused parameters from diff_unmerge()
  diff.c: return filepair from diff_unmerge()
  test: use $_z40 from test-lib
2011-05-13 10:41:54 -07:00
Junio C Hamano
22dbeee715 Merge branch 'jc/fix-diff-files-unmerged'
* jc/fix-diff-files-unmerged:
  diff-files: show unmerged entries correctly
  diff: remove often unused parameters from diff_unmerge()
  diff.c: return filepair from diff_unmerge()
  test: use $_z40 from test-lib
2011-05-06 10:52:58 -07:00
Junio C Hamano
f5bf1b5f6b Merge branch 'jh/dirstat' into maint
* jh/dirstat:
  --dirstat: In case of renames, use target filename instead of source filename
  Teach --dirstat not to completely ignore rearranged lines within a file
  --dirstat-by-file: Make it faster and more correct
  --dirstat: Describe non-obvious differences relative to --stat or regular diff
2011-05-04 14:59:07 -07:00
Johan Herland
7478ac57c4 Mark dirstat error messages for translation
Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-04-29 11:22:56 -07:00
Johan Herland
51670fc87e Improve error handling when parsing dirstat parameters
When encountering errors or unknown tokens while parsing parameters to the
--dirstat option, it makes sense to die() with an error message informing
the user of which parameter did not make sense. However, when parsing the
diff.dirstat config variable, we cannot simply die(), but should instead
(after warning the user) ignore the erroneous or unrecognized parameter.
After all, future Git versions might add more dirstat parameters, and
using two different Git versions on the same repo should not cripple the
older Git version just because of a parameter that is only understood by
a more recent Git version.

This patch fixes the issue by refactoring the dirstat parameter parsing
so that parse_dirstat_params() keeps on parsing parameters, even if an
earlier parameter was not recognized. When parsing has finished, it returns
zero if all parameters were successfully parsed, and non-zero if one or
more parameters were not recognized (with appropriate error messages
appended to the 'errmsg' argument).

The parse_dirstat_params() callers then decide (based on the return value
from parse_dirstat_params()) whether to warn and ignore (in case of
diff.dirstat), or to warn and die (in case of --dirstat).

The patch also adds a couple of tests verifying the correct behavior of
--dirstat and diff.dirstat in the face of unknown (possibly future) dirstat
parameters.

Suggested-by: Junio C Hamano <gitster@pobox.com>
Improved-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-04-29 11:22:56 -07:00
Johan Herland
1c57a627bf New --dirstat=lines mode, doing dirstat analysis based on diffstat
This patch adds an alternative implementation of show_dirstat(), called
show_dirstat_by_line(), which uses the more expensive diffstat analysis
(as opposed to show_dirstat()'s own (relatively inexpensive) analysis)
to derive the numbers from which the --dirstat output is computed.

The alternative implementation is controlled by the new "lines" parameter
to the --dirstat option (or the diff.dirstat config variable).

For binary files, the diffstat analysis counts bytes instead of lines,
so to prevent binary files from dominating the dirstat results, the
byte counts for binary files are divided by 64 before being compared to
their textual/line-based counterparts. This is a stupid and ugly - but
very cheap - heuristic.

In linux-2.6.git, running the three different --dirstat modes:

  time git diff v2.6.20..v2.6.30 --dirstat=changes > /dev/null
vs.
  time git diff v2.6.20..v2.6.30 --dirstat=lines > /dev/null
vs.
  time git diff v2.6.20..v2.6.30 --dirstat=files > /dev/null

yields the following average runtimes on my machine:

 - "changes" (default): ~6.0 s
 - "lines":             ~9.6 s
 - "files":             ~0.1 s

So, as expected, there's a considerable performance hit (~60%) by going
through the full diffstat analysis as compared to the default "changes"
analysis (obviously, "files" is much faster than both). As such, the
"lines" mode is probably only useful if you really need the --dirstat
numbers to be consistent with the numbers returned from the other
--*stat options.

The patch also includes documentation and tests for the new dirstat mode.

Improved-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-04-29 11:22:55 -07:00
Johan Herland
712d2c7dd8 Allow specifying --dirstat cut-off percentage as a floating point number
Only the first digit after the decimal point is kept, as the dirstat
calculations all happen in permille.

Selftests verifying floating-point percentage input has been added.

Improved-by: Junio C Hamano <gitster@pobox.com>
Improved-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-04-29 11:20:11 -07:00
Johan Herland
2d17495196 Add config variable for specifying default --dirstat behavior
The new diff.dirstat config variable takes the same arguments as
'--dirstat=<args>', and specifies the default arguments for --dirstat.
The config is obviously overridden by --dirstat arguments passed on the
command line.

When not specified, the --dirstat defaults are 'changes,noncumulative,3'.

The patch also adds several tests verifying the interaction between the
diff.dirstat config variable, and the --dirstat command line option.

Improved-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-04-29 11:20:03 -07:00
Johan Herland
333f3fb0c5 Refactor --dirstat parsing; deprecate --cumulative and --dirstat-by-file
Instead of having multiple interconnected dirstat-related options, teach
the --dirstat option itself to accept all behavior modifiers as parameters.

 - Preserve the current --dirstat=<limit> (where <limit> is an integer
   specifying a cut-off percentage)
 - Add --dirstat=cumulative, replacing --cumulative
 - Add --dirstat=files, replacing --dirstat-by-file
 - Also add --dirstat=changes and --dirstat=noncumulative for specifying the
   current default behavior. These allow the user to reset other --dirstat
   parameters (e.g. 'cumulative' and 'files') occuring earlier on the
   command line.

The deprecated options (--cumulative and --dirstat-by-file) are still
functional, although they have been removed from the documentation.

Allow multiple parameters to be separated by commas, e.g.:
  --dirstat=files,10,cumulative

Update the documentation accordingly, and add testcases verifying the
behavior of the new syntax.

Improved-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-04-29 11:17:36 -07:00
Johan Herland
58a8756a98 Make --dirstat=0 output directories that contribute < 0.1% of changes
The expected output from --dirstat=0, is to include any directory with
changes, even if those changes contribute a minuscule portion of the total
changes. However, currently, directories that contribute less than 0.1% are
not included, since their 'permille' value is 0, and there is an
'if (permille)' check in gather_dirstat() that causes them to be ignored.

This test is obviously intended to exclude directories that contribute no
changes whatsoever, but in this case, it hits too broadly. The correct
check is against 'this_dir' from which the permille is calculated. Only if
this value is 0 does the directory truly contribute no changes, and should
be skipped from the output.

This patches fixes this issue, and updates corresponding testcases to
expect the new behvaior.

Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-04-29 11:17:36 -07:00
Junio C Hamano
50d3062ab2 Merge branch 'jc/diff-irreversible-delete'
* jc/diff-irreversible-delete:
  git diff -D: omit the preimage of deletes
2011-04-28 14:11:47 -07:00
Junio C Hamano
76a89d6d82 Merge branch 'jc/rename-degrade-cc-to-c'
* jc/rename-degrade-cc-to-c:
  diffcore-rename: fall back to -C when -C -C busts the rename limit
  diffcore-rename: record filepair for rename src
  diffcore-rename: refactor "too many candidates" logic
  builtin/diff.c: remove duplicated call to diff_result_code()
2011-04-28 14:11:43 -07:00
Junio C Hamano
d98a509ec3 Merge branch 'jh/dirstat'
* jh/dirstat:
  --dirstat: In case of renames, use target filename instead of source filename
  Teach --dirstat not to completely ignore rearranged lines within a file
  --dirstat-by-file: Make it faster and more correct
  --dirstat: Describe non-obvious differences relative to --stat or regular diff
2011-04-28 14:11:19 -07:00
Junio C Hamano
fa7b290895 diff: remove often unused parameters from diff_unmerge()
e9c8409 (diff-index --cached --raw: show tree entry on the LHS for
unmerged entries., 2007-01-05) added a <mode, object name> pair as
parameters to this function, to store them in the pre-image side of an
unmerged file pair.  Now the function is fixed to return the filepair it
queued, we can make the caller on the special case codepath to do so.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-04-23 22:34:43 -07:00
Junio C Hamano
76399c0195 diff.c: return filepair from diff_unmerge()
The underlying diff_queue() returns diff_filepair so that the caller can
further add information to it, and the helper function diff_unmerge()
utilizes the feature itself, but does not expose it to its callers, which
was kind of selfish.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-04-23 22:34:43 -07:00
Johan Herland
2ca8671470 --dirstat: In case of renames, use target filename instead of source filename
This changes --dirstat analysis to count "damage" toward the target filename,
rather than the source filename. For renames within a directory, this won't
matter to the final output, but when moving files between diretories, the
output now lists the target directory rather than the source directory.

Signed-off-by: Johan Herland <johan@herland.net>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-04-12 11:29:34 -07:00
Johan Herland
2ff3a80334 Teach --dirstat not to completely ignore rearranged lines within a file
Currently, the --dirstat analysis ignores when lines within a file are
rearranged, because the "damage" calculated by show_dirstat() is 0.
However, if the object name has changed, we already know that there is
some damage, and it is unintuitive to claim there is _no_ damage.

Teach show_dirstat() to assign a minimum amount of damage (== 1) to
entries for which the analysis otherwise yields zero damage, to still
represent that these files are changed, instead of saying that there
is no change.

Also, skip --dirstat analysis when the object names are the same (e.g. for
a pure file rename).

Signed-off-by: Johan Herland <johan@herland.net>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-04-11 11:16:15 -07:00
Johan Herland
0133dab75d --dirstat-by-file: Make it faster and more correct
Currently, when using --dirstat-by-file, it first does the full --dirstat
analysis (using diffcore_count_changes()), and then resets 'damage' to 1,
if any damage was found by diffcore_count_changes().

But --dirstat-by-file is not interested in the file damage per se. It only
cares if the file changed at all. In that sense it only cares if the blob
object for a file has changed. We therefore only need to compare the
object names of each file pair in the diff queue and we can skip the
entire --dirstat analysis and simply set 'damage' to 1 for each entry
where the object name has changed.

This makes --dirstat-by-file faster, and also bypasses --dirstat's practice
of ignoring rearranged lines within a file.

The patch also contains an added testcase verifying that --dirstat-by-file
now detects changes that only rearrange lines within a file.

Signed-off-by: Johan Herland <johan@herland.net>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-04-11 10:12:24 -07:00
Junio C Hamano
467ddc14fe git diff -D: omit the preimage of deletes
When reviewing a patch while concentrating primarily on the text after
then change, wading through pages of deleted text involves a cognitive
burden.

Introduce the -D option that omits the preimage text from the patch output
for deleted files.  When used with -B (represent total rewrite as a single
wholesale deletion followed by a single wholesale addition), the preimage
text is also omitted.

To prevent such a patch from being applied by mistake, the output is
designed not to be usable by "git apply" (or GNU "patch"); it is strictly
for human consumption.

It of course is possible to "apply" such a patch by hand, as a human can
read the intention out of such a patch.  It however is impossible to apply
such a patch even manually in reverse, as the whole point of this option
is to omit the information necessary to do so from the output.

Initial request by Mart Sõmermaa, documentation and tests helped by
Michael J Gruber.

Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-04-02 23:52:20 -07:00
Junio C Hamano
f31027c99c diffcore-rename: fall back to -C when -C -C busts the rename limit
When there are too many paths in the project, the number of rename source
candidates "git diff -C -C" finds will exceed the rename detection limit,
and no inexact rename detection is performed.  We however could fall back
to "git diff -C" if the number of modified paths is sufficiently small.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-03-22 14:29:07 -07:00
Johannes Schindelin
c0aa335c95 Remove unused variables
Noticed by gcc 4.6.0.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-03-22 11:43:27 -07:00
Stephen Boyd
c2e86addb8 Fix sparse warnings
Fix warnings from 'make check'.

 - These files don't include 'builtin.h' causing sparse to complain that
   cmd_* isn't declared:

   builtin/clone.c:364, builtin/fetch-pack.c:797,
   builtin/fmt-merge-msg.c:34, builtin/hash-object.c:78,
   builtin/merge-index.c:69, builtin/merge-recursive.c:22
   builtin/merge-tree.c:341, builtin/mktag.c:156, builtin/notes.c:426
   builtin/notes.c:822, builtin/pack-redundant.c:596,
   builtin/pack-refs.c:10, builtin/patch-id.c:60, builtin/patch-id.c:149,
   builtin/remote.c:1512, builtin/remote-ext.c:240,
   builtin/remote-fd.c:53, builtin/reset.c:236, builtin/send-pack.c:384,
   builtin/unpack-file.c:25, builtin/var.c:75

 - These files have symbols which should be marked static since they're
   only file scope:

   submodule.c:12, diff.c:631, replace_object.c:92, submodule.c:13,
   submodule.c:14, trace.c:78, transport.c:195, transport-helper.c:79,
   unpack-trees.c:19, url.c:3, url.c:18, url.c:104, url.c:117, url.c:123,
   url.c:129, url.c:136, thread-utils.c:21, thread-utils.c:48

 - These files redeclare symbols to be different types:

   builtin/index-pack.c:210, parse-options.c:564, parse-options.c:571,
   usage.c:49, usage.c:58, usage.c:63, usage.c:72

 - These files use a literal integer 0 when they really should use a NULL
   pointer:

   daemon.c:663, fast-import.c:2942, imap-send.c:1072, notes-merge.c:362

While we're in the area, clean up some unused #includes in builtin files
(mostly exec_cmd.h).

Signed-off-by: Stephen Boyd <bebarino@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-03-22 10:16:54 -07:00
Junio C Hamano
0ce6a51b43 Merge branch 'jk/merge-rename-ux'
* jk/merge-rename-ux:
  pull: propagate --progress to merge
  merge: enable progress reporting for rename detection
  add inexact rename detection progress infrastructure
  commit: stop setting rename limit
  bump rename limit defaults (again)
  merge: improve inexact rename limit warning
2011-03-19 23:23:56 -07:00
Junio C Hamano
4e530c5049 Merge branch 'jk/diffstat-binary' into maint
* jk/diffstat-binary:
  diff: don't retrieve binary blobs for diffstat
  diff: handle diffstat of rewritten binary files
2011-03-16 16:47:26 -07:00
Jonathan Nieder
9cba13ca5d standardize brace placement in struct definitions
In a struct definitions, unlike functions, the prevailing style is for
the opening brace to go on the same line as the struct name, like so:

 struct foo {
	int bar;
	char *baz;
 };

Indeed, grepping for 'struct [a-z_]* {$' yields about 5 times as many
matches as 'struct [a-z_]*$'.

Linus sayeth:

 Heretic people all over the world have claimed that this inconsistency
 is ...  well ...  inconsistent, but all right-thinking people know that
 (a) K&R are _right_ and (b) K&R are right.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-03-16 12:49:02 -07:00
Jeff King
abb371a1ef diff: don't retrieve binary blobs for diffstat
We only need the size, which is much cheaper to get,
especially if it is a big binary file.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-02-22 10:58:18 -08:00
Jeff King
ded0abc73c diff: handle diffstat of rewritten binary files
The logic in builtin_diffstat assumes that a
complete_rewrite pair should have its lines counted. This is
nonsensical for binary files and leads to confusing things
like:

  $ git diff --stat --summary HEAD^ HEAD
   foo.rand |  Bin 4096 -> 4096 bytes
   1 files changed, 0 insertions(+), 0 deletions(-)

  $ git diff --stat --summary -B HEAD^ HEAD
   foo.rand |   34 +++++++++++++++-------------------
   1 files changed, 15 insertions(+), 19 deletions(-)
   rewrite foo.rand (100%)

So let's reorder the function to handle binary files first
(which from diffstat's perspective look like complete
rewrites anyway), then rewrites, then actual diffstats.

There are two bonus prizes to this reorder:

  1. It gets rid of a now-superfluous goto.

  2. The binary case is at the top, which means we can
     further optimize it in the next patch.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-02-22 10:57:58 -08:00
Jeff King
92c57e5c1d bump rename limit defaults (again)
We did this once before in 5070591 (bump rename limit
defaults, 2008-04-30). Back then, we were shooting for about
1 second for a diff/log calculation, and 5 seconds for a
merge.

There are a few new things to consider, though:

  1. Average processors are faster now.

  2. We've seen on the mailing list some ugly merges where
     not using inexact rename detection leads to many more
     conflicts. Merges of this size take a long time
     anyway, so users are probably happy to spend a little
     bit of time computing the renames.

Let's bump the diff/merge default limits from 200/500 to
400/1000. Those are 2 seconds and 10 seconds respectively on
my modern hardware.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-02-21 10:23:36 -08:00
Junio C Hamano
6ae7a51a2e Merge branch 'ks/blame-worktree-textconv-cached'
* ks/blame-worktree-textconv-cached:
  fill_textconv(): Don't get/put cache if sha1 is not valid
  t/t8006: Demonstrate blame is broken when cachetextconv is on
2010-12-21 14:30:52 -08:00
Kirill Smelkov
9ec09b0495 fill_textconv(): Don't get/put cache if sha1 is not valid
When blaming files in the working tree, the filespec is marked with
!sha1_valid, as we have not given the contents an object name yet.  The
function to cache textconv results (keyed on the object name), however,
didn't check this condition, and ended up on storing the cached result
under a random object name.

Cc: Axel Bonnet <axel.bonnet@ensimag.imag.fr>
Cc: Clément Poulain <clement.poulain@ensimag.imag.fr>
Cc: Diane Gasselin <diane.gasselin@ensimag.imag.fr>
Cc: Jeff King <peff@peff.net>
Signed-off-by: Kirill Smelkov <kirr@landau.phys.spbu.ru>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-12-19 18:41:32 -08:00
Junio C Hamano
cf7a64b54a Merge branch 'kb/diff-C-M-synonym'
* kb/diff-C-M-synonym:
  diff: use "find" instead of "detect" as prefix for long forms of -M and -C
  diff: add --detect-copies-harder as a synonym for --find-copies-harder
2010-12-16 12:58:59 -08:00
Yann Dirson
f611ddc774 diff: use "find" instead of "detect" as prefix for long forms of -M and -C
It is more consistent with existing --find-copies-harder; luckily "detect"
variant has not appeared in any officially released version of git.

Signed-off-by: Yann Dirson <ydirson@altern.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-12-10 13:52:05 -08:00
Junio C Hamano
8577def6fc Merge branch 'np/diff-in-corrupt-repository' into maint
* np/diff-in-corrupt-repository:
  diff: don't presume empty file when corresponding object is missing
2010-12-09 10:36:39 -08:00
Junio C Hamano
ae0a37cd6b Merge branch 'cm/diff-check-at-eol' into maint
* cm/diff-check-at-eol:
  diff --check: correct line numbers of new blank lines at EOF
2010-12-09 10:36:10 -08:00
Junio C Hamano
f04aa35eb6 Merge branch 'jk/diff-CBM'
* jk/diff-CBM:
  diff: report bogus input to -C/-M/-B
2010-12-08 11:24:11 -08:00
Junio C Hamano
f5a5531e4e Merge branch 'np/diff-in-corrupt-repository'
* np/diff-in-corrupt-repository:
  diff: don't presume empty file when corresponding object is missing
2010-11-29 17:52:33 -08:00
Junio C Hamano
039e84e30d Merge branch 'cm/diff-check-at-eol'
* cm/diff-check-at-eol:
  diff --check: correct line numbers of new blank lines at EOF
2010-11-29 17:52:31 -08:00
Kevin Ballard
150a5daad0 diff: add --detect-copies-harder as a synonym for --find-copies-harder
Signed-off-by: Kevin Ballard <kevin@sb.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-11-29 16:58:27 -08:00
Junio C Hamano
9cffe2018a Merge branch 'cb/diff-fname-optim' into maint
* cb/diff-fname-optim:
  diff: avoid repeated scanning while looking for funcname
  do not search functions for patch ID
  add rebase patch id tests
2010-11-24 12:46:26 -08:00
Junio C Hamano
78bce6c7e9 Merge branch 'jk/no-textconv-symlink' into maint
* jk/no-textconv-symlink:
  diff: don't use pathname-based diff drivers for symlinks
2010-11-24 12:46:20 -08:00
Junio C Hamano
8cf666c9ee Merge branch 'cb/diff-fname-optim'
* cb/diff-fname-optim:
  diff: avoid repeated scanning while looking for funcname
  do not search functions for patch ID
  add rebase patch id tests
2010-11-17 14:59:16 -08:00
Junio C Hamano
6a2e93f107 Merge branch 'jk/no-textconv-symlink'
* jk/no-textconv-symlink:
  diff: don't use pathname-based diff drivers for symlinks
2010-11-17 14:59:10 -08:00
Junio C Hamano
329351feeb Merge branch 'kb/merge-recursive-rename-threshold'
* kb/merge-recursive-rename-threshold:
  diff: add synonyms for -M, -C, -B
  merge-recursive: option to specify rename threshold

Conflicts:
	Documentation/diff-options.txt
	Documentation/merge-strategies.txt
2010-10-26 21:54:04 -07:00
Junio C Hamano
d7806967bd Merge branch 'maint'
* maint:
  Fix copy-pasted comments related to tree diff handling.
2010-10-26 15:04:05 -07:00
Yann Dirson
c3fced6498 Fix copy-pasted comments related to tree diff handling.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-10-25 00:21:56 -07:00
Nicolas Pitre
c50c4316e1 diff: don't presume empty file when corresponding object is missing
The low-level diff code will happily produce totally bogus diff output
with a broken repository via format-patch and friends by treating missing
objects as empty files.  Let's prevent that from happening any longer.

Reported-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Nicolas Pitre <nico@fluxnic.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-10-21 22:23:34 -07:00
Jeff King
07cd726527 diff: report bogus input to -C/-M/-B
We already detect invalid input to these functions, but we
simply exit with an error code, never saying anything as
simple as "your input was wrong". Let's fix that.

Before:

  $ git diff -CM
  $ echo $?
  128

After:

  $ git diff -CM
  error: invalid argument to -C: M
  $ echo $?
  128

There should be no problems with having diff_opt_parse print
to stderr, as there is already precedent in complaining
about bogus --color and --output arguments.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-10-21 15:44:53 -07:00
Christoph Mallon
8837d33595 diff --check: correct line numbers of new blank lines at EOF
The whitespace check printed the value of the wrong variable, i.e. the
beginning of the block of blank lines at the EOF (possibly absent) in the
old file.

As "git diff --check" is used by users to check their changes before
making a commit, we should point at the line number in the file after
the change.

Signed-off-by: Christoph Mallon <christoph.mallon@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-10-16 18:57:35 -07:00
Junio C Hamano
b3c16ee454 Merge branch 'jc/pickaxe-grep'
* jc/pickaxe-grep:
  diff/log -G<pattern>: tests
  git log/diff: add -G<regexp> that greps in the patch text
  diff: pass the entire diff-options to diffcore_pickaxe()
  gitdiffcore doc: update pickaxe description
2010-09-29 13:49:03 -07:00
Matthieu Moy
9ec26eb7cd diff: trivial fix for --output file error message
The option argument is either after the equal sign in --output=... or in
the next command-line argument. optarg is the reliable way to access it.

Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-09-29 13:25:17 -07:00
Kevin Ballard
37ab5156ae diff: add synonyms for -M, -C, -B
Add new long-form options --detect-renames[=<n>], --detect-copies[=<n>],
and --break-rewrites[=[<n>][/<m>]] as synonyms for the -M, -C, and -B
options (respectively).

Signed-off-by: Kevin Ballard <kevin@sb.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-09-29 13:18:04 -07:00
Kevin Ballard
10ae7526be merge-recursive: option to specify rename threshold
The recursive merge strategy turns on rename detection but leaves the
rename threshold at the default. Add a strategy option to allow the user
to specify a rename threshold to use.

Signed-off-by: Kevin Ballard <kevin@sb.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-09-29 13:15:56 -07:00
Clemens Buchacher
ad14b450c0 do not search functions for patch ID
Visual aids, such as the function name in the hunk
header, are not necessary for the purposes of
computing a patch ID.

This is a performance optimization.

Signed-off-by: Clemens Buchacher <drizzd@aon.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-09-23 18:35:07 -07:00
Jeff King
d391c0ff94 diff: don't use pathname-based diff drivers for symlinks
When we're diffing symlinks, we consider the contents to be
the pathname that the symlink points to. When a user sets up
a userdiff driver like "*.pdf diff=pdf", their "diff.pdf.*"
config generally tells us what to do with the content of
pdf files.

With the current code, we will actually process a symlink
like "link.pdf" using a configured pdf driver, meaning we
are using contents which consist of a pathname with
configuration that is expecting contents that consist of an
actual pdf file.

The most noticeable example of this would have been
textconv; however, it was already protected in its own
textconv-specific code path. We can still see the breakage
with something like "diff.*.binary", though. You could
also see it with diff.*.funcname, though it is a bit harder
to trigger accidentally there.

This patch adds a check for S_ISREG lower in the callstack
than the textconv-specific check, which should block use of
any userdiff config for non-regular files. We can drop the
check in the textconv code, which is now redundant.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-09-23 18:32:32 -07:00
Junio C Hamano
8ac8cf5bc1 Merge branch 'maint'
* maint:
  xdiff-interface.c: always trim trailing space from xfuncname matches
  diff.c: call regfree to free memory allocated by regcomp when necessary
2010-09-09 17:29:40 -07:00
Brandon Casey
ef5644ea6e diff.c: call regfree to free memory allocated by regcomp when necessary
Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-09-09 17:18:04 -07:00
Junio C Hamano
7cc1e385a0 Merge branch 'cb/binary-patch-id'
* cb/binary-patch-id:
  hash binary sha1 into patch id
2010-08-31 16:24:48 -07:00
Junio C Hamano
f506b8e8b5 git log/diff: add -G<regexp> that greps in the patch text
Teach "-G<regexp>" that is similar to "-S<regexp> --pickaxe-regexp" to the
"git diff" family of commands.  This limits the diff queue to filepairs
whose patch text actually has an added or a deleted line that matches the
given regexp.  Unlike "-S<regexp>", changing other parts of the line that
has a substring that matches the given regexp IS counted as a change, as
such a change would appear as one deletion followed by one addition in a
patch text.

Unlike -S (pickaxe) that is intended to be used to quickly detect a commit
that changes the number of occurrences of hits between the preimage and
the postimage to serve as a part of larger toolchain, this is meant to be
used as the top-level Porcelain feature.

The implementation unfortunately has to run "diff" twice if you are
running "log" family of commands to produce patches in the final output
(e.g. "git log -p" or "git format-patch").  I think we _could_ cache the
result in-core if we wanted to, but that would require larger surgery to
the diffcore machinery (i.e. adding an extra pointer in the filepair
structure to keep a pointer to a strbuf around, stuff the textual diff to
the strbuf inside diffgrep_consume(), and make use of it in later stages
when it is available) and it may not be worth it.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-08-31 14:30:29 -07:00
Junio C Hamano
382f013bc4 diff: pass the entire diff-options to diffcore_pickaxe()
That would make it easier to give enhanced feature to the
pickaxe transformation.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-08-31 14:30:28 -07:00
Junio C Hamano
e40b34b1ec Merge branch 'mm/shortopt-detached'
* mm/shortopt-detached:
  log: parse separate option for --glob
  log: parse separate options like git log --grep foo
  diff: parse separate options --stat-width n, --stat-name-width n
  diff: split off a function for --stat-* option parsing
  diff: parse separate options like -S foo

Conflicts:
	revision.c
2010-08-21 23:28:31 -07:00
Junio C Hamano
bd3a97a27a Merge branch 'jc/maint-follow-rename-fix'
* jc/maint-follow-rename-fix:
  log: test for regression introduced in v1.7.2-rc0~103^2~2
  diff --follow: do call diffcore_std() as necessary
  diff --follow: do not waste cycles while recursing
2010-08-18 12:47:18 -07:00
Junio C Hamano
cc34bb0b02 Merge branch 'jl/submodule-ignore-diff'
* jl/submodule-ignore-diff:
  Add tests for the diff.ignoreSubmodules config option
  Add the 'diff.ignoreSubmodules' config setting
  Submodules: Use "ignore" settings from .gitmodules too for diff and status
  Submodules: Add the new "ignore" config option for diff and status

Conflicts:
	diff.c
2010-08-18 12:36:25 -07:00
Clemens Buchacher
34597c1f5a hash binary sha1 into patch id
Since commit 2f82f760 (Take binary diffs into
account for "git rebase"), binary files are
included in patch ID computation. Binary files are
diffed using the text diff algorithm, however,
which has a huge impact on performance. The
following tests performance for a 50000 line file
marked as binary in .gitattributes.

$ git format-patch --stdout --ignore-if-in-upstream master

real    0m0.367s
user    0m0.354s
sys     0m0.010s

Instead of diffing the binary files, hash the pre-
and post-image sha1, which is just as unique. As a
result, performance is much improved.

$ git format-patch --stdout --ignore-if-in-upstream master

real    0m0.016s
user    0m0.015s
sys     0m0.001s

Signed-off-by: Clemens Buchacher <drizzd@aon.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-08-16 18:31:37 -07:00
Junio C Hamano
44c48a909a diff --follow: do call diffcore_std() as necessary
Usually, diff frontends populate the output queue with filepairs without
any rename information and call diffcore_std() to sort the renames out.
When --follow is in effect, however, diff-tree family of frontend has a
hack that looks like this:

    diff-tree frontend
    -> diff_tree_sha1()
       . populate diff_queued_diff
       . if --follow is in effect and there is only one change that
         creates the target path, then
       -> try_to_follow_renames()
	  -> diff_tree_sha1() with no pathspec but with -C
	  -> diffcore_std() to find renames
	  . if rename is found, tweak diff_queued_diff and put a
	    single filepair that records the found rename there
    -> diffcore_std()
       . tweak elements on diff_queued_diff by
       - rename detection
       - path ordering
       - pickaxe filtering

We need to skip parts of the second call to diffcore_std() that is related
to rename detection, and do so only when try_to_follow_renames() did find
a rename.  Earlier 1da6175 (Make diffcore_std only can run once before a
diff_flush, 2010-05-06) tried to deal with this issue incorrectly; it
unconditionally disabled any second call to diffcore_std().

This hopefully fixes the breakage.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-08-13 12:17:45 -07:00
Jakub Narebski
d8faea9d18 diff: strip extra "/" when stripping prefix
There are two ways a user might want to use "diff --relative":

  1. For a file in a directory, like "subdir/file", the user
     can use "--relative=subdir/" to strip the directory.

  2. To strip part of a filename, like "foo-10", they can
     use "--relative=foo-".

We currently handle both of those situations. However, if the user passes
"--relative=subdir" (without the trailing slash), we produce inconsistent
results. For the unified diff format, we collapse the double-slash of
"a//file" correctly into "a/file". But for other formats (raw, stat,
name-status), we end up with "/file".

We can do what the user means here and strip the extra "/" (and only a
slash).  We are not hurting any existing users of (2) above with this
behavior change because the existing output for this case was nonsensical.

Patch by Jakub, tests and commit message by Jeff King.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-08-11 09:46:47 -07:00
Johannes Schindelin
be4f2b408e Add the 'diff.ignoreSubmodules' config setting
When you have a lot of submodules checked out, the time penalty to check
for dirty submodules can easily imply a multiplication of the total time
by the factor 20. This makes the difference between almost instantaneous
(< 2 seconds) and unbearably slow (> 50 seconds) here, since the disk
caches are constantly overloaded.

To this end, the submodule.*.ignore config option was introduced, but it
is per-submodule.

This commit introduces a global config setting to set a default
(porcelain) value for the --ignore-submodules option, keeping the
default at 'none'. It can be overridden by the submodule.*.ignore
setting and by the --ignore-submodules option.

Incidentally, this commit fixes an issue with the overriding logic:
multiple --ignore-submodules options would not clear the previously
set flags.

While at it, fix a typo in the documentation for submodule.*.ignore.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-08-09 09:11:50 -07:00
Jens Lehmann
aee9c7d654 Submodules: Add the new "ignore" config option for diff and status
The new "ignore" config option controls the default behavior for "git
status" and the diff family. It specifies under what circumstances they
consider submodules as modified and can be set separately for each
submodule.

The command line option "--ignore-submodules=" has been extended to accept
the new parameter "none" for both status and diff.

Users that chose submodules to get rid of long work tree scanning times
might want to set the "dirty" option for those submodules. This brings
back the pre 1.7.0 behavior, where submodule work trees were never
scanned for modifications. By using "--ignore-submodules=none" on the
command line the status and diff commands can be told to do a full scan.

This option can be set to the following values (which have the same name
and meaning as for the "--ignore-submodules" option of status and diff):

"all": All changes to the submodule will be ignored.

"dirty": Only differences of the commit recorded in the superproject and
	the submodules HEAD will be considered modifications, all changes
	to the work tree of the submodule will be ignored. When using this
	value, the submodule will not be scanned for work tree changes at
	all, leading to a performance benefit on large submodules.

"untracked": Only untracked files in the submodules work tree are ignored,
	a changed HEAD and/or modified files in the submodule will mark it
	as modified.

"none" (which is the default): Either untracked or modified files in a
	submodules work tree or a difference between the subdmodules HEAD
	and the commit recorded in the superproject will make it show up
	as changed. This value is added as a new parameter for the
	"--ignore-submodules" option of the diff family and "git status"
	so the user can override the settings in the configuration.

Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-08-09 09:01:52 -07:00
Matthieu Moy
1e57208ef0 diff: parse separate options --stat-width n, --stat-name-width n
Part of a campaign for unstuck forms of options.

[jn: with some refactoring]

Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-08-06 09:14:36 -07:00
Jonathan Nieder
4d7f7a4ae7 diff: split off a function for --stat-* option parsing
As an optimization, the diff_opt_parse() switchboard has
a single case for all the --stat-* options.  Split it
off into a separate function so we can enhance it
without bringing code dangerously close to the right
margin.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-08-06 09:14:28 -07:00
Matthieu Moy
dea007fb4c diff: parse separate options like -S foo
Change the option parsing logic in revision.c to accept separate forms
like `-S foo' in addition to `-Sfoo'. The rest of git already accepted
this form, but revision.c still used its own option parsing.

Short options affected are -S<string>, -l<num> and -O<orderfile>, for
which an empty string wouldn't make sense, hence -<option> <arg> isn't
ambiguous.

This patch does not handle --stat-name-width and --stat-width, which are
special-cases where diff_long_opt do not apply. They are handled in a
separate patch to ease review.

Original patch by Matthieu Moy, plus refactoring by Jonathan Nieder.

Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-08-06 09:14:22 -07:00
Junio C Hamano
bb89e84f95 Merge branch 'sv/maint-diff-q-clear-fix' into maint
* sv/maint-diff-q-clear-fix:
  Fix DIFF_QUEUE_CLEAR refactoring
2010-08-03 15:17:34 -07:00
Junio C Hamano
ee38d823f7 Fix DIFF_QUEUE_CLEAR refactoring
It introduced a macro to reduce repeated assignments to three fields,
but an unrelated and incorrect change snuck in by mistake, which broke
commands like "git diff-files -p --submodule".

Noticed by Sven Verdoolaege.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-08-02 08:30:02 -07:00
Bo Yang
e13f38a33e diff.c: fix a graph output bug
When --graph is in effect, the line-prefix typically has colored graph
line segments and ends with reset.  The color sequence "set" given to
this function is for showing the metainfo part of the patch text and
(1) it should not be applied to the graph lines, and (2) it will be
reset at the end of line_prefix so it won't be in effect anyway.

Signed-off-by: Bo Yang <struggleyb.nku@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-07-08 18:09:14 -07:00
Junio C Hamano
a76b2084fb Merge branch 'jl/status-ignore-submodules'
* jl/status-ignore-submodules:
  Add the option "--ignore-submodules" to "git status"
  git submodule: ignore dirty submodules for summary and status

Conflicts:
	builtin/commit.c
	t/t7508-status.sh
	wt-status.c
	wt-status.h
2010-06-30 11:55:39 -07:00
Junio C Hamano
e1165dd144 Merge branch 'jl/maint-diff-ignore-submodules'
* jl/maint-diff-ignore-submodules:
  t4027,4041: Use test -s to test for an empty file
  Add optional parameters to the diff option "--ignore-submodules"
  git diff: rename test that had a conflicting name
2010-06-30 11:55:37 -07:00
Junio C Hamano
4af574dbdc Merge branch 'ab/blame-textconv'
* ab/blame-textconv:
  t/t8006: test textconv support for blame
  textconv: support for blame
  textconv: make the API public

Conflicts:
	diff.h
2010-06-27 12:07:44 -07:00
Jens Lehmann
46a958b3da Add the option "--ignore-submodules" to "git status"
In some use cases it is not desirable that "git status" considers
submodules that only contain untracked content as dirty. This may happen
e.g. when the submodule is not under the developers control and not all
build generated files have been added to .gitignore by the upstream
developers. Using the "untracked" parameter for the "--ignore-submodules"
option disables checking for untracked content and lets git diff report
them as changed only when they have new commits or modified content.

Sometimes it is not wanted to have submodules show up as changed when they
just contain changes to their work tree (this was the behavior before
1.7.0). An example for that are scripts which just want to check for
submodule commits while ignoring any changes to the work tree. Also users
having large submodules known not to change might want to use this option,
as the - sometimes substantial - time it takes to scan the submodule work
tree(s) is saved when using the "dirty" parameter.

And if you want to ignore any changes to submodules, you can now do that
by using this option without parameters or with "all" (when the config
option status.submodulesummary is set, using "all" will also suppress the
output of the submodule summary).

A new function handle_ignore_submodules_arg() is introduced to parse this
option new to "git status" in a single location, as "git diff" already
knew it.

Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-06-25 11:30:25 -07:00
Junio C Hamano
262657dce6 Merge branch 'maint'
* maint:
  Update draft release notes to 1.7.1.1
  tests: remove unnecessary '^' from 'expr' regular expression

Conflicts:
	diff.c
2010-06-22 09:35:36 -07:00
Junio C Hamano
3c656899cd Merge branch 'cc/maint-diff-CC-binary' into maint
* cc/maint-diff-CC-binary:
  diff: fix "git show -C -C" output when renaming a binary file

Conflicts:
	diff.c
2010-06-22 09:27:07 -07:00
Junio C Hamano
cb2af93ac1 Merge branch 'bw/diff-metainfo-color' into maint
* bw/diff-metainfo-color:
  diff: fix coloring of extended diff headers
2010-06-21 05:40:10 -07:00
Junio C Hamano
60335534a6 Merge branch 'rs/diff-no-minimal' into maint
* rs/diff-no-minimal:
  git diff too slow for a file
2010-06-21 05:38:50 -07:00
Junio C Hamano
5977744d04 Merge branch 'cc/maint-diff-CC-binary'
* cc/maint-diff-CC-binary:
  diff: fix "git show -C -C" output when renaming a binary file

Conflicts:
	diff.c
2010-06-18 11:16:57 -07:00
Junio C Hamano
98ad90fbab Merge branch 'by/diff-graph'
* by/diff-graph:
  Make --color-words work well with --graph
  graph.c: register a callback for graph output
  Emit a whole line in one go
  diff.c: Output the text graph padding before each diff line
  Output the graph columns at the end of the commit message
  Add a prefix output callback to diff output

Conflicts:
	diff.c
2010-06-18 11:16:57 -07:00
Junio C Hamano
18fd805583 Merge branch 'jh/diff-index-line-abbrev'
* jh/diff-index-line-abbrev:
  diff.c: Ensure "index $from..$to" line contains unambiguous SHA1s

Conflicts:
	diff.c
2010-06-18 11:16:56 -07:00
Junio C Hamano
2621ac50cc Merge branch 'ec/diff-noprefix-config'
* ec/diff-noprefix-config:
  diff: add configuration option for disabling diff prefixes.
2010-06-18 11:16:55 -07:00
Junio C Hamano
448598b508 Merge branch 'bw/diff-metainfo-color'
* bw/diff-metainfo-color:
  diff: fix coloring of extended diff headers
2010-06-13 11:21:25 -07:00
Junio C Hamano
39b5977b13 Merge branch 'rs/diff-no-minimal'
* rs/diff-no-minimal:
  git diff too slow for a file
2010-06-13 11:20:46 -07:00
Jens Lehmann
dd44d419d3 Add optional parameters to the diff option "--ignore-submodules"
In some use cases it is not desirable that the diff family considers
submodules that only contain untracked content as dirty. This may happen
e.g. when the submodule is not under the developers control and not all
build generated files have been added to .gitignore by the upstream
developers. Using the "untracked" parameter for the "--ignore-submodules"
option disables checking for untracked content and lets git diff report
them as changed only when they have new commits or modified content.

Sometimes it is not wanted to have submodules show up as changed when they
just contain changes to their work tree. An example for that are scripts
which just want to check for submodule commits while ignoring any changes
to the work tree. Also users having large submodules known not to change
might want to use this option, as the - sometimes substantial - time it
takes to scan the submodule work tree(s) is saved.

Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-06-11 13:33:17 -07:00
Axel Bonnet
a788d7d58b textconv: make the API public
The textconv functionality allows one to convert a file into text before
running diff. But this functionality can be useful to other features
such as blame.

Signed-off-by: Axel Bonnet <axel.bonnet@ensimag.imag.fr>
Signed-off-by: Clément Poulain <clement.poulain@ensimag.imag.fr>
Signed-off-by: Diane Gasselin <diane.gasselin@ensimag.imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-06-11 13:17:57 -07:00
Christian Couder
296c6bb21a diff: fix "git show -C -C" output when renaming a binary file
A bug was introduced in 3e97c7c6af
(No diff -b/-w output for all-whitespace changes, Nov 19 2009)
that made the lines:

  diff --git a/bar b/sub/bar
  similarity index 100%
  rename from bar
  rename to sub/bar

disappear from "git show -C -C" output when file bar is a binary
file.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-06-06 15:14:27 -07:00
Bo Yang
4297c0aeb5 Make --color-words work well with --graph
'--color-words' algorithm can be described as:

  1. collect a the minus/plus lines of a diff hunk, divided into
     minus-lines and plus-lines;

  2. break both minus-lines and plus-lines into words and
     place them into two mmfile_t with one word for each line;

  3. use xdiff to run diff on the two mmfile_t to get the words level diff;

And for the common parts of the both file, we output the plus side text.
diff_words->current_plus is used to trace the current position of the plus file
which printed. diff_words->last_minus is used to trace the last minus word
printed.

For '--graph' to work with '--color-words', we need to output the graph prefix
on each line of color words output. Generally, there are two conditions on
which we should output the prefix.

  1. diff_words->last_minus == 0 &&
     diff_words->current_plus == diff_words->plus.text.ptr

     that is: the plus text must start as a new line, and if there is no minus
     word printed, a graph prefix must be printed.

  2. diff_words->current_plus > diff_words->plus.text.ptr &&
     *(diff_words->current_plus - 1) == '\n'

     that is: a graph prefix must be printed following a '\n'

Signed-off-by: Bo Yang <struggleyb.nku@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-05-31 18:02:20 -07:00
Bo Yang
2efcc97764 Emit a whole line in one go
Since the graph prefix will be printed when calling
emit_line, so the functions should be used to emit a
complete line out once a time. No one should call
emit_line to just output some strings instead of a
complete line.
Use a strbuf to compose the whole line, and then
call emit_line to output it once.

Signed-off-by: Bo Yang <struggleyb.nku@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-05-31 18:02:04 -07:00
Bo Yang
7be5761073 diff.c: Output the text graph padding before each diff line
Change output from diff with -p/--dirstat/--binary/--numstat/--stat/
--shortstat/--check/--summary options to align with graph paddings.

Thanks Jeff King <peff@peff.net> for reporting the '--summary' bug and
his initial patch.

Signed-off-by: Bo Yang <struggleyb.nku@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-05-31 18:00:21 -07:00
Bo Yang
a3c158d4a5 Add a prefix output callback to diff output
The callback can be used to add some prefix string to each line of
diff output.

Signed-off-by: Bo Yang <struggleyb.nku@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-05-31 18:00:21 -07:00
Johan Herland
3e5a188f1d diff.c: Ensure "index $from..$to" line contains unambiguous SHA1s
In the metainfo section of git diffs there's an "index" line providing
abbreviated (unless --full-index is used) blob SHA1s from the
pre-/post-images used to generate the diff. These provide hints that
can be used to reconstruct a 3-way merge when applying the patch
(see the --3way option to 'git am' for more details).

In order for this to work, however, the blob SHA1s must not be
abbreviated into ambiguity.

This patch eliminates the possible ambiguity by using find_unique_abbrev()
to produce the abbreviated SHA1s (instead of blind abbreviation by way of
"%.*s").

A testcase verifying the fix is also included.

Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-05-31 17:44:01 -07:00
Junio C Hamano
82c531b3b6 Merge branch 'by/log-follow'
* by/log-follow:
  tests: rename duplicate t4205
  Make git log --follow find copies among unmodified files.
  Make diffcore_std only can run once before a diff_flush
  Add a macro DIFF_QUEUE_CLEAR.
2010-05-21 04:02:23 -07:00
Junio C Hamano
1bdd46cd3a Merge branch 'tr/word-diff'
* tr/word-diff:
  diff: add --word-diff option that generalizes --color-words

Conflicts:
	diff.c
2010-05-21 04:02:17 -07:00
Bert Wesarg
374664478f diff: fix coloring of extended diff headers
Coloring the extended headers where done as a whole not per line. less with
option -R (which is the default from git) does not support this coloring
mode because of performance reasons. The -r option would be an alternative
but has problems with lines that are longer than the screen. Therefore
stick to the idiom to color each line separately. The problem is, that the
result of ill_metainfo() will also be used as an parameter to an external
diff driver, so we need to disable coloring in this case.

Because coloring is now done inside fill_metainfo() we can simply add this
string to the diff header and therefore keep the last newline in the
extended header. This results also into the fact that the external diff
driver now gets this last newline too. Which is a change in behavior
but a good one.

Signed-off-by: Bert Wesarg <bert.wesarg@googlemail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-05-19 21:06:40 -07:00
Will Palmer
1c9eecff97 diff-options: make --patch a synonym for -p
Here we simply make --patch a synonym for -p, whose mnemonic was "patch"
all along.

Signed-off-by: Will Palmer <wmpalmer@gmail.com>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-05-18 21:50:03 -07:00
Eli Collins
f89504ddb9 diff: add configuration option for disabling diff prefixes.
With new configuration "diff.noprefix", "git diff" does not show a source or destination prefix ala "git diff --no-prefix".

Signed-off-by: Eli Collins <eli@cloudera.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-05-18 21:31:51 -07:00
Junio C Hamano
dd75d07899 Merge branch 'jk/cached-textconv'
* jk/cached-textconv:
  diff: avoid useless filespec population
  diff: cache textconv output
  textconv: refactor calls to run_textconv
  introduce notes-cache interface
  make commit_tree a library function
2010-05-08 22:33:08 -07:00
Bo Yang
1da6175d43 Make diffcore_std only can run once before a diff_flush
When file renames/copies detection is turned on, the
second diffcore_std will degrade a 'C' pair to a 'R' pair.

And this may happen when we run 'git log --follow' with
hard copies finding. That is, the try_to_follow_renames()
will run diffcore_std to find the copies, and then
'git log' will issue another diffcore_std, which will reduce
'src->rename_used' and recognize this copy as a rename.
This is not what we want.

So, I think we really don't need to run diffcore_std more
than one time.

Signed-off-by: Bo Yang <struggleyb.nku@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-05-07 09:34:28 -07:00
Bo Yang
9ca5df9061 Add a macro DIFF_QUEUE_CLEAR.
Refactor the diff_queue_struct code, this macro help
to reset the structure.

Signed-off-by: Bo Yang <struggleyb.nku@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-05-07 09:34:27 -07:00
Junio C Hamano
6b6f5d4664 Merge branch 'maint-1.7.0' into maint
* maint-1.7.0:
  remove ecb parameter from xdi_diff_outf()
2010-05-04 15:20:47 -07:00
René Scharfe
dfea79004c remove ecb parameter from xdi_diff_outf()
xdi_diff_outf() overrides the structure members of its last parameter,
ignoring any value that callers pass in.  It's no surprise then that all
callers pass a pointer to an uninitialized structure.  They also don't
read it after the call, so the parameter is neither used for input nor
for output.   Turn it into a local variable of xdi_diff_outf().

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-05-04 15:19:14 -07:00
René Scharfe
582aa00bdf git diff too slow for a file
Ever since the xdiff library had been introduced to git, all its callers
have used the flag XDF_NEED_MINIMAL.  It makes sure that the smallest
possible diff is produced, but that takes quite some time if there are
lots of differences that can be expressed in multiple ways.

This flag makes a difference for only 0.1% of the non-merge commits in
the git repo of Linux, both in terms of diff size and execution time.
The patches there are mostly nice and small.

SungHyun Nam however reported a case in a different repo where a diff
took more than 20 times longer to generate with XDF_NEED_MINIMAL than
without.  Rebasing became really slow.

This patch removes this flag from all callers.  The default of xdiff is
saner because it has minimal to no impact in the normal case of small
diffs and doesn't incur that much of a speed penalty for large ones.

A follow-up patch may introduce a command line option to set the flag if
the user needs it, similar to GNU diff's -d/--minimal.

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-05-02 07:59:50 -07:00
Junio C Hamano
4fd8145c0c Merge branch 'jk/maint-diffstat-overflow' into maint
* jk/maint-diffstat-overflow:
  diff: use large integers for diffstat calculations
2010-04-22 22:29:13 -07:00
Junio C Hamano
bc32d342c2 Merge branch 'jk/maint-diffstat-overflow'
* jk/maint-diffstat-overflow:
  diff: use large integers for diffstat calculations
2010-04-18 21:31:50 -07:00
Jeff King
0974c117ff diff: use large integers for diffstat calculations
The diffstat "added" and "changed" fields generally store
line counts; however, for binary files, they store file
sizes. Since we store and print these values as ints, a
diffstat on a file larger than 2G can show a negative size.
Instead, let's use uintmax_t, which should be at least 64
bits on modern platforms.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-04-17 11:30:21 -07:00
Thomas Rast
882749a04f diff: add --word-diff option that generalizes --color-words
This teaches the --color-words engine a more general interface that
supports two new modes:

* --word-diff=plain, inspired by the 'wdiff' utility (most similar to
  'wdiff -n <old> <new>'): uses delimiters [-removed-] and {+added+}

* --word-diff=porcelain, which generates an ad-hoc machine readable
  format:
  - each diff unit is prefixed by [-+ ] and terminated by newline as
    in unified diff
  - newlines in the input are output as a line consisting only of a
    tilde '~'

Both of these formats still support color if it is enabled, using it
to highlight the differences.  --color-words becomes a synonym for
--word-diff=color, which is the color-only format.  Also adds some
compatibility/convenience options.

Thanks to Junio C Hamano and Miles Bader for good ideas.

Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-04-14 10:56:53 -07:00
Junio C Hamano
daaf2e8892 Merge branch 'jc/conflict-marker-size' into maint
* jc/conflict-marker-size:
  diff --check: honor conflict-marker-size attribute
2010-04-09 22:38:34 -07:00
Junio C Hamano
7ec1eb93f7 Merge early parts of jk/cached-textconv 2010-04-08 23:31:51 -07:00
Junio C Hamano
aed6ca52e7 diff.c: work around pointer constness warnings
The textconv leak fix introduced two invocations of free() to release
memory pointed by "const char *", which get annoying compiler warning.
2010-04-08 23:30:49 -07:00
Junio C Hamano
3f3f8d9d09 Merge branch 'jc/conflict-marker-size'
* jc/conflict-marker-size:
  diff --check: honor conflict-marker-size attribute
2010-04-06 14:50:46 -07:00
Jeff King
b337398266 diff: avoid useless filespec population
builtin_diff calls fill_mmfile fairly early, which in turn
calls diff_populate_filespec, which actually retrieves the
file's blob contents into a buffer. Long ago, this was
sensible as we would need to look at the blobs eventually.

These days, however, we may not ever want those blobs if we
end up using a textconv cache, and for large binary files
(exactly the sort for which you might have a textconv
cache), just retrieving the objects can be costly.

This patch just pushes the fill_mmfile call a bit later, so
we can avoid populating the filespec in some cases.  There
is one thing to note that looks like a bug but isn't. We
push the fill_mmfile down into the first branch of a
conditional. It seems like we would need it on the other
branch, too, but we don't; fill_textconv does it for us (in
fact, before this, we were just writing over the results of
the fill_mmfile on that branch).

Here's a timing sample on a commit with 45 changed jpgs and
avis. The result is fully textconv cached, but we still
wasted a lot of time just pulling the blobs from storage.
The total size of the blobs (source and dest) is about
180M.

  [before]
  $ time git show >/dev/null
  real    0m0.352s
  user    0m0.148s
  sys     0m0.200s

  [after]
  $ time git show >/dev/null
  real    0m0.009s
  user    0m0.004s
  sys     0m0.004s

And that's on a warm cache. On a cold cache, the "after"
case is not much worse, but the "before" case has to do an
extra 180M of I/O.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-04-02 00:11:20 -07:00
Jeff King
d9bae1a178 diff: cache textconv output
Running a textconv filter can take a long time. It's
particularly bad for a large file which needs to be spooled
to disk, but even for small files, the fork+exec overhead
can add up for something like "git log -p".

This patch uses the notes-cache mechanism to keep a fast
cache of textconv output. Caches are stored in
refs/notes/textconv/$x, where $x is the userdiff driver
defined in gitattributes.

Caching is enabled only if diff.$x.cachetextconv is true.

In my test repo, on a commit with 45 jpg and avi files
changed and a textconv to show their exif tags:

  [before]
  $ time git show >/dev/null
  real    0m13.724s
  user    0m12.057s
  sys     0m1.624s

  [after, first run]
  $ git config diff.mfo.cachetextconv true
  $ time git show >/dev/null
  real    0m14.252s
  user    0m12.197s
  sys     0m1.800s

  [after, subsequent runs]
  $ time git show >/dev/null
  real    0m0.352s
  user    0m0.148s
  sys     0m0.200s

So for a slight (3.8%) cost on the first run, we achieve an
almost 40x speed up on subsequent runs.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-04-02 00:05:31 -07:00
Jeff King
840383b2c2 textconv: refactor calls to run_textconv
This patch adds a fill_textconv wrapper, which centralizes
some minor logic like error checking and handling the case
of no-textconv.

In addition to dropping the number of lines, this will make
it easier in future patches to handle multiple types of
textconv.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-04-02 00:01:57 -07:00
Jeff King
b76c056b95 fix textconv leak in emit_rewrite_diff
We correctly free() for the normal diff case, but leak for
rewrite diffs.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-04-01 23:49:29 -07:00
Junio C Hamano
890a13a452 Sync with 1.7.0.4
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-03-31 15:14:27 -07:00
Johannes Sixt
da1fbed3ff diff: fix textconv error zombies
To make the code simpler, run_textconv lumps all of its
error checking into one conditional. However, the
short-circuit means that an error in reading will prevent us
from calling finish_command, leaving a zombie child.
Clean up properly after errors.

Based-on-work-by: Jeff King <peff@peff.net>
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-03-30 14:46:33 -07:00
Junio C Hamano
a757c646ee diff --check: honor conflict-marker-size attribute
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-03-24 19:35:34 -07:00
Junio C Hamano
b6a7a06aa6 Merge branch 'jl/submodule-diff-dirtiness'
* jl/submodule-diff-dirtiness:
  git status: ignoring untracked files must apply to submodules too
  git status: Fix false positive "new commits" output for dirty submodules
  Refactor dirty submodule detection in diff-lib.c
  git status: Show detailed dirty status of submodules in long format
  git diff --submodule: Show detailed dirty status of submodules
2010-03-24 16:25:43 -07:00
Jens Lehmann
85adbf2f75 git status: Fix false positive "new commits" output for dirty submodules
Testing if the output "new commits" should appear in the long format of
"git status" is done by comparing the hashes of the diffpair. This always
resulted in printing "new commits" for submodules that contained untracked
or modified content, even if they did not contain new commits. The reason
was that match_stat_with_submodule() did set the "changed" flag for dirty
submodules, resulting in two->sha1 being set to the null_sha1 at the call
sites, which indicates that new commits are present. This is changed so
that when no new commits are present, the same object names are in the
sha1 field for both sides of the filepair, and the working tree side will
have the "dirty_submodule" flag set when appropriate. For a submodule to
be seen as modified even when it just has a dirty work tree, some
conditions had to be extended to also check for the "dirty_submodule"
flag.

Unfortunately the test case that should have found this bug had been
changed incorrectly too. It is fixed and extended to test for other
combinations too.

Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-03-12 22:17:24 -08:00
Jens Lehmann
ae6d5c1b6f Refactor dirty submodule detection in diff-lib.c
Moving duplicated code into the new function match_stat_with_submodule().
Replacing the implicit activation of detailed checks for the dirtiness of
submodules when DIFF_FORMAT_PATCH was selected with explicitly setting
the recently added DIFF_OPT_DIRTY_SUBMODULES option in diff_setup_done().

Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-03-12 22:17:17 -08:00
Junio C Hamano
8cc3709df0 Merge branch 'ld/maint-diff-quiet-w' into maint
* ld/maint-diff-quiet-w:
  git-diff: add a test for git diff --quiet -w
  git diff --quiet -w: check and report the status
2010-03-04 22:26:39 -08:00
Junio C Hamano
36420805a7 Merge branch 'ld/maint-diff-quiet-w'
* ld/maint-diff-quiet-w:
  git-diff: add a test for git diff --quiet -w
  git diff --quiet -w: check and report the status
2010-03-02 12:44:10 -08:00
Junio C Hamano
c06951894a Merge branch 'ml/color-when'
* ml/color-when:
  Add an optional argument for --color options
2010-03-02 12:44:06 -08:00
Mark Lodato
73e9da0196 Add an optional argument for --color options
Make git-branch, git-show-branch, git-grep, and all the diff-based
programs accept an optional argument <when> for --color.  The argument
is a colorbool: "always", "never", or "auto".  If no argument is given,
"always" is used;  --no-color is an alias for --color=never.  This makes
the command-line interface consistent with other GNU tools, such as `ls'
and `grep', and with the git-config color options.  Note that, without
an argument, --color and --no-color work exactly as before.

To implement this, two internal changes were made:

1. Allow the first argument of git_config_colorbool() to be NULL,
   in which case it returns -1 if the argument isn't "always", "never",
   or "auto".

2. Add OPT_COLOR_FLAG(), OPT__COLOR(), and parse_opt_color_flag_cb()
   to the option parsing library.  The callback uses
   git_config_colorbool(), so color.h is now a dependency
   of parse-options.c.

Signed-off-by: Mark Lodato <lodatom@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-02-18 17:21:40 -08:00
Junio C Hamano
07cb9a369e Merge branch 'jc/typo' into maint
* jc/typo:
  Typofixes outside documentation area
2010-02-17 14:55:09 -08:00
Junio C Hamano
6d816301cd Merge branch 'jc/typo'
* jc/typo:
  Typofixes outside documentation area
2010-02-16 22:45:14 -08:00
Junio C Hamano
e7b3cea0f7 Merge branch 'maint-1.6.6' into maint
* maint-1.6.6:
  dwim_ref: fix dangling symref warning
  stash pop: remove 'apply' options during 'drop' invocation
  diff: make sure --output=/bad/path is caught
  Remove hyphen from "git-command" in two error messages
2010-02-16 15:05:02 -08:00
Junio C Hamano
eb0bcd0fbe Merge branch 'maint-1.6.5' into maint-1.6.6
* maint-1.6.5:
  dwim_ref: fix dangling symref warning
  stash pop: remove 'apply' options during 'drop' invocation
  diff: make sure --output=/bad/path is caught
2010-02-16 15:04:55 -08:00
Larry D'Anna
6977c250ac git diff --quiet -w: check and report the status
The option -w tells the diff machinery to inspect the contents to set the
exit status, instead of checking the blob object level difference alone.
However, --quiet tells the diff machinery not to look at the contents, which
means DIFF_FROM_CONTENTS has no chance to inspect the change.

Work it around by calling diff_flush_patch() with output sent to /dev/null.

Signed-off-by: Larry D'Anna <larry@elder-gods.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-02-15 23:04:34 -08:00
Larry D'Anna
8324b977ae diff: make sure --output=/bad/path is caught
The return value from fopen wasn't being checked.

Signed-off-by: Larry D'Anna <larry@elder-gods.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-02-15 21:46:01 -08:00
Junio C Hamano
9517e6b843 Typofixes outside documentation area
begining -> beginning
    canonicalizations -> canonicalization
    comand -> command
    dewrapping -> unwrapping
    dirtyness -> dirtiness
    DISCLAMER -> DISCLAIMER
    explicitely -> explicitly
    feeded -> fed
    impiled -> implied
    madatory -> mandatory
    mimick -> mimic
    preceeding -> preceding
    reqeuest -> request
    substition -> substitution

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-02-03 21:28:17 -08:00
Junio C Hamano
d539de9f25 Merge branch 'jl/diff-submodule-ignore'
* jl/diff-submodule-ignore:
  Teach diff --submodule that modified submodule directory is dirty
  git diff: Don't test submodule dirtiness with --ignore-submodules
  Make ce_uptodate() trustworthy again
2010-01-26 22:53:13 -08:00
Jens Lehmann
721ceec1ad Teach diff --submodule that modified submodule directory is dirty
Since commit 8e08b4 git diff does append "-dirty" to the work tree side
if the working directory of a submodule contains new or modified files.
Lets do the same when the --submodule option is used.

Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-01-24 21:04:31 -08:00
Junio C Hamano
026680f881 Merge branch 'jc/fix-tree-walk'
* jc/fix-tree-walk:
  read-tree --debug-unpack
  unpack-trees.c: look ahead in the index
  unpack-trees.c: prepare for looking ahead in the index
  Aggressive three-way merge: fix D/F case
  traverse_trees(): handle D/F conflict case sanely
  more D/F conflict tests
  tests: move convenience regexp to match object names to test-lib.sh

Conflicts:
	builtin-read-tree.c
	unpack-trees.c
	unpack-trees.h
2010-01-24 17:35:58 -08:00
Junio C Hamano
c6ec7efdd4 Merge branch 'jl/submodule-diff'
* jl/submodule-diff:
  Performance optimization for detection of modified submodules
  git status: Show uncommitted submodule changes too when enabled
  Teach diff that modified submodule directory is dirty
  Show submodules as modified when they contain a dirty work tree
2010-01-22 16:08:10 -08:00
Junio C Hamano
03c6e97f80 Merge branch 'maint-1.6.4' into maint-1.6.5
* maint-1.6.4:
  Fix mis-backport of t7002
  base85: Make the code more obvious instead of explaining the non-obvious
  base85: encode_85() does not use the decode table
  base85 debug code: Fix length byte calculation
  checkout -m: do not try to fall back to --merge from an unborn branch
  branch: die explicitly why when calling "git branch [-a|-r] branchname".
  textconv: stop leaking file descriptors
  commit: --cleanup is a message option
  git count-objects: handle packs bigger than 4G
  t7102: make the test fail if one of its check fails
2010-01-18 21:37:12 -08:00
Junio C Hamano
814035c12a Merge branch 'maint-1.6.3' into maint-1.6.4
* maint-1.6.3:
  base85: Make the code more obvious instead of explaining the non-obvious
  base85: encode_85() does not use the decode table
  base85 debug code: Fix length byte calculation
  checkout -m: do not try to fall back to --merge from an unborn branch
  branch: die explicitly why when calling "git branch [-a|-r] branchname".
  textconv: stop leaking file descriptors
  commit: --cleanup is a message option
  git count-objects: handle packs bigger than 4G
  t7102: make the test fail if one of its check fails

Conflicts:
	builtin-commit.c
2010-01-18 21:37:06 -08:00
Junio C Hamano
011c181cc6 Merge branch 'maint-1.6.2' into maint-1.6.3
* maint-1.6.2:
  base85: Make the code more obvious instead of explaining the non-obvious
  base85: encode_85() does not use the decode table
  base85 debug code: Fix length byte calculation
  checkout -m: do not try to fall back to --merge from an unborn branch
  branch: die explicitly why when calling "git branch [-a|-r] branchname".
  textconv: stop leaking file descriptors
  commit: --cleanup is a message option
  git count-objects: handle packs bigger than 4G
  t7102: make the test fail if one of its check fails

Conflicts:
	diff.c
2010-01-18 21:29:47 -08:00
Jens Lehmann
e3d42c4773 Performance optimization for detection of modified submodules
In the worst case is_submodule_modified() got called three times for
each submodule. The information we got from scanning the whole
submodule tree the first time can be reused instead.

New parameters have been added to diff_change() and diff_addremove(),
the information is stored in a new member of struct diff_filespec. Its
value is then reused instead of calling is_submodule_modified() again.

When no explicit "-dirty" is needed in the output the call to
is_submodule_modified() is not necessary when the submodules HEAD
already disagrees with the ref of the superproject, as this alone
marks it as modified. To achieve that, get_stat_data() got an extra
argument.

Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-01-18 17:28:21 -08:00
Junio C Hamano
4fa088209c Merge branch 'jk/run-command-use-shell'
* jk/run-command-use-shell:
  t4030, t4031: work around bogus MSYS bash path conversion
  diff: run external diff helper with shell
  textconv: use shell to run helper
  editor: use run_command's shell feature
  run-command: optimize out useless shell calls
  run-command: convert simple callsites to use_shell
  t0021: use $SHELL_PATH for the filter script
  run-command: add "use shell" option
2010-01-17 15:58:15 -08:00
Junio C Hamano
8e08b4198c Teach diff that modified submodule directory is dirty
A diff run in superproject only compares the name of the commit object
bound at the submodule paths.  When we compare with a work tree and the
checked out submodule directory is dirty (e.g. has either staged or
unstaged changes, or has new files the user forgot to add to the index),
show the work tree side as "dirty".

Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-01-16 16:40:56 -08:00
Junio C Hamano
73d66323ac Merge branch 'nd/sparse'
* nd/sparse: (25 commits)
  t7002: test for not using external grep on skip-worktree paths
  t7002: set test prerequisite "external-grep" if supported
  grep: do not do external grep on skip-worktree entries
  commit: correctly respect skip-worktree bit
  ie_match_stat(): do not ignore skip-worktree bit with CE_MATCH_IGNORE_VALID
  tests: rename duplicate t1009
  sparse checkout: inhibit empty worktree
  Add tests for sparse checkout
  read-tree: add --no-sparse-checkout to disable sparse checkout support
  unpack-trees(): ignore worktree check outside checkout area
  unpack_trees(): apply $GIT_DIR/info/sparse-checkout to the final index
  unpack-trees(): "enable" sparse checkout and load $GIT_DIR/info/sparse-checkout
  unpack-trees.c: generalize verify_* functions
  unpack-trees(): add CE_WT_REMOVE to remove on worktree alone
  Introduce "sparse checkout"
  dir.c: export excluded_1() and add_excludes_from_file_1()
  excluded_1(): support exclude files in index
  unpack-trees(): carry skip-worktree bit over in merged_entry()
  Read .gitignore from index if it is skip-worktree
  Avoid writing to buffer in add_excludes_from_file_1()
  ...

Conflicts:
	.gitignore
	Documentation/config.txt
	Documentation/git-update-index.txt
	Makefile
	entry.c
	t/t7002-grep.sh
2010-01-13 11:58:34 -08:00
Junio C Hamano
c5034673fd Merge branch 'maint-1.6.1' into maint-1.6.2
* maint-1.6.1:
  base85: Make the code more obvious instead of explaining the non-obvious
  base85: encode_85() does not use the decode table
  base85 debug code: Fix length byte calculation
  checkout -m: do not try to fall back to --merge from an unborn branch
  branch: die explicitly why when calling "git branch [-a|-r] branchname".
  textconv: stop leaking file descriptors
  commit: --cleanup is a message option
  git count-objects: handle packs bigger than 4G
  t7102: make the test fail if one of its check fails

Conflicts:
	diff.c
2010-01-10 00:49:47 -08:00
Junio C Hamano
730f72840c unpack-trees.c: look ahead in the index
This makes the traversal of index be in sync with the tree traversal.
When unpack_callback() is fed a set of tree entries from trees, it
inspects the name of the entry and checks if the an index entry with
the same name could be hiding behind the current index entry, and

 (1) if the name appears in the index as a leaf node, it is also
     fed to the n_way_merge() callback function;

 (2) if the name is a directory in the index, i.e. there are entries in
     that are underneath it, then nothing is fed to the n_way_merge()
     callback function;

 (3) otherwise, if the name comes before the first eligible entry in the
     index, the index entry is first unpacked alone.

When traverse_trees_recursive() descends into a subdirectory, the
cache_bottom pointer is moved to walk index entries within that directory.

All of these are omitted for diff-index, which does not even want to be
fed an index entry and a tree entry with D/F conflicts.

This fixes 3-way read-tree and exposes a bug in other parts of the system
in t6035, test #5.  The test prepares these three trees:

 O = HEAD^
    100644 blob e69de29bb2    a/b-2/c/d
    100644 blob e69de29bb2    a/b/c/d
    100644 blob e69de29bb2    a/x

 A = HEAD
    100644 blob e69de29bb2    a/b-2/c/d
    100644 blob e69de29bb2    a/b/c/d
    100644 blob 587be6b4c3f93f93c489c0111bba5596147a26cb    a/x

 B = master
    120000 blob a36b77384451ea1de7bd340ffca868249626bc52    a/b
    100644 blob e69de29bb2    a/b-2/c/d
    100644 blob e69de29bb2    a/x

With a clean index that matches HEAD, running

    git read-tree -m -u --aggressive $O $A $B

now yields

    120000 a36b77384451ea1de7bd340ffca868249626bc52 3       a/b
    100644 e69de29bb2 0       a/b-2/c/d
    100644 e69de29bb2 1       a/b/c/d
    100644 e69de29bb2 2       a/b/c/d
    100644 587be6b4c3f93f93c489c0111bba5596147a26cb 0       a/x

which is correct.  "master" created "a/b" symlink that did not exist,
and removed "a/b/c/d" while HEAD did not do touch either path.

Before this series, read-tree did not notice the situation and resolved
addition of "a/b" and removal of "a/b/c/d" independently.  If A = HEAD had
another path "a/b/c/e" added, this merge should conflict but instead it
silently resolved "a/b" and then immediately overwrote it to add
"a/b/c/e", which was quite bogus.

Tests in t1012 start to work with this.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-01-07 15:00:14 -08:00
Jeff King
7ed7fac45a diff: run external diff helper with shell
This is mostly to make it more consistent with the rest of
git, which uses the shell to exec helpers.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-01-05 23:41:51 -08:00
Jeff King
41a457e4f8 textconv: use shell to run helper
Currently textconv helpers are run directly. Running through
the shell is useful because the user can provide a program
with command line arguments, like "antiword -f".

It also makes textconv more consistent with other parts of
git, most of which run their helpers using the shell.

The downside is that textconv helpers with shell
metacharacters (like space) in the filename will be broken.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-01-05 23:41:51 -08:00
Junio C Hamano
9e7ad090fa Merge branch 'maint'
* maint:
  textconv: stop leaking file descriptors
  commit: --cleanup is a message option
  git count-objects: handle packs bigger than 4G
  t7102: make the test fail if one of its check fails
  Documentation: always respect core.worktree if set
2009-12-30 01:25:21 -08:00
Junio C Hamano
b0b3a241e2 Merge branch 'maint-1.6.1' into maint
* maint-1.6.1:
  textconv: stop leaking file descriptors
  commit: --cleanup is a message option
  git count-objects: handle packs bigger than 4G
  t7102: make the test fail if one of its check fails

Conflicts:
	builtin-commit.c
	diff.c
2009-12-30 01:24:12 -08:00
Jeff King
70d7099916 textconv: stop leaking file descriptors
We read the output from textconv helpers over a pipe, but we
never actually closed our end of the pipe after using it.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-12-30 01:22:27 -08:00
Junio C Hamano
648f407017 Merge branch 'gb/1.7.0-diff-whitespace-only-output'
* gb/1.7.0-diff-whitespace-only-output:
  No diff -b/-w output for all-whitespace changes
2009-12-26 14:03:18 -08:00
Junio C Hamano
3cc3fb7df6 Merge branch 'jc/1.7.0-diff-whitespace-only-status'
* jc/1.7.0-diff-whitespace-only-status:
  diff.c: fix typoes in comments
  Make test case number unique
  diff: Rename QUIET internal option to QUICK
  diff: change semantics of "ignore whitespace" options

Conflicts:
	diff.h
2009-12-26 14:03:18 -08:00
Junio C Hamano
a1bb8f45f1 Merge branch 'maint' to sync with 1.6.5.7
* maint:
  Git 1.6.5.7
  worktree: don't segfault with an absolute pathspec without a work tree
  ignore unknown color configuration
  help.autocorrect: do not run a command if the command given is junk
  Illustrate "filter" attribute with an example
2009-12-16 12:47:15 -08:00
Jeff King
8b8e862490 ignore unknown color configuration
When parsing the config file, if there is a value that is
syntactically correct but unused, we generally ignore it.
This lets non-core porcelains store arbitrary information in
the config file, and it means that configuration files can
be shared between new and old versions of git (the old
versions might simply ignore certain configuration).

The one exception to this is color configuration; if we
encounter a color.{diff,branch,status}.$slot variable, we
die if it is not one of the recognized slots (presumably as
a safety valve for user misconfiguration). This behavior
has existed since 801235c (diff --color: use
$GIT_DIR/config, 2006-06-24), but hasn't yet caused a
problem. No porcelain has wanted to store extra colors, and
we once a color area (like color.diff) has been introduced,
we've never changed the set of color slots.

However, that changed recently with the addition of
color.diff.func. Now a user with color.diff.func in their
config can no longer freely switch between v1.6.6 and older
versions; the old versions will complain about the existence
of the variable.

This patch loosens the check to match the rest of
git-config; unknown color slots are simply ignored. This
doesn't fix this particular problem, as the older version
(without this patch) is the problem, but it at least
prevents it from happening again in the future.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-12-16 12:45:16 -08:00
Bert Wesarg
89cb73a19a Give the hunk comment its own color
Inspired by the coloring of quilt.

Introduce a separate color and paint the hunk comment part, i.e. the name
of the function, in a separate color "diff.func" (defaults to plain).

Whitespace between hunk header and hunk comment is printed in plain color.

Signed-off-by: Bert Wesarg <bert.wesarg@googlemail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-11-28 10:05:44 -08:00
Junio C Hamano
06a4755270 emit_line(): don't emit an empty <SET><RESET> followed by a newline
When emit_line() is called with an empty line (but non-zero length, as we
send line terminating LF or CRLF to the function), it used to emit
<SET><RESET> followed by a newline.  Stop the wastefulness.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-11-27 22:33:53 -08:00
Greg Bacon
3e97c7c6af No diff -b/-w output for all-whitespace changes
Change git-diff's whitespace-ignoring modes to generate
output only if a non-empty patch results, which git-apply
rejects.

Update the tests to look for the new behavior.

Signed-off-by: Greg Bacon <gbacon@dbresearch.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-11-20 22:00:36 -08:00
Junio C Hamano
d404a3e1a5 Merge branch 'js/maint-diff-color-words' into maint
* js/maint-diff-color-words:
  diff --color-words: bit of clean-up
  diff --color-words -U0: fix the location of hunk headers
  t4034-diff-words: add a test for word diff without context

Conflicts:
	diff.c
2009-11-16 00:01:56 -08:00
Junio C Hamano
6dbdba00ea Merge branch 'jc/maint-blank-at-eof' into maint
* jc/maint-blank-at-eof:
  diff -B: colour whitespace errors
  diff.c: emit_add_line() takes only the rest of the line
  diff.c: split emit_line() from the first char and the rest of the line
  diff.c: shuffling code around
  diff --whitespace: fix blank lines at end
  core.whitespace: split trailing-space into blank-at-{eol,eof}
  diff --color: color blank-at-eof
  diff --whitespace=warn/error: fix blank-at-eof check
  diff --whitespace=warn/error: obey blank-at-eof
  diff.c: the builtin_diff() deals with only two-file comparison
  apply --whitespace: warn blank but not necessarily empty lines at EOF
  apply --whitespace=warn/error: diagnose blank at EOF
  apply.c: split check_whitespace() into two
  apply --whitespace=fix: detect new blank lines at eof correctly
  apply --whitespace=fix: fix handling of blank lines at the eof
2009-11-15 23:06:34 -08:00
Junio C Hamano
002a9ec005 Merge branch 'js/maint-diff-color-words'
* js/maint-diff-color-words:
  diff --color-words: bit of clean-up
  diff --color-words -U0: fix the location of hunk headers
  t4034-diff-words: add a test for word diff without context

Conflicts:
	diff.c
2009-11-15 16:41:29 -08:00
Junio C Hamano
76fd28283f diff --color-words: bit of clean-up
When we introduced the "word diff" mode, we could have done one of three
things:

 * change fn_out_consume() to "this is called every time a line worth of
   diff becomes ready from the lower-level diff routine.  This function
   knows two sets of helpers (one for line-oriented diff, another for word
   diff), and each set has various functions to be called at certain
   places (e.g. hunk header, context, ...).  The function's role is to
   inspect the incoming line, and dispatch appropriate helpers to produce
   either line- or word- oriented diff output."

 * introduce fn_out_consume_word_diff() that is "this is called every time
   a line worth of diff becomes ready from the lower-level diff routine,
   and here is what we do to prepare word oriented diff using that line."
   without touching fn_out_consume() at all.

 * Do neither of the above, and keep fn_out_consume() to "this is called
   every time a line worth of diff becomes ready from the lower-level diff
   routine, and here is what we do to output line oriented diff using that
   line."  but sprinkle a handful of 'are we in word-diff mode?  if so do
   this totally different thing' at random places.

This patch is to at least abstract the details of "this totally different
thing" out from the main codepath, in order to improve readability.

We can later refactor it by introducing fn_out_consume_word_diff(), taking
the second route above, but that is a separate topic.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-10-31 14:25:15 -07:00
Junio C Hamano
d39d667169 Merge branch 'js/diff-verbose-submodule'
* js/diff-verbose-submodule:
  add tests for git diff --submodule
  Add the --submodule option to the diff option family
2009-10-30 20:16:26 -07:00
Johannes Schindelin
a4ca1465ec diff --color-words -U0: fix the location of hunk headers
Colored word diff without context lines firstly printed all the hunk
headers among each other and then printed the diff.

This was due to the code relying on getting at least one context line at
the end of each hunk, where the colored words would be flushed (it is
done that way to be able to ignore rewrapped lines).

Noticed by Markus Heidelberg.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-10-30 09:42:56 -07:00
Johannes Schindelin
752c0c2492 Add the --submodule option to the diff option family
When you use the option --submodule=log you can see the submodule
summaries inlined in the diff, instead of not-quite-helpful SHA-1 pairs.

The format imitates what "git submodule summary" shows.

To do that, <path>/.git/objects/ is added to the alternate object
databases (if that directory exists).

This option was requested by Jens Lehmann at the GitTogether in Berlin.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-10-19 22:31:00 -07:00
Junio C Hamano
9981c808b4 Merge branch 'jc/maint-blank-at-eof'
* jc/maint-blank-at-eof:
  diff -B: colour whitespace errors
  diff.c: emit_add_line() takes only the rest of the line
  diff.c: split emit_line() from the first char and the rest of the line
  diff.c: shuffling code around
  diff --whitespace: fix blank lines at end
  core.whitespace: split trailing-space into blank-at-{eol,eof}
  diff --color: color blank-at-eof
  diff --whitespace=warn/error: fix blank-at-eof check
  diff --whitespace=warn/error: obey blank-at-eof
  diff.c: the builtin_diff() deals with only two-file comparison
  apply --whitespace: warn blank but not necessarily empty lines at EOF
  apply --whitespace=warn/error: diagnose blank at EOF
  apply.c: split check_whitespace() into two
  apply --whitespace=fix: detect new blank lines at eof correctly
  apply --whitespace=fix: fix handling of blank lines at the eof
2009-10-17 00:11:03 -07:00
Felipe Contreras
2775d92c53 diff.c: stylefix
Essentially; s/type* /type */ as per the coding guidelines.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-10-11 21:54:44 -07:00
Junio C Hamano
d91ba8fa88 Merge branch 'jc/maint-1.6.0-blank-at-eof' into jc/maint-blank-at-eof
* jc/maint-1.6.0-blank-at-eof:
  diff -B: colour whitespace errors
2009-09-15 11:21:10 -07:00
Junio C Hamano
61c6457e89 Merge branch 'jc/maint-1.6.0-blank-at-eof' (early part) into jc/maint-blank-at-eof
* 'jc/maint-1.6.0-blank-at-eof' (early part):
  diff.c: emit_add_line() takes only the rest of the line
  diff.c: split emit_line() from the first char and the rest of the line
2009-09-15 11:20:46 -07:00
Junio C Hamano
bb35fefbc9 Merge branch 'jc/maint-1.6.0-blank-at-eof' (early part) into jc/maint-blank-at-eof
* 'jc/maint-1.6.0-blank-at-eof' (early part):
  diff.c: shuffling code around
2009-09-15 03:38:30 -07:00
Junio C Hamano
afd9db4173 Merge branch 'jc/maint-1.6.0-blank-at-eof' (early part) into jc/maint-blank-at-eof
* 'jc/maint-1.6.0-blank-at-eof' (early part):
  diff --whitespace: fix blank lines at end
  core.whitespace: split trailing-space into blank-at-{eol,eof}
  diff --color: color blank-at-eof
  diff --whitespace=warn/error: fix blank-at-eof check
  diff --whitespace=warn/error: obey blank-at-eof
  diff.c: the builtin_diff() deals with only two-file comparison
  apply --whitespace: warn blank but not necessarily empty lines at EOF
  apply --whitespace=warn/error: diagnose blank at EOF
  apply.c: split check_whitespace() into two
  apply --whitespace=fix: detect new blank lines at eof correctly
  apply --whitespace=fix: fix handling of blank lines at the eof
2009-09-15 03:28:08 -07:00
Junio C Hamano
7f7ee2ff2d diff -B: colour whitespace errors
We used to send the old and new contents more or less straight out to the
output with only the original "old is red, new is green" colouring.  Now
all the necessary support routines have been prepared, call them with a
line of data at a time from the output code and have them check and color
whitespace errors in exactly the same way as they are called from the low
level diff callback routines.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-09-15 02:41:03 -07:00
Junio C Hamano
018cff7046 diff.c: emit_add_line() takes only the rest of the line
As the first character on the line that is fed to this function is always
"+", it is pointless to send that along with the rest of the line.

This change will make it easier to reuse the logic when emitting the
rewrite diff, as we do not want to copy a line only to add "+"/"-"/" "
immediately before its first character when we produce rewrite diff
output.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-09-15 02:41:02 -07:00
Junio C Hamano
250f79930d diff.c: split emit_line() from the first char and the rest of the line
A new helper function emit_line_0() takes the first line of diff output
(typically "-", " ", or "+") separately from the remainder of the line.
No other functional changes.

This change will make it easier to reuse the logic when emitting the
rewrite diff, as we do not want to copy a line only to add "+"/"-"/" "
immediately before its first character when we produce rewrite diff
output.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-09-15 02:40:51 -07:00
Junio C Hamano
6957eb9a39 diff.c: shuffling code around
Move function, type, and structure definitions for fill_mmfile(),
count_trailing_blank(), check_blank_at_eof(), emit_line(),
new_blank_line_at_eof(), emit_add_line(), sane_truncate_fn, and
emit_callback up in the file, so that they can be refactored into helper
functions and reused by codepath for emitting rewrite patches.

This only moves the lines around to make the next two patches easier to
read.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-09-14 22:18:49 -07:00
Junio C Hamano
d68fe26f3e diff --whitespace: fix blank lines at end
The earlier logic tried to colour any and all blank lines that were added
beyond the last blank line in the original, but this was very wrong.  If
you added 96 blank lines, a non-blank line, and then 3 blank lines at the
end, only the last 3 lines should trigger the error, not the earlier 96
blank lines.

We need to also make sure that the lines are after the last non-blank line
in the postimage as well before deciding to paint them.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-09-14 22:18:36 -07:00
Junio C Hamano
690ed84363 diff --color: color blank-at-eof
Since the coloring logic processed the patch output one line at a time, we
couldn't easily color code the new blank lines at the end of file.

Reuse the adds_blank_at_eof() function to find where the runs of such
blank lines start, keep track of the line number in the preimage while
processing the patch output one line at a time, and paint the new blank
lines that appear after that line to implement this.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-09-04 11:50:27 -07:00
Junio C Hamano
467babf8d0 diff --whitespace=warn/error: fix blank-at-eof check
The "diff --check" logic used to share the same issue as the one fixed for
"git apply" earlier in this series, in that a patch that adds new blank
lines at end could appear as

    @@ -l,5 +m,7 @@$
    _context$
    _context$
    -deleted$
    +$
    +$
    +$
    _$
    _$

where _ stands for SP and $ shows a end-of-line.  Instead of looking at
each line in the patch in the callback, simply count the blank lines from
the end in two versions, and notice the presence of new ones.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-09-04 11:50:27 -07:00
Junio C Hamano
5b5061efd8 diff --whitespace=warn/error: obey blank-at-eof
The "diff --check" code used to conflate trailing-space whitespace error
class with this, but now we have a proper separate error class, we should
check it under blank-at-eof, not trailing-space.

The whitespace error is not about _having_ blank lines at end, but about
adding _new_ blank lines.  To keep the message consistent with what is
given by "git apply", call whitespace_error_string() to generate it,
instead of using a hardcoded custom message.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-09-04 11:50:26 -07:00
Junio C Hamano
b8d9c1a66b diff.c: the builtin_diff() deals with only two-file comparison
The combined diff is implemented in combine_diff() and fn_out_consume()
codepath never has to deal with anything but two-file comparision.

Drop nparents from the emit_callback structure and simplify the code.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-09-04 11:50:26 -07:00
Brian Gianforcaro
eeefa7c90e Style fixes, add a space after if/for/while.
The majority of code in core git appears to use a single
space after if/for/while. This is an attempt to bring more
code to this standard. These are entirely cosmetic changes.

Signed-off-by: Brian Gianforcaro <b.gianfo@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-08-31 23:26:28 -07:00
Jim Meyering
97bf2a0809 diff.c: fix typoes in comments
Should be squashed when we reroll 'next' into the main commit.
2009-08-30 14:13:01 -07:00
Nguyễn Thái Ngọc Duy
b4d1690df1 Teach Git to respect skip-worktree bit (reading part)
grep: turn on --cached for files that is marked skip-worktree
ls-files: do not check for deleted file that is marked skip-worktree
update-index: ignore update request if it's skip-worktree, while still allows removing
diff*: skip worktree version

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-08-23 17:13:32 -07:00
Junio C Hamano
90b1994170 diff: Rename QUIET internal option to QUICK
The option "QUIET" primarily meant "find if we have _any_ difference as
quick as possible and report", which means we often do not even have to
look at blobs if we know the trees are different by looking at the higher
level (e.g. "diff-tree A B").  As a side effect, because there is no point
showing one change that we happened to have found first, it also enables
NO_OUTPUT and EXIT_WITH_STATUS options, making the end result look quiet.

Rename the internal option to QUICK to reflect this better; it also makes
grepping the source tree much easier, as there are other kinds of QUIET
option everywhere.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-07-29 10:22:39 -07:00
Junio C Hamano
f245194f9a diff: change semantics of "ignore whitespace" options
Traditionally, the --ignore-whitespace* options have merely meant to tell
the diff output routine that some class of differences are not worth
showing in the textual diff output, so that the end user has easier time
to review the remaining (presumably more meaningful) changes.  These
options never affected the outcome of the command, given as the exit
status when the --exit-code option was in effect (either directly or
indirectly).

When you have only whitespace changes, however, you might expect

	git diff -b --exit-code

to report that there is _no_ change with zero exit status.

Change the semantics of --ignore-whitespace* options to mean more than
"omit showing the difference in text".

The exit status, when --exit-code is in effect, is computed by checking if
we found any differences at the path level, while diff frontends feed
filepairs to the diffcore engine.  When "ignore whitespace" options are in
effect, we defer this determination until the very end of diffcore
transformation.  We simply do not know until the textual diff is
generated, which comes very late in the pipeline.

When --quiet is in effect, various diff frontends optimize by breaking out
early from the loop that enumerates the filepairs, when we find the first
path level difference; when --ignore-whitespace* is used the above change
automatically disables this optimization.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-07-29 10:22:39 -07:00
Junio C Hamano
128a9d86da Merge branch 'rs/grep-p'
* rs/grep-p:
  grep: simplify -p output
  grep -p: support user defined regular expressions
  grep: add option -p/--show-function
  grep: handle pre context lines on demand
  grep: print context hunk marks between files
  grep: move context hunk mark handling into show_line()
  userdiff: add xdiff_clear_find_func()
2009-07-09 00:59:58 -07:00
Junio C Hamano
dd787c19c4 Merge branch 'tr/die_errno'
* tr/die_errno:
  Use die_errno() instead of die() when checking syscalls
  Convert existing die(..., strerror(errno)) to die_errno()
  die_errno(): double % in strerror() output just in case
  Introduce die_errno() that appends strerror(errno) to die()
2009-07-06 09:39:46 -07:00
René Scharfe
8cfe5f1cd5 userdiff: add xdiff_clear_find_func()
xdiff_set_find_func() is used to set user defined regular expressions
for finding function signatures.  Add xdiff_clear_find_func(), which
frees the memory allocated by the former, making the API complete.

Also, use the new function in diff.c (the only call site of
xdiff_set_find_func()) to clean up after ourselves.

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-07-01 19:16:37 -07:00
Thomas Rast
0721c314a5 Use die_errno() instead of die() when checking syscalls
Lots of die() calls did not actually report the kind of error, which
can leave the user confused as to the real problem.  Use die_errno()
where we check a system/library call that sets errno on failure, or
one of the following that wrap such calls:

  Function              Passes on error from
  --------              --------------------
  odb_pack_keep         open
  read_ancestry         fopen
  read_in_full          xread
  strbuf_read           xread
  strbuf_read_file      open or strbuf_read_file
  strbuf_readlink       readlink
  write_in_full         xwrite

Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-06-27 11:14:53 -07:00
Thomas Rast
d824cbba02 Convert existing die(..., strerror(errno)) to die_errno()
Change calls to die(..., strerror(errno)) to use the new die_errno().

In the process, also make slight style adjustments: at least state
_something_ about the function that failed (instead of just printing
the pathname), and put paths in single quotes.

Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-06-27 11:14:53 -07:00
Junio C Hamano
f4f78e668d Merge branch 'maint'
* maint:
  diff.c: plug a memory leak in an error path
  fetch-pack: close output channel after sideband demultiplexer terminates
  builtin-remote: Make "remote show" display all urls
2009-06-09 00:29:36 -07:00
Johannes Sixt
802f9c9cb2 diff.c: plug a memory leak in an error path
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-06-08 21:19:26 -07:00
David Aguilar
003b33a8ad diff: generate pretty filenames in prep_temp_blob()
Naturally, prep_temp_blob() did not care about filenames.
As a result, GIT_EXTERNAL_DIFF and textconv generated
filenames such as ".diff_XXXXXX".

This modifies prep_temp_blob() to generate user-friendly
filenames when creating temporary files.

Diffing "name.ext" now generates "XXXXXX_name.ext".

Signed-off-by: David Aguilar <davvid@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-05-31 17:57:59 -07:00
Junio C Hamano
2c5942dbae Merge branch 'ar/unlink-err' into maint
* ar/unlink-err:
  print unlink(2) errno in copy_or_link_directory
  replace direct calls to unlink(2) with unlink_or_warn
  Introduce an unlink(2) wrapper which gives warning if unlink failed
2009-05-25 19:01:50 -07:00
Jeff King
3cd7388d57 convert bare readlink to strbuf_readlink
This particular readlink call never NUL-terminated its
result, making it a potential source of bugs (though there
is no bug now, as it currently always respects the length
field). Let's just switch it to strbuf_readlink which is
shorter and less error-prone.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-05-25 11:34:02 -07:00
Junio C Hamano
36587681b4 Merge branch 'ar/unlink-err'
* ar/unlink-err:
  print unlink(2) errno in copy_or_link_directory
  replace direct calls to unlink(2) with unlink_or_warn
  Introduce an unlink(2) wrapper which gives warning if unlink failed
2009-05-18 09:01:06 -07:00
Junio C Hamano
c16cea7345 Merge branch 'mh/diff-stat-color'
* mh/diff-stat-color:
  diff: do not color --stat output like patch context
2009-05-18 08:59:54 -07:00
Junio C Hamano
e89c6ea998 Merge branch 'jc/maint-1.6.0-keep-pack' into maint-1.6.1
* jc/maint-1.6.0-keep-pack:
  pack-objects: don't loosen objects available in alternate or kept packs
  t7700: demonstrate repack flaw which may loosen objects unnecessarily
  Remove --kept-pack-only option and associated infrastructure
  pack-objects: only repack or loosen objects residing in "local" packs
  git-repack.sh: don't use --kept-pack-only option to pack-objects
  t7700-repack: add two new tests demonstrating repacking flaws
  is_kept_pack(): final clean-up
  Simplify is_kept_pack()
  Consolidate ignore_packed logic more
  has_sha1_kept_pack(): take "struct rev_info"
  has_sha1_pack(): refactor "pretend these packs do not exist" interface
  git-repack: resist stray environment variable
2009-05-03 15:01:31 -07:00
Junio C Hamano
3f3e2c26fa Merge branch 'jc/maint-1.6.0-diff-borrow-carefully' into maint-1.6.1
* jc/maint-1.6.0-diff-borrow-carefully:
  diff --cached: do not borrow from a work tree when a path is marked as assume-unchanged
2009-05-03 15:01:26 -07:00
Felipe Contreras
4b25d091ba Fix a bunch of pointer declarations (codestyle)
Essentially; s/type* /type */ as per the coding guidelines.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-05-01 15:17:31 -07:00
Alex Riesen
691f1a28bf replace direct calls to unlink(2) with unlink_or_warn
This helps to notice when something's going wrong, especially on
systems which lock open files.

I used the following criteria when selecting the code for replacement:
- it was already printing a warning for the unlink failures
- it is in a function which already printing something or is
  called from such a function
- it is in a static function, returning void and the function is only
  called from a builtin main function (cmd_)
- it is in a function which handles emergency exit (signal handlers)
- it is in a function which is obvously cleaning up the lockfiles

Signed-off-by: Alex Riesen <raa.lkml@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-04-29 18:37:41 -07:00
Markus Heidelberg
a408e0e649 diff: do not color --stat output like patch context
The diffstat used the color.diff.plain slot (context text) for coloring
filenames and the whole summary line. This didn't look nice and the
affected text isn't patch context at all.

Signed-off-by: Markus Heidelberg <markus.heidelberg@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-04-26 01:37:47 -07:00
Linus Torvalds
cced5fbc24 Allow users to un-configure rename detection
I told people on the kernel mailing list to please use "-M" when sending
me rename patches, so that I can see what they do while reading email
rather than having to apply the patch and then look at the end result.

I also told them that if they want to make it the default, they can just
add

	[diff]
		renames

to their ~/.gitconfig file. And while I was thinking about that, I wanted
to also check whether you can then mark individual projects to _not_ have
that default in the per-repository .git/config file.

And you can't. Currently you cannot have a global "enable renames by
default" and then a local ".. but not for _this_ project". Why? Because if
somebody writes

	[diff]
		renames = no

we simply ignore it, rather than resetting "diff_detect_rename_default"
back to zero.

Fixed thusly.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-04-11 22:22:00 -07:00
Junio C Hamano
197cf8d59c Merge branch 'jc/maint-1.6.0-diff-borrow-carefully' into maint
* jc/maint-1.6.0-diff-borrow-carefully:
  diff --cached: do not borrow from a work tree when a path is marked as assume-unchanged
2009-04-08 23:23:17 -07:00
Junio C Hamano
c3067cbfb3 Merge branch 'jc/maint-1.6.0-keep-pack' into maint
* jc/maint-1.6.0-keep-pack:
  pack-objects: don't loosen objects available in alternate or kept packs
  t7700: demonstrate repack flaw which may loosen objects unnecessarily
  Remove --kept-pack-only option and associated infrastructure
  pack-objects: only repack or loosen objects residing in "local" packs
  git-repack.sh: don't use --kept-pack-only option to pack-objects
  t7700-repack: add two new tests demonstrating repacking flaws
  is_kept_pack(): final clean-up
  Simplify is_kept_pack()
  Consolidate ignore_packed logic more
  has_sha1_kept_pack(): take "struct rev_info"
  has_sha1_pack(): refactor "pretend these packs do not exist" interface
  git-repack: resist stray environment variable

Conflicts:
	t/t7700-repack.sh
2009-04-08 23:21:10 -07:00
Junio C Hamano
aa72a14a7f Merge branch 'jc/maint-1.6.0-diff-borrow-carefully'
* jc/maint-1.6.0-diff-borrow-carefully:
  diff --cached: do not borrow from a work tree when a path is marked as assume-unchanged
2009-03-28 00:42:31 -07:00
Junio C Hamano
b71fdc590d Merge branch 'js/maint-diff-temp-smudge'
* js/maint-diff-temp-smudge:
  Smudge the files fed to external diff and textconv
2009-03-26 00:27:30 -07:00
Junio C Hamano
150115aded diff --cached: do not borrow from a work tree when a path is marked as assume-unchanged
When the index says that the file in the work tree that corresponds to the
blob object that is used for comparison is known to be unchanged, "diff"
reads from the file and applies convert_to_git(), instead of inflating the
object, to feed the internal diff engine with, because an earlier
benchnark found that it tends to be faster to use this optimization.

However, the index can lie when the path is marked as assume-unchanged.
Disable the optimization for such paths.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-03-22 15:26:07 -07:00
Johannes Schindelin
4e218f54b3 Smudge the files fed to external diff and textconv
When preparing temporary files for an external diff or textconv, it is
easier on the external tools, especially when they are implemented using
platform tools, if they are fed the input after convert_to_working_tree().

This fixes msysGit issue 177.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-03-22 15:03:59 -07:00
Junio C Hamano
aec813062b Merge branch 'jc/maint-1.6.0-keep-pack'
* jc/maint-1.6.0-keep-pack:
  is_kept_pack(): final clean-up
  Simplify is_kept_pack()
  Consolidate ignore_packed logic more
  has_sha1_kept_pack(): take "struct rev_info"
  has_sha1_pack(): refactor "pretend these packs do not exist" interface
  git-repack: resist stray environment variable
2009-03-11 13:49:56 -07:00
Benjamin Kramer
eb3a9dd327 Remove unused function scope local variables
These variables were unused and can be removed safely:

  builtin-clone.c::cmd_clone(): use_local_hardlinks, use_separate_remote
  builtin-fetch-pack.c::find_common(): len
  builtin-remote.c::mv(): symref
  diff.c::show_stats():show_stats(): total
  diffcore-break.c::should_break(): base_size
  fast-import.c::validate_raw_date(): date, sign
  fsck.c::fsck_tree(): o_sha1, sha1
  xdiff-interface.c::parse_num(): read_some

Signed-off-by: Benjamin Kramer <benny.kra@googlemail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-03-07 20:52:17 -08:00
Junio C Hamano
4a2caf6912 Merge branch 'al/ansi-color'
* al/ansi-color:
  builtin-branch.c: Rename branch category color names
  Clean up use of ANSI color sequences
2009-03-05 15:41:19 -08:00
Keith Cascio
628d5c2b70 Use DIFF_XDL_SET/DIFF_OPT_SET instead of raw bit-masking
Signed-off-by: Keith Cascio <keith@cs.ucla.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-03-04 00:56:51 -08:00
Junio C Hamano
cd673c1f17 has_sha1_pack(): refactor "pretend these packs do not exist" interface
Most of the callers of this function except only one pass NULL to its last
parameter, ignore_packed.

Introduce has_sha1_kept_pack() function that has the function signature
and the semantics of this function, and convert the sole caller that does
not pass NULL to call this new function.

All other callers and has_sha1_pack() lose the ignore_packed parameter.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-02-28 01:06:06 -08:00
Keith Cascio
4b15b4ab5f Remove redundant bit clears from diff_setup()
All bits already clear after memset(0).

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-02-13 18:19:37 -08:00
Arjen Laarhoven
dc6ebd4cc5 Clean up use of ANSI color sequences
Remove the literal ANSI escape sequences and replace them by readable
constants.

Signed-off-by: Arjen Laarhoven <arjen@yaph.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-02-13 17:27:58 -08:00
Nazri Ramliy
a8344abe0f Bugfix: GIT_EXTERNAL_DIFF with more than one changed files
When there is more than one file that are changed, running git diff with
GIT_EXTERNAL_DIFF incorrectly diagnoses an programming error and dies.
The check introduced in 479b0ae (diff: refactor tempfile cleanup handling,
2009-01-22) to detect a temporary file slot that forgot to remove its
temporary file was inconsistent with the way the codepath to remove the
temporary to mark the slot that it is done with it.

This patch fixes this problem and adds a test case for it.

Signed-off-by: Nazri Ramliy <ayiehere@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-02-12 12:31:52 -08:00
Junio C Hamano
bdf6442b48 Merge branch 'jc/maint-split-diff-metainfo'
* jc/maint-split-diff-metainfo:
  diff.c: output correct index lines for a split diff
2009-01-31 18:07:42 -08:00
Junio C Hamano
fa5bc8abb3 Merge branch 'jk/signal-cleanup'
* jk/signal-cleanup:
  t0005: use SIGTERM for sigchain test
  pager: do wait_for_pager on signal death
  refactor signal handling for cleanup functions
  chain kill signals for cleanup functions
  diff: refactor tempfile cleanup handling
  Windows: Fix signal numbers
2009-01-31 17:43:56 -08:00
Junio C Hamano
90b23e5f21 Merge branch 'jc/maint-1.6.0-split-diff-metainfo' into jc/maint-split-diff-metainfo
This is an evil merge, as a test added since 1.6.0 expects an incorrect
behaviour the merged commit fixes.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-01-27 01:08:02 -08:00
Junio C Hamano
b67b9612e1 diff.c: output correct index lines for a split diff
A patch that changes the filetype (e.g. regular file to symlink) of a path
must be split into a deletion event followed by a creation event, which
means that we need to have two independent metainfo lines for each.
However, the code reused the single set of metainfo lines.

As the blob object names recorded on the index lines are usually not used
nor validated on the receiving end, this is not an issue with normal use
of the resulting patch.  However, when accepting a binary patch to delete
a blob, git-apply verified that the postimage blob object name on the
index line is 0{40}, hence a patch that deletes a regular file blob that
records binary contents to create a blob with different filetype (e.g. a
symbolic link) failed to apply.  "git am -3" also uses the blob object
names recorded on the index line, so it would also misbehave when
synthesizing a preimage tree.

This moves the code to generate metainfo lines around, so that two
independent sets of metainfo lines are used for the split halves.

Additional tests by Jeff King.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-01-27 00:48:00 -08:00
Junio C Hamano
9847a52432 Merge branch 'js/diff-color-words'
* js/diff-color-words:
  Change the spelling of "wordregex".
  color-words: Support diff.wordregex config option
  color-words: make regex configurable via attributes
  color-words: expand docs with precise semantics
  color-words: enable REG_NEWLINE to help user
  color-words: take an optional regular expression describing words
  color-words: change algorithm to allow for 0-character word boundaries
  color-words: refactor word splitting and use ALLOC_GROW()
  Add color_fwrite_lines(), a function coloring each line individually
2009-01-25 17:13:29 -08:00
Junio C Hamano
5dc1308562 Merge branch 'js/patience-diff'
* js/patience-diff:
  bash completions: Add the --patience option
  Introduce the diff option '--patience'
  Implement the patience diff algorithm

Conflicts:
	contrib/completion/git-completion.bash
2009-01-23 21:51:38 -08:00
Jeff King
57b235a4bc refactor signal handling for cleanup functions
The current code is very inconsistent about which signals
are caught for doing cleanup of temporary files and lock
files. Some callsites checked only SIGINT, while others
checked a variety of death-dealing signals.

This patch factors out those signals to a single function,
and then calls it everywhere. For some sites, that means
this is a simple clean up. For others, it is an improvement
in that they will now properly clean themselves up after a
larger variety of signals.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-01-21 22:46:53 -08:00
Jeff King
4a16d07272 chain kill signals for cleanup functions
If a piece of code wanted to do some cleanup before exiting
(e.g., cleaning up a lockfile or a tempfile), our usual
strategy was to install a signal handler that did something
like this:

  do_cleanup(); /* actual work */
  signal(signo, SIG_DFL); /* restore previous behavior */
  raise(signo); /* deliver signal, killing ourselves */

For a single handler, this works fine. However, if we want
to clean up two _different_ things, we run into a problem.
The most recently installed handler will run, but when it
removes itself as a handler, it doesn't put back the first
handler.

This patch introduces sigchain, a tiny library for handling
a stack of signal handlers. You sigchain_push each handler,
and use sigchain_pop to restore whoever was before you in
the stack.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-01-21 22:46:52 -08:00
Jeff King
479b0ae81c diff: refactor tempfile cleanup handling
There are two pieces of code that create tempfiles for diff:
run_external_diff and run_textconv. The former cleans up its
tempfiles in the face of premature death (i.e., by die() or
by signal), but the latter does not. After this patch, they
will both use the same cleanup routines.

To make clear what the change is, let me first explain what
happens now:

  - run_external_diff uses a static global array of 2
    diff_tempfile structs (since it knows it will always
    need exactly 2 tempfiles). It calls prepare_temp_file
    (which doesn't know anything about the global array) on
    each of the structs, creating the tempfiles that need to
    be cleaned up. It then registers atexit and signal
    handlers to look through the global array and remove the
    tempfiles. If it succeeds, it calls the handler manually
    (which marks the tempfile structs as unused).

  - textconv has its own tempfile struct, which it allocates
    using prepare_temp_file and cleans up manually. No
    signal or atexit handlers.

The new code moves the installation of cleanup handlers into
the prepare_temp_file function. Which means that that
function now has to understand that there is static tempfile
storage. So what happens now is:

  - run_external_diff calls prepare_temp_file
  - prepare_temp_file calls claim_diff_tempfile, which
    allocates an unused slot from our global array
  - prepare_temp_file installs (if they have not already
    been installed) atexit and signal handlers for cleanup
  - prepare_temp_file sets up the tempfile as usual
  - prepare_temp_file returns a pointer to the allocated
    tempfile

The advantage being that run_external_diff no longer has to
care about setting up cleanup handlers. Now by virtue of
calling prepare_temp_file, run_textconv gets the same
benefit, as will any future users of prepare_temp_file.

There are also a few side benefits to the specific
implementation:

  - we now install cleanup handlers _before_ allocating the
    tempfile, closing a race which could leave temp cruft

  - when allocating a slot in the global array, we will now
    detect a situation where the old slots were not properly
    vacated (i.e., somebody forgot to call remove upon
    leaving the function). In the old code, such a situation
    would silently overwrite the tempfile names, meaning we
    would forget to clean them up. The new code dies with a
    bug warning.

  - we make sure only to install the signal handler once.
    This isn't a big deal, since we are just overwriting the
    old handler, but will become an issue when a later patch
    converts the code to use sigchain

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-01-21 22:46:52 -08:00
Junio C Hamano
f873dd5ac2 Merge branch 'maint'
* maint:
  Rename diff.suppress-blank-empty to diff.suppressBlankEmpty
2009-01-21 01:08:10 -08:00
Boyd Stephen Smith Jr
98a4d87b87 color-words: Support diff.wordregex config option
When diff is invoked with --color-words (w/o =regex), use the regular
expression the user has configured as diff.wordregex.

diff drivers configured via attributes take precedence over the
diff.wordregex-words setting.  If the user wants to change them, they have
their own configuration variables.

Signed-off-by: Boyd Stephen Smith Jr <bss@iguanasuicide.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-01-21 00:51:12 -08:00
Johannes Schindelin
950db8798d Rename diff.suppress-blank-empty to diff.suppressBlankEmpty
All the other config variables use CamelCase.  This config variable should
not be an exception.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-01-21 00:17:40 -08:00
Thomas Rast
80c49c3de2 color-words: make regex configurable via attributes
Make the --color-words splitting regular expression configurable via
the diff driver's 'wordregex' attribute.  The user can then set the
driver on a file in .gitattributes.  If a regex is given on the
command line, it overrides the driver's setting.

We also provide built-in regexes for the languages that already had
funcname patterns, and add an appropriate diff driver entry for C/++.
(The patterns are designed to run UTF-8 sequences into a single chunk
to make sure they remain readable.)

Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-01-17 10:44:21 -08:00
Thomas Rast
bf82940dbf color-words: enable REG_NEWLINE to help user
We silently truncate a match at the newline, which may lead to
unexpected behaviour, e.g., when matching "<[^>]*>" against

  <foo
  bar>

since then "<foo" becomes a word (and "bar>" doesn't!) even though the
regex said only angle-bracket-delimited things can be words.

To alleviate the problem slightly, use REG_NEWLINE so that negated
classes can't match a newline.  Of course newlines can still be
matched explicitly.

Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-01-17 10:43:24 -08:00
Johannes Schindelin
2b6a5417d7 color-words: take an optional regular expression describing words
In some applications, words are not delimited by white space.  To
allow for that, you can specify a regular expression describing
what makes a word with

	git diff --color-words='[A-Za-z0-9]+'

Note that words cannot contain newline characters.

As suggested by Thomas Rast, the words are the exact matches of the
regular expression.

Note that a regular expression beginning with a '^' will match only
a word at the beginning of the hunk, not a word at the beginning of
a line, and is probably not what you want.

This commit contains a quoting fix by Thomas Rast.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-01-17 10:43:08 -08:00
Johannes Schindelin
2e5d2003b2 color-words: change algorithm to allow for 0-character word boundaries
Up until now, the color-words code assumed that word boundaries are
identical to white space characters.

Therefore, it could get away with a very simple scheme: it copied the
hunks, substituted newlines for each white space character, called
libxdiff with the processed text, and then identified the text to
output by the offsets (which agreed since the original text had the
same length).

This code was ugly, for a number of reasons:

- it was impossible to introduce 0-character word boundaries,

- we had to print everything word by word, and

- the code needed extra special handling of newlines in the removed part.

Fix all of these issues by processing the text such that

- we build word lists, separated by newlines,

- we remember the original offsets for every word, and

- after calling libxdiff on the wordlists, we parse the hunk headers, and
  find the corresponding offsets, and then

- we print the removed/added parts in one go.

The pre and post samples in the test were provided by Santi Béjar.

Note that there is some strange special handling of hunk headers where
one line range is 0 due to POSIX: in this case, the start is one too
low.  In other words a hunk header '@@ -1,0 +2 @@' actually means that
the line must be added after the _second_ line of the pre text, _not_
the first.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-01-17 10:42:41 -08:00
Johannes Schindelin
23c1575f74 color-words: refactor word splitting and use ALLOC_GROW()
Word splitting is now performed by the function diff_words_fill(),
avoiding having the same code twice.

In the same spirit, avoid duplicating the code of ALLOC_GROW().

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-01-17 10:42:19 -08:00
Johannes Schindelin
34292bddb8 Introduce the diff option '--patience'
This commit teaches Git to produce diff output using the patience diff
algorithm with the diff option '--patience'.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-01-07 13:37:07 -08:00
Junio C Hamano
7bb5321be0 Merge branch 'rs/diff-ihc'
* rs/diff-ihc:
  diff: add option to show context between close hunks

Conflicts:
	Documentation/diff-options.txt
2009-01-07 00:10:14 -08:00
Alexander Potashev
d75307084d remove trailing LF in die() messages
LF at the end of format strings given to die() is redundant because
die already adds one on its own.

Signed-off-by: Alexander Potashev <aspotashev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-01-05 13:01:01 -08:00
René Scharfe
6d0e674a57 diff: add option to show context between close hunks
Merge two hunks if there is only the specified number of otherwise unshown
context between them.  For --inter-hunk-context=1, the resulting patch has
the same number of lines but shows uninterrupted context instead of a
context header line in between.

Patches generated with this option are easier to read but are also more
likely to conflict if the file to be patched contains other changes.

This patch keeps the default for this option at 0.  It is intended to just
make the feature available in order to see its advantages and downsides.

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-12-29 01:05:21 -08:00
René Scharfe
0956a6db7a Fix type-mismatch compiler warning from diff_populate_filespec()
The type of the size member of filespec is ulong, while strbuf_detach expects
a size_t pointer.  This patch should fix the warning:

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-12-18 09:58:40 -08:00
Linus Torvalds
dfab6aaecf Make 'prepare_temp_file()' ignore st_size for symlinks
The code was already set up to not really need it, so this just massages
it a bit to remove the use entirely.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-12-17 13:36:34 -08:00
Linus Torvalds
cf219d8c68 Make 'diff_populate_filespec()' use the new 'strbuf_readlink()'
This makes all tests pass on a system where 'lstat()' has been hacked to
return bogus data in st_size for symlinks.

Of course, the test coverage isn't complete, but it's a good baseline.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-12-17 13:36:34 -08:00
Jeff King
3aa1f7ca37 diff: respect textconv in rewrite diffs
Currently we just skip rewrite diffs for binary files; this
patch makes an exception for files which will be textconv'd,
and actually performs the textconv before generating the
diff.

Conceptually, rewrite diffs should be in the exact same
format as the a non-rewrite diff, except that we refuse to
share any context. Thus it makes very little sense for "git
diff" to show a textconv'd diff, but for "git diff -B" to
show "Binary files differ".

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-12-09 22:28:55 -08:00
Jeff King
0c01857df5 diff: fix handling of binary rewrite diffs
The current emit_rewrite_diff code always writes a text patch without
checking whether the content is binary. This means that if you end up with
a rewrite diff for a binary file, you get lots of raw binary goo in your
patch.

Instead, if we have binary files, then let's just skip emit_rewrite_diff
altogether. We will already have shown the "dissimilarity index" line, so
it is really about the diff contents. If binary diffs are turned off, the
"Binary files a/file and b/file differ" message should be the same in
either case. If we do have binary patches turned on, there isn't much
point in making a less-efficient binary patch that does a total rewrite;
no human is going to read it, and since binary patches don't apply with
any fuzz anyway, the result of application should be the same.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-12-09 22:14:25 -08:00
Jeff King
e10ea8126c diff: allow turning on textconv explicitly for plumbing
Some history viewers use the diff plumbing to generate diffs
rather than going through the "git diff" porcelain.
Currently, there is no way for them to specify that they
would like to see the text-converted version of the diff.

This patch adds a "--textconv" option to allow such a
plumbing user to allow text conversion.  The user can then
tell the viewer whether or not they would like text
conversion enabled.

While it may be tempting add a configuration option rather
than requiring each plumbing user to be configured to pass
--textconv, that is somewhat dangerous. Text-converted diffs
generally cannot be applied directly, so each plumbing user
should "opt in" to generating such a diff, either by
explicit request of the user or by confirming that their
output will not be fed to patch.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-12-07 19:59:32 -08:00
Junio C Hamano
72b6157aa8 Merge branch 'jk/diff-convfilter'
* jk/diff-convfilter:
  enable textconv for diff in verbose status/commit
  wt-status: load diff ui config
  only textconv regular files
  userdiff: require explicitly allowing textconv
  refactor userdiff textconv code

Conflicts:
	t/t4030-diff-textconv.sh
2008-11-12 21:50:58 -08:00
Junio C Hamano
459d60084f Merge branch 'jk/diff-convfilter-test-fix'
* jk/diff-convfilter-test-fix:
  Avoid using non-portable `echo -n` in tests.
  add userdiff textconv tests
  document the diff driver textconv feature
  diff: add missing static declaration

Conflicts:
	Documentation/gitattributes.txt
2008-11-12 21:50:41 -08:00
Junio C Hamano
1e2bba92d2 Merge branch 'rs/blame'
* rs/blame:
  blame: use xdi_diff_hunks(), get rid of struct patch
  add xdi_diff_hunks() for callers that only need hunk lengths
  Allow alternate "low-level" emit function from xdl_diff
  Always initialize xpparam_t to 0
  blame: inline get_patch()
2008-11-08 16:05:39 -08:00
Jeff King
2675773af8 only textconv regular files
We treat symlinks as text containing the results of the
symlink, so it doesn't make much sense to text-convert them.

Similarly gitlink components just end up as the text
"Subproject commit $sha1", which we should leave intact.

Note that a typechange may be broken into two parts: the
removal of the old part and the addition of the new. In that
case, we _do_ show the textconv for any part which is the
addition or removal of a file we would ordinarily textconv,
since it is purely acting on the file contents.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-10-26 14:09:48 -07:00
Jeff King
c7534ef4a1 userdiff: require explicitly allowing textconv
Diffs that have been produced with textconv almost certainly
cannot be applied, so we want to be careful not to generate
them in things like format-patch.

This introduces a new diff options, ALLOW_TEXTCONV, which
controls this behavior. It is off by default, but is
explicitly turned on for the "log" family of commands, as
well as the "diff" porcelain (but not diff-* plumbing).

Because both text conversion and external diffing are
controlled by these diff options, we can get rid of the
"plumbing versus porcelain" distinction when reading the
config. This was an attempt to control the same thing, but
suffered from being too coarse-grained.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-10-26 14:09:48 -07:00
Jeff King
04427ac848 refactor userdiff textconv code
The original implementation of textconv put the conversion
into fill_mmfile. This was a bad idea for a number of
reasons:

 - it made the semantics of fill_mmfile unclear. In some
   cases, it was allocating data (if a text conversion
   occurred), and in some cases not (if we could use the
   data directly from the filespec). But the caller had
   no idea which had happened, and so didn't know whether
   the memory should be freed

 - similarly, the caller had no idea if a text conversion
   had occurred, and so didn't know whether the contents
   should be treated as binary or not. This meant that we
   incorrectly guessed that text-converted content was
   binary and didn't actually show it (unless the user
   overrode us with "diff.foo.binary = false", which then
   created problems in plumbing where the text conversion
   did _not_ occur)

 - not all callers of fill_mmfile want the text contents. In
   particular, we don't really want diffstat, whitespace
   checks, patch id generation, etc, to look at the
   converted contents.

This patch pulls the conversion code directly into
builtin_diff, so that we only see the conversion when
generating an actual patch. We also then know whether we are
doing a conversion, so we can check the binary-ness and free
the data from the mmfile appropriately (the previous version
leaked quite badly when text conversion was used)

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-10-26 14:09:48 -07:00
Jeff King
72cf484140 diff: add missing static declaration
This function isn't used outside of diff.c; the 'static' was
simply overlooked in the original writing.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-10-26 14:09:47 -07:00
Brian Downing
9ccd0a88ac Always initialize xpparam_t to 0
We're going to be adding some parameters to this, so we can't have
any uninitialized data in it.

Signed-off-by: Brian Downing <bdowning@lavos.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-10-25 12:09:31 -07:00
Jeff King
9cb92c390c diff: add filter for converting binary to text
When diffing binary files, it is sometimes nice to see the
differences of a canonical text form rather than either a
binary patch or simply "binary files differ."

Until now, the only option for doing this was to define an
external diff command to perform the diff. This was a lot of
work, since the external command needed to take care of
doing the diff itself (including mode changes), and lost the
benefit of git's colorization and other options.

This patch adds a text conversion option, which converts a
file to its canonical format before performing the diff.
This is less flexible than an arbitrary external diff, but
is much less work to set up. For example:

  $ echo '*.jpg diff=exif' >>.gitattributes
  $ git config diff.exif.textconv exiftool
  $ git config diff.exif.binary false

allows one to see jpg diffs represented by the text output
of exiftool.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2008-10-18 08:02:55 -07:00
Jeff King
122aa6f9c0 diff: introduce diff.<driver>.binary
The "diff" gitattribute is somewhat overloaded right now. It
can say one of three things:

  1. this file is definitely binary, or definitely not
     (i.e., diff or !diff)
  2. this file should use an external diff engine (i.e.,
     diff=foo, diff.foo.command = custom-script)
  3. this file should use particular funcname patterns
     (i.e., diff=foo, diff.foo.(x?)funcname = some-regex)

Most of the time, there is no conflict between these uses,
since using one implies that the other is irrelevant (e.g.,
an external diff engine will decide for itself whether the
file is binary).

However, there is at least one conflicting situation: there
is no way to say "use the regular rules to determine whether
this file is binary, but if we do diff it textually, use
this funcname pattern." That is, currently setting diff=foo
indicates that the file is definitely text.

This patch introduces a "binary" config option for a diff
driver, so that one can explicitly set diff.foo.binary. We
default this value to "don't know". That is, setting a diff
attribute to "foo" and using "diff.foo.funcname" will have
no effect on the binaryness of a file. To get the current
behavior, one can set diff.foo.binary to true.

This patch also has one additional advantage: it cleans up
the interface to the userdiff code a bit. Before, calling
code had to know more about whether attributes were false,
true, or unset to determine binaryness. Now that binaryness
is a property of a driver, we can represent these situations
just by passing back a driver struct.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2008-10-18 08:02:26 -07:00
Jeff King
be58e70dba diff: unify external diff and funcname parsing code
Both sets of code assume that one specifies a diff profile
as a gitattribute via the "diff=foo" attribute. They then
pull information about that profile from the config as
diff.foo.*.

The code for each is currently completely separate from the
other, which has several disadvantages:

  - there is duplication as we maintain code to create and
    search the separate lists of external drivers and
    funcname patterns

  - it is difficult to add new profile options, since it is
    unclear where they should go

  - the code is difficult to follow, as we rely on the
    "check if this file is binary" code to find the funcname
    pattern as a side effect. This is the first step in
    refactoring the binary-checking code.

This patch factors out these diff profiles into "userdiff"
drivers. A file with "diff=foo" uses the "foo" driver, which
is specified by a single struct.

Note that one major difference between the two pieces of
code is that the funcname patterns are always loaded,
whereas external drivers are loaded only for the "git diff"
porcelain; the new code takes care to retain that situation.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2008-10-18 08:02:21 -07:00
Brandon Casey
f285a2d7ed Replace calls to strbuf_init(&foo, 0) with STRBUF_INIT initializer
Many call sites use strbuf_init(&foo, 0) to initialize local
strbuf variable "foo" which has not been accessed since its
declaration. These can be replaced with a static initialization
using the STRBUF_INIT macro which is just as readable, saves a
function call, and takes up fewer lines.

Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2008-10-12 12:36:19 -07:00
Jonathan del Strother
5d1e958e24 Teach git diff about Objective-C syntax
Add support for recognition of Objective-C class & instance methods,
C functions, and class implementation/interfaces.

Signed-off-by: Jonathan del Strother <jon.delStrother@bestbefore.tv>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2008-10-06 09:02:47 -07:00
Shawn O. Pearce
276328ffb8 Merge branch 'maint'
* maint:
  Update release notes for 1.6.0.3
  Teach rebase -i to honor pre-rebase hook
  docs: describe pre-rebase hook
  do not segfault if make_cache_entry failed
  make prefix_path() never return NULL
  fix bogus "diff --git" header from "diff --no-index"
  Fix fetch/clone --quiet when stdout is connected
  builtin-blame: Fix blame -C -C with submodules.
  bash: remove fetch, push, pull dashed form leftovers

Conflicts:
	diff.c
2008-10-06 08:56:07 -07:00
Linus Torvalds
71b989e7dd fix bogus "diff --git" header from "diff --no-index"
When "git diff --no-index" is given an absolute pathname, it
would generate a diff header with the absolute path
prepended by the prefix, like:

  diff --git a/dev/null b/foo

Not only is this nonsensical, and not only does it violate
the description of diffs given in git-diff(1), but it would
produce broken binary diffs. Unlike text diffs, the binary
diffs don't contain the filenames anywhere else, and so "git
apply" relies on this header to figure out the filename.

This patch just refuses to use an invalid name for anything
visible in the diff.

Now, this fixes the "git diff --no-index --binary a
/dev/null" kind of case (and we'll end up using "a" as the
basename), but some other insane cases are impossible to
handle. If you do

	git diff --no-index --binary a /bin/echo

you'll still get a patch like

	diff --git a/a b/bin/echo
	old mode 100644
	new mode 100755
	index ...

and "git apply" will refuse to apply it for a couple of
reasons, and the diff is simply bogus.

And that, btw, is no longer a bug, I think. It's impossible
to know whethe the user meant for the patch to be a rename
or not. And as such, refusing to apply it because you don't
know what name you should use is probably _exactly_ the
right thing to do!

Original problem reported by Imre Deak. Test script and problem
description by Jeff King.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2008-10-06 00:29:28 -07:00
Nicolas Pitre
9126f0091f fix openssl headers conflicting with custom SHA1 implementations
On ARM I have the following compilation errors:

    CC fast-import.o
In file included from cache.h:8,
                 from builtin.h:6,
                 from fast-import.c:142:
arm/sha1.h:14: error: conflicting types for 'SHA_CTX'
/usr/include/openssl/sha.h:105: error: previous declaration of 'SHA_CTX' was here
arm/sha1.h:16: error: conflicting types for 'SHA1_Init'
/usr/include/openssl/sha.h:115: error: previous declaration of 'SHA1_Init' was here
arm/sha1.h:17: error: conflicting types for 'SHA1_Update'
/usr/include/openssl/sha.h:116: error: previous declaration of 'SHA1_Update' was here
arm/sha1.h:18: error: conflicting types for 'SHA1_Final'
/usr/include/openssl/sha.h:117: error: previous declaration of 'SHA1_Final' was here
make: *** [fast-import.o] Error 1

This is because openssl header files are always included in
git-compat-util.h since commit 684ec6c63c whenever NO_OPENSSL is not
set, which somehow brings in <openssl/sha1.h> clashing with the custom
ARM version.  Compilation of git is probably broken on PPC too for the
same reason.

Turns out that the only file requiring openssl/ssl.h and openssl/err.h
is imap-send.c.  But only moving those problematic includes there
doesn't solve the issue as it also includes cache.h which brings in the
conflicting local SHA1 header file.

As suggested by Jeff King, the best solution is to rename our references
to SHA1 functions and structure to something git specific, and define those
according to the implementation used.

Signed-off-by: Nicolas Pitre <nico@cam.org>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2008-10-02 18:06:56 -07:00
Brandon Casey
416f80a60b diff.c: remove duplicate bibtex pattern introduced by merge 92bb9785
Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
2008-09-30 13:49:07 -07:00
Shawn O. Pearce
9800c0df41 Merge branch 'bc/master-diff-hunk-header-fix'
* bc/master-diff-hunk-header-fix:
  Clarify commit error message for unmerged files
  Use strchrnul() instead of strchr() plus manual workaround
  Use remove_path from dir.c instead of own implementation
  Add remove_path: a function to remove as much as possible of a path
  git-submodule: Fix "Unable to checkout" for the initial 'update'
  Clarify how the user can satisfy stash's 'dirty state' check.
  t4018-diff-funcname: test syntax of builtin xfuncname patterns
  t4018-diff-funcname: test syntax of builtin xfuncname patterns
  make "git remote" report multiple URLs
  diff hunk pattern: fix misconverted "\{" tex macro introducers
  diff: fix "multiple regexp" semantics to find hunk header comment
  diff: use extended regexp to find hunk headers
  diff: use extended regexp to find hunk headers
  diff.*.xfuncname which uses "extended" regex's for hunk header selection
  diff.c: associate a flag with each pattern and use it for compiling regex
  diff.c: return pattern entry pointer rather than just the hunk header pattern

Conflicts:
	builtin-merge-recursive.c
	t/t7201-co.sh
	xdiff-interface.h
2008-09-29 11:04:20 -07:00
Shawn O. Pearce
5a139ba483 Merge branch 'maint' into bc/master-diff-hunk-header-fix
* maint: (41 commits)
  Clarify commit error message for unmerged files
  Use strchrnul() instead of strchr() plus manual workaround
  Use remove_path from dir.c instead of own implementation
  Add remove_path: a function to remove as much as possible of a path
  git-submodule: Fix "Unable to checkout" for the initial 'update'
  Clarify how the user can satisfy stash's 'dirty state' check.
  Remove empty directories in recursive merge
  Documentation: clarify the details of overriding LESS via core.pager
  Update release notes for 1.6.0.3
  checkout: Do not show local changes when in quiet mode
  for-each-ref: Fix --format=%(subject) for log message without newlines
  git-stash.sh: don't default to refs/stash if invalid ref supplied
  maint: check return of split_cmdline to avoid bad config strings
  builtin-prune.c: prune temporary packs in <object_dir>/pack directory
  Do not perform cross-directory renames when creating packs
  Use dashless git commands in setgitperms.perl
  git-remote: do not use user input in a printf format string
  make "git remote" report multiple URLs
  Start draft release notes for 1.6.0.3
  git-repack uses --no-repack-object, not --no-repack-delta.
  ...

Conflicts:
	RelNotes
2008-09-29 10:52:34 -07:00
Shawn O. Pearce
edb7e82f72 Merge branch 'bc/maint-diff-hunk-header-fix' into maint
* bc/maint-diff-hunk-header-fix:
  t4018-diff-funcname: test syntax of builtin xfuncname patterns
  diff hunk pattern: fix misconverted "\{" tex macro introducers
  diff: use extended regexp to find hunk headers
  diff.*.xfuncname which uses "extended" regex's for hunk header selection
  diff.c: associate a flag with each pattern and use it for compiling regex
  diff.c: return pattern entry pointer rather than just the hunk header pattern

Conflicts:
	Documentation/gitattributes.txt
2008-09-29 10:23:19 -07:00
Shawn O. Pearce
de81343562 Merge branch 'ho/dirstat-by-file'
* ho/dirstat-by-file:
  diff --dirstat-by-file: count changed files, not lines
2008-09-25 08:41:42 -07:00
Brandon Casey
fdac6692a0 t4018-diff-funcname: test syntax of builtin xfuncname patterns
[jc: fixes bibtex pattern breakage exposed by this test]

Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-09-23 01:48:49 -07:00
Junio C Hamano
92bb978541 Merge branch 'bc/maint-diff-hunk-header-fix' into bc/master-diff-hunk-header-fix
* bc/maint-diff-hunk-header-fix:
  diff hunk pattern: fix misconverted "\{" tex macro introducers

Conflicts:
	diff.c
2008-09-20 18:37:47 -07:00
Junio C Hamano
96d1a8e9d4 diff hunk pattern: fix misconverted "\{" tex macro introducers
Pointed out by Brandon Casey.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-09-20 15:30:17 -07:00
Junio C Hamano
3d8dccd74a diff: fix "multiple regexp" semantics to find hunk header comment
When multiple regular expressions are concatenated with "\n", they were
traditionally AND'ed together, and only a line that matches _all_ of them
is taken as a match.  This however is unwieldy when multiple regexp
feature is used to specify alternatives.

This fixes the semantics to take the first match.  A nagative pattern, if
matches, makes the line to fail as before.  A match with a positive
pattern will be the final match, and what it captures in $1 is used as the
hunk header comment.

We could write alternatives using "|" in ERE, but the machinery can only
use captured $1 as the hunk header comment (or $0 if there is no match in
$1), so you cannot write:

    "junk ( A | B ) | garbage ( C | D )"

and expect both "junk" and "garbage" to get stripped with the existing
code.  With this fix, you can write it as:

    "junk ( A | B ) \n garbage ( C | D )"

and the way capture works would match the user expectation more
naturally.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-09-20 00:52:11 -07:00
Junio C Hamano
1883a0d3b7 diff: use extended regexp to find hunk headers
Using ERE elements such as "|" (alternation) by backquoting in BRE
is a GNU extension and should not be done in portable programs.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-09-19 23:52:49 -07:00
Junio C Hamano
cd0843198f Merge branch 'bc/maint-diff-hunk-header-fix' into bc/master-diff-hunk-header-fix
* bc/maint-diff-hunk-header-fix:
  diff: use extended regexp to find hunk headers

Conflicts:
	diff.c
2008-09-19 23:51:01 -07:00
Junio C Hamano
6a6baf9b4e diff: use extended regexp to find hunk headers
Using ERE elements such as "|" (alternation) by backquoting in BRE
is a GNU extension and should not be done in portable programs.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-09-19 23:45:04 -07:00
Junio C Hamano
dde4af4313 Merge branch 'bc/maint-diff-hunk-header-fix' into bc/master-diff-hunk-header-fix
* bc/maint-diff-hunk-header-fix:
  diff.*.xfuncname which uses "extended" regex's for hunk header selection
  diff.c: associate a flag with each pattern and use it for compiling regex
  diff.c: return pattern entry pointer rather than just the hunk header pattern
  Cosmetical command name fix
  Start conforming code to "git subcmd" style part 3
  t9700/test.pl: remove File::Temp requirement
  t9700/test.pl: avoid bareword 'STDERR' in 3-argument open()
  GIT 1.6.0.2
  Fix some manual typos.
  Use compatibility regex library also on FreeBSD
  Use compatibility regex library also on AIX
  Update draft release notes for 1.6.0.2
  Use compatibility regex library for OSX/Darwin
  git-svn: Fixes my() parameter list syntax error in pre-5.8 Perl
  Git.pm: Use File::Temp->tempfile instead of ->new
  t7501: always use test_cmp instead of diff
  Start conforming code to "git subcmd" style part 2
  diff: Help "less" hide ^M from the output
  checkout: do not check out unmerged higher stages randomly

Conflicts:
	Documentation/git.txt
	Documentation/gitattributes.txt
	Makefile
	diff.c
	t/t7201-co.sh
2008-09-18 20:32:50 -07:00
Junio C Hamano
e69a6f47c4 Merge branch 'jc/diff-prefix'
* jc/diff-prefix:
  diff: vary default prefix depending on what are compared
2008-09-18 20:30:07 -07:00
Brandon Casey
45d9414fa5 diff.*.xfuncname which uses "extended" regex's for hunk header selection
Currently, the hunk headers produced by 'diff -p' are customizable by
setting the diff.*.funcname option in the config file. The 'funcname' option
takes a basic regular expression. This functionality was designed using the
GNU regex library which, by default, allows using backslashed versions of
some extended regular expression operators, even in Basic Regular Expression
mode. For example, the following characters, when backslashed, are
interpreted according to the extended regular expression rules: ?, +, and |.
As such, the builtin funcname patterns were created using some extended
regular expression operators.

Other platforms which adhere more strictly to the POSIX spec do not
interpret the backslashed extended RE operators in Basic Regular Expression
mode. This causes the pattern matching for the builtin funcname patterns to
fail on those platforms.

Introduce a new option 'xfuncname' which uses extended regular expressions,
and advertise it _instead_ of funcname. Since most users are on GNU
platforms, the majority of funcname patterns are created and tested there.
Advertising only xfuncname should help to avoid the creation of non-portable
patterns which work with GNU regex but not elsewhere.

Additionally, the extended regular expressions may be less ugly and
complicated compared to the basic RE since many common special operators do
not need to be backslashed.

For example, the GNU Basic RE:

    ^[ 	]*\\(\\(public\\|static\\).*\\)$

becomes the following Extended RE:

    ^[ 	]*((public|static).*)$

Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-09-18 20:06:31 -07:00
Brandon Casey
a013585b20 diff.c: associate a flag with each pattern and use it for compiling regex
This is in preparation for allowing extended regular expression patterns.

Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-09-18 20:06:23 -07:00
Brandon Casey
45e7ca0f0e diff.c: return pattern entry pointer rather than just the hunk header pattern
This is in preparation for associating a flag with each pattern which will
control how the pattern is interpreted. For example, as a basic or extended
regular expression.

Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-09-18 19:58:29 -07:00
Junio C Hamano
01409bbf75 Merge branch 'jc/maint-diff-quiet' into maint
* jc/maint-diff-quiet:
  diff --quiet: make it synonym to --exit-code >/dev/null
  diff Porcelain: do not disable auto index refreshing on -C -C
2008-09-18 19:53:12 -07:00
Junio C Hamano
8e909f80b4 Merge branch 'jc/maint-diff-quiet'
* jc/maint-diff-quiet:
  diff --quiet: make it synonym to --exit-code >/dev/null
  diff Porcelain: do not disable auto index refreshing on -C -C
2008-09-16 00:48:16 -07:00
Junio C Hamano
26c10c7ad3 Merge branch 'jc/maint-hide-cr-in-diff-from-less' into maint
* jc/maint-hide-cr-in-diff-from-less:
  diff: Help "less" hide ^M from the output
2008-09-10 02:14:18 -07:00
Junio C Hamano
fdfb4cfadc Merge branch 'jc/hide-cr-in-diff-from-less'
* jc/hide-cr-in-diff-from-less:
  diff: Help "less" hide ^M from the output
2008-09-07 23:45:40 -07:00
Andreas Ericsson
af9ce1ffc6 Teach "git diff -p" to locate PHP class methods
Otherwise it will always print the class-name rather
than the name of the function inside that class.

While we're at it, reorder the gitattributes manpage to
list the built-in funcname pattern names in alphabetical
order.

Signed-off-by: Andreas Ericsson <ae@op5.se>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-09-07 15:22:24 -07:00
Junio C Hamano
df58a8274d diff --quiet: make it synonym to --exit-code >/dev/null
The point of --quiet was to return the status as early as possible without
doing any extra processing.  Well behaved scripts, when they expect to run
many diff operations inside, are supposed to run "update-index --refresh"
upfront; we do not want them to pay the price of iterating over the index
and comparing the contents to fix the stat dirtiness, and we avoided most
of the processing in diffcore_std() when --quiet is in effect.

But scripts that adhere to the good practice won't have to pay any more
price than the necessary lstat(2) that will report stat cleanliness, as
long as only -q is given without any fancier diff options.

More importantly, users who do ask for "--quiet -M --filter=D" (in order
to notice only the deletion, not paths that disappeared only because they
have been renamed away) deserve to get the result they asked for, even it
means they have to pay the extra price; the alternative is to get a cheap
early return that gives a result they did not ask for, which is much
worse.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-09-06 19:15:04 -07:00
Junio C Hamano
9d865356ab diff Porcelain: do not disable auto index refreshing on -C -C
When we enabled the automatic refreshing of the index to "diff" Porcelain,
we disabled it when --find-copies-harder was asked, but there is no good
reason to do so.  In the following command sequence, the first "diff"
shows an "empty" diff exposing stat dirtyness, while the second one does
not.

    $ >foo
    $ git add foo
    $ touch foo
    $ git diff -C -C
    $ git diff -C

This fixes the inconsistency.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-09-06 19:09:16 -07:00
Heikki Orsila
fd33777b78 diff --dirstat-by-file: count changed files, not lines
This new option --dirstat-by-file is the same as --dirstat, but it
counts "impacted files" instead of "impacted lines" (lines that are
added or removed).

Signed-off-by: Heikki Orsila <heikki.orsila@iki.fi>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-09-05 14:04:26 -07:00
Junio C Hamano
80d12c23de Merge branch 'jc/maint-log-grep'
* jc/maint-log-grep:
  log --author/--committer: really match only with name part
  diff --cumulative is a sub-option of --dirstat
  bash completion: Hide more plumbing commands
2008-09-04 22:30:44 -07:00
Junio C Hamano
f88d225feb diff --cumulative is a sub-option of --dirstat
The option used to be implemented as if it is a totally independent one,
but "git diff --cumulative" would not mean anything without "--dirstat".

This makes --cumulative imply --dirstat.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-09-03 22:37:03 -07:00
Junio C Hamano
a5a818ee48 diff: vary default prefix depending on what are compared
With a new configuration "diff.mnemonicprefix", "git diff" shows the
differences between various combinations of preimage and postimage trees
with prefixes different from the standard "a/" and "b/".  Hopefully this
will make the distinction stand out for some people.

    "git diff" compares the (i)ndex and the (w)ork tree;
    "git diff HEAD" compares a (c)ommit and the (w)ork tree;
    "git diff --cached" compares a (c)ommit and the (i)ndex;
    "git-diff HEAD:file1 file2" compares an (o)bject and a (w)ork tree entity;
    "git diff --no-index a b" compares two non-git things (1) and (2).

Because these mnemonics now have meanings, they are swapped when reverse
diff is in effect and this feature is enabled.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-08-30 20:53:24 -07:00
Junio C Hamano
3928097020 diff: Help "less" hide ^M from the output
When the tracked contents have CRLF line endings, colored diff output
shows "^M" at the end of output lines, which is distracting, even though
the pager we use by default ("less") knows to hide them.

The problem is that "less" hides a carriage-return only at the end of the
line, immediately before a line feed.  The colored diff output does not
take this into account, and emits four element sequence for each line:

   - force this color;
   - the line up to but not including the terminating line feed;
   - reset color
   - line feed.

By including the carriage return at the end of the line in the second
item, we are breaking the smart our pager has in order not to show "^M".
This can be fixed by changing the sequence to:

   - force this color;
   - the line up to but not including the terminating end-of-line;
   - reset color
   - end-of-line.

where end-of-line is either a single linefeed or a CRLF pair.  When the
output is not colored, "force this color" and "reset color" sequences are
both empty, so we won't have this problem with or without this patch.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-08-30 20:34:45 -07:00
Junio C Hamano
445cac18c0 Merge branch 'maint'
* maint:
  tutorial: gentler illustration of Alice/Bob workflow using gitk
  pretty=format: respect date format options
  make git-shell paranoid about closed stdin/stdout/stderr
  Document gitk --argscmd flag.
  Fix '--dirstat' with cross-directory renaming
  for-each-ref: Allow a trailing slash in the patterns
2008-08-29 00:16:39 -07:00
Linus Torvalds
441bca0bbc Fix '--dirstat' with cross-directory renaming
The dirstat code depends on the fact that we always generate diffs with
the names sorted, since it then just does a single-pass walk-over of the
sorted list of names and how many changes there were. The sorting means
that all files are nicely grouped by directory.

That all works fine.

Except when we have rename detection, and suddenly the nicely sorted list
of pathnames isn't all that sorted at all. And now the single-pass dirstat
walk gets all confused, and you can get results like this:

  [torvalds@nehalem linux]$ git diff --dirstat=2 -M v2.6.27-rc4..v2.6.27-rc5
     3.0% arch/powerpc/configs/
     6.8% arch/arm/configs/
     2.7% arch/powerpc/configs/
     4.2% arch/arm/configs/
     5.6% arch/powerpc/configs/
     8.4% arch/arm/configs/
     5.5% arch/powerpc/configs/
    23.3% arch/arm/configs/
     8.6% arch/powerpc/configs/
     4.0% arch/
     4.4% drivers/usb/musb/
     4.0% drivers/watchdog/
     7.6% drivers/
     3.5% fs/

The trivial fix is to add a sorting pass, fixing it to:

  [torvalds@nehalem linux]$ git diff --dirstat=2 -M v2.6.27-rc4..v2.6.27-rc5
    43.0% arch/arm/configs/
    25.5% arch/powerpc/configs/
     5.3% arch/
     4.4% drivers/usb/musb/
     4.0% drivers/watchdog/
     7.6% drivers/
     3.5% fs/

Spot the difference. In case anybody wonders: it's because of a ton of
renames from {include/asm-blackfin => arch/blackfin/include/asm} that just
totally messed up the file ordering in between arch/arm and arch/powerpc.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-08-29 00:14:29 -07:00
Johan Herland
1a1fcf4abe Teach "git diff -p" HTML funcname patterns
Find lines with <h1>..<h6> tags.

[jc: while at it, reordered entries to sort alphabetically.]

Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-08-20 23:56:31 -07:00
Kirill Smelkov
7c17205b64 Teach "git diff -p" Python funcname patterns
Find classes, functions, and methods definitions.

Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-08-20 23:53:33 -07:00
Junio C Hamano
e28a8670a6 Merge branch 'maint'
* maint:
  Update draft release notes for 1.6.0.1
  Add hints to revert documentation about other ways to undo changes
  Install templates with the user and group of the installing personality
  "git-merge": allow fast-forwarding in a stat-dirty tree
  completion: find out supported merge strategies correctly
  decorate: allow const objects to be decorated
  for-each-ref: cope with tags with incomplete lines
  diff --check: do not get confused by new blank lines in the middle
  remote.c: remove useless if-before-free test
  mailinfo: avoid violating strbuf assertion
  git format-patch: avoid underrun when format.headers is empty or all NLs
2008-08-20 16:18:16 -07:00
Junio C Hamano
c35539eb10 diff --check: do not get confused by new blank lines in the middle
The code remembered that the last diff output it saw was an empty line,
and tried to reset that state whenever it sees a context line, a non-blank
new line, or a new hunk.  However, this codepath asks the underlying diff
engine to feed diff without any context, and the "just saw an empty line"
state was not reset if you added a new blank line in the last hunk of your
patch, even if it is not the last line of the file.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-08-20 13:28:19 -07:00
Junio C Hamano
3814c07498 Merge branch 'bd/diff-strbuf'
* bd/diff-strbuf:
  xdiff-interface: hide the whole "xdiff_emit_state" business from the caller
  Use strbuf for struct xdiff_emit_state's remainder
  Make xdi_diff_outf interface for running xdiff_outf diffs
2008-08-19 21:43:40 -07:00
Jim Meyering
a624eaa782 add boolean diff.suppress-blank-empty config option
GNU diff's --suppress-blank-empty option makes it so that diff no
longer outputs trailing white space unless the input data has it.
With this option, empty context lines are now empty also in diff -u output.
Before, they would have a single trailing space.

 * diff.c (diff_suppress_blank_empty): New global.
   (git_diff_basic_config): Set it.
   (fn_out_consume): Honor it.
 * t/t4029-diff-trailing-space.sh: New file.
 * Documentation/config.txt: Document it.

Signed-off-by: Jim Meyering <meyering@redhat.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-08-19 18:09:37 -07:00
Junio C Hamano
8a3f524bf2 xdiff-interface: hide the whole "xdiff_emit_state" business from the caller
This further enhances xdi_diff_outf() interface so that it takes two
common parameters: the callback function that processes one line at a
time, and a pointer to its application specific callback data structure.
xdi_diff_outf() creates its own "xdiff_emit_state" structure and stashes
these two away inside it, which is used by the lowest level output
function in the xdiff_outf() callchain, consume_one(), to call back to the
application layer.  With this restructuring, we lift the requirement that
the caller supplied callback data structure embeds xdiff_emit_state
structure as its first member.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-08-14 00:30:26 -07:00
Brian Downing
c99db9d292 Make xdi_diff_outf interface for running xdiff_outf diffs
To prepare for the need to initialize and release resources for an
xdi_diff with the xdiff_outf output function, make a new function to
wrap this usage.

Old:

	ecb.outf = xdiff_outf;
	ecb.priv = &state;
	...
	xdi_diff(file_p, file_o, &xpp, &xecfg, &ecb);

New:

	xdi_diff_outf(file_p, file_o, &state.xm, &xpp, &xecfg, &ecb);

Signed-off-by: Brian Downing <bdowning@lavos.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-08-13 23:10:23 -07:00
Gustaf Hendeby
23b5beb28f Teach git diff about BibTeX head hunk patterns
All BibTeX entries starts with an @ followed by an entry type.  Since
there are many entry types and own can be defined, the pattern matches
legal entry type names instead of just the default types (which would
be a long list).  The pattern also matches strings and comments since
they will also be useful to position oneself in a bib-file.

Signed-off-by: Gustaf Hendeby <hendeby@isy.liu.se>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-08-12 15:43:55 -07:00
Junio C Hamano
04c6e9e9ca diff --check: do not unconditionally complain about trailing empty lines
Recently "git diff --check" learned to detect new trailing blank lines
just like "git apply --whitespace" does.  However this check should not
trigger unconditionally.  This patch makes it honor the whitespace
settings from core.whitespace and gitattributes.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-08-11 22:15:28 -07:00
Junio C Hamano
04bb50f45d Merge branch 'maint'
* maint:
  RelNotes 1.5.6.5 updates
  diff.renamelimit is a basic diff configuration
  git-cvsimport.perl: Print "UNKNOWN LINE..." on stderr, not stdout.
  Documentation: typos / spelling fixes in older RelNotes
2008-08-05 21:21:08 -07:00
Linus Torvalds
2b6ca6df2d diff.renamelimit is a basic diff configuration
The configuration was added as a core option in 3299c6f (diff: make
default rename detection limit configurable., 2005-11-15), but 9ce392f
(Move diff.renamelimit out of default configuration., 2005-11-21)
separated diff-related stuff out of the core.

Up to that point it was Ok.

When we separated the Porcelain options out of the git_diff_config in
83ad63c (diff: do not use configuration magic at the core-level,
2006-07-08), we should have been more careful.

This mistake made diff-tree plumbing and git-show Porcelain to notice
different set of renames when the user explicitly asked for rename
detection.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-08-05 18:27:31 -07:00
Giuseppe Bilotta
807d869453 diff: chapter and part in funcname for tex
This patch enhances the tex funcname by adding support for
chapter and part sectioning commands. It also matches
the starred version of the sectioning commands.

Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-08-02 15:40:57 -07:00
Avery Pennarun
b50005b79f Teach "git diff -p" Pascal/Delphi funcname pattern
Finds classes, records, functions, procedures, and sections.  Most lines
need to start at the first column, or else there's no way to differentiate
a procedure's definition from its declaration.

Signed-off-by: Avery Pennarun <apenwarr@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-08-02 15:39:35 -07:00
Giuseppe Bilotta
ad8c1d9260 diff: add ruby funcname pattern
Provide a regexp that catches class, module and method definitions in
Ruby scripts, since the built-in default only finds classes.

Signed-off-by: Giuseppe Bilotta <giuseppe.bilotta@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-08-02 15:38:14 -07:00
Kevin Ballard
6b2fbaaffc format-patch: Produce better output with --inline or --attach
This patch makes two small changes to improve the output of --inline
and --attach.

The first is to write a newline preceding the boundary. This is needed because
MIME defines the encapsulation boundary as including the preceding CRLF (or in
this case, just LF), so we should be writing one. Without this, the last
newline in the pre-diff content is consumed instead.

The second change is to always write the line termination character
(default: newline) even when using --inline or --attach. This is simply to
improve the aesthetics of the resulting message. When using --inline an email
client should render the resulting message identically to the non-inline
version. And when using --attach this adds a blank line preceding the
attachment in the email, which is visually attractive.

Signed-off-by: Kevin Ballard <kevin@sb.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-07-29 23:18:15 -07:00
Junio C Hamano
88bbda08d7 Merge branch 'maint'
* maint:
  Start preparing 1.5.6.4 release notes
  git fetch-pack: do not complain about "no common commits" in an empty repo
  rebase-i: keep old parents when preserving merges
  t7600-merge: Use test_expect_failure to test option parsing
  Fix buffer overflow in prepare_attr_stack
  Fix buffer overflow in git diff
  Fix buffer overflow in git-grep
  git-cvsserver: fix call to nonexistant cleanupWorkDir()
  Documentation/git-cherry-pick.txt et al.: Fix misleading -n description

Conflicts:
	RelNotes
2008-07-16 17:10:28 -07:00
Dmitry Potapov
fd55a19eb1 Fix buffer overflow in git diff
If PATH_MAX on your system is smaller than a path stored, it may cause
buffer overflow and stack corruption in diff_addremove() and diff_change()
functions when running git-diff

Signed-off-by: Dmitry Potapov <dpotapov@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-07-16 14:03:24 -07:00
Junio C Hamano
08b51f51e6 Merge branch 'qq/maint'
* qq/maint:
  clone -q: honor "quiet" option over native transports.
  attribute documentation: keep EXAMPLE at end
  builtin-commit.c: Use 'git_config_string' to get 'commit.template'
  http.c: Use 'git_config_string' to clean up SSL config.
  diff.c: Use 'git_config_string' to get 'diff.external'
  convert.c: Use 'git_config_string' to get 'smudge' and 'clean'
  builtin-log.c: Use 'git_config_string' to get 'format.subjectprefix' and 'format.suffix'
  Documentation cvs: Clarify when a bare repository is needed
  Documentation: be precise about which date --pretty uses

Conflicts:

	Documentation/gitattributes.txt
2008-07-05 18:33:16 -07:00
Brian Hetro
daec808cc6 diff.c: Use 'git_config_string' to get 'diff.external'
Signed-off-by: Brian Hetro <whee@smaertness.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-07-05 17:42:34 -07:00
Junio C Hamano
d4b76e15ea Merge branch 'jc/checkdiff'
* jc/checkdiff:
  Fix t4017-diff-retval for white-space from wc
  Update sample pre-commit hook to use "diff --check"
  diff --check: detect leftover conflict markers
  Teach "diff --check" about new blank lines at end
  checkdiff: pass diff_options to the callback
  check_and_emit_line(): rename and refactor
  diff --check: explain why we do not care whether old side is binary
2008-07-01 16:22:35 -07:00
Olivier Marin
861d1af36a show_stats(): fix stats width calculation
Before this patch, name_width becomes negative or null for width values
less than 15 and name_width values greater than 25 (default: 50). This
leads to output random data.

This patch checks for minimal width and name_width values.

Signed-off-by: Olivier Marin <dkr@freesurf.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-06-28 20:55:26 -07:00
Junio C Hamano
049540435f diff --check: detect leftover conflict markers
This teaches "diff --check" to detect and complain if the change
adds lines that look like leftover conflict markers.

We should be able to remove the old Perl script used in the sample
pre-commit hook and modernize the script with this facility.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-06-26 22:07:54 -07:00