On fixup/squash-ing rebase, git will create new commit in
i18n.commitencoding, reencode the commit message to that said encode.
Signed-off-by: Doan Tran Cong Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
On musl libc, ISO-2022-JP encoder is too eager to switch back to
1 byte encoding, musl's iconv always switch back after every combining
character. Comparing glibc and musl's output for this command
$ sed q t/t3900/ISO-2022-JP.txt| iconv -f ISO-2022-JP -t utf-8 |
iconv -f utf-8 -t ISO-2022-JP | xxd
glibc:
00000000: 1b24 4224 4f24 6c24 5224 5b24 551b 2842 .$B$O$l$R$[$U.(B
00000010: 0a .
musl:
00000000: 1b24 4224 4f1b 2842 1b24 4224 6c1b 2842 .$B$O.(B.$B$l.(B
00000010: 1b24 4224 521b 2842 1b24 4224 5b1b 2842 .$B$R.(B.$B$[.(B
00000020: 1b24 4224 551b 2842 0a .$B$U.(B.
Although musl iconv's output isn't optimal, it's still correct.
From commit 7d509878b8, ("pretty.c: format string with truncate respects
logOutputEncoding", 2014-05-21), we're encoding the message to utf-8
first, then format it and convert the message to the actual output
encoding on git commit --squash.
Thus, t3900::test_commit_autosquash_flags is failing on musl libc.
Reencode to utf-8 before arranging rebase's todo list.
By doing this, we also remove a breakage noticed by a test added in the
previous commit.
Signed-off-by: Doan Tran Cong Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We're using fixup!/squash! <subject> to mark if current commit will be
used to be fixed up or squashed to a previous commit.
However, if we're changing i18n.commitencoding after making the
original commit but before making the fixing up, we couldn't find the
original commit to do the fixup/squash.
Add a test to demonstrate that problem.
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Doan Tran Cong Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 89a70b80 ("t0302 & t3900: add forgotten quotes", 2018-01-03), quotes
were added to protect against spaces in $HOME. In the test_when_finished
command, two files are deleted which must be quoted individually.
[jc: with \$HOME in the test_when_finished command quoted, as
pointed out by j6t].
Signed-off-by: Beat Bolli <dev+git@drbeat.li>
Helped-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When cleaning up files in the $HOME directory, it really makes sense to
quote the path, especially in Git's test suite, where the HOME directory
is *guaranteed* to contain spaces in its name.
It would appear that those two tests pass even without cleaning up the
files, but really more by pure chance than by design (the cleanup seems
not actually to be necessary).
However, if anybody would have a left-over `trash/` directory in Git's
`t/` directory, these tests would fail, because they would all of a
sudden try to delete that directory, but without the `-r` (recursive)
flag. That is how this issue was found.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Mark message commit_utf8_warn for translation.
Update tests to reflect changes.
Signed-off-by: Vasco Almeida <vascomalmeida@sapo.pt>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It is not like that our longer term desire is to someday start
accept log messages with NULs in them, so it is wrong to mark a test
that demonstrates "git commit" that correctly fails given such an
input as "expect-failure". "git commit" should fail today, and it
should fail the same way in the future given a message with NUL in it.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Unicode clause D14 defines all characters U+nFFFE and U+nFFFF (where
0 <= n <= 10h) as well as the range U+FDD0..U+FDEF as non-characters,
reserved for internal use only. Disallow these characters in commit
messages as they are normally not recommended for interchange.
Signed-off-by: Peter Krefting <peter@softwolves.pp.se>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The commit code accepts pseudo-UTF-8 sequences that encode a character with more
bytes than necessary. Reject such sequences, since they are not valid UTF-8.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The commit code already contains code for validating UTF-8, but it does not
check for invalid values, such as guaranteed non-characters and surrogates. Fix
this by explicitly checking for and rejecting such characters.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Prefer:
test_line_count <OP> COUNT FILE
over:
test $(wc -l <FILE) <OP> COUNT
(or similar usages) in several tests.
Signed-off-by: Stefano Lattarini <stefano.lattarini@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Current implementation sees NUL as terminator. If users give a message
with NUL byte in it (e.g. editor set to save as UTF-16), the new commit
message will have NULs. However following operations (displaying or
amending a commit for example) will not keep anything after the first NUL.
Stop user right when they do this. If NUL is added by mistake, they have
their chance to fix. Otherwise, log messages will no longer be text "git
log" and friends would grok.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The call to test_expect_success is nested inside a function, whose
arguments the test code wants to access. But it is not specified that any
unexpanded $1, $2, $3, etc in the test code will access the surrounding
function's arguments. Rather, they will access the arguments of the
function that happens to eval the test code.
In this case, the reference is intended to supply '-m message' to a call of
'git commit --squash'. Remove it because -m is optional and the test case
does not check for it. There are tests in t7500 that check combinations of
--squash and -m.
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
t7500: test expected behavior of commit --squash
t3415: test interaction of commit --squash with rebase --autosquash
t3900: test commit --squash with i18n encodings
Signed-off-by: Pat Notz <patnotz@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
t7500: test expected behavior of commit --fixup
t3415: test interaction of commit --fixup with rebase --autosquash
t3900: test commit --fixup with i18n encodings
Signed-off-by: Pat Notz <patnotz@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some old iconv implementations do not have many alternate names and/or
do not match character encoding names case insensitively. These
implementations can not tell that utf-8 and UTF-8 are the same encoding
and fail when trying to do the conversion. So use the old names, which
modern implementations still support.
The following conversions were performed:
utf-8 --> UTF-8
ISO-8859-1 --> ISO8859-1
EUCJP --> eucJP
Also update t9129 and t9500 which make use of the test files in t/t3900.
Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
An old iconv (GNU libiconv 1.11) does not know about utf8, it does know
UTF-8 though, which is also understood by all newer iconv implementations.
Signed-off-by: Alex Riesen <raa.lkml@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When converting from other encodings (e.g. EUC-JP or UTF-8), there are
subtly different variants of ISO-2022-JP, all of which are valid. At the
end of line or when a run of string switches to 1-byte sequence, ESC ( B
can be used to switch to ASCII or ESC ( J can be used to switch to ISO
646:JP (JIS X 0201) but they essentially are the same character set and
are used interchangeably. Similarly the set ESC $ @ switches to (JIS X
0208-1978) and ESC $ B switches to (JIS X 0208-1983) are in practice used
interchangeably.
Depending on the iconv library and the locale definition on the system, a
program that converts from another encoding to ISO-2022-JP can produce
different byte sequence, and GIT_TEST_CMP (aka "diff -u") will report the
difference as a failure.
Fix this by converting the expected and the actual output to UTF-8 before
comparing when the end result is ISO-2022-JP. The test vector string in
t3900/ISO-2022-JP.txt is expressed with ASCII and JIS X 0208-1983, but it
can be expressed with any other possible variant, and when converted back
to UTF-8, these variants produce identical byte sequences.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Many test scripts assumed that they will start in a 'trash' subdirectory
that is a single level down from the t/ directory, and referred to their
test vector files by asking for files like "../t9999/expect". This will
break if we move the 'trash' subdirectory elsewhere.
To solve this, we earlier introduced "$TEST_DIRECTORY" so that they can
refer to t/ directory reliably. This finally makes all the tests use
it to refer to the outside environment.
With this patch, and a one-liner not included here (because it would
contradict with what Dscho really wants to do):
| diff --git a/t/test-lib.sh b/t/test-lib.sh
| index 70ea7e0..60e69e4 100644
| --- a/t/test-lib.sh
| +++ b/t/test-lib.sh
| @@ -485,7 +485,7 @@ fi
| . ../GIT-BUILD-OPTIONS
|
| # Test repository
| -test="trash directory"
| +test="trash directory/another level/yet another"
| rm -fr "$test" || {
| trap - exit
| echo >&5 "FATAL: Cannot prepare test area"
all the tests still pass, but we would want extra sets of eyeballs on this
type of change to really make sure.
[jc: with help from Stephan Beyer on http-push tests I do not run myself;
credits for locating silly quoting errors go to Olivier Marin.]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As a general principle, we should not use "git diff" to validate the
results of what git command that is being tested has done. We would not
know if we are testing the command in question, or locating a bug in the
cute hack of "git diff --no-index".
Rather use test_cmp for that purpose.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This fixes an unnecessary empty line that we add to the log message when
we generate diffs, but don't actually end up printing any due to having
DIFF_FORMAT_NO_OUTPUT set.
This can happen with pickaxe or with rename following. The reason is that
we normally add an empty line between the commit and the diff, but we do
that even for the case where we've then suppressed the actual printing of
the diff.
This also updates a couple of tests that assumed the extraneous empty
line would exist at the end of output.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Lars Hjemli <hjemli@gmail.com>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
Now that "git diff" handles stdin and relative paths outside the
working tree correctly, we can convert all instances of "diff -u"
to "git diff".
This commit is really the result of
$ perl -pi.bak -e 's/diff -u/git diff/' $(git grep -l "diff -u" t/)
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <junkio@cox.net>
(cherry picked from commit c699a40d68215c7e44a5b26117a35c8a56fbd387)
It appears ISO-2022-JP is more widely accepted than ISO2022JP, so
rename it that way. We probably would need to have a way to skip
this test altogether in locale-challenged environments.
Signed-off-by: Junio C Hamano <junkio@cox.net>
It is plausible for somebody to want to view the commit log in a
different encoding from i18n.commitencoding -- the project's
policy may be UTF-8 and the user may be using a commit message
hook to run iconv to conform to that policy (and either not have
i18n.commitencoding to default to UTF-8 or have it explicitly
set to UTF-8). Even then, Latin-1 may be more convenient for
the usual pager and the terminal the user uses.
The new variable i18n.logoutputencoding is used in preference to
i18n.commitencoding to decide what encoding to recode the log
output in when git-log and friends formats the commit log message.
Signed-off-by: Junio C Hamano <junkio@cox.net>