mirrors/git

mirror of https://github.com/git/git.git synced 2024-12-13 20:06:31 +08:00

Author	SHA1	Message	Date
Ævar Arnfjörð Bjarmason	9fd512c8d6	dir API: add a generalized path_match_flags() function Add a path_match_flags() function and have the two sets of starts_with_dot_{,dot_}slash() functions added in `63e95beb08` (submodule: port resolve_relative_url from shell to C, 2016-04-15) and `a2b26ffb1a` (fsck: convert gitmodules url to URL passed to curl, 2020-04-18) be thin wrappers for it. As the latter of those notes the fsck version was copied from the initial builtin/submodule--helper.c version. Since the code added in `a2b26ffb1a` was doing really doing the same as win32_is_dir_sep() added in `1cadad6f65` (git clone <url> C:\cygwin\home\USER\repo' is working (again), 2018-12-15) let's move the latter to git-compat-util.h is a is_xplatform_dir_sep(). We can then call either it or the platform-specific is_dir_sep() from this new function. Let's likewise change code in various other places that was hardcoding checks for "'/' \|\| '\\'" with the new is_xplatform_dir_sep(). As can be seen in those callers some of them still concern themselves with ':' (Mac OS classic?), but let's leave the question of whether that should be consolidated for some other time. As we expect to make wider use of the "native" case in the future, define and use two starts_with_dot_{,dot_}slash_native() convenience wrappers. This makes the diff in builtin/submodule--helper.c much smaller. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-05-16 15:02:09 -07:00
Jeff King	bb6832d552	fsck: warn about symlinked dotfiles we'll open with O_NOFOLLOW In the commits merged in via `204333b015` (Merge branch 'jk/open-dotgitx-with-nofollow', 2021-03-22), we stopped following symbolic links for .gitattributes, .gitignore, and .mailmap files. Let's teach fsck to warn that these symlinks are not going to do anything. Note that this is just a warning, and won't block the objects via transfer.fsckObjects, since there are reported to be cases of this in the wild (and even once fixed, they will continue to exist in the commit history of those projects, but are not particularly dangerous). Note that we won't add these to the existing gitmodules block in the fsck code. The logic for gitmodules is a bit more complicated, as we also check the content of non-symlink instances we find. But for these new files, there is no content check; we're just looking at the name and mode of the tree entry (and we can avoid even the complicated name checks in the common case that the mode doesn't indicate a symlink). We can reuse the test helper function we defined for .gitmodules, though (it needs some slight adjustments for the fsck error code, and because we don't block these symlinks via verify_path()). Note that I didn't explicitly test the transfer.fsckObjects case here (nor does the existing .gitmodules test that it blocks a push). The translation of fsck severities to outcomes is covered in general in t5504. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-04 11:52:02 +09:00
Jeff King	0282f6799f	fsck_tree(): wrap some long lines Many calls to report() in fsck_tree() are kept on a single line and are quite long. Most were pretty big to begin with, but have gotten even longer over the years as we've added more parameters. Let's accept the churn of wrapping them in order to conform to our usual line limits. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-03 14:41:08 +09:00
Jeff King	9e1947cb48	fsck_tree(): fix shadowed variable Commit `b2f2039c2b` (fsck: accept an oid instead of a "struct tree" for fsck_tree(), 2019-10-18) introduced a new "oid" parameter to fsck_tree(), and we pass it to the report() function when we find problems. However, that is shadowed within the tree-walking loop by the existing "oid" variable which we use to store the oid of each tree entry. As a result, we may report the wrong oid for some problems we detect within the loop (the entry oid, instead of the tree oid). Our tests didn't catch this because they checked only that we found the expected fsck problem, not that it was attached to the correct object. Let's rename both variables in the function to avoid confusion. This makes the diff a little noisy (e.g., all of the report() calls outside the loop were already correct but need to be touched), but makes sure we catch all cases and will avoid similar confusion in the future. And we can update the test to be a bit more specific and catch this problem. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-03 14:41:08 +09:00
Ævar Arnfjörð Bjarmason	3745e2693d	fetch-pack: use new fsck API to printing dangling submodules Refactor the check added in `5476e1efde` (fetch-pack: print and use dangling .gitmodules, 2021-02-22) to make use of us now passing the "msg_id" to the user defined "error_func". We can now compare against the FSCK_MSG_GITMODULES_MISSING instead of parsing the generated message. Let's also replace register_found_gitmodules() with directly manipulating the "gitmodules_found" member. A recent commit moved it into "fsck_options" so we could do this here. I'm sticking this callback in fsck.c. Perhaps in the future we'd like to accumulate such callbacks into another file (maybe fsck-cb.c, similar to parse-options-cb.c?), but while we've got just the one let's just put it into fsck.c. A better alternative in this case would be some library some more obvious library shared by fetch-pack.c ad builtin/index-pack.c, but there isn't such a thing. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-28 19:03:10 -07:00
Ævar Arnfjörð Bjarmason	c15087d17b	fsck.c: move gitmodules_{found,done} into fsck_options Move the gitmodules_{found,done} static variables added in `159e7b080b` (fsck: detect gitmodules files, 2018-05-02) into the fsck_options struct. It makes sense to keep all the context in the same place. This requires changing the recently added register_found_gitmodules() function added in `5476e1efde` (fetch-pack: print and use dangling .gitmodules, 2021-02-22) to take fsck_options. That function will be removed in a subsequent commit, but as it'll require the new gitmodules_found attribute of "fsck_options" we need this intermediate step first. An earlier version of this patch removed the small amount of duplication we now have between FSCK_OPTIONS_{DEFAULT,STRICT} with a FSCK_OPTIONS_COMMON macro. I don't think such de-duplication is worth it for this amount of copy/pasting. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-28 19:03:10 -07:00
Ævar Arnfjörð Bjarmason	53692df2b8	fsck.c: add an fsck_set_msg_type() API that takes enums Change code I added in `acf9de4c94` (mktag: use fsck instead of custom verify_tag(), 2021-01-05) to make use of a new API function that takes the fsck_msg_{id,type} types, instead of arbitrary strings that we'll (hopefully) parse into those types. At the time that the fsck_set_msg_type() API was introduced in `0282f4dced` (fsck: offer a function to demote fsck errors to warnings, 2015-06-22) it was only intended to be used to parse user-supplied data. For things that are purely internal to the C code it makes sense to have the compiler check these arguments, and to skip the sanity checking of the data in fsck_set_msg_type() which is redundant to checks we get from the compiler. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-28 19:03:10 -07:00
Ævar Arnfjörð Bjarmason	394d5d31b0	fsck.c: pass along the fsck_msg_id in the fsck_error callback Change the fsck_error callback to also pass along the fsck_msg_id. Before this change the only way to get the message id was to parse it back out of the "message". Let's pass it down explicitly for the benefit of callers that might want to use it, as discussed in [1]. Passing the msg_type is now redundant, as you can always get it back from the msg_id, but I'm not changing that convention. It's really common to need the msg_type, and the report() function itself (which calls "fsck_error") needs to call fsck_msg_type() to discover it. Let's not needlessly re-do that work in the user callback. 1. https://lore.kernel.org/git/87blcja2ha.fsf@evledraar.gmail.com/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-28 19:03:10 -07:00
Ævar Arnfjörð Bjarmason	44e07da8bb	fsck.[ch]: move FOREACH_FSCK_MSG_ID & fsck_msg_id from .c to .h Move the FOREACH_FSCK_MSG_ID macro and the fsck_msg_id enum it helps define from fsck.c to fsck.h. This is in preparation for having non-static functions take the fsck_msg_id as an argument. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-28 19:03:10 -07:00
Ævar Arnfjörð Bjarmason	901f2f6742	fsck.c: give "FOREACH_MSG_ID" a more specific name Rename the FOREACH_MSG_ID macro to FOREACH_FSCK_MSG_ID in preparation for moving it over to fsck.h. It's good convention to name macros in *.h files in such a way as to clearly not clash with any other names in other files. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-28 19:03:10 -07:00
Ævar Arnfjörð Bjarmason	b5495024ec	fsck.c: undefine temporary STR macro after use In `f417eed8cd` (fsck: provide a function to parse fsck message IDs, 2015-06-22) the "STR" macro was introduced, but that short macro name was not undefined after use as was done earlier in the same series for the MSG_ID macro in `c99ba492f1` (fsck: introduce identifiers for fsck messages, 2015-06-22). Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-28 19:03:10 -07:00
Ævar Arnfjörð Bjarmason	c72da1a22b	fsck.c: call parse_msg_type() early in fsck_set_msg_type() There's no reason to defer the calling of parse_msg_type() until after we've checked if the "id < 0". This is not a hot codepath, and parse_msg_type() itself may die on invalid input. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-28 19:03:10 -07:00
Ævar Arnfjörð Bjarmason	1b32b59f9b	fsck.h: move FSCK_{FATAL,INFO,ERROR,WARN,IGNORE} into an enum Move the FSCK_{FATAL,INFO,ERROR,WARN,IGNORE} defines into a new fsck_msg_type enum. These defines were originally introduced in: - `ba002f3b28` (builtin-fsck: move common object checking code to fsck.c, 2008-02-25) - `f50c440730` (fsck: disallow demoting grave fsck errors to warnings, 2015-06-22) - `efaba7cc77` (fsck: optionally ignore specific fsck issues completely, 2015-06-22) - `f27d05b170` (fsck: allow upgrading fsck warnings to errors, 2015-06-22) The reason these were defined in two different places is because we use FSCK_{IGNORE,INFO,FATAL} only in fsck.c, but FSCK_{ERROR,WARN} are used by external callbacks. Untangling that would take some more work, since we expose the new "enum fsck_msg_type" to both. Similar to "enum object_type" it's not worth structuring the API in such a way that only those who need FSCK_{ERROR,WARN} pass around a different type. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-28 19:03:10 -07:00
Ævar Arnfjörð Bjarmason	e35d65a78a	fsck.c: refactor fsck_msg_type() to limit scope of "int msg_type" Refactor "if options->msg_type" and other code added in `0282f4dced` (fsck: offer a function to demote fsck errors to warnings, 2015-06-22) to reduce the scope of the "int msg_type" variable. This is in preparation for changing its type in a subsequent commit, only using it in the "!options->msg_type" scope makes that change This also brings the code in line with the fsck_set_msg_type() function (also added in `0282f4dced`), which does a similar check for "!options->msg_type". Another minor benefit is getting rid of the style violation of not having braces for the body of the "if". Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-28 19:03:10 -07:00
Ævar Arnfjörð Bjarmason	35af754b06	fsck.c: rename remaining fsck_msg_id "id" to "msg_id" Rename the remaining variables of type fsck_msg_id from "id" to "msg_id". This change is relatively small, and is worth the churn for a later change where we have different id's in the "report" function. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-28 19:03:10 -07:00
Ævar Arnfjörð Bjarmason	034a7b7bcc	fsck.c: remove (mostly) redundant append_msg_id() function Remove the append_msg_id() function in favor of calling prepare_msg_ids(). We already have code to compute the camel-cased msg_id strings in msg_id_info, let's use it. When the append_msg_id() function was added in `71ab8fa840` (fsck: report the ID of the error/warning, 2015-06-22) the prepare_msg_ids() function didn't exist. When prepare_msg_ids() was added in `a46baac61e` (fsck: factor out msg_id_info[] lazy initialization code, 2018-05-26) this code wasn't moved over to lazy initialization. This changes the behavior of the code to initialize all the messages instead of just camel-casing the one we need on the fly. Since the common case is that we're printing just one message this is mostly redundant work. But that's OK in this case, reporting this fsck issue to the user isn't performance-sensitive. If we were somehow doing so in a tight loop (in a hopelessly broken repository?) this would help, since we'd save ourselves from re-doing this work for identical messages, we could just grab the prepared string from msg_id_info after the first invocation. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-28 19:03:10 -07:00
Ævar Arnfjörð Bjarmason	f1abc2d0e1	fsck.c: rename variables in fsck_set_msg_type() for less confusion Rename variables in a function added in `0282f4dced` (fsck: offer a function to demote fsck errors to warnings, 2015-06-22). It was needlessly confusing that it took a "msg_type" argument, but then later declared another "msg_type" of a different type. Let's rename that to "severity", and rename "id" to "msg_id" and "msg_id" to "msg_id_str" etc. This will make a follow-up change smaller. While I'm at it properly indent the fsck_set_msg_type() argument list. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-28 19:03:10 -07:00
Ævar Arnfjörð Bjarmason	fb79f5bff7	fsck.c: refactor and rename common config callback Refactor code I recently changed in `1f3299fda9` (fsck: make fsck_config() re-usable, 2021-01-05) so that I could use fsck's config callback in mktag in `1f3299fda9` (fsck: make fsck_config() re-usable, 2021-01-05). I don't know what I was thinking in structuring the code this way, but it clearly makes no sense to have an fsck_config_internal() at all just so it can get a fsck_options when git_config() already supports passing along some void* data. Let's just make use of that instead, which gets us rid of the two wrapper functions, and brings fsck's common config callback in line with other such reusable config callbacks. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-17 14:02:43 -07:00
Junio C Hamano	6ee353d42f	Merge branch 'jt/transfer-fsck-across-packs' The approach to "fsck" the incoming objects in "index-pack" is attractive for performance reasons (we have them already in core, inflated and ready to be inspected), but fundamentally cannot be applied fully when we receive more than one pack stream, as a tree object in one pack may refer to a blob object in another pack as ".gitmodules", when we want to inspect blobs that are used as ".gitmodules" file, for example. Teach "index-pack" to emit objects that must be inspected later and check them in the calling "fetch-pack" process. * jt/transfer-fsck-across-packs: fetch-pack: print and use dangling .gitmodules fetch-pack: with packfile URIs, use index-pack arg http-fetch: allow custom index-pack args http: allow custom index-pack args	2021-03-01 14:02:57 -08:00
Jonathan Tan	5476e1efde	fetch-pack: print and use dangling .gitmodules Teach index-pack to print dangling .gitmodules links after its "keep" or "pack" line instead of declaring an error, and teach fetch-pack to check such lines printed. This allows the tree side of the .gitmodules link to be in one packfile and the blob side to be in another without failing the fsck check, because it is now fetch-pack which checks such objects after all packfiles have been downloaded and indexed (and not index-pack on an individual packfile, as it is before this commit). Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-22 12:07:40 -08:00
Junio C Hamano	9e634a91c8	Merge branch 'js/fsck-name-objects-fix' Fix "git fsck --name-objects" which apparently has not been used by anybody who is motivated enough to report breakage. * js/fsck-name-objects-fix: fsck --name-objects: be more careful parsing generation numbers t1450: robustify `remove_object()`	2021-02-17 17:21:42 -08:00
Johannes Schindelin	e89f89361c	fsck --name-objects: be more careful parsing generation numbers In `7b35efd734` (fsck_walk(): optionally name objects on the go, 2016-07-17), the `fsck` machinery learned to optionally name the objects, so that it is easier to see what part of the repository is in a bad shape, say, when objects are missing. To save on complexity, this machinery uses a parser to determine the name of a parent given a commit's name: any `~<n>` suffix is parsed and the parent's name is formed from the prefix together with `~<n+1>`. However, this parser has a bug: if it finds a suffix `<n>` that is _not_ `~<n>`, it will mistake the empty string for the prefix and `<n>` for the generation number. In other words, it will generate a name of the form `~<bogus-number>`. Let's fix this. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-10 12:38:05 -08:00
Junio C Hamano	635ff67590	Merge branch 'jk/forbid-lf-in-git-url' into maint Newline characters in the host and path part of git:// URL are now forbidden. * jk/forbid-lf-in-git-url: fsck: reject .gitmodules git:// urls with newlines git_connect_git(): forbid newlines in host and path	2021-02-05 16:31:27 -08:00
Junio C Hamano	c7b1aaf6d6	Merge branch 'jk/forbid-lf-in-git-url' Newline characters in the host and path part of git:// URL are now forbidden. * jk/forbid-lf-in-git-url: fsck: reject .gitmodules git:// urls with newlines git_connect_git(): forbid newlines in host and path	2021-01-25 14:19:17 -08:00
Jeff King	6aed56736b	fsck: reject .gitmodules git:// urls with newlines The previous commit taught the clone/fetch client side to reject a git:// URL with a newline in it. Let's also catch these when fscking a .gitmodules file, which will give an earlier warning. Note that it would be simpler to just complain about newline in _any_ URL, but an earlier tightening for http/ftp made sure we kept allowing newlines for unknown protocols (and this is covered in the tests). So we'll stick to that precedent. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-07 14:25:44 -08:00
Ævar Arnfjörð Bjarmason	9a1a3a4d4c	mktag: allow omitting the header/body \n separator Change mktag's acceptance rules to accept an empty body without an empty line after the header again. This fixes an ancient unintended dregression in "mktag". When "mktag" was introduced in `ec4465adb3` (Add "tag" objects that can be used to sign other objects., 2005-04-25) the input checks were much looser. When it was documented it `6cfec03680` (mktag: minimally update the description., 2007-06-10) it was clearly intended for this \n to be optional: The message, when [it] exists, is separated by a blank line from the header. But then in `e0aaf781f6` (mktag.c: improve verification of tagger field and tests, 2008-03-27) this was made an error, seemingly by accident. It was just a result of the general header checks, and all the tests after that patch have a trailing empty line (but did not before). Let's allow this again, and tweak the test semantics changed in `e0aaf781f6` to remove the redundant empty line. New tests added in previous commits of mine already added an explicit test for allowing the empty line between header and body. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-05 14:58:29 -08:00
Ævar Arnfjörð Bjarmason	1f3299fda9	fsck: make fsck_config() re-usable Move the fsck_config() function from builtin/fsck.c to fsck.[ch]. This allows for re-using it in other tools that expose fsck logic and want to support its configuration variables. A logical continuation of this change would be to use a common function for all of {fetch,receive}.fsck.* and fsck.. See `5d477a334a` (fsck (receive-pack): allow demoting errors to warnings, 2015-06-22) and my own `1362df0d41` (fetch: implement fetch.fsck., 2018-07-27) for the relevant code. However, those routines want to not parse the fsck.skipList into OIDs, but rather pass them along with the --strict option to another process. It would be possible to refactor that whole thing so we support e.g. a "fetch." prefix, then just keep track of the skiplist as a filename instead of parsing it, and learn to spew that all out from our internal structures into something we can append to the --strict option. But instead I'm planning to re-use this in "mktag", which'll just re-use these "fsck.*" variables as-is. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-05 14:58:29 -08:00
Ævar Arnfjörð Bjarmason	acf9de4c94	mktag: use fsck instead of custom verify_tag() Change the validation logic in "mktag" to use fsck's fsck_tag() instead of its own custom parser. Curiously the logic for both dates back to the same commit[1]. Let's unify them so we're not maintaining two sets functions to verify that a tag is OK. The behavior of fsck_tag() and the old "mktag" code being removed here is different in few aspects. I think it makes sense to remove some of those checks, namely: A. fsck only cares that the timezone matches [-+][0-9]{4}. The mktag code disallowed values larger than 1400. Yes there's currently no timezone with a greater offset[2], but since we allow any number of non-offical timezones (e.g. +1234) passing this through seems fine. Git also won't break in the future if e.g. French Polynesia decides it needs to outdo the Line Islands when it comes to timezone extravagance. B. fsck allows missing author names such as "tagger <email>", mktag wouldn't, but would allow e.g. "tagger [2 spaces] <email>" (but not "tagger [1 space] <email>"). Now we allow all of these. C. Like B, but "mktag" disallowed spaces in the <email> part, fsck allows it. In some ways fsck_tag() is stricter than "mktag" was, namely: D. fsck disallows zero-padded dates, but mktag didn't care. So e.g. the timestamp "0000000000 +0000" produces an error now. A test in "t1006-cat-file.sh" relied on this, it's been changed to use "hash-object" (without fsck) instead. There was one check I deemed worth keeping by porting it over to fsck_tag(): E. "mktag" did not allow any custom headers, and by extension (as an empty commit is allowed) also forbade an extra stray trailing newline after the headers it knew about. Add a new check in the "ignore" category to fsck and use it. This somewhat abuses the facility added in `efaba7cc77` (fsck: optionally ignore specific fsck issues completely, 2015-06-22). This is somewhat of hack, but probably the least invasive change we can make here. The fsck command will shuffle these categories around, e.g. under --strict the "info" becomes a "warn" and "warn" becomes "error". Existing users of fsck's (and others, e.g. index-pack) --strict option rely on this. So we need to put something into a category that'll be ignored by all existing users of the API. Pretending that fsck.extraHeaderEntry=error ("ignore" by default) was set serves to do this for us. 1. `ec4465adb3` (Add "tag" objects that can be used to sign other objects., 2005-04-25) 2. https://en.wikipedia.org/wiki/List_of_UTC_time_offsets Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-05 14:58:29 -08:00
Junio C Hamano	7e75aeb290	Merge branch 'rs/fsck-duplicate-names-in-trees' The check in "git fsck" to ensure that the tree objects are sorted still had corner cases it missed unsorted entries. * rs/fsck-duplicate-names-in-trees: fsck: detect more in-tree d/f conflicts t1450: demonstrate undetected in-tree d/f conflict t1450: increase test coverage of in-tree d/f detection fsck: fix a typo in a comment	2020-06-08 18:06:29 -07:00
René Scharfe	fe747043dc	fsck: detect more in-tree d/f conflicts If the conflict candidate file name from the top of the stack is not a prefix of the current candiate directory then we can discard it as no matching directory can come up later. But we are not done checking the candidate directory -- the stack might still hold a matching file name, so stay in the loop and check the next candidate file name. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-21 11:03:13 -07:00
René Scharfe	86715592fd	fsck: fix a typo in a comment Reported-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-21 11:02:28 -07:00
Junio C Hamano	0498840b35	Merge branch 'rs/fsck-duplicate-names-in-trees' "git fsck" ensures that the paths recorded in tree objects are sorted and without duplicates, but it failed to notice a case where a blob is followed by entries that sort before a tree with the same name. This has been corrected. * rs/fsck-duplicate-names-in-trees: fsck: report non-consecutive duplicate names in trees	2020-05-14 14:39:44 -07:00
René Scharfe	9068cfb20f	fsck: report non-consecutive duplicate names in trees Tree entries are sorted in path order, meaning that directory names get a slash ('/') appended implicitly. Git fsck checks if trees contains consecutive duplicates, but due to that ordering there can be non-consecutive duplicates as well if one of them is a directory and the other one isn't. Such a tree cannot be fully checked out. Find these duplicates by recording candidate file names on a stack and check candidate directory names against that stack to find matches. Suggested-by: Brandon Williams <bwilliamseng@gmail.com> Original-test-by: Brandon Williams <bwilliamseng@gmail.com> Signed-off-by: René Scharfe <l.s.r@web.de> Reviewed-by: Luke Diamand <luke@diamand.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-11 08:40:28 -07:00
Junio C Hamano	342bc9e29f	Merge branch 'jk/config-use-size-t' The config API made mixed uses of int and size_t types to represent length of various pieces of text it parsed, which has been updated to use the correct type (i.e. size_t) throughout. * jk/config-use-size-t: config: reject parsing of files over INT_MAX config: use size_t to store parsed variable baselen git_config_parse_key(): return baselen as size_t config: drop useless length variable in write_pair() parse_config_key(): return subsection len as size_t remote: drop auto-strlen behavior of make_branch() and make_rewrite()	2020-04-28 15:49:58 -07:00
Jonathan Nieder	7397ca3373	Git 2.25.4 This merges up the security fix from v2.17.5. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>	2020-04-19 16:31:07 -07:00
Jonathan Nieder	f2771efd07	Git 2.23.3 This merges up the security fix from v2.17.5. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>	2020-04-19 16:30:27 -07:00
Jonathan Nieder	c9808fa014	Git 2.22.4 This merges up the security fix from v2.17.5. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>	2020-04-19 16:30:19 -07:00
Jonathan Nieder	9206d27eb5	Git 2.21.3 This merges up the security fix from v2.17.5. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>	2020-04-19 16:30:08 -07:00
Jonathan Nieder	041bc65923	Git 2.20.4 This merges up the security fix from v2.17.5. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>	2020-04-19 16:28:57 -07:00
Jonathan Nieder	76b54ee9b9	Git 2.19.5 This merges up the security fix from v2.17.5. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>	2020-04-19 16:26:41 -07:00
Jonathan Nieder	ba6f0905fd	Git 2.18.4 This merges up the security fix from v2.17.5. Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>	2020-04-19 16:24:14 -07:00
Jonathan Nieder	1a3609e402	fsck: reject URL with empty host in .gitmodules Git's URL parser interprets https:///example.com/repo.git to have no host and a path of "example.com/repo.git". Curl, on the other hand, internally redirects it to https://example.com/repo.git. As a result, until "credential: parse URL without host as empty host, not unset", tricking a user into fetching from such a URL would cause Git to send credentials for another host to example.com. Teach fsck to block and detect .gitmodules files using such a URL to prevent sharing them with Git versions that are not yet protected. A relative URL in a .gitmodules file could also be used to trigger this. The relative URL resolver used for .gitmodules does not normalize sequences of slashes and can follow ".." components out of the path part and to the host part of a URL, meaning that such a relative URL can be used to traverse from a https://foo.example.com/innocent superproject to a https:///attacker.example.com/exploit submodule. Fortunately, redundant extra slashes in .gitmodules are rare, so we can catch this by detecting one after a leading sequence of "./" and "../" components. Helped-by: Jeff King <peff@peff.net> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Reviewed-by: Jeff King <peff@peff.net>	2020-04-19 16:10:58 -07:00
Jonathan Nieder	c44088ecc4	credential: treat URL without scheme as invalid libcurl permits making requests without a URL scheme specified. In this case, it guesses the URL from the hostname, so I can run git ls-remote http::ftp.example.com/path/to/repo and it would make an FTP request. Any user intentionally using such a URL is likely to have made a typo. Unfortunately, credential_from_url is not able to determine the host and protocol in order to determine appropriate credentials to send, and until "credential: refuse to operate when missing host or protocol", this resulted in another host's credentials being leaked to the named host. Teach credential_from_url_gently to consider such a URL to be invalid so that fsck can detect and block gitmodules files with such URLs, allowing server operators to avoid serving them to downstream users running older versions of Git. This also means that when such URLs are passed on the command line, Git will print a clearer error so affected users can switch to the simpler URL that explicitly specifies the host and protocol they intend. One subtlety: .gitmodules files can contain relative URLs, representing a URL relative to the URL they were cloned from. The relative URL resolver used for .gitmodules can follow ".." components out of the path part and past the host part of a URL, meaning that such a relative URL can be used to traverse from a https://foo.example.com/innocent superproject to a https::attacker.example.com/exploit submodule. Fortunately a leading ':' in the first path component after a series of leading './' and '../' components is unlikely to show up in other contexts, so we can catch this by detecting that pattern. Reported-by: Jeff King <peff@peff.net> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Reviewed-by: Jeff King <peff@peff.net>	2020-04-19 16:10:58 -07:00
Jonathan Nieder	a2b26ffb1a	fsck: convert gitmodules url to URL passed to curl In `07259e74ec` (fsck: detect gitmodules URLs with embedded newlines, 2020-03-11), git fsck learned to check whether URLs in .gitmodules could be understood by the credential machinery when they are handled by git-remote-curl. However, the check is overbroad: it checks all URLs instead of only URLs that would be passed to git-remote-curl. In principle a git:// or file:/// URL does not need to follow the same conventions as an http:// URL; in particular, git:// and file:// protocols are not succeptible to issues in the credential API because they do not support attaching credentials. In the HTTP case, the URL in .gitmodules does not always match the URL that would be passed to git-remote-curl and the credential machinery: Git's URL syntax allows specifying a remote helper followed by a "::" delimiter and a URL to be passed to it, so that git ls-remote http::https://example.com/repo.git invokes git-remote-http with https://example.com/repo.git as its URL argument. With today's checks, that distinction does not make a difference, but for a check we are about to introduce (for empty URL schemes) it will matter. .gitmodules files also support relative URLs. To ensure coverage for the https based embedded-newline attack, urldecode and check them directly for embedded newlines. Helped-by: Jeff King <peff@peff.net> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Reviewed-by: Jeff King <peff@peff.net>	2020-04-19 16:10:58 -07:00
Jeff King	f5914f4b6b	parse_config_key(): return subsection len as size_t We return the length to a subset of a string using an "int *" out-parameter. This is fine most of the time, as we'd expect config keys to be relatively short, but it could behave oddly if we had a gigantic config key. A more appropriate type is size_t. Let's switch over, which lets our callers use size_t as appropriate (they are bound by our type because they must pass the out-parameter as a pointer). This is mostly just a cleanup to make it clear this code handles long strings correctly. In practice, our config parser already chokes on long key names (because of a similar int/size_t mixup!). When doing an int/size_t conversion, we have to be careful that nobody was trying to assign a negative value to the variable. I manually confirmed that for each case here. They tend to just feed the result to xmemdupz() or similar; in a few cases I adjusted the parameter types for helper functions to make sure the size_t is preserved. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-10 14:44:29 -07:00
Junio C Hamano	67b0a24910	Git 2.25.3 Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-03-17 18:12:01 -07:00
Junio C Hamano	17a02783d8	Git 2.23.2 Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-03-17 14:33:34 -07:00
Junio C Hamano	69fab82147	Git 2.22.3 Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-03-17 14:24:55 -07:00
Junio C Hamano	fe22686494	Git 2.21.2 Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-03-17 14:16:08 -07:00
Junio C Hamano	d1259ce117	Git 2.20.3 Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-03-17 13:46:10 -07:00

1 2 3 4 5

226 Commits