Commit Graph

436 Commits

Author SHA1 Message Date
brian m. carlson
b4f5aca40e sha1_file: convert read_sha1_file to struct object_id
Convert read_sha1_file to take a pointer to struct object_id and rename
it read_object_file.  Do the same for read_sha1_file_extended.

Convert one use in grep.c to use the new function without any other code
change, since the pointer being passed is a void pointer that is already
initialized with a pointer to struct object_id.  Update the declaration
and definitions of the modified functions, and apply the following
semantic patch to convert the remaining callers:

@@
expression E1, E2, E3;
@@
- read_sha1_file(E1.hash, E2, E3)
+ read_object_file(&E1, E2, E3)

@@
expression E1, E2, E3;
@@
- read_sha1_file(E1->hash, E2, E3)
+ read_object_file(E1, E2, E3)

@@
expression E1, E2, E3, E4;
@@
- read_sha1_file_extended(E1.hash, E2, E3, E4)
+ read_object_file_extended(&E1, E2, E3, E4)

@@
expression E1, E2, E3, E4;
@@
- read_sha1_file_extended(E1->hash, E2, E3, E4)
+ read_object_file_extended(E1, E2, E3, E4)

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-14 09:23:50 -07:00
Junio C Hamano
cdda65acae Merge branch 'bp/untracked-cache-noflush'
Writing out the index file when the only thing that changed in it
is the untracked cache information is often wasteful, and this has
been optimized out.

* bp/untracked-cache-noflush:
  untracked cache: use git_env_bool() not getenv() for customization
  dir.c: don't flag the index as dirty for changes to the untracked cache
2018-03-08 12:36:30 -08:00
Junio C Hamano
026336cb27 untracked cache: use git_env_bool() not getenv() for customization
GIT_DISABLE_UNTRACKED_CACHE and GIT_TEST_UNTRACKED_CACHE are only
sensed for their presense by using getenv(); use git_env_bool()
instead so that GIT_DISABLE_UNTRACKED_CACHE=false would work as
naïvely expected.

Also rename GIT_TEST_UNTRACKED_CACHE to GIT_FORCE_UNTRACKED_CACHE
to express what it does more honestly.  Forcing its use may be one
useful thing to do while testing the feature, but testing does not
have to be the only use of the knob.

While at it, avoid repeated calls to git_env_bool() by capturing the
return value from the first call in a static variable.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-02-28 13:27:10 -08:00
Junio C Hamano
cf44c1e0a2 Merge branch 'nd/fix-untracked-cache-invalidation'
Some bugs around "untracked cache" feature have been fixed.

* nd/fix-untracked-cache-invalidation:
  dir.c: ignore paths containing .git when invalidating untracked cache
  dir.c: stop ignoring opendir() error in open_cached_dir()
  dir.c: fix missing dir invalidation in untracked code
  dir.c: avoid stat() in valid_cached_dir()
  status: add a failing test showing a core.untrackedCache bug
2018-02-27 10:33:50 -08:00
Junio C Hamano
090dbea684 Merge branch 'nd/trace-index-ops'
* nd/trace-index-ops:
  trace: measure where the time is spent in the index-heavy operations
2018-02-15 14:55:44 -08:00
Junio C Hamano
8be8342b4c Merge branch 'po/object-id'
Conversion from uchar[20] to struct object_id continues.

* po/object-id:
  sha1_file: rename hash_sha1_file_literally
  sha1_file: convert write_loose_object to object_id
  sha1_file: convert force_object_loose to object_id
  sha1_file: convert write_sha1_file to object_id
  notes: convert write_notes_tree to object_id
  notes: convert combine_notes_* to object_id
  commit: convert commit_tree* to object_id
  match-trees: convert splice_tree to object_id
  cache: clear whole hash buffer with oidclr
  sha1_file: convert hash_sha1_file to object_id
  dir: convert struct sha1_stat to use object_id
  sha1_file: convert pretend_sha1_file to object_id
2018-02-15 14:55:43 -08:00
Nguyễn Thái Ngọc Duy
0cacebf099 dir.c: ignore paths containing .git when invalidating untracked cache
read_directory() code ignores all paths named ".git" even if it's not
a valid git repository. See treat_path() for details. Since ".git" is
basically invisible to read_directory(), when we are asked to
invalidate a path that contains ".git", we can safely ignore it
because the slow path would not consider it anyway.

This helps when fsmonitor is used and we have a real ".git" repo at
worktree top. Occasionally .git/index will be updated and if the
fsmonitor hook does not filter it, untracked cache is asked to
invalidate the path ".git/index".

Without this patch, we invalidate the root directory unncessarily,
which:

- makes read_directory() fall back to slow path for root directory
  (slower)

- makes the index dirty (because UNTR extension is updated). Depending
  on the index size, writing it down could also be slow.

A note about the new "safe_path" knob. Since this new check could be
relatively expensive, avoid it when we know it's not needed. If the
path comes from the index, it can't contain ".git". If it does
contain, we may be screwed up at many more levels, not just this one.

Noticed-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-02-07 12:27:02 -08:00
Ben Peart
fc9ecbeb93 dir.c: don't flag the index as dirty for changes to the untracked cache
The untracked cache saves its current state in the UNTR index extension.
Currently, _any_ change to that state causes the index to be flagged as dirty
and written out to disk.  Unfortunately, the cost to write out the index can
exceed the savings gained by using the untracked cache.  Since it is a cache
that can be updated from the current state of the working directory, there is
no functional requirement that the index be written out for every change to the
untracked cache.

Update the untracked cache logic so that it no longer forces the index to be
written to disk except in the case where the extension is being turned on or
off.  When some other git command requires the index to be written to disk, the
untracked cache will take advantage of that to save it's updated state as well.
This results in a performance win when looked at over common sequences of git
commands (ie such as a status followed by add, commit, etc).

After this patch, all the logic to track statistics for the untracked cache
could be removed as it is only used by debug tracing used to debug the untracked
cache.

Signed-off-by: Ben Peart <benpeart@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-02-05 12:55:49 -08:00
Nguyễn Thái Ngọc Duy
ca54d9baa4 trace: measure where the time is spent in the index-heavy operations
All the known heavy code blocks are measured (except object database
access). This should help identify if an optimization is effective or
not. An unoptimized git-status would give something like below:

    0.001791141 s: read cache ...
    0.004011363 s: preload index
    0.000516161 s: refresh index
    0.003139257 s: git command: ... 'status' '--porcelain=2'
    0.006788129 s: diff-files
    0.002090267 s: diff-index
    0.001885735 s: initialize name hash
    0.032013138 s: read directory
    0.051781209 s: git command: './git' 'status'

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-02-02 11:20:16 -08:00
Nguyễn Thái Ngọc Duy
b673155074 dir.c: stop ignoring opendir() error in open_cached_dir()
A follow-up to the recently fixed bugs in the untracked
invalidation. If opendir() fails it should show a warning, perhaps
this should die, but if this ever happens the error is probably
recoverable for the user, and dying would just make things worse.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-02-02 10:16:23 -08:00
Patryk Obara
f070faccc1 sha1_file: convert hash_sha1_file to object_id
Convert the declaration and definition of hash_sha1_file to use
struct object_id and adjust all function calls.

Rename this function to hash_object_file.

Signed-off-by: Patryk Obara <patryk.obara@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-01-30 10:42:36 -08:00
Patryk Obara
4b33e60201 dir: convert struct sha1_stat to use object_id
Convert the declaration of struct sha1_stat. Adjust all usages of this
struct and replace hash{clr,cmp,cpy} with oid{clr,cmp,cpy} wherever
possible.  Rename it to struct oid_stat.

Rename static function load_sha1_stat to load_oid_stat.

Remove macro EMPTY_BLOB_SHA1_BIN, as it's no longer used.

Signed-off-by: Patryk Obara <patryk.obara@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-01-30 10:42:36 -08:00
Nguyễn Thái Ngọc Duy
b640313110 dir.c: fix missing dir invalidation in untracked code
Let's start with how create a new directory cache after the last one
becomes invalid (e.g. because its dir mtime has changed...). In
open_cached_dir():

1. We start out with valid_cached_dir() returning false, which should
   call invalidate_directory() to put a directory state back to
   initial state, no untracked entries (untracked_nr zero), no sub
   directory traversal (dirs[].recurse zero).

2. Since the cache cannot be used, we go the slow path opendir() and
   go through items one by one via readdir(). All the directories on
   disk will be added back to the cache (if not already exist in
   dirs[]) and its flag "recurse" gets changed to one to note that
   it's part of the cached dir travesal next time.

3. By the time we reach close_cached_dir() we should have a good
   subdir list in dirs[]. Those with "recurse" flag set are the ones
   present in the on-disk directory. The directory is now marked
   "valid".

Next time read_directory() is called, since the directory is marked
valid, it will skip readdir(), go fast path and traverse through
dirs[] array instead.

Steps one and two need some tight cooperation. If a subdir is removed,
readdir() will not find it and of course we cannot examine/invalidate
it. To make sure removed directories on disk are gone from the cache,
step one must make sure recurse flag of all subdirs are zero.

But that's not true. If "valid" flag is already false, there is a
chance we go straight to the end of valid_cached_dir() without calling
invalidate_directory(). Or we fail to meet the "if (untracked-valid)"
condition and skip over the invalidate_directory().

After step 3, we mark the cache valid. Any stale subdir with incorrect
recurse flag becomes a real subdir next time we traverse the directory
using dirs[] array.

We could avoid this by making sure invalidate_directory() is always
called (therefore dirs[].recurse cleared) at the beginning of
open_cached_dir(). Which is what this patch does.

As to how we get into this situation, the key in the test is this
command

    git checkout master

where "one/file" is replaced with "one" in the index. This index
update triggers untracked_cache_invalidate_path(), which clears valid
flag of the root directory while keeping "recurse" flag on the subdir
"one" on. On the next git-status, we go through steps 1-3 above and
save an incorrect cache on disk. The second git-status blindly follows
the bad cache data and shows the problem.

This is arguably because of a bad design where "recurse" flag plays
double roles: whether a directory should be saved on disk, and whether
it is part of a directory traversal.

We need to keep recurse flag set at "checkout master" because of the
first role: we need to keep subdir caches (dir "two" for example has
not been touched at all, no reason to throw its cache away).

As long as we make sure to ignore/reset "recurse" flag at the
beginning of a directory traversal, we're good. But maybe eventually
we should separate these two roles.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-01-24 12:40:14 -08:00
Nguyễn Thái Ngọc Duy
2523c4be85 dir.c: avoid stat() in valid_cached_dir()
stat() may follow a symlink and return stat data of the link's target
instead of the link itself. We are concerned about the link itself.

It's kind of hard to demonstrate the bug. I think when path->buf is a
symlink, we most likely find that its target's stat data does not
match our cached one, which means we ignore the cache and fall back to
slow path.

This is performance issue, not correctness (though we could still
catch it by verifying test-dump-untracked-cache. The less unlikely
case is, link target stat data matches the cached version and we
incorrectly go fast path, ignoring real data on disk. A test for this
may involve manipulating stat data, which may be not portable.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-01-24 12:40:13 -08:00
SZEDER Gábor
f919ffebed Use MOVE_ARRAY
Use the helper macro MOVE_ARRAY to move arrays.  This is shorter and
safer, as it automatically infers the size of elements.

Patch generated by Coccinelle and contrib/coccinelle/array.cocci in
Travis CI's static analysis build job.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-01-22 11:32:51 -08:00
Junio C Hamano
61061abba7 Merge branch 'jh/object-filtering'
In preparation for implementing narrow/partial clone, the object
walking machinery has been taught a way to tell it to "filter" some
objects from enumeration.

* jh/object-filtering:
  rev-list: support --no-filter argument
  list-objects-filter-options: support --no-filter
  list-objects-filter-options: fix 'keword' typo in comment
  pack-objects: add list-objects filtering
  rev-list: add list-objects filtering support
  list-objects: filter objects in traverse_commit_list
  oidset: add iterator methods to oidset
  oidmap: add oidmap iterator methods
  dir: allow exclusions from blob in addition to file
2017-12-27 11:16:21 -08:00
Jeff Hostetler
578d81d0c4 dir: allow exclusions from blob in addition to file
Refactor add_excludes() to separate the reading of the
exclude file into a buffer and the parsing of the buffer
into exclude_list items.

Add add_excludes_from_blob_to_list() to allow an exclude
file be specified with an OID without assuming a local
worktree or index exists.

Refactor read_skip_worktree_file_from_index() and add
do_read_blob() to eliminate duplication of preliminary
processing of blob contents.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-11-22 14:11:56 +09:00
Junio C Hamano
e05336bdda Merge branch 'bp/fsmonitor'
We learned to talk to watchman to speed up "git status" and other
operations that need to see which paths have been modified.

* bp/fsmonitor:
  fsmonitor: preserve utf8 filenames in fsmonitor-watchman log
  fsmonitor: read entirety of watchman output
  fsmonitor: MINGW support for watchman integration
  fsmonitor: add a performance test
  fsmonitor: add a sample integration script for Watchman
  fsmonitor: add test cases for fsmonitor extension
  split-index: disable the fsmonitor extension when running the split index test
  fsmonitor: add a test tool to dump the index extension
  update-index: add fsmonitor support to update-index
  ls-files: Add support in ls-files to display the fsmonitor valid bit
  fsmonitor: add documentation for the fsmonitor extension.
  fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files.
  update-index: add a new --force-write-index option
  preload-index: add override to enable testing preload-index
  bswap: add 64 bit endianness helper get_be64
2017-11-21 14:07:50 +09:00
Junio C Hamano
d8df70f273 Merge branch 'jm/status-ignored-files-list'
The set of paths output from "git status --ignored" was tied
closely with its "--untracked=<mode>" option, but now it can be
controlled more flexibly.  Most notably, a directory that is
ignored because it is listed to be ignored in the ignore/exclude
mechanism can be handled differently from a directory that ends up
to be ignored only because all files in it are ignored.

* jm/status-ignored-files-list:
  status: test ignored modes
  status: document options to show matching ignored files
  status: report matching ignored and normal untracked
  status: add option to show ignored files differently
2017-11-13 14:44:59 +09:00
Junio C Hamano
e7e456f500 Merge branch 'bc/object-id'
Conversion from uchar[20] to struct object_id continues.

* bc/object-id: (25 commits)
  refs/files-backend: convert static functions to object_id
  refs: convert read_raw_ref backends to struct object_id
  refs: convert peel_object to struct object_id
  refs: convert resolve_ref_unsafe to struct object_id
  worktree: convert struct worktree to object_id
  refs: convert resolve_gitlink_ref to struct object_id
  Convert remaining callers of resolve_gitlink_ref to object_id
  sha1_file: convert index_path and index_fd to struct object_id
  refs: convert reflog_expire parameter to struct object_id
  refs: convert read_ref_at to struct object_id
  refs: convert peel_ref to struct object_id
  builtin/pack-objects: convert to struct object_id
  pack-bitmap: convert traverse_bitmap_commit_list to object_id
  refs: convert dwim_log to struct object_id
  builtin/reflog: convert remaining unsigned char uses to object_id
  refs: convert dwim_ref and expand_ref to struct object_id
  refs: convert read_ref and read_ref_full to object_id
  refs: convert resolve_refdup and refs_resolve_refdup to struct object_id
  Convert check_connected to use struct object_id
  refs: update ref transactions to use struct object_id
  ...
2017-11-06 14:24:27 +09:00
Junio C Hamano
da7996aaf7 Merge branch 'js/submodule-in-excluded'
"git status --ignored -u" did not stop at a working tree of a
separate project that is embedded in an ignored directory and
listed files in that other project, instead of just showing the
directory itself as ignored.

* js/submodule-in-excluded:
  status: do not get confused by submodules in excluded directories
2017-11-06 13:11:26 +09:00
Jameson Miller
07966ed19e status: report matching ignored and normal untracked
Teach status command to handle `--ignored=matching` with
`--untracked-files=normal`

Signed-off-by: Jameson Miller <jamill@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-31 11:54:21 +09:00
Jameson Miller
eec0f7f2b7 status: add option to show ignored files differently
Teach the status command more flexibility in how ignored files are
reported. Currently, the reporting of ignored files and untracked
files are linked. You cannot control how ignored files are reported
independently of how untracked files are reported (i.e. `all` vs
`normal`). This makes it impossible to show untracked files with the
`all` option, but show ignored files with the `normal` option.

This work 1) adds the ability to control the reporting of ignored
files independently of untracked files and 2) introduces the concept
of status reporting ignored paths that explicitly match an ignored
pattern. There are 2 benefits to these changes: 1) if a consumer needs
all untracked files but not all ignored files, there is a performance
benefit to not scanning all contents of an ignored directory and 2)
returning ignored files that explicitly match a path allow a consumer
to make more informed decisions about when a status result might be
stale.

This commit implements --ignored=matching with --untracked-files=all.
The following commit will implement --ignored=matching with
--untracked=files=normal.

As an example of where this flexibility could be useful is that our
application (Visual Studio) runs the status command and presents the
output. It shows all untracked files individually (e.g. using the
'--untracked-files==all' option), and would like to know about which
paths are ignored. It uses information about ignored paths to make
decisions about when the status result might have changed.
Additionally, many projects place build output into directories inside
a repository's working directory (e.g. in "bin/" and "obj/"
directories). Normal usage is to explicitly ignore these 2 directory
names in the .gitignore file (rather than or in addition to the *.obj
pattern).If an application could know that these directories are
explicitly ignored, it could infer that all contents are ignored as
well and make better informed decisions about files in these
directories. It could infer that any changes under these paths would
not affect the output of status. Additionally, there can be a
significant performance benefit by avoiding scanning through ignored
directories.

When status is set to report matching ignored files, it has the
following behavior. Ignored files and directories that explicitly
match an exclude pattern are reported. If an ignored directory matches
an exclude pattern, then the path of the directory is returned. If a
directory does not match an exclude pattern, but all of its contents
are ignored, then the contained files are reported instead of the
directory.

Signed-off-by: Jameson Miller <jamill@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-31 11:54:21 +09:00
Johannes Schindelin
fadb4820c4 status: do not get confused by submodules in excluded directories
We meticulously pass the `exclude` flag to the `treat_directory()`
function so that we can indicate that files in it are excluded rather
than untracked when recursing.

But we did not yet treat submodules the same way.

Because of that, `git status --ignored --untracked` with a submodule
`submodule` in a gitignored `tracked/` would show the submodule in the
"Untracked files" section, e.g.

	On branch master
	Untracked files:
	  (use "git add <file>..." to include in what will be committed)

		tracked/submodule/

	Ignored files:
	  (use "git add -f <file>..." to include in what will be committed)

		tracked/submodule/initial.t

Instead, we would want it to show the submodule in the "Ignored files"
section:

	On branch master
	Ignored files:
	  (use "git add -f <file>..." to include in what will be committed)

		tracked/submodule/

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-26 11:29:06 +09:00
brian m. carlson
a98e6101f0 refs: convert resolve_gitlink_ref to struct object_id
Convert the declaration and definition of resolve_gitlink_ref to use
struct object_id and apply the following semantic patch:

@@
expression E1, E2, E3;
@@
- resolve_gitlink_ref(E1, E2, E3.hash)
+ resolve_gitlink_ref(E1, E2, &E3)

@@
expression E1, E2, E3;
@@
- resolve_gitlink_ref(E1, E2, E3->hash)
+ resolve_gitlink_ref(E1, E2, E3)

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-16 11:05:51 +09:00
brian m. carlson
1053fe829c Convert remaining callers of resolve_gitlink_ref to object_id
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-16 11:05:51 +09:00
Ben Peart
883e248b8a fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files.
When the index is read from disk, the fsmonitor index extension is used
to flag the last known potentially dirty index entries. The registered
core.fsmonitor command is called with the time the index was last
updated and returns the list of files changed since that time. This list
is used to flag any additional dirty cache entries and untracked cache
directories.

We can then use this valid state to speed up preload_index(),
ie_match_stat(), and refresh_cache_ent() as they do not need to lstat()
files to detect potential changes for those entries marked
CE_FSMONITOR_VALID.

In addition, if the untracked cache is turned on valid_cached_dir() can
skip checking directories for new or changed files as fsmonitor will
invalidate the cache only for those directories that have been
identified as having potential changes.

To keep the CE_FSMONITOR_VALID state accurate during git operations;
when git updates a cache entry to match the current state on disk,
it will now set the CE_FSMONITOR_VALID bit.

Inversely, anytime git changes a cache entry, the CE_FSMONITOR_VALID bit
is cleared and the corresponding untracked cache directory is marked
invalid.

Signed-off-by: Ben Peart <benpeart@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-01 17:23:01 +09:00
Jameson Miller
5aaa7fd39a Improve performance of git status --ignored
Improve the performance of the directory listing logic when it wants to list
non-empty ignored directories. In order to show non-empty ignored directories,
the existing logic will recursively iterate through all contents of an ignored
directory. This change introduces the optimization to stop iterating through
the contents once it finds the first file. This can have a significant
improvement in 'git status --ignored' performance in repositories with a large
number of files in ignored directories.

For an example of the performance difference on an example repository with
196,000 files in 400 ignored directories:

| Command                    |  Time (s) |
| -------------------------- | --------- |
| git status                 |   1.2     |
| git status --ignored (old) |   3.9     |
| git status --ignored (new) |   1.4     |

Signed-off-by: Jameson Miller <jamill@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-19 12:28:06 +09:00
Junio C Hamano
0d824bc7f6 Merge branch 'rs/stat-data-unaligned-reads-fix' into maint
Code clean-up.

* rs/stat-data-unaligned-reads-fix:
  dir: support platforms that require aligned reads
2017-08-23 14:33:48 -07:00
Junio C Hamano
12deaf66d4 Merge branch 'rs/stat-data-unaligned-reads-fix'
Code clean-up.

* rs/stat-data-unaligned-reads-fix:
  dir: support platforms that require aligned reads
2017-08-11 13:26:58 -07:00
René Scharfe
268ba20110 dir: support platforms that require aligned reads
The untracked cache is stored on disk by concatenating its memory
structures without any padding.  Consequently some of the structs are
not aligned at a particular boundary when the whole extension is read
back in one go.  That's only OK on platforms without strict alignment
requirements, or for byte-aligned data like strings or hash values.

Decode struct ondisk_untracked_cache carefully from the extension
blob by using explicit pointer arithmetic with offsets, avoiding
alignment issues.  Use char pointers for passing stat_data objects to
stat_data_from_disk(), and use memcpy(3) in that function to  get the
contents into a properly aligned struct, then perform the byte-order
adjustment in place there.

Found with Clang's UBSan.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-07-17 14:56:16 -07:00
Junio C Hamano
0c6435a4d6 Merge branch 'ab/wildmatch'
Minor code cleanup.

* ab/wildmatch:
  wildmatch: remove unused wildopts parameter
2017-07-10 13:42:51 -07:00
Junio C Hamano
50f03c6676 Merge branch 'ab/free-and-null'
A common pattern to free a piece of memory and assign NULL to the
pointer that used to point at it has been replaced with a new
FREE_AND_NULL() macro.

* ab/free-and-null:
  *.[ch] refactoring: make use of the FREE_AND_NULL() macro
  coccinelle: make use of the "expression" FREE_AND_NULL() rule
  coccinelle: add a rule to make "expression" code use FREE_AND_NULL()
  coccinelle: make use of the "type" FREE_AND_NULL() rule
  coccinelle: add a rule to make "type" code use FREE_AND_NULL()
  git-compat-util: add a FREE_AND_NULL() wrapper around free(ptr); ptr = NULL
2017-06-24 14:28:41 -07:00
Junio C Hamano
f31d23a399 Merge branch 'bw/config-h'
Fix configuration codepath to pay proper attention to commondir
that is used in multi-worktree situation, and isolate config API
into its own header file.

* bw/config-h:
  config: don't implicitly use gitdir or commondir
  config: respect commondir
  setup: teach discover_git_directory to respect the commondir
  config: don't include config.h by default
  config: remove git_config_iter
  config: create config.h
2017-06-24 14:28:41 -07:00
Junio C Hamano
5812b3f73b Merge branch 'bw/ls-files-sans-the-index'
Code clean-up.

* bw/ls-files-sans-the-index:
  ls-files: factor out tag calculation
  ls-files: factor out debug info into a function
  ls-files: convert show_files to take an index
  ls-files: convert show_ce_entry to take an index
  ls-files: convert prune_cache to take an index
  ls-files: convert ce_excluded to take an index
  ls-files: convert show_ru_info to take an index
  ls-files: convert show_other_files to take an index
  ls-files: convert show_killed_files to take an index
  ls-files: convert write_eolinfo to take an index
  ls-files: convert overlay_tree_on_cache to take an index
  tree: convert read_tree to take an index parameter
  convert: convert renormalize_buffer to take an index
  convert: convert convert_to_git to take an index
  convert: convert convert_to_git_filter_fd to take an index
  convert: convert crlf_to_git to take an index
  convert: convert get_cached_convert_stats_ascii to take an index
2017-06-24 14:28:40 -07:00
Ævar Arnfjörð Bjarmason
55d3426929 wildmatch: remove unused wildopts parameter
Remove the unused wildopts placeholder struct from being passed to all
wildmatch() invocations, or rather remove all the boilerplate NULL
parameters.

This parameter was added back in commit 9b3497cab9 ("wildmatch: rename
constants and update prototype", 2013-01-01) as a placeholder for
future use. Over 4 years later nothing has made use of it, let's just
remove it. It can be added in the future if we find some reason to
start using such a parameter.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-23 18:27:07 -07:00
Junio C Hamano
52ab95cfea Merge branch 'pc/dir-count-slashes'
Three instances of the same helper function have been consolidated
to one.

* pc/dir-count-slashes:
  dir: create function count_slashes()
2017-06-22 14:15:21 -07:00
Ævar Arnfjörð Bjarmason
6a83d90207 coccinelle: make use of the "type" FREE_AND_NULL() rule
Apply the result of the just-added coccinelle rule. This manually
excludes a few occurrences, mostly things that resulted in many
FREE_AND_NULL() on one line, that'll be manually fixed in a subsequent
change.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-16 12:44:03 -07:00
Brandon Williams
b2141fc1d2 config: don't include config.h by default
Stop including config.h by default in cache.h.  Instead only include
config.h in those files which require use of the config system.

Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-15 12:56:22 -07:00
Junio C Hamano
b9a7d55d93 Merge branch 'nd/fopen-errors'
We often try to open a file for reading whose existence is
optional, and silently ignore errors from open/fopen; report such
errors if they are not due to missing files.

* nd/fopen-errors:
  mingw_fopen: report ENOENT for invalid file names
  mingw: verify that paths are not mistaken for remote nicknames
  log: fix memory leak in open_next_file()
  rerere.c: move error_errno() closer to the source system call
  print errno when reporting a system call error
  wrapper.c: make warn_on_inaccessible() static
  wrapper.c: add and use fopen_or_warn()
  wrapper.c: add and use warn_on_fopen_errors()
  config.mak.uname: set FREAD_READS_DIRECTORIES for Darwin, too
  config.mak.uname: set FREAD_READS_DIRECTORIES for Linux and FreeBSD
  clone: use xfopen() instead of fopen()
  use xfopen() in more places
  git_fopen: fix a sparse 'not declared' warning
2017-06-13 13:47:09 -07:00
Junio C Hamano
93dd544f54 Merge branch 'jc/noent-notdir'
Our code often opens a path to an optional file, to work on its
contents when we can successfully open it.  We can ignore a failure
to open if such an optional file does not exist, but we do want to
report a failure in opening for other reasons (e.g. we got an I/O
error, or the file is there, but we lack the permission to open).

The exact errors we need to ignore are ENOENT (obviously) and
ENOTDIR (less obvious).  Instead of repeating comparison of errno
with these two constants, introduce a helper function to do so.

* jc/noent-notdir:
  treewide: use is_missing_file_error() where ENOENT and ENOTDIR are checked
  compat-util: is_missing_file_error()
2017-06-13 13:47:07 -07:00
Junio C Hamano
f4683b4e9c Merge branch 'sl/clean-d-ignored-fix' into maint
"git clean -d" used to clean directories that has ignored files,
even though the command should not lose ignored ones without "-x".
"git status --ignored"  did not list ignored and untracked files
without "-uall".  These have been corrected.

* sl/clean-d-ignored-fix:
  clean: teach clean -d to preserve ignored paths
  dir: expose cmp_name() and check_contains()
  dir: hide untracked contents of untracked dirs
  dir: recurse into untracked dirs for ignored files
  t7061: status --ignored should search untracked dirs
  t7300: clean -d should skip dirs with ignored files
2017-06-13 13:27:02 -07:00
Brandon Williams
82b474e025 convert: convert convert_to_git to take an index
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-13 11:40:51 -07:00
Prathamesh Chavan
e0556a928f dir: create function count_slashes()
Similar functions exist in apply.c and builtin/show-branch.c for
counting the number of slashes in a string. Also in the later
patches, we introduce a third caller for the same. Hence, we unify
it now by cleaning the existing functions and declaring a common
function count_slashes in dir.h and implementing it in dir.c to
remove this code duplication.

Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Prathamesh Chavan <pc44800@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-12 13:26:55 -07:00
Junio C Hamano
f4fd99bf6e Merge branch 'sl/clean-d-ignored-fix'
"git clean -d" used to clean directories that has ignored files,
even though the command should not lose ignored ones without "-x".
"git status --ignored"  did not list ignored and untracked files
without "-uall".  These have been corrected.

* sl/clean-d-ignored-fix:
  clean: teach clean -d to preserve ignored paths
  dir: expose cmp_name() and check_contains()
  dir: hide untracked contents of untracked dirs
  dir: recurse into untracked dirs for ignored files
  t7061: status --ignored should search untracked dirs
  t7300: clean -d should skip dirs with ignored files
2017-06-02 15:06:05 +09:00
Junio C Hamano
c7054209d6 treewide: use is_missing_file_error() where ENOENT and ENOTDIR are checked
Using the is_missing_file_error() helper introduced in the previous
step, update all hits from

  $ git grep -e ENOENT --and -e ENOTDIR

There are codepaths that only check ENOENT, and it is possible that
some of them should be checking both.  Updating them is kept out of
this step deliberately, as we do not want to change behaviour in this
step.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-30 09:29:00 +09:00
Nguyễn Thái Ngọc Duy
11dc1fcb3f wrapper.c: add and use warn_on_fopen_errors()
In many places, Git warns about an inaccessible file after a fopen()
failed. To discern these cases from other cases where we want to warn
about inaccessible files, introduce a new helper specifically to test
whether fopen() failed because the current user lacks the permission to
open file in question.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-26 12:33:55 +09:00
Samuel Lijin
bbf504a995 dir: expose cmp_name() and check_contains()
We want to use cmp_name() and check_contains() (which both compare
`struct dir_entry`s, the former in terms of the sort order, the latter
in terms of whether one lexically contains another) outside of dir.c,
so we have to (1) change their linkage and (2) rename them as
appropriate for the global namespace. The second is achieved by
renaming cmp_name() to cmp_dir_entry() and check_contains() to
check_dir_entry_contains().

Signed-off-by: Samuel Lijin <sxlijin@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-22 12:14:13 +09:00
Samuel Lijin
fb89888849 dir: hide untracked contents of untracked dirs
When we taught read_directory_recursive() to recurse into untracked
directories in search of ignored files given DIR_SHOW_IGNORED_TOO, that
had the side effect of teaching it to collect the untracked contents of
untracked directories. It doesn't always make sense to return these,
though (we do need them for `clean -d`), so we introduce a flag
(DIR_KEEP_UNTRACKED_CONTENTS) to control whether or not read_directory()
strips dir->entries of the untracked contents of untracked dirs.

We also introduce check_contains() to check if one dir_entry corresponds
to a path which contains the path corresponding to another dir_entry.

This also fixes known breakages in t7061, since status --ignored now
searches untracked directories for ignored files.

Signed-off-by: Samuel Lijin <sxlijin@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-22 12:14:09 +09:00
Samuel Lijin
df5bcdf83a dir: recurse into untracked dirs for ignored files
We consider directories containing only untracked and ignored files to
be themselves untracked, which in the usual case means we don't have to
search these directories. This is problematic when we want to collect
ignored files with DIR_SHOW_IGNORED_TOO, though, so we teach
read_directory_recursive() to recurse into untracked directories to find
the ignored files they contain when DIR_SHOW_IGNORED_TOO is set. This
has the side effect of also collecting all untracked files in untracked
directories as well.

Signed-off-by: Samuel Lijin <sxlijin@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-22 12:06:52 +09:00