git/builtin
John Cai 440c705ea6 cat-file: add --batch-command mode
Add a new flag --batch-command that accepts commands and arguments
from stdin, similar to git-update-ref --stdin.

At GitLab, we use a pair of long running cat-file processes when
accessing object content. One for iterating over object metadata with
--batch-check, and the other to grab object contents with --batch.

However, if we had --batch-command, we wouldn't need to keep both
processes around, and instead just have one --batch-command process
where we can flip between getting object info, and getting object
contents. Since we have a pair of cat-file processes per repository,
this means we can get rid of roughly half of long lived git cat-file
processes. Given there are many repositories being accessed at any given
time, this can lead to huge savings.

git cat-file --batch-command

will enter an interactive command mode whereby the user can enter in
commands and their arguments that get queued in memory:

<command1> [arg1] [arg2] LF
<command2> [arg1] [arg2] LF

When --buffer mode is used, commands will be queued in memory until a
flush command is issued that execute them:

flush LF

The reason for a flush command is that when a consumer process (A)
talks to a git cat-file process (B) and interactively writes to and
reads from it in --buffer mode, (A) needs to be able to control when
the buffer is flushed to stdout.

Currently, from (A)'s perspective, the only way is to either

1. kill (B)'s process
2. send an invalid object to stdin.

1. is not ideal from a performance perspective as it will require
spawning a new cat-file process each time, and 2. is hacky and not a
good long term solution.

With this mechanism of queueing up commands and letting (A) issue a
flush command, process (A) can control when the buffer is flushed and
can guarantee it will receive all of the output when in --buffer mode.
--batch-command also will not allow (B) to flush to stdout until a flush
is received.

This patch adds the basic structure for adding command which can be
extended in the future to add more commands. It also adds the following
two commands (on top of the flush command):

contents <object> LF
info <object> LF

The contents command takes an <object> argument and prints out the object
contents.

The info command takes an <object> argument and prints out the object
metadata.

These can be used in the following way with --buffer:

info <object> LF
contents <object> LF
contents <object> LF
info <object> LF
flush LF
info <object> LF
flush LF

When used without --buffer:

info <object> LF
contents <object> LF
contents <object> LF
info <object> LF
info <object> LF

Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-02-18 11:21:46 -08:00
..
add.c i18n: turn even more messages into "cannot be used together" ones 2022-01-05 13:31:00 -08:00
am.c Merge branch 'ja/i18n-similar-messages' 2022-01-10 11:52:56 -08:00
annotate.c
apply.c
archive.c use xopen() to handle fatal open(2) failures 2021-08-25 14:39:08 -07:00
bisect--helper.c bisect--helper: add space between colon and following sentence 2021-10-01 15:47:53 -07:00
blame.c Merge branch 'ld/sparse-diff-blame' 2021-12-21 15:03:17 -08:00
branch.c Merge branch 'js/branch-track-inherit' 2022-01-20 15:25:38 -08:00
bugreport.c hook-list.h: add a generated list of hooks, like config-list.h 2021-09-27 09:44:54 -07:00
bundle.c Merge branch 'ab/bundle-remove-verbose-option' 2021-10-03 21:49:20 -07:00
cat-file.c cat-file: add --batch-command mode 2022-02-18 11:21:46 -08:00
check-attr.c
check-ignore.c dir.[ch]: replace dir_init() with DIR_INIT 2021-07-01 12:32:22 -07:00
check-mailmap.c shortlog: remove unused(?) "repo-abbrev" feature 2021-01-12 14:04:42 -08:00
check-ref-format.c
checkout--worker.c pkt-line.[ch]: remove unused packet_read_line_buf() 2021-10-15 13:09:40 -07:00
checkout-index.c Merge branch 'mt/parallel-checkout-part-3' 2021-05-16 21:05:23 +09:00
checkout.c Merge branch 'ab/checkout-branch-info-leakfix' 2022-01-24 09:14:46 -08:00
clean.c clean: do not attempt to remove startup_info->original_cwd 2021-12-09 13:33:13 -08:00
clone.c Merge branch 'ps/lockfile-cleanup-fix' 2022-01-12 15:11:43 -08:00
column.c column: fix parsing of the '--nl' option 2021-08-26 14:36:27 -07:00
commit-graph.c Merge branch 'ab/ignore-replace-while-working-on-commit-graph' 2021-11-01 13:48:08 -07:00
commit-tree.c use xopen() to handle fatal open(2) failures 2021-08-25 14:39:08 -07:00
commit.c i18n: turn even more messages into "cannot be used together" ones 2022-01-05 13:31:00 -08:00
config.c urlmatch.[ch]: add and use URLMATCH_CONFIG_INIT 2021-10-01 14:22:51 -07:00
count-objects.c
credential-cache--daemon.c unix-socket: add backlog size option to unix_stream_listen() 2021-03-15 14:32:51 -07:00
credential-cache.c credential-cache: check for windows specific errors 2021-09-14 09:30:54 -07:00
credential-store.c Use a better name for the function interpolating paths 2021-07-26 12:17:16 -07:00
credential.c doc: fix git credential synopsis 2021-10-28 09:57:09 -07:00
describe.c i18n: turn even more messages into "cannot be used together" ones 2022-01-05 13:31:00 -08:00
diff-files.c Merge branch 'jc/diffcore-rotate' 2021-02-25 16:43:30 -08:00
diff-index.c diff-index: restore -c/--cc options handling 2021-09-07 11:11:35 -07:00
diff-tree.c i18n: refactor "foo and bar are mutually exclusive" 2022-01-05 13:29:23 -08:00
diff.c diff: enable and test the sparse index 2021-12-06 09:55:06 -08:00
difftool.c i18n: turn "options are incompatible" into "cannot be used together" 2022-01-05 13:29:23 -08:00
env--helper.c assert PARSE_OPT_NONEG in parse-options callbacks 2020-09-30 12:53:47 -07:00
fast-export.c Merge branch 'ja/i18n-similar-messages' 2022-01-10 11:52:56 -08:00
fast-import.c usage.c API users: use die_message() for "fatal :" + exit 128 2021-12-07 13:25:15 -08:00
fetch-pack.c connect, transport: encapsulate arg in struct 2021-02-05 13:49:54 -08:00
fetch.c Merge branch 'ps/lockfile-cleanup-fix' 2022-01-12 15:11:43 -08:00
fmt-merge-msg.c merge: allow to pretend a merge is made into a different branch 2021-12-20 14:55:02 -08:00
for-each-ref.c for-each-ref: delay parsing of --sort=<atom> options 2021-10-20 14:33:07 -07:00
for-each-repo.c builtin/for-each-repo: remove unnecessary argv copy to plug leak 2021-07-26 12:19:20 -07:00
fsck.c run-command API users: use strvec_pushl(), not argv construction 2021-11-25 22:15:07 -08:00
gc.c usage.c + gc: add and use a die_message_errno() 2021-12-07 13:25:16 -08:00
get-tar-commit-id.c
grep.c grep: fix a "path_list" memory leak 2021-10-23 10:45:25 -07:00
hash-object.c Merge branch 'jc/prefix-filename-allocates' into maint 2021-10-12 13:51:32 -07:00
help.c run-command API users: use strvec_pushl(), not argv construction 2021-11-25 22:15:07 -08:00
index-pack.c i18n: factorize "--foo requires --bar" and the like 2022-01-05 13:31:00 -08:00
init-db.c i18n: refactor "foo and bar are mutually exclusive" 2022-01-05 13:29:23 -08:00
interpret-trailers.c
log.c Merge branch 'ja/i18n-similar-messages' 2022-01-10 11:52:56 -08:00
ls-files.c Merge branch 'ja/i18n-similar-messages' 2022-01-10 11:52:56 -08:00
ls-remote.c for-each-ref: delay parsing of --sort=<atom> options 2021-10-20 14:33:07 -07:00
ls-tree.c tree.h API: simplify read_tree_recursive() signature 2021-03-20 16:09:26 -07:00
mailinfo.c mailinfo: allow squelching quoted CRLF warning 2021-05-10 15:06:22 +09:00
mailsplit.c use xopen() to handle fatal open(2) failures 2021-08-25 14:39:08 -07:00
merge-base.c
merge-file.c xdiff: implement a zealous diff3, or "zdiff3" 2021-12-01 14:45:58 -08:00
merge-index.c merge-index: ensure full index 2021-04-14 13:47:21 -07:00
merge-ours.c builtins + test helpers: use return instead of exit() in cmd_* 2021-06-09 09:15:58 +09:00
merge-recursive.c
merge-tree.c xdiff users: use designated initializers for out_line 2021-05-11 12:47:31 +09:00
merge.c Merge branch 'ja/i18n-similar-messages' 2022-01-10 11:52:56 -08:00
mktag.c fsck: report invalid object type-path combinations 2021-10-01 15:06:01 -07:00
mktree.c builtins + test helpers: use return instead of exit() in cmd_* 2021-06-09 09:15:58 +09:00
multi-pack-index.c builtin/multi-pack-index.c: don't leak concatenated options 2021-10-28 15:32:14 -07:00
mv.c mv: refuse to move sparse paths 2021-09-28 10:31:02 -07:00
name-rev.c name-rev: prefer shorter names over following merges 2021-12-04 23:39:34 -08:00
notes.c Merge branch 'ab/usage-die-message' 2022-01-10 11:52:53 -08:00
pack-objects.c i18n: turn "options are incompatible" into "cannot be used together" 2022-01-05 13:29:23 -08:00
pack-redundant.c builtin/pack-redundant: avoid casting buffers to struct object_id 2021-04-27 16:31:38 +09:00
pack-refs.c
patch-id.c
prune-packed.c
prune.c Merge branch 'ns/tmp-objdir' 2022-01-03 16:24:15 -08:00
pull.c Merge branch 'pb/pull-rebase-autostash-fix' 2022-02-05 09:42:28 -08:00
push.c i18n: turn "options are incompatible" into "cannot be used together" 2022-01-05 13:29:23 -08:00
range-diff.c column, range-diff: downcase option description 2021-03-29 14:06:08 -07:00
read-tree.c Change unpack_trees' 'reset' flag into an enum 2021-09-27 13:38:37 -07:00
rebase.c i18n: turn even more messages into "cannot be used together" ones 2022-01-05 13:31:00 -08:00
receive-pack.c Merge branch 'jc/find-header' 2022-02-05 09:42:29 -08:00
reflog.c builtin/reflog.c: use parse-options api for expire, delete subcommands 2022-01-10 14:13:06 -08:00
remote-ext.c
remote-fd.c
remote.c Merge branch 'ab/designated-initializers-more' 2021-10-18 15:47:57 -07:00
repack.c Merge branch 'ja/i18n-similar-messages' 2022-01-10 11:52:56 -08:00
replace.c run-command API users: use strvec_pushl(), not argv construction 2021-11-25 22:15:07 -08:00
rerere.c xdiff users: use designated initializers for out_line 2021-05-11 12:47:31 +09:00
reset.c i18n: turn even more messages into "cannot be used together" ones 2022-01-05 13:31:00 -08:00
rev-list.c i18n: turn even more messages into "cannot be used together" ones 2022-01-05 13:31:00 -08:00
rev-parse.c refs: drop "broken" flag from for_each_fullref_in() 2021-09-27 12:36:45 -07:00
revert.c Merge branch 'ds/mergies-with-sparse-index' 2021-09-20 15:20:45 -07:00
rm.c Merge branch 'ja/i18n-similar-messages' 2022-01-10 11:52:56 -08:00
send-pack.c Merge branch 'jk/http-push-status-fix' 2021-10-29 15:43:12 -07:00
shortlog.c parse-options.[ch]: consistently use "enum parse_opt_result" 2021-10-08 14:13:11 -07:00
show-branch.c i18n: turn "options are incompatible" into "cannot be used together" 2022-01-05 13:29:23 -08:00
show-index.c builtin/show-index: set the algorithm for object IDs 2021-04-27 16:31:39 +09:00
show-ref.c refs: switch peel_ref() to peel_iterated_oid() 2021-01-21 15:51:31 -08:00
sparse-checkout.c Merge branch 'ds/sparse-checkout-malformed-pattern-fix' 2022-01-10 11:52:49 -08:00
stash.c Merge branch 'ab/cat-file' 2022-02-05 09:42:31 -08:00
stripspace.c
submodule--helper.c i18n: refactor "foo and bar are mutually exclusive" 2022-01-05 13:29:23 -08:00
symbolic-ref.c symbolic-ref: don't leak shortened refname in check_symref() 2021-03-14 15:57:59 -07:00
tag.c i18n: tag.c factorize i18n strings 2022-01-05 13:31:00 -08:00
unpack-file.c
unpack-objects.c hash: provide per-algorithm null OIDs 2021-04-27 16:31:39 +09:00
update-index.c update-index: refresh should rewrite index in case of racy timestamps 2022-01-07 12:37:31 -08:00
update-ref.c update-ref: fix streaming of status updates 2021-09-03 11:35:15 -07:00
update-server-info.c
upload-archive.c upload-archive: use regular "struct child_process" pattern 2021-11-25 22:15:07 -08:00
upload-pack.c upload-pack: document and rename --advertise-refs 2021-08-05 08:59:37 -07:00
var.c var: add GIT_DEFAULT_BRANCH variable 2021-11-03 13:25:36 -07:00
verify-commit.c
verify-pack.c
verify-tag.c
worktree.c i18n: factorize "--foo requires --bar" and the like 2022-01-05 13:31:00 -08:00
write-tree.c