Commit Graph

107 Commits

Author SHA1 Message Date
Junio C Hamano
163b959194 Update 'crlf' attribute semantics.
This updates the semantics of 'crlf' so that .gitattributes file
can say "this is text, even though it may look funny".

Setting the `crlf` attribute on a path is meant to mark the path
as a "text" file.  'core.autocrlf' conversion takes place
without guessing the content type by inspection.

Unsetting the `crlf` attribute on a path is meant to mark the
path as a "binary" file.  The path never goes through line
endings conversion upon checkin/checkout.

Unspecified `crlf` attribute tells git to apply the
`core.autocrlf` conversion when the file content looks like
text.

Setting the `crlf` attribut to string value "input" is similar
to setting the attribute to `true`, but also forces git to act
as if `core.autocrlf` is set to `input` for the path.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-19 22:37:44 -07:00
Junio C Hamano
a5e92abde6 Fix funny types used in attribute value representation
It was bothering me a lot that I abused small integer values
casted to (void *) to represent non string values in
gitattributes.  This corrects it by making the type of attribute
values (const char *), and using the address of a few statically
allocated character buffer to denote true/false.  Unset attributes
are represented as having NULLs as their values.

Added in-header documentation to explain how git_checkattr()
routine should be called.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-18 16:17:13 -07:00
Junio C Hamano
515106fa13 Allow more than true/false to attributes.
This allows you to define three values (and possibly more) to
each attribute: true, false, and unset.

Typically the handlers that notice and act on attribute values
treat "unset" attribute to mean "do your default thing"
(e.g. crlf that is unset would trigger "guess from contents"),
so being able to override a setting to an unset state is
actually useful.

 - If you want to set the attribute value to true, have an entry
   in .gitattributes file that mentions the attribute name; e.g.

	*.o	binary

 - If you want to set the attribute value explicitly to false,
   use '-'; e.g.

	*.a	-diff

 - If you want to make the attribute value _unset_, perhaps to
   override an earlier entry, use '!'; e.g.

	*.a	-diff
	c.i.a	!diff

This also allows string values to attributes, with the natural
syntax:

	attrname=attrvalue

but you cannot use it, as nobody takes notice and acts on
it yet.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-17 01:04:59 -07:00
Junio C Hamano
201ac8efc7 Fix 'crlf' attribute semantics.
Earlier we said 'crlf lets the path go through core.autocrlf
process while !crlf disables it altogether'.  This fixes the
semantics to:

 - Lack of 'crlf' attribute makes core.autocrlf to apply
   (i.e. we guess based on the contents and if platform
   expresses its desire to have CRLF line endings via
   core.autocrlf, we do so).

 - Setting 'crlf' attribute to true forces CRLF line endings in
   working tree files, even if blob does not look like text
   (e.g. contains NUL or other bytes we consider binary).

 - Setting 'crlf' attribute to false disables conversion.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-15 13:35:45 -07:00
Junio C Hamano
35ebfd6a0c Define 'crlf' attribute.
This defines the semantics of 'crlf' attribute as an example.
When a path has this attribute unset (i.e. '!crlf'), autocrlf
line-end conversion is not applied.

Eventually we would want to let users to build a pipeline of
processing to munge blob data to filesystem format (and in the
other direction) based on combination of attributes, and at that
point the mechanism in convert_to_{git,working_tree}() that
looks at 'crlf' attribute needs to be enhanced.  Perhaps the
existing 'crlf' would become the first step in the input chain,
and the last step in the output chain.

Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-04-14 08:57:06 -07:00
Linus Torvalds
d7f4633405 Make AutoCRLF ternary variable.
This allows you to do:

	[core]
		AutoCRLF = input

and it should do only the CRLF->LF translation (ie it simplifies CRLF only
when reading working tree files, but when checking out files, it leaves
the LF alone, and doesn't turn it into a CRLF).

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-02-14 11:19:28 -08:00
Linus Torvalds
6c510bee20 Lazy man's auto-CRLF
It currently does NOT know about file attributes, so it does its
conversion purely based on content. Maybe that is more in the "git
philosophy" anyway, since content is king, but I think we should try to do
the file attributes to turn it off on demand.

Anyway, BY DEFAULT it is off regardless, because it requires a

	[core]
		AutoCRLF = true

in your config file to be enabled. We could make that the default for
Windows, of course, the same way we do some other things (filemode etc).

But you can actually enable it on UNIX, and it will cause:

 - "git update-index" will write blobs without CRLF
 - "git diff" will diff working tree files without CRLF
 - "git checkout" will write files to the working tree _with_ CRLF

and things work fine.

Funnily, it actually shows an odd file in git itself:

	git clone -n git test-crlf
	cd test-crlf
	git config core.autocrlf true
	git checkout
	git diff

shows a diff for "Documentation/docbook-xsl.css". Why? Because we have
actually checked in that file *with* CRLF! So when "core.autocrlf" is
true, we'll always generate a *different* hash for it in the index,
because the index hash will be for the content _without_ CRLF.

Is this complete? I dunno. It seems to work for me. It doesn't use the
filename at all right now, and that's probably a deficiency (we could
certainly make the "is_binary()" heuristics also take standard filename
heuristics into account).

I don't pass in the filename at all for the "index_fd()" case
(git-update-index), so that would need to be passed around, but this
actually works fine.

NOTE NOTE NOTE! The "is_binary()" heuristics are totally made-up by yours
truly. I will not guarantee that they work at all reasonable. Caveat
emptor. But it _is_ simple, and it _is_ safe, since it's all off by
default.

The patch is pretty simple - the biggest part is the new "convert.c" file,
but even that is really just basic stuff that anybody can write in
"Teaching C 101" as a final project for their first class in programming.
Not to say that it's bug-free, of course - but at least we're not talking
about rocket surgery here.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <junkio@cox.net>
2007-02-14 11:19:22 -08:00