Commit Graph

39 Commits

Author SHA1 Message Date
Jean-Pierre André
17b56ccfa2 Allowed names with trailing dot or space on conditions
Windows places filenames with a trailing dot or space in the Win32
namespace and allows setting DOS names on such files.  This is true even
though on Windows such filenames can only be created and accessed using
WinNT-style paths and will confuse most Windows software.  Regardless,
because libntfs-3g did not allow setting DOS names on such files, in
some cases it was impossible to correctly restore, using libntfs-3g, a
directory structure that was created under Windows.

Update ntfs_set_ntfs_dos_name() to permit operating on a file that has a
long name with a trailing dot or space.  But continue to forbid creating
such names on a filesystem FUSE-mounted with the windows_name option.
Additionally, continue to forbid a trailing a dot or space in DOS names;
this matches the Windows behavior.

(contributed by Eric Biggers)
2017-02-11 10:54:51 +01:00
Jean-Pierre André
2052b46639 Fixed a possible buffer overrun in ntfs_utf16_to_utf8()
If an output buffer was provided, ntfs_utf16_to_utf8() limited the
output string length without the terminating null to 'outs_len'.  This
was incorrect because a terminating null was always added to the string,
causing a buffer overrun if the output string happened to have exactly
the maximum length.  This was a longstanding bug.  Fix it by leaving
space for a terminating null.

(contributed by Eric Biggers)
2017-02-11 09:51:17 +01:00
Jean-Pierre André
b9624542e0 Made utf16_to_utf8_size() always honor @outs_len
utf16_to_utf8_size() was not guaranteed to fail with ENAMETOOLONG if the
computed length was greater than @outs_len.  This could cause a buffer
overrun in ntfs_utf16_to_utf8().

(contributed by Eric Biggers)
2017-02-11 09:49:03 +01:00
Jean-Pierre André
4264f19acb Cleaned up file name collation code
- Update documentation for COLLATION_RULES
- Document how ntfs_names_full_collate() compares names
- Update comments and DEBUG code to reflect that ntfs_names_full_collate()
  always access 'upcase', even in CASE_SENSITIVE mode
- Remove unneeded assignments to 'c1' and 'c2' in IGNORE_CASE mode

Signed-off-by: Eric Biggers <ebiggers3@gmail.com>
2016-07-28 16:10:14 +02:00
Jean-Pierre André
e7c5950117 Silenced a truncation warning in upper case table
The upper case value for 0x1d79 is 0xa77d, so the difference is 0x8a04,
which overflows in the table which defines the computation of upper case
values. Rewriting this difference as -0x75fc leads to the same result
in an upper case table truncated to two bytes, and this avoid the
compiler warning.
2016-05-31 08:24:23 +02:00
Erik Larsson
f0370bfa9c unistr.c: Unify the two defines NOREVBOM and ALLOW_BROKEN_SURROGATES.
In the mailing list discussion we came to the conclusion that there
doesn't seem to be any reason to keep these declarations separate since
they address the same issue, namely libntfs-3g's tolerance for bad
Unicode data in filenames and other UTF-16 strings in the file system,
so merge the two defines into the new define ALLOW_BROKEN_UNICODE.
2016-04-12 17:02:40 +02:00
Erik Larsson
d9c61dd60e unistr.c: Enable encoding broken UTF-16 into broken UTF-8, A.K.A. WTF-8.
Windows filenames may contain invalid UTF-16 sequences (specifically
broken surrogate pairs), which cannot be converted to UTF-8 if we do
strict conversion.

This patch enables encoding broken UTF-16 into similarly broken UTF-8 by
encoding any surrogate character that don't have a match into a separate
3-byte UTF-8 sequence.

This is "sort of" valid UTF-8, but not valid Unicode since the code
points used for surrogate pair encoding are not supposed to occur in a
valid Unicode string... but on the other hand the source UTF-16 data is
also broken, so we aren't really making things any worse.

This format is sometimes referred to as WTF-8 (Wobbly Translation
Format, 8-bit encoding) and is a common solution to represent broken
UTF-16 as UTF-8.

It is a lossless round-trip conversion, i.e converting from broken
UTF-16 to "WTF-8" and back to UTF-16 yields the same broken UTF-16
sequence. Because of this property it enables accessing these files
by filename through ntfs-3g and the ntfsprogs (e.g. ls -la works as
expected).

To disable this behaviour you can pass the preprocessor/compiler flag
'-DALLOW_BROKEN_SURROGATES=0' when building ntfs-3g.
2016-04-08 05:39:48 +02:00
Erik Larsson
9893ea9ee6 Merge endianness fixes.
Conflicts:
	libntfs-3g/attrib.c
2016-01-28 09:22:42 +01:00
Erik Larsson
9cf04fd2cd Fix incorrect usage of native/little-endian types, signed types, etc.
This is harmless with regard to code generation but if we turn on strict
type checking these type mismatches will result in errors.
2015-12-21 23:55:31 +01:00
Erik Larsson
dfa4a6647f Fix code to use const_cpu_to_X/const_X_to_cpu macros for constants.
This enables the compiler to optimize this code in cases where compiler
support for endianness swapping is not present.
2015-12-21 23:21:00 +01:00
Erik Larsson
c9771d0509 unistr.c: Cleanup of OS X Unicode normalization code.
Normalize coding conventions to fit in with the rest of NTFS-3G,
including line breaks at column 80.
2015-06-23 06:43:17 +02:00
Jean-Pierre André
e40b86a86c Upgraded the upper-case table as defined by Windows 7
Newer versions of Windows use more recent definitions of upper-case
table defined by the Unicode consortium. Now using the same table as
Windows 7, windows 8 and Windows 10. This only has an effect on file
systems newly created by mkntfs.
2015-04-17 11:03:58 +02:00
Jean-Pierre André
543b17b7ef Rejected reserved files names when option windows_names is set
Windows applies legacy restrictions to file names, so when the option
windows_names is applied, reject the same reserved names, which are
CON, PRN, AUX, NUL, COM1..COM9, and LPT1..LPT9
2014-03-11 10:56:31 +01:00
Jean-Pierre André
4ce33daf6c Cosmetic : fixed an indentation in unistr 2012-01-23 17:09:19 +01:00
Jean-Pierre André
fa3d7a5728 minor : Fixed ntfs_upcase_build_default() returning garbage in error case (Fabian Keil) 2011-08-04 15:49:35 +02:00
Jean-Pierre André
82b00364a8 Fixed setting DOS names when defined with lower-case chars 2011-07-05 12:17:11 +02:00
Jean-Pierre André
a46a395006 Updated copyright notices 2011-02-08 13:52:12 +01:00
Jean-Pierre André
4c6cf9d977 Moved the knowledge of default upcase size to unistr.c 2011-02-08 13:52:12 +01:00
Jean-Pierre André
53599b1a98 Switched to the same Upcase table as Vista 2010-12-21 15:51:08 +01:00
Jean-Pierre André
8b910e9e80 Improved names comparing on big-endian computers 2010-10-26 08:59:51 +02:00
Jean-Pierre André
008d8c5df9 Fixed character translations when standard functions are not available 2010-08-28 13:59:43 +02:00
Jean-Pierre André
4d73c7c4f1 Fixed characters not allowed by Windows in names 2010-06-03 10:13:30 +02:00
Jean-Pierre André
693aa8780d enabled case insensitive file names in lowntfs-3g 2010-05-25 10:12:44 +02:00
jpandre
195945cdc0 Evaluated file names collations in a single parsing 2009-12-16 09:45:28 +00:00
jpandre
7a876eca36 Fixed possible memory leaks after char translation errors 2009-12-09 11:20:20 +00:00
jpandre
e23481624f Improved UTF8<-->UTF16 translations 2009-12-09 11:19:27 +00:00
jpandre
a75724fea8 Fixed a few misleading endianness types 2009-11-24 14:18:53 +00:00
jpandre
3af7bebe7b Mac OS X Unicode normalization form conversion (Erik Larsson) 2009-11-05 11:40:44 +00:00
jpandre
e4b3c59cb1 Accepted initial spaces in Win32/DOS names 2009-09-18 16:17:21 +00:00
jpandre
1d26eb2b97 Fixed checking spaces in Win32 names 2009-08-12 15:35:11 +00:00
jpandre
9a4672ca65 Developped getting and setting DOS names (short 8+3 names) 2009-07-01 19:45:59 +00:00
jpandre
fc78c03c39 Fixed an endianness error in default uppercase table 2009-04-20 15:27:03 +00:00
jpandre
11216c6942 Adapted to ntfs-3g-2009.1.1 2009-01-23 11:11:44 +00:00
jpandre
d3f3a19866 Adapted to ntfs-3g.1.5222-RC 2009-01-05 13:28:06 +00:00
jpandre
13552eba52 Integrated full utf-8 to utf-16le conversions, based on code by Berhard Kaindl 2008-08-21 12:04:51 +00:00
szaka
1098244bbf copyright update 2008-06-29 23:13:32 +00:00
jpandre
53fa335624 Adapted to ntfs-3g.1.2310 2008-03-10 15:35:54 +00:00
jpandre
038156ba82 Reengineered LRU caches, made generic, and applied to finding inode numbers 2008-01-10 17:32:55 +00:00
szaka
ba63b7daca initial CVS import 2006-10-30 22:32:48 +00:00