There were multiple cases of little-endian fields being used as
CPU-endian without byte swapping. This would result in incorrect
behaviour on big-endian systems.
On big-endian systems the result of the '!=' operation would be
endian-swapped rather than the first argument (which must have been the
intended action).
This is harmless except when we do strict endianness checking, in which
case this results in a compile error. Fixed by converting values to
CPU endianness before comparing them.
In 'dump_resident_attr_val', 'i' was sometimes used as a native-endian
'int'-precision string length value and sometimes used as a little-
endian 16-bit flags value. This type of mixed usage is bad practice and
results in a hard error when strict endianness checking is used.
Fixed by introducing new variable 'flags' to hold the little-endian 16-
bit flags value.
If the attribute type is specified by the user, 'attr_type' was assigned
a CPU-endian value, however if the attribute type was not specified it
would be assigned the attribute type AT_DATA, which is a little-endian
value. The rest of the code seems to assume that 'attr_type' is
CPU-endian, so this is clearly a bug.
Resolved by fixing the endianness of the variable at little-endian,
converting the input value to little-endian when specified.
In 'dump_attr_record' the variable 'u' was first used to store a
CPU-endian 32-bit value, and then to store a 16-bit little-endian value.
This is bad practice and results in a hard error when strict endian type
checking is used.
Fixed by storing the 16-bit little-endian flags value in a new variable
'flags'.
When looking up the lowercase equivalent of a Unicode character in
ntfs_fix_file_name, no byte swapping was performed on the ntfschar used
as index into the 'locase' array. This would lead to very strange
results on big-endian systems.
This commit addresses issues where little-endian variables are emitted
raw to a log or output stream which is to be interpreted by the user.
Outputting data in non-native endianness can cause confusion for anybody
attempting to debug issues with a file system.
If start buffer is more recent than restart, we update committed LSN
with last record LSN of block (last_end_lsn) while applying action but
forget about it while printing records with -f for investigation
purpose.
Note that while applying actions we use start_buffer to calculate
latest page out of block 2 and block 3 and then from latest take
committed LSN. For -f we don't need buffers so we just compare
directly with committed LSN from restart.
(contributed by Rakesh Pandit)
The new compression formats used by Windows 10 uses reparse data, and
a new reparse tag which it is useful to define even though these formats
is not yet supported by ntfs-3g.
When the unreadable directory has an ATTRIBUTE_LIST attribute and an
INDEX_ALLOCATION attribute occupying split over several extents, the first
of which defines a single cluster, the first INDEX_ALLOCATION extent has
lowest_vcn=0 and highest_vcn=0, and the second one has lowest_vcn=1.
This unusual case, which can be created by the combination of a small
volume and near-full MFT records, triggers some special-case behavior in
ntfs_mapping_pairs_decompress_i(). That behavior is incorrect if the
attribute's first extent only contains a single cluster, since in that case
highest_vcn=0 as well.
This configuration has been tested on Windows and it *is* able to
successfully read the directory. This supports the hypothesis that the
volume is valid and NTFS-3g has a bug on the read side.
This bug could, in theory, occur with any non-resident attribute, not just
INDEX_ALLOCATION attributes.
(Contributed by Eric Biggers)
This fixes the case where the original bad cluster list requires extents.
The list is processed globally, no relocation is done, and the list is
truncated, possibly fitting into fewer extents.
When block 2 or block 3 points backward to block 4, it is not clear
whether the log file only consists of block 2 or block 3 or the log
file has just wrapped around. The latter is now assumed.
Some constraints put on reparse points of unknown type (e.g. they cannot
be deleted) are not acceptable to archivers. This patch removes some
constraints.
When the bad cluster list required extent, ntfsclone and ntfsresize
did not process the extents, leading to unexpected read errors and
unmatching bitmaps. This fix enables the full list to be taken into
account.
Windows requires non-Microsoft reparse points (identified by having bit
31 of the reparse tag clear) to have a 16-byte GUID following the regular
reparse point header. This GUID is not, and cannot, be included in the
"reparse data length" field.
(Contributed by Eric Biggers)
Under some rare condition there is no space in an MFT entry to make
an index non-resident, and the index root has to be moved to an extent.
This fix cares for the situation when the attribute list was inserted
beforehand.
When writing to compressed data, the function ntfs_attr_pwrite()
cannot cross a compression block border. This is a problem for archivers
which rely on libntfs-3g, so the function is now wrapped in another one
which restarts the writing as needed.
The MFT has two runlists which may be partially stored in extents.
When these runlists have to be relocated, the relocations must be done
after the old runlists are not needed any more to read the data of
standard files, but before the MFT may be needed to extend the runlists
of standard files. Before doing so the MFT runlists have to be refreshed
from device in order to collect the updates which cannot be done in
memory during the first stage.
ntfsrecover applies to the metadata the updates which were requested on
Windows but could not be completed because they were interrupted by
some event such as a power failure, a hardware crash, a software crash
or the device being unplugged. Doing so, the file system is restored
to the latest consistent state.
No update to libntfs-3g is required by this implementation.
On-disk struct definitions used native types (u16/u32/u64/s16/s32/s64),
which doesn't say anything about the intended interpretation of the
data. The intention of having little-endian-specific types and
big-endian-specific types must have been to clarify interpretation of
data and intentions in the code. Therefore it seems reasonable to use
these types in struct definitions to clarify what data represention is
used to encode field data.
Because some struct members in layout.h are big-endian, this change also
means moving the duplicated definitions for big-endian byteswapping
macros and big-endian types found in acls.h and security.h to the
appropriate locations in endians.h and types.h respectively in order to
make them available for the struct definitions in layout.h.
Under some condition (probably interference with another process), the
directory list gets reinitialized by releasedir() and opendir() at the
beginning of a partial buffer. So in readdir() skip to the requested
offset. This is a step towards implementing seekdir().
When used with the "no-action" option, the test for self-located MFT
still reported the partition to have been repaired. Adapt the report to
only tell repairing is possible.
When compiled autonomously without the automatically generated dynamic
link stubs, use a generic library name instead of a version dependent one.
(obsolete compile mode rarely used).