Currently UBI erases all corrupted eraseblocks, irrespectively of the nature
of corruption: corruption due to power cuts and non-power cut corruption.
The former case is OK, but the latter is not, because UBI may destroy
potentially important data.
With this patch, during scanning, when UBI hits a PEB with corrupted VID
header, it checks whether this PEB contains only 0xFF data. If yes, it is
safe to erase this PEB and it is put to the 'erase' list. If not, this may
be important data and it is better to avoid erasing this PEB. Instead,
UBI puts it to the corr list and moves out of the pool of available PEB.
IOW, UBI preserves this PEB.
Such corrupted PEB lessen the amount of available PEBs. So the more of them
we accumulate, the less PEBs are available. The maximum amount of non-power
cut corrupted PEBs is 8.
This patch is a response to UBIFS problem where reporter
(Matthew L. Creech <mlcreech@gmail.com>) observes that UBIFS index points
to an unmapped LEB. The theory is that corresponding PEB somehow got
corrupted and UBI wiped it. This patch (actually a series of patches)
tries to make sure such PEBs are preserved - this would make it is easier
to analyze the corruption.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
percpu.h is included by sched.h and module.h and thus ends up being
included when building most .c files. percpu.h includes slab.h which
in turn includes gfp.h making everything defined by the two files
universally available and complicating inclusion dependencies.
percpu.h -> slab.h dependency is about to be removed. Prepare for
this change by updating users of gfp and slab facilities include those
headers directly instead of assuming availability. As this conversion
needs to touch large number of source files, the following script is
used as the basis of conversion.
http://userweb.kernel.org/~tj/misc/slabh-sweep.py
The script does the followings.
* Scan files for gfp and slab usages and update includes such that
only the necessary includes are there. ie. if only gfp is used,
gfp.h, if slab is used, slab.h.
* When the script inserts a new include, it looks at the include
blocks and try to put the new include such that its order conforms
to its surrounding. It's put in the include block which contains
core kernel includes, in the same order that the rest are ordered -
alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
doesn't seem to be any matching order.
* If the script can't find a place to put a new include (mostly
because the file doesn't have fitting include block), it prints out
an error message indicating which .h file needs to be added to the
file.
The conversion was done in the following steps.
1. The initial automatic conversion of all .c files updated slightly
over 4000 files, deleting around 700 includes and adding ~480 gfp.h
and ~3000 slab.h inclusions. The script emitted errors for ~400
files.
2. Each error was manually checked. Some didn't need the inclusion,
some needed manual addition while adding it to implementation .h or
embedding .c file was more appropriate for others. This step added
inclusions to around 150 files.
3. The script was run again and the output was compared to the edits
from #2 to make sure no file was left behind.
4. Several build tests were done and a couple of problems were fixed.
e.g. lib/decompress_*.c used malloc/free() wrappers around slab
APIs requiring slab.h to be added manually.
5. The script was run on all .h files but without automatically
editing them as sprinkling gfp.h and slab.h inclusions around .h
files could easily lead to inclusion dependency hell. Most gfp.h
inclusion directives were ignored as stuff from gfp.h was usually
wildly available and often used in preprocessor macros. Each
slab.h inclusion directive was examined and added manually as
necessary.
6. percpu.h was updated not to include slab.h.
7. Build test were done on the following configurations and failures
were fixed. CONFIG_GCOV_KERNEL was turned off for all tests (as my
distributed build env didn't work with gcov compiles) and a few
more options had to be turned off depending on archs to make things
build (like ipr on powerpc/64 which failed due to missing writeq).
* x86 and x86_64 UP and SMP allmodconfig and a custom test config.
* powerpc and powerpc64 SMP allmodconfig
* sparc and sparc64 SMP allmodconfig
* ia64 SMP allmodconfig
* s390 SMP allmodconfig
* alpha SMP allmodconfig
* um on x86_64 SMP allmodconfig
8. percpu.h modifications were reverted so that it could be applied as
a separate patch and serve as bisection point.
Given the fact that I had only a couple of failures from tests on step
6, I'm fairly confident about the coverage of this conversion patch.
If there is a breakage, it's likely to be something in one of the arch
headers which should be easily discoverable easily on most builds of
the specific arch.
Signed-off-by: Tejun Heo <tj@kernel.org>
Guess-its-ok-by: Christoph Lameter <cl@linux-foundation.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
Remove built-in gluebi support. This is a preparation for a
standalone glubi module support
Signed-off-by: Dmitry Pervushin <dpervushin@embeddedalley.com>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
UBI volume notifications are intended to create the API to get clients
notified about volume creation/deletion, renaming and re-sizing. A
client can subscribe to these notifications using 'ubi_volume_register()'
and cancel the subscription using 'ubi_volume_unregister()'. When UBI
volumes change, a blocking notifier is called. Clients also can request
"added" events on all volumes that existed before client subscribed
to the notifications.
If we use notifications instead of calling functions like 'ubi_gluebi_xxx()',
we can make the MTD emulation layer to be more flexible: build it as a
separate module and load/unload it on demand.
[Artem: many cleanups, rework locking, add "updated" event, provide
device/volume info in notifiers]
Signed-off-by: Dmitry Pervushin <dpervushin@embeddedalley.com>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
If a volume paranoid check fails, do not return an error
code to the caller, but just print error messages and go
forward. The primary reason for this is that it is difficult
to recover and cancel the operation at that stage.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
I am experiencing an error in 'paranoid_check_volume()'. Add
dump_stack() there to make it easier to identify the reasons
of the error.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Various minor improvements to the debugging messages which
I found useful while hunting problems.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
The mutex essencially protects the entire UBI device, so the
old @volumes_mutex name is a little misleading.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Fix the following warning:
drivers/mtd/ubi/vmt.c: In function 'ubi_rename_volumes':
drivers/mtd/ubi/vmt.c:642: warning: statement with no effect
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
No functional changes, just tweak comments to make kernel-doc
work fine and stop complaining.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Just out or curiousity ran checkpatch.pl for whole UBI,
and discovered there are quite a few of stylistic issues.
Fix them.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Quite useful ioctl which allows to make atomic system upgrades.
The idea belongs to Richard Titmuss <richard_titmuss@logitech.com>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Check that volume name is not shorter than 'name_len'.
No need to copy the trailing zero byte because whole array
was zeroed earlier.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
ubi_free_volume() function sets ubi->volumes[] to NULL, so
ubi_eba_close() is useless, it does not free what has to be freed.
So zap it and free vol->eba_tbl at the volume release function.
Pointed-out-by: Adrian Hunter <ext-adrian.hunter@nokia.com>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
UBI already checks that @min io size is the power of 2 at io_init.
It is save to use bit operations then.
Signed-off-by: Kyungmin Park <kyungmin.park@samsung.com>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
drivers/mtd/ubi/vmt.c: In function `ubi_create_volume':
drivers/mtd/ubi/vmt.c:379: warning: statement with no effect
Signed-off-by: S.Çağlar Onur <caglar@pardus.org.tr>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
The problem: NAND flashes have different amount of initial bad physical
eraseblocks (marked as bad by the manufacturer). For example, for 256MiB
Samsung OneNAND flash there might be from 0 to 40 bad initial eraseblocks,
which is about 2%. When UBI is used as the base system, one needs to know
the exact amount of good physical eraseblocks, because this number is
needed to create the UBI image which is put to the devices during
production. But this number is not know, which forces us to use the
minimum number of good physical eraseblocks. And UBI additionally
reserves some percentage of physical eraseblocks for bad block handling
(default is 1%), so we have 1-3% of PEBs reserved at the end, depending
on the amount of initial bad PEBs. But it is desired to always have
1% (or more, depending on the configuration).
Solution: this patch adds an "auto-resize" flag to the volume table.
The volume which has the "auto-resize" flag will automatically be re-sized
(enlarged) on the first UBI initialization. UBI clears the flag when
the volume is re-sized. Only one volume may have the "auto-resize" flag.
So, the production UBI image may have one volume with "auto-resize"
flag set, and its size is automatically adjusted on the first boot
of the device.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
When creating a new volume, do not forget to increment the
vol_count variable.
Also, users are not interested in internal volumes, so do not show
them in the volumes_count sysfs file.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
This is one more step on the way to "removable" UBI devices. It
adds reference counting for UBI devices. Every time a volume on
this device is opened - the device's refcount is increased. It
is also increased if someone is reading any sysfs file of this
UBI device or of one of its volumes.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Make the code more consistent by requiring the caller to lock the
ubi->volume_mutex, because this is what we do for updates.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Add ref_count field to UBI volumes and remove weired "vol->removed"
field. This way things are better understandable and we do not have
to do whold show_attr operation under spinlock.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Error path in volume creation is bogus. First of, it ovverrides the
'err' variable and returns zero to the caller. Second, ubi_assert()
in the release function is wrong.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
When a volume is opened, get its kref via get_device() call.
And put the reference when closing the volume. With this, we
may have a bit saner volume delete.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Pass volume description object to the EBA function which makes
more sense, and EBA function do not have to find the volume
description object by volume ID.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
This patch silences the following warning :
drivers/mtd/ubi/vmt.c:73: warning: 'ret' may be used uninitialized in this function
gcc can't see that we always initialize ret in all situations where it is
actually used. The one case where it's not initialized is when we BUG(),
but gcc doesn't know that we won't then continue and use an uninitialized
'ret'.
This patch results in code that does exactely the same as before, but it
also makes gcc shut up, so we generate one less line of warning noise.
Signed-off-by: Jesper Juhl <jesper.juhl@gmail.com>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Fix "symbol shadows an earlier one" warnings. Although they are harmless
but it does not hurt to fix them and make sparse happy.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
I was experiencing overflows in multiplications for
volume->used_bytes in vmt.c & vtbl.c, while creating & resizing large volumes.
vol->used_bytes is long long however its 2 operands vol->used_ebs &
vol->usable_leb_size
are int. So their multiplication for larger values causes integer overflows.
Typecasting them solves the problem.
My machine & flash details:
64Bit dual-core AMD opteron, 1 GB RAM, linux 2.6.18.3.
mtd size = 6GB, volume size= 5GB, peb_size = 4MB.
heres patch which does the fix.
Signed-off-by: Vinit Agnihotri <vinit.agnihotri@gmail.com>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Do not check volumes which are currently in use because thay may be
in inconsistent state.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
When volume creation fails, we have to set ubi->volumes[vol_id]
back to NULL.
This patch also tweaks some debugging stuff.
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
Kill UBI's homegrown endianess handling and replace it with
the standard kernel endianess handling.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Artem Bityutskiy <Artem.Bityutskiy@nokia.com>
UBI (Latin: "where?") manages multiple logical volumes on a single
flash device, specifically supporting NAND flash devices. UBI provides
a flexible partitioning concept which still allows for wear-levelling
across the whole flash device.
In a sense, UBI may be compared to the Logical Volume Manager
(LVM). Whereas LVM maps logical sector numbers to physical HDD sector
numbers, UBI maps logical eraseblocks to physical eraseblocks.
More information may be found at
http://www.linux-mtd.infradead.org/doc/ubi.html
Partitioning/Re-partitioning
An UBI volume occupies a certain number of erase blocks. This is
limited by a configured maximum volume size, which could also be
viewed as the partition size. Each individual UBI volume's size can
be changed independently of the other UBI volumes, provided that the
sum of all volume sizes doesn't exceed a certain limit.
UBI supports dynamic volumes and static volumes. Static volumes are
read-only and their contents are protected by CRC check sums.
Bad eraseblocks handling
UBI transparently handles bad eraseblocks. When a physical
eraseblock becomes bad, it is substituted by a good physical
eraseblock, and the user does not even notice this.
Scrubbing
On a NAND flash bit flips can occur on any write operation,
sometimes also on read. If bit flips persist on the device, at first
they can still be corrected by ECC, but once they accumulate,
correction will become impossible. Thus it is best to actively scrub
the affected eraseblock, by first copying it to a free eraseblock
and then erasing the original. The UBI layer performs this type of
scrubbing under the covers, transparently to the UBI volume users.
Erase Counts
UBI maintains an erase count header per eraseblock. This frees
higher-level layers (like file systems) from doing this and allows
for centralized erase count management instead. The erase counts are
used by the wear-levelling algorithm in the UBI layer. The algorithm
itself is exchangeable.
Booting from NAND
For booting directly from NAND flash the hardware must at least be
capable of fetching and executing a small portion of the NAND
flash. Some NAND flash controllers have this kind of support. They
usually limit the window to a few kilobytes in erase block 0. This
"initial program loader" (IPL) must then contain sufficient logic to
load and execute the next boot phase.
Due to bad eraseblocks, which may be randomly scattered over the
flash device, it is problematic to store the "secondary program
loader" (SPL) statically. Also, due to bit-flips it may become
corrupted over time. UBI allows to solve this problem gracefully by
storing the SPL in a small static UBI volume.
UBI volumes vs. static partitions
UBI volumes are still very similar to static MTD partitions:
* both consist of eraseblocks (logical eraseblocks in case of UBI
volumes, and physical eraseblocks in case of static partitions;
* both support three basic operations - read, write, erase.
But UBI volumes have the following advantages over traditional
static MTD partitions:
* there are no eraseblock wear-leveling constraints in case of UBI
volumes, so the user should not care about this;
* there are no bit-flips and bad eraseblocks in case of UBI volumes.
So, UBI volumes may be considered as flash devices with relaxed
restrictions.
Where can it be found?
Documentation, kernel code and applications can be found in the MTD
gits.
What are the applications for?
The applications help to create binary flash images for two purposes: pfi
files (partial flash images) for in-system update of UBI volumes, and plain
binary images, with or without OOB data in case of NAND, for a manufacturing
step. Furthermore some tools are/and will be created that allow flash content
analysis after a system has crashed..
Who did UBI?
The original ideas, where UBI is based on, were developed by Andreas
Arnez, Frank Haverkamp and Thomas Gleixner. Josh W. Boyer and some others
were involved too. The implementation of the kernel layer was done by Artem
B. Bityutskiy. The user-space applications and tools were written by Oliver
Lohmann with contributions from Frank Haverkamp, Andreas Arnez, and Artem.
Joern Engel contributed a patch which modifies JFFS2 so that it can be run on
a UBI volume. Thomas Gleixner did modifications to the NAND layer. Alexander
Schmidt made some testing work as well as core functionality improvements.
Signed-off-by: Artem B. Bityutskiy <dedekind@linutronix.de>
Signed-off-by: Frank Haverkamp <haver@vnet.ibm.com>