Commit Graph

518787 Commits

Author SHA1 Message Date
Linus Torvalds
e6c81cce56 ARM: SoC platform updates for v4.1
Our SoC branch usually contains expanded support for new SoCs and other core
 platform code. In this case, that includes:
 
 - Support for the new Annapurna Labs "Alpine" platform
 - A rework greatly simplifying adding new platform support to the MCPM
   subsystem (Multi-cluster power management)
 - Cpuidle and PM improvements for Exynos3250
 - Misc updates for Renesas, OMAP, Meson, i.MX. Some of these could have
   gone in other branches but ended up here for various reasons.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJVNzfWAAoJEIwa5zzehBx3idcP/Rt042tqb0bian/4M1Ud1aQ7
 AMRd4oU5MfWAlzaGPeMBS+b1eo/eENj6wyWsvBQIByZN76ImlUXtxsx0U0frLrVg
 mWVo9zOLRuoE6yyq329zZgg1IM1RtRIruS6zucKsHgKtq0DcjhYGGUH0ZVZk/rKI
 RLtRK8U6Jr0lnpu1TDE5mii7GCCZlEl5dG+J3w5ewC9y7RLRlM09xjK/Zsj0QOqY
 JvMOIaHuHMT6l7BQ6QajtVxTeGECOJ3YDqC6mDHCVD7f3v88+7H5C20xNGPK921w
 tLfB5qOojnj+kKZRPhi8EGnRzKwrBq6/mE5CvvigTCGlAEUOzy7PFSY9oNE80QeL
 6mUdPTuZuqz7ZEIF0kj8I0AkB6k8B+aYfqA9mqM5yGpa11HvZZGfP7CwI4izoe6+
 sT++0OeDPwbsMyRxZjqNQLs4QYaKGYMP4NCgA17zz5ToRCQZy7e5hd2GYzaRouyi
 kTpR9FbxwDcBIwTcA3F7oJ90BEMJ0tvGz/Al11UQpzPePhTwQt2yB5bRZyK/RYIU
 x8k8RHArG3fmS89D4aOViL3sy/zoUBedx4UfAo6jVbrvoZGALQL23KHdqBqDiPmP
 sMRj/sSr+0h9nJCVNM6I/OUD4/IrpFGaeX9V7rpEsHVe7j83eV7Q2wNRPyVTgxdn
 jS8TS0FNAXIv8FO9EoNH
 =tcGs
 -----END PGP SIGNATURE-----

Merge tag 'armsoc-soc' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc

Pull ARM SoC platform updates from Olof Johansson:
 "Our SoC branch usually contains expanded support for new SoCs and
  other core platform code.  In this case, that includes:

   - support for the new Annapurna Labs "Alpine" platform

   - a rework greatly simplifying adding new platform support to the
     MCPM subsystem (Multi-cluster power management)

   - cpuidle and PM improvements for Exynos3250

   - misc updates for Renesas, OMAP, Meson, i.MX.  Some of these could
     have gone in other branches but ended up here for various reasons"

* tag 'armsoc-soc' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (53 commits)
  ARM: alpine: add support for generic pci
  ARM: Exynos: migrate DCSCB to the new MCPM backend abstraction
  ARM: vexpress: migrate DCSCB to the new MCPM backend abstraction
  ARM: vexpress: DCSCB: tighten CPU validity assertion
  ARM: vexpress: migrate TC2 to the new MCPM backend abstraction
  ARM: MCPM: move the algorithmic complexity to the core code
  ARM: EXYNOS: allow cpuidle driver usage on Exynos3250 SoC
  ARM: EXYNOS: add AFTR mode support for Exynos3250
  ARM: EXYNOS: add code for setting/clearing boot flag
  ARM: EXYNOS: fix CPU1 hotplug on Exynos3250
  ARM: S3C64XX: Use fixed IRQ bases to avoid conflicts on Cragganmore
  ARM: cygnus: fix const declaration bcm_cygnus_dt_compat
  ARM: DRA7: hwmod: Fix the hwmod class for GPTimer4
  ARM: DRA7: hwmod: Add data for GPTimers 13 through 16
  ARM: EXYNOS: Remove left over 'extra_save'
  ARM: EXYNOS: Constify exynos_pm_data array
  ARM: EXYNOS: use static in suspend.c
  ARM: EXYNOS: Use platform device name as power domain name
  ARM: EXYNOS: add support for async-bridge clocks for pm_domains
  ARM: omap-device: add missed callback for suspend-to-disk
  ...
2015-04-22 09:08:39 -07:00
Linus Torvalds
d0440c59f5 ARM: SoC cleanups for v4.1
We've got a fairly large cleanup branch this time. The bulk of this is removal
 of non-DT platforms of several flavors:
 
 - Atmel at91 platforms go full-DT, with removal of remaining board-file based
   support
 - OMAP removes legacy board files for three more platforms
 - Removal of non-DT mach-msm, newer Qualcomm platforms now live in mach-qcom
 - Freescale i.MX25 also removes non-DT platform support
 
 Most of the rest of the changes here are fallout from the above, i.e. for
 example removal of drivers that now lack platforms, etc.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJVNzI3AAoJEIwa5zzehBx3ePwQAJKb4Mf72/4iiKb4dbVcooQN
 EiZ84fwWiWD6Mww/3A76xVnz/b7JWmB3vwW0b4fcbvzubmOjnROBmZgCWeNy4ZTv
 dOZc3/9jK7OrlwvpFBeZykQwHcbz+550+m3WxmLft1oqH/7BA1k5aunwYtFB96ii
 5Owi4Cy9OmxEyALQAvzktFaJdI7J66LNb+i30r5zIZHlkVooeF3UyadndiswUP2o
 EBzCE8UPqRi5kV6FuwVyf4MZaV28FWoglTqdx9OxogcTnKNFT6RlHQ39q/iPu348
 Wkh4kOryVy7Rlab1K4wQRpBoOwkonKDV73u2H2ifRFj7V9ZAdjibK8pgKn3kjkba
 bJkwHIqlqtqqqjj2Hh93wl+8hKSypoLXO9tagPWYBiLtFXCH/+EVsihWYpAc/A5E
 pUS6hJrJyXKJouwwsXu6459zP0ieqhvpbQG72xs9PRimAfAdSTulSTzdI/dMh42Q
 pwYkmvh+ReY3Ll4MeCzu7+eCIY0qAKsor48W1ImuziwQhg2lZj16qWtA4YdPk3+O
 N8ckyaaFg663PAfsZgBx1qTgxw5v0ec2k68/iEVGS5mUJCgcWxFvR95chTDIxQXq
 ZmJ+SuMFyLB/2zVSiGU96L1PQTcUkxJJ8LVB3qNp6KlYT7qUSsgAU+qYveFlUh+p
 X8MVsSVh8n1MTNepsLij
 =BV8A
 -----END PGP SIGNATURE-----

Merge tag 'armsoc-cleanup' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc

Pull ARM SoC cleanups from Olof Johansson:
 "We've got a fairly large cleanup branch this time.  The bulk of this
  is removal of non-DT platforms of several flavors:

   - Atmel at91 platforms go full-DT, with removal of remaining
     board-file based support

   - OMAP removes legacy board files for three more platforms

   - removal of non-DT mach-msm, newer Qualcomm platforms now live in
     mach-qcom

   - Freescale i.MX25 also removes non-DT platform support"

Most of the rest of the changes here are fallout from the above, i.e. for
example removal of drivers that now lack platforms, etc.

* tag 'armsoc-cleanup' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (58 commits)
  mmc: Remove msm_sdcc driver
  gpio: Remove gpio-msm-v1 driver
  ARM: Remove mach-msm and associated ARM architecture code
  ARM: shmobile: cpuidle: Remove the pointless default driver
  ARM: davinci: dm646x: Add interrupt resource for McASPs
  ARM: davinci: irqs: Correct McASP1 TX interrupt definition for DM646x
  ARM: davinci: dm646x: Clean up the McASP DMA resources
  ARM: davinci: devices-da8xx: Add support for McASP2 on da830
  ARM: davinci: devices-da8xx: Clean up and correct the McASP device creation
  ARM: davinci: devices-da8xx: Add interrupt resource to McASP structs
  ARM: davinci: devices-da8xx: Add resource name for the McASP DMA request
  ARM: OMAP2+: Remove legacy support for omap3 TouchBook
  ARM: OMAP3: Remove legacy support for devkit8000
  ARM: OMAP3: Remove legacy support for EMA-Tech Stalker board
  ARM: shmobile: Consolidate the pm code for R-Car Gen2
  ARM: shmobile: r8a7791: Correct SYSCIER value
  ARM: shmobile: r8a7790: Correct SYSCIER value
  ARM: at91: remove old setup
  ARM: at91: sama5d4: remove useless map_io
  ARM: at91: sama5 use SoC detection infrastructure
  ...
2015-04-22 09:04:39 -07:00
Linus Torvalds
38eb1dbb0d ARM: SoC fixes for v4.1
Here's the usual "low-priority fixes that didn't make it into the last
 few -rcs, with a twist: We had a fixes pull request that I didn't send
 in time to get into 4.0, so we'll send some of them to Greg for -stable as
 well.
 
 Contents here is as usual not all that controversial:
 
 - A handful of randconfig fixes from Arnd, in particular for older Samsung
   platforms
 - Exynos fixes, !SMP building, DTS updates for MMC and lid switch
 - Kbuild fix to create output subdirectory for DTB files
 - Misc minor fixes for OMAP
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJVNzItAAoJEIwa5zzehBx34T0QAID8eUwb0dAy23y2vlszQIhG
 5zxvkDgINGj2LOBvlO3afOf3ErKXfNBGJ4UJYyqWxBsQCvY0AcWWmTVXGERhDF3/
 ZY+Fn7txhSf63CKaFa0SVwMq4qM9V3oe1EiR2kaXwNb1sk2vniCFA9RLCznSEYsi
 wy2cwECfNKljnMVl6CiCihOXsY3ZBTq+7nbp4X6ewUoQkvqde2RDSAHKZG2Y43p6
 tKhNEPqSKZ9sYwkNrnQ5F92zlkrc7nUdttKel4Q21+3MR132taC/cqnSsVjjFRL6
 cb3xM50iy2sHZBdzqkHNSBTAgP5L+PCGOLqkByUrNronOpUYdmqsz4M0Tm1SwEX3
 NvmsLjEkeZTsVd9go0/uyrJlZ71UYkSE+YwE4o+mhg55tTW3ZPvb4i+3uXIJnVA7
 +nGT5AhmfgrR1w3sQvniQp/hEwY6+qgt2ygQrpn7zSqsWMrwC9QxlJUbikJECKYC
 SCGJBvbFBsLTiLZf7aJV2L0rtvYe27E+b1LYMos+DIj9llGSWAbWy+hHj1kY2AbP
 aa93gx3xT8TOwNYFnSa9CuiA1eiWOpS5HFY5Pp7er4+Mf5PibB/Xqdq/krXLBAo3
 nWMiRMWPF0Jmn2o8J1gR3Qrb2TLodHXitpHE2Beh0mG/Un148cG2MrjaUnjNo+NG
 Ox+2BuDNqtXuTL7Y0B9l
 =o/4N
 -----END PGP SIGNATURE-----

Merge tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc

Pull ARM SoC fixes from Olof Johansson:
 "Here's the usual "low-priority fixes that didn't make it into the last
  few -rcs, with a twist: We had a fixes pull request that I didn't send
  in time to get into 4.0, so we'll send some of them to Greg for
  -stable as well.

  Contents here is as usual not all that controversial:

   - a handful of randconfig fixes from Arnd, in particular for older
     Samsung platforms

   - Exynos fixes, !SMP building, DTS updates for MMC and lid switch

   - Kbuild fix to create output subdirectory for DTB files

   - misc minor fixes for OMAP"

* tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: (23 commits)
  ARM: at91/dt: sama5d3 xplained: add phy address for macb1
  kbuild: Create directory for target DTB
  ARM: mvebu: Disable CPU Idle on Armada 38x
  ARM: DRA7: Enable Cortex A15 errata 798181
  ARM: dts: am57xx-beagle-x15: Add thermal map to include fan and tmp102
  ARM: dts: DRA7: Add bandgap and related thermal nodes
  bus: ocp2scp: SYNC2 value should be changed to 0x6
  ARM: dts: am4372: Add "ti,am437x-ocp2scp" as compatible string for OCP2SCP
  ARM: OMAP2+: remove superfluous NULL pointer check
  ARM: EXYNOS: Fix build breakage cpuidle on !SMP
  ARM: dts: fix lid and power pin-functions for exynos5250-spring
  ARM: dts: fix mmc node updates for exynos5250-spring
  ARM: OMAP4: remove dead kconfig option OMAP4_ERRATA_I688
  MAINTAINERS: add OMAP defconfigs under OMAP SUPPORT
  ARM: OMAP1: PM: fix some build warnings on 1510-only Kconfigs
  ARM: cns3xxx: don't export static symbol
  ARM: S3C24XX: avoid a Kconfig warning
  ARM: S3C24XX: fix header file inclusions
  ARM: S3C24XX: fix building without PM_SLEEP
  ARM: S3C24XX: use SAMSUNG_WAKEMASK for s3c2416
  ...
2015-04-22 09:03:30 -07:00
Ilya Dryomov
f77303bdda rbd: rbd_wq comment is obsolete
After the switch to blk-mq rbd_wq processes requests, not devices.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2015-04-22 18:33:49 +03:00
Ilya Dryomov
7c1c4747f2 libceph: announce support for straw2 buckets
Sync up feature bits and enable CEPH_FEATURE_CRUSH_V4.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2015-04-22 18:33:48 +03:00
Ilya Dryomov
958a27658d crush: straw2 bucket type with an efficient 64-bit crush_ln()
This is an improved straw bucket that correctly avoids any data movement
between items A and B when neither A nor B's weights are changed.  Said
differently, if we adjust the weight of item C (including adding it anew
or removing it completely), we will only see inputs move to or from C,
never between other items in the bucket.

Notably, there is not intermediate scaling factor that needs to be
calculated.  The mapping function is a simple function of the item weights.

The below commits were squashed together into this one (mostly to avoid
adding and then yanking a ~6000 lines worth of crush_ln_table):

- crush: add a straw2 bucket type
- crush: add crush_ln to calculate nature log efficently
- crush: improve straw2 adjustment slightly
- crush: change crush_ln to provide 32 more digits
- crush: fix crush_get_bucket_item_weight and bucket destroy for straw2
- crush/mapper: fix divide-by-0 in straw2
  (with div64_s64() for draw = ln / w and INT64_MIN -> S64_MIN - need
   to create a proper compat.h in ceph.git)

Reflects ceph.git commits 242293c908e923d474910f2b8203fa3b41eb5a53,
                          32a1ead92efcd351822d22a5fc37d159c65c1338,
                          6289912418c4a3597a11778bcf29ed5415117ad9,
                          35fcb04e2945717cf5cfe150b9fa89cb3d2303a1,
                          6445d9ee7290938de1e4ee9563912a6ab6d8ee5f,
                          b5921d55d16796e12d66ad2c4add7305f9ce2353.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2015-04-22 18:33:43 +03:00
Ilya Dryomov
45002267e8 crush: ensuring at most num-rep osds are selected
Crush temporary buffers are allocated as per replica size configured
by the user.  When there are more final osds (to be selected as per
rule) than the replicas, buffer overlaps and it causes crash.  Now, it
ensures that at most num-rep osds are selected even if more number of
osds are allowed by the rule.

Reflects ceph.git commits 6b4d1aa99718e3b367496326c1e64551330fabc0,
                          234b066ba04976783d15ff2abc3e81b6cc06fb10.

Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2015-04-22 18:33:42 +03:00
Ilya Dryomov
9be6df215a crush: drop unnecessary include from mapper.c
Signed-off-by: Ilya Dryomov <idryomov@gmail.com>
2015-04-22 18:33:42 +03:00
Yan, Zheng
ec137c10e7 ceph: fix uninline data function
For CEPH_OSD_CMPXATTR_MODE_U64, OSD expects the u64 to be encoded
as string in object's xattr.

Signed-off-by: Yan, Zheng <zyan@redhat.com>
2015-04-22 18:33:41 +03:00
Yan, Zheng
0ea611a3bc ceph: rename snapshot support
Signed-off-by: Yan, Zheng <zyan@redhat.com>
2015-04-22 18:33:41 +03:00
Yan, Zheng
c0bd50e2ee ceph: fix null pointer dereference in send_mds_reconnect()
sb->s_root can be null when umounting

Signed-off-by: Yan, Zheng <zyan@redhat.com>
2015-04-22 18:33:31 +03:00
Paolo Bonzini
2fa462f826 KVM/ARM changes for v4.1, take #2:
Rather small this time:
 
 - a fix for a nasty bug with virtual IRQ injection
 - a fix for irqfd
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJVN7ecAAoJECPQ0LrRPXpDmcEP/jTcP89kPm8r7xk0YggalpQT
 4An+Nx5dnumO7ZJMVZM5eqBuXT3C2tvgKdz1VpEBr9J7F31QBEANCXd9P6smEhXI
 /0GuJ82tz+VC/XaVFUXsmtSW0Kkt7dIEJ29OCHdSjchxsM3C4X175IMhtVvKDGOc
 qVvHJllS4WBddNRTrWuk5su62QmEra6TgQMdXPabtcVcUYgQ3DLHnp/rvlYFsCfD
 MvZKCrlqAHNftzxW4fcimPdaToUWVNmKADm0ejujIxDRfNBH/6w7vmcFkNICAyqa
 Q/UgVKpVxa/baRr6jKyXh32S0ewQSki2OViIijs45tHXlSQMgUnOQ9ZMKkgR215q
 J5QsQFDJ5QjVth42CX3jAN6+iZFO1QX7xMBCutjg/If8rL+Cv5zKEGPzv4mOjt2M
 slSvgXo0Rv2c/o+7XOz15rCIW8Io4+MA2y1M6htZ8UCVAmQCGwkvlIo/9ZU2b8Hl
 NGnA+C8thXtrsMxjLfNWFQBGlBHKWuptTpB0qypws0C/KSXG3LM/R41q+WXqmJKy
 UZJWgC75ITeRle9PAHSoIzdDiH+hNhhUUyRRruLXoses98lIgUyuTvHNFhPHjIfC
 V/S6Ai8lBAMk/jBw/iMH1G+ubw2i2ZzFVw132NQrpucPchZvZ1DYMPA1tZEQT7Tl
 QkfD5j64y5qiQWbbqfZL
 =Cfx3
 -----END PGP SIGNATURE-----

Merge tag 'kvm-arm-for-4.1-take2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into kvm-master

KVM/ARM changes for v4.1, take #2:

Rather small this time:

- a fix for a nasty bug with virtual IRQ injection
- a fix for irqfd
2015-04-22 17:08:12 +02:00
Andre Przywara
fd1d0ddf2a KVM: arm/arm64: check IRQ number on userland injection
When userland injects a SPI via the KVM_IRQ_LINE ioctl we currently
only check it against a fixed limit, which historically is set
to 127. With the new dynamic IRQ allocation the effective limit may
actually be smaller (64).
So when now a malicious or buggy userland injects a SPI in that
range, we spill over on our VGIC bitmaps and bytemaps memory.
I could trigger a host kernel NULL pointer dereference with current
mainline by injecting some bogus IRQ number from a hacked kvmtool:
-----------------
....
DEBUG: kvm_vgic_inject_irq(kvm, cpu=0, irq=114, level=1)
DEBUG: vgic_update_irq_pending(kvm, cpu=0, irq=114, level=1)
DEBUG: IRQ #114 still in the game, writing to bytemap now...
Unable to handle kernel NULL pointer dereference at virtual address 00000000
pgd = ffffffc07652e000
[00000000] *pgd=00000000f658b003, *pud=00000000f658b003, *pmd=0000000000000000
Internal error: Oops: 96000006 [#1] PREEMPT SMP
Modules linked in:
CPU: 1 PID: 1053 Comm: lkvm-msi-irqinj Not tainted 4.0.0-rc7+ #3027
Hardware name: FVP Base (DT)
task: ffffffc0774e9680 ti: ffffffc0765a8000 task.ti: ffffffc0765a8000
PC is at kvm_vgic_inject_irq+0x234/0x310
LR is at kvm_vgic_inject_irq+0x30c/0x310
pc : [<ffffffc0000ae0a8>] lr : [<ffffffc0000ae180>] pstate: 80000145
.....

So this patch fixes this by checking the SPI number against the
actual limit. Also we remove the former legacy hard limit of
127 in the ioctl code.

Signed-off-by: Andre Przywara <andre.przywara@arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
CC: <stable@vger.kernel.org> # 4.0, 3.19, 3.18
[maz: wrap KVM_ARM_IRQ_GIC_MAX with #ifndef __KERNEL__,
as suggested by Christopher Covington]
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2015-04-22 15:42:24 +01:00
Eric Auger
0b3289ebc2 KVM: arm: irqfd: fix value returned by kvm_irq_map_gsi
irqfd/arm curently does not support routing. kvm_irq_map_gsi is
supposed to return all the routing entries associated with the
provided gsi and return the number of those entries. We should
return 0 at this point.

Signed-off-by: Eric Auger <eric.auger@linaro.org>
Acked-by: Christoffer Dall <christoffer.dall@linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
2015-04-22 15:37:54 +01:00
Wolfram Sang
cf82f52d36 watchdog: stmp3xxx_rtc_wdt: fix broken email address
My Pengutronix address is not valid anymore, redirect people to the Pengutronix
kernel team.

Reported-by: Harald Geyer <harald@ccbib.org>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Acked-by: Robert Schwebel <r.schwebel@pengutronix.de>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
2015-04-22 15:30:45 +02:00
Wolfram Sang
e8cc536657 watchdog: pnx4008_wdt: fix broken email address
My Pengutronix address is not valid anymore, redirect people to the Pengutronix
kernel team.

Reported-by: Harald Geyer <harald@ccbib.org>
Signed-off-by: Wolfram Sang <wsa@the-dreams.de>
Acked-by: Robert Schwebel <r.schwebel@pengutronix.de>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
2015-04-22 15:30:41 +02:00
Aaro Koskinen
3a30c07e71 watchdog: octeon: use fixed length string for register names
Use fixed length string for register names. This saves 416 bytes
in text size.

Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
2015-04-22 15:28:40 +02:00
Aaro Koskinen
8692cf0ad3 watchdog: octeon: fix some trivial coding style issues
Fix some trivial coding style issues to reduce noise from static analyzers.

Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
2015-04-22 15:28:35 +02:00
Aaro Koskinen
3d588c93c0 watchdog: octeon: convert to WATCHDOG_CORE API
Convert OCTEON watchdog to WATCHDOG_CORE API. This enables support
for multiple watchdogs on OCTEON boards.

Signed-off-by: Aaro Koskinen <aaro.koskinen@iki.fi>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
2015-04-22 15:28:31 +02:00
Michal Simek
6290d8c826 watchdog: cadence: Remove Kconfig dependency on ARCH
Remove Kconfig dependency and enable driver for
all ARCHs.

Signed-off-by: Michal Simek <michal.simek@xilinx.com>
Reviewed-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
2015-04-22 15:28:21 +02:00
Mathieu Olivari
cf79fb14d0 ARM: msm: add watchdog entries to DT timer binding doc
The watchdog has been reworked to use the same DT node as the timer.
This change is updating the device tree doc accordingly.

Signed-off-by: Mathieu Olivari <mathieu@codeaurora.org>
Acked-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
2015-04-22 15:28:16 +02:00
Mathieu Olivari
4ba1c98b55 ARM: qcom: add description of KPSS WDT for IPQ8064
Add the watchdog related entries to the Krait Processor Sub-system
(KPSS) timer IPQ8064 devicetree section. Also, add a fixed-clock
description of SLEEP_CLK, which will do for now.

Signed-off-by: Josh Cartwright <joshc@codeaurora.org>
Signed-off-by: Mathieu Olivari <mathieu@codeaurora.org>
Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>
Acked-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
2015-04-22 15:28:11 +02:00
Mathieu Olivari
0dfd582e02 watchdog: qcom: use timer devicetree binding
MSM watchdog configuration happens in the same register block as the
timer, so we'll use the same binding as the existing timer.

The qcom-wdt will now be probed when devicetree has an entry compatible
with "qcom,kpss-timer" or "qcom-scss-timer".

Signed-off-by: Mathieu Olivari <mathieu@codeaurora.org>
Reviewed-by: Stephen Boyd <sboyd@codeaurora.org>
Acked-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
2015-04-22 15:27:47 +02:00
Joe Perches
e1dbde2960 watchdog: bcm281xx: Remove use of seq_printf return value
The seq_printf return value, because it's frequently misused,
will eventually be converted to void.

See: commit 1f33c41c03 ("seq_file: Rename seq_overflow() to
     seq_has_overflowed() and make public")

Signed-off-by: Joe Perches <joe@perches.com>
Acked-by: Guenter Roeck <linux~roeck-us.net>
Signed-off-by: Wim Van Sebroeck <wim@iguana.be>
2015-04-22 15:27:39 +02:00
Paul Gortmaker
4a3893d069 modpost: don't emit section mismatch warnings for compiler optimizations
Currently an allyesconfig build [gcc-4.9.1] can generate the following:

   WARNING: vmlinux.o(.text.unlikely+0x3864): Section mismatch in
   reference from the function cpumask_empty.constprop.3() to the
   variable .init.data:nmi_ipi_mask

which comes from the cpumask_empty usage in arch/x86/kernel/nmi_selftest.c.

Normally we would not see a symbol entry for cpumask_empty since it is:

	static inline bool cpumask_empty(const struct cpumask *srcp)

however in this case, the variant of the symbol gets emitted when GCC does
constant propagation optimization.

Fix things up so that any locally optimized constprop variants don't warn
when accessing variables that live in the __init sections.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-04-22 17:31:34 +09:30
Paul Gortmaker
09c20c032b modpost: expand pattern matching to support substring matches
Currently the match() function supports a leading * to match any
prefix and a trailing * to match any suffix.  However there currently
is not a combination of both that can be used to target matches of
whole families of functions that share a common substring.

Here we expand the *foo and foo* match to also support *foo* with
the goal of targeting compiler generated symbol names that contain
strings like ".constprop." and ".isra."

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-04-22 17:31:33 +09:30
Quentin Casasnovas
c5c3439af0 modpost: do not try to match the SHT_NUL section.
Trying to match the SHT_NUL section isn't useful and causes build failures
on parisc and mn10300 since the addition of section strict white-listing
and __ex_table sanitizing.

Signed-off-by: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Reported-by: Guenter Roeck <linux@roeck-us.net>
Fixes: 050e57fd59 ("modpost: add strict white-listing when referencing....")
Fixes: 52dc0595d5 ("modpost: handle relocations mismatch in __ex_table.")
Tested-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-04-22 17:31:33 +09:30
Quentin Casasnovas
e84048aa17 modpost: fix extable entry size calculation.
As Guenter pointed out, we were never really calculating the extable entry
size because the pointer arithmetic was simply wrong.  We want to check
we're handling the second relocation in __ex_table to infer an entry size,
but we were using (void*) pointers instead of Elf_Rel[a]* ones.

This fixes the problem by moving that check in the caller (since we can
deal with different types of relocations) and add is_second_extable_reloc()
to make the whole thing more readable.

Signed-off-by: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Reported-by: Guenter Roeck <linux@roeck-us.net>
CC: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-04-22 17:31:32 +09:30
Quentin Casasnovas
d3df4de7eb modpost: fix inverted logic in is_extable_fault_address().
As Guenter pointed out, we want to assert that extable_entry_size has been
discovered and not the other way around.  Moreover, this sanity check is
only valid when we're not dealing with the first relocation in __ex_table,
since we have not discovered the extable entry size at that point.

This was leading to a divide-by-zero on some architectures and make the
build fail.

Signed-off-by: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Reported-by: Guenter Roeck <linux@roeck-us.net>
CC: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-04-22 17:31:31 +09:30
Rusty Russell
6c730bfc89 modpost: handle -ffunction-sections
52dc0595d5 introduced OTHER_TEXT_SECTIONS for identifying what
sections could validly have __ex_table entries.  Unfortunately, it
wasn't tested with -ffunction-sections, which some architectures
use.

Reported-by: kbuild test robot <fengguang.wu@intel.com>
Cc: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-04-22 17:31:31 +09:30
Thierry Reding
d7e0abcf4c modpost: Whitelist .text.fixup and .exception.text
32-bit and 64-bit ARM use these sections to store executable code, so
they must be whitelisted in modpost's table of valid text sections.

Signed-off-by: Thierry Reding <treding@nvidia.com>
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
2015-04-22 17:31:20 +09:30
Vinod Koul
cdde0e61cf dmaengine: dw: don't prompt for DW_DMAC_CORE
DW_DMAC_CORE is slected by PCI or Platform driver, so this symbol shouldn't
be user selectable, so remove the prompt

Signed-off-by: Vinod Koul <vinod.koul@intel.com>
2015-04-22 12:24:13 +05:30
Linus Torvalds
db4fd9c5d0 Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc
Pull sparc fixes from David Miller:

 1) ldc_alloc_exp_dring() can be called from softints, so use
    GFP_ATOMIC.  From Sowmini Varadhan.

 2) Some minor warning/build fixups for the new iommu-common code on
    certain archs and with certain debug options enabled.  Also from
    Sowmini Varadhan.

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc:
  sparc: Use GFP_ATOMIC in ldc_alloc_exp_dring() as it can be called in softirq context
  sparc64: Use M7 PMC write on all chips T4 and onward.
  iommu-common: rename iommu_pool_hash to iommu_hash_common
  iommu-common: fix x86_64 compiler warnings
2015-04-21 23:21:34 -07:00
Linus Torvalds
8aaa51b63c Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net
Pull networking fixes from David Miller:
 "Just a few fixes trickling in at this point.

  1) If we see an attached socket on an skb in the ipv4 forwarding path,
     bail.  This can happen due to races with FIB rule addition, and
     deletion, and we should just drop such frames.  From Sebastian
     Pöhn.

  2) pppoe receive should only accept packets destined for this hosts's
     MAC address.  From Joakim Tjernlund.

  3) Handle checksum unwrapping properly in ppp receive properly when
     it's encapsulated in UDP in some way, fix from Tom Herbert.

  4) Fix some bugs in mv88e6xxx DSA driver resulting from the conversion
     from register offset constants to mnenomic macros.  From Vivien
     Didelot.

  5) Fix handling of HCA max message size in mlx4 adapters, from Eran
     Ben ELisha"

* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net:
  net/mlx4_core: Fix reading HCA max message size in mlx4_QUERY_DEV_CAP
  tcp: add memory barriers to write space paths
  altera tse: Error-Bit on tx-avalon-stream always set.
  net: dsa: mv88e6xxx: use PORT_DEFAULT_VLAN
  net: dsa: mv88e6xxx: fix setup of port control 1
  ppp: call skb_checksum_complete_unset in ppp_receive_frame
  net: add skb_checksum_complete_unset
  pppoe: Lacks DST MAC address check
  ip_forward: Drop frames with attached skb->sk
2015-04-21 22:37:27 -07:00
Olof Johansson
48c1078509 Urgent pull request for v4.1 to booting for custom kernel
.config files that do not have MFD_SYSCON set.
 
 Omaps now have a dependency to MFD_SYSCON for system control
 module generic register area and some clocks with the changes
 done in omap-for-v4.1/prcm-dts branch.
 
 This can be pulled on top of omap-for-v4.1/prcm-dts, or into
 fixes for v4.1.
 
 We already do have a slight MFD_SYSCON dependency for
 REGULATOR_PBIAS for dual voltage MMC cards on the first MMC
 bus for many devices, so from that point of view this can
 also be merged separately from omap-for-v4.1/prcm-dts.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJVNTrLAAoJEBvUPslcq6VzfccQALcyghSZwVAz/Rr8xjqCpk+v
 /ADdRStXjAraT6naS2nyy0ZegOA8c87ZFsWN/SQZbYAIG5/2mn1Ak59q2YHb1UD3
 L+qGOrV2Nm1Wund5f9i7NrfkzVBim0tmyKWTzYYZ4/4AGkytILvFcv0oj0M0HJXs
 7vig1WgXgDLxSV6SVdkaDQDUw+Ab5l/9qzJMREFt8YjDqx7ZCSg60H/IRd3W+eyn
 wRPCZ0D0/697iG4NSiq4yJwerHhAmq/EXPFfYBlrPQu8IheHxGKNm2VLH0hBc7aM
 UDh/eJhFd7Cym7ZMYZq7Ev8tRffZPzZhNsxd/Lu7/GLqtYs+HFUp3ID7bfa1B+X+
 p/XXEt6GNyniShHcfJAp34OUhnfsxKD6fgQ5tPYY3ZVGfugiKZcqaGOJnPiDvqQh
 zc8+1oSel3+BRl1SXtavh4DBjZmbTN0NgaVemSSbVOthnE5DQ3baHTXnnWG+Nb/C
 C3fa6tV49xGdFDBSEIUWdiGcIiWMobR0RPYMATiU6BRV7FckrfUkwi2xZ7hKdc4E
 wNfObsclxC8zQxgyth+XGiytxrFU/AHzPCjj7He4bQLRDw4v4f7Z5nb4bvwpOhtI
 KnQmy1/83T167/JXnQxFb8a46/Eb9m/VM73AJepVER/QVPiQWeJWTv38POQkEhJg
 VmSC08CsOyPnafcZbQMB
 =4WLp
 -----END PGP SIGNATURE-----

Merge tag 'omap-for-v4.1/prcm-dts-mfd-syscon-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into next/late

Merge "urgent omap boot fix for v4.1 if MFD_SYSCON is not set" from Tony
Lindgren:

Urgent pull request for v4.1 to booting for custom kernel
.config files that do not have MFD_SYSCON set.

Omaps now have a dependency to MFD_SYSCON for system control
module generic register area and some clocks with the changes
done in omap-for-v4.1/prcm-dts branch.

This can be pulled on top of omap-for-v4.1/prcm-dts, or into
fixes for v4.1.

We already do have a slight MFD_SYSCON dependency for
REGULATOR_PBIAS for dual voltage MMC cards on the first MMC
bus for many devices, so from that point of view this can
also be merged separately from omap-for-v4.1/prcm-dts.

* tag 'omap-for-v4.1/prcm-dts-mfd-syscon-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap:
  ARM: OMAP2+: Fix booting with configs that don't have MFD_SYSCON

Signed-off-by: Olof Johansson <olof@lixom.net>
2015-04-21 21:45:15 -07:00
Chris Bainbridge
6b5eab5469 ACPI / EC: fix NULL pointer dereference in acpi_ec_remove_query_handler()
Use list_for_each_entry_safe for iterating because handler may be freed
in the loop.

BUG: unable to handle kernel NULL pointer dereference at 000000000000002c
IP: [<ffffffff814d69c8>] acpi_ec_put_query_handler+0x7/0x1a
Call Trace:
 acpi_ec_remove_query_handler+0x87/0x97
 acpi_smbus_hc_remove+0x2a/0x44 [sbshc]
 acpi_device_remove+0x7b/0x9a
 __device_release_driver+0x7e/0x110
 driver_detach+0xb0/0xc0
 bus_remove_driver+0x54/0xe0
 driver_unregister+0x2b/0x60
 acpi_bus_unregister_driver+0x10/0x12
 acpi_smb_hc_driver_exit+0x10/0x12 [sbshc]
 SyS_delete_module+0x1b8/0x210
 system_call_fastpath+0x12/0x6a

Signed-off-by: Chris Bainbridge <chris.bainbridge@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-04-22 04:12:35 +02:00
Linus Torvalds
f614c8178b Merge branch 'parisc-4.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
Pull parisc fixes from Helge Deller:
 "The patch by Guenter Roeck fixes the build on parisc which got broken
  because of commit f24ffde432 ("parisc: expose number of page table
  levels on Kconfig level") and the patch from Matthew Wilcox converts
  our code to use the generic scatterlist.h header file"

* 'parisc-4.1-1' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
  parisc: Replace PT_NLEVELS with CONFIG_PGTABLE_LEVELS
  parisc: Eliminate sg_virt_addr() and private scatterlist.h
2015-04-21 17:57:28 -07:00
Eric Mei
9ffc8f7cb9 md/raid5: don't do chunk aligned read on degraded array.
When array is degraded, read data landed on failed drives will result in
reading rest of data in a stripe. So a single sequential read would
result in same data being read twice.

This patch is to avoid chunk aligned read for degraded array. The
downside is to involve stripe cache which means associated CPU overhead
and extra memory copy.

Test Results:
Following test are done on a enterprise storage node with Seagate 6T SAS
drives and Xeon E5-2648L CPU (10 cores, 1.9Ghz), 10 disks MD RAID6 8+2,
chunk size 128 KiB.

I use FIO, using direct-io with various bs size, enough queue depth,
tested sequential and 100% random read against 3 array config:
 1) optimal, as baseline;
 2) degraded;
 3) degraded with this patch.
Kernel version is 4.0-rc3.

Each individual test I only did once so there might be some variations,
but we just focus on big trend.

Sequential Read:
  bs=(KiB)  optimal(MiB/s)  degraded(MiB/s)  degraded-with-patch (MiB/s)
   1024       1608            656              995
    512       1624            710              956
    256       1635            728              980
    128       1636            771              983
     64       1612           1119             1000
     32       1580           1420             1004
     16       1368            688              986
      8        768            647              953
      4        411            413              850

Random Read:
  bs=(KiB)  optimal(IOPS)  degraded(IOPS)  degraded-with-patch (IOPS)
   1024        163            160              156
    512        274            273              272
    256        426            428              424
    128        576            592              591
     64        726            724              726
     32        849            848              837
     16        900            970              971
      8        927            940              929
      4        948            940              955

Some notes:
  * In sequential + optimal, as bs size getting smaller, the FIO thread
become CPU bound.
  * In sequential + degraded, there's big increase when bs is 64K and
32K, I don't have explanation.
  * In sequential + degraded-with-patch, the MD thread mostly become CPU
bound.

If you want to we can discuss specific data point in those data. But in
general it seems with this patch, we have more predictable and in most
cases significant better sequential read performance when array is
degraded, and almost no noticeable impact on random read.

Performance is a complicated thing, the patch works well for this
particular configuration, but may not be universal. For example I
imagine testing on all SSD array may have very different result. But I
personally think in most cases IO bandwidth is more scarce resource than
CPU.


Signed-off-by: Eric Mei <eric.mei@seagate.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2015-04-22 08:00:43 +10:00
NeilBrown
edbe83ab4c md/raid5: allow the stripe_cache to grow and shrink.
The default setting of 256 stripe_heads is probably
much too small for many configurations.  So it is best to make it
auto-configure.

Shrinking the cache under memory pressure is easy.  The only
interesting part here is that we put a fairly high cost
('seeks') on shrinking the cache as the cost is greater than
just having to read more data, it reduces parallelism.

Growing the cache on demand needs to be done carefully.  If we allow
fast growth, that can upset memory balance as lots of dirty memory can
quickly turn into lots of memory queued in the stripe_cache.
It is important for the raid5 block device to appear congested to
allow write-throttling to work.

So we only add stripes slowly. We set a flag when an allocation
fails because all stripes are in use, allocate at a convenient
time when that flag is set, and don't allow it to be set again
until at least one stripe_head has been released for re-use.

This means that a spurt of requests will only cause one stripe_head
to be allocated, but a steady stream of requests will slowly
increase the cache size - until memory pressure puts it back again.

It could take hours to reach a steady state.

The value written to, and displayed in, stripe_cache_size is
used as a minimum.  The cache can grow above this and shrink back
down to it.  The actual size is not directly visible, though it can
be deduced to some extent by watching stripe_cache_active.

Signed-off-by: NeilBrown <neilb@suse.de>
2015-04-22 08:00:43 +10:00
NeilBrown
5423399a84 md/raid5: change ->inactive_blocked to a bit-flag.
This allows us to easily add more (atomic) flags.

Signed-off-by: NeilBrown <neilb@suse.de>
2015-04-22 08:00:43 +10:00
NeilBrown
486f0644c3 md/raid5: move max_nr_stripes management into grow_one_stripe and drop_one_stripe
Rather than adjusting max_nr_stripes whenever {grow,drop}_one_stripe()
succeeds, do it inside the functions.

Also choose the correct hash to handle next inside the functions.

This removes duplication and will help with future new uses of
{grow,drop}_one_stripe.

This also fixes a minor bug where the "md/raid:%md: allocate XXkB"
message always said "0kB".

Signed-off-by: NeilBrown <neilb@suse.de>
2015-04-22 08:00:42 +10:00
NeilBrown
a9683a795b md/raid5: pass gfp_t arg to grow_one_stripe()
This is needed for future improvement to stripe cache management.

Signed-off-by: NeilBrown <neilb@suse.de>
2015-04-22 08:00:42 +10:00
Markus Stockhausen
d06f191f8e md/raid5: introduce configuration option rmw_level
Depending on the available coding we allow optimized rmw logic for write
operations. To support easier testing this patch allows manual control
of the rmw/rcw descision through the interface /sys/block/mdX/md/rmw_level.

The configuration can handle three levels of control.

rmw_level=0: Disable rmw for all RAID types. Hardware assisted P/Q
calculation has no implementation path yet to factor in/out chunks of
a syndrome. Enforcing this level can be benefical for slow CPUs with
hardware syndrome support and fast SSDs.

rmw_level=1: Estimate rmw IOs and rcw IOs. Execute rmw only if we will
save IOs. This equals the "old" unpatched behaviour and will be the
default.

rmw_level=2: Execute rmw even if calculated IOs for rmw and rcw are
equal. We might have higher CPU consumption because of calculating the
parity twice but it can be benefical otherwise. E.g. RAID4 with fast
dedicated parity disk/SSD. The option is implemented just to be
forward-looking and will ONLY work with this patch!

Signed-off-by: Markus Stockhausen <stockhausen@collogia.de>
Signed-off-by: NeilBrown <neilb@suse.de>
2015-04-22 08:00:42 +10:00
Markus Stockhausen
584acdd49c md/raid5: activate raid6 rmw feature
Glue it altogehter. The raid6 rmw path should work the same as the
already existing raid5 logic. So emulate the prexor handling/flags
and split functions as needed.

1) Enable xor_syndrome() in the async layer.

2) Split ops_run_prexor() into RAID4/5 and RAID6 logic. Xor the syndrome
at the start of a rmw run as we did it before for the single parity.

3) Take care of rmw run in ops_run_reconstruct6(). Again process only
the changed pages to get syndrome back into sync.

4) Enhance set_syndrome_sources() to fill NULL pages if we are in a rmw
run. The lower layers will calculate start & end pages from that and
call the xor_syndrome() correspondingly.

5) Adapt the several places where we ignored Q handling up to now.

Performance numbers for a single E5630 system with a mix of 10 7200k
desktop/server disks. 300 seconds random write with 8 threads onto a
3,2TB (10*400GB) RAID6 64K chunk without spare (group_thread_cnt=4)

bsize   rmw_level=1   rmw_level=0   rmw_level=1   rmw_level=0
        skip_copy=1   skip_copy=1   skip_copy=0   skip_copy=0
   4K      115 KB/s      141 KB/s      165 KB/s      140 KB/s
   8K      225 KB/s      275 KB/s      324 KB/s      274 KB/s
  16K      434 KB/s      536 KB/s      640 KB/s      534 KB/s
  32K      751 KB/s    1,051 KB/s    1,234 KB/s    1,045 KB/s
  64K    1,339 KB/s    1,958 KB/s    2,282 KB/s    1,962 KB/s
 128K    2,673 KB/s    3,862 KB/s    4,113 KB/s    3,898 KB/s
 256K    7,685 KB/s    7,539 KB/s    7,557 KB/s    7,638 KB/s
 512K   19,556 KB/s   19,558 KB/s   19,652 KB/s   19,688 Kb/s

Signed-off-by: Markus Stockhausen <stockhausen@collogia.de>
Signed-off-by: NeilBrown <neilb@suse.de>
2015-04-22 08:00:42 +10:00
Markus Stockhausen
a582564b24 md/raid6 algorithms: xor_syndrome() for SSE2
The second and (last) optimized XOR syndrome calculation. This version
supports right and left side optimization. All CPUs with architecture
older than Haswell will benefit from it.

It should be noted that SSE2 movntdq kills performance for memory areas
that are read and written simultaneously in chunks smaller than cache
line size. So use movdqa instead for P/Q writes in sse21 and sse22 XOR
functions.

Signed-off-by: Markus Stockhausen <stockhausen@collogia.de>
Signed-off-by: NeilBrown <neilb@suse.de>
2015-04-22 08:00:42 +10:00
Markus Stockhausen
9a5ce91d05 md/raid6 algorithms: xor_syndrome() for generic int
Start the algorithms with the very basic one. It is left and right
optimized. That means we can avoid all calculations for unneeded pages
above the right stop offset. For pages below the left start offset we
still need the syndrome multiplication but without reading data pages.

Signed-off-by: Markus Stockhausen <stockhausen@collogia.de>
Signed-off-by: NeilBrown <neilb@suse.de>
2015-04-22 08:00:42 +10:00
Markus Stockhausen
7e92e1d762 md/raid6 algorithms: improve test program
It is always helpful to have a test tool in place if we implement
new data critical algorithms. So add some test routines to the raid6
checker that can prove if the new xor_syndrome() works as expected.

Run through all permutations of start/stop pages per algorithm and
simulate a xor_syndrome() assisted rmw run. After each rmw check if
the recovery algorithm still confirms that the stripe is fine.

Signed-off-by: Markus Stockhausen <stockhausen@collogia.de>
Signed-off-by: NeilBrown <neilb@suse.de>
2015-04-22 08:00:42 +10:00
Markus Stockhausen
fe5cbc6e06 md/raid6 algorithms: delta syndrome functions
v3: s-o-b comment, explanation of performance and descision for
the start/stop implementation

Implementing rmw functionality for RAID6 requires optimized syndrome
calculation. Up to now we can only generate a complete syndrome. The
target P/Q pages are always overwritten. With this patch we provide
a framework for inplace P/Q modification. In the first place simply
fill those functions with NULL values.

xor_syndrome() has two additional parameters: start & stop. These
will indicate the first and last page that are changing during a
rmw run. That makes it possible to avoid several unneccessary loops
and speed up calculation. The caller needs to implement the following
logic to make the functions work.

1) xor_syndrome(disks, start, stop, ...): "Remove" all data of source
blocks inside P/Q between (and including) start and end.

2) modify any block with start <= block <= stop

3) xor_syndrome(disks, start, stop, ...): "Reinsert" all data of
source blocks into P/Q between (and including) start and end.

Pages between start and stop that won't be changed should be filled
with a pointer to the kernel zero page. The reasons for not taking NULL
pages are:

1) Algorithms cross the whole source data line by line. Thus avoid
additional branches.

2) Having a NULL page avoids calculating the XOR P parity but still
need calulation steps for the Q parity. Depending on the algorithm
unrolling that might be only a difference of 2 instructions per loop.

The benchmark numbers of the gen_syndrome() functions are displayed in
the kernel log. Do the same for the xor_syndrome() functions. This
will help to analyze performance problems and give an rough estimate
how well the algorithm works. The choice of the fastest algorithm will
still depend on the gen_syndrome() performance.

With the start/stop page implementation the speed can vary a lot in real
life. E.g. a change of page 0 & page 15 on a stripe will be harder to
compute than the case where page 0 & page 1 are XOR candidates. To be not
to enthusiatic about the expected speeds we will run a worse case test
that simulates a change on the upper half of the stripe. So we do:

1) calculation of P/Q for the upper pages

2) continuation of Q for the lower (empty) pages

Signed-off-by: Markus Stockhausen <stockhausen@collogia.de>
Signed-off-by: NeilBrown <neilb@suse.de>
2015-04-22 08:00:41 +10:00
shli@kernel.org
dabc4ec6ba raid5: handle expansion/resync case with stripe batching
expansion/resync can grab a stripe when the stripe is in batch list. Since all
stripes in batch list must be in the same state, we can't allow some stripes
run into expansion/resync. So we delay expansion/resync for stripe in batch
list.

Signed-off-by: Shaohua Li <shli@fusionio.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2015-04-22 08:00:41 +10:00
shli@kernel.org
72ac733015 raid5: handle io error of batch list
If io error happens in any stripe of a batch list, the batch list will be
split, then normal process will run for the stripes in the list.

Signed-off-by: Shaohua Li <shli@fusionio.com>
Signed-off-by: NeilBrown <neilb@suse.de>
2015-04-22 08:00:41 +10:00