linux-next

mirror of https://github.com/edk2-porting/linux-next.git synced 2024-12-25 05:34:00 +08:00

Author	SHA1	Message	Date
Jia-Ju Bai	16fe10cf92	net: cadence: Fix a sleep-in-atomic-context bug in macb_halt_tx() The kernel module may sleep with holding a spinlock. The function call paths (from bottom to top) in Linux-4.16 are: [FUNC] usleep_range drivers/net/ethernet/cadence/macb_main.c, 648: usleep_range in macb_halt_tx drivers/net/ethernet/cadence/macb_main.c, 730: macb_halt_tx in macb_tx_error_task drivers/net/ethernet/cadence/macb_main.c, 721: _raw_spin_lock_irqsave in macb_tx_error_task To fix this bug, usleep_range() is replaced with udelay(). This bug is found by my static analysis tool DSAC. Signed-off-by: Jia-Ju Bai <baijiaju1990@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-09-02 16:05:25 -07:00
David S. Miller	a80afe89d8	Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf Daniel Borkmann says: ==================== pull-request: bpf 2018-09-02 The following pull-request contains BPF updates for your net tree. The main changes are: 1) Fix one remaining buggy offset override in sockmap's bpf_msg_pull_data() when linearizing multiple scatterlist elements, from Tushar. 2) Fix BPF sockmap's misuse of ULP when a collision with another ULP is found on map update where it would release existing ULP. syzbot found and triggered this couple of times now, fix from John. 3) Add missing xskmap type to bpftool so it will properly show the type on map dump, from Prashant. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-09-02 15:53:38 -07:00
Fabio Estevam	20fdcd760a	i2c: imx-lpi2c: Remove mx8dv compatible entry mx8dv never entered into production and there is no other place in the kernel referring to this SoC, so remove it from the driver's compatible entry. Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>	2018-09-02 23:50:43 +02:00
Fabio Estevam	f6eb893490	dt-bindings: imx-lpi2c: Remove mx8dv compatible entry mx8dv never entered into production and there is no other place in the kernel referring to this SoC, so remove it from the dt bindings documentation. Signed-off-by: Fabio Estevam <fabio.estevam@nxp.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>	2018-09-02 23:50:43 +02:00
Masahiro Yamada	4c85609b08	i2c: uniphier-f: issue STOP only for last message or I2C_M_STOP This driver currently emits a STOP if the next message is not I2C_MD_RD. It should not do it because it disturbs the I2C_RDWR ioctl, where read/write transactions are combined without STOP between. Issue STOP only when the message is the last one _or_ flagged with I2C_M_STOP. Fixes: `6a62974b66` ("i2c: uniphier_f: add UniPhier FIFO-builtin I2C driver") Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>	2018-09-02 23:50:43 +02:00
Masahiro Yamada	38f5d8d8cb	i2c: uniphier: issue STOP only for last message or I2C_M_STOP This driver currently emits a STOP if the next message is not I2C_MD_RD. It should not do it because it disturbs the I2C_RDWR ioctl, where read/write transactions are combined without STOP between. Issue STOP only when the message is the last one _or_ flagged with I2C_M_STOP. Fixes: `dd6fd4a327` ("i2c: uniphier: add UniPhier FIFO-less I2C driver") Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Signed-off-by: Wolfram Sang <wsa@the-dreams.de>	2018-09-02 23:50:43 +02:00
Linus Torvalds	57361846b5	Linux 4.19-rc2	2018-09-02 14:37:30 -07:00
David Ahern	15a81b418e	net/ipv6: Only update MTU metric if it set Jan reported a regression after an update to 4.18.5. In this case ipv6 default route is setup by systemd-networkd based on data from an RA. The RA contains an MTU of 1492 which is used when the route is first inserted but then systemd-networkd pushes down updates to the default route without the mtu set. Prior to the change to fib6_info, metrics such as MTU were held in the dst_entry and rt6i_pmtu in rt6_info contained an update to the mtu if any. ip6_mtu would look at rt6i_pmtu first and use it if set. If not, the value from the metrics is used if it is set and finally falling back to the idev value. After the fib6_info change metrics are contained in the fib6_info struct and there is no equivalent to rt6i_pmtu. To maintain consistency with the old behavior the new code should only reset the MTU in the metrics if the route update has it set. Fixes: `d4ead6b34b` ("net/ipv6: move metrics from dst to rt6_info") Reported-by: Jan Janssen <medhefgo@web.de> Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-09-02 14:03:54 -07:00
Tony Lindgren	18eb8aea7f	net: ethernet: cpsw-phy-sel: prefer phandle for phy sel The cpsw-phy-sel device is not a child of the cpsw interconnect target module. It lives in the system control module. Let's fix this issue by trying to use cpsw-phy-sel phandle first if it exists and if not fall back to current usage of trying to find the cpsw-phy-sel child. That way the phy sel driver can be a child of the system control module where it belongs in the device tree. Without this fix, we cannot have a proper interconnect target module hierarchy in device tree for things like genpd. Note that deferred probe is mostly not supported by cpsw and this patch does not attempt to fix that. In case deferred probe support is needed, this could be added to cpsw_slave_open() and phy_connect() so they start handling and returning errors. For documenting it, looks like the cpsw-phy-sel is used for all cpsw device tree nodes. It's missing the related binding documentation, so let's also update the binding documentation accordingly. Cc: devicetree@vger.kernel.org Cc: Andrew Lunn <andrew@lunn.ch> Cc: Grygorii Strashko <grygorii.strashko@ti.com> Cc: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Murali Karicheri <m-karicheri2@ti.com> Cc: Rob Herring <robh+dt@kernel.org> Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-09-02 13:52:13 -07:00
Tony Lindgren	10d7fac4c5	dt-bindings: net: cpsw: Document cpsw-phy-sel usage but prefer phandle The current cpsw usage for cpsw-phy-sel is undocumented but is used for all the boards using cpsw. And cpsw-phy-sel is not really a child of the cpsw device, it lives in the system control module instead. Let's document the existing usage, and improve it a bit where we prefer to use a phandle instead of a child device for it. That way we can properly describe the hardware in dts files for things like genpd. Cc: devicetree@vger.kernel.org Cc: Andrew Lunn <andrew@lunn.ch> Cc: Grygorii Strashko <grygorii.strashko@ti.com> Cc: Ivan Khoronzhuk <ivan.khoronzhuk@linaro.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Murali Karicheri <m-karicheri2@ti.com> Cc: Rob Herring <robh+dt@kernel.org> Signed-off-by: Tony Lindgren <tony@atomide.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-09-02 13:52:13 -07:00
David S. Miller	c60e06c3e0	Merge branch 'igmp-fix-two-incorrect-unsolicit-report-count-issues' Hangbin Liu says: ==================== igmp: fix two incorrect unsolicit report count issues Just like the subject, fix two minor igmp unsolicit report count issues. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-09-02 13:39:37 -07:00
Hangbin Liu	ff06525fcb	igmp: fix incorrect unsolicit report count after link down and up After link down and up, i.e. when call ip_mc_up(), we doesn't init im->unsolicit_count. So after igmp_timer_expire(), we will not start timer again and only send one unsolicit report at last. Fix it by initializing im->unsolicit_count in igmp_group_added(), so we can respect igmp robustness value. Fixes: `24803f38a5` ("igmp: do not remove igmp souce list info when set link down") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-09-02 13:39:37 -07:00
Hangbin Liu	4fb7253e4f	igmp: fix incorrect unsolicit report count when join group We should not start timer if im->unsolicit_count equal to 0 after decrease. Or we will send one more unsolicit report message. i.e. 3 instead of 2 by default. Fixes: `1da177e4c3` ("Linux-2.6.12-rc2") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-09-02 13:39:37 -07:00
John Fastabend	597222f72a	bpf: avoid misuse of psock when TCP_ULP_BPF collides with another ULP Currently we check sk_user_data is non NULL to determine if the sk exists in a map. However, this is not sufficient to ensure the psock or the ULP ops are not in use by another user, such as kcm or TLS. To avoid this when adding a sock to a map also verify it is of the correct ULP type. Additionally, when releasing a psock verify that it is the TCP_ULP_BPF type before releasing the ULP. The error case where we abort an update due to ULP collision can cause this error path. For example, __sock_map_ctx_update_elem() [...] err = tcp_set_ulp_id(sock, TCP_ULP_BPF) <- collides with TLS if (err) <- so err out here goto out_free [...] out_free: smap_release_sock() <- calling tcp_cleanup_ulp releases the TLS ULP incorrectly. Fixes: `2f857d0460` ("bpf: sockmap, remove STRPARSER map_flags and add multi-map support") Signed-off-by: John Fastabend <john.fastabend@gmail.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-09-02 22:31:10 +02:00
Prashant Bhole	97911e0ccb	tools/bpf: bpftool, add xskmap in map types When listed all maps, bpftool currently shows (null) for xskmap. Added xskmap type in map_type_name[] to show correct type. Signed-off-by: Prashant Bhole <bhole_prashant_q7@lab.ntt.co.jp> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-09-02 22:30:39 +02:00
Tushar Dave	9db39f4d4f	bpf: Fix bpf_msg_pull_data() Helper bpf_msg_pull_data() mistakenly reuses variable 'offset' while linearizing multiple scatterlist elements. Variable 'offset' is used to find first starting scatterlist element i.e. msg->data = sg_virt(&sg[first_sg]) + start - offset" Use different variable name while linearizing multiple scatterlist elements so that value contained in variable 'offset' won't get overwritten. Fixes: `015632bb30` ("bpf: sk_msg program helper bpf_sk_msg_pull_data") Signed-off-by: Tushar Dave <tushar.n.dave@oracle.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>	2018-09-02 22:29:53 +02:00
Linus Torvalds	fd6868d82b	Devicetree updates for 4.19-rc2: A couple of new helper functions in preparation for some tree wide clean-ups. -----BEGIN PGP SIGNATURE----- iQJEBAABCgAuFiEEktVUI4SxYhzZyEuo+vtdtY28YcMFAluL3YAQHHJvYmhAa2Vy bmVsLm9yZwAKCRD6+121jbxhwwqqEACsyq81TV/frKj+VJecGxCUwRbotg5aQu2N Dx3Y8JP4fw394Rw78Fh3Q+2gcy5lI3elup5z+LzYE45N0GjhoguF3kBVLacySdq3 VMDLj+AyFae4mCZZ+pBMpZOh+8de0xWSQKxAqj9ejZ1c+uAkdr++6bn7kBHC4yMH i7B3DHg9ZBtC1g3FZgpuZfUTaitje3+RK0NKZBIt9XTGbylwVQqLuIGatMpXyJvc ez4POzZ+miuFK90myIHklI0ARlgB3YR9A9vVeSs4PAo58GomffKbVsdcU6qCIZtw ImkC5bgirIPtGrc2lPRJ86D5bhxVKh3s4LgBEG/dlSSaeyenP8SF479DwC5IM2jt Pff0c5pRrwDkg/7SOq8Cfxn8IZ/N5yao8g7v2iCcSNqAUhSiCNxakks3fCB7yF7K 8X+aoBhwP+nAMWQifpTC9tgEyEJ3ECRK6qLdC9NfFdS52d2iws4rU6IKwumrL1nd qsJSqmsTInsBFx4gh7KYdqZYfAolMKp9P9H5Q4RHuqPg6y6f5zF6KbvKdfZ6+p9k 8rl1p/MFDSNYK/i/s6+kJtRtMMyHUYjabzmeRlkTfWlyBCwtJz6Y+1rtyozNPa11 Q2Wu4kh0tff1ujdsfGlZwj3IsXsJ5nl/X4getJj0hrxRVCueYoF1xbjfyno5kFQi Q9OifJQMag== =AaKv -----END PGP SIGNATURE----- Merge tag 'devicetree-fixes-for-4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux Pull devicetree updates from Rob Herring: "A couple of new helper functions in preparation for some tree wide clean-ups. I'm sending these new helpers now for rc2 in order to simplify the dependencies on subsequent cleanups across the tree in 4.20" * tag 'devicetree-fixes-for-4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/robh/linux: of: Add device_type access helper functions of: add node name compare helper functions of: add helper to lookup compatible child node	2018-09-02 10:56:01 -07:00
Linus Torvalds	a3ea9911e2	ARM: SoC fixes First batch of fixes post-merge window: - A handful of devicetree changes for i.MX2{3,8} to change over to new panel bindings. The platforms were moved from legacy framebuffers to DRM and some development board panels hadn't yet been converted. - OMAP fixes related to ti-sysc driver conversion fallout, fixing some register offsets, no_console_suspend fixes, etc. - Droid4 changes to fix flaky eMMC probing and vibrator DTS mismerge. - Fixed 0755->0644 permissions on a newly added file. - Defconfig changes to make ARM Versatile more useful with QEMU (helps testing). - Enable defconfig options for new TI SoC platform that was merged this window (AM6). -----BEGIN PGP SIGNATURE----- iQJDBAABCAAtFiEElf+HevZ4QCAJmMQ+jBrnPN6EHHcFAluLPKIPHG9sb2ZAbGl4 b20ubmV0AAoJEIwa5zzehBx3KGcQAIrOGMmbdIsKRm77X5ibPbFBTA9p9iZMP7a3 ymPrbpGxVUPHRXkWTf/ppxW6R5jh6CsoSPzsDExvF24JjLMrhd/HzXk2O+Z38Pr8 PTvNGGzw5GJEXvD23FVC9xBWQMDK14qmLS+S2YLSbEVFKPsDnr8wBmjnRy+vHnCR nXE/+4q5xjj8WcstPa70wn0O5eHUru42JM9docc4tmhxM3F/wMksJMKlJG9RXPTC gea99a89lxRApkmOGvZ2Xb7aFeB5YOJjOJIAXnF1BoD9u5+pdGPxL2tM6Yd3GndQ rjU6lH8i6dbJkIzsZjeYpOh48ofCfu8wGADcv+0b2jy8LLJC7mTjBdvP0ffXO3Iz qQEL4/gaxN2NnUkfXdbEPwWq8D/lOXbzefoWN0pDkZP0fmJcmXVOG1USIxqy1F0s pgXUfV0rI8Bog8v73afDPG1TI71QMciDRD7qEFo0Alea5y0Tlfzv+O1oQj+DhoJV UY4i9gvuANfmmuIsV8JH5sLZ3K1uP9aQ3eRBY9xbUdtWv2AjnqMm+a5YCq7iYaaP r98cjA3QTVEnzxK3lNbdQp5F4G6Qk7iCUY2Lg/E7ggVUAZrH9stZR+pnOHsm+tNU 62OZVNIe3bPEKWC4VoEZbDkVb0u3wyuQdJGA48VlfqDDy2PMvB1C4TCZcVbS5y27 +UxCEW4p =Pdac -----END PGP SIGNATURE----- Merge tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc Pull ARM SoC fixes from Olof Johansson: "First batch of fixes post-merge window: - A handful of devicetree changes for i.MX2{3,8} to change over to new panel bindings. The platforms were moved from legacy framebuffers to DRM and some development board panels hadn't yet been converted. - OMAP fixes related to ti-sysc driver conversion fallout, fixing some register offsets, no_console_suspend fixes, etc. - Droid4 changes to fix flaky eMMC probing and vibrator DTS mismerge. - Fixed 0755->0644 permissions on a newly added file. - Defconfig changes to make ARM Versatile more useful with QEMU (helps testing). - Enable defconfig options for new TI SoC platform that was merged this window (AM6)" * tag 'armsoc-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc: arm64: defconfig: Enable TI's AM6 SoC platform ARM: defconfig: Update the ARM Versatile defconfig ARM: dts: omap4-droid4: Fix emmc errors seen on some devices ARM: dts: Fix file permission for am335x-osd3358-sm-red.dts ARM: imx_v6_v7_defconfig: Select CONFIG_DRM_PANEL_SEIKO_43WVF1G ARM: mxs_defconfig: Select CONFIG_DRM_PANEL_SEIKO_43WVF1G ARM: dts: imx23-evk: Convert to the new display bindings ARM: dts: imx23-evk: Move regulators outside simple-bus ARM: dts: imx28-evk: Convert to the new display bindings ARM: dts: imx28-evk: Move regulators outside simple-bus Revert "ARM: dts: imx7d: Invert legacy PCI irq mapping" arm: dts: am4372: setup rtc as system-power-controller ARM: dts: omap4-droid4: fix vibrations on Droid 4 bus: ti-sysc: Fix no_console_suspend handling bus: ti-sysc: Fix module register ioremap for larger offsets ARM: OMAP2+: Fix module address for modules using mpu_rt_idx ARM: OMAP2+: Fix null hwmod for ti-sysc debug	2018-09-02 10:44:28 -07:00
Randy Dunlap	914b087ff9	kbuild: make missing $DEPMOD a Warning instead of an Error When $DEPMOD is not found, only print a warning instead of exiting with an error message and error status: Warning: 'make modules_install' requires /sbin/depmod. Please install it. This is probably in the kmod package. Change the Error to a Warning because "not all build hosts for cross compiling Linux are Linux systems and are able to provide a working port of depmod, especially at the file patch /sbin/depmod." I.e., "make modules_install" may be used to copy/install the loadable modules files to a target directory on a build system and then transferred to an embedded device where /sbin/depmod is run instead of it being run on the build system. Fixes: `934193a654` ("kbuild: verify that $DEPMOD is installed") Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Reported-by: H. Nikolaus Schaller <hns@goldelico.com> Cc: stable@vger.kernel.org Cc: Lucas De Marchi <lucas.demarchi@profusion.mobi> Cc: Lucas De Marchi <lucas.de.marchi@gmail.com> Cc: Michal Marek <michal.lkml@markovi.net> Cc: Jessica Yu <jeyu@kernel.org> Cc: Chih-Wei Huang <cwhuang@linux.org.tw> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com>	2018-09-03 02:16:25 +09:00
Masahiro Yamada	fd65465b70	kconfig: do not require pkg-config on make {menu,n}config Meelis Roos reported a {menu,n}config regression: "I have libncurses devel package installed in the default system location (as do 99%+ on actual developers probably) and in this case, pkg-config is useless. pkg-config is needed only when libraries and headers are installed in non-default locations but it is bad to require installation of pkg-config on all the machines where make menuconfig would be possibly run." For {menu,n}config, do not use pkg-config if it is not installed. For {g,x}config, keep checking pkg-config since we really rely on it for finding the installation paths of the required packages. Fixes: `4ab3b80159` ("kconfig: check for pkg-config on make {menu,n,g,x}config") Reported-by: Meelis Roos <mroos@linux.ee> Signed-off-by: Masahiro Yamada <yamada.masahiro@socionext.com> Tested-by: Meelis Roos <mroos@linux.ee> Tested-by: Randy Dunlap <rdunlap@infradead.org>	2018-09-03 02:13:48 +09:00
Linus Torvalds	899ba79553	Merge branch 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 fixes from Thomas Gleixner: "Speculation: - Make the microcode check more robust - Make the L1TF memory limit depend on the internal cache physical address space and not on the CPUID advertised physical address space, which might be significantly smaller. This avoids disabling L1TF on machines which utilize the full physical address space. - Fix the GDT mapping for EFI calls on 32bit PTI - Fix the MCE nospec implementation to prevent #GP Fixes and robustness: - Use the proper operand order for LSL in the VDSO - Prevent NMI uaccess race against CR3 switching - Add a lockdep check to verify that text_mutex is held in text_poke() functions - Repair the fallout of giving native_restore_fl() a prototype - Prevent kernel memory dumps based on usermode RIP - Wipe KASAN shadow stack before rewinding the stack to prevent false positives - Move the AMS GOTO enforcement to the actual build stage to allow user API header extraction without a compiler - Fix a section mismatch introduced by the on demand VDSO mapping change Miscellaneous: - Trivial typo, GCC quirk removal and CC_SET/OUT() cleanups" * 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/pti: Fix section mismatch warning/error x86/vdso: Fix lsl operand order x86/mce: Fix set_mce_nospec() to avoid #GP fault x86/efi: Load fixmap GDT in efi_call_phys_epilog() x86/nmi: Fix NMI uaccess race against CR3 switching x86: Allow generating user-space headers without a compiler x86/dumpstack: Don't dump kernel memory based on usermode RIP x86/asm: Use CC_SET()/CC_OUT() in __gen_sigismember() x86/alternatives: Lockdep-enforce text_mutex in text_poke*() x86/entry/64: Wipe KASAN stack shadow before rewind_stack_do_exit() x86/irqflags: Mark native_restore_fl extern inline x86/build: Remove jump label quirk for GCC older than 4.5.2 x86/Kconfig: Fix trivial typo x86/speculation/l1tf: Increase l1tf memory limit for Nehalem+ x86/spectre: Add missing family 6 check to microcode check	2018-09-02 10:11:30 -07:00
Linus Torvalds	1395d109cd	Merge branch 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull CPU hotplug fix from Thomas Gleixner: "Remove the stale skip_onerr member from the hotplug states" * 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: cpu/hotplug: Remove skip_onerr field from cpuhp_step structure	2018-09-02 10:09:35 -07:00
Linus Torvalds	501dacbc24	Merge branch 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull core fixes from Thomas Gleixner: "A small set of updates for core code: - Prevent tracing in functions which are called from trace patching via stop_machine() to prevent executing half patched function trace entries. - Remove old GCC workarounds - Remove pointless includes of notifier.h" * 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: objtool: Remove workaround for unreachable warnings from old GCC notifier: Remove notifier header file wherever not used watchdog: Mark watchdog touch functions as notrace	2018-09-02 09:41:45 -07:00
Filippo Sironi	8da38ebaad	x86/microcode: Update the new microcode revision unconditionally Handle the case where microcode gets loaded on the BSP's hyperthread sibling first and the boot_cpu_data's microcode revision doesn't get updated because of early exit due to the siblings sharing a microcode engine. For that, simply write the updated revision on all CPUs unconditionally. Signed-off-by: Filippo Sironi <sironi@amazon.de> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: prarit@redhat.com Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/1533050970-14385-1-git-send-email-sironi@amazon.de	2018-09-02 14:10:54 +02:00
Prarit Bhargava	370a132bb2	x86/microcode: Make sure boot_cpu_data.microcode is up-to-date When preparing an MCE record for logging, boot_cpu_data.microcode is used to read out the microcode revision on the box. However, on systems where late microcode update has happened, the microcode revision output in a MCE log record is wrong because boot_cpu_data.microcode is not updated when the microcode gets updated. But, the microcode revision saved in boot_cpu_data's microcode member should be kept up-to-date, regardless, for consistency. Make it so. Fixes: `fa94d0c6e0` ("x86/MCE: Save microcode revision in machine check records") Signed-off-by: Prarit Bhargava <prarit@redhat.com> Signed-off-by: Borislav Petkov <bp@suse.de> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Tony Luck <tony.luck@intel.com> Cc: sironi@amazon.de Cc: stable@vger.kernel.org Link: http://lkml.kernel.org/r/20180731112739.32338-1-prarit@redhat.com	2018-09-02 14:10:54 +02:00
Randy Dunlap	ff924c5a1e	x86/pti: Fix section mismatch warning/error Fix the section mismatch warning in arch/x86/mm/pti.c: WARNING: vmlinux.o(.text+0x6972a): Section mismatch in reference from the function pti_clone_pgtable() to the function .init.text:pti_user_pagetable_walk_pte() The function pti_clone_pgtable() references the function __init pti_user_pagetable_walk_pte(). This is often because pti_clone_pgtable lacks a __init annotation or the annotation of pti_user_pagetable_walk_pte is wrong. FATAL: modpost: Section mismatches detected. Fixes: `85900ea515` ("x86/pti: Map the vsyscall page if needed") Reported-by: kbuild test robot <lkp@intel.com> Signed-off-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Cc: Andy Lutomirski <luto@kernel.org> Link: https://lkml.kernel.org/r/43a6d6a3-d69d-5eda-da09-0b1c88215a2a@infradead.org	2018-09-02 11:24:41 +02:00
Linus Walleij	8c89ef7b6b	of/platform: initialise AMBA default DMA masks This addresses a v4.19-rc1 regression in the PL111 DRM driver in drivers/gpu/pl111/* The driver uses the CMA KMS helpers and will thus at some point call down to dma_alloc_attrs() to allocate a chunk of contigous DMA memory for the framebuffer. It appears that in v4.18, it was OK that this (and other DMA mastering AMBA devices) left dev->coherent_dma_mask blank (zero). In v4.19-rc1 the WARN_ON_ONCE(dev && !dev->coherent_dma_mask) in dma_alloc_attrs() in include/linux/dma-mapping.h is triggered. The allocation later fails when get_coherent_dma_mask() is called from __dma_alloc() and __dma_alloc() returns NULL: drm-clcd-pl111 dev:20: coherent DMA mask is unset drm-clcd-pl111 dev:20: [drm:drm_fb_helper_fbdev_setup] ERROR Failed to set fbdev configuration It turns out that in commit `4d8bde883b` ("OF: Don't set default coherent DMA mask") the OF core stops setting the default DMA mask on new devices, especially those lines of the patch: - if (!dev->coherent_dma_mask) - dev->coherent_dma_mask = DMA_BIT_MASK(32); Robin Murphy solved a similar problem in `a5516219b1` ("of/platform: Initialise default DMA masks") by simply assigning dev.coherent_dma_mask and the dev.dma_mask to point to the same when creating devices from the device tree, and introducing the same code into the code path creating AMBA/PrimeCell devices solved my problem, graphics now come up. The code simply assumes that the device can access all of the system memory by setting the coherent DMA mask to 0xffffffff when creating a device from the device tree, which is crude, but seems to be what kernel v4.18 assumed. The AMBA PrimeCells do not differ between coherent and streaming DMA so we can just assign the same to any DMA mask. Possibly drivers should augment their coherent DMA mask in accordance with "dma-ranges" from the device tree if more finegranular masking is needed. Reported-by: Russell King <linux@armlinux.org.uk> Fixes: `4d8bde883b` ("OF: Don't set default coherent DMA mask") Cc: Russell King <linux@armlinux.org.uk> Cc: Robin Murphy <robin.murphy@arm.com> Signed-off-by: Linus Walleij <linus.walleij@linaro.org> Signed-off-by: Christoph Hellwig <hch@lst.de>	2018-09-02 10:04:31 +02:00
Christoph Hellwig	5a7faef72e	sparc: set a default 32-bit dma mask for OF devices This keeps the historic default behavior for devices without a DMA mask, but removes the warning about a lacking DMA mask for doing DMA without a mask. Reported-by: Meelis Roos <mroos@linux.ee> Signed-off-by: Christoph Hellwig <hch@lst.de> Tested-by: Guenter Roeck <linux@roeck-us.net>	2018-09-02 10:02:04 +02:00
Olof Johansson	a72b44a871	Fixes for omap variants against v4.19-rc1 These are mostly fixes related to using ti-sysc interconnect target module driver for accessing right register offsets for sgx and cpsw and for no_console_suspend regression. There is also a droid4 emmc fix where emmc may not get detected for some models, and vibrator dts mismerge fix. And we have a file permission fix for am335x-osd3358-sm-red.dts that just got added. And we must tag RTC as system-power-controller for am437x for PMIC to shut down during poweroff. -----BEGIN PGP SIGNATURE----- iQJFBAABCAAvFiEEkgNvrZJU/QSQYIcQG9Q+yVyrpXMFAluHD+IRHHRvbnlAYXRv bWlkZS5jb20ACgkQG9Q+yVyrpXPolQ//Zj4/pBU5caVIerXtiVDEY492CNhNnMpZ JtxJkBHeSs9yAtQOgVgdNkn/zazM3xZfAxrZIZjXzJJqtuK0bC1mmy4PxJCxQzG5 ndccxU27MVR4tN1BE5gdD2t4mzn4N1Sos4f5Itl3bS7T0G1x/hvEjb3G0b/c5EQ/ 2QureLTCZUQcUHy/0tD6G6EoXBTj4IM9QDOCMpfCREIYC4r8ChT02H+xC9MNUY5s Y1F76NrjYi9oOSFI9sqYSEaqBYpG1WPVeQ0pAoN0x6nuyPgpif4+LkLqdkY/mIu0 GN8NeHdJmXwax0i172fmhjhnWnkYl2KjjoonE3anqoIpXIALp3WvhMITJdiNjnTX 0fxAhVLgRd8i1/bezBNAnCAlILAHr/MR5DFbUVMIT5113axbUg6+CTWb/89Z6PMo GThDngYVxpHBsxWUC2vKc8YhRqwfQCO5nolbcdQVDnktKIY51EFE8J1MEWibZ2O4 K4jyN3Bl0eOBrie0ZQ7azF7wBCq7xObVMuRcHkVIH+y2aurC5GgOGQ3SwEX/7Ro3 XCVZWFtFsha6TIABzam7CN7/1xL+RUIHM+umjJgLHS8q2OVXguBv+xtQOnVfQgyB C96GqPhch6SC9y7guxSenknO/e5cQ1bHhXV+uA11LMQWaEyEgRzlfm+4MyYClXEa tuM4mUcbY/A= =CHc2 -----END PGP SIGNATURE----- Merge tag 'omap-for-v4.19/fixes-v2-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap into fixes Fixes for omap variants against v4.19-rc1 These are mostly fixes related to using ti-sysc interconnect target module driver for accessing right register offsets for sgx and cpsw and for no_console_suspend regression. There is also a droid4 emmc fix where emmc may not get detected for some models, and vibrator dts mismerge fix. And we have a file permission fix for am335x-osd3358-sm-red.dts that just got added. And we must tag RTC as system-power-controller for am437x for PMIC to shut down during poweroff. * tag 'omap-for-v4.19/fixes-v2-signed' of git://git.kernel.org/pub/scm/linux/kernel/git/tmlind/linux-omap: ARM: dts: omap4-droid4: Fix emmc errors seen on some devices ARM: dts: Fix file permission for am335x-osd3358-sm-red.dts arm: dts: am4372: setup rtc as system-power-controller ARM: dts: omap4-droid4: fix vibrations on Droid 4 bus: ti-sysc: Fix no_console_suspend handling bus: ti-sysc: Fix module register ioremap for larger offsets ARM: OMAP2+: Fix module address for modules using mpu_rt_idx ARM: OMAP2+: Fix null hwmod for ti-sysc debug Signed-off-by: Olof Johansson <olof@lixom.net>	2018-09-01 18:22:19 -07:00
Alexey Kodanev	93bbadd6e0	ipv6: don't get lwtstate twice in ip6_rt_copy_init() Commit `80f1a0f4e0` ("net/ipv6: Put lwtstate when destroying fib6_info") partially fixed the kmemleak [1], lwtstate can be copied from fib6_info, with ip6_rt_copy_init(), and it should be done only once there. rt->dst.lwtstate is set by ip6_rt_init_dst(), at the start of the function ip6_rt_copy_init(), so there is no need to get it again at the end. With this patch, lwtstate also isn't copied from RTF_REJECT routes. [1]: unreferenced object 0xffff880b6aaa14e0 (size 64): comm "ip", pid 10577, jiffies 4295149341 (age 1273.903s) hex dump (first 32 bytes): 01 00 04 00 04 00 00 00 10 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<0000000018664623>] lwtunnel_build_state+0x1bc/0x420 [<00000000b73aa29a>] ip6_route_info_create+0x9f7/0x1fd0 [<00000000ee2c5d1f>] ip6_route_add+0x14/0x70 [<000000008537b55c>] inet6_rtm_newroute+0xd9/0xe0 [<000000002acc50f5>] rtnetlink_rcv_msg+0x66f/0x8e0 [<000000008d9cd381>] netlink_rcv_skb+0x268/0x3b0 [<000000004c893c76>] netlink_unicast+0x417/0x5a0 [<00000000f2ab1afb>] netlink_sendmsg+0x70b/0xc30 [<00000000890ff0aa>] sock_sendmsg+0xb1/0xf0 [<00000000a2e7b66f>] ___sys_sendmsg+0x659/0x950 [<000000001e7426c8>] __sys_sendmsg+0xde/0x170 [<00000000fe411443>] do_syscall_64+0x9f/0x4a0 [<000000001be7b28b>] entry_SYSCALL_64_after_hwframe+0x49/0xbe [<000000006d21f353>] 0xffffffffffffffff Fixes: `6edb3c96a5` ("net/ipv6: Defer initialization of dst to data path") Signed-off-by: Alexey Kodanev <alexey.kodanev@oracle.com> Reviewed-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-09-01 17:42:12 -07:00
Samuel Neves	e78e5a9145	x86/vdso: Fix lsl operand order In the __getcpu function, lsl is using the wrong target and destination registers. Luckily, the compiler tends to choose %eax for both variables, so it has been working so far. Fixes: `a582c540ac` ("x86/vdso: Use RDPID in preference to LSL when available") Signed-off-by: Samuel Neves <sneves@dei.uc.pt> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Andy Lutomirski <luto@kernel.org> Cc: stable@vger.kernel.org Link: https://lkml.kernel.org/r/20180901201452.27828-1-sneves@dei.uc.pt	2018-09-01 23:01:56 +02:00
Linus Torvalds	360bd62dc4	linux-watchdog 4.19-rc2 tag -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.14 (GNU/Linux) iEYEABECAAYFAluKTH8ACgkQ+iyteGJfRspLOgCgybxs7ktaE4RFal8KM7X8g5sB wVEAoMQ1IlQFkuxhJmHt8YemhRL3JqWC =cMKd -----END PGP SIGNATURE----- Merge tag 'linux-watchdog-4.19-rc2' of git://www.linux-watchdog.org/linux-watchdog Pull watchdog fixlet from Wim Van Sebroeck: "Document support for r8a774a1" * tag 'linux-watchdog-4.19-rc2' of git://www.linux-watchdog.org/linux-watchdog: dt-bindings: watchdog: renesas-wdt: Document r8a774a1 support	2018-09-01 13:17:15 -07:00
Linus Torvalds	b18ed664c2	Two small fixes, one for the x86 Stoney SoC to get a more accurate clk frequency and the other to fix a bad allocation in the Nuvoton NPCM7XX driver. -----BEGIN PGP SIGNATURE----- iQJFBAABCAAvFiEE9L57QeeUxqYDyoaDrQKIl8bklSUFAluJ96QRHHNib3lkQGtl cm5lbC5vcmcACgkQrQKIl8bklSVxShAAwmk9fLKBRHQAwmmdO5zNL5hM0VAFaSkA xaKTRWEv38FGCiILg9HaXqlzzBA0QqZyXvNFq1ucTFZ7z1yqK3LBI8XLB/IPIh+2 dI6VInfi2IwQIQyNjplMaJu0V3R6qH7+xuwQNux17PMDR5Jb1VoiX4nwbqS/50Tv poqe4xNuedav2QTFm684HDN8PPZ/3wrior/xfzPEzOZjWlXDeKO2DUKoYALRzQcn 9+IZatKNjwvC0ZFV60EDtIVOYmE15vFtxJfgTDcSZKL7BAEnmS/s8N0J6NvlQlKQ PfRO5WozupYhC7FtYjRb8nHsgp+jkz5A1VBnUfdukL2MPZhMev41eYQMdY+q3ryv U9RzzzFSGy5SE7jMEq64S14u8ql7zun9ahpUST+SZoL3tq+N8iWoRkY4zD0QaPQf K8jw6gYecltYsuwDtJgGIE0K6QEpuCeHG42vQE+wwti9rU0NjSB7LTKMbDJVkRCD 7HiHqCbfwkj3fVGFDgxsU0d8vQs9Qd/OZRIEmPK3EtJY7+TwjsUrcrjR7+NyG3MQ VoEldFSVu8wpNua/CsfrnGbcjALna4mhHT4BsihwmyHJ3l8e2aDc7oCK5SdBKVYd bGq2ddueuGKUnCLSuMtQksYJ3EJ8QafVXGJnMAge1UHFV7BQI7XtrUzv0uslh6cL TUrILnKKsb0= =b74w -----END PGP SIGNATURE----- Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux Pull clk fixes from Stephen Boyd: "Two small fixes, one for the x86 Stoney SoC to get a more accurate clk frequency and the other to fix a bad allocation in the Nuvoton NPCM7XX driver" * tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux: clk: x86: Set default parent to 48Mhz clk: npcm7xx: fix memory allocation	2018-09-01 13:03:32 -07:00
Kees Cook	9b25436662	random: make CPU trust a boot parameter Instead of forcing a distro or other system builder to choose at build time whether the CPU is trusted for CRNG seeding via CONFIG_RANDOM_TRUST_CPU, provide a boot-time parameter for end users to control the choice. The CONFIG will set the default state instead. Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Theodore Ts'o <tytso@mit.edu>	2018-09-01 12:51:54 -04:00
Christoph Hellwig	c1d0af1a1d	kernel/dma/direct: take DMA offset into account in dma_direct_supported When a device has a DMA offset the dma capable result will change due to the difference between the physical and DMA address. Take that into account. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> Reviewed-by: Robin Murphy <robin.murphy@arm.com>	2018-09-01 15:42:28 +02:00
LuckTony	c7486104a5	x86/mce: Fix set_mce_nospec() to avoid #GP fault The trick with flipping bit 63 to avoid loading the address of the 1:1 mapping of the poisoned page while the 1:1 map is updated used to work when unmapping the page. But it falls down horribly when attempting to directly set the page as uncacheable. The problem is that when the cache mode is changed to uncachable, the pages needs to be flushed from the cache first. But the decoy address is non-canonical due to bit 63 flipped, and the CLFLUSH instruction throws a #GP fault. Add code to change_page_attr_set_clr() to fix the address before calling flush. Fixes: `284ce4011b` ("x86/memory_failure: Introduce {set, clear}_mce_nospec()") Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Anvin <hpa@zytor.com> Cc: Borislav Petkov <bp@alien8.de> Cc: linux-edac <linux-edac@vger.kernel.org> Cc: Dan Williams <dan.j.williams@intel.com> Cc: Dave Jiang <dave.jiang@intel.com> Link: https://lkml.kernel.org/r/20180831165506.GA9605@agluck-desk	2018-09-01 14:59:19 +02:00
Thomas Falcon	f611a5b4a5	ibmvnic: Include missing return code checks in reset function Check the return codes of these functions and halt reset in case of failure. The driver will remain in a dormant state until the next reset event, when device initialization will be re-attempted. Signed-off-by: Thomas Falcon <tlfalcon@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-08-31 23:15:50 -07:00
Sabrina Dubroca	c81c7012e0	selftests: pmtu: detect correct binary to ping ipv6 addresses Some systems don't have the ping6 binary anymore, and use ping for everything. Detect the absence of ping6 and try to use ping instead. Fixes: `d1f1b9cbf3` ("selftests: net: Introduce first PMTU test") Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Acked-by: Stefano Brivio <sbrivio@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-08-31 23:14:20 -07:00
Sabrina Dubroca	902b5417f2	selftests: pmtu: maximum MTU for vti4 is 2^16-1-20 Since commit `82612de1c9` ("ip_tunnel: restore binding to ifaces with a large mtu"), the maximum MTU for vti4 is based on IP_MAX_MTU instead of the mysterious constant 0xFFF8. This makes this selftest fail. Fixes: `82612de1c9` ("ip_tunnel: restore binding to ifaces with a large mtu") Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Acked-by: Stefano Brivio <sbrivio@redhat.com> Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-08-31 23:14:20 -07:00
Florian Westphal	63cc357f7b	tcp: do not restart timewait timer on rst reception RFC 1337 says: ''Ignore RST segments in TIME-WAIT state. If the 2 minute MSL is enforced, this fix avoids all three hazards.'' So with net.ipv4.tcp_rfc1337=1, expected behaviour is to have TIME-WAIT sk expire rather than removing it instantly when a reset is received. However, Linux will also re-start the TIME-WAIT timer. This causes connect to fail when tying to re-use ports or very long delays (until syn retry interval exceeds MSL). packetdrill test case: // Demonstrate bogus rearming of TIME-WAIT timer in rfc1337 mode. `sysctl net.ipv4.tcp_rfc1337=1` 0.000 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3 0.000 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0 0.000 bind(3, ..., ...) = 0 0.000 listen(3, 1) = 0 0.100 < S 0:0(0) win 29200 <mss 1460,nop,nop,sackOK,nop,wscale 7> 0.100 > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 7> 0.200 < . 1:1(0) ack 1 win 257 0.200 accept(3, ..., ...) = 4 // Receive first segment 0.310 < P. 1:1001(1000) ack 1 win 46 // Send one ACK 0.310 > . 1:1(0) ack 1001 // read 1000 byte 0.310 read(4, ..., 1000) = 1000 // Application writes 100 bytes 0.350 write(4, ..., 100) = 100 0.350 > P. 1:101(100) ack 1001 // ACK 0.500 < . 1001:1001(0) ack 101 win 257 // close the connection 0.600 close(4) = 0 0.600 > F. 101:101(0) ack 1001 win 244 // Our side is in FIN_WAIT_1 & waits for ack to fin 0.7 < . 1001:1001(0) ack 102 win 244 // Our side is in FIN_WAIT_2 with no outstanding data. 0.8 < F. 1001:1001(0) ack 102 win 244 0.8 > . 102:102(0) ack 1002 win 244 // Our side is now in TIME_WAIT state, send ack for fin. 0.9 < F. 1002:1002(0) ack 102 win 244 0.9 > . 102:102(0) ack 1002 win 244 // Peer reopens with in-window SYN: 1.000 < S 1000:1000(0) win 9200 <mss 1460,nop,nop,sackOK,nop,wscale 7> // Therefore, reply with ACK. 1.000 > . 102:102(0) ack 1002 win 244 // Peer sends RST for this ACK. Normally this RST results // in tw socket removal, but rfc1337=1 setting prevents this. 1.100 < R 1002:1002(0) win 244 // second syn. Due to rfc1337=1 expect another pure ACK. 31.0 < S 1000:1000(0) win 9200 <mss 1460,nop,nop,sackOK,nop,wscale 7> 31.0 > . 102:102(0) ack 1002 win 244 // .. and another RST from peer. 31.1 < R 1002:1002(0) win 244 31.2 `echo no timer restart;ss -m -e -a -i -n -t -o state TIME-WAIT` // third syn after one minute. Time-Wait socket should have expired by now. 63.0 < S 1000:1000(0) win 9200 <mss 1460,nop,nop,sackOK,nop,wscale 7> // so we expect a syn-ack & 3whs to proceed from here on. 63.0 > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 7> Without this patch, 'ss' shows restarts of tw timer and last packet is thus just another pure ack, more than one minute later. This restores the original code from commit 283fd6cf0be690a83 ("Merge in ANK networking jumbo patch") in netdev-vger-cvs.git . For some reason the else branch was removed/lost in 1f28b683339f7 ("Merge in TCP/UDP optimizations and [..]") and timer restart became unconditional. Reported-by: Michal Tesar <mtesar@redhat.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-08-31 23:10:35 -07:00
Pavel Machek	b0e0b0abbd	net/rds: RDS is not Radio Data System Getting prompt "The RDS Protocol" (RDS) is not too helpful, and it is easily confused with Radio Data System (which we may want to support in kernel, too). Signed-off-by: Pavel Machek <pavel@ucw.cz> Acked-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Acked-by: Sowmini Varadhan <sowmini.varadhan@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-08-31 23:09:53 -07:00
Dexuan Cui	e04e7a7bbd	hv_netvsc: Fix a deadlock by getting rtnl lock earlier in netvsc_probe() This patch fixes the race between netvsc_probe() and rndis_set_subchannel(), which can cause a deadlock. These are the related 3 paths which show the deadlock: path #1: Workqueue: hv_vmbus_con vmbus_onmessage_work [hv_vmbus] Call Trace: schedule schedule_preempt_disabled __mutex_lock __device_attach bus_probe_device device_add vmbus_device_register vmbus_onoffer vmbus_onmessage_work process_one_work worker_thread kthread ret_from_fork path #2: schedule schedule_preempt_disabled __mutex_lock netvsc_probe vmbus_probe really_probe __driver_attach bus_for_each_dev driver_attach_async async_run_entry_fn process_one_work worker_thread kthread ret_from_fork path #3: Workqueue: events netvsc_subchan_work [hv_netvsc] Call Trace: schedule rndis_set_subchannel netvsc_subchan_work process_one_work worker_thread kthread ret_from_fork Before path #1 finishes, path #2 can start to run, because just before the "bus_probe_device(dev);" in device_add() in path #1, there is a line "object_uevent(&dev->kobj, KOBJ_ADD);", so systemd-udevd can immediately try to load hv_netvsc and hence path #2 can start to run. Next, path #2 offloads the subchannal's initialization to a workqueue, i.e. path #3, so we can end up in a deadlock situation like this: Path #2 gets the device lock, and is trying to get the rtnl lock; Path #3 gets the rtnl lock and is waiting for all the subchannel messages to be processed; Path #1 is trying to get the device lock, but since #2 is not releasing the device lock, path #1 has to sleep; since the VMBus messages are processed one by one, this means the sub-channel messages can't be procedded, so #3 has to sleep with the rtnl lock held, and finally #2 has to sleep... Now all the 3 paths are sleeping and we hit the deadlock. With the patch, we can make sure #2 gets both the device lock and the rtnl lock together, gets its job done, and releases the locks, so #1 and #3 will not be blocked for ever. Fixes: `8195b1396e` ("hv_netvsc: fix deadlock on hotplug") Signed-off-by: Dexuan Cui <decui@microsoft.com> Cc: Stephen Hemminger <sthemmin@microsoft.com> Cc: K. Y. Srinivasan <kys@microsoft.com> Cc: Haiyang Zhang <haiyangz@microsoft.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-08-31 23:07:54 -07:00
Jakub Kicinski	9ad716b95f	nfp: wait for posted reconfigs when disabling the device To avoid leaking a running timer we need to wait for the posted reconfigs after netdev is unregistered. In common case the process of deinitializing the device will perform synchronous reconfigs which wait for posted requests, but especially with VXLAN ports being actively added and removed there can be a race condition leaving a timer running after adapter structure is freed leading to a crash. Add an explicit flush after deregistering and for a good measure a warning to check if timer is running just before structures are freed. Fixes: `3d780b926a` ("nfp: add async reconfiguration mechanism") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Dirk van der Merwe <dirk.vandermerwe@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-08-31 23:01:30 -07:00
Eric Dumazet	3a7ad0634f	Revert "packet: switch kvzalloc to allocate memory" This reverts commit `71e4128620`. mmap()/munmap() can not be backed by kmalloced pages : We fault in : VM_BUG_ON_PAGE(PageSlab(page), page); unmap_single_vma+0x8a/0x110 unmap_vmas+0x4b/0x90 unmap_region+0xc9/0x140 do_munmap+0x274/0x360 vm_munmap+0x81/0xc0 SyS_munmap+0x2b/0x40 do_syscall_64+0x13e/0x1c0 entry_SYSCALL_64_after_hwframe+0x42/0xb7 Fixes: `71e4128620` ("packet: switch kvzalloc to allocate memory") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: John Sperbeck <jsperbeck@google.com> Bisected-by: John Sperbeck <jsperbeck@google.com> Cc: Zhang Yu <zhangyu31@baidu.com> Cc: Li RongQing <lirongqing@baidu.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-08-31 23:00:28 -07:00
Guoqing Jiang	41a9504112	md-cluster: release RESYNC lock after the last resync message All the RESYNC messages are sent with resync lock held, the only exception is resync_finish which releases resync_lockres before send the last resync message, this should be changed as well. Otherwise, we can see deadlock issue as follows: clustermd2-gqjiang2:~ # cat /proc/mdstat Personalities : [raid10] [raid1] md0 : active raid1 sdg[0] sdf[1] 134144 blocks super 1.2 [2/2] [UU] [===================>.] resync = 99.6% (134144/134144) finish=0.0min speed=26K/sec bitmap: 1/1 pages [4KB], 65536KB chunk unused devices: <none> clustermd2-gqjiang2:~ # ps aux\|grep md\|grep D root 20497 0.0 0.0 0 0 ? D 16:00 0:00 [md0_raid1] clustermd2-gqjiang2:~ # cat /proc/20497/stack [<ffffffffc05ff51e>] dlm_lock_sync+0x8e/0xc0 [md_cluster] [<ffffffffc05ff7e8>] __sendmsg+0x98/0x130 [md_cluster] [<ffffffffc05ff900>] sendmsg+0x20/0x30 [md_cluster] [<ffffffffc05ffc35>] resync_info_update+0xb5/0xc0 [md_cluster] [<ffffffffc0593e84>] md_reap_sync_thread+0x134/0x170 [md_mod] [<ffffffffc059514c>] md_check_recovery+0x28c/0x510 [md_mod] [<ffffffffc060c882>] raid1d+0x42/0x800 [raid1] [<ffffffffc058ab61>] md_thread+0x121/0x150 [md_mod] [<ffffffff9a0a5b3f>] kthread+0xff/0x140 [<ffffffff9a800235>] ret_from_fork+0x35/0x40 [<ffffffffffffffff>] 0xffffffffffffffff clustermd-gqjiang1:~ # ps aux\|grep md\|grep D root 20531 0.0 0.0 0 0 ? D 16:00 0:00 [md0_raid1] root 20537 0.0 0.0 0 0 ? D 16:00 0:00 [md0_cluster_rec] root 20676 0.0 0.0 0 0 ? D 16:01 0:00 [md0_resync] clustermd-gqjiang1:~ # cat /proc/mdstat Personalities : [raid10] [raid1] md0 : active raid1 sdf[1] sdg[0] 134144 blocks super 1.2 [2/2] [UU] [===================>.] resync = 97.3% (131072/134144) finish=8076.8min speed=0K/sec bitmap: 1/1 pages [4KB], 65536KB chunk unused devices: <none> clustermd-gqjiang1:~ # cat /proc/20531/stack [<ffffffffc080974d>] metadata_update_start+0xcd/0xd0 [md_cluster] [<ffffffffc079c897>] md_update_sb.part.61+0x97/0x820 [md_mod] [<ffffffffc079f15b>] md_check_recovery+0x29b/0x510 [md_mod] [<ffffffffc0816882>] raid1d+0x42/0x800 [raid1] [<ffffffffc0794b61>] md_thread+0x121/0x150 [md_mod] [<ffffffff9e0a5b3f>] kthread+0xff/0x140 [<ffffffff9e800235>] ret_from_fork+0x35/0x40 [<ffffffffffffffff>] 0xffffffffffffffff clustermd-gqjiang1:~ # cat /proc/20537/stack [<ffffffffc0813222>] freeze_array+0xf2/0x140 [raid1] [<ffffffffc080a56e>] recv_daemon+0x41e/0x580 [md_cluster] [<ffffffffc0794b61>] md_thread+0x121/0x150 [md_mod] [<ffffffff9e0a5b3f>] kthread+0xff/0x140 [<ffffffff9e800235>] ret_from_fork+0x35/0x40 [<ffffffffffffffff>] 0xffffffffffffffff clustermd-gqjiang1:~ # cat /proc/20676/stack [<ffffffffc080951e>] dlm_lock_sync+0x8e/0xc0 [md_cluster] [<ffffffffc080957f>] lock_token+0x2f/0xa0 [md_cluster] [<ffffffffc0809622>] lock_comm+0x32/0x90 [md_cluster] [<ffffffffc08098f5>] sendmsg+0x15/0x30 [md_cluster] [<ffffffffc0809c0a>] resync_info_update+0x8a/0xc0 [md_cluster] [<ffffffffc08130ba>] raid1_sync_request+0xa9a/0xb10 [raid1] [<ffffffffc079b8ea>] md_do_sync+0xbaa/0xf90 [md_mod] [<ffffffffc0794b61>] md_thread+0x121/0x150 [md_mod] [<ffffffff9e0a5b3f>] kthread+0xff/0x140 [<ffffffff9e800235>] ret_from_fork+0x35/0x40 [<ffffffffffffffff>] 0xffffffffffffffff Reviewed-by: NeilBrown <neilb@suse.com> Signed-off-by: Guoqing Jiang <gqjiang@suse.com> Signed-off-by: Shaohua Li <shli@fb.com>	2018-08-31 17:38:10 -07:00
Xiao Ni	1d0ffd2642	RAID10 BUG_ON in raise_barrier when force is true and conf->barrier is 0 In raid10 reshape_request it gets max_sectors in read_balance. If the underlayer disks have bad blocks, the max_sectors is less than last. It will call goto read_more many times. It calls raise_barrier(conf, sectors_done != 0) every time. In this condition sectors_done is not 0. So the value passed to the argument force of raise_barrier is true. In raise_barrier it checks conf->barrier when force is true. If force is true and conf->barrier is 0, it panic. In this case reshape_request submits bio to under layer disks. And in the callback function of the bio it calls lower_barrier. If the bio finishes before calling raise_barrier again, it can trigger the BUG_ON. Add one pair of raise_barrier/lower_barrier to fix this bug. Signed-off-by: Xiao Ni <xni@redhat.com> Suggested-by: Neil Brown <neilb@suse.com> Signed-off-by: Shaohua Li <shli@fb.com>	2018-08-31 17:38:10 -07:00
Shaohua Li	e254de6bcf	md/raid5-cache: disable reshape completely We don't support reshape yet if an array supports log device. Previously we determine the fact by checking ->log. However, ->log could be NULL after a log device is removed, but the array is still marked to support log device. Don't allow reshape in this case too. User can disable log device support by setting 'consistency_policy' to 'resync' then do reshape. Reported-by: Xiao Ni <xni@redhat.com> Tested-by: Xiao Ni <xni@redhat.com> Signed-off-by: Shaohua Li <shli@fb.com>	2018-08-31 17:38:09 -07:00
Dennis Zhou (Facebook)	3111885015	blkcg: use tryget logic when associating a blkg with a bio There is a very small change a bio gets caught up in a really unfortunate race between a task migration, cgroup exiting, and itself trying to associate with a blkg. This is due to css offlining being performed after the css->refcnt is killed which triggers removal of blkgs that reach their blkg->refcnt of 0. To avoid this, association with a blkg should use tryget and fallback to using the root_blkg. Fixes: `08e18eab0c` ("block: add bi_blkg to the bio for cgroups") Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Dennis Zhou <dennisszhou@gmail.com> Cc: Jiufei Xue <jiufei.xue@linux.alibaba.com> Cc: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Tejun Heo <tj@kernel.org> Cc: Josef Bacik <josef@toxicpanda.com> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-08-31 14:48:58 -06:00
Dennis Zhou (Facebook)	59b57717ff	blkcg: delay blkg destruction until after writeback has finished Currently, blkcg destruction relies on a sequence of events: 1. Destruction starts. blkcg_css_offline() is called and blkgs release their reference to the blkcg. This immediately destroys the cgwbs (writeback). 2. With blkgs giving up their reference, the blkcg ref count should become zero and eventually call blkcg_css_free() which finally frees the blkcg. Jiufei Xue reported that there is a race between blkcg_bio_issue_check() and cgroup_rmdir(). To remedy this, blkg destruction becomes contingent on the completion of all writeback associated with the blkcg. A count of the number of cgwbs is maintained and once that goes to zero, blkg destruction can follow. This should prevent premature blkg destruction related to writeback. The new process for blkcg cleanup is as follows: 1. Destruction starts. blkcg_css_offline() is called which offlines writeback. Blkg destruction is delayed on the cgwb_refcnt count to avoid punting potentially large amounts of outstanding writeback to root while maintaining any ongoing policies. Here, the base cgwb_refcnt is put back. 2. When the cgwb_refcnt becomes zero, blkcg_destroy_blkgs() is called and handles destruction of blkgs. This is where the css reference held by each blkg is released. 3. Once the blkcg ref count goes to zero, blkcg_css_free() is called. This finally frees the blkg. It seems in the past blk-throttle didn't do the most understandable things with taking data from a blkg while associating with current. So, the simplification and unification of what blk-throttle is doing caused this. Fixes: `08e18eab0c` ("block: add bi_blkg to the bio for cgroups") Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Dennis Zhou <dennisszhou@gmail.com> Cc: Jiufei Xue <jiufei.xue@linux.alibaba.com> Cc: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Tejun Heo <tj@kernel.org> Cc: Josef Bacik <josef@toxicpanda.com> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-08-31 14:48:56 -06:00
Dennis Zhou (Facebook)	6b06546206	Revert "blk-throttle: fix race between blkcg_bio_issue_check() and cgroup_rmdir()" This reverts commit `4c6994806f`. Destroying blkgs is tricky because of the nature of the relationship. A blkg should go away when either a blkcg or a request_queue goes away. However, blkg's pin the blkcg to ensure they remain valid. To break this cycle, when a blkcg is offlined, blkgs put back their css ref. This eventually lets css_free() get called which frees the blkcg. The above commit (`4c6994806f`) breaks this order of events by trying to destroy blkgs in css_free(). As the blkgs still hold references to the blkcg, css_free() is never called. The race between blkcg_bio_issue_check() and cgroup_rmdir() will be addressed in the following patch by delaying destruction of a blkg until all writeback associated with the blkcg has been finished. Fixes: `4c6994806f` ("blk-throttle: fix race between blkcg_bio_issue_check() and cgroup_rmdir()") Reviewed-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: Dennis Zhou <dennisszhou@gmail.com> Cc: Jiufei Xue <jiufei.xue@linux.alibaba.com> Cc: Joseph Qi <joseph.qi@linux.alibaba.com> Cc: Tejun Heo <tj@kernel.org> Cc: Jens Axboe <axboe@kernel.dk> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2018-08-31 14:48:54 -06:00

... 3 4 5 6 7 ...

782336 Commits