2
0
mirror of https://github.com/edk2-porting/linux-next.git synced 2025-01-20 03:24:03 +08:00
linux-next/drivers/base
David Hildenbrand 956f8b4450 drivers/base/memory: rename MMOP_ONLINE_KEEP to MMOP_ONLINE
Patch series "mm/memory_hotplug: allow to specify a default online_type", v3.

Distributions nowadays use udev rules ([1] [2]) to specify if and how to
online hotplugged memory.  The rules seem to get more complex with many
special cases.  Due to the various special cases,
CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE cannot be used.  All memory hotplug
is handled via udev rules.

Every time we hotplug memory, the udev rule will come to the same
conclusion.  Especially Hyper-V (but also soon virtio-mem) add a lot of
memory in separate memory blocks and wait for memory to get onlined by
user space before continuing to add more memory blocks (to not add memory
faster than it is getting onlined).  This of course slows down the whole
memory hotplug process.

To make the job of distributions easier and to avoid udev rules that get
more and more complicated, let's extend the mechanism provided by
- /sys/devices/system/memory/auto_online_blocks
- "memhp_default_state=" on the kernel cmdline
to be able to specify also "online_movable" as well as "online_kernel"

=== Example /usr/libexec/config-memhotplug ===

#!/bin/bash

VIRT=`systemd-detect-virt --vm`
ARCH=`uname -p`

sense_virtio_mem() {
  if [ -d "/sys/bus/virtio/drivers/virtio_mem/" ]; then
    DEVICES=`find /sys/bus/virtio/drivers/virtio_mem/ -maxdepth 1 -type l | wc -l`
    if [ $DEVICES != "0" ]; then
        return 0
    fi
  fi
  return 1
}

if [ ! -e "/sys/devices/system/memory/auto_online_blocks" ]; then
  echo "Memory hotplug configuration support missing in the kernel"
  exit 1
fi

if grep "memhp_default_state=" /proc/cmdline > /dev/null; then
  echo "Memory hotplug configuration overridden in kernel cmdline (memhp_default_state=)"
  exit 1
fi

if [ $VIRT == "microsoft" ]; then
  echo "Detected Hyper-V on $ARCH"
  # Hyper-V wants all memory in ZONE_NORMAL
  ONLINE_TYPE="online_kernel"
elif sense_virtio_mem; then
  echo "Detected virtio-mem on $ARCH"
  # virtio-mem wants all memory in ZONE_NORMAL
  ONLINE_TYPE="online_kernel"
elif [ $ARCH == "s390x" ] || [ $ARCH == "s390" ]; then
  echo "Detected $ARCH"
  # standby memory should not be onlined automatically
  ONLINE_TYPE="offline"
elif [ $ARCH == "ppc64" ] || [ $ARCH == "ppc64le" ]; then
  echo "Detected" $ARCH
  # PPC64 onlines all hotplugged memory right from the kernel
  ONLINE_TYPE="offline"
elif [ $VIRT == "none" ]; then
  echo "Detected bare-metal on $ARCH"
  # Bare metal users expect hotplugged memory to be unpluggable. We assume
  # that ZONE imbalances on such enterpise servers cannot happen and is
  # properly documented
  ONLINE_TYPE="online_movable"
else
  # TODO: Hypervisors that want to unplug DIMMs and can guarantee that ZONE
  # imbalances won't happen
  echo "Detected $VIRT on $ARCH"
  # Usually, ballooning is used in virtual environments, so memory should go to
  # ZONE_NORMAL. However, sometimes "movable_node" is relevant.
  ONLINE_TYPE="online"
fi

echo "Selected online_type:" $ONLINE_TYPE

# Configure what to do with memory that will be hotplugged in the future
echo $ONLINE_TYPE 2>/dev/null > /sys/devices/system/memory/auto_online_blocks
if [ $? != "0" ]; then
  echo "Memory hotplug cannot be configured (e.g., old kernel or missing permissions)"
  # A backup udev rule should handle old kernels if necessary
  exit 1
fi

# Process all already pluggedd blocks (e.g., DIMMs, but also Hyper-V or virtio-mem)
if [ $ONLINE_TYPE != "offline" ]; then
  for MEMORY in /sys/devices/system/memory/memory*; do
    STATE=`cat $MEMORY/state`
    if [ $STATE == "offline" ]; then
        echo $ONLINE_TYPE > $MEMORY/state
    fi
  done
fi

=== Example /usr/lib/systemd/system/config-memhotplug.service ===

[Unit]
Description=Configure memory hotplug behavior
DefaultDependencies=no
Conflicts=shutdown.target
Before=sysinit.target shutdown.target
After=systemd-modules-load.service
ConditionPathExists=|/sys/devices/system/memory/auto_online_blocks

[Service]
ExecStart=/usr/libexec/config-memhotplug
Type=oneshot
TimeoutSec=0
RemainAfterExit=yes

[Install]
WantedBy=sysinit.target

=== Example modification to the 40-redhat.rules [2] ===

: diff --git a/40-redhat.rules b/40-redhat.rules-new
: index 2c690e5..168fd03 100644
: --- a/40-redhat.rules
: +++ b/40-redhat.rules-new
: @@ -6,6 +6,9 @@ SUBSYSTEM=="cpu", ACTION=="add", TEST=="online", ATTR{online}=="0", ATTR{online}
:  # Memory hotadd request
:  SUBSYSTEM!="memory", GOTO="memory_hotplug_end"
:  ACTION!="add", GOTO="memory_hotplug_end"
: +# memory hotplug behavior configured
: +PROGRAM=="grep online /sys/devices/system/memory/auto_online_blocks", GOTO="memory_hotplug_end"
: +
:  PROGRAM="/bin/uname -p", RESULT=="s390*", GOTO="memory_hotplug_end"
:
:  ENV{.state}="online"

===

[1] https://github.com/lnykryn/systemd-rhel/pull/281
[2] https://github.com/lnykryn/systemd-rhel/blob/staging/rules/40-redhat.rules

This patch (of 8):

The name is misleading and it's not really clear what is "kept".  Let's
just name it like the online_type name we expose to user space ("online").

Add some documentation to the types.

Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Wei Yang <richard.weiyang@gmail.com>
Reviewed-by: Baoquan He <bhe@redhat.com>
Acked-by: Pankaj Gupta <pankaj.gupta.linux@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Wei Yang <richard.weiyang@gmail.com>
Cc: Vitaly Kuznetsov <vkuznets@redhat.com>
Cc: Yumei Huang <yuhuang@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Haiyang Zhang <haiyangz@microsoft.com>
Cc: K. Y. Srinivasan <kys@microsoft.com>
Cc: Michael Ellerman <mpe@ellerman.id.au> (powerpc)
Cc: Paul Mackerras <paulus@samba.org>
Cc: Stephen Hemminger <sthemmin@microsoft.com>
Cc: Wei Liu <wei.liu@kernel.org>
Link: http://lkml.kernel.org/r/20200319131221.14044-1-david@redhat.com
Link: http://lkml.kernel.org/r/20200317104942.11178-2-david@redhat.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-07 10:43:40 -07:00
..
firmware_loader firmware: Add new platform fallback mechanism and firmware_request_platform() 2020-03-20 14:54:04 +01:00
power Additional power management updates for 5.7-rc1 2020-04-06 10:14:39 -07:00
regmap regmap: fix writes to non incrementing registers 2020-01-21 17:16:26 +00:00
test Driver core changes for 5.6-rc1 2020-01-29 10:18:20 -08:00
arch_topology.c arm64 updates for 5.7: 2020-03-31 10:05:01 -07:00
attribute_container.c scsi: drivers: base: Support atomic version of attribute_container_device_trigger 2020-01-15 22:55:36 -05:00
base.h device.h: move devtmpfs prototypes out of the file 2019-12-16 10:10:18 +01:00
bus.c device.h: move 'struct bus' stuff out to device/bus.h 2019-12-16 10:11:12 +01:00
cacheinfo.c Driver Core and debugfs changes for 5.3-rc1 2019-07-12 12:24:03 -07:00
class.c device.h: move 'struct class' stuff out to device/class.h 2019-12-16 10:11:14 +01:00
component.c component: allow missing unbind callback 2020-03-18 14:13:33 +01:00
container.c driver core: Remove redundant license text 2017-12-07 18:36:44 +01:00
core.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next 2020-03-31 17:29:33 -07:00
cpu.c CPU (hotplug) updates: 2020-03-30 18:06:39 -07:00
dd.c driver core: Replace open-coded list_last_entry() 2020-03-24 13:33:26 +01:00
devcon.c Merge generic_lookup_helpers into usb-next 2019-09-03 17:11:07 +02:00
devcoredump.c devcoredump: fix typo in comment 2019-08-15 17:38:11 +02:00
devres.c drivers/base/devres: introduce devm_release_action() 2019-06-13 17:34:56 -10:00
devtmpfs.c Merge branch 'merge.nfs-fs_parse.1' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2020-02-08 13:26:41 -08:00
driver.c device.h: move 'struct driver' stuff out to device/driver.h 2019-12-16 10:11:16 +01:00
firmware.c driver core: Remove redundant license text 2017-12-07 18:36:44 +01:00
hypervisor.c driver core: Remove redundant license text 2017-12-07 18:36:44 +01:00
init.c base: fix order of OF initialization 2018-07-07 17:54:29 +02:00
isa.c Merge 4.15-rc3 into driver-core-next 2017-12-11 08:50:05 +01:00
Kconfig kunit: building kunit as a module breaks allmodconfig 2020-01-10 14:36:37 -07:00
Makefile drivers: base: Introducing software nodes to the firmware node framework 2018-11-26 18:19:11 +01:00
map.c driver core: Remove redundant license text 2017-12-07 18:36:44 +01:00
memory.c drivers/base/memory: rename MMOP_ONLINE_KEEP to MMOP_ONLINE 2020-04-07 10:43:40 -07:00
module.c driver core: Remove redundant license text 2017-12-07 18:36:44 +01:00
node.c mm/sparse: rename pfn_present() to pfn_in_present_section() 2020-04-02 09:35:30 -07:00
pinctrl.c driver core: Remove redundant license text 2017-12-07 18:36:44 +01:00
platform-msi.c platform-msi: Free descriptors in platform_msi_domain_free() 2018-12-13 09:35:31 +00:00
platform.c driver core: platform: Reimplement devm_platform_ioremap_resource 2020-03-24 12:09:40 +01:00
property.c device property: Export fwnode_get_name() 2020-03-16 07:47:58 +01:00
soc.c base: soc: Handle custom soc information sysfs entries 2019-10-10 14:35:32 +02:00
swnode.c Revert "software node: Simplify software_node_release() function" 2020-03-04 22:31:44 +01:00
syscore.c treewide: Switch printk users from %pf and %pF to %ps and %pS, respectively 2019-04-09 14:19:06 +02:00
topology.c topology: Create core_cpus and die_cpus sysfs attributes 2019-05-23 10:08:34 +02:00
transport_class.c scsi: drivers: base: Propagate errors through the transport component 2020-01-15 22:55:37 -05:00