2015-03-24 22:02:41 +08:00
|
|
|
acpi= [HW,ACPI,X86,ARM64]
|
2007-03-06 18:29:44 +08:00
|
|
|
Advanced Configuration and Power Interface
|
2016-04-12 22:09:11 +08:00
|
|
|
Format: { force | on | off | strict | noirq | rsdt |
|
2015-09-27 00:27:57 +08:00
|
|
|
copy_dsdt }
|
2005-04-17 06:20:36 +08:00
|
|
|
force -- enable ACPI if default was off
|
2016-04-12 22:09:11 +08:00
|
|
|
on -- enable ACPI but allow fallback to DT [arm64]
|
2005-04-17 06:20:36 +08:00
|
|
|
off -- disable ACPI if default was on
|
|
|
|
noirq -- do not use ACPI for IRQ routing
|
2005-10-24 03:57:11 +08:00
|
|
|
strict -- Be less tolerant of platforms that are not
|
2005-04-17 06:20:36 +08:00
|
|
|
strictly ACPI specification compliant.
|
2008-12-17 16:55:18 +08:00
|
|
|
rsdt -- prefer RSDT over (default) XSDT
|
2010-04-08 14:34:27 +08:00
|
|
|
copy_dsdt -- copy DSDT to memory
|
2016-04-12 22:09:11 +08:00
|
|
|
For ARM64, ONLY "acpi=off", "acpi=on" or "acpi=force"
|
|
|
|
are available
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2019-06-13 18:10:36 +08:00
|
|
|
See also Documentation/power/runtime_pm.rst, pci=noacpi
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2007-03-11 15:26:14 +08:00
|
|
|
acpi_apic_instance= [ACPI, IOAPIC]
|
|
|
|
Format: <int>
|
|
|
|
2: use 2nd APIC table, if available
|
|
|
|
1,0: use 1st APIC table
|
2007-03-31 02:16:10 +08:00
|
|
|
default: 0
|
2007-03-11 15:26:14 +08:00
|
|
|
|
2008-08-01 23:37:55 +08:00
|
|
|
acpi_backlight= [HW,ACPI]
|
2020-03-31 08:17:37 +08:00
|
|
|
{ vendor | video | native | none }
|
|
|
|
If set to vendor, prefer vendor-specific driver
|
2008-08-01 23:37:55 +08:00
|
|
|
(e.g. thinkpad_acpi, sony_acpi, etc.) instead
|
|
|
|
of the ACPI video.ko driver.
|
2020-03-31 08:17:37 +08:00
|
|
|
If set to video, use the ACPI video.ko driver.
|
|
|
|
If set to native, use the device's native backlight mode.
|
|
|
|
If set to none, disable the ACPI backlight interface.
|
2008-08-01 23:37:55 +08:00
|
|
|
|
2016-01-22 01:05:47 +08:00
|
|
|
acpi_force_32bit_fadt_addr
|
|
|
|
force FADT to use 32 bit addresses rather than the
|
|
|
|
64 bit X_* addresses. Some firmware have broken 64
|
|
|
|
bit addresses for force ACPI ignore these and use
|
|
|
|
the older legacy 32 bit addresses.
|
|
|
|
|
2015-05-14 21:31:28 +08:00
|
|
|
acpica_no_return_repair [HW, ACPI]
|
|
|
|
Disable AML predefined validation mechanism
|
|
|
|
This mechanism can repair the evaluation result to make
|
|
|
|
the return objects more ACPI specification compliant.
|
|
|
|
This option is useful for developers to identify the
|
|
|
|
root cause of an AML interpreter issue when the issue
|
|
|
|
has something to do with the repair mechanism.
|
|
|
|
|
2008-11-08 07:58:05 +08:00
|
|
|
acpi.debug_layer= [HW,ACPI,ACPI_DEBUG]
|
|
|
|
acpi.debug_level= [HW,ACPI,ACPI_DEBUG]
|
2005-04-17 06:20:36 +08:00
|
|
|
Format: <int>
|
2008-11-08 07:58:05 +08:00
|
|
|
CONFIG_ACPI_DEBUG must be enabled to produce any ACPI
|
|
|
|
debug output. Bits in debug_layer correspond to a
|
|
|
|
_COMPONENT in an ACPI source file, e.g.,
|
2021-02-20 02:16:54 +08:00
|
|
|
#define _COMPONENT ACPI_EVENTS
|
2008-11-08 07:58:05 +08:00
|
|
|
Bits in debug_level correspond to a level in
|
|
|
|
ACPI_DEBUG_PRINT statements, e.g.,
|
|
|
|
ACPI_DEBUG_PRINT((ACPI_DB_INFO, ...
|
2008-11-14 07:30:13 +08:00
|
|
|
The debug_level mask defaults to "info". See
|
2019-06-08 02:54:32 +08:00
|
|
|
Documentation/firmware-guide/acpi/debug.rst for more information about
|
2008-11-14 07:30:13 +08:00
|
|
|
debug layers and levels.
|
2008-11-08 07:58:05 +08:00
|
|
|
|
2008-11-14 07:30:13 +08:00
|
|
|
Enable processor driver info messages:
|
|
|
|
acpi.debug_layer=0x20000000
|
2008-11-08 07:58:05 +08:00
|
|
|
Enable AML "Debug" output, i.e., stores to the Debug
|
|
|
|
object while interpreting AML:
|
|
|
|
acpi.debug_layer=0xffffffff acpi.debug_level=0x2
|
|
|
|
Enable all messages related to ACPI hardware:
|
|
|
|
acpi.debug_layer=0x2 acpi.debug_level=0xffffffff
|
|
|
|
|
|
|
|
Some values produce so much output that the system is
|
|
|
|
unusable. The "log_buf_len" parameter may be useful
|
|
|
|
if you need to capture more output.
|
2007-04-24 13:53:22 +08:00
|
|
|
|
2015-05-14 21:31:28 +08:00
|
|
|
acpi_enforce_resources= [ACPI]
|
|
|
|
{ strict | lax | no }
|
|
|
|
Check for resource conflicts between native drivers
|
|
|
|
and ACPI OperationRegions (SystemIO and SystemMemory
|
|
|
|
only). IO ports and memory declared in ACPI might be
|
|
|
|
used by the ACPI subsystem in arbitrary AML code and
|
|
|
|
can interfere with legacy drivers.
|
|
|
|
strict (default): access to resources claimed by ACPI
|
|
|
|
is denied; legacy drivers trying to access reserved
|
|
|
|
resources will fail to bind to device using them.
|
|
|
|
lax: access to resources claimed by ACPI is allowed;
|
|
|
|
legacy drivers trying to access reserved resources
|
|
|
|
will bind successfully but a warning message is logged.
|
|
|
|
no: ACPI OperationRegions are not marked as reserved,
|
|
|
|
no further checks are performed.
|
|
|
|
|
2014-05-31 08:15:02 +08:00
|
|
|
acpi_force_table_verification [HW,ACPI]
|
|
|
|
Enable table checksum verification during early stage.
|
|
|
|
By default, this is disabled due to x86 early mapping
|
|
|
|
size limitation.
|
|
|
|
|
2009-04-06 06:55:22 +08:00
|
|
|
acpi_irq_balance [HW,ACPI]
|
|
|
|
ACPI will balance active IRQs
|
|
|
|
default in APIC mode
|
|
|
|
|
|
|
|
acpi_irq_nobalance [HW,ACPI]
|
|
|
|
ACPI will not move active IRQs (default)
|
|
|
|
default in PIC mode
|
|
|
|
|
|
|
|
acpi_irq_isa= [HW,ACPI] If irq_balance, mark listed IRQs used by ISA
|
|
|
|
Format: <irq>,<irq>...
|
|
|
|
|
|
|
|
acpi_irq_pci= [HW,ACPI] If irq_balance, clear listed IRQs for
|
|
|
|
use by PCI
|
|
|
|
Format: <irq>,<irq>...
|
|
|
|
|
2018-04-19 02:51:39 +08:00
|
|
|
acpi_mask_gpe= [HW,ACPI]
|
2016-12-16 12:07:57 +08:00
|
|
|
Due to the existence of _Lxx/_Exx, some GPEs triggered
|
|
|
|
by unsupported hardware/firmware features can result in
|
2018-04-19 02:51:39 +08:00
|
|
|
GPE floodings that cannot be automatically disabled by
|
|
|
|
the GPE dispatcher.
|
2016-12-16 12:07:57 +08:00
|
|
|
This facility can be used to prevent such uncontrolled
|
|
|
|
GPE floodings.
|
2021-06-17 01:03:33 +08:00
|
|
|
Format: <byte> or <bitmap-list>
|
2016-12-16 12:07:57 +08:00
|
|
|
|
2014-03-24 14:49:22 +08:00
|
|
|
acpi_no_auto_serialize [HW,ACPI]
|
|
|
|
Disable auto-serialization of AML methods
|
2014-03-24 14:49:00 +08:00
|
|
|
AML control methods that contain the opcodes to create
|
|
|
|
named objects will be marked as "Serialized" by the
|
|
|
|
auto-serialization feature.
|
2014-03-24 14:49:22 +08:00
|
|
|
This feature is enabled by default.
|
|
|
|
This option allows to turn off the feature.
|
2014-03-24 14:49:00 +08:00
|
|
|
|
2015-05-14 21:31:28 +08:00
|
|
|
acpi_no_memhotplug [ACPI] Disable memory hotplug. Useful for kdump
|
|
|
|
kernels.
|
|
|
|
|
2014-04-04 12:39:11 +08:00
|
|
|
acpi_no_static_ssdt [HW,ACPI]
|
|
|
|
Disable installation of static SSDTs at early boot time
|
|
|
|
By default, SSDTs contained in the RSDT/XSDT will be
|
|
|
|
installed automatically and they will appear under
|
|
|
|
/sys/firmware/acpi/tables.
|
|
|
|
This option turns off this feature.
|
|
|
|
Note that specifying this option does not affect
|
|
|
|
dynamic table installation which will install SSDT
|
|
|
|
tables to /sys/firmware/acpi/tables/dynamic.
|
2009-04-06 06:55:22 +08:00
|
|
|
|
2020-02-06 23:58:45 +08:00
|
|
|
acpi_no_watchdog [HW,ACPI,WDT]
|
|
|
|
Ignore the ACPI-based watchdog interface (WDAT) and let
|
|
|
|
a native driver control the watchdog device instead.
|
|
|
|
|
2015-05-14 21:31:28 +08:00
|
|
|
acpi_rsdp= [ACPI,EFI,KEXEC]
|
|
|
|
Pass the RSDP address to the kernel, mostly used
|
|
|
|
on machines running EFI runtime service to boot the
|
|
|
|
second kernel for kdump.
|
2014-02-11 11:01:52 +08:00
|
|
|
|
2009-04-06 06:55:22 +08:00
|
|
|
acpi_os_name= [HW,ACPI] Tell ACPI BIOS the name of the OS
|
|
|
|
Format: To spoof as Windows 98: ="Microsoft Windows"
|
|
|
|
|
2015-07-03 07:06:00 +08:00
|
|
|
acpi_rev_override [ACPI] Override the _REV object to return 5 (instead
|
|
|
|
of 2 which is mandated by ACPI 6) as the supported ACPI
|
|
|
|
specification revision (when using this switch, it may
|
|
|
|
be necessary to carry out a cold reboot _twice_ in a
|
|
|
|
row to make it take effect on the platform firmware).
|
|
|
|
|
2009-04-06 06:55:22 +08:00
|
|
|
acpi_osi= [HW,ACPI] Modify list of supported OS interface strings
|
2013-07-22 16:08:25 +08:00
|
|
|
acpi_osi="string1" # add string1
|
|
|
|
acpi_osi="!string2" # remove string2
|
2013-07-22 16:08:36 +08:00
|
|
|
acpi_osi=!* # remove all strings
|
2013-07-22 16:08:25 +08:00
|
|
|
acpi_osi=! # disable all built-in OS vendor
|
|
|
|
strings
|
2016-05-03 16:48:32 +08:00
|
|
|
acpi_osi=!! # enable all built-in OS vendor
|
|
|
|
strings
|
2009-04-06 06:55:22 +08:00
|
|
|
acpi_osi= # disable all strings
|
|
|
|
|
2013-07-22 16:08:25 +08:00
|
|
|
'acpi_osi=!' can be used in combination with single or
|
|
|
|
multiple 'acpi_osi="string1"' to support specific OS
|
|
|
|
vendor string(s). Note that such command can only
|
|
|
|
affect the default state of the OS vendor strings, thus
|
|
|
|
it cannot affect the default state of the feature group
|
|
|
|
strings and the current state of the OS vendor strings,
|
|
|
|
specifying it multiple times through kernel command line
|
2013-07-22 16:08:36 +08:00
|
|
|
is meaningless. This command is useful when one do not
|
|
|
|
care about the state of the feature group strings which
|
|
|
|
should be controlled by the OSPM.
|
2013-07-22 16:08:25 +08:00
|
|
|
Examples:
|
|
|
|
1. 'acpi_osi=! acpi_osi="Windows 2000"' is equivalent
|
|
|
|
to 'acpi_osi="Windows 2000" acpi_osi=!', they all
|
|
|
|
can make '_OSI("Windows 2000")' TRUE.
|
|
|
|
|
|
|
|
'acpi_osi=' cannot be used in combination with other
|
|
|
|
'acpi_osi=' command lines, the _OSI method will not
|
|
|
|
exist in the ACPI namespace. NOTE that such command can
|
|
|
|
only affect the _OSI support state, thus specifying it
|
|
|
|
multiple times through kernel command line is also
|
|
|
|
meaningless.
|
|
|
|
Examples:
|
|
|
|
1. 'acpi_osi=' can make 'CondRefOf(_OSI, Local1)'
|
|
|
|
FALSE.
|
|
|
|
|
2013-07-22 16:08:36 +08:00
|
|
|
'acpi_osi=!*' can be used in combination with single or
|
|
|
|
multiple 'acpi_osi="string1"' to support specific
|
|
|
|
string(s). Note that such command can affect the
|
|
|
|
current state of both the OS vendor strings and the
|
|
|
|
feature group strings, thus specifying it multiple times
|
|
|
|
through kernel command line is meaningful. But it may
|
|
|
|
still not able to affect the final state of a string if
|
|
|
|
there are quirks related to this string. This command
|
|
|
|
is useful when one want to control the state of the
|
|
|
|
feature group strings to debug BIOS issues related to
|
|
|
|
the OSPM features.
|
|
|
|
Examples:
|
|
|
|
1. 'acpi_osi="Module Device" acpi_osi=!*' can make
|
|
|
|
'_OSI("Module Device")' FALSE.
|
|
|
|
2. 'acpi_osi=!* acpi_osi="Module Device"' can make
|
|
|
|
'_OSI("Module Device")' TRUE.
|
|
|
|
3. 'acpi_osi=! acpi_osi=!* acpi_osi="Windows 2000"' is
|
|
|
|
equivalent to
|
|
|
|
'acpi_osi=!* acpi_osi=! acpi_osi="Windows 2000"'
|
|
|
|
and
|
|
|
|
'acpi_osi=!* acpi_osi="Windows 2000" acpi_osi=!',
|
|
|
|
they all will make '_OSI("Windows 2000")' TRUE.
|
|
|
|
|
2009-04-14 16:33:43 +08:00
|
|
|
acpi_pm_good [X86]
|
2009-04-06 06:55:22 +08:00
|
|
|
Override the pmtimer bug detection: force the kernel
|
|
|
|
to assume that this machine's pmtimer latches its value
|
|
|
|
and always returns good values.
|
|
|
|
|
2009-04-18 09:30:28 +08:00
|
|
|
acpi_sci= [HW,ACPI] ACPI System Control Interrupt trigger mode
|
|
|
|
Format: { level | edge | high | low }
|
|
|
|
|
|
|
|
acpi_skip_timer_override [HW,ACPI]
|
|
|
|
Recognize and ignore IRQ0/pin2 Interrupt Override.
|
|
|
|
For broken nForce2 BIOS resulting in XT-PIC timer.
|
|
|
|
|
|
|
|
acpi_sleep= [HW,ACPI] Sleep options
|
|
|
|
Format: { s3_bios, s3_mode, s3_beep, s4_nohwsig,
|
2017-11-15 09:16:55 +08:00
|
|
|
old_ordering, nonvs, sci_force_enable, nobl }
|
2019-06-13 18:10:36 +08:00
|
|
|
See Documentation/power/video.rst for information on
|
2009-04-18 09:30:28 +08:00
|
|
|
s3_bios and s3_mode.
|
|
|
|
s3_beep is for debugging; it makes the PC's speaker beep
|
|
|
|
as soon as the kernel's real-mode entry point is called.
|
|
|
|
s4_nohwsig prevents ACPI hardware signature from being
|
|
|
|
used during resume from hibernation.
|
|
|
|
old_ordering causes the ACPI 1.0 ordering of the _PTS
|
|
|
|
control method, with respect to putting devices into
|
|
|
|
low power states, to be enforced (the ACPI 2.0 ordering
|
|
|
|
of _PTS is used by default).
|
2010-07-24 04:59:09 +08:00
|
|
|
nonvs prevents the kernel from saving/restoring the
|
|
|
|
ACPI NVS memory during suspend/hibernation and resume.
|
2009-12-30 15:36:42 +08:00
|
|
|
sci_force_enable causes the kernel to set SCI_EN directly
|
|
|
|
on resume from S1/S3 (which is against the ACPI spec,
|
|
|
|
but some broken systems don't work without it).
|
2017-11-15 09:16:55 +08:00
|
|
|
nobl causes the internal blacklist of systems known to
|
|
|
|
behave incorrectly in some ways with respect to system
|
|
|
|
suspend and resume to be ignored (use wisely).
|
2009-04-18 09:30:28 +08:00
|
|
|
|
|
|
|
acpi_use_timer_override [HW,ACPI]
|
|
|
|
Use timer override. For some broken Nvidia NF5 boards
|
|
|
|
that require a timer override, but don't have HPET
|
|
|
|
|
|
|
|
add_efi_memmap [EFI; X86] Include EFI memory map in
|
|
|
|
kernel's map of available physical RAM.
|
|
|
|
|
2009-04-06 06:55:22 +08:00
|
|
|
agp= [AGP]
|
|
|
|
{ off | try_unsupported }
|
|
|
|
off: disable AGP support
|
|
|
|
try_unsupported: try to drive unsupported chipsets
|
|
|
|
(may crash computer or cause data corruption)
|
|
|
|
|
2010-06-08 08:10:38 +08:00
|
|
|
ALSA [HW,ALSA]
|
2018-06-14 18:43:07 +08:00
|
|
|
See Documentation/sound/alsa-configuration.rst
|
2010-06-08 08:10:38 +08:00
|
|
|
|
2010-02-21 00:13:29 +08:00
|
|
|
alignment= [KNL,ARM]
|
|
|
|
Allow the default userspace alignment fault handler
|
|
|
|
behaviour to be specified. Bit 0 enables warnings,
|
|
|
|
bit 1 enables fixups, and bit 2 sends a segfault.
|
|
|
|
|
2011-08-05 21:15:08 +08:00
|
|
|
align_va_addr= [X86-64]
|
|
|
|
Align virtual addresses by clearing slice [14:12] when
|
|
|
|
allocating a VMA at process creation time. This option
|
|
|
|
gives you up to 3% performance improvement on AMD F15h
|
|
|
|
machines (where it is enabled by default) for a
|
|
|
|
CPU-intensive style benchmark, and it can vary highly in
|
|
|
|
a microbenchmark depending on workload and compiler.
|
|
|
|
|
2011-11-21 19:10:19 +08:00
|
|
|
32: only for 32-bit processes
|
|
|
|
64: only for 64-bit processes
|
2011-08-05 21:15:08 +08:00
|
|
|
on: enable for both 32- and 64-bit processes
|
|
|
|
off: disable for both 32- and 64-bit processes
|
|
|
|
|
2013-03-08 11:48:09 +08:00
|
|
|
alloc_snapshot [FTRACE]
|
|
|
|
Allocate the ftrace snapshot buffer on boot up when the
|
|
|
|
main buffer is allocated. This is handy if debugging
|
|
|
|
and you need to use tracing_snapshot() on boot up, and
|
|
|
|
do not want to use tracing_snapshot_alloc() as it needs
|
|
|
|
to be done where GFP_KERNEL allocations are allowed.
|
|
|
|
|
2011-12-06 06:08:32 +08:00
|
|
|
amd_iommu= [HW,X86-64]
|
2008-06-27 03:28:10 +08:00
|
|
|
Pass parameters to the AMD IOMMU driver in the system.
|
|
|
|
Possible values are:
|
2008-09-20 00:23:30 +08:00
|
|
|
fullflush - enable flushing of IO/TLB entries when
|
|
|
|
they are unmapped. Otherwise they are
|
|
|
|
flushed before they will be reused, which
|
|
|
|
is a lot of faster
|
2010-05-11 23:12:33 +08:00
|
|
|
off - do not initialize any AMD IOMMU found in
|
|
|
|
the system
|
2011-12-01 22:49:45 +08:00
|
|
|
force_isolation - Force device isolation for all
|
|
|
|
devices. The IOMMU driver is not
|
|
|
|
allowed anymore to lift isolation
|
|
|
|
requirements as needed. This option
|
|
|
|
does not override iommu=pt
|
2021-06-03 21:02:03 +08:00
|
|
|
force_enable - Force enable the IOMMU on platforms known
|
|
|
|
to be buggy with IOMMU enabled. Use this
|
|
|
|
option with care.
|
2008-09-20 00:23:30 +08:00
|
|
|
|
2012-05-25 05:58:25 +08:00
|
|
|
amd_iommu_dump= [HW,X86-64]
|
|
|
|
Enable AMD IOMMU driver option to dump the ACPI table
|
|
|
|
for AMD IOMMU. With this option enabled, AMD IOMMU
|
|
|
|
driver will print ACPI tables for AMD IOMMU during
|
|
|
|
IOMMU initialization.
|
|
|
|
|
2016-08-24 02:52:32 +08:00
|
|
|
amd_iommu_intr= [HW,X86-64]
|
|
|
|
Specifies one of the following AMD IOMMU interrupt
|
|
|
|
remapping modes:
|
|
|
|
legacy - Use legacy interrupt remapping mode.
|
|
|
|
vapic - Use virtual APIC mode, which allows IOMMU
|
|
|
|
to inject interrupts directly into guest.
|
|
|
|
This mode requires kvm-amd.avic=1.
|
|
|
|
(Default when IOMMU HW support is present.)
|
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
amijoy.map= [HW,JOY] Amiga joystick support
|
|
|
|
Map of devices attached to JOY0DAT and JOY1DAT
|
|
|
|
Format: <a>,<b>
|
2017-10-11 01:36:23 +08:00
|
|
|
See also Documentation/input/joydev/joystick.rst
|
2005-04-17 06:20:36 +08:00
|
|
|
|
|
|
|
analog.map= [HW,JOY] Analog joystick and gamepad support
|
|
|
|
Specifies type or capabilities of an analog joystick
|
|
|
|
connected to one of 16 gameports
|
|
|
|
Format: <type1>,<type2>,..<type16>
|
|
|
|
|
2005-10-24 03:57:11 +08:00
|
|
|
apc= [HW,SPARC]
|
|
|
|
Power management functions (SPARCstation-4/5 + deriv.)
|
2005-04-17 06:20:36 +08:00
|
|
|
Format: noidle
|
|
|
|
Disable APC CPU standby support. SPARCstation-Fox does
|
|
|
|
not play well with APC CPU idle - disable it if you have
|
|
|
|
APC and your system crashes randomly.
|
|
|
|
|
2017-12-04 12:03:13 +08:00
|
|
|
apic= [APIC,X86] Advanced Programmable Interrupt Controller
|
2018-11-19 19:02:45 +08:00
|
|
|
Change the output verbosity while booting
|
2005-04-17 06:20:36 +08:00
|
|
|
Format: { quiet (default) | verbose | debug }
|
|
|
|
Change the amount of debugging information output
|
|
|
|
when initialising the APIC and IO-APIC components.
|
2017-12-04 12:03:13 +08:00
|
|
|
For X86-32, this can also be used to specify an APIC
|
|
|
|
driver name.
|
|
|
|
Format: apic=driver_name
|
|
|
|
Examples: apic=bigsmp
|
2005-10-24 03:57:11 +08:00
|
|
|
|
2015-12-14 18:19:12 +08:00
|
|
|
apic_extnmi= [APIC,X86] External NMI delivery setting
|
|
|
|
Format: { bsp (default) | all | none }
|
|
|
|
bsp: External NMI is delivered only to CPU 0
|
|
|
|
all: External NMIs are broadcast to all CPUs as a
|
|
|
|
backup of CPU 0
|
|
|
|
none: External NMI is masked for all CPUs. This is
|
|
|
|
useful so that a dump capture kernel won't be
|
|
|
|
shot down by NMI
|
|
|
|
|
2010-02-05 05:36:50 +08:00
|
|
|
autoconf= [IPV6]
|
2020-04-28 06:01:50 +08:00
|
|
|
See Documentation/networking/ipv6.rst.
|
2010-02-05 05:36:50 +08:00
|
|
|
|
2009-10-14 23:09:04 +08:00
|
|
|
show_lapic= [APIC,X86] Advanced Programmable Interrupt Controller
|
|
|
|
Limit apic dumping. The parameter defines the maximal
|
|
|
|
number of local apics being dumped. Also it is possible
|
|
|
|
to set it to "all" by meaning -- no limit here.
|
|
|
|
Format: { 1 (default) | 2 | ... | all }.
|
|
|
|
The parameter valid if only apic=debug or
|
|
|
|
apic=verbose is specified.
|
|
|
|
Example: apic=debug show_lapic=all
|
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
apm= [APM] Advanced Power Management
|
2008-07-05 00:59:43 +08:00
|
|
|
See header of arch/x86/kernel/apm_32.c.
|
2005-04-17 06:20:36 +08:00
|
|
|
|
|
|
|
arcrimi= [HW,NET] ARCnet - "RIM I" (entirely mem-mapped) cards
|
|
|
|
Format: <io>,<irq>,<nodeID>
|
|
|
|
|
2021-02-08 17:57:29 +08:00
|
|
|
arm64.nobti [ARM64] Unconditionally disable Branch Target
|
|
|
|
Identification support
|
|
|
|
|
2021-02-08 17:57:31 +08:00
|
|
|
arm64.nopauth [ARM64] Unconditionally disable Pointer Authentication
|
|
|
|
support
|
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
ataflop= [HW,M68k]
|
|
|
|
|
|
|
|
atarimouse= [HW,MOUSE] Atari Mouse
|
|
|
|
|
|
|
|
atkbd.extra= [HW] Enable extra LEDs and keys on IBM RapidAccess,
|
|
|
|
EzKey and similar keyboards
|
|
|
|
|
|
|
|
atkbd.reset= [HW] Reset keyboard during initialization
|
|
|
|
|
2005-10-24 03:57:11 +08:00
|
|
|
atkbd.set= [HW] Select keyboard code set
|
|
|
|
Format: <int> (2 = AT (default), 3 = PS/2)
|
2005-04-17 06:20:36 +08:00
|
|
|
|
|
|
|
atkbd.scroll= [HW] Enable scroll wheel on MS Office and similar
|
|
|
|
keyboards
|
|
|
|
|
|
|
|
atkbd.softraw= [HW] Choose between synthetic and real raw mode
|
|
|
|
Format: <bool> (0 = real, 1 = synthetic (default))
|
2005-10-24 03:57:11 +08:00
|
|
|
|
|
|
|
atkbd.softrepeat= [HW]
|
|
|
|
Use software keyboard repeat
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2013-09-24 09:53:35 +08:00
|
|
|
audit= [KNL] Enable the audit sub-system
|
2018-03-06 06:05:20 +08:00
|
|
|
Format: { "0" | "1" | "off" | "on" }
|
|
|
|
0 | off - kernel audit is disabled and can not be
|
|
|
|
enabled until the next reboot
|
2014-01-14 05:01:06 +08:00
|
|
|
unset - kernel audit is initialized but disabled and
|
|
|
|
will be fully enabled by the userspace auditd.
|
2018-03-06 06:05:20 +08:00
|
|
|
1 | on - kernel audit is initialized and partially
|
|
|
|
enabled, storing at most audit_backlog_limit
|
|
|
|
messages in RAM until it is fully enabled by the
|
|
|
|
userspace auditd.
|
2013-09-24 09:53:35 +08:00
|
|
|
Default: unset
|
2013-09-18 00:34:52 +08:00
|
|
|
|
2013-09-18 00:34:52 +08:00
|
|
|
audit_backlog_limit= [KNL] Set the audit queue size limit.
|
|
|
|
Format: <int> (must be >=0)
|
|
|
|
Default: 64
|
|
|
|
|
2016-04-01 03:18:29 +08:00
|
|
|
bau= [X86_UV] Enable the BAU on SGI UV. The default
|
|
|
|
behavior is to disable the BAU (i.e. bau=0).
|
|
|
|
Format: { "0" | "1" }
|
|
|
|
0 - Disable the BAU.
|
|
|
|
1 - Enable the BAU.
|
|
|
|
unset - Disable the BAU.
|
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
baycom_epp= [HW,AX25]
|
|
|
|
Format: <io>,<mode>
|
2005-10-24 03:57:11 +08:00
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
baycom_par= [HW,AX25] BayCom Parallel Port AX.25 Modem
|
|
|
|
Format: <io>,<mode>
|
|
|
|
See header of drivers/net/hamradio/baycom_par.c.
|
|
|
|
|
2005-10-24 03:57:11 +08:00
|
|
|
baycom_ser_fdx= [HW,AX25]
|
|
|
|
BayCom Serial Port AX.25 Modem (Full Duplex Mode)
|
2005-04-17 06:20:36 +08:00
|
|
|
Format: <io>,<irq>,<mode>[,<baud>]
|
|
|
|
See header of drivers/net/hamradio/baycom_ser_fdx.c.
|
|
|
|
|
2005-10-24 03:57:11 +08:00
|
|
|
baycom_ser_hdx= [HW,AX25]
|
|
|
|
BayCom Serial Port AX.25 Modem (Half Duplex Mode)
|
2005-04-17 06:20:36 +08:00
|
|
|
Format: <io>,<irq>,<mode>
|
|
|
|
See header of drivers/net/hamradio/baycom_ser_hdx.c.
|
|
|
|
|
2013-10-01 04:45:19 +08:00
|
|
|
blkdevparts= Manual partition parsing of block device(s) for
|
|
|
|
embedded devices based on command line input.
|
2019-04-19 06:45:00 +08:00
|
|
|
See Documentation/block/cmdline-partition.rst
|
2013-10-01 04:45:19 +08:00
|
|
|
|
2007-10-16 16:23:46 +08:00
|
|
|
boot_delay= Milliseconds to delay each printk during boot.
|
|
|
|
Values larger than 10 seconds (10000) are changed to
|
|
|
|
no delay (0).
|
|
|
|
Format: integer
|
|
|
|
|
2020-02-04 20:33:53 +08:00
|
|
|
bootconfig [KNL]
|
|
|
|
Extended command line options can be added to an initrd
|
|
|
|
and this will cause the kernel to look for it.
|
|
|
|
|
|
|
|
See Documentation/admin-guide/bootconfig.rst
|
|
|
|
|
ACPI / APEI: Add Boot Error Record Table (BERT) support
ACPI/APEI is designed to verifiy/report H/W errors, like Corrected
Error(CE) and Uncorrected Error(UC). It contains four tables: HEST,
ERST, EINJ and BERT. The first three tables have been merged for
a long time, but because of lacking BIOS support for BERT, the
support for BERT is pending until now. Recently on ARM 64 platform
it is has been supported. So here we come.
Under normal circumstances, when a hardware error occurs, kernel will
be notified via NMI, MCE or some other method, then kernel will
process the error condition, report it, and recover it if possible.
But sometime, the situation is so bad, so that firmware may choose to
reset directly without notifying Linux kernel.
Linux kernel can use the Boot Error Record Table (BERT) to get the
un-notified hardware errors that occurred in a previous boot. In this
patch, the error information is reported via printk.
For more information about BERT, please refer to ACPI Specification
version 6.0, section 18.3.1:
http://www.uefi.org/sites/default/files/resources/ACPI_6.0.pdf
The following log is a BERT record after system reboot because of hitting
a fatal memory error:
BERT: Error records from previous boot:
[Hardware Error]: It has been corrected by h/w and requires no further action
[Hardware Error]: event severity: corrected
[Hardware Error]: Error 0, type: recoverable
[Hardware Error]: section_type: memory error
[Hardware Error]: error_status: 0x0000000000000400
[Hardware Error]: physical_address: 0xffffffffffffffff
[Hardware Error]: card: 1 module: 2 bank: 3 row: 1 column: 2 bit_position: 5
[Hardware Error]: error_type: 2, single-bit ECC
[Tomasz Nowicki: Clear error status at the end of error handling]
[Tony: Applied some cleanups suggested by Fu Wei]
[Fu Wei: delete EXPORT_SYMBOL_GPL(bert_disable), improve the code]
Signed-off-by: Huang Ying <ying.huang@intel.com>
Signed-off-by: Tomasz Nowicki <tomasz.nowicki@linaro.org>
Signed-off-by: Chen, Gong <gong.chen@linux.intel.com>
Tested-by: Jonathan (Zhixiong) Zhang <zjzhang@codeaurora.org>
Signed-off-by: Fu Wei <fu.wei@linaro.org>
Tested-by: Tyler Baicar <tbaicar@codeaurora.org>
Reviewed-by: Borislav Petkov <bp@suse.de>
Signed-off-by: Tony Luck <tony.luck@intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2016-06-30 04:04:29 +08:00
|
|
|
bert_disable [ACPI]
|
|
|
|
Disable BERT OS support on buggy BIOSes.
|
|
|
|
|
2020-03-05 06:55:29 +08:00
|
|
|
bgrt_disable [ACPI][X86]
|
|
|
|
Disable BGRT to avoid flickering OEM logo.
|
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
bttv.card= [HW,V4L] bttv (bt848 + bt878 based grabber cards)
|
2005-10-24 03:57:11 +08:00
|
|
|
bttv.radio= Most important insmod options are available as
|
|
|
|
kernel args too.
|
2020-03-04 20:08:03 +08:00
|
|
|
bttv.pll= See Documentation/admin-guide/media/bttv.rst
|
2011-08-15 08:02:26 +08:00
|
|
|
bttv.tuner=
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2010-09-28 23:33:12 +08:00
|
|
|
bulk_remove=off [PPC] This parameter disables the use of the pSeries
|
|
|
|
firmware feature for flushing multiple hpte entries
|
|
|
|
at a time.
|
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
c101= [NET] Moxa C101 synchronous serial card
|
|
|
|
|
2007-07-31 15:37:59 +08:00
|
|
|
cachesize= [BUGS=X86-32] Override level 2 CPU cache size detection.
|
2005-04-17 06:20:36 +08:00
|
|
|
Sometimes CPU hardware bugs make them report the cache
|
|
|
|
size incorrectly. The kernel will attempt work arounds
|
|
|
|
to fix known problems, but for some CPUs it is not
|
|
|
|
possible to determine what the correct size should be.
|
|
|
|
This option provides an override for these situations.
|
|
|
|
|
2019-01-31 18:14:18 +08:00
|
|
|
carrier_timeout=
|
|
|
|
[NET] Specifies amount of time (in seconds) that
|
|
|
|
the kernel should wait for a network carrier. By default
|
|
|
|
it waits 120 seconds.
|
|
|
|
|
2014-06-17 16:56:58 +08:00
|
|
|
ca_keys= [KEYS] This parameter identifies a specific key(s) on
|
|
|
|
the system trusted keyring to be used for certificate
|
|
|
|
trust validation.
|
2014-06-17 16:56:59 +08:00
|
|
|
format: { id:<keyid> | builtin }
|
2014-06-17 16:56:58 +08:00
|
|
|
|
2014-06-26 07:41:13 +08:00
|
|
|
cca= [MIPS] Override the kernel pages' cache coherency
|
|
|
|
algorithm. Accepted values range from 0 to 7
|
|
|
|
inclusive. See arch/mips/include/asm/pgtable-bits.h
|
|
|
|
for platform specific values (SB1, Loongson3 and
|
|
|
|
others).
|
|
|
|
|
2018-04-19 02:51:39 +08:00
|
|
|
ccw_timeout_log [S390]
|
2019-06-09 10:27:16 +08:00
|
|
|
See Documentation/s390/common_io.rst for details.
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2021-05-25 03:53:39 +08:00
|
|
|
cgroup_disable= [KNL] Disable a particular controller or optional feature
|
|
|
|
Format: {name of the controller(s) or feature(s) to disable}
|
2013-11-07 05:18:09 +08:00
|
|
|
The effects of cgroup_disable=foo are:
|
|
|
|
- foo isn't auto-mounted if you mount all cgroups in
|
|
|
|
a single hierarchy
|
|
|
|
- foo isn't visible as an individually mountable
|
|
|
|
subsystem
|
2021-05-25 03:53:39 +08:00
|
|
|
- if foo is an optional feature then the feature is
|
|
|
|
disabled and corresponding cgroup files are not
|
|
|
|
created
|
2013-11-07 05:18:09 +08:00
|
|
|
{Currently only "memory" controller deal with this and
|
|
|
|
cut the overhead, others just disable the usage. So
|
|
|
|
only cgroup_disable=memory is actually worthy}
|
2021-05-25 03:53:39 +08:00
|
|
|
Specifying "pressure" disables per-cgroup pressure
|
|
|
|
stall information accounting feature
|
2008-04-05 05:29:57 +08:00
|
|
|
|
2018-12-29 02:31:07 +08:00
|
|
|
cgroup_no_v1= [KNL] Disable cgroup controllers and named hierarchies in v1
|
|
|
|
Format: { { controller | "all" | "named" }
|
|
|
|
[,{ controller | "all" | "named" }...] }
|
2016-02-17 02:21:14 +08:00
|
|
|
Like cgroup_disable, but only applies to cgroup v1;
|
|
|
|
the blacklisted controllers remain available in cgroup2.
|
2018-12-29 02:31:07 +08:00
|
|
|
"all" blacklists all controllers and "named" disables
|
|
|
|
named mounts. Specifying both "all" and "named" disables
|
|
|
|
all v1 hierarchies.
|
2016-02-17 02:21:14 +08:00
|
|
|
|
2016-01-15 07:21:29 +08:00
|
|
|
cgroup.memory= [KNL] Pass options to the cgroup memory controller.
|
|
|
|
Format: <string>
|
|
|
|
nosocket -- Disable socket memory accounting.
|
2016-01-21 07:02:38 +08:00
|
|
|
nokmem -- Disable kernel memory accounting.
|
2016-01-15 07:21:29 +08:00
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
checkreqprot [SELINUX] Set initial checkreqprot flag value.
|
|
|
|
Format: { "0" | "1" }
|
|
|
|
See security/selinux/Kconfig help text.
|
2005-10-24 03:57:11 +08:00
|
|
|
0 -- check protection applied by kernel (includes
|
|
|
|
any implied execute protection).
|
2005-04-17 06:20:36 +08:00
|
|
|
1 -- check protection requested by application.
|
|
|
|
Default value is set via a kernel config option.
|
2005-10-24 03:57:11 +08:00
|
|
|
Value can be changed at runtime via
|
2020-01-08 00:35:04 +08:00
|
|
|
/sys/fs/selinux/checkreqprot.
|
2020-01-09 00:24:47 +08:00
|
|
|
Setting checkreqprot to 1 is deprecated.
|
2005-10-24 03:57:11 +08:00
|
|
|
|
2008-01-26 21:10:36 +08:00
|
|
|
cio_ignore= [S390]
|
2019-06-09 10:27:16 +08:00
|
|
|
See Documentation/s390/common_io.rst for details.
|
2013-04-28 05:10:18 +08:00
|
|
|
clk_ignore_unused
|
|
|
|
[CLK]
|
2014-10-01 05:24:38 +08:00
|
|
|
Prevents the clock framework from automatically gating
|
|
|
|
clocks that have not been explicitly enabled by a Linux
|
|
|
|
device driver but are enabled in hardware at reset or
|
|
|
|
by the bootloader/firmware. Note that this does not
|
|
|
|
force such clocks to be always-on nor does it reserve
|
|
|
|
those clocks in any way. This parameter is useful for
|
|
|
|
debug and development, but should not be needed on a
|
|
|
|
platform with proper driver support. For more
|
2018-05-07 17:35:44 +08:00
|
|
|
information, see Documentation/driver-api/clk.rst.
|
2008-01-26 21:10:36 +08:00
|
|
|
|
2007-07-31 15:37:59 +08:00
|
|
|
clock= [BUGS=X86-32, HW] gettimeofday clocksource override.
|
2006-06-26 15:25:05 +08:00
|
|
|
[Deprecated]
|
2006-10-04 04:45:33 +08:00
|
|
|
Forces specified clocksource (if available) to be used
|
2006-06-26 15:25:05 +08:00
|
|
|
when calculating gettimeofday(). If specified
|
2006-10-04 04:45:33 +08:00
|
|
|
clocksource is not available, it defaults to PIT.
|
2005-04-17 06:20:36 +08:00
|
|
|
Format: { pit | tsc | cyclone | pmtmr }
|
|
|
|
|
2010-07-14 08:56:20 +08:00
|
|
|
clocksource= Override the default clocksource
|
2007-05-24 04:58:16 +08:00
|
|
|
Format: <string>
|
|
|
|
Override the default clocksource and use the clocksource
|
|
|
|
with the name specified.
|
|
|
|
Some clocksource names to choose from, depending on
|
|
|
|
the platform:
|
|
|
|
[all] jiffies (this is the base, fallback clocksource)
|
|
|
|
[ACPI] acpi_pm
|
|
|
|
[ARM] imx_timer1,OSTS,netx_timer,mpu_timer2,
|
|
|
|
pxa_timer,timer3,32k_counter,timer0_1
|
2010-08-24 05:49:11 +08:00
|
|
|
[X86-32] pit,hpet,tsc;
|
2007-05-24 04:58:16 +08:00
|
|
|
scx200_hrt on Geode; cyclone on IBM x440
|
|
|
|
[MIPS] MIPS
|
|
|
|
[PARISC] cr16
|
|
|
|
[S390] tod
|
|
|
|
[SH] SuperH
|
|
|
|
[SPARC64] tick
|
|
|
|
[X86-64] hpet,tsc
|
|
|
|
|
2016-06-28 00:30:13 +08:00
|
|
|
clocksource.arm_arch_timer.evtstrm=
|
|
|
|
[ARM,ARM64]
|
|
|
|
Format: <bool>
|
|
|
|
Enable/disable the eventstream feature of the ARM
|
|
|
|
architected timer so that code using WFE-based polling
|
|
|
|
loops can be debugged more effectively on production
|
|
|
|
systems.
|
|
|
|
|
clocksource: Retry clock read if long delays detected
When the clocksource watchdog marks a clock as unstable, this might be due
to that clock being unstable or it might be due to delays that happen to
occur between the reads of the two clocks. Yes, interrupts are disabled
across those two reads, but there are no shortage of things that can delay
interrupts-disabled regions of code ranging from SMI handlers to vCPU
preemption. It would be good to have some indication as to why the clock
was marked unstable.
Therefore, re-read the watchdog clock on either side of the read from the
clock under test. If the watchdog clock shows an excessive time delta
between its pair of reads, the reads are retried.
The maximum number of retries is specified by a new kernel boot parameter
clocksource.max_cswd_read_retries, which defaults to three, that is, up to
four reads, one initial and up to three retries. If more than one retry
was required, a message is printed on the console (the occasional single
retry is expected behavior, especially in guest OSes). If the maximum
number of retries is exceeded, the clock under test will be marked
unstable. However, the probability of this happening due to various sorts
of delays is quite small. In addition, the reason (clock-read delays) for
the unstable marking will be apparent.
Reported-by: Chris Mason <clm@fb.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Feng Tang <feng.tang@intel.com>
Link: https://lore.kernel.org/r/20210527190124.440372-1-paulmck@kernel.org
2021-05-28 03:01:19 +08:00
|
|
|
clocksource.max_cswd_read_retries= [KNL]
|
|
|
|
Number of clocksource_watchdog() retries due to
|
|
|
|
external delays before the clock will be marked
|
|
|
|
unstable. Defaults to three retries, that is,
|
|
|
|
four attempts to read the clock under test.
|
|
|
|
|
2021-05-28 03:01:21 +08:00
|
|
|
clocksource.verify_n_cpus= [KNL]
|
|
|
|
Limit the number of CPUs checked for clocksources
|
|
|
|
marked with CLOCK_SOURCE_VERIFY_PERCPU that
|
|
|
|
are marked unstable due to excessive skew.
|
|
|
|
A negative value says to check all CPUs, while
|
|
|
|
zero says not to check any. Values larger than
|
|
|
|
nr_cpu_ids are silently truncated to nr_cpu_ids.
|
|
|
|
The actual CPUs are chosen randomly, with
|
|
|
|
no replacement if the same CPU is chosen twice.
|
|
|
|
|
clocksource: Provide kernel module to test clocksource watchdog
When the clocksource watchdog marks a clock as unstable, this might
be due to that clock being unstable or it might be due to delays that
happen to occur between the reads of the two clocks. It would be good
to have a way of testing the clocksource watchdog's ability to
distinguish between these two causes of clock skew and instability.
Therefore, provide a new clocksource-wdtest module selected by a new
TEST_CLOCKSOURCE_WATCHDOG Kconfig option. This module has a single module
parameter named "holdoff" that provides the number of seconds of delay
before testing should start, which defaults to zero when built as a module
and to 10 seconds when built directly into the kernel. Very large systems
that boot slowly may need to increase the value of this module parameter.
This module uses hand-crafted clocksource structures to do its testing,
thus avoiding messing up timing for the rest of the kernel and for user
applications. This module first verifies that the ->uncertainty_margin
field of the clocksource structures are set sanely. It then tests the
delay-detection capability of the clocksource watchdog, increasing the
number of consecutive delays injected, first provoking console messages
complaining about the delays and finally forcing a clock-skew event.
Unexpected test results cause at least one WARN_ON_ONCE() console splat.
If there are no splats, the test has passed. Finally, it fuzzes the
value returned from a clocksource to test the clocksource watchdog's
ability to detect time skew.
This module checks the state of its clocksource after each test, and
uses WARN_ON_ONCE() to emit a console splat if there are any failures.
This should enable all types of test frameworks to detect any such
failures.
This facility is intended for diagnostic use only, and should be avoided
on production systems.
Reported-by: Chris Mason <clm@fb.com>
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Feng Tang <feng.tang@intel.com>
Link: https://lore.kernel.org/r/20210527190124.440372-5-paulmck@kernel.org
2021-05-28 03:01:23 +08:00
|
|
|
clocksource-wdtest.holdoff= [KNL]
|
|
|
|
Set the time in seconds that the clocksource
|
|
|
|
watchdog test waits before commencing its tests.
|
|
|
|
Defaults to zero when built as a module and to
|
|
|
|
10 seconds when built into the kernel.
|
|
|
|
|
2020-09-08 05:39:19 +08:00
|
|
|
clearcpuid=BITNUM[,BITNUM...] [X86]
|
2008-01-30 20:33:21 +08:00
|
|
|
Disable CPUID feature X for the kernel. See
|
2016-01-27 05:12:04 +08:00
|
|
|
arch/x86/include/asm/cpufeatures.h for the valid bit
|
2009-01-07 06:42:41 +08:00
|
|
|
numbers. Note the Linux specific bits are not necessarily
|
2008-01-30 20:33:21 +08:00
|
|
|
stable over kernel options, but the vendor specific
|
|
|
|
ones should be.
|
|
|
|
Also note that user programs calling CPUID directly
|
|
|
|
or using the feature without checking anything
|
|
|
|
will still see it. This just prevents it from
|
|
|
|
being used by the kernel or shown in /proc/cpuinfo.
|
|
|
|
Also note the kernel might malfunction if you disable
|
|
|
|
some critical bits.
|
|
|
|
|
2014-06-05 07:06:54 +08:00
|
|
|
cma=nn[MG]@[start[MG][-end[MG]]]
|
2020-09-18 15:05:58 +08:00
|
|
|
[KNL,CMA]
|
2014-06-05 07:06:54 +08:00
|
|
|
Sets the size of kernel global memory area for
|
|
|
|
contiguous memory allocations and optionally the
|
|
|
|
placement constraint by the physical address range of
|
2014-10-10 06:29:41 +08:00
|
|
|
memory allocations. A value of 0 disables CMA
|
|
|
|
altogether. For more information, see
|
2020-09-11 16:56:52 +08:00
|
|
|
kernel/dma/contiguous.c
|
2011-12-29 20:09:51 +08:00
|
|
|
|
2020-08-24 07:03:07 +08:00
|
|
|
cma_pernuma=nn[MG]
|
2021-01-25 12:32:02 +08:00
|
|
|
[ARM64,KNL,CMA]
|
2020-08-24 07:03:07 +08:00
|
|
|
Sets the size of kernel per-numa memory area for
|
|
|
|
contiguous memory allocations. A value of 0 disables
|
|
|
|
per-numa CMA altogether. And If this option is not
|
|
|
|
specificed, the default value is 0.
|
|
|
|
With per-numa CMA enabled, DMA users on node nid will
|
|
|
|
first try to allocate buffer from the pernuma area
|
|
|
|
which is located in node nid, if the allocation fails,
|
|
|
|
they will fallback to the global default memory area.
|
2011-12-29 20:09:51 +08:00
|
|
|
|
2009-04-15 13:55:32 +08:00
|
|
|
cmo_free_hint= [PPC] Format: { yes | no }
|
|
|
|
Specify whether pages are marked as being inactive
|
|
|
|
when they are freed. This is used in CMO environments
|
|
|
|
to determine OS memory pressure for page stealing by
|
|
|
|
a hypervisor.
|
|
|
|
Default: yes
|
|
|
|
|
2011-12-29 20:09:51 +08:00
|
|
|
coherent_pool=nn[KMG] [ARM,KNL]
|
|
|
|
Sets the size of memory pool for coherent, atomic dma
|
2012-07-30 15:11:33 +08:00
|
|
|
allocations, by default set to 256K.
|
2011-12-29 20:09:51 +08:00
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
com20020= [HW,NET] ARCnet - COM20020 chipset
|
2005-10-24 03:57:11 +08:00
|
|
|
Format:
|
|
|
|
<io>[,<irq>[,<nodeID>[,<backplane>[,<ckp>[,<timeout>]]]]]
|
2005-04-17 06:20:36 +08:00
|
|
|
|
|
|
|
com90io= [HW,NET] ARCnet - COM90xx chipset (IO-mapped buffers)
|
|
|
|
Format: <io>[,<irq>]
|
|
|
|
|
2005-10-24 03:57:11 +08:00
|
|
|
com90xx= [HW,NET]
|
|
|
|
ARCnet - COM90xx chipset (memory-mapped buffers)
|
2005-04-17 06:20:36 +08:00
|
|
|
Format: <io>[,<irq>[,<memstart>]]
|
|
|
|
|
|
|
|
condev= [HW,S390] console device
|
|
|
|
conmode=
|
2005-10-24 03:57:11 +08:00
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
console= [KNL] Output console device and options.
|
|
|
|
|
|
|
|
tty<n> Use the virtual console device <n>.
|
|
|
|
|
|
|
|
ttyS<n>[,options]
|
2006-03-25 19:08:17 +08:00
|
|
|
ttyUSB0[,options]
|
2005-04-17 06:20:36 +08:00
|
|
|
Use the specified serial port. The options are of
|
2006-03-25 19:08:17 +08:00
|
|
|
the form "bbbbpnf", where "bbbb" is the baud rate,
|
|
|
|
"p" is parity ("n", "o", or "e"), "n" is number of
|
|
|
|
bits, and "f" is flow control ("r" for RTS or
|
|
|
|
omit it). Default is "9600n8".
|
|
|
|
|
2016-11-03 18:10:10 +08:00
|
|
|
See Documentation/admin-guide/serial-console.rst for more
|
2006-03-25 19:08:17 +08:00
|
|
|
information. See
|
2020-05-01 00:04:02 +08:00
|
|
|
Documentation/networking/netconsole.rst for an
|
2006-03-25 19:08:17 +08:00
|
|
|
alternative.
|
2005-04-17 06:20:36 +08:00
|
|
|
|
serial: convert early_uart to earlycon for 8250
Beacuse SERIAL_PORT_DFNS is removed from include/asm-i386/serial.h and
include/asm-x86_64/serial.h. the serial8250_ports need to be probed late in
serial initializing stage. the console_init=>serial8250_console_init=>
register_console=>serial8250_console_setup will return -ENDEV, and console
ttyS0 can not be enabled at that time. need to wait till uart_add_one_port in
drivers/serial/serial_core.c to call register_console to get console ttyS0.
that is too late.
Make early_uart to use early_param, so uart console can be used earlier. Make
it to be bootconsole with CON_BOOT flag, so can use console handover feature.
and it will switch to corresponding normal serial console automatically.
new command line will be:
console=uart8250,io,0x3f8,9600n8
console=uart8250,mmio,0xff5e0000,115200n8
or
earlycon=uart8250,io,0x3f8,9600n8
earlycon=uart8250,mmio,0xff5e0000,115200n8
it will print in very early stage:
Early serial console at I/O port 0x3f8 (options '9600n8')
console [uart0] enabled
later for console it will print:
console handover: boot [uart0] -> real [ttyS0]
Signed-off-by: <yinghai.lu@sun.com>
Cc: Andi Kleen <ak@suse.de>
Cc: Bjorn Helgaas <bjorn.helgaas@hp.com>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Gerd Hoffmann <kraxel@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-07-16 14:37:59 +08:00
|
|
|
uart[8250],io,<addr>[,options]
|
|
|
|
uart[8250],mmio,<addr>[,options]
|
2015-10-28 11:46:05 +08:00
|
|
|
uart[8250],mmio16,<addr>[,options]
|
earlycon: 8250: Document kernel command line options
Document the expected behavior of kernel command lines of the forms:
console=uart[8250],io|mmio|mmio32,<addr>[,options]
console=uart[8250],<addr>[,options]
and
earlycon=uart[8250],io|mmio|mmio32,<addr>[,options]
earlycon=uart[8250],<addr>[,options]
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-04-06 22:52:39 +08:00
|
|
|
uart[8250],mmio32,<addr>[,options]
|
|
|
|
uart[8250],0x<addr>[,options]
|
2005-04-17 06:20:36 +08:00
|
|
|
Start an early, polled-mode console on the 8250/16550
|
|
|
|
UART at the specified I/O port or MMIO address,
|
earlycon: 8250: Document kernel command line options
Document the expected behavior of kernel command lines of the forms:
console=uart[8250],io|mmio|mmio32,<addr>[,options]
console=uart[8250],<addr>[,options]
and
earlycon=uart[8250],io|mmio|mmio32,<addr>[,options]
earlycon=uart[8250],<addr>[,options]
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-04-06 22:52:39 +08:00
|
|
|
switching to the matching ttyS device later.
|
|
|
|
MMIO inter-register address stride is either 8-bit
|
2015-10-28 11:46:05 +08:00
|
|
|
(mmio), 16-bit (mmio16), or 32-bit (mmio32).
|
|
|
|
If none of [io|mmio|mmio16|mmio32], <addr> is assumed
|
|
|
|
to be equivalent to 'mmio'. 'options' are specified in
|
|
|
|
the same format described for ttyS above; if unspecified,
|
earlycon: 8250: Document kernel command line options
Document the expected behavior of kernel command lines of the forms:
console=uart[8250],io|mmio|mmio32,<addr>[,options]
console=uart[8250],<addr>[,options]
and
earlycon=uart[8250],io|mmio|mmio32,<addr>[,options]
earlycon=uart[8250],<addr>[,options]
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-04-06 22:52:39 +08:00
|
|
|
the h/w is not re-initialized.
|
|
|
|
|
2013-02-26 04:54:09 +08:00
|
|
|
hvc<n> Use the hypervisor console device <n>. This is for
|
|
|
|
both Xen and PowerPC hypervisors.
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2018-04-19 02:51:39 +08:00
|
|
|
If the device connected to the port is not a TTY but a braille
|
|
|
|
device, prepend "brl," before the device type, for instance
|
2008-04-30 15:54:51 +08:00
|
|
|
console=brl,ttyS0
|
|
|
|
For now, only VisioBraille is supported.
|
|
|
|
|
printk: add console_msg_format command line option
0day and kernelCI automatically parse kernel log - basically some sort
of grepping using the pre-defined text patterns - in order to detect
and report regressions/errors. There are several sources they get the
kernel logs from:
a) dmesg or /proc/ksmg
This is the preferred way. Because `dmesg --raw' (see later Note)
and /proc/kmsg output contains facility and log level, which greatly
simplifies grepping for EMERG/ALERT/CRIT/ERR messages.
b) serial consoles
This option is harder to maintain, because serial console messages
don't contain facility and log level.
This patch introduces a `console_msg_format=' command line option,
to switch between different message formatting on serial consoles.
For the time being we have just two options - default and syslog.
The "default" option just keeps the existing format. While the
"syslog" option makes serial console messages to appear in syslog
format [syslog() syscall], matching the `dmesg -S --raw' and
`cat /proc/kmsg' output formats:
- facility and log level
- time stamp (depends on printk_time/PRINTK_TIME)
- message
<%u>[time stamp] text\n
NOTE: while Kevin and Fengguang talk about "dmesg --raw", it's actually
"dmesg -S --raw" that always prints messages in syslog format [per
Petr Mladek]. Running "dmesg --raw" may produce output in non-syslog
format sometimes. console_msg_format=syslog enables syslog format,
thus in documentation we mention "dmesg -S --raw", not "dmesg --raw".
Per Kevin Hilman:
: Right now we can get this info from a "dmesg --raw" after bootup,
: but it would be really nice in certain automation frameworks to
: have a kernel command-line option to enable printing of loglevels
: in default boot log.
:
: This is especially useful when ingesting kernel logs into advanced
: search/analytics frameworks (I'm playing with and ELK stack: Elastic
: Search, Logstash, Kibana).
:
: The other important reason for having this on the command line is that
: for testing linux-next (and other bleeding edge developer branches),
: it's common that we never make it to userspace, so can't even run
: "dmesg --raw" (or equivalent.) So we really want this on the primary
: boot (serial) console.
Per Fengguang Wu, 0day scripts should quickly benefit from that
feature, because they will be able to switch to a more reliable
parsing, based on messages' facility and log levels [1]:
`#{grep} -a -E -e '^<[0123]>' -e '^kern :(err |crit |alert |emerg )'
instead of doing text pattern matching
`#{grep} -a -F -f /lkp/printk-error-messages #{kmsg_file} |
grep -a -v -E -f #{LKP_SRC}/etc/oops-pattern |
grep -a -v -F -f #{LKP_SRC}/etc/kmsg-blacklist`
[1] https://github.com/fengguang/lkp-tests/blob/master/lib/dmesg.rb
Link: http://lkml.kernel.org/r/20171221054149.4398-1-sergey.senozhatsky@gmail.com
To: Steven Rostedt <rostedt@goodmis.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Fengguang Wu <fengguang.wu@intel.com>
Cc: Kevin Hilman <khilman@baylibre.com>
Cc: Mark Brown <broonie@kernel.org>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: LKML <linux-kernel@vger.kernel.org>
Signed-off-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Reviewed-by: Fengguang Wu <fengguang.wu@intel.com>
Reviewed-by: Kevin Hilman <khilman@baylibre.com>
Tested-by: Kevin Hilman <khilman@baylibre.com>
Signed-off-by: Petr Mladek <pmladek@suse.com>
2017-12-21 13:41:49 +08:00
|
|
|
console_msg_format=
|
|
|
|
[KNL] Change console messages format
|
|
|
|
default
|
|
|
|
By default we print messages on consoles in
|
|
|
|
"[time stamp] text\n" format (time stamp may not be
|
|
|
|
printed, depending on CONFIG_PRINTK_TIME or
|
|
|
|
`printk_time' param).
|
|
|
|
syslog
|
|
|
|
Switch to syslog format: "<%u>[time stamp] text\n"
|
|
|
|
IOW, each message will have a facility and loglevel
|
|
|
|
prefix. The format is similar to one used by syslog()
|
|
|
|
syscall, or to executing "dmesg -S --raw" or to reading
|
|
|
|
from /proc/kmsg.
|
|
|
|
|
2009-06-17 06:33:52 +08:00
|
|
|
consoleblank= [KNL] The console blank (screen saver) timeout in
|
2017-09-19 13:21:25 +08:00
|
|
|
seconds. A value of 0 disables the blank timer.
|
2018-04-19 02:51:39 +08:00
|
|
|
Defaults to 0.
|
2009-06-17 06:33:52 +08:00
|
|
|
|
2009-01-07 06:42:47 +08:00
|
|
|
coredump_filter=
|
|
|
|
[KNL] Change the default value for
|
|
|
|
/proc/<pid>/coredump_filter.
|
2020-04-03 01:26:14 +08:00
|
|
|
See also Documentation/filesystems/proc.rst.
|
2009-01-07 06:42:47 +08:00
|
|
|
|
2017-06-06 04:15:12 +08:00
|
|
|
coresight_cpu_debug.enable
|
|
|
|
[ARM,ARM64]
|
|
|
|
Format: <bool>
|
|
|
|
Enable/disable the CPU sampling based debugging.
|
|
|
|
0: default value, disable debugging
|
|
|
|
1: enable debugging at boot time
|
|
|
|
|
2011-04-02 06:13:10 +08:00
|
|
|
cpuidle.off=1 [CPU_IDLE]
|
|
|
|
disable the cpuidle sub-system
|
|
|
|
|
2018-12-06 06:45:34 +08:00
|
|
|
cpuidle.governor=
|
|
|
|
[CPU_IDLE] Name of the cpuidle governor to use.
|
|
|
|
|
2017-03-01 05:44:16 +08:00
|
|
|
cpufreq.off=1 [CPU_FREQ]
|
|
|
|
disable the cpufreq sub-system
|
|
|
|
|
cpufreq: Specify default governor on command line
Currently, the only way to specify the default CPUfreq governor is
via Kconfig options, which suits users who can build the kernel
themselves perfectly.
However, for those who use a distro-like kernel (such as Android,
with the Generic Kernel Image project), the only way to use a
non-default governor is to boot to userspace, and to then switch
using the sysfs interface. Being able to specify the default governor
on the command line, like is the case for cpuidle, would allow those
users to specify their governor of choice earlier on, and to simplify
the userspace boot procedure slighlty.
To support this use-case, add a kernel command line parameter
allowing the default governor for CPUfreq to be specified, which
takes precedence over the built-in default.
This implementation has one notable limitation: the default governor
must be registered before the driver. This is solved for builtin
governors and drivers using appropriate *_initcall() functions. And
in the modular case, this must be reflected as a constraint on the
module loading order.
Signed-off-by: Quentin Perret <qperret@google.com>
[ Viresh: Converted 'default_governor' to a string and parsing it only
at initcall level, and several updates to
cpufreq_init_policy(). ]
Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
[ rjw: Changelog ]
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2020-06-29 16:25:00 +08:00
|
|
|
cpufreq.default_governor=
|
|
|
|
[CPU_FREQ] Name of the default cpufreq governor or
|
|
|
|
policy to use. This governor must be registered in the
|
|
|
|
kernel before the cpufreq driver probes.
|
|
|
|
|
2015-05-12 05:27:09 +08:00
|
|
|
cpu_init_udelay=N
|
|
|
|
[X86] Delay for N microsec between assert and de-assert
|
|
|
|
of APIC INIT to start processors. This delay occurs
|
|
|
|
on every CPU online, such as boot, and resume from suspend.
|
|
|
|
Default: 10000
|
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
cpcihp_generic= [HW,PCI] Generic port I/O CompactPCI driver
|
2005-10-24 03:57:11 +08:00
|
|
|
Format:
|
|
|
|
<first_slot>,<last_slot>,<port>,<enum_bit>[,<debug>]
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2011-02-21 12:08:35 +08:00
|
|
|
crashkernel=size[KMG][@offset[KMG]]
|
|
|
|
[KNL] Using kexec, Linux can switch to a 'crash kernel'
|
|
|
|
upon panic. This parameter reserves the physical
|
|
|
|
memory region [offset, offset + size] for that kernel
|
|
|
|
image. If '@offset' is omitted, then a suitable offset
|
2019-04-22 11:19:05 +08:00
|
|
|
is selected automatically.
|
2020-08-10 10:49:41 +08:00
|
|
|
[KNL, X86-64] Select a region under 4G first, and
|
2019-04-22 11:19:05 +08:00
|
|
|
fall back to reserve region above 4G when '@offset'
|
|
|
|
hasn't been specified.
|
2019-06-14 02:21:39 +08:00
|
|
|
See Documentation/admin-guide/kdump/kdump.rst for further details.
|
2005-06-26 05:57:52 +08:00
|
|
|
|
2007-10-19 14:41:02 +08:00
|
|
|
crashkernel=range1:size1[,range2:size2,...][@offset]
|
|
|
|
[KNL] Same as above, but depends on the memory
|
|
|
|
in the running system. The syntax of range is
|
|
|
|
start-[end] where start and end are both
|
|
|
|
a memory unit (amount[KMG]). See also
|
2019-06-14 02:21:39 +08:00
|
|
|
Documentation/admin-guide/kdump/kdump.rst for an example.
|
2007-10-19 14:41:02 +08:00
|
|
|
|
x86, kdump: Change crashkernel_high/low= to crashkernel=,high/low
Per hpa, use crashkernel=X,high crashkernel=Y,low instead of
crashkernel_hign=X crashkernel_low=Y. As that could be extensible.
-v2: according to Vivek, change delimiter to ;
-v3: let hign and low only handle simple form and it conforms to
description in kernel-parameters.txt
still keep crashkernel=X override any crashkernel=X,high
crashkernel=Y,low
-v4: update get_last_crashkernel returning and add more strict
checking in parse_crashkernel_simple() found by HATAYAMA.
-v5: Change delimiter back to , according to HPA.
also separate parse_suffix from parse_simper according to vivek.
so we can avoid @pos in that path.
-v6: Tight the checking about crashkernel=X,highblahblah,high
found by HTYAYAMA.
Cc: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/1366089828-19692-5-git-send-email-yinghai@kernel.org
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-04-16 13:23:48 +08:00
|
|
|
crashkernel=size[KMG],high
|
2020-08-10 10:49:41 +08:00
|
|
|
[KNL, X86-64] range could be above 4G. Allow kernel
|
2013-04-16 13:23:47 +08:00
|
|
|
to allocate physical memory region from top, so could
|
|
|
|
be above 4G if system have more than 4G ram installed.
|
|
|
|
Otherwise memory region will be allocated below 4G, if
|
|
|
|
available.
|
|
|
|
It will be ignored if crashkernel=X is specified.
|
x86, kdump: Change crashkernel_high/low= to crashkernel=,high/low
Per hpa, use crashkernel=X,high crashkernel=Y,low instead of
crashkernel_hign=X crashkernel_low=Y. As that could be extensible.
-v2: according to Vivek, change delimiter to ;
-v3: let hign and low only handle simple form and it conforms to
description in kernel-parameters.txt
still keep crashkernel=X override any crashkernel=X,high
crashkernel=Y,low
-v4: update get_last_crashkernel returning and add more strict
checking in parse_crashkernel_simple() found by HATAYAMA.
-v5: Change delimiter back to , according to HPA.
also separate parse_suffix from parse_simper according to vivek.
so we can avoid @pos in that path.
-v6: Tight the checking about crashkernel=X,highblahblah,high
found by HTYAYAMA.
Cc: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/1366089828-19692-5-git-send-email-yinghai@kernel.org
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-04-16 13:23:48 +08:00
|
|
|
crashkernel=size[KMG],low
|
2020-08-10 10:49:41 +08:00
|
|
|
[KNL, X86-64] range under 4G. When crashkernel=X,high
|
x86, kdump: Change crashkernel_high/low= to crashkernel=,high/low
Per hpa, use crashkernel=X,high crashkernel=Y,low instead of
crashkernel_hign=X crashkernel_low=Y. As that could be extensible.
-v2: according to Vivek, change delimiter to ;
-v3: let hign and low only handle simple form and it conforms to
description in kernel-parameters.txt
still keep crashkernel=X override any crashkernel=X,high
crashkernel=Y,low
-v4: update get_last_crashkernel returning and add more strict
checking in parse_crashkernel_simple() found by HATAYAMA.
-v5: Change delimiter back to , according to HPA.
also separate parse_suffix from parse_simper according to vivek.
so we can avoid @pos in that path.
-v6: Tight the checking about crashkernel=X,highblahblah,high
found by HTYAYAMA.
Cc: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/1366089828-19692-5-git-send-email-yinghai@kernel.org
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-04-16 13:23:48 +08:00
|
|
|
is passed, kernel could allocate physical memory region
|
2013-04-16 13:23:45 +08:00
|
|
|
above 4G, that cause second kernel crash on system
|
|
|
|
that require some amount of low memory, e.g. swiotlb
|
2015-09-24 16:51:25 +08:00
|
|
|
requires at least 64M+32K low memory, also enough extra
|
|
|
|
low memory is needed to make sure DMA buffers for 32-bit
|
|
|
|
devices won't run out. Kernel would try to allocate at
|
|
|
|
at least 256M below 4G automatically.
|
2013-04-16 13:23:45 +08:00
|
|
|
This one let user to specify own low range under 4G
|
|
|
|
for second kernel instead.
|
|
|
|
0: to disable low allocation.
|
x86, kdump: Change crashkernel_high/low= to crashkernel=,high/low
Per hpa, use crashkernel=X,high crashkernel=Y,low instead of
crashkernel_hign=X crashkernel_low=Y. As that could be extensible.
-v2: according to Vivek, change delimiter to ;
-v3: let hign and low only handle simple form and it conforms to
description in kernel-parameters.txt
still keep crashkernel=X override any crashkernel=X,high
crashkernel=Y,low
-v4: update get_last_crashkernel returning and add more strict
checking in parse_crashkernel_simple() found by HATAYAMA.
-v5: Change delimiter back to , according to HPA.
also separate parse_suffix from parse_simper according to vivek.
so we can avoid @pos in that path.
-v6: Tight the checking about crashkernel=X,highblahblah,high
found by HTYAYAMA.
Cc: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/1366089828-19692-5-git-send-email-yinghai@kernel.org
Acked-by: Vivek Goyal <vgoyal@redhat.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
2013-04-16 13:23:48 +08:00
|
|
|
It will be ignored when crashkernel=X,high is not used
|
2013-04-16 13:23:47 +08:00
|
|
|
or memory reserved is below 4G.
|
2013-04-16 13:23:45 +08:00
|
|
|
|
2016-05-03 17:00:17 +08:00
|
|
|
cryptomgr.notests
|
2018-04-19 02:51:39 +08:00
|
|
|
[KNL] Disable crypto self-tests
|
2016-05-03 17:00:17 +08:00
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
cs89x0_dma= [HW,NET]
|
|
|
|
Format: <dma>
|
|
|
|
|
|
|
|
cs89x0_media= [HW,NET]
|
|
|
|
Format: { rj45 | aui | bnc }
|
2005-10-24 03:57:11 +08:00
|
|
|
|
2021-03-01 18:13:34 +08:00
|
|
|
csdlock_debug= [KNL] Enable debug add-ons of cross-CPU function call
|
|
|
|
handling. When switched on, additional debug data is
|
|
|
|
printed to the console in case a hanging CPU is
|
|
|
|
detected, and that CPU is pinged again in order to try
|
|
|
|
to resolve the hang situation.
|
locking/csd_lock: Add more data to CSD lock debugging
In order to help identifying problems with IPI handling and remote
function execution add some more data to IPI debugging code.
There have been multiple reports of CPUs looping long times (many
seconds) in smp_call_function_many() waiting for another CPU executing
a function like tlb flushing. Most of these reports have been for
cases where the kernel was running as a guest on top of KVM or Xen
(there are rumours of that happening under VMWare, too, and even on
bare metal).
Finding the root cause hasn't been successful yet, even after more than
2 years of chasing this bug by different developers.
Commit:
35feb60474bf4f7 ("kernel/smp: Provide CSD lock timeout diagnostics")
tried to address this by adding some debug code and by issuing another
IPI when a hang was detected. This helped mitigating the problem
(the repeated IPI unlocks the hang), but the root cause is still unknown.
Current available data suggests that either an IPI wasn't sent when it
should have been, or that the IPI didn't result in the target CPU
executing the queued function (due to the IPI not reaching the CPU,
the IPI handler not being called, or the handler not seeing the queued
request).
Try to add more diagnostic data by introducing a global atomic counter
which is being incremented when doing critical operations (before and
after queueing a new request, when sending an IPI, and when dequeueing
a request). The counter value is stored in percpu variables which can
be printed out when a hang is detected.
The data of the last event (consisting of sequence counter, source
CPU, target CPU, and event type) is stored in a global variable. When
a new event is to be traced, the data of the last event is stored in
the event related percpu location and the global data is updated with
the new event's data. This allows to track two events in one data
location: one by the value of the event data (the event before the
current one), and one by the location itself (the current event).
A typical printout with a detected hang will look like this:
csd: Detected non-responsive CSD lock (#1) on CPU#1, waiting 5000000003 ns for CPU#06 scf_handler_1+0x0/0x50(0xffffa2a881bb1410).
csd: CSD lock (#1) handling prior scf_handler_1+0x0/0x50(0xffffa2a8813823c0) request.
csd: cnt(00008cc): ffff->0000 dequeue (src cpu 0 == empty)
csd: cnt(00008cd): ffff->0006 idle
csd: cnt(0003668): 0001->0006 queue
csd: cnt(0003669): 0001->0006 ipi
csd: cnt(0003e0f): 0007->000a queue
csd: cnt(0003e10): 0001->ffff ping
csd: cnt(0003e71): 0003->0000 ping
csd: cnt(0003e72): ffff->0006 gotipi
csd: cnt(0003e73): ffff->0006 handle
csd: cnt(0003e74): ffff->0006 dequeue (src cpu 0 == empty)
csd: cnt(0003e7f): 0004->0006 ping
csd: cnt(0003e80): 0001->ffff pinged
csd: cnt(0003eb2): 0005->0001 noipi
csd: cnt(0003eb3): 0001->0006 queue
csd: cnt(0003eb4): 0001->0006 noipi
csd: cnt now: 0003f00
The idea is to print only relevant entries. Those are all events which
are associated with the hang (so sender side events for the source CPU
of the hanging request, and receiver side events for the target CPU),
and the related events just before those (for adding data needed to
identify a possible race). Printing all available data would be
possible, but this would add large amounts of data printed on larger
configurations.
Signed-off-by: Juergen Gross <jgross@suse.com>
[ Minor readability edits. Breaks col80 but is far more readable. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Tested-by: Paul E. McKenney <paulmck@kernel.org>
Link: https://lore.kernel.org/r/20210301101336.7797-4-jgross@suse.com
2021-03-01 18:13:36 +08:00
|
|
|
0: disable csdlock debugging (default)
|
|
|
|
1: enable basic csdlock debugging (minor impact)
|
|
|
|
ext: enable extended csdlock debugging (more impact,
|
|
|
|
but more data)
|
2021-03-01 18:13:34 +08:00
|
|
|
|
2005-10-24 03:57:11 +08:00
|
|
|
dasd= [HW,NET]
|
2005-04-17 06:20:36 +08:00
|
|
|
See header of drivers/s390/block/dasd_devmap.c.
|
|
|
|
|
|
|
|
db9.dev[2|3]= [HW,JOY] Multisystem joystick support via parallel port
|
|
|
|
(one device per port)
|
|
|
|
Format: <port#>,<type>
|
2017-10-11 01:36:23 +08:00
|
|
|
See also Documentation/input/devices/joystick-parport.rst
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2018-04-19 02:51:39 +08:00
|
|
|
ddebug_query= [KNL,DYNAMIC_DEBUG] Enable debug messages at early boot
|
2017-06-14 18:24:12 +08:00
|
|
|
time. See
|
|
|
|
Documentation/admin-guide/dynamic-debug-howto.rst for
|
2012-04-28 04:30:41 +08:00
|
|
|
details. Deprecated, see dyndbg.
|
2010-08-06 22:11:02 +08:00
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
debug [KNL] Enable kernel debugging (events log level).
|
|
|
|
|
2018-06-22 07:15:34 +08:00
|
|
|
debug_boot_weak_hash
|
|
|
|
[KNL] Enable printing [hashed] pointers early in the
|
|
|
|
boot sequence. If enabled, we use a weak hash instead
|
|
|
|
of siphash to hash pointers. Use this option if you are
|
|
|
|
seeing instances of '(___ptrval___)') and need to see a
|
|
|
|
value (hashed pointer) instead. Cryptographically
|
|
|
|
insecure, please do not use on production kernels.
|
|
|
|
|
2006-07-03 15:24:48 +08:00
|
|
|
debug_locks_verbose=
|
2020-12-09 23:42:57 +08:00
|
|
|
[KNL] verbose locking self-tests
|
|
|
|
Format: <int>
|
2006-07-03 15:24:48 +08:00
|
|
|
Print debugging info while doing the locking API
|
|
|
|
self-tests.
|
2020-12-09 23:42:57 +08:00
|
|
|
Bitmask for the various LOCKTYPE_ tests. Defaults to 0
|
|
|
|
(no extra messages), setting it to -1 (all bits set)
|
|
|
|
will print _a_lot_ more information - normally only
|
|
|
|
useful to lockdep developers.
|
2006-07-03 15:24:48 +08:00
|
|
|
|
2008-04-30 15:55:01 +08:00
|
|
|
debug_objects [KNL] Enable object debugging
|
|
|
|
|
2009-03-02 09:41:41 +08:00
|
|
|
no_debug_objects
|
|
|
|
[KNL] Disable object debugging
|
|
|
|
|
2012-01-11 07:07:28 +08:00
|
|
|
debug_guardpage_minorder=
|
|
|
|
[KNL] When CONFIG_DEBUG_PAGEALLOC is set, this
|
|
|
|
parameter allows control of the order of pages that will
|
|
|
|
be intentionally kept free (and hence protected) by the
|
|
|
|
buddy allocator. Bigger value increase the probability
|
|
|
|
of catching random memory corruption, but reduce the
|
|
|
|
amount of memory for normal system use. The maximum
|
|
|
|
possible value is MAX_ORDER/2. Setting this parameter
|
|
|
|
to 1 or 2 should be enough to identify most random
|
|
|
|
memory corruption problems caused by bugs in kernel or
|
|
|
|
driver code when a CPU writes to (or reads from) a
|
|
|
|
random memory location. Note that there exists a class
|
|
|
|
of memory corruptions problems caused by buggy H/W or
|
|
|
|
F/W or by drivers badly programing DMA (basically when
|
|
|
|
memory is written at bus level and the CPU MMU is
|
|
|
|
bypassed) which are not detectable by
|
|
|
|
CONFIG_DEBUG_PAGEALLOC, hence this option will not help
|
|
|
|
tracking down these problems.
|
|
|
|
|
2014-12-13 08:55:52 +08:00
|
|
|
debug_pagealloc=
|
2019-07-12 11:55:13 +08:00
|
|
|
[KNL] When CONFIG_DEBUG_PAGEALLOC is set, this parameter
|
|
|
|
enables the feature at boot time. By default, it is
|
|
|
|
disabled and the system will work mostly the same as a
|
|
|
|
kernel built without CONFIG_DEBUG_PAGEALLOC.
|
mm, page_owner, debug_pagealloc: save and dump freeing stack trace
The debug_pagealloc functionality is useful to catch buggy page allocator
users that cause e.g. use after free or double free. When page
inconsistency is detected, debugging is often simpler by knowing the call
stack of process that last allocated and freed the page. When page_owner
is also enabled, we record the allocation stack trace, but not freeing.
This patch therefore adds recording of freeing process stack trace to page
owner info, if both page_owner and debug_pagealloc are configured and
enabled. With only page_owner enabled, this info is not useful for the
memory leak debugging use case. dump_page() is adjusted to print the
info. An example result of calling __free_pages() twice may look like
this (note the page last free stack trace):
BUG: Bad page state in process bash pfn:13d8f8
page:ffffc31984f63e00 refcount:-1 mapcount:0 mapping:0000000000000000 index:0x0
flags: 0x1affff800000000()
raw: 01affff800000000 dead000000000100 dead000000000122 0000000000000000
raw: 0000000000000000 0000000000000000 ffffffffffffffff 0000000000000000
page dumped because: nonzero _refcount
page_owner tracks the page as freed
page last allocated via order 0, migratetype Unmovable, gfp_mask 0xcc0(GFP_KERNEL)
prep_new_page+0x143/0x150
get_page_from_freelist+0x289/0x380
__alloc_pages_nodemask+0x13c/0x2d0
khugepaged+0x6e/0xc10
kthread+0xf9/0x130
ret_from_fork+0x3a/0x50
page last free stack trace:
free_pcp_prepare+0x134/0x1e0
free_unref_page+0x18/0x90
khugepaged+0x7b/0xc10
kthread+0xf9/0x130
ret_from_fork+0x3a/0x50
Modules linked in:
CPU: 3 PID: 271 Comm: bash Not tainted 5.3.0-rc4-2.g07a1a73-default+ #57
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.1-0-ga5cab58-prebuilt.qemu.org 04/01/2014
Call Trace:
dump_stack+0x85/0xc0
bad_page.cold+0xba/0xbf
rmqueue_pcplist.isra.0+0x6c5/0x6d0
rmqueue+0x2d/0x810
get_page_from_freelist+0x191/0x380
__alloc_pages_nodemask+0x13c/0x2d0
__get_free_pages+0xd/0x30
__pud_alloc+0x2c/0x110
copy_page_range+0x4f9/0x630
dup_mmap+0x362/0x480
dup_mm+0x68/0x110
copy_process+0x19e1/0x1b40
_do_fork+0x73/0x310
__x64_sys_clone+0x75/0x80
do_syscall_64+0x6e/0x1e0
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x7f10af854a10
...
Link: http://lkml.kernel.org/r/20190820131828.22684-5-vbabka@suse.cz
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Michal Hocko <mhocko@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-09-24 06:34:42 +08:00
|
|
|
Note: to get most of debug_pagealloc error reports, it's
|
|
|
|
useful to also enable the page_owner functionality.
|
2014-12-13 08:55:52 +08:00
|
|
|
on: enable the feature
|
|
|
|
|
2020-07-16 15:15:11 +08:00
|
|
|
debugfs= [KNL] This parameter enables what is exposed to userspace
|
|
|
|
and debugfs internal clients.
|
|
|
|
Format: { on, no-mount, off }
|
|
|
|
on: All functions are enabled.
|
|
|
|
no-mount:
|
|
|
|
Filesystem is not registered but kernel clients can
|
|
|
|
access APIs and a crashkernel can be used to read
|
|
|
|
its content. There is nothing to mount.
|
|
|
|
off: Filesystem is not registered and clients
|
|
|
|
get a -EPERM as result when trying to register files
|
|
|
|
or directories within debugfs.
|
|
|
|
This is equivalent of the runtime functionality if
|
|
|
|
debugfs was not enabled in the kernel at all.
|
|
|
|
Default value is set in build-time with a kernel configuration.
|
|
|
|
|
2008-07-15 21:04:56 +08:00
|
|
|
debugpat [X86] Enable PAT debugging
|
|
|
|
|
2008-02-03 21:18:45 +08:00
|
|
|
decnet.addr= [HW,NET]
|
2005-04-17 06:20:36 +08:00
|
|
|
Format: <area>[,<node>]
|
2020-04-28 06:01:30 +08:00
|
|
|
See also Documentation/networking/decnet.rst.
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2009-04-06 06:55:22 +08:00
|
|
|
default_hugepagesz=
|
2020-06-04 07:00:46 +08:00
|
|
|
[HW] The size of the default HugeTLB page. This is
|
|
|
|
the size represented by the legacy /proc/ hugepages
|
|
|
|
APIs. In addition, this is the default hugetlb size
|
|
|
|
used for shmget(), mmap() and mounting hugetlbfs
|
|
|
|
filesystems. If not specified, defaults to the
|
|
|
|
architecture's default huge page size. Huge page
|
|
|
|
sizes are architecture dependent. See also
|
|
|
|
Documentation/admin-guide/mm/hugetlbpage.rst.
|
|
|
|
Format: size[KMG]
|
2007-05-08 15:38:53 +08:00
|
|
|
|
2018-07-09 23:41:48 +08:00
|
|
|
deferred_probe_timeout=
|
|
|
|
[KNL] Debugging option to set a timeout in seconds for
|
|
|
|
deferred probe to give up waiting on dependencies to
|
|
|
|
probe. Only specific dependencies (subsystems or
|
|
|
|
drivers) that have opted in will be ignored. A timeout of 0
|
|
|
|
will timeout at the end of initcalls. This option will also
|
|
|
|
dump out devices still on the deferred probe list after
|
|
|
|
retrying.
|
|
|
|
|
2020-01-31 14:16:27 +08:00
|
|
|
dfltcc= [HW,S390]
|
|
|
|
Format: { on | off | def_only | inf_only | always }
|
|
|
|
on: s390 zlib hardware support for compression on
|
|
|
|
level 1 and decompression (default)
|
|
|
|
off: No s390 zlib hardware support
|
|
|
|
def_only: s390 zlib hardware support for deflate
|
|
|
|
only (compression on level 1)
|
|
|
|
inf_only: s390 zlib hardware support for inflate
|
|
|
|
only (decompression)
|
|
|
|
always: Same as 'on' but ignores the selected compression
|
|
|
|
level always using hardware support (used for debugging)
|
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
dhash_entries= [KNL]
|
|
|
|
Set number of hash buckets for dentry cache.
|
2005-10-24 03:57:11 +08:00
|
|
|
|
2016-07-05 09:43:21 +08:00
|
|
|
disable_1tb_segments [PPC]
|
|
|
|
Disables the use of 1TB hash page table segments. This
|
|
|
|
causes the kernel to fall back to 256MB segments which
|
|
|
|
can be useful when debugging issues that require an SLB
|
|
|
|
miss to occur.
|
|
|
|
|
2020-05-11 20:58:24 +08:00
|
|
|
stress_slb [PPC]
|
|
|
|
Limits the number of kernel SLB entries, and flushes
|
|
|
|
them frequently to increase the rate of SLB faults
|
|
|
|
on kernel addresses.
|
|
|
|
|
2010-02-05 05:36:50 +08:00
|
|
|
disable= [IPV6]
|
2020-04-28 06:01:50 +08:00
|
|
|
See Documentation/networking/ipv6.rst.
|
2010-02-05 05:36:50 +08:00
|
|
|
|
2018-07-04 03:43:08 +08:00
|
|
|
hardened_usercopy=
|
|
|
|
[KNL] Under CONFIG_HARDENED_USERCOPY, whether
|
|
|
|
hardening is enabled for this boot. Hardened
|
|
|
|
usercopy checking is used to protect the kernel
|
|
|
|
from reading or writing beyond known memory
|
|
|
|
allocation boundaries as a proactive defense
|
|
|
|
against bounds-checking flaws in the kernel's
|
|
|
|
copy_to_user()/copy_from_user() interface.
|
|
|
|
on Perform hardened usercopy checks (default).
|
|
|
|
off Disable hardened usercopy checks.
|
|
|
|
|
2016-07-13 17:35:31 +08:00
|
|
|
disable_radix [PPC]
|
|
|
|
Disable RADIX MMU mode on POWER9
|
|
|
|
|
2020-07-27 16:59:08 +08:00
|
|
|
radix_hcall_invalidate=on [PPC/PSERIES]
|
|
|
|
Disable RADIX GTSE feature and use hcall for TLB
|
|
|
|
invalidate.
|
|
|
|
|
2019-09-02 23:29:31 +08:00
|
|
|
disable_tlbie [PPC]
|
|
|
|
Disable TLBIE instruction. Currently does not work
|
|
|
|
with KVM, with HASH MMU, or with coherent accelerators.
|
|
|
|
|
x86, apic, kexec: Add disable_cpu_apicid kernel parameter
Add disable_cpu_apicid kernel parameter. To use this kernel parameter,
specify an initial APIC ID of the corresponding CPU you want to
disable.
This is mostly used for the kdump 2nd kernel to disable BSP to wake up
multiple CPUs without causing system reset or hang due to sending INIT
from AP to BSP.
Kdump users first figure out initial APIC ID of the BSP, CPU0 in the
1st kernel, for example from /proc/cpuinfo and then set up this kernel
parameter for the 2nd kernel using the obtained APIC ID.
However, doing this procedure at each boot time manually is awkward,
which should be automatically done by user-land service scripts, for
example, kexec-tools on fedora/RHEL distributions.
This design is more flexible than disabling BSP in kernel boot time
automatically in that in kernel boot time we have no choice but
referring to ACPI/MP table to obtain initial APIC ID for BSP, meaning
that the method is not applicable to the systems without such BIOS
tables.
One assumption behind this design is that users get initial APIC ID of
the BSP in still healthy state and so BSP is uniquely kept in
CPU0. Thus, through the kernel parameter, only one initial APIC ID can
be specified.
In a comparison with disabled_cpu_apicid, we use read_apic_id(), not
boot_cpu_physical_apicid, because on some platforms, the variable is
modified to the apicid reported as BSP through MP table and this
function is executed with the temporarily modified
boot_cpu_physical_apicid. As a result, disabled_cpu_apicid kernel
parameter doesn't work well for apicids of APs.
Fixing the wrong handling of boot_cpu_physical_apicid requires some
reviews and tests beyond some platforms and it could take some
time. The fix here is a kind of workaround to focus on the main topic
of this patch.
Signed-off-by: HATAYAMA Daisuke <d.hatayama@jp.fujitsu.com>
Link: http://lkml.kernel.org/r/20140115064458.1545.38775.stgit@localhost6.localdomain6
Signed-off-by: H. Peter Anvin <hpa@zytor.com>
2014-01-15 14:44:58 +08:00
|
|
|
disable_cpu_apicid= [X86,APIC,SMP]
|
|
|
|
Format: <int>
|
|
|
|
The number of initial APIC ID for the
|
|
|
|
corresponding CPU to be disabled at boot,
|
|
|
|
mostly used for the kdump 2nd kernel to
|
|
|
|
disable BSP to wake up multiple CPUs without
|
|
|
|
causing system reset or hang due to sending
|
|
|
|
INIT from AP to BSP.
|
|
|
|
|
2018-04-19 02:51:39 +08:00
|
|
|
disable_ddw [PPC/PSERIES]
|
2020-09-18 13:48:03 +08:00
|
|
|
Disable Dynamic DMA Window support. Use this
|
2011-02-10 17:10:47 +08:00
|
|
|
to workaround buggy firmware.
|
|
|
|
|
2010-02-05 05:36:50 +08:00
|
|
|
disable_ipv6= [IPV6]
|
2020-04-28 06:01:50 +08:00
|
|
|
See Documentation/networking/ipv6.rst.
|
2010-02-05 05:36:50 +08:00
|
|
|
|
2008-04-29 18:52:33 +08:00
|
|
|
disable_mtrr_cleanup [X86]
|
|
|
|
The kernel tries to adjust MTRR layout from continuous
|
|
|
|
to discrete, to make X server driver able to add WB
|
2009-04-06 06:55:22 +08:00
|
|
|
entry later. This parameter disables that.
|
2008-04-29 18:52:33 +08:00
|
|
|
|
2008-01-30 20:33:32 +08:00
|
|
|
disable_mtrr_trim [X86, Intel and AMD only]
|
x86, 32-bit: trim memory not covered by wb mtrrs
On some machines, buggy BIOSes don't properly setup WB MTRRs to cover all
available RAM, meaning the last few megs (or even gigs) of memory will be
marked uncached. Since Linux tends to allocate from high memory addresses
first, this causes the machine to be unusably slow as soon as the kernel
starts really using memory (i.e. right around init time).
This patch works around the problem by scanning the MTRRs at boot and
figuring out whether the current end_pfn value (setup by early e820 code)
goes beyond the highest WB MTRR range, and if so, trimming it to match. A
fairly obnoxious KERN_WARNING is printed too, letting the user know that
not all of their memory is available due to a likely BIOS bug.
Something similar could be done on i386 if needed, but the boot ordering
would be slightly different, since the MTRR code on i386 depends on the
boot_cpu_data structure being setup.
This patch fixes a bug in the last patch that caused the code to run on
non-Intel machines (AMD machines apparently don't need it and it's untested
on other non-Intel machines, so best keep it off).
Further enhancements and fixes from:
Yinghai Lu <Yinghai.Lu@Sun.COM>
Andi Kleen <ak@suse.de>
Signed-off-by: Jesse Barnes <jesse.barnes@intel.com>
Tested-by: Justin Piszcz <jpiszcz@lucidpixels.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Yinghai Lu <yhlu.kernel@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 20:33:18 +08:00
|
|
|
By default the kernel will trim any uncacheable
|
|
|
|
memory out of your available memory pool based on
|
|
|
|
MTRR settings. This parameter disables that behavior,
|
|
|
|
possibly causing your machine to run very slowly.
|
|
|
|
|
2009-04-14 16:33:43 +08:00
|
|
|
disable_timer_pin_1 [X86]
|
2009-04-06 06:55:22 +08:00
|
|
|
Disable PIN 1 of APIC timer
|
|
|
|
Can be useful to work around chipset bugs.
|
|
|
|
|
2015-08-26 01:34:53 +08:00
|
|
|
dis_ucode_ldr [X86] Disable the microcode loader.
|
|
|
|
|
2009-04-06 06:55:22 +08:00
|
|
|
dma_debug=off If the kernel is compiled with DMA_API_DEBUG support,
|
|
|
|
this option disables the debugging code at boot.
|
|
|
|
|
|
|
|
dma_debug_entries=<number>
|
|
|
|
This option allows to tune the number of preallocated
|
|
|
|
entries for DMA-API debugging code. One entry is
|
|
|
|
required per DMA-API allocation. Use this if the
|
|
|
|
DMA-API debugging code disables itself because the
|
|
|
|
architectural default is too low.
|
|
|
|
|
2009-05-23 03:49:51 +08:00
|
|
|
dma_debug_driver=<driver_name>
|
|
|
|
With this option the DMA-API debugging driver
|
|
|
|
filter feature can be enabled at boot time. Just
|
|
|
|
pass the driver to filter for as the parameter.
|
|
|
|
The filter can be disabled or changed to another
|
|
|
|
driver later using sysfs.
|
|
|
|
|
2019-02-13 15:47:36 +08:00
|
|
|
driver_async_probe= [KNL]
|
|
|
|
List of driver names to be probed asynchronously.
|
|
|
|
Format: <driver_name1>,<driver_name2>...
|
|
|
|
|
drm: handle override and firmware EDID at drm_do_get_edid() level
Handle debugfs override edid and firmware edid at the low level to
transparently and completely replace the real edid. Previously, we
practically only used the modes from the override EDID, and none of the
other data, such as audio parameters.
This change also prevents actual EDID reads when the EDID is to be
overridden, but retains the DDC probe. This is useful if the reason for
preferring override EDID are problems with reading the data, or
corruption of the data.
Move firmware EDID loading from helper to core, as the functionality
moves to lower level as well. This will result in a change of module
parameter from drm_kms_helper.edid_firmware to drm.edid_firmware, which
arguably makes more sense anyway.
Some future work remains related to override and firmware EDID
validation. Like before, no validation is done for override EDID. The
firmware EDID is validated separately in the loader. Some unification
and deduplication would be in order, to validate all of them at the
drm_do_get_edid() level, like "real" EDIDs.
v2: move firmware loading to core
v3: rebase, commit message refresh
Cc: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>
Tested-by: Abdiel Janulgue <abdiel.janulgue@linux.intel.com>
Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com>
Acked-by: Dave Airlie <airlied@gmail.com>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/1e8a710bcac46e5136c1a7b430074893c81f364a.1505203831.git.jani.nikula@intel.com
2017-09-12 16:19:26 +08:00
|
|
|
drm.edid_firmware=[<connector>:]<file>[,[<connector>:]<file>]
|
2015-08-28 01:04:13 +08:00
|
|
|
Broken monitors, graphic adapters, KVMs and EDIDless
|
|
|
|
panels may send no or incorrect EDID data sets.
|
|
|
|
This parameter allows to specify an EDID data sets
|
|
|
|
in the /lib/firmware directory that are used instead.
|
2012-03-19 05:37:33 +08:00
|
|
|
Generic built-in EDID data sets are used, if one of
|
|
|
|
edid/1024x768.bin, edid/1280x1024.bin,
|
|
|
|
edid/1680x1050.bin, or edid/1920x1080.bin is given
|
|
|
|
and no file with the same name exists. Details and
|
|
|
|
instructions how to build your own EDID data are
|
2020-04-03 01:26:14 +08:00
|
|
|
available in Documentation/admin-guide/edid.rst. An EDID
|
2012-03-19 05:37:33 +08:00
|
|
|
data set will only be used for a particular connector,
|
|
|
|
if its name and a colon are prepended to the EDID
|
2015-08-28 01:04:13 +08:00
|
|
|
name. Each connector may use a unique EDID data
|
|
|
|
set by separating the files with a comma. An EDID
|
|
|
|
data set with no connector name will be used for
|
|
|
|
any connectors not explicitly specified.
|
2012-03-19 05:37:33 +08:00
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
dscc4.setup= [NET]
|
|
|
|
|
2017-05-11 19:24:41 +08:00
|
|
|
dt_cpu_ftrs= [PPC]
|
|
|
|
Format: {"off" | "known"}
|
|
|
|
Control how the dt_cpu_ftrs device-tree binding is
|
|
|
|
used for CPU feature discovery and setup (if it
|
|
|
|
exists).
|
|
|
|
off: Do not use it, fall back to legacy cpu table.
|
|
|
|
known: Do not pass through unknown features to guests
|
|
|
|
or userspace, only those that the kernel is aware of.
|
|
|
|
|
x86/efi: Retrieve and assign Apple device properties
Apple's EFI drivers supply device properties which are needed to support
Macs optimally. They contain vital information which cannot be obtained
any other way (e.g. Thunderbolt Device ROM). They're also used to convey
the current device state so that OS drivers can pick up where EFI
drivers left (e.g. GPU mode setting).
There's an EFI driver dubbed "AAPL,PathProperties" which implements a
per-device key/value store. Other EFI drivers populate it using a custom
protocol. The macOS bootloader /System/Library/CoreServices/boot.efi
retrieves the properties with the same protocol. The kernel extension
AppleACPIPlatform.kext subsequently merges them into the I/O Kit
registry (see ioreg(8)) where they can be queried by other kernel
extensions and user space.
This commit extends the efistub to retrieve the device properties before
ExitBootServices is called. It assigns them to devices in an fs_initcall
so that they can be queried with the API in <linux/property.h>.
Note that the device properties will only be available if the kernel is
booted with the efistub. Distros should adjust their installers to
always use the efistub on Macs. grub with the "linux" directive will not
work unless the functionality of this commit is duplicated in grub.
(The "linuxefi" directive should work but is not included upstream as of
this writing.)
The custom protocol has GUID 91BD12FE-F6C3-44FB-A5B7-5122AB303AE0 and
looks like this:
typedef struct {
unsigned long version; /* 0x10000 */
efi_status_t (*get) (
IN struct apple_properties_protocol *this,
IN struct efi_dev_path *device,
IN efi_char16_t *property_name,
OUT void *buffer,
IN OUT u32 *buffer_len);
/* EFI_SUCCESS, EFI_NOT_FOUND, EFI_BUFFER_TOO_SMALL */
efi_status_t (*set) (
IN struct apple_properties_protocol *this,
IN struct efi_dev_path *device,
IN efi_char16_t *property_name,
IN void *property_value,
IN u32 property_value_len);
/* allocates copies of property name and value */
/* EFI_SUCCESS, EFI_OUT_OF_RESOURCES */
efi_status_t (*del) (
IN struct apple_properties_protocol *this,
IN struct efi_dev_path *device,
IN efi_char16_t *property_name);
/* EFI_SUCCESS, EFI_NOT_FOUND */
efi_status_t (*get_all) (
IN struct apple_properties_protocol *this,
OUT void *buffer,
IN OUT u32 *buffer_len);
/* EFI_SUCCESS, EFI_BUFFER_TOO_SMALL */
} apple_properties_protocol;
Thanks to Pedro Vilaça for this blog post which was helpful in reverse
engineering Apple's EFI drivers and bootloader:
https://reverse.put.as/2016/06/25/apple-efi-firmware-passwords-and-the-scbo-myth/
If someone at Apple is reading this, please note there's a memory leak
in your implementation of the del() function as the property struct is
freed but the name and value allocations are not.
Neither the macOS bootloader nor Apple's EFI drivers check the protocol
version, but we do to avoid breakage if it's ever changed. It's been the
same since at least OS X 10.6 (2009).
The get_all() function conveniently fills a buffer with all properties
in marshalled form which can be passed to the kernel as a setup_data
payload. The number of device properties is dynamic and can change
between a first invocation of get_all() (to determine the buffer size)
and a second invocation (to retrieve the actual buffer), hence the
peculiar loop which does not finish until the buffer size settles.
The macOS bootloader does the same.
The setup_data payload is later on unmarshalled in an fs_initcall. The
idea is that most buses instantiate devices in "subsys" initcall level
and drivers are usually bound to these devices in "device" initcall
level, so we assign the properties in-between, i.e. in "fs" initcall
level.
This assumes that devices to which properties pertain are instantiated
from a "subsys" initcall or earlier. That should always be the case
since on macOS, AppleACPIPlatformExpert::matchEFIDevicePath() only
supports ACPI and PCI nodes and we've fully scanned those buses during
"subsys" initcall level.
The second assumption is that properties are only needed from a "device"
initcall or later. Seems reasonable to me, but should this ever not work
out, an alternative approach would be to store the property sets e.g. in
a btree early during boot. Then whenever device_add() is called, an EFI
Device Path would have to be constructed for the newly added device,
and looked up in the btree. That way, the property set could be assigned
to the device immediately on instantiation. And this would also work for
devices instantiated in a deferred fashion. It seems like this approach
would be more complicated and require more code. That doesn't seem
justified without a specific use case.
For comparison, the strategy on macOS is to assign properties to objects
in the ACPI namespace (AppleACPIPlatformExpert::mergeEFIProperties()).
That approach is definitely wrong as it fails for devices not present in
the namespace: The NHI EFI driver supplies properties for attached
Thunderbolt devices, yet on Macs with Thunderbolt 1 only one device
level behind the host controller is described in the namespace.
Consequently macOS cannot assign properties for chained devices. With
Thunderbolt 2 they started to describe three device levels behind host
controllers in the namespace but this grossly inflates the SSDT and
still fails if the user daisy-chained more than three devices.
We copy the property names and values from the setup_data payload to
swappable virtual memory and afterwards make the payload available to
the page allocator. This is just for the sake of good housekeeping, it
wouldn't occupy a meaningful amount of physical memory (4444 bytes on my
machine). Only the payload is freed, not the setup_data header since
otherwise we'd break the list linkage and we cannot safely update the
predecessor's ->next link because there's no locking for the list.
The payload is currently not passed on to kexec'ed kernels, same for PCI
ROMs retrieved by setup_efi_pci(). This can be added later if there is
demand by amending setup_efi_state(). The payload can then no longer be
made available to the page allocator of course.
Tested-by: Lukas Wunner <lukas@wunner.de> [MacBookPro9,1]
Tested-by: Pierre Moreau <pierre.morrow@free.fr> [MacBookPro11,3]
Signed-off-by: Lukas Wunner <lukas@wunner.de>
Signed-off-by: Matt Fleming <matt@codeblueprint.co.uk>
Cc: Andreas Noever <andreas.noever@gmail.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Pedro Vilaça <reverser@put.as>
Cc: Peter Jones <pjones@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: grub-devel@gnu.org
Cc: linux-efi@vger.kernel.org
Link: http://lkml.kernel.org/r/20161112213237.8804-9-matt@codeblueprint.co.uk
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-13 05:32:36 +08:00
|
|
|
dump_apple_properties [X86]
|
|
|
|
Dump name and content of EFI device properties on
|
|
|
|
x86 Macs. Useful for driver authors to determine
|
|
|
|
what data is available or for reverse-engineering.
|
|
|
|
|
2012-04-28 04:30:41 +08:00
|
|
|
dyndbg[="val"] [KNL,DYNAMIC_DEBUG]
|
2020-09-16 10:49:02 +08:00
|
|
|
<module>.dyndbg[="val"]
|
2012-04-28 04:30:41 +08:00
|
|
|
Enable debug messages at boot time. See
|
2017-06-14 18:24:12 +08:00
|
|
|
Documentation/admin-guide/dynamic-debug-howto.rst
|
|
|
|
for details.
|
2012-04-28 04:30:41 +08:00
|
|
|
|
2016-02-13 05:02:29 +08:00
|
|
|
nopku [X86] Disable Memory Protection Keys CPU feature found
|
|
|
|
in some Intel CPUs.
|
|
|
|
|
2020-09-16 10:49:02 +08:00
|
|
|
<module>.async_probe [KNL]
|
2015-03-31 07:20:05 +08:00
|
|
|
Enable asynchronous probe on this module.
|
|
|
|
|
2014-04-08 06:39:53 +08:00
|
|
|
early_ioremap_debug [KNL]
|
|
|
|
Enable debug messages in early_ioremap support. This
|
|
|
|
is useful for tracking down temporary early mappings
|
|
|
|
which are not unmapped.
|
|
|
|
|
2009-04-06 06:55:22 +08:00
|
|
|
earlycon= [KNL] Output early console device and options.
|
2014-04-19 06:19:57 +08:00
|
|
|
|
2019-09-17 15:15:23 +08:00
|
|
|
When used with no options, the early console is
|
|
|
|
determined by stdout-path property in device tree's
|
|
|
|
chosen node or the ACPI SPCR table if supported by
|
|
|
|
the platform.
|
2015-09-15 08:54:07 +08:00
|
|
|
|
2016-09-22 23:58:16 +08:00
|
|
|
cdns,<addr>[,options]
|
|
|
|
Start an early, polled-mode console on a Cadence
|
|
|
|
(xuartps) serial port at the specified address. Only
|
|
|
|
supported option is baud rate. If baud rate is not
|
|
|
|
specified, the serial port must already be setup and
|
|
|
|
configured.
|
2014-09-10 18:43:02 +08:00
|
|
|
|
2009-04-06 06:55:22 +08:00
|
|
|
uart[8250],io,<addr>[,options]
|
|
|
|
uart[8250],mmio,<addr>[,options]
|
2010-07-21 06:26:51 +08:00
|
|
|
uart[8250],mmio32,<addr>[,options]
|
2015-05-25 11:54:28 +08:00
|
|
|
uart[8250],mmio32be,<addr>[,options]
|
earlycon: 8250: Document kernel command line options
Document the expected behavior of kernel command lines of the forms:
console=uart[8250],io|mmio|mmio32,<addr>[,options]
console=uart[8250],<addr>[,options]
and
earlycon=uart[8250],io|mmio|mmio32,<addr>[,options]
earlycon=uart[8250],<addr>[,options]
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-04-06 22:52:39 +08:00
|
|
|
uart[8250],0x<addr>[,options]
|
2009-04-06 06:55:22 +08:00
|
|
|
Start an early, polled-mode console on the 8250/16550
|
|
|
|
UART at the specified I/O port or MMIO address.
|
2011-08-14 03:34:52 +08:00
|
|
|
MMIO inter-register address stride is either 8-bit
|
2015-05-25 11:54:28 +08:00
|
|
|
(mmio) or 32-bit (mmio32 or mmio32be).
|
|
|
|
If none of [io|mmio|mmio32|mmio32be], <addr> is assumed
|
|
|
|
to be equivalent to 'mmio'. 'options' are specified
|
|
|
|
in the same format described for "console=ttyS<n>"; if
|
earlycon: 8250: Document kernel command line options
Document the expected behavior of kernel command lines of the forms:
console=uart[8250],io|mmio|mmio32,<addr>[,options]
console=uart[8250],<addr>[,options]
and
earlycon=uart[8250],io|mmio|mmio32,<addr>[,options]
earlycon=uart[8250],<addr>[,options]
Signed-off-by: Peter Hurley <peter@hurleysoftware.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2015-04-06 22:52:39 +08:00
|
|
|
unspecified, the h/w is not initialized.
|
2009-04-06 06:55:22 +08:00
|
|
|
|
2014-04-19 06:19:57 +08:00
|
|
|
pl011,<addr>
|
2016-01-05 05:37:42 +08:00
|
|
|
pl011,mmio32,<addr>
|
2014-04-19 06:19:57 +08:00
|
|
|
Start an early, polled-mode console on a pl011 serial
|
|
|
|
port at the specified address. The pl011 serial port
|
|
|
|
must already be setup and configured. Options are not
|
2016-01-05 05:37:42 +08:00
|
|
|
yet supported. If 'mmio32' is specified, then only
|
|
|
|
the driver will use only 32-bit accessors to read/write
|
|
|
|
the device registers.
|
2014-04-19 06:19:57 +08:00
|
|
|
|
2021-05-17 19:54:52 +08:00
|
|
|
liteuart,<addr>
|
|
|
|
Start an early console on a litex serial port at the
|
|
|
|
specified address. The serial port must already be
|
|
|
|
setup and configured. Options are not yet supported.
|
|
|
|
|
2016-03-06 19:21:24 +08:00
|
|
|
meson,<addr>
|
|
|
|
Start an early, polled-mode console on a meson serial
|
|
|
|
port at the specified address. The serial port must
|
|
|
|
already be setup and configured. Options are not yet
|
|
|
|
supported.
|
|
|
|
|
2014-09-16 08:22:51 +08:00
|
|
|
msm_serial,<addr>
|
|
|
|
Start an early, polled-mode console on an msm serial
|
|
|
|
port at the specified address. The serial port
|
|
|
|
must already be setup and configured. Options are not
|
|
|
|
yet supported.
|
|
|
|
|
|
|
|
msm_serial_dm,<addr>
|
|
|
|
Start an early, polled-mode console on an msm serial
|
|
|
|
dm port at the specified address. The serial port
|
|
|
|
must already be setup and configured. Options are not
|
|
|
|
yet supported.
|
|
|
|
|
2017-06-19 09:46:40 +08:00
|
|
|
owl,<addr>
|
|
|
|
Start an early, polled-mode console on a serial port
|
|
|
|
of an Actions Semi SoC, such as S500 or S900, at the
|
|
|
|
specified address. The serial port must already be
|
|
|
|
setup and configured. Options are not yet supported.
|
|
|
|
|
2018-12-18 23:02:37 +08:00
|
|
|
rda,<addr>
|
|
|
|
Start an early, polled-mode console on a serial port
|
|
|
|
of an RDA Micro SoC, such as RDA8810PL, at the
|
|
|
|
specified address. The serial port must already be
|
2017-06-19 09:46:40 +08:00
|
|
|
setup and configured. Options are not yet supported.
|
|
|
|
|
2019-09-14 04:38:43 +08:00
|
|
|
sbi
|
|
|
|
Use RISC-V SBI (Supervisor Binary Interface) for early
|
|
|
|
console.
|
|
|
|
|
2014-04-19 06:19:58 +08:00
|
|
|
smh Use ARM semihosting calls for early console.
|
|
|
|
|
2015-01-23 21:47:41 +08:00
|
|
|
s3c2410,<addr>
|
|
|
|
s3c2412,<addr>
|
|
|
|
s3c2440,<addr>
|
|
|
|
s3c6400,<addr>
|
|
|
|
s5pv210,<addr>
|
|
|
|
exynos4210,<addr>
|
|
|
|
Use early console provided by serial driver available
|
|
|
|
on Samsung SoCs, requires selecting proper type and
|
|
|
|
a correct base address of the selected UART port. The
|
|
|
|
serial port must already be setup and configured.
|
|
|
|
Options are not yet supported.
|
|
|
|
|
2016-12-12 04:42:23 +08:00
|
|
|
lantiq,<addr>
|
|
|
|
Start an early, polled-mode console on a lantiq serial
|
|
|
|
(lqasc) port at the specified address. The serial port
|
|
|
|
must already be setup and configured. Options are not
|
|
|
|
yet supported.
|
|
|
|
|
2015-10-17 15:45:55 +08:00
|
|
|
lpuart,<addr>
|
|
|
|
lpuart32,<addr>
|
|
|
|
Use early console provided by Freescale LP UART driver
|
|
|
|
found on Freescale Vybrid and QorIQ LS1021A processors.
|
|
|
|
A valid base address must be provided, and the serial
|
|
|
|
port must already be setup and configured.
|
|
|
|
|
2020-02-29 21:27:48 +08:00
|
|
|
ec_imx21,<addr>
|
|
|
|
ec_imx6q,<addr>
|
|
|
|
Start an early, polled-mode, output-only console on the
|
|
|
|
Freescale i.MX UART at the specified address. The UART
|
|
|
|
must already be setup and configured.
|
|
|
|
|
2017-05-04 07:49:36 +08:00
|
|
|
ar3700_uart,<addr>
|
2016-02-17 02:14:53 +08:00
|
|
|
Start an early, polled-mode console on the
|
|
|
|
Armada 3700 serial port at the specified
|
|
|
|
address. The serial port must already be setup
|
|
|
|
and configured. Options are not yet supported.
|
|
|
|
|
2018-05-04 04:14:40 +08:00
|
|
|
qcom_geni,<addr>
|
|
|
|
Start an early, polled-mode console on a Qualcomm
|
|
|
|
Generic Interface (GENI) based serial port at the
|
|
|
|
specified address. The serial port must already be
|
|
|
|
setup and configured. Options are not yet supported.
|
|
|
|
|
2019-02-02 17:41:18 +08:00
|
|
|
efifb,[options]
|
|
|
|
Start an early, unaccelerated console on the EFI
|
|
|
|
memory mapped framebuffer (if available). On cache
|
|
|
|
coherent non-x86 systems that use system memory for
|
|
|
|
the framebuffer, pass the 'ram' option so that it is
|
|
|
|
mapped with the correct attributes.
|
|
|
|
|
2019-08-09 19:29:16 +08:00
|
|
|
linflex,<addr>
|
2019-10-16 20:48:25 +08:00
|
|
|
Use early console provided by Freescale LINFlexD UART
|
2019-08-09 19:29:16 +08:00
|
|
|
serial driver for NXP S32V234 SoCs. A valid base
|
|
|
|
address must be provided, and the serial port must
|
|
|
|
already be setup and configured.
|
|
|
|
|
2018-03-08 05:23:24 +08:00
|
|
|
earlyprintk= [X86,SH,ARM,M68k,S390]
|
2005-04-17 06:20:36 +08:00
|
|
|
earlyprintk=vga
|
2017-01-11 16:14:52 +08:00
|
|
|
earlyprintk=sclp
|
2013-02-26 04:54:08 +08:00
|
|
|
earlyprintk=xen
|
2005-04-17 06:20:36 +08:00
|
|
|
earlyprintk=serial[,ttySn[,baudrate]]
|
2013-04-11 05:03:38 +08:00
|
|
|
earlyprintk=serial[,0x...[,baudrate]]
|
2009-09-24 22:08:30 +08:00
|
|
|
earlyprintk=ttySn[,baudrate]
|
2009-08-21 04:39:57 +08:00
|
|
|
earlyprintk=dbgp[debugController#]
|
2018-10-03 00:49:21 +08:00
|
|
|
earlyprintk=pciserial[,force],bus:device.function[,baudrate]
|
2017-03-21 16:01:31 +08:00
|
|
|
earlyprintk=xdbc[xhciController#]
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2013-04-11 05:03:38 +08:00
|
|
|
earlyprintk is useful when the kernel crashes before
|
|
|
|
the normal console is initialized. It is not enabled by
|
|
|
|
default because it has some cosmetic problems.
|
|
|
|
|
2005-10-24 03:57:11 +08:00
|
|
|
Append ",keep" to not disable it when the real console
|
2005-04-17 06:20:36 +08:00
|
|
|
takes over.
|
|
|
|
|
2013-10-04 16:36:56 +08:00
|
|
|
Only one of vga, efi, serial, or usb debug port can
|
|
|
|
be used at a time.
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2013-04-11 05:03:38 +08:00
|
|
|
Currently only ttyS0 and ttyS1 may be specified by
|
|
|
|
name. Other I/O ports may be explicitly specified
|
|
|
|
on some architectures (x86 and arm at least) by
|
|
|
|
replacing ttySn with an I/O port address, like this:
|
|
|
|
earlyprintk=serial,0x1008,115200
|
|
|
|
You can find the port for a given device in
|
|
|
|
/proc/tty/driver/serial:
|
|
|
|
2: uart:ST16650V2 port:00001008 irq:18 ...
|
2005-04-17 06:20:36 +08:00
|
|
|
|
|
|
|
Interaction with the standard serial driver is not
|
|
|
|
very good.
|
|
|
|
|
2013-10-04 16:36:56 +08:00
|
|
|
The VGA and EFI output is eventually overwritten by
|
|
|
|
the real console.
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2013-02-26 04:54:08 +08:00
|
|
|
The xen output can only be used by Xen PV guests.
|
|
|
|
|
2017-01-11 16:14:52 +08:00
|
|
|
The sclp output can only be used on s390.
|
|
|
|
|
2018-10-03 00:49:21 +08:00
|
|
|
The optional "force" to "pciserial" enables use of a
|
|
|
|
PCI device even when its classcode is not of the
|
|
|
|
UART class.
|
|
|
|
|
2013-12-06 14:17:08 +08:00
|
|
|
edac_report= [HW,EDAC] Control how to report EDAC event
|
|
|
|
Format: {"on" | "off" | "force"}
|
|
|
|
on: enable EDAC to report H/W event. May be overridden
|
|
|
|
by other higher priority error reporting module.
|
|
|
|
off: disable H/W event reporting through EDAC.
|
|
|
|
force: enforce the use of EDAC to report H/W event.
|
|
|
|
default: on.
|
|
|
|
|
2010-05-21 10:04:30 +08:00
|
|
|
ekgdboc= [X86,KGDB] Allow early kernel console debugging
|
|
|
|
ekgdboc=kbd
|
|
|
|
|
2011-03-31 09:57:33 +08:00
|
|
|
This is designed to be used in conjunction with
|
2010-05-21 10:04:30 +08:00
|
|
|
the boot argument: earlyprintk=vga
|
|
|
|
|
2020-05-08 04:08:47 +08:00
|
|
|
This parameter works in place of the kgdboc parameter
|
|
|
|
but can only be used if the backing tty is available
|
|
|
|
very early in the boot process. For early debugging
|
|
|
|
via a serial port see kgdboc_earlycon instead.
|
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
edd= [EDD]
|
2008-04-29 16:02:45 +08:00
|
|
|
Format: {"off" | "on" | "skip[mbr]"}
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2013-11-01 00:25:08 +08:00
|
|
|
efi= [EFI]
|
2020-06-16 18:40:12 +08:00
|
|
|
Format: { "debug", "disable_early_pci_dma",
|
|
|
|
"nochunk", "noruntime", "nosoftreserve",
|
2020-08-17 18:00:17 +08:00
|
|
|
"novamap", "no_disable_early_pci_dma" }
|
2020-06-16 18:40:12 +08:00
|
|
|
debug: enable misc debug output.
|
|
|
|
disable_early_pci_dma: disable the busmaster bit on all
|
|
|
|
PCI bridges while in the EFI boot stub.
|
2014-08-05 18:52:11 +08:00
|
|
|
nochunk: disable reading files in "chunks" in the EFI
|
|
|
|
boot stub, as chunking can cause problems with some
|
|
|
|
firmware implementations.
|
2014-08-14 17:15:28 +08:00
|
|
|
noruntime : disable EFI runtime services support
|
2019-11-07 09:43:11 +08:00
|
|
|
nosoftreserve: The EFI_MEMORY_SP (Specific Purpose)
|
|
|
|
attribute may cause the kernel to reserve the
|
|
|
|
memory range for a memory mapping driver to
|
|
|
|
claim. Specify efi=nosoftreserve to disable this
|
|
|
|
reservation and treat the memory by its base type
|
|
|
|
(i.e. EFI_CONVENTIONAL_MEMORY / "System RAM").
|
2020-06-16 18:40:12 +08:00
|
|
|
novamap: do not call SetVirtualAddressMap().
|
2020-01-03 19:39:50 +08:00
|
|
|
no_disable_early_pci_dma: Leave the busmaster bit set
|
|
|
|
on all PCI bridges while in the EFI boot stub
|
2013-11-01 00:25:08 +08:00
|
|
|
|
2013-04-17 07:00:53 +08:00
|
|
|
efi_no_storage_paranoia [EFI; X86]
|
|
|
|
Using this parameter you can use more than 50% of
|
|
|
|
your efi variable storage. Use this parameter only if
|
|
|
|
you are really sure that your UEFI does sane gc and
|
|
|
|
fulfills the spec otherwise your board may brick.
|
|
|
|
|
2015-09-30 22:01:56 +08:00
|
|
|
efi_fake_mem= nn[KMG]@ss[KMG]:aa[,nn[KMG]@ss[KMG]:aa,..] [EFI; X86]
|
|
|
|
Add arbitrary attribute to specific memory range by
|
|
|
|
updating original EFI memory map.
|
|
|
|
Region of memory which aa attribute is added to is
|
|
|
|
from ss to ss+nn.
|
2019-11-07 09:43:26 +08:00
|
|
|
|
2015-09-30 22:01:56 +08:00
|
|
|
If efi_fake_mem=2G@4G:0x10000,2G@0x10a0000000:0x10000
|
|
|
|
is specified, EFI_MEMORY_MORE_RELIABLE(0x10000)
|
|
|
|
attribute is added to range 0x100000000-0x180000000 and
|
|
|
|
0x10a0000000-0x1120000000.
|
|
|
|
|
2019-11-07 09:43:26 +08:00
|
|
|
If efi_fake_mem=8G@9G:0x40000 is specified, the
|
|
|
|
EFI_MEMORY_SP(0x40000) attribute is added to
|
|
|
|
range 0x240000000-0x43fffffff.
|
|
|
|
|
2015-09-30 22:01:56 +08:00
|
|
|
Using this parameter you can do debugging of EFI memmap
|
2019-11-07 09:43:26 +08:00
|
|
|
related features. For example, you can do debugging of
|
2015-09-30 22:01:56 +08:00
|
|
|
Address Range Mirroring feature even if your box
|
2019-11-07 09:43:26 +08:00
|
|
|
doesn't support it, or mark specific memory as
|
|
|
|
"soft reserved".
|
2015-09-30 22:01:56 +08:00
|
|
|
|
2016-07-09 00:13:12 +08:00
|
|
|
efivar_ssdt= [EFI; X86] Name of an EFI variable that contains an SSDT
|
|
|
|
that is to be dynamically loaded by Linux. If there are
|
|
|
|
multiple variables with the same name but with different
|
|
|
|
vendor GUIDs, all of them will be loaded. See
|
2019-06-08 02:54:32 +08:00
|
|
|
Documentation/admin-guide/acpi/ssdt-overlays.rst for details.
|
2016-07-09 00:13:12 +08:00
|
|
|
|
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
eisa_irq_edge= [PARISC,HW]
|
|
|
|
See header of drivers/parisc/eisa.c.
|
|
|
|
|
2007-07-31 15:37:59 +08:00
|
|
|
elanfreq= [X86-32]
|
2005-04-17 06:20:36 +08:00
|
|
|
See comment before function elanfreq_setup() in
|
2008-07-05 00:59:43 +08:00
|
|
|
arch/x86/kernel/cpu/cpufreq/elanfreq.c.
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2011-10-30 22:16:37 +08:00
|
|
|
elfcorehdr=[size[KMG]@]offset[KMG] [IA64,PPC,SH,X86,S390]
|
2005-10-24 03:57:11 +08:00
|
|
|
Specifies physical address of start of kernel core
|
2011-10-30 22:16:37 +08:00
|
|
|
image elf header and optionally the size. Generally
|
|
|
|
kexec loader will pass this option to capture kernel.
|
2019-06-14 02:21:39 +08:00
|
|
|
See Documentation/admin-guide/kdump/kdump.rst for details.
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2009-04-06 06:55:22 +08:00
|
|
|
enable_mtrr_cleanup [X86]
|
|
|
|
The kernel tries to adjust MTRR layout from continuous
|
|
|
|
to discrete, to make X server driver able to add WB
|
|
|
|
entry later. This parameter enables that.
|
|
|
|
|
2009-05-07 07:02:58 +08:00
|
|
|
enable_timer_pin_1 [X86]
|
2009-04-06 06:55:22 +08:00
|
|
|
Enable PIN 1 of APIC timer
|
|
|
|
Can be useful to work around chipset bugs
|
|
|
|
(in particular on some ATI chipsets).
|
|
|
|
The kernel tries to set a reasonable default.
|
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
enforcing [SELINUX] Set initial enforcing status.
|
|
|
|
Format: {"0" | "1"}
|
|
|
|
See security/selinux/Kconfig help text.
|
|
|
|
0 -- permissive (log only, no denials).
|
|
|
|
1 -- enforcing (deny and log).
|
|
|
|
Default value is 0.
|
2020-01-08 00:35:04 +08:00
|
|
|
Value can be changed at runtime via
|
|
|
|
/sys/fs/selinux/enforce.
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2010-05-18 14:35:21 +08:00
|
|
|
erst_disable [ACPI]
|
|
|
|
Disable Error Record Serialization Table (ERST)
|
|
|
|
support.
|
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
ether= [HW,NET] Ethernet cards parameters
|
|
|
|
This option is obsoleted by the "netdev=" option, which
|
|
|
|
has equivalent usage. See its documentation for details.
|
|
|
|
|
2011-05-13 06:33:20 +08:00
|
|
|
evm= [EVM]
|
|
|
|
Format: { "fix" }
|
|
|
|
Permit 'security.evm' to be updated regardless of
|
|
|
|
current integrity status.
|
|
|
|
|
2006-12-08 18:39:42 +08:00
|
|
|
failslab=
|
2020-10-16 11:13:46 +08:00
|
|
|
fail_usercopy=
|
2006-12-08 18:39:42 +08:00
|
|
|
fail_page_alloc=
|
|
|
|
fail_make_request=[KNL]
|
|
|
|
General fault injection mechanism.
|
|
|
|
Format: <interval>,<probability>,<space>,<times>
|
2011-08-15 08:02:26 +08:00
|
|
|
See also Documentation/fault-injection/.
|
2006-12-08 18:39:42 +08:00
|
|
|
|
2020-08-27 00:05:35 +08:00
|
|
|
fb_tunnels= [NET]
|
|
|
|
Format: { initns | none }
|
|
|
|
See Documentation/admin-guide/sysctl/net.rst for
|
|
|
|
fb_tunnels_only_for_init_ns
|
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
floppy= [HW]
|
2019-06-18 22:47:10 +08:00
|
|
|
See Documentation/admin-guide/blockdev/floppy.rst.
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2008-05-09 04:03:23 +08:00
|
|
|
force_pal_cache_flush
|
|
|
|
[IA-64] Avoid check_sal_cache_flush which may hang on
|
|
|
|
buggy SAL_CACHE_FLUSH implementations. Using this
|
|
|
|
parameter will force ia64_sal_cache_flush to call
|
|
|
|
ia64_pal_cache_flush instead of SAL_CACHE_FLUSH.
|
|
|
|
|
2018-04-19 02:51:39 +08:00
|
|
|
forcepae [X86-32]
|
2014-03-07 19:40:42 +08:00
|
|
|
Forcefully enable Physical Address Extension (PAE).
|
|
|
|
Many Pentium M systems disable PAE but may have a
|
|
|
|
functionally usable PAE implementation.
|
|
|
|
Warning: use of this parameter will taint the kernel
|
|
|
|
and may cause unknown problems.
|
|
|
|
|
2008-11-02 02:57:37 +08:00
|
|
|
ftrace=[tracer]
|
2009-05-29 01:37:24 +08:00
|
|
|
[FTRACE] will set and start the specified tracer
|
2008-11-02 02:57:37 +08:00
|
|
|
as early as possible in order to facilitate early
|
|
|
|
boot debugging.
|
|
|
|
|
2010-04-19 01:08:41 +08:00
|
|
|
ftrace_dump_on_oops[=orig_cpu]
|
2009-05-29 01:37:24 +08:00
|
|
|
[FTRACE] will dump the trace buffers on oops.
|
2010-04-19 01:08:41 +08:00
|
|
|
If no parameter is passed, ftrace will dump
|
|
|
|
buffers of all CPUs, but if you pass orig_cpu, it will
|
|
|
|
dump only the buffer of the CPU that triggered the
|
|
|
|
oops.
|
2009-05-29 01:37:24 +08:00
|
|
|
|
|
|
|
ftrace_filter=[function-list]
|
|
|
|
[FTRACE] Limit the functions traced by the function
|
2021-01-01 12:08:31 +08:00
|
|
|
tracer at boot up. function-list is a comma-separated
|
2009-05-29 01:37:24 +08:00
|
|
|
list of functions. This list can be changed at run
|
|
|
|
time by the set_ftrace_filter file in the debugfs
|
2011-08-14 03:34:52 +08:00
|
|
|
tracing directory.
|
2009-05-29 01:37:24 +08:00
|
|
|
|
|
|
|
ftrace_notrace=[function-list]
|
|
|
|
[FTRACE] Do not trace the functions specified in
|
|
|
|
function-list. This list can be changed at run time
|
|
|
|
by the set_ftrace_notrace file in the debugfs
|
|
|
|
tracing directory.
|
2008-11-02 02:57:37 +08:00
|
|
|
|
2009-10-13 04:17:21 +08:00
|
|
|
ftrace_graph_filter=[function-list]
|
|
|
|
[FTRACE] Limit the top level callers functions traced
|
|
|
|
by the function graph tracer at boot up.
|
2021-01-01 12:08:31 +08:00
|
|
|
function-list is a comma-separated list of functions
|
2009-10-13 04:17:21 +08:00
|
|
|
that can be changed at run time by the
|
|
|
|
set_graph_function file in the debugfs tracing directory.
|
|
|
|
|
2014-06-13 00:23:50 +08:00
|
|
|
ftrace_graph_notrace=[function-list]
|
|
|
|
[FTRACE] Do not trace from the functions specified in
|
2021-01-01 12:08:31 +08:00
|
|
|
function-list. This list is a comma-separated list of
|
2014-06-13 00:23:50 +08:00
|
|
|
functions that can be changed at run time by the
|
|
|
|
set_graph_notrace file in the debugfs tracing directory.
|
|
|
|
|
2017-03-03 08:12:15 +08:00
|
|
|
ftrace_graph_max_depth=<uint>
|
|
|
|
[FTRACE] Used with the function graph tracer. This is
|
|
|
|
the max depth it will trace into a function. This value
|
|
|
|
can be changed at run time by the max_graph_depth file
|
|
|
|
in the tracefs tracing directory. default: 0 (no limit)
|
|
|
|
|
2020-02-22 09:40:35 +08:00
|
|
|
fw_devlink= [KNL] Create device links between consumer and supplier
|
|
|
|
devices by scanning the firmware to infer the
|
|
|
|
consumer/supplier relationships. This feature is
|
|
|
|
especially useful when drivers are loaded as modules as
|
|
|
|
it ensures proper ordering of tasks like device probing
|
|
|
|
(suppliers first, then consumers), supplier boot state
|
|
|
|
clean up (only after all consumers have probed),
|
|
|
|
suspend/resume & runtime PM (consumers first, then
|
|
|
|
suppliers).
|
|
|
|
Format: { off | permissive | on | rpm }
|
|
|
|
off -- Don't create device links from firmware info.
|
|
|
|
permissive -- Create device links from firmware info
|
|
|
|
but use it only for ordering boot state clean
|
|
|
|
up (sync_state() calls).
|
|
|
|
on -- Create device links from firmware info and use it
|
|
|
|
to enforce probe and suspend/resume ordering.
|
|
|
|
rpm -- Like "on", but also use to order runtime PM.
|
|
|
|
|
2021-02-06 06:26:39 +08:00
|
|
|
fw_devlink.strict=<bool>
|
|
|
|
[KNL] Treat all inferred dependencies as mandatory
|
|
|
|
dependencies. This only applies for fw_devlink=on|rpm.
|
|
|
|
Format: <bool>
|
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
gamecon.map[2|3]=
|
|
|
|
[HW,JOY] Multisystem joystick and NES/SNES/PSX pad
|
|
|
|
support via parallel port (up to 5 devices per port)
|
|
|
|
Format: <port#>,<pad1>,<pad2>,<pad3>,<pad4>,<pad5>
|
2017-10-11 01:36:23 +08:00
|
|
|
See also Documentation/input/devices/joystick-parport.rst
|
2005-04-17 06:20:36 +08:00
|
|
|
|
|
|
|
gamma= [HW,DRM]
|
|
|
|
|
2020-08-10 10:49:41 +08:00
|
|
|
gart_fix_e820= [X86-64] disable the fix e820 for K8 GART
|
x86: disable the GART early, 64-bit
For K8 system: 4G RAM with memory hole remapping enabled, or more than
4G RAM installed.
when try to use kexec second kernel, and the first doesn't include
gart_shutdown. the second kernel could have different aper position than
the first kernel. and second kernel could use that hole as RAM that is
still used by GART set by the first kernel. esp. when try to kexec
2.6.24 with sparse mem enable from previous kernel (from RHEL 5 or SLES
10). the new kernel will use aper by GART (set by first kernel) for
vmemmap. and after new kernel setting one new GART. the position will be
real RAM. the _mapcount set is lost.
Bad page state in process 'swapper'
page:ffffe2000e600020 flags:0x0000000000000000 mapping:0000000000000000 mapcount:1 count:0
Trying to fix it up, but a reboot is needed
Backtrace:
Pid: 0, comm: swapper Not tainted 2.6.24-rc7-smp-gcdf71a10-dirty #13
Call Trace:
[<ffffffff8026401f>] bad_page+0x63/0x8d
[<ffffffff80264169>] __free_pages_ok+0x7c/0x2a5
[<ffffffff80ba75d1>] free_all_bootmem_core+0xd0/0x198
[<ffffffff80ba3a42>] numa_free_all_bootmem+0x3b/0x76
[<ffffffff80ba3461>] mem_init+0x3b/0x152
[<ffffffff80b959d3>] start_kernel+0x236/0x2c2
[<ffffffff80b9511a>] _sinittext+0x11a/0x121
and
[ffffe2000e600000-ffffe2000e7fffff] PMD ->ffff81001c200000 on node 0
phys addr is : 0x1c200000
RHEL 5.1 kernel -53 said:
PCI-DMA: aperture base @ 1c000000 size 65536 KB
new kernel said:
Mapping aperture over 65536 KB of RAM @ 3c000000
So could try to disable that GART if possible.
According to Ingo
> hm, i'm wondering, instead of modifying the GART, why dont we simply
> _detect_ whatever GART settings we have inherited, and propagate that
> into our e820 maps? I.e. if there's inconsistency, then punch that out
> from the memory maps and just dont use that memory.
>
> that way it would not matter whether the GART settings came from a [old
> or crashing] Linux kernel that has not called gart_iommu_shutdown(), or
> whether it's a BIOS that has set up an aperture hole inconsistent with
> the memory map it passed. (or the memory map we _think_ i tried to pass
> us)
>
> it would also be more robust to only read and do a memory map quirk
> based on that, than actively trying to change the GART so early in the
> bootup. Later on we have to re-enable the GART _anyway_ and have to
> punch a hole for it.
>
> and as a bonus, we would have shored up our defenses against crappy
> BIOSes as well.
add e820 modification for gart inconsistent setting.
gart_fix_e820=off could be used to disable e820 fix.
Signed-off-by: Yinghai Lu <yinghai.lu@sun.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2008-01-30 20:33:09 +08:00
|
|
|
Format: off | on
|
|
|
|
default: on
|
|
|
|
|
2009-06-18 07:28:08 +08:00
|
|
|
gcov_persist= [GCOV] When non-zero (default), profiling data for
|
|
|
|
kernel modules is saved and remains accessible via
|
|
|
|
debugfs, even when the module is unloaded/reloaded.
|
|
|
|
When zero, profiling data is discarded and associated
|
|
|
|
debugfs files are removed at module unload time.
|
|
|
|
|
2017-02-15 18:11:50 +08:00
|
|
|
goldfish [X86] Enable the goldfish android emulator platform.
|
|
|
|
Don't use this when you are not running on the
|
|
|
|
android emulator
|
|
|
|
|
2021-03-29 19:16:47 +08:00
|
|
|
gpio-mockup.gpio_mockup_ranges
|
|
|
|
[HW] Sets the ranges of gpiochip of for this device.
|
|
|
|
Format: <start1>,<end1>,<start2>,<end2>...
|
2021-03-29 19:16:48 +08:00
|
|
|
gpio-mockup.gpio_mockup_named_lines
|
|
|
|
[HW] Let the driver know GPIO lines should be named.
|
2021-03-29 19:16:47 +08:00
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
gpt [EFI] Forces disk with valid GPT signature but
|
2014-01-24 07:56:03 +08:00
|
|
|
invalid Protective MBR to be treated as GPT. If the
|
|
|
|
primary GPT is corrupted, it enables the backup/alternate
|
|
|
|
GPT to be used instead.
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2012-11-15 15:47:14 +08:00
|
|
|
grcan.enable0= [HW] Configuration of physical interface 0. Determines
|
|
|
|
the "Enable 0" bit of the configuration register.
|
|
|
|
Format: 0 | 1
|
|
|
|
Default: 0
|
|
|
|
grcan.enable1= [HW] Configuration of physical interface 1. Determines
|
|
|
|
the "Enable 0" bit of the configuration register.
|
|
|
|
Format: 0 | 1
|
|
|
|
Default: 0
|
|
|
|
grcan.select= [HW] Select which physical interface to use.
|
|
|
|
Format: 0 | 1
|
|
|
|
Default: 0
|
|
|
|
grcan.txsize= [HW] Sets the size of the tx buffer.
|
|
|
|
Format: <unsigned int> such that (txsize & ~0x1fffc0) == 0.
|
|
|
|
Default: 1024
|
|
|
|
grcan.rxsize= [HW] Sets the size of the rx buffer.
|
|
|
|
Format: <unsigned int> such that (rxsize & ~0x1fffc0) == 0.
|
|
|
|
Default: 1024
|
|
|
|
|
2015-11-06 10:44:41 +08:00
|
|
|
hardlockup_all_cpu_backtrace=
|
|
|
|
[KNL] Should the hard-lockup detector generate
|
|
|
|
backtraces on all cpus.
|
2020-06-08 12:40:42 +08:00
|
|
|
Format: 0 | 1
|
2015-11-06 10:44:41 +08:00
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
hashdist= [KNL,NUMA] Large hashes allocated during boot
|
|
|
|
are distributed across NUMA nodes. Defaults on
|
2011-08-14 03:34:52 +08:00
|
|
|
for 64-bit NUMA, off otherwise.
|
2005-10-24 03:57:11 +08:00
|
|
|
Format: 0 | 1 (for off | on)
|
2005-04-17 06:20:36 +08:00
|
|
|
|
|
|
|
hcl= [IA-64] SGI's Hardware Graph compatibility layer
|
|
|
|
|
|
|
|
hd= [EIDE] (E)IDE hard drive subsystem geometry
|
|
|
|
Format: <cyl>,<head>,<sect>
|
|
|
|
|
2010-05-18 14:35:15 +08:00
|
|
|
hest_disable [ACPI]
|
|
|
|
Disable Hardware Error Source Table (HEST) support;
|
|
|
|
corresponding firmware-first mode error processing
|
|
|
|
logic will be disabled.
|
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
highmem=nn[KMG] [KNL,BOOT] forces the highmem zone to have an exact
|
|
|
|
size of <nn>. This works even on boxes that have no
|
|
|
|
highmem otherwise. This also works to reduce highmem
|
|
|
|
size on bigger boxes.
|
|
|
|
|
2007-02-16 17:28:11 +08:00
|
|
|
highres= [KNL] Enable/disable high resolution timer mode.
|
|
|
|
Valid parameters: "on", "off"
|
|
|
|
Default: "on"
|
|
|
|
|
2009-04-06 06:55:22 +08:00
|
|
|
hlt [BUGS=ARM,SH]
|
|
|
|
|
|
|
|
hpet= [X86-32,HPET] option to control HPET usage
|
|
|
|
Format: { enable (default) | disable | force |
|
|
|
|
verbose }
|
|
|
|
disable: disable HPET and use PIT instead
|
|
|
|
force: allow force enabled of undocumented chips (ICH4,
|
|
|
|
VIA, nVidia)
|
|
|
|
verbose: show contents of HPET registers during setup
|
|
|
|
|
2013-11-13 07:08:33 +08:00
|
|
|
hpet_mmap= [X86, HPET_MMAP] Allow userspace to mmap HPET
|
|
|
|
registers. Default set by CONFIG_HPET_MMAP_DEFAULT.
|
|
|
|
|
2021-01-25 12:32:02 +08:00
|
|
|
hugetlb_cma= [HW,CMA] The size of a CMA area used for allocation
|
mm: hugetlb: optionally allocate gigantic hugepages using cma
Commit 944d9fec8d7a ("hugetlb: add support for gigantic page allocation
at runtime") has added the run-time allocation of gigantic pages.
However it actually works only at early stages of the system loading,
when the majority of memory is free. After some time the memory gets
fragmented by non-movable pages, so the chances to find a contiguous 1GB
block are getting close to zero. Even dropping caches manually doesn't
help a lot.
At large scale rebooting servers in order to allocate gigantic hugepages
is quite expensive and complex. At the same time keeping some constant
percentage of memory in reserved hugepages even if the workload isn't
using it is a big waste: not all workloads can benefit from using 1 GB
pages.
The following solution can solve the problem:
1) On boot time a dedicated cma area* is reserved. The size is passed
as a kernel argument.
2) Run-time allocations of gigantic hugepages are performed using the
cma allocator and the dedicated cma area
In this case gigantic hugepages can be allocated successfully with a
high probability, however the memory isn't completely wasted if nobody
is using 1GB hugepages: it can be used for pagecache, anon memory, THPs,
etc.
* On a multi-node machine a per-node cma area is allocated on each node.
Following gigantic hugetlb allocation are using the first available
numa node if the mask isn't specified by a user.
Usage:
1) configure the kernel to allocate a cma area for hugetlb allocations:
pass hugetlb_cma=10G as a kernel argument
2) allocate hugetlb pages as usual, e.g.
echo 10 > /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages
If the option isn't enabled or the allocation of the cma area failed,
the current behavior of the system is preserved.
x86 and arm-64 are covered by this patch, other architectures can be
trivially added later.
The patch contains clean-ups and fixes proposed and implemented by Aslan
Bakirov and Randy Dunlap. It also contains ideas and suggestions
proposed by Rik van Riel, Michal Hocko and Mike Kravetz. Thanks!
Signed-off-by: Roman Gushchin <guro@fb.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Tested-by: Andreas Schaufler <andreas.schaufler@gmx.de>
Acked-by: Mike Kravetz <mike.kravetz@oracle.com>
Acked-by: Michal Hocko <mhocko@kernel.org>
Cc: Aslan Bakirov <aslan@fb.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: Joonsoo Kim <js1304@gmail.com>
Link: http://lkml.kernel.org/r/20200407163840.92263-3-guro@fb.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-11 05:32:45 +08:00
|
|
|
of gigantic hugepages.
|
|
|
|
Format: nn[KMGTPE]
|
|
|
|
|
2021-01-25 12:32:02 +08:00
|
|
|
Reserve a CMA area of given size and allocate gigantic
|
|
|
|
hugepages using the CMA allocator. If enabled, the
|
mm: hugetlb: optionally allocate gigantic hugepages using cma
Commit 944d9fec8d7a ("hugetlb: add support for gigantic page allocation
at runtime") has added the run-time allocation of gigantic pages.
However it actually works only at early stages of the system loading,
when the majority of memory is free. After some time the memory gets
fragmented by non-movable pages, so the chances to find a contiguous 1GB
block are getting close to zero. Even dropping caches manually doesn't
help a lot.
At large scale rebooting servers in order to allocate gigantic hugepages
is quite expensive and complex. At the same time keeping some constant
percentage of memory in reserved hugepages even if the workload isn't
using it is a big waste: not all workloads can benefit from using 1 GB
pages.
The following solution can solve the problem:
1) On boot time a dedicated cma area* is reserved. The size is passed
as a kernel argument.
2) Run-time allocations of gigantic hugepages are performed using the
cma allocator and the dedicated cma area
In this case gigantic hugepages can be allocated successfully with a
high probability, however the memory isn't completely wasted if nobody
is using 1GB hugepages: it can be used for pagecache, anon memory, THPs,
etc.
* On a multi-node machine a per-node cma area is allocated on each node.
Following gigantic hugetlb allocation are using the first available
numa node if the mask isn't specified by a user.
Usage:
1) configure the kernel to allocate a cma area for hugetlb allocations:
pass hugetlb_cma=10G as a kernel argument
2) allocate hugetlb pages as usual, e.g.
echo 10 > /sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages
If the option isn't enabled or the allocation of the cma area failed,
the current behavior of the system is preserved.
x86 and arm-64 are covered by this patch, other architectures can be
trivially added later.
The patch contains clean-ups and fixes proposed and implemented by Aslan
Bakirov and Randy Dunlap. It also contains ideas and suggestions
proposed by Rik van Riel, Michal Hocko and Mike Kravetz. Thanks!
Signed-off-by: Roman Gushchin <guro@fb.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Tested-by: Andreas Schaufler <andreas.schaufler@gmx.de>
Acked-by: Mike Kravetz <mike.kravetz@oracle.com>
Acked-by: Michal Hocko <mhocko@kernel.org>
Cc: Aslan Bakirov <aslan@fb.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Rik van Riel <riel@surriel.com>
Cc: Joonsoo Kim <js1304@gmail.com>
Link: http://lkml.kernel.org/r/20200407163840.92263-3-guro@fb.com
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-04-11 05:32:45 +08:00
|
|
|
boot-time allocation of gigantic hugepages is skipped.
|
|
|
|
|
2020-06-04 07:00:46 +08:00
|
|
|
hugepages= [HW] Number of HugeTLB pages to allocate at boot.
|
|
|
|
If this follows hugepagesz (below), it specifies
|
|
|
|
the number of pages of hugepagesz to be allocated.
|
|
|
|
If this is the first HugeTLB parameter on the command
|
|
|
|
line, it specifies the number of pages to allocate for
|
|
|
|
the default huge page size. See also
|
|
|
|
Documentation/admin-guide/mm/hugetlbpage.rst.
|
|
|
|
Format: <integer>
|
|
|
|
|
|
|
|
hugepagesz=
|
|
|
|
[HW] The size of the HugeTLB pages. This is used in
|
|
|
|
conjunction with hugepages (above) to allocate huge
|
|
|
|
pages of a specific size at boot. The pair
|
|
|
|
hugepagesz=X hugepages=Y can be specified once for
|
|
|
|
each supported huge page size. Huge page sizes are
|
|
|
|
architecture dependent. See also
|
|
|
|
Documentation/admin-guide/mm/hugetlbpage.rst.
|
|
|
|
Format: size[KMG]
|
2008-09-21 16:14:42 +08:00
|
|
|
|
2021-07-01 09:47:25 +08:00
|
|
|
hugetlb_free_vmemmap=
|
|
|
|
[KNL] Reguires CONFIG_HUGETLB_PAGE_FREE_VMEMMAP
|
|
|
|
enabled.
|
|
|
|
Allows heavy hugetlb users to free up some more
|
|
|
|
memory (6 * PAGE_SIZE for each 2MB hugetlb page).
|
|
|
|
Format: { on | off (default) }
|
|
|
|
|
|
|
|
on: enable the feature
|
|
|
|
off: disable the feature
|
|
|
|
|
2021-07-01 09:48:28 +08:00
|
|
|
Built with CONFIG_HUGETLB_PAGE_FREE_VMEMMAP_DEFAULT_ON=y,
|
|
|
|
the default is on.
|
|
|
|
|
2021-07-01 09:47:29 +08:00
|
|
|
This is not compatible with memory_hotplug.memmap_on_memory.
|
|
|
|
If both parameters are enabled, hugetlb_free_vmemmap takes
|
|
|
|
precedence over memory_hotplug.memmap_on_memory.
|
|
|
|
|
2018-05-22 02:18:17 +08:00
|
|
|
hung_task_panic=
|
|
|
|
[KNL] Should the hung task detector generate panics.
|
2020-06-08 12:40:42 +08:00
|
|
|
Format: 0 | 1
|
2008-12-25 20:39:55 +08:00
|
|
|
|
2020-06-08 12:40:31 +08:00
|
|
|
A value of 1 instructs the kernel to panic when a
|
2018-05-22 02:18:17 +08:00
|
|
|
hung task is detected. The default value is controlled
|
|
|
|
by the CONFIG_BOOTPARAM_HUNG_TASK_PANIC build-time
|
|
|
|
option. The value selected by this boot parameter can
|
|
|
|
be changed later by the kernel.hung_task_panic sysctl.
|
|
|
|
|
2018-04-19 02:51:39 +08:00
|
|
|
hvc_iucv= [S390] Number of z/VM IUCV hypervisor console (HVC)
|
|
|
|
terminal devices. Valid values: 0..8
|
|
|
|
hvc_iucv_allow= [S390] Comma-separated list of z/VM user IDs.
|
|
|
|
If specified, z/VM IUCV HVC accepts connections
|
|
|
|
from listed z/VM user IDs only.
|
2018-10-08 16:29:34 +08:00
|
|
|
|
|
|
|
hv_nopvspin [X86,HYPER_V] Disables the paravirt spinlock optimizations
|
|
|
|
which allow the hypervisor to 'idle' the
|
|
|
|
guest on lock contention.
|
|
|
|
|
2011-03-23 07:34:20 +08:00
|
|
|
keep_bootcon [KNL]
|
|
|
|
Do not unregister boot console at start. This is only
|
|
|
|
useful for debugging when something happens in the window
|
|
|
|
between unregistering the boot console and initializing
|
|
|
|
the real console.
|
|
|
|
|
2018-04-19 02:51:39 +08:00
|
|
|
i2c_bus= [HW] Override the default board specific I2C bus speed
|
|
|
|
or register an additional I2C bus that is not
|
|
|
|
registered from board initialization code.
|
|
|
|
Format:
|
|
|
|
<bus_id>,<clkrate>
|
2009-03-24 09:07:47 +08:00
|
|
|
|
2008-10-06 14:51:09 +08:00
|
|
|
i8042.debug [HW] Toggle i8042 debug mode
|
2015-07-16 01:20:17 +08:00
|
|
|
i8042.unmask_kbd_data
|
|
|
|
[HW] Enable printing of interrupt data from the KBD port
|
|
|
|
(disabled by default, and as a pre-condition
|
|
|
|
requires that i8042.debug=1 be enabled)
|
2005-04-17 06:20:36 +08:00
|
|
|
i8042.direct [HW] Put keyboard port into non-translated mode
|
2006-10-04 04:53:09 +08:00
|
|
|
i8042.dumbkbd [HW] Pretend that controller can only read data from
|
|
|
|
keyboard and cannot control its state
|
2005-04-17 06:20:36 +08:00
|
|
|
(Don't attempt to blink the leds)
|
|
|
|
i8042.noaux [HW] Don't check for auxiliary (== mouse) port
|
2005-09-04 14:42:00 +08:00
|
|
|
i8042.nokbd [HW] Don't check/create keyboard port
|
2008-03-14 04:13:59 +08:00
|
|
|
i8042.noloop [HW] Disable the AUX Loopback command while probing
|
|
|
|
for the AUX port
|
2005-04-17 06:20:36 +08:00
|
|
|
i8042.nomux [HW] Don't check presence of an active multiplexing
|
2014-11-01 00:35:53 +08:00
|
|
|
controller
|
2005-04-17 06:20:36 +08:00
|
|
|
i8042.nopnp [HW] Don't use ACPIPnP / PnPBIOS to discover KBD/AUX
|
|
|
|
controllers
|
2012-02-14 23:26:42 +08:00
|
|
|
i8042.notimeout [HW] Ignore timeout condition signalled by controller
|
2016-10-02 03:07:35 +08:00
|
|
|
i8042.reset [HW] Reset the controller during init, cleanup and
|
|
|
|
suspend-to-ram transitions, only during s2r
|
|
|
|
transitions, or never reset
|
|
|
|
Format: { 1 | Y | y | 0 | N | n }
|
|
|
|
1, Y, y: always reset controller
|
|
|
|
0, N, n: don't ever reset controller
|
|
|
|
Default: only on s2r transitions on x86; most other
|
|
|
|
architectures force reset to be always executed
|
2005-04-17 06:20:36 +08:00
|
|
|
i8042.unlock [HW] Unlock (ignore) the keylock
|
2018-04-19 02:51:39 +08:00
|
|
|
i8042.kbdreset [HW] Reset device connected to KBD port
|
2005-04-17 06:20:36 +08:00
|
|
|
|
|
|
|
i810= [HW,DRM]
|
|
|
|
|
2005-06-26 05:54:25 +08:00
|
|
|
i8k.ignore_dmi [HW] Continue probing hardware even if DMI data
|
|
|
|
indicates that the driver is running on unsupported
|
|
|
|
hardware.
|
2005-04-17 06:20:36 +08:00
|
|
|
i8k.force [HW] Activate i8k driver even if SMM BIOS signature
|
|
|
|
does not match list of supported models.
|
|
|
|
i8k.power_status
|
|
|
|
[HW] Report power status in /proc/i8k
|
|
|
|
(disabled by default)
|
|
|
|
i8k.restricted [HW] Allow controlling fans only if SYS_ADMIN
|
|
|
|
capability is set.
|
|
|
|
|
2012-03-15 22:56:26 +08:00
|
|
|
i915.invert_brightness=
|
2012-03-15 22:56:25 +08:00
|
|
|
[DRM] Invert the sense of the variable that is used to
|
|
|
|
set the brightness of the panel backlight. Normally a
|
2012-03-15 22:56:26 +08:00
|
|
|
brightness value of 0 indicates backlight switched off,
|
|
|
|
and the maximum of the brightness value sets the backlight
|
|
|
|
to maximum brightness. If this parameter is set to 0
|
|
|
|
(default) and the machine requires it, or this parameter
|
|
|
|
is set to 1, a brightness value of 0 sets the backlight
|
|
|
|
to maximum brightness, and the maximum of the brightness
|
|
|
|
value switches the backlight off.
|
|
|
|
-1 -- never invert brightness
|
|
|
|
0 -- machine default
|
|
|
|
1 -- force brightness inversion
|
2012-03-15 22:56:25 +08:00
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
icn= [HW,ISDN]
|
|
|
|
Format: <io>[,<membase>[,<icn_id>[,<icn_id2>]]]
|
|
|
|
|
2009-02-26 03:28:21 +08:00
|
|
|
ide-core.nodma= [HW] (E)IDE subsystem
|
|
|
|
Format: =0.0 to prevent dma on hda, =0.1 hdb =1.0 hdc
|
2009-06-07 19:52:52 +08:00
|
|
|
.vlb_clock .pci_clock .noflush .nohpa .noprobe .nowerr
|
|
|
|
.cdrom .chs .ignore_cable are additional options
|
2019-06-13 01:52:47 +08:00
|
|
|
See Documentation/ide/ide.rst.
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2014-10-26 00:03:52 +08:00
|
|
|
ide-generic.probe-mask= [HW] (E)IDE subsystem
|
|
|
|
Format: <int>
|
|
|
|
Probe mask for legacy ISA IDE ports. Depending on
|
|
|
|
platform up to 6 ports are supported, enabled by
|
|
|
|
setting corresponding bits in the mask to 1. The
|
|
|
|
default value is 0x0, which has a special meaning.
|
|
|
|
On systems that have PCI, it triggers scanning the
|
|
|
|
PCI bus for the first and the second port, which
|
|
|
|
are then probed. On systems without PCI the value
|
|
|
|
of 0x0 enables probing the two first ports as if it
|
|
|
|
was 0x3.
|
|
|
|
|
2009-04-06 06:55:22 +08:00
|
|
|
ide-pci-generic.all-generic-ide [HW] (E)IDE subsystem
|
|
|
|
Claim all unknown PCI IDE storage controllers.
|
|
|
|
|
2007-05-03 01:27:12 +08:00
|
|
|
idle= [X86]
|
2013-02-10 14:38:39 +08:00
|
|
|
Format: idle=poll, idle=halt, idle=nomwait
|
2008-12-20 02:57:32 +08:00
|
|
|
Poll forces a polling idle loop that can slightly
|
|
|
|
improve the performance of waking up a idle CPU, but
|
|
|
|
will use a lot of power and make the system run hot.
|
|
|
|
Not recommended.
|
|
|
|
idle=halt: Halt is forced to be used for CPU idle.
|
2008-06-24 17:58:53 +08:00
|
|
|
In such case C2/C3 won't be used again.
|
2008-12-20 02:57:32 +08:00
|
|
|
idle=nomwait: Disable mwait for CPU C-states
|
2005-10-24 03:57:11 +08:00
|
|
|
|
2021-01-23 02:46:00 +08:00
|
|
|
idxd.sva= [HW]
|
|
|
|
Format: <bool>
|
|
|
|
Allow force disabling of Shared Virtual Memory (SVA)
|
|
|
|
support for the idxd driver. By default it is set to
|
|
|
|
true (1).
|
|
|
|
|
2015-11-13 08:48:29 +08:00
|
|
|
ieee754= [MIPS] Select IEEE Std 754 conformance mode
|
|
|
|
Format: { strict | legacy | 2008 | relaxed }
|
|
|
|
Default: strict
|
|
|
|
|
|
|
|
Choose which programs will be accepted for execution
|
|
|
|
based on the IEEE 754 NaN encoding(s) supported by
|
|
|
|
the FPU and the NaN encoding requested with the value
|
|
|
|
of an ELF file header flag individually set by each
|
|
|
|
binary. Hardware implementations are permitted to
|
|
|
|
support either or both of the legacy and the 2008 NaN
|
|
|
|
encoding mode.
|
|
|
|
|
|
|
|
Available settings are as follows:
|
|
|
|
strict accept binaries that request a NaN encoding
|
|
|
|
supported by the FPU
|
|
|
|
legacy only accept legacy-NaN binaries, if supported
|
|
|
|
by the FPU
|
|
|
|
2008 only accept 2008-NaN binaries, if supported
|
|
|
|
by the FPU
|
|
|
|
relaxed accept any binaries regardless of whether
|
|
|
|
supported by the FPU
|
|
|
|
|
|
|
|
The FPU emulator is always able to support both NaN
|
|
|
|
encodings, so if no FPU hardware is present or it has
|
|
|
|
been disabled with 'nofpu', then the settings of
|
|
|
|
'legacy' and '2008' strap the emulator accordingly,
|
|
|
|
'relaxed' straps the emulator for both legacy-NaN and
|
|
|
|
2008-NaN, whereas 'strict' enables legacy-NaN only on
|
|
|
|
legacy processors and both NaN encodings on MIPS32 or
|
|
|
|
MIPS64 CPUs.
|
|
|
|
|
|
|
|
The setting for ABS.fmt/NEG.fmt instruction execution
|
|
|
|
mode generally follows that for the NaN encoding,
|
|
|
|
except where unsupported by hardware.
|
|
|
|
|
2006-12-07 12:40:51 +08:00
|
|
|
ignore_loglevel [KNL]
|
|
|
|
Ignore loglevel setting - this will print /all/
|
|
|
|
kernel messages to the console. Useful for debugging.
|
2011-11-01 08:11:25 +08:00
|
|
|
We also add it as printk module parameter, so users
|
|
|
|
could change it dynamically, usually by
|
|
|
|
/sys/module/printk/parameters/ignore_loglevel.
|
2006-12-07 12:40:51 +08:00
|
|
|
|
2016-02-03 08:57:43 +08:00
|
|
|
ignore_rlimit_data
|
|
|
|
Ignore RLIMIT_DATA setting for data mappings,
|
|
|
|
print warning at first misuse. Can be changed via
|
|
|
|
/sys/module/kernel/parameters/ignore_rlimit_data.
|
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
ihash_entries= [KNL]
|
|
|
|
Set number of hash buckets for inode cache.
|
|
|
|
|
ima: integrity appraisal extension
IMA currently maintains an integrity measurement list used to assert the
integrity of the running system to a third party. The IMA-appraisal
extension adds local integrity validation and enforcement of the
measurement against a "good" value stored as an extended attribute
'security.ima'. The initial methods for validating 'security.ima' are
hashed based, which provides file data integrity, and digital signature
based, which in addition to providing file data integrity, provides
authenticity.
This patch creates and maintains the 'security.ima' xattr, containing
the file data hash measurement. Protection of the xattr is provided by
EVM, if enabled and configured.
Based on policy, IMA calls evm_verifyxattr() to verify a file's metadata
integrity and, assuming success, compares the file's current hash value
with the one stored as an extended attribute in 'security.ima'.
Changelov v4:
- changed iint cache flags to hex values
Changelog v3:
- change appraisal default for filesystems without xattr support to fail
Changelog v2:
- fix audit msg 'res' value
- removed unused 'ima_appraise=' values
Changelog v1:
- removed unused iint mutex (Dmitry Kasatkin)
- setattr hook must not reset appraised (Dmitry Kasatkin)
- evm_verifyxattr() now differentiates between no 'security.evm' xattr
(INTEGRITY_NOLABEL) and no EVM 'protected' xattrs included in the
'security.evm' (INTEGRITY_NOXATTRS).
- replace hash_status with ima_status (Dmitry Kasatkin)
- re-initialize slab element ima_status on free (Dmitry Kasatkin)
- include 'security.ima' in EVM if CONFIG_IMA_APPRAISE, not CONFIG_IMA
- merged half "ima: ima_must_appraise_or_measure API change" (Dmitry Kasatkin)
- removed unnecessary error variable in process_measurement() (Dmitry Kasatkin)
- use ima_inode_post_setattr() stub function, if IMA_APPRAISE not configured
(moved ima_inode_post_setattr() to ima_appraise.c)
- make sure ima_collect_measurement() can read file
Changelog:
- add 'iint' to evm_verifyxattr() call (Dimitry Kasatkin)
- fix the race condition between chmod, which takes the i_mutex and then
iint->mutex, and ima_file_free() and process_measurement(), which take
the locks in the reverse order, by eliminating iint->mutex. (Dmitry Kasatkin)
- cleanup of ima_appraise_measurement() (Dmitry Kasatkin)
- changes as a result of the iint not allocated for all regular files, but
only for those measured/appraised.
- don't try to appraise new/empty files
- expanded ima_appraisal description in ima/Kconfig
- IMA appraise definitions required even if IMA_APPRAISE not enabled
- add return value to ima_must_appraise() stub
- unconditionally set status = INTEGRITY_PASS *after* testing status,
not before. (Found by Joe Perches)
Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com>
2012-02-13 23:15:05 +08:00
|
|
|
ima_appraise= [IMA] appraise integrity measurements
|
2014-05-08 18:11:29 +08:00
|
|
|
Format: { "off" | "enforce" | "fix" | "log" }
|
ima: integrity appraisal extension
IMA currently maintains an integrity measurement list used to assert the
integrity of the running system to a third party. The IMA-appraisal
extension adds local integrity validation and enforcement of the
measurement against a "good" value stored as an extended attribute
'security.ima'. The initial methods for validating 'security.ima' are
hashed based, which provides file data integrity, and digital signature
based, which in addition to providing file data integrity, provides
authenticity.
This patch creates and maintains the 'security.ima' xattr, containing
the file data hash measurement. Protection of the xattr is provided by
EVM, if enabled and configured.
Based on policy, IMA calls evm_verifyxattr() to verify a file's metadata
integrity and, assuming success, compares the file's current hash value
with the one stored as an extended attribute in 'security.ima'.
Changelov v4:
- changed iint cache flags to hex values
Changelog v3:
- change appraisal default for filesystems without xattr support to fail
Changelog v2:
- fix audit msg 'res' value
- removed unused 'ima_appraise=' values
Changelog v1:
- removed unused iint mutex (Dmitry Kasatkin)
- setattr hook must not reset appraised (Dmitry Kasatkin)
- evm_verifyxattr() now differentiates between no 'security.evm' xattr
(INTEGRITY_NOLABEL) and no EVM 'protected' xattrs included in the
'security.evm' (INTEGRITY_NOXATTRS).
- replace hash_status with ima_status (Dmitry Kasatkin)
- re-initialize slab element ima_status on free (Dmitry Kasatkin)
- include 'security.ima' in EVM if CONFIG_IMA_APPRAISE, not CONFIG_IMA
- merged half "ima: ima_must_appraise_or_measure API change" (Dmitry Kasatkin)
- removed unnecessary error variable in process_measurement() (Dmitry Kasatkin)
- use ima_inode_post_setattr() stub function, if IMA_APPRAISE not configured
(moved ima_inode_post_setattr() to ima_appraise.c)
- make sure ima_collect_measurement() can read file
Changelog:
- add 'iint' to evm_verifyxattr() call (Dimitry Kasatkin)
- fix the race condition between chmod, which takes the i_mutex and then
iint->mutex, and ima_file_free() and process_measurement(), which take
the locks in the reverse order, by eliminating iint->mutex. (Dmitry Kasatkin)
- cleanup of ima_appraise_measurement() (Dmitry Kasatkin)
- changes as a result of the iint not allocated for all regular files, but
only for those measured/appraised.
- don't try to appraise new/empty files
- expanded ima_appraisal description in ima/Kconfig
- IMA appraise definitions required even if IMA_APPRAISE not enabled
- add return value to ima_must_appraise() stub
- unconditionally set status = INTEGRITY_PASS *after* testing status,
not before. (Found by Joe Perches)
Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com>
2012-02-13 23:15:05 +08:00
|
|
|
default: "enforce"
|
|
|
|
|
2019-04-05 02:23:22 +08:00
|
|
|
ima_appraise_tcb [IMA] Deprecated. Use ima_policy= instead.
|
ima: add appraise action keywords and default rules
Unlike the IMA measurement policy, the appraise policy can not be dependent
on runtime process information, such as the task uid, as the 'security.ima'
xattr is written on file close and must be updated each time the file changes,
regardless of the current task uid.
This patch extends the policy language with 'fowner', defines an appraise
policy, which appraises all files owned by root, and defines 'ima_appraise_tcb',
a new boot command line option, to enable the appraise policy.
Changelog v3:
- separate the measure from the appraise rules in order to support measuring
without appraising and appraising without measuring.
- change appraisal default for filesystems without xattr support to fail
- update default appraise policy for cgroups
Changelog v1:
- don't appraise RAMFS (Dmitry Kasatkin)
- merged rest of "ima: ima_must_appraise_or_measure API change" commit
(Dmtiry Kasatkin)
ima_must_appraise_or_measure() called ima_match_policy twice, which
searched the policy for a matching rule. Once for a matching measurement
rule and subsequently for an appraisal rule. Searching the policy twice
is unnecessary overhead, which could be noticeable with a large policy.
The new version of ima_must_appraise_or_measure() does everything in a
single iteration using a new version of ima_match_policy(). It returns
IMA_MEASURE, IMA_APPRAISE mask.
With the use of action mask only one efficient matching function
is enough. Removed other specific versions of matching functions.
Changelog:
- change 'owner' to 'fowner' to conform to the new LSM conditions posted by
Roberto Sassu.
- fix calls to ima_log_string()
Signed-off-by: Mimi Zohar <zohar@us.ibm.com>
Signed-off-by: Dmitry Kasatkin <dmitry.kasatkin@intel.com>
2011-03-10 11:25:48 +08:00
|
|
|
The builtin appraise policy appraises all files
|
|
|
|
owned by uid=0.
|
|
|
|
|
2016-12-20 08:22:57 +08:00
|
|
|
ima_canonical_fmt [IMA]
|
|
|
|
Use the canonical format for the binary runtime
|
|
|
|
measurements, instead of host native format.
|
|
|
|
|
2009-02-04 22:06:58 +08:00
|
|
|
ima_hash= [IMA]
|
2013-06-07 18:16:37 +08:00
|
|
|
Format: { md5 | sha1 | rmd160 | sha256 | sha384
|
|
|
|
| sha512 | ... }
|
2009-02-04 22:06:58 +08:00
|
|
|
default: "sha1"
|
|
|
|
|
2013-06-07 18:16:37 +08:00
|
|
|
The list of supported hash algorithms is defined
|
|
|
|
in crypto/hash_info.h.
|
|
|
|
|
2015-06-12 08:48:33 +08:00
|
|
|
ima_policy= [IMA]
|
2017-04-25 00:04:09 +08:00
|
|
|
The builtin policies to load during IMA setup.
|
2018-02-22 00:36:32 +08:00
|
|
|
Format: "tcb | appraise_tcb | secure_boot |
|
2021-01-08 12:07:07 +08:00
|
|
|
fail_securely | critical_data"
|
2017-04-25 00:04:09 +08:00
|
|
|
|
|
|
|
The "tcb" policy measures all programs exec'd, files
|
|
|
|
mmap'd for exec, and all files opened with the read
|
|
|
|
mode bit set by either the effective uid (euid=0) or
|
|
|
|
uid=0.
|
|
|
|
|
|
|
|
The "appraise_tcb" policy appraises the integrity of
|
2019-04-05 02:23:22 +08:00
|
|
|
all files owned by root.
|
2015-06-12 08:48:33 +08:00
|
|
|
|
2017-04-22 06:58:27 +08:00
|
|
|
The "secure_boot" policy appraises the integrity
|
|
|
|
of files (eg. kexec kernel image, kernel modules,
|
|
|
|
firmware, policy, etc) based on file signatures.
|
2015-06-12 08:48:33 +08:00
|
|
|
|
2018-02-22 00:36:32 +08:00
|
|
|
The "fail_securely" policy forces file signature
|
|
|
|
verification failure also on privileged mounted
|
|
|
|
filesystems with the SB_I_UNVERIFIABLE_SIGNATURE
|
|
|
|
flag.
|
|
|
|
|
2021-01-08 12:07:07 +08:00
|
|
|
The "critical_data" policy measures kernel integrity
|
|
|
|
critical data.
|
|
|
|
|
2015-06-12 08:48:33 +08:00
|
|
|
ima_tcb [IMA] Deprecated. Use ima_policy= instead.
|
2009-05-22 03:47:06 +08:00
|
|
|
Load a policy which meets the needs of the Trusted
|
|
|
|
Computing Base. This means IMA will measure all
|
|
|
|
programs exec'd, files mmap'd for exec, and all files
|
|
|
|
opened for read by uid=0.
|
|
|
|
|
2018-04-19 02:51:39 +08:00
|
|
|
ima_template= [IMA]
|
2013-06-07 18:16:35 +08:00
|
|
|
Select one of defined IMA measurements template formats.
|
2015-04-11 23:07:03 +08:00
|
|
|
Formats: { "ima" | "ima-ng" | "ima-sig" }
|
2013-06-07 18:16:35 +08:00
|
|
|
Default: "ima-ng"
|
|
|
|
|
2014-10-13 20:08:42 +08:00
|
|
|
ima_template_fmt=
|
2018-04-19 02:51:39 +08:00
|
|
|
[IMA] Define a custom template format.
|
2014-10-13 20:08:42 +08:00
|
|
|
Format: { "field1|...|fieldN" }
|
|
|
|
|
2014-02-26 23:05:20 +08:00
|
|
|
ima.ahash_minsize= [IMA] Minimum file size for asynchronous hash usage
|
|
|
|
Format: <min_file_size>
|
|
|
|
Set the minimal file size for using asynchronous hash.
|
|
|
|
If left unspecified, ahash usage is disabled.
|
|
|
|
|
|
|
|
ahash performance varies for different data sizes on
|
|
|
|
different crypto accelerators. This option can be used
|
|
|
|
to achieve the best performance for a particular HW.
|
|
|
|
|
2014-05-06 19:47:13 +08:00
|
|
|
ima.ahash_bufsize= [IMA] Asynchronous hash buffer size
|
|
|
|
Format: <bufsize>
|
|
|
|
Set hashing buffer size. Default: 4k.
|
|
|
|
|
|
|
|
ahash performance varies for different chunk sizes on
|
|
|
|
different crypto accelerators. This option can be used
|
|
|
|
to achieve best performance for particular HW.
|
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
init= [KNL]
|
|
|
|
Format: <full_path>
|
|
|
|
Run specified binary instead of /sbin/init as init
|
|
|
|
process.
|
|
|
|
|
|
|
|
initcall_debug [KNL] Trace initcalls as they are executed. Useful
|
|
|
|
for working out where the kernel is dying during
|
|
|
|
startup.
|
|
|
|
|
2014-06-05 07:12:17 +08:00
|
|
|
initcall_blacklist= [KNL] Do not execute a comma-separated list of
|
|
|
|
initcall functions. Useful for debugging built-in
|
|
|
|
modules and initcalls.
|
|
|
|
|
init/initramfs.c: do unpacking asynchronously
Patch series "background initramfs unpacking, and CONFIG_MODPROBE_PATH", v3.
These two patches are independent, but better-together.
The second is a rather trivial patch that simply allows the developer to
change "/sbin/modprobe" to something else - e.g. the empty string, so
that all request_module() during early boot return -ENOENT early, without
even spawning a usermode helper, needlessly synchronizing with the
initramfs unpacking.
The first patch delegates decompressing the initramfs to a worker thread,
allowing do_initcalls() in main.c to proceed to the device_ and late_
initcalls without waiting for that decompression (and populating of
rootfs) to finish. Obviously, some of those later calls may rely on the
initramfs being available, so I've added synchronization points in the
firmware loader and usermodehelper paths - there might be other places
that would need this, but so far no one has been able to think of any
places I have missed.
There's not much to win if most of the functionality needed during boot is
only available as modules. But systems with a custom-made .config and
initramfs can boot faster, partly due to utilizing more than one cpu
earlier, partly by avoiding known-futile modprobe calls (which would still
trigger synchronization with the initramfs unpacking, thus eliminating
most of the first benefit).
This patch (of 2):
Most of the boot process doesn't actually need anything from the
initramfs, until of course PID1 is to be executed. So instead of doing
the decompressing and populating of the initramfs synchronously in
populate_rootfs() itself, push that off to a worker thread.
This is primarily motivated by an embedded ppc target, where unpacking
even the rather modest sized initramfs takes 0.6 seconds, which is long
enough that the external watchdog becomes unhappy that it doesn't get
attention soon enough. By doing the initramfs decompression in a worker
thread, we get to do the device_initcalls and hence start petting the
watchdog much sooner.
Normal desktops might benefit as well. On my mostly stock Ubuntu kernel,
my initramfs is a 26M xz-compressed blob, decompressing to around 126M.
That takes almost two seconds:
[ 0.201454] Trying to unpack rootfs image as initramfs...
[ 1.976633] Freeing initrd memory: 29416K
Before this patch, these lines occur consecutively in dmesg. With this
patch, the timestamps on these two lines is roughly the same as above, but
with 172 lines inbetween - so more than one cpu has been kept busy doing
work that would otherwise only happen after the populate_rootfs()
finished.
Should one of the initcalls done after rootfs_initcall time (i.e., device_
and late_ initcalls) need something from the initramfs (say, a kernel
module or a firmware blob), it will simply wait for the initramfs
unpacking to be done before proceeding, which should in theory make this
completely safe.
But if some driver pokes around in the filesystem directly and not via one
of the official kernel interfaces (i.e. request_firmware*(),
call_usermodehelper*) that theory may not hold - also, I certainly might
have missed a spot when sprinkling wait_for_initramfs(). So there is an
escape hatch in the form of an initramfs_async= command line parameter.
Link: https://lkml.kernel.org/r/20210313212528.2956377-1-linux@rasmusvillemoes.dk
Link: https://lkml.kernel.org/r/20210313212528.2956377-2-linux@rasmusvillemoes.dk
Signed-off-by: Rasmus Villemoes <linux@rasmusvillemoes.dk>
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Cc: Jessica Yu <jeyu@kernel.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Takashi Iwai <tiwai@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-05-07 09:05:42 +08:00
|
|
|
initramfs_async= [KNL]
|
|
|
|
Format: <bool>
|
|
|
|
Default: 1
|
|
|
|
This parameter controls whether the initramfs
|
|
|
|
image is unpacked asynchronously, concurrently
|
|
|
|
with devices being probed and
|
|
|
|
initialized. This should normally just work,
|
|
|
|
but as a debugging aid, one can get the
|
|
|
|
historical behaviour of the initramfs
|
|
|
|
unpacking being completed before device_ and
|
|
|
|
late_ initcalls.
|
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
initrd= [BOOT] Specify the location of the initial ramdisk
|
|
|
|
|
x86/setup: Add an initrdmem= option to specify initrd physical address
Add the initrdmem option:
initrdmem=ss[KMG],nn[KMG]
which is used to specify the physical address of the initrd, almost
always an address in FLASH. Also add code for x86 to use the existing
phys_init_start and phys_init_size variables in the kernel.
This is useful in cases where a kernel and an initrd is placed in FLASH,
but there is no firmware file system structure in the FLASH.
One such situation occurs when unused FLASH space on UEFI systems has
been reclaimed by, e.g., taking it from the Management Engine. For
example, on many systems, the ME is given half the FLASH part; not only
is 2.75M of an 8M part unused; but 10.75M of a 16M part is unused. This
space can be used to contain an initrd, but need to tell Linux where it
is.
This space is "raw": due to, e.g., UEFI limitations: it can not be added
to UEFI firmware volumes without rebuilding UEFI from source or writing
a UEFI device driver. It can be referenced only as a physical address
and size.
At the same time, if a kernel can be "netbooted" or loaded from GRUB or
syslinux, the option of not using the physical address specification
should be available.
Then, it is easy to boot the kernel and provide an initrd; or boot the
the kernel and let it use the initrd in FLASH. In practice, this has
proven to be very helpful when integrating Linux into FLASH on x86.
Hence, the most flexible and convenient path is to enable the initrdmem
command line option in a way that it is the last choice tried.
For example, on the DigitalLoggers Atomic Pi, an image into FLASH can be
burnt in with a built-in command line which includes:
initrdmem=0xff968000,0x200000
which specifies a location and size.
[ bp: Massage commit message, make it passive. ]
[akpm@linux-foundation.org: coding style fixes]
Signed-off-by: Ronald G. Minnich <rminnich@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: H. Peter Anvin (Intel) <hpa@zytor.com>
Link: http://lkml.kernel.org/r/CAP6exYLK11rhreX=6QPyDQmW7wPHsKNEFtXE47pjx41xS6O7-A@mail.gmail.com
Link: https://lkml.kernel.org/r/20200426011021.1cskg0AGd%akpm@linux-foundation.org
2020-04-26 09:10:21 +08:00
|
|
|
initrdmem= [KNL] Specify a physical address and size from which to
|
|
|
|
load the initrd. If an initrd is compiled in or
|
|
|
|
specified in the bootparams, it takes priority over this
|
|
|
|
setting.
|
|
|
|
Format: ss[KMG],nn[KMG]
|
|
|
|
Default is 0, 0
|
|
|
|
|
mm: security: introduce init_on_alloc=1 and init_on_free=1 boot options
Patch series "add init_on_alloc/init_on_free boot options", v10.
Provide init_on_alloc and init_on_free boot options.
These are aimed at preventing possible information leaks and making the
control-flow bugs that depend on uninitialized values more deterministic.
Enabling either of the options guarantees that the memory returned by the
page allocator and SL[AU]B is initialized with zeroes. SLOB allocator
isn't supported at the moment, as its emulation of kmem caches complicates
handling of SLAB_TYPESAFE_BY_RCU caches correctly.
Enabling init_on_free also guarantees that pages and heap objects are
initialized right after they're freed, so it won't be possible to access
stale data by using a dangling pointer.
As suggested by Michal Hocko, right now we don't let the heap users to
disable initialization for certain allocations. There's not enough
evidence that doing so can speed up real-life cases, and introducing ways
to opt-out may result in things going out of control.
This patch (of 2):
The new options are needed to prevent possible information leaks and make
control-flow bugs that depend on uninitialized values more deterministic.
This is expected to be on-by-default on Android and Chrome OS. And it
gives the opportunity for anyone else to use it under distros too via the
boot args. (The init_on_free feature is regularly requested by folks
where memory forensics is included in their threat models.)
init_on_alloc=1 makes the kernel initialize newly allocated pages and heap
objects with zeroes. Initialization is done at allocation time at the
places where checks for __GFP_ZERO are performed.
init_on_free=1 makes the kernel initialize freed pages and heap objects
with zeroes upon their deletion. This helps to ensure sensitive data
doesn't leak via use-after-free accesses.
Both init_on_alloc=1 and init_on_free=1 guarantee that the allocator
returns zeroed memory. The two exceptions are slab caches with
constructors and SLAB_TYPESAFE_BY_RCU flag. Those are never
zero-initialized to preserve their semantics.
Both init_on_alloc and init_on_free default to zero, but those defaults
can be overridden with CONFIG_INIT_ON_ALLOC_DEFAULT_ON and
CONFIG_INIT_ON_FREE_DEFAULT_ON.
If either SLUB poisoning or page poisoning is enabled, those options take
precedence over init_on_alloc and init_on_free: initialization is only
applied to unpoisoned allocations.
Slowdown for the new features compared to init_on_free=0, init_on_alloc=0:
hackbench, init_on_free=1: +7.62% sys time (st.err 0.74%)
hackbench, init_on_alloc=1: +7.75% sys time (st.err 2.14%)
Linux build with -j12, init_on_free=1: +8.38% wall time (st.err 0.39%)
Linux build with -j12, init_on_free=1: +24.42% sys time (st.err 0.52%)
Linux build with -j12, init_on_alloc=1: -0.13% wall time (st.err 0.42%)
Linux build with -j12, init_on_alloc=1: +0.57% sys time (st.err 0.40%)
The slowdown for init_on_free=0, init_on_alloc=0 compared to the baseline
is within the standard error.
The new features are also going to pave the way for hardware memory
tagging (e.g. arm64's MTE), which will require both on_alloc and on_free
hooks to set the tags for heap objects. With MTE, tagging will have the
same cost as memory initialization.
Although init_on_free is rather costly, there are paranoid use-cases where
in-memory data lifetime is desired to be minimized. There are various
arguments for/against the realism of the associated threat models, but
given that we'll need the infrastructure for MTE anyway, and there are
people who want wipe-on-free behavior no matter what the performance cost,
it seems reasonable to include it in this series.
[glider@google.com: v8]
Link: http://lkml.kernel.org/r/20190626121943.131390-2-glider@google.com
[glider@google.com: v9]
Link: http://lkml.kernel.org/r/20190627130316.254309-2-glider@google.com
[glider@google.com: v10]
Link: http://lkml.kernel.org/r/20190628093131.199499-2-glider@google.com
Link: http://lkml.kernel.org/r/20190617151050.92663-2-glider@google.com
Signed-off-by: Alexander Potapenko <glider@google.com>
Acked-by: Kees Cook <keescook@chromium.org>
Acked-by: Michal Hocko <mhocko@suse.cz> [page and dmapool parts
Acked-by: James Morris <jamorris@linux.microsoft.com>]
Cc: Christoph Lameter <cl@linux.com>
Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
Cc: "Serge E. Hallyn" <serge@hallyn.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Kostya Serebryany <kcc@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Sandeep Patil <sspatil@android.com>
Cc: Laura Abbott <labbott@redhat.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Jann Horn <jannh@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Marco Elver <elver@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-07-12 11:59:19 +08:00
|
|
|
init_on_alloc= [MM] Fill newly allocated pages and heap objects with
|
|
|
|
zeroes.
|
|
|
|
Format: 0 | 1
|
|
|
|
Default set by CONFIG_INIT_ON_ALLOC_DEFAULT_ON.
|
|
|
|
|
|
|
|
init_on_free= [MM] Fill freed pages and heap objects with zeroes.
|
|
|
|
Format: 0 | 1
|
|
|
|
Default set by CONFIG_INIT_ON_FREE_DEFAULT_ON.
|
|
|
|
|
2020-08-10 10:49:41 +08:00
|
|
|
init_pkru= [X86] Specify the default memory protection keys rights
|
x86/pkeys: Default to a restrictive init PKRU
PKRU is the register that lets you disallow writes or all access to a given
protection key.
The XSAVE hardware defines an "init state" of 0 for PKRU: its most
permissive state, allowing access/writes to everything. Since we start off
all new processes with the init state, we start all processes off with the
most permissive possible PKRU.
This is unfortunate. If a thread is clone()'d [1] before a program has
time to set PKRU to a restrictive value, that thread will be able to write
to all data, no matter what pkey is set on it. This weakens any integrity
guarantees that we want pkeys to provide.
To fix this, we define a very restrictive PKRU to override the
XSAVE-provided value when we create a new FPU context. We choose a value
that only allows access to pkey 0, which is as restrictive as we can
practically make it.
This does not cause any practical problems with applications using
protection keys because we require them to specify initial permissions for
each key when it is allocated, which override the restrictive default.
In the end, this ensures that threads which do not know how to manage their
own pkey rights can not do damage to data which is pkey-protected.
I would have thought this was a pretty contrived scenario, except that I
heard a bug report from an MPX user who was creating threads in some very
early code before main(). It may be crazy, but folks evidently _do_ it.
Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: linux-arch@vger.kernel.org
Cc: Dave Hansen <dave@sr71.net>
Cc: mgorman@techsingularity.net
Cc: arnd@arndb.de
Cc: linux-api@vger.kernel.org
Cc: linux-mm@kvack.org
Cc: luto@kernel.org
Cc: akpm@linux-foundation.org
Cc: torvalds@linux-foundation.org
Link: http://lkml.kernel.org/r/20160729163021.F3C25D4A@viggo.jf.intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
2016-07-30 00:30:21 +08:00
|
|
|
register contents for all processes. 0x55555554 by
|
|
|
|
default (disallow access to all but pkey 0). Can
|
|
|
|
override in debugfs after boot.
|
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
inport.irq= [HW] Inport (ATI XL and Microsoft) busmouse driver
|
|
|
|
Format: <irq>
|
|
|
|
|
2020-08-10 10:49:41 +08:00
|
|
|
int_pln_enable [X86] Enable power limit notification interrupt
|
2013-05-22 03:35:17 +08:00
|
|
|
|
2013-03-19 02:48:02 +08:00
|
|
|
integrity_audit=[IMA]
|
|
|
|
Format: { "0" | "1" }
|
|
|
|
0 -- basic integrity auditing messages. (Default)
|
|
|
|
1 -- additional integrity auditing messages.
|
|
|
|
|
2007-10-22 07:41:49 +08:00
|
|
|
intel_iommu= [DMAR] Intel IOMMU driver (DMAR) option
|
2009-02-05 06:29:19 +08:00
|
|
|
on
|
|
|
|
Enable intel iommu driver.
|
2007-10-22 07:41:49 +08:00
|
|
|
off
|
|
|
|
Disable intel iommu driver.
|
|
|
|
igfx_off [Default Off]
|
|
|
|
By default, gfx is mapped as normal device. If a gfx
|
|
|
|
device has a dedicated DMAR unit, the DMAR unit is
|
|
|
|
bypassed by not enabling DMAR with this option. In
|
|
|
|
this case, gfx device will use physical address for
|
|
|
|
DMA.
|
2008-03-05 07:22:08 +08:00
|
|
|
strict [Default Off]
|
|
|
|
With this option on every unmap_single operation will
|
|
|
|
result in a hardware IOTLB flush operation as opposed
|
|
|
|
to batching them for performance.
|
intel-iommu: Enable super page (2MiB, 1GiB, etc.) support
There are no externally-visible changes with this. In the loop in the
internal __domain_mapping() function, we simply detect if we are mapping:
- size >= 2MiB, and
- virtual address aligned to 2MiB, and
- physical address aligned to 2MiB, and
- on hardware that supports superpages.
(and likewise for larger superpages).
We automatically use a superpage for such mappings. We never have to
worry about *breaking* superpages, since we trust that we will always
*unmap* the same range that was mapped. So all we need to do is ensure
that dma_pte_clear_range() will also cope with superpages.
Adjust pfn_to_dma_pte() to take a superpage 'level' as an argument, so
it can return a PTE at the appropriate level rather than always
extending the page tables all the way down to level 1. Again, this is
simplified by the fact that we should never encounter existing small
pages when we're creating a mapping; any old mapping that used the same
virtual range will have been entirely removed and its obsolete page
tables freed.
Provide an 'intel_iommu=sp_off' argument on the command line as a
chicken bit. Not that it should ever be required.
==
The original commit seen in the iommu-2.6.git was Youquan's
implementation (and completion) of my own half-baked code which I'd
typed into an email. Followed by half a dozen subsequent 'fixes'.
I've taken the unusual step of rewriting history and collapsing the
original commits in order to keep the main history simpler, and make
life easier for the people who are going to have to backport this to
older kernels. And also so I can give it a more coherent commit comment
which (hopefully) gives a better explanation of what's going on.
The original sequence of commits leading to identical code was:
Youquan Song (3):
intel-iommu: super page support
intel-iommu: Fix superpage alignment calculation error
intel-iommu: Fix superpage level calculation error in dma_pfn_level_pte()
David Woodhouse (4):
intel-iommu: Precalculate superpage support for dmar_domain
intel-iommu: Fix hardware_largepage_caps()
intel-iommu: Fix inappropriate use of superpages in __domain_mapping()
intel-iommu: Fix phys_pfn in __domain_mapping for sglist pages
Signed-off-by: Youquan Song <youquan.song@intel.com>
Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
2011-05-26 02:13:49 +08:00
|
|
|
sp_off [Default Off]
|
|
|
|
By default, super page will be supported if Intel IOMMU
|
|
|
|
has the capability. With this option, super page will
|
|
|
|
not be supported.
|
2019-01-24 10:31:32 +08:00
|
|
|
sm_on [Default Off]
|
|
|
|
By default, scalable mode will be disabled even if the
|
2018-12-10 09:58:55 +08:00
|
|
|
hardware advertises that it has support for the scalable
|
|
|
|
mode translation. With this option set, scalable mode
|
2019-01-24 10:31:32 +08:00
|
|
|
will be used on hardware which claims to support it.
|
2017-04-27 00:18:35 +08:00
|
|
|
tboot_noforce [Default Off]
|
|
|
|
Do not force the Intel IOMMU enabled under tboot.
|
|
|
|
By default, tboot will force Intel IOMMU on, which
|
|
|
|
could harm performance of some high-throughput
|
|
|
|
devices like 40GBit network cards, even if identity
|
|
|
|
mapping is enabled.
|
|
|
|
Note that using this option lowers the security
|
|
|
|
provided by tboot because it makes the system
|
|
|
|
vulnerable to DMA attacks.
|
2011-12-15 00:18:52 +08:00
|
|
|
|
|
|
|
intel_idle.max_cstate= [KNL,HW,ACPI,X86]
|
|
|
|
0 disables intel_idle and fall back on acpi_idle.
|
Update the maximum depth of C-state from 6 to 9
Hi Jon,
This patch is an old one, we have corrected some minor issues on the newer one.
Please only review the newest version from my last mail with this subject
"[PATCH] ACPI: Update the maximum depth of C-state from 6 to 9".
And I also attached it to this mail.
Thanks,
Baole
On 7/11/2016 6:37 AM, Jonathan Corbet wrote:
> On Mon, 4 Jul 2016 09:55:10 +0800
> "baolex.ni" <baolex.ni@intel.com> wrote:
>
>> Currently, CPUIDLE_STATE_MAX has been defined as 10 in the cpuidle head file,
>> and max_cstate = CPUIDLE_STATE_MAX – 1, so 9 is the right maximum depth of C-state.
>> This change is reflected in one place of the kernel-param file,
>> but not in the other place where I suggest changing.
>>
>> Signed-off-by: Chuansheng Liu <chuansheng.liu@intel.com>
>> Signed-off-by: Baole Ni <baolex.ni@intel.com>
>
> So why are there two signoffs on a single-line patch? Which one of you
> is the actual author?
>
> Thanks,
>
> jon
>
From cf5f8aa6885874f6490b11507d3c0c86fa0a11f4 Mon Sep 17 00:00:00 2001
From: Chuansheng Liu <chuansheng.liu@intel.com>
Date: Mon, 4 Jul 2016 08:52:51 +0800
Subject: [PATCH] Update the maximum depth of C-state from 6 to 9
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Currently, CPUIDLE_STATE_MAX has been defined as 10 in the cpuidle head file,
and max_cstate = CPUIDLE_STATE_MAX – 1, so 9 is the right maximum depth of C-state.
This change is reflected in one place of the kernel-param file,
but not in the other place where I suggest changing.
Signed-off-by: Chuansheng Liu <chuansheng.liu@intel.com>
Signed-off-by: Baole Ni <baolex.ni@intel.com>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
2016-07-11 09:57:37 +08:00
|
|
|
1 to 9 specify maximum depth of C-state.
|
2011-12-15 00:18:52 +08:00
|
|
|
|
2018-04-19 02:51:39 +08:00
|
|
|
intel_pstate= [X86]
|
|
|
|
disable
|
|
|
|
Do not enable intel_pstate as the default
|
|
|
|
scaling driver for the supported processors
|
|
|
|
passive
|
|
|
|
Use intel_pstate as a scaling driver, but configure it
|
|
|
|
to work with generic cpufreq governors (instead of
|
|
|
|
enabling its internal governor). This mode cannot be
|
|
|
|
used along with the hardware-managed P-states (HWP)
|
|
|
|
feature.
|
|
|
|
force
|
|
|
|
Enable intel_pstate on systems that prohibit it by default
|
|
|
|
in favor of acpi-cpufreq. Forcing the intel_pstate driver
|
|
|
|
instead of acpi-cpufreq may disable platform features, such
|
|
|
|
as thermal controls and power capping, that rely on ACPI
|
|
|
|
P-States information being indicated to OSPM and therefore
|
|
|
|
should be used with caution. This option does not work with
|
|
|
|
processors that aren't supported by the intel_pstate driver
|
|
|
|
or on platforms that use pcc-cpufreq instead of acpi-cpufreq.
|
|
|
|
no_hwp
|
|
|
|
Do not enable hardware P state control (HWP)
|
|
|
|
if available.
|
|
|
|
hwp_only
|
|
|
|
Only load intel_pstate on systems which support
|
|
|
|
hardware P state control (HWP) if available.
|
|
|
|
support_acpi_ppc
|
|
|
|
Enforce ACPI _PPC performance limits. If the Fixed ACPI
|
|
|
|
Description Table, specifies preferred power management
|
|
|
|
profile as "Enterprise Server" or "Performance Server",
|
|
|
|
then this feature is turned on by default.
|
|
|
|
per_cpu_perf_limits
|
|
|
|
Allow per-logical-CPU P-State performance control limits using
|
|
|
|
cpufreq sysfs interface
|
2013-02-16 05:55:10 +08:00
|
|
|
|
2010-07-21 02:06:49 +08:00
|
|
|
intremap= [X86-64, Intel-IOMMU]
|
|
|
|
on enable Interrupt Remapping (default)
|
|
|
|
off disable Interrupt Remapping
|
|
|
|
nosid disable Source ID checking
|
2011-08-24 08:05:18 +08:00
|
|
|
no_x2apic_optout
|
|
|
|
BIOS x2APIC opt-out request will be ignored
|
2015-09-18 22:29:56 +08:00
|
|
|
nopost disable Interrupt Posting
|
2010-07-21 02:06:49 +08:00
|
|
|
|
2009-04-06 06:55:22 +08:00
|
|
|
iomem= Disable strict checking of access to MMIO memory
|
|
|
|
strict regions from userspace.
|
|
|
|
relaxed
|
|
|
|
|
2020-08-10 10:49:41 +08:00
|
|
|
iommu= [X86]
|
2009-04-06 06:55:22 +08:00
|
|
|
off
|
|
|
|
force
|
|
|
|
noforce
|
|
|
|
biomerge
|
|
|
|
panic
|
|
|
|
nopanic
|
|
|
|
merge
|
|
|
|
nomerge
|
|
|
|
soft
|
2020-08-10 10:49:41 +08:00
|
|
|
pt [X86]
|
|
|
|
nopt [X86]
|
2014-10-24 05:19:35 +08:00
|
|
|
nobypass [PPC/POWERNV]
|
|
|
|
Disable IOMMU bypass, using IOMMU for PCI devices.
|
2011-10-22 03:56:24 +08:00
|
|
|
|
2021-03-06 00:32:34 +08:00
|
|
|
iommu.forcedac= [ARM64, X86] Control IOVA allocation for PCI devices.
|
|
|
|
Format: { "0" | "1" }
|
|
|
|
0 - Try to allocate a 32-bit DMA address first, before
|
|
|
|
falling back to the full range if needed.
|
|
|
|
1 - Allocate directly from the full usable range,
|
|
|
|
forcing Dual Address Cycle for PCI cards supporting
|
|
|
|
greater than 32-bit addressing.
|
|
|
|
|
2021-06-14 22:57:26 +08:00
|
|
|
iommu.strict= [ARM64, X86] Configure TLB invalidation behaviour
|
2018-09-21 00:10:23 +08:00
|
|
|
Format: { "0" | "1" }
|
|
|
|
0 - Lazy mode.
|
|
|
|
Request that DMA unmap operations use deferred
|
|
|
|
invalidation of hardware TLBs, for increased
|
|
|
|
throughput at the cost of reduced device isolation.
|
|
|
|
Will fall back to strict mode if not supported by
|
|
|
|
the relevant IOMMU driver.
|
|
|
|
1 - Strict mode (default).
|
|
|
|
DMA unmap operations invalidate IOMMU hardware TLBs
|
|
|
|
synchronously.
|
2021-06-14 22:57:26 +08:00
|
|
|
Note: on x86, the default behaviour depends on the
|
|
|
|
equivalent driver-specific parameters, but a strict
|
|
|
|
mode explicitly specified by either method takes
|
|
|
|
precedence.
|
2018-09-21 00:10:23 +08:00
|
|
|
|
2017-01-06 02:38:26 +08:00
|
|
|
iommu.passthrough=
|
2019-08-19 21:22:56 +08:00
|
|
|
[ARM64, X86] Configure DMA to bypass the IOMMU by default.
|
2017-01-06 02:38:26 +08:00
|
|
|
Format: { "0" | "1" }
|
|
|
|
0 - Use IOMMU translation for DMA.
|
|
|
|
1 - Bypass the IOMMU for DMA.
|
2018-09-20 21:14:26 +08:00
|
|
|
unset - Use value of CONFIG_IOMMU_DEFAULT_PASSTHROUGH.
|
2009-04-06 06:55:22 +08:00
|
|
|
|
2020-09-18 13:47:51 +08:00
|
|
|
io7= [HW] IO7 for Marvel-based Alpha systems
|
2009-04-06 06:55:22 +08:00
|
|
|
See comment before marvel_specify_io7 in
|
|
|
|
arch/alpha/kernel/core_marvel.c.
|
|
|
|
|
2009-04-14 16:33:43 +08:00
|
|
|
io_delay= [X86] I/O delay method
|
2008-01-30 20:30:05 +08:00
|
|
|
0x80
|
|
|
|
Standard port 0x80 based delay
|
|
|
|
0xed
|
|
|
|
Alternate port 0xed based delay (needed on some systems)
|
2008-01-30 20:30:05 +08:00
|
|
|
udelay
|
2008-01-30 20:30:05 +08:00
|
|
|
Simple two microseconds delay
|
|
|
|
none
|
|
|
|
No delay
|
2008-01-30 20:30:05 +08:00
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
ip= [IP_PNP]
|
2020-02-13 02:13:32 +08:00
|
|
|
See Documentation/admin-guide/nfs/nfsroot.rst.
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2019-05-15 06:46:29 +08:00
|
|
|
ipcmni_extend [KNL] Extend the maximum number of unique System V
|
|
|
|
IPC identifiers from 32,768 to 16,777,216.
|
|
|
|
|
2016-02-04 02:52:23 +08:00
|
|
|
irqaffinity= [SMP] Set the default irq affinity mask
|
2016-10-12 04:51:35 +08:00
|
|
|
The argument is a cpu list, as described above.
|
2016-02-04 02:52:23 +08:00
|
|
|
|
2017-10-27 16:34:22 +08:00
|
|
|
irqchip.gicv2_force_probe=
|
|
|
|
[ARM, ARM64]
|
|
|
|
Format: <bool>
|
|
|
|
Force the kernel to look for the second 4kB page
|
|
|
|
of a GICv2 controller even if the memory range
|
|
|
|
exposed by the device tree is too small.
|
|
|
|
|
2018-02-25 19:27:04 +08:00
|
|
|
irqchip.gicv3_nolpi=
|
|
|
|
[ARM, ARM64]
|
|
|
|
Force the kernel to ignore the availability of
|
|
|
|
LPIs (and by consequence ITSs). Intended for system
|
|
|
|
that use the kernel as a bootloader, and thus want
|
|
|
|
to let secondary kernels in charge of setting up
|
|
|
|
LPIs.
|
|
|
|
|
2019-01-31 22:59:03 +08:00
|
|
|
irqchip.gicv3_pseudo_nmi= [ARM64]
|
|
|
|
Enables support for pseudo-NMIs in the kernel. This
|
|
|
|
requires the kernel to be built with
|
|
|
|
CONFIG_ARM64_PSEUDO_NMI.
|
|
|
|
|
2005-06-29 11:45:18 +08:00
|
|
|
irqfixup [HW]
|
|
|
|
When an interrupt is not handled search all handlers
|
|
|
|
for it. Intended to get systems with badly broken
|
|
|
|
firmware running.
|
|
|
|
|
|
|
|
irqpoll [HW]
|
|
|
|
When an interrupt is not handled search all handlers
|
|
|
|
for it. Also check all handlers each timer
|
|
|
|
interrupt. Intended to get systems with badly broken
|
|
|
|
firmware running.
|
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
isapnp= [ISAPNP]
|
2005-10-24 03:57:11 +08:00
|
|
|
Format: <RDP>,<reset>,<pci_scan>,<verbosity>
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2017-12-15 02:18:27 +08:00
|
|
|
isolcpus= [KNL,SMP,ISOL] Isolate a given set of CPUs from disturbance.
|
2017-10-31 11:18:34 +08:00
|
|
|
[Deprecated - use cpusets instead]
|
|
|
|
Format: [flag-list,]<cpu-list>
|
|
|
|
|
|
|
|
Specify one or more CPUs to isolate from disturbances
|
|
|
|
specified in the flag list (default: domain):
|
|
|
|
|
|
|
|
nohz
|
|
|
|
Disable the tick when a single task runs.
|
2018-02-21 12:17:29 +08:00
|
|
|
|
|
|
|
A residual 1Hz tick is offloaded to workqueues, which you
|
|
|
|
need to affine to housekeeping through the global
|
|
|
|
workqueue's affinity configured via the
|
|
|
|
/sys/devices/virtual/workqueue/cpumask sysfs file, or
|
|
|
|
by using the 'domain' flag described below.
|
|
|
|
|
|
|
|
NOTE: by default the global workqueue runs on all CPUs,
|
|
|
|
so to protect individual CPUs the 'cpumask' file has to
|
|
|
|
be configured manually after bootup.
|
|
|
|
|
2017-10-31 11:18:34 +08:00
|
|
|
domain
|
|
|
|
Isolate from the general SMP balancing and scheduling
|
|
|
|
algorithms. Note that performing domain isolation this way
|
|
|
|
is irreversible: it's not possible to bring back a CPU to
|
|
|
|
the domains once isolated through isolcpus. It's strongly
|
|
|
|
advised to use cpusets instead to disable scheduler load
|
|
|
|
balancing through the "cpuset.sched_load_balance" file.
|
|
|
|
It offers a much more flexible interface where CPUs can
|
|
|
|
move in and out of an isolated set anytime.
|
|
|
|
|
|
|
|
You can move a process onto or off an "isolated" CPU via
|
|
|
|
the CPU affinity syscalls or cpuset.
|
|
|
|
<cpu number> begins at 0 and the maximum value is
|
|
|
|
"number of CPUs in system - 1".
|
|
|
|
|
2020-01-20 17:16:25 +08:00
|
|
|
managed_irq
|
|
|
|
|
|
|
|
Isolate from being targeted by managed interrupts
|
|
|
|
which have an interrupt mask containing isolated
|
|
|
|
CPUs. The affinity of managed interrupts is
|
|
|
|
handled by the kernel and cannot be changed via
|
|
|
|
the /proc/irq/* interfaces.
|
|
|
|
|
|
|
|
This isolation is best effort and only effective
|
|
|
|
if the automatically assigned interrupt mask of a
|
|
|
|
device queue contains isolated and housekeeping
|
|
|
|
CPUs. If housekeeping CPUs are online then such
|
|
|
|
interrupts are directed to the housekeeping CPU
|
|
|
|
so that IO submitted on the housekeeping CPU
|
|
|
|
cannot disturb the isolated CPU.
|
|
|
|
|
|
|
|
If a queue's affinity mask contains only isolated
|
|
|
|
CPUs then this parameter has no effect on the
|
|
|
|
interrupt routing decision, though interrupts are
|
|
|
|
only delivered when tasks running on those
|
|
|
|
isolated CPUs submit IO. IO submitted on
|
|
|
|
housekeeping CPUs has no influence on those
|
|
|
|
queues.
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2020-01-20 17:16:25 +08:00
|
|
|
The format of <cpu-list> is described above.
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2005-10-24 03:57:11 +08:00
|
|
|
iucv= [HW,NET]
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2020-08-10 10:49:41 +08:00
|
|
|
ivrs_ioapic [HW,X86-64]
|
2013-04-10 03:27:19 +08:00
|
|
|
Provide an override to the IOAPIC-ID<->DEVICE-ID
|
|
|
|
mapping provided in the IVRS ACPI table. For
|
|
|
|
example, to map IOAPIC-ID decimal 10 to
|
|
|
|
PCI device 00:14.0 write the parameter as:
|
|
|
|
ivrs_ioapic[10]=00:14.0
|
|
|
|
|
2020-08-10 10:49:41 +08:00
|
|
|
ivrs_hpet [HW,X86-64]
|
2013-04-10 03:27:19 +08:00
|
|
|
Provide an override to the HPET-ID<->DEVICE-ID
|
|
|
|
mapping provided in the IVRS ACPI table. For
|
|
|
|
example, to map HPET-ID decimal 0 to
|
|
|
|
PCI device 00:14.0 write the parameter as:
|
|
|
|
ivrs_hpet[0]=00:14.0
|
|
|
|
|
2020-08-10 10:49:41 +08:00
|
|
|
ivrs_acpihid [HW,X86-64]
|
2016-04-01 21:06:01 +08:00
|
|
|
Provide an override to the ACPI-HID:UID<->DEVICE-ID
|
|
|
|
mapping provided in the IVRS ACPI table. For
|
|
|
|
example, to map UART-HID:UID AMD0020:0 to
|
|
|
|
PCI device 00:14.5 write the parameter as:
|
|
|
|
ivrs_acpihid[00:14.5]=AMD0020:0
|
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
js= [HW,JOY] Analog joystick
|
2017-10-11 01:36:23 +08:00
|
|
|
See Documentation/input/joydev/joystick.rst.
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2016-06-14 06:10:02 +08:00
|
|
|
nokaslr [KNL]
|
|
|
|
When CONFIG_RANDOMIZE_BASE is set, this disables
|
|
|
|
kernel and module base offset ASLR (Address Space
|
|
|
|
Layout Randomization).
|
2014-06-14 04:30:36 +08:00
|
|
|
|
2017-04-01 06:12:04 +08:00
|
|
|
kasan_multi_shot
|
|
|
|
[KNL] Enforce KASAN (Kernel Address Sanitizer) to print
|
|
|
|
report on every invalid memory access. Without this
|
|
|
|
parameter KASAN will print report only for the first
|
|
|
|
invalid access.
|
|
|
|
|
2009-04-06 06:55:22 +08:00
|
|
|
keepinitrd [HW,ARM]
|
|
|
|
|
2016-03-16 05:55:22 +08:00
|
|
|
kernelcore= [KNL,X86,IA-64,PPC]
|
mm, page_alloc: extend kernelcore and movablecore for percent
Both kernelcore= and movablecore= can be used to define the amount of
ZONE_NORMAL and ZONE_MOVABLE on a system, respectively. This requires
the system memory capacity to be known when specifying the command line,
however.
This introduces the ability to define both kernelcore= and movablecore=
as a percentage of total system memory. This is convenient for systems
software that wants to define the amount of ZONE_MOVABLE, for example,
as a proportion of a system's memory rather than a hardcoded byte value.
To define the percentage, the final character of the parameter should be
a '%'.
mhocko: "why is anyone using these options nowadays?"
rientjes:
:
: Fragmentation of non-__GFP_MOVABLE pages due to low on memory
: situations can pollute most pageblocks on the system, as much as 1GB of
: slab being fragmented over 128GB of memory, for example. When the
: amount of kernel memory is well bounded for certain systems, it is
: better to aggressively reclaim from existing MIGRATE_UNMOVABLE
: pageblocks rather than eagerly fallback to others.
:
: We have additional patches that help with this fragmentation if you're
: interested, specifically kcompactd compaction of MIGRATE_UNMOVABLE
: pageblocks triggered by fallback of non-__GFP_MOVABLE allocations and
: draining of pcp lists back to the zone free area to prevent stranding.
[rientjes@google.com: updates]
Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1802131700160.71590@chino.kir.corp.google.com
Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1802121622470.179479@chino.kir.corp.google.com
Signed-off-by: David Rientjes <rientjes@google.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Mel Gorman <mgorman@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-06 07:23:09 +08:00
|
|
|
Format: nn[KMGTPE] | nn% | "mirror"
|
|
|
|
This parameter specifies the amount of memory usable by
|
|
|
|
the kernel for non-movable allocations. The requested
|
|
|
|
amount is spread evenly throughout all nodes in the
|
|
|
|
system as ZONE_NORMAL. The remaining memory is used for
|
|
|
|
movable memory in its own zone, ZONE_MOVABLE. In the
|
|
|
|
event, a node is too small to have both ZONE_NORMAL and
|
|
|
|
ZONE_MOVABLE, kernelcore memory will take priority and
|
|
|
|
other nodes will have a larger ZONE_MOVABLE.
|
|
|
|
|
|
|
|
ZONE_MOVABLE is used for the allocation of pages that
|
|
|
|
may be reclaimed or moved by the page migration
|
|
|
|
subsystem. Note that allocations like PTEs-from-HighMem
|
|
|
|
still use the HighMem zone if it exists, and the Normal
|
2007-07-17 19:03:14 +08:00
|
|
|
zone if it does not.
|
|
|
|
|
mm, page_alloc: extend kernelcore and movablecore for percent
Both kernelcore= and movablecore= can be used to define the amount of
ZONE_NORMAL and ZONE_MOVABLE on a system, respectively. This requires
the system memory capacity to be known when specifying the command line,
however.
This introduces the ability to define both kernelcore= and movablecore=
as a percentage of total system memory. This is convenient for systems
software that wants to define the amount of ZONE_MOVABLE, for example,
as a proportion of a system's memory rather than a hardcoded byte value.
To define the percentage, the final character of the parameter should be
a '%'.
mhocko: "why is anyone using these options nowadays?"
rientjes:
:
: Fragmentation of non-__GFP_MOVABLE pages due to low on memory
: situations can pollute most pageblocks on the system, as much as 1GB of
: slab being fragmented over 128GB of memory, for example. When the
: amount of kernel memory is well bounded for certain systems, it is
: better to aggressively reclaim from existing MIGRATE_UNMOVABLE
: pageblocks rather than eagerly fallback to others.
:
: We have additional patches that help with this fragmentation if you're
: interested, specifically kcompactd compaction of MIGRATE_UNMOVABLE
: pageblocks triggered by fallback of non-__GFP_MOVABLE allocations and
: draining of pcp lists back to the zone free area to prevent stranding.
[rientjes@google.com: updates]
Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1802131700160.71590@chino.kir.corp.google.com
Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1802121622470.179479@chino.kir.corp.google.com
Signed-off-by: David Rientjes <rientjes@google.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Mel Gorman <mgorman@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-06 07:23:09 +08:00
|
|
|
It is possible to specify the exact amount of memory in
|
|
|
|
the form of "nn[KMGTPE]", a percentage of total system
|
|
|
|
memory in the form of "nn%", or "mirror". If "mirror"
|
2016-03-16 05:55:22 +08:00
|
|
|
option is specified, mirrored (reliable) memory is used
|
|
|
|
for non-movable allocations and remaining memory is used
|
mm, page_alloc: extend kernelcore and movablecore for percent
Both kernelcore= and movablecore= can be used to define the amount of
ZONE_NORMAL and ZONE_MOVABLE on a system, respectively. This requires
the system memory capacity to be known when specifying the command line,
however.
This introduces the ability to define both kernelcore= and movablecore=
as a percentage of total system memory. This is convenient for systems
software that wants to define the amount of ZONE_MOVABLE, for example,
as a proportion of a system's memory rather than a hardcoded byte value.
To define the percentage, the final character of the parameter should be
a '%'.
mhocko: "why is anyone using these options nowadays?"
rientjes:
:
: Fragmentation of non-__GFP_MOVABLE pages due to low on memory
: situations can pollute most pageblocks on the system, as much as 1GB of
: slab being fragmented over 128GB of memory, for example. When the
: amount of kernel memory is well bounded for certain systems, it is
: better to aggressively reclaim from existing MIGRATE_UNMOVABLE
: pageblocks rather than eagerly fallback to others.
:
: We have additional patches that help with this fragmentation if you're
: interested, specifically kcompactd compaction of MIGRATE_UNMOVABLE
: pageblocks triggered by fallback of non-__GFP_MOVABLE allocations and
: draining of pcp lists back to the zone free area to prevent stranding.
[rientjes@google.com: updates]
Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1802131700160.71590@chino.kir.corp.google.com
Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1802121622470.179479@chino.kir.corp.google.com
Signed-off-by: David Rientjes <rientjes@google.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Mel Gorman <mgorman@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-06 07:23:09 +08:00
|
|
|
for Movable pages. "nn[KMGTPE]", "nn%", and "mirror"
|
|
|
|
are exclusive, so you cannot specify multiple forms.
|
2007-07-17 19:03:14 +08:00
|
|
|
|
2010-05-21 10:04:31 +08:00
|
|
|
kgdbdbgp= [KGDB,HW] kgdb over EHCI usb debug port.
|
|
|
|
Format: <Controller#>[,poll interval]
|
|
|
|
The controller # is the number of the ehci usb debug
|
|
|
|
port as it is probed via PCI. The poll interval is
|
|
|
|
optional and is the number seconds in between
|
|
|
|
each poll cycle to the debug port in case you need
|
|
|
|
the functionality for interrupting the kernel with
|
|
|
|
gdb or control-c on the dbgp connection. When
|
|
|
|
not using this parameter you use sysrq-g to break into
|
|
|
|
the kernel debugger.
|
|
|
|
|
2010-05-21 10:04:24 +08:00
|
|
|
kgdboc= [KGDB,HW] kgdb over consoles.
|
2010-05-21 10:04:24 +08:00
|
|
|
Requires a tty driver that supports console polling,
|
|
|
|
or a supported polling keyboard driver (non-usb).
|
2010-08-05 22:22:33 +08:00
|
|
|
Serial only format: <serial_device>[,baud]
|
|
|
|
keyboard only format: kbd
|
|
|
|
keyboard and serial format: kbd,<serial_device>[,baud]
|
|
|
|
Optional Kernel mode setting:
|
|
|
|
kms, kbd format: kms,kbd
|
|
|
|
kms, kbd and serial format: kms,kbd,<ser_dev>[,baud]
|
2008-04-18 02:05:38 +08:00
|
|
|
|
2020-05-08 04:08:47 +08:00
|
|
|
kgdboc_earlycon= [KGDB,HW]
|
|
|
|
If the boot console provides the ability to read
|
|
|
|
characters and can work in polling mode, you can use
|
|
|
|
this parameter to tell kgdb to use it as a backend
|
|
|
|
until the normal console is registered. Intended to
|
|
|
|
be used together with the kgdboc parameter which
|
|
|
|
specifies the normal console to transition to.
|
|
|
|
|
|
|
|
The name of the early console should be specified
|
|
|
|
as the value of this parameter. Note that the name of
|
|
|
|
the early console might be different than the tty
|
|
|
|
name passed to kgdboc. It's OK to leave the value
|
|
|
|
blank and the first boot console that implements
|
|
|
|
read() will be picked.
|
|
|
|
|
2010-05-21 10:04:24 +08:00
|
|
|
kgdbwait [KGDB] Stop kernel execution and enter the
|
|
|
|
kernel debugger at the earliest opportunity.
|
|
|
|
|
2020-09-18 13:47:22 +08:00
|
|
|
kmac= [MIPS] Korina ethernet MAC address.
|
2008-08-24 00:54:37 +08:00
|
|
|
Configure the RouterBoard 532 series on-chip
|
|
|
|
Ethernet adapter MAC address.
|
|
|
|
|
2009-06-11 20:22:39 +08:00
|
|
|
kmemleak= [KNL] Boot-time kmemleak enable/disable
|
|
|
|
Valid arguments: on, off
|
|
|
|
Default: on
|
2014-10-24 20:24:59 +08:00
|
|
|
Built with CONFIG_DEBUG_KMEMLEAK_DEFAULT_OFF=y,
|
|
|
|
the default is off.
|
2009-06-11 20:22:39 +08:00
|
|
|
|
2019-05-22 16:32:35 +08:00
|
|
|
kprobe_event=[probe-list]
|
|
|
|
[FTRACE] Add kprobe events and enable at boot time.
|
|
|
|
The probe-list is a semicolon delimited list of probe
|
|
|
|
definitions. Each definition is same as kprobe_events
|
|
|
|
interface, but the parameters are comma delimited.
|
|
|
|
For example, to add a kprobe event on vfs_read with
|
|
|
|
arg1 and arg2, add to the command line;
|
|
|
|
|
|
|
|
kprobe_event=p,vfs_read,$arg1,$arg2
|
|
|
|
|
|
|
|
See also Documentation/trace/kprobetrace.rst "Kernel
|
|
|
|
Boot Parameter" section.
|
|
|
|
|
2019-01-26 02:07:00 +08:00
|
|
|
kpti= [ARM64] Control page table isolation of user
|
|
|
|
and kernel address spaces.
|
|
|
|
Default: enabled on cores which need mitigation.
|
|
|
|
0: force disabled
|
|
|
|
1: force enabled
|
|
|
|
|
2009-07-10 20:20:35 +08:00
|
|
|
kvm.ignore_msrs=[KVM] Ignore guest accesses to unhandled MSRs.
|
|
|
|
Default is 0 (don't ignore, but inject #GP)
|
|
|
|
|
2018-03-12 19:12:47 +08:00
|
|
|
kvm.enable_vmware_backdoor=[KVM] Support VMware backdoor PV interface.
|
|
|
|
Default is false (don't support).
|
|
|
|
|
2010-09-20 22:17:48 +08:00
|
|
|
kvm.mmu_audit= [KVM] This is a R/W parameter which allows audit
|
|
|
|
KVM MMU at runtime.
|
2009-07-10 20:20:35 +08:00
|
|
|
Default is 0 (off)
|
|
|
|
|
2019-11-04 19:22:02 +08:00
|
|
|
kvm.nx_huge_pages=
|
|
|
|
[KVM] Controls the software workaround for the
|
|
|
|
X86_BUG_ITLB_MULTIHIT bug.
|
|
|
|
force : Always deploy workaround.
|
|
|
|
off : Never deploy workaround.
|
|
|
|
auto : Deploy workaround based on the presence of
|
|
|
|
X86_BUG_ITLB_MULTIHIT.
|
|
|
|
|
|
|
|
Default is 'auto'.
|
|
|
|
|
|
|
|
If the software workaround is enabled for the host,
|
|
|
|
guests do need not to enable it for nested guests.
|
|
|
|
|
2019-11-05 03:26:00 +08:00
|
|
|
kvm.nx_huge_pages_recovery_ratio=
|
|
|
|
[KVM] Controls how many 4KiB pages are periodically zapped
|
|
|
|
back to huge pages. 0 disables the recovery, otherwise if
|
|
|
|
the value is N KVM will zap 1/Nth of the 4KiB pages every
|
|
|
|
minute. The default is 60.
|
|
|
|
|
2009-07-10 20:20:35 +08:00
|
|
|
kvm-amd.nested= [KVM,AMD] Allow nested virtualization in KVM/SVM.
|
2010-09-20 22:16:45 +08:00
|
|
|
Default is 1 (enabled)
|
2009-07-10 20:20:35 +08:00
|
|
|
|
|
|
|
kvm-amd.npt= [KVM,AMD] Disable nested paging (virtualized MMU)
|
|
|
|
for all guests.
|
2011-08-14 03:34:52 +08:00
|
|
|
Default is 1 (enabled) if in 64-bit or 32-bit PAE mode.
|
2009-07-10 20:20:35 +08:00
|
|
|
|
2020-12-03 02:40:57 +08:00
|
|
|
kvm-arm.mode=
|
|
|
|
[KVM,ARM] Select one of KVM/arm64's modes of operation.
|
|
|
|
|
2021-02-08 17:57:26 +08:00
|
|
|
nvhe: Standard nVHE-based mode, without support for
|
|
|
|
protected guests.
|
|
|
|
|
2020-12-03 02:40:57 +08:00
|
|
|
protected: nVHE-based mode with support for guests whose
|
|
|
|
state is kept private from the host.
|
|
|
|
Not valid if the kernel is running in EL2.
|
|
|
|
|
2021-04-08 21:10:10 +08:00
|
|
|
Defaults to VHE/nVHE based on hardware support.
|
2020-12-03 02:40:57 +08:00
|
|
|
|
2017-06-09 19:49:46 +08:00
|
|
|
kvm-arm.vgic_v3_group0_trap=
|
|
|
|
[KVM,ARM] Trap guest accesses to GICv3 group-0
|
|
|
|
system registers
|
|
|
|
|
2017-06-09 19:49:41 +08:00
|
|
|
kvm-arm.vgic_v3_group1_trap=
|
|
|
|
[KVM,ARM] Trap guest accesses to GICv3 group-1
|
|
|
|
system registers
|
|
|
|
|
2017-06-09 19:49:53 +08:00
|
|
|
kvm-arm.vgic_v3_common_trap=
|
|
|
|
[KVM,ARM] Trap guest accesses to GICv3 common
|
|
|
|
system registers
|
|
|
|
|
2017-10-27 22:28:54 +08:00
|
|
|
kvm-arm.vgic_v4_enable=
|
|
|
|
[KVM,ARM] Allow use of GICv4 for direct injection of
|
|
|
|
LPIs.
|
|
|
|
|
2020-09-21 17:02:20 +08:00
|
|
|
kvm_cma_resv_ratio=n [PPC]
|
|
|
|
Reserves given percentage from system memory area for
|
|
|
|
contiguous memory allocation for KVM hash pagetable
|
|
|
|
allocation.
|
|
|
|
By default it reserves 5% of total system memory.
|
|
|
|
Format: <integer>
|
|
|
|
Default: 5
|
|
|
|
|
2009-07-10 20:20:35 +08:00
|
|
|
kvm-intel.ept= [KVM,Intel] Disable extended page tables
|
|
|
|
(virtualized MMU) support on capable Intel chips.
|
|
|
|
Default is 1 (enabled)
|
|
|
|
|
|
|
|
kvm-intel.emulate_invalid_guest_state=
|
|
|
|
[KVM,Intel] Enable emulation of invalid guest states
|
|
|
|
Default is 0 (disabled)
|
|
|
|
|
|
|
|
kvm-intel.flexpriority=
|
|
|
|
[KVM,Intel] Disable FlexPriority feature (TPR shadow).
|
|
|
|
Default is 1 (enabled)
|
|
|
|
|
2011-08-09 19:28:35 +08:00
|
|
|
kvm-intel.nested=
|
|
|
|
[KVM,Intel] Enable VMX nesting (nVMX).
|
|
|
|
Default is 0 (disabled)
|
|
|
|
|
2009-07-10 20:20:35 +08:00
|
|
|
kvm-intel.unrestricted_guest=
|
|
|
|
[KVM,Intel] Disable unrestricted guest feature
|
|
|
|
(virtualized real and unpaged mode) on capable
|
|
|
|
Intel chips. Default is 1 (enabled)
|
|
|
|
|
2018-07-02 18:29:30 +08:00
|
|
|
kvm-intel.vmentry_l1d_flush=[KVM,Intel] Mitigation for L1 Terminal Fault
|
|
|
|
CVE-2018-3620.
|
|
|
|
|
|
|
|
Valid arguments: never, cond, always
|
|
|
|
|
|
|
|
always: L1D cache flush on every VMENTER.
|
|
|
|
cond: Flush L1D on VMENTER only when the code between
|
|
|
|
VMEXIT and VMENTER can leak host memory.
|
|
|
|
never: Disables the mitigation
|
|
|
|
|
|
|
|
Default is cond (do L1 cache flush in specific instances)
|
|
|
|
|
2009-07-10 20:20:35 +08:00
|
|
|
kvm-intel.vpid= [KVM,Intel] Disable Virtual Processor Identification
|
|
|
|
feature (tagged TLBs) on capable Intel chips.
|
|
|
|
Default is 1 (enabled)
|
|
|
|
|
2021-01-08 20:10:56 +08:00
|
|
|
l1d_flush= [X86,INTEL]
|
|
|
|
Control mitigation for L1D based snooping vulnerability.
|
|
|
|
|
|
|
|
Certain CPUs are vulnerable to an exploit against CPU
|
|
|
|
internal buffers which can forward information to a
|
|
|
|
disclosure gadget under certain conditions.
|
|
|
|
|
|
|
|
In vulnerable processors, the speculatively
|
|
|
|
forwarded data can be used in a cache side channel
|
|
|
|
attack, to access data to which the attacker does
|
|
|
|
not have direct access.
|
|
|
|
|
|
|
|
This parameter controls the mitigation. The
|
|
|
|
options are:
|
|
|
|
|
|
|
|
on - enable the interface for the mitigation
|
|
|
|
|
x86/bugs, kvm: Introduce boot-time control of L1TF mitigations
Introduce the 'l1tf=' kernel command line option to allow for boot-time
switching of mitigation that is used on processors affected by L1TF.
The possible values are:
full
Provides all available mitigations for the L1TF vulnerability. Disables
SMT and enables all mitigations in the hypervisors. SMT control via
/sys/devices/system/cpu/smt/control is still possible after boot.
Hypervisors will issue a warning when the first VM is started in
a potentially insecure configuration, i.e. SMT enabled or L1D flush
disabled.
full,force
Same as 'full', but disables SMT control. Implies the 'nosmt=force'
command line option. sysfs control of SMT and the hypervisor flush
control is disabled.
flush
Leaves SMT enabled and enables the conditional hypervisor mitigation.
Hypervisors will issue a warning when the first VM is started in a
potentially insecure configuration, i.e. SMT enabled or L1D flush
disabled.
flush,nosmt
Disables SMT and enables the conditional hypervisor mitigation. SMT
control via /sys/devices/system/cpu/smt/control is still possible
after boot. If SMT is reenabled or flushing disabled at runtime
hypervisors will issue a warning.
flush,nowarn
Same as 'flush', but hypervisors will not warn when
a VM is started in a potentially insecure configuration.
off
Disables hypervisor mitigations and doesn't emit any warnings.
Default is 'flush'.
Let KVM adhere to these semantics, which means:
- 'lt1f=full,force' : Performe L1D flushes. No runtime control
possible.
- 'l1tf=full'
- 'l1tf-flush'
- 'l1tf=flush,nosmt' : Perform L1D flushes and warn on VM start if
SMT has been runtime enabled or L1D flushing
has been run-time enabled
- 'l1tf=flush,nowarn' : Perform L1D flushes and no warnings are emitted.
- 'l1tf=off' : L1D flushes are not performed and no warnings
are emitted.
KVM can always override the L1D flushing behavior using its 'vmentry_l1d_flush'
module parameter except when lt1f=full,force is set.
This makes KVM's private 'nosmt' option redundant, and as it is a bit
non-systematic anyway (this is something to control globally, not on
hypervisor level), remove that option.
Add the missing Documentation entry for the l1tf vulnerability sysfs file
while at it.
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Jiri Kosina <jkosina@suse.cz>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lkml.kernel.org/r/20180713142323.202758176@linutronix.de
2018-07-13 22:23:25 +08:00
|
|
|
l1tf= [X86] Control mitigation of the L1TF vulnerability on
|
|
|
|
affected CPUs
|
|
|
|
|
|
|
|
The kernel PTE inversion protection is unconditionally
|
|
|
|
enabled and cannot be disabled.
|
|
|
|
|
|
|
|
full
|
|
|
|
Provides all available mitigations for the
|
|
|
|
L1TF vulnerability. Disables SMT and
|
|
|
|
enables all mitigations in the
|
|
|
|
hypervisors, i.e. unconditional L1D flush.
|
|
|
|
|
|
|
|
SMT control and L1D flush control via the
|
|
|
|
sysfs interface is still possible after
|
|
|
|
boot. Hypervisors will issue a warning
|
|
|
|
when the first VM is started in a
|
|
|
|
potentially insecure configuration,
|
|
|
|
i.e. SMT enabled or L1D flush disabled.
|
|
|
|
|
|
|
|
full,force
|
|
|
|
Same as 'full', but disables SMT and L1D
|
|
|
|
flush runtime control. Implies the
|
|
|
|
'nosmt=force' command line option.
|
|
|
|
(i.e. sysfs control of SMT is disabled.)
|
|
|
|
|
|
|
|
flush
|
|
|
|
Leaves SMT enabled and enables the default
|
|
|
|
hypervisor mitigation, i.e. conditional
|
|
|
|
L1D flush.
|
|
|
|
|
|
|
|
SMT control and L1D flush control via the
|
|
|
|
sysfs interface is still possible after
|
|
|
|
boot. Hypervisors will issue a warning
|
|
|
|
when the first VM is started in a
|
|
|
|
potentially insecure configuration,
|
|
|
|
i.e. SMT enabled or L1D flush disabled.
|
|
|
|
|
|
|
|
flush,nosmt
|
|
|
|
|
|
|
|
Disables SMT and enables the default
|
|
|
|
hypervisor mitigation.
|
|
|
|
|
|
|
|
SMT control and L1D flush control via the
|
|
|
|
sysfs interface is still possible after
|
|
|
|
boot. Hypervisors will issue a warning
|
|
|
|
when the first VM is started in a
|
|
|
|
potentially insecure configuration,
|
|
|
|
i.e. SMT enabled or L1D flush disabled.
|
|
|
|
|
|
|
|
flush,nowarn
|
|
|
|
Same as 'flush', but hypervisors will not
|
|
|
|
warn when a VM is started in a potentially
|
|
|
|
insecure configuration.
|
|
|
|
|
|
|
|
off
|
|
|
|
Disables hypervisor mitigations and doesn't
|
|
|
|
emit any warnings.
|
2018-11-14 02:49:10 +08:00
|
|
|
It also drops the swap size and available
|
|
|
|
RAM limit restriction on both hypervisor and
|
|
|
|
bare metal.
|
x86/bugs, kvm: Introduce boot-time control of L1TF mitigations
Introduce the 'l1tf=' kernel command line option to allow for boot-time
switching of mitigation that is used on processors affected by L1TF.
The possible values are:
full
Provides all available mitigations for the L1TF vulnerability. Disables
SMT and enables all mitigations in the hypervisors. SMT control via
/sys/devices/system/cpu/smt/control is still possible after boot.
Hypervisors will issue a warning when the first VM is started in
a potentially insecure configuration, i.e. SMT enabled or L1D flush
disabled.
full,force
Same as 'full', but disables SMT control. Implies the 'nosmt=force'
command line option. sysfs control of SMT and the hypervisor flush
control is disabled.
flush
Leaves SMT enabled and enables the conditional hypervisor mitigation.
Hypervisors will issue a warning when the first VM is started in a
potentially insecure configuration, i.e. SMT enabled or L1D flush
disabled.
flush,nosmt
Disables SMT and enables the conditional hypervisor mitigation. SMT
control via /sys/devices/system/cpu/smt/control is still possible
after boot. If SMT is reenabled or flushing disabled at runtime
hypervisors will issue a warning.
flush,nowarn
Same as 'flush', but hypervisors will not warn when
a VM is started in a potentially insecure configuration.
off
Disables hypervisor mitigations and doesn't emit any warnings.
Default is 'flush'.
Let KVM adhere to these semantics, which means:
- 'lt1f=full,force' : Performe L1D flushes. No runtime control
possible.
- 'l1tf=full'
- 'l1tf-flush'
- 'l1tf=flush,nosmt' : Perform L1D flushes and warn on VM start if
SMT has been runtime enabled or L1D flushing
has been run-time enabled
- 'l1tf=flush,nowarn' : Perform L1D flushes and no warnings are emitted.
- 'l1tf=off' : L1D flushes are not performed and no warnings
are emitted.
KVM can always override the L1D flushing behavior using its 'vmentry_l1d_flush'
module parameter except when lt1f=full,force is set.
This makes KVM's private 'nosmt' option redundant, and as it is a bit
non-systematic anyway (this is something to control globally, not on
hypervisor level), remove that option.
Add the missing Documentation entry for the l1tf vulnerability sysfs file
while at it.
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Jiri Kosina <jkosina@suse.cz>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lkml.kernel.org/r/20180713142323.202758176@linutronix.de
2018-07-13 22:23:25 +08:00
|
|
|
|
|
|
|
Default is 'flush'.
|
|
|
|
|
2019-02-19 18:10:49 +08:00
|
|
|
For details see: Documentation/admin-guide/hw-vuln/l1tf.rst
|
x86/bugs, kvm: Introduce boot-time control of L1TF mitigations
Introduce the 'l1tf=' kernel command line option to allow for boot-time
switching of mitigation that is used on processors affected by L1TF.
The possible values are:
full
Provides all available mitigations for the L1TF vulnerability. Disables
SMT and enables all mitigations in the hypervisors. SMT control via
/sys/devices/system/cpu/smt/control is still possible after boot.
Hypervisors will issue a warning when the first VM is started in
a potentially insecure configuration, i.e. SMT enabled or L1D flush
disabled.
full,force
Same as 'full', but disables SMT control. Implies the 'nosmt=force'
command line option. sysfs control of SMT and the hypervisor flush
control is disabled.
flush
Leaves SMT enabled and enables the conditional hypervisor mitigation.
Hypervisors will issue a warning when the first VM is started in a
potentially insecure configuration, i.e. SMT enabled or L1D flush
disabled.
flush,nosmt
Disables SMT and enables the conditional hypervisor mitigation. SMT
control via /sys/devices/system/cpu/smt/control is still possible
after boot. If SMT is reenabled or flushing disabled at runtime
hypervisors will issue a warning.
flush,nowarn
Same as 'flush', but hypervisors will not warn when
a VM is started in a potentially insecure configuration.
off
Disables hypervisor mitigations and doesn't emit any warnings.
Default is 'flush'.
Let KVM adhere to these semantics, which means:
- 'lt1f=full,force' : Performe L1D flushes. No runtime control
possible.
- 'l1tf=full'
- 'l1tf-flush'
- 'l1tf=flush,nosmt' : Perform L1D flushes and warn on VM start if
SMT has been runtime enabled or L1D flushing
has been run-time enabled
- 'l1tf=flush,nowarn' : Perform L1D flushes and no warnings are emitted.
- 'l1tf=off' : L1D flushes are not performed and no warnings
are emitted.
KVM can always override the L1D flushing behavior using its 'vmentry_l1d_flush'
module parameter except when lt1f=full,force is set.
This makes KVM's private 'nosmt' option redundant, and as it is a bit
non-systematic anyway (this is something to control globally, not on
hypervisor level), remove that option.
Add the missing Documentation entry for the l1tf vulnerability sysfs file
while at it.
Signed-off-by: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Jiri Kosina <jkosina@suse.cz>
Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Link: https://lkml.kernel.org/r/20180713142323.202758176@linutronix.de
2018-07-13 22:23:25 +08:00
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
l2cr= [PPC]
|
|
|
|
|
2008-03-29 04:20:23 +08:00
|
|
|
l3cr= [PPC]
|
|
|
|
|
2007-07-31 15:37:59 +08:00
|
|
|
lapic [X86-32,APIC] Enable the local APIC even if BIOS
|
2005-10-24 03:57:11 +08:00
|
|
|
disabled it.
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2020-09-18 13:47:39 +08:00
|
|
|
lapic= [X86,APIC] Do not use TSC deadline
|
2012-10-23 05:37:58 +08:00
|
|
|
value for LAPIC timer one-shot implementation. Default
|
|
|
|
back to the programmable timer unit in the LAPIC.
|
2020-09-18 13:47:39 +08:00
|
|
|
Format: notscdeadline
|
2012-10-23 05:37:58 +08:00
|
|
|
|
2009-04-14 16:33:43 +08:00
|
|
|
lapic_timer_c2_ok [X86,APIC] trust the local apic timer
|
2008-12-20 02:57:32 +08:00
|
|
|
in C2 power state.
|
2007-03-23 23:08:01 +08:00
|
|
|
|
2008-01-07 02:08:56 +08:00
|
|
|
libata.dma= [LIBATA] DMA control
|
|
|
|
libata.dma=0 Disable all PATA and SATA DMA
|
|
|
|
libata.dma=1 PATA and SATA Disk DMA only
|
|
|
|
libata.dma=2 ATAPI (CDROM) DMA only
|
2011-08-14 03:34:52 +08:00
|
|
|
libata.dma=4 Compact Flash DMA only
|
2008-01-07 02:08:56 +08:00
|
|
|
Combinations also work, so libata.dma=3 enables DMA
|
|
|
|
for disks and CDROMs, but not CFs.
|
2011-08-14 03:34:52 +08:00
|
|
|
|
2009-08-06 06:14:10 +08:00
|
|
|
libata.ignore_hpa= [LIBATA] Ignore HPA limit
|
|
|
|
libata.ignore_hpa=0 keep BIOS limits (default)
|
|
|
|
libata.ignore_hpa=1 ignore limits, using full disk
|
2008-01-07 02:08:56 +08:00
|
|
|
|
2007-09-27 23:50:13 +08:00
|
|
|
libata.noacpi [LIBATA] Disables use of ACPI in libata suspend/resume
|
|
|
|
when set.
|
|
|
|
Format: <int>
|
|
|
|
|
2021-01-01 12:08:31 +08:00
|
|
|
libata.force= [LIBATA] Force configurations. The format is comma-
|
2008-02-13 08:15:09 +08:00
|
|
|
separated list of "[ID:]VAL" where ID is
|
2010-04-21 18:17:12 +08:00
|
|
|
PORT[.DEVICE]. PORT and DEVICE are decimal numbers
|
2008-02-13 08:15:09 +08:00
|
|
|
matching port, link or device. Basically, it matches
|
|
|
|
the ATA ID string printed on console by libata. If
|
|
|
|
the whole ID part is omitted, the last PORT and DEVICE
|
|
|
|
values are used. If ID hasn't been specified yet, the
|
|
|
|
configuration applies to all ports, links and devices.
|
|
|
|
|
|
|
|
If only DEVICE is omitted, the parameter applies to
|
|
|
|
the port and all links and devices behind it. DEVICE
|
|
|
|
number of 0 either selects the first device or the
|
|
|
|
first fan-out link behind PMP device. It does not
|
|
|
|
select the host link. DEVICE number of 15 selects the
|
|
|
|
host link and device attached to it.
|
|
|
|
|
|
|
|
The VAL specifies the configuration to force. As long
|
|
|
|
as there's no ambiguity shortcut notation is allowed.
|
|
|
|
For example, both 1.5 and 1.5G would work for 1.5Gbps.
|
|
|
|
The following configurations can be forced.
|
|
|
|
|
|
|
|
* Cable type: 40c, 80c, short40c, unk, ign or sata.
|
|
|
|
Any ID with matching PORT is used.
|
|
|
|
|
|
|
|
* SATA link speed limit: 1.5Gbps or 3.0Gbps.
|
|
|
|
|
|
|
|
* Transfer mode: pio[0-7], mwdma[0-4] and udma[0-7].
|
|
|
|
udma[/][16,25,33,44,66,100,133] notation is also
|
|
|
|
allowed.
|
|
|
|
|
|
|
|
* [no]ncq: Turn on or off NCQ.
|
|
|
|
|
2015-05-05 09:54:18 +08:00
|
|
|
* [no]ncqtrim: Turn off queued DSM TRIM.
|
|
|
|
|
2008-08-13 19:19:09 +08:00
|
|
|
* nohrst, nosrst, norst: suppress hard, soft
|
2018-04-19 02:51:39 +08:00
|
|
|
and both resets.
|
2008-08-13 19:19:09 +08:00
|
|
|
|
2012-06-22 14:41:41 +08:00
|
|
|
* rstonce: only attempt one reset during
|
|
|
|
hot-unplug link recovery
|
|
|
|
|
2010-05-23 18:59:11 +08:00
|
|
|
* dump_id: dump IDENTIFY data.
|
|
|
|
|
2013-05-22 04:30:58 +08:00
|
|
|
* atapi_dmadir: Enable ATAPI DMADIR bridge support
|
|
|
|
|
2013-12-17 01:31:19 +08:00
|
|
|
* disable: Disable this device.
|
|
|
|
|
2008-02-13 08:15:09 +08:00
|
|
|
If there are multiple matching configurations changing
|
|
|
|
the same attribute, the last one is used.
|
|
|
|
|
2010-07-12 12:36:09 +08:00
|
|
|
memblock=debug [KNL] Enable memblock debug messages.
|
2009-01-07 06:42:44 +08:00
|
|
|
|
2020-09-18 09:56:40 +08:00
|
|
|
load_ramdisk= [RAM] [Deprecated]
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2006-01-15 05:21:19 +08:00
|
|
|
lockd.nlm_grace_period=P [NFS] Assign grace period.
|
|
|
|
Format: <integer>
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2006-01-15 05:21:19 +08:00
|
|
|
lockd.nlm_tcpport=N [NFS] Assign TCP port.
|
|
|
|
Format: <integer>
|
|
|
|
|
|
|
|
lockd.nlm_timeout=T [NFS] Assign timeout value.
|
|
|
|
Format: <integer>
|
|
|
|
|
|
|
|
lockd.nlm_udpport=M [NFS] Assign UDP port.
|
|
|
|
Format: <integer>
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2019-08-20 08:17:39 +08:00
|
|
|
lockdown= [SECURITY]
|
|
|
|
{ integrity | confidentiality }
|
|
|
|
Enable the kernel lockdown feature. If set to
|
|
|
|
integrity, kernel features that allow userland to
|
|
|
|
modify the running kernel are disabled. If set to
|
|
|
|
confidentiality, kernel features that allow userland
|
|
|
|
to extract confidential information from the kernel
|
|
|
|
are also disabled.
|
|
|
|
|
2014-09-13 01:50:01 +08:00
|
|
|
locktorture.nreaders_stress= [KNL]
|
|
|
|
Set the number of locking read-acquisition kthreads.
|
|
|
|
Defaults to being automatically set based on the
|
|
|
|
number of online CPUs.
|
|
|
|
|
|
|
|
locktorture.nwriters_stress= [KNL]
|
|
|
|
Set the number of locking write-acquisition kthreads.
|
|
|
|
|
|
|
|
locktorture.onoff_holdoff= [KNL]
|
|
|
|
Set time (s) after boot for CPU-hotplug testing.
|
|
|
|
|
|
|
|
locktorture.onoff_interval= [KNL]
|
|
|
|
Set time (s) between CPU-hotplug operations, or
|
|
|
|
zero to disable CPU-hotplug testing.
|
|
|
|
|
|
|
|
locktorture.shuffle_interval= [KNL]
|
|
|
|
Set task-shuffle interval (jiffies). Shuffling
|
|
|
|
tasks allows some CPUs to go into dyntick-idle
|
|
|
|
mode during the locktorture test.
|
|
|
|
|
|
|
|
locktorture.shutdown_secs= [KNL]
|
|
|
|
Set time (s) after boot system shutdown. This
|
|
|
|
is useful for hands-off automated testing.
|
|
|
|
|
|
|
|
locktorture.stat_interval= [KNL]
|
|
|
|
Time (s) between statistics printk()s.
|
|
|
|
|
|
|
|
locktorture.stutter= [KNL]
|
|
|
|
Time (s) to stutter testing, for example,
|
|
|
|
specifying five seconds causes the test to run for
|
|
|
|
five seconds, wait for five seconds, and so on.
|
|
|
|
This tests the locking primitive's ability to
|
|
|
|
transition abruptly to and from idle.
|
|
|
|
|
|
|
|
locktorture.torture_type= [KNL]
|
|
|
|
Specify the locking implementation to test.
|
|
|
|
|
|
|
|
locktorture.verbose= [KNL]
|
|
|
|
Enable additional printk() statements.
|
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
logibm.irq= [HW,MOUSE] Logitech Bus Mouse Driver
|
|
|
|
Format: <irq>
|
|
|
|
|
|
|
|
loglevel= All Kernel Messages with a loglevel smaller than the
|
|
|
|
console loglevel will be printed to the console. It can
|
|
|
|
also be changed with klogd or other programs. The
|
|
|
|
loglevels are defined as follows:
|
|
|
|
|
|
|
|
0 (KERN_EMERG) system is unusable
|
|
|
|
1 (KERN_ALERT) action must be taken immediately
|
|
|
|
2 (KERN_CRIT) critical conditions
|
|
|
|
3 (KERN_ERR) error conditions
|
|
|
|
4 (KERN_WARNING) warning conditions
|
|
|
|
5 (KERN_NOTICE) normal but significant condition
|
|
|
|
6 (KERN_INFO) informational
|
|
|
|
7 (KERN_DEBUG) debug-level messages
|
|
|
|
|
2011-02-21 12:08:35 +08:00
|
|
|
log_buf_len=n[KMG] Sets the size of the printk ring buffer,
|
2014-08-07 07:08:56 +08:00
|
|
|
in bytes. n must be a power of two and greater
|
|
|
|
than the minimal size. The minimal size is defined
|
|
|
|
by LOG_BUF_SHIFT kernel config parameter. There is
|
|
|
|
also CONFIG_LOG_CPU_MAX_BUF_SHIFT config parameter
|
|
|
|
that allows to increase the default size depending on
|
|
|
|
the number of CPUs. See init/Kconfig for more details.
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2007-10-16 16:29:37 +08:00
|
|
|
logo.nologo [FB] Disables display of the built-in Linux logo.
|
|
|
|
This may be used to provide more screen space for
|
|
|
|
kernel log messages and is useful when debugging
|
|
|
|
kernel boot problems.
|
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
lp=0 [LP] Specify parallel ports to use, e.g,
|
|
|
|
lp=port[,port...] lp=none,parport0 (lp0 not configured, lp1 uses
|
|
|
|
lp=reset first parallel port). 'lp=0' disables the
|
|
|
|
lp=auto printer driver. 'lp=reset' (which can be
|
|
|
|
specified in addition to the ports) causes
|
|
|
|
attached printers to be reset. Using
|
|
|
|
lp=port1,port2,... specifies the parallel ports
|
|
|
|
to associate lp devices with, starting with
|
|
|
|
lp0. A port specification may be 'none' to skip
|
|
|
|
that lp device, or a parport name such as
|
|
|
|
'parport0'. Specifying 'lp=auto' instead of a
|
|
|
|
port specification list means that device IDs
|
|
|
|
from each port should be examined, to see if
|
|
|
|
an IEEE 1284-compliant printer is attached; if
|
|
|
|
so, the driver will manage that printer.
|
|
|
|
See also header of drivers/char/lp.c.
|
|
|
|
|
|
|
|
lpj=n [KNL]
|
|
|
|
Sets loops_per_jiffy to given constant, thus avoiding
|
|
|
|
time-consuming boot-time autodetection (up to 250 ms per
|
|
|
|
CPU). 0 enables autodetection (default). To determine
|
|
|
|
the correct value for your kernel, boot with normal
|
|
|
|
autodetection and see what value is printed. Note that
|
|
|
|
on SMP systems the preset will be applied to all CPUs,
|
|
|
|
which is likely to cause problems if your CPUs need
|
|
|
|
significantly divergent settings. An incorrect value
|
|
|
|
will cause delays in the kernel to be wrong, leading to
|
|
|
|
unpredictable I/O errors and other breakage. Although
|
|
|
|
unlikely, in the extreme case this might damage your
|
|
|
|
hardware.
|
|
|
|
|
|
|
|
ltpc= [NET]
|
|
|
|
Format: <io>,<irq>,<dma>
|
|
|
|
|
2018-10-11 08:18:25 +08:00
|
|
|
lsm.debug [SECURITY] Enable LSM initialization debugging output.
|
|
|
|
|
2018-09-20 08:30:09 +08:00
|
|
|
lsm=lsm1,...,lsmN
|
|
|
|
[SECURITY] Choose order of LSM initialization. This
|
2019-02-13 02:23:18 +08:00
|
|
|
overrides CONFIG_LSM, and the "security=" parameter.
|
2018-09-20 08:30:09 +08:00
|
|
|
|
2011-08-14 03:34:52 +08:00
|
|
|
machvec= [IA-64] Force the use of a particular machine-vector
|
2005-10-24 03:57:11 +08:00
|
|
|
(machvec) in a generic kernel.
|
2019-08-13 15:25:06 +08:00
|
|
|
Example: machvec=hpzx1
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2020-09-19 08:52:02 +08:00
|
|
|
machtype= [Loongson] Share the same kernel image file between
|
|
|
|
different yeeloong laptops.
|
2009-07-02 23:27:12 +08:00
|
|
|
Example: machtype=lemote-yeeloong-2f-7inch
|
|
|
|
|
2009-04-06 06:55:22 +08:00
|
|
|
max_addr=nn[KMG] [KNL,BOOT,ia64] All physical memory greater
|
|
|
|
than or equal to this physical address is ignored.
|
2005-04-17 06:20:36 +08:00
|
|
|
|
|
|
|
maxcpus= [SMP] Maximum number of processors that an SMP kernel
|
2016-08-24 13:06:45 +08:00
|
|
|
will bring up during bootup. maxcpus=n : n >= 0 limits
|
|
|
|
the kernel to bring up 'n' processors. Surely after
|
|
|
|
bootup you can bring up the other plugged cpu by executing
|
|
|
|
"echo 1 > /sys/devices/system/cpu/cpuX/online". So maxcpus
|
|
|
|
only takes effect during system bootup.
|
|
|
|
While n=0 is a special case, it is equivalent to "nosmp",
|
|
|
|
which also disables the IO APIC.
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2011-08-01 04:08:04 +08:00
|
|
|
max_loop= [LOOP] The number of loop block devices that get
|
|
|
|
(loop.max_loop) unconditionally pre-created at init time. The default
|
|
|
|
number is configured by BLK_DEV_LOOP_MIN_COUNT. Instead
|
|
|
|
of statically allocating a predefined number, loop
|
|
|
|
devices can be requested on-demand with the
|
|
|
|
/dev/loop-control interface.
|
2005-06-30 09:00:00 +08:00
|
|
|
|
2007-07-31 15:37:59 +08:00
|
|
|
mce [X86-32] Machine Check Exception
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2019-06-08 02:54:32 +08:00
|
|
|
mce=option [X86-64] See Documentation/x86/x86_64/boot-options.rst
|
2007-10-18 00:04:38 +08:00
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
md= [HW] RAID subsystems devices and level
|
2016-11-03 18:10:10 +08:00
|
|
|
See Documentation/admin-guide/md.rst.
|
2005-10-24 03:57:11 +08:00
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
mdacon= [MDA]
|
|
|
|
Format: <first>,<last>
|
|
|
|
Specifies range of consoles to be captured by the MDA.
|
2005-10-24 03:57:11 +08:00
|
|
|
|
2019-02-19 05:04:08 +08:00
|
|
|
mds= [X86,INTEL]
|
|
|
|
Control mitigation for the Micro-architectural Data
|
|
|
|
Sampling (MDS) vulnerability.
|
|
|
|
|
|
|
|
Certain CPUs are vulnerable to an exploit against CPU
|
|
|
|
internal buffers which can forward information to a
|
|
|
|
disclosure gadget under certain conditions.
|
|
|
|
|
|
|
|
In vulnerable processors, the speculatively
|
|
|
|
forwarded data can be used in a cache side channel
|
|
|
|
attack, to access data to which the attacker does
|
|
|
|
not have direct access.
|
|
|
|
|
|
|
|
This parameter controls the MDS mitigation. The
|
|
|
|
options are:
|
|
|
|
|
2019-04-02 22:59:33 +08:00
|
|
|
full - Enable MDS mitigation on vulnerable CPUs
|
|
|
|
full,nosmt - Enable MDS mitigation and disable
|
|
|
|
SMT on vulnerable CPUs
|
|
|
|
off - Unconditionally disable MDS mitigation
|
2019-02-19 05:04:08 +08:00
|
|
|
|
2019-11-16 00:14:44 +08:00
|
|
|
On TAA-affected machines, mds=off can be prevented by
|
|
|
|
an active TAA mitigation as both vulnerabilities are
|
|
|
|
mitigated with the same mechanism so in order to disable
|
|
|
|
this mitigation, you need to specify tsx_async_abort=off
|
|
|
|
too.
|
|
|
|
|
2019-02-19 05:04:08 +08:00
|
|
|
Not specifying this option is equivalent to
|
|
|
|
mds=full.
|
|
|
|
|
2019-02-19 07:02:31 +08:00
|
|
|
For details see: Documentation/admin-guide/hw-vuln/mds.rst
|
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
mem=nn[KMG] [KNL,BOOT] Force usage of a specific amount of memory
|
2020-04-07 11:06:50 +08:00
|
|
|
Amount of memory to be used in cases as follows:
|
|
|
|
|
|
|
|
1 for test;
|
|
|
|
2 when the kernel is not able to see the whole system memory;
|
|
|
|
3 memory that lies after 'mem=' boundary is excluded from
|
|
|
|
the hypervisor, then assigned to KVM guests.
|
|
|
|
|
2012-12-18 07:59:29 +08:00
|
|
|
[X86] Work as limiting max address. Use together
|
|
|
|
with memmap= to avoid physical address space collisions.
|
|
|
|
Without memmap= PCI devices could be placed at addresses
|
|
|
|
belonging to unused RAM.
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2020-04-07 11:06:50 +08:00
|
|
|
Note that this only takes effects during boot time since
|
|
|
|
in above case 3, memory may need be hot added after boot
|
|
|
|
if system memory of hypervisor is not sufficient.
|
|
|
|
|
2007-07-31 15:37:59 +08:00
|
|
|
mem=nopentium [BUGS=X86-32] Disable usage of 4MB pages for kernel
|
2005-04-17 06:20:36 +08:00
|
|
|
memory.
|
|
|
|
|
2008-09-21 16:14:42 +08:00
|
|
|
memchunk=nn[KMG]
|
|
|
|
[KNL,SH] Allow user to override the default size for
|
|
|
|
per-device physically contiguous DMA buffers.
|
|
|
|
|
2018-04-19 02:51:39 +08:00
|
|
|
memhp_default_state=online/offline
|
2016-05-20 08:13:06 +08:00
|
|
|
[KNL] Set the initial state for the memory hotplug
|
|
|
|
onlining policy. If not specified, the default value is
|
|
|
|
set according to the
|
|
|
|
CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE kernel config
|
|
|
|
option.
|
2019-06-08 02:54:32 +08:00
|
|
|
See Documentation/admin-guide/mm/memory-hotplug.rst.
|
2016-05-20 08:13:06 +08:00
|
|
|
|
2009-04-14 16:33:43 +08:00
|
|
|
memmap=exactmap [KNL,X86] Enable setting of an exact
|
2005-04-17 06:20:36 +08:00
|
|
|
E820 memory map, as specified by the user.
|
|
|
|
Such memmap=exactmap lines can be constructed based on
|
|
|
|
BIOS output or other requirements. See the memmap=nn@ss
|
|
|
|
option description.
|
|
|
|
|
|
|
|
memmap=nn[KMG]@ss[KMG]
|
2020-11-29 03:51:21 +08:00
|
|
|
[KNL, X86, MIPS, XTENSA] Force usage of a specific region of memory.
|
2014-02-07 04:04:19 +08:00
|
|
|
Region of memory to be used is from ss to ss+nn.
|
Documentation/kernel-parameters.txt: Update 'memmap=' boot option description
In commit:
9710f581bb4c ("x86, mm: Let "memmap=" take more entries one time")
... 'memmap=' was changed to adopt multiple, comma delimited values in a
single entry, so update the related description.
In the special case of only specifying size value without an offset,
like memmap=nn[KMG], memmap behaves similarly to mem=nn[KMG], so update
it too here.
Furthermore, for memmap=nn[KMG]$ss[KMG], an escape character needs be added
before '$' for some bootloaders. E.g in grub2, if we specify memmap=100M$5G
as suggested by the documentation, "memmap=100MG" gets passed to the kernel.
Clarify all this.
Signed-off-by: Baoquan He <bhe@redhat.com>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: dan.j.williams@intel.com
Cc: douly.fnst@cn.fujitsu.com
Cc: dyoung@redhat.com
Cc: m.mizuma@jp.fujitsu.com
Link: http://lkml.kernel.org/r/1494654390-23861-4-git-send-email-bhe@redhat.com
[ Various spelling fixes. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-05-13 13:46:30 +08:00
|
|
|
If @ss[KMG] is omitted, it is equivalent to mem=nn[KMG],
|
|
|
|
which limits max address to nn[KMG].
|
|
|
|
Multiple different regions can be specified,
|
|
|
|
comma delimited.
|
|
|
|
Example:
|
|
|
|
memmap=100M@2G,100M#3G,1G!1024G
|
2005-04-17 06:20:36 +08:00
|
|
|
|
|
|
|
memmap=nn[KMG]#ss[KMG]
|
|
|
|
[KNL,ACPI] Mark specific memory as ACPI data.
|
2014-02-07 04:04:19 +08:00
|
|
|
Region of memory to be marked is from ss to ss+nn.
|
2005-04-17 06:20:36 +08:00
|
|
|
|
|
|
|
memmap=nn[KMG]$ss[KMG]
|
|
|
|
[KNL,ACPI] Mark specific memory as reserved.
|
2014-02-07 04:04:19 +08:00
|
|
|
Region of memory to be reserved is from ss to ss+nn.
|
2008-03-25 03:29:43 +08:00
|
|
|
Example: Exclude memory from 0x18690000-0x1869ffff
|
|
|
|
memmap=64K$0x18690000
|
|
|
|
or
|
|
|
|
memmap=0x10000$0x18690000
|
Documentation/kernel-parameters.txt: Update 'memmap=' boot option description
In commit:
9710f581bb4c ("x86, mm: Let "memmap=" take more entries one time")
... 'memmap=' was changed to adopt multiple, comma delimited values in a
single entry, so update the related description.
In the special case of only specifying size value without an offset,
like memmap=nn[KMG], memmap behaves similarly to mem=nn[KMG], so update
it too here.
Furthermore, for memmap=nn[KMG]$ss[KMG], an escape character needs be added
before '$' for some bootloaders. E.g in grub2, if we specify memmap=100M$5G
as suggested by the documentation, "memmap=100MG" gets passed to the kernel.
Clarify all this.
Signed-off-by: Baoquan He <bhe@redhat.com>
Acked-by: Kees Cook <keescook@chromium.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: dan.j.williams@intel.com
Cc: douly.fnst@cn.fujitsu.com
Cc: dyoung@redhat.com
Cc: m.mizuma@jp.fujitsu.com
Link: http://lkml.kernel.org/r/1494654390-23861-4-git-send-email-bhe@redhat.com
[ Various spelling fixes. ]
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-05-13 13:46:30 +08:00
|
|
|
Some bootloaders may need an escape character before '$',
|
|
|
|
like Grub2, otherwise '$' and the following number
|
|
|
|
will be eaten.
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2015-04-01 15:12:18 +08:00
|
|
|
memmap=nn[KMG]!ss[KMG]
|
|
|
|
[KNL,X86] Mark specific memory as protected.
|
|
|
|
Region of memory to be used, from ss to ss+nn.
|
|
|
|
The memory region may be marked as e820 type 12 (0xc)
|
|
|
|
and is NVDIMM or ADR memory.
|
|
|
|
|
2018-02-03 07:10:20 +08:00
|
|
|
memmap=<size>%<offset>-<oldtype>+<newtype>
|
|
|
|
[KNL,ACPI] Convert memory within the specified region
|
|
|
|
from <oldtype> to <newtype>. If "-<oldtype>" is left
|
|
|
|
out, the whole region will be marked as <newtype>,
|
|
|
|
even if previously unavailable. If "+<newtype>" is left
|
|
|
|
out, matching memory will be removed. Types are
|
|
|
|
specified as e820 types, e.g., 1 = RAM, 2 = reserved,
|
|
|
|
3 = ACPI, 12 = PRAM.
|
|
|
|
|
2008-09-07 16:51:34 +08:00
|
|
|
memory_corruption_check=0/1 [X86]
|
|
|
|
Some BIOSes seem to corrupt the first 64k of
|
|
|
|
memory when doing things like suspend/resume.
|
|
|
|
Setting this option will scan the memory
|
|
|
|
looking for corruption. Enabling this will
|
|
|
|
both detect corruption and prevent the kernel
|
|
|
|
from using the memory being corrupted.
|
|
|
|
However, its intended as a diagnostic tool; if
|
|
|
|
repeatable BIOS-originated corruption always
|
|
|
|
affects the same memory, you can use memmap=
|
|
|
|
to prevent the kernel from using that memory.
|
|
|
|
|
|
|
|
memory_corruption_check_size=size [X86]
|
|
|
|
By default it checks for corruption in the low
|
|
|
|
64k, making this memory unavailable for normal
|
|
|
|
use. Use this parameter to scan for
|
|
|
|
corruption in more or less memory.
|
|
|
|
|
|
|
|
memory_corruption_check_period=seconds [X86]
|
|
|
|
By default it checks for corruption every 60
|
|
|
|
seconds. Use this parameter to check at some
|
|
|
|
other rate. 0 disables periodic checking.
|
|
|
|
|
2021-05-05 09:39:48 +08:00
|
|
|
memory_hotplug.memmap_on_memory
|
|
|
|
[KNL,X86,ARM] Boolean flag to enable this feature.
|
|
|
|
Format: {on | off (default)}
|
|
|
|
When enabled, runtime hotplugged memory will
|
|
|
|
allocate its internal metadata (struct pages)
|
|
|
|
from the hotadded memory which will allow to
|
|
|
|
hotadd a lot of memory without requiring
|
|
|
|
additional memory to do so.
|
|
|
|
This feature is disabled by default because it
|
|
|
|
has some implication on large (e.g. GB)
|
|
|
|
allocations in some configurations (e.g. small
|
|
|
|
memory blocks).
|
|
|
|
The state of the flag can be read in
|
|
|
|
/sys/module/memory_hotplug/parameters/memmap_on_memory.
|
|
|
|
Note that even when enabled, there are a few cases where
|
|
|
|
the feature is not effective.
|
|
|
|
|
2021-07-01 09:47:29 +08:00
|
|
|
This is not compatible with hugetlb_free_vmemmap. If
|
|
|
|
both parameters are enabled, hugetlb_free_vmemmap takes
|
|
|
|
precedence over memory_hotplug.memmap_on_memory.
|
|
|
|
|
2021-02-25 14:54:17 +08:00
|
|
|
memtest= [KNL,X86,ARM,PPC,RISCV] Enable memtest
|
2008-03-22 09:56:19 +08:00
|
|
|
Format: <integer>
|
|
|
|
default : 0 <disable>
|
2009-02-25 18:30:45 +08:00
|
|
|
Specifies the number of memtest passes to be
|
|
|
|
performed. Each pass selects another test
|
|
|
|
pattern from a given set of patterns. Memtest
|
|
|
|
fills the memory with this pattern, validates
|
|
|
|
memory contents and reserves bad memory
|
|
|
|
regions that are detected.
|
2008-03-22 09:56:19 +08:00
|
|
|
|
2017-07-18 05:09:58 +08:00
|
|
|
mem_encrypt= [X86-64] AMD Secure Memory Encryption (SME) control
|
|
|
|
Valid arguments: on, off
|
|
|
|
Default (depends on kernel configuration option):
|
|
|
|
on (CONFIG_AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT=y)
|
|
|
|
off (CONFIG_AMD_MEM_ENCRYPT_ACTIVE_BY_DEFAULT=n)
|
|
|
|
mem_encrypt=on: Activate SME
|
|
|
|
mem_encrypt=off: Do not activate SME
|
|
|
|
|
2019-07-24 15:24:49 +08:00
|
|
|
Refer to Documentation/virt/kvm/amd-memory-encryption.rst
|
2017-07-18 05:09:58 +08:00
|
|
|
for details on when memory encryption can be activated.
|
|
|
|
|
PM / sleep: System sleep state selection interface rework
There are systems in which the platform doesn't support any special
sleep states, so suspend-to-idle (PM_SUSPEND_FREEZE) is the only
available system sleep state. However, some user space frameworks
only use the "mem" and (sometimes) "standby" sleep state labels, so
the users of those systems need to modify user space in order to be
able to use system suspend at all and that may be a pain in practice.
Commit 0399d4db3edf (PM / sleep: Introduce command line argument for
sleep state enumeration) attempted to address this problem by adding
a command line argument to change the meaning of the "mem" string in
/sys/power/state to make it trigger suspend-to-idle (instead of
suspend-to-RAM).
However, there also are systems in which the platform does support
special sleep states, but suspend-to-idle is the preferred one anyway
(it even may save more energy than the platform-provided sleep states
in some cases) and the above commit doesn't help in those cases.
For this reason, rework the system sleep state selection interface
again (but preserve backwards compatibiliby). Namely, add a new
sysfs file, /sys/power/mem_sleep, that will control the system
suspend mode triggered by writing "mem" to /sys/power/state (in
analogy with what /sys/power/disk does for hibernation). Make it
select suspend-to-RAM ("deep" sleep) by default (if supported) and
fall back to suspend-to-idle ("s2idle") otherwise and add a new
command line argument, mem_sleep_default, allowing that default to
be overridden if need be.
At the same time, drop the relative_sleep_states command line
argument that doesn't make sense any more.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Mario Limonciello <mario.limonciello@dell.com>
2016-11-22 05:45:40 +08:00
|
|
|
mem_sleep_default= [SUSPEND] Default system suspend mode:
|
|
|
|
s2idle - Suspend-To-Idle
|
|
|
|
shallow - Power-On Suspend or equivalent (if supported)
|
|
|
|
deep - Suspend-To-RAM or equivalent (if supported)
|
2017-10-06 06:38:49 +08:00
|
|
|
See Documentation/admin-guide/pm/sleep-states.rst.
|
PM / sleep: System sleep state selection interface rework
There are systems in which the platform doesn't support any special
sleep states, so suspend-to-idle (PM_SUSPEND_FREEZE) is the only
available system sleep state. However, some user space frameworks
only use the "mem" and (sometimes) "standby" sleep state labels, so
the users of those systems need to modify user space in order to be
able to use system suspend at all and that may be a pain in practice.
Commit 0399d4db3edf (PM / sleep: Introduce command line argument for
sleep state enumeration) attempted to address this problem by adding
a command line argument to change the meaning of the "mem" string in
/sys/power/state to make it trigger suspend-to-idle (instead of
suspend-to-RAM).
However, there also are systems in which the platform does support
special sleep states, but suspend-to-idle is the preferred one anyway
(it even may save more energy than the platform-provided sleep states
in some cases) and the above commit doesn't help in those cases.
For this reason, rework the system sleep state selection interface
again (but preserve backwards compatibiliby). Namely, add a new
sysfs file, /sys/power/mem_sleep, that will control the system
suspend mode triggered by writing "mem" to /sys/power/state (in
analogy with what /sys/power/disk does for hibernation). Make it
select suspend-to-RAM ("deep" sleep) by default (if supported) and
fall back to suspend-to-idle ("s2idle") otherwise and add a new
command line argument, mem_sleep_default, allowing that default to
be overridden if need be.
At the same time, drop the relative_sleep_states command line
argument that doesn't make sense any more.
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
Tested-by: Mario Limonciello <mario.limonciello@dell.com>
2016-11-22 05:45:40 +08:00
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
meye.*= [HW] Set MotionEye Camera parameters
|
2020-03-04 20:08:03 +08:00
|
|
|
See Documentation/admin-guide/media/meye.rst.
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2007-10-13 05:04:06 +08:00
|
|
|
mfgpt_irq= [IA-32] Specify the IRQ to use for the
|
|
|
|
Multi-Function General Purpose Timers on AMD Geode
|
|
|
|
platforms.
|
|
|
|
|
2008-01-30 20:33:33 +08:00
|
|
|
mfgptfix [X86-32] Fix MFGPT timers on AMD Geode platforms when
|
|
|
|
the BIOS has incorrectly applied a workaround. TinyBIOS
|
|
|
|
version 0.98 is known to be affected, 0.99 fixes the
|
|
|
|
problem by letting the user disable the workaround.
|
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
mga= [HW,DRM]
|
|
|
|
|
2008-11-20 07:36:16 +08:00
|
|
|
min_addr=nn[KMG] [KNL,BOOT,ia64] All physical memory below this
|
|
|
|
physical address is ignored.
|
|
|
|
|
2009-05-20 18:10:31 +08:00
|
|
|
mini2440= [ARM,HW,KNL]
|
|
|
|
Format:[0..2][b][c][t]
|
|
|
|
Default: "0tb"
|
|
|
|
MINI2440 configuration specification:
|
|
|
|
0 - The attached screen is the 3.5" TFT
|
|
|
|
1 - The attached screen is the 7" TFT
|
|
|
|
2 - The VGA Shield is attached (1024x768)
|
|
|
|
Leaving out the screen size parameter will not load
|
|
|
|
the TFT driver, and the framebuffer will be left
|
|
|
|
unconfigured.
|
|
|
|
b - Enable backlight. The TFT backlight pin will be
|
|
|
|
linked to the kernel VESA blanking code and a GPIO
|
|
|
|
LED. This parameter is not necessary when using the
|
|
|
|
VGA shield.
|
|
|
|
c - Enable the s3c camera interface.
|
|
|
|
t - Reserved for enabling touchscreen support. The
|
|
|
|
touchscreen support is not enabled in the mainstream
|
|
|
|
kernel as of 2.6.30, a preliminary port can be found
|
|
|
|
in the "bleeding edge" mini2440 support kernel at
|
2020-06-27 15:29:35 +08:00
|
|
|
https://repo.or.cz/w/linux-2.6/mini2440.git
|
2009-05-20 18:10:31 +08:00
|
|
|
|
2019-04-13 04:39:28 +08:00
|
|
|
mitigations=
|
2019-04-13 04:39:32 +08:00
|
|
|
[X86,PPC,S390,ARM64] Control optional mitigations for
|
|
|
|
CPU vulnerabilities. This is a set of curated,
|
2019-04-13 04:39:29 +08:00
|
|
|
arch-independent options, each of which is an
|
|
|
|
aggregation of existing arch-specific options.
|
2019-04-13 04:39:28 +08:00
|
|
|
|
|
|
|
off
|
|
|
|
Disable all optional CPU mitigations. This
|
|
|
|
improves system performance, but it may also
|
|
|
|
expose users to several CPU vulnerabilities.
|
2019-04-13 04:39:30 +08:00
|
|
|
Equivalent to: nopti [X86,PPC]
|
2019-04-13 04:39:32 +08:00
|
|
|
kpti=0 [ARM64]
|
x86/speculation: Enable Spectre v1 swapgs mitigations
The previous commit added macro calls in the entry code which mitigate the
Spectre v1 swapgs issue if the X86_FEATURE_FENCE_SWAPGS_* features are
enabled. Enable those features where applicable.
The mitigations may be disabled with "nospectre_v1" or "mitigations=off".
There are different features which can affect the risk of attack:
- When FSGSBASE is enabled, unprivileged users are able to place any
value in GS, using the wrgsbase instruction. This means they can
write a GS value which points to any value in kernel space, which can
be useful with the following gadget in an interrupt/exception/NMI
handler:
if (coming from user space)
swapgs
mov %gs:<percpu_offset>, %reg1
// dependent load or store based on the value of %reg
// for example: mov %(reg1), %reg2
If an interrupt is coming from user space, and the entry code
speculatively skips the swapgs (due to user branch mistraining), it
may speculatively execute the GS-based load and a subsequent dependent
load or store, exposing the kernel data to an L1 side channel leak.
Note that, on Intel, a similar attack exists in the above gadget when
coming from kernel space, if the swapgs gets speculatively executed to
switch back to the user GS. On AMD, this variant isn't possible
because swapgs is serializing with respect to future GS-based
accesses.
NOTE: The FSGSBASE patch set hasn't been merged yet, so the above case
doesn't exist quite yet.
- When FSGSBASE is disabled, the issue is mitigated somewhat because
unprivileged users must use prctl(ARCH_SET_GS) to set GS, which
restricts GS values to user space addresses only. That means the
gadget would need an additional step, since the target kernel address
needs to be read from user space first. Something like:
if (coming from user space)
swapgs
mov %gs:<percpu_offset>, %reg1
mov (%reg1), %reg2
// dependent load or store based on the value of %reg2
// for example: mov %(reg2), %reg3
It's difficult to audit for this gadget in all the handlers, so while
there are no known instances of it, it's entirely possible that it
exists somewhere (or could be introduced in the future). Without
tooling to analyze all such code paths, consider it vulnerable.
Effects of SMAP on the !FSGSBASE case:
- If SMAP is enabled, and the CPU reports RDCL_NO (i.e., not
susceptible to Meltdown), the kernel is prevented from speculatively
reading user space memory, even L1 cached values. This effectively
disables the !FSGSBASE attack vector.
- If SMAP is enabled, but the CPU *is* susceptible to Meltdown, SMAP
still prevents the kernel from speculatively reading user space
memory. But it does *not* prevent the kernel from reading the
user value from L1, if it has already been cached. This is probably
only a small hurdle for an attacker to overcome.
Thanks to Dave Hansen for contributing the speculative_smap() function.
Thanks to Andrew Cooper for providing the inside scoop on whether swapgs
is serializing on AMD.
[ tglx: Fixed the USER fence decision and polished the comment as suggested
by Dave Hansen ]
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
2019-07-09 00:52:26 +08:00
|
|
|
nospectre_v1 [X86,PPC]
|
2019-04-13 04:39:31 +08:00
|
|
|
nobp=0 [S390]
|
2019-04-13 04:39:32 +08:00
|
|
|
nospectre_v2 [X86,PPC,S390,ARM64]
|
2019-04-13 04:39:29 +08:00
|
|
|
spectre_v2_user=off [X86]
|
2019-04-13 04:39:30 +08:00
|
|
|
spec_store_bypass_disable=off [X86,PPC]
|
2019-04-13 04:39:32 +08:00
|
|
|
ssbd=force-off [ARM64]
|
2019-04-13 04:39:29 +08:00
|
|
|
l1tf=off [X86]
|
2019-04-18 05:39:02 +08:00
|
|
|
mds=off [X86]
|
2019-10-23 18:32:55 +08:00
|
|
|
tsx_async_abort=off [X86]
|
2019-11-04 19:22:02 +08:00
|
|
|
kvm.nx_huge_pages=off [X86]
|
2020-11-17 13:59:12 +08:00
|
|
|
no_entry_flush [PPC]
|
2020-11-17 13:59:13 +08:00
|
|
|
no_uaccess_flush [PPC]
|
2019-11-04 19:22:02 +08:00
|
|
|
|
|
|
|
Exceptions:
|
|
|
|
This does not have any effect on
|
|
|
|
kvm.nx_huge_pages when
|
|
|
|
kvm.nx_huge_pages=force.
|
2019-04-13 04:39:28 +08:00
|
|
|
|
|
|
|
auto (default)
|
|
|
|
Mitigate all CPU vulnerabilities, but leave SMT
|
|
|
|
enabled, even if it's vulnerable. This is for
|
|
|
|
users who don't want to be surprised by SMT
|
|
|
|
getting disabled across kernel upgrades, or who
|
|
|
|
have other ways of avoiding SMT-based attacks.
|
2019-04-13 04:39:29 +08:00
|
|
|
Equivalent to: (default behavior)
|
2019-04-13 04:39:28 +08:00
|
|
|
|
|
|
|
auto,nosmt
|
|
|
|
Mitigate all CPU vulnerabilities, disabling SMT
|
|
|
|
if needed. This is for users who always want to
|
|
|
|
be fully mitigated, even if it means losing SMT.
|
2019-04-13 04:39:29 +08:00
|
|
|
Equivalent to: l1tf=flush,nosmt [X86]
|
2019-04-18 05:39:02 +08:00
|
|
|
mds=full,nosmt [X86]
|
2019-10-23 18:32:55 +08:00
|
|
|
tsx_async_abort=full,nosmt [X86]
|
2019-04-13 04:39:28 +08:00
|
|
|
|
2008-07-24 12:26:49 +08:00
|
|
|
mminit_loglevel=
|
|
|
|
[KNL] When CONFIG_DEBUG_MEMORY_INIT is set, this
|
|
|
|
parameter allows control of the logging verbosity for
|
|
|
|
the additional memory initialisation checks. A value
|
|
|
|
of 0 disables mminit logging and a level of 4 will
|
|
|
|
log everything. Information is printed at KERN_DEBUG
|
|
|
|
so loglevel=8 may also need to be specified.
|
|
|
|
|
2012-09-26 17:09:40 +08:00
|
|
|
module.sig_enforce
|
|
|
|
[KNL] When CONFIG_MODULE_SIG is set, this means that
|
|
|
|
modules without (valid) signatures will fail to load.
|
2013-03-26 03:42:06 +08:00
|
|
|
Note that if CONFIG_MODULE_SIG_FORCE is set, that
|
2012-09-26 17:09:40 +08:00
|
|
|
is always true, so this option does nothing.
|
|
|
|
|
2016-07-21 14:07:56 +08:00
|
|
|
module_blacklist= [KNL] Do not load a comma-separated list of
|
|
|
|
modules. Useful for debugging problem modules.
|
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
mousedev.tap_time=
|
|
|
|
[MOUSE] Maximum time between finger touching and
|
|
|
|
leaving touchpad surface for touch to be considered
|
|
|
|
a tap and be reported as a left button click (for
|
|
|
|
touchpads working in absolute mode only).
|
|
|
|
Format: <msecs>
|
|
|
|
mousedev.xres= [MOUSE] Horizontal screen resolution, used for devices
|
|
|
|
reporting absolute coordinates, such as tablets
|
|
|
|
mousedev.yres= [MOUSE] Vertical screen resolution, used for devices
|
|
|
|
reporting absolute coordinates, such as tablets
|
|
|
|
|
mm, page_alloc: extend kernelcore and movablecore for percent
Both kernelcore= and movablecore= can be used to define the amount of
ZONE_NORMAL and ZONE_MOVABLE on a system, respectively. This requires
the system memory capacity to be known when specifying the command line,
however.
This introduces the ability to define both kernelcore= and movablecore=
as a percentage of total system memory. This is convenient for systems
software that wants to define the amount of ZONE_MOVABLE, for example,
as a proportion of a system's memory rather than a hardcoded byte value.
To define the percentage, the final character of the parameter should be
a '%'.
mhocko: "why is anyone using these options nowadays?"
rientjes:
:
: Fragmentation of non-__GFP_MOVABLE pages due to low on memory
: situations can pollute most pageblocks on the system, as much as 1GB of
: slab being fragmented over 128GB of memory, for example. When the
: amount of kernel memory is well bounded for certain systems, it is
: better to aggressively reclaim from existing MIGRATE_UNMOVABLE
: pageblocks rather than eagerly fallback to others.
:
: We have additional patches that help with this fragmentation if you're
: interested, specifically kcompactd compaction of MIGRATE_UNMOVABLE
: pageblocks triggered by fallback of non-__GFP_MOVABLE allocations and
: draining of pcp lists back to the zone free area to prevent stranding.
[rientjes@google.com: updates]
Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1802131700160.71590@chino.kir.corp.google.com
Link: http://lkml.kernel.org/r/alpine.DEB.2.10.1802121622470.179479@chino.kir.corp.google.com
Signed-off-by: David Rientjes <rientjes@google.com>
Reviewed-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Mel Gorman <mgorman@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2018-04-06 07:23:09 +08:00
|
|
|
movablecore= [KNL,X86,IA-64,PPC]
|
|
|
|
Format: nn[KMGTPE] | nn%
|
|
|
|
This parameter is the complement to kernelcore=, it
|
|
|
|
specifies the amount of memory used for migratable
|
|
|
|
allocations. If both kernelcore and movablecore is
|
|
|
|
specified, then kernelcore will be at *least* the
|
|
|
|
specified value but may be more. If movablecore on its
|
|
|
|
own is specified, the administrator must be careful
|
2009-04-06 06:55:22 +08:00
|
|
|
that the amount of memory usable for all allocations
|
|
|
|
is not too small.
|
|
|
|
|
2017-07-07 06:41:02 +08:00
|
|
|
movable_node [KNL] Boot-time switch to make hotplugable memory
|
|
|
|
NUMA nodes to be movable. This means that the memory
|
|
|
|
of such nodes will be usable only for movable
|
|
|
|
allocations which rules out almost all kernel
|
|
|
|
allocations. Use with caution!
|
mem-hotplug: introduce movable_node boot option
The hot-Pluggable field in SRAT specifies which memory is hotpluggable.
As we mentioned before, if hotpluggable memory is used by the kernel, it
cannot be hot-removed. So memory hotplug users may want to set all
hotpluggable memory in ZONE_MOVABLE so that the kernel won't use it.
Memory hotplug users may also set a node as movable node, which has
ZONE_MOVABLE only, so that the whole node can be hot-removed.
But the kernel cannot use memory in ZONE_MOVABLE. By doing this, the
kernel cannot use memory in movable nodes. This will cause NUMA
performance down. And other users may be unhappy.
So we need a way to allow users to enable and disable this functionality.
In this patch, we introduce movable_node boot option to allow users to
choose to not to consume hotpluggable memory at early boot time and later
we can set it as ZONE_MOVABLE.
To achieve this, the movable_node boot option will control the memblock
allocation direction. That said, after memblock is ready, before SRAT is
parsed, we should allocate memory near the kernel image as we explained in
the previous patches. So if movable_node boot option is set, the kernel
does the following:
1. After memblock is ready, make memblock allocate memory bottom up.
2. After SRAT is parsed, make memblock behave as default, allocate memory
top down.
Users can specify "movable_node" in kernel commandline to enable this
functionality. For those who don't use memory hotplug or who don't want
to lose their NUMA performance, just don't specify anything. The kernel
will work as before.
Signed-off-by: Tang Chen <tangchen@cn.fujitsu.com>
Signed-off-by: Zhang Yanfei <zhangyanfei@cn.fujitsu.com>
Suggested-by: Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Suggested-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Tejun Heo <tj@kernel.org>
Acked-by: Toshi Kani <toshi.kani@hp.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Wanpeng Li <liwanp@linux.vnet.ibm.com>
Cc: Thomas Renninger <trenn@suse.de>
Cc: Yinghai Lu <yinghai@kernel.org>
Cc: Jiang Liu <jiang.liu@huawei.com>
Cc: Wen Congyang <wency@cn.fujitsu.com>
Cc: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: Yasuaki Ishimatsu <isimatu.yasuaki@jp.fujitsu.com>
Cc: Taku Izumi <izumi.taku@jp.fujitsu.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Michal Nazarewicz <mina86@mina86.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Rik van Riel <riel@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-11-13 07:08:10 +08:00
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
MTD_Partition= [MTD]
|
|
|
|
Format: <name>,<region-number>,<size>,<offset>
|
|
|
|
|
2005-10-24 03:57:11 +08:00
|
|
|
MTD_Region= [MTD] Format:
|
|
|
|
<name>,<region-number>[,<base>,<size>,<buswidth>,<altbuswidth>]
|
2005-04-17 06:20:36 +08:00
|
|
|
|
|
|
|
mtdparts= [MTD]
|
2020-02-18 23:02:19 +08:00
|
|
|
See drivers/mtd/parsers/cmdlinepart.c
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2010-09-28 23:33:12 +08:00
|
|
|
multitce=off [PPC] This parameter disables the use of the pSeries
|
|
|
|
firmware feature for updating multiple TCE entries
|
|
|
|
at a time.
|
|
|
|
|
2009-05-13 04:46:57 +08:00
|
|
|
onenand.bdry= [HW,MTD] Flex-OneNAND Boundary Configuration
|
|
|
|
|
|
|
|
Format: [die0_boundary][,die0_lock][,die1_boundary][,die1_lock]
|
|
|
|
|
|
|
|
boundary - index of last SLC block on Flex-OneNAND.
|
|
|
|
The remaining blocks are configured as MLC blocks.
|
|
|
|
lock - Configure if Flex-OneNAND boundary should be locked.
|
|
|
|
Once locked, the boundary cannot be changed.
|
|
|
|
1 indicates lock status, 0 indicates unlock status.
|
|
|
|
|
2008-07-03 18:24:29 +08:00
|
|
|
mtdset= [ARM]
|
|
|
|
ARM/S3C2412 JIVE boot control
|
|
|
|
|
2020-09-11 22:33:41 +08:00
|
|
|
See arch/arm/mach-s3c/mach-jive.c
|
2008-07-03 18:24:29 +08:00
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
mtouchusb.raw_coordinates=
|
2005-10-24 03:57:11 +08:00
|
|
|
[HW] Make the MicroTouch USB driver use raw coordinates
|
|
|
|
('y', default) or cooked coordinates ('n')
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2009-04-06 06:55:22 +08:00
|
|
|
mtrr_chunk_size=nn[KMG] [X86]
|
2009-04-27 21:06:31 +08:00
|
|
|
used for mtrr cleanup. It is largest continuous chunk
|
2009-04-06 06:55:22 +08:00
|
|
|
that could hold holes aka. UC entries.
|
|
|
|
|
|
|
|
mtrr_gran_size=nn[KMG] [X86]
|
|
|
|
Used for mtrr cleanup. It is granularity of mtrr block.
|
|
|
|
Default is 1.
|
|
|
|
Large value could prevent small alignment from
|
|
|
|
using up MTRRs.
|
|
|
|
|
|
|
|
mtrr_spare_reg_nr=n [X86]
|
|
|
|
Format: <integer>
|
|
|
|
Range: 0,7 : spare reg number
|
|
|
|
Default : 1
|
|
|
|
Used for mtrr cleanup. It is spare mtrr entries number.
|
|
|
|
Set to 2 or more if your graphical card needs more.
|
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
n2= [NET] SDL Inc. RISCom/N2 synchronous serial card
|
|
|
|
|
|
|
|
netdev= [NET] Network devices parameters
|
|
|
|
Format: <irq>,<io>,<mem_start>,<mem_end>,<name>
|
|
|
|
Note that mem_start is often overloaded to mean
|
|
|
|
something different and driver-specific.
|
2005-10-24 03:57:11 +08:00
|
|
|
This usage is only documented in each driver source
|
|
|
|
file if at all.
|
|
|
|
|
netfilter: accounting rework: ct_extend + 64bit counters (v4)
Initially netfilter has had 64bit counters for conntrack-based accounting, but
it was changed in 2.6.14 to save memory. Unfortunately in-kernel 64bit counters are
still required, for example for "connbytes" extension. However, 64bit counters
waste a lot of memory and it was not possible to enable/disable it runtime.
This patch:
- reimplements accounting with respect to the extension infrastructure,
- makes one global version of seq_print_acct() instead of two seq_print_counters(),
- makes it possible to enable it at boot time (for CONFIG_SYSCTL/CONFIG_SYSFS=n),
- makes it possible to enable/disable it at runtime by sysctl or sysfs,
- extends counters from 32bit to 64bit,
- renames ip_conntrack_counter -> nf_conn_counter,
- enables accounting code unconditionally (no longer depends on CONFIG_NF_CT_ACCT),
- set initial accounting enable state based on CONFIG_NF_CT_ACCT
- removes buggy IPCT_COUNTER_FILLING event handling.
If accounting is enabled newly created connections get additional acct extend.
Old connections are not changed as it is not possible to add a ct_extend area
to confirmed conntrack. Accounting is performed for all connections with
acct extend regardless of a current state of "net.netfilter.nf_conntrack_acct".
Signed-off-by: Krzysztof Piotr Oledzki <ole@ans.pl>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-22 01:01:34 +08:00
|
|
|
nf_conntrack.acct=
|
|
|
|
[NETFILTER] Enable connection tracking flow accounting
|
|
|
|
0 to disable accounting
|
|
|
|
1 to enable accounting
|
2010-06-25 20:46:56 +08:00
|
|
|
Default value is 0.
|
netfilter: accounting rework: ct_extend + 64bit counters (v4)
Initially netfilter has had 64bit counters for conntrack-based accounting, but
it was changed in 2.6.14 to save memory. Unfortunately in-kernel 64bit counters are
still required, for example for "connbytes" extension. However, 64bit counters
waste a lot of memory and it was not possible to enable/disable it runtime.
This patch:
- reimplements accounting with respect to the extension infrastructure,
- makes one global version of seq_print_acct() instead of two seq_print_counters(),
- makes it possible to enable it at boot time (for CONFIG_SYSCTL/CONFIG_SYSFS=n),
- makes it possible to enable/disable it at runtime by sysctl or sysfs,
- extends counters from 32bit to 64bit,
- renames ip_conntrack_counter -> nf_conn_counter,
- enables accounting code unconditionally (no longer depends on CONFIG_NF_CT_ACCT),
- set initial accounting enable state based on CONFIG_NF_CT_ACCT
- removes buggy IPCT_COUNTER_FILLING event handling.
If accounting is enabled newly created connections get additional acct extend.
Old connections are not changed as it is not possible to add a ct_extend area
to confirmed conntrack. Accounting is performed for all connections with
acct extend regardless of a current state of "net.netfilter.nf_conntrack_acct".
Signed-off-by: Krzysztof Piotr Oledzki <ole@ans.pl>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2008-07-22 01:01:34 +08:00
|
|
|
|
2010-09-17 22:54:37 +08:00
|
|
|
nfsaddrs= [NFS] Deprecated. Use ip= instead.
|
2020-02-13 02:13:32 +08:00
|
|
|
See Documentation/admin-guide/nfs/nfsroot.rst.
|
2005-04-17 06:20:36 +08:00
|
|
|
|
|
|
|
nfsroot= [NFS] nfs root filesystem for disk-less boxes.
|
2020-02-13 02:13:32 +08:00
|
|
|
See Documentation/admin-guide/nfs/nfsroot.rst.
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2010-09-17 22:54:37 +08:00
|
|
|
nfsrootdebug [NFS] enable nfsroot debugging messages.
|
2020-02-13 02:13:32 +08:00
|
|
|
See Documentation/admin-guide/nfs/nfsroot.rst.
|
2010-09-17 22:54:37 +08:00
|
|
|
|
2016-08-30 08:03:52 +08:00
|
|
|
nfs.callback_nr_threads=
|
|
|
|
[NFSv4] set the total number of threads that the
|
|
|
|
NFS client will assign to service NFSv4 callback
|
|
|
|
requests.
|
|
|
|
|
2006-01-03 16:55:41 +08:00
|
|
|
nfs.callback_tcpport=
|
|
|
|
[NFS] set the TCP port on which the NFSv4 callback
|
|
|
|
channel should listen.
|
|
|
|
|
2009-08-20 06:12:27 +08:00
|
|
|
nfs.cache_getent=
|
|
|
|
[NFS] sets the pathname to the program which is used
|
|
|
|
to update the NFS client cache entries.
|
|
|
|
|
|
|
|
nfs.cache_getent_timeout=
|
|
|
|
[NFS] sets the timeout after which an attempt to
|
|
|
|
update a cache entry is deemed to have failed.
|
|
|
|
|
2006-01-03 16:55:57 +08:00
|
|
|
nfs.idmap_cache_timeout=
|
|
|
|
[NFS] set the maximum lifetime for idmapper cache
|
|
|
|
entries.
|
|
|
|
|
2007-10-10 00:01:04 +08:00
|
|
|
nfs.enable_ino64=
|
|
|
|
[NFS] enable 64-bit inode numbers.
|
|
|
|
If zero, the NFS client will fake up a 32-bit inode
|
|
|
|
number for the readdir() and stat() syscalls instead
|
|
|
|
of returning the full 64-bit number.
|
|
|
|
The default is to return 64-bit inode numbers.
|
|
|
|
|
2016-08-30 08:03:52 +08:00
|
|
|
nfs.max_session_cb_slots=
|
|
|
|
[NFSv4.1] Sets the maximum number of session
|
|
|
|
slots the client will assign to the callback
|
|
|
|
channel. This determines the maximum number of
|
|
|
|
callbacks the client will process in parallel for
|
|
|
|
a particular server.
|
|
|
|
|
2012-02-07 08:50:40 +08:00
|
|
|
nfs.max_session_slots=
|
|
|
|
[NFSv4.1] Sets the maximum number of session slots
|
|
|
|
the client will attempt to negotiate with the server.
|
|
|
|
This limits the number of simultaneous RPC requests
|
|
|
|
that the client can send to the NFSv4.1 server.
|
|
|
|
Note that there is little point in setting this
|
|
|
|
value higher than the max_tcp_slot_table_limit.
|
|
|
|
|
2011-02-23 07:44:32 +08:00
|
|
|
nfs.nfs4_disable_idmapping=
|
2012-01-10 02:46:26 +08:00
|
|
|
[NFSv4] When set to the default of '1', this option
|
|
|
|
ensures that both the RPC level authentication
|
|
|
|
scheme and the NFS level operations agree to use
|
|
|
|
numeric uids/gids if the mount is using the
|
|
|
|
'sec=sys' security flavour. In effect it is
|
|
|
|
disabling idmapping, which can make migration from
|
|
|
|
legacy NFSv2/v3 systems to NFSv4 easier.
|
|
|
|
Servers that do not support this mode of operation
|
|
|
|
will be autodetected by the client, and it will fall
|
|
|
|
back to using the idmapper.
|
|
|
|
To turn off this behaviour, set the value to '0'.
|
2012-09-15 05:24:41 +08:00
|
|
|
nfs.nfs4_unique_id=
|
|
|
|
[NFS4] Specify an additional fixed unique ident-
|
|
|
|
ification string that NFSv4 clients can insert into
|
|
|
|
their nfs_client_id4 string. This is typically a
|
|
|
|
UUID that is generated at system install time.
|
2011-02-23 07:44:32 +08:00
|
|
|
|
2012-02-18 04:20:24 +08:00
|
|
|
nfs.send_implementation_id =
|
|
|
|
[NFSv4.1] Send client implementation identification
|
|
|
|
information in exchange_id requests.
|
|
|
|
If zero, no implementation identification information
|
|
|
|
will be sent.
|
|
|
|
The default is to send the implementation identification
|
|
|
|
information.
|
2016-11-03 18:10:10 +08:00
|
|
|
|
2013-09-04 22:08:54 +08:00
|
|
|
nfs.recover_lost_locks =
|
|
|
|
[NFSv4] Attempt to recover locks that were lost due
|
|
|
|
to a lease timeout on the server. Please note that
|
|
|
|
doing this risks data corruption, since there are
|
|
|
|
no guarantees that the file will remain unchanged
|
|
|
|
after the locks are lost.
|
|
|
|
If you want to enable the kernel legacy behaviour of
|
|
|
|
attempting to recover these locks, then set this
|
|
|
|
parameter to '1'.
|
|
|
|
The default parameter value of '0' causes the kernel
|
|
|
|
not to attempt recovery of lost locks.
|
2012-02-18 04:20:24 +08:00
|
|
|
|
2015-08-25 08:39:18 +08:00
|
|
|
nfs4.layoutstats_timer =
|
|
|
|
[NFSv4.2] Change the rate at which the kernel sends
|
|
|
|
layoutstats to the pNFS metadata server.
|
|
|
|
|
|
|
|
Setting this to value to 0 causes the kernel to use
|
|
|
|
whatever value is the default set by the layout
|
|
|
|
driver. A non-zero value sets the minimum interval
|
|
|
|
in seconds between layoutstats transmissions.
|
|
|
|
|
2012-03-23 04:07:18 +08:00
|
|
|
nfsd.nfs4_disable_idmapping=
|
|
|
|
[NFSv4] When set to the default of '1', the NFSv4
|
|
|
|
server will return only numeric uids and gids to
|
|
|
|
clients using auth_sys, and will accept numeric uids
|
|
|
|
and gids from such clients. This is intended to ease
|
|
|
|
migration from NFSv2/v3.
|
2012-02-18 04:20:24 +08:00
|
|
|
|
2020-07-09 07:25:43 +08:00
|
|
|
nmi_backtrace.backtrace_idle [KNL]
|
|
|
|
Dump stacks even of idle CPUs in response to an
|
|
|
|
NMI stack-backtrace request.
|
|
|
|
|
2017-02-26 20:17:39 +08:00
|
|
|
nmi_debug= [KNL,SH] Specify one or more actions to take
|
2007-10-10 20:58:29 +08:00
|
|
|
when a NMI is triggered.
|
|
|
|
Format: [state][,regs][,debounce][,die]
|
|
|
|
|
2009-04-14 16:33:43 +08:00
|
|
|
nmi_watchdog= [KNL,BUGS=X86] Debugging features for SMP kernels
|
2011-03-23 07:34:16 +08:00
|
|
|
Format: [panic,][nopanic,][num]
|
2015-04-15 06:44:13 +08:00
|
|
|
Valid num: 0 or 1
|
2015-10-11 03:40:42 +08:00
|
|
|
0 - turn hardlockup detector in nmi_watchdog off
|
|
|
|
1 - turn hardlockup detector in nmi_watchdog on
|
2009-04-06 06:55:22 +08:00
|
|
|
When panic is specified, panic when an NMI watchdog
|
2019-05-21 10:32:08 +08:00
|
|
|
timeout occurs (or 'nopanic' to not panic on an NMI
|
|
|
|
watchdog, if CONFIG_BOOTPARAM_HARDLOCKUP_PANIC is set)
|
|
|
|
To disable both hard and soft lockup detectors,
|
2015-10-11 03:40:42 +08:00
|
|
|
please see 'nowatchdog'.
|
2009-04-06 06:55:22 +08:00
|
|
|
This is useful when you use a panic=... timeout and
|
|
|
|
need the box quickly up again.
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2017-12-10 15:48:46 +08:00
|
|
|
These settings can be accessed at runtime via
|
|
|
|
the nmi_watchdog and hardlockup_panic sysctls.
|
|
|
|
|
2009-07-09 02:10:56 +08:00
|
|
|
netpoll.carrier_timeout=
|
|
|
|
[NET] Specifies amount of time (in seconds) that
|
|
|
|
netpoll should wait for a carrier. By default netpoll
|
|
|
|
waits 4 seconds.
|
|
|
|
|
2007-07-31 15:37:59 +08:00
|
|
|
no387 [BUGS=X86-32] Tells the kernel to use the 387 maths
|
2005-04-17 06:20:36 +08:00
|
|
|
emulation library even if a 387 maths coprocessor
|
|
|
|
is present.
|
|
|
|
|
2018-05-18 18:35:25 +08:00
|
|
|
no5lvl [X86-64] Disable 5-level paging mode. Forces
|
|
|
|
kernel to use 4-level paging instead.
|
|
|
|
|
2020-05-29 04:13:58 +08:00
|
|
|
nofsgsbase [X86] Disables FSGSBASE instructions.
|
2020-05-29 04:13:48 +08:00
|
|
|
|
2009-04-06 06:55:22 +08:00
|
|
|
no_console_suspend
|
|
|
|
[HW] Never suspend the console
|
|
|
|
Disable suspending of consoles during suspend and
|
|
|
|
hibernate operations. Once disabled, debugging
|
|
|
|
messages can reach various consoles while the rest
|
|
|
|
of the system is being put to sleep (ie, while
|
|
|
|
debugging driver suspend/resume hooks). This may
|
|
|
|
not work reliably with all consoles, but is known
|
|
|
|
to work with serial and VGA consoles.
|
2011-11-01 08:11:27 +08:00
|
|
|
To facilitate more flexible debugging, we also add
|
|
|
|
console_suspend, a printk module parameter to control
|
|
|
|
it. Users could use console_suspend (usually
|
|
|
|
/sys/module/printk/parameters/console_suspend) to
|
|
|
|
turn on/off it dynamically.
|
2009-04-06 06:55:22 +08:00
|
|
|
|
2019-07-17 07:26:39 +08:00
|
|
|
novmcoredd [KNL,KDUMP]
|
|
|
|
Disable device dump. Device dump allows drivers to
|
|
|
|
append dump data to vmcore so you can collect driver
|
|
|
|
specified debug info. Drivers can append the data
|
|
|
|
without any limit and this data is stored in memory,
|
|
|
|
so this may cause significant memory stress. Disabling
|
|
|
|
device dump can help save memory but the driver debug
|
|
|
|
data will be no longer available. This parameter
|
|
|
|
is only available when CONFIG_PROC_VMCORE_DEVICE_DUMP
|
|
|
|
is set.
|
|
|
|
|
2007-05-31 15:40:47 +08:00
|
|
|
noaliencache [MM, NUMA, SLAB] Disables the allocation of alien
|
|
|
|
caches in the slab allocator. Saves per-node memory,
|
|
|
|
but will impact performance.
|
2006-12-07 12:32:16 +08:00
|
|
|
|
2005-10-24 03:57:11 +08:00
|
|
|
noalign [KNL,ARM]
|
|
|
|
|
s390: introduce CPU alternatives
Implement CPU alternatives, which allows to optionally patch newer
instructions at runtime, based on CPU facilities availability.
A new kernel boot parameter "noaltinstr" disables patching.
Current implementation is derived from x86 alternatives. Although
ideal instructions padding (when altinstr is longer then oldinstr)
is added at compile time, and no oldinstr nops optimization has to be
done at runtime. Also couple of compile time sanity checks are done:
1. oldinstr and altinstr must be <= 254 bytes long,
2. oldinstr and altinstr must not have an odd length.
alternative(oldinstr, altinstr, facility);
alternative_2(oldinstr, altinstr1, facility1, altinstr2, facility2);
Both compile time and runtime padding consists of either 6/4/2 bytes nop
or a jump (brcl) + 2 bytes nop filler if padding is longer then 6 bytes.
.altinstructions and .altinstr_replacement sections are part of
__init_begin : __init_end region and are freed after initialization.
Signed-off-by: Vasily Gorbik <gor@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-10-12 19:01:47 +08:00
|
|
|
noaltinstr [S390] Disables alternative instructions patching
|
|
|
|
(CPU alternatives feature).
|
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
noapic [SMP,APIC] Tells the kernel to not make use of any
|
|
|
|
IOAPICs that may be present in the system.
|
|
|
|
|
sched: Add 'autogroup' scheduling feature: automated per session task groups
A recurring complaint from CFS users is that parallel kbuild has
a negative impact on desktop interactivity. This patch
implements an idea from Linus, to automatically create task
groups. Currently, only per session autogroups are implemented,
but the patch leaves the way open for enhancement.
Implementation: each task's signal struct contains an inherited
pointer to a refcounted autogroup struct containing a task group
pointer, the default for all tasks pointing to the
init_task_group. When a task calls setsid(), a new task group
is created, the process is moved into the new task group, and a
reference to the preveious task group is dropped. Child
processes inherit this task group thereafter, and increase it's
refcount. When the last thread of a process exits, the
process's reference is dropped, such that when the last process
referencing an autogroup exits, the autogroup is destroyed.
At runqueue selection time, IFF a task has no cgroup assignment,
its current autogroup is used.
Autogroup bandwidth is controllable via setting it's nice level
through the proc filesystem:
cat /proc/<pid>/autogroup
Displays the task's group and the group's nice level.
echo <nice level> > /proc/<pid>/autogroup
Sets the task group's shares to the weight of nice <level> task.
Setting nice level is rate limited for !admin users due to the
abuse risk of task group locking.
The feature is enabled from boot by default if
CONFIG_SCHED_AUTOGROUP=y is selected, but can be disabled via
the boot option noautogroup, and can also be turned on/off on
the fly via:
echo [01] > /proc/sys/kernel/sched_autogroup_enabled
... which will automatically move tasks to/from the root task group.
Signed-off-by: Mike Galbraith <efault@gmx.de>
Acked-by: Linus Torvalds <torvalds@linux-foundation.org>
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Markus Trippelsdorf <markus@trippelsdorf.de>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Paul Turner <pjt@google.com>
Cc: Oleg Nesterov <oleg@redhat.com>
[ Removed the task_group_path() debug code, and fixed !EVENTFD build failure. ]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
LKML-Reference: <1290281700.28711.9.camel@maggy.simson.net>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-11-30 21:18:03 +08:00
|
|
|
noautogroup Disable scheduler automatic task group creation.
|
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
nobats [PPC] Do not use BATs for mapping kernel lowmem
|
|
|
|
on "Classic" PPC cores.
|
|
|
|
|
|
|
|
nocache [ARM]
|
2005-10-24 03:57:11 +08:00
|
|
|
|
2009-04-06 06:55:22 +08:00
|
|
|
noclflush [BUGS=X86] Don't use the CLFLUSH instruction
|
|
|
|
|
2021-05-05 04:43:32 +08:00
|
|
|
delayacct [KNL] Enable per-task delay accounting
|
2006-07-30 18:03:11 +08:00
|
|
|
|
2008-09-21 16:14:42 +08:00
|
|
|
nodsp [SH] Disable hardware DSP at boot time.
|
|
|
|
|
2014-08-14 17:15:26 +08:00
|
|
|
noefi Disable EFI runtime services support.
|
2008-01-30 20:32:11 +08:00
|
|
|
|
2020-11-17 13:59:12 +08:00
|
|
|
no_entry_flush [PPC] Don't flush the L1-D cache when entering the kernel.
|
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
noexec [IA-64]
|
|
|
|
|
2009-04-14 16:33:43 +08:00
|
|
|
noexec [X86]
|
2008-04-12 16:28:25 +08:00
|
|
|
On X86-32 available only on PAE configured kernels.
|
2005-04-17 06:20:36 +08:00
|
|
|
noexec=on: enable non-executable mappings (default)
|
2008-04-12 16:28:25 +08:00
|
|
|
noexec=off: disable non-executable mappings
|
|
|
|
|
2019-04-18 14:51:20 +08:00
|
|
|
nosmap [X86,PPC]
|
2012-09-22 03:43:13 +08:00
|
|
|
Disable SMAP (Supervisor Mode Access Prevention)
|
|
|
|
even if it is supported by processor.
|
|
|
|
|
2019-04-18 14:51:19 +08:00
|
|
|
nosmep [X86,PPC]
|
2012-09-22 03:43:13 +08:00
|
|
|
Disable SMEP (Supervisor Mode Execution Prevention)
|
2011-05-12 07:51:05 +08:00
|
|
|
even if it is supported by processor.
|
|
|
|
|
2008-04-12 16:28:25 +08:00
|
|
|
noexec32 [X86-64]
|
|
|
|
This affects only 32-bit executables.
|
|
|
|
noexec32=on: enable non-executable mappings (default)
|
|
|
|
read doesn't imply executable mappings
|
|
|
|
noexec32=off: disable non-executable mappings
|
|
|
|
read implies executable mappings
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2015-04-04 06:23:34 +08:00
|
|
|
nofpu [MIPS,SH] Disable hardware FPU at boot time.
|
2008-09-21 16:14:42 +08:00
|
|
|
|
2007-07-31 15:37:59 +08:00
|
|
|
nofxsr [BUGS=X86-32] Disables x86 floating point extended
|
2006-03-23 18:59:34 +08:00
|
|
|
register save and restore. The kernel will only save
|
|
|
|
legacy floating-point registers on task switch.
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2020-09-10 20:19:46 +08:00
|
|
|
nohugeiomap [KNL,X86,PPC,ARM64] Disable kernel huge I/O mappings.
|
2015-04-15 06:47:20 +08:00
|
|
|
|
2021-05-03 17:17:55 +08:00
|
|
|
nohugevmalloc [PPC] Disable kernel huge vmalloc mappings.
|
|
|
|
|
2016-04-05 18:53:38 +08:00
|
|
|
nosmt [KNL,S390] Disable symmetric multithreading (SMT).
|
|
|
|
Equivalent to smt=1.
|
|
|
|
|
2020-08-10 10:49:41 +08:00
|
|
|
[KNL,X86] Disable symmetric multithreading (SMT).
|
2018-06-29 22:05:47 +08:00
|
|
|
nosmt=force: Force disable SMT, cannot be undone
|
|
|
|
via the sysfs control file.
|
powerpc updates for 4.19
Notable changes:
- A fix for a bug in our page table fragment allocator, where a page table page
could be freed and reallocated for something else while still in use, leading
to memory corruption etc. The fix reuses pt_mm in struct page (x86 only) for
a powerpc only refcount.
- Fixes to our pkey support. Several are user-visible changes, but bring us in
to line with x86 behaviour and/or fix outright bugs. Thanks to Florian Weimer
for reporting many of these.
- A series to improve the hvc driver & related OPAL console code, which have
been seen to cause hardlockups at times. The hvc driver changes in particular
have been in linux-next for ~month.
- Increase our MAX_PHYSMEM_BITS to 128TB when SPARSEMEM_VMEMMAP=y.
- Remove Power8 DD1 and Power9 DD1 support, neither chip should be in use
anywhere other than as a paper weight.
- An optimised memcmp implementation using Power7-or-later VMX instructions
- Support for barrier_nospec on some NXP CPUs.
- Support for flushing the count cache on context switch on some IBM CPUs
(controlled by firmware), as a Spectre v2 mitigation.
- A series to enhance the information we print on unhandled signals to bring it
into line with other arches, including showing the offending VMA and dumping
the instructions around the fault.
Thanks to:
Aaro Koskinen, Akshay Adiga, Alastair D'Silva, Alexey Kardashevskiy, Alexey
Spirkov, Alistair Popple, Andrew Donnellan, Aneesh Kumar K.V, Anju T Sudhakar,
Arnd Bergmann, Bartosz Golaszewski, Benjamin Herrenschmidt, Bharat Bhushan,
Bjoern Noetel, Boqun Feng, Breno Leitao, Bryant G. Ly, Camelia Groza,
Christophe Leroy, Christoph Hellwig, Cyril Bur, Dan Carpenter, Daniel Klamt,
Darren Stevens, Dave Young, David Gibson, Diana Craciun, Finn Thain, Florian
Weimer, Frederic Barrat, Gautham R. Shenoy, Geert Uytterhoeven, Geoff Levand,
Guenter Roeck, Gustavo Romero, Haren Myneni, Hari Bathini, Joel Stanley,
Jonathan Neuschäfer, Kees Cook, Madhavan Srinivasan, Mahesh Salgaonkar, Markus
Elfring, Mathieu Malaterre, Mauro S. M. Rodrigues, Michael Hanselmann, Michael
Neuling, Michael Schmitz, Mukesh Ojha, Murilo Opsfelder Araujo, Nicholas
Piggin, Parth Y Shah, Paul Mackerras, Paul Menzel, Ram Pai, Randy Dunlap,
Rashmica Gupta, Reza Arbab, Rodrigo R. Galvao, Russell Currey, Sam Bobroff,
Scott Wood, Shilpasri G Bhat, Simon Guo, Souptick Joarder, Stan Johnson,
Thiago Jung Bauermann, Tyrel Datwyler, Vaibhav Jain, Vasant Hegde, Venkat Rao
B, zhong jiang.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCgAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAlt2O6cTHG1wZUBlbGxl
cm1hbi5pZC5hdQAKCRBR6+o8yOGlgC7hD/4+cj796Df7GsVsIMxzQm7SS9dklIdO
JuKj2Nr5HRzTH59jWlXukLG9mfTNCFgFJB4gEpK1ArDOTcHTCI9RRsLZTZ/kum66
7Pd+7T40dLYXB5uecuUs0vMXa2fI3syKh1VLzACSXv3Dh9BBIKQBwW/aD2eww4YI
1fS5LnXZ2PSxfr6KNAC6ogZnuaiD0sHXOYrtGHq+S/TFC7+Z6ySa6+AnPS+hPVoo
/rHDE1Khr66aj7uk+PP2IgUrCFj6Sbj6hTVlS/iAuwbMjUl9ty6712PmvX9x6wMZ
13hJQI+g6Ci+lqLKqmqVUpXGSr6y4NJGPS/Hko4IivBTJApI+qV/tF2H9nxU+6X0
0RqzsMHPHy13n2torA1gC7ttzOuXPI4hTvm6JWMSsfmfjTxLANJng3Dq3ejh6Bqw
76EMowpDLexwpy7/glPpqNdsP4ySf2Qm8yq3mR7qpL4m3zJVRGs11x+s5DW8NKBL
Fl5SqZvd01abH+sHwv6NLaLkEtayUyohxvyqu2RU3zu5M5vi7DhqstybTPjKPGu0
icSPh7b2y10WpOUpC6lxpdi8Me8qH47mVc/trZ+SpgBrsuEmtJhGKszEnzRCOqos
o2IhYHQv3lQv86kpaAFQlg/RO+Lv+Lo5qbJ209V+hfU5nYzXpEulZs4dx1fbA+ze
fK8GEh+u0L4uJg==
=PzRz
-----END PGP SIGNATURE-----
Merge tag 'powerpc-4.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc updates from Michael Ellerman:
"Notable changes:
- A fix for a bug in our page table fragment allocator, where a page
table page could be freed and reallocated for something else while
still in use, leading to memory corruption etc. The fix reuses
pt_mm in struct page (x86 only) for a powerpc only refcount.
- Fixes to our pkey support. Several are user-visible changes, but
bring us in to line with x86 behaviour and/or fix outright bugs.
Thanks to Florian Weimer for reporting many of these.
- A series to improve the hvc driver & related OPAL console code,
which have been seen to cause hardlockups at times. The hvc driver
changes in particular have been in linux-next for ~month.
- Increase our MAX_PHYSMEM_BITS to 128TB when SPARSEMEM_VMEMMAP=y.
- Remove Power8 DD1 and Power9 DD1 support, neither chip should be in
use anywhere other than as a paper weight.
- An optimised memcmp implementation using Power7-or-later VMX
instructions
- Support for barrier_nospec on some NXP CPUs.
- Support for flushing the count cache on context switch on some IBM
CPUs (controlled by firmware), as a Spectre v2 mitigation.
- A series to enhance the information we print on unhandled signals
to bring it into line with other arches, including showing the
offending VMA and dumping the instructions around the fault.
Thanks to: Aaro Koskinen, Akshay Adiga, Alastair D'Silva, Alexey
Kardashevskiy, Alexey Spirkov, Alistair Popple, Andrew Donnellan,
Aneesh Kumar K.V, Anju T Sudhakar, Arnd Bergmann, Bartosz Golaszewski,
Benjamin Herrenschmidt, Bharat Bhushan, Bjoern Noetel, Boqun Feng,
Breno Leitao, Bryant G. Ly, Camelia Groza, Christophe Leroy, Christoph
Hellwig, Cyril Bur, Dan Carpenter, Daniel Klamt, Darren Stevens, Dave
Young, David Gibson, Diana Craciun, Finn Thain, Florian Weimer,
Frederic Barrat, Gautham R. Shenoy, Geert Uytterhoeven, Geoff Levand,
Guenter Roeck, Gustavo Romero, Haren Myneni, Hari Bathini, Joel
Stanley, Jonathan Neuschäfer, Kees Cook, Madhavan Srinivasan, Mahesh
Salgaonkar, Markus Elfring, Mathieu Malaterre, Mauro S. M. Rodrigues,
Michael Hanselmann, Michael Neuling, Michael Schmitz, Mukesh Ojha,
Murilo Opsfelder Araujo, Nicholas Piggin, Parth Y Shah, Paul
Mackerras, Paul Menzel, Ram Pai, Randy Dunlap, Rashmica Gupta, Reza
Arbab, Rodrigo R. Galvao, Russell Currey, Sam Bobroff, Scott Wood,
Shilpasri G Bhat, Simon Guo, Souptick Joarder, Stan Johnson, Thiago
Jung Bauermann, Tyrel Datwyler, Vaibhav Jain, Vasant Hegde, Venkat
Rao, zhong jiang"
* tag 'powerpc-4.19-1' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux: (234 commits)
powerpc/mm/book3s/radix: Add mapping statistics
powerpc/uaccess: Enable get_user(u64, *p) on 32-bit
powerpc/mm/hash: Remove unnecessary do { } while(0) loop
powerpc/64s: move machine check SLB flushing to mm/slb.c
powerpc/powernv/idle: Fix build error
powerpc/mm/tlbflush: update the mmu_gather page size while iterating address range
powerpc/mm: remove warning about ‘type’ being set
powerpc/32: Include setup.h header file to fix warnings
powerpc: Move `path` variable inside DEBUG_PROM
powerpc/powermac: Make some functions static
powerpc/powermac: Remove variable x that's never read
cxl: remove a dead branch
powerpc/powermac: Add missing include of header pmac.h
powerpc/kexec: Use common error handling code in setup_new_fdt()
powerpc/xmon: Add address lookup for percpu symbols
powerpc/mm: remove huge_pte_offset_and_shift() prototype
powerpc/lib: Use patch_site to patch copy_32 functions once cache is enabled
powerpc/pseries: Fix endianness while restoring of r3 in MCE handler.
powerpc/fadump: merge adjacent memory ranges to reduce PT_LOAD segements
powerpc/fadump: handle crash memory ranges array index overflow
...
2018-08-18 02:32:50 +08:00
|
|
|
|
x86/speculation: Enable Spectre v1 swapgs mitigations
The previous commit added macro calls in the entry code which mitigate the
Spectre v1 swapgs issue if the X86_FEATURE_FENCE_SWAPGS_* features are
enabled. Enable those features where applicable.
The mitigations may be disabled with "nospectre_v1" or "mitigations=off".
There are different features which can affect the risk of attack:
- When FSGSBASE is enabled, unprivileged users are able to place any
value in GS, using the wrgsbase instruction. This means they can
write a GS value which points to any value in kernel space, which can
be useful with the following gadget in an interrupt/exception/NMI
handler:
if (coming from user space)
swapgs
mov %gs:<percpu_offset>, %reg1
// dependent load or store based on the value of %reg
// for example: mov %(reg1), %reg2
If an interrupt is coming from user space, and the entry code
speculatively skips the swapgs (due to user branch mistraining), it
may speculatively execute the GS-based load and a subsequent dependent
load or store, exposing the kernel data to an L1 side channel leak.
Note that, on Intel, a similar attack exists in the above gadget when
coming from kernel space, if the swapgs gets speculatively executed to
switch back to the user GS. On AMD, this variant isn't possible
because swapgs is serializing with respect to future GS-based
accesses.
NOTE: The FSGSBASE patch set hasn't been merged yet, so the above case
doesn't exist quite yet.
- When FSGSBASE is disabled, the issue is mitigated somewhat because
unprivileged users must use prctl(ARCH_SET_GS) to set GS, which
restricts GS values to user space addresses only. That means the
gadget would need an additional step, since the target kernel address
needs to be read from user space first. Something like:
if (coming from user space)
swapgs
mov %gs:<percpu_offset>, %reg1
mov (%reg1), %reg2
// dependent load or store based on the value of %reg2
// for example: mov %(reg2), %reg3
It's difficult to audit for this gadget in all the handlers, so while
there are no known instances of it, it's entirely possible that it
exists somewhere (or could be introduced in the future). Without
tooling to analyze all such code paths, consider it vulnerable.
Effects of SMAP on the !FSGSBASE case:
- If SMAP is enabled, and the CPU reports RDCL_NO (i.e., not
susceptible to Meltdown), the kernel is prevented from speculatively
reading user space memory, even L1 cached values. This effectively
disables the !FSGSBASE attack vector.
- If SMAP is enabled, but the CPU *is* susceptible to Meltdown, SMAP
still prevents the kernel from speculatively reading user space
memory. But it does *not* prevent the kernel from reading the
user value from L1, if it has already been cached. This is probably
only a small hurdle for an attacker to overcome.
Thanks to Dave Hansen for contributing the speculative_smap() function.
Thanks to Andrew Cooper for providing the inside scoop on whether swapgs
is serializing on AMD.
[ tglx: Fixed the USER fence decision and polished the comment as suggested
by Dave Hansen ]
Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Dave Hansen <dave.hansen@intel.com>
2019-07-09 00:52:26 +08:00
|
|
|
nospectre_v1 [X86,PPC] Disable mitigations for Spectre Variant 1
|
|
|
|
(bounds check bypass). With this option data leaks are
|
|
|
|
possible in the system.
|
2018-05-29 23:48:27 +08:00
|
|
|
|
2019-04-16 05:21:20 +08:00
|
|
|
nospectre_v2 [X86,PPC_FSL_BOOK3E,ARM64] Disable all mitigations for
|
|
|
|
the Spectre variant 2 (indirect branch prediction)
|
|
|
|
vulnerability. System may allow data leaks with this
|
|
|
|
option.
|
2018-01-12 05:46:26 +08:00
|
|
|
|
2018-04-26 10:04:21 +08:00
|
|
|
nospec_store_bypass_disable
|
|
|
|
[HW] Disable all mitigations for the Speculative Store Bypass vulnerability
|
|
|
|
|
2020-11-17 13:59:13 +08:00
|
|
|
no_uaccess_flush
|
|
|
|
[PPC] Don't flush the L1-D cache after accessing user data.
|
|
|
|
|
2009-05-23 03:17:45 +08:00
|
|
|
noxsave [BUGS=X86] Disables x86 extended register state save
|
|
|
|
and restore using xsave. The kernel will fallback to
|
|
|
|
enabling legacy floating-point and sse state.
|
|
|
|
|
2014-05-30 02:12:31 +08:00
|
|
|
noxsaveopt [X86] Disables xsaveopt used in saving x86 extended
|
|
|
|
register states. The kernel will fall back to use
|
|
|
|
xsave to save the states. By using this parameter,
|
|
|
|
performance of saving the states is degraded because
|
|
|
|
xsave doesn't support modified optimization while
|
|
|
|
xsaveopt supports it on xsaveopt enabled systems.
|
|
|
|
|
|
|
|
noxsaves [X86] Disables xsaves and xrstors used in saving and
|
|
|
|
restoring x86 extended register state in compacted
|
|
|
|
form of xsave area. The kernel will fall back to use
|
|
|
|
xsaveopt and xrstor to save and restore the states
|
|
|
|
in standard form of xsave area. By using this
|
|
|
|
parameter, xsave area per process might occupy more
|
|
|
|
memory on xsaves enabled systems.
|
|
|
|
|
2021-02-10 01:23:48 +08:00
|
|
|
nohlt [ARM,ARM64,MICROBLAZE,SH] Forces the kernel to busy wait
|
|
|
|
in do_idle() and not use the arch_cpu_idle()
|
|
|
|
implementation; requires CONFIG_GENERIC_IDLE_POLL_SETUP
|
|
|
|
to be effective. This is useful on platforms where the
|
|
|
|
sleep(SH) or wfi(ARM,ARM64) instructions do not work
|
|
|
|
correctly or when doing power measurements to evalute
|
|
|
|
the impact of the sleep instructions. This is also
|
|
|
|
useful when using JTAG debugger.
|
2005-10-24 03:57:11 +08:00
|
|
|
|
file capabilities: add no_file_caps switch (v4)
Add a no_file_caps boot option when file capabilities are
compiled into the kernel (CONFIG_SECURITY_FILE_CAPABILITIES=y).
This allows distributions to ship a kernel with file capabilities
compiled in, without forcing users to use (and understand and
trust) them.
When no_file_caps is specified at boot, then when a process executes
a file, any file capabilities stored with that file will not be
used in the calculation of the process' new capability sets.
This means that booting with the no_file_caps boot option will
not be the same as booting a kernel with file capabilities
compiled out - in particular a task with CAP_SETPCAP will not
have any chance of passing capabilities to another task (which
isn't "really" possible anyway, and which may soon by killed
altogether by David Howells in any case), and it will instead
be able to put new capabilities in its pI. However since fI
will always be empty and pI is masked with fI, it gains the
task nothing.
We also support the extra prctl options, setting securebits and
dropping capabilities from the per-process bounding set.
The other remaining difference is that killpriv, task_setscheduler,
setioprio, and setnice will continue to be hooked. That will
be noticable in the case where a root task changed its uid
while keeping some caps, and another task owned by the new uid
tries to change settings for the more privileged task.
Changelog:
Nov 05 2008: (v4) trivial port on top of always-start-\
with-clear-caps patch
Sep 23 2008: nixed file_caps_enabled when file caps are
not compiled in as it isn't used.
Document no_file_caps in kernel-parameters.txt.
Signed-off-by: Serge Hallyn <serue@us.ibm.com>
Acked-by: Andrew G. Morgan <morgan@kernel.org>
Signed-off-by: James Morris <jmorris@namei.org>
2008-11-06 06:08:52 +08:00
|
|
|
no_file_caps Tells the kernel not to honor file capabilities. The
|
|
|
|
only way then for a file to be executed with privilege
|
|
|
|
is to be setuid root or executed by root.
|
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
nohalt [IA-64] Tells the kernel not to use the power saving
|
|
|
|
function PAL_HALT_LIGHT when idle. This increases
|
|
|
|
power-consumption. On the positive side, it reduces
|
|
|
|
interrupt wake-up latency, which may improve performance
|
|
|
|
in certain environments such as networked servers or
|
|
|
|
real-time systems.
|
|
|
|
|
2021-02-15 00:13:48 +08:00
|
|
|
no_hash_pointers
|
|
|
|
Force pointers printed to the console or buffers to be
|
|
|
|
unhashed. By default, when a pointer is printed via %p
|
|
|
|
format string, that pointer is "hashed", i.e. obscured
|
|
|
|
by hashing the pointer value. This is a security feature
|
|
|
|
that hides actual kernel addresses from unprivileged
|
|
|
|
users, but it also makes debugging the kernel more
|
|
|
|
difficult since unequal pointers can no longer be
|
|
|
|
compared. However, if this command-line option is
|
|
|
|
specified, then all normal pointers will have their true
|
|
|
|
value printed. Pointers printed via %pK may still be
|
|
|
|
hashed. This option should only be specified when
|
|
|
|
debugging the kernel. Please do not use on production
|
|
|
|
kernels.
|
|
|
|
|
2014-06-14 04:30:35 +08:00
|
|
|
nohibernate [HIBERNATION] Disable hibernation and resume.
|
|
|
|
|
2007-02-16 17:28:03 +08:00
|
|
|
nohz= [KNL] Boottime enable/disable dynamic ticks
|
|
|
|
Valid arguments: on, off
|
|
|
|
Default: on
|
|
|
|
|
2017-12-15 02:18:27 +08:00
|
|
|
nohz_full= [KNL,BOOT,SMP,ISOL]
|
2016-10-12 04:51:35 +08:00
|
|
|
The argument is a cpu list, as described above.
|
2013-04-12 22:45:34 +08:00
|
|
|
In kernels built with CONFIG_NO_HZ_FULL=y, set
|
2012-12-19 00:32:19 +08:00
|
|
|
the specified list of CPUs whose tick will be stopped
|
2013-03-27 09:18:34 +08:00
|
|
|
whenever possible. The boot CPU will be forced outside
|
2017-06-03 02:26:43 +08:00
|
|
|
the range to maintain the timekeeping. Any CPUs
|
|
|
|
in this list will have their RCU callbacks offloaded,
|
|
|
|
just as if they had also been called out in the
|
|
|
|
rcu_nocbs= boot parameter.
|
2012-12-19 00:32:19 +08:00
|
|
|
|
2009-04-02 11:31:16 +08:00
|
|
|
noiotrap [SH] Disables trapped I/O port accesses.
|
|
|
|
|
2007-07-31 15:37:59 +08:00
|
|
|
noirqdebug [X86-32] Disables the code which attempts to detect and
|
2005-04-17 06:20:36 +08:00
|
|
|
disable unhandled interrupt sources.
|
|
|
|
|
2009-04-14 16:33:43 +08:00
|
|
|
no_timer_check [X86,APIC] Disables the code which tests for
|
2006-12-07 09:14:09 +08:00
|
|
|
broken timer IRQ sources.
|
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
noisapnp [ISAPNP] Disables ISA PnP code.
|
|
|
|
|
|
|
|
noinitrd [RAM] Tells the kernel not to load any configured
|
|
|
|
initial RAM disk.
|
|
|
|
|
2009-04-17 16:42:15 +08:00
|
|
|
nointremap [X86-64, Intel-IOMMU] Do not enable interrupt
|
|
|
|
remapping.
|
2010-07-21 02:06:49 +08:00
|
|
|
[Deprecated - use intremap=off]
|
2009-04-17 16:42:15 +08:00
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
nointroute [IA-64]
|
|
|
|
|
2016-01-30 03:42:58 +08:00
|
|
|
noinvpcid [X86] Disable the INVPCID cpu feature.
|
|
|
|
|
2011-08-14 03:34:52 +08:00
|
|
|
nojitter [IA-64] Disables jitter checking for ITC timers.
|
2007-07-21 02:22:30 +08:00
|
|
|
|
2010-08-16 23:51:20 +08:00
|
|
|
no-kvmclock [X86,KVM] Disable paravirtualized KVM clock driver
|
|
|
|
|
2010-10-14 17:22:51 +08:00
|
|
|
no-kvmapf [X86,KVM] Disable paravirtualized asynchronous page
|
|
|
|
fault handling.
|
|
|
|
|
2016-10-28 15:54:32 +08:00
|
|
|
no-vmw-sched-clock
|
|
|
|
[X86,PV_OPS] Disable paravirtualized VMware scheduler
|
|
|
|
clock and use the default one.
|
|
|
|
|
2020-03-24 03:57:06 +08:00
|
|
|
no-steal-acc [X86,PV_OPS,ARM64] Disable paravirtualized steal time
|
2019-10-21 23:28:23 +08:00
|
|
|
accounting. steal time is computed, but won't
|
|
|
|
influence scheduler behaviour
|
2011-07-12 03:28:19 +08:00
|
|
|
|
2007-07-31 15:37:59 +08:00
|
|
|
nolapic [X86-32,APIC] Do not enable or use the local APIC.
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2007-07-31 15:37:59 +08:00
|
|
|
nolapic_timer [X86-32,APIC] Do not use the local APIC timer.
|
2007-03-22 16:11:21 +08:00
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
noltlbs [PPC] Do not use large page/tlb entries for kernel
|
2016-02-10 00:07:52 +08:00
|
|
|
lowmem mapping on PPC40x and PPC8xx
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2006-02-22 08:57:55 +08:00
|
|
|
nomca [IA-64] Disable machine check abort handling
|
|
|
|
|
2015-05-16 01:16:43 +08:00
|
|
|
nomce [X86-32] Disable Machine Check Exception
|
2006-04-01 07:36:09 +08:00
|
|
|
|
2007-10-13 05:04:06 +08:00
|
|
|
nomfgpt [X86-32] Disable Multi-Function General Purpose
|
|
|
|
Timer usage (for AMD Geode machines).
|
|
|
|
|
2011-10-14 03:14:27 +08:00
|
|
|
nonmi_ipi [X86] Disable using NMI IPIs during panic/reboot to
|
|
|
|
shutdown the other cpus. Instead use the REBOOT_VECTOR
|
|
|
|
irq.
|
|
|
|
|
2012-02-01 10:33:14 +08:00
|
|
|
nomodule Disable module load
|
|
|
|
|
2010-01-19 00:05:40 +08:00
|
|
|
nopat [X86] Disable PAT (page attribute table extension of
|
|
|
|
pagetables) support.
|
|
|
|
|
2017-06-29 23:53:20 +08:00
|
|
|
nopcid [X86-64] Disable the PCID cpu feature.
|
|
|
|
|
2009-04-06 06:55:22 +08:00
|
|
|
norandmaps Don't use address space randomization. Equivalent to
|
|
|
|
echo 0 > /proc/sys/kernel/randomize_va_space
|
|
|
|
|
2007-07-31 15:37:59 +08:00
|
|
|
noreplace-smp [X86-32,SMP] Don't replace SMP instructions
|
2007-05-03 01:27:13 +08:00
|
|
|
with UP alternatives
|
|
|
|
|
2014-05-12 11:25:20 +08:00
|
|
|
nordrand [X86] Disable kernel use of the RDRAND and
|
|
|
|
RDSEED instructions even if they are supported
|
|
|
|
by the processor. RDRAND and RDSEED are still
|
|
|
|
available to user space applications.
|
2011-08-01 05:02:19 +08:00
|
|
|
|
2005-10-24 03:57:11 +08:00
|
|
|
noresume [SWSUSP] Disables resume and restores original swap
|
|
|
|
space.
|
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
no-scroll [VGA] Disables scrollback.
|
|
|
|
This is required for the Braillex ib80-piezo Braille
|
|
|
|
reader made by F.H. Papenmeier (Germany).
|
|
|
|
|
|
|
|
nosbagart [IA-64]
|
|
|
|
|
2007-07-31 15:37:59 +08:00
|
|
|
nosep [BUGS=X86-32] Disables x86 SYSENTER/SYSEXIT support.
|
2006-03-23 18:59:34 +08:00
|
|
|
|
2020-11-13 06:01:19 +08:00
|
|
|
nosgx [X86-64,SGX] Disables Intel SGX kernel support.
|
|
|
|
|
2007-08-16 15:34:22 +08:00
|
|
|
nosmp [SMP] Tells an SMP kernel to act as a UP kernel,
|
|
|
|
and disable the IO APIC. legacy for "maxcpus=0".
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2007-07-16 14:41:05 +08:00
|
|
|
nosoftlockup [KNL] Disable the soft-lockup detector.
|
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
nosync [HW,M68K] Disables sync negotiation for all devices.
|
|
|
|
|
2015-04-15 06:44:13 +08:00
|
|
|
nowatchdog [KNL] Disable both lockup detectors, i.e.
|
2018-04-19 02:51:39 +08:00
|
|
|
soft-lockup and NMI watchdog (hard-lockup).
|
2010-05-08 05:11:44 +08:00
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
nowb [ARM]
|
2005-10-24 03:57:11 +08:00
|
|
|
|
2009-04-17 16:42:12 +08:00
|
|
|
nox2apic [X86-64,APIC] Do not enable x2APIC mode.
|
|
|
|
|
2012-11-14 03:32:38 +08:00
|
|
|
cpu0_hotplug [X86] Turn on CPU0 hotplug feature when
|
|
|
|
CONFIG_BOOTPARAM_HOTPLUG_CPU0 is off.
|
|
|
|
Some features depend on CPU0. Known dependencies are:
|
|
|
|
1. Resume from suspend/hibernate depends on CPU0.
|
|
|
|
Suspend/hibernate will fail if CPU0 is offline and you
|
|
|
|
need to online CPU0 before suspend/hibernate.
|
|
|
|
2. PIC interrupts also depend on CPU0. CPU0 can't be
|
|
|
|
removed if a PIC interrupt is detected.
|
|
|
|
It's said poweroff/reboot may depend on CPU0 on some
|
|
|
|
machines although I haven't seen such issues so far
|
|
|
|
after CPU0 is offline on a few tested machines.
|
|
|
|
If the dependencies are under your control, you can
|
|
|
|
turn on cpu0_hotplug.
|
|
|
|
|
2018-04-19 02:51:39 +08:00
|
|
|
nps_mtm_hs_ctr= [KNL,ARC]
|
2017-06-15 16:43:57 +08:00
|
|
|
This parameter sets the maximum duration, in
|
|
|
|
cycles, each HW thread of the CTOP can run
|
|
|
|
without interruptions, before HW switches it.
|
|
|
|
The actual maximum duration is 16 times this
|
|
|
|
parameter's value.
|
|
|
|
Format: integer between 1 and 255
|
|
|
|
Default: 255
|
|
|
|
|
2011-08-14 03:34:52 +08:00
|
|
|
nptcg= [IA-64] Override max number of concurrent global TLB
|
2008-03-15 04:57:08 +08:00
|
|
|
purges which is reported from either PAL_VM_SUMMARY or
|
|
|
|
SAL PALO.
|
|
|
|
|
2010-02-10 17:20:37 +08:00
|
|
|
nr_cpus= [SMP] Maximum number of processors that an SMP kernel
|
|
|
|
could support. nr_cpus=n : n >= 1 limits the kernel to
|
2016-08-24 13:06:45 +08:00
|
|
|
support 'n' processors. It could be larger than the
|
|
|
|
number of already plugged CPU during bootup, later in
|
|
|
|
runtime you can physically add extra cpu until it reaches
|
|
|
|
n. So during boot up some boot time memory for per-cpu
|
|
|
|
variables need be pre-allocated for later physical cpu
|
|
|
|
hot plugging.
|
2010-02-10 17:20:37 +08:00
|
|
|
|
2009-04-06 06:55:22 +08:00
|
|
|
nr_uarts= [SERIAL] maximum number of UARTs to be registered.
|
|
|
|
|
2021-05-24 13:17:15 +08:00
|
|
|
numa=off [KNL, ARM64, PPC, RISCV, SPARC, X86] Disable NUMA, Only
|
|
|
|
set up a single NUMA node spanning all memory.
|
|
|
|
|
2021-03-02 16:41:59 +08:00
|
|
|
numa_balancing= [KNL,ARM64,PPC,RISCV,S390,X86] Enable or disable automatic
|
|
|
|
NUMA balancing.
|
2012-11-22 19:16:36 +08:00
|
|
|
Allowed values are enable and disable
|
|
|
|
|
2007-07-16 14:38:01 +08:00
|
|
|
numa_zonelist_order= [KNL, BOOT] Select zonelist order for NUMA.
|
2017-09-07 07:20:13 +08:00
|
|
|
'node', 'default' can be specified
|
2007-07-16 14:38:01 +08:00
|
|
|
This can be set from sysctl after boot.
|
2019-04-23 03:48:00 +08:00
|
|
|
See Documentation/admin-guide/sysctl/vm.rst for details.
|
2007-07-16 14:38:01 +08:00
|
|
|
|
2009-01-07 06:42:44 +08:00
|
|
|
ohci1394_dma=early [HW] enable debugging via the ohci1394 driver.
|
2020-05-01 23:37:50 +08:00
|
|
|
See Documentation/core-api/debugging-via-ohci1394.rst for more
|
2009-01-07 06:42:44 +08:00
|
|
|
info.
|
|
|
|
|
2008-04-29 15:59:53 +08:00
|
|
|
olpc_ec_timeout= [OLPC] ms delay when issuing EC commands
|
|
|
|
Rather than timing out after 20 ms if an EC
|
|
|
|
command is not properly ACKed, override the length
|
|
|
|
of the timeout. We have interrupts disabled while
|
|
|
|
waiting for the ACK, so if this is set too high
|
|
|
|
interrupts *may* be lost!
|
|
|
|
|
2009-12-12 08:16:32 +08:00
|
|
|
omap_mux= [OMAP] Override bootloader pin multiplexing.
|
|
|
|
Format: <mux_mode0.mode_name=value>...
|
|
|
|
For example, to override I2C bus2:
|
|
|
|
omap_mux=i2c2_scl.i2c2_scl=0x100,i2c2_sda.i2c2_sda=0x100
|
|
|
|
|
2011-04-05 06:02:24 +08:00
|
|
|
oops=panic Always panic on oopses. Default is to just kill the
|
|
|
|
process, but there is a small probability of
|
|
|
|
deadlocking the machine.
|
2011-03-23 07:34:04 +08:00
|
|
|
This will also cause panics on machine check exceptions.
|
|
|
|
Useful together with panic=30 to trigger a reboot.
|
|
|
|
|
mm: shuffle initial free memory to improve memory-side-cache utilization
Patch series "mm: Randomize free memory", v10.
This patch (of 3):
Randomization of the page allocator improves the average utilization of
a direct-mapped memory-side-cache. Memory side caching is a platform
capability that Linux has been previously exposed to in HPC
(high-performance computing) environments on specialty platforms. In
that instance it was a smaller pool of high-bandwidth-memory relative to
higher-capacity / lower-bandwidth DRAM. Now, this capability is going
to be found on general purpose server platforms where DRAM is a cache in
front of higher latency persistent memory [1].
Robert offered an explanation of the state of the art of Linux
interactions with memory-side-caches [2], and I copy it here:
It's been a problem in the HPC space:
http://www.nersc.gov/research-and-development/knl-cache-mode-performance-coe/
A kernel module called zonesort is available to try to help:
https://software.intel.com/en-us/articles/xeon-phi-software
and this abandoned patch series proposed that for the kernel:
https://lkml.kernel.org/r/20170823100205.17311-1-lukasz.daniluk@intel.com
Dan's patch series doesn't attempt to ensure buffers won't conflict, but
also reduces the chance that the buffers will. This will make performance
more consistent, albeit slower than "optimal" (which is near impossible
to attain in a general-purpose kernel). That's better than forcing
users to deploy remedies like:
"To eliminate this gradual degradation, we have added a Stream
measurement to the Node Health Check that follows each job;
nodes are rebooted whenever their measured memory bandwidth
falls below 300 GB/s."
A replacement for zonesort was merged upstream in commit cc9aec03e58f
("x86/numa_emulation: Introduce uniform split capability"). With this
numa_emulation capability, memory can be split into cache sized
("near-memory" sized) numa nodes. A bind operation to such a node, and
disabling workloads on other nodes, enables full cache performance.
However, once the workload exceeds the cache size then cache conflicts
are unavoidable. While HPC environments might be able to tolerate
time-scheduling of cache sized workloads, for general purpose server
platforms, the oversubscribed cache case will be the common case.
The worst case scenario is that a server system owner benchmarks a
workload at boot with an un-contended cache only to see that performance
degrade over time, even below the average cache performance due to
excessive conflicts. Randomization clips the peaks and fills in the
valleys of cache utilization to yield steady average performance.
Here are some performance impact details of the patches:
1/ An Intel internal synthetic memory bandwidth measurement tool, saw a
3X speedup in a contrived case that tries to force cache conflicts.
The contrived cased used the numa_emulation capability to force an
instance of the benchmark to be run in two of the near-memory sized
numa nodes. If both instances were placed on the same emulated they
would fit and cause zero conflicts. While on separate emulated nodes
without randomization they underutilized the cache and conflicted
unnecessarily due to the in-order allocation per node.
2/ A well known Java server application benchmark was run with a heap
size that exceeded cache size by 3X. The cache conflict rate was 8%
for the first run and degraded to 21% after page allocator aging. With
randomization enabled the rate levelled out at 11%.
3/ A MongoDB workload did not observe measurable difference in
cache-conflict rates, but the overall throughput dropped by 7% with
randomization in one case.
4/ Mel Gorman ran his suite of performance workloads with randomization
enabled on platforms without a memory-side-cache and saw a mix of some
improvements and some losses [3].
While there is potentially significant improvement for applications that
depend on low latency access across a wide working-set, the performance
may be negligible to negative for other workloads. For this reason the
shuffle capability defaults to off unless a direct-mapped
memory-side-cache is detected. Even then, the page_alloc.shuffle=0
parameter can be specified to disable the randomization on those systems.
Outside of memory-side-cache utilization concerns there is potentially
security benefit from randomization. Some data exfiltration and
return-oriented-programming attacks rely on the ability to infer the
location of sensitive data objects. The kernel page allocator, especially
early in system boot, has predictable first-in-first out behavior for
physical pages. Pages are freed in physical address order when first
onlined.
Quoting Kees:
"While we already have a base-address randomization
(CONFIG_RANDOMIZE_MEMORY), attacks against the same hardware and
memory layouts would certainly be using the predictability of
allocation ordering (i.e. for attacks where the base address isn't
important: only the relative positions between allocated memory).
This is common in lots of heap-style attacks. They try to gain
control over ordering by spraying allocations, etc.
I'd really like to see this because it gives us something similar
to CONFIG_SLAB_FREELIST_RANDOM but for the page allocator."
While SLAB_FREELIST_RANDOM reduces the predictability of some local slab
caches it leaves vast bulk of memory to be predictably in order allocated.
However, it should be noted, the concrete security benefits are hard to
quantify, and no known CVE is mitigated by this randomization.
Introduce shuffle_free_memory(), and its helper shuffle_zone(), to perform
a Fisher-Yates shuffle of the page allocator 'free_area' lists when they
are initially populated with free memory at boot and at hotplug time. Do
this based on either the presence of a page_alloc.shuffle=Y command line
parameter, or autodetection of a memory-side-cache (to be added in a
follow-on patch).
The shuffling is done in terms of CONFIG_SHUFFLE_PAGE_ORDER sized free
pages where the default CONFIG_SHUFFLE_PAGE_ORDER is MAX_ORDER-1 i.e. 10,
4MB this trades off randomization granularity for time spent shuffling.
MAX_ORDER-1 was chosen to be minimally invasive to the page allocator
while still showing memory-side cache behavior improvements, and the
expectation that the security implications of finer granularity
randomization is mitigated by CONFIG_SLAB_FREELIST_RANDOM. The
performance impact of the shuffling appears to be in the noise compared to
other memory initialization work.
This initial randomization can be undone over time so a follow-on patch is
introduced to inject entropy on page free decisions. It is reasonable to
ask if the page free entropy is sufficient, but it is not enough due to
the in-order initial freeing of pages. At the start of that process
putting page1 in front or behind page0 still keeps them close together,
page2 is still near page1 and has a high chance of being adjacent. As
more pages are added ordering diversity improves, but there is still high
page locality for the low address pages and this leads to no significant
impact to the cache conflict rate.
[1]: https://itpeernetwork.intel.com/intel-optane-dc-persistent-memory-operating-modes/
[2]: https://lkml.kernel.org/r/AT5PR8401MB1169D656C8B5E121752FC0F8AB120@AT5PR8401MB1169.NAMPRD84.PROD.OUTLOOK.COM
[3]: https://lkml.org/lkml/2018/10/12/309
[dan.j.williams@intel.com: fix shuffle enable]
Link: http://lkml.kernel.org/r/154943713038.3858443.4125180191382062871.stgit@dwillia2-desk3.amr.corp.intel.com
[cai@lca.pw: fix SHUFFLE_PAGE_ALLOCATOR help texts]
Link: http://lkml.kernel.org/r/20190425201300.75650-1-cai@lca.pw
Link: http://lkml.kernel.org/r/154899811738.3165233.12325692939590944259.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Qian Cai <cai@lca.pw>
Reviewed-by: Kees Cook <keescook@chromium.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Keith Busch <keith.busch@intel.com>
Cc: Robert Elliott <elliott@hpe.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-05-15 06:41:28 +08:00
|
|
|
page_alloc.shuffle=
|
|
|
|
[KNL] Boolean flag to control whether the page allocator
|
|
|
|
should randomize its free lists. The randomization may
|
|
|
|
be automatically enabled if the kernel detects it is
|
|
|
|
running on a platform with a direct-mapped memory-side
|
|
|
|
cache, and this parameter can be used to
|
|
|
|
override/disable that behavior. The state of the flag
|
|
|
|
can be read from sysfs at:
|
|
|
|
/sys/module/page_alloc/parameters/shuffle.
|
|
|
|
|
mm/page_owner: keep track of page owners
This is the page owner tracking code which is introduced so far ago. It
is resident on Andrew's tree, though, nobody tried to upstream so it
remain as is. Our company uses this feature actively to debug memory leak
or to find a memory hogger so I decide to upstream this feature.
This functionality help us to know who allocates the page. When
allocating a page, we store some information about allocation in extra
memory. Later, if we need to know status of all pages, we can get and
analyze it from this stored information.
In previous version of this feature, extra memory is statically defined in
struct page, but, in this version, extra memory is allocated outside of
struct page. It enables us to turn on/off this feature at boottime
without considerable memory waste.
Although we already have tracepoint for tracing page allocation/free,
using it to analyze page owner is rather complex. We need to enlarge the
trace buffer for preventing overlapping until userspace program launched.
And, launched program continually dump out the trace buffer for later
analysis and it would change system behaviour with more possibility rather
than just keeping it in memory, so bad for debug.
Moreover, we can use page_owner feature further for various purposes. For
example, we can use it for fragmentation statistics implemented in this
patch. And, I also plan to implement some CMA failure debugging feature
using this interface.
I'd like to give the credit for all developers contributed this feature,
but, it's not easy because I don't know exact history. Sorry about that.
Below is people who has "Signed-off-by" in the patches in Andrew's tree.
Contributor:
Alexander Nyberg <alexn@dsv.su.se>
Mel Gorman <mgorman@suse.de>
Dave Hansen <dave@linux.vnet.ibm.com>
Minchan Kim <minchan@kernel.org>
Michal Nazarewicz <mina86@mina86.com>
Andrew Morton <akpm@linux-foundation.org>
Jungsoo Son <jungsoo.son@lge.com>
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Mel Gorman <mgorman@suse.de>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Dave Hansen <dave@sr71.net>
Cc: Michal Nazarewicz <mina86@mina86.com>
Cc: Jungsoo Son <jungsoo.son@lge.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2014-12-13 08:56:01 +08:00
|
|
|
page_owner= [KNL] Boot-time page_owner enabling option.
|
|
|
|
Storage of the information about who allocated
|
|
|
|
each page is disabled in default. With this switch,
|
|
|
|
we can turn it on.
|
|
|
|
on: enable the feature
|
|
|
|
|
2016-03-16 05:56:27 +08:00
|
|
|
page_poison= [KNL] Boot-time parameter changing the state of
|
2018-08-22 12:53:10 +08:00
|
|
|
poisoning on the buddy allocator, available with
|
|
|
|
CONFIG_PAGE_POISONING=y.
|
|
|
|
off: turn off poisoning (default)
|
2016-03-16 05:56:27 +08:00
|
|
|
on: turn on poisoning
|
|
|
|
|
2021-06-29 10:35:19 +08:00
|
|
|
page_reporting.page_reporting_order=
|
|
|
|
[KNL] Minimal page reporting order
|
|
|
|
Format: <integer>
|
|
|
|
Adjust the minimal page reporting order. The page
|
|
|
|
reporting is disabled when it exceeds (MAX_ORDER-1).
|
|
|
|
|
2011-04-05 06:02:24 +08:00
|
|
|
panic= [KNL] Kernel behaviour on panic: delay <timeout>
|
2011-07-27 07:08:52 +08:00
|
|
|
timeout > 0: seconds before rebooting
|
|
|
|
timeout = 0: wait forever
|
|
|
|
timeout < 0: reboot immediately
|
2005-04-17 06:20:36 +08:00
|
|
|
Format: <timeout>
|
|
|
|
|
2019-01-04 07:28:17 +08:00
|
|
|
panic_print= Bitmask for printing system info when panic happens.
|
|
|
|
User can chose combination of the following bits:
|
|
|
|
bit 0: print all tasks info
|
|
|
|
bit 1: print system memory info
|
|
|
|
bit 2: print timer info
|
|
|
|
bit 3: print locks info if CONFIG_LOCKDEP is on
|
|
|
|
bit 4: print ftrace buffer
|
2019-05-18 05:31:50 +08:00
|
|
|
bit 5: print all printk messages in buffer
|
2019-01-04 07:28:17 +08:00
|
|
|
|
2020-06-08 12:40:17 +08:00
|
|
|
panic_on_taint= Bitmask for conditionally calling panic() in add_taint()
|
|
|
|
Format: <hex>[,nousertaint]
|
|
|
|
Hexadecimal bitmask representing the set of TAINT flags
|
|
|
|
that will cause the kernel to panic when add_taint() is
|
|
|
|
called with any of the flags in this set.
|
|
|
|
The optional switch "nousertaint" can be utilized to
|
|
|
|
prevent userspace forced crashes by writing to sysctl
|
|
|
|
/proc/sys/kernel/tainted any flagset matching with the
|
|
|
|
bitmask set on panic_on_taint.
|
|
|
|
See Documentation/admin-guide/tainted-kernels.rst for
|
|
|
|
extra details on the taint flags that users can pick
|
|
|
|
to compose the bitmask to assign to panic_on_taint.
|
|
|
|
|
2014-12-11 07:45:50 +08:00
|
|
|
panic_on_warn panic() instead of WARN(). Useful to cause kdump
|
|
|
|
on a WARN().
|
|
|
|
|
2014-06-07 05:37:07 +08:00
|
|
|
crash_kexec_post_notifiers
|
|
|
|
Run kdump after running panic-notifiers and dumping
|
|
|
|
kmsg. This only for the users who doubt kdump always
|
|
|
|
succeeds in any situation.
|
|
|
|
Note that this also increases risks of kdump failure,
|
|
|
|
because some panic notifiers can make the crashed
|
|
|
|
kernel more unstable.
|
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
parkbd.port= [HW] Parallel port number the keyboard adapter is
|
|
|
|
connected to, default is 0.
|
|
|
|
Format: <parport#>
|
|
|
|
parkbd.mode= [HW] Parallel port keyboard adapter mode of operation,
|
|
|
|
0 for XT, 1 for AT (default is AT).
|
2005-10-24 03:57:11 +08:00
|
|
|
Format: <mode>
|
|
|
|
|
|
|
|
parport= [HW,PPT] Specify parallel ports. 0 disables.
|
|
|
|
Format: { 0 | auto | 0xBBB[,IRQ[,DMA]] }
|
|
|
|
Use 'auto' to force the driver to use any
|
|
|
|
IRQ/DMA settings detected (the default is to
|
|
|
|
ignore detected IRQ/DMA settings because of
|
|
|
|
possible conflicts). You can specify the base
|
|
|
|
address, IRQ, and DMA settings; IRQ and DMA
|
|
|
|
should be numbers, or 'auto' (for using detected
|
|
|
|
settings on that particular port), or 'nofifo'
|
|
|
|
(to avoid using a FIFO even if it is detected).
|
|
|
|
Parallel ports are assigned in the order they
|
|
|
|
are specified on the command line, starting
|
|
|
|
with parport0.
|
|
|
|
|
|
|
|
parport_init_mode= [HW,PPT]
|
|
|
|
Configure VIA parallel port to operate in
|
|
|
|
a specific mode. This is necessary on Pegasos
|
|
|
|
computer where firmware has no options for setting
|
|
|
|
up parallel port mode and sets it to spp.
|
|
|
|
Currently this function knows 686a and 8231 chips.
|
2005-04-17 06:20:36 +08:00
|
|
|
Format: [spp|ps2|epp|ecp|ecpepp]
|
|
|
|
|
2021-03-22 03:55:22 +08:00
|
|
|
pata_legacy.all= [HW,LIBATA]
|
|
|
|
Format: <int>
|
|
|
|
Set to non-zero to probe primary and secondary ISA
|
|
|
|
port ranges on PCI systems where no PCI PATA device
|
|
|
|
has been found at either range. Disabled by default.
|
|
|
|
|
|
|
|
pata_legacy.autospeed= [HW,LIBATA]
|
|
|
|
Format: <int>
|
|
|
|
Set to non-zero if a chip is present that snoops speed
|
|
|
|
changes. Disabled by default.
|
|
|
|
|
|
|
|
pata_legacy.ht6560a= [HW,LIBATA]
|
|
|
|
Format: <int>
|
|
|
|
Set to 1, 2, or 3 for HT 6560A on the primary channel,
|
|
|
|
the secondary channel, or both channels respectively.
|
|
|
|
Disabled by default.
|
|
|
|
|
|
|
|
pata_legacy.ht6560b= [HW,LIBATA]
|
|
|
|
Format: <int>
|
|
|
|
Set to 1, 2, or 3 for HT 6560B on the primary channel,
|
|
|
|
the secondary channel, or both channels respectively.
|
|
|
|
Disabled by default.
|
|
|
|
|
|
|
|
pata_legacy.iordy_mask= [HW,LIBATA]
|
|
|
|
Format: <int>
|
|
|
|
IORDY enable mask. Set individual bits to allow IORDY
|
|
|
|
for the respective channel. Bit 0 is for the first
|
|
|
|
legacy channel handled by this driver, bit 1 is for
|
|
|
|
the second channel, and so on. The sequence will often
|
|
|
|
correspond to the primary legacy channel, the secondary
|
|
|
|
legacy channel, and so on, but the handling of a PCI
|
|
|
|
bus and the use of other driver options may interfere
|
|
|
|
with the sequence. By default IORDY is allowed across
|
|
|
|
all channels.
|
|
|
|
|
|
|
|
pata_legacy.opti82c46x= [HW,LIBATA]
|
|
|
|
Format: <int>
|
|
|
|
Set to 1, 2, or 3 for Opti 82c611A on the primary
|
|
|
|
channel, the secondary channel, or both channels
|
|
|
|
respectively. Disabled by default.
|
|
|
|
|
|
|
|
pata_legacy.opti82c611a= [HW,LIBATA]
|
|
|
|
Format: <int>
|
|
|
|
Set to 1, 2, or 3 for Opti 82c465MV on the primary
|
|
|
|
channel, the secondary channel, or both channels
|
|
|
|
respectively. Disabled by default.
|
|
|
|
|
|
|
|
pata_legacy.pio_mask= [HW,LIBATA]
|
|
|
|
Format: <int>
|
|
|
|
PIO mode mask for autospeed devices. Set individual
|
|
|
|
bits to allow the use of the respective PIO modes.
|
|
|
|
Bit 0 is for mode 0, bit 1 is for mode 1, and so on.
|
|
|
|
All modes allowed by default.
|
|
|
|
|
|
|
|
pata_legacy.probe_all= [HW,LIBATA]
|
|
|
|
Format: <int>
|
|
|
|
Set to non-zero to probe tertiary and further ISA
|
|
|
|
port ranges on PCI systems. Disabled by default.
|
|
|
|
|
pata_legacy: Add `probe_mask' parameter like with ide-generic
Carry the `probe_mask' parameter over from ide-generic to pata_legacy so
that there is a way to prevent random poking at ISA port I/O locations
in attempt to discover adapter option cards with libata like with the
old IDE driver. By default all enabled locations are tried, however it
may interfere with a different kind of hardware responding there.
For example with a plain (E)ISA system the driver tries all the six
possible locations:
scsi host0: pata_legacy
ata1: PATA max PIO4 cmd 0x1f0 ctl 0x3f6 irq 14
ata1.00: ATA-4: ST310211A, 3.54, max UDMA/100
ata1.00: 19541088 sectors, multi 16: LBA
ata1.00: configured for PIO
scsi 0:0:0:0: Direct-Access ATA ST310211A 3.54 PQ: 0 ANSI: 5
scsi 0:0:0:0: Attached scsi generic sg0 type 0
sd 0:0:0:0: [sda] 19541088 512-byte logical blocks: (10.0 GB/9.32 GiB)
sd 0:0:0:0: [sda] Write Protect is off
sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
sda: sda1 sda2 sda3
sd 0:0:0:0: [sda] Attached SCSI disk
scsi host1: pata_legacy
ata2: PATA max PIO4 cmd 0x170 ctl 0x376 irq 15
scsi host1: pata_legacy
ata3: PATA max PIO4 cmd 0x1e8 ctl 0x3ee irq 11
scsi host1: pata_legacy
ata4: PATA max PIO4 cmd 0x168 ctl 0x36e irq 10
scsi host1: pata_legacy
ata5: PATA max PIO4 cmd 0x1e0 ctl 0x3e6 irq 8
scsi host1: pata_legacy
ata6: PATA max PIO4 cmd 0x160 ctl 0x366 irq 12
however giving the kernel "pata_legacy.probe_mask=21" makes it try every
other location only:
scsi host0: pata_legacy
ata1: PATA max PIO4 cmd 0x1f0 ctl 0x3f6 irq 14
ata1.00: ATA-4: ST310211A, 3.54, max UDMA/100
ata1.00: 19541088 sectors, multi 16: LBA
ata1.00: configured for PIO
scsi 0:0:0:0: Direct-Access ATA ST310211A 3.54 PQ: 0 ANSI: 5
scsi 0:0:0:0: Attached scsi generic sg0 type 0
sd 0:0:0:0: [sda] 19541088 512-byte logical blocks: (10.0 GB/9.32 GiB)
sd 0:0:0:0: [sda] Write Protect is off
sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
sda: sda1 sda2 sda3
sd 0:0:0:0: [sda] Attached SCSI disk
scsi host1: pata_legacy
ata2: PATA max PIO4 cmd 0x1e8 ctl 0x3ee irq 11
scsi host1: pata_legacy
ata3: PATA max PIO4 cmd 0x1e0 ctl 0x3e6 irq 8
Signed-off-by: Maciej W. Rozycki <macro@orcam.me.uk>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/alpine.DEB.2.21.2103211800110.21463@angie.orcam.me.uk
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2021-03-22 03:55:32 +08:00
|
|
|
pata_legacy.probe_mask= [HW,LIBATA]
|
|
|
|
Format: <int>
|
|
|
|
Probe mask for legacy ISA PATA ports. Depending on
|
|
|
|
platform configuration and the use of other driver
|
|
|
|
options up to 6 legacy ports are supported: 0x1f0,
|
|
|
|
0x170, 0x1e8, 0x168, 0x1e0, 0x160, however probing
|
|
|
|
of individual ports can be disabled by setting the
|
|
|
|
corresponding bits in the mask to 1. Bit 0 is for
|
|
|
|
the first port in the list above (0x1f0), and so on.
|
|
|
|
By default all supported ports are probed.
|
|
|
|
|
2021-03-22 03:55:22 +08:00
|
|
|
pata_legacy.qdi= [HW,LIBATA]
|
|
|
|
Format: <int>
|
|
|
|
Set to non-zero to probe QDI controllers. By default
|
|
|
|
set to 1 if CONFIG_PATA_QDI_MODULE, 0 otherwise.
|
|
|
|
|
|
|
|
pata_legacy.winbond= [HW,LIBATA]
|
|
|
|
Format: <int>
|
|
|
|
Set to non-zero to probe Winbond controllers. Use
|
|
|
|
the standard I/O port (0x130) if 1, otherwise the
|
|
|
|
value given is the I/O port to use (typically 0x1b0).
|
|
|
|
By default set to 1 if CONFIG_PATA_WINBOND_VLB_MODULE,
|
|
|
|
0 otherwise.
|
|
|
|
|
2021-03-22 03:55:27 +08:00
|
|
|
pata_platform.pio_mask= [HW,LIBATA]
|
|
|
|
Format: <int>
|
|
|
|
Supported PIO mode mask. Set individual bits to allow
|
|
|
|
the use of the respective PIO modes. Bit 0 is for
|
|
|
|
mode 0, bit 1 is for mode 1, and so on. Mode 0 only
|
|
|
|
allowed by default.
|
|
|
|
|
2006-03-23 19:00:57 +08:00
|
|
|
pause_on_oops=
|
|
|
|
Halt all CPUs after the first oops has been printed for
|
|
|
|
the specified number of seconds. This is to be used if
|
|
|
|
your oopses keep scrolling off the screen.
|
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
pcbit= [HW,ISDN]
|
|
|
|
|
|
|
|
pcd. [PARIDE]
|
|
|
|
See header of drivers/block/paride/pcd.c.
|
2019-06-18 22:47:10 +08:00
|
|
|
See also Documentation/admin-guide/blockdev/paride.rst.
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2018-07-31 00:18:37 +08:00
|
|
|
pci=option[,option...] [PCI] various PCI subsystem options.
|
|
|
|
|
|
|
|
Some options herein operate on a specific device
|
|
|
|
or a set of devices (<pci_dev>). These are
|
|
|
|
specified in one of the following formats:
|
|
|
|
|
2018-07-31 00:18:38 +08:00
|
|
|
[<domain>:]<bus>:<dev>.<func>[/<dev>.<func>]*
|
2018-07-31 00:18:37 +08:00
|
|
|
pci:<vendor>:<device>[:<subvendor>:<subdevice>]
|
|
|
|
|
|
|
|
Note: the first format specifies a PCI
|
|
|
|
bus/device/function address which may change
|
|
|
|
if new hardware is inserted, if motherboard
|
|
|
|
firmware changes, or due to changes caused
|
|
|
|
by other kernel parameters. If the
|
|
|
|
domain is left unspecified, it is
|
2018-07-31 00:18:38 +08:00
|
|
|
taken to be zero. Optionally, a path
|
|
|
|
to a device through multiple device/function
|
|
|
|
addresses can be specified after the base
|
|
|
|
address (this is more robust against
|
|
|
|
renumbering issues). The second format
|
2018-07-31 00:18:37 +08:00
|
|
|
selects devices using IDs from the
|
|
|
|
configuration space which may match multiple
|
|
|
|
devices in the system.
|
|
|
|
|
2018-06-05 10:16:09 +08:00
|
|
|
earlydump dump PCI config space before the kernel
|
2018-04-19 02:51:39 +08:00
|
|
|
changes anything
|
2008-08-22 15:53:39 +08:00
|
|
|
off [X86] don't probe for the PCI bus
|
2007-07-31 15:37:59 +08:00
|
|
|
bios [X86-32] force use of PCI BIOS, don't access
|
2005-10-24 03:57:11 +08:00
|
|
|
the hardware directly. Use this if your machine
|
|
|
|
has a non-standard PCI host bridge.
|
2007-07-31 15:37:59 +08:00
|
|
|
nobios [X86-32] disallow use of PCI BIOS, only direct
|
2005-10-24 03:57:11 +08:00
|
|
|
hardware access methods are allowed. Use this
|
|
|
|
if you experience crashes upon bootup and you
|
|
|
|
suspect they are caused by the BIOS.
|
2016-01-13 23:48:51 +08:00
|
|
|
conf1 [X86] Force use of PCI Configuration Access
|
|
|
|
Mechanism 1 (config address in IO port 0xCF8,
|
|
|
|
data in IO port 0xCFC, both 32-bit).
|
|
|
|
conf2 [X86] Force use of PCI Configuration Access
|
|
|
|
Mechanism 2 (IO port 0xCF8 is an 8-bit port for
|
|
|
|
the function, IO port 0xCFA, also 8-bit, sets
|
|
|
|
bus number. The config space is then accessed
|
|
|
|
through ports 0xC000-0xCFFF).
|
|
|
|
See http://wiki.osdev.org/PCI for more info
|
|
|
|
on the configuration access mechanisms.
|
2007-10-06 04:17:58 +08:00
|
|
|
noaer [PCIE] If the PCIEAER kernel config parameter is
|
|
|
|
enabled, this kernel boot option can be used to
|
|
|
|
disable the use of PCIE advanced error reporting.
|
2007-10-12 04:57:27 +08:00
|
|
|
nodomains [PCI] Disable support for multiple PCI
|
|
|
|
root domains (aka PCI segments, in ACPI-speak).
|
2009-04-14 16:33:43 +08:00
|
|
|
nommconf [X86] Disable use of MMCONFIG for PCI
|
2006-02-16 07:17:43 +08:00
|
|
|
Configuration
|
2009-06-07 22:15:16 +08:00
|
|
|
check_enable_amd_mmconf [X86] check for and enable
|
|
|
|
properly configured MMIO access to PCI
|
|
|
|
config space on AMD family 10h CPU
|
2006-03-06 13:33:34 +08:00
|
|
|
nomsi [MSI] If the PCI_MSI kernel config parameter is
|
|
|
|
enabled, this kernel boot option can be used to
|
|
|
|
disable the use of MSI interrupts system-wide.
|
2008-06-11 22:35:14 +08:00
|
|
|
noioapicquirk [APIC] Disable all boot interrupt quirks.
|
|
|
|
Safety option to keep boot IRQs enabled. This
|
|
|
|
should never be necessary.
|
2008-06-11 22:35:15 +08:00
|
|
|
ioapicreroute [APIC] Enable rerouting of boot IRQs to the
|
|
|
|
primary IO-APIC for bridges that cannot disable
|
|
|
|
boot IRQs. This fixes a source of spurious IRQs
|
|
|
|
when the system masks IRQs.
|
2008-07-15 19:48:55 +08:00
|
|
|
noioapicreroute [APIC] Disable workaround that uses the
|
|
|
|
boot IRQ equivalent of an IRQ that connects to
|
|
|
|
a chipset where boot IRQs cannot be disabled.
|
|
|
|
The opposite of ioapicreroute.
|
2007-07-31 15:37:59 +08:00
|
|
|
biosirq [X86-32] Use PCI BIOS calls to get the interrupt
|
2005-10-24 03:57:11 +08:00
|
|
|
routing table. These calls are known to be buggy
|
|
|
|
on several machines and they hang the machine
|
|
|
|
when used, but on other computers it's the only
|
|
|
|
way to get the interrupt routing table. Try
|
|
|
|
this option if the kernel is unable to allocate
|
|
|
|
IRQs or discover secondary PCI buses on your
|
|
|
|
motherboard.
|
2008-08-22 15:53:39 +08:00
|
|
|
rom [X86] Assign address space to expansion ROMs.
|
2005-10-24 03:57:11 +08:00
|
|
|
Use with caution as certain devices share
|
|
|
|
address decoders between ROMs and other
|
|
|
|
resources.
|
2008-08-22 15:53:39 +08:00
|
|
|
norom [X86] Do not assign address space to
|
2008-05-13 04:57:46 +08:00
|
|
|
expansion ROMs that do not already have
|
|
|
|
BIOS assigned address ranges.
|
2010-05-13 02:14:32 +08:00
|
|
|
nobar [X86] Do not assign address space to the
|
|
|
|
BARs that weren't assigned by the BIOS.
|
2008-08-22 15:53:39 +08:00
|
|
|
irqmask=0xMMMM [X86] Set a bit mask of IRQs allowed to be
|
2005-10-24 03:57:11 +08:00
|
|
|
assigned automatically to PCI devices. You can
|
|
|
|
make the kernel exclude IRQs of your ISA cards
|
|
|
|
this way.
|
2008-08-22 15:53:39 +08:00
|
|
|
pirqaddr=0xAAAAA [X86] Specify the physical address
|
2005-10-24 03:57:11 +08:00
|
|
|
of the PIRQ table (normally generated
|
|
|
|
by the BIOS) if it is outside the
|
|
|
|
F0000h-100000h range.
|
2008-08-22 15:53:39 +08:00
|
|
|
lastbus=N [X86] Scan all buses thru bus #N. Can be
|
2005-10-24 03:57:11 +08:00
|
|
|
useful if the kernel is unable to find your
|
|
|
|
secondary buses and you want to tell it
|
|
|
|
explicitly which ones they are.
|
2008-08-22 15:53:39 +08:00
|
|
|
assign-busses [X86] Always assign all PCI bus
|
2005-10-24 03:57:11 +08:00
|
|
|
numbers ourselves, overriding
|
|
|
|
whatever the firmware may have done.
|
2008-08-22 15:53:39 +08:00
|
|
|
usepirqmask [X86] Honor the possible IRQ mask stored
|
2005-10-24 03:57:11 +08:00
|
|
|
in the BIOS $PIR table. This is needed on
|
|
|
|
some systems with broken BIOSes, notably
|
|
|
|
some HP Pavilion N5400 and Omnibook XE3
|
|
|
|
notebooks. This will have no effect if ACPI
|
|
|
|
IRQ routing is enabled.
|
2008-08-22 15:53:39 +08:00
|
|
|
noacpi [X86] Do not use ACPI for IRQ routing
|
2005-10-24 03:57:11 +08:00
|
|
|
or for PCI scanning.
|
2010-02-24 01:24:41 +08:00
|
|
|
use_crs [X86] Use PCI host bridge window information
|
|
|
|
from ACPI. On BIOSes from 2008 or later, this
|
|
|
|
is enabled by default. If you need to use this,
|
|
|
|
please report a bug.
|
|
|
|
nocrs [X86] Ignore PCI host bridge windows from ACPI.
|
2018-04-19 02:51:39 +08:00
|
|
|
If you need to use this, please report a bug.
|
2005-10-24 03:57:11 +08:00
|
|
|
routeirq Do IRQ routing for all PCI devices.
|
|
|
|
This is normally done in pci_enable_device(),
|
|
|
|
so this option is a temporary workaround
|
|
|
|
for broken drivers that don't call it.
|
2008-03-27 16:31:18 +08:00
|
|
|
skip_isa_align [X86] do not align io start addr, so can
|
|
|
|
handle more pci cards
|
2006-09-26 16:52:41 +08:00
|
|
|
noearly [X86] Don't do any early type 1 scanning.
|
|
|
|
This might help on some broken boards which
|
|
|
|
machine check when some devices' config space
|
|
|
|
is read. But various workarounds are disabled
|
|
|
|
and some IOMMU drivers will not work.
|
PCI: optionally sort device lists breadth-first
Problem:
New Dell PowerEdge servers have 2 embedded ethernet ports, which are
labeled NIC1 and NIC2 on the chassis, in the BIOS setup screens, and
in the printed documentation. Assuming no other add-in ethernet ports
in the system, Linux 2.4 kernels name these eth0 and eth1
respectively. Many people have come to expect this naming. Linux 2.6
kernels name these eth1 and eth0 respectively (backwards from
expectations). I also have reports that various Sun and HP servers
have similar behavior.
Root cause:
Linux 2.4 kernels walk the pci_devices list, which happens to be
sorted in breadth-first order (or pcbios_find_device order on i386,
which most often is breadth-first also). 2.6 kernels have both the
pci_devices list and the pci_bus_type.klist_devices list, the latter
is what is walked at driver load time to match the pci_id tables; this
klist happens to be in depth-first order.
On systems where, for physical routing reasons, NIC1 appears on a
lower bus number than NIC2, but NIC2's bridge is discovered first in
the depth-first ordering, NIC2 will be discovered before NIC1. If the
list were sorted breadth-first, NIC1 would be discovered before NIC2.
A PowerEdge 1955 system has the following topology which easily
exhibits the difference between depth-first and breadth-first device
lists.
-[0000:00]-+-00.0 Intel Corporation 5000P Chipset Memory Controller Hub
+-02.0-[0000:03-08]--+-00.0-[0000:04-07]--+-00.0-[0000:05-06]----00.0-[0000:06]----00.0 Broadcom Corporation NetXtreme II BCM5708S Gigabit Ethernet (labeled NIC2, 2.4 kernel name eth1, 2.6 kernel name eth0)
+-1c.0-[0000:01-02]----00.0-[0000:02]----00.0 Broadcom Corporation NetXtreme II BCM5708S Gigabit Ethernet (labeled NIC1, 2.4 kernel name eth0, 2.6 kernel name eth1)
Other factors, such as device driver load order and the presence of
PCI slots at various points in the bus hierarchy further complicate
this problem; I'm not trying to solve those here, just restore the
device order, and thus basic behavior, that 2.4 kernels had.
Solution:
The solution can come in multiple steps.
Suggested fix #1: kernel
Patch below optionally sorts the two device lists into breadth-first
ordering to maintain compatibility with 2.4 kernels. It adds two new
command line options:
pci=bfsort
pci=nobfsort
to force the sort order, or not, as you wish. It also adds DMI checks
for the specific Dell systems which exhibit "backwards" ordering, to
make them "right".
Suggested fix #2: udev rules from userland
Many people also have the expectation that embedded NICs are always
discovered before add-in NICs (which this patch does not try to do).
Using the PCI IRQ Routing Table provided by system BIOS, it's easy to
determine which PCI devices are embedded, or if add-in, which PCI slot
they're in. I'm working on a tool that would allow udev to name
ethernet devices in ascending embedded, slot 1 .. slot N order,
subsort by PCI bus/dev/fn breadth-first. It'll be possible to use it
independent of udev as well for those distributions that don't use
udev in their installers.
Suggested fix #3: system board routing rules
One can constrain the system board layout to put NIC1 ahead of NIC2
regardless of breadth-first or depth-first discovery order. This adds
a significant level of complexity to board routing, and may not be
possible in all instances (witness the above systems from several
major manufacturers). I don't want to encourage this particular train
of thought too far, at the expense of not doing #1 or #2 above.
Feedback appreciated. Patch tested on a Dell PowerEdge 1955 blade
with 2.6.18.
You'll also note I took some liberty and temporarily break the klist
abstraction to simplify and speed up the sort algorithm. I think
that's both safe and appropriate in this instance.
Signed-off-by: Matt Domsch <Matt_Domsch@dell.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-09-30 04:23:23 +08:00
|
|
|
bfsort Sort PCI devices into breadth-first order.
|
|
|
|
This sorting is done to get a device
|
|
|
|
order compatible with older (<= 2.4) kernels.
|
|
|
|
nobfsort Don't sort PCI devices into breadth-first order.
|
2013-01-30 09:40:52 +08:00
|
|
|
pcie_bus_tune_off Disable PCIe MPS (Max Payload Size)
|
|
|
|
tuning and use the BIOS-configured MPS defaults.
|
|
|
|
pcie_bus_safe Set every device's MPS to the largest value
|
|
|
|
supported by all devices below the root complex.
|
|
|
|
pcie_bus_perf Set device MPS to the largest allowable MPS
|
|
|
|
based on its parent bus. Also set MRRS (Max
|
|
|
|
Read Request Size) to the largest supported
|
|
|
|
value (no larger than the MPS that the device
|
|
|
|
or bus can support) for best performance.
|
|
|
|
pcie_bus_peer2peer Set every device's MPS to 128B, which
|
|
|
|
every device is guaranteed to support. This
|
|
|
|
configuration allows peer-to-peer DMA between
|
|
|
|
any pair of devices, possibly at the cost of
|
|
|
|
reduced performance. This also guarantees
|
|
|
|
that hot-added devices will work.
|
2007-02-06 08:36:06 +08:00
|
|
|
cbiosize=nn[KMG] The fixed amount of bus space which is
|
|
|
|
reserved for the CardBus bridge's IO window.
|
|
|
|
The default value is 256 bytes.
|
|
|
|
cbmemsize=nn[KMG] The fixed amount of bus space which is
|
|
|
|
reserved for the CardBus bridge's memory
|
|
|
|
window. The default value is 64 megabytes.
|
2009-03-16 16:13:39 +08:00
|
|
|
resource_alignment=
|
|
|
|
Format:
|
2018-07-31 00:18:37 +08:00
|
|
|
[<order of align>@]<pci_dev>[; ...]
|
2009-03-16 16:13:39 +08:00
|
|
|
Specifies alignment and device to reassign
|
2018-07-31 00:18:37 +08:00
|
|
|
aligned memory resources. How to
|
|
|
|
specify the device is described above.
|
2009-03-16 16:13:39 +08:00
|
|
|
If <order of align> is not specified,
|
|
|
|
PAGE_SIZE is used as alignment.
|
2019-06-06 11:25:57 +08:00
|
|
|
A PCI-PCI bridge can be specified if resource
|
2009-03-16 16:13:39 +08:00
|
|
|
windows need to be expanded.
|
2016-08-09 16:33:31 +08:00
|
|
|
To specify the alignment for several
|
|
|
|
instances of a device, the PCI vendor,
|
|
|
|
device, subvendor, and subdevice may be
|
2019-06-06 11:25:57 +08:00
|
|
|
specified, e.g., 12@pci:8086:9c22:103c:198f
|
|
|
|
for 4096-byte alignment.
|
2009-04-23 06:52:09 +08:00
|
|
|
ecrc= Enable/disable PCIe ECRC (transaction layer
|
|
|
|
end-to-end CRC checking).
|
|
|
|
bios: Use BIOS/firmware settings. This is the
|
|
|
|
the default.
|
|
|
|
off: Turn ECRC off
|
|
|
|
on: Turn ECRC on.
|
2013-01-23 20:29:06 +08:00
|
|
|
hpiosize=nn[KMG] The fixed amount of bus space which is
|
|
|
|
reserved for hotplug bridge's IO window.
|
|
|
|
Default size is 256 bytes.
|
2019-10-23 20:12:29 +08:00
|
|
|
hpmmiosize=nn[KMG] The fixed amount of bus space which is
|
|
|
|
reserved for hotplug bridge's MMIO window.
|
|
|
|
Default size is 2 megabytes.
|
|
|
|
hpmmioprefsize=nn[KMG] The fixed amount of bus space which is
|
|
|
|
reserved for hotplug bridge's MMIO_PREF window.
|
|
|
|
Default size is 2 megabytes.
|
2013-01-23 20:29:06 +08:00
|
|
|
hpmemsize=nn[KMG] The fixed amount of bus space which is
|
2019-10-23 20:12:29 +08:00
|
|
|
reserved for hotplug bridge's MMIO and
|
|
|
|
MMIO_PREF window.
|
2013-01-23 20:29:06 +08:00
|
|
|
Default size is 2 megabytes.
|
2016-07-22 11:40:28 +08:00
|
|
|
hpbussize=nn The minimum amount of additional bus numbers
|
|
|
|
reserved for buses below a hotplug bridge.
|
|
|
|
Default is 1.
|
2012-02-24 11:23:30 +08:00
|
|
|
realloc= Enable/disable reallocating PCI bridge resources
|
|
|
|
if allocations done by BIOS are too small to
|
|
|
|
accommodate resources required by all child
|
|
|
|
devices.
|
|
|
|
off: Turn realloc off
|
|
|
|
on: Turn realloc on
|
|
|
|
realloc same as realloc=on
|
2012-03-01 07:06:33 +08:00
|
|
|
noari do not use PCIe ARI.
|
2018-05-11 06:56:02 +08:00
|
|
|
noats [PCIE, Intel-IOMMU, AMD-IOMMU]
|
|
|
|
do not use PCIe ATS (and IOMMU device IOTLB).
|
2012-05-01 05:21:02 +08:00
|
|
|
pcie_scan_all Scan all possible PCIe devices. Otherwise we
|
|
|
|
only look for one device below a PCIe downstream
|
|
|
|
port.
|
2018-01-11 21:23:29 +08:00
|
|
|
big_root_window Try to add a big 64bit memory window to the PCIe
|
|
|
|
root complex on AMD CPUs. Some GFX hardware
|
|
|
|
can resize a BAR to allow access to all VRAM.
|
|
|
|
Adding the window is slightly risky (it may
|
|
|
|
conflict with unreported devices), so this
|
|
|
|
taints the kernel.
|
2018-07-31 00:18:40 +08:00
|
|
|
disable_acs_redir=<pci_dev>[; ...]
|
|
|
|
Specify one or more PCI devices (in the format
|
|
|
|
specified above) separated by semicolons.
|
|
|
|
Each device specified will have the PCI ACS
|
|
|
|
redirect capabilities forced off which will
|
|
|
|
allow P2P traffic between devices through
|
|
|
|
bridges without forcing it upstream. Note:
|
|
|
|
this removes isolation between devices and
|
|
|
|
may put more devices in an IOMMU group.
|
2019-02-26 23:07:32 +08:00
|
|
|
force_floating [S390] Force usage of floating interrupts.
|
2019-04-19 03:39:06 +08:00
|
|
|
nomio [S390] Do not use MIO instructions.
|
2020-04-01 17:12:24 +08:00
|
|
|
norid [S390] ignore the RID field and force use of
|
|
|
|
one PCI domain per PCI function
|
PCI: optionally sort device lists breadth-first
Problem:
New Dell PowerEdge servers have 2 embedded ethernet ports, which are
labeled NIC1 and NIC2 on the chassis, in the BIOS setup screens, and
in the printed documentation. Assuming no other add-in ethernet ports
in the system, Linux 2.4 kernels name these eth0 and eth1
respectively. Many people have come to expect this naming. Linux 2.6
kernels name these eth1 and eth0 respectively (backwards from
expectations). I also have reports that various Sun and HP servers
have similar behavior.
Root cause:
Linux 2.4 kernels walk the pci_devices list, which happens to be
sorted in breadth-first order (or pcbios_find_device order on i386,
which most often is breadth-first also). 2.6 kernels have both the
pci_devices list and the pci_bus_type.klist_devices list, the latter
is what is walked at driver load time to match the pci_id tables; this
klist happens to be in depth-first order.
On systems where, for physical routing reasons, NIC1 appears on a
lower bus number than NIC2, but NIC2's bridge is discovered first in
the depth-first ordering, NIC2 will be discovered before NIC1. If the
list were sorted breadth-first, NIC1 would be discovered before NIC2.
A PowerEdge 1955 system has the following topology which easily
exhibits the difference between depth-first and breadth-first device
lists.
-[0000:00]-+-00.0 Intel Corporation 5000P Chipset Memory Controller Hub
+-02.0-[0000:03-08]--+-00.0-[0000:04-07]--+-00.0-[0000:05-06]----00.0-[0000:06]----00.0 Broadcom Corporation NetXtreme II BCM5708S Gigabit Ethernet (labeled NIC2, 2.4 kernel name eth1, 2.6 kernel name eth0)
+-1c.0-[0000:01-02]----00.0-[0000:02]----00.0 Broadcom Corporation NetXtreme II BCM5708S Gigabit Ethernet (labeled NIC1, 2.4 kernel name eth0, 2.6 kernel name eth1)
Other factors, such as device driver load order and the presence of
PCI slots at various points in the bus hierarchy further complicate
this problem; I'm not trying to solve those here, just restore the
device order, and thus basic behavior, that 2.4 kernels had.
Solution:
The solution can come in multiple steps.
Suggested fix #1: kernel
Patch below optionally sorts the two device lists into breadth-first
ordering to maintain compatibility with 2.4 kernels. It adds two new
command line options:
pci=bfsort
pci=nobfsort
to force the sort order, or not, as you wish. It also adds DMI checks
for the specific Dell systems which exhibit "backwards" ordering, to
make them "right".
Suggested fix #2: udev rules from userland
Many people also have the expectation that embedded NICs are always
discovered before add-in NICs (which this patch does not try to do).
Using the PCI IRQ Routing Table provided by system BIOS, it's easy to
determine which PCI devices are embedded, or if add-in, which PCI slot
they're in. I'm working on a tool that would allow udev to name
ethernet devices in ascending embedded, slot 1 .. slot N order,
subsort by PCI bus/dev/fn breadth-first. It'll be possible to use it
independent of udev as well for those distributions that don't use
udev in their installers.
Suggested fix #3: system board routing rules
One can constrain the system board layout to put NIC1 ahead of NIC2
regardless of breadth-first or depth-first discovery order. This adds
a significant level of complexity to board routing, and may not be
possible in all instances (witness the above systems from several
major manufacturers). I don't want to encourage this particular train
of thought too far, at the expense of not doing #1 or #2 above.
Feedback appreciated. Patch tested on a Dell PowerEdge 1955 blade
with 2.6.18.
You'll also note I took some liberty and temporarily break the klist
abstraction to simplify and speed up the sort algorithm. I think
that's both safe and appropriate in this instance.
Signed-off-by: Matt Domsch <Matt_Domsch@dell.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-09-30 04:23:23 +08:00
|
|
|
|
2008-09-25 08:40:34 +08:00
|
|
|
pcie_aspm= [PCIE] Forcibly enable or disable PCIe Active State Power
|
|
|
|
Management.
|
|
|
|
off Disable ASPM.
|
|
|
|
force Enable ASPM even on devices that claim not to support it.
|
|
|
|
WARNING: Forcing ASPM on may cause system lockups.
|
|
|
|
|
2018-03-10 01:21:28 +08:00
|
|
|
pcie_ports= [PCIE] PCIe port services handling:
|
|
|
|
native Use native PCIe services (PME, AER, DPC, PCIe hotplug)
|
|
|
|
even if the platform doesn't give the OS permission to
|
|
|
|
use them. This may cause conflicts if the platform
|
|
|
|
also tries to use these services.
|
PCI/DPC: Add "pcie_ports=dpc-native" to allow DPC without AER control
Prior to eed85ff4c0da7 ("PCI/DPC: Enable DPC only if AER is available"),
Linux handled DPC events regardless of whether firmware had granted it
ownership of AER or DPC, e.g., via _OSC.
PCIe r5.0, sec 6.2.10, recommends that the OS link control of DPC to
control of AER, so after eed85ff4c0da7, Linux handles DPC events only if it
has control of AER.
On platforms that do not grant OS control of AER via _OSC, Linux DPC
handling worked before eed85ff4c0da7 but not after.
To make Linux DPC handling work on those platforms the same way they did
before, add a "pcie_ports=dpc-native" kernel parameter that makes Linux
handle DPC events regardless of whether it has control of AER.
[bhelgaas: commit log, move pcie_ports_dpc_native to drivers/pci/]
Link: https://lore.kernel.org/r/20191023192205.97024-1-olof@lixom.net
Signed-off-by: Olof Johansson <olof@lixom.net>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
2019-10-24 03:22:05 +08:00
|
|
|
dpc-native Use native PCIe service for DPC only. May
|
|
|
|
cause conflicts if firmware uses AER or DPC.
|
2018-03-10 01:21:28 +08:00
|
|
|
compat Disable native PCIe services (PME, AER, DPC, PCIe
|
|
|
|
hotplug).
|
2010-08-21 07:51:44 +08:00
|
|
|
|
2016-06-02 16:17:12 +08:00
|
|
|
pcie_port_pm= [PCIE] PCIe port power management handling:
|
|
|
|
off Disable power management of all PCIe ports
|
|
|
|
force Forcibly enable power management of all PCIe ports
|
|
|
|
|
2010-02-18 06:39:08 +08:00
|
|
|
pcie_pme= [PCIE,PM] Native PCIe PME signaling options:
|
2010-02-18 06:40:07 +08:00
|
|
|
nomsi Do not use MSI for native PCIe PME signaling (this makes
|
PCI: PCIe: Ask BIOS for control of all native services at once
After commit 852972acff8f10f3a15679be2059bb94916cba5d (ACPI: Disable
ASPM if the platform won't provide _OSC control for PCIe) control of
the PCIe Capability Structure is unconditionally requested by
acpi_pci_root_add(), which in principle may cause problems to
happen in two ways. First, the BIOS may refuse to give control of
the PCIe Capability Structure if it is not asked for any of the
_OSC features depending on it at the same time. Second, the BIOS may
assume that control of the _OSC features depending on the PCIe
Capability Structure will be requested in the future and may behave
incorrectly if that doesn't happen. For this reason, control of
the PCIe Capability Structure should always be requested along with
control of any other _OSC features that may depend on it (ie. PCIe
native PME, PCIe native hot-plug, PCIe AER).
Rework the PCIe port driver so that (1) it checks which native PCIe
port services can be enabled, according to the BIOS, and (2) it
requests control of all these services simultaneously. In
particular, this causes pcie_portdrv_probe() to fail if the BIOS
refuses to grant control of the PCIe Capability Structure, which
means that no native PCIe port services can be enabled for the PCIe
Root Complex the given port belongs to. If that happens, ASPM is
disabled to avoid problems with mishandling it by the part of the
PCIe hierarchy for which control of the PCIe Capability Structure
has not been received.
Make it possible to override this behavior using 'pcie_ports=native'
(use the PCIe native services regardless of the BIOS response to the
control request), or 'pcie_ports=compat' (do not use the PCIe native
services at all).
Accordingly, rework the existing PCIe port service drivers so that
they don't request control of the services directly.
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2010-08-22 04:02:38 +08:00
|
|
|
all PCIe root ports use INTx for all services).
|
2010-02-18 06:39:08 +08:00
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
pcmv= [HW,PCMCIA] BadgePAD 4
|
|
|
|
|
2014-03-28 13:20:21 +08:00
|
|
|
pd_ignore_unused
|
|
|
|
[PM]
|
|
|
|
Keep all power-domains already enabled by bootloader on,
|
|
|
|
even if no driver has claimed them. This is useful
|
|
|
|
for debug and development, but should not be
|
|
|
|
needed on a platform with proper driver support.
|
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
pd. [PARIDE]
|
2019-06-18 22:47:10 +08:00
|
|
|
See Documentation/admin-guide/blockdev/paride.rst.
|
2005-04-17 06:20:36 +08:00
|
|
|
|
|
|
|
pdcchassis= [PARISC,HW] Disable/Enable PDC Chassis Status codes at
|
|
|
|
boot time.
|
|
|
|
Format: { 0 | 1 }
|
|
|
|
See arch/parisc/kernel/pdc_chassis.c
|
|
|
|
|
2009-08-14 14:00:50 +08:00
|
|
|
percpu_alloc= Select which percpu first chunk allocator to use.
|
2009-08-14 14:00:53 +08:00
|
|
|
Currently supported values are "embed" and "page".
|
|
|
|
Archs may support subset or none of the selections.
|
|
|
|
See comments in mm/percpu.c for details on each
|
|
|
|
allocator. This parameter is primarily for debugging
|
|
|
|
and performance comparison.
|
2009-06-22 10:56:24 +08:00
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
pf. [PARIDE]
|
2019-06-18 22:47:10 +08:00
|
|
|
See Documentation/admin-guide/blockdev/paride.rst.
|
2005-04-17 06:20:36 +08:00
|
|
|
|
|
|
|
pg. [PARIDE]
|
2019-06-18 22:47:10 +08:00
|
|
|
See Documentation/admin-guide/blockdev/paride.rst.
|
2005-04-17 06:20:36 +08:00
|
|
|
|
|
|
|
pirq= [SMP,APIC] Manual mp-table setup
|
2019-06-08 02:54:32 +08:00
|
|
|
See Documentation/x86/i386/IO-APIC.rst.
|
2005-04-17 06:20:36 +08:00
|
|
|
|
|
|
|
plip= [PPT,NET] Parallel port network link
|
|
|
|
Format: { parport<nr> | timid | 0 }
|
2017-10-11 01:36:16 +08:00
|
|
|
See also Documentation/admin-guide/parport.rst.
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2011-08-14 03:34:52 +08:00
|
|
|
pmtmr= [X86] Manual setup of pmtmr I/O Port.
|
2008-07-12 11:33:30 +08:00
|
|
|
Override pmtimer IOPort with a hex value.
|
|
|
|
e.g. pmtmr=0x508
|
|
|
|
|
2020-04-02 15:56:52 +08:00
|
|
|
pm_debug_messages [SUSPEND,KNL]
|
|
|
|
Enable suspend/resume debug messages during boot up.
|
|
|
|
|
2011-08-12 02:14:05 +08:00
|
|
|
pnp.debug=1 [PNP]
|
|
|
|
Enable PNP debug messages (depends on the
|
|
|
|
CONFIG_PNP_DEBUG_MESSAGES option). Change at run-time
|
|
|
|
via /sys/module/pnp/parameters/debug. We always show
|
|
|
|
current resource usage; turning this on also shows
|
|
|
|
possible settings and some assignment information.
|
2008-08-20 06:53:41 +08:00
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
pnpacpi= [ACPI]
|
|
|
|
{ off }
|
|
|
|
|
|
|
|
pnpbios= [ISAPNP]
|
|
|
|
{ on | off | curr | res | no-curr | no-res }
|
|
|
|
|
|
|
|
pnp_reserve_irq=
|
|
|
|
[ISAPNP] Exclude IRQs for the autoconfiguration
|
|
|
|
|
|
|
|
pnp_reserve_dma=
|
|
|
|
[ISAPNP] Exclude DMAs for the autoconfiguration
|
|
|
|
|
|
|
|
pnp_reserve_io= [ISAPNP] Exclude I/O ports for the autoconfiguration
|
2005-10-24 03:57:11 +08:00
|
|
|
Ranges are in pairs (I/O port base and size).
|
2005-04-17 06:20:36 +08:00
|
|
|
|
|
|
|
pnp_reserve_mem=
|
2005-10-24 03:57:11 +08:00
|
|
|
[ISAPNP] Exclude memory regions for the
|
|
|
|
autoconfiguration.
|
2005-04-17 06:20:36 +08:00
|
|
|
Ranges are in pairs (memory base and size).
|
|
|
|
|
2009-04-18 09:30:28 +08:00
|
|
|
ports= [IP_VS_FTP] IPVS ftp helper module
|
|
|
|
Default is 21.
|
|
|
|
Up to 8 (IP_VS_APP_MAX_PORTS) ports
|
|
|
|
may be specified.
|
|
|
|
Format: <port>,<port>....
|
|
|
|
|
2016-12-01 21:08:26 +08:00
|
|
|
powersave=off [PPC] This option disables power saving features.
|
|
|
|
It specifically disables cpuidle and sets the
|
|
|
|
platform machine description specific power_save
|
|
|
|
function to NULL. On Idle the CPU just reduces
|
|
|
|
execution priority.
|
|
|
|
|
2015-10-29 08:44:06 +08:00
|
|
|
ppc_strict_facility_enable
|
|
|
|
[PPC] This option catches any kernel floating point,
|
|
|
|
Altivec, VSX and SPE outside of regions specifically
|
|
|
|
allowed (eg kernel_enable_fpu()/kernel_disable_fpu()).
|
|
|
|
There is some performance impact when enabling this.
|
|
|
|
|
2017-10-12 18:17:16 +08:00
|
|
|
ppc_tm= [PPC]
|
|
|
|
Format: {"off"}
|
|
|
|
Disable Hardware Transactional Memory
|
|
|
|
|
2021-01-18 22:12:19 +08:00
|
|
|
preempt= [KNL]
|
|
|
|
Select preemption mode if you have CONFIG_PREEMPT_DYNAMIC
|
|
|
|
none - Limited to cond_resched() calls
|
|
|
|
voluntary - Limited to cond_resched() and might_sleep() calls
|
|
|
|
full - Any section that isn't explicitly preempt disabled
|
|
|
|
can be preempted anytime.
|
|
|
|
|
2007-07-16 14:40:10 +08:00
|
|
|
print-fatal-signals=
|
|
|
|
[KNL] debug: print fatal signals
|
2009-11-08 23:46:42 +08:00
|
|
|
|
|
|
|
If enabled, warn about various signal handling
|
|
|
|
related application anomalies: too many signals,
|
|
|
|
too many POSIX.1 timers, fatal signals causing a
|
|
|
|
coredump - etc.
|
|
|
|
|
|
|
|
If you hit the warning due to signal overflow,
|
|
|
|
you might want to try "ulimit -i unlimited".
|
|
|
|
|
2007-07-16 14:40:10 +08:00
|
|
|
default: off.
|
|
|
|
|
2012-03-06 06:59:10 +08:00
|
|
|
printk.always_kmsg_dump=
|
|
|
|
Trigger kmsg_dump for cases other than kernel oops or
|
|
|
|
panics
|
|
|
|
Format: <bool> (1/Y/y=enable, 0/N/n=disable)
|
|
|
|
default: disabled
|
|
|
|
|
2016-08-03 05:04:07 +08:00
|
|
|
printk.devkmsg={on,off,ratelimit}
|
|
|
|
Control writing to /dev/kmsg.
|
|
|
|
on - unlimited logging to /dev/kmsg from userspace
|
|
|
|
off - logging to /dev/kmsg disabled
|
|
|
|
ratelimit - ratelimit the logging
|
|
|
|
Default: ratelimit
|
|
|
|
|
2007-07-16 14:40:25 +08:00
|
|
|
printk.time= Show timing data prefixed to each printk message line
|
|
|
|
Format: <bool> (1/Y/y=enable, 0/N/n=disable)
|
|
|
|
|
2009-04-06 06:55:22 +08:00
|
|
|
processor.max_cstate= [HW,ACPI]
|
|
|
|
Limit processor to maximum C-state
|
|
|
|
max_cstate=9 overrides any DMI blacklist limit.
|
|
|
|
|
|
|
|
processor.nocst [HW,ACPI]
|
|
|
|
Ignore the _CST method to determine C-states,
|
|
|
|
instead using the legacy FADT method
|
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
profile= [KNL] Enable kernel profiling via /proc/profile
|
2017-11-20 13:08:11 +08:00
|
|
|
Format: [<profiletype>,]<number>
|
|
|
|
Param: <profiletype>: "schedule", "sleep", or "kvm"
|
|
|
|
[defaults to kernel profiling]
|
2005-10-24 03:57:11 +08:00
|
|
|
Param: "schedule" - profile schedule points.
|
2007-10-25 00:23:50 +08:00
|
|
|
Param: "sleep" - profile D-state sleeping (millisecs).
|
|
|
|
Requires CONFIG_SCHEDSTATS
|
2007-10-20 09:08:22 +08:00
|
|
|
Param: "kvm" - profile VM exits.
|
2017-11-20 13:08:11 +08:00
|
|
|
Param: <number> - step/bucket size as a power of 2 for
|
|
|
|
statistical time based profiling.
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2020-09-18 09:56:40 +08:00
|
|
|
prompt_ramdisk= [RAM] [Deprecated]
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2019-10-23 19:56:36 +08:00
|
|
|
prot_virt= [S390] enable hosting protected virtual machines
|
|
|
|
isolated from the hypervisor (if hardware supports
|
|
|
|
that).
|
|
|
|
Format: <bool>
|
|
|
|
|
2018-12-01 06:09:58 +08:00
|
|
|
psi= [KNL] Enable or disable pressure stall information
|
|
|
|
tracking.
|
|
|
|
Format: <bool>
|
|
|
|
|
2005-10-24 03:57:11 +08:00
|
|
|
psmouse.proto= [HW,MOUSE] Highest PS2 mouse protocol extension to
|
|
|
|
probe for; one of (bare|imps|exps|lifebook|any).
|
2005-04-17 06:20:36 +08:00
|
|
|
psmouse.rate= [HW,MOUSE] Set desired mouse report rate, in reports
|
|
|
|
per second.
|
2005-10-24 03:57:11 +08:00
|
|
|
psmouse.resetafter= [HW,MOUSE]
|
|
|
|
Try to reset the device after so many bad packets
|
2005-04-17 06:20:36 +08:00
|
|
|
(0 = never).
|
|
|
|
psmouse.resolution=
|
|
|
|
[HW,MOUSE] Set desired mouse resolution, in dpi.
|
|
|
|
psmouse.smartscroll=
|
2005-10-24 03:57:11 +08:00
|
|
|
[HW,MOUSE] Controls Logitech smartscroll autorepeat.
|
2005-04-17 06:20:36 +08:00
|
|
|
0 = disabled, 1 = enabled (default).
|
|
|
|
|
2011-07-22 04:57:55 +08:00
|
|
|
pstore.backend= Specify the name of the pstore backend to use
|
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
pt. [PARIDE]
|
2019-06-18 22:47:10 +08:00
|
|
|
See Documentation/admin-guide/blockdev/paride.rst.
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2020-08-10 10:49:41 +08:00
|
|
|
pti= [X86-64] Control Page Table Isolation of user and
|
2018-01-06 01:44:36 +08:00
|
|
|
kernel address spaces. Disabling this feature
|
|
|
|
removes hardening, but improves performance of
|
|
|
|
system calls and interrupts.
|
|
|
|
|
|
|
|
on - unconditionally enable
|
|
|
|
off - unconditionally disable
|
|
|
|
auto - kernel detects whether your CPU model is
|
|
|
|
vulnerable to issues that PTI mitigates
|
|
|
|
|
|
|
|
Not specifying this option is equivalent to pti=auto.
|
|
|
|
|
2020-08-10 10:49:41 +08:00
|
|
|
nopti [X86-64]
|
2018-01-06 01:44:36 +08:00
|
|
|
Equivalent to pti=off
|
2017-12-12 21:39:52 +08:00
|
|
|
|
2007-08-15 18:25:38 +08:00
|
|
|
pty.legacy_count=
|
|
|
|
[KNL] Number of legacy pty's. Overwrites compiled-in
|
|
|
|
default number.
|
|
|
|
|
2006-09-29 17:01:02 +08:00
|
|
|
quiet [KNL] Disable most log messages
|
2005-10-24 03:57:11 +08:00
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
r128= [HW,DRM]
|
|
|
|
|
|
|
|
raid= [HW,RAID]
|
2016-11-03 18:10:10 +08:00
|
|
|
See Documentation/admin-guide/md.rst.
|
2005-04-17 06:20:36 +08:00
|
|
|
|
|
|
|
ramdisk_size= [RAM] Sizes of RAM disks in kilobytes
|
2019-06-18 22:47:10 +08:00
|
|
|
See Documentation/admin-guide/blockdev/ramdisk.rst.
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2020-09-18 09:56:40 +08:00
|
|
|
ramdisk_start= [RAM] RAM disk image start address
|
|
|
|
|
2018-08-28 05:51:54 +08:00
|
|
|
random.trust_cpu={on,off}
|
|
|
|
[KNL] Enable or disable trusting the use of the
|
|
|
|
CPU's random number generator (if available) to
|
|
|
|
fully seed the kernel's CRNG. Default is controlled
|
|
|
|
by CONFIG_RANDOM_TRUST_CPU.
|
|
|
|
|
stack: Optionally randomize kernel stack offset each syscall
This provides the ability for architectures to enable kernel stack base
address offset randomization. This feature is controlled by the boot
param "randomize_kstack_offset=on/off", with its default value set by
CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAULT.
This feature is based on the original idea from the last public release
of PaX's RANDKSTACK feature: https://pax.grsecurity.net/docs/randkstack.txt
All the credit for the original idea goes to the PaX team. Note that
the design and implementation of this upstream randomize_kstack_offset
feature differs greatly from the RANDKSTACK feature (see below).
Reasoning for the feature:
This feature aims to make harder the various stack-based attacks that
rely on deterministic stack structure. We have had many such attacks in
past (just to name few):
https://jon.oberheide.org/files/infiltrate12-thestackisback.pdf
https://jon.oberheide.org/files/stackjacking-infiltrate11.pdf
https://googleprojectzero.blogspot.com/2016/06/exploiting-recursion-in-linux-kernel_20.html
As Linux kernel stack protections have been constantly improving
(vmap-based stack allocation with guard pages, removal of thread_info,
STACKLEAK), attackers have had to find new ways for their exploits
to work. They have done so, continuing to rely on the kernel's stack
determinism, in situations where VMAP_STACK and THREAD_INFO_IN_TASK_STRUCT
were not relevant. For example, the following recent attacks would have
been hampered if the stack offset was non-deterministic between syscalls:
https://repositorio-aberto.up.pt/bitstream/10216/125357/2/374717.pdf
(page 70: targeting the pt_regs copy with linear stack overflow)
https://a13xp0p0v.github.io/2020/02/15/CVE-2019-18683.html
(leaked stack address from one syscall as a target during next syscall)
The main idea is that since the stack offset is randomized on each system
call, it is harder for an attack to reliably land in any particular place
on the thread stack, even with address exposures, as the stack base will
change on the next syscall. Also, since randomization is performed after
placing pt_regs, the ptrace-based approach[1] to discover the randomized
offset during a long-running syscall should not be possible.
Design description:
During most of the kernel's execution, it runs on the "thread stack",
which is pretty deterministic in its structure: it is fixed in size,
and on every entry from userspace to kernel on a syscall the thread
stack starts construction from an address fetched from the per-cpu
cpu_current_top_of_stack variable. The first element to be pushed to the
thread stack is the pt_regs struct that stores all required CPU registers
and syscall parameters. Finally the specific syscall function is called,
with the stack being used as the kernel executes the resulting request.
The goal of randomize_kstack_offset feature is to add a random offset
after the pt_regs has been pushed to the stack and before the rest of the
thread stack is used during the syscall processing, and to change it every
time a process issues a syscall. The source of randomness is currently
architecture-defined (but x86 is using the low byte of rdtsc()). Future
improvements for different entropy sources is possible, but out of scope
for this patch. Further more, to add more unpredictability, new offsets
are chosen at the end of syscalls (the timing of which should be less
easy to measure from userspace than at syscall entry time), and stored
in a per-CPU variable, so that the life of the value does not stay
explicitly tied to a single task.
As suggested by Andy Lutomirski, the offset is added using alloca()
and an empty asm() statement with an output constraint, since it avoids
changes to assembly syscall entry code, to the unwinder, and provides
correct stack alignment as defined by the compiler.
In order to make this available by default with zero performance impact
for those that don't want it, it is boot-time selectable with static
branches. This way, if the overhead is not wanted, it can just be
left turned off with no performance impact.
The generated assembly for x86_64 with GCC looks like this:
...
ffffffff81003977: 65 8b 05 02 ea 00 7f mov %gs:0x7f00ea02(%rip),%eax
# 12380 <kstack_offset>
ffffffff8100397e: 25 ff 03 00 00 and $0x3ff,%eax
ffffffff81003983: 48 83 c0 0f add $0xf,%rax
ffffffff81003987: 25 f8 07 00 00 and $0x7f8,%eax
ffffffff8100398c: 48 29 c4 sub %rax,%rsp
ffffffff8100398f: 48 8d 44 24 0f lea 0xf(%rsp),%rax
ffffffff81003994: 48 83 e0 f0 and $0xfffffffffffffff0,%rax
...
As a result of the above stack alignment, this patch introduces about
5 bits of randomness after pt_regs is spilled to the thread stack on
x86_64, and 6 bits on x86_32 (since its has 1 fewer bit required for
stack alignment). The amount of entropy could be adjusted based on how
much of the stack space we wish to trade for security.
My measure of syscall performance overhead (on x86_64):
lmbench: /usr/lib/lmbench/bin/x86_64-linux-gnu/lat_syscall -N 10000 null
randomize_kstack_offset=y Simple syscall: 0.7082 microseconds
randomize_kstack_offset=n Simple syscall: 0.7016 microseconds
So, roughly 0.9% overhead growth for a no-op syscall, which is very
manageable. And for people that don't want this, it's off by default.
There are two gotchas with using the alloca() trick. First,
compilers that have Stack Clash protection (-fstack-clash-protection)
enabled by default (e.g. Ubuntu[3]) add pagesize stack probes to
any dynamic stack allocations. While the randomization offset is
always less than a page, the resulting assembly would still contain
(unreachable!) probing routines, bloating the resulting assembly. To
avoid this, -fno-stack-clash-protection is unconditionally added to
the kernel Makefile since this is the only dynamic stack allocation in
the kernel (now that VLAs have been removed) and it is provably safe
from Stack Clash style attacks.
The second gotcha with alloca() is a negative interaction with
-fstack-protector*, in that it sees the alloca() as an array allocation,
which triggers the unconditional addition of the stack canary function
pre/post-amble which slows down syscalls regardless of the static
branch. In order to avoid adding this unneeded check and its associated
performance impact, architectures need to carefully remove uses of
-fstack-protector-strong (or -fstack-protector) in the compilation units
that use the add_random_kstack() macro and to audit the resulting stack
mitigation coverage (to make sure no desired coverage disappears). No
change is visible for this on x86 because the stack protector is already
unconditionally disabled for the compilation unit, but the change is
required on arm64. There is, unfortunately, no attribute that can be
used to disable stack protector for specific functions.
Comparison to PaX RANDKSTACK feature:
The RANDKSTACK feature randomizes the location of the stack start
(cpu_current_top_of_stack), i.e. including the location of pt_regs
structure itself on the stack. Initially this patch followed the same
approach, but during the recent discussions[2], it has been determined
to be of a little value since, if ptrace functionality is available for
an attacker, they can use PTRACE_PEEKUSR/PTRACE_POKEUSR to read/write
different offsets in the pt_regs struct, observe the cache behavior of
the pt_regs accesses, and figure out the random stack offset. Another
difference is that the random offset is stored in a per-cpu variable,
rather than having it be per-thread. As a result, these implementations
differ a fair bit in their implementation details and results, though
obviously the intent is similar.
[1] https://lore.kernel.org/kernel-hardening/2236FBA76BA1254E88B949DDB74E612BA4BC57C1@IRSMSX102.ger.corp.intel.com/
[2] https://lore.kernel.org/kernel-hardening/20190329081358.30497-1-elena.reshetova@intel.com/
[3] https://lists.ubuntu.com/archives/ubuntu-devel/2019-June/040741.html
Co-developed-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Elena Reshetova <elena.reshetova@intel.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20210401232347.2791257-4-keescook@chromium.org
2021-04-02 07:23:44 +08:00
|
|
|
randomize_kstack_offset=
|
|
|
|
[KNL] Enable or disable kernel stack offset
|
|
|
|
randomization, which provides roughly 5 bits of
|
|
|
|
entropy, frustrating memory corruption attacks
|
|
|
|
that depend on stack address determinism or
|
|
|
|
cross-syscall address exposures. This is only
|
|
|
|
available on architectures that have defined
|
|
|
|
CONFIG_HAVE_ARCH_RANDOMIZE_KSTACK_OFFSET.
|
|
|
|
Format: <bool> (1/Y/y=enable, 0/N/n=disable)
|
|
|
|
Default is CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAULT.
|
|
|
|
|
2017-03-27 17:33:02 +08:00
|
|
|
ras=option[,option,...] [KNL] RAS-specific options
|
|
|
|
|
|
|
|
cec_disable [X86]
|
|
|
|
Disable the Correctable Errors Collector,
|
|
|
|
see CONFIG_RAS_CEC help text.
|
|
|
|
|
2013-10-09 11:23:47 +08:00
|
|
|
rcu_nocbs= [KNL]
|
2021-02-21 16:08:27 +08:00
|
|
|
The argument is a cpu list, as described above.
|
2016-10-12 04:51:35 +08:00
|
|
|
|
2012-08-20 12:35:53 +08:00
|
|
|
In kernels built with CONFIG_RCU_NOCB_CPU=y, set
|
|
|
|
the specified list of CPUs to be no-callback CPUs.
|
2018-07-02 23:25:57 +08:00
|
|
|
Invocation of these CPUs' RCU callbacks will be
|
|
|
|
offloaded to "rcuox/N" kthreads created for that
|
|
|
|
purpose, where "x" is "p" for RCU-preempt, and
|
|
|
|
"s" for RCU-sched, and "N" is the CPU number.
|
|
|
|
This reduces OS jitter on the offloaded CPUs,
|
|
|
|
which can be useful for HPC and real-time
|
|
|
|
workloads. It can also improve energy efficiency
|
|
|
|
for asymmetric multiprocessors.
|
2012-08-20 12:35:53 +08:00
|
|
|
|
2013-10-09 11:23:47 +08:00
|
|
|
rcu_nocb_poll [KNL]
|
2012-08-20 12:35:53 +08:00
|
|
|
Rather than requiring that offloaded CPUs
|
|
|
|
(specified by rcu_nocbs= above) explicitly
|
|
|
|
awaken the corresponding "rcuoN" kthreads,
|
|
|
|
make these kthreads poll for callbacks.
|
|
|
|
This improves the real-time response for the
|
|
|
|
offloaded CPUs by relieving them of the need to
|
|
|
|
wake up the corresponding kthread, but degrades
|
|
|
|
energy efficiency by requiring that the kthreads
|
|
|
|
periodically wake up to do the polling.
|
|
|
|
|
2013-10-09 11:23:47 +08:00
|
|
|
rcutree.blimit= [KNL]
|
2013-10-28 00:44:03 +08:00
|
|
|
Set maximum number of finished RCU callbacks to
|
|
|
|
process in one batch.
|
2006-03-08 13:55:33 +08:00
|
|
|
|
2015-04-21 02:40:50 +08:00
|
|
|
rcutree.dump_tree= [KNL]
|
|
|
|
Dump the structure of the rcu_node combining tree
|
|
|
|
out at early boot. This is used for diagnostic
|
|
|
|
purposes, to verify correct tree setup.
|
|
|
|
|
2015-03-11 09:33:20 +08:00
|
|
|
rcutree.gp_cleanup_delay= [KNL]
|
|
|
|
Set the number of jiffies to delay each step of
|
rcu: Remove *_SLOW_* Kconfig options
The RCU_TORTURE_TEST_SLOW_PREINIT, RCU_TORTURE_TEST_SLOW_PREINIT_DELAY,
RCU_TORTURE_TEST_SLOW_PREINIT_DELAY, RCU_TORTURE_TEST_SLOW_INIT,
RCU_TORTURE_TEST_SLOW_INIT_DELAY, RCU_TORTURE_TEST_SLOW_CLEANUP,
and RCU_TORTURE_TEST_SLOW_CLEANUP_DELAY Kconfig options are only
useful for torture testing, and there are the rcutree.gp_cleanup_delay,
rcutree.gp_init_delay, and rcutree.gp_preinit_delay kernel boot parameters
that rcutorture can use instead. The effect of these parameters is to
artificially slow down grace period initialization and cleanup in order
to make some types of race conditions happen more often.
This commit therefore simplifies Tree RCU a bit by removing the Kconfig
options and adding the corresponding kernel parameters to rcutorture's
.boot files instead. However, this commit also leaves out the kernel
parameters for TREE02, TREE04, and TREE07 in order to have about the
same number of tests slowed as not slowed. TREE01, TREE03, TREE05,
and TREE06 are slowed, and the rest are not slowed.
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-05-11 05:36:55 +08:00
|
|
|
RCU grace-period cleanup.
|
2015-03-11 09:33:20 +08:00
|
|
|
|
2015-01-23 10:24:08 +08:00
|
|
|
rcutree.gp_init_delay= [KNL]
|
|
|
|
Set the number of jiffies to delay each step of
|
rcu: Remove *_SLOW_* Kconfig options
The RCU_TORTURE_TEST_SLOW_PREINIT, RCU_TORTURE_TEST_SLOW_PREINIT_DELAY,
RCU_TORTURE_TEST_SLOW_PREINIT_DELAY, RCU_TORTURE_TEST_SLOW_INIT,
RCU_TORTURE_TEST_SLOW_INIT_DELAY, RCU_TORTURE_TEST_SLOW_CLEANUP,
and RCU_TORTURE_TEST_SLOW_CLEANUP_DELAY Kconfig options are only
useful for torture testing, and there are the rcutree.gp_cleanup_delay,
rcutree.gp_init_delay, and rcutree.gp_preinit_delay kernel boot parameters
that rcutorture can use instead. The effect of these parameters is to
artificially slow down grace period initialization and cleanup in order
to make some types of race conditions happen more often.
This commit therefore simplifies Tree RCU a bit by removing the Kconfig
options and adding the corresponding kernel parameters to rcutorture's
.boot files instead. However, this commit also leaves out the kernel
parameters for TREE02, TREE04, and TREE07 in order to have about the
same number of tests slowed as not slowed. TREE01, TREE03, TREE05,
and TREE06 are slowed, and the rest are not slowed.
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-05-11 05:36:55 +08:00
|
|
|
RCU grace-period initialization.
|
2015-03-11 09:33:20 +08:00
|
|
|
|
|
|
|
rcutree.gp_preinit_delay= [KNL]
|
|
|
|
Set the number of jiffies to delay each step of
|
|
|
|
RCU grace-period pre-initialization, that is,
|
|
|
|
the propagation of recent CPU-hotplug changes up
|
rcu: Remove *_SLOW_* Kconfig options
The RCU_TORTURE_TEST_SLOW_PREINIT, RCU_TORTURE_TEST_SLOW_PREINIT_DELAY,
RCU_TORTURE_TEST_SLOW_PREINIT_DELAY, RCU_TORTURE_TEST_SLOW_INIT,
RCU_TORTURE_TEST_SLOW_INIT_DELAY, RCU_TORTURE_TEST_SLOW_CLEANUP,
and RCU_TORTURE_TEST_SLOW_CLEANUP_DELAY Kconfig options are only
useful for torture testing, and there are the rcutree.gp_cleanup_delay,
rcutree.gp_init_delay, and rcutree.gp_preinit_delay kernel boot parameters
that rcutorture can use instead. The effect of these parameters is to
artificially slow down grace period initialization and cleanup in order
to make some types of race conditions happen more often.
This commit therefore simplifies Tree RCU a bit by removing the Kconfig
options and adding the corresponding kernel parameters to rcutorture's
.boot files instead. However, this commit also leaves out the kernel
parameters for TREE02, TREE04, and TREE07 in order to have about the
same number of tests slowed as not slowed. TREE01, TREE03, TREE05,
and TREE06 are slowed, and the rest are not slowed.
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-05-11 05:36:55 +08:00
|
|
|
the rcu_node combining tree.
|
2015-01-23 10:24:08 +08:00
|
|
|
|
2019-03-21 05:13:33 +08:00
|
|
|
rcutree.use_softirq= [KNL]
|
|
|
|
If set to zero, move all RCU_SOFTIRQ processing to
|
|
|
|
per-CPU rcuc kthreads. Defaults to a non-zero
|
|
|
|
value, meaning that RCU_SOFTIRQ is used by default.
|
|
|
|
Specify rcutree.use_softirq=0 to use rcuc kthreads.
|
|
|
|
|
2020-12-15 22:16:46 +08:00
|
|
|
But note that CONFIG_PREEMPT_RT=y kernels disable
|
|
|
|
this kernel boot parameter, forcibly setting it
|
|
|
|
to zero.
|
|
|
|
|
2015-04-21 01:27:15 +08:00
|
|
|
rcutree.rcu_fanout_exact= [KNL]
|
|
|
|
Disable autobalancing of the rcu_node combining
|
|
|
|
tree. This is used by rcutorture, and might
|
|
|
|
possibly be useful for architectures having high
|
|
|
|
cache-to-cache transfer latencies.
|
2015-01-23 10:24:08 +08:00
|
|
|
|
2013-10-09 11:23:47 +08:00
|
|
|
rcutree.rcu_fanout_leaf= [KNL]
|
2015-07-31 23:28:35 +08:00
|
|
|
Change the number of CPUs assigned to each
|
|
|
|
leaf rcu_node structure. Useful for very
|
|
|
|
large systems, which will choose the value 64,
|
|
|
|
and for NUMA systems with large remote-access
|
|
|
|
latencies, which will choose a value aligned
|
|
|
|
with the appropriate hardware boundaries.
|
2012-04-24 06:52:53 +08:00
|
|
|
|
2020-05-26 05:47:52 +08:00
|
|
|
rcutree.rcu_min_cached_objs= [KNL]
|
|
|
|
Minimum number of objects which are cached and
|
|
|
|
maintained per one CPU. Object size is equal
|
|
|
|
to PAGE_SIZE. The cache allows to reduce the
|
|
|
|
pressure to page allocator, also it makes the
|
|
|
|
whole algorithm to behave better in low memory
|
|
|
|
condition.
|
|
|
|
|
2021-04-16 01:19:56 +08:00
|
|
|
rcutree.rcu_delay_page_cache_fill_msec= [KNL]
|
|
|
|
Set the page-cache refill delay (in milliseconds)
|
|
|
|
in response to low-memory conditions. The range
|
|
|
|
of permitted values is in the range 0:100000.
|
|
|
|
|
2013-10-09 11:23:47 +08:00
|
|
|
rcutree.jiffies_till_first_fqs= [KNL]
|
2012-12-29 03:30:36 +08:00
|
|
|
Set delay from grace-period initialization to
|
|
|
|
first attempt to force quiescent states.
|
|
|
|
Units are jiffies, minimum value is zero,
|
|
|
|
and maximum value is HZ.
|
|
|
|
|
2013-10-09 11:23:47 +08:00
|
|
|
rcutree.jiffies_till_next_fqs= [KNL]
|
2012-12-29 03:30:36 +08:00
|
|
|
Set delay between subsequent attempts to force
|
|
|
|
quiescent states. Units are jiffies, minimum
|
|
|
|
value is one, and maximum value is HZ.
|
|
|
|
|
2018-11-21 02:22:00 +08:00
|
|
|
rcutree.jiffies_till_sched_qs= [KNL]
|
|
|
|
Set required age in jiffies for a
|
|
|
|
given grace period before RCU starts
|
|
|
|
soliciting quiescent-state help from
|
|
|
|
rcu_note_context_switch() and cond_resched().
|
|
|
|
If not specified, the kernel will calculate
|
|
|
|
a value based on the most recent settings
|
|
|
|
of rcutree.jiffies_till_first_fqs
|
|
|
|
and rcutree.jiffies_till_next_fqs.
|
|
|
|
This calculated value may be viewed in
|
|
|
|
rcutree.jiffies_to_sched_qs. Any attempt to set
|
|
|
|
rcutree.jiffies_to_sched_qs will be cheerfully
|
|
|
|
overwritten.
|
|
|
|
|
2014-09-13 10:21:09 +08:00
|
|
|
rcutree.kthread_prio= [KNL,BOOT]
|
2015-01-21 15:54:59 +08:00
|
|
|
Set the SCHED_FIFO priority of the RCU per-CPU
|
|
|
|
kthreads (rcuc/N). This value is also used for
|
|
|
|
the priority of the RCU boost threads (rcub/N)
|
|
|
|
and for the RCU grace-period kthreads (rcu_bh,
|
|
|
|
rcu_preempt, and rcu_sched). If RCU_BOOST is
|
|
|
|
set, valid values are 1-99 and the default is 1
|
|
|
|
(the least-favored priority). Otherwise, when
|
|
|
|
RCU_BOOST is not set, valid values are 0-99 and
|
|
|
|
the default is zero (non-realtime operation).
|
2014-09-13 10:21:09 +08:00
|
|
|
|
2019-04-02 23:05:55 +08:00
|
|
|
rcutree.rcu_nocb_gp_stride= [KNL]
|
|
|
|
Set the number of NOCB callback kthreads in
|
|
|
|
each group, which defaults to the square root
|
|
|
|
of the number of CPUs. Larger numbers reduce
|
|
|
|
the wakeup overhead on the global grace-period
|
|
|
|
kthread, but increases that same overhead on
|
|
|
|
each group's NOCB grace-period kthread.
|
rcu: Parallelize and economize NOCB kthread wakeups
An 80-CPU system with a context-switch-heavy workload can require so
many NOCB kthread wakeups that the RCU grace-period kthreads spend several
tens of percent of a CPU just awakening things. This clearly will not
scale well: If you add enough CPUs, the RCU grace-period kthreads would
get behind, increasing grace-period latency.
To avoid this problem, this commit divides the NOCB kthreads into leaders
and followers, where the grace-period kthreads awaken the leaders each of
whom in turn awakens its followers. By default, the number of groups of
kthreads is the square root of the number of CPUs, but this default may
be overridden using the rcutree.rcu_nocb_leader_stride boot parameter.
This reduces the number of wakeups done per grace period by the RCU
grace-period kthread by the square root of the number of CPUs, but of
course by shifting those wakeups to the leaders. In addition, because
the leaders do grace periods on behalf of their respective followers,
the number of wakeups of the followers decreases by up to a factor of two.
Instead of being awakened once when new callbacks arrive and again
at the end of the grace period, the followers are awakened only at
the end of the grace period.
For a numerical example, in a 4096-CPU system, the grace-period kthread
would awaken 64 leaders, each of which would awaken its 63 followers
at the end of the grace period. This compares favorably with the 79
wakeups for the grace-period kthread on an 80-CPU system.
Reported-by: Rik van Riel <riel@redhat.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2014-06-25 00:26:11 +08:00
|
|
|
|
2013-10-09 11:23:47 +08:00
|
|
|
rcutree.qhimark= [KNL]
|
2013-10-28 00:44:03 +08:00
|
|
|
Set threshold of queued RCU callbacks beyond which
|
|
|
|
batch limiting is disabled.
|
2006-03-08 13:55:33 +08:00
|
|
|
|
2013-10-09 11:23:47 +08:00
|
|
|
rcutree.qlowmark= [KNL]
|
2008-02-03 21:20:26 +08:00
|
|
|
Set threshold of queued RCU callbacks below which
|
|
|
|
batch limiting is re-enabled.
|
2006-03-08 13:55:33 +08:00
|
|
|
|
2019-10-31 02:56:10 +08:00
|
|
|
rcutree.qovld= [KNL]
|
|
|
|
Set threshold of queued RCU callbacks beyond which
|
|
|
|
RCU's force-quiescent-state scan will aggressively
|
|
|
|
enlist help from cond_resched() and sched IPIs to
|
|
|
|
help CPUs more quickly reach quiescent states.
|
|
|
|
Set to less than zero to make this be set based
|
|
|
|
on rcutree.qhimark at boot time and to zero to
|
|
|
|
disable more aggressive help enlistment.
|
|
|
|
|
2013-10-09 11:23:47 +08:00
|
|
|
rcutree.rcu_idle_gp_delay= [KNL]
|
2012-12-29 03:30:36 +08:00
|
|
|
Set wakeup interval for idle CPUs that have
|
|
|
|
RCU callbacks (RCU_FAST_NO_HZ=y).
|
rcu: Control grace-period duration from sysfs
Although almost everyone is well-served by the defaults, some uses of RCU
benefit from shorter grace periods, while others benefit more from the
greater efficiency provided by longer grace periods. Situations requiring
a large number of grace periods to elapse (and wireshark startup has
been called out as an example of this) are helped by lower-latency
grace periods. Furthermore, in some embedded applications, people are
willing to accept a small degradation in update efficiency (due to there
being more of the shorter grace-period operations) in order to gain the
lower latency.
In contrast, those few systems with thousands of CPUs need longer grace
periods because the CPU overhead of a grace period rises roughly
linearly with the number of CPUs. Such systems normally do not make
much use of facilities that require large numbers of grace periods to
elapse, so this is a good tradeoff.
Therefore, this commit allows the durations to be controlled from sysfs.
There are two sysfs parameters, one named "jiffies_till_first_fqs" that
specifies the delay in jiffies from the end of grace-period initialization
until the first attempt to force quiescent states, and the other named
"jiffies_till_next_fqs" that specifies the delay (again in jiffies)
between subsequent attempts to force quiescent states. They both default
to three jiffies, which is compatible with the old hard-coded behavior.
At some future time, it may be possible to automatically increase the
grace-period length with the number of CPUs, but we do not yet have
sufficient data to do a good job. Preliminary data indicates that we
should add an addiitonal jiffy to each of the delays for every 200 CPUs
in the system, but more experimentation is needed. For now, the number
of systems with more than 1,000 CPUs is small enough that this can be
relegated to boot-time hand tuning.
Signed-off-by: Paul E. McKenney <paul.mckenney@linaro.org>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Reviewed-by: Josh Triplett <josh@joshtriplett.org>
2012-06-27 11:45:57 +08:00
|
|
|
|
2017-01-07 07:14:11 +08:00
|
|
|
rcutree.rcu_kick_kthreads= [KNL]
|
|
|
|
Cause the grace-period kthread to get an extra
|
|
|
|
wake_up() if it sleeps three times longer than
|
|
|
|
it should at force-quiescent-state time.
|
|
|
|
This wake_up() will be accompanied by a
|
|
|
|
WARN_ONCE() splat and an ftrace_dump().
|
|
|
|
|
2020-08-08 04:44:10 +08:00
|
|
|
rcutree.rcu_unlock_delay= [KNL]
|
|
|
|
In CONFIG_RCU_STRICT_GRACE_PERIOD=y kernels,
|
|
|
|
this specifies an rcu_read_unlock()-time delay
|
|
|
|
in microseconds. This defaults to zero.
|
|
|
|
Larger delays increase the probability of
|
|
|
|
catching RCU pointer leaks, that is, buggy use
|
|
|
|
of RCU-protected pointers after the relevant
|
|
|
|
rcu_read_unlock() has completed.
|
|
|
|
|
2018-12-13 04:32:06 +08:00
|
|
|
rcutree.sysrq_rcu= [KNL]
|
|
|
|
Commandeer a sysrq key to dump out Tree RCU's
|
|
|
|
rcu_node tree with an eye towards determining
|
|
|
|
why a new grace period has not yet started.
|
|
|
|
|
2020-08-12 12:18:12 +08:00
|
|
|
rcuscale.gp_async= [KNL]
|
2017-04-18 03:47:10 +08:00
|
|
|
Measure performance of asynchronous
|
|
|
|
grace-period primitives such as call_rcu().
|
|
|
|
|
2020-08-12 12:18:12 +08:00
|
|
|
rcuscale.gp_async_max= [KNL]
|
2017-04-18 03:47:10 +08:00
|
|
|
Specify the maximum number of outstanding
|
|
|
|
callbacks per writer thread. When a writer
|
|
|
|
thread exceeds this limit, it invokes the
|
|
|
|
corresponding flavor of rcu_barrier() to allow
|
|
|
|
previously posted callbacks to drain.
|
|
|
|
|
2020-08-12 12:18:12 +08:00
|
|
|
rcuscale.gp_exp= [KNL]
|
2016-01-02 05:47:19 +08:00
|
|
|
Measure performance of expedited synchronous
|
|
|
|
grace-period primitives.
|
|
|
|
|
2020-08-12 12:18:12 +08:00
|
|
|
rcuscale.holdoff= [KNL]
|
2016-01-31 12:56:38 +08:00
|
|
|
Set test-start holdoff period. The purpose of
|
|
|
|
this parameter is to delay the start of the
|
|
|
|
test until boot completes in order to avoid
|
|
|
|
interference.
|
|
|
|
|
2020-08-12 12:18:12 +08:00
|
|
|
rcuscale.kfree_rcu_test= [KNL]
|
2019-08-31 00:36:29 +08:00
|
|
|
Set to measure performance of kfree_rcu() flooding.
|
|
|
|
|
2021-02-18 02:51:10 +08:00
|
|
|
rcuscale.kfree_rcu_test_double= [KNL]
|
|
|
|
Test the double-argument variant of kfree_rcu().
|
|
|
|
If this parameter has the same value as
|
|
|
|
rcuscale.kfree_rcu_test_single, both the single-
|
|
|
|
and double-argument variants are tested.
|
|
|
|
|
|
|
|
rcuscale.kfree_rcu_test_single= [KNL]
|
|
|
|
Test the single-argument variant of kfree_rcu().
|
|
|
|
If this parameter has the same value as
|
|
|
|
rcuscale.kfree_rcu_test_double, both the single-
|
|
|
|
and double-argument variants are tested.
|
|
|
|
|
2020-08-12 12:18:12 +08:00
|
|
|
rcuscale.kfree_nthreads= [KNL]
|
2019-08-31 00:36:29 +08:00
|
|
|
The number of threads running loops of kfree_rcu().
|
|
|
|
|
2020-08-12 12:18:12 +08:00
|
|
|
rcuscale.kfree_alloc_num= [KNL]
|
2019-08-31 00:36:29 +08:00
|
|
|
Number of allocations and frees done in an iteration.
|
|
|
|
|
2020-08-12 12:18:12 +08:00
|
|
|
rcuscale.kfree_loops= [KNL]
|
|
|
|
Number of loops doing rcuscale.kfree_alloc_num number
|
2019-08-31 00:36:29 +08:00
|
|
|
of allocations and frees.
|
|
|
|
|
2020-08-12 12:18:12 +08:00
|
|
|
rcuscale.nreaders= [KNL]
|
2016-01-02 05:47:19 +08:00
|
|
|
Set number of RCU readers. The value -1 selects
|
|
|
|
N, where N is the number of CPUs. A value
|
|
|
|
"n" less than -1 selects N-n+1, where N is again
|
|
|
|
the number of CPUs. For example, -2 selects N
|
|
|
|
(the number of CPUs), -3 selects N+1, and so on.
|
|
|
|
A value of "n" less than or equal to -N selects
|
|
|
|
a single reader.
|
|
|
|
|
2020-08-12 12:18:12 +08:00
|
|
|
rcuscale.nwriters= [KNL]
|
2016-01-02 05:47:19 +08:00
|
|
|
Set number of RCU writers. The values operate
|
2020-08-12 12:18:12 +08:00
|
|
|
the same as for rcuscale.nreaders.
|
2016-01-02 05:47:19 +08:00
|
|
|
N, where N is the number of CPUs
|
|
|
|
|
2020-08-12 12:18:12 +08:00
|
|
|
rcuscale.perf_type= [KNL]
|
2017-04-26 06:12:56 +08:00
|
|
|
Specify the RCU implementation to test.
|
|
|
|
|
2020-08-12 12:18:12 +08:00
|
|
|
rcuscale.shutdown= [KNL]
|
2016-01-02 05:47:19 +08:00
|
|
|
Shut the system down after performance tests
|
|
|
|
complete. This is useful for hands-off automated
|
|
|
|
testing.
|
|
|
|
|
2020-08-12 12:18:12 +08:00
|
|
|
rcuscale.verbose= [KNL]
|
2016-01-02 05:47:19 +08:00
|
|
|
Enable additional printk() statements.
|
|
|
|
|
2020-08-12 12:18:12 +08:00
|
|
|
rcuscale.writer_holdoff= [KNL]
|
2017-04-26 06:12:56 +08:00
|
|
|
Write-side holdoff between grace periods,
|
|
|
|
in microseconds. The default of zero says
|
|
|
|
no holdoff.
|
|
|
|
|
2013-10-09 11:23:47 +08:00
|
|
|
rcutorture.fqs_duration= [KNL]
|
2015-05-15 08:29:51 +08:00
|
|
|
Set duration of force_quiescent_state bursts
|
|
|
|
in microseconds.
|
2012-04-24 01:54:45 +08:00
|
|
|
|
2013-10-09 11:23:47 +08:00
|
|
|
rcutorture.fqs_holdoff= [KNL]
|
2015-05-15 08:29:51 +08:00
|
|
|
Set holdoff time within force_quiescent_state bursts
|
|
|
|
in microseconds.
|
2012-04-24 01:54:45 +08:00
|
|
|
|
2013-10-09 11:23:47 +08:00
|
|
|
rcutorture.fqs_stutter= [KNL]
|
2015-05-15 08:29:51 +08:00
|
|
|
Set wait time between force_quiescent_state bursts
|
|
|
|
in seconds.
|
|
|
|
|
2018-10-01 23:38:54 +08:00
|
|
|
rcutorture.fwd_progress= [KNL]
|
|
|
|
Enable RCU grace-period forward-progress testing
|
|
|
|
for the types of RCU supporting this notion.
|
|
|
|
|
|
|
|
rcutorture.fwd_progress_div= [KNL]
|
|
|
|
Specify the fraction of a CPU-stall-warning
|
|
|
|
period to do tight-loop forward-progress testing.
|
|
|
|
|
|
|
|
rcutorture.fwd_progress_holdoff= [KNL]
|
|
|
|
Number of seconds to wait between successive
|
|
|
|
forward-progress tests.
|
|
|
|
|
|
|
|
rcutorture.fwd_progress_need_resched= [KNL]
|
|
|
|
Enclose cond_resched() calls within checks for
|
|
|
|
need_resched() during tight-loop forward-progress
|
|
|
|
testing.
|
|
|
|
|
2015-05-15 08:29:51 +08:00
|
|
|
rcutorture.gp_cond= [KNL]
|
|
|
|
Use conditional/asynchronous update-side
|
|
|
|
primitives, if available.
|
2012-04-24 01:54:45 +08:00
|
|
|
|
2013-10-09 11:23:47 +08:00
|
|
|
rcutorture.gp_exp= [KNL]
|
2015-05-15 08:29:51 +08:00
|
|
|
Use expedited update-side primitives, if available.
|
2013-10-09 11:23:47 +08:00
|
|
|
|
|
|
|
rcutorture.gp_normal= [KNL]
|
2015-05-15 08:29:51 +08:00
|
|
|
Use normal (non-expedited) asynchronous
|
|
|
|
update-side primitives, if available.
|
|
|
|
|
|
|
|
rcutorture.gp_sync= [KNL]
|
|
|
|
Use normal (non-expedited) synchronous
|
|
|
|
update-side primitives, if available. If all
|
|
|
|
of rcutorture.gp_cond=, rcutorture.gp_exp=,
|
|
|
|
rcutorture.gp_normal=, and rcutorture.gp_sync=
|
|
|
|
are zero, rcutorture acts as if is interpreted
|
|
|
|
they are all non-zero.
|
2012-04-24 01:54:45 +08:00
|
|
|
|
2020-08-12 01:33:39 +08:00
|
|
|
rcutorture.irqreader= [KNL]
|
|
|
|
Run RCU readers from irq handlers, or, more
|
|
|
|
accurately, from a timer handler. Not all RCU
|
|
|
|
flavors take kindly to this sort of thing.
|
|
|
|
|
|
|
|
rcutorture.leakpointer= [KNL]
|
|
|
|
Leak an RCU-protected pointer out of the reader.
|
|
|
|
This can of course result in splats, and is
|
|
|
|
intended to test the ability of things like
|
|
|
|
CONFIG_RCU_STRICT_GRACE_PERIOD=y to detect
|
|
|
|
such leaks.
|
|
|
|
|
2013-10-09 11:23:47 +08:00
|
|
|
rcutorture.n_barrier_cbs= [KNL]
|
2012-04-24 01:54:45 +08:00
|
|
|
Set callbacks/threads for rcu_barrier() testing.
|
|
|
|
|
2013-10-09 11:23:47 +08:00
|
|
|
rcutorture.nfakewriters= [KNL]
|
2012-04-24 01:54:45 +08:00
|
|
|
Set number of concurrent RCU writers. These just
|
|
|
|
stress RCU, they don't participate in the actual
|
|
|
|
test, hence the "fake".
|
|
|
|
|
2020-09-24 08:39:46 +08:00
|
|
|
rcutorture.nocbs_nthreads= [KNL]
|
|
|
|
Set number of RCU callback-offload togglers.
|
|
|
|
Zero (the default) disables toggling.
|
|
|
|
|
|
|
|
rcutorture.nocbs_toggle= [KNL]
|
|
|
|
Set the delay in milliseconds between successive
|
|
|
|
callback-offload toggling attempts.
|
|
|
|
|
2013-10-09 11:23:47 +08:00
|
|
|
rcutorture.nreaders= [KNL]
|
2015-03-13 04:55:48 +08:00
|
|
|
Set number of RCU readers. The value -1 selects
|
|
|
|
N-1, where N is the number of CPUs. A value
|
|
|
|
"n" less than -1 selects N-n-2, where N is again
|
|
|
|
the number of CPUs. For example, -2 selects N
|
|
|
|
(the number of CPUs), -3 selects N+1, and so on.
|
2012-04-24 01:54:45 +08:00
|
|
|
|
2013-10-09 11:23:47 +08:00
|
|
|
rcutorture.object_debug= [KNL]
|
|
|
|
Enable debug-object double-call_rcu() testing.
|
|
|
|
|
|
|
|
rcutorture.onoff_holdoff= [KNL]
|
2012-04-24 01:54:45 +08:00
|
|
|
Set time (s) after boot for CPU-hotplug testing.
|
|
|
|
|
2013-10-09 11:23:47 +08:00
|
|
|
rcutorture.onoff_interval= [KNL]
|
2018-05-09 00:20:34 +08:00
|
|
|
Set time (jiffies) between CPU-hotplug operations,
|
|
|
|
or zero to disable CPU-hotplug testing.
|
2012-04-24 01:54:45 +08:00
|
|
|
|
2020-04-25 02:21:40 +08:00
|
|
|
rcutorture.read_exit= [KNL]
|
|
|
|
Set the number of read-then-exit kthreads used
|
|
|
|
to test the interaction of RCU updaters and
|
|
|
|
task-exit processing.
|
|
|
|
|
|
|
|
rcutorture.read_exit_burst= [KNL]
|
|
|
|
The number of times in a given read-then-exit
|
|
|
|
episode that a set of read-then-exit kthreads
|
|
|
|
is spawned.
|
|
|
|
|
|
|
|
rcutorture.read_exit_delay= [KNL]
|
|
|
|
The delay, in seconds, between successive
|
|
|
|
read-then-exit testing episodes.
|
|
|
|
|
2013-10-09 11:23:47 +08:00
|
|
|
rcutorture.shuffle_interval= [KNL]
|
2012-04-24 01:54:45 +08:00
|
|
|
Set task-shuffle interval (s). Shuffling tasks
|
|
|
|
allows some CPUs to go into dyntick-idle mode
|
|
|
|
during the rcutorture test.
|
|
|
|
|
2013-10-09 11:23:47 +08:00
|
|
|
rcutorture.shutdown_secs= [KNL]
|
2012-04-24 01:54:45 +08:00
|
|
|
Set time (s) after boot system shutdown. This
|
|
|
|
is useful for hands-off automated testing.
|
|
|
|
|
2013-10-09 11:23:47 +08:00
|
|
|
rcutorture.stall_cpu= [KNL]
|
2012-04-24 01:54:45 +08:00
|
|
|
Duration of CPU stall (s) to test RCU CPU stall
|
|
|
|
warnings, zero to disable.
|
|
|
|
|
2020-03-12 08:39:12 +08:00
|
|
|
rcutorture.stall_cpu_block= [KNL]
|
|
|
|
Sleep while stalling if set. This will result
|
|
|
|
in warnings from preemptible RCU in addition
|
|
|
|
to any other stall-related activity.
|
|
|
|
|
2013-10-09 11:23:47 +08:00
|
|
|
rcutorture.stall_cpu_holdoff= [KNL]
|
2012-04-24 01:54:45 +08:00
|
|
|
Time to wait (s) after boot before inducing stall.
|
|
|
|
|
2017-08-19 07:11:37 +08:00
|
|
|
rcutorture.stall_cpu_irqsoff= [KNL]
|
|
|
|
Disable interrupts while stalling if set.
|
|
|
|
|
2020-04-02 10:57:52 +08:00
|
|
|
rcutorture.stall_gp_kthread= [KNL]
|
|
|
|
Duration (s) of forced sleep within RCU
|
|
|
|
grace-period kthread to test RCU CPU stall
|
|
|
|
warnings, zero to disable. If both stall_cpu
|
|
|
|
and stall_gp_kthread are specified, the
|
|
|
|
kthread is starved first, then the CPU.
|
|
|
|
|
2013-10-09 11:23:47 +08:00
|
|
|
rcutorture.stat_interval= [KNL]
|
2012-04-24 01:54:45 +08:00
|
|
|
Time (s) between statistics printk()s.
|
|
|
|
|
2013-10-09 11:23:47 +08:00
|
|
|
rcutorture.stutter= [KNL]
|
2012-04-24 01:54:45 +08:00
|
|
|
Time (s) to stutter testing, for example, specifying
|
|
|
|
five seconds causes the test to run for five seconds,
|
|
|
|
wait for five seconds, and so on. This tests RCU's
|
|
|
|
ability to transition abruptly to and from idle.
|
|
|
|
|
2013-10-09 11:23:47 +08:00
|
|
|
rcutorture.test_boost= [KNL]
|
2012-04-24 01:54:45 +08:00
|
|
|
Test RCU priority boosting? 0=no, 1=maybe, 2=yes.
|
|
|
|
"Maybe" means test if the RCU implementation
|
|
|
|
under test support RCU priority boosting.
|
|
|
|
|
2013-10-09 11:23:47 +08:00
|
|
|
rcutorture.test_boost_duration= [KNL]
|
2012-04-24 01:54:45 +08:00
|
|
|
Duration (s) of each individual boost test.
|
|
|
|
|
2013-10-09 11:23:47 +08:00
|
|
|
rcutorture.test_boost_interval= [KNL]
|
2012-04-24 01:54:45 +08:00
|
|
|
Interval (s) between each boost test.
|
|
|
|
|
2013-10-09 11:23:47 +08:00
|
|
|
rcutorture.test_no_idle_hz= [KNL]
|
2012-04-24 01:54:45 +08:00
|
|
|
Test RCU's dyntick-idle handling. See also the
|
|
|
|
rcutorture.shuffle_interval parameter.
|
|
|
|
|
2013-10-09 11:23:47 +08:00
|
|
|
rcutorture.torture_type= [KNL]
|
2012-04-24 01:54:45 +08:00
|
|
|
Specify the RCU implementation to test.
|
|
|
|
|
2013-10-09 11:23:47 +08:00
|
|
|
rcutorture.verbose= [KNL]
|
2012-04-24 01:54:45 +08:00
|
|
|
Enable additional printk() statements.
|
|
|
|
|
2019-06-14 06:30:49 +08:00
|
|
|
rcupdate.rcu_cpu_stall_ftrace_dump= [KNL]
|
|
|
|
Dump ftrace buffer after reporting RCU CPU
|
|
|
|
stall warning.
|
|
|
|
|
2015-11-25 07:44:06 +08:00
|
|
|
rcupdate.rcu_cpu_stall_suppress= [KNL]
|
|
|
|
Suppress RCU CPU stall warning messages.
|
|
|
|
|
2019-12-06 03:29:01 +08:00
|
|
|
rcupdate.rcu_cpu_stall_suppress_at_boot= [KNL]
|
|
|
|
Suppress RCU CPU stall warning messages and
|
|
|
|
rcutorture writer stall warnings that occur
|
|
|
|
during early boot, that is, during the time
|
|
|
|
before the init task is spawned.
|
|
|
|
|
2015-11-25 07:44:06 +08:00
|
|
|
rcupdate.rcu_cpu_stall_timeout= [KNL]
|
|
|
|
Set timeout for RCU CPU stall warning messages.
|
|
|
|
|
2013-10-09 11:23:47 +08:00
|
|
|
rcupdate.rcu_expedited= [KNL]
|
|
|
|
Use expedited grace-period primitives, for
|
|
|
|
example, synchronize_rcu_expedited() instead
|
|
|
|
of synchronize_rcu(). This reduces latency,
|
|
|
|
but can increase CPU utilization, degrade
|
|
|
|
real-time latency, and degrade energy efficiency.
|
2015-12-08 05:09:52 +08:00
|
|
|
No effect on CONFIG_TINY_RCU kernels.
|
2013-10-09 11:23:47 +08:00
|
|
|
|
2015-11-25 07:44:06 +08:00
|
|
|
rcupdate.rcu_normal= [KNL]
|
|
|
|
Use only normal grace-period primitives,
|
|
|
|
for example, synchronize_rcu() instead of
|
|
|
|
synchronize_rcu_expedited(). This improves
|
2015-12-08 05:09:52 +08:00
|
|
|
real-time latency, CPU utilization, and
|
|
|
|
energy efficiency, but can expose users to
|
|
|
|
increased grace-period latency. This parameter
|
|
|
|
overrides rcupdate.rcu_expedited. No effect on
|
|
|
|
CONFIG_TINY_RCU kernels.
|
2013-10-09 11:23:47 +08:00
|
|
|
|
2015-11-26 10:56:00 +08:00
|
|
|
rcupdate.rcu_normal_after_boot= [KNL]
|
|
|
|
Once boot has completed (that is, after
|
|
|
|
rcu_end_inkernel_boot() has been invoked), use
|
2015-12-08 05:09:52 +08:00
|
|
|
only normal grace-period primitives. No effect
|
|
|
|
on CONFIG_TINY_RCU kernels.
|
2015-11-26 10:56:00 +08:00
|
|
|
|
2020-12-15 22:16:47 +08:00
|
|
|
But note that CONFIG_PREEMPT_RT=y kernels enables
|
|
|
|
this kernel boot parameter, forcibly setting
|
|
|
|
it to the value one, that is, converting any
|
|
|
|
post-boot attempt at an expedited RCU grace
|
|
|
|
period to instead use normal non-expedited
|
|
|
|
grace-period processing.
|
|
|
|
|
2020-03-18 02:39:26 +08:00
|
|
|
rcupdate.rcu_task_ipi_delay= [KNL]
|
|
|
|
Set time in jiffies during which RCU tasks will
|
|
|
|
avoid sending IPIs, starting with the beginning
|
|
|
|
of a given grace period. Setting a large
|
|
|
|
number avoids disturbing real-time workloads,
|
|
|
|
but lengthens grace periods.
|
|
|
|
|
2014-07-02 09:16:30 +08:00
|
|
|
rcupdate.rcu_task_stall_timeout= [KNL]
|
|
|
|
Set timeout in jiffies for RCU task stall warning
|
|
|
|
messages. Disable with a value less than or equal
|
|
|
|
to zero.
|
|
|
|
|
2014-09-19 23:34:09 +08:00
|
|
|
rcupdate.rcu_self_test= [KNL]
|
|
|
|
Run the RCU early boot self tests
|
|
|
|
|
2005-09-07 06:17:19 +08:00
|
|
|
rdinit= [KNL]
|
|
|
|
Format: <full_path>
|
|
|
|
Run specified binary instead of /init from the ramdisk,
|
|
|
|
used for early userspace startup. See initrd.
|
|
|
|
|
2019-08-19 23:52:35 +08:00
|
|
|
rdrand= [X86]
|
|
|
|
force - Override the decision by the kernel to hide the
|
|
|
|
advertisement of RDRAND support (this affects
|
|
|
|
certain AMD processors because of buggy BIOS
|
|
|
|
support, specifically around the suspend/resume
|
|
|
|
path).
|
|
|
|
|
2017-08-25 00:26:51 +08:00
|
|
|
rdt= [HW,X86,RDT]
|
|
|
|
Turn on/off individual RDT features. List is:
|
2017-12-21 06:57:24 +08:00
|
|
|
cmt, mbmtotal, mbmlocal, l3cat, l3cdp, l2cat, l2cdp,
|
|
|
|
mba.
|
2017-08-25 00:26:51 +08:00
|
|
|
E.g. to turn on cmt and turn off mba use:
|
|
|
|
rdt=cmt,!mba
|
|
|
|
|
2013-07-09 07:01:42 +08:00
|
|
|
reboot= [KNL]
|
|
|
|
Format (x86 or x86_64):
|
|
|
|
[w[arm] | c[old] | h[ard] | s[oft] | g[pio]] \
|
|
|
|
[[,]s[mp]#### \
|
|
|
|
[[,]b[ios] | a[cpi] | k[bd] | t[riple] | e[fi] | p[ci]] \
|
|
|
|
[[,]f[orce]
|
2019-05-15 06:45:37 +08:00
|
|
|
Where reboot_mode is one of warm (soft) or cold (hard) or gpio
|
|
|
|
(prefix with 'panic_' to set mode for panic
|
|
|
|
reboot only),
|
2013-07-09 07:01:42 +08:00
|
|
|
reboot_type is one of bios, acpi, kbd, triple, efi, or pci,
|
|
|
|
reboot_force is either force or not specified,
|
|
|
|
reboot_cpu is s[mp]#### with #### being the processor
|
|
|
|
to be used for rebooting.
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2020-06-18 02:53:53 +08:00
|
|
|
refscale.holdoff= [KNL]
|
2020-05-30 05:24:03 +08:00
|
|
|
Set test-start holdoff period. The purpose of
|
|
|
|
this parameter is to delay the start of the
|
|
|
|
test until boot completes in order to avoid
|
|
|
|
interference.
|
|
|
|
|
2020-06-18 02:53:53 +08:00
|
|
|
refscale.loops= [KNL]
|
2020-05-30 05:24:03 +08:00
|
|
|
Set the number of loops over the synchronization
|
|
|
|
primitive under test. Increasing this number
|
|
|
|
reduces noise due to loop start/end overhead,
|
|
|
|
but the default has already reduced the per-pass
|
|
|
|
noise to a handful of picoseconds on ca. 2020
|
|
|
|
x86 laptops.
|
|
|
|
|
2020-06-18 02:53:53 +08:00
|
|
|
refscale.nreaders= [KNL]
|
2020-05-30 05:24:03 +08:00
|
|
|
Set number of readers. The default value of -1
|
|
|
|
selects N, where N is roughly 75% of the number
|
|
|
|
of CPUs. A value of zero is an interesting choice.
|
|
|
|
|
2020-06-18 02:53:53 +08:00
|
|
|
refscale.nruns= [KNL]
|
2020-05-30 05:24:03 +08:00
|
|
|
Set number of runs, each of which is dumped onto
|
|
|
|
the console log.
|
|
|
|
|
2020-06-18 02:53:53 +08:00
|
|
|
refscale.readdelay= [KNL]
|
2020-05-30 05:24:03 +08:00
|
|
|
Set the read-side critical-section duration,
|
|
|
|
measured in microseconds.
|
|
|
|
|
2020-06-18 02:53:53 +08:00
|
|
|
refscale.scale_type= [KNL]
|
|
|
|
Specify the read-protection implementation to test.
|
|
|
|
|
|
|
|
refscale.shutdown= [KNL]
|
2020-05-30 05:24:03 +08:00
|
|
|
Shut down the system at the end of the performance
|
|
|
|
test. This defaults to 1 (shut it down) when
|
2020-08-12 12:18:12 +08:00
|
|
|
refscale is built into the kernel and to 0 (leave
|
|
|
|
it running) when refscale is built as a module.
|
2020-05-30 05:24:03 +08:00
|
|
|
|
2020-06-18 02:53:53 +08:00
|
|
|
refscale.verbose= [KNL]
|
2020-05-30 05:24:03 +08:00
|
|
|
Enable additional printk() statements.
|
|
|
|
|
2020-11-16 02:24:52 +08:00
|
|
|
refscale.verbose_batched= [KNL]
|
|
|
|
Batch the additional printk() statements. If zero
|
|
|
|
(the default) or negative, print everything. Otherwise,
|
|
|
|
print every Nth verbose statement, where N is the value
|
|
|
|
specified.
|
|
|
|
|
2008-07-05 01:00:09 +08:00
|
|
|
relax_domain_level=
|
|
|
|
[KNL, SMP] Set scheduler's default relax_domain_level.
|
2019-06-28 00:08:35 +08:00
|
|
|
See Documentation/admin-guide/cgroup-v1/cpusets.rst.
|
2008-07-05 01:00:09 +08:00
|
|
|
|
2017-12-02 01:50:33 +08:00
|
|
|
reserve= [KNL,BUGS] Force kernel to ignore I/O ports or memory
|
|
|
|
Format: <base1>,<size1>[,<base2>,<size2>,...]
|
|
|
|
Reserve I/O ports or memory so the kernel won't use
|
|
|
|
them. If <base> is less than 0x10000, the region
|
|
|
|
is assumed to be I/O ports; otherwise it is memory.
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2007-07-31 15:37:59 +08:00
|
|
|
reservetop= [X86-32]
|
2006-09-26 14:32:25 +08:00
|
|
|
Format: nn[KMG]
|
|
|
|
Reserves a hole at the top of the kernel virtual
|
|
|
|
address space.
|
|
|
|
|
2006-09-27 16:50:44 +08:00
|
|
|
reset_devices [KNL] Force drivers to reset the underlying device
|
|
|
|
during initialization.
|
|
|
|
|
2005-10-24 03:57:11 +08:00
|
|
|
resume= [SWSUSP]
|
|
|
|
Specify the partition device for software suspend
|
2012-05-15 03:45:31 +08:00
|
|
|
Format:
|
|
|
|
{/dev/<dev> | PARTUUID=<uuid> | <int>:<int> | <hex>}
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2006-12-07 12:34:13 +08:00
|
|
|
resume_offset= [SWSUSP]
|
|
|
|
Specify the offset from the beginning of the partition
|
|
|
|
given by "resume=" at which the swap header is located,
|
|
|
|
in <PAGE_SIZE> units (needed only for swap files).
|
2019-06-13 18:10:36 +08:00
|
|
|
See Documentation/power/swsusp-and-swap-files.rst
|
2006-12-07 12:34:13 +08:00
|
|
|
|
2011-10-11 05:38:41 +08:00
|
|
|
resumedelay= [HIBERNATION] Delay (in seconds) to pause before attempting to
|
|
|
|
read the resume files
|
|
|
|
|
2011-10-07 02:34:46 +08:00
|
|
|
resumewait [HIBERNATION] Wait (indefinitely) for resume device to show up.
|
|
|
|
Useful for devices that are detected asynchronously
|
|
|
|
(e.g. USB and MMC devices).
|
|
|
|
|
2010-09-10 05:06:23 +08:00
|
|
|
hibernate= [HIBERNATION]
|
|
|
|
noresume Don't check if there's a hibernation image
|
|
|
|
present during boot.
|
|
|
|
nocompress Don't compress/decompress hibernation images.
|
2014-06-14 04:30:35 +08:00
|
|
|
no Disable hibernation and resume.
|
2016-07-10 08:12:10 +08:00
|
|
|
protect_image Turn on image protection during restoration
|
|
|
|
(that will set all pages holding image data
|
|
|
|
during restoration read-only).
|
2010-09-10 05:06:23 +08:00
|
|
|
|
2007-02-10 17:44:33 +08:00
|
|
|
retain_initrd [RAM] Keep initrd memory after extraction
|
|
|
|
|
2015-01-10 04:24:55 +08:00
|
|
|
rfkill.default_state=
|
|
|
|
0 "airplane mode". All wifi, bluetooth, wimax, gps, fm,
|
|
|
|
etc. communication is blocked by default.
|
|
|
|
1 Unblocked.
|
|
|
|
|
|
|
|
rfkill.master_switch_mode=
|
|
|
|
0 The "airplane mode" button does nothing.
|
|
|
|
1 The "airplane mode" button toggles between everything
|
|
|
|
blocked and the previous configuration.
|
|
|
|
2 The "airplane mode" button toggles between everything
|
|
|
|
blocked and everything unblocked.
|
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
rhash_entries= [KNL,NET]
|
|
|
|
Set number of hash buckets for route cache
|
|
|
|
|
2017-01-20 21:22:36 +08:00
|
|
|
ring3mwait=disable
|
|
|
|
[KNL] Disable ring 3 MONITOR/MWAIT feature on supported
|
|
|
|
CPUs.
|
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
ro [KNL] Mount root device read-only on boot
|
|
|
|
|
2016-02-18 06:41:13 +08:00
|
|
|
rodata= [KNL]
|
|
|
|
on Mark read-only kernel memory as read-only (default).
|
|
|
|
off Leave read-only kernel memory writable for debugging.
|
|
|
|
|
2016-02-22 19:55:01 +08:00
|
|
|
rockchip.usb_uart
|
|
|
|
Enable the uart passthrough on the designated usb port
|
|
|
|
on Rockchip SoCs. When active, the signals of the
|
|
|
|
debug-uart get routed to the D+ and D- pins of the usb
|
|
|
|
port and the regular usb controller gets disabled.
|
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
root= [KNL] Root filesystem
|
2011-08-04 07:21:08 +08:00
|
|
|
See name_to_dev_t comment in init/do_mounts.c.
|
2005-04-17 06:20:36 +08:00
|
|
|
|
|
|
|
rootdelay= [KNL] Delay (in seconds) to pause before attempting to
|
|
|
|
mount the root filesystem
|
|
|
|
|
|
|
|
rootflags= [KNL] Set root filesystem mount option string
|
|
|
|
|
|
|
|
rootfstype= [KNL] Set root filesystem type
|
|
|
|
|
2007-07-16 14:40:35 +08:00
|
|
|
rootwait [KNL] Wait (indefinitely) for root device to show up.
|
|
|
|
Useful for devices that are detected asynchronously
|
|
|
|
(e.g. USB and MMC devices).
|
|
|
|
|
2013-03-29 09:41:46 +08:00
|
|
|
rproc_mem=nn[KMG][@address]
|
|
|
|
[KNL,ARM,CMA] Remoteproc physical memory block.
|
|
|
|
Memory area to be used by remote processor image,
|
|
|
|
managed by CMA.
|
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
rw [KNL] Mount root device read-write on boot
|
|
|
|
|
|
|
|
S [KNL] Run init in single mode
|
|
|
|
|
2014-07-18 23:37:08 +08:00
|
|
|
s390_iommu= [HW,S390]
|
|
|
|
Set s390 IOTLB flushing mode
|
|
|
|
strict
|
|
|
|
With strict flushing every unmap operation will result in
|
|
|
|
an IOTLB flush. Default is lazy flushing before reuse,
|
|
|
|
which is faster.
|
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
sa1100ir [NET]
|
|
|
|
See drivers/net/irda/sa1100_ir.c.
|
|
|
|
|
|
|
|
sbni= [NET] Granch SBNI12 leased line adapter
|
2005-10-24 03:57:11 +08:00
|
|
|
|
2021-04-16 00:23:17 +08:00
|
|
|
sched_verbose [KNL] Enables verbose scheduler debug messages.
|
2009-11-18 08:22:15 +08:00
|
|
|
|
2016-02-05 17:08:36 +08:00
|
|
|
schedstats= [KNL,X86] Enable or disable scheduled statistics.
|
|
|
|
Allowed values are enable and disable. This feature
|
|
|
|
incurs a small amount of overhead in the scheduler
|
|
|
|
but is useful for debugging and performance tuning.
|
2009-11-18 08:22:15 +08:00
|
|
|
|
2020-02-22 08:52:13 +08:00
|
|
|
sched_thermal_decay_shift=
|
|
|
|
[KNL, SMP] Set a decay shift for scheduler thermal
|
|
|
|
pressure signal. Thermal pressure signal follows the
|
|
|
|
default decay period of other scheduler pelt
|
|
|
|
signals(usually 32 ms but configurable). Setting
|
|
|
|
sched_thermal_decay_shift will left shift the decay
|
|
|
|
period for the thermal pressure signal by the shift
|
|
|
|
value.
|
|
|
|
i.e. with the default pelt decay period of 32 ms
|
|
|
|
sched_thermal_decay_shift thermal pressure decay pr
|
|
|
|
1 64 ms
|
|
|
|
2 128 ms
|
|
|
|
and so on.
|
|
|
|
Format: integer between 0 and 10
|
|
|
|
Default is 0.
|
|
|
|
|
2020-06-25 06:59:59 +08:00
|
|
|
scftorture.holdoff= [KNL]
|
|
|
|
Number of seconds to hold off before starting
|
|
|
|
test. Defaults to zero for module insertion and
|
|
|
|
to 10 seconds for built-in smp_call_function()
|
|
|
|
tests.
|
|
|
|
|
|
|
|
scftorture.longwait= [KNL]
|
|
|
|
Request ridiculously long waits randomly selected
|
|
|
|
up to the chosen limit in seconds. Zero (the
|
|
|
|
default) disables this feature. Please note
|
|
|
|
that requesting even small non-zero numbers of
|
|
|
|
seconds can result in RCU CPU stall warnings,
|
|
|
|
softlockup complaints, and so on.
|
|
|
|
|
|
|
|
scftorture.nthreads= [KNL]
|
|
|
|
Number of kthreads to spawn to invoke the
|
|
|
|
smp_call_function() family of functions.
|
|
|
|
The default of -1 specifies a number of kthreads
|
|
|
|
equal to the number of CPUs.
|
|
|
|
|
|
|
|
scftorture.onoff_holdoff= [KNL]
|
|
|
|
Number seconds to wait after the start of the
|
|
|
|
test before initiating CPU-hotplug operations.
|
|
|
|
|
|
|
|
scftorture.onoff_interval= [KNL]
|
|
|
|
Number seconds to wait between successive
|
|
|
|
CPU-hotplug operations. Specifying zero (which
|
|
|
|
is the default) disables CPU-hotplug operations.
|
|
|
|
|
|
|
|
scftorture.shutdown_secs= [KNL]
|
|
|
|
The number of seconds following the start of the
|
|
|
|
test after which to shut down the system. The
|
|
|
|
default of zero avoids shutting down the system.
|
|
|
|
Non-zero values are useful for automated tests.
|
|
|
|
|
|
|
|
scftorture.stat_interval= [KNL]
|
|
|
|
The number of seconds between outputting the
|
|
|
|
current test statistics to the console. A value
|
|
|
|
of zero disables statistics output.
|
|
|
|
|
|
|
|
scftorture.stutter_cpus= [KNL]
|
|
|
|
The number of jiffies to wait between each change
|
|
|
|
to the set of CPUs under test.
|
|
|
|
|
|
|
|
scftorture.use_cpus_read_lock= [KNL]
|
|
|
|
Use use_cpus_read_lock() instead of the default
|
|
|
|
preempt_disable() to disable CPU hotplug
|
|
|
|
while invoking one of the smp_call_function*()
|
|
|
|
functions.
|
|
|
|
|
|
|
|
scftorture.verbose= [KNL]
|
|
|
|
Enable additional printk() statements.
|
|
|
|
|
|
|
|
scftorture.weight_single= [KNL]
|
|
|
|
The probability weighting to use for the
|
|
|
|
smp_call_function_single() function with a zero
|
|
|
|
"wait" parameter. A value of -1 selects the
|
|
|
|
default if all other weights are -1. However,
|
|
|
|
if at least one weight has some other value, a
|
|
|
|
value of -1 will instead select a weight of zero.
|
|
|
|
|
|
|
|
scftorture.weight_single_wait= [KNL]
|
|
|
|
The probability weighting to use for the
|
|
|
|
smp_call_function_single() function with a
|
|
|
|
non-zero "wait" parameter. See weight_single.
|
|
|
|
|
|
|
|
scftorture.weight_many= [KNL]
|
|
|
|
The probability weighting to use for the
|
|
|
|
smp_call_function_many() function with a zero
|
|
|
|
"wait" parameter. See weight_single.
|
|
|
|
Note well that setting a high probability for
|
|
|
|
this weighting can place serious IPI load
|
|
|
|
on the system.
|
|
|
|
|
|
|
|
scftorture.weight_many_wait= [KNL]
|
|
|
|
The probability weighting to use for the
|
|
|
|
smp_call_function_many() function with a
|
|
|
|
non-zero "wait" parameter. See weight_single
|
|
|
|
and weight_many.
|
|
|
|
|
|
|
|
scftorture.weight_all= [KNL]
|
|
|
|
The probability weighting to use for the
|
|
|
|
smp_call_function_all() function with a zero
|
|
|
|
"wait" parameter. See weight_single and
|
|
|
|
weight_many.
|
|
|
|
|
|
|
|
scftorture.weight_all_wait= [KNL]
|
|
|
|
The probability weighting to use for the
|
|
|
|
smp_call_function_all() function with a
|
|
|
|
non-zero "wait" parameter. See weight_single
|
|
|
|
and weight_many.
|
|
|
|
|
2012-05-08 18:20:58 +08:00
|
|
|
skew_tick= [KNL] Offset the periodic timer tick per cpu to mitigate
|
|
|
|
xtime_lock contention on larger systems, and/or RCU lock
|
|
|
|
contention on all systems with CONFIG_MAXSMP set.
|
|
|
|
Format: { "0" | "1" }
|
|
|
|
0 -- disable. (may be 1 via CONFIG_CMDLINE="skew_tick=1"
|
|
|
|
1 -- enable.
|
|
|
|
Note: increases power consumption, thus should only be
|
|
|
|
enabled if running jitter sensitive (HPC/RT) workloads.
|
|
|
|
|
2019-02-13 02:23:18 +08:00
|
|
|
security= [SECURITY] Choose a legacy "major" security module to
|
|
|
|
enable at boot. This has been deprecated by the
|
|
|
|
"lsm=" parameter.
|
2009-04-06 06:55:22 +08:00
|
|
|
|
|
|
|
selinux= [SELINUX] Disable or enable SELinux at boot time.
|
2005-04-17 06:20:36 +08:00
|
|
|
Format: { "0" | "1" }
|
|
|
|
See security/selinux/Kconfig help text.
|
|
|
|
0 -- disable.
|
|
|
|
1 -- enable.
|
2020-01-08 00:35:04 +08:00
|
|
|
Default value is 1.
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2010-07-30 05:48:09 +08:00
|
|
|
apparmor= [APPARMOR] Disable or enable AppArmor at boot time
|
|
|
|
Format: { "0" | "1" }
|
|
|
|
See security/apparmor/Kconfig help text
|
|
|
|
0 -- disable.
|
|
|
|
1 -- enable.
|
|
|
|
Default value is set via kernel config option.
|
|
|
|
|
2007-07-31 15:37:59 +08:00
|
|
|
serialnumber [BUGS=X86-32]
|
2005-04-17 06:20:36 +08:00
|
|
|
|
|
|
|
shapers= [NET]
|
|
|
|
Maximal number of shapers.
|
2005-10-24 03:57:11 +08:00
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
simeth= [IA-64]
|
|
|
|
simscsi=
|
2005-10-24 03:57:11 +08:00
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
slram= [HW,MTD]
|
|
|
|
|
2021-04-30 13:54:39 +08:00
|
|
|
slab_merge [MM]
|
|
|
|
Enable merging of slabs with similar size when the
|
|
|
|
kernel is built without CONFIG_SLAB_MERGE_DEFAULT.
|
|
|
|
|
2014-10-10 06:26:22 +08:00
|
|
|
slab_nomerge [MM]
|
|
|
|
Disable merging of slabs with similar size. May be
|
|
|
|
necessary if there is some reason to distinguish
|
2017-07-07 06:36:40 +08:00
|
|
|
allocs to different slabs, especially in hardened
|
|
|
|
environments where the risk of heap overflows and
|
|
|
|
layout control by attackers can usually be
|
|
|
|
frustrated by disabling merging. This will reduce
|
|
|
|
most of the exposure of a heap attack to a single
|
|
|
|
cache (risks via metadata attacks are mostly
|
|
|
|
unchanged). Debug options disable merging on their
|
|
|
|
own.
|
2018-03-22 03:22:47 +08:00
|
|
|
For more information see Documentation/vm/slub.rst.
|
2014-10-10 06:26:22 +08:00
|
|
|
|
2011-10-19 13:09:28 +08:00
|
|
|
slab_max_order= [MM, SLAB]
|
|
|
|
Determines the maximum allowed order for slabs.
|
|
|
|
A high setting may cause OOMs due to memory
|
|
|
|
fragmentation. Defaults to 1 for systems with
|
|
|
|
more than 32MB of RAM, 0 otherwise.
|
|
|
|
|
2020-08-07 14:18:35 +08:00
|
|
|
slub_debug[=options[,slabs][;[options[,slabs]]...] [MM, SLUB]
|
2007-07-16 14:38:14 +08:00
|
|
|
Enabling slub_debug allows one to determine the
|
|
|
|
culprit if slab objects become corrupted. Enabling
|
|
|
|
slub_debug can create guard zones around objects and
|
|
|
|
may poison objects when not in use. Also tracks the
|
|
|
|
last alloc / free. For more information see
|
2018-03-22 03:22:47 +08:00
|
|
|
Documentation/vm/slub.rst.
|
2007-05-31 15:40:47 +08:00
|
|
|
|
|
|
|
slub_max_order= [MM, SLUB]
|
2007-07-16 14:38:14 +08:00
|
|
|
Determines the maximum allowed order for slabs.
|
|
|
|
A high setting may cause OOMs due to memory
|
|
|
|
fragmentation. For more information see
|
2018-03-22 03:22:47 +08:00
|
|
|
Documentation/vm/slub.rst.
|
2007-05-31 15:40:47 +08:00
|
|
|
|
|
|
|
slub_min_objects= [MM, SLUB]
|
2007-07-16 14:38:14 +08:00
|
|
|
The minimum number of objects per slab. SLUB will
|
|
|
|
increase the slab order up to slub_max_order to
|
|
|
|
generate a sufficiently large slab able to contain
|
|
|
|
the number of objects indicated. The higher the number
|
|
|
|
of objects the smaller the overhead of tracking slabs
|
|
|
|
and the less frequently locks need to be acquired.
|
2018-03-22 03:22:47 +08:00
|
|
|
For more information see Documentation/vm/slub.rst.
|
2007-05-31 15:40:47 +08:00
|
|
|
|
|
|
|
slub_min_order= [MM, SLUB]
|
2012-02-14 23:26:42 +08:00
|
|
|
Determines the minimum page order for slabs. Must be
|
2007-07-16 14:38:14 +08:00
|
|
|
lower than slub_max_order.
|
2018-03-22 03:22:47 +08:00
|
|
|
For more information see Documentation/vm/slub.rst.
|
2007-05-31 15:40:47 +08:00
|
|
|
|
2021-04-30 13:54:39 +08:00
|
|
|
slub_merge [MM, SLUB]
|
|
|
|
Same with slab_merge.
|
|
|
|
|
2007-05-31 15:40:47 +08:00
|
|
|
slub_nomerge [MM, SLUB]
|
2014-10-10 06:26:22 +08:00
|
|
|
Same with slab_nomerge. This is supported for legacy.
|
|
|
|
See slab_nomerge for more information.
|
2007-05-31 15:40:47 +08:00
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
smart2= [HW]
|
|
|
|
Format: <io1>[,<io2>[,...,<io8>]]
|
|
|
|
|
2007-05-08 15:36:05 +08:00
|
|
|
smsc-ircc2.nopnp [HW] Don't use PNP to discover SMC devices
|
|
|
|
smsc-ircc2.ircc_cfg= [HW] Device configuration I/O port
|
|
|
|
smsc-ircc2.ircc_sir= [HW] SIR base I/O port
|
|
|
|
smsc-ircc2.ircc_fir= [HW] FIR base I/O port
|
|
|
|
smsc-ircc2.ircc_irq= [HW] IRQ line
|
|
|
|
smsc-ircc2.ircc_dma= [HW] DMA channel
|
|
|
|
smsc-ircc2.ircc_transceiver= [HW] Transceiver type:
|
|
|
|
0: Toshiba Satellite 1800 (GP data pin select)
|
|
|
|
1: Fast pin select (default)
|
|
|
|
2: ATC IRMode
|
|
|
|
|
2016-04-05 18:53:38 +08:00
|
|
|
smt [KNL,S390] Set the maximum number of threads (logical
|
|
|
|
CPUs) to use per physical CPU on systems capable of
|
|
|
|
symmetric multithreading (SMT). Will be capped to the
|
|
|
|
actual hardware limit.
|
|
|
|
Format: <integer>
|
|
|
|
Default: -1 (no limit)
|
|
|
|
|
2008-05-13 03:21:04 +08:00
|
|
|
softlockup_panic=
|
|
|
|
[KNL] Should the soft-lockup detector generate panics.
|
2020-06-08 12:40:42 +08:00
|
|
|
Format: 0 | 1
|
2008-05-13 03:21:04 +08:00
|
|
|
|
2020-06-08 12:40:42 +08:00
|
|
|
A value of 1 instructs the soft-lockup detector
|
2020-03-11 02:36:49 +08:00
|
|
|
to panic the machine when a soft-lockup occurs. It is
|
|
|
|
also controlled by the kernel.softlockup_panic sysctl
|
|
|
|
and CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC, which is the
|
|
|
|
respective build-time switch to that functionality.
|
2017-10-03 23:54:07 +08:00
|
|
|
|
2014-06-24 04:22:05 +08:00
|
|
|
softlockup_all_cpu_backtrace=
|
|
|
|
[KNL] Should the soft-lockup detector generate
|
|
|
|
backtraces on all cpus.
|
2020-06-08 12:40:42 +08:00
|
|
|
Format: 0 | 1
|
2014-06-24 04:22:05 +08:00
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
sonypi.*= [HW] Sony Programmable I/O Control Device driver
|
2019-06-14 02:07:43 +08:00
|
|
|
See Documentation/admin-guide/laptops/sonypi.rst
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2018-01-12 05:46:26 +08:00
|
|
|
spectre_v2= [X86] Control mitigation of Spectre variant 2
|
|
|
|
(indirect branch speculation) vulnerability.
|
2018-11-26 02:33:45 +08:00
|
|
|
The default operation protects the kernel from
|
|
|
|
user space attacks.
|
2018-01-12 05:46:26 +08:00
|
|
|
|
2018-11-26 02:33:45 +08:00
|
|
|
on - unconditionally enable, implies
|
|
|
|
spectre_v2_user=on
|
|
|
|
off - unconditionally disable, implies
|
|
|
|
spectre_v2_user=off
|
2018-01-12 05:46:26 +08:00
|
|
|
auto - kernel detects whether your CPU model is
|
|
|
|
vulnerable
|
|
|
|
|
|
|
|
Selecting 'on' will, and 'auto' may, choose a
|
|
|
|
mitigation method at run time according to the
|
|
|
|
CPU, the available microcode, the setting of the
|
|
|
|
CONFIG_RETPOLINE configuration option, and the
|
|
|
|
compiler with which the kernel was built.
|
|
|
|
|
2018-11-26 02:33:45 +08:00
|
|
|
Selecting 'on' will also enable the mitigation
|
|
|
|
against user space to user space task attacks.
|
|
|
|
|
|
|
|
Selecting 'off' will disable both the kernel and
|
|
|
|
the user space protections.
|
|
|
|
|
2018-01-12 05:46:26 +08:00
|
|
|
Specific mitigations can also be selected manually:
|
|
|
|
|
|
|
|
retpoline - replace indirect branches
|
|
|
|
retpoline,generic - google's original retpoline
|
|
|
|
retpoline,amd - AMD-specific minimal thunk
|
|
|
|
|
|
|
|
Not specifying this option is equivalent to
|
|
|
|
spectre_v2=auto.
|
|
|
|
|
2018-11-26 02:33:45 +08:00
|
|
|
spectre_v2_user=
|
|
|
|
[X86] Control mitigation of Spectre variant 2
|
|
|
|
(indirect branch speculation) vulnerability between
|
|
|
|
user space tasks
|
|
|
|
|
|
|
|
on - Unconditionally enable mitigations. Is
|
|
|
|
enforced by spectre_v2=on
|
|
|
|
|
|
|
|
off - Unconditionally disable mitigations. Is
|
|
|
|
enforced by spectre_v2=off
|
|
|
|
|
2018-11-26 02:33:54 +08:00
|
|
|
prctl - Indirect branch speculation is enabled,
|
|
|
|
but mitigation can be enabled via prctl
|
|
|
|
per thread. The mitigation control state
|
|
|
|
is inherited on fork.
|
|
|
|
|
2018-11-26 02:33:56 +08:00
|
|
|
prctl,ibpb
|
|
|
|
- Like "prctl" above, but only STIBP is
|
|
|
|
controlled per thread. IBPB is issued
|
|
|
|
always when switching between different user
|
|
|
|
space processes.
|
|
|
|
|
2018-11-26 02:33:55 +08:00
|
|
|
seccomp
|
|
|
|
- Same as "prctl" above, but all seccomp
|
|
|
|
threads will enable the mitigation unless
|
|
|
|
they explicitly opt out.
|
|
|
|
|
2018-11-26 02:33:56 +08:00
|
|
|
seccomp,ibpb
|
|
|
|
- Like "seccomp" above, but only STIBP is
|
|
|
|
controlled per thread. IBPB is issued
|
|
|
|
always when switching between different
|
|
|
|
user space processes.
|
|
|
|
|
2018-11-26 02:33:45 +08:00
|
|
|
auto - Kernel selects the mitigation depending on
|
|
|
|
the available CPU features and vulnerability.
|
2018-11-26 02:33:55 +08:00
|
|
|
|
|
|
|
Default mitigation:
|
|
|
|
If CONFIG_SECCOMP=y then "seccomp", otherwise "prctl"
|
2018-11-26 02:33:45 +08:00
|
|
|
|
|
|
|
Not specifying this option is equivalent to
|
|
|
|
spectre_v2_user=auto.
|
|
|
|
|
2018-04-26 10:04:21 +08:00
|
|
|
spec_store_bypass_disable=
|
|
|
|
[HW] Control Speculative Store Bypass (SSB) Disable mitigation
|
|
|
|
(Speculative Store Bypass vulnerability)
|
|
|
|
|
|
|
|
Certain CPUs are vulnerable to an exploit against a
|
|
|
|
a common industry wide performance optimization known
|
|
|
|
as "Speculative Store Bypass" in which recent stores
|
|
|
|
to the same memory location may not be observed by
|
|
|
|
later loads during speculative execution. The idea
|
|
|
|
is that such stores are unlikely and that they can
|
|
|
|
be detected prior to instruction retirement at the
|
|
|
|
end of a particular speculation execution window.
|
|
|
|
|
|
|
|
In vulnerable processors, the speculatively forwarded
|
|
|
|
store can be used in a cache side channel attack, for
|
|
|
|
example to read memory to which the attacker does not
|
|
|
|
directly have access (e.g. inside sandboxed code).
|
|
|
|
|
|
|
|
This parameter controls whether the Speculative Store
|
|
|
|
Bypass optimization is used.
|
|
|
|
|
2018-07-10 10:08:36 +08:00
|
|
|
On x86 the options are:
|
|
|
|
|
2018-05-04 05:37:54 +08:00
|
|
|
on - Unconditionally disable Speculative Store Bypass
|
|
|
|
off - Unconditionally enable Speculative Store Bypass
|
|
|
|
auto - Kernel detects whether the CPU model contains an
|
|
|
|
implementation of Speculative Store Bypass and
|
|
|
|
picks the most appropriate mitigation. If the
|
|
|
|
CPU is not vulnerable, "off" is selected. If the
|
|
|
|
CPU is vulnerable the default mitigation is
|
|
|
|
architecture and Kconfig dependent. See below.
|
|
|
|
prctl - Control Speculative Store Bypass per thread
|
|
|
|
via prctl. Speculative Store Bypass is enabled
|
|
|
|
for a process by default. The state of the control
|
|
|
|
is inherited on fork.
|
|
|
|
seccomp - Same as "prctl" above, but all seccomp threads
|
|
|
|
will disable SSB unless they explicitly opt out.
|
2018-04-26 10:04:21 +08:00
|
|
|
|
2018-05-04 05:37:54 +08:00
|
|
|
Default mitigations:
|
|
|
|
X86: If CONFIG_SECCOMP=y "seccomp", otherwise "prctl"
|
|
|
|
|
2018-07-10 10:08:36 +08:00
|
|
|
On powerpc the options are:
|
|
|
|
|
|
|
|
on,auto - On Power8 and Power9 insert a store-forwarding
|
|
|
|
barrier on kernel entry and exit. On Power7
|
|
|
|
perform a software flush on kernel entry and
|
|
|
|
exit.
|
|
|
|
off - No action.
|
|
|
|
|
|
|
|
Not specifying this option is equivalent to
|
|
|
|
spec_store_bypass_disable=auto.
|
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
spia_io_base= [HW,MTD]
|
|
|
|
spia_fio_base=
|
|
|
|
spia_pedr=
|
|
|
|
spia_peddr=
|
|
|
|
|
2020-01-27 04:05:35 +08:00
|
|
|
split_lock_detect=
|
2021-03-22 21:53:25 +08:00
|
|
|
[X86] Enable split lock detection or bus lock detection
|
2020-01-27 04:05:35 +08:00
|
|
|
|
|
|
|
When enabled (and if hardware support is present), atomic
|
|
|
|
instructions that access data across cache line
|
2021-03-22 21:53:25 +08:00
|
|
|
boundaries will result in an alignment check exception
|
|
|
|
for split lock detection or a debug exception for
|
|
|
|
bus lock detection.
|
2020-01-27 04:05:35 +08:00
|
|
|
|
|
|
|
off - not enabled
|
|
|
|
|
2021-03-22 21:53:25 +08:00
|
|
|
warn - the kernel will emit rate-limited warnings
|
2020-01-27 04:05:35 +08:00
|
|
|
about applications triggering the #AC
|
2021-03-22 21:53:25 +08:00
|
|
|
exception or the #DB exception. This mode is
|
|
|
|
the default on CPUs that support split lock
|
|
|
|
detection or bus lock detection. Default
|
|
|
|
behavior is by #AC if both features are
|
|
|
|
enabled in hardware.
|
2020-01-27 04:05:35 +08:00
|
|
|
|
|
|
|
fatal - the kernel will send SIGBUS to applications
|
2021-03-22 21:53:25 +08:00
|
|
|
that trigger the #AC exception or the #DB
|
|
|
|
exception. Default behavior is by #AC if
|
|
|
|
both features are enabled in hardware.
|
2020-01-27 04:05:35 +08:00
|
|
|
|
2021-04-20 05:49:57 +08:00
|
|
|
ratelimit:N -
|
|
|
|
Set system wide rate limit to N bus locks
|
|
|
|
per second for bus lock detection.
|
|
|
|
0 < N <= 1000.
|
|
|
|
|
|
|
|
N/A for split lock detection.
|
|
|
|
|
|
|
|
|
2020-01-27 04:05:35 +08:00
|
|
|
If an #AC exception is hit in the kernel or in
|
|
|
|
firmware (i.e. not while executing in user mode)
|
|
|
|
the kernel will oops in either "warn" or "fatal"
|
|
|
|
mode.
|
|
|
|
|
2021-03-22 21:53:25 +08:00
|
|
|
#DB exception for bus lock is triggered only when
|
|
|
|
CPL > 0.
|
|
|
|
|
2020-04-16 23:54:04 +08:00
|
|
|
srbds= [X86,INTEL]
|
|
|
|
Control the Special Register Buffer Data Sampling
|
|
|
|
(SRBDS) mitigation.
|
|
|
|
|
|
|
|
Certain CPUs are vulnerable to an MDS-like
|
|
|
|
exploit which can leak bits from the random
|
|
|
|
number generator.
|
|
|
|
|
|
|
|
By default, this issue is mitigated by
|
|
|
|
microcode. However, the microcode fix can cause
|
|
|
|
the RDRAND and RDSEED instructions to become
|
|
|
|
much slower. Among other effects, this will
|
|
|
|
result in reduced throughput from /dev/urandom.
|
|
|
|
|
|
|
|
The microcode mitigation can be disabled with
|
|
|
|
the following option:
|
|
|
|
|
|
|
|
off: Disable mitigation and remove
|
|
|
|
performance impact to RDRAND and RDSEED
|
|
|
|
|
srcu: Prevent sdp->srcu_gp_seq_needed counter wrap
If a given CPU never happens to ever start an SRCU grace period, the
grace-period sequence counter might wrap. If this CPU were to decide to
finally start a grace period, the state of its sdp->srcu_gp_seq_needed
might make it appear that it has already requested this grace period,
which would prevent starting the grace period. If no other CPU ever started
a grace period again, this would look like a grace-period hang. Even
if some other CPU took pity and started the needed grace period, the
leaf rcu_node structure's ->srcu_data_have_cbs field won't have record
of the fact that this CPU has a callback pending, which would look like
a very localized grace-period hang.
This might seem very unlikely, but SRCU grace periods can take less than
a microsecond on small systems, which means that overflow can happen
in much less than an hour on a 32-bit embedded system. And embedded
systems are especially likely to have long-term idle CPUs. Therefore,
it makes sense to prevent this scenario from happening.
This commit therefore scans each srcu_data structure occasionally,
with frequency controlled by the srcutree.counter_wrap_check kernel
boot parameter. This parameter can be set to something like 255
in order to exercise the counter-wrap-prevention code.
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
2017-05-04 06:35:32 +08:00
|
|
|
srcutree.counter_wrap_check [KNL]
|
|
|
|
Specifies how frequently to check for
|
|
|
|
grace-period sequence counter wrap for the
|
|
|
|
srcu_data structure's ->srcu_gp_seq_needed field.
|
|
|
|
The greater the number of bits set in this kernel
|
|
|
|
parameter, the less frequently counter wrap will
|
|
|
|
be checked for. Note that the bottom two bits
|
|
|
|
are ignored.
|
|
|
|
|
2017-04-26 05:03:11 +08:00
|
|
|
srcutree.exp_holdoff [KNL]
|
|
|
|
Specifies how many nanoseconds must elapse
|
|
|
|
since the end of the last SRCU grace period for
|
|
|
|
a given srcu_struct until the next normal SRCU
|
|
|
|
grace period will be considered for automatic
|
|
|
|
expediting. Set to zero to disable automatic
|
|
|
|
expediting.
|
|
|
|
|
2018-05-29 20:11:09 +08:00
|
|
|
ssbd= [ARM64,HW]
|
|
|
|
Speculative Store Bypass Disable control
|
|
|
|
|
|
|
|
On CPUs that are vulnerable to the Speculative
|
|
|
|
Store Bypass vulnerability and offer a
|
|
|
|
firmware based mitigation, this parameter
|
|
|
|
indicates how the mitigation should be used:
|
|
|
|
|
|
|
|
force-on: Unconditionally enable mitigation for
|
|
|
|
for both kernel and userspace
|
|
|
|
force-off: Unconditionally disable mitigation for
|
|
|
|
for both kernel and userspace
|
|
|
|
kernel: Always enable mitigation in the
|
|
|
|
kernel, and offer a prctl interface
|
|
|
|
to allow userspace to register its
|
|
|
|
interest in being mitigated too.
|
|
|
|
|
mm: larger stack guard gap, between vmas
Stack guard page is a useful feature to reduce a risk of stack smashing
into a different mapping. We have been using a single page gap which
is sufficient to prevent having stack adjacent to a different mapping.
But this seems to be insufficient in the light of the stack usage in
userspace. E.g. glibc uses as large as 64kB alloca() in many commonly
used functions. Others use constructs liks gid_t buffer[NGROUPS_MAX]
which is 256kB or stack strings with MAX_ARG_STRLEN.
This will become especially dangerous for suid binaries and the default
no limit for the stack size limit because those applications can be
tricked to consume a large portion of the stack and a single glibc call
could jump over the guard page. These attacks are not theoretical,
unfortunatelly.
Make those attacks less probable by increasing the stack guard gap
to 1MB (on systems with 4k pages; but make it depend on the page size
because systems with larger base pages might cap stack allocations in
the PAGE_SIZE units) which should cover larger alloca() and VLA stack
allocations. It is obviously not a full fix because the problem is
somehow inherent, but it should reduce attack space a lot.
One could argue that the gap size should be configurable from userspace,
but that can be done later when somebody finds that the new 1MB is wrong
for some special case applications. For now, add a kernel command line
option (stack_guard_gap) to specify the stack gap size (in page units).
Implementation wise, first delete all the old code for stack guard page:
because although we could get away with accounting one extra page in a
stack vma, accounting a larger gap can break userspace - case in point,
a program run with "ulimit -S -v 20000" failed when the 1MB gap was
counted for RLIMIT_AS; similar problems could come with RLIMIT_MLOCK
and strict non-overcommit mode.
Instead of keeping gap inside the stack vma, maintain the stack guard
gap as a gap between vmas: using vm_start_gap() in place of vm_start
(or vm_end_gap() in place of vm_end if VM_GROWSUP) in just those few
places which need to respect the gap - mainly arch_get_unmapped_area(),
and and the vma tree's subtree_gap support for that.
Original-patch-by: Oleg Nesterov <oleg@redhat.com>
Original-patch-by: Michal Hocko <mhocko@suse.com>
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Tested-by: Helge Deller <deller@gmx.de> # parisc
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-06-19 19:03:24 +08:00
|
|
|
stack_guard_gap= [MM]
|
|
|
|
override the default stack gap protection. The value
|
|
|
|
is in page units and it defines how many pages prior
|
|
|
|
to (for stacks growing down) resp. after (for stacks
|
|
|
|
growing up) the main stack are reserved for no other
|
|
|
|
mapping. Default value is 256 pages.
|
|
|
|
|
2021-02-26 09:21:27 +08:00
|
|
|
stack_depot_disable= [KNL]
|
|
|
|
Setting this to true through kernel command line will
|
|
|
|
disable the stack depot thereby saving the static memory
|
|
|
|
consumed by the stack hash table. By default this is set
|
|
|
|
to false.
|
|
|
|
|
2008-12-17 12:06:40 +08:00
|
|
|
stacktrace [FTRACE]
|
|
|
|
Enabled the stack tracer on boot up.
|
|
|
|
|
2011-12-20 11:01:00 +08:00
|
|
|
stacktrace_filter=[function-list]
|
|
|
|
[FTRACE] Limit the functions that the stack tracer
|
2021-01-01 12:08:31 +08:00
|
|
|
will trace at boot up. function-list is a comma-separated
|
2011-12-20 11:01:00 +08:00
|
|
|
list of functions. This list can be changed at run
|
|
|
|
time by the stack_trace_filter file in the debugfs
|
|
|
|
tracing directory. Note, this enables stack tracing
|
|
|
|
and the stacktrace above is not needed.
|
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
sti= [PARISC,HW]
|
|
|
|
Format: <num>
|
|
|
|
Set the STI (builtin display/keyboard on the HP-PARISC
|
|
|
|
machines) console (graphic card) which should be used
|
|
|
|
as the initial boot-console.
|
|
|
|
See also comment in drivers/video/console/sticore.c.
|
|
|
|
|
|
|
|
sti_font= [HW]
|
|
|
|
See comment in drivers/video/console/sticore.c.
|
|
|
|
|
|
|
|
stifb= [HW]
|
|
|
|
Format: bpp:<bpp1>[:<bpp2>[:<bpp3>...]]
|
|
|
|
|
2009-08-10 03:06:19 +08:00
|
|
|
sunrpc.min_resvport=
|
|
|
|
sunrpc.max_resvport=
|
|
|
|
[NFS,SUNRPC]
|
|
|
|
SunRPC servers often require that client requests
|
|
|
|
originate from a privileged port (i.e. a port in the
|
|
|
|
range 0 < portnr < 1024).
|
|
|
|
An administrator who wishes to reserve some of these
|
|
|
|
ports for other uses may adjust the range that the
|
|
|
|
kernel's sunrpc client considers to be privileged
|
|
|
|
using these two parameters to set the minimum and
|
|
|
|
maximum port values.
|
|
|
|
|
2016-06-24 22:55:50 +08:00
|
|
|
sunrpc.svc_rpc_per_connection_limit=
|
|
|
|
[NFS,SUNRPC]
|
|
|
|
Limit the number of requests that the server will
|
|
|
|
process in parallel from a single connection.
|
|
|
|
The default value is 0 (no limit).
|
|
|
|
|
2007-03-06 17:42:23 +08:00
|
|
|
sunrpc.pool_mode=
|
|
|
|
[NFS]
|
|
|
|
Control how the NFS server code allocates CPUs to
|
|
|
|
service thread pools. Depending on how many NICs
|
|
|
|
you have and where their interrupts are bound, this
|
|
|
|
option will affect which CPUs will do NFS serving.
|
|
|
|
Note: this parameter cannot be changed while the
|
|
|
|
NFS server is running.
|
|
|
|
|
|
|
|
auto the server chooses an appropriate mode
|
|
|
|
automatically using heuristics
|
|
|
|
global a single global pool contains all CPUs
|
|
|
|
percpu one pool for each CPU
|
|
|
|
pernode one pool for each NUMA node (equivalent
|
|
|
|
to global on non-NUMA machines)
|
|
|
|
|
2009-08-10 03:06:19 +08:00
|
|
|
sunrpc.tcp_slot_table_entries=
|
|
|
|
sunrpc.udp_slot_table_entries=
|
|
|
|
[NFS,SUNRPC]
|
|
|
|
Sets the upper limit on the number of simultaneous
|
|
|
|
RPC calls that can be sent from the client to a
|
|
|
|
server. Increasing these values may allow you to
|
|
|
|
improve throughput, but will also increase the
|
|
|
|
amount of memory reserved for use by the client.
|
|
|
|
|
PM / sleep: add configurable delay for pm_test
When CONFIG_PM_DEBUG=y, we provide a sysfs file (/sys/power/pm_test) for
selecting one of a few suspend test modes, where rather than entering a
full suspend state, the kernel will perform some subset of suspend
steps, wait 5 seconds, and then resume back to normal operation.
This mode is useful for (among other things) observing the state of the
system just before entering a sleep mode, for debugging or analysis
purposes. However, a constant 5 second wait is not sufficient for some
sorts of analysis; for example, on an SoC, one might want to use
external tools to probe the power states of various on-chip controllers
or clocks.
This patch turns this 5 second delay into a configurable module
parameter, so users can determine how long to wait in this
pseudo-suspend state before resuming the system.
Example (wait 30 seconds);
# echo 30 > /sys/module/suspend/parameters/pm_test_delay
# echo core > /sys/power/pm_test
# time echo mem > /sys/power/state
...
[ 17.583625] suspend debug: Waiting for 30 second(s).
...
real 0m30.381s
user 0m0.017s
sys 0m0.080s
Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Acked-by: Pavel Machek <pavel@ucw.cz>
Reviewed-by: Kevin Cernekee <cernekee@chromium.org>
Acked-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2015-02-23 13:16:49 +08:00
|
|
|
suspend.pm_test_delay=
|
|
|
|
[SUSPEND]
|
|
|
|
Sets the number of seconds to remain in a suspend test
|
|
|
|
mode before resuming the system (see
|
|
|
|
/sys/power/pm_test). Only available when CONFIG_PM_DEBUG
|
|
|
|
is set. Default value is 5.
|
|
|
|
|
2019-08-20 10:13:14 +08:00
|
|
|
svm= [PPC]
|
|
|
|
Format: { on | off | y | n | 1 | 0 }
|
|
|
|
This parameter controls use of the Protected
|
|
|
|
Execution Facility on pSeries.
|
|
|
|
|
2013-08-23 07:35:46 +08:00
|
|
|
swapaccount=[0|1]
|
2010-11-25 04:57:08 +08:00
|
|
|
[KNL] Enable accounting of swap in memory resource
|
|
|
|
controller if no parameter or 1 is given or disable
|
2019-06-28 00:08:35 +08:00
|
|
|
it if 0 is given (See Documentation/admin-guide/cgroup-v1/memory.rst)
|
2010-11-25 04:57:08 +08:00
|
|
|
|
2013-11-27 20:48:09 +08:00
|
|
|
swiotlb= [ARM,IA-64,PPC,MIPS,X86]
|
2016-12-16 21:28:42 +08:00
|
|
|
Format: { <int> | force | noforce }
|
2013-11-27 20:48:09 +08:00
|
|
|
<int> -- Number of I/O TLB slabs
|
|
|
|
force -- force using of bounce buffers even if they
|
|
|
|
wouldn't be automatically used by the kernel
|
2016-12-16 21:28:42 +08:00
|
|
|
noforce -- Never use bounce buffers (for debugging)
|
2005-10-24 03:57:11 +08:00
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
switches= [HW,M68k]
|
|
|
|
|
kernel/sysctl: support setting sysctl parameters from kernel command line
Patch series "support setting sysctl parameters from kernel command line", v3.
This series adds support for something that seems like many people
always wanted but nobody added it yet, so here's the ability to set
sysctl parameters via kernel command line options in the form of
sysctl.vm.something=1
The important part is Patch 1. The second, not so important part is an
attempt to clean up legacy one-off parameters that do the same thing as
a sysctl. I don't want to remove them completely for compatibility
reasons, but with generic sysctl support the idea is to remove the
one-off param handlers and treat the parameters as aliases for the
sysctl variants.
I have identified several parameters that mention sysctl counterparts in
Documentation/admin-guide/kernel-parameters.txt but there might be more.
The conversion also has varying level of success:
- numa_zonelist_order is converted in Patch 2 together with adding the
necessary infrastructure. It's easy as it doesn't really do anything
but warn on deprecated value these days.
- hung_task_panic is converted in Patch 3, but there's a downside that
now it only accepts 0 and 1, while previously it was any integer
value
- nmi_watchdog maps to two sysctls nmi_watchdog and hardlockup_panic,
so there's no straighforward conversion possible
- traceoff_on_warning is a flag without value and it would be required
to handle that somehow in the conversion infractructure, which seems
pointless for a single flag
This patch (of 5):
A recently proposed patch to add vm_swappiness command line parameter in
addition to existing sysctl [1] made me wonder why we don't have a
general support for passing sysctl parameters via command line.
Googling found only somebody else wondering the same [2], but I haven't
found any prior discussion with reasons why not to do this.
Settings the vm_swappiness issue aside (the underlying issue might be
solved in a different way), quick search of kernel-parameters.txt shows
there are already some that exist as both sysctl and kernel parameter -
hung_task_panic, nmi_watchdog, numa_zonelist_order, traceoff_on_warning.
A general mechanism would remove the need to add more of those one-offs
and might be handy in situations where configuration by e.g.
/etc/sysctl.d/ is impractical.
Hence, this patch adds a new parse_args() pass that looks for parameters
prefixed by 'sysctl.' and tries to interpret them as writes to the
corresponding sys/ files using an temporary in-kernel procfs mount.
This mechanism was suggested by Eric W. Biederman [3], as it handles
all dynamically registered sysctl tables, even though we don't handle
modular sysctls. Errors due to e.g. invalid parameter name or value
are reported in the kernel log.
The processing is hooked right before the init process is loaded, as
some handlers might be more complicated than simple setters and might
need some subsystems to be initialized. At the moment the init process
can be started and eventually execute a process writing to /proc/sys/
then it should be also fine to do that from the kernel.
Sysctls registered later on module load time are not set by this
mechanism - it's expected that in such scenarios, setting sysctl values
from userspace is practical enough.
[1] https://lore.kernel.org/r/BL0PR02MB560167492CA4094C91589930E9FC0@BL0PR02MB5601.namprd02.prod.outlook.com/
[2] https://unix.stackexchange.com/questions/558802/how-to-set-sysctl-using-kernel-command-line-parameter
[3] https://lore.kernel.org/r/87bloj2skm.fsf@x220.int.ebiederm.org/
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Reviewed-by: Luis Chamberlain <mcgrof@kernel.org>
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Acked-by: Kees Cook <keescook@chromium.org>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Iurii Zaikin <yzaikin@google.com>
Cc: Ivan Teterevkov <ivan.teterevkov@nutanix.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: "Eric W . Biederman" <ebiederm@xmission.com>
Cc: "Guilherme G . Piccoli" <gpiccoli@canonical.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Christian Brauner <christian.brauner@ubuntu.com>
Link: http://lkml.kernel.org/r/20200427180433.7029-1-vbabka@suse.cz
Link: http://lkml.kernel.org/r/20200427180433.7029-2-vbabka@suse.cz
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2020-06-08 12:40:24 +08:00
|
|
|
sysctl.*= [KNL]
|
|
|
|
Set a sysctl parameter, right before loading the init
|
|
|
|
process, as if the value was written to the respective
|
|
|
|
/proc/sys/... file. Both '.' and '/' are recognized as
|
|
|
|
separators. Unrecognized parameters and invalid values
|
|
|
|
are reported in the kernel log. Sysctls registered
|
|
|
|
later by a loaded module cannot be set this way.
|
|
|
|
Example: sysctl.vm.swappiness=40
|
|
|
|
|
2010-09-08 22:54:17 +08:00
|
|
|
sysfs.deprecated=0|1 [KNL]
|
|
|
|
Enable/disable old style sysfs layout for old udev
|
|
|
|
on older distributions. When this option is enabled
|
|
|
|
very new udev will not work anymore. When this option
|
|
|
|
is disabled (or CONFIG_SYSFS_DEPRECATED not compiled)
|
|
|
|
in older udev will not work anymore.
|
|
|
|
Default depends on CONFIG_SYSFS_DEPRECATED_V2 set in
|
|
|
|
the kernel configuration.
|
|
|
|
|
2006-12-13 16:34:36 +08:00
|
|
|
sysrq_always_enabled
|
|
|
|
[KNL]
|
|
|
|
Ignore sysrq setting - this boot parameter will
|
|
|
|
neutralize any effect of /proc/sys/kernel/sysrq.
|
|
|
|
Useful for debugging.
|
|
|
|
|
2014-11-07 02:46:50 +08:00
|
|
|
tcpmhash_entries= [KNL,NET]
|
|
|
|
Set the number of tcp_metrics_hash slots.
|
|
|
|
Default value is 8192 or 16384 depending on total
|
|
|
|
ram pages. This is used to specify the TCP metrics
|
2020-04-28 06:01:49 +08:00
|
|
|
cache size. See Documentation/networking/ip-sysctl.rst
|
2014-11-07 02:46:50 +08:00
|
|
|
"tcp_no_metrics_save" section for more details.
|
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
tdfx= [HW,DRM]
|
|
|
|
|
2014-09-03 02:54:41 +08:00
|
|
|
test_suspend= [SUSPEND][,N]
|
2008-07-24 12:28:33 +08:00
|
|
|
Specify "mem" (for Suspend-to-RAM) or "standby" (for
|
2014-09-03 02:54:41 +08:00
|
|
|
standby suspend) or "freeze" (for suspend type freeze)
|
|
|
|
as the system sleep state during system startup with
|
|
|
|
the optional capability to repeat N number of times.
|
|
|
|
The system is woken from this state using a
|
|
|
|
wakeup-capable RTC alarm.
|
2008-07-24 12:28:33 +08:00
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
thash_entries= [KNL,NET]
|
|
|
|
Set number of hash buckets for TCP connection
|
|
|
|
|
2007-08-12 12:12:54 +08:00
|
|
|
thermal.act= [HW,ACPI]
|
|
|
|
-1: disable all active trip points in all thermal zones
|
|
|
|
<degrees C>: override all lowest active trip points
|
|
|
|
|
2007-08-15 03:49:32 +08:00
|
|
|
thermal.crt= [HW,ACPI]
|
|
|
|
-1: disable all critical trip points in all thermal zones
|
2008-10-17 14:41:20 +08:00
|
|
|
<degrees C>: override all critical trip points
|
2007-08-15 03:49:32 +08:00
|
|
|
|
2007-08-12 12:12:44 +08:00
|
|
|
thermal.nocrt= [HW,ACPI]
|
|
|
|
Set to disable actions on ACPI thermal zone
|
|
|
|
critical and hot trip points.
|
|
|
|
|
2007-08-12 12:12:17 +08:00
|
|
|
thermal.off= [HW,ACPI]
|
|
|
|
1: disable ACPI thermal control
|
|
|
|
|
2007-08-12 12:12:35 +08:00
|
|
|
thermal.psv= [HW,ACPI]
|
|
|
|
-1: disable all passive trip points
|
2008-12-20 02:57:32 +08:00
|
|
|
<degrees C>: override all passive trip points to this
|
|
|
|
value
|
2007-08-12 12:12:35 +08:00
|
|
|
|
ACPI: thermal: expose "thermal.tzp=" to set global polling frequency
Thermal Zone Polling frequency (_TZP) is an optional ACPI object
recommending the rate that the OS should poll the associated thermal zone.
If _TZP is 0, no polling should be used.
If _TZP is non-zero, then the platform recommends that
the OS poll the thermal zone at the specified rate.
The minimum period is 30 seconds.
The maximum period is 5 minutes.
(note _TZP and thermal.tzp units are in deci-seconds,
so _TZP = 300 corresponds to 30 seconds)
If _TZP is not present, ACPI 3.0b recommends that the
thermal zone be polled at an "OS provided default frequency".
However, common industry practice is:
1. The BIOS never specifies any _TZP
2. High volume OS's from this century never poll any thermal zones
Ie. The OS depends on the platform's ability to
provoke thermal events when necessary, and
the "OS provided default frequency" is "never":-)
There is a proposal that ACPI 4.0 be updated to reflect
common industry practice -- ie. no _TZP, no polling.
The Linux kernel already follows this practice --
thermal zones are not polled unless _TZP is present and non-zero.
But thermal zone polling is useful as a workaround for systems
which have ACPI thermal control, but have an issue preventing
thermal events. Indeed, some Linux distributions still
set a non-zero thermal polling frequency for this reason.
But rather than ask the user to write a polling frequency
into all the /proc/acpi/thermal_zone/*/polling_frequency
files, here we simply document and expose the already
existing module parameter to do the same at system level,
to simplify debugging those broken platforms.
Note that thermal.tzp is a module-load time parameter only.
Signed-off-by: Len Brown <len.brown@intel.com>
2007-08-12 12:12:26 +08:00
|
|
|
thermal.tzp= [HW,ACPI]
|
|
|
|
Specify global default ACPI thermal zone polling rate
|
|
|
|
<deci-seconds>: poll all this frequency
|
|
|
|
0: no polling (default)
|
|
|
|
|
2011-02-24 07:52:23 +08:00
|
|
|
threadirqs [KNL]
|
|
|
|
Force threading of all interrupt handlers except those
|
2012-02-14 23:26:42 +08:00
|
|
|
marked explicitly IRQF_NO_THREAD.
|
2011-02-24 07:52:23 +08:00
|
|
|
|
2008-12-25 20:39:23 +08:00
|
|
|
topology= [S390]
|
|
|
|
Format: {off | on}
|
|
|
|
Specify if the kernel should make use of the cpu
|
2011-04-05 06:04:46 +08:00
|
|
|
topology information if the hardware supports this.
|
|
|
|
The scheduler will make use of this information and
|
2008-12-25 20:39:23 +08:00
|
|
|
e.g. base its process migration decisions on it.
|
2010-10-25 22:10:43 +08:00
|
|
|
Default is on.
|
2008-12-25 20:39:23 +08:00
|
|
|
|
2014-10-11 00:04:49 +08:00
|
|
|
topology_updates= [KNL, PPC, NUMA]
|
|
|
|
Format: {off}
|
|
|
|
Specify if the kernel should ignore (off)
|
|
|
|
topology updates sent by the hypervisor to this
|
|
|
|
LPAR.
|
|
|
|
|
2019-12-07 07:02:59 +08:00
|
|
|
torture.disable_onoff_at_boot= [KNL]
|
|
|
|
Prevent the CPU-hotplug component of torturing
|
|
|
|
until after init has spawned.
|
|
|
|
|
2020-06-17 06:38:24 +08:00
|
|
|
torture.ftrace_dump_at_shutdown= [KNL]
|
|
|
|
Dump the ftrace buffer at torture-test shutdown,
|
|
|
|
even if there were no errors. This can be a
|
|
|
|
very costly operation when many torture tests
|
|
|
|
are running concurrently, especially on systems
|
|
|
|
with rotating-rust storage.
|
|
|
|
|
2020-11-26 05:00:04 +08:00
|
|
|
torture.verbose_sleep_frequency= [KNL]
|
|
|
|
Specifies how many verbose printk()s should be
|
|
|
|
emitted between each sleep. The default of zero
|
|
|
|
disables verbose-printk() sleeping.
|
|
|
|
|
|
|
|
torture.verbose_sleep_duration= [KNL]
|
|
|
|
Duration of each verbose-printk() sleep in jiffies.
|
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
tp720= [HW,PS2]
|
|
|
|
|
2010-03-25 11:55:32 +08:00
|
|
|
tpm_suspend_pcr=[HW,TPM]
|
|
|
|
Format: integer pcr id
|
|
|
|
Specify that at suspend time, the tpm driver
|
|
|
|
should extend the specified pcr with zeros,
|
|
|
|
as a workaround for some chips which fail to
|
|
|
|
flush the last written pcr on TPM_SaveState.
|
|
|
|
This will guarantee that all the other pcrs
|
|
|
|
are saved.
|
|
|
|
|
2009-06-24 17:33:15 +08:00
|
|
|
trace_buf_size=nn[KMG]
|
2014-12-03 09:39:20 +08:00
|
|
|
[FTRACE] will set tracing buffer size on each cpu.
|
2009-03-10 12:57:10 +08:00
|
|
|
|
2009-07-01 10:47:05 +08:00
|
|
|
trace_event=[event-list]
|
|
|
|
[FTRACE] Set and start specified trace events in order
|
2016-05-24 04:37:58 +08:00
|
|
|
to facilitate early boot debugging. The event-list is a
|
2021-01-01 12:08:31 +08:00
|
|
|
comma-separated list of trace events to enable. See
|
2018-05-09 02:14:57 +08:00
|
|
|
also Documentation/trace/events.rst
|
2009-07-01 10:47:05 +08:00
|
|
|
|
2012-11-02 10:56:07 +08:00
|
|
|
trace_options=[option-list]
|
|
|
|
[FTRACE] Enable or disable tracer options at boot.
|
|
|
|
The option-list is a comma delimited list of options
|
|
|
|
that can be enabled or disabled just as if you were
|
|
|
|
to echo the option name into
|
|
|
|
|
|
|
|
/sys/kernel/debug/tracing/trace_options
|
|
|
|
|
|
|
|
For example, to enable stacktrace option (to dump the
|
|
|
|
stack trace of each event), add to the command line:
|
|
|
|
|
|
|
|
trace_options=stacktrace
|
|
|
|
|
2018-05-09 02:14:57 +08:00
|
|
|
See also Documentation/trace/ftrace.rst "trace options"
|
2012-11-02 10:56:07 +08:00
|
|
|
section.
|
|
|
|
|
2014-12-13 11:27:10 +08:00
|
|
|
tp_printk[FTRACE]
|
|
|
|
Have the tracepoints sent to printk as well as the
|
|
|
|
tracing ring buffer. This is useful for early boot up
|
|
|
|
where the system hangs or reboots and does not give the
|
|
|
|
option for reading the tracing buffer or performing a
|
|
|
|
ftrace_dump_on_oops.
|
|
|
|
|
|
|
|
To turn off having tracepoints sent to printk,
|
|
|
|
echo 0 > /proc/sys/kernel/tracepoint_printk
|
|
|
|
Note, echoing 1 into this file without the
|
|
|
|
tracepoint_printk kernel cmdline option has no effect.
|
|
|
|
|
2021-06-17 22:51:02 +08:00
|
|
|
The tp_printk_stop_on_boot (see below) can also be used
|
|
|
|
to stop the printing of events to console at
|
|
|
|
late_initcall_sync.
|
|
|
|
|
2014-12-13 11:27:10 +08:00
|
|
|
** CAUTION **
|
|
|
|
|
|
|
|
Having tracepoints sent to printk() and activating high
|
|
|
|
frequency tracepoints such as irq or sched, can cause
|
|
|
|
the system to live lock.
|
|
|
|
|
2021-06-17 22:51:02 +08:00
|
|
|
tp_printk_stop_on_boot[FTRACE]
|
|
|
|
When tp_printk (above) is set, it can cause a lot of noise
|
|
|
|
on the console. It may be useful to only include the
|
|
|
|
printing of events during boot up, as user space may
|
|
|
|
make the system inoperable.
|
|
|
|
|
|
|
|
This command line option will stop the printing of events
|
|
|
|
to console at the late_initcall_sync() time frame.
|
|
|
|
|
2013-06-15 04:21:43 +08:00
|
|
|
traceoff_on_warning
|
|
|
|
[FTRACE] enable this option to disable tracing when a
|
|
|
|
warning is hit. This turns off "tracing_on". Tracing can
|
|
|
|
be enabled again by echoing '1' into the "tracing_on"
|
|
|
|
file located in /sys/kernel/debug/tracing/
|
|
|
|
|
|
|
|
This option is useful, as it disables the trace before
|
|
|
|
the WARNING dump is called, which prevents the trace to
|
|
|
|
be filled with content caused by the warning output.
|
|
|
|
|
|
|
|
This option can also be set at run time via the sysctl
|
|
|
|
option: kernel/traceoff_on_warning
|
|
|
|
|
2012-03-22 07:34:02 +08:00
|
|
|
transparent_hugepage=
|
|
|
|
[KNL]
|
|
|
|
Format: [always|madvise|never]
|
|
|
|
Can be used to control the default behavior of the system
|
|
|
|
with respect to transparent hugepages.
|
2018-05-14 16:13:40 +08:00
|
|
|
See Documentation/admin-guide/mm/transhuge.rst
|
|
|
|
for more details.
|
2012-03-22 07:34:02 +08:00
|
|
|
|
2021-03-01 21:11:24 +08:00
|
|
|
trusted.source= [KEYS]
|
|
|
|
Format: <string>
|
|
|
|
This parameter identifies the trust source as a backend
|
|
|
|
for trusted keys implementation. Supported trust
|
|
|
|
sources:
|
|
|
|
- "tpm"
|
|
|
|
- "tee"
|
|
|
|
If not specified then it defaults to iterating through
|
|
|
|
the trust source list starting with TPM and assigns the
|
|
|
|
first trust source as a backend which is initialized
|
|
|
|
successfully during iteration.
|
|
|
|
|
2009-08-18 07:40:47 +08:00
|
|
|
tsc= Disable clocksource stability checks for TSC.
|
2008-10-25 08:22:01 +08:00
|
|
|
Format: <string>
|
|
|
|
[x86] reliable: mark tsc clocksource as reliable, this
|
2009-08-18 07:40:47 +08:00
|
|
|
disables clocksource verification at runtime, as well
|
|
|
|
as the stability checks done at bootup. Used to enable
|
|
|
|
high-resolution timer mode on older hardware, and in
|
|
|
|
virtualized environment.
|
2010-10-05 08:03:20 +08:00
|
|
|
[x86] noirqtime: Do not use TSC to do irq accounting.
|
|
|
|
Used to run time disable IRQ_TIME_ACCOUNTING on any
|
|
|
|
platforms where RDTSC is slow and this accounting
|
|
|
|
can add overhead.
|
2017-10-09 17:03:33 +08:00
|
|
|
[x86] unstable: mark the TSC clocksource as unstable, this
|
|
|
|
marks the TSC unconditionally unstable at bootup and
|
|
|
|
avoids any further wobbles once the TSC watchdog notices.
|
2019-03-07 20:09:13 +08:00
|
|
|
[x86] nowatchdog: disable clocksource watchdog. Used
|
|
|
|
in situations with strict latency requirements (where
|
|
|
|
interruptions from clocksource watchdog are not
|
|
|
|
acceptable).
|
2008-10-25 08:22:01 +08:00
|
|
|
|
2020-01-24 00:09:26 +08:00
|
|
|
tsc_early_khz= [X86] Skip early TSC calibration and use the given
|
|
|
|
value instead. Useful when the early TSC frequency discovery
|
|
|
|
procedure is not reliable, such as on overclocked systems
|
|
|
|
with CPUID.16h support and partial CPUID.15h support.
|
|
|
|
Format: <unsigned int>
|
|
|
|
|
2019-10-23 17:01:53 +08:00
|
|
|
tsx= [X86] Control Transactional Synchronization
|
|
|
|
Extensions (TSX) feature in Intel processors that
|
|
|
|
support TSX control.
|
|
|
|
|
|
|
|
This parameter controls the TSX feature. The options are:
|
|
|
|
|
|
|
|
on - Enable TSX on the system. Although there are
|
|
|
|
mitigations for all known security vulnerabilities,
|
|
|
|
TSX has been known to be an accelerator for
|
|
|
|
several previous speculation-related CVEs, and
|
|
|
|
so there may be unknown security risks associated
|
|
|
|
with leaving it enabled.
|
|
|
|
|
|
|
|
off - Disable TSX on the system. (Note that this
|
|
|
|
option takes effect only on newer CPUs which are
|
|
|
|
not vulnerable to MDS, i.e., have
|
|
|
|
MSR_IA32_ARCH_CAPABILITIES.MDS_NO=1 and which get
|
|
|
|
the new IA32_TSX_CTRL MSR through a microcode
|
|
|
|
update. This new MSR allows for the reliable
|
|
|
|
deactivation of the TSX functionality.)
|
|
|
|
|
2019-10-23 18:28:57 +08:00
|
|
|
auto - Disable TSX if X86_BUG_TAA is present,
|
|
|
|
otherwise enable TSX on the system.
|
|
|
|
|
2019-10-23 17:01:53 +08:00
|
|
|
Not specifying this option is equivalent to tsx=off.
|
|
|
|
|
|
|
|
See Documentation/admin-guide/hw-vuln/tsx_async_abort.rst
|
|
|
|
for more details.
|
|
|
|
|
2019-10-23 18:32:55 +08:00
|
|
|
tsx_async_abort= [X86,INTEL] Control mitigation for the TSX Async
|
|
|
|
Abort (TAA) vulnerability.
|
|
|
|
|
|
|
|
Similar to Micro-architectural Data Sampling (MDS)
|
|
|
|
certain CPUs that support Transactional
|
|
|
|
Synchronization Extensions (TSX) are vulnerable to an
|
|
|
|
exploit against CPU internal buffers which can forward
|
|
|
|
information to a disclosure gadget under certain
|
|
|
|
conditions.
|
|
|
|
|
|
|
|
In vulnerable processors, the speculatively forwarded
|
|
|
|
data can be used in a cache side channel attack, to
|
|
|
|
access data to which the attacker does not have direct
|
|
|
|
access.
|
|
|
|
|
|
|
|
This parameter controls the TAA mitigation. The
|
|
|
|
options are:
|
|
|
|
|
|
|
|
full - Enable TAA mitigation on vulnerable CPUs
|
|
|
|
if TSX is enabled.
|
|
|
|
|
|
|
|
full,nosmt - Enable TAA mitigation and disable SMT on
|
|
|
|
vulnerable CPUs. If TSX is disabled, SMT
|
|
|
|
is not disabled because CPU is not
|
|
|
|
vulnerable to cross-thread TAA attacks.
|
|
|
|
off - Unconditionally disable TAA mitigation
|
|
|
|
|
2019-11-16 00:14:44 +08:00
|
|
|
On MDS-affected machines, tsx_async_abort=off can be
|
|
|
|
prevented by an active MDS mitigation as both vulnerabilities
|
|
|
|
are mitigated with the same mechanism so in order to disable
|
|
|
|
this mitigation, you need to specify mds=off too.
|
|
|
|
|
2019-10-23 18:32:55 +08:00
|
|
|
Not specifying this option is equivalent to
|
|
|
|
tsx_async_abort=full. On CPUs which are MDS affected
|
|
|
|
and deploy MDS mitigation, TAA mitigation is not
|
|
|
|
required and doesn't provide any additional
|
|
|
|
mitigation.
|
|
|
|
|
|
|
|
For details see:
|
|
|
|
Documentation/admin-guide/hw-vuln/tsx_async_abort.rst
|
|
|
|
|
2005-10-24 03:57:11 +08:00
|
|
|
turbografx.map[2|3]= [HW,JOY]
|
|
|
|
TurboGraFX parallel port interface
|
|
|
|
Format:
|
|
|
|
<port#>,<js1>,<js2>,<js3>,<js4>,<js5>,<js6>,<js7>
|
2017-10-11 01:36:23 +08:00
|
|
|
See also Documentation/input/devices/joystick-parport.rst
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2011-05-31 23:22:05 +08:00
|
|
|
udbg-immortal [PPC] When debugging early kernel crashes that
|
2016-11-03 18:10:10 +08:00
|
|
|
happen after console_init() and before a proper
|
2011-05-31 23:22:05 +08:00
|
|
|
console driver takes over, this boot options might
|
|
|
|
help "seeing" what's going on.
|
|
|
|
|
2009-10-07 08:37:59 +08:00
|
|
|
uhash_entries= [KNL,NET]
|
|
|
|
Set number of hash buckets for UDP/UDP-Lite connections
|
|
|
|
|
2006-12-06 05:29:55 +08:00
|
|
|
uhci-hcd.ignore_oc=
|
|
|
|
[USB] Ignore overcurrent events (default N).
|
|
|
|
Some badly-designed motherboards generate lots of
|
|
|
|
bogus events, for ports that aren't wired to
|
|
|
|
anything. Set this parameter to avoid log spamming.
|
|
|
|
Note that genuine overcurrent events won't be
|
|
|
|
reported either.
|
|
|
|
|
2008-07-20 06:32:54 +08:00
|
|
|
unknown_nmi_panic
|
2011-04-05 06:02:24 +08:00
|
|
|
[X86] Cause panic on unknown NMI.
|
2008-07-20 06:32:54 +08:00
|
|
|
|
2011-06-01 03:31:08 +08:00
|
|
|
usbcore.authorized_default=
|
|
|
|
[USB] Default USB device authorization:
|
|
|
|
(default -1 = authorized except for wireless USB,
|
2019-02-17 15:21:51 +08:00
|
|
|
0 = not authorized, 1 = authorized, 2 = authorized
|
|
|
|
if device connected to internal port)
|
2011-06-01 03:31:08 +08:00
|
|
|
|
2007-02-21 04:00:53 +08:00
|
|
|
usbcore.autosuspend=
|
|
|
|
[USB] The autosuspend time delay (in seconds) used
|
|
|
|
for newly-detected USB devices (default 2). This
|
|
|
|
is the time required before an idle device will be
|
|
|
|
autosuspended. Devices for which the delay is set
|
2007-03-14 04:39:15 +08:00
|
|
|
to a negative value won't be autosuspended at all.
|
2007-02-21 04:00:53 +08:00
|
|
|
|
2008-10-10 22:24:45 +08:00
|
|
|
usbcore.usbfs_snoop=
|
|
|
|
[USB] Set to log all usbfs traffic (default 0 = off).
|
|
|
|
|
2015-11-21 02:53:22 +08:00
|
|
|
usbcore.usbfs_snoop_max=
|
|
|
|
[USB] Maximum number of bytes to snoop in each URB
|
|
|
|
(default = 65536).
|
|
|
|
|
2008-10-10 22:24:45 +08:00
|
|
|
usbcore.blinkenlights=
|
|
|
|
[USB] Set to cycle leds on hubs (default 0 = off).
|
|
|
|
|
|
|
|
usbcore.old_scheme_first=
|
|
|
|
[USB] Start with the old device initialization
|
2020-04-23 04:13:08 +08:00
|
|
|
scheme (default 0 = off).
|
2008-10-10 22:24:45 +08:00
|
|
|
|
2011-11-18 05:41:35 +08:00
|
|
|
usbcore.usbfs_memory_mb=
|
|
|
|
[USB] Memory limit (in MB) for buffers allocated by
|
|
|
|
usbfs (default = 16, 0 = max = 2047).
|
|
|
|
|
2008-10-10 22:24:45 +08:00
|
|
|
usbcore.use_both_schemes=
|
|
|
|
[USB] Try the other device initialization scheme
|
|
|
|
if the first one fails (default 1 = enabled).
|
|
|
|
|
|
|
|
usbcore.initial_descriptor_timeout=
|
|
|
|
[USB] Specifies timeout for the initial 64-byte
|
2018-04-19 02:51:39 +08:00
|
|
|
USB_REQ_GET_DESCRIPTOR request in milliseconds
|
2008-10-10 22:24:45 +08:00
|
|
|
(default 5000 = 5.0 seconds).
|
|
|
|
|
2015-12-03 22:03:32 +08:00
|
|
|
usbcore.nousb [USB] Disable the USB subsystem
|
|
|
|
|
2018-03-20 00:26:06 +08:00
|
|
|
usbcore.quirks=
|
|
|
|
[USB] A list of quirk entries to augment the built-in
|
|
|
|
usb core quirk list. List entries are separated by
|
|
|
|
commas. Each entry has the form
|
|
|
|
VendorID:ProductID:Flags. The IDs are 4-digit hex
|
|
|
|
numbers and Flags is a set of letters. Each letter
|
|
|
|
will change the built-in quirk; setting it if it is
|
|
|
|
clear and clearing it if it is set. The letters have
|
|
|
|
the following meanings:
|
|
|
|
a = USB_QUIRK_STRING_FETCH_255 (string
|
|
|
|
descriptors must not be fetched using
|
|
|
|
a 255-byte read);
|
|
|
|
b = USB_QUIRK_RESET_RESUME (device can't resume
|
|
|
|
correctly so reset it instead);
|
|
|
|
c = USB_QUIRK_NO_SET_INTF (device can't handle
|
|
|
|
Set-Interface requests);
|
|
|
|
d = USB_QUIRK_CONFIG_INTF_STRINGS (device can't
|
|
|
|
handle its Configuration or Interface
|
|
|
|
strings);
|
|
|
|
e = USB_QUIRK_RESET (device can't be reset
|
|
|
|
(e.g morph devices), don't use reset);
|
|
|
|
f = USB_QUIRK_HONOR_BNUMINTERFACES (device has
|
|
|
|
more interface descriptions than the
|
|
|
|
bNumInterfaces count, and can't handle
|
|
|
|
talking to these interfaces);
|
|
|
|
g = USB_QUIRK_DELAY_INIT (device needs a pause
|
|
|
|
during initialization, after we read
|
|
|
|
the device descriptor);
|
|
|
|
h = USB_QUIRK_LINEAR_UFRAME_INTR_BINTERVAL (For
|
|
|
|
high speed and super speed interrupt
|
|
|
|
endpoints, the USB 2.0 and USB 3.0 spec
|
|
|
|
require the interval in microframes (1
|
|
|
|
microframe = 125 microseconds) to be
|
|
|
|
calculated as interval = 2 ^
|
|
|
|
(bInterval-1).
|
|
|
|
Devices with this quirk report their
|
|
|
|
bInterval as the result of this
|
|
|
|
calculation instead of the exponent
|
|
|
|
variable used in the calculation);
|
|
|
|
i = USB_QUIRK_DEVICE_QUALIFIER (device can't
|
|
|
|
handle device_qualifier descriptor
|
|
|
|
requests);
|
|
|
|
j = USB_QUIRK_IGNORE_REMOTE_WAKEUP (device
|
|
|
|
generates spurious wakeup, ignore
|
|
|
|
remote wakeup capability);
|
|
|
|
k = USB_QUIRK_NO_LPM (device can't handle Link
|
|
|
|
Power Management);
|
|
|
|
l = USB_QUIRK_LINEAR_FRAME_INTR_BINTERVAL
|
|
|
|
(Device reports its bInterval as linear
|
|
|
|
frames instead of the USB 2.0
|
|
|
|
calculation);
|
|
|
|
m = USB_QUIRK_DISCONNECT_SUSPEND (Device needs
|
|
|
|
to be disconnected before suspend to
|
2018-03-24 03:26:36 +08:00
|
|
|
prevent spurious wakeup);
|
|
|
|
n = USB_QUIRK_DELAY_CTRL_MSG (Device needs a
|
|
|
|
pause after every control message);
|
2018-10-19 16:14:50 +08:00
|
|
|
o = USB_QUIRK_HUB_SLOW_RESET (Hub needs extra
|
|
|
|
delay after resetting its port);
|
2018-03-20 00:26:06 +08:00
|
|
|
Example: quirks=0781:5580:bk,0a5c:5834:gij
|
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
usbhid.mousepoll=
|
|
|
|
[USBHID] The interval which mice are to be polled at.
|
2005-10-24 03:57:11 +08:00
|
|
|
|
2017-02-26 03:27:27 +08:00
|
|
|
usbhid.jspoll=
|
|
|
|
[USBHID] The interval which joysticks are to be polled at.
|
|
|
|
|
2018-03-22 00:28:25 +08:00
|
|
|
usbhid.kbpoll=
|
|
|
|
[USBHID] The interval which keyboards are to be polled at.
|
|
|
|
|
2008-11-11 03:07:45 +08:00
|
|
|
usb-storage.delay_use=
|
|
|
|
[UMS] The delay in seconds before a new device is
|
2014-11-04 21:00:15 +08:00
|
|
|
scanned for Logical Units (default 1).
|
2008-11-11 03:07:45 +08:00
|
|
|
|
|
|
|
usb-storage.quirks=
|
|
|
|
[UMS] A list of quirks entries to supplement or
|
|
|
|
override the built-in unusual_devs list. List
|
|
|
|
entries are separated by commas. Each entry has
|
|
|
|
the form VID:PID:Flags where VID and PID are Vendor
|
|
|
|
and Product ID values (4-digit hex numbers) and
|
|
|
|
Flags is a set of characters, each corresponding
|
|
|
|
to a common usb-storage quirk flag as follows:
|
2008-12-15 23:40:06 +08:00
|
|
|
a = SANE_SENSE (collect more than 18 bytes
|
2019-11-14 19:27:58 +08:00
|
|
|
of sense data, not on uas);
|
2009-12-08 05:39:16 +08:00
|
|
|
b = BAD_SENSE (don't collect more than 18
|
2019-11-14 19:27:58 +08:00
|
|
|
bytes of sense data, not on uas);
|
2008-11-11 03:07:45 +08:00
|
|
|
c = FIX_CAPACITY (decrease the reported
|
|
|
|
device capacity by one sector);
|
2011-05-19 04:42:34 +08:00
|
|
|
d = NO_READ_DISC_INFO (don't use
|
2019-11-14 19:27:58 +08:00
|
|
|
READ_DISC_INFO command, not on uas);
|
2011-05-19 04:42:34 +08:00
|
|
|
e = NO_READ_CAPACITY_16 (don't use
|
|
|
|
READ_CAPACITY_16 command);
|
2014-09-17 00:36:52 +08:00
|
|
|
f = NO_REPORT_OPCODES (don't use report opcodes
|
|
|
|
command, uas only);
|
2015-04-21 17:20:31 +08:00
|
|
|
g = MAX_SECTORS_240 (don't transfer more than
|
|
|
|
240 sectors at a time, uas only);
|
2008-12-15 23:40:06 +08:00
|
|
|
h = CAPACITY_HEURISTICS (decrease the
|
|
|
|
reported device capacity by one
|
|
|
|
sector if the number is odd);
|
2008-11-11 03:07:45 +08:00
|
|
|
i = IGNORE_DEVICE (don't bind to this
|
|
|
|
device);
|
2016-04-12 18:27:09 +08:00
|
|
|
j = NO_REPORT_LUNS (don't use report luns
|
|
|
|
command, uas only);
|
2020-12-09 23:26:39 +08:00
|
|
|
k = NO_SAME (do not use WRITE_SAME, uas only)
|
2008-11-11 03:07:45 +08:00
|
|
|
l = NOT_LOCKABLE (don't try to lock and
|
2019-11-14 19:27:58 +08:00
|
|
|
unlock ejectable media, not on uas);
|
2008-11-11 03:07:45 +08:00
|
|
|
m = MAX_SECTORS_64 (don't transfer more
|
2019-11-14 19:27:58 +08:00
|
|
|
than 64 sectors = 32 KB at a time,
|
|
|
|
not on uas);
|
2011-06-07 23:35:52 +08:00
|
|
|
n = INITIAL_READ10 (force a retry of the
|
2019-11-14 19:27:58 +08:00
|
|
|
initial READ(10) command, not on uas);
|
2008-12-15 23:40:06 +08:00
|
|
|
o = CAPACITY_OK (accept the capacity
|
2019-11-14 19:27:58 +08:00
|
|
|
reported by the device, not on uas);
|
2012-07-08 11:05:28 +08:00
|
|
|
p = WRITE_CACHE (the device cache is ON
|
2019-11-14 19:27:58 +08:00
|
|
|
by default, not on uas);
|
2008-11-11 03:07:45 +08:00
|
|
|
r = IGNORE_RESIDUE (the device reports
|
2019-11-14 19:27:58 +08:00
|
|
|
bogus residue values, not on uas);
|
2008-11-11 03:07:45 +08:00
|
|
|
s = SINGLE_LUN (the device has only one
|
|
|
|
Logical Unit);
|
2014-09-15 22:04:12 +08:00
|
|
|
t = NO_ATA_1X (don't allow ATA(12) and ATA(16)
|
|
|
|
commands, uas only);
|
2014-09-03 03:42:18 +08:00
|
|
|
u = IGNORE_UAS (don't bind to the uas driver);
|
2008-11-11 03:07:45 +08:00
|
|
|
w = NO_WP_DETECT (don't test whether the
|
|
|
|
medium is write-protected).
|
2016-09-12 21:19:41 +08:00
|
|
|
y = ALWAYS_SYNC (issue a SYNCHRONIZE_CACHE
|
2019-11-14 19:27:58 +08:00
|
|
|
even if the device claims no cache,
|
|
|
|
not on uas)
|
2008-11-11 03:07:45 +08:00
|
|
|
Example: quirks=0419:aaf5:rl,0421:0433:rc
|
|
|
|
|
2011-08-14 03:34:50 +08:00
|
|
|
user_debug= [KNL,ARM]
|
|
|
|
Format: <int>
|
|
|
|
See arch/arm/Kconfig.debug help text.
|
|
|
|
1 - undefined instruction events
|
|
|
|
2 - system calls
|
|
|
|
4 - invalid data aborts
|
|
|
|
8 - SIGSEGV faults
|
|
|
|
16 - SIGBUS faults
|
|
|
|
Example: user_debug=31
|
|
|
|
|
2010-02-17 18:38:10 +08:00
|
|
|
userpte=
|
|
|
|
[X86] Flags controlling user PTE allocations.
|
|
|
|
|
|
|
|
nohigh = do not allocate PTE pages in
|
|
|
|
HIGHMEM regardless of setting
|
|
|
|
of CONFIG_HIGHPTE.
|
|
|
|
|
2009-04-14 16:33:43 +08:00
|
|
|
vdso= [X86,SH]
|
2014-03-14 07:01:26 +08:00
|
|
|
On X86_32, this is an alias for vdso32=. Otherwise:
|
|
|
|
|
|
|
|
vdso=1: enable VDSO (the default)
|
[PATCH] vdso: randomize the i386 vDSO by moving it into a vma
Move the i386 VDSO down into a vma and thus randomize it.
Besides the security implications, this feature also helps debuggers, which
can COW a vma-backed VDSO just like a normal DSO and can thus do
single-stepping and other debugging features.
It's good for hypervisors (Xen, VMWare) too, which typically live in the same
high-mapped address space as the VDSO, hence whenever the VDSO is used, they
get lots of guest pagefaults and have to fix such guest accesses up - which
slows things down instead of speeding things up (the primary purpose of the
VDSO).
There's a new CONFIG_COMPAT_VDSO (default=y) option, which provides support
for older glibcs that still rely on a prelinked high-mapped VDSO. Newer
distributions (using glibc 2.3.3 or later) can turn this option off. Turning
it off is also recommended for security reasons: attackers cannot use the
predictable high-mapped VDSO page as syscall trampoline anymore.
There is a new vdso=[0|1] boot option as well, and a runtime
/proc/sys/vm/vdso_enabled sysctl switch, that allows the VDSO to be turned
on/off.
(This version of the VDSO-randomization patch also has working ELF
coredumping, the previous patch crashed in the coredumping code.)
This code is a combined work of the exec-shield VDSO randomization
code and Gerd Hoffmann's hypervisor-centric VDSO patch. Rusty Russell
started this patch and i completed it.
[akpm@osdl.org: cleanups]
[akpm@osdl.org: compile fix]
[akpm@osdl.org: compile fix 2]
[akpm@osdl.org: compile fix 3]
[akpm@osdl.org: revernt MAXMEM change]
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Arjan van de Ven <arjan@infradead.org>
Cc: Gerd Hoffmann <kraxel@suse.de>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Zachary Amsden <zach@vmware.com>
Cc: Andi Kleen <ak@muc.de>
Cc: Jan Beulich <jbeulich@novell.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-27 17:53:50 +08:00
|
|
|
vdso=0: disable VDSO mapping
|
|
|
|
|
2014-03-14 07:01:26 +08:00
|
|
|
vdso32= [X86] Control the 32-bit vDSO
|
|
|
|
vdso32=1: enable 32-bit VDSO
|
|
|
|
vdso32=0 or vdso32=2: disable 32-bit VDSO
|
|
|
|
|
|
|
|
See the help text for CONFIG_COMPAT_VDSO for more
|
|
|
|
details. If CONFIG_COMPAT_VDSO is set, the default is
|
|
|
|
vdso32=0; otherwise, the default is vdso32=1.
|
|
|
|
|
|
|
|
For compatibility with older kernels, vdso32=2 is an
|
|
|
|
alias for vdso32=0.
|
|
|
|
|
|
|
|
Try vdso32=0 if you encounter an error that says:
|
|
|
|
dl_main: Assertion `(void *) ph->p_vaddr == _rtld_local._dl_sysinfo_dso' failed!
|
2008-01-30 20:30:43 +08:00
|
|
|
|
2007-07-17 20:22:55 +08:00
|
|
|
vector= [IA-64,SMP]
|
|
|
|
vector=percpu: enable percpu vector domain
|
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
video= [FB] Frame buffer configuration
|
2019-06-13 01:52:45 +08:00
|
|
|
See Documentation/fb/modedb.rst.
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2013-06-20 15:08:55 +08:00
|
|
|
video.brightness_switch_enabled= [0,1]
|
|
|
|
If set to 1, on receiving an ACPI notify event
|
|
|
|
generated by hotkey, video driver will adjust brightness
|
|
|
|
level and then send out the event to user space through
|
|
|
|
the allocated input device; If set to 0, video driver
|
|
|
|
will only send out the event without touching backlight
|
|
|
|
brightness level.
|
2014-07-15 01:35:45 +08:00
|
|
|
default: 1
|
2013-06-20 15:08:55 +08:00
|
|
|
|
2012-05-10 01:30:16 +08:00
|
|
|
virtio_mmio.device=
|
|
|
|
[VMMIO] Memory mapped virtio (platform) device.
|
|
|
|
|
|
|
|
<size>@<baseaddr>:<irq>[:<id>]
|
|
|
|
where:
|
|
|
|
<size> := size (can use standard suffixes
|
|
|
|
like K, M and G)
|
|
|
|
<baseaddr> := physical base address
|
|
|
|
<irq> := interrupt number (as passed to
|
|
|
|
request_irq())
|
|
|
|
<id> := (optional) platform device id
|
|
|
|
example:
|
|
|
|
virtio_mmio.device=1K@0x100b0000:48:7
|
|
|
|
|
|
|
|
Can be used multiple times for multiple devices.
|
|
|
|
|
2007-07-31 15:37:59 +08:00
|
|
|
vga= [BOOT,X86-32] Select a particular video mode
|
2019-06-08 02:54:32 +08:00
|
|
|
See Documentation/x86/boot.rst and
|
2019-06-28 01:56:51 +08:00
|
|
|
Documentation/admin-guide/svga.rst.
|
2005-04-17 06:20:36 +08:00
|
|
|
Use vga=ask for menu.
|
|
|
|
This is actually a boot loader parameter; the value is
|
|
|
|
passed to the kernel using a special protocol.
|
|
|
|
|
2018-10-27 06:07:45 +08:00
|
|
|
vm_debug[=options] [KNL] Available with CONFIG_DEBUG_VM=y.
|
|
|
|
May slow down system boot speed, especially when
|
|
|
|
enabled on systems with a large amount of memory.
|
|
|
|
All options are enabled by default, and this
|
|
|
|
interface is meant to allow for selectively
|
|
|
|
enabling or disabling specific virtual memory
|
|
|
|
debugging features.
|
|
|
|
|
|
|
|
Available options are:
|
|
|
|
P Enable page structure init time poisoning
|
|
|
|
- Disable all of the above options
|
|
|
|
|
2005-10-24 03:57:11 +08:00
|
|
|
vmalloc=nn[KMG] [KNL,BOOT] Forces the vmalloc area to have an exact
|
2005-04-17 06:20:36 +08:00
|
|
|
size of <nn>. This can be used to increase the
|
|
|
|
minimum size (128MB on x86). It can also be used to
|
|
|
|
decrease the size and leave more room for directly
|
|
|
|
mapped kernel RAM.
|
|
|
|
|
2017-08-07 21:16:15 +08:00
|
|
|
vmcp_cma=nn[MG] [KNL,S390]
|
|
|
|
Sets the memory size reserved for contiguous memory
|
|
|
|
allocations for the vmcp device driver.
|
|
|
|
|
2006-06-29 21:08:25 +08:00
|
|
|
vmhalt= [KNL,S390] Perform z/VM CP command after system halt.
|
|
|
|
Format: <command>
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2006-06-29 21:08:25 +08:00
|
|
|
vmpanic= [KNL,S390] Perform z/VM CP command after kernel panic.
|
|
|
|
Format: <command>
|
|
|
|
|
|
|
|
vmpoff= [KNL,S390] Perform z/VM CP command after power off.
|
|
|
|
Format: <command>
|
2005-10-24 03:57:11 +08:00
|
|
|
|
2011-08-10 23:15:32 +08:00
|
|
|
vsyscall= [X86-64]
|
|
|
|
Controls the behavior of vsyscalls (i.e. calls to
|
|
|
|
fixed addresses of 0xffffffffff600x00 from legacy
|
|
|
|
code). Most statically-linked binaries and older
|
|
|
|
versions of glibc use these calls. Because these
|
|
|
|
functions are at fixed addresses, they make nice
|
|
|
|
targets for exploits that can control RIP.
|
|
|
|
|
2011-11-08 08:33:41 +08:00
|
|
|
emulate [default] Vsyscalls turn into traps and are
|
2019-06-27 12:45:03 +08:00
|
|
|
emulated reasonably safely. The vsyscall
|
|
|
|
page is readable.
|
2011-08-10 23:15:32 +08:00
|
|
|
|
2019-06-27 12:45:03 +08:00
|
|
|
xonly Vsyscalls turn into traps and are
|
|
|
|
emulated reasonably safely. The vsyscall
|
|
|
|
page is not readable.
|
2011-08-10 23:15:32 +08:00
|
|
|
|
|
|
|
none Vsyscalls don't work at all. This makes
|
|
|
|
them quite hard to use for exploits but
|
|
|
|
might break your system.
|
|
|
|
|
2013-08-04 19:09:50 +08:00
|
|
|
vt.color= [VT] Default text color.
|
|
|
|
Format: 0xYX, X = foreground, Y = background.
|
|
|
|
Default: 0x07 = light gray on black.
|
|
|
|
|
2009-12-16 08:45:39 +08:00
|
|
|
vt.cur_default= [VT] Default cursor shape.
|
|
|
|
Format: 0xCCBBAA, where AA, BB, and CC are the same as
|
|
|
|
the parameters of the <Esc>[?A;B;Cc escape sequence;
|
|
|
|
see VGA-softcursor.txt. Default: 2 = underline.
|
|
|
|
|
2009-04-06 06:55:22 +08:00
|
|
|
vt.default_blu= [VT]
|
|
|
|
Format: <blue0>,<blue1>,<blue2>,...,<blue15>
|
|
|
|
Change the default blue palette of the console.
|
|
|
|
This is a 16-member array composed of values
|
|
|
|
ranging from 0-255.
|
|
|
|
|
|
|
|
vt.default_grn= [VT]
|
|
|
|
Format: <green0>,<green1>,<green2>,...,<green15>
|
|
|
|
Change the default green palette of the console.
|
|
|
|
This is a 16-member array composed of values
|
|
|
|
ranging from 0-255.
|
|
|
|
|
|
|
|
vt.default_red= [VT]
|
|
|
|
Format: <red0>,<red1>,<red2>,...,<red15>
|
|
|
|
Change the default red palette of the console.
|
|
|
|
This is a 16-member array composed of values
|
|
|
|
ranging from 0-255.
|
|
|
|
|
|
|
|
vt.default_utf8=
|
|
|
|
[VT]
|
|
|
|
Format=<0|1>
|
|
|
|
Set system-wide default UTF-8 mode for all tty's.
|
|
|
|
Default is 1, i.e. UTF-8 mode is enabled for all
|
|
|
|
newly opened terminals.
|
|
|
|
|
2009-11-14 04:14:11 +08:00
|
|
|
vt.global_cursor_default=
|
|
|
|
[VT]
|
|
|
|
Format=<-1|0|1>
|
|
|
|
Set system-wide default for whether a cursor
|
|
|
|
is shown on new VTs. Default is -1,
|
|
|
|
i.e. cursors will be created by default unless
|
|
|
|
overridden by individual drivers. 0 will hide
|
|
|
|
cursors, 1 will display them.
|
|
|
|
|
2013-08-04 19:09:50 +08:00
|
|
|
vt.italic= [VT] Default color for italic text; 0-15.
|
|
|
|
Default: 2 = green.
|
|
|
|
|
|
|
|
vt.underline= [VT] Default color for underlined text; 0-15.
|
|
|
|
Default: 3 = cyan.
|
|
|
|
|
2010-05-04 02:42:52 +08:00
|
|
|
watchdog timers [HW,WDT] For information on watchdog timers,
|
2019-06-13 01:53:01 +08:00
|
|
|
see Documentation/watchdog/watchdog-parameters.rst
|
2010-05-04 02:42:52 +08:00
|
|
|
or other driver-specific files in the
|
|
|
|
Documentation/watchdog/ directory.
|
2005-04-17 06:20:36 +08:00
|
|
|
|
2018-11-01 21:30:18 +08:00
|
|
|
watchdog_thresh=
|
|
|
|
[KNL]
|
|
|
|
Set the hard lockup detector stall duration
|
|
|
|
threshold in seconds. The soft lockup detector
|
|
|
|
threshold is set to twice the value. A value of 0
|
|
|
|
disables both lockup detectors. Default is 10
|
|
|
|
seconds.
|
|
|
|
|
workqueue: implement lockup detector
Workqueue stalls can happen from a variety of usage bugs such as
missing WQ_MEM_RECLAIM flag or concurrency managed work item
indefinitely staying RUNNING. These stalls can be extremely difficult
to hunt down because the usual warning mechanisms can't detect
workqueue stalls and the internal state is pretty opaque.
To alleviate the situation, this patch implements workqueue lockup
detector. It periodically monitors all worker_pools periodically and,
if any pool failed to make forward progress longer than the threshold
duration, triggers warning and dumps workqueue state as follows.
BUG: workqueue lockup - pool cpus=0 node=0 flags=0x0 nice=0 stuck for 31s!
Showing busy workqueues and worker pools:
workqueue events: flags=0x0
pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=17/256
pending: monkey_wrench_fn, e1000_watchdog, cache_reap, vmstat_shepherd, release_one_tty, release_one_tty, release_one_tty, release_one_tty, release_one_tty, release_one_tty, release_one_tty, release_one_tty, release_one_tty, release_one_tty, release_one_tty, release_one_tty, cgroup_release_agent
workqueue events_power_efficient: flags=0x80
pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=2/256
pending: check_lifetime, neigh_periodic_work
workqueue cgroup_pidlist_destroy: flags=0x0
pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/1
pending: cgroup_pidlist_destroy_work_fn
...
The detection mechanism is controller through kernel parameter
workqueue.watchdog_thresh and can be updated at runtime through the
sysfs module parameter file.
v2: Decoupled from softlockup control knobs.
Signed-off-by: Tejun Heo <tj@kernel.org>
Acked-by: Don Zickus <dzickus@redhat.com>
Cc: Ulrich Obergfell <uobergfe@redhat.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Chris Mason <clm@fb.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
2015-12-09 00:28:04 +08:00
|
|
|
workqueue.watchdog_thresh=
|
|
|
|
If CONFIG_WQ_WATCHDOG is configured, workqueue can
|
|
|
|
warn stall conditions and dump internal state to
|
|
|
|
help debugging. 0 disables workqueue stall
|
|
|
|
detection; otherwise, it's the stall threshold
|
|
|
|
duration in seconds. The default value is 30 and
|
|
|
|
it can be updated at runtime by writing to the
|
|
|
|
corresponding sysfs file.
|
|
|
|
|
2013-04-02 02:23:38 +08:00
|
|
|
workqueue.disable_numa
|
|
|
|
By default, all work items queued to unbound
|
|
|
|
workqueues are affine to the NUMA nodes they're
|
|
|
|
issued on, which results in better behavior in
|
|
|
|
general. If NUMA affinity needs to be disabled for
|
|
|
|
whatever reason, this option can be used. Note
|
|
|
|
that this also can be controlled per-workqueue for
|
|
|
|
workqueues visible under /sys/bus/workqueue/.
|
|
|
|
|
2013-04-08 19:15:40 +08:00
|
|
|
workqueue.power_efficient
|
|
|
|
Per-cpu workqueues are generally preferred because
|
|
|
|
they show better performance thanks to cache
|
|
|
|
locality; unfortunately, per-cpu workqueues tend to
|
|
|
|
be more power hungry than unbound workqueues.
|
|
|
|
|
|
|
|
Enabling this makes the per-cpu workqueues which
|
|
|
|
were observed to contribute significantly to power
|
|
|
|
consumption unbound, leading to measurably lower
|
|
|
|
power usage at the cost of small performance
|
|
|
|
overhead.
|
|
|
|
|
|
|
|
The default value of this parameter is determined by
|
|
|
|
the config option CONFIG_WQ_POWER_EFFICIENT_DEFAULT.
|
|
|
|
|
2016-02-10 06:59:38 +08:00
|
|
|
workqueue.debug_force_rr_cpu
|
|
|
|
Workqueue used to implicitly guarantee that work
|
|
|
|
items queued without explicit CPU specified are put
|
|
|
|
on the local CPU. This guarantee is no longer true
|
|
|
|
and while local CPU is still preferred work items
|
|
|
|
may be put on foreign CPUs. This debug option
|
|
|
|
forces round-robin CPU selection to flush out
|
|
|
|
usages which depend on the now broken guarantee.
|
|
|
|
When enabled, memory and cache locality will be
|
|
|
|
impacted.
|
|
|
|
|
2009-04-06 06:55:22 +08:00
|
|
|
x2apic_phys [X86-64,APIC] Use x2apic physical mode instead of
|
|
|
|
default x2apic cluster mode on platforms
|
|
|
|
supporting x2apic.
|
|
|
|
|
2015-07-17 12:51:36 +08:00
|
|
|
xen_512gb_limit [KNL,X86-64,XEN]
|
|
|
|
Restricts the kernel running paravirtualized under Xen
|
|
|
|
to use only up to 512 GB of RAM. The reason to do so is
|
|
|
|
crash analysis tools and Xen tools for doing domain
|
|
|
|
save/restore/migration must be enabled to handle larger
|
|
|
|
domains.
|
|
|
|
|
2010-05-14 19:44:30 +08:00
|
|
|
xen_emul_unplug= [HW,X86,XEN]
|
|
|
|
Unplug Xen emulated devices
|
|
|
|
Format: [unplug0,][unplug1]
|
|
|
|
ide-disks -- unplug primary master IDE devices
|
|
|
|
aux-ide-disks -- unplug non-primary-master IDE devices
|
|
|
|
nics -- unplug network devices
|
|
|
|
all -- unplug all emulated devices (NICs and IDE disks)
|
2010-08-23 18:59:29 +08:00
|
|
|
unnecessary -- unplugging emulated devices is
|
|
|
|
unnecessary even if the host did not respond to
|
|
|
|
the unplug protocol
|
2010-08-23 18:59:28 +08:00
|
|
|
never -- do not unplug even if version check succeeds
|
2010-05-14 19:44:30 +08:00
|
|
|
|
2019-10-01 04:44:41 +08:00
|
|
|
xen_legacy_crash [X86,XEN]
|
|
|
|
Crash from Xen panic notifier, without executing late
|
|
|
|
panic() code such as dumping handler.
|
|
|
|
|
2013-09-25 22:07:20 +08:00
|
|
|
xen_nopvspin [X86,XEN]
|
2019-10-23 19:16:23 +08:00
|
|
|
Disables the qspinlock slowpath using Xen PV optimizations.
|
|
|
|
This parameter is obsoleted by "nopvspin" parameter, which
|
|
|
|
has equivalent effect for XEN platform.
|
2013-09-25 22:07:20 +08:00
|
|
|
|
2014-07-11 23:51:35 +08:00
|
|
|
xen_nopv [X86]
|
|
|
|
Disables the PV optimizations forcing the HVM guest to
|
|
|
|
run as generic HVM guest with no PV drivers.
|
2019-07-11 20:02:10 +08:00
|
|
|
This option is obsoleted by the "nopv" option, which
|
|
|
|
has equivalent effect for XEN platform.
|
2014-07-11 23:51:35 +08:00
|
|
|
|
2021-01-06 23:39:56 +08:00
|
|
|
xen_no_vector_callback
|
|
|
|
[KNL,X86,XEN] Disable the vector callback for Xen
|
|
|
|
event channel interrupts.
|
|
|
|
|
2018-09-08 00:49:08 +08:00
|
|
|
xen_scrub_pages= [XEN]
|
|
|
|
Boolean option to control scrubbing pages before giving them back
|
|
|
|
to Xen, for use by other domains. Can be also changed at runtime
|
|
|
|
with /sys/devices/system/xen_memory/xen_memory0/scrub_pages.
|
|
|
|
Default value controlled with CONFIG_XEN_SCRUB_PAGES_DEFAULT.
|
|
|
|
|
2019-03-23 02:29:57 +08:00
|
|
|
xen_timer_slop= [X86-64,XEN]
|
|
|
|
Set the timer slop (in nanoseconds) for the virtual Xen
|
|
|
|
timers (default is 100000). This adjusts the minimum
|
|
|
|
delta of virtualized Xen timers, where lower values
|
|
|
|
improve timer resolution at the expense of processing
|
|
|
|
more timer interrupts.
|
|
|
|
|
2020-09-07 21:47:30 +08:00
|
|
|
xen.event_eoi_delay= [XEN]
|
|
|
|
How long to delay EOI handling in case of event
|
|
|
|
storms (jiffies). Default is 10.
|
|
|
|
|
|
|
|
xen.event_loop_timeout= [XEN]
|
|
|
|
After which time (jiffies) the event handling loop
|
|
|
|
should start to delay EOI handling. Default is 2.
|
|
|
|
|
2020-10-22 17:49:07 +08:00
|
|
|
xen.fifo_events= [XEN]
|
|
|
|
Boolean parameter to disable using fifo event handling
|
|
|
|
even if available. Normally fifo event handling is
|
|
|
|
preferred over the 2-level event handling, as it is
|
|
|
|
fairer and the number of possible event channels is
|
|
|
|
much higher. Default is on (use fifo events).
|
|
|
|
|
2019-07-11 20:02:09 +08:00
|
|
|
nopv= [X86,XEN,KVM,HYPER_V,VMWARE]
|
|
|
|
Disables the PV optimizations forcing the guest to run
|
|
|
|
as generic guest with no PV drivers. Currently support
|
|
|
|
XEN HVM, KVM, HYPER_V and VMWARE guest.
|
|
|
|
|
2019-10-23 19:16:23 +08:00
|
|
|
nopvspin [X86,XEN,KVM]
|
2019-10-23 19:16:22 +08:00
|
|
|
Disables the qspinlock slow path using PV optimizations
|
|
|
|
which allow the hypervisor to 'idle' the guest on lock
|
|
|
|
contention.
|
|
|
|
|
2005-04-17 06:20:36 +08:00
|
|
|
xirc2ps_cs= [NET,PCMCIA]
|
2005-10-24 03:57:11 +08:00
|
|
|
Format:
|
|
|
|
<irq>,<irq_mask>,<io>,<full_duplex>,<do_sound>,<lockup_hack>[,<irq2>[,<irq3>[,<irq4>]]]
|
2018-07-05 21:31:42 +08:00
|
|
|
|
2019-05-13 13:39:10 +08:00
|
|
|
xive= [PPC]
|
|
|
|
By default on POWER9 and above, the kernel will
|
|
|
|
natively use the XIVE interrupt controller. This option
|
|
|
|
allows the fallback firmware mode to be used:
|
|
|
|
|
|
|
|
off Fallback to firmware control of XIVE interrupt
|
|
|
|
controller on both pseries and powernv
|
|
|
|
platforms. Only useful on POWER9 and above.
|
|
|
|
|
2018-07-05 21:31:42 +08:00
|
|
|
xhci-hcd.quirks [USB,KNL]
|
|
|
|
A hex value specifying bitmask with supplemental xhci
|
|
|
|
host controller quirks. Meaning of each bit can be
|
|
|
|
consulted in header drivers/usb/host/xhci.h.
|
2019-08-15 04:56:37 +08:00
|
|
|
|
|
|
|
xmon [PPC]
|
|
|
|
Format: { early | on | rw | ro | off }
|
|
|
|
Controls if xmon debugger is enabled. Default is off.
|
|
|
|
Passing only "xmon" is equivalent to "xmon=early".
|
|
|
|
early Call xmon as early as possible on boot; xmon
|
|
|
|
debugger is called from setup_arch().
|
|
|
|
on xmon debugger hooks will be installed so xmon
|
|
|
|
is only called on a kernel crash. Default mode,
|
|
|
|
i.e. either "ro" or "rw" mode, is controlled
|
|
|
|
with CONFIG_XMON_DEFAULT_RO_MODE.
|
|
|
|
rw xmon debugger hooks will be installed so xmon
|
|
|
|
is called only on a kernel crash, mode is write,
|
|
|
|
meaning SPR registers, memory and, other data
|
|
|
|
can be written using xmon commands.
|
|
|
|
ro same as "rw" option above but SPR registers,
|
|
|
|
memory, and other data can't be written using
|
|
|
|
xmon commands.
|
|
|
|
off xmon is disabled.
|