Kernel

Found nice article on https://capsule8.com

By Theofilos Petsios  and will share here for me and you =)

In our post “Millions of Binaries Later: a Look Into Linux Hardening in the Wild”, we examined the security properties of different distributions. In the following, we provide a glossary for the security-relevant kernel configuration options discussed in that post (scraped from the Linux Kernel Driver Database).

Option

Description

Significance

CONFIG_X86_SMAP

Supervisor Mode Access Prevention (SMAP) is a security feature in newer Intel processors. There is a small performance cost if this enabled and turned on; there is also a small increase in the kernel size if this is enabled.

Critical

CONFIG_STRICT_KERNEL_RWX

If this is set kernel text and rodata memory will be made read-only and non-text memory will be made non-executable. This provides protection against certain security exploits (e.g. executing the heap or modifying text)

Critical

CONFIG_RANDOMIZE_BASE

In support of Kernel Address Space Layout Randomization (KASLR) this randomizes the physical address at which the kernel image is decompressed and the virtual address where the kernel image is mapped as a security feature that deters exploit attempts relying on knowledge of the location of kernel code internals.

Critical

CONFIG_RANDOMIZE_MEMORY

Randomizes the base virtual address of kernel memory sections (physical memory mapping vmalloc & vmemmap). This security feature makes exploits relying on predictable memory locations less reliable.

Critical

CONFIG_STACKPROTECTOR_STRONG

Functions will have the stack-protector canary logic added in any of the following conditions:

Critical

CONFIG_HARDENED_USERCOPY

This option checks for obviously wrong memory regions when copying memory to/from the kernel (via copy_to_user() and copy_from_user() functions) by rejecting memory ranges that are larger than the specified heap object span multiple separately allocated pages are not on the process stack or are part of the kernel text. This kills entire classes of heap overflow exploits and similar kernel memory exposures.

Critical

CONFIG_LOCK_DOWN_KERNEL

Critical

CONFIG_STRICT_MODULE_RWX

If this is set module text and rodata memory will be made read-only and non-text memory will be made non-executable. This provides protection against certain security exploits (e.g. writing to text)

Critical

CONFIG_SECURITY

This allows you to choose different security modules to be configured into your kernel.

Critical

CONFIG_SECCOMP

This kernel feature is useful for number crunching applications that may need to compute untrusted bytecode during their execution. By using pipes or other transports made available to the process as file descriptors supporting the read/write syscalls it's possible to isolate those applications in their own address space using seccomp. Once seccomp is enabled via prctl(PR_SET_SECCOMP) it cannot be disabled and the task is only allowed to execute a few safe syscalls defined by each seccomp mode.

Critical

CONFIG_STRICT_DEVMEM

If this option is disabled you allow userspace (root) access to all of memory including kernel and userspace memory. Accidental access to this is obviously disastrous but specific access can be used by people debugging the kernel.

Critical

CONFIG_DEVKMEM

Say Y here if you want to support the /dev/kmem device. The /dev/kmem device is rarely used but can be used for certain kind of kernel debugging operations. When in doubt say "N".

Critical

CONFIG_X86_INTEL_UMIP

The User Mode Instruction Prevention (UMIP) is a security feature in newer Intel processors. If enabled a general protection fault is issued if the SGDT SLDT SIDT SMSW or STR instructions are executed in user mode. These instructions unnecessarily expose information about the hardware state.

High

CONFIG_VMAP_STACK

Enable this if you want the use virtually-mapped kernel stacks with guard pages. This causes kernel stack overflows to be caught immediately rather than causing difficult-to-diagnose corruption.

High

CONFIG_SLAB_FREELIST_HARDENED

Many kernel heap attacks try to target slab cache metadata and other infrastructure. This options makes minor performance sacrifies to harden the kernel slab allocator against common freelist exploit methods.

High

CONFIG_SLAB_FREELIST_RANDOM

Randomizes the freelist order used on creating new pages. This security feature reduces the predictability of the kernel slab allocator against heap overflows.

High

CONFIG_FORTIFY_SOURCE

Detect overflows of buffers in common string and memory functions where the compiler can determine and validate the buffer sizes.

High

CONFIG_BUG_ON_DATA_CORRUPTION

Select this option if the kernel should BUG when it encounters data corruption in kernel memory structures when they get checked for validity.

High

CONFIG_HARDENED_USERCOPY_FALLBACK

This is a temporary option that allows missing usercopy whitelists to be discovered via a WARN() to the kernel log instead of rejecting the copy falling back to non-whitelisted hardened usercopy that checks the slab allocation size instead of the whitelist size. This option will be removed once it seems like all missing usercopy whitelists have been identified and fixed. Booting with "slab_common.usercopy_fallback=Y/N" can change this setting.

High

CONFIG_SECURITY_DMESG_RESTRICT

This enforces restrictions on unprivileged users reading the kernel syslog via dmesg(8).

High

CONFIG_SECURITY_YAMA

This selects Yama which extends DAC support with additional system-wide security settings beyond regular Linux discretionary access controls. Currently available is ptrace scope restriction. Like capabilities this security module stacks with other LSMs. Further information can be found in Documentation/admin-guide/LSM/Yama.rst.

High

CONFIG_SECURITY_SELINUX_DISABLE

This option enables writing to a selinuxfs node 'disable' which allows SELinux to be disabled at runtime prior to the policy load. SELinux will then remain disabled until the next boot. This option is similar to the selinux=0 boot parameter but is to support runtime disabling of SELinux e.g. from /sbin/init for portability across platforms where boot parameters are difficult to employ.

High

CONFIG_SECCOMP_FILTER

Enable tasks to build secure computing environments defined in terms of Berkeley Packet Filter programs which implement task-defined system call filtering polices.

High

CONFIG_ACPI_CUSTOM_METHOD

This debug facility allows ACPI AML methods to be inserted and/or replaced without rebooting the system. For details refer to: Documentation/acpi/method-customizing.txt.

High

CONFIG_COMPAT_BRK

Randomizing heap placement makes heap exploits harder but it also breaks ancient binaries (including anything libc5 based). This option changes the bootup default to heap randomization disabled and can be overridden at runtime by setting /proc/sys/kernel/randomize_va_space to 2.

High

CONFIG_IO_STRICT_DEVMEM

If this option is disabled you allow userspace (root) access to all io-memory regardless of whether a driver is actively using that range. Accidental access to this is obviously disastrous but specific access can be used by people debugging kernel drivers.

High

CONFIG_LEGACY_VSYSCALL_NONE

There will be no vsyscall mapping at all. This will eliminate any risk of ASLR bypass due to the vsyscall fixed address mapping. Attempts to use the vsyscalls will be reported to dmesg so that either old or malicious userspace programs can be identified.

High

CONFIG_USERFAULTFD

Enable the userfaultfd() system call that allows to intercept and handle page faults in userland.

High

CONFIG_LIVEPATCH

Say Y here if you want to support kernel live patching. This option has no runtime impact until a kernel "patch" module uses the interface provided by this option to register a patch causing calls to patched functions to be redirected to new function code contained in the patch module.

High

CONFIG_BPF_JIT

Berkeley Packet Filter filtering capabilities are normally handled by an interpreter. This option allows kernel to generate a native code when filter is loaded in memory. This should speedup packet sniffing (libpcap/tcpdump).

High

CONFIG_PAGE_TABLE_ISOLATION

This feature reduces the number of hardware side channels by ensuring that the majority of kernel addresses are not mapped into userspace.

High

CONFIG_RETPOLINE

Compile kernel with the retpoline compiler options to guard against kernel-to-user data leaks by avoiding speculative indirect branches. Requires a compiler with -mindirect-branch=thunk-extern support for full protection. The kernel may run slower.

High

CONFIG_X86_64

Port to the x86-64 architecture. x86-64 is a 64-bit extension to the classical 32-bit x86 architecture. For details see http://www.x86-64.org/.

High

CONFIG_DEBUG_WX

Generate a warning if any W+X mappings are found at boot.

High

CONFIG_SCHED_STACK_END_CHECK

This option checks for a stack overrun on calls to schedule(). If the stack end location is found to be over written always panic as the content of the corrupted region can no longer be trusted. This is to ensure no erroneous behaviour occurs which could result in data corruption or a sporadic crash at a later stage once the region is examined. The runtime overhead introduced is minimal.

High

CONFIG_MODULE_SIG

Check modules for valid signatures upon load: the signature is simply appended to the module. For more information see Documentation/admin-guide/module-signing.rst.

High

CONFIG_REFCOUNT_FULL

Enabling this switches the refcounting infrastructure from a fast unchecked atomic_t implementation to a fully state checked implementation which can be (slightly) slower but provides protections against various use-after-free conditions that can be used in security flaw exploits.

High

CONFIG_STATIC_USERMODEHELPER

By default the kernel can call many different userspace binary programs through the "usermode helper" kernel interface. Some of these binaries are statically defined either in the kernel code itself or as a kernel configuration option. However some of these are dynamically created at runtime or can be modified after the kernel has started up. To provide an additional layer of security route all of these calls through a single executable that can not have its name changed.

High

CONFIG_COMPAT_VDSO

Map the VDSO to the predictable old-style address too. Say N here if you are running a sufficiently recent glibc version (2.3.3 or later) to remove the high-mapped VDSO mapping and to exclusively use the randomized VDSO.

High

CONFIG_BINFMT_MISC

If you say Y here it will be possible to plug wrapper-driven binary formats into the kernel. You will like this especially when you use programs that need an interpreter to run like Java Python .NET or Emacs-Lisp. It's also useful if you often run DOS executables under the Linux DOS emulator DOSEMU (read the DOSEMU-HOWTO available from http://www.tldp.org/docs.html#howto). Once you have registered such a binary class with the kernel you can start one of those programs simply by typing in its name at a shell prompt; Linux will automatically feed it to the correct interpreter.

High

CONFIG_PROC_KCORE

Provides a virtual ELF core file of the live kernel. This can be read with gdb and other ELF tools. No modifications can be made using this mechanism.

High

CONFIG_MODIFY_LDT_SYSCALL

Linux can allow user programs to install a per-process x86 Local Descriptor Table (LDT) using the modify_ldt(2) system call. This is required to run 16-bit or segmented code such as DOSEMU or some Wine programs. It is also used by some very old threading libraries.

High

CONFIG_KPROBES

Kprobes allows you to trap at almost any kernel address and execute a callback function. register_kprobe() establishes a probepoint and specifies the callback. Kprobes is useful for kernel debugging non-intrusive instrumentation and testing. If in doubt say "N".

High

CONFIG_UPROBES

Uprobes is the user-space counterpart to kprobes: they enable instrumentation applications (such as 'perf probe') to establish unintrusive probes in user-space binaries and libraries by executing handler functions when the probes are hit by user-space applications.

High

CONFIG_DEBUG_FS

debugfs is a virtual file system that kernel developers use to put debugging files into. Enable this option to be able to read and write to these files.

High

CONFIG_BPF_SYSCALL

Enable the bpf() system call that allows to manipulate eBPF programs and maps via file descriptors.

High

CONFIG_USER_NS

Support user namespaces. This allows containers i.e. vservers to use user namespaces to provide different user info for different servers. If unsure say N.

High

CONFIG_FTRACE

Enable the kernel to trace every kernel function. This is done by using a compiler feature to insert a small 5-byte No-Operation instruction to the beginning of every kernel function which NOP sequence is then dynamically patched into a tracer call when tracing is enabled by the administrator. If it's runtime disabled (the bootup default) then the overhead of the instructions is very small and not measurable even in micro-benchmarks.

High

CONFIG_ARCH_MMAP_RND_BITS

This value can be used to select the number of bits to use to determine the random offset to the base address of vma regions resulting from mmap allocations. This value will be bounded by the architecture's minimum and maximum supported values.

High

CONFIG_BUG

Disabling this option eliminates support for BUG and WARN reducing the size of your kernel image and potentially quietly ignoring numerous fatal conditions. You should only consider disabling this option for embedded systems with no facilities for reporting errors. Just say Y.

Medium

CONFIG_THREAD_INFO_IN_TASK

Select this to move thread_info off the stack into task_struct. To make this work an arch will need to remove all thread_info fields except flags and fix any runtime bugs.

Medium

CONFIG_MODULE_SIG_ALL

Sign all modules during make modules_install. Without this option modules must be signed manually using the scripts/sign-file tool.

Medium

CONFIG_PAGE_POISONING

Fill the pages with poison patterns after free_pages() and verify the patterns before alloc_pages. The filling of the memory helps reduce the risk of information leaks from freed data. This does have a potential performance impact if enabled with the "page_poison=1" kernel boot option.

Medium

CONFIG_GCC_PLUGIN_RANDSTRUCT

If you say Y here the layouts of structures that are entirely function pointers (and have not been manually annotated with __no_randomize_layout) or structures that have been explicitly marked with __randomize_layout will be randomized at compile-time. This can introduce the requirement of an additional information exposure vulnerability for exploits targeting these structure types.

Medium

CONFIG_HIBERNATION

Enable the suspend to disk (STD) functionality which is usually called "hibernation" in user interfaces. STD checkpoints the system and powers it off; and restores that checkpoint on reboot.

Medium

CONFIG_PROC_VMCORE

Exports the dump image of crashed kernel in ELF format.

Medium

CONFIG_HWPOISON_INJECT

Medium

CONFIG_SLUB_DEBUG

SLUB has extensive debug support features. Disabling these can result in significant savings in code size. This also disables SLUB sysfs support. /sys/slab will not exist and there will be no support for cache validation etc.

Medium

CONFIG_SYN_COOKIES

Normal TCP/IP networking is open to an attack known as "SYN flooding". This denial-of-service attack prevents legitimate remote users from being able to connect to your computer during an ongoing attack and requires very little work from the attacker who can operate from anywhere on the Internet.

Medium

CONFIG_DEFAULT_MMAP_MIN_ADDR

This is the portion of low virtual memory which should be protected from userspace allocation. Keeping a user from writing to low pages can help reduce the impact of kernel NULL pointer bugs.

Medium

CONFIG_GCC_PLUGIN_LATENT_ENTROPY

By saying Y here the kernel will instrument some kernel code to extract some entropy from both original and artificially created program state. This will help especially embedded systems where there is little 'natural' source of entropy normally. The cost is some slowdown of the boot process (about 0.5%) and fork and irq processing.

Medium

CONFIG_DEBUG_LIST

Enable this to turn on extended checks in the linked-list walking routines.

Medium

CONFIG_DEBUG_CREDENTIALS

Enable this to turn on some debug checking for credential management. The additional code keeps track of the number of pointers from task_structs to any given cred struct and checks to see that this number never exceeds the usage count of the cred struct.

Medium

CONFIG_MODULE_SIG_FORCE

Reject unsigned modules or signed modules for which we don't have a key. Without this such modules will simply taint the kernel.

Medium

CONFIG_GCC_PLUGIN_STACKLEAK

Medium

CONFIG_SECURITY_LOADPIN

Any files read through the kernel file reading interface (kernel modules firmware kexec images security policy) can be pinned to the first filesystem used for loading. When enabled any files that come from other filesystems will be rejected. This is best used on systems without an initrd that have a root filesystem backed by a read-only device such as dm-verity or a CDROM.

Medium

CONFIG_PAGE_POISONING_NO_SANITY

Skip the sanity checking on alloc only fill the pages with poison on free. This reduces some of the overhead of the poisoning feature.

Medium

CONFIG_PAGE_POISONING_ZERO

Instead of using the existing poison value fill the pages with zeros. This makes it harder to detect when errors are occurring due to sanitization but the zeroing at free means that it is no longer necessary to write zeros when GFP_ZERO is used on allocation.

Medium

CONFIG_SLAB_MERGE_DEFAULT

For reduced kernel memory fragmentation slab caches can be merged when they share the same size and other characteristics. This carries a risk of kernel heap overflows being able to overwrite objects from merged caches (and more easily control cache layout) which makes such heap attacks easier to exploit by attackers. By keeping caches unmerged these kinds of exploits can usually only damage objects in the same cache. To disable merging at runtime "slab_nomerge" can be passed on the kernel command line.

Medium

CONFIG_X86_PTDUMP

Say Y here if you want to show the kernel pagetable layout in a debugfs file. This information is only useful for kernel developers who are working in architecture specific areas of the kernel. It is probably not a good idea to enable this feature in a production kernel. If in doubt say "N"

Medium

CONFIG_DEBUG_KMEMLEAK

Say Y here if you want to enable the memory leak detector. The memory allocation/freeing is traced in a way similar to the Boehm's conservative garbage collector the difference being that the orphan objects are not freed but only shown in /sys/kernel/debug/kmemleak. Enabling this feature will introduce an overhead to memory allocations. See Documentation/dev-tools/kmemleak.rst for more details.

Medium

CONFIG_KEXEC

kexec is a system call that implements the ability to shutdown your current kernel and to start another kernel. It is like a reboot but it is independent of the system firmware. And like a reboot you can start any kernel with it not just Linux.

Medium

CONFIG_LEGACY_PTYS

A pseudo terminal (PTY) is a software device consisting of two halves: a master and a slave. The slave device behaves identical to a physical terminal; the master device is used by a process to read data from and write data to the slave thereby emulating a terminal. Typical programs for the master side are telnet servers and xterms.

Medium

CONFIG_IA32_EMULATION

Include code to run legacy 32-bit programs under a 64-bit kernel. You should likely turn this on unless you're 100% sure that you don't have any 32-bit programs left.

Medium

CONFIG_PROC_PAGE_MONITOR

Various /proc files exist to monitor process memory utilization: /proc/pid/smaps /proc/pid/clear_refs /proc/pid/pagemap /proc/kpagecount and /proc/kpageflags. Disabling these interfaces will reduce the size of the kernel by approximately 4kb.

Medium

CONFIG_GCC_PLUGIN_STRUCTLEAK

This plugin zero-initializes any structures containing a __user attribute. This can prevent some classes of information exposures.

Medium

CONFIG_GCC_PLUGIN_STRUCTLEAK_BYREF_ALL

Zero initialize any struct type local variable that may be passed by reference without having been initialized.

Medium

CONFIG_DEBUG_SG

Enable this to turn on checks on scatter-gather tables. This can help find problems with drivers that do not properly initialize their sg tables.

Medium

CONFIG_SLUB_DEBUG_ON

Boot with debugging on by default. SLUB boots by default with the runtime debug capabilities switched off. Enabling this is equivalent to specifying the "slub_debug" parameter on boot. There is no support for more fine grained debug control like possible with slub_debug=xxx. SLUB debugging may be switched off in a kernel built with SLUB_DEBUG_ON by specifying "slub_debug=-".

Medium

CONFIG_INET_DIAG

Support for INET (TCP DCCP etc) socket monitoring interface used by native Linux tools such as ss. ss is included in iproute2 currently downloadable at:

Medium

CONFIG_X86_X32

Include code to run binaries for the x32 native 32-bit ABI for 64-bit processors. An x32 process gets access to the full 64-bit register file and wide data path while leaving pointers at 32 bits for smaller memory footprint.

Medium

CONFIG_USELIB

This option enables the uselib syscall a system call used in the dynamic linker from libc5 and earlier. glibc does not use this system call. If you intend to run programs built on libc5 or earlier you may need to enable this syscall. Current systems running glibc can safely disable this.

Medium

CONFIG_CHECKPOINT_RESTORE

Enables additional kernel features in a sake of checkpoint/restore. In particular it adds auxiliary prctl codes to setup process text data and heap segment sizes and a few additional /proc filesystem entries.

Medium

CONFIG_MEM_SOFT_DIRTY

This option enables memory changes tracking by introducing a soft-dirty bit on pte-s. This bit it set when someone writes into a page just as regular dirty bit but unlike the latter it can be cleared by hands.

Medium

CONFIG_MMIOTRACE

Mmiotrace traces Memory Mapped I/O access and is meant for debugging and reverse engineering. It is called from the ioremap implementation and works via page faults. Tracing is disabled by default and can be enabled at run-time.

Medium

CONFIG_KEXEC_FILE

This is new version of kexec system call. This system call is file based and takes file descriptors as system call argument for kernel and initramfs as opposed to list of segments as accepted by previous system call.

Medium

CONFIG_DEBUG_NOTIFIERS

Enable this to turn on sanity checking for notifier call chains. This is most useful for kernel developers to make sure that modules properly unregister themselves from notifier chains. This is a relatively cheap check but if you care about maximum performance say N.

Low

CONFIG_ZSMALLOC_STAT

This option enables code in the zsmalloc to collect various statistics about whats happening in zsmalloc and exports that information to userspace via debugfs. If unsure say N.

Low

CONFIG_PAGE_OWNER

This keeps track of what call chain is the owner of a page may help to find bare alloc_page(s) leaks. Even if you include this feature on your build it is disabled in default. You should pass "page_owner=on" to boot parameter in order to enable it. Eats a fair amount of memory if enabled. See tools/vm/page_owner_sort.c for user-space helper.

Low

CONFIG_BINFMT_AOUT

A.out (Assembler.OUTput) is a set of formats for libraries and executables used in the earliest versions of UNIX. Linux used the a.out formats QMAGIC and ZMAGIC until they were replaced with the ELF format.

Low

CONFIG_IP_DCCP

Datagram Congestion Control Protocol (RFC 4340)

Low

CONFIG_IP_SCTP

Stream Control Transmission Protocol

Low

CONFIG_DEVPORT

Say Y here if you want to support the /dev/port device. The /dev/port device is similar to /dev/mem but for I/O ports.

Low

CONFIG_NOTIFIER_ERROR_INJECTION

This option provides the ability to inject artificial errors to specified notifier chain callbacks. It is useful to test the error handling of notifier call chain failures.

Low

CONFIG_ACPI_TABLE_UPGRADE

This option provides functionality to upgrade arbitrary ACPI tables via initrd. No functional change if no ACPI tables are passed via initrd therefore it's safe to say Y. See Documentation/acpi/initrd_table_override.txt for details

Low

CONFIG_ACPI_APEI_EINJ

EINJ provides a hardware error injection mechanism it is mainly used for debugging and testing the other parts of APEI and some other RAS features.

Low

CONFIG_PROFILING

Say Y here to enable the extended profiling support mechanisms used by profilers such as OProfile.

Low

CONFIG_GCC_PLUGINS

GCC plugins are loadable modules that provide extra features to the compiler. They are useful for runtime instrumentation and static analysis.

Low

CONFIG_MMIOTRACE_TEST

This is a dumb module for testing mmiotrace. It is very dangerous as it will write garbage to IO memory starting at a given address. However it should be safe to use on e.g. unused portion of VRAM.

Low