* [PATCH v2-updated 1/2] mm/vmemmap/devdax: Fix kernel crash when probing devdax devices
@ 2023-04-11 14:22 Aneesh Kumar K.V
2023-04-11 14:22 ` [PATCH v2-updated 2/2] m/hugetlb: Rename ARCH_WANT_HUGETLB_PAGE_OPTIMIZE_VMEMMAP Aneesh Kumar K.V
0 siblings, 1 reply; 2+ messages in thread
From: Aneesh Kumar K.V @ 2023-04-11 14:22 UTC (permalink / raw)
To: linux-mm, akpm
Cc: Aneesh Kumar K.V, Joao Martins, Muchun Song, Dan Williams, Tarun Sahu
commit 4917f55b4ef9 ("mm/sparse-vmemmap: improve memory savings for compound
devmaps") added support for using optimized vmmemap for devdax devices. But how
vmemmap mappings are created are architecture specific. For example, powerpc
with hash translation doesn't have vmemmap mappings in init_mm page table
instead they are bolted table entries in the hardware page table
vmemmap_populate_compound_pages() used by vmemmap optimization code is not aware
of these architecture-specific mapping. Hence allow architecture to opt for this
feature. I selected architectures supporting HUGETLB_PAGE_OPTIMIZE_VMEMMAP
option as also supporting this feature.
This patch fixes the below crash on ppc64.
BUG: Unable to handle kernel data access on write at 0xc00c000100400038
Faulting instruction address: 0xc000000001269d90
Oops: Kernel access of bad area, sig: 11 [#1]
LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
Modules linked in:
CPU: 7 PID: 1 Comm: swapper/0 Not tainted 6.3.0-rc5-150500.34-default+ #2 5c90a668b6bbd142599890245c2fb5de19d7d28a
Hardware name: IBM,9009-42G POWER9 (raw) 0x4e0202 0xf000005 of:IBM,FW950.40 (VL950_099) hv:phyp pSeries
NIP: c000000001269d90 LR: c0000000004c57d4 CTR: 0000000000000000
REGS: c000000003632c30 TRAP: 0300 Not tainted (6.3.0-rc5-150500.34-default+)
MSR: 8000000000009033 <SF,EE,ME,IR,DR,RI,LE> CR: 24842228 XER: 00000000
CFAR: c0000000004c57d0 DAR: c00c000100400038 DSISR: 42000000 IRQMASK: 0
....
NIP [c000000001269d90] __init_single_page.isra.74+0x14/0x4c
LR [c0000000004c57d4] __init_zone_device_page+0x44/0xd0
Call Trace:
[c000000003632ed0] [c000000003632f60] 0xc000000003632f60 (unreliable)
[c000000003632f10] [c0000000004c5ca0] memmap_init_zone_device+0x170/0x250
[c000000003632fe0] [c0000000005575f8] memremap_pages+0x2c8/0x7f0
[c0000000036330c0] [c000000000557b5c] devm_memremap_pages+0x3c/0xa0
[c000000003633100] [c000000000d458a8] dev_dax_probe+0x108/0x3e0
[c0000000036331a0] [c000000000d41430] dax_bus_probe+0xb0/0x140
[c0000000036331d0] [c000000000cef27c] really_probe+0x19c/0x520
[c000000003633260] [c000000000cef6b4] __driver_probe_device+0xb4/0x230
[c0000000036332e0] [c000000000cef888] driver_probe_device+0x58/0x120
[c000000003633320] [c000000000cefa6c] __device_attach_driver+0x11c/0x1e0
[c0000000036333a0] [c000000000cebc58] bus_for_each_drv+0xa8/0x130
[c000000003633400] [c000000000ceefcc] __device_attach+0x15c/0x250
[c0000000036334a0] [c000000000ced458] bus_probe_device+0x108/0x110
[c0000000036334f0] [c000000000ce92dc] device_add+0x7fc/0xa10
[c0000000036335b0] [c000000000d447c8] devm_create_dev_dax+0x1d8/0x530
[c000000003633640] [c000000000d46b60] __dax_pmem_probe+0x200/0x270
[c0000000036337b0] [c000000000d46bf0] dax_pmem_probe+0x20/0x70
[c0000000036337d0] [c000000000d2279c] nvdimm_bus_probe+0xac/0x2b0
[c000000003633860] [c000000000cef27c] really_probe+0x19c/0x520
[c0000000036338f0] [c000000000cef6b4] __driver_probe_device+0xb4/0x230
[c000000003633970] [c000000000cef888] driver_probe_device+0x58/0x120
[c0000000036339b0] [c000000000cefd08] __driver_attach+0x1d8/0x240
[c000000003633a30] [c000000000cebb04] bus_for_each_dev+0xb4/0x130
[c000000003633a90] [c000000000cee564] driver_attach+0x34/0x50
[c000000003633ab0] [c000000000ced878] bus_add_driver+0x218/0x300
[c000000003633b40] [c000000000cf1144] driver_register+0xa4/0x1b0
[c000000003633bb0] [c000000000d21a0c] __nd_driver_register+0x5c/0x100
[c000000003633c10] [c00000000206a2e8] dax_pmem_init+0x34/0x48
[c000000003633c30] [c0000000000132d0] do_one_initcall+0x60/0x320
[c000000003633d00] [c0000000020051b0] kernel_init_freeable+0x360/0x400
[c000000003633de0] [c000000000013764] kernel_init+0x34/0x1d0
[c000000003633e50] [c00000000000de14] ret_from_kernel_thread+0x5c/0x64
Fixes: 4917f55b4ef9 ("mm/sparse-vmemmap: improve memory savings for compound devmaps")
Cc: Joao Martins <joao.m.martins@oracle.com>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Reported-by: Tarun Sahu <tsahu@linux.ibm.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
---
Changes from V1:
* Only disable memory saving part of compound devmaps
* Update the correct Fixes: commit
* Add patch to drop HUGETLB specific kconfig
include/linux/mm.h | 16 ++++++++++++++++
mm/page_alloc.c | 9 ++++++---
mm/sparse-vmemmap.c | 3 +--
3 files changed, 23 insertions(+), 5 deletions(-)
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 716d30d93616..08799ad0cf0f 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -3442,6 +3442,22 @@ void vmemmap_populate_print_last(void);
void vmemmap_free(unsigned long start, unsigned long end,
struct vmem_altmap *altmap);
#endif
+
+#ifdef CONFIG_ARCH_WANT_HUGETLB_PAGE_OPTIMIZE_VMEMMAP
+static inline bool vmemmap_can_optimize(struct vmem_altmap *altmap,
+ struct dev_pagemap *pgmap)
+{
+ return is_power_of_2(sizeof(struct page)) &&
+ pgmap && (pgmap_vmemmap_nr(pgmap) > 1) && !altmap;
+}
+#else
+static inline bool vmemmap_can_optimize(struct vmem_altmap *altmap,
+ struct dev_pagemap *pgmap)
+{
+ return false;
+}
+#endif
+
void register_page_bootmem_memmap(unsigned long section_nr, struct page *map,
unsigned long nr_pages);
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 3bb3484563ed..292411d8816f 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -6844,10 +6844,13 @@ static void __ref __init_zone_device_page(struct page *page, unsigned long pfn,
* of an altmap. See vmemmap_populate_compound_pages().
*/
static inline unsigned long compound_nr_pages(struct vmem_altmap *altmap,
+ struct dev_pagemap *pgmap,
unsigned long nr_pages)
{
- return is_power_of_2(sizeof(struct page)) &&
- !altmap ? 2 * (PAGE_SIZE / sizeof(struct page)) : nr_pages;
+ if (vmemmap_can_optimize(altmap, pgmap))
+ return 2 * (PAGE_SIZE / sizeof(struct page));
+ else
+ return nr_pages;
}
static void __ref memmap_init_compound(struct page *head,
@@ -6912,7 +6915,7 @@ void __ref memmap_init_zone_device(struct zone *zone,
continue;
memmap_init_compound(page, pfn, zone_idx, nid, pgmap,
- compound_nr_pages(altmap, pfns_per_compound));
+ compound_nr_pages(altmap, pgmap, pfns_per_compound));
}
pr_info("%s initialised %lu pages in %ums\n", __func__,
diff --git a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c
index c5398a5960d0..10d73a0dfcec 100644
--- a/mm/sparse-vmemmap.c
+++ b/mm/sparse-vmemmap.c
@@ -458,8 +458,7 @@ struct page * __meminit __populate_section_memmap(unsigned long pfn,
!IS_ALIGNED(nr_pages, PAGES_PER_SUBSECTION)))
return NULL;
- if (is_power_of_2(sizeof(struct page)) &&
- pgmap && pgmap_vmemmap_nr(pgmap) > 1 && !altmap)
+ if (vmemmap_can_optimize(altmap, pgmap))
r = vmemmap_populate_compound_pages(pfn, start, end, nid, pgmap);
else
r = vmemmap_populate(start, end, nid, altmap);
--
2.39.2
^ permalink raw reply [flat|nested] 2+ messages in thread* [PATCH v2-updated 2/2] m/hugetlb: Rename ARCH_WANT_HUGETLB_PAGE_OPTIMIZE_VMEMMAP
2023-04-11 14:22 [PATCH v2-updated 1/2] mm/vmemmap/devdax: Fix kernel crash when probing devdax devices Aneesh Kumar K.V
@ 2023-04-11 14:22 ` Aneesh Kumar K.V
0 siblings, 0 replies; 2+ messages in thread
From: Aneesh Kumar K.V @ 2023-04-11 14:22 UTC (permalink / raw)
To: linux-mm, akpm; +Cc: Aneesh Kumar K.V, Joao Martins, Muchun Song, Dan Williams
Now we use ARCH_WANT_HUGETLB_PAGE_OPTIMIZE_VMEMMAP config option
to indicate devdax and hugetlb vmemmap optimization support. Hence
rename that to a generic ARCH_WANT_OPTIMIZE_VMEMMAP
Cc: Joao Martins <joao.m.martins@oracle.com>
Cc: Muchun Song <songmuchun@bytedance.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
---
arch/arm64/Kconfig | 2 +-
arch/loongarch/Kconfig | 2 +-
arch/s390/Kconfig | 2 +-
arch/x86/Kconfig | 2 +-
fs/Kconfig | 9 +--------
include/linux/mm.h | 2 +-
mm/Kconfig | 6 ++++++
7 files changed, 12 insertions(+), 13 deletions(-)
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 27b2592698b0..77d9713dcd9c 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -100,9 +100,9 @@ config ARM64
select ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT
select ARCH_WANT_FRAME_POINTERS
select ARCH_WANT_HUGE_PMD_SHARE if ARM64_4K_PAGES || (ARM64_16K_PAGES && !ARM64_VA_BITS_36)
- select ARCH_WANT_HUGETLB_PAGE_OPTIMIZE_VMEMMAP
select ARCH_WANT_LD_ORPHAN_WARN
select ARCH_WANTS_NO_INSTR
+ select ARCH_WANT_OPTIMIZE_VMEMMAP
select ARCH_WANTS_THP_SWAP if ARM64_4K_PAGES
select ARCH_HAS_UBSAN_SANITIZE_ALL
select ARM_AMBA
diff --git a/arch/loongarch/Kconfig b/arch/loongarch/Kconfig
index 9cc8b84f7eb0..9cb00f962de1 100644
--- a/arch/loongarch/Kconfig
+++ b/arch/loongarch/Kconfig
@@ -53,9 +53,9 @@ config LOONGARCH
select ARCH_USE_QUEUED_RWLOCKS
select ARCH_USE_QUEUED_SPINLOCKS
select ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT
- select ARCH_WANT_HUGETLB_PAGE_OPTIMIZE_VMEMMAP
select ARCH_WANT_LD_ORPHAN_WARN
select ARCH_WANTS_NO_INSTR
+ select ARCH_WANT_OPTIMIZE_VMEMMAP
select BUILDTIME_TABLE_SORT
select COMMON_CLK
select CPU_PM
diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
index 933771b0b07a..df2cd510480a 100644
--- a/arch/s390/Kconfig
+++ b/arch/s390/Kconfig
@@ -126,7 +126,7 @@ config S390
select ARCH_WANTS_NO_INSTR
select ARCH_WANT_DEFAULT_BPF_JIT
select ARCH_WANT_IPC_PARSE_VERSION
- select ARCH_WANT_HUGETLB_PAGE_OPTIMIZE_VMEMMAP
+ select ARCH_WANT_OPTIMIZE_VMEMMAP
select BUILDTIME_TABLE_SORT
select CLONE_BACKWARDS2
select DMA_OPS if PCI
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index a825bf031f49..5269131cc248 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -125,8 +125,8 @@ config X86
select ARCH_WANTS_NO_INSTR
select ARCH_WANT_GENERAL_HUGETLB
select ARCH_WANT_HUGE_PMD_SHARE
- select ARCH_WANT_HUGETLB_PAGE_OPTIMIZE_VMEMMAP if X86_64
select ARCH_WANT_LD_ORPHAN_WARN
+ select ARCH_WANT_OPTIMIZE_VMEMMAP if X86_64
select ARCH_WANTS_THP_SWAP if X86_64
select ARCH_HAS_PARANOID_L1D_FLUSH
select BUILDTIME_TABLE_SORT
diff --git a/fs/Kconfig b/fs/Kconfig
index e99830c65033..cc07a0cd3172 100644
--- a/fs/Kconfig
+++ b/fs/Kconfig
@@ -250,16 +250,9 @@ config HUGETLBFS
config HUGETLB_PAGE
def_bool HUGETLBFS
-#
-# Select this config option from the architecture Kconfig, if it is preferred
-# to enable the feature of HugeTLB Vmemmap Optimization (HVO).
-#
-config ARCH_WANT_HUGETLB_PAGE_OPTIMIZE_VMEMMAP
- bool
-
config HUGETLB_PAGE_OPTIMIZE_VMEMMAP
def_bool HUGETLB_PAGE
- depends on ARCH_WANT_HUGETLB_PAGE_OPTIMIZE_VMEMMAP
+ depends on ARCH_WANT_OPTIMIZE_VMEMMAP
depends on SPARSEMEM_VMEMMAP
config HUGETLB_PAGE_OPTIMIZE_VMEMMAP_DEFAULT_ON
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 08799ad0cf0f..fb71e21df23d 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -3443,7 +3443,7 @@ void vmemmap_free(unsigned long start, unsigned long end,
struct vmem_altmap *altmap);
#endif
-#ifdef CONFIG_ARCH_WANT_HUGETLB_PAGE_OPTIMIZE_VMEMMAP
+#ifdef CONFIG_ARCH_WANT_OPTIMIZE_VMEMMAP
static inline bool vmemmap_can_optimize(struct vmem_altmap *altmap,
struct dev_pagemap *pgmap)
{
diff --git a/mm/Kconfig b/mm/Kconfig
index ff7b209dec05..492919cf62a4 100644
--- a/mm/Kconfig
+++ b/mm/Kconfig
@@ -460,6 +460,12 @@ config SPARSEMEM_VMEMMAP
SPARSEMEM_VMEMMAP uses a virtually mapped memmap to optimise
pfn_to_page and page_to_pfn operations. This is the most
efficient option when sufficient kernel resources are available.
+#
+# Select this config option from the architecture Kconfig, if it is preferred
+# to enable the feature of HugeTLB/dev_dax vmemmap optimization.
+#
+config ARCH_WANT_OPTIMIZE_VMEMMAP
+ bool
config HAVE_MEMBLOCK_PHYS_MAP
bool
--
2.39.2
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2023-04-11 14:22 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-04-11 14:22 [PATCH v2-updated 1/2] mm/vmemmap/devdax: Fix kernel crash when probing devdax devices Aneesh Kumar K.V
2023-04-11 14:22 ` [PATCH v2-updated 2/2] m/hugetlb: Rename ARCH_WANT_HUGETLB_PAGE_OPTIMIZE_VMEMMAP Aneesh Kumar K.V
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox