linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 0/5] ptdump: add intermediate directory support
@ 2024-06-18 14:37 Maxwell Bland
  2024-06-18 14:40 ` [PATCH v4 1/5] mm: add ARCH_SUPPORTS_NON_LEAF_PTDUMP Maxwell Bland
                   ` (4 more replies)
  0 siblings, 5 replies; 10+ messages in thread
From: Maxwell Bland @ 2024-06-18 14:37 UTC (permalink / raw)
  To: linux-mm
  Cc: Catalin Marinas, Will Deacon, Jonathan Corbet, Andrew Morton,
	Ard Biesheuvel, Mark Rutland, Christophe Leroy, Maxwell Bland,
	Alexandre Ghiti, linux-arm-kernel, linux-doc, linux-kernel

Makes many several improvements to (arm64) ptdump debugging, including:

- support note_page on intermediate table entries
- (arm64) print intermediate entries and add an array for their specific
  attributes
- (arm64) adjust the entry ranges to remove the implicit exclusive upper
  bound
- (arm64) indent page table by level while maintaining attribute
  alignment
- (arm64) improve documentation clarity, detail, and precision

Thank you again to the maintainers for their review of this patch.

A comparison of the differences in output is provided here:
github.com/maxwell-bland/linux-patch-data/tree/main/ptdump-non-leaf

New in v4:
- Inclusive upper bounds on range specifications
- Splits commit into multiple smaller commits and separates cosmetic,
  documentation, and logic changes
- Updates documentation more sensibly
- Fixes bug in size computation and handles ULONG_MAX bound overflow

v3:
https://lore.kernel.org/all/fik5ys53dbkpkl22o4s7sw7cxi6dqjcpm2f3kno5tyms73jm5y@buo4jsktsnrt/
- Added tabulation to delineate entries
- Fixed formatting issues with mailer and rebased to mm/linus

v2:
https://lore.kernel.org/r/20240423142307.495726312-1-mbland@motorola.com
- Rebased onto linux-next/akpm (the incorrect branch)

v1:
https://lore.kernel.org/all/20240423121820.874441838-1-mbland@motorola.com/


Maxwell Bland (5):
  mm: add ARCH_SUPPORTS_NON_LEAF_PTDUMP
  arm64: non leaf ptdump support
  arm64: indent ptdump by level, aligning attributes
  arm64: exclusive upper bound for ptdump entries
  arm64: add attrs and format to ptdump document

 Documentation/arch/arm64/ptdump.rst | 126 ++++++++++++-----------
 arch/arm64/Kconfig                  |   1 +
 arch/arm64/mm/ptdump.c              | 149 +++++++++++++++++++++++++---
 mm/Kconfig.debug                    |   9 ++
 mm/ptdump.c                         |  21 ++--
 5 files changed, 217 insertions(+), 89 deletions(-)

-- 
2.39.2



^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v4 1/5] mm: add ARCH_SUPPORTS_NON_LEAF_PTDUMP
  2024-06-18 14:37 [PATCH v4 0/5] ptdump: add intermediate directory support Maxwell Bland
@ 2024-06-18 14:40 ` Maxwell Bland
  2024-06-18 18:38   ` LEROY Christophe
  2024-06-18 14:40 ` [PATCH v4 2/5] arm64: non leaf ptdump support Maxwell Bland
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 10+ messages in thread
From: Maxwell Bland @ 2024-06-18 14:40 UTC (permalink / raw)
  To: linux-mm
  Cc: Catalin Marinas, Will Deacon, Jonathan Corbet, Andrew Morton,
	Ard Biesheuvel, Mark Rutland, Christophe Leroy, Maxwell Bland,
	Alexandre Ghiti, linux-arm-kernel, linux-doc, linux-kernel

Provide a Kconfig option indicating if note_page can be called for
intermediate page directories during ptdump.

Signed-off-by: Maxwell Bland <mbland@motorola.com>
---
 mm/Kconfig.debug |  9 +++++++++
 mm/ptdump.c      | 21 +++++++++++++--------
 2 files changed, 22 insertions(+), 8 deletions(-)

diff --git a/mm/Kconfig.debug b/mm/Kconfig.debug
index afc72fde0f03..6af5ecfdef93 100644
--- a/mm/Kconfig.debug
+++ b/mm/Kconfig.debug
@@ -201,6 +201,15 @@ config PTDUMP_DEBUGFS
 
 	  If in doubt, say N.
 
+config ARCH_SUPPORTS_NON_LEAF_PTDUMP
+	bool "Include intermediate directory entries in pagetable dumps"
+	default n
+	help
+	  Enable the inclusion of intermediate page directory entries in calls
+	  to the ptdump API. Once an architecture defines correct ptdump
+	  behavior for PGD, PUD, P4D, and PMD entries, this config can be
+	  selected.
+
 config HAVE_DEBUG_KMEMLEAK
 	bool
 
diff --git a/mm/ptdump.c b/mm/ptdump.c
index 106e1d66e9f9..6180708669fe 100644
--- a/mm/ptdump.c
+++ b/mm/ptdump.c
@@ -41,10 +41,11 @@ static int ptdump_pgd_entry(pgd_t *pgd, unsigned long addr,
 	if (st->effective_prot)
 		st->effective_prot(st, 0, pgd_val(val));
 
-	if (pgd_leaf(val)) {
+	if (IS_ENABLED(CONFIG_ARCH_SUPPORTS_NON_LEAF_PTDUMP) || pgd_leaf(val))
 		st->note_page(st, addr, 0, pgd_val(val));
+
+	if (pgd_leaf(val))
 		walk->action = ACTION_CONTINUE;
-	}
 
 	return 0;
 }
@@ -64,10 +65,11 @@ static int ptdump_p4d_entry(p4d_t *p4d, unsigned long addr,
 	if (st->effective_prot)
 		st->effective_prot(st, 1, p4d_val(val));
 
-	if (p4d_leaf(val)) {
+	if (IS_ENABLED(CONFIG_ARCH_SUPPORTS_NON_LEAF_PTDUMP) || pgd_leaf(val))
 		st->note_page(st, addr, 1, p4d_val(val));
+
+	if (p4d_leaf(val))
 		walk->action = ACTION_CONTINUE;
-	}
 
 	return 0;
 }
@@ -87,10 +89,11 @@ static int ptdump_pud_entry(pud_t *pud, unsigned long addr,
 	if (st->effective_prot)
 		st->effective_prot(st, 2, pud_val(val));
 
-	if (pud_leaf(val)) {
+	if (IS_ENABLED(CONFIG_ARCH_SUPPORTS_NON_LEAF_PTDUMP) || pgd_leaf(val))
 		st->note_page(st, addr, 2, pud_val(val));
+
+	if (pud_leaf(val))
 		walk->action = ACTION_CONTINUE;
-	}
 
 	return 0;
 }
@@ -108,10 +111,12 @@ static int ptdump_pmd_entry(pmd_t *pmd, unsigned long addr,
 
 	if (st->effective_prot)
 		st->effective_prot(st, 3, pmd_val(val));
-	if (pmd_leaf(val)) {
+
+	if (IS_ENABLED(CONFIG_ARCH_SUPPORTS_NON_LEAF_PTDUMP) || pgd_leaf(val))
 		st->note_page(st, addr, 3, pmd_val(val));
+
+	if (pmd_leaf(val))
 		walk->action = ACTION_CONTINUE;
-	}
 
 	return 0;
 }
-- 
2.39.2




^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v4 2/5] arm64: non leaf ptdump support
  2024-06-18 14:37 [PATCH v4 0/5] ptdump: add intermediate directory support Maxwell Bland
  2024-06-18 14:40 ` [PATCH v4 1/5] mm: add ARCH_SUPPORTS_NON_LEAF_PTDUMP Maxwell Bland
@ 2024-06-18 14:40 ` Maxwell Bland
  2024-06-18 14:59   ` Ard Biesheuvel
  2024-06-18 14:42 ` [PATCH v4 3/5] arm64: indent ptdump by level, aligning attributes Maxwell Bland
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 10+ messages in thread
From: Maxwell Bland @ 2024-06-18 14:40 UTC (permalink / raw)
  To: linux-mm
  Cc: Catalin Marinas, Will Deacon, Jonathan Corbet, Andrew Morton,
	Ard Biesheuvel, Mark Rutland, Christophe Leroy, Maxwell Bland,
	Alexandre Ghiti, linux-arm-kernel, linux-doc, linux-kernel

Separate the pte_bits used in ptdump from pxd_bits used by pmd, p4d,
pud, and pgd descriptors, thereby adding support for printing key
intermediate directory protection bits, such as PXNTable, and enable the
associated support Kconfig option.

Signed-off-by: Maxwell Bland <mbland@motorola.com>
---
 arch/arm64/Kconfig     |   1 +
 arch/arm64/mm/ptdump.c | 140 ++++++++++++++++++++++++++++++++++++-----
 2 files changed, 125 insertions(+), 16 deletions(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 5d91259ee7b5..f4c3290160db 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -98,6 +98,7 @@ config ARM64
 	select ARCH_SUPPORTS_NUMA_BALANCING
 	select ARCH_SUPPORTS_PAGE_TABLE_CHECK
 	select ARCH_SUPPORTS_PER_VMA_LOCK
+	select ARCH_SUPPORTS_NON_LEAF_PTDUMP
 	select ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH
 	select ARCH_WANT_COMPAT_IPC_PARSE_VERSION if COMPAT
 	select ARCH_WANT_DEFAULT_BPF_JIT
diff --git a/arch/arm64/mm/ptdump.c b/arch/arm64/mm/ptdump.c
index 6986827e0d64..8f0b459c13ed 100644
--- a/arch/arm64/mm/ptdump.c
+++ b/arch/arm64/mm/ptdump.c
@@ -24,6 +24,7 @@
 #include <asm/memory.h>
 #include <asm/pgtable-hwdef.h>
 #include <asm/ptdump.h>
+#include <asm/pgalloc.h>
 
 
 #define pt_dump_seq_printf(m, fmt, args...)	\
@@ -105,11 +106,6 @@ static const struct prot_bits pte_bits[] = {
 		.val	= PTE_CONT,
 		.set	= "CON",
 		.clear	= "   ",
-	}, {
-		.mask	= PTE_TABLE_BIT,
-		.val	= PTE_TABLE_BIT,
-		.set	= "   ",
-		.clear	= "BLK",
 	}, {
 		.mask	= PTE_UXN,
 		.val	= PTE_UXN,
@@ -143,34 +139,129 @@ static const struct prot_bits pte_bits[] = {
 	}
 };
 
+static const struct prot_bits pxd_bits[] = {
+	{
+		.mask	= PMD_SECT_VALID,
+		.val	= PMD_SECT_VALID,
+		.set	= " ",
+		.clear	= "F",
+	}, {
+		.mask	= PMD_TABLE_BIT,
+		.val	= PMD_TABLE_BIT,
+		.set	= "TBL",
+		.clear	= "BLK",
+	}, {
+		.mask	= PMD_SECT_USER,
+		.val	= PMD_SECT_USER,
+		.set	= "USR",
+		.clear	= "   ",
+	}, {
+		.mask	= PMD_SECT_RDONLY,
+		.val	= PMD_SECT_RDONLY,
+		.set	= "ro",
+		.clear	= "RW",
+	}, {
+		.mask	= PMD_SECT_S,
+		.val	= PMD_SECT_S,
+		.set	= "SHD",
+		.clear	= "   ",
+	}, {
+		.mask	= PMD_SECT_AF,
+		.val	= PMD_SECT_AF,
+		.set	= "AF",
+		.clear	= "  ",
+	}, {
+		.mask	= PMD_SECT_NG,
+		.val	= PMD_SECT_NG,
+		.set	= "NG",
+		.clear	= "  ",
+	}, {
+		.mask	= PMD_SECT_CONT,
+		.val	= PMD_SECT_CONT,
+		.set	= "CON",
+		.clear	= "   ",
+	}, {
+		.mask	= PMD_SECT_PXN,
+		.val	= PMD_SECT_PXN,
+		.set	= "NX",
+		.clear	= "x ",
+	}, {
+		.mask	= PMD_SECT_UXN,
+		.val	= PMD_SECT_UXN,
+		.set	= "UXN",
+		.clear	= "   ",
+	}, {
+		.mask	= PMD_TABLE_PXN,
+		.val	= PMD_TABLE_PXN,
+		.set	= "NXTbl",
+		.clear	= "     ",
+	}, {
+		.mask	= PMD_TABLE_UXN,
+		.val	= PMD_TABLE_UXN,
+		.set	= "UXNTbl",
+		.clear	= "      ",
+	}, {
+		.mask	= PTE_GP,
+		.val	= PTE_GP,
+		.set	= "GP",
+		.clear	= "  ",
+	}, {
+		.mask	= PMD_ATTRINDX_MASK,
+		.val	= PMD_ATTRINDX(MT_DEVICE_nGnRnE),
+		.set	= "DEVICE/nGnRnE",
+	}, {
+		.mask	= PMD_ATTRINDX_MASK,
+		.val	= PMD_ATTRINDX(MT_DEVICE_nGnRE),
+		.set	= "DEVICE/nGnRE",
+	}, {
+		.mask	= PMD_ATTRINDX_MASK,
+		.val	= PMD_ATTRINDX(MT_NORMAL_NC),
+		.set	= "MEM/NORMAL-NC",
+	}, {
+		.mask	= PMD_ATTRINDX_MASK,
+		.val	= PMD_ATTRINDX(MT_NORMAL),
+		.set	= "MEM/NORMAL",
+	}, {
+		.mask	= PMD_ATTRINDX_MASK,
+		.val	= PMD_ATTRINDX(MT_NORMAL_TAGGED),
+		.set	= "MEM/NORMAL-TAGGED",
+	}
+};
+
 struct pg_level {
 	const struct prot_bits *bits;
 	char name[4];
 	int num;
 	u64 mask;
+	unsigned long size;
 };
 
 static struct pg_level pg_level[] __ro_after_init = {
 	{ /* pgd */
 		.name	= "PGD",
-		.bits	= pte_bits,
-		.num	= ARRAY_SIZE(pte_bits),
+		.bits	= pxd_bits,
+		.num	= ARRAY_SIZE(pxd_bits),
+		.size	= PGDIR_SIZE,
 	}, { /* p4d */
 		.name	= "P4D",
-		.bits	= pte_bits,
-		.num	= ARRAY_SIZE(pte_bits),
+		.bits	= pxd_bits,
+		.num	= ARRAY_SIZE(pxd_bits),
+		.size	= P4D_SIZE,
 	}, { /* pud */
 		.name	= "PUD",
-		.bits	= pte_bits,
-		.num	= ARRAY_SIZE(pte_bits),
+		.bits	= pxd_bits,
+		.num	= ARRAY_SIZE(pxd_bits),
+		.size	= PUD_SIZE,
 	}, { /* pmd */
 		.name	= "PMD",
-		.bits	= pte_bits,
-		.num	= ARRAY_SIZE(pte_bits),
+		.bits	= pxd_bits,
+		.num	= ARRAY_SIZE(pxd_bits),
+		.size	= PMD_SIZE,
 	}, { /* pte */
 		.name	= "PTE",
 		.bits	= pte_bits,
 		.num	= ARRAY_SIZE(pte_bits),
+		.size	= PAGE_SIZE
 	},
 };
 
@@ -251,10 +342,27 @@ static void note_page(struct ptdump_state *pt_st, unsigned long addr, int level,
 			note_prot_wx(st, addr);
 		}
 
-		pt_dump_seq_printf(st->seq, "0x%016lx-0x%016lx   ",
-				   st->start_address, addr);
+		/*
+		 * Non-leaf entries use a fixed size for their range
+		 * specification, whereas leaf entries are grouped by
+		 * attributes and may not have a range larger than the type
+		 * specifier.
+		 */
+		if (st->start_address == addr) {
+			if (check_add_overflow(addr, pg_level[st->level].size,
+					       &delta))
+				delta = ULONG_MAX - addr + 1;
+			else
+				delta = pg_level[st->level].size;
+			pt_dump_seq_printf(st->seq, "0x%016lx-0x%016lx   ",
+					   addr, addr + delta);
+		} else {
+			delta = (addr - st->start_address);
+			pt_dump_seq_printf(st->seq, "0x%016lx-0x%016lx   ",
+					   st->start_address, addr);
+		}
 
-		delta = (addr - st->start_address) >> 10;
+		delta >>= 10;
 		while (!(delta & 1023) && unit[1]) {
 			delta >>= 10;
 			unit++;
-- 
2.39.2




^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v4 3/5] arm64: indent ptdump by level, aligning attributes
  2024-06-18 14:37 [PATCH v4 0/5] ptdump: add intermediate directory support Maxwell Bland
  2024-06-18 14:40 ` [PATCH v4 1/5] mm: add ARCH_SUPPORTS_NON_LEAF_PTDUMP Maxwell Bland
  2024-06-18 14:40 ` [PATCH v4 2/5] arm64: non leaf ptdump support Maxwell Bland
@ 2024-06-18 14:42 ` Maxwell Bland
  2024-06-18 14:42 ` [PATCH v4 4/5] arm64: exclusive upper bound for ptdump entries Maxwell Bland
  2024-06-18 14:43 ` [PATCH v4 5/5] arm64: add attrs and format to ptdump document Maxwell Bland
  4 siblings, 0 replies; 10+ messages in thread
From: Maxwell Bland @ 2024-06-18 14:42 UTC (permalink / raw)
  To: linux-mm
  Cc: Catalin Marinas, Will Deacon, Jonathan Corbet, Andrew Morton,
	Ard Biesheuvel, Mark Rutland, Christophe Leroy, Maxwell Bland,
	Alexandre Ghiti, linux-arm-kernel, linux-doc, linux-kernel

Outputs each level of the page table with two additional spaces for
parsers, distinction, and readability while maintaining the alignment of
region size and attributes.

Signed-off-by: Maxwell Bland <mbland@motorola.com>
---
 arch/arm64/mm/ptdump.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/arch/arm64/mm/ptdump.c b/arch/arm64/mm/ptdump.c
index 8f0b459c13ed..2ec16b523043 100644
--- a/arch/arm64/mm/ptdump.c
+++ b/arch/arm64/mm/ptdump.c
@@ -336,6 +336,10 @@ static void note_page(struct ptdump_state *pt_st, unsigned long addr, int level,
 		   addr >= st->marker[1].start_address) {
 		const char *unit = units;
 		unsigned long delta;
+		unsigned int i;
+
+		for (i = 0; i < st->level; i++)
+			pt_dump_seq_printf(st->seq, "  ");
 
 		if (st->current_prot) {
 			note_prot_uxn(st, addr);
@@ -362,6 +366,10 @@ static void note_page(struct ptdump_state *pt_st, unsigned long addr, int level,
 					   st->start_address, addr);
 		}
 
+		/* Align region information regardlesss of level */
+		for (i = st->level; i < 4; i++)
+			pt_dump_seq_printf(st->seq, "  ");
+
 		delta >>= 10;
 		while (!(delta & 1023) && unit[1]) {
 			delta >>= 10;
@@ -369,6 +377,7 @@ static void note_page(struct ptdump_state *pt_st, unsigned long addr, int level,
 		}
 		pt_dump_seq_printf(st->seq, "%9lu%c %s", delta, *unit,
 				   pg_level[st->level].name);
+
 		if (st->current_prot && pg_level[st->level].bits)
 			dump_prot(st, pg_level[st->level].bits,
 				  pg_level[st->level].num);
-- 
2.39.2




^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v4 4/5] arm64: exclusive upper bound for ptdump entries
  2024-06-18 14:37 [PATCH v4 0/5] ptdump: add intermediate directory support Maxwell Bland
                   ` (2 preceding siblings ...)
  2024-06-18 14:42 ` [PATCH v4 3/5] arm64: indent ptdump by level, aligning attributes Maxwell Bland
@ 2024-06-18 14:42 ` Maxwell Bland
  2024-06-18 14:43 ` [PATCH v4 5/5] arm64: add attrs and format to ptdump document Maxwell Bland
  4 siblings, 0 replies; 10+ messages in thread
From: Maxwell Bland @ 2024-06-18 14:42 UTC (permalink / raw)
  To: linux-mm
  Cc: Catalin Marinas, Will Deacon, Jonathan Corbet, Andrew Morton,
	Ard Biesheuvel, Mark Rutland, Christophe Leroy, Maxwell Bland,
	Alexandre Ghiti, linux-arm-kernel, linux-doc, linux-kernel

Update the upper bound of all ptdump entries to not include the byte
which is actually governed by the next entry. As the lowest byte is
included and governed, this makes the size specifications exact.

Signed-off-by: Maxwell Bland <mbland@motorola.com>
---
 arch/arm64/mm/ptdump.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/mm/ptdump.c b/arch/arm64/mm/ptdump.c
index 2ec16b523043..63f17c08c406 100644
--- a/arch/arm64/mm/ptdump.c
+++ b/arch/arm64/mm/ptdump.c
@@ -359,11 +359,11 @@ static void note_page(struct ptdump_state *pt_st, unsigned long addr, int level,
 			else
 				delta = pg_level[st->level].size;
 			pt_dump_seq_printf(st->seq, "0x%016lx-0x%016lx   ",
-					   addr, addr + delta);
+					   addr, addr + delta - 1);
 		} else {
 			delta = (addr - st->start_address);
 			pt_dump_seq_printf(st->seq, "0x%016lx-0x%016lx   ",
-					   st->start_address, addr);
+					   st->start_address, addr - 1);
 		}
 
 		/* Align region information regardlesss of level */
-- 
2.39.2




^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v4 5/5] arm64: add attrs and format to ptdump document
  2024-06-18 14:37 [PATCH v4 0/5] ptdump: add intermediate directory support Maxwell Bland
                   ` (3 preceding siblings ...)
  2024-06-18 14:42 ` [PATCH v4 4/5] arm64: exclusive upper bound for ptdump entries Maxwell Bland
@ 2024-06-18 14:43 ` Maxwell Bland
  2024-06-18 23:06   ` Randy Dunlap
  4 siblings, 1 reply; 10+ messages in thread
From: Maxwell Bland @ 2024-06-18 14:43 UTC (permalink / raw)
  To: linux-mm
  Cc: Catalin Marinas, Will Deacon, Jonathan Corbet, Andrew Morton,
	Ard Biesheuvel, Mark Rutland, Christophe Leroy, Maxwell Bland,
	Alexandre Ghiti, linux-arm-kernel, linux-doc, linux-kernel

Update the ptdump content with a precise explanation of the attribute
symbols and the identical-entry coalescing implicit in the code.

Remove unnecessary layout example given the existing cat example,
and opt instead for a precise, clear explantination of address markers,
format, attributes.

Update example to match the new cosmetic and intermediate-directory
printing changes.

Signed-off-by: Maxwell Bland <mbland@motorola.com>
---
 Documentation/arch/arm64/ptdump.rst | 126 ++++++++++++++--------------
 1 file changed, 61 insertions(+), 65 deletions(-)

diff --git a/Documentation/arch/arm64/ptdump.rst b/Documentation/arch/arm64/ptdump.rst
index 5dcfc5d7cddf..fee7600dd4d1 100644
--- a/Documentation/arch/arm64/ptdump.rst
+++ b/Documentation/arch/arm64/ptdump.rst
@@ -29,68 +29,64 @@ configurations and mount debugfs::
  mount -t debugfs nodev /sys/kernel/debug
  cat /sys/kernel/debug/kernel_page_tables
 
-On analysing the output of ``cat /sys/kernel/debug/kernel_page_tables``
-one can derive information about the virtual address range of the entry,
-followed by size of the memory region covered by this entry, the
-hierarchical structure of the page tables and finally the attributes
-associated with each page. The page attributes provide information about
-access permissions, execution capability, type of mapping such as leaf
-level PTE or block level PGD, PMD and PUD, and access status of a page
-within the kernel memory. Assessing these attributes can assist in
-understanding the memory layout, access patterns and security
-characteristics of the kernel pages.
-
-Kernel virtual memory layout example::
-
- start address        end address         size             attributes
- +---------------------------------------------------------------------------------------+
- | ---[ Linear Mapping start ]---------------------------------------------------------- |
- | ..................                                                                    |
- | 0xfff0000000000000-0xfff0000000210000  2112K PTE RW NX SHD AF  UXN  MEM/NORMAL-TAGGED |
- | 0xfff0000000210000-0xfff0000001c00000 26560K PTE ro NX SHD AF  UXN  MEM/NORMAL        |
- | ..................                                                                    |
- | ---[ Linear Mapping end ]------------------------------------------------------------ |
- +---------------------------------------------------------------------------------------+
- | ---[ Modules start ]----------------------------------------------------------------- |
- | ..................                                                                    |
- | 0xffff800000000000-0xffff800008000000   128M PTE                                      |
- | ..................                                                                    |
- | ---[ Modules end ]------------------------------------------------------------------- |
- +---------------------------------------------------------------------------------------+
- | ---[ vmalloc() area ]---------------------------------------------------------------- |
- | ..................                                                                    |
- | 0xffff800008010000-0xffff800008200000  1984K PTE ro x  SHD AF       UXN  MEM/NORMAL   |
- | 0xffff800008200000-0xffff800008e00000    12M PTE ro x  SHD AF  CON  UXN  MEM/NORMAL   |
- | ..................                                                                    |
- | ---[ vmalloc() end ]----------------------------------------------------------------- |
- +---------------------------------------------------------------------------------------+
- | ---[ Fixmap start ]------------------------------------------------------------------ |
- | ..................                                                                    |
- | 0xfffffbfffdb80000-0xfffffbfffdb90000    64K PTE ro x  SHD AF  UXN  MEM/NORMAL        |
- | 0xfffffbfffdb90000-0xfffffbfffdba0000    64K PTE ro NX SHD AF  UXN  MEM/NORMAL        |
- | ..................                                                                    |
- | ---[ Fixmap end ]-------------------------------------------------------------------- |
- +---------------------------------------------------------------------------------------+
- | ---[ PCI I/O start ]----------------------------------------------------------------- |
- | ..................                                                                    |
- | 0xfffffbfffe800000-0xfffffbffff800000    16M PTE                                      |
- | ..................                                                                    |
- | ---[ PCI I/O end ]------------------------------------------------------------------- |
- +---------------------------------------------------------------------------------------+
- | ---[ vmemmap start ]----------------------------------------------------------------- |
- | ..................                                                                    |
- | 0xfffffc0002000000-0xfffffc0002200000     2M PTE RW NX SHD AF  UXN  MEM/NORMAL        |
- | 0xfffffc0002200000-0xfffffc0020000000   478M PTE                                      |
- | ..................                                                                    |
- | ---[ vmemmap end ]------------------------------------------------------------------- |
- +---------------------------------------------------------------------------------------+
-
-``cat /sys/kernel/debug/kernel_page_tables`` output::
-
- 0xfff0000001c00000-0xfff0000080000000     2020M PTE  RW NX SHD AF   UXN    MEM/NORMAL-TAGGED
- 0xfff0000080000000-0xfff0000800000000       30G PMD
- 0xfff0000800000000-0xfff0000800700000        7M PTE  RW NX SHD AF   UXN    MEM/NORMAL-TAGGED
- 0xfff0000800700000-0xfff0000800710000       64K PTE  ro NX SHD AF   UXN    MEM/NORMAL-TAGGED
- 0xfff0000800710000-0xfff0000880000000  2089920K PTE  RW NX SHD AF   UXN    MEM/NORMAL-TAGGED
- 0xfff0000880000000-0xfff0040000000000     4062G PMD
- 0xfff0040000000000-0xffff800000000000     3964T PGD
+``/sys/kernel/debug/kernel_page_tables`` provides a line of information
+for each group of page table entries sharing the same attributes and
+type of mapping, i.e. leaf level PTE or block level PGD, PMD, and PUD.
+Assessing these attributes can assist in determining memory layout,
+access patterns and security characteristics of the kernel pages.
+
+Lines are formatted as follows::
+
+ <start_vaddr>-<end_vaddr> <size> <type> <attributes>
+
+Note that the set of attributes, and therefore formatting, is not
+equivalent between leaf and non-leaf entries. For example, PMD entries
+can support the PXNTable permission bit and do not share that same set
+of attributes as leaf level PTE entries.
+
+The following attributes are presently supported::
+
+F		Entry is invalid
+USER		Memory is user mapped
+ro		Memory is read-only
+RW		Memory is read-write
+NX		Memory is privileged execute never
+x               Memory is privileged executable
+SHD		Memory is shared
+AF		Entry accessed flag is set
+NG		Entry Not-Global flag is set
+CON		Entry contiguous bit is set
+UXN		Memory is unprivileged execute never
+GP		Memory supports BTI
+TBL		Entry is a table descriptor
+BLK		Entry is a block descriptor
+NXTbl		Entry's referenced table is PXN
+UXNTbl		Entry's referenced table is unprivileged execute never
+DEVICE/*	Entry is device memory, see ARM reference for types
+MEM/*		Entry is non-device memory, see ARM reference for types
+
+The beginning and end of each region is also delineated by a single line
+tag in the following format::
+
+ ---[ <marker_name> ]---
+
+With supported address markers including the kernel's linear mapping,
+kasan shadow memory, kernel modules memory, vmalloc memory, PCI I/O
+memory, and the kernel's fixmap region.
+
+Example ``cat /sys/kernel/debug/kernel_page_tables`` output::
+
+---[ Linear Mapping start ]---
+0xffff000000000000-0xffff31ffffffffff                  50T PGD
+0xffff320000000000-0xffffffffffffffff                 206T PGD   TBL     RW               x      NXTbl UXNTbl    MEM/NORMAL
+    0xffff320000000000-0xffff3251ffffffff             328G PUD
+    0xffff325200000000-0xffff32523fffffff               1G PUD   TBL     RW               x      NXTbl UXNTbl    MEM/NORMAL
+      0xffff325200000000-0xffff3252001fffff             2M PMD   TBL     RW               x      NXTbl UXNTbl    MEM/NORMAL
+        0xffff325200000000-0xffff3252001fffff           2M PTE       RW NX SHD AF NG     UXN    MEM/NORMAL-TAGGED
+      0xffff325200200000-0xffff3252003fffff             2M PMD   TBL     RW               x      NXTbl UXNTbl    MEM/NORMAL
+        0xffff325200200000-0xffff32520020ffff          64K PTE       RW NX SHD AF NG     UXN    MEM/NORMAL-TAGGED
+        0xffff325200210000-0xffff3252003fffff        1984K PTE       ro NX SHD AF NG     UXN    MEM/NORMAL
+      0xffff325200400000-0xffff325201dfffff            26M PMD   BLK     ro SHD AF NG     NX UXN                 MEM/NORMAL
+      0xffff325201e00000-0xffff325201ffffff             2M PMD   TBL     RW               x      NXTbl UXNTbl    MEM/NORMAL
+        0xffff325201e00000-0xffff325201e0ffff          64K PTE       ro NX SHD AF NG     UXN    MEM/NORMAL
+        0xffff325201e10000-0xffff325201ffffff        1984K PTE       RW NX SHD AF NG     UXN    MEM/NORMAL-TAGGED
-- 
2.39.2




^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v4 2/5] arm64: non leaf ptdump support
  2024-06-18 14:40 ` [PATCH v4 2/5] arm64: non leaf ptdump support Maxwell Bland
@ 2024-06-18 14:59   ` Ard Biesheuvel
  2024-06-18 16:55     ` Maxwell Bland
  0 siblings, 1 reply; 10+ messages in thread
From: Ard Biesheuvel @ 2024-06-18 14:59 UTC (permalink / raw)
  To: Maxwell Bland
  Cc: linux-mm, Catalin Marinas, Will Deacon, Jonathan Corbet,
	Andrew Morton, Mark Rutland, Christophe Leroy, Alexandre Ghiti,
	linux-arm-kernel, linux-doc, linux-kernel

On Tue, 18 Jun 2024 at 16:40, Maxwell Bland <mbland@motorola.com> wrote:
>
> Separate the pte_bits used in ptdump from pxd_bits used by pmd, p4d,
> pud, and pgd descriptors, thereby adding support for printing key
> intermediate directory protection bits, such as PXNTable, and enable the
> associated support Kconfig option.
>
> Signed-off-by: Maxwell Bland <mbland@motorola.com>
> ---
>  arch/arm64/Kconfig     |   1 +
>  arch/arm64/mm/ptdump.c | 140 ++++++++++++++++++++++++++++++++++++-----
>  2 files changed, 125 insertions(+), 16 deletions(-)
>
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index 5d91259ee7b5..f4c3290160db 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -98,6 +98,7 @@ config ARM64
>         select ARCH_SUPPORTS_NUMA_BALANCING
>         select ARCH_SUPPORTS_PAGE_TABLE_CHECK
>         select ARCH_SUPPORTS_PER_VMA_LOCK
> +       select ARCH_SUPPORTS_NON_LEAF_PTDUMP
>         select ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH
>         select ARCH_WANT_COMPAT_IPC_PARSE_VERSION if COMPAT
>         select ARCH_WANT_DEFAULT_BPF_JIT
> diff --git a/arch/arm64/mm/ptdump.c b/arch/arm64/mm/ptdump.c
> index 6986827e0d64..8f0b459c13ed 100644
> --- a/arch/arm64/mm/ptdump.c
> +++ b/arch/arm64/mm/ptdump.c
> @@ -24,6 +24,7 @@
>  #include <asm/memory.h>
>  #include <asm/pgtable-hwdef.h>
>  #include <asm/ptdump.h>
> +#include <asm/pgalloc.h>
>
>
>  #define pt_dump_seq_printf(m, fmt, args...)    \
> @@ -105,11 +106,6 @@ static const struct prot_bits pte_bits[] = {
>                 .val    = PTE_CONT,
>                 .set    = "CON",
>                 .clear  = "   ",
> -       }, {
> -               .mask   = PTE_TABLE_BIT,
> -               .val    = PTE_TABLE_BIT,
> -               .set    = "   ",
> -               .clear  = "BLK",
>         }, {
>                 .mask   = PTE_UXN,
>                 .val    = PTE_UXN,
> @@ -143,34 +139,129 @@ static const struct prot_bits pte_bits[] = {
>         }
>  };
>
> +static const struct prot_bits pxd_bits[] = {

This table will need to distinguish between table and block entries.
In your sample output, I see

2M PMD   TBL     RW               x            UXNTbl    MEM/NORMAL

for a table entry, which includes a memory type and access permissions
based on descriptor fields that are not used for table descriptors.

Some other attributes listed below are equally inapplicable to table
entries, but happen to be 0x0 so they don't appear in the output, but
they would if the IGNORED bit in the descriptor happened to be set.

So I suspect that the distinction pte_bits <-> pxd_bits is not so
useful here. It would be better to have tbl_bits[], with pointers to
it in the pg_level array, where the PTE level one is set to NULL.


> +       {
> +               .mask   = PMD_SECT_VALID,
> +               .val    = PMD_SECT_VALID,
> +               .set    = " ",
> +               .clear  = "F",
> +       }, {
> +               .mask   = PMD_TABLE_BIT,
> +               .val    = PMD_TABLE_BIT,
> +               .set    = "TBL",
> +               .clear  = "BLK",
> +       }, {
> +               .mask   = PMD_SECT_USER,
> +               .val    = PMD_SECT_USER,
> +               .set    = "USR",
> +               .clear  = "   ",
> +       }, {
> +               .mask   = PMD_SECT_RDONLY,
> +               .val    = PMD_SECT_RDONLY,
> +               .set    = "ro",
> +               .clear  = "RW",
> +       }, {
> +               .mask   = PMD_SECT_S,
> +               .val    = PMD_SECT_S,
> +               .set    = "SHD",
> +               .clear  = "   ",
> +       }, {
> +               .mask   = PMD_SECT_AF,
> +               .val    = PMD_SECT_AF,
> +               .set    = "AF",
> +               .clear  = "  ",
> +       }, {
> +               .mask   = PMD_SECT_NG,
> +               .val    = PMD_SECT_NG,
> +               .set    = "NG",
> +               .clear  = "  ",
> +       }, {
> +               .mask   = PMD_SECT_CONT,
> +               .val    = PMD_SECT_CONT,
> +               .set    = "CON",
> +               .clear  = "   ",
> +       }, {
> +               .mask   = PMD_SECT_PXN,
> +               .val    = PMD_SECT_PXN,
> +               .set    = "NX",
> +               .clear  = "x ",
> +       }, {
> +               .mask   = PMD_SECT_UXN,
> +               .val    = PMD_SECT_UXN,
> +               .set    = "UXN",
> +               .clear  = "   ",
> +       }, {
> +               .mask   = PMD_TABLE_PXN,
> +               .val    = PMD_TABLE_PXN,
> +               .set    = "NXTbl",
> +               .clear  = "     ",
> +       }, {
> +               .mask   = PMD_TABLE_UXN,
> +               .val    = PMD_TABLE_UXN,
> +               .set    = "UXNTbl",
> +               .clear  = "      ",
> +       }, {
> +               .mask   = PTE_GP,
> +               .val    = PTE_GP,
> +               .set    = "GP",
> +               .clear  = "  ",
> +       }, {
> +               .mask   = PMD_ATTRINDX_MASK,
> +               .val    = PMD_ATTRINDX(MT_DEVICE_nGnRnE),
> +               .set    = "DEVICE/nGnRnE",
> +       }, {
> +               .mask   = PMD_ATTRINDX_MASK,
> +               .val    = PMD_ATTRINDX(MT_DEVICE_nGnRE),
> +               .set    = "DEVICE/nGnRE",
> +       }, {
> +               .mask   = PMD_ATTRINDX_MASK,
> +               .val    = PMD_ATTRINDX(MT_NORMAL_NC),
> +               .set    = "MEM/NORMAL-NC",
> +       }, {
> +               .mask   = PMD_ATTRINDX_MASK,
> +               .val    = PMD_ATTRINDX(MT_NORMAL),
> +               .set    = "MEM/NORMAL",
> +       }, {
> +               .mask   = PMD_ATTRINDX_MASK,
> +               .val    = PMD_ATTRINDX(MT_NORMAL_TAGGED),
> +               .set    = "MEM/NORMAL-TAGGED",
> +       }
> +};
> +
>  struct pg_level {
>         const struct prot_bits *bits;
>         char name[4];
>         int num;
>         u64 mask;
> +       unsigned long size;
>  };
>
>  static struct pg_level pg_level[] __ro_after_init = {
>         { /* pgd */
>                 .name   = "PGD",
> -               .bits   = pte_bits,
> -               .num    = ARRAY_SIZE(pte_bits),
> +               .bits   = pxd_bits,
> +               .num    = ARRAY_SIZE(pxd_bits),
> +               .size   = PGDIR_SIZE,
>         }, { /* p4d */
>                 .name   = "P4D",
> -               .bits   = pte_bits,
> -               .num    = ARRAY_SIZE(pte_bits),
> +               .bits   = pxd_bits,
> +               .num    = ARRAY_SIZE(pxd_bits),
> +               .size   = P4D_SIZE,
>         }, { /* pud */
>                 .name   = "PUD",
> -               .bits   = pte_bits,
> -               .num    = ARRAY_SIZE(pte_bits),
> +               .bits   = pxd_bits,
> +               .num    = ARRAY_SIZE(pxd_bits),
> +               .size   = PUD_SIZE,
>         }, { /* pmd */
>                 .name   = "PMD",
> -               .bits   = pte_bits,
> -               .num    = ARRAY_SIZE(pte_bits),
> +               .bits   = pxd_bits,
> +               .num    = ARRAY_SIZE(pxd_bits),
> +               .size   = PMD_SIZE,
>         }, { /* pte */
>                 .name   = "PTE",
>                 .bits   = pte_bits,
>                 .num    = ARRAY_SIZE(pte_bits),
> +               .size   = PAGE_SIZE
>         },
>  };
>
> @@ -251,10 +342,27 @@ static void note_page(struct ptdump_state *pt_st, unsigned long addr, int level,
>                         note_prot_wx(st, addr);
>                 }
>
> -               pt_dump_seq_printf(st->seq, "0x%016lx-0x%016lx   ",
> -                                  st->start_address, addr);
> +               /*
> +                * Non-leaf entries use a fixed size for their range
> +                * specification, whereas leaf entries are grouped by
> +                * attributes and may not have a range larger than the type
> +                * specifier.
> +                */
> +               if (st->start_address == addr) {
> +                       if (check_add_overflow(addr, pg_level[st->level].size,
> +                                              &delta))
> +                               delta = ULONG_MAX - addr + 1;
> +                       else
> +                               delta = pg_level[st->level].size;
> +                       pt_dump_seq_printf(st->seq, "0x%016lx-0x%016lx   ",
> +                                          addr, addr + delta);
> +               } else {
> +                       delta = (addr - st->start_address);
> +                       pt_dump_seq_printf(st->seq, "0x%016lx-0x%016lx   ",
> +                                          st->start_address, addr);
> +               }
>
> -               delta = (addr - st->start_address) >> 10;
> +               delta >>= 10;
>                 while (!(delta & 1023) && unit[1]) {
>                         delta >>= 10;
>                         unit++;
> --
> 2.39.2
>
>


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v4 2/5] arm64: non leaf ptdump support
  2024-06-18 14:59   ` Ard Biesheuvel
@ 2024-06-18 16:55     ` Maxwell Bland
  0 siblings, 0 replies; 10+ messages in thread
From: Maxwell Bland @ 2024-06-18 16:55 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-mm, Catalin Marinas, Will Deacon, Jonathan Corbet,
	Andrew Morton, Mark Rutland, Christophe Leroy, Alexandre Ghiti,
	linux-arm-kernel, linux-doc, linux-kernel

On Tue, Jun 18, 2024 at 04:59:22PM GMT, Ard Biesheuvel wrote:
> On Tue, 18 Jun 2024 at 16:40, Maxwell Bland <mbland@motorola.com> wrote:
> > @@ -105,11 +106,6 @@ static const struct prot_bits pte_bits[] = {
> >                 .val    = PTE_CONT,
> >                 .set    = "CON",
> >                 .clear  = "   ",
> > -       }, {
> > -               .mask   = PTE_TABLE_BIT,
> > -               .val    = PTE_TABLE_BIT,
> > -               .set    = "   ",
> > -               .clear  = "BLK",
> >         }, {
> >                 .mask   = PTE_UXN,
> >                 .val    = PTE_UXN,
> This table will need to distinguish between table and block entries.
> 
> I suspect that the distinction pte_bits <-> pxd_bits is not so useful
> here. It would be better to have tbl_bits[], with pointers to it in
> the pg_level array, where the PTE level one is set to NULL.

Nice, thanks! Adding now. I'll slate a v5 release for next monday.

Maxwell


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v4 1/5] mm: add ARCH_SUPPORTS_NON_LEAF_PTDUMP
  2024-06-18 14:40 ` [PATCH v4 1/5] mm: add ARCH_SUPPORTS_NON_LEAF_PTDUMP Maxwell Bland
@ 2024-06-18 18:38   ` LEROY Christophe
  0 siblings, 0 replies; 10+ messages in thread
From: LEROY Christophe @ 2024-06-18 18:38 UTC (permalink / raw)
  To: Maxwell Bland, linux-mm
  Cc: Catalin Marinas, Will Deacon, Jonathan Corbet, Andrew Morton,
	Ard Biesheuvel, Mark Rutland, Alexandre Ghiti, linux-arm-kernel,
	linux-doc, linux-kernel



Le 18/06/2024 à 16:40, Maxwell Bland a écrit :
> [Vous ne recevez pas souvent de courriers de mbland@motorola.com. D?couvrez pourquoi ceci est important ? https://aka.ms/LearnAboutSenderIdentification ]
> 
> Provide a Kconfig option indicating if note_page can be called for
> intermediate page directories during ptdump.
> 
> Signed-off-by: Maxwell Bland <mbland@motorola.com>
> ---
>   mm/Kconfig.debug |  9 +++++++++
>   mm/ptdump.c      | 21 +++++++++++++--------
>   2 files changed, 22 insertions(+), 8 deletions(-)
> 
> diff --git a/mm/Kconfig.debug b/mm/Kconfig.debug
> index afc72fde0f03..6af5ecfdef93 100644
> --- a/mm/Kconfig.debug
> +++ b/mm/Kconfig.debug
> @@ -201,6 +201,15 @@ config PTDUMP_DEBUGFS
> 
>            If in doubt, say N.
> 
> +config ARCH_SUPPORTS_NON_LEAF_PTDUMP
> +       bool "Include intermediate directory entries in pagetable dumps"
> +       default n
> +       help
> +         Enable the inclusion of intermediate page directory entries in calls
> +         to the ptdump API. Once an architecture defines correct ptdump
> +         behavior for PGD, PUD, P4D, and PMD entries, this config can be
> +         selected.
> +
>   config HAVE_DEBUG_KMEMLEAK
>          bool
> 
> diff --git a/mm/ptdump.c b/mm/ptdump.c
> index 106e1d66e9f9..6180708669fe 100644
> --- a/mm/ptdump.c
> +++ b/mm/ptdump.c
> @@ -41,10 +41,11 @@ static int ptdump_pgd_entry(pgd_t *pgd, unsigned long addr,
>          if (st->effective_prot)
>                  st->effective_prot(st, 0, pgd_val(val));
> 
> -       if (pgd_leaf(val)) {
> +       if (IS_ENABLED(CONFIG_ARCH_SUPPORTS_NON_LEAF_PTDUMP) || pgd_leaf(val))
>                  st->note_page(st, addr, 0, pgd_val(val));
> +
> +       if (pgd_leaf(val))
>                  walk->action = ACTION_CONTINUE;
> -       }
> 
>          return 0;
>   }
> @@ -64,10 +65,11 @@ static int ptdump_p4d_entry(p4d_t *p4d, unsigned long addr,
>          if (st->effective_prot)
>                  st->effective_prot(st, 1, p4d_val(val));
> 
> -       if (p4d_leaf(val)) {
> +       if (IS_ENABLED(CONFIG_ARCH_SUPPORTS_NON_LEAF_PTDUMP) || pgd_leaf(val))

Don't you mean p4d_leaf() here instead of pgd_leaf() ?

>                  st->note_page(st, addr, 1, p4d_val(val));
> +
> +       if (p4d_leaf(val))
>                  walk->action = ACTION_CONTINUE;
> -       }
> 
>          return 0;
>   }
> @@ -87,10 +89,11 @@ static int ptdump_pud_entry(pud_t *pud, unsigned long addr,
>          if (st->effective_prot)
>                  st->effective_prot(st, 2, pud_val(val));
> 
> -       if (pud_leaf(val)) {
> +       if (IS_ENABLED(CONFIG_ARCH_SUPPORTS_NON_LEAF_PTDUMP) || pgd_leaf(val))

Don't you mean pud_leaf() here instead of pgd_leaf() ?

>                  st->note_page(st, addr, 2, pud_val(val));
> +
> +       if (pud_leaf(val))
>                  walk->action = ACTION_CONTINUE;
> -       }
> 
>          return 0;
>   }
> @@ -108,10 +111,12 @@ static int ptdump_pmd_entry(pmd_t *pmd, unsigned long addr,
> 
>          if (st->effective_prot)
>                  st->effective_prot(st, 3, pmd_val(val));
> -       if (pmd_leaf(val)) {
> +
> +       if (IS_ENABLED(CONFIG_ARCH_SUPPORTS_NON_LEAF_PTDUMP) || pgd_leaf(val))

Don't you mean pmd_leaf() here instead of pgd_leaf() ?

>                  st->note_page(st, addr, 3, pmd_val(val));
> +
> +       if (pmd_leaf(val))
>                  walk->action = ACTION_CONTINUE;
> -       }
> 
>          return 0;
>   }
> --
> 2.39.2
> 
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v4 5/5] arm64: add attrs and format to ptdump document
  2024-06-18 14:43 ` [PATCH v4 5/5] arm64: add attrs and format to ptdump document Maxwell Bland
@ 2024-06-18 23:06   ` Randy Dunlap
  0 siblings, 0 replies; 10+ messages in thread
From: Randy Dunlap @ 2024-06-18 23:06 UTC (permalink / raw)
  To: Maxwell Bland, linux-mm
  Cc: Catalin Marinas, Will Deacon, Jonathan Corbet, Andrew Morton,
	Ard Biesheuvel, Mark Rutland, Christophe Leroy, Alexandre Ghiti,
	linux-arm-kernel, linux-doc, linux-kernel

Hi,-

On 6/18/24 7:43 AM, Maxwell Bland wrote:
> Update the ptdump content with a precise explanation of the attribute
> symbols and the identical-entry coalescing implicit in the code.
> 
> Remove unnecessary layout example given the existing cat example,
> and opt instead for a precise, clear explantination of address markers,

                                       explanation

> format, attributes.
> 
> Update example to match the new cosmetic and intermediate-directory
> printing changes.
> 
> Signed-off-by: Maxwell Bland <mbland@motorola.com>
> ---
>  Documentation/arch/arm64/ptdump.rst | 126 ++++++++++++++--------------
>  1 file changed, 61 insertions(+), 65 deletions(-)
> 
> diff --git a/Documentation/arch/arm64/ptdump.rst b/Documentation/arch/arm64/ptdump.rst
> index 5dcfc5d7cddf..fee7600dd4d1 100644
> --- a/Documentation/arch/arm64/ptdump.rst
> +++ b/Documentation/arch/arm64/ptdump.rst
> @@ -29,68 +29,64 @@ configurations and mount debugfs::
>   mount -t debugfs nodev /sys/kernel/debug
>   cat /sys/kernel/debug/kernel_page_tables
>  
> -On analysing the output of ``cat /sys/kernel/debug/kernel_page_tables``
> -one can derive information about the virtual address range of the entry,
> -followed by size of the memory region covered by this entry, the
> -hierarchical structure of the page tables and finally the attributes
> -associated with each page. The page attributes provide information about
> -access permissions, execution capability, type of mapping such as leaf
> -level PTE or block level PGD, PMD and PUD, and access status of a page
> -within the kernel memory. Assessing these attributes can assist in
> -understanding the memory layout, access patterns and security
> -characteristics of the kernel pages.
> -
> -Kernel virtual memory layout example::
> -
> - start address        end address         size             attributes
> - +---------------------------------------------------------------------------------------+
> - | ---[ Linear Mapping start ]---------------------------------------------------------- |
> - | ..................                                                                    |
> - | 0xfff0000000000000-0xfff0000000210000  2112K PTE RW NX SHD AF  UXN  MEM/NORMAL-TAGGED |
> - | 0xfff0000000210000-0xfff0000001c00000 26560K PTE ro NX SHD AF  UXN  MEM/NORMAL        |
> - | ..................                                                                    |
> - | ---[ Linear Mapping end ]------------------------------------------------------------ |
> - +---------------------------------------------------------------------------------------+
> - | ---[ Modules start ]----------------------------------------------------------------- |
> - | ..................                                                                    |
> - | 0xffff800000000000-0xffff800008000000   128M PTE                                      |
> - | ..................                                                                    |
> - | ---[ Modules end ]------------------------------------------------------------------- |
> - +---------------------------------------------------------------------------------------+
> - | ---[ vmalloc() area ]---------------------------------------------------------------- |
> - | ..................                                                                    |
> - | 0xffff800008010000-0xffff800008200000  1984K PTE ro x  SHD AF       UXN  MEM/NORMAL   |
> - | 0xffff800008200000-0xffff800008e00000    12M PTE ro x  SHD AF  CON  UXN  MEM/NORMAL   |
> - | ..................                                                                    |
> - | ---[ vmalloc() end ]----------------------------------------------------------------- |
> - +---------------------------------------------------------------------------------------+
> - | ---[ Fixmap start ]------------------------------------------------------------------ |
> - | ..................                                                                    |
> - | 0xfffffbfffdb80000-0xfffffbfffdb90000    64K PTE ro x  SHD AF  UXN  MEM/NORMAL        |
> - | 0xfffffbfffdb90000-0xfffffbfffdba0000    64K PTE ro NX SHD AF  UXN  MEM/NORMAL        |
> - | ..................                                                                    |
> - | ---[ Fixmap end ]-------------------------------------------------------------------- |
> - +---------------------------------------------------------------------------------------+
> - | ---[ PCI I/O start ]----------------------------------------------------------------- |
> - | ..................                                                                    |
> - | 0xfffffbfffe800000-0xfffffbffff800000    16M PTE                                      |
> - | ..................                                                                    |
> - | ---[ PCI I/O end ]------------------------------------------------------------------- |
> - +---------------------------------------------------------------------------------------+
> - | ---[ vmemmap start ]----------------------------------------------------------------- |
> - | ..................                                                                    |
> - | 0xfffffc0002000000-0xfffffc0002200000     2M PTE RW NX SHD AF  UXN  MEM/NORMAL        |
> - | 0xfffffc0002200000-0xfffffc0020000000   478M PTE                                      |
> - | ..................                                                                    |
> - | ---[ vmemmap end ]------------------------------------------------------------------- |
> - +---------------------------------------------------------------------------------------+
> -
> -``cat /sys/kernel/debug/kernel_page_tables`` output::
> -
> - 0xfff0000001c00000-0xfff0000080000000     2020M PTE  RW NX SHD AF   UXN    MEM/NORMAL-TAGGED
> - 0xfff0000080000000-0xfff0000800000000       30G PMD
> - 0xfff0000800000000-0xfff0000800700000        7M PTE  RW NX SHD AF   UXN    MEM/NORMAL-TAGGED
> - 0xfff0000800700000-0xfff0000800710000       64K PTE  ro NX SHD AF   UXN    MEM/NORMAL-TAGGED
> - 0xfff0000800710000-0xfff0000880000000  2089920K PTE  RW NX SHD AF   UXN    MEM/NORMAL-TAGGED
> - 0xfff0000880000000-0xfff0040000000000     4062G PMD
> - 0xfff0040000000000-0xffff800000000000     3964T PGD
> +``/sys/kernel/debug/kernel_page_tables`` provides a line of information
> +for each group of page table entries sharing the same attributes and
> +type of mapping, i.e. leaf level PTE or block level PGD, PMD, and PUD.
> +Assessing these attributes can assist in determining memory layout,
> +access patterns and security characteristics of the kernel pages.
> +
> +Lines are formatted as follows::
> +
> + <start_vaddr>-<end_vaddr> <size> <type> <attributes>
> +
> +Note that the set of attributes, and therefore formatting, is not
> +equivalent between leaf and non-leaf entries. For example, PMD entries
> +can support the PXNTable permission bit and do not share that same set
> +of attributes as leaf level PTE entries.
> +
> +The following attributes are presently supported::
> +
> +F		Entry is invalid
> +USER		Memory is user mapped
> +ro		Memory is read-only
> +RW		Memory is read-write
> +NX		Memory is privileged execute never
> +x               Memory is privileged executable

Please use tabs above for indentation, like the other lines.

Why lower case x and ro but upper case for the others?

> +SHD		Memory is shared
> +AF		Entry accessed flag is set
> +NG		Entry Not-Global flag is set
> +CON		Entry contiguous bit is set
> +UXN		Memory is unprivileged execute never
> +GP		Memory supports BTI

Most of the abbreviations make some sense, but not that one (IMHO). ;)

> +TBL		Entry is a table descriptor
> +BLK		Entry is a block descriptor
> +NXTbl		Entry's referenced table is PXN
> +UXNTbl		Entry's referenced table is unprivileged execute never
> +DEVICE/*	Entry is device memory, see ARM reference for types
> +MEM/*		Entry is non-device memory, see ARM reference for types
> +
> +The beginning and end of each region is also delineated by a single line
> +tag in the following format::
> +
> + ---[ <marker_name> ]---
> +
> +With supported address markers including the kernel's linear mapping,
> +kasan shadow memory, kernel modules memory, vmalloc memory, PCI I/O
> +memory, and the kernel's fixmap region.
> +
> +Example ``cat /sys/kernel/debug/kernel_page_tables`` output::
> +
> +---[ Linear Mapping start ]---
> +0xffff000000000000-0xffff31ffffffffff                  50T PGD
> +0xffff320000000000-0xffffffffffffffff                 206T PGD   TBL     RW               x      NXTbl UXNTbl    MEM/NORMAL
> +    0xffff320000000000-0xffff3251ffffffff             328G PUD
> +    0xffff325200000000-0xffff32523fffffff               1G PUD   TBL     RW               x      NXTbl UXNTbl    MEM/NORMAL
> +      0xffff325200000000-0xffff3252001fffff             2M PMD   TBL     RW               x      NXTbl UXNTbl    MEM/NORMAL
> +        0xffff325200000000-0xffff3252001fffff           2M PTE       RW NX SHD AF NG     UXN    MEM/NORMAL-TAGGED
> +      0xffff325200200000-0xffff3252003fffff             2M PMD   TBL     RW               x      NXTbl UXNTbl    MEM/NORMAL
> +        0xffff325200200000-0xffff32520020ffff          64K PTE       RW NX SHD AF NG     UXN    MEM/NORMAL-TAGGED
> +        0xffff325200210000-0xffff3252003fffff        1984K PTE       ro NX SHD AF NG     UXN    MEM/NORMAL
> +      0xffff325200400000-0xffff325201dfffff            26M PMD   BLK     ro SHD AF NG     NX UXN                 MEM/NORMAL
> +      0xffff325201e00000-0xffff325201ffffff             2M PMD   TBL     RW               x      NXTbl UXNTbl    MEM/NORMAL
> +        0xffff325201e00000-0xffff325201e0ffff          64K PTE       ro NX SHD AF NG     UXN    MEM/NORMAL
> +        0xffff325201e10000-0xffff325201ffffff        1984K PTE       RW NX SHD AF NG     UXN    MEM/NORMAL-TAGGED

-- 
thanks.
~Randy


^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2024-06-18 23:07 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-06-18 14:37 [PATCH v4 0/5] ptdump: add intermediate directory support Maxwell Bland
2024-06-18 14:40 ` [PATCH v4 1/5] mm: add ARCH_SUPPORTS_NON_LEAF_PTDUMP Maxwell Bland
2024-06-18 18:38   ` LEROY Christophe
2024-06-18 14:40 ` [PATCH v4 2/5] arm64: non leaf ptdump support Maxwell Bland
2024-06-18 14:59   ` Ard Biesheuvel
2024-06-18 16:55     ` Maxwell Bland
2024-06-18 14:42 ` [PATCH v4 3/5] arm64: indent ptdump by level, aligning attributes Maxwell Bland
2024-06-18 14:42 ` [PATCH v4 4/5] arm64: exclusive upper bound for ptdump entries Maxwell Bland
2024-06-18 14:43 ` [PATCH v4 5/5] arm64: add attrs and format to ptdump document Maxwell Bland
2024-06-18 23:06   ` Randy Dunlap

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox