* [PATCH man-pages v4] madvise.2: add documentation for MADV_COLLAPSE
@ 2022-10-31 22:55 Zach OKeefe
2022-10-31 23:36 ` Alejandro Colomar
0 siblings, 1 reply; 5+ messages in thread
From: Zach OKeefe @ 2022-10-31 22:55 UTC (permalink / raw)
To: Alejandro Colomar, Michael Kerrisk
Cc: Yang Shi, linux-mm, linux-man, Zach O'Keefe
From: Zach O'Keefe <zokeefe@google.com>
Linux 6.1 introduced MADV_COLLAPSE in upstream commit 7d8faaf15545
("mm/madvise: introduce MADV_COLLAPSE sync hugepage collapse") and
upstream commit 34488399fa08 ("mm/madvise: add file and shmem support to
MADV_COLLAPSE"). Update the man-pages for madvise(2) and
process_madvise(2).
Link: https://lore.kernel.org/linux-mm/20220922224046.1143204-1-zokeefe@google.com/
Link: https://lore.kernel.org/linux-mm/20220706235936.2197195-1-zokeefe@google.com/
Signed-off-by: Zach O'Keefe <zokeefe@google.com>
---
v3[1] -> v4
- Rebased to latest master
- (Alejandro Colomar) Fixed weird, non-ascii chars: e2 80 99 -> "'"
- (Alejandro Colomar) Replaced .BR with .B directive when the entire
line was bold (no non-bold part)
[1] https://lore.kernel.org/linux-man/bb3b5c3c-3966-ea1a-6d84-4f7f3afa37ca@gmail.com/T/#u
man2/madvise | 0
man2/madvise.2 | 90 +++++++++++++++++++++++++++++++++++++++++-
man2/process_madvise.2 | 10 +++++
3 files changed, 98 insertions(+), 2 deletions(-)
create mode 100644 man2/madvise
diff --git a/man2/madvise b/man2/madvise
new file mode 100644
index 000000000..e69de29bb
diff --git a/man2/madvise.2 b/man2/madvise.2
index edf805740..dca42c7d6 100644
--- a/man2/madvise.2
+++ b/man2/madvise.2
@@ -386,9 +386,10 @@ set (see
.BR prctl (2)).
.IP
The
-.B MADV_HUGEPAGE
+.BR MADV_HUGEPAGE ,
+.BR MADV_NOHUGEPAGE ,
and
-.B MADV_NOHUGEPAGE
+.B MADV_COLLAPSE
operations are available only if the kernel was configured with
.B CONFIG_TRANSPARENT_HUGEPAGE
and file/shmem memory is only supported if the kernel was configured with
@@ -401,6 +402,81 @@ and
.I length
will not be backed by transparent hugepages.
.TP
+.BR MADV_COLLAPSE " (since Linux 6.1)"
+.\" commit 7d8faaf155454f8798ec56404faca29a82689c77
+.\" commit 34488399fa08faaf664743fa54b271eb6f9e1321
+Perform a best-effort synchronous collapse of the native pages mapped by the
+memory range into Transparent Huge Pages (THPs).
+.B MADV_COLLAPSE
+operates on the current state of memory of the calling process and makes no
+persistent changes or guarantees on how pages will be mapped,
+constructed,
+or faulted in the future.
+.IP
+.B MADV_COLLAPSE
+supports private anonymous pages (see
+.BR mmap (2)),
+shmem pages,
+and file-backed pages.
+See
+.B MADV_HUGEPAGE
+for general information on memory requirements for THP.
+If the range provided spans multiple VMAs,
+the semantics of the collapse over each VMA is independent from the others.
+If collapse of a given huge page-aligned/sized region fails,
+the operation may continue to attempt collapsing the remainder of the
+specified memory.
+.B MADV_COLLAPSE
+will automatically clamp the provided range to be hugepage-aligned.
+.IP
+All non-resident pages covered by the range will first be
+swapped/faulted-in,
+before being copied onto a freshly allocated hugepage.
+If the native pages compose the same PTE-mapped hugepage,
+and are suitably aligned,
+allocation of a new hugepage may be elided and collapse may happen
+in-place.
+Unmapped pages will have their data directly initialized to 0 in the new
+hugepage.
+However,
+for every eligible hugepage-aligned/sized region to be collapsed,
+at least one page must currently be backed by physical memory.
+.IP
+.B MADV_COLLAPSE
+is independent of any sysfs
+(see
+.BR sysfs (5))
+setting under
+.IR /sys/kernel/mm/transparent_hugepage ,
+both in terms of determining THP eligibility,
+and allocation semantics.
+See Linux kernel source file
+.I Documentation/admin\-guide/mm/transhuge.rst
+for more information.
+.B MADV_COLLAPSE
+also ignores
+.B huge=
+tmpfs mount when operating on tmpfs files.
+Allocation for the new hugepage may enter direct reclaim and/or compaction,
+regardless of VMA flags
+(though
+.B VM_NOHUGEPAGE
+is still respected).
+.IP
+When the system has multiple NUMA nodes,
+the hugepage will be allocated from the node providing the most native
+pages.
+.IP
+If all hugepage-sized/aligned regions covered by the provided range were
+either successfully collapsed,
+or were already PMD-mapped THPs,
+this operation will be deemed successful.
+Note that this doesn't guarantee anything about other possible mappings of
+the memory.
+Also note that many failures might have occurred since the operation may
+continue to collapse in the event collapse of a single hugepage-sized/aligned
+region fails.
+.TP
.BR MADV_DONTDUMP " (since Linux 3.4)"
.\" commit 909af768e88867016f427264ae39d27a57b6a8ed
.\" commit accb61fe7bb0f5c2a4102239e4981650f9048519
@@ -620,6 +696,11 @@ A kernel resource was temporarily unavailable.
.B EBADF
The map exists, but the area maps something that isn't a file.
.TP
+.B EBUSY
+(for
+.BR MADV_COLLAPSE )
+Could not charge hugepage to cgroup: cgroup limit exceeded.
+.TP
.B EFAULT
.I advice
is
@@ -717,6 +798,11 @@ maximum resident set size.
Not enough memory: paging in failed.
.TP
.B ENOMEM
+(for
+.BR MADV_COLLAPSE )
+Not enough memory: could not allocate hugepage.
+.TP
+.B ENOMEM
Addresses in the specified range are not currently
mapped, or are outside the address space of the process.
.TP
diff --git a/man2/process_madvise.2 b/man2/process_madvise.2
index ac98850a9..92878286b 100644
--- a/man2/process_madvise.2
+++ b/man2/process_madvise.2
@@ -73,6 +73,10 @@ argument is one of the following values:
See
.BR madvise (2).
.TP
+.B MADV_COLLAPSE
+See
+.BR madvise (2).
+.TP
.B MADV_PAGEOUT
See
.BR madvise (2).
@@ -173,6 +177,12 @@ The caller does not have permission to access the address space of the process
.TP
.B ESRCH
The target process does not exist (i.e., it has terminated and been waited on).
+.PP
+See
+.BR madvise (2)
+for
+.IR advice -specific
+errors.
.SH VERSIONS
This system call first appeared in Linux 5.10.
.\" commit ecb8ac8b1f146915aa6b96449b66dd48984caacc
--
2.38.1.273.g43a17bfeac-goog
^ permalink raw reply [flat|nested] 5+ messages in thread* Re: [PATCH man-pages v4] madvise.2: add documentation for MADV_COLLAPSE 2022-10-31 22:55 [PATCH man-pages v4] madvise.2: add documentation for MADV_COLLAPSE Zach OKeefe @ 2022-10-31 23:36 ` Alejandro Colomar 2022-11-01 0:38 ` Zach O'Keefe 0 siblings, 1 reply; 5+ messages in thread From: Alejandro Colomar @ 2022-10-31 23:36 UTC (permalink / raw) To: Zach OKeefe; +Cc: Yang Shi, linux-mm, linux-man, Michael Kerrisk [-- Attachment #1.1: Type: text/plain, Size: 10281 bytes --] Hey Zach! On 10/31/22 23:55, Zach OKeefe wrote: > From: Zach O'Keefe <zokeefe@google.com> > > Linux 6.1 introduced MADV_COLLAPSE in upstream commit 7d8faaf15545 > ("mm/madvise: introduce MADV_COLLAPSE sync hugepage collapse") and > upstream commit 34488399fa08 ("mm/madvise: add file and shmem support to > MADV_COLLAPSE"). Update the man-pages for madvise(2) and > process_madvise(2). > > Link: https://lore.kernel.org/linux-mm/20220922224046.1143204-1-zokeefe@google.com/ > Link: https://lore.kernel.org/linux-mm/20220706235936.2197195-1-zokeefe@google.com/ > Signed-off-by: Zach O'Keefe <zokeefe@google.com> Okay, now I have some more comments: - A few changes about semantic newlines. See a diff at the bottom of this email that you can apply. - An accident. - Some paragraph I don't really understand. Cheers, Alex > --- > > v3[1] -> v4 > - Rebased to latest master > - (Alejandro Colomar) Fixed weird, non-ascii chars: e2 80 99 -> "'" > - (Alejandro Colomar) Replaced .BR with .B directive when the entire > line was bold (no non-bold part) > > [1] https://lore.kernel.org/linux-man/bb3b5c3c-3966-ea1a-6d84-4f7f3afa37ca@gmail.com/T/#u > > man2/madvise | 0 > man2/madvise.2 | 90 +++++++++++++++++++++++++++++++++++++++++- > man2/process_madvise.2 | 10 +++++ > 3 files changed, 98 insertions(+), 2 deletions(-) > create mode 100644 man2/madvise > > diff --git a/man2/madvise b/man2/madvise > new file mode 100644 > index 000000000..e69de29bb Heh! This was a funny accident. I realized because autocomplete showed it as a possibility. :) The diff at the bottom removes it. > diff --git a/man2/madvise.2 b/man2/madvise.2 > index edf805740..dca42c7d6 100644 > --- a/man2/madvise.2 > +++ b/man2/madvise.2 > @@ -386,9 +386,10 @@ set (see > .BR prctl (2)). > .IP > The > -.B MADV_HUGEPAGE > +.BR MADV_HUGEPAGE , > +.BR MADV_NOHUGEPAGE , > and > -.B MADV_NOHUGEPAGE > +.B MADV_COLLAPSE > operations are available only if the kernel was configured with > .B CONFIG_TRANSPARENT_HUGEPAGE > and file/shmem memory is only supported if the kernel was configured with > @@ -401,6 +402,81 @@ and > .I length > will not be backed by transparent hugepages. > .TP > +.BR MADV_COLLAPSE " (since Linux 6.1)" > +.\" commit 7d8faaf155454f8798ec56404faca29a82689c77 > +.\" commit 34488399fa08faaf664743fa54b271eb6f9e1321 > +Perform a best-effort synchronous collapse of the native pages mapped by the > +memory range into Transparent Huge Pages (THPs). > +.B MADV_COLLAPSE > +operates on the current state of memory of the calling process and makes no > +persistent changes or guarantees on how pages will be mapped, > +constructed, > +or faulted in the future. > +.IP > +.B MADV_COLLAPSE > +supports private anonymous pages (see > +.BR mmap (2)), > +shmem pages, > +and file-backed pages. > +See > +.B MADV_HUGEPAGE > +for general information on memory requirements for THP. > +If the range provided spans multiple VMAs, > +the semantics of the collapse over each VMA is independent from the others. > +If collapse of a given huge page-aligned/sized region fails, > +the operation may continue to attempt collapsing the remainder of the > +specified memory. > +.B MADV_COLLAPSE > +will automatically clamp the provided range to be hugepage-aligned. > +.IP > +All non-resident pages covered by the range will first be > +swapped/faulted-in, > +before being copied onto a freshly allocated hugepage. > +If the native pages compose the same PTE-mapped hugepage, > +and are suitably aligned, > +allocation of a new hugepage may be elided and collapse may happen > +in-place. > +Unmapped pages will have their data directly initialized to 0 in the new > +hugepage. > +However, > +for every eligible hugepage-aligned/sized region to be collapsed, > +at least one page must currently be backed by physical memory. > +.IP > +.B MADV_COLLAPSE > +is independent of any sysfs > +(see > +.BR sysfs (5)) > +setting under > +.IR /sys/kernel/mm/transparent_hugepage , > +both in terms of determining THP eligibility, > +and allocation semantics. > +See Linux kernel source file > +.I Documentation/admin\-guide/mm/transhuge.rst > +for more information. > +.B MADV_COLLAPSE > +also ignores > +.B huge= > +tmpfs mount when operating on tmpfs files. > +Allocation for the new hugepage may enter direct reclaim and/or compaction, > +regardless of VMA flags > +(though > +.B VM_NOHUGEPAGE > +is still respected). > +.IP > +When the system has multiple NUMA nodes, > +the hugepage will be allocated from the node providing the most native > +pages. > +.IP > +If all hugepage-sized/aligned regions covered by the provided range were > +either successfully collapsed, > +or were already PMD-mapped THPs, > +this operation will be deemed successful. > +Note that this doesn't guarantee anything about other possible mappings of > +the memory. > +Also note that many failures might have occurred since the operation may > +continue to collapse in the event collapse of a single hugepage-sized/aligned > +region fails. I don't understand this last paragraph (since "Also note ..."). Could you please reword it a little bit? > +.TP > .BR MADV_DONTDUMP " (since Linux 3.4)" > .\" commit 909af768e88867016f427264ae39d27a57b6a8ed > .\" commit accb61fe7bb0f5c2a4102239e4981650f9048519 > @@ -620,6 +696,11 @@ A kernel resource was temporarily unavailable. > .B EBADF > The map exists, but the area maps something that isn't a file. > .TP > +.B EBUSY > +(for > +.BR MADV_COLLAPSE ) > +Could not charge hugepage to cgroup: cgroup limit exceeded. > +.TP > .B EFAULT > .I advice > is > @@ -717,6 +798,11 @@ maximum resident set size. > Not enough memory: paging in failed. > .TP > .B ENOMEM > +(for > +.BR MADV_COLLAPSE ) > +Not enough memory: could not allocate hugepage. > +.TP > +.B ENOMEM > Addresses in the specified range are not currently > mapped, or are outside the address space of the process. > .TP > diff --git a/man2/process_madvise.2 b/man2/process_madvise.2 > index ac98850a9..92878286b 100644 > --- a/man2/process_madvise.2 > +++ b/man2/process_madvise.2 > @@ -73,6 +73,10 @@ argument is one of the following values: > See > .BR madvise (2). > .TP > +.B MADV_COLLAPSE > +See > +.BR madvise (2). > +.TP > .B MADV_PAGEOUT > See > .BR madvise (2). > @@ -173,6 +177,12 @@ The caller does not have permission to access the address space of the process > .TP > .B ESRCH > The target process does not exist (i.e., it has terminated and been waited on). > +.PP > +See > +.BR madvise (2) > +for > +.IR advice -specific > +errors. > .SH VERSIONS > This system call first appeared in Linux 5.10. > .\" commit ecb8ac8b1f146915aa6b96449b66dd48984caacc Diff for changing a few line breaks (and removing the spurious file): diff --git a/man2/madvise b/man2/madvise deleted file mode 100644 index e69de29bb..000000000 diff --git a/man2/madvise.2 b/man2/madvise.2 index dca42c7d6..7f34301d3 100644 --- a/man2/madvise.2 +++ b/man2/madvise.2 @@ -405,11 +405,12 @@ .SS Linux-specific advice values .BR MADV_COLLAPSE " (since Linux 6.1)" .\" commit 7d8faaf155454f8798ec56404faca29a82689c77 .\" commit 34488399fa08faaf664743fa54b271eb6f9e1321 -Perform a best-effort synchronous collapse of the native pages mapped by the -memory range into Transparent Huge Pages (THPs). +Perform a best-effort synchronous collapse of +the native pages mapped by the memory range +into Transparent Huge Pages (THPs). .B MADV_COLLAPSE -operates on the current state of memory of the calling process and makes no -persistent changes or guarantees on how pages will be mapped, +operates on the current state of memory of the calling process and +makes no persistent changes or guarantees on how pages will be mapped, constructed, or faulted in the future. .IP @@ -424,20 +425,20 @@ .SS Linux-specific advice values If the range provided spans multiple VMAs, the semantics of the collapse over each VMA is independent from the others. If collapse of a given huge page-aligned/sized region fails, -the operation may continue to attempt collapsing the remainder of the -specified memory. +the operation may continue to attempt collapsing +the remainder of the specified memory. .B MADV_COLLAPSE will automatically clamp the provided range to be hugepage-aligned. .IP -All non-resident pages covered by the range will first be -swapped/faulted-in, +All non-resident pages covered by the range +will first be swapped/faulted-in, before being copied onto a freshly allocated hugepage. If the native pages compose the same PTE-mapped hugepage, and are suitably aligned, -allocation of a new hugepage may be elided and collapse may happen -in-place. -Unmapped pages will have their data directly initialized to 0 in the new -hugepage. +allocation of a new hugepage may be elided and +collapse may happen in-place. +Unmapped pages will have their data directly initialized to 0 +in the new hugepage. However, for every eligible hugepage-aligned/sized region to be collapsed, at least one page must currently be backed by physical memory. @@ -464,15 +465,15 @@ .SS Linux-specific advice values is still respected). .IP When the system has multiple NUMA nodes, -the hugepage will be allocated from the node providing the most native -pages. +the hugepage will be allocated from +the node providing the most native pages. .IP If all hugepage-sized/aligned regions covered by the provided range were either successfully collapsed, or were already PMD-mapped THPs, this operation will be deemed successful. -Note that this doesn't guarantee anything about other possible mappings of -the memory. +Note that this doesn't guarantee anything about +other possible mappings of the memory. Also note that many failures might have occurred since the operation may continue to collapse in the event collapse of a single hugepage-sized/aligned region fails. -- <http://www.alejandro-colomar.es/> [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH man-pages v4] madvise.2: add documentation for MADV_COLLAPSE 2022-10-31 23:36 ` Alejandro Colomar @ 2022-11-01 0:38 ` Zach O'Keefe 2022-11-01 11:38 ` Alejandro Colomar 0 siblings, 1 reply; 5+ messages in thread From: Zach O'Keefe @ 2022-11-01 0:38 UTC (permalink / raw) To: Alejandro Colomar; +Cc: Yang Shi, linux-mm, linux-man, Michael Kerrisk Hey Alex, On Mon, Oct 31, 2022 at 4:37 PM Alejandro Colomar <alx.manpages@gmail.com> wrote: > > Hey Zach! > > On 10/31/22 23:55, Zach OKeefe wrote: > > From: Zach O'Keefe <zokeefe@google.com> > > > > Linux 6.1 introduced MADV_COLLAPSE in upstream commit 7d8faaf15545 > > ("mm/madvise: introduce MADV_COLLAPSE sync hugepage collapse") and > > upstream commit 34488399fa08 ("mm/madvise: add file and shmem support to > > MADV_COLLAPSE"). Update the man-pages for madvise(2) and > > process_madvise(2). > > > > Link: https://lore.kernel.org/linux-mm/20220922224046.1143204-1-zokeefe@google.com/ > > Link: https://lore.kernel.org/linux-mm/20220706235936.2197195-1-zokeefe@google.com/ > > Signed-off-by: Zach O'Keefe <zokeefe@google.com> > > Okay, now I have some more comments: > Thank you :) > - A few changes about semantic newlines. See a diff at the bottom of this email > that you can apply. > > - An accident. > > - Some paragraph I don't really understand. > > Cheers, > > Alex > > > --- > > > > v3[1] -> v4 > > - Rebased to latest master > > - (Alejandro Colomar) Fixed weird, non-ascii chars: e2 80 99 -> "'" > > - (Alejandro Colomar) Replaced .BR with .B directive when the entire > > line was bold (no non-bold part) > > > > [1] https://lore.kernel.org/linux-man/bb3b5c3c-3966-ea1a-6d84-4f7f3afa37ca@gmail.com/T/#u > > > > man2/madvise | 0 > > man2/madvise.2 | 90 +++++++++++++++++++++++++++++++++++++++++- > > man2/process_madvise.2 | 10 +++++ > > 3 files changed, 98 insertions(+), 2 deletions(-) > > create mode 100644 man2/madvise > > > > diff --git a/man2/madvise b/man2/madvise > > new file mode 100644 > > index 000000000..e69de29bb > > Heh! This was a funny accident. I realized because autocomplete showed it as a > possibility. :) > > The diff at the bottom removes it. > Sorry about that - thanks for noticing! > > diff --git a/man2/madvise.2 b/man2/madvise.2 > > index edf805740..dca42c7d6 100644 > > --- a/man2/madvise.2 > > +++ b/man2/madvise.2 > > @@ -386,9 +386,10 @@ set (see > > .BR prctl (2)). > > .IP > > The > > -.B MADV_HUGEPAGE > > +.BR MADV_HUGEPAGE , > > +.BR MADV_NOHUGEPAGE , > > and > > -.B MADV_NOHUGEPAGE > > +.B MADV_COLLAPSE > > operations are available only if the kernel was configured with > > .B CONFIG_TRANSPARENT_HUGEPAGE > > and file/shmem memory is only supported if the kernel was configured with > > @@ -401,6 +402,81 @@ and > > .I length > > will not be backed by transparent hugepages. > > .TP > > +.BR MADV_COLLAPSE " (since Linux 6.1)" > > +.\" commit 7d8faaf155454f8798ec56404faca29a82689c77 > > +.\" commit 34488399fa08faaf664743fa54b271eb6f9e1321 > > +Perform a best-effort synchronous collapse of the native pages mapped by the > > +memory range into Transparent Huge Pages (THPs). > > +.B MADV_COLLAPSE > > +operates on the current state of memory of the calling process and makes no > > +persistent changes or guarantees on how pages will be mapped, > > +constructed, > > +or faulted in the future. > > +.IP > > +.B MADV_COLLAPSE > > +supports private anonymous pages (see > > +.BR mmap (2)), > > +shmem pages, > > +and file-backed pages. > > +See > > +.B MADV_HUGEPAGE > > +for general information on memory requirements for THP. > > +If the range provided spans multiple VMAs, > > +the semantics of the collapse over each VMA is independent from the others. > > +If collapse of a given huge page-aligned/sized region fails, > > +the operation may continue to attempt collapsing the remainder of the > > +specified memory. > > +.B MADV_COLLAPSE > > +will automatically clamp the provided range to be hugepage-aligned. > > +.IP > > +All non-resident pages covered by the range will first be > > +swapped/faulted-in, > > +before being copied onto a freshly allocated hugepage. > > +If the native pages compose the same PTE-mapped hugepage, > > +and are suitably aligned, > > +allocation of a new hugepage may be elided and collapse may happen > > +in-place. > > +Unmapped pages will have their data directly initialized to 0 in the new > > +hugepage. > > +However, > > +for every eligible hugepage-aligned/sized region to be collapsed, > > +at least one page must currently be backed by physical memory. > > +.IP > > +.B MADV_COLLAPSE > > +is independent of any sysfs > > +(see > > +.BR sysfs (5)) > > +setting under > > +.IR /sys/kernel/mm/transparent_hugepage , > > +both in terms of determining THP eligibility, > > +and allocation semantics. > > +See Linux kernel source file > > +.I Documentation/admin\-guide/mm/transhuge.rst > > +for more information. > > +.B MADV_COLLAPSE > > +also ignores > > +.B huge= > > +tmpfs mount when operating on tmpfs files. > > +Allocation for the new hugepage may enter direct reclaim and/or compaction, > > +regardless of VMA flags > > +(though > > +.B VM_NOHUGEPAGE > > +is still respected). > > +.IP > > +When the system has multiple NUMA nodes, > > +the hugepage will be allocated from the node providing the most native > > +pages. > > +.IP > > +If all hugepage-sized/aligned regions covered by the provided range were > > +either successfully collapsed, > > +or were already PMD-mapped THPs, > > +this operation will be deemed successful. > > +Note that this doesn't guarantee anything about other possible mappings of > > +the memory. > > +Also note that many failures might have occurred since the operation may > > +continue to collapse in the event collapse of a single hugepage-sized/aligned > > +region fails. > > I don't understand this last paragraph (since "Also note ..."). Could you > please reword it a little bit? > Sure - I can see that it's hard to parse. Further up I note that, "If collapse of a given huge page-aligned/sized region fails, the operation may continue to attempt collapsing the remainder of the specified memory." Then perhaps it's enough to just state, "In the event multiple hugepage-aligned/sized areas fail to collapse, only the most recently-failed code will be set in errno" The idea here being: errno only communicates the reason for 1/N failures that might have occured. However -- on second thought -- perhaps this isn't particularly useful, as it's already implied. So, my new suggestion would be that we should drop it. What do you think? > > +.TP > > .BR MADV_DONTDUMP " (since Linux 3.4)" > > .\" commit 909af768e88867016f427264ae39d27a57b6a8ed > > .\" commit accb61fe7bb0f5c2a4102239e4981650f9048519 > > @@ -620,6 +696,11 @@ A kernel resource was temporarily unavailable. > > .B EBADF > > The map exists, but the area maps something that isn't a file. > > .TP > > +.B EBUSY > > +(for > > +.BR MADV_COLLAPSE ) > > +Could not charge hugepage to cgroup: cgroup limit exceeded. > > +.TP > > .B EFAULT > > .I advice > > is > > @@ -717,6 +798,11 @@ maximum resident set size. > > Not enough memory: paging in failed. > > .TP > > .B ENOMEM > > +(for > > +.BR MADV_COLLAPSE ) > > +Not enough memory: could not allocate hugepage. > > +.TP > > +.B ENOMEM > > Addresses in the specified range are not currently > > mapped, or are outside the address space of the process. > > .TP > > diff --git a/man2/process_madvise.2 b/man2/process_madvise.2 > > index ac98850a9..92878286b 100644 > > --- a/man2/process_madvise.2 > > +++ b/man2/process_madvise.2 > > @@ -73,6 +73,10 @@ argument is one of the following values: > > See > > .BR madvise (2). > > .TP > > +.B MADV_COLLAPSE > > +See > > +.BR madvise (2). > > +.TP > > .B MADV_PAGEOUT > > See > > .BR madvise (2). > > @@ -173,6 +177,12 @@ The caller does not have permission to access the address space of the process > > .TP > > .B ESRCH > > The target process does not exist (i.e., it has terminated and been waited on). > > +.PP > > +See > > +.BR madvise (2) > > +for > > +.IR advice -specific > > +errors. > > .SH VERSIONS > > This system call first appeared in Linux 5.10. > > .\" commit ecb8ac8b1f146915aa6b96449b66dd48984caacc > > Diff for changing a few line breaks (and removing the spurious file): > Thank you so much for this! :) > diff --git a/man2/madvise b/man2/madvise > deleted file mode 100644 > index e69de29bb..000000000 > diff --git a/man2/madvise.2 b/man2/madvise.2 > index dca42c7d6..7f34301d3 100644 > --- a/man2/madvise.2 > +++ b/man2/madvise.2 > @@ -405,11 +405,12 @@ .SS Linux-specific advice values > .BR MADV_COLLAPSE " (since Linux 6.1)" > .\" commit 7d8faaf155454f8798ec56404faca29a82689c77 > .\" commit 34488399fa08faaf664743fa54b271eb6f9e1321 > -Perform a best-effort synchronous collapse of the native pages mapped by the > -memory range into Transparent Huge Pages (THPs). > +Perform a best-effort synchronous collapse of > +the native pages mapped by the memory range > +into Transparent Huge Pages (THPs). > .B MADV_COLLAPSE > -operates on the current state of memory of the calling process and makes no > -persistent changes or guarantees on how pages will be mapped, > +operates on the current state of memory of the calling process and > +makes no persistent changes or guarantees on how pages will be mapped, > constructed, > or faulted in the future. > .IP > @@ -424,20 +425,20 @@ .SS Linux-specific advice values > If the range provided spans multiple VMAs, > the semantics of the collapse over each VMA is independent from the others. > If collapse of a given huge page-aligned/sized region fails, > -the operation may continue to attempt collapsing the remainder of the > -specified memory. > +the operation may continue to attempt collapsing > +the remainder of the specified memory. > .B MADV_COLLAPSE > will automatically clamp the provided range to be hugepage-aligned. > .IP > -All non-resident pages covered by the range will first be > -swapped/faulted-in, > +All non-resident pages covered by the range > +will first be swapped/faulted-in, > before being copied onto a freshly allocated hugepage. > If the native pages compose the same PTE-mapped hugepage, > and are suitably aligned, > -allocation of a new hugepage may be elided and collapse may happen > -in-place. > -Unmapped pages will have their data directly initialized to 0 in the new > -hugepage. > +allocation of a new hugepage may be elided and > +collapse may happen in-place. > +Unmapped pages will have their data directly initialized to 0 > +in the new hugepage. > However, > for every eligible hugepage-aligned/sized region to be collapsed, > at least one page must currently be backed by physical memory. > @@ -464,15 +465,15 @@ .SS Linux-specific advice values > is still respected). > .IP > When the system has multiple NUMA nodes, > -the hugepage will be allocated from the node providing the most native > -pages. > +the hugepage will be allocated from > +the node providing the most native pages. > .IP > If all hugepage-sized/aligned regions covered by the provided range were > either successfully collapsed, > or were already PMD-mapped THPs, > this operation will be deemed successful. > -Note that this doesn't guarantee anything about other possible mappings of > -the memory. > +Note that this doesn't guarantee anything about > +other possible mappings of the memory. > Also note that many failures might have occurred since the operation may > continue to collapse in the event collapse of a single hugepage-sized/aligned > region fails. > > > -- > <http://www.alejandro-colomar.es/> Best, Zach ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH man-pages v4] madvise.2: add documentation for MADV_COLLAPSE 2022-11-01 0:38 ` Zach O'Keefe @ 2022-11-01 11:38 ` Alejandro Colomar 2022-11-01 15:04 ` Zach O'Keefe 0 siblings, 1 reply; 5+ messages in thread From: Alejandro Colomar @ 2022-11-01 11:38 UTC (permalink / raw) To: Zach O'Keefe; +Cc: Yang Shi, linux-mm, linux-man, Michael Kerrisk [-- Attachment #1.1: Type: text/plain, Size: 1337 bytes --] Hey Zach, On 11/1/22 01:38, Zach O'Keefe wrote: >> >> I don't understand this last paragraph (since "Also note ..."). Could you >> please reword it a little bit? >> > > Sure - I can see that it's hard to parse. > > Further up I note that, "If collapse of a given huge > page-aligned/sized region fails, the operation may continue to attempt > collapsing the remainder of the specified memory." > > Then perhaps it's enough to just state, "In the event multiple > hugepage-aligned/sized areas fail to collapse, only the most > recently-failed code will be set in errno" I like this. > > The idea here being: errno only communicates the reason for 1/N > failures that might have occured. > > However -- on second thought -- perhaps this isn't particularly > useful, as it's already implied. So, my new suggestion would be that > we should drop it. What do you think? errno usually behaves like that if you call consecutive calls, but it's not so obvious how a single call will behave: it could report the last one as in this case, or the first one since it's the one that made it break. I'd keep it. [...] >> Diff for changing a few line breaks (and removing the spurious file): >> > > Thank you so much for this! :) :) Cheers, Alex -- <http://www.alejandro-colomar.es/> [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH man-pages v4] madvise.2: add documentation for MADV_COLLAPSE 2022-11-01 11:38 ` Alejandro Colomar @ 2022-11-01 15:04 ` Zach O'Keefe 0 siblings, 0 replies; 5+ messages in thread From: Zach O'Keefe @ 2022-11-01 15:04 UTC (permalink / raw) To: Alejandro Colomar; +Cc: Yang Shi, linux-mm, linux-man, Michael Kerrisk Hey Alex, On Tue, Nov 1, 2022 at 4:38 AM Alejandro Colomar <alx.manpages@gmail.com> wrote: > > Hey Zach, > > On 11/1/22 01:38, Zach O'Keefe wrote: > > >> > >> I don't understand this last paragraph (since "Also note ..."). Could you > >> please reword it a little bit? > >> > > > > Sure - I can see that it's hard to parse. > > > > Further up I note that, "If collapse of a given huge > > page-aligned/sized region fails, the operation may continue to attempt > > collapsing the remainder of the specified memory." > > > > Then perhaps it's enough to just state, "In the event multiple > > hugepage-aligned/sized areas fail to collapse, only the most > > recently-failed code will be set in errno" > > I like this. > > > > > The idea here being: errno only communicates the reason for 1/N > > failures that might have occured. > > > > However -- on second thought -- perhaps this isn't particularly > > useful, as it's already implied. So, my new suggestion would be that > > we should drop it. What do you think? > > errno usually behaves like that if you call consecutive calls, but it's not so > obvious how a single call will behave: it could report the last one as in this > case, or the first one since it's the one that made it break. I'd keep it. > Roger that - done && have sent out v5. Thank you so much again! Best, Zach > [...] > > >> Diff for changing a few line breaks (and removing the spurious file): > >> > > > > Thank you so much for this! :) > > :) > > Cheers, > Alex > > -- > <http://www.alejandro-colomar.es/> ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2022-11-01 15:04 UTC | newest] Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2022-10-31 22:55 [PATCH man-pages v4] madvise.2: add documentation for MADV_COLLAPSE Zach OKeefe 2022-10-31 23:36 ` Alejandro Colomar 2022-11-01 0:38 ` Zach O'Keefe 2022-11-01 11:38 ` Alejandro Colomar 2022-11-01 15:04 ` Zach O'Keefe
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox