linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Alejandro Colomar <alx.manpages@gmail.com>
To: Zach OKeefe <zokeefe@google.com>
Cc: Yang Shi <shy828301@gmail.com>,
	linux-mm@kvack.org, linux-man@vger.kernel.org,
	Michael Kerrisk <mtk.manpages@gmail.com>
Subject: Re: [PATCH man-pages v4] madvise.2: add documentation for MADV_COLLAPSE
Date: Tue, 1 Nov 2022 00:36:53 +0100	[thread overview]
Message-ID: <4b4a42ee-9243-96aa-b581-d56ae420f84a@gmail.com> (raw)
In-Reply-To: <20221031225500.3994542-1-zokeefe@google.com>


[-- Attachment #1.1: Type: text/plain, Size: 10281 bytes --]

Hey Zach!

On 10/31/22 23:55, Zach OKeefe wrote:
> From: Zach O'Keefe <zokeefe@google.com>
> 
> Linux 6.1 introduced MADV_COLLAPSE in upstream commit 7d8faaf15545
> ("mm/madvise: introduce MADV_COLLAPSE sync hugepage collapse") and
> upstream commit 34488399fa08 ("mm/madvise: add file and shmem support to
> MADV_COLLAPSE").  Update the man-pages for madvise(2) and
> process_madvise(2).
> 
> Link: https://lore.kernel.org/linux-mm/20220922224046.1143204-1-zokeefe@google.com/
> Link: https://lore.kernel.org/linux-mm/20220706235936.2197195-1-zokeefe@google.com/
> Signed-off-by: Zach O'Keefe <zokeefe@google.com>

Okay, now I have some more comments:

- A few changes about semantic newlines.  See a diff at the bottom of this email 
that you can apply.

- An accident.

- Some paragraph I don't really understand.

Cheers,

Alex

> ---
> 
> v3[1] -> v4
> - Rebased to latest master
> - (Alejandro Colomar) Fixed weird, non-ascii chars: e2 80 99 -> "'"
> - (Alejandro Colomar) Replaced .BR with .B directive when the entire
>    line was bold (no non-bold part)
> 
> [1] https://lore.kernel.org/linux-man/bb3b5c3c-3966-ea1a-6d84-4f7f3afa37ca@gmail.com/T/#u
> 
>   man2/madvise           |  0
>   man2/madvise.2         | 90 +++++++++++++++++++++++++++++++++++++++++-
>   man2/process_madvise.2 | 10 +++++
>   3 files changed, 98 insertions(+), 2 deletions(-)
>   create mode 100644 man2/madvise
> 
> diff --git a/man2/madvise b/man2/madvise
> new file mode 100644
> index 000000000..e69de29bb

Heh!  This was a funny accident.  I realized because autocomplete showed it as a 
possibility. :)

The diff at the bottom removes it.

> diff --git a/man2/madvise.2 b/man2/madvise.2
> index edf805740..dca42c7d6 100644
> --- a/man2/madvise.2
> +++ b/man2/madvise.2
> @@ -386,9 +386,10 @@ set (see
>   .BR prctl (2)).
>   .IP
>   The
> -.B MADV_HUGEPAGE
> +.BR MADV_HUGEPAGE ,
> +.BR MADV_NOHUGEPAGE ,
>   and
> -.B MADV_NOHUGEPAGE
> +.B MADV_COLLAPSE
>   operations are available only if the kernel was configured with
>   .B CONFIG_TRANSPARENT_HUGEPAGE
>   and file/shmem memory is only supported if the kernel was configured with
> @@ -401,6 +402,81 @@ and
>   .I length
>   will not be backed by transparent hugepages.
>   .TP
> +.BR MADV_COLLAPSE " (since Linux 6.1)"
> +.\" commit 7d8faaf155454f8798ec56404faca29a82689c77
> +.\" commit 34488399fa08faaf664743fa54b271eb6f9e1321
> +Perform a best-effort synchronous collapse of the native pages mapped by the
> +memory range into Transparent Huge Pages (THPs).
> +.B MADV_COLLAPSE
> +operates on the current state of memory of the calling process and makes no
> +persistent changes or guarantees on how pages will be mapped,
> +constructed,
> +or faulted in the future.
> +.IP
> +.B MADV_COLLAPSE
> +supports private anonymous pages (see
> +.BR mmap (2)),
> +shmem pages,
> +and file-backed pages.
> +See
> +.B MADV_HUGEPAGE
> +for general information on memory requirements for THP.
> +If the range provided spans multiple VMAs,
> +the semantics of the collapse over each VMA is independent from the others.
> +If collapse of a given huge page-aligned/sized region fails,
> +the operation may continue to attempt collapsing the remainder of the
> +specified memory.
> +.B MADV_COLLAPSE
> +will automatically clamp the provided range to be hugepage-aligned.
> +.IP
> +All non-resident pages covered by the range will first be
> +swapped/faulted-in,
> +before being copied onto a freshly allocated hugepage.
> +If the native pages compose the same PTE-mapped hugepage,
> +and are suitably aligned,
> +allocation of a new hugepage may be elided and collapse may happen
> +in-place.
> +Unmapped pages will have their data directly initialized to 0 in the new
> +hugepage.
> +However,
> +for every eligible hugepage-aligned/sized region to be collapsed,
> +at least one page must currently be backed by physical memory.
> +.IP
> +.B MADV_COLLAPSE
> +is independent of any sysfs
> +(see
> +.BR sysfs (5))
> +setting under
> +.IR /sys/kernel/mm/transparent_hugepage ,
> +both in terms of determining THP eligibility,
> +and allocation semantics.
> +See Linux kernel source file
> +.I Documentation/admin\-guide/mm/transhuge.rst
> +for more information.
> +.B MADV_COLLAPSE
> +also ignores
> +.B huge=
> +tmpfs mount when operating on tmpfs files.
> +Allocation for the new hugepage may enter direct reclaim and/or compaction,
> +regardless of VMA flags
> +(though
> +.B VM_NOHUGEPAGE
> +is still respected).
> +.IP
> +When the system has multiple NUMA nodes,
> +the hugepage will be allocated from the node providing the most native
> +pages.
> +.IP
> +If all hugepage-sized/aligned regions covered by the provided range were
> +either successfully collapsed,
> +or were already PMD-mapped THPs,
> +this operation will be deemed successful.
> +Note that this doesn't guarantee anything about other possible mappings of
> +the memory.
> +Also note that many failures might have occurred since the operation may
> +continue to collapse in the event collapse of a single hugepage-sized/aligned
> +region fails.

I don't understand this last paragraph (since "Also note ...").  Could you 
please reword it a little bit?

> +.TP
>   .BR MADV_DONTDUMP " (since Linux 3.4)"
>   .\" commit 909af768e88867016f427264ae39d27a57b6a8ed
>   .\" commit accb61fe7bb0f5c2a4102239e4981650f9048519
> @@ -620,6 +696,11 @@ A kernel resource was temporarily unavailable.
>   .B EBADF
>   The map exists, but the area maps something that isn't a file.
>   .TP
> +.B EBUSY
> +(for
> +.BR MADV_COLLAPSE )
> +Could not charge hugepage to cgroup: cgroup limit exceeded.
> +.TP
>   .B EFAULT
>   .I advice
>   is
> @@ -717,6 +798,11 @@ maximum resident set size.
>   Not enough memory: paging in failed.
>   .TP
>   .B ENOMEM
> +(for
> +.BR MADV_COLLAPSE )
> +Not enough memory: could not allocate hugepage.
> +.TP
> +.B ENOMEM
>   Addresses in the specified range are not currently
>   mapped, or are outside the address space of the process.
>   .TP
> diff --git a/man2/process_madvise.2 b/man2/process_madvise.2
> index ac98850a9..92878286b 100644
> --- a/man2/process_madvise.2
> +++ b/man2/process_madvise.2
> @@ -73,6 +73,10 @@ argument is one of the following values:
>   See
>   .BR madvise (2).
>   .TP
> +.B MADV_COLLAPSE
> +See
> +.BR madvise (2).
> +.TP
>   .B MADV_PAGEOUT
>   See
>   .BR madvise (2).
> @@ -173,6 +177,12 @@ The caller does not have permission to access the address space of the process
>   .TP
>   .B ESRCH
>   The target process does not exist (i.e., it has terminated and been waited on).
> +.PP
> +See
> +.BR madvise (2)
> +for
> +.IR advice -specific
> +errors.
>   .SH VERSIONS
>   This system call first appeared in Linux 5.10.
>   .\" commit ecb8ac8b1f146915aa6b96449b66dd48984caacc

Diff for changing a few line breaks (and removing the spurious file):

diff --git a/man2/madvise b/man2/madvise
deleted file mode 100644
index e69de29bb..000000000
diff --git a/man2/madvise.2 b/man2/madvise.2
index dca42c7d6..7f34301d3 100644
--- a/man2/madvise.2
+++ b/man2/madvise.2
@@ -405,11 +405,12 @@ .SS Linux-specific advice values
  .BR MADV_COLLAPSE " (since Linux 6.1)"
  .\" commit 7d8faaf155454f8798ec56404faca29a82689c77
  .\" commit 34488399fa08faaf664743fa54b271eb6f9e1321
-Perform a best-effort synchronous collapse of the native pages mapped by the
-memory range into Transparent Huge Pages (THPs).
+Perform a best-effort synchronous collapse of
+the native pages mapped by the memory range
+into Transparent Huge Pages (THPs).
  .B MADV_COLLAPSE
-operates on the current state of memory of the calling process and makes no
-persistent changes or guarantees on how pages will be mapped,
+operates on the current state of memory of the calling process and
+makes no persistent changes or guarantees on how pages will be mapped,
  constructed,
  or faulted in the future.
  .IP
@@ -424,20 +425,20 @@ .SS Linux-specific advice values
  If the range provided spans multiple VMAs,
  the semantics of the collapse over each VMA is independent from the others.
  If collapse of a given huge page-aligned/sized region fails,
-the operation may continue to attempt collapsing the remainder of the
-specified memory.
+the operation may continue to attempt collapsing
+the remainder of the specified memory.
  .B MADV_COLLAPSE
  will automatically clamp the provided range to be hugepage-aligned.
  .IP
-All non-resident pages covered by the range will first be
-swapped/faulted-in,
+All non-resident pages covered by the range
+will first be swapped/faulted-in,
  before being copied onto a freshly allocated hugepage.
  If the native pages compose the same PTE-mapped hugepage,
  and are suitably aligned,
-allocation of a new hugepage may be elided and collapse may happen
-in-place.
-Unmapped pages will have their data directly initialized to 0 in the new
-hugepage.
+allocation of a new hugepage may be elided and
+collapse may happen in-place.
+Unmapped pages will have their data directly initialized to 0
+in the new hugepage.
  However,
  for every eligible hugepage-aligned/sized region to be collapsed,
  at least one page must currently be backed by physical memory.
@@ -464,15 +465,15 @@ .SS Linux-specific advice values
  is still respected).
  .IP
  When the system has multiple NUMA nodes,
-the hugepage will be allocated from the node providing the most native
-pages.
+the hugepage will be allocated from
+the node providing the most native pages.
  .IP
  If all hugepage-sized/aligned regions covered by the provided range were
  either successfully collapsed,
  or were already PMD-mapped THPs,
  this operation will be deemed successful.
-Note that this doesn't guarantee anything about other possible mappings of
-the memory.
+Note that this doesn't guarantee anything about
+other possible mappings of the memory.
  Also note that many failures might have occurred since the operation may
  continue to collapse in the event collapse of a single hugepage-sized/aligned
  region fails.


-- 
<http://www.alejandro-colomar.es/>

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

  reply	other threads:[~2022-10-31 23:37 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-10-31 22:55 Zach OKeefe
2022-10-31 23:36 ` Alejandro Colomar [this message]
2022-11-01  0:38   ` Zach O'Keefe
2022-11-01 11:38     ` Alejandro Colomar
2022-11-01 15:04       ` Zach O'Keefe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4b4a42ee-9243-96aa-b581-d56ae420f84a@gmail.com \
    --to=alx.manpages@gmail.com \
    --cc=linux-man@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mtk.manpages@gmail.com \
    --cc=shy828301@gmail.com \
    --cc=zokeefe@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox