From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id DB300C4332F for ; Sun, 11 Dec 2022 22:37:51 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 747838E0003; Sun, 11 Dec 2022 17:37:51 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 6F78E8E0002; Sun, 11 Dec 2022 17:37:51 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5BF3E8E0003; Sun, 11 Dec 2022 17:37:51 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 4CE3E8E0002 for ; Sun, 11 Dec 2022 17:37:51 -0500 (EST) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 16CCB8051F for ; Sun, 11 Dec 2022 22:37:51 +0000 (UTC) X-FDA: 80231489142.17.D8C1759 Received: from mail-ej1-f52.google.com (mail-ej1-f52.google.com [209.85.218.52]) by imf11.hostedemail.com (Postfix) with ESMTP id 56C0B40006 for ; Sun, 11 Dec 2022 22:37:49 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=RrJKlCKd; spf=pass (imf11.hostedemail.com: domain of zokeefe@google.com designates 209.85.218.52 as permitted sender) smtp.mailfrom=zokeefe@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1670798269; a=rsa-sha256; cv=none; b=QeM1RDv6LgWUb49jCYpQb6MTb9huMC/qtBqIWaJj4uJLJs1GZU80bVwOyLfL+OqE5SgQRl DfKvh15QVoGFbLQrMOjtX3PjD8t2WplWU9IoDUufMLKxhWy+gS+lTgjCbp2J2rGYRw6S1A fl1t9a53g3j9gso6jGMmqWXCgciXZRE= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=RrJKlCKd; spf=pass (imf11.hostedemail.com: domain of zokeefe@google.com designates 209.85.218.52 as permitted sender) smtp.mailfrom=zokeefe@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1670798269; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=NUy5YkFtqaqqqUW8yQYlhRA93pY1uG4lTwQS2s2TXck=; b=ByS27ZkgsZJXJjpL+XP/Xjg7Srd0YnOTn+zcwghe6gDEB2Fds5nJsnLeMkudRkKvqvNGXs xpIhAuybV/VvMUMCHd2+eMYBMciI+Ay//qSGFi0Qrms4ggu7fbTM00XodEbTLua4QF6/cC FQXAtUsv3lioru7x6NpIzdtgBfu05JM= Received: by mail-ej1-f52.google.com with SMTP id fc4so23667529ejc.12 for ; Sun, 11 Dec 2022 14:37:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=NUy5YkFtqaqqqUW8yQYlhRA93pY1uG4lTwQS2s2TXck=; b=RrJKlCKdnbYizUYaixBwjt2yD0vA6rHUDLcKTjdWAYikS/PnDQOmp6U/HAygm29FNJ 3C1dvxF+oRLZicDHxGr2wqJhDr0tyVnkum/8kLiRAg4ywCenF4h6Yplwcel46qcgFQGl 16aXJdtgNl9cSIgeA8q8yYi/JWMzT+UmlxZQORuovY1aem+LQpsh41cFC7MCuQa1db3Y hrxh0NkcX6tHq9mmmYx24R7SX/4rbTMzNKIQ1Sbgat22rnJPkXjf0IGvEa8v8dCe8/9C Gdpj2L4y6cQfwpVKwI4YZ5gKW3X6JYXDyIlC56GqQy+8luHrNuqslwB5xQPX7x9Z50as DF2w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=NUy5YkFtqaqqqUW8yQYlhRA93pY1uG4lTwQS2s2TXck=; b=o+ub2rd3K86R6lyT/c/gOZb0J6P4UBepASmgicnSF1uneNOLV2wgjXmRm/kFepSDDO Tjn7NJwJcLWeYzterryQhd9ggHYKRnLzm1+f4+3Eb0ddT8213gcWUxWPVZUD1lpB6c5z h9vN9DcKWGwB7+j0J/f5dMO9k3unPvN7S/W3Avr5WACmaj2wjtHBAiigV8kd6eC1Bu/d yE38qP5SOErG3Xgx9HPfryU0xRJkVt5YkbjawswBzErkCdN2slzzzBNOyFYDuMCUe+O9 jKaGSzC2B6fDCBOwlo7HOQ3uEfgg+YrWwDzu9tWJeR/OcWB9VL9MI7zGQ6g3xcQ0BJuK sPdg== X-Gm-Message-State: ANoB5plHyc7/ZV3805UeGYPRVkcYDbZ38sNd76FQtJ/WwGqp2StwPCsu e1WKbi+h1mVx/xl8Gmdb0x9hrneYWwpKZZSWOmH+8pJH2WNWBQ== X-Google-Smtp-Source: AA0mqf55aJYLlIfK68r7bTyf5XZphUYZP5lsQvMPmF0D/Y2cJOd4p+gACqdTgOdgVVWX0007lovsdWuNHFfeod1hOvs= X-Received: by 2002:a17:906:3542:b0:7ad:aedb:140b with SMTP id s2-20020a170906354200b007adaedb140bmr11733112eja.477.1670798267761; Sun, 11 Dec 2022 14:37:47 -0800 (PST) MIME-Version: 1.0 References: <20221021223300.3675201-1-zokeefe@google.com> <20221021223300.3675201-5-zokeefe@google.com> <374b1dcd-6a2c-a452-9c1b-9f5945df493b@gmail.com> In-Reply-To: From: "Zach O'Keefe" Date: Sun, 11 Dec 2022 14:37:10 -0800 Message-ID: Subject: Re: [PATCH man-pages v3 4/4] madvise.2: add documentation for MADV_COLLAPSE To: Alejandro Colomar Cc: Yang Shi , linux-mm@kvack.org, linux-man@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 56C0B40006 X-Stat-Signature: w6g6jjbirf765s36459m8jkumuqstpdn X-HE-Tag: 1670798269-106988 X-HE-Meta: U2FsdGVkX1+aTh5I0lPqSTteAStAf6crJSUoEC7gPJYG0nGrR44AwtK8y8XaUmrgnKNjgWf6oqgaxZA9et9JyZDUfCtrVOUHGwm+O92p46XnktM4zQ3SH6D3VAlu9AJ97fu6FQmQs1/pREd9cI4uJEBrvKhyKZ392t0L0iZjXV39/nkG6+qqJ8BdrMDDB7WhHgULQkNHfwl3lgSOC4hpdXZ+tw4emd/DyaFb8apEYgN/kZ51JeVTVtGVXGdp5Fy8Sr+OBxgv0sB1aHeZmC3nfcJXOdKEARMo+mgZgGNwGQPwKr0fV4SI8j83kjT/GUEDLNkESKwrlZu3fBeLI3q0vQ9c0XdVLz48vpa7SUBv518PO4RFIYZgb/3HFC+iOL9lmPmDx2wjtD64es4V2ACjCJ+sXWpQyIBvDZ/a09Tcypziz/H5PaYgOVGSTPpeIPA8fFUkgMPJmlpZTFeWrYKvFJEBDz0w4Fs4A/wGtLFppWYZi0aa8lXBrVPt9YR6sl+wfhJuH13hIfEsV7wsB3kZuL8gYYvV5xI5ClHQKw19UJfrOZiTjB6NWvWBl9PKhz8ywX5FcJ+9nZLChD6BfQQ2UaOp7iwcen0taLnUxnLJNIqcsZKphVzVtwTPkj2sctlpGFx3K1cP1TTRVD3QnXtTrlxodptfipg4ImcQMwvs/VypBd6A76lw/TfkypmsM0MqoGkIW+pB9tx0t58udkXNL2wstGtbXYVe0XAVs3nlOyPMYbnPldg01Cq50m+Y0UtQOlgnNGB5kjW84LPRFpzcysmH+FPyii20F8x7Hdl22bZ5V9GQqQPHwU19Ht2fEgfm9+8FsO8ov012fkO0ofOXPrhxOIahMFF40O0YVl+FnDjSpTfVZBrFUjw2o5Q9UCJwxwqHUljwLZ6zt/KNaJ9kKgqe0smz35+UHSWL+L6fozkOgqtXwU8wdKGQ0++L5D75nHdRn/8x3muTNGM1Ubz sjkT/dDt jG5BkoV8zTyOcKYCylM1P+towwzkx+cK2i9YkHTNnosexRKfJ5KaowOL7hyk0hAaQy44Gb1LqN7rjUC3C/8Cke7OFxbewj/6fDKWBC/J5D2u56zE4ty1a5rAaTnB2I28+hglN X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Sun, Dec 11, 2022 at 1:55 PM Alejandro Colomar wrote: > > Hey Zach, > > On 12/11/22 22:51, Zach O'Keefe wrote: > > On Sun, Dec 11, 2022 at 9:59 AM Alejandro Colomar > > wrote: > >> > >> Hi Zach, > > > > Hey Alex, > > > >> On 10/22/22 00:33, Zach OKeefe wrote: > >>> From: Zach O'Keefe > >>> > >>> Linux 6.1 introduced MADV_COLLAPSE in upstream commit 7d8faaf15545 > >>> ("mm/madvise: introduce MADV_COLLAPSE sync hugepage collapse") and > >>> upstream commit 34488399fa08 ("mm/madvise: add file and shmem support= to > >>> MADV_COLLAPSE"). Update the man-pages for madvise(2) and > >>> process_madvise(2). > >>> > >>> Link: https://lore.kernel.org/linux-mm/20220922224046.1143204-1-zokee= fe@google.com/ > >>> Link: https://lore.kernel.org/linux-mm/20220706235936.2197195-1-zokee= fe@google.com/ > >>> Signed-off-by: Zach O'Keefe > >> > >> Please see a few comments below. > >> > > > > Thanks for the mail. So, this patch was taken as commit b106cd5bf > > ("madvise.2: add documentation for MADV_COLLAPSE"). Some of your > > comments below were > > applied (I think, by you) as fixes pre-commit. However, there are some > > new comments (or ones > > that address the same lines, but in different ways). Is this mail to > > log ~ what changes were done, > > or is there anything actionable here on my side? > > Ah no, it's just that I had it marked as unread for some reason, so I tho= ught I > had forgotten to respond (and I forgot that I had applied it). :-) > > So, no action required. > > Regarding different suggestions, heh, it demonstrates that it's not exact= ly > deterministic :P > Heh -- no worries :) Thanks for following up! > Cheers, > > Alex > > P.S.: Do you know if I have anything missing from you or any of your coll= egues? At least on my part, I think you've taken all my patches (with help & edits -- thank you!). I can't speak for anyone else at Google, however (though, just a very hasty cross reference between git log and lore.kernel.org/linux-man seems to indicate patches sent from *@google.com since man-pages-6.00 have previously made it into man-pages-6.01, and nothing afterwards). Have a great rest of your weekend, Best, Zach > > > > > Best, > > Zach > > > > Thanks for this. > >> Cheers, > >> > >> Alex > >> > >>> --- > >>> man2/madvise.2 | 90 ++++++++++++++++++++++++++++++++++++++= +++- > >>> man2/process_madvise.2 | 10 +++++ > >>> 2 files changed, 98 insertions(+), 2 deletions(-) > >>> > >>> diff --git a/man2/madvise.2 b/man2/madvise.2 > >>> index df3413cc8..b03fc731d 100644 > >>> --- a/man2/madvise.2 > >>> +++ b/man2/madvise.2 > >>> @@ -385,9 +385,10 @@ set (see > >>> .BR prctl (2) ). > >>> .IP > >>> The > >>> -.B MADV_HUGEPAGE > >>> +.BR MADV_HUGEPAGE , > >>> +.BR MADV_NOHUGEPAGE , > >>> and > >>> -.B MADV_NOHUGEPAGE > >>> +.B MADV_COLLAPSE > >>> operations are available only if the kernel was configured with > >>> .B CONFIG_TRANSPARENT_HUGEPAGE > >>> and file/shmem memory is only supported if the kernel was configur= ed with > >>> @@ -400,6 +401,81 @@ and > >>> .I length > >>> will not be backed by transparent hugepages. > >>> .TP > >>> +.BR MADV_COLLAPSE " (since Linux 6.1)" > >>> +.\" commit 7d8faaf155454f8798ec56404faca29a82689c77 > >>> +.\" commit 34488399fa08faaf664743fa54b271eb6f9e1321 > >>> +Perform a best-effort synchronous collapse of the native pages mappe= d by the > >> > >> Please use semantic line breaks. In this case, I'd break after "pages= ". > >> > >> man-pages(7): > >> Use semantic newlines > >> In the source of a manual page, new sentences should be star= ted on new > >> lines, long sentences should be split into lines at clause br= eaks (com=E2=80=90 > >> mas, semicolons, colons, and so on), and long clauses should = be split > >> at phrase boundaries. This convention, sometimes known as= "semantic > >> newlines", makes it easier to see the effect of patches, wh= ich often > >> operate at the level of individual sentences, clauses, or phr= ases. > >> > >>> +memory range into Transparent Huge Pages (THPs). > >>> +.B MADV_COLLAPSE > >>> +operates on the current state of memory of the calling process and m= akes no > >> > >> Here I'd break after "and". > >> > >>> +persistent changes or guarantees on how pages will be mapped, > >>> +constructed, > >>> +or faulted in the future. > >>> +.IP > >>> +.B MADV_COLLAPSE > >>> +supports private anonymous pages (see > >>> +.BR mmap (2)), > >>> +shmem pages, > >>> +and file-backed pages. > >>> +See > >>> +.B MADV_HUGEPAGE > >>> +for general information on memory requirements for THP. > >>> +If the range provided spans multiple VMAs, > >>> +the semantics of the collapse over each VMA is independent from the = others. > >>> +If collapse of a given huge page-aligned/sized region fails, > >>> +the operation may continue to attempt collapsing the remainder of th= e > >> > >> Break after "collapsing". > >> > >>> +specified memory. > >>> +.B MADV_COLLAPSE > >>> +will automatically clamp the provided range to be hugepage-aligned. > >>> +.IP > >>> +All non-resident pages covered by the range will first be > >> > >> Break after "range". > >> > >>> +swapped/faulted-in, > >>> +before being copied onto a freshly allocated hugepage. > >>> +If the native pages compose the same PTE-mapped hugepage, > >>> +and are suitably aligned, > >>> +allocation of a new hugepage may be elided and collapse may happen > >> > >> Break before or after "and". > >> > >>> +in-place. > >>> +Unmapped pages will have their data directly initialized to 0 in the= new > >> > >> Break after "0". > >> > >>> +hugepage. > >>> +However, > >>> +for every eligible hugepage-aligned/sized region to be collapsed, > >>> +at least one page must currently be backed by physical memory. > >>> +.IP > >>> +.BR MADV_COLLAPSE > >> > >> s/BR/B/ > >> > >>> +is independent of any sysfs > >>> +(see > >>> +.BR sysfs (5)) > >>> +setting under > >>> +.IR /sys/kernel/mm/transparent_hugepage , > >>> +both in terms of determining THP eligibility, > >>> +and allocation semantics. > >>> +See Linux kernel source file > >>> +.I Documentation/admin\-guide/mm/transhuge.rst > >>> +for more information. > >>> +.BR MADV_COLLAPSE > >> > >> s/BR/B/ > >> > >>> +also ignores > >>> +.B huge=3D > >>> +tmpfs mount when operating on tmpfs files. > >>> +Allocation for the new hugepage may enter direct reclaim and/or comp= action, > >>> +regardless of VMA flags > >>> +(though > >>> +.BR VM_NOHUGEPAGE > >> > >> s/BR/B/ > >> > >>> +is still respected). > >>> +.IP > >>> +When the system has multiple NUMA nodes, > >>> +the hugepage will be allocated from the node providing the most nati= ve > >> > >> Break after "from". > >> > >>> +pages. > >>> +.IP > >>> +If all hugepage-sized/aligned regions covered by the provided range = were > >> > >> Prefer English rather than "/". > >> > >>> +either successfully collapsed, > >>> +or were already PMD-mapped THPs, > >>> +this operation will be deemed successful. > >>> +Note that this doesn=E2=80=99t guarantee anything about other possib= le mappings of > >> > >> Break after "about". > >> > >>> +the memory. > >>> +Also note that many failures might have occurred since the operation= may > >>> +continue to collapse in the event collapse of a single hugepage-size= d/aligned > >> > >> Add some omitted "that" or something that will help readability to > >> non-native-English readers. > >> > >> And break at a better place. > >> > >>> +region fails. > >>> +.TP > >>> .BR MADV_DONTDUMP " (since Linux 3.4)" > >>> .\" commit 909af768e88867016f427264ae39d27a57b6a8ed > >>> .\" commit accb61fe7bb0f5c2a4102239e4981650f9048519 > >>> @@ -619,6 +695,11 @@ A kernel resource was temporarily unavailable. > >>> .B EBADF > >>> The map exists, but the area maps something that isn't a file. > >>> .TP > >>> +.B EBUSY > >>> +(for > >>> +.BR MADV_COLLAPSE ) > >>> +Could not charge hugepage to cgroup: cgroup limit exceeded. > >>> +.TP > >>> .B EFAULT > >>> .I advice > >>> is > >>> @@ -716,6 +797,11 @@ maximum resident set size. > >>> Not enough memory: paging in failed. > >>> .TP > >>> .B ENOMEM > >>> +(for > >>> +.BR MADV_COLLAPSE ) > >>> +Not enough memory: could not allocate hugepage. > >>> +.TP > >>> +.B ENOMEM > >>> Addresses in the specified range are not currently > >>> mapped, or are outside the address space of the process. > >>> .TP > >>> diff --git a/man2/process_madvise.2 b/man2/process_madvise.2 > >>> index 44d3b94e8..8b0ddccdd 100644 > >>> --- a/man2/process_madvise.2 > >>> +++ b/man2/process_madvise.2 > >>> @@ -73,6 +73,10 @@ argument is one of the following values: > >>> See > >>> .BR madvise (2). > >>> .TP > >>> +.B MADV_COLLAPSE > >>> +See > >>> +.BR madvise (2). > >>> +.TP > >>> .B MADV_PAGEOUT > >>> See > >>> .BR madvise (2). > >>> @@ -173,6 +177,12 @@ The caller does not have permission to access th= e address space of the process > >>> .TP > >>> .B ESRCH > >>> The target process does not exist (i.e., it has terminated and bee= n waited on). > >>> +.PP > >>> +See > >>> +.BR madvise (2) > >>> +for > >>> +.IR advice -specific > >>> +errors. > >>> .SH VERSIONS > >>> This system call first appeared in Linux 5.10. > >>> .\" commit ecb8ac8b1f146915aa6b96449b66dd48984caacc > >> > >> -- > >> > > -- >