From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8E27DC433EF for ; Tue, 12 Jul 2022 16:31:33 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 014089400A6; Tue, 12 Jul 2022 12:31:33 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id F05C1940063; Tue, 12 Jul 2022 12:31:32 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DA5889400A6; Tue, 12 Jul 2022 12:31:32 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id C8BC6940063 for ; Tue, 12 Jul 2022 12:31:32 -0400 (EDT) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 9D01534B33 for ; Tue, 12 Jul 2022 16:31:32 +0000 (UTC) X-FDA: 79678988424.21.5E6108B Received: from mail-pj1-f43.google.com (mail-pj1-f43.google.com [209.85.216.43]) by imf17.hostedemail.com (Postfix) with ESMTP id 3AADD40072 for ; Tue, 12 Jul 2022 16:31:32 +0000 (UTC) Received: by mail-pj1-f43.google.com with SMTP id q5-20020a17090a304500b001efcc885cc4so8722072pjl.4 for ; Tue, 12 Jul 2022 09:31:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=dKYZI6ZY104mrXcUowiVJJmE8QT3HJ2OgGS7jQpUTv0=; b=cPcwAFNRUkz9asEijd9XGuP7gKN21/OqBLERJRhljIa1CoNZHJlFvzLoYArz1GrjWL HVqxafPWrmaylWiyGK4CQdhIG482yIUN+WNaOKgcyYdOWxgSCeMv/3SchwlfuZnRcB1t FaHzL2Ym5yomvK2oXK9w/zdoIBRJNT08ZB8Fx+HnzQdFrh8kKK5tFqT3kGyt7U3KXN/Y NwVo7QXOeBMCMrzPyFZRGW3WczTqUBbVBR4fqQGPQ2VEazebp335vLRgwlFSCqMr1yvX dPhAHImMU85D4qVu1FAr4HoGQt08+cZIlUPLHNxese2B02Okw0rfOCKGTTZ1GrwZBz0q Xsmw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=dKYZI6ZY104mrXcUowiVJJmE8QT3HJ2OgGS7jQpUTv0=; b=w7gAyZ2lxLbKrZ0DeUtwqDBh/WdmlSA9HcSxjgceddw+Z7Z4UafUZP+Whm4R8XM4Gp 939EXzLPK8mWjIzIOj69FnX7ktRYb3nvfF+3IRh5k1PWH5c+t7puN5NRqVgPKjJSPDwK 0G2xt/1qwqoI0fAcT8Gh5hHED7uxSXZanOwYKLrxCOHhLdSZ2z62mcdu23ocnpKZ6YJG RuSwior6coD/CZvCz7yXT3MNGIl0HPKF9f7gKi1duV8GQyZTC+oi/wL1x5CpTmk4WIEj yjx3by7UxtuN5/SiNdSEGVPpWoSUOdma3R0vif53kSkNkvOY19hxoYHraJpI97muHszf 7cyg== X-Gm-Message-State: AJIora/JLZn9O0IeVi4TXEh1iw2k0RcvRS+sYSh+qhli6Qh+iB80bljd bZhvuQf9VCVhvodAjR2x53197Q== X-Google-Smtp-Source: AGRyM1uU5gI3nm16qlhqxw+++jsXEVKCsTLb0Xg7J4NGZQyn84UhzAdvW6rDfgoIqtECYlXKF/EqbA== X-Received: by 2002:a17:902:8547:b0:16b:df3b:203 with SMTP id d7-20020a170902854700b0016bdf3b0203mr23748317plo.137.1657643490824; Tue, 12 Jul 2022 09:31:30 -0700 (PDT) Received: from google.com (55.212.185.35.bc.googleusercontent.com. [35.185.212.55]) by smtp.gmail.com with ESMTPSA id y196-20020a6264cd000000b00528c754da2dsm7041977pfb.27.2022.07.12.09.31.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 12 Jul 2022 09:31:30 -0700 (PDT) Date: Tue, 12 Jul 2022 09:31:27 -0700 From: Zach O'Keefe To: Yang Shi Cc: Alex Shi , David Hildenbrand , David Rientjes , Matthew Wilcox , Michal Hocko , Pasha Tatashin , Peter Xu , Rongwei Wang , SeongJae Park , Song Liu , Vlastimil Babka , Zi Yan , Linux MM , Andrea Arcangeli , Andrew Morton , Arnd Bergmann , Axel Rasmussen , Chris Kennelly , Chris Zankel , Helge Deller , Hugh Dickins , Ivan Kokshaysky , "James E.J. Bottomley" , Jens Axboe , "Kirill A. Shutemov" , Matt Turner , Max Filippov , Miaohe Lin , Minchan Kim , Patrick Xia , Pavel Begunkov , Thomas Bogendoerfer , jthoughton@google.com Subject: Re: [mm-unstable v7 13/18] proc/smaps: add PMDMappable field to smaps Message-ID: References: <20220706235936.2197195-1-zokeefe@google.com> <20220706235936.2197195-14-zokeefe@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1657643492; a=rsa-sha256; cv=none; b=ed6Qp3AjNFsmFKu6NLYmoXJ6p59ud36uFjB83eRc+dqG1dz74MRLglMTKAGBKZmEFhuX+K afd7xTrMV7qSPNonDFXWY7oOMRMY1jo/Su+fIP2Aoq8bNEbEEcH+akjXc2hNlAu1ReEC+H 75dkQJjpMKCBF9RU3Aw2rjl0CmlwRYQ= ARC-Authentication-Results: i=1; imf17.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=cPcwAFNR; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf17.hostedemail.com: domain of zokeefe@google.com designates 209.85.216.43 as permitted sender) smtp.mailfrom=zokeefe@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1657643492; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=dKYZI6ZY104mrXcUowiVJJmE8QT3HJ2OgGS7jQpUTv0=; b=GFZE0LAywwO4RJzjlJAjgw/vIoEc8QiEYR7yWU7xe5jPFMXfW3MEleVUV14WoXz1Dpguj5 SVfDcXbqprIpoO/4xRNLmkI/CS1T6mvmFTLgcwQaUFG2WnaXYziJajbDqCEy51DEPEIrZd g72adUE/6BzH1tjDB/70hCQmJpScv6A= X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 3AADD40072 Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=cPcwAFNR; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf17.hostedemail.com: domain of zokeefe@google.com designates 209.85.216.43 as permitted sender) smtp.mailfrom=zokeefe@google.com X-Stat-Signature: 4asgz7rfgbwk373xecfghhn3jmiytucg X-Rspam-User: X-HE-Tag: 1657643492-104618 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Jul 11 14:37, Yang Shi wrote: > On Wed, Jul 6, 2022 at 5:06 PM Zach O'Keefe wrote: > > > > Add PMDMappable field to smaps output which informs the user if memory > > in the VMA can be PMD-mapped by MADV_COLLAPSE. > > > > The distinction from THPeligible is needed for two reasons: > > > > 1) For THP, MADV_COLLAPSE is not coupled to THP sysfs controls, which > > THPeligible reports. > > > > 2) PMDMappable can also be used in HugeTLB fine-granularity mappings, > > which are independent from THP. > > Could you please elaborate the usecase? The user checks this hint > before calling MADV_COLLAPSE? Is it really necessary? > > And, TBH it sounds confusing and we don't have to maintain both > THPeligible and PMDMappable. We could just relax THPeligible to make > it return 1 even though THP is disabled by sysfs but MADV_COLLAPSE > could collapse it if such hint is useful. > Hey Yang, Thanks for taking the time to review this series again, and thanks for challenging this. TLDR: "Is it really necessary" - at the moment, no, probably not .. but I think it's "useful". Rationale: 1. IMO, I thought was was confusing seeing: ... AnonHugePages: 2048 kB ShmemPmdMapped: 0 kB FilePmdMapped: 0 kB Shared_Hugetlb: 0 kB Private_Hugetlb: 0 kB Swap: 0 kB SwapPss: 0 kB Locked: 0 kB THPeligible: 0 ... Maybe this could simply be clarified in the docs though. I guess we can already get: ... AnonHugePages: 0 kB ShmemPmdMapped: 0 kB FilePmdMapped: 2048 kB Shared_Hugetlb: 0 kB Private_Hugetlb: 0 kB Swap: 0 kB SwapPss: 0 kB Locked: 0 kB THPeligible: 0 ... today[1], so perhaps it's not a big deal. 2. It was useful for debugging - similar to rationale for including THPeligible1[2], the logic for determining if a VMA is eligible is pretty complicated. I.e. is this file mapped suitably? Unlike THPeligible, however, madvise(2) has the ability to set errno on failure to help* diagnose why some memory isn't being backed. 3. For the immediately-envisioned usecases, the user "knows" about what memory they are acting on. However, eventually we'd like to experiment with moving THP utilization policy to userspace. Here, it would be useful if the userspace agent managing was made aware of what memory it should be managing. I don't have a working prototype of what this would like yet, however. 4. I thought it was neat that this field could be reused for HugeTLB fine-granularity mappings - but TBH I'm not sure how useful it'd be there. I figured relaxing existing THPeligible could break existing users / tests, and it'd be likewise confusing for them to see THPeligible: 1, but then have faults fail and they'd then have to go check sysfs settings and vma flags ; we'd be back in pre-commit 7635d9cbe832 ("mm, thp, proc: report THP eligibility for each vma"). Thanks, Zach [1] https://lore.kernel.org/linux-mm/YrxbQGiwml24APCx@google.com/ > > > > > Signed-off-by: Zach O'Keefe > > --- > > Documentation/filesystems/proc.rst | 10 ++++++++-- > > fs/proc/task_mmu.c | 2 ++ > > 2 files changed, 10 insertions(+), 2 deletions(-) > > > > diff --git a/Documentation/filesystems/proc.rst b/Documentation/filesystems/proc.rst > > index 47e95dbc820d..f207903a57a5 100644 > > --- a/Documentation/filesystems/proc.rst > > +++ b/Documentation/filesystems/proc.rst > > @@ -466,6 +466,7 @@ Memory Area, or VMA) there is a series of lines such as the following:: > > MMUPageSize: 4 kB > > Locked: 0 kB > > THPeligible: 0 > > + PMDMappable: 0 > > VmFlags: rd ex mr mw me dw > > > > The first of these lines shows the same information as is displayed for the > > @@ -518,9 +519,14 @@ replaced by copy-on-write) part of the underlying shmem object out on swap. > > does not take into account swapped out page of underlying shmem objects. > > "Locked" indicates whether the mapping is locked in memory or not. > > > > +"PMDMappable" indicates if the memory can be mapped by PMDs - 1 if true, 0 > > +otherwise. It just shows the current status. Note that this is memory > > +operable on explicitly by MADV_COLLAPSE. > > + > > "THPeligible" indicates whether the mapping is eligible for allocating THP > > -pages as well as the THP is PMD mappable or not - 1 if true, 0 otherwise. > > -It just shows the current status. > > +pages by the kernel, as well as the THP is PMD mappable or not - 1 if true, 0 > > +otherwise. It just shows the current status. Note this is memory the kernel can > > +transparently provide as THPs. > > > > "VmFlags" field deserves a separate description. This member represents the > > kernel flags associated with the particular virtual memory area in two letter > > diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c > > index f8cd58846a28..29f2089456ba 100644 > > --- a/fs/proc/task_mmu.c > > +++ b/fs/proc/task_mmu.c > > @@ -867,6 +867,8 @@ static int show_smap(struct seq_file *m, void *v) > > > > seq_printf(m, "THPeligible: %d\n", > > hugepage_vma_check(vma, vma->vm_flags, true, false, true)); > > + seq_printf(m, "PMDMappable: %d\n", > > + hugepage_vma_check(vma, vma->vm_flags, true, false, false)); > > > > if (arch_pkeys_enabled()) > > seq_printf(m, "ProtectionKey: %8u\n", vma_pkey(vma)); > > -- > > 2.37.0.rc0.161.g10f37bed90-goog > >