From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id EAB58C433EF for ; Wed, 10 Nov 2021 08:27:31 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 6E7146108E for ; Wed, 10 Nov 2021 08:27:31 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 6E7146108E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 0610C6B006C; Wed, 10 Nov 2021 03:27:31 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 0112E6B0071; Wed, 10 Nov 2021 03:27:30 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E424D6B0072; Wed, 10 Nov 2021 03:27:30 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0177.hostedemail.com [216.40.44.177]) by kanga.kvack.org (Postfix) with ESMTP id D685F6B006C for ; Wed, 10 Nov 2021 03:27:30 -0500 (EST) Received: from smtpin07.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 8F6321829B8E2 for ; Wed, 10 Nov 2021 08:27:30 +0000 (UTC) X-FDA: 78792341460.07.BEF957E Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf05.hostedemail.com (Postfix) with ESMTP id 3B0BA508EB15 for ; Wed, 10 Nov 2021 08:27:09 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1636532849; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=zky/vWToC3C8LiDo8BNPLluCNBMJ39ylhYqpXz2GoPI=; b=DLqjjdQfCGUdtqTHSdnziDN9igmqtS9QEXT5Nxe8P86BW7Pbk64rEOGfz8+eWDKIEOKXsF jI1uPZeuet/cfkP0qnMmkkRkeVq/B2rQ7AOxI3tJ8BqcY72SZAxalUyY+ZLdCVY2Zz07n1 sPPJNVfOFVpBq7wo5qJQ1yh8Sseijp0= Received: from mail-pf1-f199.google.com (mail-pf1-f199.google.com [209.85.210.199]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-8-OyFycO9MNMeI2NSr_6Z06g-1; Wed, 10 Nov 2021 03:27:27 -0500 X-MC-Unique: OyFycO9MNMeI2NSr_6Z06g-1 Received: by mail-pf1-f199.google.com with SMTP id w2-20020a627b02000000b0049fa951281fso1567508pfc.9 for ; Wed, 10 Nov 2021 00:27:27 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=zky/vWToC3C8LiDo8BNPLluCNBMJ39ylhYqpXz2GoPI=; b=OgmHCEzy7MOkjuofCJv2YxwHlppLbGFCKVgpjHj/N+1V0ee52kCMLwQRmPDDdfXk5m Cwj0peYOeZTZiXBuy6lcQgwNf2O8PQ9WdbwQ6Fvn6E0uNilcPKT4TyOjcg40R7rXDna6 iEOxOyXUmubkJMnY+7rCrEZpmj3C6ij7XDgz+7C8HmnFCMukrMc5u9TVWxL8QdgyQxzA 1opL2OSqift1wXZZY5SrXE9z2KnWOwZKdKC6CUfmTjlJ6R4J8SUafGBZEE1irVs63I2t 3YFFQ3kQaiW8nj5Dil328UrjoLXqBxcjZ/BtZtQG4j7jp9GITnr91zfDo7sZcJquFpLW rFAQ== X-Gm-Message-State: AOAM533+6x9mYGRWxeg9JXHnKMRGpP60Mlv1nsLGAXJc/3Jf/8suZh02 7K2/4rsT4yXqa+/srLDuhB6Cc4xSwYliODbCar8bgdrzF6hNFi2q7scptGjKlSpqwR2ecb72ACq d6f9zku2rJzw= X-Received: by 2002:a05:6a00:1151:b0:492:62e1:5968 with SMTP id b17-20020a056a00115100b0049262e15968mr14506338pfm.75.1636532846810; Wed, 10 Nov 2021 00:27:26 -0800 (PST) X-Google-Smtp-Source: ABdhPJyOKflnGQgX7O2PA+X5kdlAqd3MkOusyaBbTGvH2hKDFVlvijLehbB5yQCaRdyGTfDaYWw8ZA== X-Received: by 2002:a05:6a00:1151:b0:492:62e1:5968 with SMTP id b17-20020a056a00115100b0049262e15968mr14506300pfm.75.1636532846480; Wed, 10 Nov 2021 00:27:26 -0800 (PST) Received: from t490s ([209.132.188.80]) by smtp.gmail.com with ESMTPSA id u10sm21143749pfh.49.2021.11.10.00.27.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 10 Nov 2021 00:27:25 -0800 (PST) Date: Wed, 10 Nov 2021 16:27:20 +0800 From: Peter Xu To: David Hildenbrand Cc: Mina Almasry , Matthew Wilcox , "Paul E . McKenney" , Yu Zhao , Jonathan Corbet , Andrew Morton , Ivan Teterevkov , Florian Schmidt , linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org Subject: Re: [PATCH v4] mm: Add PM_HUGE_THP_MAPPING to /proc/pid/pagemap Message-ID: References: <20211107235754.1395488-1-almasrymina@google.com> MIME-Version: 1.0 In-Reply-To: X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Disposition: inline X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 3B0BA508EB15 X-Stat-Signature: niepcb7dinwageis1xyx8cyszf6kt9h5 Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=DLqjjdQf; spf=none (imf05.hostedemail.com: domain of peterx@redhat.com has no SPF policy when checking 170.10.133.124) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com X-HE-Tag: 1636532829-464489 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Nov 10, 2021 at 09:14:42AM +0100, David Hildenbrand wrote: > On 10.11.21 08:03, Peter Xu wrote: > > Hi, Mina, > > > > Sorry to comment late. > > > > On Sun, Nov 07, 2021 at 03:57:54PM -0800, Mina Almasry wrote: > >> diff --git a/Documentation/admin-guide/mm/pagemap.rst b/Documentation/admin-guide/mm/pagemap.rst > >> index fdc19fbc10839..8a0f0064ff336 100644 > >> --- a/Documentation/admin-guide/mm/pagemap.rst > >> +++ b/Documentation/admin-guide/mm/pagemap.rst > >> @@ -23,7 +23,8 @@ There are four components to pagemap: > >> * Bit 56 page exclusively mapped (since 4.2) > >> * Bit 57 pte is uffd-wp write-protected (since 5.13) (see > >> :ref:`Documentation/admin-guide/mm/userfaultfd.rst `) > >> - * Bits 57-60 zero > >> + * Bit 58 page is a huge (PMD size) THP mapping > >> + * Bits 59-60 zero > >> * Bit 61 page is file-page or shared-anon (since 3.5) > >> * Bit 62 page swapped > >> * Bit 63 page present > >> diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c > >> index ad667dbc96f5c..6f1403f83b310 100644 > >> --- a/fs/proc/task_mmu.c > >> +++ b/fs/proc/task_mmu.c > >> @@ -1302,6 +1302,7 @@ struct pagemapread { > >> #define PM_SOFT_DIRTY BIT_ULL(55) > >> #define PM_MMAP_EXCLUSIVE BIT_ULL(56) > >> #define PM_UFFD_WP BIT_ULL(57) > >> +#define PM_HUGE_THP_MAPPING BIT_ULL(58) > > > > The ending "_MAPPING" seems redundant to me, how about just call it "PM_THP" or > > "PM_HUGE" (as THP also means HUGE already)? > > > > IMHO the core problem is about permission controls, and it seems to me we're > > actually trying to workaround it by duplicating some information we have.. so > > it's kind of a pity. Totally not against this patch, but imho it'll be nicer > > if it's the permission part that to be enhanced, rather than a new but slightly > > duplicated interface. > > It's not a permission problem AFAIKS: even with permissions "changed", > any attempt to use /proc/kpageflags is just racy. Let's not go down that > path, it's really the wrong mechanism to export to random userspace. I agree it's racy, but IMHO that's fine. These are hints for userspace to make decisions, they cannot be always right. Even if we fetch atomically and seeing that this pte is swapped out, it can be quickly accessed at the same time and it'll be in-memory again. Only if we can freeze the whole pgtable but we can't, so they can only be used as hints. > > We do have an interface to access this information from userspace > already: /proc/self/smaps IIRC. Mina commented that they are seeing > performance issues with that approach. > > It would be valuable to add these details to the patch description, > including a performance difference when using both interfaces we have > available. As the patch description stands, there is no explanation > "why" we want this change. I didn't notice Mina mention about performance issues with kpageflags, if so then I agree this solution helps. I doubt the performance is an issue, though, as THP info shouldn't be something changing rapidly so it should be some hint to do sanity checks only (e.g., to make sure no unwanted split of THP happening, but the scanning should not require to be super fast; it could be done with a relatively long scanning period). If there's a performance concern, yes it would be great to mention it too in the commit message. -- Peter Xu