linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "Zach O'Keefe" <zokeefe@google.com>
To: Saurabh Singh Sengar <ssengar@microsoft.com>,
	Matthew Wilcox <willy@infradead.org>,
	 Dan Williams <dan.j.williams@intel.com>
Cc: "linux-mm@kvack.org" <linux-mm@kvack.org>,
	Yang Shi <shy828301@gmail.com>,
	 "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [EXTERNAL] [PATCH] mm/thp: fix "mm: thp: kill __transhuge_page_enabled()"
Date: Mon, 14 Aug 2023 11:47:50 -0700	[thread overview]
Message-ID: <CAAa6QmSrwe2m4MjS9mGO+DeGNGSv=B2uZ72EAxnZk2jsDh39rQ@mail.gmail.com> (raw)
In-Reply-To: <PUZP153MB06358FF02518EF3B279F5DD4BE16A@PUZP153MB0635.APCP153.PROD.OUTLOOK.COM>

On Sat, Aug 12, 2023 at 11:19 PM Saurabh Singh Sengar
<ssengar@microsoft.com> wrote:
>
>
>
> > -----Original Message-----
> > From: Zach O'Keefe <zokeefe@google.com>
> > Sent: Sunday, August 13, 2023 2:31 AM
> > To: linux-mm@kvack.org; Yang Shi <shy828301@gmail.com>
> > Cc: linux-kernel@vger.kernel.org; Zach O'Keefe <zokeefe@google.com>;
> > Saurabh Singh Sengar <ssengar@microsoft.com>
> > Subject: [EXTERNAL] [PATCH] mm/thp: fix "mm: thp: kill
> > __transhuge_page_enabled()"
> >
> > [You don't often get email from zokeefe@google.com. Learn why this is
> > important at https://aka.ms/LearnAboutSenderIdentification ]
> >
> > The 6.0 commits:
> >
> > commit 9fec51689ff6 ("mm: thp: kill transparent_hugepage_active()") commit
> > 7da4e2cb8b1f ("mm: thp: kill __transhuge_page_enabled()")
> >
> > merged "can we have THPs in this VMA?" logic that was previously done
> > separately by fault-path, khugepaged, and smaps "THPeligible".
> >
> > During the process, the check on VM_NO_KHUGEPAGED from the
> > khugepaged path was accidentally added to fault and smaps paths.  Certainly
> > the previous behavior for fault should be restored, and since smaps should
> > report the union of THP eligibility for fault and khugepaged, also opt smaps
> > out of this constraint.
> >
> > Fixes: 7da4e2cb8b1f ("mm: thp: kill __transhuge_page_enabled()")
> > Reported-by: Saurabh Singh Sengar <ssengar@microsoft.com>
> > Signed-off-by: Zach O'Keefe <zokeefe@google.com>
> > Cc: Yang Shi <shy828301@gmail.com>
> > ---
> >  mm/huge_memory.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/mm/huge_memory.c b/mm/huge_memory.c index
> > eb3678360b97..e098c26d5e2e 100644
> > --- a/mm/huge_memory.c
> > +++ b/mm/huge_memory.c
> > @@ -96,11 +96,11 @@ bool hugepage_vma_check(struct vm_area_struct
> > *vma, unsigned long vm_flags,
> >                 return in_pf;
> >
> >         /*
> > -        * Special VMA and hugetlb VMA.
> > +        * khugepaged check for special VMA and hugetlb VMA.
> >          * Must be checked after dax since some dax mappings may have
> >          * VM_MIXEDMAP set.
> >          */
> > -       if (vm_flags & VM_NO_KHUGEPAGED)
> > +       if (!in_pf && !smaps && (vm_flags & VM_NO_KHUGEPAGED))
> >                 return false;
> >
> >         /*
> > --
> > 2.41.0.694.ge786442a9b-goog
>
> Thanks for the patch, I realized with the commit 9fec51689ff60,
> !vma_is_anonymous restriction is also introduced. To make fault path
> work same as before we need relaxation for this check as well. Can we
> add the below as will in this patch:
>
> -       if (!vma_is_anonymous(vma))
> +       if (!is_pf && !vma_is_anonymous(vma))
>                 return false;

Hey Saurabh,

Thanks for pointing this out, and sorry for the mixup.

I'll try looping in some folks from DAX and fs worlds to be sure,
since my knowledge doesn't extend far into those realms.

I was under the understanding that CONFIG_READ_ONLY_THP_FOR_FS was
supposed to keep the filesystem blissfully unaware of hugepages; IOW,
that assembling file-backed hugepages was supposed to be a
pagecache-only thing .. or be DAX.

The early check:

if (vma_is_dax(vma))
        return in_pf;

Should handle the DAX case.

IIUC, the check, lower down:

if (!in_pf && file_thp_enabled(vma))
        return true;

Was supposed to be the last check for eligible file-backed memory, and
here it's clear that we don't support faulting-in hugepages over
file-backed memory.

Looking at current users of struct vm_operations_struct->huge_fault, I see:

drivers/dax/device.c : dev_dax_huge_fault
fs/ext4/file.c : ext4_dax_huge_fault
fs/xfs/xfs_file.c : xfs_filemap_huge_fault
fs/erofs/data.c : erofs_dax_huge_fault
fs/fuse/dax.c: fuse_dax_huge_fault

All of which *look* like they operate on DAX-backed memory (I checked
the xfs handler, it does so as well) -- so they should have been
whitelisted by the vma_is_dax() check.

All this to say, the kernel doesn't _currently_ support faulting-in
hugepages over non-DAX file-backed memory. However, it seems we don't
give that ->huge_fault handler a fair shake.

Saurabh, does your use case fall outside this?

Willy -- I'm not up-to-date on what is happening on the THP-fs front.
Should we be checking for a ->huge_fault handler here?

Thanks,
Zach

> - Saurabh
>


  reply	other threads:[~2023-08-14 18:48 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-12 21:00 Zach O'Keefe
2023-08-12 21:24 ` Zach O'Keefe
2023-08-13  6:19 ` [EXTERNAL] " Saurabh Singh Sengar
2023-08-14 18:47   ` Zach O'Keefe [this message]
2023-08-14 19:06     ` Matthew Wilcox
2023-08-15  0:04       ` Zach O'Keefe
2023-08-15  2:24         ` Matthew Wilcox
2023-08-16 16:52           ` Saurabh Singh Sengar
2023-08-16 21:47             ` Zach O'Keefe
2023-08-17 17:46               ` Yang Shi
2023-08-17 18:29                 ` Zach O'Keefe
2023-08-18 21:21                   ` Yang Shi
2023-08-21 15:08                     ` Zach O'Keefe
2023-08-21 22:59                       ` Yang Shi
2023-08-16 21:31           ` Zach O'Keefe
2023-08-17 12:18             ` Matthew Wilcox
2023-08-17 18:13               ` Zach O'Keefe
2023-08-17 19:01                 ` Matthew Wilcox
2023-08-17 21:12                   ` Zach O'Keefe
2023-08-16 16:49         ` Saurabh Singh Sengar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAAa6QmSrwe2m4MjS9mGO+DeGNGSv=B2uZ72EAxnZk2jsDh39rQ@mail.gmail.com' \
    --to=zokeefe@google.com \
    --cc=dan.j.williams@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=shy828301@gmail.com \
    --cc=ssengar@microsoft.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox