linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "Kasireddy, Vivek" <vivek.kasireddy@intel.com>
To: Gerd Hoffmann <kraxel@redhat.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>,
	James Houghton <jthoughton@google.com>,
	"Chang, Junxiao" <junxiao.chang@intel.com>,
	"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
	"kirill.shutemov@linux.intel.com"
	<kirill.shutemov@linux.intel.com>,
	"Hocko, Michal" <mhocko@suse.com>,
	"jmarchan@redhat.com" <jmarchan@redhat.com>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"muchun.song@linux.dev" <muchun.song@linux.dev>,
	"Kim, Dongwon" <dongwon.kim@intel.com>,
	"dri-devel@lists.freedesktop.org"
	<dri-devel@lists.freedesktop.org>
Subject: RE: [PATCH] mm: fix hugetlb page unmap count balance issue
Date: Tue, 20 Jun 2023 06:23:31 +0000	[thread overview]
Message-ID: <IA0PR11MB71851ED7AF1E49597B15F0B9F85CA@IA0PR11MB7185.namprd11.prod.outlook.com> (raw)
In-Reply-To: <rig64luafho5rflev25kwyejxwi44n7hj5kioxhogj4wpg5pch@bvxc2mfiyhyd>

Hi Gerd,

> 
> On Mon, May 15, 2023 at 10:04:42AM -0700, Mike Kravetz wrote:
> > On 05/12/23 16:29, Mike Kravetz wrote:
> > > On 05/12/23 14:26, James Houghton wrote:
> > > > On Fri, May 12, 2023 at 12:20 AM Junxiao Chang
> <junxiao.chang@intel.com> wrote:
> > > >
> > > > This alone doesn't fix mapcounting for PTE-mapped HugeTLB pages.
> You
> > > > need something like [1]. I can resend it if that's what we should be
> > > > doing, but this mapcounting scheme doesn't work when the page
> structs
> > > > have been freed.
> > > >
> > > > It seems like it was a mistake to include support for hugetlb memfds in
> udmabuf.
> > >
> > > IIUC, it was added with commit 16c243e99d33 udmabuf: Add support for
> mapping
> > > hugepages (v4).  Looks like it was never sent to linux-mm?  That is
> unfortunate
> > > as hugetlb vmemmap freeing went in at about the same time.  And, as
> you have
> > > noted udmabuf will not work if hugetlb vmemmap freeing is enabled.
> > >
> > > Sigh!
> > >
> > > Trying to think of a way forward.
> > > --
> > > Mike Kravetz
> > >
> > > >
> > > > [1]: https://lore.kernel.org/linux-mm/20230306230004.1387007-2-
> jthoughton@google.com/
> > > >
> > > > - James
> >
> > Adding people and list on Cc: involved with commit 16c243e99d33.
> >
> > There are several issues with trying to map tail pages of hugetllb pages
> > not taken into account with udmabuf.  James spent quite a bit of time
> trying
> > to understand and address all the issues with the HGM code.  While using
> > the scheme proposed by James, may be an approach to the mapcount
> issue there
> > are also other issues that need attention.  For example, I do not see how
> > the fault code checks the state of the hugetlb page (such as poison) as none
> > of that state is carried in tail pages.
> >
> > The more I think about it, the more I think udmabuf should treat hugetlb
> > pages as hugetlb pages.  They should be mapped at the appropriate level
> > in the page table.  Of course, this would impose new restrictions on the
> > API (mmap and ioctl) that may break existing users.  I have no idea how
> > extensively udmabuf is being used with hugetlb mappings.
> 
> User of this is qemu.  It can use the udmabuf driver to create host
> dma-bufs for guest resources (virtio-gpu buffers), to avoid copying
> data when showing the guest display in a host window.
> 
> hugetlb support is needed in case qemu guest memory is backed by
> hugetlbfs.  That does not imply the virtio-gpu buffers are hugepage
> aligned though, udmabuf would still need to operate on smaller chunks
> of memory.  So with additional restrictions this will not work any
> more for qemu.  I'd suggest to just revert hugetlb support instead
> and go back to the drawing board.
> 
> Also not sure why hugetlbfs is used for guest memory in the first place.
> It used to be a thing years ago, but with the arrival of transparent
> hugepages there is as far I know little reason to still use hugetlbfs.
The main reason why we are interested in using hugetlbfs for guest memory
is because we observed non-trivial performance improvement while running
certain 3D heavy workloads in the guest. And, we noticed this by only
switching the Guest memory backend to include hugepages (i.e, hugetlb=on)
and with no other changes.

To address the current situation, I am readying a patch for udmabuf driver that
would add back support for mapping hugepages but without making use of
the subpages directly.

Thanks,
Vivek

> 
> Vivek? Dongwon?
> 
> take care,
>   Gerd


      reply	other threads:[~2023-06-20  6:23 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-12  7:20 Junxiao Chang
2023-05-12 21:03 ` Andrew Morton
2023-05-15  0:08   ` Chang, Junxiao
2023-05-12 21:26 ` James Houghton
2023-05-12 23:29   ` Mike Kravetz
2023-05-15  0:44     ` Chang, Junxiao
2023-05-15 17:04     ` Mike Kravetz
2023-05-16 22:34       ` Mike Kravetz
2023-06-07 19:03         ` Andrew Morton
2023-06-07 20:53           ` Mike Kravetz
2023-06-07 21:00             ` Andrew Morton
2023-06-07 21:16               ` Mike Kravetz
2023-06-08  7:59               ` Greg Kroah-Hartman
2023-06-07 19:27         ` David Hildenbrand
2023-06-19 12:27       ` Gerd Hoffmann
2023-06-20  6:23         ` Kasireddy, Vivek [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=IA0PR11MB71851ED7AF1E49597B15F0B9F85CA@IA0PR11MB7185.namprd11.prod.outlook.com \
    --to=vivek.kasireddy@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=dongwon.kim@intel.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=jmarchan@redhat.com \
    --cc=jthoughton@google.com \
    --cc=junxiao.chang@intel.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=kraxel@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=mike.kravetz@oracle.com \
    --cc=muchun.song@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox