From: Jason Gunthorpe <jgg@nvidia.com>
To: Dan Williams <dan.j.williams@intel.com>
Cc: Gerald Schaefer <gerald.schaefer@linux.ibm.com>,
Joao Martins <joao.m.martins@oracle.com>,
Christoph Hellwig <hch@lst.de>,
Heiko Carstens <hca@linux.ibm.com>,
Vasily Gorbik <gor@linux.ibm.com>,
Christian Borntraeger <borntraeger@de.ibm.com>,
Linux NVDIMM <nvdimm@lists.linux.dev>,
linux-s390 <linux-s390@vger.kernel.org>,
Matthew Wilcox <willy@infradead.org>,
Alex Sierra <alex.sierra@amd.com>,
"Kuehling, Felix" <Felix.Kuehling@amd.com>,
Linux MM <linux-mm@kvack.org>,
Ralph Campbell <rcampbell@nvidia.com>,
Alistair Popple <apopple@nvidia.com>,
Vishal Verma <vishal.l.verma@intel.com>,
Dave Jiang <dave.jiang@intel.com>
Subject: Re: can we finally kill off CONFIG_FS_DAX_LIMITED
Date: Thu, 14 Oct 2021 20:04:39 -0300 [thread overview]
Message-ID: <20211014230439.GA3592864@nvidia.com> (raw)
In-Reply-To: <CAPcyv4iFeVDVPn6uc=aKsyUvkiu3-fK-N16iJVZQ3N8oT00hWA@mail.gmail.com>
On Tue, Aug 24, 2021 at 11:44:20AM -0700, Dan Williams wrote:
> Yes, that's along the lines of what I'm thinking. I.e don't expect
> pte_devmap() to be there in the slow path, and use the vma to check
> for DAX.
I think we should delete pte_devmap completely from gup.c.
It is doing a few things that are better done in more general ways:
1) Doing the get_dev_pagemap() stuff which should be entirely deleted
from gup.c in favour of proper use of struct page references.
2) Denying FOLL_LONGTERM
Once GUP has grabbed the page we can call is_zone_device_page() on
the struct page. If true we can check page->pgmap and read some
DENY_FOLL_LONGTERM flag from there
3) Different refcounts for pud/pmd pages
Ideally DAX cases would not do this (ie Joao is fixing device-dax)
but in the interm we can just loop over the PUD/PMD in all
cases. Looping is safe for THP AFAIK. I described how this can work
here:
https://lore.kernel.org/all/20211013174140.GJ2744544@nvidia.com/
After that there are only two remaining uses:
4) The pud/pmd_devmap() in vm_normal_page() should just go
away. ZONE_DEVICE memory with struct pages SHOULD be a normal
page. This also means dropping pte_special too.
5) dev_pagemap_mapping_shift() - I don't know what this does
but why not use the is_zone_device_page() approach from 2?
In this way ZONE_DEVICE pages can be fully normal pages with no
requirements on PTE flags.
Where have I gone wrong? :)
pud/pmd_devmap() looks a little more involved to remove, but I wonder
if we can change logic like this:
if (pmd_trans_huge(*vmf->pmd) || pmd_devmap(*vmf->pmd)) {
Into
if (pmd_is_page(*pmd))
? And rely on struct page based stuff as above to discern THP vs devmap?
Thanks,
Jason
next parent reply other threads:[~2021-10-14 23:04 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20210820054340.GA28560@lst.de>
[not found] ` <20210823160546.0bf243bf@thinkpad>
[not found] ` <20210823214708.77979b3f@thinkpad>
[not found] ` <CAPcyv4jijqrb1O5OOTd5ftQ2Q-5SVwNRM7XMQ+N3MAFxEfvxpA@mail.gmail.com>
[not found] ` <e250feab-1873-c91d-5ea9-39ac6ef26458@oracle.com>
[not found] ` <CAPcyv4jYXPWmT2EzroTa7RDz1Z68Qz8Uj4MeheQHPbBXdfS4pA@mail.gmail.com>
[not found] ` <20210824202449.19d524b5@thinkpad>
[not found] ` <CAPcyv4iFeVDVPn6uc=aKsyUvkiu3-fK-N16iJVZQ3N8oT00hWA@mail.gmail.com>
2021-10-14 23:04 ` Jason Gunthorpe [this message]
2021-10-15 0:22 ` Joao Martins
2021-10-18 23:30 ` Jason Gunthorpe
2021-10-19 4:26 ` Dan Williams
2021-10-19 14:20 ` Jason Gunthorpe
2021-10-19 15:20 ` Joao Martins
2021-10-19 15:38 ` Felix Kuehling
2021-10-19 17:38 ` Dan Williams
2021-10-19 17:54 ` Jason Gunthorpe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20211014230439.GA3592864@nvidia.com \
--to=jgg@nvidia.com \
--cc=Felix.Kuehling@amd.com \
--cc=alex.sierra@amd.com \
--cc=apopple@nvidia.com \
--cc=borntraeger@de.ibm.com \
--cc=dan.j.williams@intel.com \
--cc=dave.jiang@intel.com \
--cc=gerald.schaefer@linux.ibm.com \
--cc=gor@linux.ibm.com \
--cc=hca@linux.ibm.com \
--cc=hch@lst.de \
--cc=joao.m.martins@oracle.com \
--cc=linux-mm@kvack.org \
--cc=linux-s390@vger.kernel.org \
--cc=nvdimm@lists.linux.dev \
--cc=rcampbell@nvidia.com \
--cc=vishal.l.verma@intel.com \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox