From: Dan Williams <dan.j.williams@intel.com>
To: "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
Cc: Jan Kara <jack@suse.cz>, linux-nvdimm <linux-nvdimm@lists.01.org>,
Michael Ellerman <mpe@ellerman.id.au>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
Linux MM <linux-mm@kvack.org>, Ross Zwisler <zwisler@kernel.org>,
Andrew Morton <akpm@linux-foundation.org>,
linuxppc-dev <linuxppc-dev@lists.ozlabs.org>,
"Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>
Subject: Re: [PATCH 2/2] mm/dax: Don't enable huge dax mapping by default
Date: Wed, 13 Mar 2019 21:02:11 -0700 [thread overview]
Message-ID: <CAPcyv4i0SahDP=_ZQV3RG_b5pMkjn-9Cjy7OpY2sm1PxLdO8jA@mail.gmail.com> (raw)
In-Reply-To: <871s3aqfup.fsf@linux.ibm.com>
On Wed, Mar 13, 2019 at 8:45 PM Aneesh Kumar K.V
<aneesh.kumar@linux.ibm.com> wrote:
[..]
> >> Now w.r.t to failures, can device-dax do an opportunistic huge page
> >> usage?
> >
> > device-dax explicitly disclaims the ability to do opportunistic mappings.
> >
> >> I haven't looked at the device-dax details fully yet. Do we make the
> >> assumption of the mapping page size as a format w.r.t device-dax? Is that
> >> derived from nd_pfn->align value?
> >
> > Correct.
> >
> >>
> >> Here is what I am working on:
> >> 1) If the platform doesn't support huge page and if the device superblock
> >> indicated that it was created with huge page support, we fail the device
> >> init.
> >
> > Ok.
> >
> >> 2) Now if we are creating a new namespace without huge page support in
> >> the platform, then we force the align details to PAGE_SIZE. In such a
> >> configuration when handling dax fault even with THP enabled during
> >> the build, we should not try to use hugepage. This I think we can
> >> achieve by using TRANSPARENT_HUGEPAEG_DAX_FLAG.
> >
> > How is this dynamic property communicated to the guest?
>
> via device tree on powerpc. We have a device tree node indicating
> supported page sizes.
Ah, ok, yeah let's plumb that straight to the device-dax driver and
leave out the interaction / interpretation of the thp-enabled flags.
>
> >
> >>
> >> Also even if the user decided to not use THP, by
> >> echo "never" > transparent_hugepage/enabled , we should continue to map
> >> dax fault using huge page on platforms that can support huge pages.
> >>
> >> This still doesn't cover the details of a device-dax created with
> >> PAGE_SIZE align later booted with a kernel that can do hugepage dax.How
> >> should we handle that? That makes me think, this should be a VMA flag
> >> which got derived from device config? May be use VM_HUGEPAGE to indicate
> >> if device should use a hugepage mapping or not?
> >
> > device-dax configured with PAGE_SIZE always gets PAGE_SIZE mappings.
>
> Now what will be page size used for mapping vmemmap?
That's up to the architecture's vmemmap_populate() implementation.
> Architectures
> possibly will use PMD_SIZE mapping if supported for vmemmap. Now a
> device-dax with struct page in the device will have pfn reserve area aligned
> to PAGE_SIZE with the above example? We can't map that using
> PMD_SIZE page size?
IIUC, that's a different alignment. Currently that's handled by
padding the reservation area up to a section (128MB on x86) boundary,
but I'm working on patches to allow sub-section sized ranges to be
mapped.
Now, that said, I expect there may be bugs lurking in the
implementation if PAGE_SIZE changes from one boot to the next simply
because I've never tested that.
I think this also indicates that the section padding logic can't be
removed until all arch vmemmap_populate() implementations understand
the sub-section case.
next prev parent reply other threads:[~2019-03-14 4:02 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-02-28 8:35 [PATCH 1/2] fs/dax: deposit pagetable even when installing zero page Aneesh Kumar K.V
2019-02-28 8:35 ` [PATCH 2/2] mm/dax: Don't enable huge dax mapping by default Aneesh Kumar K.V
2019-02-28 9:40 ` Jan Kara
2019-02-28 12:32 ` Aneesh Kumar K.V
2019-02-28 9:40 ` Oliver
2019-02-28 12:43 ` Aneesh Kumar K.V
2019-02-28 16:45 ` Dan Williams
2019-03-06 9:17 ` Aneesh Kumar K.V
2019-03-06 11:44 ` Michal Suchánek
2019-03-06 12:45 ` Aneesh Kumar K.V
2019-03-06 13:06 ` Kirill A. Shutemov
2019-03-13 16:07 ` Dan Williams
2019-03-19 8:44 ` Kirill A. Shutemov
2019-03-19 15:36 ` Dan Williams
2019-03-13 16:02 ` Dan Williams
2019-03-14 3:45 ` Aneesh Kumar K.V
2019-03-14 4:02 ` Dan Williams [this message]
2019-03-20 8:06 ` Aneesh Kumar K.V
2019-03-20 8:09 ` Aneesh Kumar K.V
2019-03-20 15:34 ` Dan Williams
2019-03-20 20:57 ` Dan Williams
2019-03-21 3:08 ` Oliver
2019-03-21 3:12 ` Dan Williams
2019-02-28 9:21 ` [PATCH 1/2] fs/dax: deposit pagetable even when installing zero page Jan Kara
2019-02-28 12:34 ` Aneesh Kumar K.V
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAPcyv4i0SahDP=_ZQV3RG_b5pMkjn-9Cjy7OpY2sm1PxLdO8jA@mail.gmail.com' \
--to=dan.j.williams@intel.com \
--cc=akpm@linux-foundation.org \
--cc=aneesh.kumar@linux.ibm.com \
--cc=jack@suse.cz \
--cc=kirill.shutemov@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-nvdimm@lists.01.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=mpe@ellerman.id.au \
--cc=zwisler@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox