linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Muchun Song <songmuchun@bytedance.com>
To: Joao Martins <joao.m.martins@oracle.com>
Cc: Linux Memory Management List <linux-mm@kvack.org>,
	Dan Williams <dan.j.williams@intel.com>,
	 Vishal Verma <vishal.l.verma@intel.com>,
	Matthew Wilcox <willy@infradead.org>,
	 Jason Gunthorpe <jgg@ziepe.ca>, Jane Chu <jane.chu@oracle.com>,
	 Mike Kravetz <mike.kravetz@oracle.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	 Jonathan Corbet <corbet@lwn.net>, Christoph Hellwig <hch@lst.de>,
	nvdimm@lists.linux.dev,
	 Linux Doc Mailing List <linux-doc@vger.kernel.org>
Subject: Re: [PATCH v5 5/5] mm/page_alloc: reuse tail struct pages for compound devmaps
Date: Sat, 12 Feb 2022 19:11:13 +0800	[thread overview]
Message-ID: <CAMZfGtUSH9cKWmQpsD2BzvVMAjQJCpyO_p7sFchEVx6ywxDEyw@mail.gmail.com> (raw)
In-Reply-To: <cfd0690f-bbc5-0fba-e085-1385041c470d@oracle.com>

On Fri, Feb 11, 2022 at 8:48 PM Joao Martins <joao.m.martins@oracle.com> wrote:
>
> On 2/11/22 05:07, Muchun Song wrote:
> > On Fri, Feb 11, 2022 at 3:34 AM Joao Martins <joao.m.martins@oracle.com> wrote:
> >> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> >> index cface1d38093..c10df2fd0ec2 100644
> >> --- a/mm/page_alloc.c
> >> +++ b/mm/page_alloc.c
> >> @@ -6666,6 +6666,20 @@ static void __ref __init_zone_device_page(struct page *page, unsigned long pfn,
> >>         }
> >>  }
> >>
> >> +/*
> >> + * With compound page geometry and when struct pages are stored in ram most
> >> + * tail pages are reused. Consequently, the amount of unique struct pages to
> >> + * initialize is a lot smaller that the total amount of struct pages being
> >> + * mapped. This is a paired / mild layering violation with explicit knowledge
> >> + * of how the sparse_vmemmap internals handle compound pages in the lack
> >> + * of an altmap. See vmemmap_populate_compound_pages().
> >> + */
> >> +static inline unsigned long compound_nr_pages(struct vmem_altmap *altmap,
> >> +                                             unsigned long nr_pages)
> >> +{
> >> +       return !altmap ? 2 * (PAGE_SIZE/sizeof(struct page)) : nr_pages;
> >> +}
> >> +
> >
> > This means only the first 2 pages will be modified, the reset 6 or 4094 pages
> > do not.  In the HugeTLB case, those tail pages are mapped with read-only
> > to catch invalid usage on tail pages (e.g. write operations). Quick question:
> > should we also do similar things on DAX?
> >
> What's sort of in the way of marking deduplicated pages as read-only is one
> particular CONFIG_DEBUG_VM feature, particularly page_init_poison(). HugeTLB
> gets its memory from the page allocator of already has pre-populated (at boot)
> system RAM sections and needs those to be 'given back' before they can be
> hotunplugged. So I guess it never goes through page_init_poison(). Although
> device-dax, the sections are populated and dedicated to device-dax when
> hotplugged, and then on hotunplug when the last user devdax user drops the page
> reference.
>
> So page_init_poison() is called on those two occasions. It actually writes to
> whole sections of memmap, not just one page. So either I gate read-only page
> protection when CONFIG_DEBUG_VM=n (which feels very wrong), or I detect inside
> page_init_poison() that the caller is trying to init compound devmap backed
> struct pages that were already watermarked (i.e. essentially when pfn offset
> between passed page and head page is bigger than 128).

Got it. I haven't realized page_init_poison() will poison the struct pages.
I agree with you that mapping with read-only is wrong.

Thanks.


      reply	other threads:[~2022-02-12 11:11 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-10 19:33 [PATCH v5 0/5] sparse-vmemmap: memory savings for compound devmaps (device-dax) Joao Martins
2022-02-10 19:33 ` [PATCH v5 1/5] mm/sparse-vmemmap: add a pgmap argument to section activation Joao Martins
2022-02-11  8:03   ` Muchun Song
2022-02-11 12:37     ` Joao Martins
2022-02-10 19:33 ` [PATCH v5 2/5] mm/sparse-vmemmap: refactor core of vmemmap_populate_basepages() to helper Joao Martins
2022-02-11  7:54   ` Muchun Song
2022-02-10 19:33 ` [PATCH v5 3/5] mm/hugetlb_vmemmap: move comment block to Documentation/vm Joao Martins
2022-02-10 19:33 ` [PATCH v5 4/5] mm/sparse-vmemmap: improve memory savings for compound devmaps Joao Martins
2022-02-11  7:54   ` Muchun Song
2022-02-11 12:37     ` Joao Martins
2022-02-12 10:08       ` Muchun Song
2022-02-12 14:49         ` Muchun Song
2022-02-14 10:57           ` Joao Martins
2022-02-14 10:55         ` Joao Martins
2022-02-10 19:33 ` [PATCH v5 5/5] mm/page_alloc: reuse tail struct pages " Joao Martins
2022-02-11  5:07   ` Muchun Song
2022-02-11 12:48     ` Joao Martins
2022-02-12 11:11       ` Muchun Song [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAMZfGtUSH9cKWmQpsD2BzvVMAjQJCpyO_p7sFchEVx6ywxDEyw@mail.gmail.com \
    --to=songmuchun@bytedance.com \
    --cc=akpm@linux-foundation.org \
    --cc=corbet@lwn.net \
    --cc=dan.j.williams@intel.com \
    --cc=hch@lst.de \
    --cc=jane.chu@oracle.com \
    --cc=jgg@ziepe.ca \
    --cc=joao.m.martins@oracle.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mike.kravetz@oracle.com \
    --cc=nvdimm@lists.linux.dev \
    --cc=vishal.l.verma@intel.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox