From: Muchun Song <songmuchun@bytedance.com>
To: Joao Martins <joao.m.martins@oracle.com>
Cc: Linux Memory Management List <linux-mm@kvack.org>,
Dan Williams <dan.j.williams@intel.com>,
Vishal Verma <vishal.l.verma@intel.com>,
Matthew Wilcox <willy@infradead.org>,
Jason Gunthorpe <jgg@ziepe.ca>, Jane Chu <jane.chu@oracle.com>,
Mike Kravetz <mike.kravetz@oracle.com>,
Andrew Morton <akpm@linux-foundation.org>,
Jonathan Corbet <corbet@lwn.net>, Christoph Hellwig <hch@lst.de>,
nvdimm@lists.linux.dev,
Linux Doc Mailing List <linux-doc@vger.kernel.org>
Subject: Re: [PATCH v6 4/5] mm/sparse-vmemmap: improve memory savings for compound devmaps
Date: Thu, 24 Feb 2022 13:54:54 +0800 [thread overview]
Message-ID: <CAMZfGtXm5pLbTnzMCrWPg8Vm3gykB8XEg5DHFm0z1p1x2fhySQ@mail.gmail.com> (raw)
In-Reply-To: <20220223194807.12070-5-joao.m.martins@oracle.com>
On Thu, Feb 24, 2022 at 3:48 AM Joao Martins <joao.m.martins@oracle.com> wrote:
>
> A compound devmap is a dev_pagemap with @vmemmap_shift > 0 and it
> means that pages are mapped at a given huge page alignment and utilize
> uses compound pages as opposed to order-0 pages.
>
> Take advantage of the fact that most tail pages look the same (except
> the first two) to minimize struct page overhead. Allocate a separate
> page for the vmemmap area which contains the head page and separate for
> the next 64 pages. The rest of the subsections then reuse this tail
> vmemmap page to initialize the rest of the tail pages.
>
> Sections are arch-dependent (e.g. on x86 it's 64M, 128M or 512M) and
> when initializing compound devmap with big enough @vmemmap_shift (e.g.
> 1G PUD) it may cross multiple sections. The vmemmap code needs to
> consult @pgmap so that multiple sections that all map the same tail
> data can refer back to the first copy of that data for a given
> gigantic page.
>
> On compound devmaps with 2M align, this mechanism lets 6 pages be
> saved out of the 8 necessary PFNs necessary to set the subsection's
> 512 struct pages being mapped. On a 1G compound devmap it saves
> 4094 pages.
>
> Altmap isn't supported yet, given various restrictions in altmap pfn
> allocator, thus fallback to the already in use vmemmap_populate(). It
> is worth noting that altmap for devmap mappings was there to relieve the
> pressure of inordinate amounts of memmap space to map terabytes of pmem.
> With compound pages the motivation for altmaps for pmem gets reduced.
>
> Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
[...]
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 5f549cf6a4e8..b0798b9c6a6a 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -3118,7 +3118,7 @@ p4d_t *vmemmap_p4d_populate(pgd_t *pgd, unsigned long addr, int node);
> pud_t *vmemmap_pud_populate(p4d_t *p4d, unsigned long addr, int node);
> pmd_t *vmemmap_pmd_populate(pud_t *pud, unsigned long addr, int node);
> pte_t *vmemmap_pte_populate(pmd_t *pmd, unsigned long addr, int node,
> - struct vmem_altmap *altmap);
> + struct vmem_altmap *altmap, struct page *block);
Have forgotten to update @block to @reuse here.
[...]
> +
> +static int __meminit vmemmap_populate_range(unsigned long start,
> + unsigned long end,
> + int node, struct page *page)
All of the users are passing a valid parameter of @page. This function
will populate the vmemmap with the @page and without memory
allocations. So the @node parameter seems to be unnecessary.
If you want to make this function more generic like
vmemmap_populate_address() to handle memory allocations
(the case of @page == NULL). I think vmemmap_populate_range()
should add another parameter of `struct vmem_altmap *altmap`.
Otherwise, is it better to remove @node and rename @page to @reuse?
Thanks.
next prev parent reply other threads:[~2022-02-24 5:55 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-02-23 19:48 [PATCH v6 0/5] sparse-vmemmap: memory savings for compound devmaps (device-dax) Joao Martins
2022-02-23 19:48 ` [PATCH v6 1/5] mm/sparse-vmemmap: add a pgmap argument to section activation Joao Martins
2022-02-24 3:02 ` Muchun Song
2022-02-24 10:38 ` Joao Martins
2022-02-23 19:48 ` [PATCH v6 2/5] mm/sparse-vmemmap: refactor core of vmemmap_populate_basepages() to helper Joao Martins
2022-02-24 3:10 ` Muchun Song
2022-02-24 10:39 ` Joao Martins
2022-02-23 19:48 ` [PATCH v6 3/5] mm/hugetlb_vmemmap: move comment block to Documentation/vm Joao Martins
2022-02-23 19:48 ` [PATCH v6 4/5] mm/sparse-vmemmap: improve memory savings for compound devmaps Joao Martins
2022-02-24 5:54 ` Muchun Song [this message]
2022-02-24 11:46 ` Joao Martins
2022-02-24 15:34 ` Muchun Song
2022-02-23 19:48 ` [PATCH v6 5/5] mm/page_alloc: reuse tail struct pages " Joao Martins
2022-02-24 5:57 ` Muchun Song
2022-02-24 11:47 ` Joao Martins
2022-02-24 15:41 ` Muchun Song
2022-02-24 16:49 ` Joao Martins
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAMZfGtXm5pLbTnzMCrWPg8Vm3gykB8XEg5DHFm0z1p1x2fhySQ@mail.gmail.com \
--to=songmuchun@bytedance.com \
--cc=akpm@linux-foundation.org \
--cc=corbet@lwn.net \
--cc=dan.j.williams@intel.com \
--cc=hch@lst.de \
--cc=jane.chu@oracle.com \
--cc=jgg@ziepe.ca \
--cc=joao.m.martins@oracle.com \
--cc=linux-doc@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mike.kravetz@oracle.com \
--cc=nvdimm@lists.linux.dev \
--cc=vishal.l.verma@intel.com \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox