From: Matthew Wilcox <willy@infradead.org>
To: linux-mm@kvack.org
Cc: Uladzislau Rezki <urezki@gmail.com>,
David Hildenbrand <david@redhat.com>,
Christoph Hellwig <hch@lst.de>
Subject: Re: Mapping vmalloc pages to userspace
Date: Fri, 6 Dec 2024 21:01:44 +0000 [thread overview]
Message-ID: <Z1NmOC_bkbzW4Fw0@casper.infradead.org> (raw)
In-Reply-To: <Z1MmIYZAvj1rE2Fn@casper.infradead.org>
On Fri, Dec 06, 2024 at 04:28:17PM +0000, Matthew Wilcox wrote:
> 4. Introduce an indirection structure between the page and vm_struct which
> contains the refcount.
I'm starting to really warm up to this one. There are a number of
places that we allocate "some pages", but want to treat them as a single
object, not just vmalloc. Let's call this a 'scamem', short for
"scattered memory".
But this is going to be challenging. Assuming we want to support GUP,
we need to be able to go from page->scamem [1]. In the skinniest
version of shrinking struct page, we have just 8 bytes per page, and
we need to both store a pointer to the scamem and store information
like node, zone, section for _each_ page. We don't need to worry about
this for folios/slabs/... because all pages in the folio have the same
node/zone/section, so we can store this information once in the folio
and then copy it back to the page on free. We can't do that for scamem
without a (potentially large) allocation. And even if we do something
like:
struct scamem {
unsigned int nr;
refcount_t refcount;
unsigned long flags[];
};
to be able to implement page_to_nid() on a page, we'd have to figure
out which page within the scamem this was. So either we have to give up
on our dream of an 8 byte memdesc, or figure out some other way to do
this.
So what if we store the scamem pointer in vma->vm_file->private_data,
or vma->vm_private_data. That would let us keep the node/section/zone
in the struct page. GUP has the VMA, so this can work.
Yet another possibility would be if we can look up the page's pfn in
some data structure and reconstruct the zone/section/node information at
freeing time. I don't fully understand the meaning of this information,
so I have no idea if this is possible.
My current thought is:
struct scamem {
unsigned int nr;
refcount_t refcount;
struct page *pages[];
};
and changing vm_struct:
- struct page **pages;
+ struct scamem *scamem;
(I don't think we want to embed it in vm_struct, since we want vm_struct
to have one refcount on scamem, and for the scamem to be freed once its
refcount reaches zero rather than freed as part of vm_struct)
next prev parent reply other threads:[~2024-12-06 21:01 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-12-06 16:28 Matthew Wilcox
2024-12-06 21:01 ` Matthew Wilcox [this message]
2024-12-10 19:48 ` David Hildenbrand
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z1NmOC_bkbzW4Fw0@casper.infradead.org \
--to=willy@infradead.org \
--cc=david@redhat.com \
--cc=hch@lst.de \
--cc=linux-mm@kvack.org \
--cc=urezki@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox