Re: Inaccessible pages & folios

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Claudio Imbrenda <imbrenda@linux.ibm.com>
To: Matthew Wilcox <willy@infradead.org>
Cc: linux-mm@kvack.org, linux-s390@vger.kernel.org
Subject: Re: Inaccessible pages & folios
Date: Thu, 15 Apr 2021 11:28:14 +0200	[thread overview]
Message-ID: <20210415112814.303f7f02@ibm-vm> (raw)
In-Reply-To: <20210412135514.GK2531743@casper.infradead.org>

On Mon, 12 Apr 2021 14:55:14 +0100
Matthew Wilcox <willy@infradead.org> wrote:

[...]

> 
> I was only thinking about the page cache case ...
> 
>         access_ret = arch_make_page_accessible(page);
>         /*
>          * If writeback has been triggered on a page that cannot be
> made
>          * accessible, it is too late to recover here.
>          */
>         VM_BUG_ON_PAGE(access_ret != 0, page);
> 
> ... where it seems all pages _can_ be made accessible.

yes, for that case it is straightforward

> > also, I assume you keep the semantic difference between get_page and
> > pin_page? that's also very important for us  
> 
> I haven't changed anything in gup.c yet.  Just trying to get the page
> cache to suck less right now.

fair enough :)
 
> > > So what you're saying is that the host might allocate, eg a 1GB
> > > folio for a guest, then the guest splits that up into smaller
> > > chunks (eg 1MB), and would only want one of those small chunks
> > > accessible to the hypervisor?  
> > 
> > qemu will allocate a big chunk of memory, and I/O would happen only
> > on small chunks (depending on what the guest does). I don't know
> > how swap and pagecache would behave in the folio scenario.
> > 
> > Also consider that currently we need 4k hardware pages for protected
> > guests (so folios would be ok, as long as they are backed by small
> > pages)
> > 
> > How and when are folios created actually?
> > 
> > is there a way to prevent creation of multi-page folios?  
> 
> Today there's no way to create multi-page folios because I haven't
> submitted the patch to add alloc_folio() and friends:
> 
> https://git.infradead.org/users/willy/pagecache.git/commitdiff/4fe26f7a28ffdc850cd016cdaaa74974c59c5f53
> 
> We do have a way to allocate compound pages and add them to the page
> cache, but that's only in use by tmpfs/shmem.
> 
> What will happen is that (for filesystems which support multipage
> folios), they'll be allocated by the page cache.  I expect other
> places will start to use folios after that (eg anonymous memory), but
> I don't know where all those places will be.  I hope not to be
> involved in that!
> 
> The general principle, though, is that the overhead of tracking
> memory in page-sized units is too high, and we need to use larger
> units by default. There are occasions when we need to do things to
> memory in smaller units, and for those, we can choose to either
> handle sub-folio things, or we can split a folio apart into smaller
> folios.
> 
> > > > a possible approach maybe would be to keep the _page variant,
> > > > and add a _folio wrapper around it    
> > > 
> > > Yes, we can do that.  It's what I'm currently doing for
> > > flush_dcache_folio().  
> > 
> > where would the page flags be stored? as I said, we really depend on
> > that bit to be set correctly to prevent potentially disruptive I/O
> > errors. It's ok if the bit overindicates protection (non-protected
> > pages can be marked as protected), but protected pages must at all
> > times have the bit set.
> > 
> > the reason why this hook exists at all, is to prevent secure pages
> > from being accidentally (or maliciously) fed into I/O  
> 
> You can still use PG_arch_1 on the sub-pages of a folio.  It's one of
> the things you'll have to decide, actually.  Does setting PG_arch_1 on
> the head page of the folio indicate that the entire page is
> accessible, or just that the head page is accessible?  Different page
> flags have made different decisions here.

ok then, I think the simplest and safest thing to do right now is to
keep the flag on each page


in short:
* pagecache -> you can put a loop or introduce a _folio wrapper for
  arch_make_page_accessible
* gup.c -> won't be touched for now, but when the time comes, the
  PG_arch_1 bit should be set for each page

     prev parent reply	other threads:[~2021-04-15  9:28 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-09 19:40 Matthew Wilcox
2021-04-12 12:18 ` Claudio Imbrenda
2021-04-12 12:43   ` Matthew Wilcox
2021-04-12 13:37     ` Claudio Imbrenda
2021-04-12 13:55       ` Matthew Wilcox
2021-04-15  9:28         ` Claudio Imbrenda [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210415112814.303f7f02@ibm-vm \
    --to=imbrenda@linux.ibm.com \
    --cc=linux-mm@kvack.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox