Re: PCID review? - Mel Gorman

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Mel Gorman <mgorman@techsingularity.net>
To: Andy Lutomirski <luto@amacapital.net>
Cc: Andy Lutomirski <luto@kernel.org>,
	Nadav Amit <nadav.amit@gmail.com>, Borislav Petkov <bp@alien8.de>,
	Kees Cook <keescook@chromium.org>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
Subject: Re: PCID review?
Date: Thu, 9 Feb 2017 00:10:42 +0000	[thread overview]
Message-ID: <20170209001042.ahxmoqegr6h74mle@techsingularity.net> (raw)
In-Reply-To: <CALCETrVfah6AFG5mZDjVcRrdXKL=07+WC9ES9ZKU90XqVpWCOg@mail.gmail.com>

On Wed, Feb 08, 2017 at 12:51:24PM -0800, Andy Lutomirski wrote:
> On Tue, Feb 7, 2017 at 10:56 AM, Andy Lutomirski <luto@kernel.org> wrote:
> > Quite a few people have expressed interest in enabling PCID on (x86)
> > Linux.  Here's the code:
> >
> > https://git.kernel.org/cgit/linux/kernel/git/luto/linux.git/log/?h=x86/pcid
> >
> > The main hold-up is that the code needs to be reviewed very carefully.
> > It's quite subtle.  In particular, "x86/mm: Try to preserve old TLB
> > entries using PCID" ought to be looked at carefully to make sure the
> > locking is right, but there are plenty of other ways this this could
> > all break.
> >
> > Anyone want to take a look or maybe scare up some other reviewers?
> > (Kees, you seemed *really* excited about getting this in.)
> 
> Nadav pointed out that this doesn't work right with
> ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH.  Mel, here's the issue:
> 

I confess I didn't read the thread or patch, I'm only looking at this
question.

> I want to add ASID (Intel calls it PCID) support to x86.  This means
> that "flush the TLB on a given CPU" will no longer be a particularly
> well defined operation because it's not clear which ASID tag to flush.
> Instead there's "flush the TLB for a given mm on a given CPU".
> 

Understood.

> If I'm understanding the batched flush code, all it's trying to do is
> to flush more than one mm at a time. 

More or less. It would be more accurate to say it's flushing any CPU TLB
that potentially accessed a list of pages recently.  It's not flushing
"multiple MMs at a time" as such, it's flushing "only CPUs that accessed
pages mapped by a mm recently". The distinction may not matter.

The requirements do matter though.

mm/vmscan.c is where TTU_BATCH_FLUSH is set and that is processing a list
of pages that can be mapped by multiple MMs.

The TLBs must be flushed before either IO starts (try_to_unmap_flush_dirty)
or they are freed to the page allocator (try_to_unmap_flush).

To do this, all it has to track is a simple mask of CPUs, whether a flush
is necessary and whether any of the PTEs were dirty. This is trivial to
assemble during an rmap walk as it's a PTE check and a cpumask_or.

try_to_unmap_flush then flushes the entire TLB as the cost of targetted
a specific page to flush was so high (both maintaining the PFNs and the
individual flush operations).

> Would it make sense to add a new
> arch API to flush more than one mm?  Presumably it would take a linked
> list, and the batched flush code would fall back to flushing in pieces
> if it can't allocate a new linked list node when needed.
> 

Conceptually it's ok but the details are a headache.

The defer code would need to maintain a list of mm's (or ASIDs) that is
unbounded in size to match the number of IPIs sent as the current code as
opposed to a simple cpumask. There are SWAP_CLUSTER_MAX pages to consider
with each page potentially mapped by an arbitrary number of MMs. The same
mm's could exist on multiple lists for each active kswapd instance and
direct reclaimer.

As multiple reclaimers are interested in the same mm, that pretty much
rules out linking them off mm_struct unless the locking would serialise
the parallel reclaimers and prevent an mm existing on more than one list
at a time. You could try allowing multiple tasks to share the one list
(not sure how to find that list quickly) but each entry would have to
be locked and as each user can flush at any time, multiple reclaimers
potentially have to block while an IPI is being sent. It's hard to see
how this could be scaled to match the existing code.

It would be easier to track via an array stored in task_struct but the
number of MMs is unknown in advance so all you can do is guess a reasonable
size. It would have to flush if the array files resulting in more IPIs
than the current code depending on how many MMs map the list of pages.

-- 
Mel Gorman
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2017-02-09  0:10 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-02-07 18:56 Andy Lutomirski
2017-02-07 19:11 ` Kees Cook
2017-02-07 19:24   ` Thomas Garnier
2017-02-07 19:37 ` Nadav Amit
2017-02-08 16:24   ` Andy Lutomirski
2017-02-07 21:06 ` Paul E. McKenney
2017-02-08 16:25   ` Andy Lutomirski
2017-02-08 16:52     ` Paul E. McKenney
2017-02-08 20:51 ` Andy Lutomirski
2017-02-09  0:10   ` Mel Gorman [this message]
2017-02-10  2:46     ` Andy Lutomirski
2017-02-10 11:01       ` Mel Gorman
2017-02-10 16:44         ` Andy Lutomirski
2017-02-10 21:57           ` Mel Gorman
2017-02-10 22:07             ` Andy Lutomirski
2017-02-10 22:25               ` Borislav Petkov
2017-02-10 22:58                 ` Andy Lutomirski
2017-02-13 10:05               ` Mel Gorman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170209001042.ahxmoqegr6h74mle@techsingularity.net \
    --to=mgorman@techsingularity.net \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=keescook@chromium.org \
    --cc=linux-mm@kvack.org \
    --cc=luto@amacapital.net \
    --cc=luto@kernel.org \
    --cc=nadav.amit@gmail.com \
    --cc=paulmck@linux.vnet.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox