linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: John Hubbard <jhubbard@nvidia.com>
To: Linus Torvalds <torvalds@linux-foundation.org>,
	Peter Xu <peterx@redhat.com>
Cc: Linux MM Mailing List <linux-mm@kvack.org>,
	Jason Gunthorpe <jgg@nvidia.com>, Jan Kara <jack@suse.cz>,
	Andrew Morton <akpm@linux-foundation.org>,
	Andrea Arcangeli <aarcange@redhat.com>
Subject: Re: Possible race with page_maybe_dma_pinned?
Date: Wed, 29 Sep 2021 18:57:33 -0700	[thread overview]
Message-ID: <9636a101-4445-0ede-e3ad-4cecd531f433@nvidia.com> (raw)
In-Reply-To: <CAHk-=wjdiXyXFukBUwLKdSXRrwTzyRnynLFvJUe1BM4D6wBvsw@mail.gmail.com>

On 9/29/21 15:47, Linus Torvalds wrote:
> On Wed, Sep 29, 2021 at 12:57 PM Peter Xu <peterx@redhat.com> wrote:
>>
>> Now we have 3 callers of page_maybe_dma_pinned():
>>
>>          1. page_needs_cow_for_dma
>>          2. pte_is_pinned
>>          3. shrink_page_list
>>
>> The 1st one is good as it takes the seqlock for write properly.  The 2nd & 3rd
>> are missing, we may need to add them.
> 
> Well, the pte_is_pinned() case at least could do the seqlock in
> clear_soft_dirty() - it has the vma and mm available.

That part seems straightforward, OK.

> 
> The page shrinker has always been problematic since it doesn't have
> the vm (and by "always" I mean "modern times" - long ago we used to
> scan virtually, in the days before rmap)
> 
> One option might be for fast-gup to give up on locked pages. I think
> the page lock is the only thing that shrink_page_list() serializes
> with.
> 

In order to avoid locked pages in gup fast, it is easiest to do a
check for locked pages *after* fast-pinning them, and unpin them before
returning to the caller. This makes the change much smaller.

However, doing so leaves a window of time during which the pages are
still marked as maybe-dma-pinned, although those pages are never
returned to the caller as such. There is already code that is subject to
this in lockless_pages_from_mm(), for the case of a failed seqlock read.
I'm thinking it's probably OK, because the pages are not going to be
held long-term. They will be unpinned before returning from
lockless_pages_from_mm().

The counter argument is that this is merely making the race window
smaller, which is usually something that I argue against because it just
leads to harder-to-find bugs...

To be specific, here's what I mean:

diff --git a/mm/gup.c b/mm/gup.c
index 886d6148d3d03..8ba871a927668 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -2657,6 +2657,7 @@ static unsigned long lockless_pages_from_mm(unsigned long start,
  {
  	unsigned long flags;
  	int nr_pinned = 0;
+	int i;
  	unsigned seq;

  	if (!IS_ENABLED(CONFIG_HAVE_FAST_GUP) ||
@@ -2693,7 +2694,23 @@ static unsigned long lockless_pages_from_mm(unsigned long start,
  			unpin_user_pages(pages, nr_pinned);
  			return 0;
  		}
+
+		/*
+		 * Avoiding locked pages, in this fast/lockless context, will
+		 * avoid interfering with shrink_page_list(), in particular.
+		 * Give up upon finding the first locked page, but keep the
+		 * earlier pages, so that slow gup does not have to re-pin them.
+		 */
+		for (i = 0; i < nr_pinned; i++)  {
+			if (PageLocked(pages[i])) {
+				unpin_user_pages(&pages[i], nr_pinned - i);
+				nr_pinned = i + 1;
+				break;
+			}
+		}
  	}
+
+
  	return nr_pinned;
  }


thanks,
-- 
John Hubbard
NVIDIA


  reply	other threads:[~2021-09-30  1:57 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-29 19:57 Peter Xu
2021-09-29 22:47 ` Linus Torvalds
2021-09-30  1:57   ` John Hubbard [this message]
2021-09-30 11:11     ` Jan Kara

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9636a101-4445-0ede-e3ad-4cecd531f433@nvidia.com \
    --to=jhubbard@nvidia.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=jack@suse.cz \
    --cc=jgg@nvidia.com \
    --cc=linux-mm@kvack.org \
    --cc=peterx@redhat.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox