linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: John Hubbard <jhubbard@nvidia.com>
To: Ira Weiny <ira.weiny@intel.com>
Cc: Vlastimil Babka <vbabka@suse.cz>,
	Michal Hocko <mhocko@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Christoph Hellwig <hch@infradead.org>, Jan Kara <jack@suse.cz>,
	Jason Gunthorpe <jgg@ziepe.ca>,
	Jerome Glisse <jglisse@redhat.com>,
	LKML <linux-kernel@vger.kernel.org>, <linux-mm@kvack.org>,
	<linux-fsdevel@vger.kernel.org>,
	Dan Williams <dan.j.williams@intel.com>,
	Daniel Black <daniel@linux.ibm.com>,
	Matthew Wilcox <willy@infradead.org>,
	Mike Kravetz <mike.kravetz@oracle.com>
Subject: Re: [PATCH 1/3] mm/mlock.c: convert put_page() to put_user_page*()
Date: Thu, 8 Aug 2019 16:57:51 -0700	[thread overview]
Message-ID: <5713cc2b-b41c-142a-eb52-f5cda999eca7@nvidia.com> (raw)
In-Reply-To: <20190808234138.GA15908@iweiny-DESK2.sc.intel.com>

On 8/8/19 4:41 PM, Ira Weiny wrote:
> On Thu, Aug 08, 2019 at 03:59:15PM -0700, John Hubbard wrote:
>> On 8/8/19 12:20 PM, John Hubbard wrote:
>>> On 8/8/19 4:09 AM, Vlastimil Babka wrote:
>>>> On 8/8/19 8:21 AM, Michal Hocko wrote:
>>>>> On Wed 07-08-19 16:32:08, John Hubbard wrote:
>>>>>> On 8/7/19 4:01 AM, Michal Hocko wrote:
>>>>>>> On Mon 05-08-19 15:20:17, john.hubbard@gmail.com wrote:
...
>> Oh, and meanwhile, I'm leaning toward a cheap fix: just use gup_fast() instead
>> of get_page(), and also fix the releasing code. So this incremental patch, on
>> top of the existing one, should do it:
>>
>> diff --git a/mm/mlock.c b/mm/mlock.c
>> index b980e6270e8a..2ea272c6fee3 100644
>> --- a/mm/mlock.c
>> +++ b/mm/mlock.c
>> @@ -318,18 +318,14 @@ static void __munlock_pagevec(struct pagevec *pvec, struct zone *zone)
>>                 /*
>>                  * We won't be munlocking this page in the next phase
>>                  * but we still need to release the follow_page_mask()
>> -                * pin. We cannot do it under lru_lock however. If it's
>> -                * the last pin, __page_cache_release() would deadlock.
>> +                * pin.
>>                  */
>> -               pagevec_add(&pvec_putback, pvec->pages[i]);
>> +               put_user_page(pages[i]);

correction, make that:   
                   put_user_page(pvec->pages[i]);

(This is not fully tested yet.)

>>                 pvec->pages[i] = NULL;
>>         }
>>         __mod_zone_page_state(zone, NR_MLOCK, delta_munlocked);
>>         spin_unlock_irq(&zone->zone_pgdat->lru_lock);
>>  
>> -       /* Now we can release pins of pages that we are not munlocking */
>> -       pagevec_release(&pvec_putback);
>> -
> 
> I'm not an expert but this skips a call to lru_add_drain().  Is that ok?

Yes: unless I'm missing something, there is no reason to go through lru_add_drain
in this case. These are gup'd pages that are not going to get any further
processing.

> 
>>         /* Phase 2: page munlock */
>>         for (i = 0; i < nr; i++) {
>>                 struct page *page = pvec->pages[i];
>> @@ -394,6 +390,8 @@ static unsigned long __munlock_pagevec_fill(struct pagevec *pvec,
>>         start += PAGE_SIZE;
>>         while (start < end) {
>>                 struct page *page = NULL;
>> +               int ret;
>> +
>>                 pte++;
>>                 if (pte_present(*pte))
>>                         page = vm_normal_page(vma, start, *pte);
>> @@ -411,7 +409,13 @@ static unsigned long __munlock_pagevec_fill(struct pagevec *pvec,
>>                 if (PageTransCompound(page))
>>                         break;
>>  
>> -               get_page(page);
>> +               /*
>> +                * Use get_user_pages_fast(), instead of get_page() so that the
>> +                * releasing code can unconditionally call put_user_page().
>> +                */
>> +               ret = get_user_pages_fast(start, 1, 0, &page);
>> +               if (ret != 1)
>> +                       break;
> 
> I like the idea of making this a get/put pair but I'm feeling uneasy about how
> this is really supposed to work.
> 
> For sure the GUP/PUP was supposed to be separate from [get|put]_page.
> 

Actually, they both take references on the page. And it is absolutely OK to call
them both on the same page.

But anyway, we're not mixing them up here. If you follow the code paths, either 
gup or follow_page_mask() is used, and then put_user_page() releases. 

So...you haven't actually pointed to a bug here, right? :)


thanks,
-- 
John Hubbard
NVIDIA


  reply	other threads:[~2019-08-08 23:57 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-05 22:20 [PATCH 0/3] mm/: 3 more put_user_page() conversions john.hubbard
2019-08-05 22:20 ` [PATCH 1/3] mm/mlock.c: convert put_page() to put_user_page*() john.hubbard
2019-08-07 11:01   ` Michal Hocko
2019-08-07 23:32     ` John Hubbard
2019-08-08  6:21       ` Michal Hocko
2019-08-08 11:09         ` Vlastimil Babka
2019-08-08 19:20           ` John Hubbard
2019-08-08 22:59             ` John Hubbard
2019-08-08 23:41               ` Ira Weiny
2019-08-08 23:57                 ` John Hubbard [this message]
2019-08-09 18:22                   ` Weiny, Ira
2019-08-09  8:12               ` Vlastimil Babka
2019-08-09  8:23                 ` Michal Hocko
2019-08-09  9:05                   ` John Hubbard
2019-08-09  9:16                     ` Michal Hocko
2019-08-09 13:58                   ` Jan Kara
2019-08-09 17:52                     ` Michal Hocko
2019-08-09 18:14                       ` Weiny, Ira
2019-08-09 18:36                         ` John Hubbard
2019-08-05 22:20 ` [PATCH 2/3] mm/mempolicy.c: " john.hubbard
2019-08-05 22:20 ` [PATCH 3/3] mm/ksm: " john.hubbard
2019-08-06 21:59 ` [PATCH 0/3] mm/: 3 more put_user_page() conversions Andrew Morton
2019-08-06 22:05   ` John Hubbard

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5713cc2b-b41c-142a-eb52-f5cda999eca7@nvidia.com \
    --to=jhubbard@nvidia.com \
    --cc=akpm@linux-foundation.org \
    --cc=dan.j.williams@intel.com \
    --cc=daniel@linux.ibm.com \
    --cc=hch@infradead.org \
    --cc=ira.weiny@intel.com \
    --cc=jack@suse.cz \
    --cc=jgg@ziepe.ca \
    --cc=jglisse@redhat.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=mike.kravetz@oracle.com \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox