linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@kernel.org>
To: Dan Williams <dan.j.williams@intel.com>
Cc: "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"Linux MM" <linux-mm@kvack.org>,
	"Jérôme Glisse" <jglisse@redhat.com>,
	"Ingo Molnar" <mingo@redhat.com>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	"Logan Gunthorpe" <logang@deltatee.com>,
	"Kirill Shutemov" <kirill.shutemov@linux.intel.com>
Subject: Re: [PATCH v2] mm, zone_device: replace {get, put}_zone_device_page() with a single reference
Date: Mon, 1 May 2017 09:12:59 +0200	[thread overview]
Message-ID: <20170501071259.5vya524wcdddm42b@gmail.com> (raw)
In-Reply-To: <CAPcyv4i8WrNPzu_-Lu1uKi8NT-vj1PF0h0SW_Pi=QGn5PPhQfQ@mail.gmail.com>


* Dan Williams <dan.j.williams@intel.com> wrote:

> On Sat, Apr 29, 2017 at 7:18 AM, Ingo Molnar <mingo@kernel.org> wrote:
> >
> > * Dan Williams <dan.j.williams@intel.com> wrote:
> >
> >> Kirill points out that the calls to {get,put}_dev_pagemap() can be
> >> removed from the mm fast path if we take a single get_dev_pagemap()
> >> reference to signify that the page is alive and use the final put of the
> >> page to drop that reference.
> >>
> >> This does require some care to make sure that any waits for the
> >> percpu_ref to drop to zero occur *after* devm_memremap_page_release(),
> >> since it now maintains its own elevated reference.
> >>
> >> Cc: Ingo Molnar <mingo@redhat.com>
> >> Cc: Jerome Glisse <jglisse@redhat.com>
> >> Cc: Andrew Morton <akpm@linux-foundation.org>
> >> Reviewed-by: Logan Gunthorpe <logang@deltatee.com>
> >> Suggested-by: Kirill Shutemov <kirill.shutemov@linux.intel.com>
> >> Tested-by: Kirill Shutemov <kirill.shutemov@linux.intel.com>
> >> Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> >
> > This changelog is lacking an explanation about how this solves the crashes you
> > were seeing.
> 
> Kirill? It wasn't clear to me why the conversion to generic 
> get_user_pages_fast() caused the reference counts to be off.

Ok, the merge window is open and we really need this fix for x86/mm, so this is 
what I've decoded:

 The x86 conversion to the generic GUP code included a small change which causes
 crashes and data corruption in the pmem code - not good.

 The root cause is that the /dev/pmem driver code implicitly relies on the x86
 get_user_pages() implementation doing a get_page() on the page refcount, because
 get_page() does a get_zone_device_page() which properly refcounts pmem's separate
 page struct arrays that are not present in the regular page struct structures.
 (The pmem driver does this because it can cover huge memory areas.)

 But the x86 conversion to the generic GUP code changed the get_page() to
 page_cache_get_speculative() which is faster but doesn't do the
 get_zone_device_page() call the pmem code relies on.

 One way to solve the regression would be to change the generic GUP code to use 
 get_page(), but that would slow things down a bit and punish other generic-GUP 
 using architectures for an x86-ism they did not care about. (Arguably the pmem 
 driver was probably not working reliably for them: but nvdimm is an Intel
 feature, so non-x86 exposure is probably still limited.)

 So restructure the pmem code's interface with the MM instead: get rid of the 
 get/put_zone_device_page() distinction, integrate put_zone_device_page() into 
 __put_page() and and restructure the pmem completion-wait and teardown machinery.

 This speeds up things while also making the pmem refcounting more robust going 
 forward.

... is this extension to the changelog correct?

I'll apply this for the time being - but can still amend the text before sending 
it to Linus later today.

Thanks,

	Ingo

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2017-05-01  7:13 UTC|newest]

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CAA9_cmf7=aGXKoQFkzS_UJtznfRtWofitDpV2AyGwpaRGKyQkg@mail.gmail.com>
2017-04-23 23:31 ` get_zone_device_page() in get_page() and page_cache_get_speculative() Kirill A. Shutemov
2017-04-24 17:23   ` Dan Williams
2017-04-24 17:30     ` Kirill A. Shutemov
2017-04-24 17:47       ` Dan Williams
2017-04-24 18:01         ` Kirill A. Shutemov
2017-04-24 18:25           ` Kirill A. Shutemov
2017-04-24 18:41             ` Dan Williams
2017-04-25 13:19               ` Kirill A. Shutemov
2017-04-25 16:44                 ` Dan Williams
2017-04-27  0:55   ` [PATCH] mm, zone_device: replace {get, put}_zone_device_page() with a single reference Dan Williams
2017-04-27  8:33     ` Kirill A. Shutemov
2017-04-28  6:39       ` Ingo Molnar
2017-04-28  8:14         ` [PATCH] mm, zone_device: Replace " Kirill A. Shutemov
2017-04-28 17:23         ` [PATCH v2] mm, zone_device: replace " Dan Williams
2017-04-28 17:34           ` Jerome Glisse
2017-04-28 17:41             ` Dan Williams
2017-04-28 18:00               ` Jerome Glisse
2017-04-28 19:02                 ` Dan Williams
2017-04-28 19:16                   ` Jerome Glisse
2017-04-28 19:22                     ` Dan Williams
2017-04-28 19:33                       ` Jerome Glisse
2017-04-29 10:17                         ` Kirill A. Shutemov
2017-04-30 23:14                           ` Jerome Glisse
2017-05-01  1:42                             ` Dan Williams
2017-05-01  1:54                               ` Jerome Glisse
2017-05-01  2:40                                 ` Dan Williams
2017-05-01  3:48                             ` Logan Gunthorpe
2017-05-01 10:23                             ` Kirill A. Shutemov
2017-05-01 13:55                               ` Jerome Glisse
2017-05-01 20:19                                 ` Dan Williams
2017-05-01 20:32                                   ` Jerome Glisse
2017-05-02 11:37                                 ` Kirill A. Shutemov
2017-05-02 13:22                                   ` Jerome Glisse
2017-04-29 14:18           ` Ingo Molnar
2017-05-01  2:45             ` Dan Williams
2017-05-01  7:12               ` Ingo Molnar [this message]
2017-05-01  9:33                 ` Kirill A. Shutemov
2017-04-27 16:11     ` [PATCH] " Logan Gunthorpe
2017-04-27 16:14       ` Dan Williams
2017-04-27 16:33         ` Logan Gunthorpe
2017-04-27 16:38           ` Dan Williams
2017-04-27 16:45             ` Logan Gunthorpe
2017-04-27 16:46               ` Dan Williams

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170501071259.5vya524wcdddm42b@gmail.com \
    --to=mingo@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=dan.j.williams@intel.com \
    --cc=jglisse@redhat.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=logang@deltatee.com \
    --cc=mingo@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox