linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Hillf Danton <hdanton@sina.com>
To: Jerome Glisse <jglisse@redhat.com>
Cc: Hillf Danton <hdanton@sina.com>,
	John Hubbard <jhubbard@nvidia.com>, linux-mm <linux-mm@kvack.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	Vlastimil Babka <vbabka@suse.cz>, Mel Gorman <mgorman@suse.de>,
	Dan Williams <dan.j.williams@intel.com>,
	Ira Weiny <ira.weiny@intel.com>, Christoph Hellwig <hch@lst.de>,
	Jonathan Corbet <corbet@lwn.net>
Subject: Re: [RFC] mm: gup: add helper page_try_gup_pin(page)
Date: Fri,  8 Nov 2019 17:38:37 +0800	[thread overview]
Message-ID: <20191108093837.1696-1-hdanton@sina.com> (raw)
In-Reply-To: <20191107095017.17544-1-hdanton@sina.com>


On Thu, 7 Nov 2019 09:57:48 -0500 Jerome Glisse wrote:
> 
> I am not sure i follow ? Today we can not differentiate between GUP
> and regular get_page(), if you use some combination of specific fs
> and hardware you might get some BUG_ON() throws at you depending on
> how lucky/unlucky you are. We can not solve this without being able
> to differentiate between GUP and regular get_page(). Hence why John's
> patchset is the first step in the right direction.
> 
What is the second one? And when? By who?

> If there is no GUP on a page then regular writeback happens as it has
> for years now so in absence of GUP i do not see any issue.
> 
> 
> > > still something where there is no agreement as far as i remember the
> > > outcome of the last discussion we had. I expect this will a topic
> > > at next LSF/MM or maybe something we can flush out before.
> >
> > These are the restraints we know
> >
> > A, multiple gup pins
> > B, mutual data corruptions
> > C, no break of existing use cases
> > D, zero copy
> 
> ? What you mean by zero copy ?
> 
Snippet that can be found at https://lwn.net/Articles/784574/

"get_user_pages() is a way to map user-space memory into the kernel's
address space; it will ensure that all of the requested pages have
been faulted into RAM (and locked there) and provide a kernel mapping
that, in turn, can be used for direct access by the kernel or (more
often) to set up zero-copy I/O operations.

> > E, feel free to add
> >
> > then what is preventing an agreement like bounce page?
> 
> There is 2 sides (AFAIR):
>     - do not write back GUPed page and wait until GUP goes away to
>       write them. But GUP can last as long as the uptime and we can
>       loose data on power failure.
>     - use a bounce page so that there is a chance we have some data
>       on power failure
> 
> >
> > Because page migrate and reclaim have been working for a while with
> > gup pin taken into account, detecting it has no priority in any form
> > over the agreement on how to make a witeback page stable.
> 
> migrate just ignore GUPed page and thus there is no issue with migrate.
> writeback is a special case here because some filesystem need a stable
> page content and also we need to inhibit some fs specific things that
> trigger BUG_ON() in set_page_dirty*()
> 
Which drivers so far have been snared by the BUG_ON()? Is there any
chance to fix them one after another? Otherwise what is making them
special (long-lived pin)?

After setting page dirty, is there any pending DMA transfer to the
dirty page? If yes, what is the point to do writeback for corrupted
data? If no, what is preventing the gup pin from being released?

> > What seems more important, restriction B above makes C hard to meet
> > in any feasible approach trying to keep a writeback page stable, and
> > zero-copy makes it harder AFAICS.
> 
> writeback can use bounce page, zero copy ie not having to use bounce
> page, is not an issue in fact in some cases we already use bounce page
> (at the block device level).
> 
> >
> > > In any case my opinion is bounce page is the best thing we can do,
> > > from application and FS point of view it mimics the characteristics
> > > of regular write-back just as if the write protection window of the
> > > write-backed page was infinitly short.
> >
> > A 100-line patch tells more than a 200-line explanation can and helps
> > to shorten the discussion prior to reaching an agreement.
> 
> It is not that trivial, you need to make sure every layer from fs down
> to block device driver properly behave in front of bounce page. We have
> such mechanism for bio but it is a the bio level but maybe it can be
> dumped one level.



  parent reply	other threads:[~2019-11-08  9:38 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-03 11:21 Hillf Danton
2019-11-03 20:20 ` John Hubbard
2019-11-04  4:34 ` Hillf Danton
2019-11-04  6:09   ` John Hubbard
2019-11-04  8:13     ` Jan Kara
2019-11-04 10:20   ` Hillf Danton
2019-11-04 19:03     ` Jerome Glisse
2019-11-05  8:56       ` David Hildenbrand
2019-11-05  4:27     ` Hillf Danton
2019-11-05 15:54       ` Jerome Glisse
2019-11-06  9:22       ` Hillf Danton
2019-11-06 15:46         ` Jerome Glisse
2019-11-07  9:50         ` Hillf Danton
2019-11-07 14:57           ` Jerome Glisse
2019-11-08  9:38           ` Hillf Danton [this message]
2019-11-08 13:59             ` Jerome Glisse

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191108093837.1696-1-hdanton@sina.com \
    --to=hdanton@sina.com \
    --cc=akpm@linux-foundation.org \
    --cc=corbet@lwn.net \
    --cc=dan.j.williams@intel.com \
    --cc=hch@lst.de \
    --cc=ira.weiny@intel.com \
    --cc=jglisse@redhat.com \
    --cc=jhubbard@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox