linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Christopher Lameter <cl@linux.com>
To: linux-rdma@vger.kernel.org
Cc: linux-mm@kvack.org, Michal Hocko <mhocko@kernel.org>,
	Jason Gunthorpe <jgg@ziepe.ca>
Subject: [LSFMM] RDMA data corruption potential during FS writeback
Date: Fri, 18 May 2018 14:37:52 +0000	[thread overview]
Message-ID: <0100016373af827b-e6164b8d-f12e-4938-bf1f-2f85ec830bc0-000000@email.amazonses.com> (raw)

There was a session at the Linux Filesystem and Memory Management summit
on issues that are caused by devices using get_user_pages() or elevated
refcounts to pin pages and then do I/O on them.

See https://lwn.net/Articles/753027/

Basically filesystems need to mark the pages readonly during writeback.
Concurrent DMA into the page while it is written by a filesystem can cause
corrupted data being written to the disk, cause incorrect checksums etc
etc.

The solution that was proposed at the meeting was that mmu notifiers can
remedy that situation by allowing callbacks to the RDMA device to ensure
that the RDMA device and the filesystem do not do concurrent writeback.

This issue has been around for a long time and so far not caused too much
grief it seems. Doing I/O to two devices from the same memory location is
naturally a bit inconsistent in itself.

But could we do more to prevent issues here? I think what may be useful is
to not allow the memory registrations of file back writable mappings
unless the device driver provides mmu callbacks or something like that.

There is also the longstanding issue of the refcounts that are held over
long time periods. If we require mmu notifier callbacks then we may as
well go to on demand paging mode for RDMA memory registrations. This
avoids increasing the refcounts long term and allows easy access control /
page removal for memory management.

There may even be more issues if DAX is being used but the FS writeback
has the potential of biting anyone at this point it seems.

I think we need to put some thought into these issues and we need some
coordination between the RDMA developers and memory management. RDMA seems
to be more and more important and thus its likely that issues like this
will become more important.

             reply	other threads:[~2018-05-18 14:37 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-18 14:37 Christopher Lameter [this message]
2018-05-18 15:49 ` Jason Gunthorpe
2018-05-18 16:47   ` Christopher Lameter
2018-05-18 17:36     ` Jason Gunthorpe
2018-05-18 20:23       ` Dan Williams
2018-05-19  2:33         ` John Hubbard
2018-05-19  3:24           ` Jason Gunthorpe
2018-05-19  3:51             ` Dan Williams
2018-05-19  5:38               ` John Hubbard
2018-05-21 14:38               ` Matthew Wilcox
2018-05-23 23:03                 ` John Hubbard
2018-05-21 13:37             ` Christopher Lameter
2018-05-21 13:59           ` Christopher Lameter

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=0100016373af827b-e6164b8d-f12e-4938-bf1f-2f85ec830bc0-000000@email.amazonses.com \
    --to=cl@linux.com \
    --cc=jgg@ziepe.ca \
    --cc=linux-mm@kvack.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=mhocko@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox