linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "Kirill A. Shutemov" <kirill@shutemov.name>
To: Yury Norov <ynorov@caviumnetworks.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Al Viro <viro@zeniv.linux.org.uk>,
	Dan Williams <dan.j.williams@intel.com>,
	Huang Ying <ying.huang@intel.com>,
	"Michael S . Tsirkin" <mst@redhat.com>,
	Michel Lespinasse <walken@google.com>,
	Souptick Joarder <jrdr.linux@gmail.com>, Willy Tarreau <w@1wt.eu>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [PATCH] mm: fix COW faults after mlock()
Date: Tue, 25 Sep 2018 13:48:29 +0300	[thread overview]
Message-ID: <20180925104829.jld5xd6evr7uhwfw@kshutemo-mobl1> (raw)
In-Reply-To: <20180924234843.GA23726@yury-thinkpad>

On Tue, Sep 25, 2018 at 02:48:43AM +0300, Yury Norov wrote:
> On Tue, Sep 25, 2018 at 12:22:47AM +0300, Kirill A. Shutemov wrote:
> > External Email
> > 
> > On Mon, Sep 24, 2018 at 04:08:52PM +0300, Yury Norov wrote:
> > > After mlock() on newly mmap()ed shared memory I observe page faults.
> > >
> > > The problem is that populate_vma_page_range() doesn't set FOLL_WRITE
> > > flag for writable shared memory in mlock() path, arguing that like:
> > > /*
> > >  * We want to touch writable mappings with a write fault in order
> > >  * to break COW, except for shared mappings because these don't COW
> > >  * and we would not want to dirty them for nothing.
> > >  */
> > >
> > > But they are actually COWed. The most straightforward way to avoid it
> > > is to set FOLL_WRITE flag for shared mappings as well as for private ones.
> > 
> > Huh? How do shared mapping get CoWed?
> > 
> > In this context CoW means to create a private copy of the  page for the
> > process. It only makes sense for private mappings as all pages in shared
> > mappings do not belong to the process.
> > 
> > Shared mappings will still get faults, but a bit later -- after the page
> > is written back to disc, the page get clear and write protected to catch
> > the next write access.
> > 
> > Noticeable exception is tmpfs/shmem. These pages do not belong to normal
> > write back process. But the code path is used for other filesystems as
> > well.
> > 
> > Therefore, NAK. You only create unneeded write back traffic.
> 
> Hi Kirill,
> 
> (My first reaction was exactly like yours indeed, but) on my real
> system (Cavium OcteonTX2), and on my qemu simulation I can reproduce
> the same behavior: just mlock()ed memory causes faults. That faults
> happen because page is mapped to the process as read-only, while
> underlying VMA is read-write. So faults get resolved well by just
> setting write access to the page.

mlock() doesn't guarntee that you'll never get a *minor* fault. Write back
or page migration will get these pages write-protected.

Making pages write protected is what we rely on for proper dirty
accounting: filesystems need to know when page gets dirty and allocate
resources for properly write back the page. Once page is written back to
storage the page gets write protected again to catch the next write access
to the page.

I guess we can situation a bit better for shmem/tmpfs: we can populate
such shared mappings with FOLL_WRITE. But this patch is not good for the
task.

-- 
 Kirill A. Shutemov

  reply	other threads:[~2018-09-25 10:48 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-09-24 13:08 Yury Norov
2018-09-24 21:22 ` Kirill A. Shutemov
2018-09-24 23:48   ` Yury Norov
2018-09-25 10:48     ` Kirill A. Shutemov [this message]
2018-10-11  5:37 ` [LKP] [mm] dd12385915: vm-scalability.median 18.6% improvement kernel test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180925104829.jld5xd6evr7uhwfw@kshutemo-mobl1 \
    --to=kirill@shutemov.name \
    --cc=akpm@linux-foundation.org \
    --cc=dan.j.williams@intel.com \
    --cc=jrdr.linux@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mst@redhat.com \
    --cc=viro@zeniv.linux.org.uk \
    --cc=w@1wt.eu \
    --cc=walken@google.com \
    --cc=ying.huang@intel.com \
    --cc=ynorov@caviumnetworks.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox