linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Hugh Dickins <hugh@veritas.com>
To: Carsten Otte <carsteno@de.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	Nick Piggin <nickpiggin@yahoo.com.au>,
	Linux Memory Management <linux-mm@kvack.org>,
	Andrew Morton <akpm@osdl.org>, Ralf Baechle <ralf@linux-mips.org>
Subject: Re: [patch] mm: mremap correct rmap accounting
Date: Tue, 30 Jan 2007 22:04:57 +0000 (GMT)	[thread overview]
Message-ID: <Pine.LNX.4.64.0701302157250.22828@blonde.wat.veritas.com> (raw)
In-Reply-To: <45BF68A4.5070002@de.ibm.com>

On Tue, 30 Jan 2007, Carsten Otte wrote:
> Hugh Dickins wrote:
> > Could make it loop over them all, but a quicker patch would be as
> > below.  I've no idea if the intersection of filemap_xip users and
> > MIPS users is the empty set or more interesting.  But I'd prefer
> > you don't just slam in the patch, better have an opinion from
> > Carsten and/or Nick first.
> Took me some time to catch up on this thread, sorry for that.

No problem.  And sorry for being so slow to reply to you,
there seems to have been an assault on my home connectivity.

I think it's now clear that XIP won't be impacted at all by my
ZERO_PAGE(0) change, and that's the patch Linus should put in
for 2.6.20 (given how much he disliked Nick's patch to maintain
the different zeropage counts across mremap move).  Ah, good,
that's now gone into his tree since last I looked.

> Yea, I think xip can be implemented correctly that it works on
> mips when we loop over all zero pages on unmap. Let me try to
> come up with a patch for that.

Sorry, no, I didn't mean for you to try that, and there appears
to be no need for it at all, for the foreseeable future anyway:
please don't waste your time on that.

But there is a change which I now think you do need to make,
for 2.6.21 - let it not distract attention from the pagecount
correctness issue we've been discussing so far.  Something I
should have noticed when I first looked at your clever use of
the ZERO_PAGE, but have only noticed now.  Doesn't it clash
with the clever use of the ZERO_PAGE when reading /dev/zero
(see read_zero_pagealigned in drivers/char/mem.c)?

Consider two PROT_READ|PROT_WRITE,MAP_PRIVATE mappings of a
four-page hole in a XIP file.  One of them just readfaults the
four pages in (and is given ZERO_PAGE for each), the other has
four pages read from /dev/zero into it (which also maps the
ZERO_PAGE into its four ptes).

Then imagine that non-zero data is written to the first page of
that hole, by a write syscall, or through a PROT_WRITE,MAP_SHARED
mapping.  __xip_unmap will replace the first ZERO_PAGE in each of
the MAP_PRIVATE mappings by the new non-zero data page.  Which is
correct for the first mapping which just did readfaults, but wrong
for the second mapping which has overwritten by reading /dev/zero
- those pages ought to remain zeroed, never seeing the later data.

I've never much liked the read_zero_pagealigned cleverness, seems
"too clever by half" as we say, and this overlooked clash shows why.
But I'm also scared to remove any long-established optimization(?),
for fear of impacting some unknown workload.

So, if you're to retain your share-one-page-for-holes cleverness,
I think you need to switch over to allocating a page of your own
for it, instead of using the ZERO_PAGE.

Or have I got it wrong?  A simple test should show.

Hugh

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2007-01-30 22:04 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-01-23 14:19 Nick Piggin
2007-01-23 20:55 ` Hugh Dickins
2007-01-23 23:49   ` Nick Piggin
2007-01-29  3:31   ` Nick Piggin
2007-01-29  6:40     ` Andrew Morton
2007-01-29  6:57       ` Nick Piggin
2007-01-29 19:08     ` Hugh Dickins
2007-01-29 19:27       ` Linus Torvalds
2007-01-29 20:03         ` Andrew Morton
2007-01-29 20:18           ` Linus Torvalds
2007-01-29 21:27             ` Ralf Baechle
2007-01-29 20:10         ` Hugh Dickins
2007-01-29 20:22           ` Linus Torvalds
2007-01-29 20:38             ` Hugh Dickins
2007-01-29 21:24               ` Hugh Dickins
2007-01-30  1:00                 ` Nick Piggin
2007-01-30 14:24                 ` Carsten Otte
2007-01-30 16:41                   ` Ralf Baechle
2007-01-30 17:35                     ` Carsten Otte
2007-01-30 15:47                 ` Carsten Otte
2007-01-30 22:04                   ` Hugh Dickins [this message]
2007-01-31 13:51                     ` Carsten Otte
2007-01-31 13:59                     ` Carsten Otte
2007-01-31 16:31                       ` Hugh Dickins
2007-02-01 16:21                         ` Carsten Otte

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Pine.LNX.4.64.0701302157250.22828@blonde.wat.veritas.com \
    --to=hugh@veritas.com \
    --cc=akpm@osdl.org \
    --cc=carsteno@de.ibm.com \
    --cc=linux-mm@kvack.org \
    --cc=nickpiggin@yahoo.com.au \
    --cc=ralf@linux-mips.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox