linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Hugh Dickins <hugh.dickins@tiscali.co.uk>
To: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Cc: "linux-mm@kvack.org" <linux-mm@kvack.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Nick Piggin <npiggin@suse.de>,
	Andrew Morton <akpm@linux-foundation.org>,
	Linus Torvalds <torvalds@linux-foundation.org>
Subject: Re: [PATCH] ZERO_PAGE again v5.
Date: Mon, 10 Aug 2009 12:24:22 +0100 (BST)	[thread overview]
Message-ID: <Pine.LNX.4.64.0908101152150.8339@sister.anvils> (raw)
In-Reply-To: <20090810091458.1e889cdc.kamezawa.hiroyu@jp.fujitsu.com>

On Mon, 10 Aug 2009, KAMEZAWA Hiroyuki wrote:
> On Sun, 9 Aug 2009 18:28:48 +0100 (BST)
> Hugh Dickins <hugh.dickins@tiscali.co.uk> wrote:
> > 
> > Actually, I don't understand ignore_zero at all: it's used solely by
> > the mlock case, yet its effect seems to be precisely not to fault in
> > pages if they're missing - I wonder if you've got in a muddle between
> > the two very different awkward cases, mlocking and coredumps of
> > sparsely populated areas.
> > 
> Ah, then, you say mlock() should allocate 'real' page if zero page
> is mapped. Right ?

No.

(That would be a possibility if it gets us out of some difficulty
with the newer mlocking code, but it's not something we ever did or
wanted to do in the past.)

What I was saying in that paragraph was that (it appears to me that)
in your patch only __mlock_vma_pages_range sets GUP_FLAGS_IGNORE_ZERO,
that __get_user_pages sets ignore_zero according to that flag, some
conditions may clear it, but then it goes on to say
			while (!(page = follow_page(vma, start, foll_flags))) {
				/*
				 * When we ignore zero pages, no more ops to do.
				 */
				if (ignore_zero)
					break;
which means that when ignore_zero is set and follow_page returns NULL,
we emerge from the loop with NULL page, don't we?  Whereas when mlocking,
we want to fault in any pages which were not already there.

Or am I just reading this all wrong?

> > And I don't at all like the way you flush_dcache_page(page) on a
> > page which may now be NULL: okay, you're only encouraging x86 to
> > say Yes to the Kconfig option, but that's a landmine for the first
> > arch with a real flush_dcache_page(page) which says Yes to it.
> > 
> do_wp_page()
> 	-> cow_user_page()
> 		-> (src is NULL)
> Ah....ok, it's bug. I added ....Sorry, I didn't see this in older version
> and missed this.

That's an entirely different issue, and I don't see that it's a bug,
just the inefficiency I mentioned elsewhere that we'd be better off
doing a memset than trying to memcpy the ZERO_PAGE.

What I was saying in that paragraph is that when you break from the
loop in __get_user_pages with ignore_zero and NULL page, you reach
			if (IS_ERR(page))
				return i ? i : PTR_ERR(page);
			if (pages) {
				pages[i] = page;

				flush_anon_page(vma, page, start);
				flush_dcache_page(page);
			}
which inserts a NULL into pages[i] (which may be okay if the other
end is prepared for it, as I think __mlock_vma_pages_range is),
then passes a NULL page to flush_anon_page and flush_dcache_page.

I looked up one of the non-empty implementations of flush_dcache_page
and saw it testing a bit in page->flags, assuming (very reasonably!)
that the page pointer is not NULL.  Oops.

> > Because I hate reviewing things and trying to direct other people
> > by remote control: what usually happens is I send them off in some
> > direction which, once I try to do it myself, turns out to have been
> > the wrong direction.  I do need to try to do this myself, instead of
> > standing on the sidelines criticizing.
> > 
> > In fairness, I think Linus himself was a little confused when he
> > separated off use_zero_page(): I think we've all got confused around
> > there (as we noticed a month or so ago when discussing its hugetlb
> > equivalent), and I need to think it through again at last.
> > 
> > I'll get on to it now.
> > 
> 
> Thank you for comments. I'll go to a trip until Aug/17, programming-camp,

Sorry for not getting to all this sooner, yes I'd seen your warning in
another mail that you'd be away, but I just didn't get here fast enough.

> I'll be able to consider this patch and the whole things aroung paging in calm
> enviroment. I'll try to restart from scratch.

What I'm saying above is that I'd much prefer to try doing the patch
myself and have you review that, than us to keep on going back and
forth with different versions like this.

I am not confident that you've grasped all the issues, and I am sure
that there's a least one issue which I have not grasped: maybe it'll
end up irrelevant, but I do need to understand Linus's Fix ZERO_PAGE
breakage with vmware, 672ca28e300c17bf8d792a2a7a8631193e580c74

But I'll discuss that separately, probably offlist; or, if I'm lucky,
just composing the mail will bring me to answer my own question!

Hugh

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

      reply	other threads:[~2009-08-10 11:24 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-08-05 10:16 KAMEZAWA Hiroyuki
2009-08-05 10:17 ` [PATCH 1/2] ZERO_PAGE config KAMEZAWA Hiroyuki
2009-08-05 10:19 ` [PATCH 2/2] ZERO_PAGE based on pte_special KAMEZAWA Hiroyuki
2009-08-09 17:28 ` [PATCH] ZERO_PAGE again v5 Hugh Dickins
2009-08-10  0:14   ` KAMEZAWA Hiroyuki
2009-08-10 11:24     ` Hugh Dickins [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Pine.LNX.4.64.0908101152150.8339@sister.anvils \
    --to=hugh.dickins@tiscali.co.uk \
    --cc=akpm@linux-foundation.org \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=npiggin@suse.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox