* Re: [RFC][PATCH 0/4] ZERO PAGE again v2
@ 2009-07-07 15:50 KAMEZAWA Hiroyuki
0 siblings, 0 replies; 26+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-07-07 15:50 UTC (permalink / raw)
To: Nick Piggin
Cc: KAMEZAWA Hiroyuki, linux-mm, hugh.dickins, avi, akpm, torvalds
Nick Piggin wrote:
> On Tue, Jul 07, 2009 at 06:06:29PM +0900, KAMEZAWA Hiroyuki wrote:
>> 3. Considering save&restore application's data table, ZERO_PAGE is
>> useful.
>> maybe.
>
> I just wouldn't like to re-add significant complexity back to
> the vm without good and concrete examples. OK I agree that
> just saying "rewrite your code" is not so good, but are there
> real significant problems? Is it inside just a particuar linear
> algebra library or something that might be able to be updated?
>
As far as I can tell
I know 2 cases from my limited experience for user support.
1. A middlware maps /dev/zero with PRIVATE mapping and use copy-on-write
intentionally. I think this is because their Solaris? apps required
/dev/zero to use ZERO_PAGE or anon.
I don't know much about solaris but
"mapping /dev/zero eats up tons of memory" sounds strange for me.
2. A HPC middleware seems to make use of ZERO_PAGE to do checkpoint/restart
of his job. (Maybe they can rewrite programs as you say.)
Maybe there are others. (I'm not afraid of famous OSS applications/library.
There will be enough technical support for such apps.)
To be honest, I'd like to support /dev/zero, at least.
"mmap(/dev/zero, PROT_READ) caues OOM" sounds like a crazy behavior as OS.
Is it ok to write fault handler for /dev/zero and use zero page even if
this request is rejected ?
It was a choice to advertise "ZERO PAGE is not available any more, plz
check and rewrite you applications" to all my customers. But I'm being
pessimistic about this issue. (So, trying this patch)
Users will not understand what is the change and I'll see some of OOM
report caused by this change.
Thanks,
-Kame
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [RFC][PATCH 0/4] ZERO PAGE again v2
2009-07-13 6:46 ` Nick Piggin
@ 2009-07-13 7:24 ` Nick Piggin
0 siblings, 0 replies; 26+ messages in thread
From: Nick Piggin @ 2009-07-13 7:24 UTC (permalink / raw)
To: Hugh Dickins
Cc: Andrea Arcangeli, KAMEZAWA Hiroyuki, linux-mm, avi, akpm, torvalds
On Mon, Jul 13, 2009 at 08:46:41AM +0200, Nick Piggin wrote:
> On Fri, Jul 10, 2009 at 12:18:07PM +0100, Hugh Dickins wrote:
> > On Wed, 8 Jul 2009, Andrea Arcangeli wrote:
> > > On Tue, Jul 07, 2009 at 06:06:29PM +0900, KAMEZAWA Hiroyuki wrote:
> > > harmful as there's a double page fault generated instead of a single
> > > one, kksmd has a cost but zeropage isn't free either in term of page
> > > faults too)
> >
> > Much as I like KSM, I have to agree with Avi, that if people are
> > wanting the ZERO_PAGE back in compute-intensive loads, then relying
>
> I can't imagine ZERO_PAGE would be too widely used in compute-intensive
> loads. At least, not serious stuff. Nobody wants to spend 4K of cache
> and one TLB entry for one or two non-zero floating point numbers in a
> big sparse matrix. Not to mention the cache and memory overhead of just
> scanning through lots of zeros.
Heh, oops: before anyone thinks it will be fun to make some
personal insults, there won't be much memory overhead from
zero page of course! Cache and *TLB* overhead is going to be
involved.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [RFC][PATCH 0/4] ZERO PAGE again v2
2009-07-10 11:18 ` Hugh Dickins
2009-07-10 13:42 ` Andrea Arcangeli
@ 2009-07-13 6:46 ` Nick Piggin
2009-07-13 7:24 ` Nick Piggin
1 sibling, 1 reply; 26+ messages in thread
From: Nick Piggin @ 2009-07-13 6:46 UTC (permalink / raw)
To: Hugh Dickins
Cc: Andrea Arcangeli, KAMEZAWA Hiroyuki, linux-mm, avi, akpm, torvalds
On Fri, Jul 10, 2009 at 12:18:07PM +0100, Hugh Dickins wrote:
> On Wed, 8 Jul 2009, Andrea Arcangeli wrote:
> > On Tue, Jul 07, 2009 at 06:06:29PM +0900, KAMEZAWA Hiroyuki wrote:
> > harmful as there's a double page fault generated instead of a single
> > one, kksmd has a cost but zeropage isn't free either in term of page
> > faults too)
>
> Much as I like KSM, I have to agree with Avi, that if people are
> wanting the ZERO_PAGE back in compute-intensive loads, then relying
I can't imagine ZERO_PAGE would be too widely used in compute-intensive
loads. At least, not serious stuff. Nobody wants to spend 4K of cache
and one TLB entry for one or two non-zero floating point numbers in a
big sparse matrix. Not to mention the cache and memory overhead of just
scanning through lots of zeros.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [RFC][PATCH 0/4] ZERO PAGE again v2
2009-07-10 13:42 ` Andrea Arcangeli
2009-07-10 14:12 ` KAMEZAWA Hiroyuki
@ 2009-07-10 17:09 ` Hugh Dickins
1 sibling, 0 replies; 26+ messages in thread
From: Hugh Dickins @ 2009-07-10 17:09 UTC (permalink / raw)
To: Andrea Arcangeli
Cc: KAMEZAWA Hiroyuki, Nick Piggin, linux-mm, avi, akpm, torvalds
On Fri, 10 Jul 2009, Andrea Arcangeli wrote:
> On Fri, Jul 10, 2009 at 12:18:07PM +0100, Hugh Dickins wrote:
> > as an "automatic" KSM page, I don't know; or we'll need to teach KSM
> > not to waste its time remerging instances of the ZERO_PAGE to a
> > zeroed KSM page. We'll worry about that once both sets in mmotm.
>
> There is no risk of collision, zero page is not anonymous so...
You're right, yes, no change required.
>
> I think it's a mistake for them not to try ksm first regardless of the
> new zeropage patches being floating around, because my whole point is
> that those kind of apps will save more than just zero page with
> ksm. Sure not guaranteed... but possible and worth checking.
Okay, you're right to ask people to give KSM a try: there may be some
apps wanting ZERO_PAGE back, which would really benefit from having
other pages also merged for them, despite the cost.
(And the cost may not be so bad, given that you can stop KSM scanning
for merges, while still keeping all the merges already made.)
But I'm not going to hold my breath on that, and I don't think Kame
should hold back his patch for that. Particularly since it would
need the extensions to apply KSM to other processes, and we're not
giving those any thought this time around.
(Beyond musing that if we're going to apply madvise MADV_MERGEABLE
to other processes, wouldn't we do better to extend the idea, to be
able to apply madvise and mlock generally to other processes?).
Hugh
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [RFC][PATCH 0/4] ZERO PAGE again v2
2009-07-10 15:16 ` Andrea Arcangeli
@ 2009-07-10 15:32 ` KAMEZAWA Hiroyuki
0 siblings, 0 replies; 26+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-07-10 15:32 UTC (permalink / raw)
To: Andrea Arcangeli
Cc: KAMEZAWA Hiroyuki, Hugh Dickins, Nick Piggin, linux-mm, avi,
akpm, torvalds
Andrea Arcangeli wrote:
> On Fri, Jul 10, 2009 at 11:12:38PM +0900, KAMEZAWA Hiroyuki wrote:
>> BTW, ksm has no refcnt pingpong problem ?
>
> Well sure it has, the refcount has to be increased when pages are
> shared, just like for regular fork() on anonymous memory, but the
> point is that you pay for it only when you're saving ram, so the
> probability that is just pure overhead is lower than for the zero
> page... it always depend on the app. I simply suggest in trying
> it... perhaps zero page is way to go for your users.. they should
> tell, not us...
>
My point is that we don't have to say "Unless you evolve yourself,
you'll die" to users. they will evolve by themselves if they are sane.
As I said, I like ksm. But demanding users to rewrite private apps is
different problem. I'd like to say "You can live as you're. but here,
there is better options" rather than "die!".
Adding documentation/advertisement and show pros. and cons. of ksm or
something correct is what we can do for increasing sane users.
Thanks,
-Kame
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [RFC][PATCH 0/4] ZERO PAGE again v2
2009-07-10 14:12 ` KAMEZAWA Hiroyuki
@ 2009-07-10 15:16 ` Andrea Arcangeli
2009-07-10 15:32 ` KAMEZAWA Hiroyuki
0 siblings, 1 reply; 26+ messages in thread
From: Andrea Arcangeli @ 2009-07-10 15:16 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki
Cc: Hugh Dickins, Nick Piggin, linux-mm, avi, akpm, torvalds
On Fri, Jul 10, 2009 at 11:12:38PM +0900, KAMEZAWA Hiroyuki wrote:
> BTW, ksm has no refcnt pingpong problem ?
Well sure it has, the refcount has to be increased when pages are
shared, just like for regular fork() on anonymous memory, but the
point is that you pay for it only when you're saving ram, so the
probability that is just pure overhead is lower than for the zero
page... it always depend on the app. I simply suggest in trying
it... perhaps zero page is way to go for your users.. they should
tell, not us...
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [RFC][PATCH 0/4] ZERO PAGE again v2
2009-07-10 13:42 ` Andrea Arcangeli
@ 2009-07-10 14:12 ` KAMEZAWA Hiroyuki
2009-07-10 15:16 ` Andrea Arcangeli
2009-07-10 17:09 ` Hugh Dickins
1 sibling, 1 reply; 26+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-07-10 14:12 UTC (permalink / raw)
To: Andrea Arcangeli
Cc: Hugh Dickins, KAMEZAWA Hiroyuki, Nick Piggin, linux-mm, avi,
akpm, torvalds
Andrea Arcangeli さんは書きました:
> On Fri, Jul 10, 2009 at 12:18:07PM +0100, Hugh Dickins wrote:
>> as an "automatic" KSM page, I don't know; or we'll need to teach KSM
>> not to waste its time remerging instances of the ZERO_PAGE to a
>> zeroed KSM page. We'll worry about that once both sets in mmotm.
>
> There is no risk of collision, zero page is not anonymous so...
>
> I think it's a mistake for them not to try ksm first regardless of the
> new zeropage patches being floating around, because my whole point is
> that those kind of apps will save more than just zero page with
> ksm. Sure not guaranteed... but possible and worth checking.
>
How many mercyless teachers who know waht is correct there are...
BTW, ksm has no refcnt pingpong problem ?
Thanks,
-Kame
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [RFC][PATCH 0/4] ZERO PAGE again v2
2009-07-10 11:18 ` Hugh Dickins
@ 2009-07-10 13:42 ` Andrea Arcangeli
2009-07-10 14:12 ` KAMEZAWA Hiroyuki
2009-07-10 17:09 ` Hugh Dickins
2009-07-13 6:46 ` Nick Piggin
1 sibling, 2 replies; 26+ messages in thread
From: Andrea Arcangeli @ 2009-07-10 13:42 UTC (permalink / raw)
To: Hugh Dickins
Cc: KAMEZAWA Hiroyuki, Nick Piggin, linux-mm, avi, akpm, torvalds
On Fri, Jul 10, 2009 at 12:18:07PM +0100, Hugh Dickins wrote:
> as an "automatic" KSM page, I don't know; or we'll need to teach KSM
> not to waste its time remerging instances of the ZERO_PAGE to a
> zeroed KSM page. We'll worry about that once both sets in mmotm.
There is no risk of collision, zero page is not anonymous so...
I think it's a mistake for them not to try ksm first regardless of the
new zeropage patches being floating around, because my whole point is
that those kind of apps will save more than just zero page with
ksm. Sure not guaranteed... but possible and worth checking.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [RFC][PATCH 0/4] ZERO PAGE again v2
2009-07-08 17:32 ` Andrea Arcangeli
2009-07-09 1:12 ` KAMEZAWA Hiroyuki
@ 2009-07-10 11:18 ` Hugh Dickins
2009-07-10 13:42 ` Andrea Arcangeli
2009-07-13 6:46 ` Nick Piggin
1 sibling, 2 replies; 26+ messages in thread
From: Hugh Dickins @ 2009-07-10 11:18 UTC (permalink / raw)
To: Andrea Arcangeli
Cc: KAMEZAWA Hiroyuki, Nick Piggin, linux-mm, avi, akpm, torvalds
On Wed, 8 Jul 2009, Andrea Arcangeli wrote:
> On Tue, Jul 07, 2009 at 06:06:29PM +0900, KAMEZAWA Hiroyuki wrote:
> > Then, most of users will not notice that ZERO_PAGE is not available until
> > he(she) find OOM-Killer message. This is very terrible situation for me.
> > (and most of system admins.)
>
> Can you try to teach them to use KSM and see if they gain a while lot
> more from it (surely they also do some memset(dst, 0) sometime not
> only memcpy(zerosrc, dst)). Not to tell when they init to non zero
> values their arrays/matrix which is a bit harder to optimize for with
> zero page...
>
> My only dislike is that zero page requires a flood of "if ()" new
> branches in fast paths that benefits nothing but badly written app,
> and that's the only reason I liked its removal.
>
> For goodly (and badly) written scientific app there KSM that will do
> more than zeropage while dealing with matrix algorithms and such. If
> they try KSM and they don't gain a lot more free memory than with the
> zero page hack, then I agree in reintroducing it, but I guess when
> they try KSM they will ask you to patch kernel with it, instead of
> patch kernel with zeropage. If they don't gain anything more with KSM
> than with zeropage, and the kksmd overhead is too high, then it would
> make sense to use zeropage for them I agree even if it bites in the
> fast path of all apps that can't benefit from it. (not to tell the
> fact that reading zero and writing non zero back for normal apps is
> harmful as there's a double page fault generated instead of a single
> one, kksmd has a cost but zeropage isn't free either in term of page
> faults too)
Much as I like KSM, I have to agree with Avi, that if people are
wanting the ZERO_PAGE back in compute-intensive loads, then relying
on ksmd to put Humpty Dumpty together again is much too expensive a
way to go about it: ZERO_PAGE saves him from falling off the wall
in the first place, and that's much the better way to deal with it.
It might turn out in the end to be convenient to treat the ZERO_PAGE
as an "automatic" KSM page, I don't know; or we'll need to teach KSM
not to waste its time remerging instances of the ZERO_PAGE to a
zeroed KSM page. We'll worry about that once both sets in mmotm.
I didn't care for Kamezawa-san's original patchsets, seemed messy
and branchy, but it looks to be heading the right way now using
vm_normal_page (pity about arches without pte_special, oh well).
Hugh
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [RFC][PATCH 0/4] ZERO PAGE again v2
2009-07-10 3:38 ` Linus Torvalds
@ 2009-07-10 3:51 ` Nick Piggin
0 siblings, 0 replies; 26+ messages in thread
From: Nick Piggin @ 2009-07-10 3:51 UTC (permalink / raw)
To: Linus Torvalds; +Cc: KAMEZAWA Hiroyuki, linux-mm, hugh.dickins, avi, akpm
On Thu, Jul 09, 2009 at 08:38:41PM -0700, Linus Torvalds wrote:
>
>
> On Fri, 10 Jul 2009, Nick Piggin wrote:
> >
> > So if you were going to re-add the zero page when a single regression
> > is reported after a year or two, then it was wrong of you to remove
> > the zero page to begin with.
>
> Oh, I argued against it. And I told people we can always revert it.
>
> But even better than reverting it is to just fix it cleanly in the new
> world order, wouldn't you say?
If it is put back in without being refcounted, that should be
fine. That's what I first proposed for it (although you didn't
think my actua implementation was clean and preferred to remove
it completely).
I would like to see support for architectures which don't define
a pte_special bit too, however.
> > So to answer your question, I guess I would like to know a bit
> > more about the regression and what the app is doing.
>
> Ok, go ahead and try to figure it out. But please don't cc me on it any
> more. I'm not interested in your hang-ups with ZERO_PAGE.
>
> Because I just don't care. I think ZERO_PAGE was great to begin with, I
> put it to use muyself historically at Transmeta, and I didn't like your
> crusade against it.
>
> People (including me) have told you why it's useful. Whatever. If you
> still want more information, go bother somebody else.
You're apparently not reading what I write when I do cc you, so
I don't think there would be much difference.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [RFC][PATCH 0/4] ZERO PAGE again v2
2009-07-10 2:09 ` Nick Piggin
@ 2009-07-10 3:38 ` Linus Torvalds
2009-07-10 3:51 ` Nick Piggin
0 siblings, 1 reply; 26+ messages in thread
From: Linus Torvalds @ 2009-07-10 3:38 UTC (permalink / raw)
To: Nick Piggin; +Cc: KAMEZAWA Hiroyuki, linux-mm, hugh.dickins, avi, akpm
On Fri, 10 Jul 2009, Nick Piggin wrote:
>
> So if you were going to re-add the zero page when a single regression
> is reported after a year or two, then it was wrong of you to remove
> the zero page to begin with.
Oh, I argued against it. And I told people we can always revert it.
But even better than reverting it is to just fix it cleanly in the new
world order, wouldn't you say?
> So to answer your question, I guess I would like to know a bit
> more about the regression and what the app is doing.
Ok, go ahead and try to figure it out. But please don't cc me on it any
more. I'm not interested in your hang-ups with ZERO_PAGE.
Because I just don't care. I think ZERO_PAGE was great to begin with, I
put it to use muyself historically at Transmeta, and I didn't like your
crusade against it.
People (including me) have told you why it's useful. Whatever. If you
still want more information, go bother somebody else.
Linus
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [RFC][PATCH 0/4] ZERO PAGE again v2
2009-07-09 17:54 ` Linus Torvalds
@ 2009-07-10 2:09 ` Nick Piggin
2009-07-10 3:38 ` Linus Torvalds
0 siblings, 1 reply; 26+ messages in thread
From: Nick Piggin @ 2009-07-10 2:09 UTC (permalink / raw)
To: Linus Torvalds; +Cc: KAMEZAWA Hiroyuki, linux-mm, hugh.dickins, avi, akpm
On Thu, Jul 09, 2009 at 10:54:02AM -0700, Linus Torvalds wrote:
>
>
> On Thu, 9 Jul 2009, Nick Piggin wrote:
> >
> > Having a ZERO_PAGE I'm not against, so I don't know why you claim
> > I am. Al I'm saying is that now we don't have one, we should have
> > some good reasons to introduce it again. Unreasonable?
>
> Umm. I had good reasons to introduce it in the _first_ place.
>
> And now you have reports of people who depend on the behaviour, and point
> to the new behaviour as a *regression*.
>
> What the _hell_ more do you want?
Well there is obviously no way to test a representaive sample of
workoads, and we pretty much knew that some people are going to
prefer to have a ZERO_PAGE with their app.
So if you were going to re-add the zero page when a single regression
is reported after a year or two, then it was wrong of you to remove
the zero page to begin with.
So to answer your question, I guess I would like to know a bit
more about the regression and what the app is doing.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [RFC][PATCH 0/4] ZERO PAGE again v2
2009-07-09 7:47 ` Nick Piggin
@ 2009-07-09 17:54 ` Linus Torvalds
2009-07-10 2:09 ` Nick Piggin
0 siblings, 1 reply; 26+ messages in thread
From: Linus Torvalds @ 2009-07-09 17:54 UTC (permalink / raw)
To: Nick Piggin; +Cc: KAMEZAWA Hiroyuki, linux-mm, hugh.dickins, avi, akpm
On Thu, 9 Jul 2009, Nick Piggin wrote:
>
> Having a ZERO_PAGE I'm not against, so I don't know why you claim
> I am. Al I'm saying is that now we don't have one, we should have
> some good reasons to introduce it again. Unreasonable?
Umm. I had good reasons to introduce it in the _first_ place.
And now you have reports of people who depend on the behaviour, and point
to the new behaviour as a *regression*.
What the _hell_ more do you want?
Linus
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [RFC][PATCH 0/4] ZERO PAGE again v2
2009-07-08 16:07 ` Linus Torvalds
@ 2009-07-09 7:47 ` Nick Piggin
2009-07-09 17:54 ` Linus Torvalds
0 siblings, 1 reply; 26+ messages in thread
From: Nick Piggin @ 2009-07-09 7:47 UTC (permalink / raw)
To: Linus Torvalds; +Cc: KAMEZAWA Hiroyuki, linux-mm, hugh.dickins, avi, akpm
On Wed, Jul 08, 2009 at 09:07:08AM -0700, Linus Torvalds wrote:
>
>
> On Wed, 8 Jul 2009, Nick Piggin wrote:
> >
> > I'm talking about the cases where you would want to use ZERO_PAGE for
> > computing with anonymous memory (not for zeroing IO). In that case,
> > the TLB would probably be the primary one.
>
> Umm. Are you even listening to yourself?
>
> OF COURSE the TLB would be the primary issue, since the zero page has made
> cache effects go away.
Yes, that's what I said.
> BUT THAT IS A GOOD THING.
>
> Instead of making it sound like "that's a bad thing, because now TLB
> dominates", just say what's really going on: "that's a good thing, because
> you made the cache access patterns wonderful".
>
> See? You claim TLB is a problem, but it's really that you made all _other_
> problems go away.
No I don't. Re-read what I wrote. I said that an app that scans huge
sparse matricies *might* be better off with a different data format
rather than relying on ZERO_PAGE with a naive format. Of course if it
does rely on ZERO_PAGE for this, then having ZERO_PAGE is going to be
better than allocating lots of anonymous memory for it, I didn't caim
otherwise.
> Now, it's true that you can avoid the TLB costs by moving the costs into a
> "software TLB" (aka "indirection table"), and make the TLB footprint go
> away by turning it into something else (indirection through a pointer).
>
> Sometimes that speeds things up - because you may be able to actually
> avoid doing other things by noticing huge gaps etc - but sometimes it
> slows you down too - because indirection isn't free, and maybe there are
> common cases where there isn't so many sparse accesses.
Sometimes there are much for efficient data formats for sparse
matricies too, which can also avoid the quantization effects
(and cache usage) of page size.
> > I don't fight it. I had proposals to get rid of cache pingpong too,
> > but you rejected that ;)
>
> Yeah, and they were ugly as hell. I had a suggestion to just continue to
> use PG_reserved (which was _way_ simpler than your version) before the
> counting, but you and Hugh were on a religious agenda against the whole
> PG_reserved bit.
No I had no problem with it. I didn't see the big difference between
explicitly testing for ZERO_PAGE or using a new page flag bit (which
aren't free -- PG_reserved can basicaly be reclaimed now if somebody
cares to go through arch init code).
Now if there was more than one type of page to test for, then yes
a page flag would be better because it would reduce branches. I
just didn't see why you were religiously against testing ZERO_PAGE
but thought PG_zero (or PG_reserved or whatever) was so much better.
> So I don't understand why you claim that you fight it, when you CLEARLY
> do. The patches that KAMEZAWA-san posted were already simpler than your
> complicated models were - I just think they can be simpler still.
Having a ZERO_PAGE I'm not against, so I don't know why you claim
I am. Al I'm saying is that now we don't have one, we should have
some good reasons to introduce it again. Unreasonable?
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [RFC][PATCH 0/4] ZERO PAGE again v2
2009-07-08 17:32 ` Andrea Arcangeli
@ 2009-07-09 1:12 ` KAMEZAWA Hiroyuki
2009-07-10 11:18 ` Hugh Dickins
1 sibling, 0 replies; 26+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-07-09 1:12 UTC (permalink / raw)
To: Andrea Arcangeli; +Cc: Nick Piggin, linux-mm, hugh.dickins, avi, akpm, torvalds
On Wed, 8 Jul 2009 19:32:06 +0200
Andrea Arcangeli <aarcange@redhat.com> wrote:
> On Tue, Jul 07, 2009 at 06:06:29PM +0900, KAMEZAWA Hiroyuki wrote:
> > Then, most of users will not notice that ZERO_PAGE is not available until
> > he(she) find OOM-Killer message. This is very terrible situation for me.
> > (and most of system admins.)
>
> Can you try to teach them to use KSM and see if they gain a while lot
> more from it (surely they also do some memset(dst, 0) sometime not
> only memcpy(zerosrc, dst)). Not to tell when they init to non zero
> values their arrays/matrix which is a bit harder to optimize for with
> zero page...
>
Hmm, scan & take diff & merge user pages in the kernel ?
IIUC, it can be only help if zero-page's life time are verrrry long.
> My only dislike is that zero page requires a flood of "if ()" new
> branches in fast paths that benefits nothing but badly written app,
> and that's the only reason I liked its removal.
>
I'll take Linus's suggestion "use pte_special() in vm_normal_page()".
Then, "if()" will not increase so much as expected, flood.
In usual apps which doen't use any zero-page, following path will be checked.
- "is this WRITE fault ?" in do_anonymous_page().
- vm_normal_page() never finds pte_special() then no more "if"s.
- get_user_pages() etc..will have more 2-3 if()s depends on passed flags.
Anyway, I'll reduce overheads as much as possible. please see v3.
pte_special() checks (which are already used) reduce "if()" to some extent.
> For goodly (and badly) written scientific app there KSM that will do
> more than zeropage while dealing with matrix algorithms and such. If
> they try KSM and they don't gain a lot more free memory than with the
> zero page hack, then I agree in reintroducing it, but I guess when
> they try KSM they will ask you to patch kernel with it, instead of
> patch kernel with zeropage.
Most of the difference between zeropage and KSM solution is that
zeropage requires no refcnt/rmap handling, never pollutes caches, etc.
This will be big advantage.
> If they don't gain anything more with KSM
> than with zeropage, and the kksmd overhead is too high, then it would
> make sense to use zeropage for them I agree even if it bites in the
> fast path of all apps that can't benefit from it. (not to tell the
> fact that reading zero and writing non zero back for normal apps is
> harmful as there's a double page fault generated instead of a single
> one, kksmd has a cost but zeropage isn't free either in term of page
> faults too)
>
Sorry, my _all_ customers use RHEL5 and there are no ksm yet.
BTW, I love concepts of KSM but I don't trust KSM so much as that I recommend
it to my customers, yet. It's a bit young for production in my point of view.
AFAIK, no bug reports of ksm has reached this mailing list, yet.
Thanks,
-Kame
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [RFC][PATCH 0/4] ZERO PAGE again v2
2009-07-07 9:06 ` KAMEZAWA Hiroyuki
2009-07-07 14:00 ` Nick Piggin
@ 2009-07-08 17:32 ` Andrea Arcangeli
2009-07-09 1:12 ` KAMEZAWA Hiroyuki
2009-07-10 11:18 ` Hugh Dickins
1 sibling, 2 replies; 26+ messages in thread
From: Andrea Arcangeli @ 2009-07-08 17:32 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki
Cc: Nick Piggin, linux-mm, hugh.dickins, avi, akpm, torvalds
On Tue, Jul 07, 2009 at 06:06:29PM +0900, KAMEZAWA Hiroyuki wrote:
> Then, most of users will not notice that ZERO_PAGE is not available until
> he(she) find OOM-Killer message. This is very terrible situation for me.
> (and most of system admins.)
Can you try to teach them to use KSM and see if they gain a while lot
more from it (surely they also do some memset(dst, 0) sometime not
only memcpy(zerosrc, dst)). Not to tell when they init to non zero
values their arrays/matrix which is a bit harder to optimize for with
zero page...
My only dislike is that zero page requires a flood of "if ()" new
branches in fast paths that benefits nothing but badly written app,
and that's the only reason I liked its removal.
For goodly (and badly) written scientific app there KSM that will do
more than zeropage while dealing with matrix algorithms and such. If
they try KSM and they don't gain a lot more free memory than with the
zero page hack, then I agree in reintroducing it, but I guess when
they try KSM they will ask you to patch kernel with it, instead of
patch kernel with zeropage. If they don't gain anything more with KSM
than with zeropage, and the kksmd overhead is too high, then it would
make sense to use zeropage for them I agree even if it bites in the
fast path of all apps that can't benefit from it. (not to tell the
fact that reading zero and writing non zero back for normal apps is
harmful as there's a double page fault generated instead of a single
one, kksmd has a cost but zeropage isn't free either in term of page
faults too)
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [RFC][PATCH 0/4] ZERO PAGE again v2
2009-07-08 6:21 ` Nick Piggin
@ 2009-07-08 16:07 ` Linus Torvalds
2009-07-09 7:47 ` Nick Piggin
0 siblings, 1 reply; 26+ messages in thread
From: Linus Torvalds @ 2009-07-08 16:07 UTC (permalink / raw)
To: Nick Piggin; +Cc: KAMEZAWA Hiroyuki, linux-mm, hugh.dickins, avi, akpm
On Wed, 8 Jul 2009, Nick Piggin wrote:
>
> I'm talking about the cases where you would want to use ZERO_PAGE for
> computing with anonymous memory (not for zeroing IO). In that case,
> the TLB would probably be the primary one.
Umm. Are you even listening to yourself?
OF COURSE the TLB would be the primary issue, since the zero page has made
cache effects go away.
BUT THAT IS A GOOD THING.
Instead of making it sound like "that's a bad thing, because now TLB
dominates", just say what's really going on: "that's a good thing, because
you made the cache access patterns wonderful".
See? You claim TLB is a problem, but it's really that you made all _other_
problems go away.
Now, it's true that you can avoid the TLB costs by moving the costs into a
"software TLB" (aka "indirection table"), and make the TLB footprint go
away by turning it into something else (indirection through a pointer).
Sometimes that speeds things up - because you may be able to actually
avoid doing other things by noticing huge gaps etc - but sometimes it
slows you down too - because indirection isn't free, and maybe there are
common cases where there isn't so many sparse accesses.
> I don't fight it. I had proposals to get rid of cache pingpong too,
> but you rejected that ;)
Yeah, and they were ugly as hell. I had a suggestion to just continue to
use PG_reserved (which was _way_ simpler than your version) before the
counting, but you and Hugh were on a religious agenda against the whole
PG_reserved bit.
So I don't understand why you claim that you fight it, when you CLEARLY
do. The patches that KAMEZAWA-san posted were already simpler than your
complicated models were - I just think they can be simpler still.
Linus
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [RFC][PATCH 0/4] ZERO PAGE again v2
2009-07-07 16:59 ` Linus Torvalds
@ 2009-07-08 6:21 ` Nick Piggin
2009-07-08 16:07 ` Linus Torvalds
0 siblings, 1 reply; 26+ messages in thread
From: Nick Piggin @ 2009-07-08 6:21 UTC (permalink / raw)
To: Linus Torvalds; +Cc: KAMEZAWA Hiroyuki, linux-mm, hugh.dickins, avi, akpm
On Tue, Jul 07, 2009 at 09:59:39AM -0700, Linus Torvalds wrote:
>
>
> On Tue, 7 Jul 2009, Nick Piggin wrote:
> >
> > I just wouldn't like to re-add significant complexity back to
> > the vm without good and concrete examples. OK I agree that
> > just saying "rewrite your code" is not so good, but are there
> > real significant problems? Is it inside just a particuar linear
> > algebra library or something that might be able to be updated?
>
> The thing is, ZERO_PAGE really used to work very well.
>
> It was not only useful for simple "I want lots of memory, and I'm going to
> use it pretty sparsely" (which _is_ a very valid thing to do), but it was
> useful for TLB benchmarking, and for cache-efficient "I'm going to write
> lots of zeroes to files", and for a number of other uses.
>
> You can talk about TLB pressure all you want, but the fact is, quite often
> normal cache effects dominate - and ZERO_PAGE is _wonderful_ for sharing
> cachelines (which is why it was so useful for TLB performance testing: map
> a huge area, and you know that there will be no cache effects, only TLB
> effects).
>
> There are actually very few cases where TLB effects are the primary ones -
> they tend to happen when you have truly random accesses that have no
> locality even on a small case. That's pretty rare. Even things that depend
> on sparse arrays etc tend to mainly _access_ the parts it works on (ie you
> may have allocated hundreds of megs of memory to simplify your memory
> management, but you work on only a small part of it).
I'm talking about the cases where you would want to use ZERO_PAGE for
computing with anonymous memory (not for zeroing IO). In that case,
the TLB would probably be the primary one. For IO, having zero page
for /dev/zero mapping would be a good idea (I think I actually
implemented that in a sles kernel for someone doing benchmarking).
> So it's not just "people actually use it". It really was a useful feature,
> with valid uses. We got rid of it, but if we can re-introduce it cleanly,
> we definitely should.
>
> I don't understand why you fight it. If we can do it well (read: without
> having fork/exit cause endless amounts of cache ping-pongs due to touching
> 'struct page *'), there are no downsides that I can see. It's not like
> it's a complicated feature.
I don't fight it. I had proposals to get rid of cache pingpong too,
but you rejected that ;)
I just think that right now seeing as we have gotten rid of it for
a year or so, then it would be good to know of some real cases where
it helps before reintroducing it. I'm not saying none exist, I just
want to know about them.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [RFC][PATCH 0/4] ZERO PAGE again v2
2009-07-07 14:00 ` Nick Piggin
@ 2009-07-07 16:59 ` Linus Torvalds
2009-07-08 6:21 ` Nick Piggin
0 siblings, 1 reply; 26+ messages in thread
From: Linus Torvalds @ 2009-07-07 16:59 UTC (permalink / raw)
To: Nick Piggin; +Cc: KAMEZAWA Hiroyuki, linux-mm, hugh.dickins, avi, akpm
On Tue, 7 Jul 2009, Nick Piggin wrote:
>
> I just wouldn't like to re-add significant complexity back to
> the vm without good and concrete examples. OK I agree that
> just saying "rewrite your code" is not so good, but are there
> real significant problems? Is it inside just a particuar linear
> algebra library or something that might be able to be updated?
The thing is, ZERO_PAGE really used to work very well.
It was not only useful for simple "I want lots of memory, and I'm going to
use it pretty sparsely" (which _is_ a very valid thing to do), but it was
useful for TLB benchmarking, and for cache-efficient "I'm going to write
lots of zeroes to files", and for a number of other uses.
You can talk about TLB pressure all you want, but the fact is, quite often
normal cache effects dominate - and ZERO_PAGE is _wonderful_ for sharing
cachelines (which is why it was so useful for TLB performance testing: map
a huge area, and you know that there will be no cache effects, only TLB
effects).
There are actually very few cases where TLB effects are the primary ones -
they tend to happen when you have truly random accesses that have no
locality even on a small case. That's pretty rare. Even things that depend
on sparse arrays etc tend to mainly _access_ the parts it works on (ie you
may have allocated hundreds of megs of memory to simplify your memory
management, but you work on only a small part of it).
So it's not just "people actually use it". It really was a useful feature,
with valid uses. We got rid of it, but if we can re-introduce it cleanly,
we definitely should.
I don't understand why you fight it. If we can do it well (read: without
having fork/exit cause endless amounts of cache ping-pongs due to touching
'struct page *'), there are no downsides that I can see. It's not like
it's a complicated feature.
Linus
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [RFC][PATCH 0/4] ZERO PAGE again v2
2009-07-07 9:06 ` KAMEZAWA Hiroyuki
@ 2009-07-07 14:00 ` Nick Piggin
2009-07-07 16:59 ` Linus Torvalds
2009-07-08 17:32 ` Andrea Arcangeli
1 sibling, 1 reply; 26+ messages in thread
From: Nick Piggin @ 2009-07-07 14:00 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki; +Cc: linux-mm, hugh.dickins, avi, akpm, torvalds
On Tue, Jul 07, 2009 at 06:06:29PM +0900, KAMEZAWA Hiroyuki wrote:
> On Tue, 7 Jul 2009 10:47:50 +0200
> Nick Piggin <npiggin@suse.de> wrote:
>
> > On Tue, Jul 07, 2009 at 04:51:01PM +0900, KAMEZAWA Hiroyuki wrote:
> > > Hi, this is ZERO_PAGE mapping revival patch v2.
> > >
> > > ZERO PAGE was removed in 2.6.24 (=> http://lkml.org/lkml/2007/10/9/112)
> > > and I had no objections.
> > >
> > > In these days, at user support jobs, I noticed a few of customers
> > > are making use of ZERO_PAGE intentionally...brutal mmap and scan, etc.
> > > (For example, scanning big sparse table and save the contents.)
> > >
> > > They are using RHEL4-5(before 2.6.18) then they don't notice that ZERO_PAGE
> > > is gone, yet.
> > > yes, I can say "ZERO PAGE is gone" to them in next generation distro.
> > >
> > > Recently, a question comes to lkml (http://lkml.org/lkml/2009/6/4/383
> > >
> > > Maybe there are some users of ZERO_PAGE other than my customers.
> > > So, can't we use ZERO_PAGE again ?
> > >
> > > IIUC, the problem of ZERO_PAGE was
> > > - reference count cache ping-pong
> > > - complicated handling.
> > > - the behavior page-fault-twice can make applications slow.
> > >
> > > This patch is a trial to de-refcounted ZERO_PAGE.
> > >
> > > This includes 4 patches.
> > > [1/4] introduce pte_zero() at el.
> > > [2/4] use ZERO_PAGE for READ fault in anonymous mapping.
> > > [3/4] corner cases, get_user_pages()
> > > [4/4] introduce get_user_pages_nozero().
> > >
> > > I feel these patches needs to be clearer but includes almost all
> > > messes we have to handle at using ZERO_PAGE again.
> > >
> > > What I feel now is
> > > a. technically, we can do because we did.
> > > b. Considering maintenance, code's beauty etc.. ZERO_PAGE adds messes.
> > > c. Very big benefits for some (a few?) users but no benefits to usual programs.
> > >
> > > There are trade-off between b. and c.
> > >
> > > Any comments are welcome.
> >
> > Can we just try to wean them off it? Using zero page for huge sparse
> > matricies is probably not ideal anyway because it needs to still be
> > faulted in and it occupies TLB space. They might see better performance
> > by using a better algorithm.
> >
> TLB usage is another problem I think...
>
> I agreed removal of ZERO_PAGE in 2.6.24. But I'm now retrying this
> because of following reasons.
>
> 1. From programmer's perspective, I almost agree to you. But considering users,
> most of them are _not_ programmers, saying "please rewrite your program
> because OS changed its implementation" is no help.
> What they want is calclating something and not writing a program.
>
> 2. This change is _very_ implicit and doesn't affect alomost all programs.
> I think ZERO_PAGE() is used only when an apllication does some special jobs.
>
> Then, most of users will not notice that ZERO_PAGE is not available until
> he(she) find OOM-Killer message. This is very terrible situation for me.
> (and most of system admins.)
>
> 3. Considering save&restore application's data table, ZERO_PAGE is useful.
> maybe.
I just wouldn't like to re-add significant complexity back to
the vm without good and concrete examples. OK I agree that
just saying "rewrite your code" is not so good, but are there
real significant problems? Is it inside just a particuar linear
algebra library or something that might be able to be updated?
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [RFC][PATCH 0/4] ZERO PAGE again v2
2009-07-07 9:18 ` KAMEZAWA Hiroyuki
@ 2009-07-07 9:26 ` Avi Kivity
0 siblings, 0 replies; 26+ messages in thread
From: Avi Kivity @ 2009-07-07 9:26 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki; +Cc: Nick Piggin, linux-mm, hugh.dickins, akpm, torvalds
On 07/07/2009 12:18 PM, KAMEZAWA Hiroyuki wrote:
>> For kvm live migration, I've thought of extending mincore() to report if
>> a page will be read as zeros.
>>
>>
> BTW, ksm can scale enough to combine all pages which just includes zero ?
> No heavy cache ping-pong without zero-page ?
>
ksm will increase cpu and cache load; it's oriented towards workloads
where reducing memory pressure is more important than cpu load. For
cpu-intensive, low sharing workloads it will be disabled. That's why I
want an alternative way to deal with zero pages; it can be ZERO_PAGE,
mincore(), or madvise(MADV_DROP_IFZERO).
--
error compiling committee.c: too many arguments to function
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [RFC][PATCH 0/4] ZERO PAGE again v2
2009-07-07 9:05 ` Avi Kivity
@ 2009-07-07 9:18 ` KAMEZAWA Hiroyuki
2009-07-07 9:26 ` Avi Kivity
0 siblings, 1 reply; 26+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-07-07 9:18 UTC (permalink / raw)
To: Avi Kivity; +Cc: Nick Piggin, linux-mm, hugh.dickins, akpm, torvalds
On Tue, 07 Jul 2009 12:05:24 +0300
Avi Kivity <avi@redhat.com> wrote:
> On 07/07/2009 11:47 AM, Nick Piggin wrote:
> >> Any comments are welcome.
> >>
> >
> > Can we just try to wean them off it? Using zero page for huge sparse
> > matricies is probably not ideal anyway because it needs to still be
> > faulted in and it occupies TLB space. They might see better performance
> > by using a better algorithm.
> >
>
> For kvm live migration, I've thought of extending mincore() to report if
> a page will be read as zeros.
>
BTW, ksm can scale enough to combine all pages which just includes zero ?
No heavy cache ping-pong without zero-page ?
Thanks,
-Kame
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [RFC][PATCH 0/4] ZERO PAGE again v2
2009-07-07 8:47 ` Nick Piggin
2009-07-07 9:05 ` Avi Kivity
@ 2009-07-07 9:06 ` KAMEZAWA Hiroyuki
2009-07-07 14:00 ` Nick Piggin
2009-07-08 17:32 ` Andrea Arcangeli
1 sibling, 2 replies; 26+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-07-07 9:06 UTC (permalink / raw)
To: Nick Piggin; +Cc: linux-mm, hugh.dickins, avi, akpm, torvalds
On Tue, 7 Jul 2009 10:47:50 +0200
Nick Piggin <npiggin@suse.de> wrote:
> On Tue, Jul 07, 2009 at 04:51:01PM +0900, KAMEZAWA Hiroyuki wrote:
> > Hi, this is ZERO_PAGE mapping revival patch v2.
> >
> > ZERO PAGE was removed in 2.6.24 (=> http://lkml.org/lkml/2007/10/9/112)
> > and I had no objections.
> >
> > In these days, at user support jobs, I noticed a few of customers
> > are making use of ZERO_PAGE intentionally...brutal mmap and scan, etc.
> > (For example, scanning big sparse table and save the contents.)
> >
> > They are using RHEL4-5(before 2.6.18) then they don't notice that ZERO_PAGE
> > is gone, yet.
> > yes, I can say "ZERO PAGE is gone" to them in next generation distro.
> >
> > Recently, a question comes to lkml (http://lkml.org/lkml/2009/6/4/383
> >
> > Maybe there are some users of ZERO_PAGE other than my customers.
> > So, can't we use ZERO_PAGE again ?
> >
> > IIUC, the problem of ZERO_PAGE was
> > - reference count cache ping-pong
> > - complicated handling.
> > - the behavior page-fault-twice can make applications slow.
> >
> > This patch is a trial to de-refcounted ZERO_PAGE.
> >
> > This includes 4 patches.
> > [1/4] introduce pte_zero() at el.
> > [2/4] use ZERO_PAGE for READ fault in anonymous mapping.
> > [3/4] corner cases, get_user_pages()
> > [4/4] introduce get_user_pages_nozero().
> >
> > I feel these patches needs to be clearer but includes almost all
> > messes we have to handle at using ZERO_PAGE again.
> >
> > What I feel now is
> > a. technically, we can do because we did.
> > b. Considering maintenance, code's beauty etc.. ZERO_PAGE adds messes.
> > c. Very big benefits for some (a few?) users but no benefits to usual programs.
> >
> > There are trade-off between b. and c.
> >
> > Any comments are welcome.
>
> Can we just try to wean them off it? Using zero page for huge sparse
> matricies is probably not ideal anyway because it needs to still be
> faulted in and it occupies TLB space. They might see better performance
> by using a better algorithm.
>
TLB usage is another problem I think...
I agreed removal of ZERO_PAGE in 2.6.24. But I'm now retrying this
because of following reasons.
1. From programmer's perspective, I almost agree to you. But considering users,
most of them are _not_ programmers, saying "please rewrite your program
because OS changed its implementation" is no help.
What they want is calclating something and not writing a program.
2. This change is _very_ implicit and doesn't affect alomost all programs.
I think ZERO_PAGE() is used only when an apllication does some special jobs.
Then, most of users will not notice that ZERO_PAGE is not available until
he(she) find OOM-Killer message. This is very terrible situation for me.
(and most of system admins.)
3. Considering save&restore application's data table, ZERO_PAGE is useful.
maybe.
Thanks,
-Kame
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [RFC][PATCH 0/4] ZERO PAGE again v2
2009-07-07 8:47 ` Nick Piggin
@ 2009-07-07 9:05 ` Avi Kivity
2009-07-07 9:18 ` KAMEZAWA Hiroyuki
2009-07-07 9:06 ` KAMEZAWA Hiroyuki
1 sibling, 1 reply; 26+ messages in thread
From: Avi Kivity @ 2009-07-07 9:05 UTC (permalink / raw)
To: Nick Piggin; +Cc: KAMEZAWA Hiroyuki, linux-mm, hugh.dickins, akpm, torvalds
On 07/07/2009 11:47 AM, Nick Piggin wrote:
> On Tue, Jul 07, 2009 at 04:51:01PM +0900, KAMEZAWA Hiroyuki wrote:
>
>> Hi, this is ZERO_PAGE mapping revival patch v2.
>>
>> ZERO PAGE was removed in 2.6.24 (=> http://lkml.org/lkml/2007/10/9/112)
>> and I had no objections.
>>
>> In these days, at user support jobs, I noticed a few of customers
>> are making use of ZERO_PAGE intentionally...brutal mmap and scan, etc.
>> (For example, scanning big sparse table and save the contents.)
>>
>> They are using RHEL4-5(before 2.6.18) then they don't notice that ZERO_PAGE
>> is gone, yet.
>> yes, I can say "ZERO PAGE is gone" to them in next generation distro.
>>
>> Recently, a question comes to lkml (http://lkml.org/lkml/2009/6/4/383
>>
>> Maybe there are some users of ZERO_PAGE other than my customers.
>> So, can't we use ZERO_PAGE again ?
>>
>> IIUC, the problem of ZERO_PAGE was
>> - reference count cache ping-pong
>> - complicated handling.
>> - the behavior page-fault-twice can make applications slow.
>>
>> This patch is a trial to de-refcounted ZERO_PAGE.
>>
>> This includes 4 patches.
>> [1/4] introduce pte_zero() at el.
>> [2/4] use ZERO_PAGE for READ fault in anonymous mapping.
>> [3/4] corner cases, get_user_pages()
>> [4/4] introduce get_user_pages_nozero().
>>
>> I feel these patches needs to be clearer but includes almost all
>> messes we have to handle at using ZERO_PAGE again.
>>
>> What I feel now is
>> a. technically, we can do because we did.
>> b. Considering maintenance, code's beauty etc.. ZERO_PAGE adds messes.
>> c. Very big benefits for some (a few?) users but no benefits to usual programs.
>>
>> There are trade-off between b. and c.
>>
>> Any comments are welcome.
>>
>
> Can we just try to wean them off it? Using zero page for huge sparse
> matricies is probably not ideal anyway because it needs to still be
> faulted in and it occupies TLB space. They might see better performance
> by using a better algorithm.
>
For kvm live migration, I've thought of extending mincore() to report if
a page will be read as zeros.
--
error compiling committee.c: too many arguments to function
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [RFC][PATCH 0/4] ZERO PAGE again v2
2009-07-07 7:51 KAMEZAWA Hiroyuki
@ 2009-07-07 8:47 ` Nick Piggin
2009-07-07 9:05 ` Avi Kivity
2009-07-07 9:06 ` KAMEZAWA Hiroyuki
0 siblings, 2 replies; 26+ messages in thread
From: Nick Piggin @ 2009-07-07 8:47 UTC (permalink / raw)
To: KAMEZAWA Hiroyuki; +Cc: linux-mm, hugh.dickins, avi, akpm, torvalds
On Tue, Jul 07, 2009 at 04:51:01PM +0900, KAMEZAWA Hiroyuki wrote:
> Hi, this is ZERO_PAGE mapping revival patch v2.
>
> ZERO PAGE was removed in 2.6.24 (=> http://lkml.org/lkml/2007/10/9/112)
> and I had no objections.
>
> In these days, at user support jobs, I noticed a few of customers
> are making use of ZERO_PAGE intentionally...brutal mmap and scan, etc.
> (For example, scanning big sparse table and save the contents.)
>
> They are using RHEL4-5(before 2.6.18) then they don't notice that ZERO_PAGE
> is gone, yet.
> yes, I can say "ZERO PAGE is gone" to them in next generation distro.
>
> Recently, a question comes to lkml (http://lkml.org/lkml/2009/6/4/383
>
> Maybe there are some users of ZERO_PAGE other than my customers.
> So, can't we use ZERO_PAGE again ?
>
> IIUC, the problem of ZERO_PAGE was
> - reference count cache ping-pong
> - complicated handling.
> - the behavior page-fault-twice can make applications slow.
>
> This patch is a trial to de-refcounted ZERO_PAGE.
>
> This includes 4 patches.
> [1/4] introduce pte_zero() at el.
> [2/4] use ZERO_PAGE for READ fault in anonymous mapping.
> [3/4] corner cases, get_user_pages()
> [4/4] introduce get_user_pages_nozero().
>
> I feel these patches needs to be clearer but includes almost all
> messes we have to handle at using ZERO_PAGE again.
>
> What I feel now is
> a. technically, we can do because we did.
> b. Considering maintenance, code's beauty etc.. ZERO_PAGE adds messes.
> c. Very big benefits for some (a few?) users but no benefits to usual programs.
>
> There are trade-off between b. and c.
>
> Any comments are welcome.
Can we just try to wean them off it? Using zero page for huge sparse
matricies is probably not ideal anyway because it needs to still be
faulted in and it occupies TLB space. They might see better performance
by using a better algorithm.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 26+ messages in thread
* [RFC][PATCH 0/4] ZERO PAGE again v2
@ 2009-07-07 7:51 KAMEZAWA Hiroyuki
2009-07-07 8:47 ` Nick Piggin
0 siblings, 1 reply; 26+ messages in thread
From: KAMEZAWA Hiroyuki @ 2009-07-07 7:51 UTC (permalink / raw)
To: linux-mm; +Cc: npiggin, hugh.dickins, avi, akpm, torvalds
Hi, this is ZERO_PAGE mapping revival patch v2.
ZERO PAGE was removed in 2.6.24 (=> http://lkml.org/lkml/2007/10/9/112)
and I had no objections.
In these days, at user support jobs, I noticed a few of customers
are making use of ZERO_PAGE intentionally...brutal mmap and scan, etc.
(For example, scanning big sparse table and save the contents.)
They are using RHEL4-5(before 2.6.18) then they don't notice that ZERO_PAGE
is gone, yet.
yes, I can say "ZERO PAGE is gone" to them in next generation distro.
Recently, a question comes to lkml (http://lkml.org/lkml/2009/6/4/383
Maybe there are some users of ZERO_PAGE other than my customers.
So, can't we use ZERO_PAGE again ?
IIUC, the problem of ZERO_PAGE was
- reference count cache ping-pong
- complicated handling.
- the behavior page-fault-twice can make applications slow.
This patch is a trial to de-refcounted ZERO_PAGE.
This includes 4 patches.
[1/4] introduce pte_zero() at el.
[2/4] use ZERO_PAGE for READ fault in anonymous mapping.
[3/4] corner cases, get_user_pages()
[4/4] introduce get_user_pages_nozero().
I feel these patches needs to be clearer but includes almost all
messes we have to handle at using ZERO_PAGE again.
What I feel now is
a. technically, we can do because we did.
b. Considering maintenance, code's beauty etc.. ZERO_PAGE adds messes.
c. Very big benefits for some (a few?) users but no benefits to usual programs.
There are trade-off between b. and c.
Any comments are welcome.
-Kame
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 26+ messages in thread
end of thread, other threads:[~2009-07-13 7:03 UTC | newest]
Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-07-07 15:50 [RFC][PATCH 0/4] ZERO PAGE again v2 KAMEZAWA Hiroyuki
-- strict thread matches above, loose matches on Subject: below --
2009-07-07 7:51 KAMEZAWA Hiroyuki
2009-07-07 8:47 ` Nick Piggin
2009-07-07 9:05 ` Avi Kivity
2009-07-07 9:18 ` KAMEZAWA Hiroyuki
2009-07-07 9:26 ` Avi Kivity
2009-07-07 9:06 ` KAMEZAWA Hiroyuki
2009-07-07 14:00 ` Nick Piggin
2009-07-07 16:59 ` Linus Torvalds
2009-07-08 6:21 ` Nick Piggin
2009-07-08 16:07 ` Linus Torvalds
2009-07-09 7:47 ` Nick Piggin
2009-07-09 17:54 ` Linus Torvalds
2009-07-10 2:09 ` Nick Piggin
2009-07-10 3:38 ` Linus Torvalds
2009-07-10 3:51 ` Nick Piggin
2009-07-08 17:32 ` Andrea Arcangeli
2009-07-09 1:12 ` KAMEZAWA Hiroyuki
2009-07-10 11:18 ` Hugh Dickins
2009-07-10 13:42 ` Andrea Arcangeli
2009-07-10 14:12 ` KAMEZAWA Hiroyuki
2009-07-10 15:16 ` Andrea Arcangeli
2009-07-10 15:32 ` KAMEZAWA Hiroyuki
2009-07-10 17:09 ` Hugh Dickins
2009-07-13 6:46 ` Nick Piggin
2009-07-13 7:24 ` Nick Piggin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox