* vDSO vs. mm : problems with ppc vdso
@ 2006-02-28 5:39 Benjamin Herrenschmidt
2006-02-28 5:54 ` Andrew Morton
0 siblings, 1 reply; 15+ messages in thread
From: Benjamin Herrenschmidt @ 2006-02-28 5:39 UTC (permalink / raw)
To: linux-mm; +Cc: Hugh Dickins, akpm, Paul Mackerras, Nick Piggin
is for 2.6.16 and see if we want to do something about it).
I have discovered some issues with my vDSO implementation that went
unnoticed so far but might cause problems with the VM.
The problems are related to the way the powerpc vDSO is implemented in
order to support COW (for breakpoints) and randomisation. It's not
implemented as a gate_area() hack. Instead, I create a vma at process
exec (see arch_setup_additional_pages() in arch/powerpc/kernel/vdso.c,
which is called from binfmt_elf.c).
This vma has custom vm_ops with a nopage() function that maps in pages
from the vdso on demand. Those pages are kernel pages shared by all
processes at first, though if a COW happens, they will be replaced by
normal anonymous pages by the normal COW code.
A first problem happens here (though it's not my main concern right now.
It's a bug I need to fix but at least I have a good handle on it). The
nopage function decides wether to map the pages from the 32 or the 64
bits vdso based on test_thread_flag(). This is broken if those pages end
up being faulted in as the result of a get_user_pages() done by another
process. Typically, that means that a 64 bits gdb tracing a 32 bits
program will fault the wrong pages in. So I need a way to "know" what
vdso to fault it based on the vma ... that will require me to either
hack something in the vma (stuff a flag somewhere ?) or find a way to
identify a 32 bits vma from a 64 bits vma...
The second problem is more subtle and that's where I really need a VM
guru to help me assess how bad the situation is and what should be done
to fix it.
Since when not-COWed, those vDSO pages are actually kernel pages mapped
into every process, they aren't per-se anonymous pages, nor file
pages... in fact, they don't quite fit in anything rmap knows about.
However, I can't mark the VMA as VM_RESERVED or anything like that since
that would prevent COW from working.
Thus we hit some "interesting" code path in rmap of that sort:
- page_address_in_vma() will always fail for those pages afaik. Not
sure of the consequences at this point. (Neither PageAnon() nor
page->mapping)
- page_referenced() will not get into any of the code path under "if
(page_mapped(page) && page->mapping) {" thanks to page->mapping being
NULL afaik. I think that's a good thing in this case. We rely solely on
the PTE information for these pages
- try_to_unmap() gets more funny... It will call try_to_unmap_file().
Maybe we shouldn't ... maybe I should set the kernel pages of the vdso's
PageLocked(), though I would have to dig through the possible side
effects of that (notably vs. COW). If that works though, it may be a
good workaround to avoid nasty code path in the VM.
- If we hit try_to_unmap_one(), we'll probably do dec_mm_counter(mm,
file_rss). But file_rss has never been incremented when the page was
faulted in in the first place, was it ? Those shared kernel pages
shouldn't be accounted there anyway
- There may be other problematic code path outside of rmap.c that I
missed.
I'd really like to assess the situation and maybe get a few band aids in
2.6.16 if proper fixes are too complicated...
Thanks !
Ben.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 15+ messages in thread* Re: vDSO vs. mm : problems with ppc vdso
2006-02-28 5:39 vDSO vs. mm : problems with ppc vdso Benjamin Herrenschmidt
@ 2006-02-28 5:54 ` Andrew Morton
2006-02-28 6:08 ` Benjamin Herrenschmidt
0 siblings, 1 reply; 15+ messages in thread
From: Andrew Morton @ 2006-02-28 5:54 UTC (permalink / raw)
To: Benjamin Herrenschmidt
Cc: linux-mm, hugh, paulus, nickpiggin, David S. Miller
Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote:
>
> (Andrew: I think it's important to assess at least how bad the problem
> is for 2.6.16 and see if we want to do something about it).
>
> I have discovered some issues with my vDSO implementation that went
> unnoticed so far but might cause problems with the VM.
>
> The problems are related to the way the powerpc vDSO is implemented in
> order to support COW (for breakpoints) and randomisation. It's not
> implemented as a gate_area() hack. Instead, I create a vma at process
> exec (see arch_setup_additional_pages() in arch/powerpc/kernel/vdso.c,
> which is called from binfmt_elf.c).
>
> This vma has custom vm_ops with a nopage() function that maps in pages
> from the vdso on demand. Those pages are kernel pages shared by all
> processes at first, though if a COW happens, they will be replaced by
> normal anonymous pages by the normal COW code.
>
> A first problem happens here (though it's not my main concern right now.
> It's a bug I need to fix but at least I have a good handle on it). The
> nopage function decides wether to map the pages from the 32 or the 64
> bits vdso based on test_thread_flag(). This is broken if those pages end
> up being faulted in as the result of a get_user_pages() done by another
> process. Typically, that means that a 64 bits gdb tracing a 32 bits
> program will fault the wrong pages in. So I need a way to "know" what
> vdso to fault it based on the vma ... that will require me to either
> hack something in the vma (stuff a flag somewhere ?) or find a way to
> identify a 32 bits vma from a 64 bits vma...
As mentioned on IRC, we keep on getting bugs because we don't have a clear
separation between 64-bit tasks (a task_struct thing) and 64-bit mm's (an
mm_struct thing). I'd propose added mm_struct.task_size and testing that
in the appropriate places.
> The second problem is more subtle and that's where I really need a VM
> guru to help me assess how bad the situation is and what should be done
> to fix it.
>
> Since when not-COWed, those vDSO pages are actually kernel pages mapped
> into every process, they aren't per-se anonymous pages, nor file
> pages... in fact, they don't quite fit in anything rmap knows about.
> However, I can't mark the VMA as VM_RESERVED or anything like that since
> that would prevent COW from working.
>
> Thus we hit some "interesting" code path in rmap of that sort:
rmap won't touch this page unless your ->nopage handler put it onto the
page LRU.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: vDSO vs. mm : problems with ppc vdso
2006-02-28 5:54 ` Andrew Morton
@ 2006-02-28 6:08 ` Benjamin Herrenschmidt
2006-02-28 6:20 ` Andrew Morton
2006-02-28 6:27 ` [PATCH] Add mm->task_size and fix powerpc vdso Benjamin Herrenschmidt
0 siblings, 2 replies; 15+ messages in thread
From: Benjamin Herrenschmidt @ 2006-02-28 6:08 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux-mm, hugh, paulus, nickpiggin, David S. Miller
> As mentioned on IRC, we keep on getting bugs because we don't have a clear
> separation between 64-bit tasks (a task_struct thing) and 64-bit mm's (an
> mm_struct thing). I'd propose added mm_struct.task_size and testing that
> in the appropriate places.
Ok, What about a patch adding mm->task_size and setting it to TASK_SIZE
asap and use that to fix my bug at least. It would have to be done in
flush_old_exec(), after the call to flush_thread() at least on powerpc
that's where we properly switch the TIF_32BIT flag. I can't do it
earlier. Does that sound all right ?
I'll send the patch as a reply to this message.
> > The second problem is more subtle and that's where I really need a VM
> > guru to help me assess how bad the situation is and what should be done
> > to fix it.
> >
> > Since when not-COWed, those vDSO pages are actually kernel pages mapped
> > into every process, they aren't per-se anonymous pages, nor file
> > pages... in fact, they don't quite fit in anything rmap knows about.
> > However, I can't mark the VMA as VM_RESERVED or anything like that since
> > that would prevent COW from working.
> >
> > Thus we hit some "interesting" code path in rmap of that sort:
>
> rmap won't touch this page unless your ->nopage handler put it onto the
> page LRU.
It indeed looks like try_to_unmap() is never called if page->mapping is
NULL.
Do you gus see any other case where my "special" vma & those kernel
pages in could be a problem ?
Cheers,
Ben.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: vDSO vs. mm : problems with ppc vdso
2006-02-28 6:08 ` Benjamin Herrenschmidt
@ 2006-02-28 6:20 ` Andrew Morton
2006-02-28 6:30 ` Benjamin Herrenschmidt
2006-02-28 6:27 ` [PATCH] Add mm->task_size and fix powerpc vdso Benjamin Herrenschmidt
1 sibling, 1 reply; 15+ messages in thread
From: Andrew Morton @ 2006-02-28 6:20 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: linux-mm, hugh, paulus, nickpiggin, davem
Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote:
>
>
> > As mentioned on IRC, we keep on getting bugs because we don't have a clear
> > separation between 64-bit tasks (a task_struct thing) and 64-bit mm's (an
> > mm_struct thing). I'd propose added mm_struct.task_size and testing that
> > in the appropriate places.
>
> Ok, What about a patch adding mm->task_size and setting it to TASK_SIZE
> asap and use that to fix my bug at least. It would have to be done in
> flush_old_exec(), after the call to flush_thread() at least on powerpc
> that's where we properly switch the TIF_32BIT flag. I can't do it
> earlier. Does that sound all right ?
It should be done with some care - I suspect this will become *the*
way in which we recognise a 64-bit mm and quite a bit of stuff will end up
migrating to it. We do need input from the various 64-bit people who have
wrestled with these things.
> I'll send the patch as a reply to this message.
Please copy linux-arch.
> > > The second problem is more subtle and that's where I really need a VM
> > > guru to help me assess how bad the situation is and what should be done
> > > to fix it.
> > >
> > > Since when not-COWed, those vDSO pages are actually kernel pages mapped
> > > into every process, they aren't per-se anonymous pages, nor file
> > > pages... in fact, they don't quite fit in anything rmap knows about.
> > > However, I can't mark the VMA as VM_RESERVED or anything like that since
> > > that would prevent COW from working.
> > >
> > > Thus we hit some "interesting" code path in rmap of that sort:
> >
> > rmap won't touch this page unless your ->nopage handler put it onto the
> > page LRU.
>
> It indeed looks like try_to_unmap() is never called if page->mapping is
> NULL.
It's not ->mapping. It's the fact that rmap only operates on pages which
were found on the LRU. If you don't add it to the LRU (and surely you do
not) then no problem.
> Do you gus see any other case where my "special" vma & those kernel
> pages in could be a problem ?
It sounds just like a sound card DMA buffer to me - that's a solved
problem? (Well, we keep unsolving it, but it's a relatively common
pattern).
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: vDSO vs. mm : problems with ppc vdso
2006-02-28 6:20 ` Andrew Morton
@ 2006-02-28 6:30 ` Benjamin Herrenschmidt
2006-02-28 6:47 ` Andrew Morton
2006-02-28 10:24 ` Nick Piggin
0 siblings, 2 replies; 15+ messages in thread
From: Benjamin Herrenschmidt @ 2006-02-28 6:30 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux-mm, hugh, paulus, nickpiggin, davem
> It should be done with some care - I suspect this will become *the*
> way in which we recognise a 64-bit mm and quite a bit of stuff will end up
> migrating to it. We do need input from the various 64-bit people who have
> wrestled with these things.
Patch send, now let's get feedback ;)
> > I'll send the patch as a reply to this message.
>
> Please copy linux-arch.
Did that.
> It's not ->mapping. It's the fact that rmap only operates on pages which
> were found on the LRU. If you don't add it to the LRU (and surely you do
> not) then no problem.
Ok.
> > Do you gus see any other case where my "special" vma & those kernel
> > pages in could be a problem ?
>
> It sounds just like a sound card DMA buffer to me - that's a solved
> problem? (Well, we keep unsolving it, but it's a relatively common
> pattern).
Might be ... though I though the later had VM_RESERVED or some similar
thing ... the trick with that vma is that i don't want any of these
things to allow for COW ... But yeah, it _looks_ like it will just work
(well... it appears to work so far anyway....)
Ben.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: vDSO vs. mm : problems with ppc vdso
2006-02-28 6:30 ` Benjamin Herrenschmidt
@ 2006-02-28 6:47 ` Andrew Morton
2006-02-28 7:36 ` Benjamin Herrenschmidt
2006-02-28 12:13 ` Hugh Dickins
2006-02-28 10:24 ` Nick Piggin
1 sibling, 2 replies; 15+ messages in thread
From: Andrew Morton @ 2006-02-28 6:47 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: linux-mm, hugh, paulus, nickpiggin, davem
Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote:
>
> > > I'll send the patch as a reply to this message.
> >
> > Please copy linux-arch.
>
> Did that.
You did not, you meanie.
> > > pages in could be a problem ?
> >
> > It sounds just like a sound card DMA buffer to me - that's a solved
> > problem? (Well, we keep unsolving it, but it's a relatively common
> > pattern).
>
> Might be ... though I though the later had VM_RESERVED or some similar
> thing ... the trick with that vma is that i don't want any of these
> things to allow for COW ... But yeah, it _looks_ like it will just work
> (well... it appears to work so far anyway....)
Hugh's the man - he loves that stuff.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: vDSO vs. mm : problems with ppc vdso
2006-02-28 6:47 ` Andrew Morton
@ 2006-02-28 7:36 ` Benjamin Herrenschmidt
2006-02-28 12:13 ` Hugh Dickins
1 sibling, 0 replies; 15+ messages in thread
From: Benjamin Herrenschmidt @ 2006-02-28 7:36 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux-mm, hugh, paulus, nickpiggin, davem
On Mon, 2006-02-27 at 22:47 -0800, Andrew Morton wrote:
> Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote:
> >
> > > > I'll send the patch as a reply to this message.
> > >
> > > Please copy linux-arch.
> >
> > Did that.
>
> You did not, you meanie.
I did :) Under the title
[PATCH] Add mm->task_size and fix
powerpc vdso
Check the CC list :)
> Hugh's the man - he loves that stuff.
Ok.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: vDSO vs. mm : problems with ppc vdso
2006-02-28 6:47 ` Andrew Morton
2006-02-28 7:36 ` Benjamin Herrenschmidt
@ 2006-02-28 12:13 ` Hugh Dickins
1 sibling, 0 replies; 15+ messages in thread
From: Hugh Dickins @ 2006-02-28 12:13 UTC (permalink / raw)
To: Andrew Morton, Benjamin Herrenschmidt; +Cc: linux-mm, paulus, nickpiggin, davem
On Mon, 27 Feb 2006, Andrew Morton wrote:
> Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote:
> >
> > > > I'll send the patch as a reply to this message.
> > >
> > > Please copy linux-arch.
> >
> > Did that.
>
> You did not, you meanie.
I couldn't see linux-arch in there either. I won't comment on the patch
as there are others with a _much_ better grasp of the 32-on-64 issues.
But yes, something like that is long overdue, it's been a recurrent
hassle not to have any indication in the mm.
> > > > pages in could be a problem ?
> > >
> > > It sounds just like a sound card DMA buffer to me - that's a solved
> > > problem? (Well, we keep unsolving it, but it's a relatively common
> > > pattern).
> >
> > Might be ... though I though the later had VM_RESERVED or some similar
> > thing ... the trick with that vma is that i don't want any of these
> > things to allow for COW ... But yeah, it _looks_ like it will just work
> > (well... it appears to work so far anyway....)
>
> Hugh's the man - he loves that stuff.
And here I am, limping along behind - wild applause as I enter the ring!
Ben, I agree completely with Andrew, you should be just fine with that
vma. I've noticed it in the past when checking users of insert_vm_struct,
and saw no problem with it. Andi copied that code to use in x86_64 a few
months back; and Fedora have something similar on i386 (though they use
install_page rather than nopage, and so have to patch install_page to
cope with !vma->vm_file).
Pages with NULL page->mapping pass through page_add_file_rmap and
page_remove_rmap without causing any stir, and nobody puts them on
the LRU anyway, and (in your case - one day we might worry more about
sound's case) you've only got one lot of these pages so we're not in
the least interested in freeing them under memory pressure. It is a
surprising case, but plenty of other examples of it: sleep soundly.
(But I didn't understand your comment "i don't want any of these things
to allow for COW" - I thought that was just what you are allowing for.)
Hugh
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: vDSO vs. mm : problems with ppc vdso
2006-02-28 6:30 ` Benjamin Herrenschmidt
2006-02-28 6:47 ` Andrew Morton
@ 2006-02-28 10:24 ` Nick Piggin
2006-02-28 12:32 ` Hugh Dickins
1 sibling, 1 reply; 15+ messages in thread
From: Nick Piggin @ 2006-02-28 10:24 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: Andrew Morton, linux-mm, hugh, paulus, davem
Benjamin Herrenschmidt wrote:
>>>Do you gus see any other case where my "special" vma & those kernel
>>>pages in could be a problem ?
>>
>>It sounds just like a sound card DMA buffer to me - that's a solved
>>problem? (Well, we keep unsolving it, but it's a relatively common
>>pattern).
>
>
> Might be ... though I though the later had VM_RESERVED or some similar
> thing ... the trick with that vma is that i don't want any of these
> things to allow for COW ... But yeah, it _looks_ like it will just work
> (well... it appears to work so far anyway....)
>
You should be OK. VM_RESERVED itself is something of an anachronism
these days. If you're not getting your page from the page allocator
then you'll want to make sure each of their count, and mapcount is
reset before allowing them to be mapped.
--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: vDSO vs. mm : problems with ppc vdso
2006-02-28 10:24 ` Nick Piggin
@ 2006-02-28 12:32 ` Hugh Dickins
2006-02-28 17:55 ` Benjamin Herrenschmidt
2006-03-01 2:24 ` Nick Piggin
0 siblings, 2 replies; 15+ messages in thread
From: Hugh Dickins @ 2006-02-28 12:32 UTC (permalink / raw)
To: Nick Piggin
Cc: Benjamin Herrenschmidt, Andrew Morton, linux-mm, paulus, davem
On Tue, 28 Feb 2006, Nick Piggin wrote:
>
> You should be OK. VM_RESERVED itself is something of an anachronism
> these days. If you're not getting your page from the page allocator
> then you'll want to make sure each of their count, and mapcount is
> reset before allowing them to be mapped.
Yes, it's fine that VM_RESERVED isn't set on it.
But I don't understand your remarks about count and mapcount at all:
perhaps you meant to say something else?
If I ignore what you actually said, and think of what problems there
might be in that area, then yes, if the pages come from kernel memory
(they do) rather than page allocator, we'd better make sure page_count
starts above 0, so it doesn't go down to zero on last free from userspace:
and indeed, Ben's vdso_init does a get_page on each to ensure that.
Hugh
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: vDSO vs. mm : problems with ppc vdso
2006-02-28 12:32 ` Hugh Dickins
@ 2006-02-28 17:55 ` Benjamin Herrenschmidt
2006-03-01 2:24 ` Nick Piggin
1 sibling, 0 replies; 15+ messages in thread
From: Benjamin Herrenschmidt @ 2006-02-28 17:55 UTC (permalink / raw)
To: Hugh Dickins; +Cc: Nick Piggin, Andrew Morton, linux-mm, paulus, davem
On Tue, 2006-02-28 at 12:32 +0000, Hugh Dickins wrote:
> On Tue, 28 Feb 2006, Nick Piggin wrote:
> >
> > You should be OK. VM_RESERVED itself is something of an anachronism
> > these days. If you're not getting your page from the page allocator
> > then you'll want to make sure each of their count, and mapcount is
> > reset before allowing them to be mapped.
>
> Yes, it's fine that VM_RESERVED isn't set on it.
> But I don't understand your remarks about count and mapcount at all:
> perhaps you meant to say something else?
Ah thanks , I was worried there too ;)
> If I ignore what you actually said, and think of what problems there
> might be in that area, then yes, if the pages come from kernel memory
> (they do) rather than page allocator, we'd better make sure page_count
> starts above 0, so it doesn't go down to zero on last free from userspace:
> and indeed, Ben's vdso_init does a get_page on each to ensure that.
Yup, I took care of that and that part seems to work. I don't touch
mapcount at all.
Ben.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: vDSO vs. mm : problems with ppc vdso
2006-02-28 12:32 ` Hugh Dickins
2006-02-28 17:55 ` Benjamin Herrenschmidt
@ 2006-03-01 2:24 ` Nick Piggin
2006-03-01 2:26 ` Benjamin Herrenschmidt
1 sibling, 1 reply; 15+ messages in thread
From: Nick Piggin @ 2006-03-01 2:24 UTC (permalink / raw)
To: Hugh Dickins
Cc: Benjamin Herrenschmidt, Andrew Morton, linux-mm, paulus, davem
Hugh Dickins wrote:
> On Tue, 28 Feb 2006, Nick Piggin wrote:
>
>>You should be OK. VM_RESERVED itself is something of an anachronism
>>these days. If you're not getting your page from the page allocator
>>then you'll want to make sure each of their count, and mapcount is
>>reset before allowing them to be mapped.
>
>
> Yes, it's fine that VM_RESERVED isn't set on it.
> But I don't understand your remarks about count and mapcount at all:
> perhaps you meant to say something else?
>
Yes, count should be elevated.
mapcount should be reset, to avoid the bug in page_remove_rmap.
--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: vDSO vs. mm : problems with ppc vdso
2006-03-01 2:24 ` Nick Piggin
@ 2006-03-01 2:26 ` Benjamin Herrenschmidt
2006-03-01 2:38 ` Nick Piggin
0 siblings, 1 reply; 15+ messages in thread
From: Benjamin Herrenschmidt @ 2006-03-01 2:26 UTC (permalink / raw)
To: Nick Piggin; +Cc: Hugh Dickins, Andrew Morton, linux-mm, paulus, davem
On Wed, 2006-03-01 at 13:24 +1100, Nick Piggin wrote:
> Hugh Dickins wrote:
> > On Tue, 28 Feb 2006, Nick Piggin wrote:
> >
> >>You should be OK. VM_RESERVED itself is something of an anachronism
> >>these days. If you're not getting your page from the page allocator
> >>then you'll want to make sure each of their count, and mapcount is
> >>reset before allowing them to be mapped.
> >
> >
> > Yes, it's fine that VM_RESERVED isn't set on it.
> > But I don't understand your remarks about count and mapcount at all:
> > perhaps you meant to say something else?
> >
>
> Yes, count should be elevated.
>
> mapcount should be reset, to avoid the bug in page_remove_rmap.
Can you be more explicit ?
Cheers,
Ben.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: vDSO vs. mm : problems with ppc vdso
2006-03-01 2:26 ` Benjamin Herrenschmidt
@ 2006-03-01 2:38 ` Nick Piggin
0 siblings, 0 replies; 15+ messages in thread
From: Nick Piggin @ 2006-03-01 2:38 UTC (permalink / raw)
To: Benjamin Herrenschmidt
Cc: Hugh Dickins, Andrew Morton, linux-mm, paulus, davem
Benjamin Herrenschmidt wrote:
> On Wed, 2006-03-01 at 13:24 +1100, Nick Piggin wrote:
>>mapcount should be reset, to avoid the bug in page_remove_rmap.
>
>
> Can you be more explicit ?
>
reset_page_mapcount() -- if you don't already know that mapcount is
the right value.
It might not be unreasonable to say "bah my arch initialises it to
0, and I didn'tcare for it to be accounted in nr_mapped anyway",
however not using the mapcount accessors means you might break in
future if they change.
--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH] Add mm->task_size and fix powerpc vdso
2006-02-28 6:08 ` Benjamin Herrenschmidt
2006-02-28 6:20 ` Andrew Morton
@ 2006-02-28 6:27 ` Benjamin Herrenschmidt
1 sibling, 0 replies; 15+ messages in thread
From: Benjamin Herrenschmidt @ 2006-02-28 6:27 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux-mm, hugh, paulus, nickpiggin, David S. Miller
This patch adds mm->task_size to keep track of the task size of a given
mm and uses that to fix the powerpc vdso so that it uses the mm task
size to decide what pages to fault in instead of the current thread
flags (which broke when ptracing).
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Index: linux-work/arch/powerpc/kernel/vdso.c
===================================================================
--- linux-work.orig/arch/powerpc/kernel/vdso.c 2005-11-29 10:56:02.000000000 +1100
+++ linux-work/arch/powerpc/kernel/vdso.c 2006-02-28 17:07:18.000000000 +1100
@@ -182,8 +182,8 @@ static struct page * vdso_vma_nopage(str
unsigned long offset = address - vma->vm_start;
struct page *pg;
#ifdef CONFIG_PPC64
- void *vbase = test_thread_flag(TIF_32BIT) ?
- vdso32_kbase : vdso64_kbase;
+ void *vbase = (vma->vm_mm->task_size > TASK_SIZE_USER32) ?
+ vdso64_kbase : vdso32_kbase;
#else
void *vbase = vdso32_kbase;
#endif
Index: linux-work/fs/exec.c
===================================================================
--- linux-work.orig/fs/exec.c 2006-02-17 14:38:43.000000000 +1100
+++ linux-work/fs/exec.c 2006-02-28 17:05:50.000000000 +1100
@@ -885,6 +885,12 @@ int flush_old_exec(struct linux_binprm *
current->flags &= ~PF_RANDOMIZE;
flush_thread();
+ /* Set the new mm task size. We have to do that late because it may
+ * depend on TIF_32BIT which is only updated in flush_thread() on
+ * some architectures like powerpc
+ */
+ current->mm->task_size = TASK_SIZE;
+
if (bprm->e_uid != current->euid || bprm->e_gid != current->egid ||
file_permission(bprm->file, MAY_READ) ||
(bprm->interp_flags & BINPRM_FLAGS_ENFORCE_NONDUMP)) {
Index: linux-work/include/linux/sched.h
===================================================================
--- linux-work.orig/include/linux/sched.h 2006-02-17 14:38:43.000000000 +1100
+++ linux-work/include/linux/sched.h 2006-02-28 17:03:52.000000000 +1100
@@ -299,6 +299,7 @@ struct mm_struct {
unsigned long pgoff, unsigned long flags);
void (*unmap_area) (struct mm_struct *mm, unsigned long addr);
unsigned long mmap_base; /* base of mmap area */
+ unsigned long task_size; /* size of task vm space */
unsigned long cached_hole_size; /* if non-zero, the largest hole below free_area_cache */
unsigned long free_area_cache; /* first hole of size cached_hole_size or larger */
pgd_t * pgd;
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2006-03-01 2:38 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-02-28 5:39 vDSO vs. mm : problems with ppc vdso Benjamin Herrenschmidt
2006-02-28 5:54 ` Andrew Morton
2006-02-28 6:08 ` Benjamin Herrenschmidt
2006-02-28 6:20 ` Andrew Morton
2006-02-28 6:30 ` Benjamin Herrenschmidt
2006-02-28 6:47 ` Andrew Morton
2006-02-28 7:36 ` Benjamin Herrenschmidt
2006-02-28 12:13 ` Hugh Dickins
2006-02-28 10:24 ` Nick Piggin
2006-02-28 12:32 ` Hugh Dickins
2006-02-28 17:55 ` Benjamin Herrenschmidt
2006-03-01 2:24 ` Nick Piggin
2006-03-01 2:26 ` Benjamin Herrenschmidt
2006-03-01 2:38 ` Nick Piggin
2006-02-28 6:27 ` [PATCH] Add mm->task_size and fix powerpc vdso Benjamin Herrenschmidt
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox