linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
To: linux-mm@kvack.org
Cc: Hugh Dickins <hugh@veritas.com>,
	akpm@osdl.org, Paul Mackerras <paulus@samba.org>,
	Nick Piggin <nickpiggin@yahoo.com.au>
Subject: vDSO vs. mm : problems with ppc vdso
Date: Tue, 28 Feb 2006 16:39:14 +1100	[thread overview]
Message-ID: <1141105154.3767.27.camel@localhost.localdomain> (raw)

is for 2.6.16 and see if we want to do something about it).

I have discovered some issues with my vDSO implementation that went
unnoticed so far but might cause problems with the VM.

The problems are related to the way the powerpc vDSO is implemented in
order to support COW (for breakpoints) and randomisation. It's not
implemented as a gate_area() hack. Instead, I create a vma at process
exec (see arch_setup_additional_pages() in arch/powerpc/kernel/vdso.c,
which is called from binfmt_elf.c).

This vma has custom vm_ops with a nopage() function that maps in pages
from the vdso on demand. Those pages are kernel pages shared by all
processes at first, though if a COW happens, they will be replaced by
normal anonymous pages by the normal COW code.

A first problem happens here (though it's not my main concern right now.
It's a bug I need to fix but at least I have a good handle on it). The
nopage function decides wether to map the pages from the 32 or the 64
bits vdso based on test_thread_flag(). This is broken if those pages end
up being faulted in as the result of a get_user_pages() done by another
process. Typically, that means that a 64 bits gdb tracing a 32 bits
program will fault the wrong pages in. So I need a way to "know" what
vdso to fault it based on the vma ... that will require me to either
hack something in the vma (stuff a flag somewhere ?) or find a way to
identify a 32 bits vma from a 64 bits vma...

The second problem is more subtle and that's where I really need a VM
guru to help me assess how bad the situation is and what should be done
to fix it.

Since when not-COWed, those vDSO pages are actually kernel pages mapped
into every process, they aren't per-se anonymous pages, nor file
pages... in fact, they don't quite fit in anything rmap knows about.
However, I can't mark the VMA as VM_RESERVED or anything like that since
that would prevent COW from working.

Thus we hit some "interesting" code path in rmap of that sort:

 - page_address_in_vma() will always fail for those pages afaik. Not
sure of the consequences at this point. (Neither PageAnon() nor
page->mapping)

 - page_referenced() will not get into any of the code path under "if
(page_mapped(page) && page->mapping) {" thanks to page->mapping being
NULL afaik. I think that's a good thing in this case. We rely solely on
the PTE information for these pages

 - try_to_unmap() gets more funny... It will call try_to_unmap_file().
Maybe we shouldn't ... maybe I should set the kernel pages of the vdso's
PageLocked(), though I would have to dig through the possible side
effects of that (notably vs. COW). If that works though, it may be a
good workaround to avoid nasty code path in the VM.

 - If we hit try_to_unmap_one(), we'll probably do dec_mm_counter(mm,
file_rss). But file_rss has never been incremented when the page was
faulted in in the first place, was it ? Those shared kernel pages
shouldn't be accounted there anyway

 - There may be other problematic code path outside of rmap.c that I
missed.

I'd really like to assess the situation and maybe get a few band aids in
2.6.16 if proper fixes are too complicated... 

Thanks !

Ben.



--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

             reply	other threads:[~2006-02-28  5:39 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-02-28  5:39 Benjamin Herrenschmidt [this message]
2006-02-28  5:54 ` Andrew Morton
2006-02-28  6:08   ` Benjamin Herrenschmidt
2006-02-28  6:20     ` Andrew Morton
2006-02-28  6:30       ` Benjamin Herrenschmidt
2006-02-28  6:47         ` Andrew Morton
2006-02-28  7:36           ` Benjamin Herrenschmidt
2006-02-28 12:13           ` Hugh Dickins
2006-02-28 10:24         ` Nick Piggin
2006-02-28 12:32           ` Hugh Dickins
2006-02-28 17:55             ` Benjamin Herrenschmidt
2006-03-01  2:24             ` Nick Piggin
2006-03-01  2:26               ` Benjamin Herrenschmidt
2006-03-01  2:38                 ` Nick Piggin
2006-02-28  6:27     ` [PATCH] Add mm->task_size and fix powerpc vdso Benjamin Herrenschmidt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1141105154.3767.27.camel@localhost.localdomain \
    --to=benh@kernel.crashing.org \
    --cc=akpm@osdl.org \
    --cc=hugh@veritas.com \
    --cc=linux-mm@kvack.org \
    --cc=nickpiggin@yahoo.com.au \
    --cc=paulus@samba.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox