From: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
From: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
To: Andrew Morton <akpm@osdl.org>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: [PATCH RFP-V4 11/13] RFP prot support: fix get_user_pages() on VM_MANYPROTS vmas
Date: Sat, 26 Aug 2006 19:42:46 +0200 [thread overview]
Message-ID: <20060826174246.14790.94903.stgit@memento.home.lan> (raw)
In-Reply-To: <200608261933.36574.blaisorblade@yahoo.it>
*Non unit-tested patch* - I've not written a test case to verify functionality of
ptrace on VM_MANYPROTS area.
get_user_pages may well call __handle_mm_fault() wanting to override protections,
so in this case __handle_mm_fault() should still avoid checking VM access rights.
Also, get_user_pages() may give write faults on present readonly PTEs in
VM_MANYPROTS areas (think of PTRACE_POKETEXT), so we must still do do_wp_page
even on VM_MANYPROTS areas.
So, possibly use VM_MAYWRITE and/or VM_MAYREAD in the access_mask and check
VM_MANYPROTS in maybe_mkwrite_file (new variant of maybe_mkwrite).
API Note: there are many flags parameter which can be constructed but which
don't make any sense, but the code very freely interprets them too.
For instance VM_MAYREAD|VM_WRITE is interpreted as VM_MAYWRITE|VM_WRITE.
XXX: Todo: add checking to reject all meaningless flag combination.
====
pte_to_pgprot is to be used only with encoded PTEs.
This is needed since now pte_to_pgprot does heavy changes to the pte, it looks
for _PAGE_FILE_PROTNONE and translates it to _PAGE_PROTNONE.
---
include/linux/mm.h | 17 ++++++++++++++++-
mm/memory.c | 43 ++++++++++++++++++++++++++++++++-----------
2 files changed, 48 insertions(+), 12 deletions(-)
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 67fe661..b04973f 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -747,7 +747,22 @@ extern int install_file_pte(struct mm_st
#ifdef CONFIG_MMU
-/* We reuse VM_READ, VM_WRITE and VM_EXEC for the @access_mask. */
+/* We reuse VM_READ, VM_WRITE and (optionally) VM_EXEC for the @access_mask, to
+ * report the kind of access we request for permission checking, in case the VMA
+ * is VM_MANYPROTS.
+ *
+ * get_user_pages( force == 1 ) is a special case. It's allowed to override
+ * protection checks, even on VM_MANYPROTS vma.
+ *
+ * To express that, it must replace VM_READ / VM_WRITE with the corresponding
+ * MAY flags.
+ * This allows to force copying COW pages to break sharing even on read-only
+ * page table entries.
+ * PITFALL: you're not allowed to override only part of the checks, and in
+ * general specifying strange combinations of flags may lead to unspecified
+ * results.
+ */
+
extern int __handle_mm_fault(struct mm_struct *mm,struct vm_area_struct *vma,
unsigned long address, int access_mask);
diff --git a/mm/memory.c b/mm/memory.c
index 992d877..db71673 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1058,16 +1058,17 @@ int get_user_pages(struct task_struct *t
cond_resched();
while (!(page = follow_page(vma, start, foll_flags))) {
int ret;
- ret = __handle_mm_fault(mm, vma, start,
- foll_flags & FOLL_WRITE);
+ ret = __handle_mm_fault(mm, vma, start, vm_flags);
/*
* The VM_FAULT_WRITE bit tells us that do_wp_page has
* broken COW when necessary, even if maybe_mkwrite
* decided not to set pte_write. We can thus safely do
* subsequent page lookups as if they were reads.
*/
- if (ret & VM_FAULT_WRITE)
+ if (ret & VM_FAULT_WRITE) {
foll_flags &= ~FOLL_WRITE;
+ vm_flags &= ~(VM_WRITE|VM_MAYWRITE);
+ }
switch (ret & ~VM_FAULT_WRITE) {
case VM_FAULT_MINOR:
@@ -1402,7 +1403,20 @@ #endif
* servicing faults for write access. In the normal case, do always want
* pte_mkwrite. But get_user_pages can cause write faults for mappings
* that do not have writing enabled, when used by access_process_vm.
+ *
+ * Also, we must never change protections on VM_MANYPROTS pages; that's only
+ * allowed in do_no_page(), so test only VMA protections there. For other cases
+ * we *know* that VM_MANYPROTS is clear, such as anonymous/swap pages, and in
+ * that case using plain maybe_mkwrite() is an optimization.
+ * Instead, when we may be mapping a file, we must use maybe_mkwrite_file.
*/
+static inline pte_t maybe_mkwrite_file(pte_t pte, struct vm_area_struct *vma)
+{
+ if (likely((vma->vm_flags & (VM_WRITE | VM_MANYPROTS)) == VM_WRITE))
+ pte = pte_mkwrite(pte);
+ return pte;
+}
+
static inline pte_t maybe_mkwrite(pte_t pte, struct vm_area_struct *vma)
{
if (likely(vma->vm_flags & VM_WRITE))
@@ -1454,6 +1468,9 @@ static inline void cow_user_page(struct
* We enter with non-exclusive mmap_sem (to exclude vma changes,
* but allow concurrent faults), with pte both mapped and locked.
* We return with mmap_sem still held, but pte unmapped and unlocked.
+ *
+ * Note that a page here can be a shared readonly page where
+ * get_user_pages() (for instance for ptrace()) wants to write to it!
*/
static int do_wp_page(struct mm_struct *mm, struct vm_area_struct *vma,
unsigned long address, pte_t *page_table, pmd_t *pmd,
@@ -1509,7 +1526,8 @@ static int do_wp_page(struct mm_struct *
if (reuse) {
flush_cache_page(vma, address, pte_pfn(orig_pte));
entry = pte_mkyoung(orig_pte);
- entry = maybe_mkwrite(pte_mkdirty(entry), vma);
+ /* Since it can be shared, it can be VM_MANYPROTS! */
+ entry = maybe_mkwrite_file(pte_mkdirty(entry), vma);
ptep_set_access_flags(vma, address, page_table, entry, 1);
update_mmu_cache(vma, address, entry);
lazy_mmu_prot_update(entry);
@@ -1552,7 +1570,7 @@ gotten:
inc_mm_counter(mm, anon_rss);
flush_cache_page(vma, address, pte_pfn(orig_pte));
entry = mk_pte(new_page, vma->vm_page_prot);
- entry = maybe_mkwrite(pte_mkdirty(entry), vma);
+ entry = maybe_mkwrite_file(pte_mkdirty(entry), vma);
lazy_mmu_prot_update(entry);
ptep_establish(vma, address, page_table, entry);
update_mmu_cache(vma, address, entry);
@@ -1982,7 +2000,7 @@ static int do_swap_page(struct mm_struct
inc_mm_counter(mm, anon_rss);
pte = mk_pte(page, vma->vm_page_prot);
if (write_access && can_share_swap_page(page)) {
- pte = maybe_mkwrite(pte_mkdirty(pte), vma);
+ pte = maybe_mkwrite_file(pte_mkdirty(pte), vma);
write_access = 0;
}
@@ -2305,7 +2323,7 @@ static inline int handle_pte_fault(struc
pte_t entry;
pte_t old_entry;
spinlock_t *ptl;
- int write_access = access_mask & VM_WRITE;
+ int write_access = access_mask & (VM_WRITE|VM_MAYWRITE);
old_entry = entry = *pte;
if (!pte_present(entry)) {
@@ -2313,7 +2331,7 @@ static inline int handle_pte_fault(struc
* we need to check VM_MANYPROTS, because in that case the arch
* fault handler skips the VMA protection check. */
if (!pte_file(entry) && unlikely(insufficient_vma_perms(vma, access_mask)))
- goto out_segv;
+ goto segv;
if (pte_none(entry)) {
if (!vma->vm_ops || !vma->vm_ops->nopage)
@@ -2340,8 +2358,9 @@ static inline int handle_pte_fault(struc
* condition with another thread having already fixed the fault. So go
* the slow way. */
if (unlikely(vma->vm_flags & VM_MANYPROTS) &&
- unlikely(insufficient_perms(entry, access_mask)))
- goto out_segv;
+ unlikely(insufficient_perms(entry, access_mask)) &&
+ likely(!(access_mask & (VM_MAYWRITE|VM_MAYREAD))))
+ goto segv_unlock;
if (write_access) {
if (!pte_write(entry))
@@ -2367,8 +2386,10 @@ static inline int handle_pte_fault(struc
unlock:
pte_unmap_unlock(pte, ptl);
return VM_FAULT_MINOR;
-out_segv:
+
+segv_unlock:
pte_unmap_unlock(pte, ptl);
+segv:
return VM_FAULT_SIGSEGV;
}
Chiacchiera con i tuoi amici in tempo reale!
http://it.yahoo.com/mail_it/foot/*http://it.messenger.yahoo.com
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2006-08-26 17:42 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-08-26 17:33 [PATCH RFP-V4 00/13] remap_file_pages protection support - 4th attempt Blaisorblade
2006-08-26 17:37 ` [PATCH RFP-V4 03/13] RFP prot support: add needed macros Paolo 'Blaisorblade' Giarrusso, Paolo 'Blaisorblade' Giarrusso
2006-08-26 17:42 ` [PATCH RFP-V4 01/13] RFP: new bitmask_trans in <linux/bitops.h> Paolo 'Blaisorblade' Giarrusso, Paolo 'Blaisorblade' Giarrusso
2006-08-26 17:42 ` [PATCH RFP-V4 02/13] Fix comment about remap_file_pages Paolo 'Blaisorblade' Giarrusso, Paolo 'Blaisorblade' Giarrusso
2006-08-26 17:42 ` [PATCH RFP-V4 03/13] RFP prot support: add needed macros Paolo 'Blaisorblade' Giarrusso, Paolo 'Blaisorblade' Giarrusso
2006-08-26 17:42 ` [PATCH RFP-V4 04/13] RFP prot support: handle MANYPROTS VMAs Paolo 'Blaisorblade' Giarrusso, Paolo 'Blaisorblade' Giarrusso
2006-08-26 17:42 ` [PATCH RFP-V4 05/13] RFP prot support: disallow mprotect() on manyprots mappings Paolo 'Blaisorblade' Giarrusso, Paolo 'Blaisorblade' Giarrusso
2006-08-26 17:42 ` [PATCH RFP-V4 06/13] RFP prot support: cleanup syscall checks Paolo 'Blaisorblade' Giarrusso, Paolo 'Blaisorblade' Giarrusso
2006-08-26 17:42 ` [PATCH RFP-V4 07/13] RFP prot support: enhance syscall interface Paolo 'Blaisorblade' Giarrusso, Ingo Molnar, Paolo 'Blaisorblade' Giarrusso
2006-08-26 17:42 ` [PATCH RFP-V4 08/13] RFP prot support: support private vma for MAP_POPULATE Paolo 'Blaisorblade' Giarrusso, Ingo Molnar
2006-08-26 17:42 ` [PATCH RFP-V4 09/13] RFP prot support: use FAULT_SIGSEGV for protection checking Paolo 'Blaisorblade' Giarrusso, Paolo 'Blaisorblade' Giarrusso, Ingo Molnar
2006-08-26 17:42 ` [PATCH RFP-V4 10/13] RFP prot support: fix race condition with concurrent faults on same address space Paolo 'Blaisorblade' Giarrusso, Paolo 'Blaisorblade' Giarrusso
2006-08-26 17:42 ` Paolo 'Blaisorblade' Giarrusso, Paolo 'Blaisorblade' Giarrusso [this message]
2006-08-26 17:42 ` [PATCH RFP-V4 12/13] RFP prot support: also set VM_NONLINEAR on nonuniform VMAs Paolo 'Blaisorblade' Giarrusso, Paolo 'Blaisorblade' Giarrusso
2006-08-26 17:42 ` [PATCH RFP-V4 13/13] RFP prot support: uml, i386, x64 bits Paolo 'Blaisorblade' Giarrusso, Paolo 'Blaisorblade' Giarrusso, Ingo Molnar
2006-08-28 20:49 ` [PATCH RFP-V4 00/13] remap_file_pages protection support - 4th attempt Andrew Morton
2006-08-29 8:22 ` Paolo Giarrusso
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20060826174246.14790.94903.stgit@memento.home.lan \
--to=blaisorblade@yahoo.it \
--cc=akpm@osdl.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox