linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* Re: Assumed Failure rates in Various o.s's ?
       [not found] <199905191428.QAA1295681@beryllium.daimi.au.dk>
@ 1999-05-19 17:37 ` Kanoj Sarcar
  1999-05-21 10:07   ` Erik Corry
  0 siblings, 1 reply; 7+ messages in thread
From: Kanoj Sarcar @ 1999-05-19 17:37 UTC (permalink / raw)
  To: Erik Corry; +Cc: ak-uu, Linux-MM

> 
> It's rather a pity noone has ported it to the 386 then.
> You raised the problem yourself in the post leading up
> to http://x35.deja.com/=dnc/[ST_rn=ps]/getdoc.xp?AN=467741389
> and as far as I know this was never resolved.

Unfortunately, I couldn't quite trace back to the roots of this
thread, so I am guessing at what the problem is by looking at the
replies to the original post. Maybe one of you guys can explain 
why my proposed fix down below will not work ...

> 
> Perhaps something could be done about this.  Like rechecking
> when a blocked thread wakes up again in the middle of a 
> copy_from_fs.
> 
> > appropiate implementations of the *_user functions/macros. Actually Linux
> > versions upto 2.0 used exactly such a "software MMU" scheme for user space
> > address from the kernel. A similar way has been done recently by some guy 
> > of SGI to support upto 3.8GB of physical memory. In his patch kernel and 
> > user space have separated page tables, this means that kernel has to check
> > page tables by hand when it access user space. It all works by just changing
> > some architecture specific files and macros in asm/uaccess.h - generic Linux 
> > code is not touched.
> 
> > See http://www.linux.sgi.com/intel/bigmem/

Remember, this patch has not been fully tested, and I have only tested 
it on i686 (that's where big memory is interesting anyways). It should
have the same bugs that Linux has, plus some more (which I am hoping
sharp reviewers like you will be able to point out)

> 
> Did this stuff work around the scenario in the link above?
> He doesn't mention a workaround, and because SMP is possible
> it seems like it would need some kind of locking to prevent
> mmaps/munmaps while a copy_to_fs is taking place.  I didn't
> look at the patch.
> 
> -- 
> Erik Corry erik@arbat.com     Ceterum censeo, Microsoftem esse delendam!
> 

I think my patch might actually help your situation, given that the
*software* is checking the pte bits and making decisions about writability,
rather than relying on broken *hardware* which ignores the pte writability
bit.

Now for a proposal: I don't see a down(mm->mmap_sem) being done
in the code path leading up to calls to __verify_write. Am I missing
it? If a down(mm->mmap_sem) were added around __verify_write, you could
quit worrying about simultaneous munmaps while an user access function 
was executing. 

Kanoj
--
To unsubscribe, send a message with 'unsubscribe linux-mm my@address'
in the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://humbolt.geo.uu.nl/Linux-MM/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Assumed Failure rates in Various o.s's ?
  1999-05-19 17:37 ` Assumed Failure rates in Various o.s's ? Kanoj Sarcar
@ 1999-05-21 10:07   ` Erik Corry
  1999-05-21 14:25     ` Benjamin C.R. LaHaise
  1999-05-21 17:23     ` Kanoj Sarcar
  0 siblings, 2 replies; 7+ messages in thread
From: Erik Corry @ 1999-05-21 10:07 UTC (permalink / raw)
  To: Kanoj Sarcar; +Cc: ak-uu, Linux-MM

On Wed, May 19, 1999 at 10:37:42AM -0700, Kanoj Sarcar wrote:
> > 
> > to http://x35.deja.com/=dnc/[ST_rn=ps]/getdoc.xp?AN=467741389
> 
> Unfortunately, I couldn't quite trace back to the roots of this
> thread,

You can click on the 'Thread button to get an overview of the
thread.  I don't think there is an actual bug demonstration or
exploit available.

> I think my patch might actually help your situation, given that the
> *software* is checking the pte bits and making decisions about writability,
> rather than relying on broken *hardware* which ignores the pte writability
> bit.

Yes.  Though the performace hit would be even worse on the i386.

> Now for a proposal: I don't see a down(mm->mmap_sem) being done
> in the code path leading up to calls to __verify_write. Am I missing
> it? If a down(mm->mmap_sem) were added around __verify_write, you could
> quit worrying about simultaneous munmaps while an user access function 
> was executing. 

I think this is the wrong place.  As far as I understand it,
the verify_write runs before the actual copying takes place.
So after verify_write has run, while the copy_to_user is
taking place there can be a page fault (is that even necessary
on SMP?).  While that is happening, the black hat user can do
an mmap/munmap in another thread.  But I haven't really looked
into it much, I am relying mostly on hearsay here.

According to Andi you already fixed this with a read lock that
prevents mmap and mmunmap from doing anything while the copy
is running.  This makes sense, since if you do it right with a
readers/writers lock you can keep out mmap without serialising
copy_to_user or copy_from_user.

-- 
Erik Corry erik@arbat.com     Ceterum censeo, Microsoftem esse delendam!
--
To unsubscribe, send a message with 'unsubscribe linux-mm my@address'
in the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://humbolt.geo.uu.nl/Linux-MM/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Assumed Failure rates in Various o.s's ?
  1999-05-21 10:07   ` Erik Corry
@ 1999-05-21 14:25     ` Benjamin C.R. LaHaise
  1999-05-21 14:54       ` Erik Corry
  1999-05-21 17:06       ` Kanoj Sarcar
  1999-05-21 17:23     ` Kanoj Sarcar
  1 sibling, 2 replies; 7+ messages in thread
From: Benjamin C.R. LaHaise @ 1999-05-21 14:25 UTC (permalink / raw)
  To: Erik Corry; +Cc: Kanoj Sarcar, ak-uu, linux-mm

On Fri, 21 May 1999, Erik Corry wrote:

> According to Andi you already fixed this with a read lock that
> prevents mmap and mmunmap from doing anything while the copy
> is running.  This makes sense, since if you do it right with a
> readers/writers lock you can keep out mmap without serialising
> copy_to_user or copy_from_user.

I really like the cleanliness of this approach, but it's troublesome:
memory allocations in other threads would then get blocked during large
IOs -- very bad.  What if we instead move from the mm level semaphore to a
per vma locking scheme?  The mmap semaphore could become a spinlock for
fudging with list of vmas, and mmap/page faults/... could lock the
specific vma.  Or would this be too heavy?

		-ben

--
To unsubscribe, send a message with 'unsubscribe linux-mm my@address'
in the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://humbolt.geo.uu.nl/Linux-MM/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Assumed Failure rates in Various o.s's ?
  1999-05-21 14:25     ` Benjamin C.R. LaHaise
@ 1999-05-21 14:54       ` Erik Corry
  1999-05-21 16:02         ` Benjamin C.R. LaHaise
  1999-05-21 17:06       ` Kanoj Sarcar
  1 sibling, 1 reply; 7+ messages in thread
From: Erik Corry @ 1999-05-21 14:54 UTC (permalink / raw)
  To: Benjamin C.R. LaHaise; +Cc: Kanoj Sarcar, ak-uu, linux-mm

On Fri, May 21, 1999 at 10:25:42AM -0400, Benjamin C.R. LaHaise wrote:
> On Fri, 21 May 1999, Erik Corry wrote:
> 
> > According to Andi you already fixed this with a read lock that
> > prevents mmap and mmunmap from doing anything while the copy
> > is running.  This makes sense, since if you do it right with a
> > readers/writers lock you can keep out mmap without serialising
> > copy_to_user or copy_from_user.
> 
> I really like the cleanliness of this approach, but it's troublesome:
> memory allocations in other threads would then get blocked during large
> IOs -- very bad.

Actually, isn't it just munmap that is problematic?

After the access_ok you can't map a read-only file into the
path of an oncoming copy_to_user without first unmapping
what was there before (this is assuming a version of
access_ok that checks whether something was mapped).
So mmaps can safely happen in parallel with copy_to_user.

-- 
Erik Corry erik@arbat.com           Ceterum censeo, Microsoftem esse delendam!
--
To unsubscribe, send a message with 'unsubscribe linux-mm my@address'
in the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://humbolt.geo.uu.nl/Linux-MM/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Assumed Failure rates in Various o.s's ?
  1999-05-21 14:54       ` Erik Corry
@ 1999-05-21 16:02         ` Benjamin C.R. LaHaise
  0 siblings, 0 replies; 7+ messages in thread
From: Benjamin C.R. LaHaise @ 1999-05-21 16:02 UTC (permalink / raw)
  To: Erik Corry; +Cc: Kanoj Sarcar, ak-uu, linux-mm

On Fri, 21 May 1999, Erik Corry wrote:

> Actually, isn't it just munmap that is problematic?
> 
> After the access_ok you can't map a read-only file into the
> path of an oncoming copy_to_user without first unmapping
> what was there before (this is assuming a version of
> access_ok that checks whether something was mapped).
> So mmaps can safely happen in parallel with copy_to_user.

Both mmap and munmap are safe -- the i386 bug is that writes to read-only
pages succeed while in the kernel.  Mmap needs to lock the vma during
initialization in case the driver has to sleep.  To avoid the bug, we just
need to protect against making any pages readonly in the vma after the vma
is in a safe state: fork, read mappings of non-present pages, swapout --
just about anything that can modify the page table can put a read only
page.

		-ben

--
To unsubscribe, send a message with 'unsubscribe linux-mm my@address'
in the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://humbolt.geo.uu.nl/Linux-MM/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Assumed Failure rates in Various o.s's ?
  1999-05-21 14:25     ` Benjamin C.R. LaHaise
  1999-05-21 14:54       ` Erik Corry
@ 1999-05-21 17:06       ` Kanoj Sarcar
  1 sibling, 0 replies; 7+ messages in thread
From: Kanoj Sarcar @ 1999-05-21 17:06 UTC (permalink / raw)
  To: Benjamin C.R. LaHaise; +Cc: erik, ak-uu, linux-mm

> 
> On Fri, 21 May 1999, Erik Corry wrote:
> 
> > According to Andi you already fixed this with a read lock that
> > prevents mmap and mmunmap from doing anything while the copy
> > is running.  This makes sense, since if you do it right with a
> > readers/writers lock you can keep out mmap without serialising
> > copy_to_user or copy_from_user.
> 
> I really like the cleanliness of this approach, but it's troublesome:
> memory allocations in other threads would then get blocked during large
> IOs -- very bad.  What if we instead move from the mm level semaphore to a
> per vma locking scheme?  The mmap semaphore could become a spinlock for
> fudging with list of vmas, and mmap/page faults/... could lock the
> specific vma.  Or would this be too heavy?
>

I am sorry I did not clear up this misconception in your original mail.
Though the uaccess procedures in my patch are called upage_rlock/upage_wlock,
they do not do any kind of locking. The code looks at the pte, decides
whether it is in a readable/writable state, and if so, fastpaths out,
returning a kernel virtual address for the user page, that kernel code can
use (without incurring faults). The reason this will work is because 
uaccess callers already have the kernel_lock, so no one can steal the
page (or munmap it). If the page is not in the proper state for the
access, then the procedure longpaths into grabbing the mmap_sem and 
doing a handle_mm_fault, which it keeps on doing until the page is in
the proper state. 

Kanoj
--
To unsubscribe, send a message with 'unsubscribe linux-mm my@address'
in the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://humbolt.geo.uu.nl/Linux-MM/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Assumed Failure rates in Various o.s's ?
  1999-05-21 10:07   ` Erik Corry
  1999-05-21 14:25     ` Benjamin C.R. LaHaise
@ 1999-05-21 17:23     ` Kanoj Sarcar
  1 sibling, 0 replies; 7+ messages in thread
From: Kanoj Sarcar @ 1999-05-21 17:23 UTC (permalink / raw)
  To: Erik Corry; +Cc: ak-uu, Linux-MM

> 
> > Now for a proposal: I don't see a down(mm->mmap_sem) being done
> > in the code path leading up to calls to __verify_write. Am I missing
> > it? If a down(mm->mmap_sem) were added around __verify_write, you could
> > quit worrying about simultaneous munmaps while an user access function 
> > was executing. 
> 
> I think this is the wrong place.  As far as I understand it,
> the verify_write runs before the actual copying takes place.
> So after verify_write has run, while the copy_to_user is
> taking place there can be a page fault (is that even necessary
> on SMP?).  While that is happening, the black hat user can do
> an mmap/munmap in another thread.  But I haven't really looked
> into it much, I am relying mostly on hearsay here.
>

Note that verify_write loops thru pages, making them go to the
proper state. While the pages in the first vma have been verified,  
the code might fault verifying pages in the second vma. It gives
up the kernel lock, letting another thread munmap the already 
verified vma, and replace it with a readonly vma. I didn't
see any checks to prevent this .. thus my proposal for the mmap_sem
in this path. On a differnt note, check out
http://humbolt.nl.linux.org/lists/linux-mm/1999-05/msg00022.html
about why the mmap_sem is needed anyways for correctness.

If this were done, say the copy_to_user faults (with the kernel_lock
held by caller of copy_to_user). Then, the fault handling code will
grab mmap_sem, before possibly going to sleep releasing the kernel_lock.
No munmaps can happen since mmap_sem is held.

Kanoj
--
To unsubscribe, send a message with 'unsubscribe linux-mm my@address'
in the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://humbolt.geo.uu.nl/Linux-MM/

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~1999-05-21 17:24 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <199905191428.QAA1295681@beryllium.daimi.au.dk>
1999-05-19 17:37 ` Assumed Failure rates in Various o.s's ? Kanoj Sarcar
1999-05-21 10:07   ` Erik Corry
1999-05-21 14:25     ` Benjamin C.R. LaHaise
1999-05-21 14:54       ` Erik Corry
1999-05-21 16:02         ` Benjamin C.R. LaHaise
1999-05-21 17:06       ` Kanoj Sarcar
1999-05-21 17:23     ` Kanoj Sarcar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox