* Re: [RFC] PAGE_RW Should be added to PAGE_COPY ? [not found] ` <Pine.LNX.4.64.0609151431320.22674@blonde.wat.veritas.com> @ 2006-09-16 7:42 ` Nick Piggin 2006-09-23 18:51 ` Hugh Dickins 0 siblings, 1 reply; 3+ messages in thread From: Nick Piggin @ 2006-09-16 7:42 UTC (permalink / raw) To: Hugh Dickins Cc: Yingchao Zhou, linux-kernel, akpm, alan, zxc, Linux Memory Management (adding linux-mm) Hugh Dickins wrote: >>but for the app to use a shared mapping instead of a private. > > > But that suggestion wasn't helpful: you'd much prefer not to > restrict what areas of userspace are used in this way. > > The problem, as I now see it, is precisely with do_wp_page()'s > TestSetPageLocked, as you first said. There is indeed a small > but real chance that will fail. At some time in the past I did > realize that, but pushed it to the back of my mind, waiting for > someone actually to complain: now you have. > > Yes, it would be good if we could do that check in some other, > reliable way. The problem is that can_share_swap_page has to > check page_mapcount (and PageSwapCache) and page_swapcount in > an atomic way: the page lock is what we have used to guard the > movement between mapcount and swapcount. > > I'll try to think whether we can do that better, > but not until next week. I don't think TestSetPageLocked is the problem. Indeed you may be able to get around a few specific cases say, by turning that into a plain lock_page()... but the problem is still fundamentally COW. In other words, one should always be able to return 0 from that can_share_swap_page and have the system continue to work... right? Because even if you hadn't done that mprotect trick, you may still have a problem because the page may *have* to be copied on write if it is shared over fork. So if we filled in the missing mm/ implementation of VM_DONTCOPY (and call it MAP_DONTCOPY rather than the confusing MAP_DONTFORK) such that it withstands such an mprotect sequence, we can then ask that all userspace drivers do their get_user_pages memory on these types of vmas. Would that work? -- SUSE Labs, Novell Inc. Send instant messages to your online friends http://au.messenger.yahoo.com -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [RFC] PAGE_RW Should be added to PAGE_COPY ? 2006-09-16 7:42 ` [RFC] PAGE_RW Should be added to PAGE_COPY ? Nick Piggin @ 2006-09-23 18:51 ` Hugh Dickins 2006-09-25 2:53 ` Nick Piggin 0 siblings, 1 reply; 3+ messages in thread From: Hugh Dickins @ 2006-09-23 18:51 UTC (permalink / raw) To: Nick Piggin Cc: Yingchao Zhou, linux-kernel, akpm, alan, zxc, Linux Memory Management On Sat, 16 Sep 2006, Nick Piggin wrote: > Hugh Dickins wrote: > > > Yes, it would be good if we could do that check in some other, > > reliable way. The problem is that can_share_swap_page has to > > check page_mapcount (and PageSwapCache) and page_swapcount in > > an atomic way: the page lock is what we have used to guard the > > movement between mapcount and swapcount. > > > > I'll try to think whether we can do that better, > > but not until next week. I currently believe we can do it without TestSetPageLocked in do_wp_page(): just with a few memory barriers added. But that belief may evaporate once I actually focus on each site: it wouldn't be the first time I've fooled myself like that, so please keep on the alert, Nick. > > I don't think TestSetPageLocked is the problem. Indeed you may be > able to get around a few specific cases say, by turning that into > a plain lock_page()... Hmm, an actual lock_page, that wasn't what I was intending. Would be simple, I'm trying to remember why I ruled it out earlier. Oh, yes, we're holding the pte lock there, that's why. > ... but the problem is still fundamentally COW. Well, yes, we wouldn't have all these problems if we didn't have to respect COW. But generally a process can, one way or another, make sure it won't get into those problems: Yingchao is concerned with the way the TestSetPageLocked unpredictably upsets correctness. I'd say it's a more serious error than the general problems with COW. > > In other words, one should always be able to return 0 from that > can_share_swap_page and have the system continue to work... right? > Because even if you hadn't done that mprotect trick, you may still > have a problem because the page may *have* to be copied on write > if it is shared over fork. Most processes won't fork, and exec has freed them from sharing their parents pages, and their private file mappings aren't being used as buffers. Maybe Yingchao will later have to worry about those cases, but for now it seems not. > > So if we filled in the missing mm/ implementation of VM_DONTCOPY > (and call it MAP_DONTCOPY rather than the confusing MAP_DONTFORK) > such that it withstands such an mprotect sequence, we can then ask > that all userspace drivers do their get_user_pages memory on these > types of vmas. (madvise MADV_DONTFORK) For the longest time I couldn't understand you there at all, perhaps distracted by your parenthetical line: at last I think you're proposing we tweak mprotect to behave differently on a VM_DONTCOPY area. But differently in what way? Allow it to ignore Copy-On-Write? No, of course not. Go down to the struct page level (one of the nice things about mprotect is that it doesn't need to look at struct pages at present, except perhaps inside the ia64 lazy_mmu_prot_update), and decide if it has to do the copying itself instead? Doesn't sound a good place to do it; and would involve just the same problematic TestSetPageLocked within pte lock that do_wp_page is doing, so wouldn't solve anything. Or I'm still misunderstanding you. Hugh -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [RFC] PAGE_RW Should be added to PAGE_COPY ? 2006-09-23 18:51 ` Hugh Dickins @ 2006-09-25 2:53 ` Nick Piggin 0 siblings, 0 replies; 3+ messages in thread From: Nick Piggin @ 2006-09-25 2:53 UTC (permalink / raw) To: Hugh Dickins Cc: Yingchao Zhou, linux-kernel, akpm, alan, zxc, Linux Memory Management Hugh Dickins wrote: >On Sat, 16 Sep 2006, Nick Piggin wrote: > >>... but the problem is still fundamentally COW. >> > >Well, yes, we wouldn't have all these problems if we didn't have >to respect COW. But generally a process can, one way or another, >make sure it won't get into those problems: Yingchao is concerned >with the way the TestSetPageLocked unpredictably upsets correctness. >I'd say it's a more serious error than the general problems with COW. > But correctness is no more upset here than with any other reason that the page gets COWed. >>In other words, one should always be able to return 0 from that >>can_share_swap_page and have the system continue to work... right? >>Because even if you hadn't done that mprotect trick, you may still >>have a problem because the page may *have* to be copied on write >>if it is shared over fork. >> > >Most processes won't fork, and exec has freed them from sharing >their parents pages, and their private file mappings aren't being >used as buffers. Maybe Yingchao will later have to worry about >those cases, but for now it seems not. > So we should still solve it for once and for all just by turning off COW completely. >>So if we filled in the missing mm/ implementation of VM_DONTCOPY >>(and call it MAP_DONTCOPY rather than the confusing MAP_DONTFORK) >>such that it withstands such an mprotect sequence, we can then ask >>that all userspace drivers do their get_user_pages memory on these >>types of vmas. >> > >(madvise MADV_DONTFORK) > >For the longest time I couldn't understand you there at all, perhaps >distracted by your parenthetical line: at last I think you're proposing >we tweak mprotect to behave differently on a VM_DONTCOPY area. > >But differently in what way? Allow it to ignore Copy-On-Write? > Well I think that we should have a flag that just prevents copy on write from ever happening. Maybe that would mean it be easiest to implement in mmap rather than as madvise, but that should be OK. -- Send instant messages to your online friends http://au.messenger.yahoo.com -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2006-09-25 2:53 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <20060915033842.C205FFB045@ncic.ac.cn>
[not found] ` <Pine.LNX.4.64.0609150514190.7397@blonde.wat.veritas.com>
[not found] ` <Pine.LNX.4.64.0609151431320.22674@blonde.wat.veritas.com>
2006-09-16 7:42 ` [RFC] PAGE_RW Should be added to PAGE_COPY ? Nick Piggin
2006-09-23 18:51 ` Hugh Dickins
2006-09-25 2:53 ` Nick Piggin
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox