* A couple of questions
@ 1999-03-02 13:11 Neil Booth
1999-03-15 18:58 ` Stephen C. Tweedie
0 siblings, 1 reply; 5+ messages in thread
From: Neil Booth @ 1999-03-02 13:11 UTC (permalink / raw)
To: linux-mm
I have a couple of questions about do_wp_page; I hope they're welcome
here.
1) do_wp_page has most execution paths doing an unlock_kernel() but
there are a couple that don't. Why isn't this inconsistent? e.g. any of
the branches that call end_wp_page do not unlock the kernel. What am I
missing? Is it that these branches only happen if we slept while getting
the free page, and sleeping always unlocks the kernel?
2) The last 2 of the 3 branches to end_wp_page seem to me to be
impossible code paths.
if (!pte_present(pte))
goto end_wp_page;
if (pte_write(pte))
goto end_wp_page;
At entry, pte (= *page_table) is present and not writable as this is the
only way do_wp_page gets called from handle_pte_fault (and we hold the
kernel lock so nothing else can change *page_table). Being a local
variable, it contents cannot change, so why these 2 tests?
Cheers,
Neil.
--
To unsubscribe, send a message with 'unsubscribe linux-mm my@address'
in the body to majordomo@kvack.org. For more info on Linux MM,
see: http://humbolt.geo.uu.nl/Linux-MM/
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: A couple of questions
1999-03-02 13:11 A couple of questions Neil Booth
@ 1999-03-15 18:58 ` Stephen C. Tweedie
1999-03-15 22:46 ` neil
1999-03-16 2:11 ` Andrea Arcangeli
0 siblings, 2 replies; 5+ messages in thread
From: Stephen C. Tweedie @ 1999-03-15 18:58 UTC (permalink / raw)
To: Neil Booth; +Cc: linux-mm, Stephen Tweedie
Hi,
<Late answer: I've been offline for a couple of weeks>
On Tue, 02 Mar 1999 22:11:45 +0900, Neil Booth <NeilB@earthling.net> said:
> I have a couple of questions about do_wp_page; I hope they're welcome
> here.
> 1) do_wp_page has most execution paths doing an unlock_kernel() but
> there are a couple that don't. Why isn't this inconsistent?
Good question, and a possible bug. Anyone else care to glance at this?
It's a possible problem only on SMP, of course. The obvious fix is:
----------------------------------------------------------------
--- mm/memory.c~ Tue Jan 19 01:33:10 1999
+++ mm/memory.c Mon Mar 15 18:57:31 1999
@@ -651,13 +651,13 @@
delete_from_swap_cache(page_map);
/* FallThrough */
case 1:
- /* We can release the kernel lock now.. */
- unlock_kernel();
-
flush_cache_page(vma, address);
set_pte(page_table, pte_mkdirty(pte_mkwrite(pte)));
flush_tlb_page(vma, address);
end_wp_page:
+ /* We can release the kernel lock now.. */
+ unlock_kernel();
+
if (new_page)
free_page(new_page);
return 1;
----------------------------------------------------------------
> 2) The last 2 of the 3 branches to end_wp_page seem to me to be
> impossible code paths.
> if (!pte_present(pte))
> goto end_wp_page;
> if (pte_write(pte))
> goto end_wp_page;
No, the start of do_wp_page() looks like:
pte = *page_table;
new_page = __get_free_page(GFP_USER);
and the get_free_page() call can block if we are out of memory, dropping
the kernel lock in the process. The page table can be modified by
kswapd during this interval.
--Stephen
--
To unsubscribe, send a message with 'unsubscribe linux-mm my@address'
in the body to majordomo@kvack.org. For more info on Linux MM,
see: http://humbolt.geo.uu.nl/Linux-MM/
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: A couple of questions
1999-03-15 18:58 ` Stephen C. Tweedie
@ 1999-03-15 22:46 ` neil
1999-03-16 12:22 ` Stephen C. Tweedie
1999-03-16 2:11 ` Andrea Arcangeli
1 sibling, 1 reply; 5+ messages in thread
From: neil @ 1999-03-15 22:46 UTC (permalink / raw)
To: Stephen C. Tweedie; +Cc: Linux-MM
Hi Stephen,
Stephen C. Tweedie wrote:-
> Hi,
>
[..snip..]
>
> > 2) The last 2 of the 3 branches to end_wp_page seem to me to be
> > impossible code paths.
>
> > if (!pte_present(pte))
> > goto end_wp_page;
> > if (pte_write(pte))
> > goto end_wp_page;
>
> No, the start of do_wp_page() looks like:
>
> pte = *page_table;
> new_page = __get_free_page(GFP_USER);
>
> and the get_free_page() call can block if we are out of memory, dropping
> the kernel lock in the process. The page table can be modified by
> kswapd during this interval.
Thanks for your reply. I think you've missed my point on this one.
The variable "pte" is set before calling __get_free_page(), and being
local cannot be modified by other processes. Hence I still believe
the 2 branches shown are impossible, their negative having been the
condition for entering do_wp_page().
The case you mention is captured by the initial test
if (pte_val(*page_table) != pte_val(pte))
goto end_wp_page;
performed before the two above. Do you agree?
Cheers,
Neil.
--
--
To unsubscribe, send a message with 'unsubscribe linux-mm my@address'
in the body to majordomo@kvack.org. For more info on Linux MM,
see: http://humbolt.geo.uu.nl/Linux-MM/
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: A couple of questions
1999-03-15 18:58 ` Stephen C. Tweedie
1999-03-15 22:46 ` neil
@ 1999-03-16 2:11 ` Andrea Arcangeli
1 sibling, 0 replies; 5+ messages in thread
From: Andrea Arcangeli @ 1999-03-16 2:11 UTC (permalink / raw)
To: Stephen C. Tweedie; +Cc: Neil Booth, linux-mm, Linus Torvalds
On Mon, 15 Mar 1999, Stephen C. Tweedie wrote:
>--- mm/memory.c~ Tue Jan 19 01:33:10 1999
>+++ mm/memory.c Mon Mar 15 18:57:31 1999
>@@ -651,13 +651,13 @@
> delete_from_swap_cache(page_map);
> /* FallThrough */
> case 1:
>- /* We can release the kernel lock now.. */
>- unlock_kernel();
>-
> flush_cache_page(vma, address);
> set_pte(page_table, pte_mkdirty(pte_mkwrite(pte)));
> flush_tlb_page(vma, address);
> end_wp_page:
>+ /* We can release the kernel lock now.. */
>+ unlock_kernel();
>+
> if (new_page)
> free_page(new_page);
> return 1;
>----------------------------------------------------------------
Your sure safe patch is strictly needed according to me in order to
release the lock_kernel in the end_wp_page path.
The reason I think it's just safe remove the lock_kernel before updating
the page table of the process is because the swap_out engine will do
nothing with the page until it will be a clean page (and should be clean
because it was read-only in first place.... am I really right here?).
Every other part of the VM will block on the semaphore so it won't race
anyway with the page fault handler.
I think this patch against 2.2.3 looks needed to me (except the first
chunk that is only removing superflous code).
Seems to works fine after some minute of stress-testing.
Index: mm//memory.c
===================================================================
RCS file: /var/cvs/linux/mm/memory.c,v
retrieving revision 1.1.2.3
diff -u -r1.1.2.3 memory.c
--- memory.c 1999/01/24 02:46:31 1.1.2.3
+++ linux/mm/memory.c 1999/03/16 01:55:45
@@ -624,10 +624,6 @@
/* Did someone else copy this page for us while we slept? */
if (pte_val(*page_table) != pte_val(pte))
goto end_wp_page;
- if (!pte_present(pte))
- goto end_wp_page;
- if (pte_write(pte))
- goto end_wp_page;
old_page = pte_page(pte);
if (MAP_NR(old_page) >= max_mapnr)
goto bad_wp_page;
@@ -651,13 +647,18 @@
delete_from_swap_cache(page_map);
/* FallThrough */
case 1:
- /* We can release the kernel lock now.. */
+ /*
+ * We can release the kernel lock now.. because the swap_out
+ * engine will do nothing with the page table until it
+ * will be a clean page (and we are sure it's clean because it
+ * wasn't writable yet). All other parts of the VM will
+ * stop on the mmap semaphore. -arca
+ */
unlock_kernel();
flush_cache_page(vma, address);
set_pte(page_table, pte_mkdirty(pte_mkwrite(pte)));
flush_tlb_page(vma, address);
-end_wp_page:
if (new_page)
free_page(new_page);
return 1;
@@ -681,9 +682,15 @@
bad_wp_page:
printk("do_wp_page: bogus page at address %08lx (%08lx)\n",address,old_page);
send_sig(SIGKILL, tsk, 1);
+ unlock_kernel();
if (new_page)
free_page(new_page);
return 0;
+end_wp_page:
+ unlock_kernel();
+ if (new_page)
+ free_page(new_page);
+ return 1;
}
/*
Andrea Arcangeli
--
To unsubscribe, send a message with 'unsubscribe linux-mm my@address'
in the body to majordomo@kvack.org. For more info on Linux MM,
see: http://humbolt.geo.uu.nl/Linux-MM/
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: A couple of questions
1999-03-15 22:46 ` neil
@ 1999-03-16 12:22 ` Stephen C. Tweedie
0 siblings, 0 replies; 5+ messages in thread
From: Stephen C. Tweedie @ 1999-03-16 12:22 UTC (permalink / raw)
To: neil; +Cc: Stephen C. Tweedie, Linux-MM
Hi,
On Tue, 16 Mar 1999 07:46:06 +0900, neil@tc-1-192.ariake.gol.ne.jp
said:
> Thanks for your reply. I think you've missed my point on this one.
> The variable "pte" is set before calling __get_free_page(), and being
> local cannot be modified by other processes.
Umm, OK, you've convinced me. :) I think we have enough locks held
throughout this to prevent the present or writable bits in *page_table
from changing between the test in handle_pte_fault() and do_wp_page()
itself, even on SMP.
--Stephen
--
To unsubscribe, send a message with 'unsubscribe linux-mm my@address'
in the body to majordomo@kvack.org. For more info on Linux MM,
see: http://humbolt.geo.uu.nl/Linux-MM/
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~1999-03-16 12:29 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
1999-03-02 13:11 A couple of questions Neil Booth
1999-03-15 18:58 ` Stephen C. Tweedie
1999-03-15 22:46 ` neil
1999-03-16 12:22 ` Stephen C. Tweedie
1999-03-16 2:11 ` Andrea Arcangeli
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox