* [PATCH 2.5.66-mm2] Fix page_convert_anon locking issues
@ 2003-04-02 17:13 Dave McCracken
2003-04-02 21:29 ` Andrew Morton
0 siblings, 1 reply; 20+ messages in thread
From: Dave McCracken @ 2003-04-02 17:13 UTC (permalink / raw)
To: Andrew Morton; +Cc: Linux Memory Management, Linux Kernel
[-- Attachment #1: Type: text/plain, Size: 445 bytes --]
I came up with a scheme for accessing the page tables in page_convert_anon
that should work without requiring locks. Hugh has looked at it and agrees
it addresses the problems he found. Anyway, here's the patch.
Dave McCracken
======================================================================
Dave McCracken IBM Linux Base Kernel Team 1-512-838-3059
dmccr@us.ibm.com T/L 678-3059
[-- Attachment #2: objlock-2.5.66-mm2-1.diff --]
[-- Type: text/plain, Size: 1116 bytes --]
--- 2.5.66-mm2/./mm/rmap.c 2003-04-01 11:23:35.000000000 -0600
+++ 2.5.66-mm2-fix/./mm/rmap.c 2003-04-01 11:30:21.000000000 -0600
@@ -95,7 +95,9 @@ find_pte(struct vm_area_struct *vma, str
{
struct mm_struct *mm = vma->vm_mm;
pgd_t *pgd;
+ pgd_t pgdval;
pmd_t *pmd;
+ pmd_t pmdval;
pte_t *pte;
unsigned long loffset;
unsigned long address;
@@ -106,17 +108,27 @@ find_pte(struct vm_area_struct *vma, str
goto out;
pgd = pgd_offset(mm, address);
- if (!pgd_present(*pgd))
+ pgdval = *pgd;
+ if (!pgd_present(pgdval))
goto out;
- pmd = pmd_offset(pgd, address);
- if (!pmd_present(*pmd))
+ pmd = pmd_offset(&pgdval, address);
+ pmdval = *pmd;
+ if (!pmd_present(pmdval))
goto out;
- pte = pte_offset_map(pmd, address);
+ /* Double check to make sure the pmd page hasn't been freed */
+ if (!pgd_present(*pgd))
+ goto out;
+
+ pte = pte_offset_map(&pmdval, address);
if (!pte_present(*pte))
goto out_unmap;
+ /* Double check to make sure the pte page hasn't been freed */
+ if (!pmd_present(*pmd))
+ goto out_unmap;
+
if (page_to_pfn(page) != pte_pfn(*pte))
goto out_unmap;
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2.5.66-mm2] Fix page_convert_anon locking issues
2003-04-02 17:13 [PATCH 2.5.66-mm2] Fix page_convert_anon locking issues Dave McCracken
@ 2003-04-02 21:29 ` Andrew Morton
2003-04-02 21:56 ` Dave McCracken
0 siblings, 1 reply; 20+ messages in thread
From: Andrew Morton @ 2003-04-02 21:29 UTC (permalink / raw)
To: Dave McCracken; +Cc: linux-mm, linux-kernel
Dave McCracken <dmccr@us.ibm.com> wrote:
>
>
> I came up with a scheme for accessing the page tables in page_convert_anon
> that should work without requiring locks. Hugh has looked at it and agrees
> it addresses the problems he found. Anyway, here's the patch.
>
I am unable to convince myself that this is correct. It's playing with pmd
and pte pages which can be freed, reallocated and filled with random stuff.
I really don't see how that can work, but am willing to be taught.
Because we hold i_shared_sem we know that the pgd layer is stable and that
the mm's aren't going away.
Is it not possible to take each mm's page_table_lock? There's a ranking
problem with pte_chain_lock(), but that can presumably be resolved by doing a
trylock on the page_table_lock and if that fails, restart the whole operation.
But then again, why is it not possible to just do:
list_for_each_entry(vma, &mapping->i_mmap, shared) {
if (!pte_chain)
pte_chain = pte_chain_alloc(GFP_KERNEL);
spin_lock(&mm->page_table_lock);
pte = find_pte(vma, page, NULL);
if (pte)
pte_chain = page_add_rmap(page, pte, pte_chain);
spin_unlock(&mm->page_table_lock);
}
pte_chain_free(pte_chain);
up(&mapping->i_shared_sem);
?
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org">aart@kvack.org</a>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2.5.66-mm2] Fix page_convert_anon locking issues
2003-04-02 21:29 ` Andrew Morton
@ 2003-04-02 21:56 ` Dave McCracken
2003-04-02 23:09 ` Andrew Morton
0 siblings, 1 reply; 20+ messages in thread
From: Dave McCracken @ 2003-04-02 21:56 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux-mm, linux-kernel
--On Wednesday, April 02, 2003 13:29:39 -0800 Andrew Morton
<akpm@digeo.com> wrote:
> I am unable to convince myself that this is correct. It's playing with
> pmd and pte pages which can be freed, reallocated and filled with random
> stuff. I really don't see how that can work, but am willing to be taught.
>
> Because we hold i_shared_sem we know that the pgd layer is stable and that
> the mm's aren't going away.
The sequence is the following:
1. take a copy of the reference to the page (the pgd or pmd entry)
2. validate the copy
3. establish a pointer into the page
4. pull the data from the page (pmd or pte entry)
5. validate the original reference again
6. use the data
This guarantees that the data is from a page that's still valid, since the
pgd or pmd entry are cleared when the page is released. We're helped by
the fact that for an invalid page we can simply return failure.
> Is it not possible to take each mm's page_table_lock? There's a ranking
> problem with pte_chain_lock(), but that can presumably be resolved by
> doing a trylock on the page_table_lock and if that fails, restart the
> whole operation.
It is possible, but would be much more painful.
> But then again, why is it not possible to just do:
>
> list_for_each_entry(vma, &mapping->i_mmap, shared) {
> if (!pte_chain)
> pte_chain = pte_chain_alloc(GFP_KERNEL);
> spin_lock(&mm->page_table_lock);
> pte = find_pte(vma, page, NULL);
> if (pte)
> pte_chain = page_add_rmap(page, pte, pte_chain);
> spin_unlock(&mm->page_table_lock);
> }
>
> pte_chain_free(pte_chain);
> up(&mapping->i_shared_sem);
>
> ?
Because the page is in transition from !PageAnon to PageAnon. We have to
hold the pte_chain lock during the entire transition in case someone else
tries to do something like page_remove_rmap, which would break.
Dave
======================================================================
Dave McCracken IBM Linux Base Kernel Team 1-512-838-3059
dmccr@us.ibm.com T/L 678-3059
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org">aart@kvack.org</a>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2.5.66-mm2] Fix page_convert_anon locking issues
2003-04-02 21:56 ` Dave McCracken
@ 2003-04-02 23:09 ` Andrew Morton
2003-04-02 23:23 ` Dave McCracken
0 siblings, 1 reply; 20+ messages in thread
From: Andrew Morton @ 2003-04-02 23:09 UTC (permalink / raw)
To: Dave McCracken; +Cc: linux-mm, linux-kernel
Dave McCracken <dmccr@us.ibm.com> wrote:
>
> The sequence is the following:
Boy you owe me a big fat comment on top of this one.
> 1. take a copy of the reference to the page (the pgd or pmd entry)
> 2. validate the copy
> 3. establish a pointer into the page
> 4. pull the data from the page (pmd or pte entry)
> 5. validate the original reference again
> 6. use the data
>
> This guarantees that the data is from a page that's still valid, since the
> pgd or pmd entry are cleared when the page is released. We're helped by
> the fact that for an invalid page we can simply return failure.
+ if (page_to_pfn(page) != pte_pfn(*pte))
+ goto out_unmap;
+
+ if (addr)
+ *addr = address;
+
==>munmap here
+ return pte;
i_shared_sem won't stop that. The pte points into thin air, and may now
point at a value which looks like our page.
> > But then again, why is it not possible to just do:
> >
> > list_for_each_entry(vma, &mapping->i_mmap, shared) {
> > if (!pte_chain)
> > pte_chain = pte_chain_alloc(GFP_KERNEL);
> > spin_lock(&mm->page_table_lock);
> > pte = find_pte(vma, page, NULL);
> > if (pte)
> > pte_chain = page_add_rmap(page, pte, pte_chain);
> > spin_unlock(&mm->page_table_lock);
> > }
> >
> > pte_chain_free(pte_chain);
> > up(&mapping->i_shared_sem);
> >
> > ?
>
> Because the page is in transition from !PageAnon to PageAnon.
These are file-backed pages. So what does PageAnon really mean?
> We have to
> hold the pte_chain lock during the entire transition in case someone else
> tries to do something like page_remove_rmap, which would break.
How about setting PageAnon at the _start_ of the operation?
page_remove_rmap() will cope with that OK.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org">aart@kvack.org</a>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2.5.66-mm2] Fix page_convert_anon locking issues
2003-04-02 23:09 ` Andrew Morton
@ 2003-04-02 23:23 ` Dave McCracken
2003-04-02 23:38 ` Andrew Morton
2003-04-03 0:06 ` Hugh Dickins
0 siblings, 2 replies; 20+ messages in thread
From: Dave McCracken @ 2003-04-02 23:23 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux-mm, linux-kernel
--On Wednesday, April 02, 2003 15:09:03 -0800 Andrew Morton
<akpm@digeo.com> wrote:
> i_shared_sem won't stop that. The pte points into thin air, and may now
> point at a value which looks like our page.
Once we find a match in the pte entry, we have the additional protection of
the pte_chain lock. The pte entry is never cleared without a call to
page_remove_rmap, which will block on the pte_chain lock.
>> Because the page is in transition from !PageAnon to PageAnon.
>
> These are file-backed pages. So what does PageAnon really mean?
I suppose PageAnon should be renamed to PageChain, to mean it's using
pte_chains. It did mean anon pages until I used it for nonlinear pages.
>> We have to
>> hold the pte_chain lock during the entire transition in case someone else
>> tries to do something like page_remove_rmap, which would break.
>
> How about setting PageAnon at the _start_ of the operation?
> page_remove_rmap() will cope with that OK.
Hmm... I was gonna say that page_remove_rmap will BUG() if it doesn't find
the entry, but it's only under DEBUG and could easily be changed. Lemme
think on this one a bit. I need to assure myself it's safe to go unlocked
in the middle.
Dave
======================================================================
Dave McCracken IBM Linux Base Kernel Team 1-512-838-3059
dmccr@us.ibm.com T/L 678-3059
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org">aart@kvack.org</a>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2.5.66-mm2] Fix page_convert_anon locking issues
2003-04-02 23:23 ` Dave McCracken
@ 2003-04-02 23:38 ` Andrew Morton
2003-04-02 23:42 ` Dave McCracken
2003-04-03 0:06 ` Hugh Dickins
1 sibling, 1 reply; 20+ messages in thread
From: Andrew Morton @ 2003-04-02 23:38 UTC (permalink / raw)
To: Dave McCracken; +Cc: linux-mm, linux-kernel
Dave McCracken <dmccr@us.ibm.com> wrote:
>
> > i_shared_sem won't stop that. The pte points into thin air, and may now
> > point at a value which looks like our page.
>
> Once we find a match in the pte entry, we have the additional protection of
> the pte_chain lock. The pte entry is never cleared without a call to
> page_remove_rmap, which will block on the pte_chain lock.
But:
+ /* Double check to make sure the pte page hasn't been freed */
+ if (!pmd_present(*pmd))
+ goto out_unmap;
+
==> munmap, pte page is freed, reallocated for pagecache, someone
happens to write the correct value into it.
+ if (page_to_pfn(page) != pte_pfn(*pte))
+ goto out_unmap;
+
+ if (addr)
+ *addr = address;
+
> >> Because the page is in transition from !PageAnon to PageAnon.
> >
> > These are file-backed pages. So what does PageAnon really mean?
>
> I suppose PageAnon should be renamed to PageChain, to mean it's using
> pte_chains. It did mean anon pages until I used it for nonlinear pages.
OK, I'll edit the diffs at a convenient time.
> >> We have to
> >> hold the pte_chain lock during the entire transition in case someone else
> >> tries to do something like page_remove_rmap, which would break.
> >
> > How about setting PageAnon at the _start_ of the operation?
> > page_remove_rmap() will cope with that OK.
>
> Hmm... I was gonna say that page_remove_rmap will BUG() if it doesn't find
> the entry, but it's only under DEBUG and could easily be changed.
That debug already triggers if it is enabled. I forget why, but it's OK.
> Lemme
> think on this one a bit. I need to assure myself it's safe to go unlocked
> in the middle.
Thanks. It's tricky. I wish we had more tests for this stuff, and real
applications which use it...
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org">aart@kvack.org</a>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2.5.66-mm2] Fix page_convert_anon locking issues
2003-04-02 23:38 ` Andrew Morton
@ 2003-04-02 23:42 ` Dave McCracken
2003-04-02 23:52 ` Andrew Morton
2003-04-02 23:59 ` Hugh Dickins
0 siblings, 2 replies; 20+ messages in thread
From: Dave McCracken @ 2003-04-02 23:42 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux-mm, linux-kernel
--On Wednesday, April 02, 2003 15:38:45 -0800 Andrew Morton
<akpm@digeo.com> wrote:
> But:
>
> + /* Double check to make sure the pte page hasn't been freed */
> + if (!pmd_present(*pmd))
> + goto out_unmap;
> +
> ==> munmap, pte page is freed, reallocated for pagecache, someone
> happens to write the correct value into it.
>
> + if (page_to_pfn(page) != pte_pfn(*pte))
> + goto out_unmap;
> +
> + if (addr)
> + *addr = address;
> +
Oops. The pmd_present() check should be after the page_to_pfn() !=
pte_pfn() check.
Dave
======================================================================
Dave McCracken IBM Linux Base Kernel Team 1-512-838-3059
dmccr@us.ibm.com T/L 678-3059
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org">aart@kvack.org</a>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2.5.66-mm2] Fix page_convert_anon locking issues
2003-04-02 23:42 ` Dave McCracken
@ 2003-04-02 23:52 ` Andrew Morton
2003-04-02 23:58 ` Dave McCracken
2003-04-02 23:59 ` Hugh Dickins
1 sibling, 1 reply; 20+ messages in thread
From: Andrew Morton @ 2003-04-02 23:52 UTC (permalink / raw)
To: Dave McCracken; +Cc: linux-mm, linux-kernel
Dave McCracken <dmccr@us.ibm.com> wrote:
>
>
> Oops. The pmd_present() check should be after the page_to_pfn() !=
> pte_pfn() check.
>
hmmmm. It also probably needs both compiler barriers and memory barriers.
It does give me creepy feelings. I worry that because nobody uses
remap_file_pages() yet, we will hit 2.6.25 before discovering that we have
fundamental VM locking problems which affect $major$ applications.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org">aart@kvack.org</a>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2.5.66-mm2] Fix page_convert_anon locking issues
2003-04-02 23:52 ` Andrew Morton
@ 2003-04-02 23:58 ` Dave McCracken
2003-04-03 14:49 ` Dave McCracken
0 siblings, 1 reply; 20+ messages in thread
From: Dave McCracken @ 2003-04-02 23:58 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux-mm, linux-kernel
--On Wednesday, April 02, 2003 15:52:20 -0800 Andrew Morton
<akpm@digeo.com> wrote:
> hmmmm. It also probably needs both compiler barriers and memory barriers.
>
> It does give me creepy feelings. I worry that because nobody uses
> remap_file_pages() yet, we will hit 2.6.25 before discovering that we have
> fundamental VM locking problems which affect $major$ applications.
It's looking more and more like we should use your other suggestion. It's
definitely simpler if we can make it failsafe. I'll code it up tomorrow.
Dave
======================================================================
Dave McCracken IBM Linux Base Kernel Team 1-512-838-3059
dmccr@us.ibm.com T/L 678-3059
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org">aart@kvack.org</a>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2.5.66-mm2] Fix page_convert_anon locking issues
2003-04-02 23:42 ` Dave McCracken
2003-04-02 23:52 ` Andrew Morton
@ 2003-04-02 23:59 ` Hugh Dickins
1 sibling, 0 replies; 20+ messages in thread
From: Hugh Dickins @ 2003-04-02 23:59 UTC (permalink / raw)
To: Dave McCracken; +Cc: Andrew Morton, linux-mm, linux-kernel
On Wed, 2 Apr 2003, Dave McCracken wrote:
> --On Wednesday, April 02, 2003 15:38:45 -0800 Andrew Morton
> <akpm@digeo.com> wrote:
>
> > But:
> >
> > + /* Double check to make sure the pte page hasn't been freed */
> > + if (!pmd_present(*pmd))
> > + goto out_unmap;
> > +
> > ==> munmap, pte page is freed, reallocated for pagecache, someone
> > happens to write the correct value into it.
> >
> > + if (page_to_pfn(page) != pte_pfn(*pte))
> > + goto out_unmap;
> > +
> > + if (addr)
> > + *addr = address;
> > +
>
> Oops. The pmd_present() check should be after the page_to_pfn() !=
> pte_pfn() check.
No, you're forgetting that the case Andrew rightly indicates is
covered by the ptecount check I added to page_convert_anon, and
commented at length there. As I said yesterday, I don't think
this "Double check" on *pmd serves any real purpose as coded
(whereas the earlier "Double check" on *pgd is vital).
Hugh
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org">aart@kvack.org</a>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2.5.66-mm2] Fix page_convert_anon locking issues
2003-04-02 23:23 ` Dave McCracken
2003-04-02 23:38 ` Andrew Morton
@ 2003-04-03 0:06 ` Hugh Dickins
1 sibling, 0 replies; 20+ messages in thread
From: Hugh Dickins @ 2003-04-03 0:06 UTC (permalink / raw)
To: Dave McCracken; +Cc: Andrew Morton, linux-mm, linux-kernel
On Wed, 2 Apr 2003, Dave McCracken wrote:
> --On Wednesday, April 02, 2003 15:09:03 -0800 Andrew Morton
> <akpm@digeo.com> wrote:
> >
> > How about setting PageAnon at the _start_ of the operation?
> > page_remove_rmap() will cope with that OK.
>
> Hmm... I was gonna say that page_remove_rmap will BUG() if it doesn't find
> the entry, but it's only under DEBUG and could easily be changed. Lemme
> think on this one a bit. I need to assure myself it's safe to go unlocked
> in the middle.
Yes, it's an interesting idea, but by no means clear it's safe.
I'll think about it too, but sorry, no more tonight.
Hugh
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org">aart@kvack.org</a>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2.5.66-mm2] Fix page_convert_anon locking issues
2003-04-02 23:58 ` Dave McCracken
@ 2003-04-03 14:49 ` Dave McCracken
2003-04-03 15:38 ` Hugh Dickins
` (2 more replies)
0 siblings, 3 replies; 20+ messages in thread
From: Dave McCracken @ 2003-04-03 14:49 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux-mm, linux-kernel
--On Wednesday, April 02, 2003 17:58:08 -0600 Dave McCracken
<dmccr@us.ibm.com> wrote:
> It's looking more and more like we should use your other suggestion. It's
> definitely simpler if we can make it failsafe. I'll code it up tomorrow.
I thought of a big hole in the simpler scheme you suggested. It occurred
to me that try_to_unmap will fail. It will see the PageAnon flag so it'll
just walk the pte_chain and assume it doesn't have to walk the vmas. This
will leave the page with some stranded mappings. Actually
page_convert_anon will then finish, and we'll have a page where
try_to_unmap claims it has succeeded but still has mappings.
Dave McCracken
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org">aart@kvack.org</a>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2.5.66-mm2] Fix page_convert_anon locking issues
2003-04-03 14:49 ` Dave McCracken
@ 2003-04-03 15:38 ` Hugh Dickins
2003-04-03 16:00 ` Dave McCracken
2003-04-03 20:00 ` Andrew Morton
2003-04-03 20:06 ` Andrew Morton
2 siblings, 1 reply; 20+ messages in thread
From: Hugh Dickins @ 2003-04-03 15:38 UTC (permalink / raw)
To: Dave McCracken; +Cc: Andrew Morton, linux-mm, linux-kernel
On Thu, 3 Apr 2003, Dave McCracken wrote:
>
> I thought of a big hole in the simpler scheme you suggested. It occurred
> to me that try_to_unmap will fail. It will see the PageAnon flag so it'll
> just walk the pte_chain and assume it doesn't have to walk the vmas. This
> will leave the page with some stranded mappings. Actually
> page_convert_anon will then finish, and we'll have a page where
> try_to_unmap claims it has succeeded but still has mappings.
I don't see that as a big hole at all. While we're in page_convert_anon,
yes, page_referenced won't find all the ptes and try_to_unmap won't be
able to unmap them all; but there are plenty of other reasons why a page
may be briefly unfreeable even though try_to_unmap succeeded, it'll just
try again later.
I haven't really had my think yet, but the only difficulty I've seen so
far is in maintaining the nr_mapped stats correctly. page_convert_anon
insert an initial dummy entry (the entry install_page is about to add?)
to make sure page_mapped cannot go false?
(Hmm, is the current page_convert_anon maintaining nr_reverse_maps
correctly? I would think not, since it's doing nothing about it, and
page_remove_rmap would decrement seeing an Anon. But perhaps I'm
confused again, a quick test doesn't show the drop I'd expect.)
Hugh
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org">aart@kvack.org</a>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2.5.66-mm2] Fix page_convert_anon locking issues
2003-04-03 15:38 ` Hugh Dickins
@ 2003-04-03 16:00 ` Dave McCracken
2003-04-03 16:33 ` Hugh Dickins
0 siblings, 1 reply; 20+ messages in thread
From: Dave McCracken @ 2003-04-03 16:00 UTC (permalink / raw)
To: Hugh Dickins; +Cc: Andrew Morton, linux-mm, linux-kernel
--On Thursday, April 03, 2003 16:38:12 +0100 Hugh Dickins
<hugh@veritas.com> wrote:
> I don't see that as a big hole at all. While we're in page_convert_anon,
> yes, page_referenced won't find all the ptes and try_to_unmap won't be
> able to unmap them all; but there are plenty of other reasons why a page
> may be briefly unfreeable even though try_to_unmap succeeded, it'll just
> try again later.
No, try_to_unmap will claim success when in fact there are still mappings.
It'd be all right if it failed, but there's no way to tell it to fail. The
page will be freed by kswapd based on try_to_unmap's claim of success.
> (Hmm, is the current page_convert_anon maintaining nr_reverse_maps
> correctly? I would think not, since it's doing nothing about it, and
> page_remove_rmap would decrement seeing an Anon. But perhaps I'm
> confused again, a quick test doesn't show the drop I'd expect.)
You're right, it is a hole.
Dave
======================================================================
Dave McCracken IBM Linux Base Kernel Team 1-512-838-3059
dmccr@us.ibm.com T/L 678-3059
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org">aart@kvack.org</a>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2.5.66-mm2] Fix page_convert_anon locking issues
2003-04-03 16:00 ` Dave McCracken
@ 2003-04-03 16:33 ` Hugh Dickins
2003-04-03 16:39 ` Dave McCracken
0 siblings, 1 reply; 20+ messages in thread
From: Hugh Dickins @ 2003-04-03 16:33 UTC (permalink / raw)
To: Dave McCracken; +Cc: Andrew Morton, linux-mm, linux-kernel
On Thu, 3 Apr 2003, Dave McCracken wrote:
>
> No, try_to_unmap will claim success when in fact there are still mappings.
> It'd be all right if it failed, but there's no way to tell it to fail. The
> page will be freed by kswapd based on try_to_unmap's claim of success.
No: see the various checks on page_count(page) in vmscan.c:
though page_convert_anon temporarily leaves a page with neither
mapcount nor the right number of pte pointers, page_count is unaffected.
Hugh
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org">aart@kvack.org</a>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2.5.66-mm2] Fix page_convert_anon locking issues
2003-04-03 16:33 ` Hugh Dickins
@ 2003-04-03 16:39 ` Dave McCracken
0 siblings, 0 replies; 20+ messages in thread
From: Dave McCracken @ 2003-04-03 16:39 UTC (permalink / raw)
To: Hugh Dickins; +Cc: Andrew Morton, linux-mm, linux-kernel
--On Thursday, April 03, 2003 17:33:31 +0100 Hugh Dickins
<hugh@veritas.com> wrote:
> No: see the various checks on page_count(page) in vmscan.c:
> though page_convert_anon temporarily leaves a page with neither
> mapcount nor the right number of pte pointers, page_count is unaffected.
Oh, hmm... Ok, so it's safe... This could have some interesting
implications all around.
Dave
======================================================================
Dave McCracken IBM Linux Base Kernel Team 1-512-838-3059
dmccr@us.ibm.com T/L 678-3059
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org">aart@kvack.org</a>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2.5.66-mm2] Fix page_convert_anon locking issues
2003-04-03 14:49 ` Dave McCracken
2003-04-03 15:38 ` Hugh Dickins
@ 2003-04-03 20:00 ` Andrew Morton
2003-04-03 20:06 ` Andrew Morton
2 siblings, 0 replies; 20+ messages in thread
From: Andrew Morton @ 2003-04-03 20:00 UTC (permalink / raw)
To: Dave McCracken; +Cc: linux-mm, linux-kernel
Dave McCracken <dmccr@us.ibm.com> wrote:
>
>
> --On Wednesday, April 02, 2003 17:58:08 -0600 Dave McCracken
> <dmccr@us.ibm.com> wrote:
>
> > It's looking more and more like we should use your other suggestion. It's
> > definitely simpler if we can make it failsafe. I'll code it up tomorrow.
>
> I thought of a big hole in the simpler scheme you suggested. It occurred
> to me that try_to_unmap will fail. It will see the PageAnon flag so it'll
> just walk the pte_chain and assume it doesn't have to walk the vmas. This
> will leave the page with some stranded mappings. Actually
> page_convert_anon will then finish, and we'll have a page where
> try_to_unmap claims it has succeeded but still has mappings.
>
Lock the page, and page reclaim will not go near it. It needs to be locked
across page_convert_anon() anyway, to protect ->mapping.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org">aart@kvack.org</a>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2.5.66-mm2] Fix page_convert_anon locking issues
2003-04-03 14:49 ` Dave McCracken
2003-04-03 15:38 ` Hugh Dickins
2003-04-03 20:00 ` Andrew Morton
@ 2003-04-03 20:06 ` Andrew Morton
2003-04-03 20:56 ` Andrew Morton
2 siblings, 1 reply; 20+ messages in thread
From: Andrew Morton @ 2003-04-03 20:06 UTC (permalink / raw)
To: Dave McCracken; +Cc: linux-mm, linux-kernel
Dave McCracken <dmccr@us.ibm.com> wrote:
>
>
> --On Wednesday, April 02, 2003 17:58:08 -0600 Dave McCracken
> <dmccr@us.ibm.com> wrote:
>
> > It's looking more and more like we should use your other suggestion. It's
> > definitely simpler if we can make it failsafe. I'll code it up tomorrow.
>
> I thought of a big hole in the simpler scheme you suggested. It occurred
> to me that try_to_unmap will fail. It will see the PageAnon flag so it'll
> just walk the pte_chain and assume it doesn't have to walk the vmas. This
> will leave the page with some stranded mappings. Actually
> page_convert_anon will then finish, and we'll have a page where
> try_to_unmap claims it has succeeded but still has mappings.
>
page_referenced() has the same problem, so refill_inactive_zone() will need
to lock pages too.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org">aart@kvack.org</a>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2.5.66-mm2] Fix page_convert_anon locking issues
2003-04-03 20:06 ` Andrew Morton
@ 2003-04-03 20:56 ` Andrew Morton
2003-04-03 21:01 ` Dave McCracken
0 siblings, 1 reply; 20+ messages in thread
From: Andrew Morton @ 2003-04-03 20:56 UTC (permalink / raw)
To: dmccr; +Cc: linux-mm, linux-kernel
Andrew Morton <akpm@digeo.com> wrote:
>
> page_referenced() has the same problem, so refill_inactive_zone() will need
> to lock pages too.
Complete bollocks. As long as the pte chains are consistent while
refill_inactive_zone holds pte_chain_lock (they darn well should be),
concurrent page_referenced() and page_convert_anon() is fine.
It could be that page_referenced() returns an inappropriate answer, but it's
so rare we don't care.
Which is good. We really don't want to lock pages in refill_inactive_zone()
to keep the extremely rare page_convert_anon() away. refill_inactive_zone()
is more a bath-temperature path than a hotpath, but still...
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org">aart@kvack.org</a>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2.5.66-mm2] Fix page_convert_anon locking issues
2003-04-03 20:56 ` Andrew Morton
@ 2003-04-03 21:01 ` Dave McCracken
0 siblings, 0 replies; 20+ messages in thread
From: Dave McCracken @ 2003-04-03 21:01 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux-mm, linux-kernel
--On Thursday, April 03, 2003 12:56:34 -0800 Andrew Morton <akpm@digeo.com>
wrote:
> Complete bollocks. As long as the pte chains are consistent while
> refill_inactive_zone holds pte_chain_lock (they darn well should be),
> concurrent page_referenced() and page_convert_anon() is fine.
Good. You just saved me from having to butcher refill_inactive_zone :)
Dave
======================================================================
Dave McCracken IBM Linux Base Kernel Team 1-512-838-3059
dmccr@us.ibm.com T/L 678-3059
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org">aart@kvack.org</a>
^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2003-04-03 21:01 UTC | newest]
Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2003-04-02 17:13 [PATCH 2.5.66-mm2] Fix page_convert_anon locking issues Dave McCracken
2003-04-02 21:29 ` Andrew Morton
2003-04-02 21:56 ` Dave McCracken
2003-04-02 23:09 ` Andrew Morton
2003-04-02 23:23 ` Dave McCracken
2003-04-02 23:38 ` Andrew Morton
2003-04-02 23:42 ` Dave McCracken
2003-04-02 23:52 ` Andrew Morton
2003-04-02 23:58 ` Dave McCracken
2003-04-03 14:49 ` Dave McCracken
2003-04-03 15:38 ` Hugh Dickins
2003-04-03 16:00 ` Dave McCracken
2003-04-03 16:33 ` Hugh Dickins
2003-04-03 16:39 ` Dave McCracken
2003-04-03 20:00 ` Andrew Morton
2003-04-03 20:06 ` Andrew Morton
2003-04-03 20:56 ` Andrew Morton
2003-04-03 21:01 ` Dave McCracken
2003-04-02 23:59 ` Hugh Dickins
2003-04-03 0:06 ` Hugh Dickins
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox