* Re: ZERO_PAGE refcounting causes cache line bouncing
2007-03-19 5:56 ` Nick Piggin
@ 2007-03-19 6:24 ` Nick Piggin
2007-03-19 6:41 ` Nick Piggin
2007-03-19 17:06 ` Christoph Lameter
2007-03-19 12:03 ` Robin Holt
` (2 subsequent siblings)
3 siblings, 2 replies; 13+ messages in thread
From: Nick Piggin @ 2007-03-19 6:24 UTC (permalink / raw)
Cc: William Lee Irwin III, Christoph Lameter, linux-mm
[-- Attachment #1: Type: text/plain, Size: 550 bytes --]
Nick Piggin wrote:
> I've always thought the bouncing issue was a silly one and should be
> fixed, of course. Maybe the reason my fix was vetoed was lack of numbers.
> Christoph, would you oblige? I'll dig out the patch and repost.
Something like this roughly should get rid of ZERO_PAGE _count and _mapcount
manipulation for anonymous pages. (others still exist, XIP and /dev/zero, but
they should not be a large concern AFAIKS).
I haven't booted this, but it is a quick forward port + some fixes and
simplifications.
--
SUSE Labs, Novell Inc.
[-- Attachment #2: mm-special-case-ZERO_PAGE.patch --]
[-- Type: text/plain, Size: 1012 bytes --]
Index: linux-2.6/mm/memory.c
===================================================================
--- linux-2.6.orig/mm/memory.c
+++ linux-2.6/mm/memory.c
@@ -665,7 +665,8 @@ static unsigned long zap_pte_range(struc
ptent = ptep_get_and_clear_full(mm, addr, pte,
tlb->fullmm);
tlb_remove_tlb_entry(tlb, pte, addr);
- if (unlikely(!page))
+ if (unlikely(!page ||
+ (!vma->vm_file && page == ZERO_PAGE(addr))))
continue;
if (unlikely(details) && details->nonlinear_vma
&& linear_page_index(details->nonlinear_vma,
@@ -2152,15 +2153,12 @@ static int do_anonymous_page(struct mm_s
} else {
/* Map the ZERO_PAGE - vm_page_prot is readonly */
page = ZERO_PAGE(address);
- page_cache_get(page);
entry = mk_pte(page, vma->vm_page_prot);
ptl = pte_lockptr(mm, pmd);
spin_lock(ptl);
if (!pte_none(*page_table))
- goto release;
- inc_mm_counter(mm, file_rss);
- page_add_file_rmap(page);
+ goto unlock;
}
set_pte_at(mm, address, page_table, entry);
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: ZERO_PAGE refcounting causes cache line bouncing
2007-03-19 6:24 ` Nick Piggin
@ 2007-03-19 6:41 ` Nick Piggin
2007-03-19 10:04 ` Nick Piggin
2007-03-19 17:06 ` Christoph Lameter
1 sibling, 1 reply; 13+ messages in thread
From: Nick Piggin @ 2007-03-19 6:41 UTC (permalink / raw)
To: Nick Piggin; +Cc: William Lee Irwin III, Christoph Lameter, linux-mm
Nick Piggin wrote:
> Something like this roughly should get rid of ZERO_PAGE _count and
> _mapcount
> manipulation for anonymous pages. (others still exist, XIP and
> /dev/zero, but
> they should not be a large concern AFAIKS).
>
> I haven't booted this, but it is a quick forward port + some fixes and
> simplifications.
>
>
> ------------------------------------------------------------------------
>
> Index: linux-2.6/mm/memory.c
> ===================================================================
> --- linux-2.6.orig/mm/memory.c
> +++ linux-2.6/mm/memory.c
> @@ -665,7 +665,8 @@ static unsigned long zap_pte_range(struc
> ptent = ptep_get_and_clear_full(mm, addr, pte,
> tlb->fullmm);
> tlb_remove_tlb_entry(tlb, pte, addr);
> - if (unlikely(!page))
> + if (unlikely(!page ||
> + (!vma->vm_file && page == ZERO_PAGE(addr))))
> continue;
Hmm, well I suppose it would be cleaner if this check used the one in
handle_pte_fault instead of !vma->vm_file ie. (!vma->vm_ops ||
!vma->vm_ops->nopage)
> if (unlikely(details) && details->nonlinear_vma
> && linear_page_index(details->nonlinear_vma,
> @@ -2152,15 +2153,12 @@ static int do_anonymous_page(struct mm_s
> } else {
> /* Map the ZERO_PAGE - vm_page_prot is readonly */
> page = ZERO_PAGE(address);
> - page_cache_get(page);
> entry = mk_pte(page, vma->vm_page_prot);
>
> ptl = pte_lockptr(mm, pmd);
> spin_lock(ptl);
> if (!pte_none(*page_table))
> - goto release;
> - inc_mm_counter(mm, file_rss);
> - page_add_file_rmap(page);
> + goto unlock;
> }
>
> set_pte_at(mm, address, page_table, entry);
--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: ZERO_PAGE refcounting causes cache line bouncing
2007-03-19 6:41 ` Nick Piggin
@ 2007-03-19 10:04 ` Nick Piggin
0 siblings, 0 replies; 13+ messages in thread
From: Nick Piggin @ 2007-03-19 10:04 UTC (permalink / raw)
To: Nick Piggin; +Cc: William Lee Irwin III, Christoph Lameter, linux-mm
Nick Piggin wrote:
> Nick Piggin wrote:
>
>> Something like this roughly should get rid of ZERO_PAGE _count and
>> _mapcount
>> manipulation for anonymous pages. (others still exist, XIP and
>> /dev/zero, but
>> they should not be a large concern AFAIKS).
>>
>> I haven't booted this, but it is a quick forward port + some fixes and
>> simplifications.
>>
>>
>> ------------------------------------------------------------------------
>>
>> Index: linux-2.6/mm/memory.c
>> ===================================================================
>> --- linux-2.6.orig/mm/memory.c
>> +++ linux-2.6/mm/memory.c
>> @@ -665,7 +665,8 @@ static unsigned long zap_pte_range(struc
>> ptent = ptep_get_and_clear_full(mm, addr, pte,
>> tlb->fullmm);
>> tlb_remove_tlb_entry(tlb, pte, addr);
>> - if (unlikely(!page))
>> + if (unlikely(!page ||
>> + (!vma->vm_file && page == ZERO_PAGE(addr))))
>> continue;
>
>
> Hmm, well I suppose it would be cleaner if this check used the one in
> handle_pte_fault instead of !vma->vm_file ie. (!vma->vm_ops ||
> !vma->vm_ops->nopage)
Bah, I also missed a reject for a similar hunk required in copy_one_pte.
I don't think there is anything more required after that, though... I
will actually test it tomorrow and send an updated patch with proper
changelog.
--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: ZERO_PAGE refcounting causes cache line bouncing
2007-03-19 6:24 ` Nick Piggin
2007-03-19 6:41 ` Nick Piggin
@ 2007-03-19 17:06 ` Christoph Lameter
2007-03-20 2:53 ` Nick Piggin
1 sibling, 1 reply; 13+ messages in thread
From: Christoph Lameter @ 2007-03-19 17:06 UTC (permalink / raw)
To: Nick Piggin; +Cc: William Lee Irwin III, linux-mm
On Mon, 19 Mar 2007, Nick Piggin wrote:
> I haven't booted this, but it is a quick forward port + some fixes and
> simplifications.
Eeek patch vanished.
The comparison with ZERO_PAGE may fail if we have multiple zero pages.
Would it be possible to check for PageReserved?
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: ZERO_PAGE refcounting causes cache line bouncing
2007-03-19 17:06 ` Christoph Lameter
@ 2007-03-20 2:53 ` Nick Piggin
0 siblings, 0 replies; 13+ messages in thread
From: Nick Piggin @ 2007-03-20 2:53 UTC (permalink / raw)
To: Christoph Lameter; +Cc: William Lee Irwin III, linux-mm
Christoph Lameter wrote:
> On Mon, 19 Mar 2007, Nick Piggin wrote:
>
>
>>I haven't booted this, but it is a quick forward port + some fixes and
>>simplifications.
>
>
> Eeek patch vanished.
>
> The comparison with ZERO_PAGE may fail if we have multiple zero pages.
> Would it be possible to check for PageReserved?
That still wasn't quite right either.
I don't want to check for PageReserved, because I want to get rid of
that flag one day. The Robin/Bill approach for multiple zero pages
will work, though.
--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: ZERO_PAGE refcounting causes cache line bouncing
2007-03-19 5:56 ` Nick Piggin
2007-03-19 6:24 ` Nick Piggin
@ 2007-03-19 12:03 ` Robin Holt
2007-03-20 2:35 ` Nick Piggin
2007-03-19 12:46 ` William Lee Irwin III
2007-03-19 17:04 ` Christoph Lameter
3 siblings, 1 reply; 13+ messages in thread
From: Robin Holt @ 2007-03-19 12:03 UTC (permalink / raw)
To: Nick Piggin; +Cc: William Lee Irwin III, Christoph Lameter, linux-mm
On Mon, Mar 19, 2007 at 04:56:47PM +1100, Nick Piggin wrote:
> Yes, I have the patch to do it quite easily. Per-node ZERO_PAGE could be
> another option, but that's going to cost another page flag if we wish to
> recognise the zero page in wp faults like we do now (hmm, for some reason
> it is OK to special case it _there_).
Could we do a per-node ZERO_PAGE as a pointer from the node structure
and then use a page_to_nid to get back to the node and compare the page
to the node's zero page instead of using another page flag which would
actually only be used on numa?
Thanks,
Robin
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: ZERO_PAGE refcounting causes cache line bouncing
2007-03-19 12:03 ` Robin Holt
@ 2007-03-20 2:35 ` Nick Piggin
0 siblings, 0 replies; 13+ messages in thread
From: Nick Piggin @ 2007-03-20 2:35 UTC (permalink / raw)
To: Robin Holt; +Cc: William Lee Irwin III, Christoph Lameter, linux-mm
Robin Holt wrote:
> On Mon, Mar 19, 2007 at 04:56:47PM +1100, Nick Piggin wrote:
>
>>Yes, I have the patch to do it quite easily. Per-node ZERO_PAGE could be
>>another option, but that's going to cost another page flag if we wish to
>>recognise the zero page in wp faults like we do now (hmm, for some reason
>>it is OK to special case it _there_).
>
>
> Could we do a per-node ZERO_PAGE as a pointer from the node structure
> and then use a page_to_nid to get back to the node and compare the page
> to the node's zero page instead of using another page flag which would
> actually only be used on numa?
Yes, that's a nice way to do it.
--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: ZERO_PAGE refcounting causes cache line bouncing
2007-03-19 5:56 ` Nick Piggin
2007-03-19 6:24 ` Nick Piggin
2007-03-19 12:03 ` Robin Holt
@ 2007-03-19 12:46 ` William Lee Irwin III
2007-03-20 2:34 ` Nick Piggin
2007-03-19 17:04 ` Christoph Lameter
3 siblings, 1 reply; 13+ messages in thread
From: William Lee Irwin III @ 2007-03-19 12:46 UTC (permalink / raw)
To: Nick Piggin; +Cc: Christoph Lameter, linux-mm
On Mon, Mar 19, 2007 at 04:56:47PM +1100, Nick Piggin wrote:
> Yes, I have the patch to do it quite easily. Per-node ZERO_PAGE could be
> another option, but that's going to cost another page flag if we wish to
> recognise the zero page in wp faults like we do now (hmm, for some reason
> it is OK to special case it _there_).
No need for a page flag. A per-node array of struct page * can be used
to check by merely indexing into it with the nid of the page's node. e.g.
struct page *get_zero_page(int nid, unsigned long addr)
{
return zero_pages[nid][(addr & SOME_ARCHDEP_MASK) >> PAGE_SHIFT];
}
/* any time we fish one out of a pte we have a uvaddr */
int is_zero_page_addr(struct page *page, unsigned long address)
{
return page == get_zero_page(page_to_nid(page), address);
}
-- wli
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 13+ messages in thread* Re: ZERO_PAGE refcounting causes cache line bouncing
2007-03-19 12:46 ` William Lee Irwin III
@ 2007-03-20 2:34 ` Nick Piggin
0 siblings, 0 replies; 13+ messages in thread
From: Nick Piggin @ 2007-03-20 2:34 UTC (permalink / raw)
To: William Lee Irwin III; +Cc: Christoph Lameter, linux-mm
William Lee Irwin III wrote:
> On Mon, Mar 19, 2007 at 04:56:47PM +1100, Nick Piggin wrote:
>
>>Yes, I have the patch to do it quite easily. Per-node ZERO_PAGE could be
>>another option, but that's going to cost another page flag if we wish to
>>recognise the zero page in wp faults like we do now (hmm, for some reason
>>it is OK to special case it _there_).
>
>
> No need for a page flag. A per-node array of struct page * can be used
> to check by merely indexing into it with the nid of the page's node. e.g.
>
> struct page *get_zero_page(int nid, unsigned long addr)
> {
> return zero_pages[nid][(addr & SOME_ARCHDEP_MASK) >> PAGE_SHIFT];
> }
>
> /* any time we fish one out of a pte we have a uvaddr */
> int is_zero_page_addr(struct page *page, unsigned long address)
> {
> return page == get_zero_page(page_to_nid(page), address);
> }
Ah, good point :)
--
SUSE Labs, Novell Inc.
Send instant messages to your online friends http://au.messenger.yahoo.com
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: ZERO_PAGE refcounting causes cache line bouncing
2007-03-19 5:56 ` Nick Piggin
` (2 preceding siblings ...)
2007-03-19 12:46 ` William Lee Irwin III
@ 2007-03-19 17:04 ` Christoph Lameter
3 siblings, 0 replies; 13+ messages in thread
From: Christoph Lameter @ 2007-03-19 17:04 UTC (permalink / raw)
To: Nick Piggin; +Cc: William Lee Irwin III, linux-mm
On Mon, 19 Mar 2007, Nick Piggin wrote:
> I've always thought the bouncing issue was a silly one and should be
> fixed, of course. Maybe the reason my fix was vetoed was lack of numbers.
> Christoph, would you oblige? I'll dig out the patch and repost.
Well this occurs on a 1024p system that is only sporadically available. It
gets so bad it hangs completely. Could you also provide patch against
SLES10? We get get some bug action on this one I think.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 13+ messages in thread