From mboxrd@z Thu Jan 1 00:00:00 1970 Date: Sat, 1 May 2004 00:54:08 -0700 From: Andrew Morton Subject: rmap spin_trylock success rates Message-Id: <20040501005408.1cd77796.akpm@osdl.org> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: owner-linux-mm@kvack.org Return-Path: To: Hugh Dickins Cc: linux-mm@kvack.org List-ID: I applied the appended patch to determine the spin_trylock success rate in rmap.c. The machine is a single P4-HT pseudo-2-way. 256MB of memory and the workload is a straightforward `usemem -m 400': allocate and touch 400MB of memory. page_referenced_one_miss = 4027 page_referenced_one_hit = 212605 try_to_unmap_one_miss = 3257 try_to_unmap_one_hit = 61153 That's a 5% failure rate in try_to_unmap_one()'s spin_trylock(). I suspect this is the reason for the problem which Martin Schwidefsky reported a while back: this particular workload only achieves half the disk bandwidth on SMP when compared with UP. I poked around with that a bit at the time and determined that it was due to poor I/O submission patterns. Increasing the disk queue from 128 slots to 1024 fixed it completely because the request queue fixed up the bad I/O submission patterns. With `./qsbench -p 4 -m 96': page_referenced_one_miss = 401 page_referenced_one_hit = 1224748 try_to_unmap_one_miss = 103 try_to_unmap_one_hit = 339944 That's negligible. I don't think we really need to do anything about this - the everything-in-one-mm case isn't the most interesting situation. In a way it's an argument for serialising the whole page reclaim path. It'd be nice of we can reduce the page_table_lock hold times in there. hm, now I'm confused. We're running try_to_unmap_one() under anonhd->lock, so why is there any contention for page_table_lock at all with this workload? It must be contending with page_referenced_one(). Taking anonhd->lock in page_referenced_one() also might fix this up. 25-akpm/mm/rmap.c | 21 +++++++++++++++++++-- 1 files changed, 19 insertions(+), 2 deletions(-) diff -puN mm/rmap.c~rmap-trylock-instrumentation mm/rmap.c --- 25/mm/rmap.c~rmap-trylock-instrumentation 2004-05-01 00:19:05.485768648 -0700 +++ 25-akpm/mm/rmap.c 2004-05-01 00:22:32.029369248 -0700 @@ -27,6 +27,15 @@ #include +static struct stats { + int page_referenced_one_miss; + int page_referenced_one_hit; + int try_to_unmap_one_miss; + int try_to_unmap_one_hit; + int try_to_unmap_cluster_miss; + int try_to_unmap_cluster_hit; +} stats; + /* * struct anonmm: to track a bundle of anonymous memory mappings. * @@ -178,6 +187,7 @@ static int page_referenced_one(struct pa int referenced = 0; if (!spin_trylock(&mm->page_table_lock)) { + stats.page_referenced_one_miss++; /* * For debug we're currently warning if not all found, * but in this case that's expected: suppress warning. @@ -185,6 +195,7 @@ static int page_referenced_one(struct pa (*failed)++; return 0; } + stats.page_referenced_one_hit++; pgd = pgd_offset(mm, address); if (!pgd_present(*pgd)) @@ -495,8 +506,11 @@ static int try_to_unmap_one(struct page * We need the page_table_lock to protect us from page faults, * munmap, fork, etc... */ - if (!spin_trylock(&mm->page_table_lock)) + if (!spin_trylock(&mm->page_table_lock)) { + stats.try_to_unmap_one_miss++; goto out; + } + stats.try_to_unmap_one_hit++; pgd = pgd_offset(mm, address); if (!pgd_present(*pgd)) @@ -596,8 +610,11 @@ static int try_to_unmap_cluster(struct m * We need the page_table_lock to protect us from page faults, * munmap, fork, etc... */ - if (!spin_trylock(&mm->page_table_lock)) + if (!spin_trylock(&mm->page_table_lock)) { + stats.try_to_unmap_cluster_miss++; return SWAP_FAIL; + } + stats.try_to_unmap_cluster_hit++; address = (vma->vm_start + cursor) & CLUSTER_MASK; end = address + CLUSTER_SIZE; _ -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: aart@kvack.org