linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] remove_inode_page rewrite.
@ 2000-05-09 20:14 Dave Jones
  2000-05-10 10:10 ` Steve Dodd
  0 siblings, 1 reply; 4+ messages in thread
From: Dave Jones @ 2000-05-09 20:14 UTC (permalink / raw)
  To: Linux Kernel Mailing List; +Cc: linux-mm

Hi,
 I'm not entirely convinced that remove_inode_page() is
SMP safe.  The diff below rewrites it so that it doesn't
repeatedly take/drop the pagecache_lock.

I believe that while after CPU0 drops the pagecache_lock, and starts
removing one page, CPU1 fails to lock the same page (as CPU0 grabbed it 
with the trylock) and moves to the next page in the list, succeeds,
removes it, and then rescans from the top.

With the current locking I believe it's then possible for CPU1 to
lock that page (again in the TryLockPage(page) call) just before CPU0
calls page_cache_release(page)

This patch probably kills us latency-wise, but looks a lot more
sane in my eyes.

 Any comments ?

-- 
Dave.

--- filemap.c~	Tue May  9 19:37:13 2000
+++ filemap.c	Tue May  9 19:37:41 2000
@@ -91,44 +91,50 @@
  * Remove a page from the page cache and free it. Caller has to make
  * sure the page is locked and that nobody else uses it - or that usage
  * is safe.
+ * Caller must also be holding pagecache_lock
  */
 void remove_inode_page(struct page *page)
 {
 	if (!PageLocked(page))
 		PAGE_BUG(page);
 
-	spin_lock(&pagecache_lock);
 	remove_page_from_inode_queue(page);
 	remove_page_from_hash_queue(page);
 	page->mapping = NULL;
-	spin_unlock(&pagecache_lock);
 }
 
+
 void invalidate_inode_pages(struct inode * inode)
 {
 	struct list_head *head, *curr;
 	struct page * page;
 
- repeat:
-	head = &inode->i_mapping->pages;
 	spin_lock(&pagecache_lock);
+
+	head = &inode->i_mapping->pages;
+
+	if (head == head->next)
+		goto empty_list;
+
 	curr = head->next;
 
-	while (curr != head) {
+	do {
 		page = list_entry(curr, struct page, list);
 		curr = curr->next;
 
 		/* We cannot invalidate a locked page */
 		if (TryLockPage(page))
 			continue;
-		spin_unlock(&pagecache_lock);
 
 		lru_cache_del(page);
 		remove_inode_page(page);
 		UnlockPage(page);
 		page_cache_release(page);
-		goto repeat;
-	}
+		head = &inode->i_mapping->pages;
+
+	} while (curr != head); 
+
+empty_list:
 	spin_unlock(&pagecache_lock);
 }
 
@@ -180,7 +186,9 @@
 			 * page cache and creates a buffer-cache alias
 			 * to it causing all sorts of fun problems ...
 			 */
+			spin_lock(&pagecache_lock);
 			remove_inode_page(page);
+			spin_unlock(&pagecache_lock);
 
 			UnlockPage(page);
 			page_cache_release(page);

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] remove_inode_page rewrite.
  2000-05-09 20:14 [PATCH] remove_inode_page rewrite Dave Jones
@ 2000-05-10 10:10 ` Steve Dodd
  2000-05-10 17:25   ` Dave Jones
  0 siblings, 1 reply; 4+ messages in thread
From: Steve Dodd @ 2000-05-10 10:10 UTC (permalink / raw)
  To: Dave Jones; +Cc: Linux Kernel Mailing List, linux-mm

On Tue, May 09, 2000 at 09:14:08PM +0100, Dave Jones wrote:

> I believe that while after CPU0 drops the pagecache_lock, and starts
> removing one page, CPU1 fails to lock the same page (as CPU0 grabbed it 
> with the trylock) and moves to the next page in the list, succeeds,
> removes it, and then rescans from the top.
> 
> With the current locking I believe it's then possible for CPU1 to
> lock that page

Which page? CPU1 should never find the page CPU0 is freeing because it will
either be locked, or not on the list at all. By the time CPU0 unlocks the
page, it's removed it from the list (and it grabs the spinlock while messing
with the list structure).

> (again in the TryLockPage(page) call) just before CPU0
> calls page_cache_release(page)
> 
> This patch probably kills us latency-wise, but looks a lot more
> sane in my eyes.

Now that invalidate_inode_page isn't calling sync_page, there seems to be
no reason to drop and retake the spinlock, I agree.

[..]
> - repeat:
> -	head = &inode->i_mapping->pages;
>  	spin_lock(&pagecache_lock);
> +
> +	head = &inode->i_mapping->pages;

That shouldn't be necessary - nobody is likely to change the address of
inode->i_mapping->pages under us :)
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] remove_inode_page rewrite.
  2000-05-10 10:10 ` Steve Dodd
@ 2000-05-10 17:25   ` Dave Jones
  2000-05-10 20:55     ` Dave Jones
  0 siblings, 1 reply; 4+ messages in thread
From: Dave Jones @ 2000-05-10 17:25 UTC (permalink / raw)
  To: Steve Dodd; +Cc: Linux Kernel Mailing List, linux-mm

On Wed, 10 May 2000, Steve Dodd wrote:

> Now that invalidate_inode_page isn't calling sync_page, there seems to be
> no reason to drop and retake the spinlock, I agree.

*nod*

> > +	head = &inode->i_mapping->pages;
> That shouldn't be necessary - nobody is likely to change the address of
> inode->i_mapping->pages under us :)

I spotted that, but wasn't entirely sure that the pagecache_lock was
enough to ensure this. With the line above removed also, this means that
invalidate_inode_pages becomes a lot faster as we only pass through the
list once, so maybe holding the spinlock for the whole function isn't such
a big deal.

Even if the race I thought was there doesn't exist, this could be worth
adding for a worthwhile performance increase. I'll do some performance
tests in the next day or so.

regards,

-- 
Dave.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] remove_inode_page rewrite.
  2000-05-10 17:25   ` Dave Jones
@ 2000-05-10 20:55     ` Dave Jones
  0 siblings, 0 replies; 4+ messages in thread
From: Dave Jones @ 2000-05-10 20:55 UTC (permalink / raw)
  To: linux-mm

Ok, I've thrown some ideas around in #kernelnewbies with regards
to my last patch, and with the help of Arjan & Quintela, have
arrived at the following patch.

The last patch I sent would 'forget' about locked pages, and never
release them. This one makes multiple passes at the list until
they have all been freed. (but in a much better wat than the original
code) When we get to a locked page, we now sleep in lock_page() until the
page is freed.

This diff is untested, and has been sent here primarily to self-ensure I'm
not going down some blind-alley making things worse than they already are.

Arjan pointed out that there could be a possibility of a page being
unlocked from an interrupt which this code doesn't take into
consideration, and I've not tested. If others want to prove/disprove that,
please do so.

regards,

-- 
Dave.


--- filemap.c~	Tue May  9 19:37:13 2000
+++ filemap.c	Wed May 10 20:50:52 2000
@@ -91,45 +91,64 @@
  * Remove a page from the page cache and free it. Caller has to make
  * sure the page is locked and that nobody else uses it - or that usage
  * is safe.
+ * Caller must also be holding pagecache_lock
  */
 void remove_inode_page(struct page *page)
 {
 	if (!PageLocked(page))
 		PAGE_BUG(page);
 
-	spin_lock(&pagecache_lock);
 	remove_page_from_inode_queue(page);
 	remove_page_from_hash_queue(page);
 	page->mapping = NULL;
-	spin_unlock(&pagecache_lock);
 }
 
+
 void invalidate_inode_pages(struct inode * inode)
 {
 	struct list_head *head, *curr;
 	struct page * page;
 
- repeat:
-	head = &inode->i_mapping->pages;
-	spin_lock(&pagecache_lock);
-	curr = head->next;
+	while (head != head->next) {
 
-	while (curr != head) {
-		page = list_entry(curr, struct page, list);
-		curr = curr->next;
+		spin_lock(&pagecache_lock);
+
+		head = &inode->i_mapping->pages;
+		curr = head->next;
+
+		while (curr != head) {
+
+			page = list_entry(curr, struct page, list);
+			curr = curr->next;
 
-		/* We cannot invalidate a locked page */
-		if (TryLockPage(page))
-			continue;
-		spin_unlock(&pagecache_lock);
-
-		lru_cache_del(page);
-		remove_inode_page(page);
-		UnlockPage(page);
-		page_cache_release(page);
-		goto repeat;
+			/* We cannot invalidate a locked page */
+			if (PageLocked(page))
+				continue;
+
+			lru_cache_del(page);
+			remove_inode_page(page);
+			page_cache_release(page);
+		}
+
+		/* At this stage we have passed through the list
+		 * once, and there may still be locked pages. */
+
+		if (head->next!=head) {
+			page = list_entry(head->next,struct page,list);
+			spin_unlock(&pagecache_lock);
+
+			/* We need to block */
+			lock_page(page);
+			UnlockPage(page);
+
+		} else {
+		
+			/* No pages left in list. */
+			spin_unlock(&pagecache_lock);
+		}
 	}
-	spin_unlock(&pagecache_lock);
+
+empty_list:
 }
 
 /*
@@ -180,7 +199,9 @@
 			 * page cache and creates a buffer-cache alias
 			 * to it causing all sorts of fun problems ...
 			 */
+			spin_lock(&pagecache_lock);
 			remove_inode_page(page);
+			spin_unlock(&pagecache_lock);
 
 			UnlockPage(page);
 			page_cache_release(page);
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux.eu.org/Linux-MM/

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2000-05-10 21:31 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2000-05-09 20:14 [PATCH] remove_inode_page rewrite Dave Jones
2000-05-10 10:10 ` Steve Dodd
2000-05-10 17:25   ` Dave Jones
2000-05-10 20:55     ` Dave Jones

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox