[PATCH]: VM 7/8 cluster pageout

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

* [PATCH]: VM 7/8 cluster pageout
@ 2005-04-17 17:38 Nikita Danilov
  2005-04-26  4:15 ` Andrew Morton
  0 siblings, 1 reply; 8+ messages in thread
From: Nikita Danilov @ 2005-04-17 17:38 UTC (permalink / raw)
  To: linux-mm; +Cc: Andrew Morton

Implement pageout clustering at the VM level.

With this patch VM scanner calls pageout_cluster() instead of
->writepage(). pageout_cluster() tries to find a group of dirty pages around
target page, called "pivot" page of the cluster. If group of suitable size is
found, ->writepages() is called for it, otherwise page_cluster() falls back
to ->writepage().

This is supposed to help in work-loads with significant page-out of
file-system pages from tail of the inactive list (for example, heavy dirtying
through mmap), because file system usually writes multiple pages more
efficiently. Should also be advantageous for file-systems doing delayed
allocation, as in this case they will allocate whole extents at once.

Few points:

 - swap-cache pages are not clustered (although they can be, but by
   page->private rather than page->index)

 - only kswapd do clustering, because direct reclaim path should be low
   latency.

 - this patch adds new fields to struct writeback_control and expects
   ->writepages() to interpret them. This is needed, because pageout_cluster()
   calls ->writepages() with pivot page already locked, so that ->writepages()
   is allowed to only trylock other pages in the cluster.

   Besides, rather rough plumbing (wbc->pivot_ret field) is added to check
   whether ->writepages() failed to write pivot page for any reason (in latter
   case page_cluster() falls back to ->writepage()).

   Only mpage_writepages() was updated to honor these new fields, but
   all in-tree ->writepages() implementations seem to call
   mpage_writepages(). (Except reiser4, of course, for which I'll send a
   (trivial) patch, if necessary).

Signed-off-by: Nikita Danilov <nikita@clusterfs.com>


 fs/mpage.c                |  118 +++++++++++++++++++++-------------------------
 include/linux/writeback.h |    6 ++
 mm/vmscan.c               |   72 +++++++++++++++++++++++++++-
 3 files changed, 133 insertions(+), 63 deletions(-)

diff -puN mm/vmscan.c~cluster-pageout mm/vmscan.c
--- bk-linux/mm/vmscan.c~cluster-pageout	2005-04-17 17:52:52.000000000 +0400
+++ bk-linux-nikita/mm/vmscan.c	2005-04-17 17:52:52.000000000 +0400
@@ -349,6 +349,76 @@ static void send_page_to_kaiod(struct pa
 	spin_unlock(&kaio_queue_lock);
 }
 
+enum {
+	PAGE_CLUSTER_WING = 16,
+	PAGE_CLUSTER_SIZE = 2 * PAGE_CLUSTER_WING,
+};
+
+enum {
+	PIVOT_RET_MAGIC = 42
+};
+
+static int pageout_cluster(struct page *page, struct address_space *mapping,
+			   struct writeback_control *wbc)
+{
+	pgoff_t punct;
+	pgoff_t start;
+	pgoff_t end;
+	struct page *opage = page;
+
+	if (PageSwapCache(page) || !current_is_kswapd())
+		return mapping->a_ops->writepage(page, wbc);
+
+	wbc->pivot = page;
+	punct = page->index;
+	read_lock_irq(&mapping->tree_lock);
+	for (start = punct - 1;
+	     start < punct && punct - start <= PAGE_CLUSTER_WING; -- start) {
+		page = radix_tree_lookup(&mapping->page_tree, start);
+		if (page == NULL || !PageDirty(page))
+			/*
+			 * no suitable page, stop cluster at this point
+			 */
+			break;
+		if ((start % PAGE_CLUSTER_SIZE) == 0)
+			/*
+			 * we reached aligned page.
+			 */
+			-- start;
+			break;
+	}
+	++ start;
+	for (end = punct + 1;
+	     end > punct && end - start < PAGE_CLUSTER_SIZE; ++ end) {
+		/*
+		 * XXX nikita: consider find_get_pages_tag()
+		 */
+		page = radix_tree_lookup(&mapping->page_tree, end);
+		if (page == NULL || !PageDirty(page))
+			/*
+			 * no suitable page, stop cluster at this point
+			 */
+			break;
+	}
+	read_unlock_irq(&mapping->tree_lock);
+	-- end;
+	wbc->pivot_ret = PIVOT_RET_MAGIC; /* magic */
+	if (end > start) {
+		wbc->start = ((loff_t)start) << PAGE_CACHE_SHIFT;
+		wbc->end   = ((loff_t)end) << PAGE_CACHE_SHIFT;
+		wbc->end  += PAGE_CACHE_SIZE - 1;
+		wbc->nr_to_write = end - start + 1;
+		do_writepages(mapping, wbc);
+	}
+	if (wbc->pivot_ret == PIVOT_RET_MAGIC)
+		/*
+		 * single page, or ->writepages() skipped pivot for any
+		 * reason: just call ->writepage()
+		 */
+		wbc->pivot_ret = mapping->a_ops->writepage(opage, wbc);
+	return wbc->pivot_ret;
+}
+
 /*
  * Called by shrink_list() for each dirty page. Calls ->writepage().
  */
@@ -434,7 +504,7 @@ static pageout_t pageout(struct page *pa
 
 		ClearPageSkipped(page);
 		SetPageReclaim(page);
-		res = mapping->a_ops->writepage(page, &wbc);
+		res = pageout_cluster(page, mapping, &wbc);
 
 		if (res < 0)
 			handle_write_error(mapping, page, res);
diff -puN include/linux/writeback.h~cluster-pageout include/linux/writeback.h
--- bk-linux/include/linux/writeback.h~cluster-pageout	2005-04-17 17:52:52.000000000 +0400
+++ bk-linux-nikita/include/linux/writeback.h	2005-04-17 17:52:52.000000000 +0400
@@ -55,6 +55,12 @@ struct writeback_control {
 	unsigned encountered_congestion:1;	/* An output: a queue is full */
 	unsigned for_kupdate:1;			/* A kupdate writeback */
 	unsigned for_reclaim:1;			/* Invoked from the page allocator */
+	/* if non-NULL, page already locked by ->writepages()
+	 * caller. ->writepages() should use trylock on all other pages it
+	 * submits for IO */
+	struct page *pivot;
+	/* if ->pivot is not NULL, result for pivot page is stored here */
+	int pivot_ret;
 };
 
 /*
diff -puN fs/mpage.c~cluster-pageout fs/mpage.c
--- bk-linux/fs/mpage.c~cluster-pageout	2005-04-17 17:52:52.000000000 +0400
+++ bk-linux-nikita/fs/mpage.c	2005-04-17 17:52:52.000000000 +0400
@@ -391,7 +391,6 @@ __mpage_writepage(struct bio *bio, struc
 	sector_t *last_block_in_bio, int *ret, struct writeback_control *wbc,
 	writepage_t writepage_fn)
 {
-	struct address_space *mapping = page->mapping;
 	struct inode *inode = page->mapping->host;
 	const unsigned blkbits = inode->i_blkbits;
 	unsigned long end_index;
@@ -409,6 +408,7 @@ __mpage_writepage(struct bio *bio, struc
 	struct buffer_head map_bh;
 	loff_t i_size = i_size_read(inode);
 
+	*ret = 0;
 	if (page_has_buffers(page)) {
 		struct buffer_head *head = page_buffers(page);
 		struct buffer_head *bh = head;
@@ -582,30 +582,22 @@ alloc_new:
 confused:
 	if (bio)
 		bio = mpage_bio_submit(WRITE, bio);
-
-	if (writepage_fn) {
-		*ret = (*writepage_fn)(page, wbc);
-	} else {
-		*ret = -EAGAIN;
-		goto out;
-	}
-	/*
-	 * The caller has a ref on the inode, so *mapping is stable
-	 */
-	if (*ret) {
-		if (*ret == -ENOSPC)
-			set_bit(AS_ENOSPC, &mapping->flags);
-		else
-			set_bit(AS_EIO, &mapping->flags);
-	}
 out:
 	return bio;
 }
 
+static void handle_writepage_error(int err, struct address_space *mapping)
+{
+	if (unlikely(err == -ENOSPC))
+		set_bit(AS_ENOSPC, &mapping->flags);
+	else if (unlikely(err != 0))
+		set_bit(AS_EIO, &mapping->flags);
+}
+
 /**
  * mpage_writepages - walk the list of dirty pages of the given
  * address space and writepage() all of them.
- * 
+ *
  * @mapping: address space structure to write
  * @wbc: subtract the number of written pages from *@wbc->nr_to_write
  * @get_block: the filesystem's block mapper function.
@@ -682,51 +674,53 @@ retry:
 		for (i = 0; i < nr_pages; i++) {
 			struct page *page = pvec.pages[i];
 
-			/*
-			 * At this point we hold neither mapping->tree_lock nor
-			 * lock on the page itself: the page may be truncated or
-			 * invalidated (changing page->mapping to NULL), or even
-			 * swizzled back from swapper_space to tmpfs file
-			 * mapping
-			 */
-
-			lock_page(page);
+			if (page != wbc->pivot) {
+				/*
+				 * At this point we hold neither
+				 * mapping->tree_lock nor lock on the page
+				 * itself: the page may be truncated or
+				 * invalidated (changing page->mapping to
+				 * NULL), or even swizzled back from
+				 * swapper_space to tmpfs file mapping
+				 */
 
-			if (unlikely(page->mapping != mapping)) {
-				unlock_page(page);
-				continue;
-			}
+				if (wbc->pivot != NULL) {
+					if (unlikely(TestSetPageLocked(page)))
+						continue;
+				} else
+					lock_page(page);
+
+				if (unlikely(page->mapping != mapping)) {
+					unlock_page(page);
+					continue;
+				}
 
-			if (unlikely(is_range) && page->index > end) {
-				done = 1;
-				unlock_page(page);
-				continue;
-			}
+				if (unlikely(is_range) && page->index > end) {
+					done = 1;
+					unlock_page(page);
+					continue;
+				}
 
-			if (wbc->sync_mode != WB_SYNC_NONE)
-				wait_on_page_writeback(page);
+				if (wbc->sync_mode != WB_SYNC_NONE)
+					wait_on_page_writeback(page);
 
-			if (PageWriteback(page) ||
-					!clear_page_dirty_for_io(page)) {
-				unlock_page(page);
-				continue;
+				if (PageWriteback(page) ||
+				    !clear_page_dirty_for_io(page)) {
+					unlock_page(page);
+					continue;
+				}
 			}
 
-			if (writepage) {
+			if (writepage)
 				ret = (*writepage)(page, wbc);
-				if (ret) {
-					if (ret == -ENOSPC)
-						set_bit(AS_ENOSPC,
-							&mapping->flags);
-					else
-						set_bit(AS_EIO,
-							&mapping->flags);
-				}
-			} else {
+			else
 				bio = __mpage_writepage(bio, page, get_block,
-						&last_block_in_bio, &ret, wbc,
-						writepage_fn);
-			}
+							&last_block_in_bio,
+							&ret, wbc,
+							writepage_fn);
+			handle_writepage_error(ret, page->mapping);
+			if (page == wbc->pivot)
+				wbc->pivot_ret = ret;
 			if (ret || (--(wbc->nr_to_write) <= 0))
 				done = 1;
 			if (wbc->nonblocking && bdi_write_congested(bdi)) {
@@ -766,7 +760,7 @@ int mpage_writepage(struct page *page, g
 			&last_block_in_bio, &ret, wbc, NULL);
 	if (bio)
 		mpage_bio_submit(WRITE, bio);
-
+	handle_writepage_error(ret, page->mapping);
 	return ret;
 }
 EXPORT_SYMBOL(mpage_writepage);

_
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH]: VM 7/8 cluster pageout
  2005-04-17 17:38 [PATCH]: VM 7/8 cluster pageout Nikita Danilov
@ 2005-04-26  4:15 ` Andrew Morton
  2005-04-26  9:16   ` Nikita Danilov
  2005-05-02  4:12   ` William Lee Irwin III
  0 siblings, 2 replies; 8+ messages in thread
From: Andrew Morton @ 2005-04-26  4:15 UTC (permalink / raw)
  To: Nikita Danilov; +Cc: linux-mm

Nikita Danilov <nikita@clusterfs.com> wrote:
>
> Implement pageout clustering at the VM level.

I dunno...

Once __mpage_writepages() has started I/O against the pivot page, I don't
see that we have any guarantees that some other CPU cannot come in,
truncated or reclaim all the inode's pages and then reclaimed the inode
altogether.  While __mpage_writepages() is still dinking with it all.

I had something like this happening in 2.5.10(ish), but ended up deciding
it was all too complex and writeout from the LRU is rare and the pages are
probably close-by on the LRU and the elevator sorting would catch most
cases so I tossed it all out.

Plus some of your other patches make LRU-based writeout even less common.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH]: VM 7/8 cluster pageout
  2005-04-26  4:15 ` Andrew Morton
@ 2005-04-26  9:16   ` Nikita Danilov
  2005-04-26  9:36     ` Andrew Morton
  2005-05-02  4:12   ` William Lee Irwin III
  1 sibling, 1 reply; 8+ messages in thread
From: Nikita Danilov @ 2005-04-26  9:16 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-mm

Andrew Morton writes:
 > Nikita Danilov <nikita@clusterfs.com> wrote:
 > >
 > > Implement pageout clustering at the VM level.
 > 
 > I dunno...
 > 
 > Once __mpage_writepages() has started I/O against the pivot page, I don't
 > see that we have any guarantees that some other CPU cannot come in,
 > truncated or reclaim all the inode's pages and then reclaimed the inode
 > altogether.  While __mpage_writepages() is still dinking with it all.

Ah, silly me. Will __iget(page->mapping->host) in pageout_cluster() be
enough? We risk truncate on matching iput(), but VM scanner calls iput()
on inodes with ->i_nlink == 0 already (from shrink_dcache()).

Also that patch fixes what I believe is a bug in mpage_writepages(): if
->writepage() returns WRITEPAGE_ACTIVATE page is still _locked_, but
__mpage_writepages() doesn't unlock it. Attached is documentation fix.

 > 
 > I had something like this happening in 2.5.10(ish), but ended up deciding
 > it was all too complex and writeout from the LRU is rare and the pages are
 > probably close-by on the LRU and the elevator sorting would catch most
 > cases so I tossed it all out.

Are you talking about ->vm_writeback()?

 > 
 > Plus some of your other patches make LRU-based writeout even less common.

Idea is that if we do pageout, it's better to send to the disk few
neighboring dirty pages too while we are here. Plus, this allows file
systems with delayed allocation to improve layout. I think XFS already
does similar clustering from ->writepage() by itself.

Nikita.
 Documentation/filesystems/Locking |    8 ++++++--
 1 files changed, 6 insertions(+), 2 deletions(-)

diff -puN Documentation/filesystems/Locking~WRITEPAGE_ACTIVATE-doc-fix Documentation/filesystems/Locking
--- bk-linux/Documentation/filesystems/Locking~WRITEPAGE_ACTIVATE-doc-fix	2005-04-22 12:11:38.000000000 +0400
+++ bk-linux-nikita/Documentation/filesystems/Locking	2005-04-22 12:11:38.000000000 +0400
@@ -219,8 +219,12 @@ This may also be done to avoid internal 
 If the filesytem is called for sync then it must wait on any
 in-progress I/O and then start new I/O.
 
-The filesystem should unlock the page synchronously, before returning
-to the caller.
+The filesystem should unlock the page synchronously, before returning to the
+caller, unless ->writepage() returns special WRITEPAGE_ACTIVATE
+value. WRITEPAGE_ACTIVATE means that page cannot really be written out
+currently, and VM should stop calling ->writepage() on this page for some
+time. VM does this by moving page to the head of the active list, hence the
+name.
 
 Unless the filesystem is going to redirty_page_for_writepage(), unlock the page
 and return zero, writepage *must* run set_page_writeback() against the page,

_
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH]: VM 7/8 cluster pageout
  2005-04-26  9:16   ` Nikita Danilov
@ 2005-04-26  9:36     ` Andrew Morton
  2005-04-26 16:19       ` Nikita Danilov
  0 siblings, 1 reply; 8+ messages in thread
From: Andrew Morton @ 2005-04-26  9:36 UTC (permalink / raw)
  To: Nikita Danilov; +Cc: linux-mm

Nikita Danilov <nikita@clusterfs.com> wrote:
>
> Andrew Morton writes:
>  > Nikita Danilov <nikita@clusterfs.com> wrote:
>  > >
>  > > Implement pageout clustering at the VM level.
>  > 
>  > I dunno...
>  > 
>  > Once __mpage_writepages() has started I/O against the pivot page, I don't
>  > see that we have any guarantees that some other CPU cannot come in,
>  > truncated or reclaim all the inode's pages and then reclaimed the inode
>  > altogether.  While __mpage_writepages() is still dinking with it all.
> 
> Ah, silly me. Will __iget(page->mapping->host) in pageout_cluster() be
> enough? We risk truncate on matching iput(), but VM scanner calls iput()
> on inodes with ->i_nlink == 0 already (from shrink_dcache()).

I have vague memories about iput() in page reclaim causing deadlocks or
some other nastiness.  Maybe not.

ummm, generic_vm_writeback() used igrab(), to avoid races with the inode
disappearing.  Which would seem to be an odd thing to happen if we had a
locked page.  Maybe I used igrab because a bare atomic_inc(&inode->i_count)
seemed grubby, and there's no API function to do it.  But I do seem to
recall that igrab() was needed for other reasons.  It's all lost in the
mists of time.

> Also that patch fixes what I believe is a bug in mpage_writepages(): if
> ->writepage() returns WRITEPAGE_ACTIVATE page is still _locked_, but
> __mpage_writepages() doesn't unlock it. Attached is documentation fix.

OK.  WRITEPAGE_ACTIVATE is supposed to be a secret hack whcih filesystems
don't use.

>  > 
>  > I had something like this happening in 2.5.10(ish), but ended up deciding
>  > it was all too complex and writeout from the LRU is rare and the pages are
>  > probably close-by on the LRU and the elevator sorting would catch most
>  > cases so I tossed it all out.
> 
> Are you talking about ->vm_writeback()?

yup.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH]: VM 7/8 cluster pageout
  2005-04-26  9:36     ` Andrew Morton
@ 2005-04-26 16:19       ` Nikita Danilov
  2005-04-26 19:39         ` Andrew Morton
  0 siblings, 1 reply; 8+ messages in thread
From: Nikita Danilov @ 2005-04-26 16:19 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-mm

Andrew Morton writes:
 > Nikita Danilov <nikita@clusterfs.com> wrote:
 > >
 > > Andrew Morton writes:
 > >  > Nikita Danilov <nikita@clusterfs.com> wrote:
 > >  > >
 > >  > > Implement pageout clustering at the VM level.
 > >  > 
 > >  > I dunno...
 > >  > 
 > >  > Once __mpage_writepages() has started I/O against the pivot page, I don't
 > >  > see that we have any guarantees that some other CPU cannot come in,
 > >  > truncated or reclaim all the inode's pages and then reclaimed the inode
 > >  > altogether.  While __mpage_writepages() is still dinking with it all.
 > > 
 > > Ah, silly me. Will __iget(page->mapping->host) in pageout_cluster() be
 > > enough? We risk truncate on matching iput(), but VM scanner calls iput()
 > > on inodes with ->i_nlink == 0 already (from shrink_dcache()).
 > 
 > I have vague memories about iput() in page reclaim causing deadlocks or
 > some other nastiness.  Maybe not.

Aren't you talking about

http://marc.theaimsgroup.com/?t=108272583200001&r=1&w=2

by any chance? As I remember it, conclusion was that file system has to
be ready to handle final iput() from within GFP_FS allocation.

 > 

Nikita.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH]: VM 7/8 cluster pageout
  2005-04-26 16:19       ` Nikita Danilov
@ 2005-04-26 19:39         ` Andrew Morton
  0 siblings, 0 replies; 8+ messages in thread
From: Andrew Morton @ 2005-04-26 19:39 UTC (permalink / raw)
  To: Nikita Danilov; +Cc: linux-mm

Nikita Danilov <nikita@clusterfs.com> wrote:
>
> Andrew Morton writes:
>   > Nikita Danilov <nikita@clusterfs.com> wrote:
>   > >
>   > > Andrew Morton writes:
>   > >  > Nikita Danilov <nikita@clusterfs.com> wrote:
>   > >  > >
>   > >  > > Implement pageout clustering at the VM level.
>   > >  > 
>   > >  > I dunno...
>   > >  > 
>   > >  > Once __mpage_writepages() has started I/O against the pivot page, I don't
>   > >  > see that we have any guarantees that some other CPU cannot come in,
>   > >  > truncated or reclaim all the inode's pages and then reclaimed the inode
>   > >  > altogether.  While __mpage_writepages() is still dinking with it all.
>   > > 
>   > > Ah, silly me. Will __iget(page->mapping->host) in pageout_cluster() be
>   > > enough? We risk truncate on matching iput(), but VM scanner calls iput()
>   > > on inodes with ->i_nlink == 0 already (from shrink_dcache()).
>   > 
>   > I have vague memories about iput() in page reclaim causing deadlocks or
>   > some other nastiness.  Maybe not.
> 
>  Aren't you talking about
> 
>  http://marc.theaimsgroup.com/?t=108272583200001&r=1&w=2
> 
>  by any chance?

Nope, this all happened in the early 2002 timeframe.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH]: VM 7/8 cluster pageout
  2005-04-26  4:15 ` Andrew Morton
  2005-04-26  9:16   ` Nikita Danilov
@ 2005-05-02  4:12   ` William Lee Irwin III
  2005-05-02  5:51     ` Rik van Riel
  1 sibling, 1 reply; 8+ messages in thread
From: William Lee Irwin III @ 2005-05-02  4:12 UTC (permalink / raw)
  To: Andrew Morton; +Cc: Nikita Danilov, linux-mm

Nikita Danilov <nikita@clusterfs.com> wrote:
>> Implement pageout clustering at the VM level.

On Mon, Apr 25, 2005 at 09:15:14PM -0700, Andrew Morton wrote:
> I had something like this happening in 2.5.10(ish), but ended up deciding
> it was all too complex and writeout from the LRU is rare and the pages are
> probably close-by on the LRU and the elevator sorting would catch most
> cases so I tossed it all out.
> Plus some of your other patches make LRU-based writeout even less common.

Sorry for chiming in late on this issue.

I would be careful in dismissing the case as "rare"; what I've
discovered in this kind of performance scenario is that the rare case
happens to someone, who is willing to tolerate poor performance and
understands they're not the common case, but discovers pathological
performance instead and cries out for help (unfortunately, this is all
subjective). I'd be glad to see some bulletproofing of the VM against
this case go into mainline, not to specifically recommend this approach
against any other.

By and large I've seen writeout from the LRU get dismissed and I'm
convinced that although it should be rare, some (moderate?) steps
are in order to ensure the degradation from such is not too severe
(though poor performance is can be tolerated, pathological can't).

-- wli
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH]: VM 7/8 cluster pageout
  2005-05-02  4:12   ` William Lee Irwin III
@ 2005-05-02  5:51     ` Rik van Riel
  0 siblings, 0 replies; 8+ messages in thread
From: Rik van Riel @ 2005-05-02  5:51 UTC (permalink / raw)
  To: William Lee Irwin III; +Cc: Andrew Morton, Nikita Danilov, linux-mm

On Sun, 1 May 2005, William Lee Irwin III wrote:

> I would be careful in dismissing the case as "rare"; what I've
> discovered in this kind of performance scenario is that the rare case
> happens to someone, who is willing to tolerate poor performance and
> understands they're not the common case, but discovers pathological
> performance instead and cries out for help (unfortunately, this is all
> subjective). I'd be glad to see some bulletproofing of the VM against
> this case go into mainline, not to specifically recommend this approach
> against any other.

Agreed.  The VM is all about preventing these "corner cases",
because there will always be users who run into them the whole
time - from bootup till shutdown - and we can't degenerate to
pathological performance for somebody's main workload ;)

Of course, if there isn't an actual workload that's being
improved by some patch we should avoid the complexity, but
if a patch helps enough to outweigh its complexity ...

-- 
"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it." - Brian W. Kernighan
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2005-05-02  5:51 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-04-17 17:38 [PATCH]: VM 7/8 cluster pageout Nikita Danilov
2005-04-26  4:15 ` Andrew Morton
2005-04-26  9:16   ` Nikita Danilov
2005-04-26  9:36     ` Andrew Morton
2005-04-26 16:19       ` Nikita Danilov
2005-04-26 19:39         ` Andrew Morton
2005-05-02  4:12   ` William Lee Irwin III
2005-05-02  5:51     ` Rik van Riel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox