From: "Stephen C. Tweedie" <sct@redhat.com>
To: Andrew Morton <akpm@zip.com.au>, linux-mm@kvack.org
Cc: Stephen Tweedie <sct@redhat.com>
Subject: [patch] Buffers pinning inodes in icache forever
Date: Wed, 6 Nov 2002 21:59:48 GMT [thread overview]
Message-ID: <200211062159.gA6LxmK23126@sisko.scot.redhat.com> (raw)
[-- Attachment #1: Type: text/plain, Size: 1414 bytes --]
Hi,
In chasing a performance problem on a 2.4.9-based VM (yes, that one!),
we found a case where kswapd was consuming massive CPU time, 97% of
which was in prune_icache (and of that, about 7% was in the
inode_has_buffers() sub-function). slabinfo showed about 100k inodes
in use.
The hypothesis is that we've got buffers in cache pinning the inodes.
It's not pages doing the pinning because if the inode page count is
zero we never perform the inode_has_buffers() test.
On buffer write, the bh goes onto BUF_LOCKED, but never gets removed
from there. In other testing I've seen several GB of memory in
BUF_LOCKED bh'es during extensive write loads.
That's normally no problem, except that the lack of a refile_buffer()
on those bh'es also keeps them on the inode's own buffer lists. If
it's metadata that the buffers back (ie. it's all in low memory) and
the demand on the system is for highmem pages, then we're not
necessarily going to be aggressively doing try_to_release_page() on
the lowmem pages which would allow the bhes to be refiled.
Doing the refile really isn't hard, either. We expect IO completion
to be happening in approximately list order on the BUF_LOCKED list, so
simply doing a refile on any unlocked buffers at the head of that list
is going to keep it under control in O(1) time per buffer.
With the patch below we've not seen this particular pathology recur.
Comments?
--Stephen
[-- Attachment #2: io-postprocess.patch --]
[-- Type: text/plain, Size: 844 bytes --]
--- linux/fs/buffer.c.orig Fri Oct 25 09:53:43 2002
+++ linux/fs/buffer.c Fri Oct 25 10:15:51 2002
@@ -2835,6 +2835,30 @@
}
}
+
+/*
+ * Do some IO post-processing here!!!
+ */
+void do_io_postprocessing(void)
+{
+ int i;
+ struct buffer_head *bh, *next;
+
+ spin_lock(&lru_list_lock);
+ bh = lru_list[BUF_LOCKED];
+ if (bh) {
+ for (i = nr_buffers_type[BUF_LOCKED]; i-- > 0; bh = next) {
+ next = bh->b_next_free;
+
+ if (!buffer_locked(bh))
+ __refile_buffer(bh);
+ else
+ break;
+ }
+ }
+ spin_unlock(&lru_list_lock);
+}
+
/*
* This is the kernel update daemon. It was used to live in userspace
* but since it's need to run safely we want it unkillable by mistake.
@@ -2886,6 +2910,7 @@
#ifdef DEBUG
printk(KERN_DEBUG "kupdate() activated...\n");
#endif
+ do_io_postprocessing();
sync_old_buffers();
}
}
next reply other threads:[~2002-11-06 21:59 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2002-11-06 21:59 Stephen C. Tweedie [this message]
2002-11-06 22:40 ` Andrew Morton
2002-11-07 2:16 ` Rik van Riel
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200211062159.gA6LxmK23126@sisko.scot.redhat.com \
--to=sct@redhat.com \
--cc=akpm@zip.com.au \
--cc=linux-mm@kvack.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox