From: Nick Piggin <nickpiggin@yahoo.com.au>
To: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
Jens Axboe <jens.axboe@oracle.com>,
akpm@linux-foundation.org,
Peter Zijlstra <a.p.zijlstra@chello.nl>,
Linus Torvalds <torvalds@linux-foundation.org>,
Ingo Molnar <mingo@elte.hu>,
thomas.pi@arcor.dea, Yuriy Lalym <ylalym@gmail.com>,
ltt-dev@lists.casi.polymtl.ca, linux-kernel@vger.kernel.org,
linux-mm@kvack.org
Subject: Re: [PATCH] mm fix page writeback accounting to fix oom condition under heavy I/O
Date: Tue, 10 Feb 2009 14:55:55 +1100 [thread overview]
Message-ID: <200902101455.56789.nickpiggin@yahoo.com.au> (raw)
In-Reply-To: <20090210033652.GA28435@Krystal>
On Tuesday 10 February 2009 14:36:53 Mathieu Desnoyers wrote:
> Related to :
> http://bugzilla.kernel.org/show_bug.cgi?id=12309
>
> Very annoying I/O latencies (20-30 seconds) are occuring under heavy I/O
> since ~2.6.18.
>
> Yuriy Lalym noticed that the oom killer was eventually called. So I took a
> look at /proc/meminfo and noticed that under my test case (fio job created
> from a LTTng block I/O trace, reproducing dd writing to a 20GB file and ssh
> sessions being opened), the Inactive(file) value increased, and the total
> memory consumed increased until only 80kB (out of 16GB) were left.
>
> So I first used cgroups to limit the memory usable by fio (or dd). This
> seems to fix the problem.
>
> Thomas noted that there seems to be a problem with pages being passed to
> the block I/O elevator not being counted as dirty. I looked at
> clear_page_dirty_for_io and noticed that page_mkclean clears the dirty bit
> and then set_page_dirty(page) is called on the page. This calls
> mm/page-writeback.c:set_page_dirty(). I assume that the
> mapping->a_ops->set_page_dirty is NULL, so it calls
> buffer.c:__set_page_dirty_buffers(). This calls set_buffer_dirty(bh).
>
> So we come back in clear_page_dirty_for_io where we decrement the dirty
> accounting. This is a problem, because we assume that the block layer will
> re-increment it when it gets the page, but because the buffer is marked as
> dirty, this won't happen.
>
> So this patch fixes this behavior by only decrementing the page accounting
> _after_ the block I/O writepage has been done.
>
> The effect on my workload is that the memory stops being completely filled
> by page cache under heavy I/O. The vfs_cache_pressure value seems to work
> again.
I don't think we're supposed to assume the block layer will re-increment
the dirty count? It should be all in the VM. And the VM should increment
writeback count before sending it to the block device, and dirty page
throttling also takes into account the number of writeback pages, so it
should not be allowed to fill up memory with dirty pages even if the
block device queue size is unlimited.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2009-02-10 3:56 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20090120122855.GF30821@kernel.dk>
[not found] ` <20090120232748.GA10605@Krystal>
[not found] ` <20090123220009.34DF.KOSAKI.MOTOHIRO@jp.fujitsu.com>
2009-02-10 3:36 ` Mathieu Desnoyers
2009-02-10 3:55 ` Nick Piggin [this message]
2009-02-10 5:23 ` Linus Torvalds
2009-02-10 5:56 ` Nick Piggin
2009-02-10 6:12 ` Mathieu Desnoyers
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=200902101455.56789.nickpiggin@yahoo.com.au \
--to=nickpiggin@yahoo.com.au \
--cc=a.p.zijlstra@chello.nl \
--cc=akpm@linux-foundation.org \
--cc=jens.axboe@oracle.com \
--cc=kosaki.motohiro@jp.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=ltt-dev@lists.casi.polymtl.ca \
--cc=mathieu.desnoyers@polymtl.ca \
--cc=mingo@elte.hu \
--cc=thomas.pi@arcor.dea \
--cc=torvalds@linux-foundation.org \
--cc=ylalym@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox