[LSF/MM TOPIC] balancing dirty pages - how to keep growing dirty memory in reasonable limits

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Maxim Patlasov <mpatlasov@parallels.com>
To: lsf-pc@lists.linux-foundation.org
Cc: "linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	"fuse-devel@lists.sourceforge.net"
	<fuse-devel@lists.sourceforge.net>,
	linux-mm@kvack.org
Subject: [LSF/MM TOPIC] balancing dirty pages - how to keep growing dirty memory in reasonable limits
Date: Thu, 30 Jan 2014 16:40:31 +0400	[thread overview]
Message-ID: <52EA483F.50105@parallels.com> (raw)

Hi,

A recent patch from Linus limiting global_dirtyable_memory to 1GB (see 
"Disabling in-memory write cache for x86-64 in Linux" thread) drew 
attention to a long-standing problem: on a node with a huge amount of 
RAM installed, the global dirty threshold is high, and existing 
behaviour of balance_dirty_pages() skips throttling until the global 
limit is reached. So, by the time balance_dirty_pages() starts 
throttling, you can easily end up in a huge amount of dirty pages backed 
up by some (e.g. slow USB) device.

A lot of ideas were proposed, but no conclusion was made. In particular, 
one of suggested approaches is to develop per-BDI time-based limits and 
to enable them for all: don't allow dirty cache of BDI to grow over 5s 
of measured writeback speed. The approach looks pretty straightforward, 
but in practice it may be tricky to implement: you cannot discover how 
fast a device is until you load it heavily enough, and conversely, you 
must go far beyond current per-BDI limit to load the device heavily. And 
other approaches have other caveats as usual.

I'm interested in attending upcoming LSF/MM to discuss the topic above 
as well as two other unrelated ones:

* future improvements of FUSE. Having "write-back cache policy" 
patch-set almost adopted and patches for synchronous close(2) and 
umount(2) in queue, I'd like to keep my efforts in sync with other FUSE 
developers.

* reboot-less kernel updates. Since memory reset can be avoided by 
booting the new kernel using Kexec, and almost any application can be 
checkpointed and then restored by CRIU, the downtime can be diminished 
significantly by keeping userspace processes' working set in memory 
while the system gets updated. Questions to discuss are how to prevent 
the kernel from using some memory regions on boot, what interface can be 
reused/introduced for managing the regions and how they can be 
re-installed back into processes' address space on restore.

Thanks,
Maxim

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

                 reply	other threads:[~2014-01-30 12:40 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=52EA483F.50105@parallels.com \
    --to=mpatlasov@parallels.com \
    --cc=fuse-devel@lists.sourceforge.net \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lsf-pc@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox