linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
To: Greg Thelen <gthelen@google.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Michal Hocko <mhocko@suse.cz>, Tejun Heo <tj@kernel.org>
Cc: Dave Chinner <david@fromorbit.com>,
	Sha Zhengju <handai.szj@gmail.com>,
	Kamezawa Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	linux-fsdevel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [PATCH] memcg: add per cgroup dirty page accounting
Date: Wed, 11 Mar 2015 19:25:25 +0300	[thread overview]
Message-ID: <55006C75.2050604@yandex-team.ru> (raw)
In-Reply-To: <1425876632-6681-1-git-send-email-gthelen@google.com>

On 09.03.2015 07:50, Greg Thelen wrote:> When modifying PG_Dirty on 
cached file pages, update the new
 > MEM_CGROUP_STAT_DIRTY counter.  This is done in the same places where
 > global NR_FILE_DIRTY is managed.  The new memcg stat is visible in the
 > per memcg memory.stat cgroupfs file.  The most recent past attempt at
 > this was http://thread.gmane.org/gmane.linux.kernel.cgroups/8632
 >
 > The new accounting supports future efforts to add per cgroup dirty
 > page throttling and writeback.  It also helps an administrator break
 > down a container's memory usage and provides evidence to understand
 > memcg oom kills (the new dirty count is included in memcg oom kill
 > messages).
 >
 > The ability to move page accounting between memcg
 > (memory.move_charge_at_immigrate) makes this accounting more
 > complicated than the global counter.  The existing
 > mem_cgroup_{begin,end}_page_stat() lock is used to serialize move
 > accounting with stat updates.
 > Typical update operation:
 > 	memcg = mem_cgroup_begin_page_stat(page)
 > 	if (TestSetPageDirty()) {
 > 		[...]
 > 		mem_cgroup_update_page_stat(memcg)
 > 	}
 > 	mem_cgroup_end_page_stat(memcg)
 >
 > Summary of mem_cgroup_end_page_stat() overhead:
 > - Without CONFIG_MEMCG it's a no-op
 > - With CONFIG_MEMCG and no inter memcg task movement, it's just
 >    rcu_read_lock()
 > - With CONFIG_MEMCG and inter memcg  task movement, it's
 >    rcu_read_lock() + spin_lock_irqsave()
 >
 > A memcg parameter is added to several routines because their callers
 > now grab mem_cgroup_begin_page_stat() which returns the memcg later
 > needed by for mem_cgroup_update_page_stat().
 >
 > Because mem_cgroup_begin_page_stat() may disable interrupts, some
 > adjustments are needed:
 > - move __mark_inode_dirty() from __set_page_dirty() to its caller.
 >    __mark_inode_dirty() locking does not want interrupts disabled.
 > - use spin_lock_irqsave(tree_lock) rather than spin_lock_irq() in
 >    __delete_from_page_cache(), replace_page_cache_page(),
 >    invalidate_complete_page2(), and __remove_mapping().

This patch conflicts with my cleanup which is already in mm tree:
("page_writeback: clean up mess around cancel_dirty_page()")
Nothing nontrivial but I've killed cancel_dirty_page() and replaced
it which account_page_cleaned() symmetrical to account_page_dirtied().


I think this accounting can be done without mem_cgroup_begin_page_stat()
All page cleaning happens under page is lock.
Some dirtying is called without page-lock when kernel moves
dirty status from pte to page, but in this case acconting happens
under mapping->tree_lock.

Memcg already locks pages when moves them between cgroups,
maybe it could also lock mapping->tree_lock?

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2015-03-11 16:25 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-09  4:50 Greg Thelen
2015-03-09 13:52 ` Tejun Heo
2015-03-09 15:29   ` Greg Thelen
2015-03-09 15:31     ` Tejun Heo
2015-03-11 16:25 ` Konstantin Khlebnikov [this message]
2015-03-11 19:57   ` Greg Thelen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=55006C75.2050604@yandex-team.ru \
    --to=khlebnikov@yandex-team.ru \
    --cc=akpm@linux-foundation.org \
    --cc=david@fromorbit.com \
    --cc=gthelen@google.com \
    --cc=handai.szj@gmail.com \
    --cc=hannes@cmpxchg.org \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.cz \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox