From: Johannes Weiner <hannes@cmpxchg.org>
To: Greg Thelen <gthelen@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
Michal Hocko <mhocko@kernel.org>,
Vladimir Davydov <vdavydov.dev@gmail.com>,
Tejun Heo <tj@kernel.org>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH] writeback: sum memcg dirty counters as needed
Date: Thu, 28 Mar 2019 10:20:16 -0400 [thread overview]
Message-ID: <20190328142016.GA15763@cmpxchg.org> (raw)
In-Reply-To: <20190307165632.35810-1-gthelen@google.com>
On Thu, Mar 07, 2019 at 08:56:32AM -0800, Greg Thelen wrote:
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -3880,6 +3880,7 @@ struct wb_domain *mem_cgroup_wb_domain(struct bdi_writeback *wb)
> * @pheadroom: out parameter for number of allocatable pages according to memcg
> * @pdirty: out parameter for number of dirty pages
> * @pwriteback: out parameter for number of pages under writeback
> + * @exact: determines exact counters are required, indicates more work.
> *
> * Determine the numbers of file, headroom, dirty, and writeback pages in
> * @wb's memcg. File, dirty and writeback are self-explanatory. Headroom
> @@ -3890,18 +3891,29 @@ struct wb_domain *mem_cgroup_wb_domain(struct bdi_writeback *wb)
> * ancestors. Note that this doesn't consider the actual amount of
> * available memory in the system. The caller should further cap
> * *@pheadroom accordingly.
> + *
> + * Return value is the error precision associated with *@pdirty
> + * and *@pwriteback. When @exact is set this a minimal value.
> */
> -void mem_cgroup_wb_stats(struct bdi_writeback *wb, unsigned long *pfilepages,
> - unsigned long *pheadroom, unsigned long *pdirty,
> - unsigned long *pwriteback)
> +unsigned long
> +mem_cgroup_wb_stats(struct bdi_writeback *wb, unsigned long *pfilepages,
> + unsigned long *pheadroom, unsigned long *pdirty,
> + unsigned long *pwriteback, bool exact)
> {
> struct mem_cgroup *memcg = mem_cgroup_from_css(wb->memcg_css);
> struct mem_cgroup *parent;
> + unsigned long precision;
>
> - *pdirty = memcg_page_state(memcg, NR_FILE_DIRTY);
> -
> + if (exact) {
> + precision = 0;
> + *pdirty = memcg_exact_page_state(memcg, NR_FILE_DIRTY);
> + *pwriteback = memcg_exact_page_state(memcg, NR_WRITEBACK);
> + } else {
> + precision = MEMCG_CHARGE_BATCH * num_online_cpus();
> + *pdirty = memcg_page_state(memcg, NR_FILE_DIRTY);
> + *pwriteback = memcg_page_state(memcg, NR_WRITEBACK);
> + }
> /* this should eventually include NR_UNSTABLE_NFS */
> - *pwriteback = memcg_page_state(memcg, NR_WRITEBACK);
> *pfilepages = mem_cgroup_nr_lru_pages(memcg, (1 << LRU_INACTIVE_FILE) |
> (1 << LRU_ACTIVE_FILE));
> *pheadroom = PAGE_COUNTER_MAX;
> @@ -3913,6 +3925,8 @@ void mem_cgroup_wb_stats(struct bdi_writeback *wb, unsigned long *pfilepages,
> *pheadroom = min(*pheadroom, ceiling - min(ceiling, used));
> memcg = parent;
> }
> +
> + return precision;
Have you considered unconditionally using the exact version here?
It does for_each_online_cpu(), but until very, very recently we did
this per default for all stats, for years. It only became a problem in
conjunction with the for_each_memcg loops when frequently reading
memory stats at the top of a very large hierarchy.
balance_dirty_pages() is called against memcgs that actually own the
inodes/memory and doesn't do the additional recursive tree collection.
It's also not *that* hot of a function, and in the io path...
It would simplify this patch immensely.
next prev parent reply other threads:[~2019-03-28 14:20 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-03-07 16:56 Greg Thelen
2019-03-21 23:44 ` Andrew Morton
2019-03-29 17:47 ` Greg Thelen
2019-03-22 18:15 ` Roman Gushchin
2019-03-27 22:29 ` Greg Thelen
2019-03-28 14:05 ` Johannes Weiner
2019-03-28 14:20 ` Johannes Weiner [this message]
2019-03-29 17:50 ` Greg Thelen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190328142016.GA15763@cmpxchg.org \
--to=hannes@cmpxchg.org \
--cc=akpm@linux-foundation.org \
--cc=gthelen@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=tj@kernel.org \
--cc=vdavydov.dev@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox