From: Shakeel Butt <shakeel.butt@linux.dev>
To: Qu Wenruo <wqu@suse.com>
Cc: linux-btrfs@vger.kernel.org, hannes@cmpxchg.org,
mhocko@kernel.org, roman.gushchin@linux.dev,
muchun.song@linux.dev, akpm@linux-foundation.org,
cgroups@vger.kernel.org, linux-mm@kvack.org,
Michal Hocko <mhocko@suse.com>,
"Vlastimil Babka (SUSE)" <vbabka@kernel.org>
Subject: Re: [PATCH] btrfs: root memcgroup for metadata filemap_add_folio()
Date: Mon, 30 Sep 2024 10:23:16 -0700 [thread overview]
Message-ID: <iwjlzsphxhqdpml5gn3t3qt5zhizgcmizel5vug7g7bwlkzeob@g2jlar2nynqb> (raw)
In-Reply-To: <b5fef5372ae454a7b6da4f2f75c427aeab6a07d6.1727498749.git.wqu@suse.com>
Hi Qu,
On Sat, Sep 28, 2024 at 02:15:56PM GMT, Qu Wenruo wrote:
> [BACKGROUND]
> The function filemap_add_folio() charges the memory cgroup,
> as we assume all page caches are accessible by user space progresses
> thus needs the cgroup accounting.
>
> However btrfs is a special case, it has a very large metadata thanks to
> its support of data csum (by default it's 4 bytes per 4K data, and can
> be as large as 32 bytes per 4K data).
> This means btrfs has to go page cache for its metadata pages, to take
> advantage of both cache and reclaim ability of filemap.
>
> This has a tiny problem, that all btrfs metadata pages have to go through
> the memcgroup charge, even all those metadata pages are not
> accessible by the user space, and doing the charging can introduce some
> latency if there is a memory limits set.
>
> Btrfs currently uses __GFP_NOFAIL flag as a workaround for this cgroup
> charge situation so that metadata pages won't really be limited by
> memcgroup.
>
> [ENHANCEMENT]
> Instead of relying on __GFP_NOFAIL to avoid charge failure, use root
> memory cgroup to attach metadata pages.
>
> Although this needs to export the symbol mem_root_cgroup for
> CONFIG_MEMCG, or define mem_root_cgroup as NULL for !CONFIG_MEMCG.
>
> With root memory cgroup, we directly skip the charging part, and only
> rely on __GFP_NOFAIL for the real memory allocation part.
>
I have a couple of questions:
1. Were you using __GFP_NOFAIL just to avoid ENOMEMs? Are you ok with
oom-kills?
2. What the normal overhead of these metadata in real world production
environment? I see 4 to 32 bytes per 4k but what's the most used one and
does it depend on the data of 4k or something else?
3. Most probably multiple metadata values are colocated on a single 4k
page of the btrfs page cache even though the corresponding page cache
might be charged to different cgroups. Is that correct?
4. What is stopping us to use reclaimable slab cache for this metadata?
thanks,
Shakeel
next prev parent reply other threads:[~2024-09-30 17:23 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-09-28 4:45 Qu Wenruo
2024-09-30 17:23 ` Shakeel Butt [this message]
2024-09-30 22:00 ` Qu Wenruo
2024-10-01 1:37 ` Shakeel Butt
2024-10-01 2:03 ` Qu Wenruo
2024-10-01 9:19 ` Christoph Hellwig
2024-10-01 9:40 ` Qu Wenruo
2024-10-02 7:41 ` Christoph Hellwig
2024-10-03 8:07 ` Michal Hocko
2024-10-03 20:39 ` Shakeel Butt
2024-10-03 8:11 ` Qu Wenruo
2024-10-03 8:22 ` Michal Hocko
2024-10-03 8:23 ` Qu Wenruo
2024-10-03 20:58 ` Johannes Weiner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=iwjlzsphxhqdpml5gn3t3qt5zhizgcmizel5vug7g7bwlkzeob@g2jlar2nynqb \
--to=shakeel.butt@linux.dev \
--cc=akpm@linux-foundation.org \
--cc=cgroups@vger.kernel.org \
--cc=hannes@cmpxchg.org \
--cc=linux-btrfs@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=mhocko@suse.com \
--cc=muchun.song@linux.dev \
--cc=roman.gushchin@linux.dev \
--cc=vbabka@kernel.org \
--cc=wqu@suse.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox