From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DA626C433FE for ; Tue, 9 Nov 2021 23:56:17 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 654D961211 for ; Tue, 9 Nov 2021 23:56:17 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 654D961211 Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 5D2336B0078; Tue, 9 Nov 2021 18:56:16 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 57FD26B007B; Tue, 9 Nov 2021 18:56:16 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3D2FC6B007D; Tue, 9 Nov 2021 18:56:16 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0082.hostedemail.com [216.40.44.82]) by kanga.kvack.org (Postfix) with ESMTP id 2B86D6B0078 for ; Tue, 9 Nov 2021 18:56:16 -0500 (EST) Received: from smtpin28.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id DAB20184B4B09 for ; Tue, 9 Nov 2021 23:56:15 +0000 (UTC) X-FDA: 78791053110.28.BB3B640 Received: from mail-io1-f42.google.com (mail-io1-f42.google.com [209.85.166.42]) by imf05.hostedemail.com (Postfix) with ESMTP id 0576C5095560 for ; Tue, 9 Nov 2021 23:55:54 +0000 (UTC) Received: by mail-io1-f42.google.com with SMTP id x10so727064ioj.9 for ; Tue, 09 Nov 2021 15:56:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=ZoK2BBW7Jb6r8LW9mP4KvyVyymYNKZIHae58TqDdmjM=; b=e62I04EFslO8JvLVvNHAQ9M278/k+EW9C9hQfqvs/2+1DafaBcurjF1Tw40Jotuw+8 a76hUHEIBUw5sz5N2VC+8FIXdNs5LWOHUchzTowWdzRQ/qhobM2E6HSI8tealG9nyest n2XPAp6ghoqcSFzmU+dOTMiMh92xrGfzgVxjnE32SvOSK3UTFa+XyHQAYUb3cuogUoI3 ktrpsTE3cnwIVA+a7jEVel4GzAXQTy3pA5r5hQezKzGG9mHJiJKuHKaOieoaFn5B8RRJ ylafV0ZEx69LyWI/2bXv3H+IPK2Jfx1t20og61f7SXBp24yFlnMaUX9QSaLexn24zRvF S8pQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=ZoK2BBW7Jb6r8LW9mP4KvyVyymYNKZIHae58TqDdmjM=; b=3eZpoITB+Lt+0wR0vntv5FLPsg2PLTdznuOiI6+O9JGV4aLksrAGnfuQoJzxSl4KkA bfnip+gKwwTpOJEWFhyNhuKG/WHLdHsoaQXIdhpfQTvl7zZxOKaqMBVu1Y2hChMJuJAn /PDSOuhe0ZVeLbH87nO0qcSkqMU5eKx7222gc0ZUuA/OyYOTunZ9aP8AzCw3FlRY+XIM m/BPpK/Lg2MOvkfDuIdoME+2WNgLWXKKACRqBjxj3a4n3Un4hc1poPAoR+VYdZK7DNjr cE0OWivap7bA47bByUIOqUczh+vbtIWvD5eaPDS6rKcfhxPQvHgyYNy41K+kDTEx+uQF op5w== X-Gm-Message-State: AOAM531PrN8xCgMFIQ6bBt8ME+xJY6f0SDYGHRtWBvRxlqtImeSa3THC +m1+5u6MCGiyTDRSy1hTy/CACPCVWNkwwZ3PvzWEuA== X-Google-Smtp-Source: ABdhPJzG072qHQ/0yi0HddSOZXPMaJDZ+ZFS0cuRcHJhcqYNWirO7+r+0gMr2/R+GvX5VWGlK+sRjtiZ3BPsTw10ozg= X-Received: by 2002:a05:6638:4183:: with SMTP id az3mr431731jab.56.1636502174733; Tue, 09 Nov 2021 15:56:14 -0800 (PST) MIME-Version: 1.0 References: <20211108211959.1750915-1-almasrymina@google.com> <20211108211959.1750915-2-almasrymina@google.com> <20211108221047.GE418105@dread.disaster.area> <20211109011837.GF418105@dread.disaster.area> In-Reply-To: <20211109011837.GF418105@dread.disaster.area> From: Mina Almasry Date: Tue, 9 Nov 2021 15:56:03 -0800 Message-ID: Subject: Re: [PATCH v1 1/5] mm/shmem: support deterministic charging of tmpfs To: david@fromorbit.com Cc: Matthew Wilcox , Michal Hocko , "Theodore Ts'o" , Greg Thelen , Shakeel Butt , Andrew Morton , Hugh Dickins , Roman Gushchin , Johannes Weiner , Tejun Heo , Vladimir Davydov , riel@surriel.com, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, cgroups@vger.kernel.org Content-Type: text/plain; charset="UTF-8" X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 0576C5095560 X-Stat-Signature: z3uzztpz3pifr1k5nthsw51ni5oahac5 Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=e62I04EF; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf05.hostedemail.com: domain of almasrymina@google.com designates 209.85.166.42 as permitted sender) smtp.mailfrom=almasrymina@google.com X-HE-Tag: 1636502154-350357 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, Nov 8, 2021 at 5:18 PM Dave Chinner wrote: > > On Mon, Nov 08, 2021 at 11:41:51PM +0000, Matthew Wilcox wrote: > > On Tue, Nov 09, 2021 at 09:10:47AM +1100, Dave Chinner wrote: > > > > + rcu_read_lock(); > > > > + memcg = rcu_dereference(mapping->host->i_sb->s_memcg_to_charge); > > > > > > Anything doing pointer chasing to obtain static, unchanging > > > superblock state is poorly implemented. The s_memcg_to_charge value never > > > changes, so this code should associate the memcg to charge directly > > > on the mapping when the mapping is first initialised by the > > > filesystem. We already do this with things like attaching address > > > space ops and mapping specific gfp masks (i.e > > > mapping_set_gfp_mask()), so this association should be set up that > > > way, too (e.g. mapping_set_memcg_to_charge()). > > > > I'm not a fan of enlarging struct address_space with another pointer > > unless it's going to be used by all/most filesystems. If this is > > destined to be a shmem-only feature, then it should be in the > > shmem_inode instead of the mapping. > > Neither am I, but I'm also not a fan of the filemap code still > having to drill through the mapping to the host inode just to check > if it needs to do special stuff for shmem inodes on every call that > adds a page to the page cache. This is just as messy and intrusive > and the memcg code really has no business digging about in the > filesystem specific details of the inode behind the mapping. > > Hmmm. The mem_cgroup_charge() call in filemap_add_folio() passes a > null mm context, so deep in the guts it ends getting the memcg from > active_memcg() in get_mem_cgroup_from_mm(). That ends up using > current->active_memcg, so maybe a better approach here is to have > shmem override current->active_memcg via set_active_memcg() before > it enters the generic fs paths and restore it on return... > > current_fsmemcg()? > Thank you for providing a detailed alternative. To be honest it seems a bit brittle to me, as in folks can easily add calls to generic fs paths forgetting to override the active_memcg and having memory charged incorrectly, but if there is no other option and we want to make this a shmem-only feature, I can do this anyway. > > If we are to have this for all filesystems, then let's do that properly > > and make it generic functionality from its introduction. > > Fully agree. So the tmpfs feature addresses the first 2 usecases I mention in the cover letter. For the 3rd usecase I will likely need to extend this support to 1 disk-based filesystem, and I'm not sure which at this point. It also looks like Roman has in mind 1 or more use cases and may extend it to other filesystems as a result. I'm hoping that I can provide the generic implementation and the tmpfs support and in follow up patches folks can extend this to other file systems by providing the fs-specific changes needed for that filesystem. AFAICT with this patch the work to extend to another file system is to parse the memcg= option in that filesystem, set the s_memcg_to_charge on the superblock (or mapping) of that filesystem, and to charge s_memcg_to_charge in fs specific code paths, so all are fs-specific changes. Based on this, it seems to me the suggestion is to hang the memcg_to_charge off the mapping? I'm not sure if *most/all* filesystems will eventually support it, but likely more than just tmpfs. > > Cheers, > > Dave. > -- > Dave Chinner > david@fromorbit.com