From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-24.8 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_SANE_1,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4E659C433E0 for ; Fri, 5 Mar 2021 08:06:49 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id D330865015 for ; Fri, 5 Mar 2021 08:06:48 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org D330865015 Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 17AFB6B0005; Fri, 5 Mar 2021 03:06:48 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 129C96B0007; Fri, 5 Mar 2021 03:06:48 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F0CCE6B0008; Fri, 5 Mar 2021 03:06:47 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0217.hostedemail.com [216.40.44.217]) by kanga.kvack.org (Postfix) with ESMTP id D6F466B0005 for ; Fri, 5 Mar 2021 03:06:47 -0500 (EST) Received: from smtpin05.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 9B86945AB for ; Fri, 5 Mar 2021 08:06:47 +0000 (UTC) X-FDA: 77885089254.05.2362D25 Received: from mail-ot1-f43.google.com (mail-ot1-f43.google.com [209.85.210.43]) by imf30.hostedemail.com (Postfix) with ESMTP id 6F3B8E0011E2 for ; Fri, 5 Mar 2021 08:06:44 +0000 (UTC) Received: by mail-ot1-f43.google.com with SMTP id e45so983314ote.9 for ; Fri, 05 Mar 2021 00:06:46 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:from:to:cc:subject:in-reply-to:message-id:references :user-agent:mime-version; bh=6Ntm9TYj9xnK5FJwNfSHccWZk0cH/kRpY9uFJXEGdp4=; b=JtDs5SGsyGwpKyZGB+UWCFD0bUCHtF++4j4LEkMR3zxnUbR7upuo3bGIoFiSGBtExy Xlu/idUZ+1LErYPG3Cd+Eej2W0DQW2t7bP3RcQmpj9eZu4zrngwcstsiZFWkh71eDLt/ +nuDyIl5YfMHCEMynjwMtQeeO6PCpHLId8kJOUlmRAn8t86VTxSa7bVNI5eCWjXFbzhI KCpyR8GReEwx6f8/TqrakkXvX+xSDklxWlzkQs7H9jr7uriW6UvyRmVeN1EPjJMszwh4 sar5jzloP5Iz83AbJ/YuFjSfDDuDnnZOE9T5EQ1HK2xhxG5tmJUpTCiIjgvl02WDu9W+ adnA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:in-reply-to:message-id :references:user-agent:mime-version; bh=6Ntm9TYj9xnK5FJwNfSHccWZk0cH/kRpY9uFJXEGdp4=; b=bwmYsRAXahVEnbo/aA9v5FjnfW4fgEc4LZiQbL9vSTQ3wRilrwwtLTD8gGQdxv0B0V GpgCCSdnIKAPVuq1/U/llrZzFBnJI16uJm8dVZLhssFVlERDj/7nCH9g6z+ztmYqhu9v mTQkvn4xl5JsWKad8xt9gMMu/aXoJWMeZT8r8qZDpzD7Yht+rTa5lqsjVwVQvptfja7G fzcY88fb8ym7A1/VUeb3Yhs0wFW+iTZ72vi2huLwns2kHAmmlhDM43kApbgQUeO+DhVp 6hyp/RUlbSmZa8hlSLGqlcRXqwHAY5VF2T5+d08Ouz5XREfgu1uZUKoXjwJA85YW8J/W vzCQ== X-Gm-Message-State: AOAM532ZaGrbXNfxyywp3cTxZIauXbpD7eQx9GT4IxlxB8GkcPC+24z8 Xr+3+I2ldm/8p5eSKYZR3ubxbw== X-Google-Smtp-Source: ABdhPJxThJ5wrHrPS0uUauUNUFu56eKAS9O2j6jzhAW+ts6AM6FgX1hw/FEjfEL4rW3W5UQs8HgcbA== X-Received: by 2002:a9d:701e:: with SMTP id k30mr6969214otj.157.1614931606122; Fri, 05 Mar 2021 00:06:46 -0800 (PST) Received: from eggly.attlocal.net (172-10-233-147.lightspeed.sntcca.sbcglobal.net. [172.10.233.147]) by smtp.gmail.com with ESMTPSA id d21sm376885oic.54.2021.03.05.00.06.44 (version=TLS1 cipher=ECDHE-ECDSA-AES128-SHA bits=128/128); Fri, 05 Mar 2021 00:06:45 -0800 (PST) Date: Fri, 5 Mar 2021 00:06:31 -0800 (PST) From: Hugh Dickins X-X-Sender: hugh@eggly.anvils To: Shakeel Butt cc: Hugh Dickins , Johannes Weiner , Roman Gushchin , Michal Hocko , Andrew Morton , cgroups@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH v3] memcg: charge before adding to swapcache on swapin In-Reply-To: <20210304014229.521351-1-shakeelb@google.com> Message-ID: References: <20210304014229.521351-1-shakeelb@google.com> User-Agent: Alpine 2.11 (LSU 23 2013-08-11) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Stat-Signature: f5s64sh1ogqj16ztwim3qo4wo9ksurbm X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 6F3B8E0011E2 Received-SPF: none (google.com>: No applicable sender policy available) receiver=imf30; identity=mailfrom; envelope-from=""; helo=mail-ot1-f43.google.com; client-ip=209.85.210.43 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1614931604-907048 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, 3 Mar 2021, Shakeel Butt wrote: > Currently the kernel adds the page, allocated for swapin, to the > swapcache before charging the page. This is fine but now we want a > per-memcg swapcache stat which is essential for folks who wants to > transparently migrate from cgroup v1's memsw to cgroup v2's memory and > swap counters. In addition charging a page before exposing it to other > parts of the kernel is a step in the right direction. > > To correctly maintain the per-memcg swapcache stat, this patch has > adopted to charge the page before adding it to swapcache. One > challenge in this option is the failure case of add_to_swap_cache() on > which we need to undo the mem_cgroup_charge(). Specifically undoing > mem_cgroup_uncharge_swap() is not simple. > > To resolve the issue, this patch introduces transaction like interface > to charge a page for swapin. The function mem_cgroup_charge_swapin_page() > initiates the charging of the page and mem_cgroup_finish_swapin_page() > completes the charging process. So, the kernel starts the charging > process of the page for swapin with mem_cgroup_charge_swapin_page(), > adds the page to the swapcache and on success completes the charging > process with mem_cgroup_finish_swapin_page(). > > Signed-off-by: Shakeel Butt Quite apart from helping with the stat you want, what you've ended up with here is a nice cleanup in several different ways (and I'm glad Johannes talked you out of __GFP_NOFAIL: much better like this). I'll say Acked-by: Hugh Dickins but I am quite unhappy with the name mem_cgroup_finish_swapin_page(): it doesn't finish the swapin, it doesn't finish the page, and I'm not persuaded by your paragraph above that there's any "transaction" here (if there were, I'd suggest "commit" instead of "finish"'; and I'd get worried by the css_put before it's called - but no, that's fine, it's independent). How about complementing mem_cgroup_charge_swapin_page() with mem_cgroup_uncharge_swapin_swap()? I think that describes well what it does, at least in the do_memsw_account() case, and I hope we can overlook that it does nothing at all in the other case. And it really doesn't need a page argument: both places it's called have just allocated an order-0 page, there's no chance of a THP here; but you might have some idea of future expansion, or matching put_swap_page() - I won't object if you prefer to pass in the page. But more interesting, though off-topic, comments on it below... > +/* > + * mem_cgroup_finish_swapin_page - complete the swapin page charge transaction > + * @page: page charged for swapin > + * @entry: swap entry for which the page is charged > + * > + * This function completes the transaction of charging the page allocated for > + * swapin. > + */ > +void mem_cgroup_finish_swapin_page(struct page *page, swp_entry_t entry) > +{ > /* > * Cgroup1's unified memory+swap counter has been charged with the > * new swapcache page, finish the transfer by uncharging the swap > @@ -6760,20 +6796,14 @@ int mem_cgroup_charge(struct page *page, struct mm_struct *mm, gfp_t gfp_mask) > * correspond 1:1 to page and swap slot lifetimes: we charge the > * page to memory here, and uncharge swap when the slot is freed. > */ > - if (do_memsw_account() && PageSwapCache(page)) { > - swp_entry_t entry = { .val = page_private(page) }; > + if (!mem_cgroup_disabled() && do_memsw_account()) { I understand why you put that !mem_cgroup_disabled() check in there, but I have a series of observations on that. First I was going to say that it would be better left to mem_cgroup_uncharge_swap() itself. Then I was going to say that I think it's already covered here by the cgroup_memory_noswap check inside do_memsw_account(). Then, going back to mem_cgroup_uncharge_swap(), I realized that 5.8's 2d1c498072de ("mm: memcontrol: make swap tracking an integral part of memory control") removed the do_swap_account or cgroup_memory_noswap checks from mem_cgroup_uncharge_swap() and swap_cgroup_swapon() and swap_cgroup_swapoff() - so since then we have been allocating totally unnecessary swap_cgroup arrays when mem_cgroup_disabled() (and mem_cgroup_uncharge_swap() has worked by reading the zalloced array). I think, or am I confused? If I'm right on that, one of us ought to send another patch putting back, either cgroup_memory_noswap checks or mem_cgroup_disabled() checks in those three places - I suspect the static key mem_cgroup_disabled() is preferable, but I'm getting dozy. Whatever we do with that - and it's really not any business for this patch - I think you can drop the mem_cgroup_disabled() check from mem_cgroup_uncharge_swapin_swap(). > /* > * The swap entry might not get freed for a long time, > * let's not wait for it. The page already received a > * memory+swap charge, drop the swap entry duplicate. > */ > - mem_cgroup_uncharge_swap(entry, nr_pages); > + mem_cgroup_uncharge_swap(entry, thp_nr_pages(page)); > } > - > -out_put: > - css_put(&memcg->css); > -out: > - return ret; > }