From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 58500C433E0 for ; Thu, 11 Mar 2021 15:21:47 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 80FDC64FAB for ; Thu, 11 Mar 2021 15:21:46 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 80FDC64FAB Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=cmpxchg.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id D7D3B8D02C6; Thu, 11 Mar 2021 10:21:45 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id D54978D02B2; Thu, 11 Mar 2021 10:21:45 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BF5948D02C6; Thu, 11 Mar 2021 10:21:45 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0203.hostedemail.com [216.40.44.203]) by kanga.kvack.org (Postfix) with ESMTP id A31EB8D02B2 for ; Thu, 11 Mar 2021 10:21:45 -0500 (EST) Received: from smtpin29.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 5AA4D6C36 for ; Thu, 11 Mar 2021 15:21:45 +0000 (UTC) X-FDA: 77907958170.29.D5CDA29 Received: from mail-qk1-f170.google.com (mail-qk1-f170.google.com [209.85.222.170]) by imf06.hostedemail.com (Postfix) with ESMTP id 8CA0DC001C59 for ; Thu, 11 Mar 2021 15:21:43 +0000 (UTC) Received: by mail-qk1-f170.google.com with SMTP id 130so20909182qkh.11 for ; Thu, 11 Mar 2021 07:21:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=wLIhdavW8MaDURDGJcZRIIUE4Aj3H3GIRyebjCvQFLM=; b=tH8SNVYbUkOa/85pkUQlHcQ9sJWhweqs7CAcvDIhcF4V3W/f3IGYUi51MiTRGcVbdk Wf9wbp6AGPVVHq+bGG66VzZbYZkkP5AjcsOrEMIl5yptJcJ5/a9wPNBcjv4MjrpwIw39 FbEg+KKWp2d7IkzIbqndJnKmd4NX3JyC95xDFCP4MMYqe/mC70xlBWc2nrAbWeNJcEYw wpQNGGozxIlDk9CbeOXJmQISOhnkCdhkN/634b+Ehfo8LQzqcJf1UYZDnfEymVoNl05E FBUE8KeVAVDwFQuVYnMdLdH5kkBGozeihGdAlbo7DPp0KNl+9vsIZzvt27zBdmCjteAK ZnBw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=wLIhdavW8MaDURDGJcZRIIUE4Aj3H3GIRyebjCvQFLM=; b=IHvDBrChte/puQWrdTi13boaZgrFlJ5TY3zSI1rdGXJP7UZPr9Ud7NaOoT7h8PSz5r 6eVrfckz9yaugIzposMzqXBPw8GNMm7Q16Wm4aBqiigLUMAazZ/it+qTTzD2DPrlFsYS UzQA+hj5Ukx/KXnI42PlyvquWYl/0xDtMCblIfpGoVgleHKCWcqmzaeuhGVqUnrlneDB LpzzScbTD2vWxQahk8Vagqj0U1h8yjFLgpYLF+eUyKQFw5ElTMcx4NiLW1NBHDu9l9xe UMo4abUuSe2oGuxiXhNtLOU4Qt3/shC7uM0QPSXmPB1Ld1ESz1kAs2c726Wy96wNDn2U mJIA== X-Gm-Message-State: AOAM533C/4yKcbxFnxX/ZAKASnDc0vVVq7mQox11k7GbmXhs/8DqtjIw Eytl2wtZFhwnCzd2pIEorXGTZA== X-Google-Smtp-Source: ABdhPJxAtYcFORaGg0aLnmlbrVA3vygapVhtj8/eqEOia+G6JLcfnW/wG6WkOQJ12+mBr5xjuyhU5w== X-Received: by 2002:a37:a8d3:: with SMTP id r202mr8243024qke.383.1615476101342; Thu, 11 Mar 2021 07:21:41 -0800 (PST) Received: from localhost (70.44.39.90.res-cmts.bus.ptd.net. [70.44.39.90]) by smtp.gmail.com with ESMTPSA id s133sm2193001qke.1.2021.03.11.07.21.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 11 Mar 2021 07:21:40 -0800 (PST) Date: Thu, 11 Mar 2021 10:21:39 -0500 From: Johannes Weiner To: Michal Hocko Cc: Matthew Wilcox , Zhou Guanghui , linux-kernel@vger.kernel.org, linux-mm@kvack.org, akpm@linux-foundation.org, hughd@google.com, kirill.shutemov@linux.intel.com, npiggin@gmail.com, ziy@nvidia.com, wangkefeng.wang@huawei.com, guohanjun@huawei.com, dingtianhong@huawei.com, chenweilong@huawei.com, rui.xiang@huawei.com Subject: Re: [PATCH v2 2/2] mm/memcg: set memcg when split page Message-ID: References: <20210304074053.65527-1-zhouguanghui1@huawei.com> <20210304074053.65527-3-zhouguanghui1@huawei.com> <20210308210225.GF3479805@casper.infradead.org> <20210309123255.GI3479805@casper.infradead.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 8CA0DC001C59 X-Stat-Signature: zhtrtbpwpd38aa7btd6pc4u67nbfu3q5 Received-SPF: none (cmpxchg.org>: No applicable sender policy available) receiver=imf06; identity=mailfrom; envelope-from=""; helo=mail-qk1-f170.google.com; client-ip=209.85.222.170 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1615476103-818518 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Mar 11, 2021 at 09:37:02AM +0100, Michal Hocko wrote: > Johannes, Hugh, > > what do you think about this approach? If we want to stick with > split_page approach then we need to update the missing place Matthew has > pointed out. I find the __free_pages() code quite tricky as well. But for that reason I would actually prefer to initiate the splitting in there, since that's the place where we actually split the page, rather than spread the handling of this situation further out. The race condition shouldn't be hot, so I don't think we need to be as efficient about setting page->memcg_data only on the higher-order buddies as in Willy's scratch patch. We can call split_page_memcg(), which IMO should actually help document what's happening to the page. I think that function could also benefit a bit more from step-by-step documentation about what's going on. The kerneldoc is helpful, but I don't think it does justice to how tricky this race condition is. Something like this? void __free_pages(struct page *page, unsigned int order) { /* * Drop the base reference from __alloc_pages and free. In * case there is an outstanding speculative reference, from * e.g. the page cache, it will put and free the page later. */ if (likely(put_page_testzero(page))) { free_the_page(page, order); return; } /* * The speculative reference will put and free the page. * * However, if the speculation was into a higher-order page * that isn't marked compound, the other side will know * nothing about our buddy pages and only free the order-0 * page at the start of our chunk! We must split off and free * the buddy pages here. * * The buddy pages aren't individually refcounted, so they * can't have any pending speculative references themselves. */ if (!PageHead(page) && order > 0) { split_page_memcg(page, 1 << order); while (order-- > 0) free_the_page(page + (1 << order), order); } }