From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8D98BC61D97 for ; Sat, 25 Nov 2023 02:42:33 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CAB4E6B06EA; Fri, 24 Nov 2023 21:42:32 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id C5AED6B06EB; Fri, 24 Nov 2023 21:42:32 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B49EF6B06EC; Fri, 24 Nov 2023 21:42:32 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id A51446B06EA for ; Fri, 24 Nov 2023 21:42:32 -0500 (EST) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 76D4DA0169 for ; Sat, 25 Nov 2023 02:42:32 +0000 (UTC) X-FDA: 81494928144.13.7C30D41 Received: from szxga01-in.huawei.com (szxga01-in.huawei.com [45.249.212.187]) by imf25.hostedemail.com (Postfix) with ESMTP id D1855A0015 for ; Sat, 25 Nov 2023 02:42:28 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=none; spf=pass (imf25.hostedemail.com: domain of wangkefeng.wang@huawei.com designates 45.249.212.187 as permitted sender) smtp.mailfrom=wangkefeng.wang@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1700880150; a=rsa-sha256; cv=none; b=LPX+K79hgKVa8fsVqOLKbxTrx8KAmPKWWdZgWSuFiS/hSTSJexw8kkJvW7Qfu7e961fusf GJqUZUbq6LPsnjH7F+WL5KloWDy+yDqJJ53xRvSW+YjnmFCvzGUAvnZwY3tWRuSjctJm5u O8ul2Bm/QEGl5jL45uFY/4aRBCwpHoQ= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=none; spf=pass (imf25.hostedemail.com: domain of wangkefeng.wang@huawei.com designates 45.249.212.187 as permitted sender) smtp.mailfrom=wangkefeng.wang@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1700880150; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Z38PuWG2TdGv1C2tKxgHQy7vTBFxGB4AKOHI10V8LZ4=; b=sw3rMaM3BtDT303yxp0sDSQ6L9WFy1qDwTTFJqvTF1xXoVmI67L+drlQajR00x9lYo1aEM ywoMlG6wuyzlWnSxvdiXgGn+pKh172rwuk4YbiqTuZEbf+CEEH0mhJ1EFr8zWGXpt3BKXC FLv0BLT+AJziatHsbGqPi6OUhqubxwI= Received: from dggpemm100001.china.huawei.com (unknown [172.30.72.53]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4ScbB62NhwzsRCl; Sat, 25 Nov 2023 10:18:38 +0800 (CST) Received: from [10.174.177.243] (10.174.177.243) by dggpemm100001.china.huawei.com (7.185.36.93) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.35; Sat, 25 Nov 2023 10:22:10 +0800 Message-ID: <2dbbd4e7-036f-4643-b05f-5967f4253ab8@huawei.com> Date: Sat, 25 Nov 2023 10:22:10 +0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH -next v2] mm, proc: collect percpu free pages into the free pages Content-Language: en-US To: Dmytro Maluka , Michal Hocko CC: Liu Shixin , Andrew Morton , Greg Kroah-Hartman , huang ying , Aaron Lu , Dave Hansen , Jesper Dangaard Brouer , Vlastimil Babka , Kemi Wang , , References: <20220822023311.909316-1-liushixin2@huawei.com> <20220822033354.952849-1-liushixin2@huawei.com> <20220822141207.24ff7252913a62f80ea55e90@linux-foundation.org> <6b2977fc-1e4a-f3d4-db24-7c4699e0773f@huawei.com> From: Kefeng Wang In-Reply-To: Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: 8bit X-Originating-IP: [10.174.177.243] X-ClientProxiedBy: dggems706-chm.china.huawei.com (10.3.19.183) To dggpemm100001.china.huawei.com (7.185.36.93) X-CFilter-Loop: Reflected X-Rspam-User: X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: D1855A0015 X-Stat-Signature: 58gg1whe3e3ndbuqx9gmdrcde7kmrhkx X-HE-Tag: 1700880148-38763 X-HE-Meta: U2FsdGVkX18H8IHSuJxs5eyedIKgBs2KqOkR+Be60sOu1OVXUrF6U2MnzEGrMZs/RubYAIMW1MucwsHTLIRUTzeTKZnPE/ECBxNgDiccxV30EBLF6O7pwJcmVqMY9ZcQ304A52uiW+peUvRTIkSaBrLQZf2QNll4TxeoPd+phJCS0Fk7Gnc1hRZvHywWSdjC/2fHjvDh3PKc1pm2Z3k+XC7/Vdu/MmKSJ6oGv03ATIb2mUPn44UKGaj2Yr6kvml8G0Fa8hk6qWD6yt3i6LGkuB8IvSzZjIjv+RQqxD/OlGG5iWKQedpPQO4Y9lyMX9npsMG3Vv4a1nNrsKah/cxVqJ6uldkCLrM8MobtyIaC9cM2TTVIbnxPR5w1+z9N2jTlL54A46HAhOYvdtMyHwnAY3uUrcKpw5vEGA3fJZv4fDVSqTzqBJi9QZYI+uoNpHCyJofnK1QVr4UXjiRqG9FKwhVXRJZKEop1x2JL1OXFUlUU4W4mnvfzyTiYKM1JqbI3pJ6Hh0uB+Z0wA2TP5bhZGTUxPCjJwaJSgOyoR5wbIj+FKZtrGZnCpPgPNigpXZ5lNXUpi9Fweka+RWfAI6hSvWD+HH06kZ7SNjeWiL1maQQtnD54kiY1hv/nH9BgnfDb1WjAHk0su8XI+nLbDfvE87d+Rvw4JptXe6TeuWPfL4EBTQhli2/iwH5GjfSIfcp4i1S9Qigz2f8wnGiaWNjmP/ZPzopAm0ASlBSknUsEpQ44y6g1/V8u0sM5dyirm2wswNw0L6bDbjYoaKFiSXh7BrESkhipYGIR+Njasi52rYiUjIjkx5gkiKsxYE2xstTzLeTlJUQn/qv4q3/ZtqzO4NsZCZjjmU8c9uryfYP2IUG6/NoyQZmTHThbnw/cYlpdpgewwT8xf91K3TLjSehaaqcZyCDwgIPmXpYKkMR/oz/xvqyIEb7z8YRPxJWKxK41oSTTcppickkYJDKD7E1 JFBnoAZn rtUZ0VHeirWkEZaB/nMtHVvDSkYlYoTgWpajyS1pd6VAfjm9qzxM5TX+0Nk5MCTuaOTN6BFFMRwcIOmYBrYwr+BcDvRu2GeJ/0NSbtsZ61VAomule9+1P1Gy/hTSTXmzdxWfZj4js3OZTkgklUUdTnkRyFN4LK7fjSJEsRCI/BBvguUKFnf5ovRIka3Gm+e8CfGsrz53NYYWn3V/qnIsuFTiWz3xLsyGzBtrDUdwaQQ3PM6Awh/Bd5ACd2wbjtZYpUReVFhMh4Xc2fZI= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 2023/11/25 1:54, Dmytro Maluka wrote: > On Tue, Aug 23, 2022 at 03:37:52PM +0200, Michal Hocko wrote: >> On Tue 23-08-22 20:46:43, Liu Shixin wrote: >>> On 2022/8/23 15:50, Michal Hocko wrote: >>>> On Mon 22-08-22 14:12:07, Andrew Morton wrote: >>>>> On Mon, 22 Aug 2022 11:33:54 +0800 Liu Shixin wrote: >>>>> >>>>>> The page on pcplist could be used, but not counted into memory free or >>>>>> avaliable, and pcp_free is only showed by show_mem() for now. Since commit >>>>>> d8a759b57035 ("mm, page_alloc: double zone's batchsize"), there is a >>>>>> significant decrease in the display of free memory, with a large number >>>>>> of cpus and zones, the number of pages in the percpu list can be very >>>>>> large, so it is better to let user to know the pcp count. >>>>>> >>>>>> On a machine with 3 zones and 72 CPUs. Before commit d8a759b57035, the >>>>>> maximum amount of pages in the pcp lists was theoretically 162MB(3*72*768KB). >>>>>> After the patch, the lists can hold 324MB. It has been observed to be 114MB >>>>>> in the idle state after system startup in practice(increased 80 MB). >>>>>> >>>>> Seems reasonable. >>>> I have asked in the previous incarnation of the patch but haven't really >>>> received any answer[1]. Is this a _real_ problem? The absolute amount of >>>> memory could be perceived as a lot but is this really noticeable wrt >>>> overall memory on those systems? > > Let me provide some other numbers, from the desktop side. On a low-end > chromebook with 4GB RAM and a dual-core CPU, after commit b92ca18e8ca5 > (mm/page_alloc: disassociate the pcp->high from pcp->batch) the max > amount of PCP pages increased 56x times: from 2.9MB (1.45 per CPU) to > 165MB (82.5MB per CPU). > > On such a system, memory pressure conditions are not a rare occurrence, > so several dozen MB make a lot of difference. And with mm: PCP high auto-tuning merged in v6.7, the pcp could be more bigger than before. > > (The reason it increased so much is because it now corresponds to the > low watermark, which is 165MB. And the low watermark, in turn, is so > high because of khugepaged, which bumps up min_free_kbytes to 132MB > regardless of the total amount of memory.) > >>> This may not obvious when the memory is sufficient. However, as products monitor the >>> memory to plan it. The change has caused warning. >> >> Is it possible that the said monitor is over sensitive and looking at >> wrong numbers? Overall free memory doesn't really tell much TBH. >> MemAvailable is a very rough estimation as well. >> >> In reality what really matters much more is whether the memory is >> readily available when it is required and none of MemFree/MemAvailable >> gives you that information in general case. >> >>> We have also considered using /proc/zoneinfo to calculate the total >>> number of pcplists. However, we think it is more appropriate to add >>> the total number of pcplists to free and available pages. After all, >>> this part is also free pages. >> >> Those free pages are not generally available as exaplained. They are >> available to a specific CPU, drained under memory pressure and other >> events but still there is no guarantee a specific process can harvest >> that memory because the pcp caches are replenished all the time. >> So in a sense it is a semi-hidden memory. > > I was intuitively assuming that per-CPU pages should be always available > for allocation without resorting to paging out allocated pages (and thus > it should be non-controversially a good idea to include per-CPU pages in > MemFree, to make it more accurate). > > But looking at the code in __alloc_pages() and around, I see you are > right: we don't try draining other CPUs' PCP lists *before* resorting to > direct reclaim, compaction etc. > > BTW, why not? Shouldn't draining PCP lists be cheaper than pageout() in > any case? Same question here, could drain pcp before direct reclaim? > >> That being said, I am still not convinced this is actually going to help >> all that much. You will see a slightly different numbers which do not >> tell much one way or another and if the sole reason for tweaking these >> numbers is that some monitor is complaining because X became X-epsilon >> then this sounds like a weak justification to me. That epsilon happens >> all the time because there are quite some hidden caches that are >> released under memory pressure. I am not sure it is maintainable to >> consider each one of them and pretend that MemFree/MemAvailable is >> somehow precise. It has never been and likely never will be. >> -- >> Michal Hocko >> SUSE Labs