From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B1068C4332F for ; Tue, 29 Nov 2022 17:49:41 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3ADB66B0074; Tue, 29 Nov 2022 12:49:41 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 35EF26B0075; Tue, 29 Nov 2022 12:49:41 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 226456B0078; Tue, 29 Nov 2022 12:49:41 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 129CD6B0074 for ; Tue, 29 Nov 2022 12:49:41 -0500 (EST) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id DA67A141018 for ; Tue, 29 Nov 2022 17:49:39 +0000 (UTC) X-FDA: 80187217278.16.B79BCA3 Received: from mail-pl1-f174.google.com (mail-pl1-f174.google.com [209.85.214.174]) by imf04.hostedemail.com (Postfix) with ESMTP id 6DB8A4000A for ; Tue, 29 Nov 2022 17:49:38 +0000 (UTC) Received: by mail-pl1-f174.google.com with SMTP id jn7so14141511plb.13 for ; Tue, 29 Nov 2022 09:49:38 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=Hl3DPINmZ1T1XtdjVjeWIBcpMDsqsxiqkNjkViEje6g=; b=S2O6hUY5rTSZmVMXYryt5ZPXz7UTn/t8d0ZpVZ/P7PSEnElhd2Gb9JCwr5oRUL2n+b eska0zL51J7KzNay8y03JpZRAT4X7FXicBAu1rV9/El1Pln8vAnHHisgagi3GkOYDrOA VqptbHTlmJZjIIlwNMkVe/Alh1rHgNBu+d0LdvJ8LKsk+CQvitgJRwyXy4RylfE2zY4T EYifWdhLXJQ+2K90zbyogJcbW4QeKTkk/ekQz6cjXY6yWzU6wpuOaGyzKq+zVG4+0AvJ XYDeopFJoNC9Zc8whw59rhzseZ8CTkW6iwj6kCqvwURac6yuFPZJBbwR5Z7WDPSGWey1 vIJw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=Hl3DPINmZ1T1XtdjVjeWIBcpMDsqsxiqkNjkViEje6g=; b=UyYK+8mEJMev4EpfHUzCjerXRk1O/bj+m1Ru24Wi7xYVFGwRpT1/i8zU2acx8TpBec d64wEwkfxiAUDYqvMOn+aAOcBNm0lyeJHoccgCy/nOjuuZuWh5hPBGA6IJ1hRqBiO28o FGP257EaL2y31bgU8O5TutkWmklfemD3eL6x7MZBMSOPzmAOj2nZIWY1O+MxEiE4/4zE A5I1Gz6PT2joz0eAULEQsosDVLDoWM4ov4CZskSWhzxX1Q5ruBJePc1bLeTOj/PnhDIr 4D3e03dGO5cXEs+g120/KYJ6fZuTTjcklU0o6XLvEoVxKT81XovlZb/r+bM+JuVeYjjr jpbw== X-Gm-Message-State: ANoB5pluI5gwNy96HqoLt5i6E7804vzYuNhqR4n3g+3gyZk9KNcQfj4y bx6Vv2bqGbzN2Nzaq1OInFdX/mNz+pcFbPK1byw= X-Google-Smtp-Source: AA0mqf7wCQceOB0kuyg1U5MwVqL/zsBXkVIhotZcQN/YuYIJXc0ZmSPqoVm5XCdXCWf8N24AFml7WGTsl0ArZBjEgPM= X-Received: by 2002:a17:90a:4889:b0:20d:d531:97cc with SMTP id b9-20020a17090a488900b0020dd53197ccmr62168749pjh.164.1669744177387; Tue, 29 Nov 2022 09:49:37 -0800 (PST) MIME-Version: 1.0 References: <8a2f2644-71d0-05d7-49d8-878aafa99652@huawei.com> In-Reply-To: From: Yang Shi Date: Tue, 29 Nov 2022 09:49:25 -0800 Message-ID: Subject: Re: [QUESTION] memcg page_counter seems broken in MADV_DONTNEED with THP enabled To: Michal Hocko Cc: Yongqiang Liu , "linux-kernel@vger.kernel.org" , "linux-mm@kvack.org" , "akpm@linux-foundation.org" , aarcange@redhat.com, hughd@google.com, mgorman@suse.de, cl@gentwo.org, zokeefe@google.com, rientjes@google.com, Matthew Wilcox , peterx@redhat.com, "Wangkefeng (OS Kernel Lab)" , "zhangxiaoxu (A)" , kirill.shutemov@linux.intel.com, Lu Jialin Content-Type: text/plain; charset="UTF-8" ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=S2O6hUY5; spf=pass (imf04.hostedemail.com: domain of shy828301@gmail.com designates 209.85.214.174 as permitted sender) smtp.mailfrom=shy828301@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1669744178; a=rsa-sha256; cv=none; b=r3WqhgmiPuDOs/7CoEPWIJPyUZUjve/thGxKXyeYT6RXtsbu5u4hQGQUXqIitVZr2jfjK0 nQ6JnSeLpXHiUaNuNxa89uB22qR8NSv0TFtzoxAOCQZRv2I+p8DeQeBpBGJvYeIpKzitNt SCJbAVt6TuoK808WvjrlMyn22c/0COU= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1669744178; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Hl3DPINmZ1T1XtdjVjeWIBcpMDsqsxiqkNjkViEje6g=; b=Y/ZD62MsQflUnIT5PWQKIsZ4Fq3XoXja49/yF2KaCTBSwD1Yxe6HGVI+YepBM2rsSV1Rkx lqwbnmvPhCvxDaiaTr/n9RKkqwKZOlboW+a3eayMsz80drBDivX24cJmlXqqQhAU5ucZOc GmJZOIrjlMS85pastM3iue+SyCBFFeA= Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=S2O6hUY5; spf=pass (imf04.hostedemail.com: domain of shy828301@gmail.com designates 209.85.214.174 as permitted sender) smtp.mailfrom=shy828301@gmail.com; dmarc=pass (policy=none) header.from=gmail.com X-Rspamd-Server: rspam01 X-Stat-Signature: z6f7sr796zqcoypgbsxkqftunwkppdso X-Rspamd-Queue-Id: 6DB8A4000A X-Rspam-User: X-HE-Tag: 1669744178-10895 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Nov 29, 2022 at 12:10 AM Michal Hocko wrote: > > On Mon 28-11-22 12:01:37, Yang Shi wrote: > > On Sat, Nov 26, 2022 at 5:10 AM Yongqiang Liu wrote: > > > > > > Hi, > > > > > > We use mm_counter to how much a process physical memory used. Meanwhile, > > > page_counter of a memcg is used to count how much a cgroup physical > > > memory used. > > > If a cgroup only contains a process, they looks almost the same. But with > > > THP enabled, sometimes memory.usage_in_bytes in memcg may be twice or > > > more than rss > > > in proc/[pid]/smaps_rollup as follow: > [...] > > > node_page_stat which shows in meminfo was also decreased. the > > > __split_huge_pmd > > > seems free no physical memory unless the total THP was free.I am > > > confused which > > > one is the true physical memory used of a process. > > > > This should be caused by the deferred split of THP. When MADV_DONTNEED > > is called on the partial of the map, the huge PMD is split, but the > > THP itself will not be split until the memory pressure is hit (global > > or memcg limit). So the unmapped sub pages are actually not freed > > until that point. So the mm counter is decreased due to the zapping > > but the physical pages are not actually freed then uncharged from > > memcg. > > Yes, and this is not really bound to THP. Consider a page cache. It can > be accessed via syscalls when it doesn't correspondent to rss at all > while it is still charged to a memcg. Or it can be mapped and then later > unmapped so it disappear from rss while it is still charged until it > gets reclaimed by the memory pressure. Or it can be an in-memory object > that is not bound to any process life time (e.g. tmpfs). Or it can be a > kernel memory charged to a memcg which is not covered by rss because it > is either not mapped or it is unknown to rss counters. Yes, good points. Thanks, Michal. And one more thing worth mentioning is that the RSS shown by ps or smaps is different from the RSS shown by memcg. > -- > Michal Hocko > SUSE Labs