From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A6534E7F127 for ; Tue, 26 Sep 2023 20:50:24 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2DA398D002D; Tue, 26 Sep 2023 16:50:24 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 28A768D0002; Tue, 26 Sep 2023 16:50:24 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 152D98D002D; Tue, 26 Sep 2023 16:50:24 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 03FB18D0002 for ; Tue, 26 Sep 2023 16:50:24 -0400 (EDT) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id D00301A07D4 for ; Tue, 26 Sep 2023 20:50:23 +0000 (UTC) X-FDA: 81279941526.04.AFC253D Received: from mail-vs1-f48.google.com (mail-vs1-f48.google.com [209.85.217.48]) by imf03.hostedemail.com (Postfix) with ESMTP id 2146F20019 for ; Tue, 26 Sep 2023 20:50:21 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=WnJEOKtb; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf03.hostedemail.com: domain of fvdl@google.com designates 209.85.217.48 as permitted sender) smtp.mailfrom=fvdl@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1695761422; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=lJ8eTdCcCBk/Huq43GsPZLcFxSomyGvN9/0CZ/gCf4w=; b=QAaltEKcnHrK/0VX+FmxOY2KG5bjJUK1HN5t3JRVN8NdlqEXP0B6sqMZVaVhaBpgWj8egG 7C7fMMqfModcl1xK3LNb53GuxmkGYqzNxUxUTlXfI524kuQCBIQxtjeF1cQgA2h6Gqajyn cUXGK76GCgkk/PDryrPjxO4y06C0+Wk= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=WnJEOKtb; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf03.hostedemail.com: domain of fvdl@google.com designates 209.85.217.48 as permitted sender) smtp.mailfrom=fvdl@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1695761422; a=rsa-sha256; cv=none; b=xTFxtXTWu7aBb1cNAOaY9TfcoDGMRTfBEryqbeFJRC2rYKcdE7DNXsaYrn/DJU4aqX96LL K3tARaYpz6PK+ngGwDL/QDOCQ4yrjQEEMFO4Xtv+Wxg6bqmIy/DvZKcaN6Cpk8jk/MxpH2 0f83L37dyixLakupaq1TCacQy6IrSHE= Received: by mail-vs1-f48.google.com with SMTP id ada2fe7eead31-45260b91a29so4476562137.2 for ; Tue, 26 Sep 2023 13:50:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1695761421; x=1696366221; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=lJ8eTdCcCBk/Huq43GsPZLcFxSomyGvN9/0CZ/gCf4w=; b=WnJEOKtbLQkfyhUJYLFJu4IEfdeuS/vw100vTGmUTWtP/uAODSEvB+wJzW7l57LsMQ mxtEiLBMQ7RumiNWF4niX/dw49Y284T/YP2N8PNLf/wHeZh1CqhhtNfWgU2GX+6UGOB2 zCCMxqQHmbEqkm/lLmiBOT9RnRYDPo5vpHkkbOOt9DRDtnGfuEVH+zzYSDptlW32s53e dDM3/7FWEN6fRgGcg3gJFCLPOmB/qmWIQ7LXBQIYsYnI0PQA994yrTGDCdkHj/wNXE5H 2u81f0EJd5IjG0eOZ9Xs/HRyCsb+tMhmcZevGasZAY5zKCpcOY/GciGgq1gdi3C3lHWi Bkgg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1695761421; x=1696366221; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=lJ8eTdCcCBk/Huq43GsPZLcFxSomyGvN9/0CZ/gCf4w=; b=Fc74kCDID5F7om8WmksvXsuLT11LSQd7bfHJAJOosY1QdNJdgcxQffzel9ZAtlYrzW jkv9CN8se0TpxIPrkfHJgqUAn3tjGO3ZZBFOpG3oAUWW9SBfYGHMwQ7tW064nX282YPJ S2Aw/xWszE/ZAgeeXVJhfdmvVvEkGtu3mNHvpidEbK89/ngnw3dMAjvx1OH/Fpas1Mgb qYGWcCQReov1HoJS8oCLOvUlsesamNXYjR7L2S4zIOZO50aMUZOQwzbNgV+N5CQqYp2W O9yWWomchRLsmsTaaIUW29kiRbyCEq7iJIyPBSn4J8F7Hy00kIpEk62maQ3nAmYfHPTc fJpg== X-Gm-Message-State: AOJu0YxBO1uIwKmRHxC6VFw6+nbE1+sNANLrxlnzK2LTclhK0OlsUSvw c8NcuzTJfROWCGSM0ZIyFTBl/TwV4j9/juKd5QzuwA== X-Google-Smtp-Source: AGHT+IHiSxZQR8jJmMne+tkZHh5aFyaxecm2LkIPWS70wLORNzYm/SDGJADtfduU14OQahYm6KZWNzyI9bWKucCDjjE= X-Received: by 2002:a05:6102:34e5:b0:452:6e60:3eba with SMTP id bi5-20020a05610234e500b004526e603ebamr204435vsb.1.1695761421124; Tue, 26 Sep 2023 13:50:21 -0700 (PDT) MIME-Version: 1.0 References: <20230926194949.2637078-1-nphamcs@gmail.com> In-Reply-To: <20230926194949.2637078-1-nphamcs@gmail.com> From: Frank van der Linden Date: Tue, 26 Sep 2023 13:50:10 -0700 Message-ID: Subject: Re: [PATCH 0/2] hugetlb memcg accounting To: Nhat Pham Cc: akpm@linux-foundation.org, riel@surriel.com, hannes@cmpxchg.org, mhocko@kernel.org, roman.gushchin@linux.dev, shakeelb@google.com, muchun.song@linux.dev, tj@kernel.org, lizefan.x@bytedance.com, shuah@kernel.org, mike.kravetz@oracle.com, yosryahmed@google.com, linux-mm@kvack.org, kernel-team@meta.com, linux-kernel@vger.kernel.org, cgroups@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 2146F20019 X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: y3jxtr738x1kwaecaytew8smkjb7czm7 X-HE-Tag: 1695761421-769264 X-HE-Meta: U2FsdGVkX19Ve3h/3LB/51UD5ptwSXLHou2//Ynvqkd2ioMNMd24uZQh4t+or90pZSgNoc8LptOcQ7nvQtRBmoIo3tXbrJ26vjo1JEA+shSdH5wB1aF1iXesj3M/5yqoVnWm3wY3e1/pPTNzuRYY7yyX/lQIHOuk/KD2LUvL63YC/g0DVDZkgp0+ahNv6KVagwlcvpYgLb+O0CHUjdowZqwQ44cY7xl3bzd+8zZh2QZKVxQ3uiHeth2pc2/DvuEpNWCUtdM0HL8gQNgBumZbqAbi22Ep51+EjkklO/9McHdjZHZ5gH7R/HoojyQ+xDJSmQHfeGM04NHQ9iYad62C5urN5qr20GGM6NNKvlk38NPpjwGDTxdhGL89zxKPw366Rz8ntpeNXIJz7Nr8fsznzuIMERpOV6VJGMUWvduypat1LMhRaTigiNLxzsNe/8exUTpZ1B2rUKwQmmGueOMKc/abHm2eBA90H6PM8OCTdPVquKCHQlPlTnCeuovh3jHmewYO8o+UO9L0JfOYYw2InAUjDSg2rekZO8P0RKZIuENAyErUlLQpeGzfxI6RFkmxEzu4zR9nmoSv+2njXeo4ivHd6CqvRg6Bse24gxXRdK8Vjbe7wfw8rAyVe8ypGuTtORCEMhcwWHfvZ7PbsScj0eolUefXYkyOi6ZvkcaKoZBGsTqr20ihXy4i3HE+oRCT9Ao4hA/8RFticjwUmdzE2p7//tp56Xn6XrB0OAG3Gk77i5soyR7HjIKTPaUA13RBQTmpPf1ptAr7Rfg69uN6aDMvkCgWxGFAvWDyMi1ZW3vYhzokWkDEq2CE/I6fRhboEkwcC8h0CBo7ztnSOIq5SWqPcAu+AJhln2YmJRB+0/kVqF5qc8rJvxsUPExJ8rPbZv2Tt0rJXD7JCvftDWLJjG1tehM0xhPQLpSIJNud+SZ0V44xLhM3xbuW7d9RkKgihYCSUlNiqxMpp7T/pAa evxnp6q+ IS44HfvfC5yE15xt3dT74aJI/RxBO6gDKFGeKpBupN1Jifvy7YTz8NVWSB252USDE7jgW0p8GpC2K4KgFzEgjetJAyzDFrtTHrH4bBWZ7S4Z4HyQmRWj3dWPjAS3n5R0qvBxHafG1hnLzwn4ptYCWgpO82SPgnlE8g5WnF1zMK6RGh4UHYdTw2zOo04EBngatADRY2r4fxQsemuYyNMoHP9UxjhxJXO1LCqecxam59iN8d+UKPreGV/xk8g== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Sep 26, 2023 at 12:49=E2=80=AFPM Nhat Pham wrot= e: > > Currently, hugetlb memory usage is not acounted for in the memory > controller, which could lead to memory overprotection for cgroups with > hugetlb-backed memory. This has been observed in our production system. > > This patch series rectifies this issue by charging the memcg when the > hugetlb folio is allocated, and uncharging when the folio is freed. In > addition, a new selftest is added to demonstrate and verify this new > behavior. > > Nhat Pham (2): > hugetlb: memcg: account hugetlb-backed memory in memory controller > selftests: add a selftest to verify hugetlb usage in memcg > > MAINTAINERS | 2 + > fs/hugetlbfs/inode.c | 2 +- > include/linux/hugetlb.h | 6 +- > include/linux/memcontrol.h | 8 + > mm/hugetlb.c | 23 +- > mm/memcontrol.c | 40 ++++ > tools/testing/selftests/cgroup/.gitignore | 1 + > tools/testing/selftests/cgroup/Makefile | 2 + > .../selftests/cgroup/test_hugetlb_memcg.c | 222 ++++++++++++++++++ > 9 files changed, 297 insertions(+), 9 deletions(-) > create mode 100644 tools/testing/selftests/cgroup/test_hugetlb_memcg.c > > -- > 2.34.1 > We've had this behavior at Google for a long time, and we're actually getting rid of it. hugetlb pages are a precious resource that should be accounted for separately. They are not just any memory, they are physically contiguous memory, charging them the same as any other region of the same size ended up not making sense, especially not for larger hugetlb page sizes. Additionally, if this behavior is changed just like that, there will be quite a few workloads that will break badly because they'll hit their limits immediately - imagine a container that uses 1G hugetlb pages to back something large (a database, a VM), and 'plain' memory for control processes. What do your workloads do? Is it not possible for you to account for hugetlb pages separately? Sure, it can be annoying to have to deal with 2 separate totals that you need to take into account, but again, hugetlb pages are a resource that is best dealt with separately. - Frank