From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B5F33E784BD for ; Mon, 2 Oct 2023 15:26:00 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 430908D0033; Mon, 2 Oct 2023 11:26:00 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 3E1068D000E; Mon, 2 Oct 2023 11:26:00 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2D8608D0033; Mon, 2 Oct 2023 11:26:00 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 1D9198D000E for ; Mon, 2 Oct 2023 11:26:00 -0400 (EDT) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id E002F802FF for ; Mon, 2 Oct 2023 15:25:59 +0000 (UTC) X-FDA: 81300896838.09.24AE05F Received: from mail-qk1-f177.google.com (mail-qk1-f177.google.com [209.85.222.177]) by imf04.hostedemail.com (Postfix) with ESMTP id DD34F4001A for ; Mon, 2 Oct 2023 15:25:57 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=cmpxchg-org.20230601.gappssmtp.com header.s=20230601 header.b=iUiWYQx0; spf=pass (imf04.hostedemail.com: domain of hannes@cmpxchg.org designates 209.85.222.177 as permitted sender) smtp.mailfrom=hannes@cmpxchg.org; dmarc=pass (policy=none) header.from=cmpxchg.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1696260358; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Z686kNiWukEPuUFPdj5Zpdh2jq/+sj08V/foqt23i+A=; b=1WfDwnnapXKmhfNAfwJyka24wsDXBWUOV8L6P5ocJr4xvWV4LLLHTFg9nbcbiQtO9ga37g C49e+2m7ykTsHENQI/7Ibn0VpBDogiS85IYCTJX0izRvYnxf8p1wHZsG/Z8kTgZBXZiWzQ lsipRwY0J/GGPPCpBng613w5PhtAduM= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1696260358; a=rsa-sha256; cv=none; b=Q6V3+Fh3GlcTMjppHyAtf8QsMkhf+q/geVMMTEGBYYq/8KNJFZnKgL9QpKz7b/sY2dzUaf YXxqCxIKdSIDKY+AGPesM1XyqNKe0YSX7/Iu4FOkT/cQ6JD1nDjGxuM3M2rl2/PCkXvKLy FghgQg1X64uTgxhY4Aqt86GxXCBocUE= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=pass header.d=cmpxchg-org.20230601.gappssmtp.com header.s=20230601 header.b=iUiWYQx0; spf=pass (imf04.hostedemail.com: domain of hannes@cmpxchg.org designates 209.85.222.177 as permitted sender) smtp.mailfrom=hannes@cmpxchg.org; dmarc=pass (policy=none) header.from=cmpxchg.org Received: by mail-qk1-f177.google.com with SMTP id af79cd13be357-774105e8c37so1108294485a.3 for ; Mon, 02 Oct 2023 08:25:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20230601.gappssmtp.com; s=20230601; t=1696260357; x=1696865157; darn=kvack.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=Z686kNiWukEPuUFPdj5Zpdh2jq/+sj08V/foqt23i+A=; b=iUiWYQx0fS3YWHqDjydHUVfGiqHCic/cLTlTSJJHvP6LlpNvbZyPn3RXgFvTZ0VdQ2 0k+uF8zuSLHwL0uDb9r1vQw1d6hML4mgZpYPtNihhSB+8CeIaCjCkhnsMGXWkWdW1Bk3 s/nwGjBleEuDEbwI/ckj5no8FNBA7pnotr0gAXYzQuTxemeaQmsd3Xexbv4335O+qYvn O4nY9vAj+j04ySLyph3Pt8ywePMUNXrgwUtTRMYwI0YSI603ujjqAE5eSaehDdalvtcd vk8LOh82LdItcOl99PTKrxyIOgdWYXbuxLryPC2ul3738s4tvqb4ptlRPwOJnGuz598B ljfg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1696260357; x=1696865157; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=Z686kNiWukEPuUFPdj5Zpdh2jq/+sj08V/foqt23i+A=; b=cduwwGS5Q30VntkdCBApyznDwhKM7hdZTuEzeneD5hV9xHgJHnRxhDNpwB3peXOQzF 89LXotju/2RNxWBYkZBikVMp6Bks0M+IONjq3X0xyDknqBAEzlIGkFNNnPduXLKLx9Ob mkg91nbyzMmOnMUNmaMgCigk249PyfiEKpsPeKQvtQC2qcE1WUFZLX8xRG/lMIqc6Lri 9CHbUjb74idNzi9ZCtX5dhZtePWedBoRcuZ7+/blwaWm5jgzHlt+KbHISfI40hmVil0h 93YZZOO2ESEFcbaRuOTg5M0wPfvs/dCkryWVTn38uia/fmGjvIk86EJJDpXFbQ9S+r/6 Oqjw== X-Gm-Message-State: AOJu0YxnVYUreXgQ34OaS+zVY52mFtRvky+UqxHvGcu3NzIU9lQh4WdO sqLnVTojPBNoyjr5W0HwXEInAA== X-Google-Smtp-Source: AGHT+IGOzBNgnR/8kyNK9t/3Vfmd2zrtQURJkh2zz1eHoYnXPAUTPdrNqNl1BvQCUeX7rTJ8yRuL2w== X-Received: by 2002:a05:620a:4891:b0:775:9766:cb69 with SMTP id ea17-20020a05620a489100b007759766cb69mr7612236qkb.75.1696260356936; Mon, 02 Oct 2023 08:25:56 -0700 (PDT) Received: from localhost (2603-7000-0c01-2716-3012-16a2-6bc2-2937.res6.spectrum.com. [2603:7000:c01:2716:3012:16a2:6bc2:2937]) by smtp.gmail.com with ESMTPSA id b6-20020a05620a126600b007759e9b0eb8sm1707697qkl.99.2023.10.02.08.25.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 02 Oct 2023 08:25:56 -0700 (PDT) Date: Mon, 2 Oct 2023 11:25:55 -0400 From: Johannes Weiner To: Michal Hocko Cc: Nhat Pham , akpm@linux-foundation.org, riel@surriel.com, roman.gushchin@linux.dev, shakeelb@google.com, muchun.song@linux.dev, tj@kernel.org, lizefan.x@bytedance.com, shuah@kernel.org, mike.kravetz@oracle.com, yosryahmed@google.com, linux-mm@kvack.org, kernel-team@meta.com, linux-kernel@vger.kernel.org, cgroups@vger.kernel.org Subject: Re: [PATCH v2 1/2] hugetlb: memcg: account hugetlb-backed memory in memory controller Message-ID: <20231002152555.GA5054@cmpxchg.org> References: <20230928005723.1709119-1-nphamcs@gmail.com> <20230928005723.1709119-2-nphamcs@gmail.com> <20231002145026.GB4414@cmpxchg.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspamd-Queue-Id: DD34F4001A X-Rspam-User: X-Stat-Signature: 4kd7zrqwykhncuu9w3rtmoaoqsr7oefe X-Rspamd-Server: rspam03 X-HE-Tag: 1696260357-466770 X-HE-Meta: U2FsdGVkX1/5qLovsXofBdJRjeAVjEMc1fnoeEwigBg26w62I3AGQjf4C2HdFt2sI5X6BDi7njMTLma9AQmj3lIB78+zWgQVxU+OQoXpopj/e4HPQhFaD5/HitK/ZXto7UOn12u//gpKnBRWmw6sIyF9TdImshv0PVBgqfdlsjKWbTb9Aoq1OhBMycQRRW0N+sG/p5oUAZfIipKthug9KiAG3MM0HiS08+GzpKphleL8TTKLElJadcRkEtP9z+xD6144P01iCghraPMcp0SG/QmL36R9PuyTabdSERipNJecZoOhmW3fy+P8JaE9oRfljrMwViOtsnWoTwfWOXzufHG4qpthE4UVnaWbtWKMWqYSbcWR51oCZWHPaXUF78zc+3HMa9Ua30CeuocF2EgkQhmRoN5moepPKQWQyopJNcOGXRvVc/j77KFUiwXfbXkaVPaqkmUmFrTTriaXwVIIZuE1arolJJR77TCMqcptm9yTqxTZPWbqKwG97SEai/iXF3vuMivnpnKoaaSjUCPws/UhF+UcV9Eg2VcvYOp9c0EofgwILOHNSifAp+7yc1BdxZGTro1V1sfhNxa2kSnEBj1Vq8/497csarQMbSoWKFJFsFb6mkgs6gpAFB2g0KMTyjtWoDSYHZm4zr9JyoAEMu7p5nPGpZp5DpSkdm3r5oNYYIqPnKo0s5BGHDbLcNXUM2xNwtuLi4bwTaspH2E4WGxDCI3h3F3ALY5Wm5sa4f+8yHEFN0cq961tlwOQ0c2Cts2a5DAOjBf51txKfY8QjDciyvXRowzDX5EOQAaj0PamOXLPFqZPCM+BEZ8awHVh6tJKEn5t/+WDKEdJqLn8PdKJyUKNsboIhLhX6Fm2yNDcpdvO5tF5Idn78BIRwlVWaRopb3tQPSCqn3xoP0o/ZOCAnv/EUr78b6iAr+kdJ1rF06Xoiudfzn4DNBj01FApzpmkP24ItS3bKViME4b uMK0dUX3 p8S9iFB6qVLRlD3NPkeuICvnKAfTe5/gcA7fZBzJnhw6bJWIvjslnGfSz4ZgsrVV1zAqhZq7BeUVgO08TvUAEtr2Yu8VDZ9VOiiHsOHCeO2fX5yA7G8s6ckqN/MP3xXUalyT2kBXu44IFNXFwr2JzlR+em75UBIxA44cckfh5hsPWRWBbY6F87hGu18bSTNIEFtrpi1qVwm3W8cmnJHf3D9pAnfYv1bhfl0Uv+WhHhvx9MinBJd30MKvEpxUG9SNHx8O9vvnMBbIdkz+Wem6BAmcZDS0VzvhJb4gHQdfvKyQznVyy0sr1lPrQzTSTa7iu3M8IXSmYR6XhTECkwmUXITR6ojmriBd4dUBekrN4bjfXjmKH0IWbyKaZJ1bipXTSvElOofXsIZPYeIY= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, Oct 02, 2023 at 05:08:34PM +0200, Michal Hocko wrote: > On Mon 02-10-23 10:50:26, Johannes Weiner wrote: > > On Mon, Oct 02, 2023 at 03:43:19PM +0200, Michal Hocko wrote: > > > On Wed 27-09-23 17:57:22, Nhat Pham wrote: > [...] > > > - memcg limit reclaim doesn't assist hugetlb pages allocation when > > > hugetlb overcommit is configured (i.e. pages are not consumed from the > > > pool) which means that the page allocation might disrupt workloads > > > from other memcgs. > > > - failure to charge a hugetlb page results in SIGBUS rather > > > than memcg oom killer. That could be the case even if the > > > hugetlb pool still has pages available and there is > > > reclaimable memory in the memcg. > > > > Are these actually true? AFAICS, regardless of whether the page comes > > from the pool or the buddy allocator, the memcg code will go through > > the regular charge path, attempt reclaim, and OOM if that fails. > > OK, I should have been more explicit. Let me expand. Charges are > accounted only _after_ the actual allocation is done. So the actual > allocation is not constrained by the memcg context. It might reclaim > from the memcg at that time but the disruption could have already > happened. Not really any different from regular memory allocation > attempt but much more visible with GB pages and one could reasonably > expect that memcg should stop such a GB allocation if the local reclaim > would be hopeless to free up enough from its own consumption. > > Makes more sense? Yes, that makes sense. This should be fairly easy to address by having hugetlb do the split transaction that charge_memcg() does in one go, similar to what we do for the hugetlb controller as well. IOW, alloc_hugetlb_folio() { if (mem_cgroup_hugetlb_try_charge()) return ERR_PTR(-ENOMEM); folio = dequeue(); if (!folio) { folio = alloc_buddy(); if (!folio) goto uncharge; } mem_cgroup_hugetlb_commit_charge(); }