From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id CA02DD12D77 for ; Mon, 11 Nov 2024 06:33:41 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 53BF76B0089; Mon, 11 Nov 2024 01:33:41 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 4EAC06B008A; Mon, 11 Nov 2024 01:33:41 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3B23C6B008C; Mon, 11 Nov 2024 01:33:41 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 1B5F26B0089 for ; Mon, 11 Nov 2024 01:33:41 -0500 (EST) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id BC4671A1764 for ; Mon, 11 Nov 2024 06:33:40 +0000 (UTC) X-FDA: 82772845974.09.40ECDA7 Received: from out-183.mta1.migadu.com (out-183.mta1.migadu.com [95.215.58.183]) by imf05.hostedemail.com (Postfix) with ESMTP id 307ED100003 for ; Mon, 11 Nov 2024 06:32:22 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b="lqtL6g/7"; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf05.hostedemail.com: domain of shakeel.butt@linux.dev designates 95.215.58.183 as permitted sender) smtp.mailfrom=shakeel.butt@linux.dev ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1731306757; a=rsa-sha256; cv=none; b=Jya4srBXCW4q9keLB1V4T9Y1oCNuhE++8xJJyDq+EYH3yR36Eh9pLmkA+54jvJ49qx0v0L qURRErLOoddpnbxONa35kBpBgTmZq/gm5F5fUR+bURi+LJx3OOraIXYNnMpe+N/2HxvTKk uRG+lomgEZsdn/2lX7R61R/KDhV3rNw= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b="lqtL6g/7"; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf05.hostedemail.com: domain of shakeel.butt@linux.dev designates 95.215.58.183 as permitted sender) smtp.mailfrom=shakeel.butt@linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1731306757; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=X/MVBgEHJ44Ez5PPP0jKD7akI1JfToDyS0gIR7NVZiU=; b=emhkyVXmS7tCLvCzZ8S+wT9DO2CTdL8zWmHb//8WEI3OanUdSYwdR6EzKKUkICNI6cJ+DQ hoHUEs4jsYYchkTiJ2RctdD7SAplDp6dftO0JRl0VQiOP9BTjb+LkWKQBPNIYU3iglNn7f ntBGqFON/DNz8agqLicK5EEdvEL8Wng= Date: Sun, 10 Nov 2024 22:33:26 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1731306816; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=X/MVBgEHJ44Ez5PPP0jKD7akI1JfToDyS0gIR7NVZiU=; b=lqtL6g/7LKpANkQ22HN8NMcNyIyxrkUBsHIOtBAkoLfYHxjoMT1TuCCV6mDpMe132sO9rB 8eP8A1RhzbqI1jlPCcceRMomea+mTsXtuPWCBzNU+FdlxHaqsIiVd7pOSea0N0ge4bo9FV IeJDshKG0gomRma/l63v30/N4CrjD8I= X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Shakeel Butt To: Joshua Hahn Cc: hannes@cmpxchg.org, mhocko@kernel.org, roman.gushchin@linux.dev, muchun.song@linux.dev, akpm@linux-foundation.org, cgroups@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, kernel-team@meta.com Subject: Re: [PATCH 2/3] memcg/hugetlb: Introduce mem_cgroup_charge_hugetlb Message-ID: References: <20241108212946.2642085-1-joshua.hahnjy@gmail.com> <20241108212946.2642085-3-joshua.hahnjy@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Migadu-Flow: FLOW_OUT X-Rspam-User: X-Rspamd-Queue-Id: 307ED100003 X-Rspamd-Server: rspam01 X-Stat-Signature: 31spqqkigagfixdxfmkiybmbsxs6ony8 X-HE-Tag: 1731306742-54806 X-HE-Meta: U2FsdGVkX1+f9SVhQRJdgo/cITZLnJT90gRyyF+VuFDhQc7CChoAatAqJkUdp2EdoLvpe5ZkmbSNt3olO7XNVh0Y1/KdMtUThQs8ZdSt+0wy5dvEoA3YGXepEYnYRywzKSX4f39SuK6L3KVXqPnFfkBeTL08y84Qpu3v6ldfVp8fFVejNFeMA6sykpzLTlfcqeh5GMZRmfOBtJsPMX3prA7lGiLOsFYbT6EhZpY/nrq1geEPHWxwy+S7m6/A76m6nX3RM75BQ86/TVgpglmzE3fAnOeJt0S3v26U8YozrOC2ZgPsLyt9eJDtXv08xH9fR0uKOIDZh1gPWGf7EQkP1ZAx7iVfKQT2PAdacjrbQF2nuKYfnOo/OkLM3g9AKiyD2pwKkAWROSer894wPeG1pF0VZJfbStWFoJAHP3VxZ5uWqlq3fF4crtUW/iZWFTOVJqg4ZimnYlnLISMexdc2uTH4KKB0k8L6TTY9NTmgFdsAPLRAghslRgN8G+b7Q+8w+nERQmhrs2ckRI/Z2QSy+tk/gMN8Hqt3GIE7gpuqYfVdX0CsAYQ9yUWn74UeSzzU4wppYeERLHN98QxG5z2S0Sk9renB9va2CQthSUrNfx9u8yQYi5Rm/UsJ+aDacG71C/knXjri8fgWvzsg51Q+V/2T9nIwhCFyvK6U3yWUlj7yIADn9Wj4nIhZM6HK/PmlW9Z4wJcWORYQT4JFWW4qT6FRF3Mw3sfbLnCsLKGrLP67t1F23EfMgnEy6/368s0uej2Aj7d9wNr3Exm3UYAj14piF5bmDMxUiLYVflZlCKX5/e0vB9aAc9w3BNn8hdsJ/KdXc1qMt6CF11Wy0J2PjS1REt+sVKU/yIYzxxpOA9GW4SV2e5HhfegbciLZnu7k8bgf4p5fYLY6De7COPU5yFDrctLvpgAW+nQrfIiRu+uE6gDWTiezjiVGx4g+TvHd4t1Wvxrw40R0/R1wHHG 0pm2Nfqz F0AzW88mtadevw1ENNYTDAxXYom6Ltdfqaqg0aDPv1u6IwYp80pWMCHb+nSV60uyiQS+50W6XwQWMFgIS3CCTeaRxfhM7jNrzSQ+cmDYjpxVOrxKtQ9ze6xb5Ry6VYBMXhGozdDhgvmE5U2ckyE2uSGOREs5MjbE2elU6I9BJ+7ClqU47lzoXpKWYTRC6cNvawZ9OGW4FWDq4dhuNwiZUV44gLVp/68U1ODBqQFCWS+g8OTjOT4udFd6fM+YlvW1FID8ju+rdJpTOgVk= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Sat, Nov 09, 2024 at 01:58:46PM -0500, Joshua Hahn wrote: [...] > > > > + free_huge_folio(folio); > > > > free_huge_folio() will call lruvec_stat_mod_folio() unconditionally but > > you are only calling it on success. This may underflow the metric. > > I was actually thinking about this too. I was wondering what would > make sense -- in the original draft of this patch, I had the charge > increment be called unconditionally as well. The idea was that > even though it would not make sense to have the stat incremented > when there is an error, it would eventually be corrected by > free_huge_folio's decrement. However, because there is nothing > stopping the user from checking the stat in this period, they may > temporarily see that the value is incremented in memory.stat, > even though they were not able to obtain this page. > On the alloc path, unconditionally calling lruvec_stat_mod_folio() after mem_cgroup_charge_hugetlb() but before free_huge_folio() is fine. Please note that if mem_cgroup_charge_hugetlb() has failed, lruvec_stat_mod_folio() will not be incrementing the memcg stat. The system level stats may get elevated for small time window but that is fine. > With that said, maybe it makes sense to increment unconditionally, > if free also decrements unconditionally. This race condition is > not something that will cause a huge problem for the user, > although users relying on userspace monitors for memory.stat > to handle memory management may see some problems. > > Maybe what would make the most sense is to do both > incrementing & decrementing conditionally as well. > Thank you for your feedback, I will iterate on this for the next version! > > > > +int mem_cgroup_charge_hugetlb(struct folio *folio, gfp_t gfp) > > > +{ > > > + struct mem_cgroup *memcg = get_mem_cgroup_from_current(); > > > + int ret = 0; > > > + > > > + if (mem_cgroup_disabled() || !memcg_accounts_hugetlb() || > > > + !memcg || !cgroup_subsys_on_dfl(memory_cgrp_subsys)) { > > > + ret = -EOPNOTSUPP; > > > > why EOPNOTSUPP? You need to return 0 here. We do want > > lruvec_stat_mod_folio() to be called. > > In this case, I was just preserving the original code's return statements. > That is, in mem_cgroup_hugetlb_try_charge, the intended behavior > was to return -EOPNOTSUPP if any of these conditions were met. > If I understand the code correctly, calling lruvec_stat_mod_folio() > on this failure will be a noop, since either memcg doesn't account > hugetlb folios / there is no memcg / memcg is disabled. > lruvec_stat_mod_folio() does manipulate system level stats, so even if memcg accounting of hugetlb is not enabled we do want system level stats get updated.