From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 59CABD3E18D for ; Mon, 21 Oct 2024 07:15:21 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B63306B007B; Mon, 21 Oct 2024 03:15:20 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B12566B0082; Mon, 21 Oct 2024 03:15:20 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9DA1C6B0083; Mon, 21 Oct 2024 03:15:20 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 7FF526B007B for ; Mon, 21 Oct 2024 03:15:20 -0400 (EDT) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 304A281698 for ; Mon, 21 Oct 2024 07:15:07 +0000 (UTC) X-FDA: 82696748022.05.AD37E86 Received: from mail-lf1-f53.google.com (mail-lf1-f53.google.com [209.85.167.53]) by imf25.hostedemail.com (Postfix) with ESMTP id D130FA000A for ; Mon, 21 Oct 2024 07:15:07 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=suse.com header.s=google header.b=BibXo6kO; spf=pass (imf25.hostedemail.com: domain of mhocko@suse.com designates 209.85.167.53 as permitted sender) smtp.mailfrom=mhocko@suse.com; dmarc=pass (policy=quarantine) header.from=suse.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1729494768; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=O66gPrArBhu1Lkf2k94VS51buUsSbDnUdJxUilOqKoU=; b=8iLo+EeS+jd+sZdUeV9VknYaLdqPduhwoDplNz0J36ymeBh1mVQqvXNCHUjmoFWsXnvRVK fC15g4ci8uhajimGwPIDVbpUmGpNaRINecEDSfQErFOJdypFW2xQkh3hZzs0l4s+XxOd0w vH3LkOb/Af4rG3MmS7AQHIeqe/P1DCQ= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729494768; a=rsa-sha256; cv=none; b=dEzhIZv72YpkZOCuwX0wD4MACzjdqu2khlOd/rWdJEj6Tl6G4ILsEBEr+IVOhif7rKw85C 11yFRBmIEuP/af2/msv+Y2M3mrtCN1NCMZEdsTe+xS4Cu+XtRc0K9CZ4b2uqv/xsG4r7bX j2XP5jXqW+GE5hfEIqfyuL/oo+uEanI= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=pass header.d=suse.com header.s=google header.b=BibXo6kO; spf=pass (imf25.hostedemail.com: domain of mhocko@suse.com designates 209.85.167.53 as permitted sender) smtp.mailfrom=mhocko@suse.com; dmarc=pass (policy=quarantine) header.from=suse.com Received: by mail-lf1-f53.google.com with SMTP id 2adb3069b0e04-539fe76e802so4558705e87.1 for ; Mon, 21 Oct 2024 00:15:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=google; t=1729494916; x=1730099716; darn=kvack.org; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date:from:to :cc:subject:date:message-id:reply-to; bh=O66gPrArBhu1Lkf2k94VS51buUsSbDnUdJxUilOqKoU=; b=BibXo6kOFF1OqPrQ2dpEVorXwzGFCudmNyrsgiuwq/NPYoP96OdwKklGaY7rpLFAFM MwQrDBVK76RM+P4cIYBEGM6fLrob4d2wCrpbN9ofbeAbLwTSiU2E2e/tyuldoCQ7fy1M 4XHs93dmVOenB/WdwatwUqqpCOvbIaoe4uy79XbOfO+V6dNNZEQBMbB5AqFQdhtlfIXQ 9A1NVwXA1Y6KVNlW9MQWjEYAy22ZWgrBAic+tHyfdg+wMSQZN5L49dPKi2iXtzclfPqR QIoRdfeIxrge5HA9/60rs59lJtXLBCZ1aKijxARTIgVUpJJtN3v9QxaMqRy4Rmnh71+w tJcQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1729494916; x=1730099716; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=O66gPrArBhu1Lkf2k94VS51buUsSbDnUdJxUilOqKoU=; b=k8pOQ726Y5gbGtkYhmfBBCe6Yfj+SK40zqHyj/CdNE76g7jD79/MpfVAG9y0QqCGgW tj3OGC/MhAQkRYgXKsuID8ua5xWTfW8OxYbiYU62syW7zUaE8vO2oaxWuBuST+ux7l/U nRBOxY0zl3JgGyd2gLMmECJmtKJcvfMgNqZ/bk/rJJgCdfJiDqsA98cIoQIAHTCx+4y0 cKS6JqienkyDXmDoWRo94x6FpNqBj6OdTc+wDqhuX8T25CrW59ZxbjJok4WoaXIUK7pw uqpduFoyfy3HakRQxDyD95SS2s9CiV6+iyZpkxemLL7ywnL6VJVf3GyYf6h6b0njBx5J 7gFg== X-Forwarded-Encrypted: i=1; AJvYcCV+gTbtt8qg3US+Yond4WDHg0Ssh5d/0rNvoU/f96ooF7G154SYfWzL6xhtBFvLm8j8nOG3KU9qQA==@kvack.org X-Gm-Message-State: AOJu0YwQE5Plo7HjdByeETWttwmK1/Fwc/41UU5JXJt9rK0CWyUGMmRk N8fktnEKI2k1xEwu1zZHhOqU+lkGY5QmUjoqLslForTfe6qyHwTEzPqS7StBy84= X-Google-Smtp-Source: AGHT+IETEFXR8/cKXRm1qK4kLTWa+k3b+HsgghOppLGOl3d3G+/sF4TWEhiXzLustHLoNu+OGM33Cw== X-Received: by 2002:a05:6512:39cd:b0:539:8ade:2d0 with SMTP id 2adb3069b0e04-53a155092a2mr4776049e87.51.1729494916059; Mon, 21 Oct 2024 00:15:16 -0700 (PDT) Received: from localhost (109-81-89-238.rct.o2.cz. [109.81.89.238]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-a9a912d641esm169229666b.3.2024.10.21.00.15.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 21 Oct 2024 00:15:15 -0700 (PDT) Date: Mon, 21 Oct 2024 09:15:14 +0200 From: Michal Hocko To: Joshua Hahn Cc: Shakeel Butt , Johannes Weiner , nphamcs@gmail.com, roman.gushchin@linux.dev, muchun.song@linux.dev, akpm@linux-foundation.org, cgroups@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, lnyng@meta.com Subject: Re: [PATCH 0/1] memcg/hugetlb: Adding hugeTLB counters to memory controller Message-ID: References: <20241017160438.3893293-1-joshua.hahnjy@gmail.com> <20241018123122.GB71939@cmpxchg.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Stat-Signature: 775wk7cji91pb375t74h93cfrc3jh7qd X-Rspamd-Queue-Id: D130FA000A X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1729494907-933361 X-HE-Meta: U2FsdGVkX1/vjEFCgWX2FjVdaxKViaTkyFZ+YDxNRXAsF6S8ENft8EAiKhBbCNCAjTwjmvhgzeCDg+xWtro2cQU/eXR1T8Co2whSEZharsM/fAiOVKgHOiCU9zCB/p61Qea5KIcS6NS7KJlz4GaNMRhWjultL3Hc0gHpw+Hb8me8blox/wYf8FbylhaGs9D4e/4MmvlUAZfIc9X6FTu+3oqeH0U55kd5UfRSfNxc1A2NMXKPp0X7uqjFzsd5A4gV0ofxDJ9GRMsPSEJlPP4znElA7OMKNwg8mfzpbyXzpyfp76Jo+X4GBxlzqi7VJSVvLmf9gvN0a8Jf7ZkAT8B3Mku2WIiANSI4LQfwG6YxssXdlLalNI+8PH9ZIsn1aKX2JMEa1MYDDyBrg4+W4hEjAXfVVHw8RsRCfKVeiJ5gxMRnOW3Hf9iPfORJfiSHIHJNhZlCxv8ZEN5tLuWiEFoS0xGkYmXD+1qTHieBCyksZEmcMqwM6dH1nBx5Yd9sFkXMFlE5hB/nhbNAeJZiCAZ3XjsKdH0MELoywoGvBsUEdYp8YNDijiEjJTNoh2inGyKWiYmL1sqDRMx8Kt76SuAJYRD2P7YSQ2QV1s4K/194tbwvB6fqeTv+S7Giz/CPlrc2LSSFwfqp7ppPUKwqlLgXThqjc6UwT8TfX8JvP6YMJMDb9wiAU+1fzEoOLM/P0uGR9ptkRyx+4xvXyZX11ysvsprv2KkgxwhZ6Yg2NiBKZA/roLZ8fPHe0t/JtK+biEoLzHP6mjS92BkhErhF0bxFzc01lWl8xBM81wVm1Bjwp4ZqpLBq+15MHb6pJs0ki0BRO0vavTQidh6Wv6kwJDcd1NGqOMxxy/dyw7hN1O7XqDH2JQaSoqRlPyuAj5uyWadIyYAr/lyaI7vDxaPcQWQZlh9ro0WPIc6/9eE8CaKkN+8Gs2JNbJy6akkF0a02cTPy/aoR9cySQ4gJ9gSqjTH yFBMU8sJ Wj1cwSW2k9B12wgSTiJaT7pbwhRcytrH5gDYYj9/RFwxHA8uMAPhUwgI0x+BGF3mZ3g6hW/tiDOUgfBDXPFBZP5rD93KfoE+pPsVzkyuBl8m1INXbK57GE9XQFKbY0UDs64RGWTbr9KXJ2Rk4fo0DSOuxc4JTVc4FvizCll+D+rJtL1YeQnkBAttSVQgXf5xi1Qa3WO1TEqsgSmpYL1LUP9hxAeWw149T+ccYSE1Q8qmys2z/F7KcVXE1aE+SvXR+lExyeD3UtjGGvvTEBW1t8kD08LnXIInADh7rxxM2BDJkNiCeZPMtjHbApnWZw292VnYQ46KT9KVp/nDCwYV/ITyZgeA1g6mrz9yaQnooi9UXwFSOEeaHXNUDzhYLyoR5FnRAk9H9wgMnLgmAnrJn+0evUw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000087, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri 18-10-24 14:38:48, Joshua Hahn wrote: > On Fri, Oct 18, 2024 at 2:11 PM Shakeel Butt wrote: > > On Fri, Oct 18, 2024 at 03:42:13PM GMT, Michal Hocko wrote: [...] > > > and it would be great to have an explanation why the lack of tracking > > > has proven problematic. Also the above doesn't really explain why those > > > who care cannot really enabled hugetlb controller to gain the > > > consumption information. > > > > Let me give my take on this. The reason is the ease and convenience to > > see what is happening when I see unexpectedly large memory.current > > value. Logically I would look at memory.stat to make sense of it. > > Without this I have to remember that the user might have hugetlb memcg > > accounting option enabled and they are use hugetlb cgroup to find the > > answer. If they have not enabled hugetlb cgroup then I am in dark. Yes, I thought that this was an acceptable limitation of the accounting. If that is not the case then it is really preferable to mention reasons in the changelog. The reasoning was that hugetlb controller which would be a natural source of that information is not really great because of an overhead which hasn't really been evaluated - hence my questioning. [...] > Aside from consistency between the two files, we can see benefits in > observability. There are many reasons userspace might be intersted in > understanding the hugeTLB footprint of cgroups. To name a few, system > administrators might want to verify that hugeTLB usage is distributed as > expected across tasks: i.e. memory-intensive tasks are using more hugeTLB > pages than tasks that don't consume a lot of memory, or is seen to fault > frequently. Note that this is separate from wanting to inspect the > distribution for limiting purposes (in that case, it makes sense to use > the controller) Please add this information into the changelog. > 2. Why can't you enable the hugeTLB controller, if tracking is so important? > > By turning on the hugeTLB controller, we gain all of the observability > that I mentioned above; users can just check the respective hugetlb files. > However, the discrepancy between memory.stat and memory.current is still > there. When I check memory.current, I expect to be able to explain the usage > by looking at memory.stat and trying to understand the breakdown, not by going > into the various hugetlb controller files to check how/if the memory is > accounted for. Well, as mentioned in the previous response this has been an acceptable limitation of the hugetlb accounting. It is fair to reconsider this based on existing experience but that should be a part of the changelog. > But even if we are okay with this, I think it might be overkill to > enable the hugeTLB controller for the convenience of being able to inspect > the hugeTLB usage for cgroups. This is especially true in workloads where > we can predict what usage patterns will be like, and we do not need to enforce > specific limits on hugeTLB usage. I am sorry but I do not understand the overkill part of the argument. Is there any runtime or configuration cost that is visible? > 3. What if CGRP_ROOT_MEMORY_HUGETLB_ACCOUNTING is disabled? > > This is a great point. The way the patch is currently implemented, it > should still do the accounting to memory.stat, even if > CGRP_ROOT_MEMORY_HUGETLB_ACCOUNTING is disabled. This would give us the reverse > problem where hugeTLB usage that is reported in the statistics are no longer > accounted for in current... Exactly! And this is a problem. > I think it makes sense to show hugeTLB statistics in memory.stat only if > hugeTLB is accounted for in memory.current as well (i.e. check if > CGRP_ROOT_MEMORY_HUGETLB_ACCOUNTING is enabled before doing the accounting, > or move the accounting from hugeTLB alloc/free --> hugeTLB charge/uncharge, > which should only happen if hugeTLBs are accounted for in memory.current). > > What do you think? yes, not only shown but also accounted only if the feature is enabled because we do not want to introduce any (even tiny) overhead for feature that is not enabled. TL;DR 1) you need to make the stats accounting aligned with the existing charge accounting. 2) make the stat visible only when feature is enabled 3) work more on the justification - i.e. changelog part and give us a better insight why a) hugetlb cgroup is seen is a bad choice and b) why the original limitation hasn't proven good since the feature was introduced. Makes sense? -- Michal Hocko SUSE Labs