From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id EE59EC61DA4 for ; Fri, 3 Feb 2023 19:00:12 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7C60A6B0071; Fri, 3 Feb 2023 14:00:12 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 775FC6B0072; Fri, 3 Feb 2023 14:00:12 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 68BF46B0074; Fri, 3 Feb 2023 14:00:12 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 5A3E86B0071 for ; Fri, 3 Feb 2023 14:00:12 -0500 (EST) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id ECA6C1212E0 for ; Fri, 3 Feb 2023 19:00:11 +0000 (UTC) X-FDA: 80426895822.27.CE1A53C Received: from out-142.mta1.migadu.com (out-142.mta1.migadu.com [95.215.58.142]) by imf17.hostedemail.com (Postfix) with ESMTP id 0925C40019 for ; Fri, 3 Feb 2023 19:00:08 +0000 (UTC) Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=BS2eVBXF; spf=pass (imf17.hostedemail.com: domain of roman.gushchin@linux.dev designates 95.215.58.142 as permitted sender) smtp.mailfrom=roman.gushchin@linux.dev; dmarc=pass (policy=none) header.from=linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1675450809; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=bDlRD3fwCPL14p6e6ddJjSrpAo4LDvSi2XC5vgn8kU0=; b=sLxTWmWS5q+ZiDa8HCxZraKeHOSd67bNzvIZ4GxcS0YiXHddwv/YPAhp9WELQVrRWEg6JT 1kVE9eU0VFEzMoyi3drnc6PQVTudWk8SO1vDMt/RIVrsmSB6DK1KRGY5x89GlvD/zkPoEt lSbR4lQu8ATJRllEaMYhhYRyKPMcTX0= ARC-Authentication-Results: i=1; imf17.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=BS2eVBXF; spf=pass (imf17.hostedemail.com: domain of roman.gushchin@linux.dev designates 95.215.58.142 as permitted sender) smtp.mailfrom=roman.gushchin@linux.dev; dmarc=pass (policy=none) header.from=linux.dev ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1675450809; a=rsa-sha256; cv=none; b=A40yD7lL0y6AYAUqho7MMzOhvxOStHWHCznN2GHwIKzjNbR69AB5JTLGCoumBkv4r/OAAs uyuHX7KKUBBlZoHaPctG/vwg7TuW6BZ9CJ/41ftfNcVNUde5cWNoZl/oF1GjTL6npr2ysy Ry7dMsAj65etgXHvlHtaV1+7bDBqHRk= Date: Fri, 3 Feb 2023 11:00:00 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1675450805; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=bDlRD3fwCPL14p6e6ddJjSrpAo4LDvSi2XC5vgn8kU0=; b=BS2eVBXFzFjNVBl+LWymBljXHRNVObqwCrs6RSCuaozAoNgSBBHI+URK67rTHSm37kcZnB xuu7iZbEKt34nkdtPU1rIcyM+rsn4lX0QtnZ4Dx0YQ1iiKeeUZIFXwH106L0PWu3sXHBWo CcnuqqZpr6oOUA9uxQraLv13i7vqCi8= X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Roman Gushchin To: Johannes Weiner Cc: Michal Hocko , Shakeel Butt , Tejun Heo , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Christian Brauner Subject: Re: [RFC PATCH] mm: memcontrol: don't account swap failures not due to cgroup limits Message-ID: References: <20230202155626.1829121-1-hannes@cmpxchg.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20230202155626.1829121-1-hannes@cmpxchg.org> X-Migadu-Flow: FLOW_OUT X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 0925C40019 X-Stat-Signature: ayu9dru1ryetcyfr83k8w5n8frejq7a3 X-HE-Tag: 1675450808-888088 X-HE-Meta: U2FsdGVkX1+CVN0LRcmgyWgPhVt+vxrI9fPzXE+wqu0zXQJmaQOgG2OFokBwSZ7DZCrlXsq2whfP/2h4G+AQGr+Lm9IcUGbEczeoeSLXxagfRQjzqJjlX8OzQPwMbcoLz+AOpmQJLSGA+6jgLd7M+uE+5Q/BP0HpzqV3tAVkzWczCZ4hoDq9npum38UZYLYQoCB1ppCRDQPsbzYy5oflaf1y62PELUniBwMX7nGVgpppIUcqr9UQRse+9XP6ok8cMNapWICNrkpNQQFxPoRmfTft2OvBK3fl7QoNnA5D94Xla47U24b1py5a7Xn+z2fZTHawkx3k6icNviVsGGScat/sQuRk7v8KkZWtqjNJu1LtrTqFRBAgNUzmAVAFxWiARafmM41HahPx+uAYYM/Sseaaka2o5G25NRNCZ6UG/a5ySHW0wKhl37hfHSNx9wsCjoG8oUXkDlEGdYIoU538hNl+cuWyVGda2cGFqMNkYxshoGzs28cy/mnI1pzryCXOTalMN6JXo6lcpRe1H0CN7SMrlpASsvXEXvtjI19ukNtfeNP9Lw6EROHDKe/HQQCbWvw+ySQIdCuvF12zZm8T6/DMy9/Qo65aJeDHwi17dxkGolBuxtJBDRh+QTfwAlmJqZlKwcezkSP/ok/1bGJeGFvZ/Wg+2F/yxq90ucAvXk0Rv1P9mVF+jTMKm5LpyRG4sBdNm0VD4gwjYDmsqM6cw7uHbcKmUUsQUD7m6Pp0vRKYzVruv0k6ySMa+w5Qwar7RNbMuO+NWI+/sl5oTZy51JVp9x8a7K4PZKKSVpg/42MdqRokYdQrV9t2Emo2a9H0Gp8J9AQ11PS9nrKC+sOOK7ady6fN2yHgjY+Uy6zddXBVpxCrxVjae6krHAVw5IfrBGJAGaTxlThxX87S7XYz4XrLjAiRBF8B40J6l/KS03nvdyp7wwdmjrFlnuERMoPILA8irkDOLeY3CCs+l7Q Wwnjzjcr G9RWIhdrQ/Cajc8P2Ga3D+erxJ7UhehwqAqd7odQJLxNdEr4MjTJX+GjEHRozCOW1NoArAFF1iH3XzRoby049Kz6MY6l9kV6b5VydpTXRlfCH7i/ehn6bIxy/7yhC/zoWGRYvilEASyXom0M= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Feb 02, 2023 at 10:56:26AM -0500, Johannes Weiner wrote: > Christian reports the following situation in a cgroup that doesn't > have memory.swap.max configured: > > $ cat memory.swap.events > high 0 > max 0 > fail 6218 > > Upon closer examination, this is an ARM64 machine that doesn't support > swapping out THPs. Do we expect it to be added any time soon or it's caused by some system limitations? > In that case, the first get_swap_page() fails, and > the kernel falls back to splitting the THP and swapping the 4k > constituents one by one. /proc/vmstat confirms this with a high rate > of thp_swpout_fallback events. > > While the behavior can ultimately be explained, it's unexpected and > confusing. I see three choices how to address this: > > a) Specifically exlude THP fallbacks from being counted, as the > failure is transient and the memory is ultimately swapped. > > Arguably, though, the user would like to know if their cgroup's > swap limit is causing high rates of THP splitting during swapout. I agree, but it's probably better to reflect it in a form of a per-memcg thp split failure counter (e.g. in memory.stat), not as swap out failures. Overall option a) looks preferable to me. Especially if in the long run the arm64 limitation will be fixed. > > b) Only count cgroup swap events when they are actually due to a > cgroup's own limit. Exclude failures that are due to physical swap > shortage or other system-level conditions (like !THP_SWAP). Also > count them at the level where the limit is configured, which may be > above the local cgroup that holds the page-to-be-swapped. > > This is in line with how memory.swap.high, memory.high and > memory.max events are counted. > > However, it's a change in documented behavior. I'm not sure about this option: I can easily imagine a setup with a memcg-specific swap space, which would require setting an artificial memory.swap.max to get the fail counter working. On the other side not a deal breaker. Thanks!