From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 96EE2C61DA4 for ; Fri, 3 Feb 2023 19:07:46 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 021276B0071; Fri, 3 Feb 2023 14:07:46 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id F12726B0072; Fri, 3 Feb 2023 14:07:45 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E00EC6B0074; Fri, 3 Feb 2023 14:07:45 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id D0EE96B0071 for ; Fri, 3 Feb 2023 14:07:45 -0500 (EST) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 9DD00C066D for ; Fri, 3 Feb 2023 19:07:45 +0000 (UTC) X-FDA: 80426914890.20.2F2791B Received: from mail-pg1-f176.google.com (mail-pg1-f176.google.com [209.85.215.176]) by imf26.hostedemail.com (Postfix) with ESMTP id DCE25140024 for ; Fri, 3 Feb 2023 19:07:42 +0000 (UTC) Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=qSODSDxd; spf=pass (imf26.hostedemail.com: domain of shy828301@gmail.com designates 209.85.215.176 as permitted sender) smtp.mailfrom=shy828301@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1675451262; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=nDCfdH5PhWAgC7H11VM8lY+JRkqHaXUeMCOTn+hxrSI=; b=KflB3+zhEFLbqaRXX4DMl49Tf1YqFUkVijVXMbU4tKpGff+ljwz/ClCAyv2QFfUhRl2XiA m/3dUOKWaUV5zuPh1qSL4wCiu8403h+nn0J8EpFeml2k9R8rdBHxZH4hcXx+jdwHc4v9w7 tguhYzuv43eOKLBLlKH+FNWTkdOdNjM= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=qSODSDxd; spf=pass (imf26.hostedemail.com: domain of shy828301@gmail.com designates 209.85.215.176 as permitted sender) smtp.mailfrom=shy828301@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1675451262; a=rsa-sha256; cv=none; b=5t93JU/tOTIrlpt5kjYfeTVeByUcmGjOhb/WSBmSLpPOfYea2F6DMuHY4Cz4+JE8LMXz72 +N8GJDNwcM+CRk3SN0gYELyqTIzjPQ1esMofUB11/iBJGLrxx9TI0gphMVdYSZoP3x7ObM ZbvaPARSmof+SPGqkfgtjYiPs8pjnkw= Received: by mail-pg1-f176.google.com with SMTP id n2so1706823pgb.2 for ; Fri, 03 Feb 2023 11:07:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=nDCfdH5PhWAgC7H11VM8lY+JRkqHaXUeMCOTn+hxrSI=; b=qSODSDxdm0nViYzSarSacBlWfSm+myWU+qIIOQRW2ixEsHcB4Ir4iLBUXrvoAIsiv/ 5R3gg/DH4h/vyp+RmgeqaYtriP/PRcXUcCizSIlbNELxTnYQHxm3QweHIRNZIgS63UR0 6HooxWaBJviQ6rinLCJraKMhrf9tl/BBGIcO2FajJSZ/I07j4LLi6ckhLhQXZHRZf5XD Y/+N7xHv2ZR5/Kin0kfsmYWk+Koooa/xedW5wAM1cVRAfsCLLYaA1ugod6The65JU4B+ BcvfG655rbQzlOSFA+d4YLQcdXn/pWJAL+jW3vaCvo/CIqaVbeDEELFx+RnvaU1qXxdT Sdbg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=nDCfdH5PhWAgC7H11VM8lY+JRkqHaXUeMCOTn+hxrSI=; b=fS+SBqOlx3dI4xVSWxt1+fvLOuFNQI/9vehsF8Srgktea2c1TxFkcx/lCAei8hYre0 6myZ+1VNU+fUJIKfRwiCn93EQdkw5yYma6mir5o2wFkIOFpPHTx3AxPS6MkIQRUnqK6A YjjLz1Fu0KvC2MuG5xtR/wjapbVsoeI+FepoFD0uWjxgVBC4j/dSf7yx/HqAZ5wegZAp imJvPEOkKSQw298F9dhSNSq81aXn7bwePfnxsj3Ht5Aej/yGOXQuZhQIp4mL5jIKDGeY AC5SV8dsPGLEvb+tJSCQkDDodlrRSeJVsL8m7ANXA/wiMdjCKIT99pbn4Q+zRSijK9L8 +9kA== X-Gm-Message-State: AO0yUKVPhx7sXp28lFRaf1J9O5pDtLn1y6hmqFYJQYiiy5SavA0XiJhQ VV4/+eJvazx4vBaxdH2rBqoYZuyRolW4ZxsguqM= X-Google-Smtp-Source: AK7set+0Qos5sJ1BXqFBHaoguK79AGcVOHMbGkwS7uC6lwawsIbWcJP4GVqmTJOMcHXe5hM5II8Uxu8CP7R/HVcmU3Y= X-Received: by 2002:a05:6a00:1589:b0:592:7c9a:1236 with SMTP id u9-20020a056a00158900b005927c9a1236mr2374540pfk.26.1675451261726; Fri, 03 Feb 2023 11:07:41 -0800 (PST) MIME-Version: 1.0 References: <20230202155626.1829121-1-hannes@cmpxchg.org> In-Reply-To: From: Yang Shi Date: Fri, 3 Feb 2023 11:07:30 -0800 Message-ID: Subject: Re: [RFC PATCH] mm: memcontrol: don't account swap failures not due to cgroup limits To: Roman Gushchin Cc: Johannes Weiner , Michal Hocko , Shakeel Butt , Tejun Heo , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, Christian Brauner Content-Type: text/plain; charset="UTF-8" X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: DCE25140024 X-Stat-Signature: qbxc3z3mptm8uy3tupg65z9w85k77qts X-HE-Tag: 1675451262-702644 X-HE-Meta: U2FsdGVkX19Pb5zVLZBj65NYh98+p9VnDOWmlyA/tS/Ds3URIiIBWNOnkwYDd3OAV6OobAW5hxZarMMFiiiUPIX8gG4XVYcYloZA1cgRLAA+hkgEsIHNmR8HCuZgssQFONCCNu2WL52SuAdwgIY7Vwt6sxsNLbZghxhoKBtdrhmkyo4UYMA3NScHKGTdhEXimGXrghZIZHpLqth370xzWO3W7uMUQmhtd4/mHMNqTWoM+etrpog39G6A9oi15WP2GSSYSdWypCcp5rveRmrjC73mRq0OJs0EGvHzWyd96rqlRDnNwPmvLZvQGUHUODOfChOJ3o0v9TWCplaJDxJ2mUrtH6SIiZTFy7JhiC5RiRC/Nw/AAQH4NZydNFMdwz5dY78UhXpEXMfMR7sYZP4tFFNgt0DsZ3x+HyysmOesNqLRiDb8ckIJiGp21ygeapuHuU3Jh7mqf7ey9mOMUlrTZfZtJHq+Bn47LlOzO/wz9KXnAmJzl65y8jfy63mcYJiIa6B+lKjuJg2uTr+BT2Zj5PTRUAw2aqD5clIW4cDZFsUIQ5xe+ThZWhJagC2EFq4MJOcTPjLvOTw91jnM/piEVzYtDidAPMgj0Gk5HcKJah4fIIfxuwWbDma942iVdPf1nNsoSs8tM18P6MlFXI+05kGni+DgKU93r1q4Zu908PGLx31GFPASpnVDTVmMKPR05LY2oZWnKNb7bQrPCYWO5/e/y44D/dVg9CAPLGtkWoMc0SMsamHdHcLBjFhVYb6XBYpoYeHu9kn8I86rwXjR01OUzKEnJXWSXANv/zBCspaamE+EkNTQi/kC6rrkQIpkrGWxzHvXJ1EA9oKLkpQCNHjUo9oz+KkHpRtdAURLT5ea9d9+/uP4InXWWb/x/ABJEiGn/7v8GMlRfUy4p9zwm7jRAICs8ArTX2Ya05gC0RyFK8FgE35IurJUjnPtzEMJBCJHPOfVq7R4qfvqmfb L9wk2H+F B6H5DfdL31OeZHVSr3OhNEwi1ixRcLy0eiK3QSICJHARu7hZjtNulnhn9Dgd3v4u9ERLaivjNEBavqKZ0C01VKt/ISR0O3Qcy+5UHJ4nnojzbGU7JasZIdAh7LfybdFj4/F7LqGms/vAxFzT1FZxlNUvpJnpirYdYyZMyow0Rowy0D4SU8SkLE/IGF5/10m5SkzUOULSK14PMw7UVxvVsVgP7LecFgpbCtQ3/RMVE6X9KBbA6CDYEsAcB9ILYDonmFGm1WJFIsQedWKQM/vOqQOTr1Yc3N7ACAegBjLCij8yWwwH4j7yJgGdDmtn+zt+t8U0tnb1At+OileGp/jc+AteODN9cQLmpLAjAkOX/E6K6XPllr6gnl62JIg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, Feb 3, 2023 at 11:00 AM Roman Gushchin wrote: > > On Thu, Feb 02, 2023 at 10:56:26AM -0500, Johannes Weiner wrote: > > Christian reports the following situation in a cgroup that doesn't > > have memory.swap.max configured: > > > > $ cat memory.swap.events > > high 0 > > max 0 > > fail 6218 > > > > Upon closer examination, this is an ARM64 machine that doesn't support > > swapping out THPs. > > Do we expect it to be added any time soon or it's caused by some system > limitations? AFAIK, it has been supported since 6.0. See commit d0637c505f8a1 > > > In that case, the first get_swap_page() fails, and > > the kernel falls back to splitting the THP and swapping the 4k > > constituents one by one. /proc/vmstat confirms this with a high rate > > of thp_swpout_fallback events. > > > > While the behavior can ultimately be explained, it's unexpected and > > confusing. I see three choices how to address this: > > > > a) Specifically exlude THP fallbacks from being counted, as the > > failure is transient and the memory is ultimately swapped. > > > > Arguably, though, the user would like to know if their cgroup's > > swap limit is causing high rates of THP splitting during swapout. > > I agree, but it's probably better to reflect it in a form of a per-memcg > thp split failure counter (e.g. in memory.stat), not as swap out failures. > Overall option a) looks preferable to me. Especially if in the long run > the arm64 limitation will be fixed. > > > > > b) Only count cgroup swap events when they are actually due to a > > cgroup's own limit. Exclude failures that are due to physical swap > > shortage or other system-level conditions (like !THP_SWAP). Also > > count them at the level where the limit is configured, which may be > > above the local cgroup that holds the page-to-be-swapped. > > > > This is in line with how memory.swap.high, memory.high and > > memory.max events are counted. > > > > However, it's a change in documented behavior. > > I'm not sure about this option: I can easily imagine a setup with a > memcg-specific swap space, which would require setting an artificial > memory.swap.max to get the fail counter working. On the other side not a deal > breaker. > > Thanks! >