From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C10FEC47DDB for ; Tue, 23 Jan 2024 16:48:28 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E881C6B0074; Tue, 23 Jan 2024 11:48:27 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id E38306B0078; Tue, 23 Jan 2024 11:48:27 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D00106B007B; Tue, 23 Jan 2024 11:48:27 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id BE0A26B0074 for ; Tue, 23 Jan 2024 11:48:27 -0500 (EST) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 92FE8140B5A for ; Tue, 23 Jan 2024 16:48:27 +0000 (UTC) X-FDA: 81711159054.14.69EAFD6 Received: from mail-oi1-f177.google.com (mail-oi1-f177.google.com [209.85.167.177]) by imf16.hostedemail.com (Postfix) with ESMTP id 9BE5A180025 for ; Tue, 23 Jan 2024 16:48:25 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=cmpxchg-org.20230601.gappssmtp.com header.s=20230601 header.b=BRx93CXh; dmarc=pass (policy=none) header.from=cmpxchg.org; spf=pass (imf16.hostedemail.com: domain of hannes@cmpxchg.org designates 209.85.167.177 as permitted sender) smtp.mailfrom=hannes@cmpxchg.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1706028505; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=gbO1hM7Gq1MMQFtxk3PHCAqmC2ittFH3Mus5UZlycCs=; b=wnR7I2Cb12cGCHurNoCL6HbxUvaRycd2VsLwMFmiUJvEk7Y86Vo+PvYHgVUFYvSaJWbLE4 027X5dcmYLgNqyz3wlmcNEths9YIwvTTdzgLV0Xf2DcvZtxq99HrNXLNnl2I4dATbvW9/P zBfbulVCwG8qgAEse5/tmekz41MpiZg= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=cmpxchg-org.20230601.gappssmtp.com header.s=20230601 header.b=BRx93CXh; dmarc=pass (policy=none) header.from=cmpxchg.org; spf=pass (imf16.hostedemail.com: domain of hannes@cmpxchg.org designates 209.85.167.177 as permitted sender) smtp.mailfrom=hannes@cmpxchg.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1706028505; a=rsa-sha256; cv=none; b=K6ZOt07oHrKh5R7E5AEZjdugHSg0L/2yDqn/Wja6Qi0jZrAR5N52LCkMqTq3gaCcczW2MC Y1EBn2iOQc1uiuVw1nPnFna/Z0A0bglJRvRq8mtYT4E+/5IcRAV5A+iluZ5RD0LkWTTTiL 69gBzFuvWazMH5iRgh3Qs5cENBVL2MI= Received: by mail-oi1-f177.google.com with SMTP id 5614622812f47-3bdc2759468so655405b6e.1 for ; Tue, 23 Jan 2024 08:48:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20230601.gappssmtp.com; s=20230601; t=1706028504; x=1706633304; darn=kvack.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=gbO1hM7Gq1MMQFtxk3PHCAqmC2ittFH3Mus5UZlycCs=; b=BRx93CXhYwAIVyCSRnKyKvUGxtiCjFybVb5ZJ3/+wNf5CNmv4JdVvxfg4rF2w7Augw khIiqwY8XgwoIDDOkw1oKMRcnh00M0dGPcED6+M3LijbDiFiGjuDiFqWNKsZ4UQTQ9f1 tFMFNItXKNzYn9lSRk01gWGMtTSibe0CluDONeSrvjWXUS+LQXrzx2v7ADPa3K+lncfJ HkoVk1aKnZjulVxDU6jDMi3xvxHfByCGFMEXPyPrzDtuLW0wq3mKrENdo74RXsBWUzFw nmeVjvwpbM344dhTW7wh5oDgsynOvsYiPibIWkVpei1koDzCrrLFxF24i7etuDhqK8nd vZYA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1706028504; x=1706633304; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=gbO1hM7Gq1MMQFtxk3PHCAqmC2ittFH3Mus5UZlycCs=; b=iAvRZ8L4Wtl3LRekAc9Z9s65i77FEkl0l1OIXfy0xIg+sMcTJ57LK0JtxB3KTBVb0w Hotry+N7NNMnLw1NjOtyqFKUxoXDtDHqUsMUU7TaBaSNCJLbZngqhuhdLAbltJXd7578 mOvgUaVsZqLCnBWIrvYdERR5Cv6ONnRn3fK6CKz7dIkcZF5M/SY+8LyRIF3XtdNlFPap grnkTQukWoZgDKrEDQ06kefnXZZEJC4Dp4azNwGl+DHAsHj1tM/bHLYKY2gXaXPc/LvJ I7EN8Kw3T/W56wr5fwjT24ksiUH+61Rw7ZiOX4jWlUVEA2luNmTVEow9+Uhb82AzkMwb +UFg== X-Gm-Message-State: AOJu0YwkvN8gZRmLtNjv5ReSzDWKkEyf9qiTecKUzaGvCCKGrdxw5fif v0G4uLquoT7LUiKBYk2ZcfItbqMU6eyks9t5NgBbCKUddGdlgN9Fp2ItDt1rJWU= X-Google-Smtp-Source: AGHT+IHUG7S/rv/R7wA+Z9UIIztYDf0v6Sh7d9fO6fdN0c9sFxPRFzPESC1XVa3Mb1iV3eCC4V2oUg== X-Received: by 2002:a05:6808:399a:b0:3bd:cbb2:4614 with SMTP id gq26-20020a056808399a00b003bdcbb24614mr185240oib.68.1706028504592; Tue, 23 Jan 2024 08:48:24 -0800 (PST) Received: from localhost (2603-7000-0c01-2716-97cf-7b55-44af-acd6.res6.spectrum.com. [2603:7000:c01:2716:97cf:7b55:44af:acd6]) by smtp.gmail.com with ESMTPSA id bk21-20020a05620a1a1500b0078353f07523sm3256413qkb.1.2024.01.23.08.48.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 23 Jan 2024 08:48:24 -0800 (PST) Date: Tue, 23 Jan 2024 11:48:19 -0500 From: Johannes Weiner To: "T.J. Mercier" Cc: Michal Hocko , Roman Gushchin , Shakeel Butt , Muchun Song , Andrew Morton , android-mm@google.com, yuzhao@google.com, yangyifei03@kuaishou.com, cgroups@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH] Revert "mm:vmscan: fix inaccurate reclaim during proactive reclaim" Message-ID: <20240123164819.GB1745986@cmpxchg.org> References: <20240121214413.833776-1-tjmercier@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspam-User: X-Stat-Signature: bfksyfb51e5gptzpfqsgxxswb58d3iqt X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 9BE5A180025 X-HE-Tag: 1706028505-732034 X-HE-Meta: U2FsdGVkX1+R28C6ArK/DAAyxWCblQPxiXSGcZT7O03SZryH8SjDTs+2RufgzFTEfgsecZ2ZTAKb9ps/SRZTo6QtJmeAAHCJHGXxbwOJOtO7ZXy6pJg7Xx2an+NwRzaEwRsLuNx9GmFMOgfqaTNMLWYjH66cVw56eEYd9MtfhULee/9L1n5RB6jdtxs33y5PK3/GmAjDot+Zv3ZEKrFYoTZEWFqi1x3jkzqKURUlOfPtekTp6MXoYZ0pq/DoTxmuO33TGkY3sa5L13ypBXZdxLCchS9aUVVmXechECJ/KafhAshg7FPpjSHEmplTH2/DDfQadQV5ro8k4VibaUNccif+GVUrNrAfl3bt/hxfeF0zH1Qu0H71IkFMZgdqPTBZu3yKvK//oiBVP13wcKEL8qB2+fmy95W7nFh4eFUyhMK1UZMNpJu2+vy32VghrEmIrNxAktIzlKTVItIO7JxX7nhV2+DBc6athr8tRp2Ok8Y6/e3aLSCq2VGh9RWENlHuIyIHEenLscwCwIZXcS+FVXnMcKRs/+IJPwGZEgRQ0YjsC74Q2oTdhm0NLrQgpuhyPPx/mqfx4KH6S6982KZKj3gPL+hyVydbevFXkCznbcS4OXYAJpofLp2Bw2jDq5DZcUccyB9JpGp0BfoPgveo0oZU2fjihmN2j0DrCN2FEVoFpQxwaHfJoIxToLX3dyE8XPuYtfD5a1opzeC9s4ap9cnhfW9cVafn+/j5Ln0jnNnzS3j3Rka0ozOlW/RRGkrnzBzpJdX3kzonMz+TnAMYfje/5iq39wSDPgcyp/8YK9xz9uXOrq4uFbwpNnPTfzDNdzhnlSnYBSrHxbMIoEpVf21OhbxIEhVDnwcStXHMIKVNoT/ulLkuUQmxLO4xiCmPOtToPC15lGd7IsGmRQNNEg0kEM9j3mZJhmEb33JtveQERwrhHTXWbcf10wjaJYJJVl9gi6Yc3toGU/IAcrL A8Hz/7b4 CzYRUiG1UjHnCXCIBQYMLqSkQh8kF1dPgxL8T4YnKROUTTL/zO6zKRLiBs6pgDtjU1iYYjKRrJPwmcJjQnSlTZmInZkrBlaAl8qtrJzMitgdH79ivZ+3F24E3lD3Ub7M3h2SJnm58Xyx2yTtRwuhzzHARQuvJAcdFVaPko1TboVPxnb751MqksPms+joneTJBU6ua3JeEFXyUoCGoQXNcBxNWcIY6tsYPfDj+Ul9Yt9UP7+OdoaML86QhfUXrRmkmWvTiBwQF6l53nll0chRTrZehyzfyT/zHCW9XKetjEMBKg5s77mpA+CFrurOzRZeUUtjqVeQtY0OiHFxDU73yZBbZlp1Pfw1qcWhMi7afHmI2caI7QpF9Zdu+TrjFN0nc6F+f0MAK/qZjAcJvhjqoI1qE73s3fDFCi4FBUvmNH5fOO47W6V0L2fjVfA5Ots08kSGJqR5lHTNsEY5DXL6W+jKNzqriiccLwsB4 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: The revert isn't a straight-forward solution. The patch you're reverting fixed conventional reclaim and broke MGLRU. Your revert fixes MGLRU and breaks conventional reclaim. On Tue, Jan 23, 2024 at 05:58:05AM -0800, T.J. Mercier wrote: > They both are able to make progress. The main difference is that a > single iteration of try_to_free_mem_cgroup_pages with MGLRU ends soon > after it reclaims nr_to_reclaim, and before it touches all memcgs. So > a single iteration really will reclaim only about SWAP_CLUSTER_MAX-ish > pages with MGLRU. WIthout MGLRU the memcg walk is not aborted > immediately after nr_to_reclaim is reached, so a single call to > try_to_free_mem_cgroup_pages can actually reclaim thousands of pages > even when sc->nr_to_reclaim is 32. (I.E. MGLRU overreclaims less.) > https://lore.kernel.org/lkml/20221201223923.873696-1-yuzhao@google.com/ Is that a feature or a bug? * 1. Memcg LRU only applies to global reclaim, and the round-robin incrementing * of their max_seq counters ensures the eventual fairness to all eligible * memcgs. For memcg reclaim, it still relies on mem_cgroup_iter(). If it bails out exactly after nr_to_reclaim, it'll overreclaim less. But with steady reclaim in a complex subtree, it will always hit the first cgroup returned by mem_cgroup_iter() and then bail. This seems like a fairness issue. We should figure out what the right method for balancing fairness with overreclaim is, regardless of reclaim implementation. Because having two different approaches and reverting dependent things back and forth doesn't make sense. Using an LRU to rotate through memcgs over multiple reclaim cycles seems like a good idea. Why is this specific to MGLRU? Shouldn't this be a generic piece of memcg infrastructure? Then there is the question of why there is an LRU for global reclaim, but not for subtree reclaim. Reclaiming a container with multiple subtrees would benefit from the fairness provided by a container-level LRU order just as much; having fairness for root but not for subtrees would produce different reclaim and pressure behavior, and can cause regressions when moving a service from bare-metal into a container. Figuring out these differences and converging on a method for cgroup fairness would be the better way of fixing this. Because of the regression risk to the default reclaim implementation, I'm inclined to NAK this revert.