From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2A83DC83F1B for ; Mon, 14 Jul 2025 15:22:57 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A60418D000A; Mon, 14 Jul 2025 11:22:56 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A108E8D0001; Mon, 14 Jul 2025 11:22:56 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8B1498D000A; Mon, 14 Jul 2025 11:22:56 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 737EC8D0001 for ; Mon, 14 Jul 2025 11:22:56 -0400 (EDT) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 1E49CB72EB for ; Mon, 14 Jul 2025 15:22:56 +0000 (UTC) X-FDA: 83663237952.30.495F6F8 Received: from mail-qv1-f43.google.com (mail-qv1-f43.google.com [209.85.219.43]) by imf25.hostedemail.com (Postfix) with ESMTP id 15A01A000C for ; Mon, 14 Jul 2025 15:22:53 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=cmpxchg-org.20230601.gappssmtp.com header.s=20230601 header.b=rIdZivHQ; dmarc=pass (policy=none) header.from=cmpxchg.org; spf=pass (imf25.hostedemail.com: domain of hannes@cmpxchg.org designates 209.85.219.43 as permitted sender) smtp.mailfrom=hannes@cmpxchg.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1752506574; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=nmGx1FYo90dL2WOETAZNdmhMseM83XofBrzEl0Cu6l8=; b=fiUHFETH9LXjJvvbZZDY5G3BIpuTmp1zvstVn10/3Apxv/Ul+C9OA7AECEttZlQVK5acD9 4gNGA/G7A7g9iHCsrmBbHPszxX4IfsSwTDJdE6gFl+qd5bXcGTeurPIdiME4UfQluY+xyo B8TttsR8KHOh4KmoINCKaNnqVefH7wQ= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1752506574; a=rsa-sha256; cv=none; b=Qju7LV7L49glzy8r36eICQsqxSCXGWRBDJrLNOBAJrRVKdV6q7bb2vgiMMdddeswtCUGyS zR0BTn4ibv4o4253E+J6moUuEGVRMoBPaSLVZjRNALdK5MUnSfjg3SLOJS6R6H8haSmU5L OIz0U/pzpvFaDNj99ipugGsGNBF3lak= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=pass header.d=cmpxchg-org.20230601.gappssmtp.com header.s=20230601 header.b=rIdZivHQ; dmarc=pass (policy=none) header.from=cmpxchg.org; spf=pass (imf25.hostedemail.com: domain of hannes@cmpxchg.org designates 209.85.219.43 as permitted sender) smtp.mailfrom=hannes@cmpxchg.org Received: by mail-qv1-f43.google.com with SMTP id 6a1803df08f44-6facc3b9559so58743506d6.0 for ; Mon, 14 Jul 2025 08:22:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20230601.gappssmtp.com; s=20230601; t=1752506573; x=1753111373; darn=kvack.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=nmGx1FYo90dL2WOETAZNdmhMseM83XofBrzEl0Cu6l8=; b=rIdZivHQGkUgR7gQw2564DEfoNqkzgqE7SxaZ1zteAMeXp/dBljLd8r6aTNUoLz8bP zI4pzazkI4q6Bo6YxyeQ+zavlhHZkaYaTHQ+OqDgdX0ygz2gaL2hWWm4V4fcylXI9WMt W7Nj0ZpDDuq+kjAXFqT1EwSE9RE56+baYpdnr9XxhL9vfsoZD+aOxtLIAxjbQb9V4NHQ 57yg2AAB/ZWtzOli0lCFQQTND553qKJ2t5gr9wU8w+DRgXAILpj22s5mPusDAgNrW87b 6NBGYw52j12r8wPMtUl9WGOakEa8BH7XtcMCUklzR8q7Uv5ZlSjuR7u2UryMF+cxr2Ro Fz7g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1752506573; x=1753111373; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=nmGx1FYo90dL2WOETAZNdmhMseM83XofBrzEl0Cu6l8=; b=Oj8u2fWoNn0ssKtr3PyFFr9gbQjR8o5Mxyzs/Nz6fWD/M844tXkSM9cefPbzbXRP/m Bdmi9V7V7Ip+DqJRkVthBsfVIMkqUIt5YDXjOSH3kWYmjHUsc5TcJyhNgjtJZhyskLMv MWRiLHm3KtsZ8Ci+BDzqFkrVvvXOfoxQ7bL1fNYyuFwhwmRpU9H5w7CiGXtHhZmkk2lY gzvSU6B1uXr+0C8HjUikfR1po+tC3IU6zNeCWnNuVix7112h6QasVfP3V39kapHIBU7I 9UqNDUsNDDYkAG1zg+e7647BAU0y11qoCjCerlHvPeQ8tFN4iMZor1NQXvIAIlmmCZvl qC1g== X-Forwarded-Encrypted: i=1; AJvYcCUgBMlIayLEsrs/J7HrCj85GaH41wJSyGNdbaDVhJwkHpfgypPLnvqF8d2H3W65/dOYoXQTTHLtsg==@kvack.org X-Gm-Message-State: AOJu0YyXJFsnrK/Ob+21udi6AR9keez3+v4V6fzIfaKCKd8AmvAtXU+k uMeknCNCvAfKT9O2XexW8sQAStQZJLuOHkI5qTOHOFQFo+EsMEPX8TWPk3rD4vxeVqM= X-Gm-Gg: ASbGnctLC+YaxXkUNeFw/yzor9JT9C4yzvMQFrNCDdaAs2lzIAY+MSBYEIA5E9Y1g1F cM8yg/4R/HR/0w4pC1G399jyvHUeGKBhJd3rYhit+i9/bwx43HgZ8u7hVGDMpt1oedMvsh/6/pN g2lt+rxYGHjbPPiOZAtCuoC9AEbo0NWN9EJvSdVg/pwRwL7mmUDKmbP7Oltm1S3Dobckm//n+p9 20KYEoiRisjlnbonmYJqHOZlyc+SWsxsgLpzT7x4dkjQUfXe4SXqFVRQeCbbogLUejuPuTQY9gg TmNoWR8YXEmeTfoI2W230sTM8EoeONdeNjTKQfQXyH9dWuTg69d3ARG/UqGF5Tk6Tw8LNLueisz cO2E8iHMp14JyG33fYUhfRw== X-Google-Smtp-Source: AGHT+IHDkILrpmq53HezSNzRRgdawyDIV7RK9otTY/uYj4XO1FTpqK75CEn4qcP3vC5c0pHVKLLWzg== X-Received: by 2002:a05:6214:2e83:b0:704:a757:979 with SMTP id 6a1803df08f44-704a75709bcmr110052856d6.1.1752506572954; Mon, 14 Jul 2025 08:22:52 -0700 (PDT) Received: from localhost ([2603:7000:c01:2716:365a:60ff:fe62:ff29]) by smtp.gmail.com with UTF8SMTPSA id 6a1803df08f44-70497db175dsm48395536d6.103.2025.07.14.08.22.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 14 Jul 2025 08:22:52 -0700 (PDT) Date: Mon, 14 Jul 2025 11:22:47 -0400 From: Johannes Weiner To: Roman Gushchin Cc: Andrew Morton , Shakeel Butt , Lorenzo Stoakes , Michal Hocko , David Hildenbrand , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH] mm: skip lru_note_cost() when scanning only file or anon Message-ID: <20250714152247.GB991@cmpxchg.org> References: <20250711155044.137652-1-roman.gushchin@linux.dev> <20250711172028.GA991@cmpxchg.org> <8734b2vcgr.fsf@linux.dev> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <8734b2vcgr.fsf@linux.dev> X-Rspamd-Queue-Id: 15A01A000C X-Stat-Signature: 1nkshcbxttugznhe7t3rap8hm1jmdduu X-Rspam-User: X-Rspamd-Server: rspam11 X-HE-Tag: 1752506573-352434 X-HE-Meta: U2FsdGVkX19BW25o2lQGc6Nldj4jYeH8g+vNUTtR024rSJz5l7+mZfPctcYLqKuOTC619vi1u/Zq5/nRly330Bndr0A48tVo2ruw4DDfsMUw+QERMA/Zz04DEGJTljGwqHcBxreqO1fn0FEDoueNUZe+S22l4EPHRubsSpvsTESyx8IqVrvMJwmRGe4uwJQB04aIWwE1wr/GWkagAY4pVQ4g6ukzVAUjjjEGpC86MwWc47b1mipU2eGTxCgfPQBhCzzGPPfXdJ/sZguqL60Aq/KA/1mvBke7vLKaba9Iyyj0xYGnzRKssOxUwoUdBpzO2O6fVPwwJsoj6B3F2bxNFVm71G6qweppJD/TFDj9CgRLomhNlLyKmUgZGAt+BWNdjEnOrWg9GLb86voiupFirV78RUtXBX92JBrcH01fZdvdNaPBZvxOutkbnTR+oUE7L7uJ1J4zaTUXqZUMgBcClW3ljTnuiLmjLXuQKg5iGSvoV/GNg0jlWPU/e2OaSG8MzOg1FuXdxN93Bb5Bfo23taOX0HFTA3f9ZF7aCXGNMHup2Clr6elZolKgrhlp1pKu97JyNRaGMpNxy1d3Y3R/Ldyzl0dCD1K0R1kBAHVefSdlvqIfZOXJ++7GFxSKxIlnnLKiHVd6++Fb30Zil2mTc3LDi5pJaQIV5t30FJo8mkXl3tOSIL7VSWoH34qV1d3LSOM+5d1dxjC2dkJ8Zy1ohV8U9EpObxWlFuxR+Jeo64JH68/rnJayUWUFO4a2NgwNnip85/bjI0o12iII0hDdUlzncEhGDCjZfVLNKKQZSEKlpG5TIaHQIbw6eYpSh13tIMMFTUusiwdHFpavWTJ0qNl37pPcwFKGj1gR8BvmoAfhLo/ZVuWDEcUSzgPZWsuRmHLAIiWK5WI5W4Jw9Q+7vwyNopJHXHD5EFqueASpeT2QoxE/SONxKMGl4OBgh9X50xyRAoR0a6gLYZ9tjnd 9GGupsp/ VWEXVW1LoNd2B+aZJiOmEOh062EOS5M1ywc3BodJBqqnwNwBPQjCzJ5JhKBm+u//TaeP7ha18Fxo8ZOJcYI7WX7r4KPXu/BM145cy8oYLEoRxjN9Td1PLF875V4UpJZT5H2F9//s0YJe1PMkLI7f7PnBwL3bTxulqqqOQy8nQ0in1YO/MNsCaNTDZAcShKKyvfFfcJ+EosS5nXDA5Orzf25R+VYIx+89xu9pxdcjUuaWoh7S+B7GPULGyPdoE2D8lgicJ7kUSLWDGN5wizv8282okgIybQeTafyOtfV4CIryPbNTxAD+cYldTMW1OCcYlNHWNxdqj/ozI9njwYbvNIxcG6b3E6jdrdOyoNqWkFJoehf4cq7U+HVtVi+0bwuQyhzbZZ9OCFX4nnescaHFre7LGRiNW5Z4juX9TmmlLL5blWtDltjiXFa1XFkGdfXpbexW4cbrSSyxEQ/o6K3skabgtDowUsNEfl87z5f4UVG9fJ88ZopOn45RPiA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Jul 11, 2025 at 10:55:48AM -0700, Roman Gushchin wrote: > Johannes Weiner writes: > > The caveat with this patch is that, aside from the static noswap > > scenario, modes can switch back and forth abruptly or even overlap. > > > > So if you leave a pressure scenario and go back to cache trimming, you > > will no longer age the cost information anymore. The next spike could > > be starting out with potentially quite stale information. > > > > Or say proactive reclaim recently already targeted anon, and there > > were rotations and pageouts; that would be useful data for a reactive > > reclaimer doing work at around the same time, or shortly thereafter. > > Agree, but at the same time it's possible to come up with the scenario > when it's not good. > A > / \ > B C memory.max=X > / \ > D E > > Let's say we have a cgroup structure like this, we apply a lot > of proactive anon pressure on E, then the pressure from on D from > C's limit will be biased towards file without a good reason. No, this is on purpose. D and E are not independent. They're in the same memory domain, C. So if you want to reclaim C, and a subset of its anon has already been pressured to resistance, then a larger part of the reclaim candidates in C will need to come from file. > Or as in my case, if a cgroup has memory.memsw.limit set and is > thrashing, does it makes sense to bias the rest of the system > into anon reclaim? The recorded cost can really large. > > > > > So for everything but the static noswap case, the patch makes me > > nervous. And I'm not sure it actually helps in the cases where it > > would matter the most. > > I understand, but do you think it's acceptable with some additional > conditions: e.g. narrow it down to only very high scanning priorities? > Or !sc.may_swap case? > > In the end, we have the following code in get_scan_count(), so at > least on priority 0 we ignore all costs anyway. > if (!sc->priority && swappiness) { > scan_balance = SCAN_EQUAL; > goto out; > } > > Wdyt? I think relitigating a proven aging mechanism after half a decade in production is going to be tough and require extensive testing. If your primary problem is the cost of the locking, I'd focus on that. > > It might make more sense to look into the cost (ha) of the cost > > recording itself. Can we turn it into a vmstat item? That would make > > it lockless, would get rstat batching up the cgroup tree etc. This > > doesn't need to be 100% precise and race free after all. > > Idk, maybe yes, but rstat flushing was a source of the issues as well > and now it's mostly ratelimited, so I'm concerned that because of that > we'll have sudden changes in the reclaim behavior every 2 seconds. That's not a new hazard, though. prepare_scan_control() decisions are already subject to this, as is the lru cost aging itself.