From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 29CB2C678D5 for ; Wed, 8 Mar 2023 20:16:35 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4E6E96B007B; Wed, 8 Mar 2023 15:16:35 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 49459280001; Wed, 8 Mar 2023 15:16:35 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 335386B007E; Wed, 8 Mar 2023 15:16:35 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 205266B007B for ; Wed, 8 Mar 2023 15:16:35 -0500 (EST) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id EC31B1A01F8 for ; Wed, 8 Mar 2023 20:16:34 +0000 (UTC) X-FDA: 80546838708.13.BC4F4CD Received: from mail-qt1-f177.google.com (mail-qt1-f177.google.com [209.85.160.177]) by imf14.hostedemail.com (Postfix) with ESMTP id EE031100018 for ; Wed, 8 Mar 2023 20:16:31 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=cmpxchg-org.20210112.gappssmtp.com header.s=20210112 header.b=yag6FV13; spf=pass (imf14.hostedemail.com: domain of hannes@cmpxchg.org designates 209.85.160.177 as permitted sender) smtp.mailfrom=hannes@cmpxchg.org; dmarc=pass (policy=none) header.from=cmpxchg.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1678306592; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=+jvBsh7LbTE3BBTYqkpZsR8VusiDvh1HfdbNeXAcFCM=; b=XVzsxj3MIaFXeHR9l58B7oM8UTTSYWa4IvBSCgpBqiY17rWILS7XsoClcaawu5dFe7msh2 ZKge9dMS459G8MPnAKYZ1xAv/4tjGrq1Rdtn+Dimr+1yiyGyEx3PElt5oy3hzILry5hcry dixXM7tKlBnB1f/xcu5utcvh1trNJeo= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=pass header.d=cmpxchg-org.20210112.gappssmtp.com header.s=20210112 header.b=yag6FV13; spf=pass (imf14.hostedemail.com: domain of hannes@cmpxchg.org designates 209.85.160.177 as permitted sender) smtp.mailfrom=hannes@cmpxchg.org; dmarc=pass (policy=none) header.from=cmpxchg.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1678306592; a=rsa-sha256; cv=none; b=MwB/tv4Z3P2BhtsX2C8yrxtk0q7zA3yJ6KK2L+qhFlkPYXgfHhwki6pa5rPA/4bYprr35B p4Sxy8jNxLXZv/x8+5ygMt3SRzQSLauRbPzYR8jZThnax0D3OkfYUrqDwXI0VtWASZQdz8 3CDmXspiVpUDnCJfFJ5YLOsYY9vmb+k= Received: by mail-qt1-f177.google.com with SMTP id w23so19465520qtn.6 for ; Wed, 08 Mar 2023 12:16:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20210112.gappssmtp.com; s=20210112; t=1678306591; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date:from:to :cc:subject:date:message-id:reply-to; bh=+jvBsh7LbTE3BBTYqkpZsR8VusiDvh1HfdbNeXAcFCM=; b=yag6FV13Ig3iSSTEMkdIaAry85Fcek0dOQvKFYaXiOTLwD+yt6xwMobzPBqyyGuwnE x6e9O75UrWI/EhG/hKB+SoRid6lJd6jsVqgogeJ+M/KLtcDOy+1wiyLRFJZdMhJZuy67 4cFEH9iJ9rt78ebthg4KuViKITwgGrKIBZfVtg4jejfD3NJfBZlWyLRWfKCiUUDJhk3M AUZ91ZRAELBUgAwYSmxyn+R9LKXstzdhXdS3aCt/AcIFxXpwEhx/C4JnoPSVrsWtp56O Pw7f9hyfXpo61FvfHLR2VrxrczLUh00iJJPQPGLxSGZhaRQbITAln0l3Io0L2LGlMTqm z49A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1678306591; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=+jvBsh7LbTE3BBTYqkpZsR8VusiDvh1HfdbNeXAcFCM=; b=mA3MZjE+/6fq1UG+lGwCX6qXdv62dNLd1TwiIZcMdjT433Z4V5d0cHm6xH3kj4gaE1 EIvITkdekcQkX+aJ2rgIoyMXcjCXwAEsXOjRNYs1AhDDSL8jS3wB/Qpg7EqjkhsjgNOk PBbfy/dTuHchfeKsVDyvHzVZVY1/sC7BAWbD9UZtuseG7MBuNyQnMo87AEAcS/qHAeSo +JoMej9uUr1ULjr8WtbJSrMNyNn043wiuP9lX0ffnt7MO534mdY7x/N2K6YEFfiKS+uN 4rJbWMoYQ/Nclye8IR5PBYoHfoDMqkp7028POCxMDCwZmMV10JokO/rLofrBEkJXVELL 3zaQ== X-Gm-Message-State: AO0yUKXbmt08DGYSCu7RNi3F0Qg5v4nJjJorWvsdwPPPG+TUPkz1WvMC 6Rc4EQ93iPphppmVex6hp5JEwA== X-Google-Smtp-Source: AK7set/wqHwokeLnuFtipo28PT4sjFacv/VxOQnhvXvVV+Nsg7Ki3HCu1cum2w+Q3/hSnb/ROYu+7w== X-Received: by 2002:ac8:4e94:0:b0:3bf:c085:e953 with SMTP id 20-20020ac84e94000000b003bfc085e953mr25211678qtp.24.1678306590929; Wed, 08 Mar 2023 12:16:30 -0800 (PST) Received: from localhost ([2620:10d:c091:400::5:d32c]) by smtp.gmail.com with ESMTPSA id m8-20020aed27c8000000b003bfc355c3a6sm12285754qtg.80.2023.03.08.12.16.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 08 Mar 2023 12:16:30 -0800 (PST) Date: Wed, 8 Mar 2023 15:16:29 -0500 From: Johannes Weiner To: Yosry Ahmed Cc: Alexander Viro , "Darrick J. Wong" , Christoph Lameter , David Rientjes , Joonsoo Kim , Vlastimil Babka , Roman Gushchin , Hyeonggon Yoo <42.hyeyoo@gmail.com>, "Matthew Wilcox (Oracle)" , Miaohe Lin , David Hildenbrand , Peter Xu , NeilBrown , Shakeel Butt , Michal Hocko , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-xfs@vger.kernel.org, linux-mm@kvack.org Subject: Re: [PATCH v1 0/2] Ignore non-LRU-based reclaim in memcg reclaim Message-ID: <20230308201629.GB476158@cmpxchg.org> References: <20230228085002.2592473-1-yosryahmed@google.com> <20230308160056.GA414058@cmpxchg.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: EE031100018 X-Rspam-User: X-Stat-Signature: 1f5ob5owjp9to8j1nmw3pqts4bifduxe X-HE-Tag: 1678306591-298658 X-HE-Meta: U2FsdGVkX18gA51pJWVAunSJFVVLXG3Mrq2wN646qEdCVmA9JBHT6XUxlt3uBd6hrEk79uAbZE7PSquCy8Z5ZcFSmgadKCz6otCynBOYK9n0H9iUzloV5IeXMp9xY8+MIxoCwQ9Jb+zQUn65aiWCQwjd4ySo96obeH12JhNenJB1pMiYHY+C27jQD99wn54ugpRLzWmBGEOBWVGjDWKrnAO1ZuIAqvQ7YtmcP+ywKsU4bPvvMKCI6EnL71aNAr92+vjMeA/JxKwsoE7Ne0aGYdj8WP2VPq8DjR5j0s6awAJDDlVAp1X5l1BL02v2lj118VOxSIZ5U5+cq6FXd6JJKezdKTAx55Qq58gkjHGbx9e8skz1FnOPS4qUhMaJhD9qnT8VtH9dpZTY0BlZYVm27gERVRaqBf/EJsB7JgGixhxxzuzei8/Hwy/FynnoIA3r7W53/s881KwCXAvbF8zYi8HOjr/QqdcCfTDevR/+BZsNMD47S97ieFYQ+i3tw2IdSN3Lj8EZRBBikRV3Uc+rdL+KA7KRbJh3hW5EHG7BtHPc7lVdRB4joK1GxQWTBXziGcQt0IYv4ONaO7offViDOdZXtbkjsKHS3+rq+vFW4H7H/jyzL+vRDtRVXtG7nmfEJicuaJqlSVmoUarN2613DXkuSiAImqd+YNdqM3jkwh6/Y8cBBJcHnh9LTqUzzLRAb5kBlr4Zi72qhErc+SKS8jGmVdHcQl4gZpMWjJ7/Pbn9NX/wnMDTImk7nvFmu1cIOmGtrfba1VkYKh+SIZJwp3a4CuFYj73jZhla4iE+l3fmUHZEAKPgmotckcS3DoZ6uz18Ep91aSzutEAazga0I5jNqyCP2gca2x4tTO3bWTvZYe3Yo5d+lX6XyLWvRGYIm+FHiruVTOHFzdpG+ZVd5e6bHZNp4Lo807+t/PE+J1L+ztGMLcYtz1lkjpSRGKrFb6iM6H6fJXV+AWMX1lv B/K766IJ qf+6LZJDsuuvNP5UDlfp/zKCurs/UCi3XFmnaUozEni7yOM3VxHe+kWdx6WTEZapGdSc3W1KoLkH8rqDvgyZygWiscYAMn343KBii8P5HX1GiSqUaHB0hZKIdizpa4qbtwO1VZHaW/KnBtTX4XZ2ZcW7QzW7KVxCT/14VYQ+XlYDUiNqZ4aU9OsqXnJU8DO1L792sRtCo3WsZ73gWT5Uvu0PaFvg3VVj+c7+KkTXyIeXUbD0N3Pouwl7mRIz69W4odCGZdjfQSaY8/S662LpFTol9lTmeSwcCEihwicZBeChrjy0Sgnqu+IFu58ScuKDY39TkxdEpg+4p27y4yv8jzy0sPnKNTObzORRwn0BIaOm7VpBzvpDKhWqFihs4L8b77sUQyWO88KHFv3Fi/vfa2dWnt4GA8fkZg35hBgk6ZTtS9p19GP9z7OcrcADlaKngUe0xr+DcbhU4osGxltumlue+5DPBCukZ1gLthgWScLhK3ECTuoz3x7YLH9zDf1Mt5pUYio2VujsiL9Cj6NFe/OqTZwcgUeDDDwk7cVQphvSvbmE7xVgq3Iy1jx1rIHpO0f4xGaZD4g4H3elouwqIPCDnDQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Mar 08, 2023 at 10:01:24AM -0800, Yosry Ahmed wrote: > On Wed, Mar 8, 2023 at 8:00 AM Johannes Weiner wrote: > > > > Hello Yosry, > > > > On Tue, Feb 28, 2023 at 08:50:00AM +0000, Yosry Ahmed wrote: > > > Reclaimed pages through other means than LRU-based reclaim are tracked > > > through reclaim_state in struct scan_control, which is stashed in > > > current task_struct. These pages are added to the number of reclaimed > > > pages through LRUs. For memcg reclaim, these pages generally cannot be > > > linked to the memcg under reclaim and can cause an overestimated count > > > of reclaimed pages. This short series tries to address that. > > > > Could you please add more details on how this manifests as a problem > > with real workloads? > > We haven't observed problems in production workloads, but we have > observed problems in testing using memory.reclaim when sometimes a > write to memory.reclaim would succeed when we didn't fully reclaim the > requested amount. This leads to tests flaking sometimes, and we have > to look into the failures to find out if there is a real problem or > not. Ah, that would be great to have in the cover letter. Thanks! Have you also tested this patch against prod without memory.reclaim? Just to make sure there are no problems with cgroup OOMs or similar. There shouldn't be, but, you know... > > > Patch 1 is just refactoring updating reclaim_state into a helper > > > function, and renames reclaimed_slab to just reclaimed, with a comment > > > describing its true purpose. > > > > Looking through the code again, I don't think these helpers add value. > > > > report_freed_pages() is fairly vague. Report to who? It abstracts only > > two lines of code, and those two lines are more descriptive of what's > > happening than the helper is. Just leave them open-coded. > > I agree the name is not great, I am usually bad at naming things and > hope people would point that out (like you're doing now). The reason I > added it is to contain the logic within mm/vmscan.c such that future > changes do not have to add noisy diffs to a lot of unrelated files. If > you have a better name that makes more sense to you please let me > know, otherwise I'm fine dropping the helper as well, no strong > opinions here. I tried to come up with something better, but wasn't happy with any of the options, either. So I defaulted to just leaving it alone :-) It's part of the shrinker API and the name hasn't changed since the initial git import of the kernel tree. It should be fine, churn-wise. > > add_non_vmanscan_reclaimed() may or may not add anything. But let's > > take a step back. It only has two callsites because lrugen duplicates > > the entire reclaim implementation, including the call to shrink_slab() > > and the transfer of reclaim_state to sc->nr_reclaimed. > > > > IMO the resulting code would overall be simpler, less duplicative and > > easier to follow if you added a common shrink_slab_reclaim() that > > takes sc, handles the transfer, and documents the memcg exception. > > IIUC you mean something like: > > void shrink_slab_reclaim(struct scan_control *sc, pg_data_t *pgdat, > struct mem_cgroup *memcg) > { > shrink_slab(sc->gfp_mask, pgdat->node_id, memcg, sc->priority); > > /* very long comment */ > if (current->reclaim_state && !cgroup_reclaim(sc)) { > sc->nr_reclaimed += current->reclaim_state->reclaimed; > current->reclaim_state->reclaimed = 0; > } > } Sorry, I screwed up, that doesn't actually work. reclaim_state is used by buffer heads freed in shrink_folio_list() -> filemap_release_folio(). So flushing the count cannot be shrink_slab() specific. Bummer. Your patch had it right by making a helper for just flushing the reclaim state. But add_non_vmscan_reclaimed() is then also not a great name because these frees are directly from vmscan. Maybe simply flush_reclaim_state()? As far as the name reclaimed_slab, I agree it's not optimal, although 90% accurate ;-) I wouldn't mind a rename to just 'reclaimed'.