From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 01821C4332F for ; Thu, 6 Oct 2022 15:32:50 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3E20D8E0001; Thu, 6 Oct 2022 11:32:50 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 391F06B0073; Thu, 6 Oct 2022 11:32:50 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 232A78E0001; Thu, 6 Oct 2022 11:32:50 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 015166B0072 for ; Thu, 6 Oct 2022 11:32:49 -0400 (EDT) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id C43B21A0419 for ; Thu, 6 Oct 2022 15:32:49 +0000 (UTC) X-FDA: 79990917258.15.E90C146 Received: from mail-qk1-f179.google.com (mail-qk1-f179.google.com [209.85.222.179]) by imf08.hostedemail.com (Postfix) with ESMTP id C6EE7160023 for ; Thu, 6 Oct 2022 15:32:47 +0000 (UTC) Received: by mail-qk1-f179.google.com with SMTP id t7so1225501qkt.10 for ; Thu, 06 Oct 2022 08:32:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20210112.gappssmtp.com; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=HD/3Y6U1aB12XT2sHowJ6F+GbBNpOcL+Q3nS96b8W9o=; b=4WqNJOFbpojr+On41IcjBSg7LLLaApk/OvMwSZrD9xiRQdqxbAuydQWa8UeYIvCyhO 71w8L7bjt2nKU93JMcwDPianc+XdxnPNXX4B+xQCTOIoQHO//929+nqYjQTJ4rlCrGvX SxjZjTK/AM3XuEUPiQi7xuK8HCzqJ358fzfCO8zx1wSic9jk8EDKXu1vMKOBhSJO5PH/ BkZrwgkqQfWTQrK2S+LIPDWTylL0Gn34/3zbK+msE1WdaqKmGo1IWGlwGh23B1zzkwvI ltbqb0bdy6Ex5VCWcVz5xASQbvnvcPLjQVq/AbT5Z7ixy5cZebpN23fMPkSxgtbzE6yn 5DaA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=HD/3Y6U1aB12XT2sHowJ6F+GbBNpOcL+Q3nS96b8W9o=; b=Tn+l/NT06OaD4SDCj7RBGsrEpq/dv6YMuCKGzUsjIQhVmbPNvTM5ato0mQIQ0RPw5M kSVLEQyApErox6T5MJC3e3xxvb53dKRCFW042OTfjwkr7xx4xO/C+mjYksjhFeuQFbxR KX2QImKPL988QQ43Sc/8XEtx1noJ3+gjblBLHuBL+8Fqo5PqCDu0xzhvwku7LjAMPNa+ duKlrt0tme/pp8NItajyUdM6z9/X0zEbV3lBW3VixG+CENxk918DOpiz4hqGNmms63B3 SkAjoUkVcggOVoX9F4QU3v9TQrx4hcJCzbX4NqYrqgiiFzUIORdk87hBDR/+usnLEbxG 1Ijg== X-Gm-Message-State: ACrzQf0mrk/nUCODm7qCziw1URCakG2WPzCpWSs3ciziGQulqkgxooyy x9LtiQ7T3tcBaMVdeIXwXSonJQ== X-Google-Smtp-Source: AMsMyM6h9Ks8qzhTcb3j+T+Wy042tMCFaCET4BDkScnpQTRIB7TkqOsMcUKrv9UyxDhWUPOVdjf1Aw== X-Received: by 2002:a05:620a:16d0:b0:6cf:a482:ab21 with SMTP id a16-20020a05620a16d000b006cfa482ab21mr484675qkn.771.1665070366892; Thu, 06 Oct 2022 08:32:46 -0700 (PDT) Received: from localhost ([2620:10d:c091:480::8a16]) by smtp.gmail.com with ESMTPSA id m21-20020a05620a291500b006b8e8c657ccsm21065422qkp.117.2022.10.06.08.32.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 Oct 2022 08:32:46 -0700 (PDT) Date: Thu, 6 Oct 2022 11:32:45 -0400 From: Johannes Weiner To: Yosry Ahmed Cc: Yu Zhao , Andrew Morton , Michal Hocko , Roman Gushchin , Shakeel Butt , Muchun Song , Greg Thelen , David Rientjes , Cgroups , Linux-MM Subject: Re: [PATCH v2] mm/vmscan: check references from all memcgs for swapbacked memory Message-ID: References: <20221005173713.1308832-1-yosryahmed@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1665070368; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=HD/3Y6U1aB12XT2sHowJ6F+GbBNpOcL+Q3nS96b8W9o=; b=EVeIRMA0+up3Iuu25s4ddZLNP+VmPznW9d9p0njPziUd6Y7Os8eADi0AKprYAJ7fxCsV6Q bdvQOU6EUf/4Bbh4d3lDa92OtocZI0KdHv6/xF6U/nQNKUuaemHe+bbRUBGx1hAAg8vcDP XfszBAmDC0HHDCF+i/LwUh06kO5+DKk= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=cmpxchg-org.20210112.gappssmtp.com header.s=20210112 header.b=4WqNJOFb; spf=pass (imf08.hostedemail.com: domain of hannes@cmpxchg.org designates 209.85.222.179 as permitted sender) smtp.mailfrom=hannes@cmpxchg.org; dmarc=pass (policy=none) header.from=cmpxchg.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1665070368; a=rsa-sha256; cv=none; b=0j98bh/cNhimGpPq3pa9bMZamKj9rZZ2UaCWWTySdcez09x7jCg3C/LiOeErEzOCApF9l0 PH04qzUazwDNiQTerxD6B+TYJgvdghu3S/1s/iFSCtU1kIECKgOIclBCiK4W8oh42tRwn0 eW0QPH9L6Xcp6t+qI7cqw+jJQRIT9pg= X-Stat-Signature: okjukztn4hudad9mu1q84rfur6uagnqs X-Rspamd-Queue-Id: C6EE7160023 Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=cmpxchg-org.20210112.gappssmtp.com header.s=20210112 header.b=4WqNJOFb; spf=pass (imf08.hostedemail.com: domain of hannes@cmpxchg.org designates 209.85.222.179 as permitted sender) smtp.mailfrom=hannes@cmpxchg.org; dmarc=pass (policy=none) header.from=cmpxchg.org X-Rspamd-Server: rspam07 X-Rspam-User: X-HE-Tag: 1665070367-846416 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Oct 06, 2022 at 12:30:45AM -0700, Yosry Ahmed wrote: > On Wed, Oct 5, 2022 at 9:19 PM Johannes Weiner wrote: > > > > On Wed, Oct 05, 2022 at 03:13:38PM -0600, Yu Zhao wrote: > > > On Wed, Oct 5, 2022 at 3:02 PM Yosry Ahmed wrote: > > > > > > > > On Wed, Oct 5, 2022 at 1:48 PM Yu Zhao wrote: > > > > > > > > > > On Wed, Oct 5, 2022 at 11:37 AM Yosry Ahmed wrote: > > > > > > > > > > > > During page/folio reclaim, we check if a folio is referenced using > > > > > > folio_referenced() to avoid reclaiming folios that have been recently > > > > > > accessed (hot memory). The rationale is that this memory is likely to be > > > > > > accessed soon, and hence reclaiming it will cause a refault. > > > > > > > > > > > > For memcg reclaim, we currently only check accesses to the folio from > > > > > > processes in the subtree of the target memcg. This behavior was > > > > > > originally introduced by commit bed7161a519a ("Memory controller: make > > > > > > page_referenced() cgroup aware") a long time ago. Back then, refaulted > > > > > > pages would get charged to the memcg of the process that was faulting them > > > > > > in. It made sense to only consider accesses coming from processes in the > > > > > > subtree of target_mem_cgroup. If a page was charged to memcg A but only > > > > > > being accessed by a sibling memcg B, we would reclaim it if memcg A is > > > > > > is the reclaim target. memcg B can then fault it back in and get charged > > > > > > for it appropriately. > > > > > > > > > > > > Today, this behavior still makes sense for file pages. However, unlike > > > > > > file pages, when swapbacked pages are refaulted they are charged to the > > > > > > memcg that was originally charged for them during swapping out. Which > > > > > > means that if a swapbacked page is charged to memcg A but only used by > > > > > > memcg B, and we reclaim it from memcg A, it would simply be faulted back > > > > > > in and charged again to memcg A once memcg B accesses it. In that sense, > > > > > > accesses from all memcgs matter equally when considering if a swapbacked > > > > > > page/folio is a viable reclaim target. > > > > > > > > > > > > Modify folio_referenced() to always consider accesses from all memcgs if > > > > > > the folio is swapbacked. > > > > > > > > > > It seems to me this change can potentially increase the number of > > > > > zombie memcgs. Any risk assessment done on this? > > > > > > > > Do you mind elaborating the case(s) where this could happen? Is this > > > > the cgroup v1 case in mem_cgroup_swapout() where we are reclaiming > > > > from a zombie memcg and swapping out would let us move the charge to > > > > the parent? > > > > > > The scenario is quite straightforward: for a page charged to memcg A > > > and also actively used by memcg B, if we don't ignore the access from > > > memcg B, we won't be able to reclaim it after memcg A is deleted. > > > > This patch changes the behavior of limit-induced reclaim. There is no > > limit reclaim on A after it's been deleted. And parental/global > > reclaim has always recognized outside references. > > Do you mind elaborating on the parental reclaim part? > > I am looking at the code and it looks like memcg reclaim of a parent > (limit-induced or proactive) will only consider references coming from > its subtree, even when reclaiming from its dead children. It looks > like as long as sc->target_mem_cgroup is set, we ignore outside > references (relative to sc->target_mem_cgroup). Yes, I was referring to outside of A. As of today, any siblings of A can already pin its memory after it's dead. I suppose your patch would add cousins to that list. It doesn't seem like a categorial difference to me. > If that is true, maybe we want to keep ignoring outside references for > swap-backed pages if the folio is charged to a dead memcg? My > understanding is that in this case we will uncharge the page from the > dead memcg and charge the swapped entry to the parent, reducing the > number of refs on the dead memcg. Without this check, this patch might > prevent the charge from being moved to the parent in this case. WDYT? I don't think it's worth it. Keeping the semantics simple and behavior predictable is IMO more valuable. It also wouldn't fix the scrape-before-rmdir issue Yu points out, which I think is the more practical concern. In light of that, it might be best to table the patch for now. (Until we have reparent-on-delete for anon and file pages...)