From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9362BEE499B for ; Fri, 18 Aug 2023 21:52:40 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id EC199940077; Fri, 18 Aug 2023 17:52:39 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E7232940012; Fri, 18 Aug 2023 17:52:39 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D39E0940077; Fri, 18 Aug 2023 17:52:39 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id C18C9940012 for ; Fri, 18 Aug 2023 17:52:39 -0400 (EDT) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 8FD9DA0AAC for ; Fri, 18 Aug 2023 21:52:39 +0000 (UTC) X-FDA: 81138575238.05.9DEF1CB Received: from mail-qt1-f169.google.com (mail-qt1-f169.google.com [209.85.160.169]) by imf21.hostedemail.com (Postfix) with ESMTP id CCA5C1C0010 for ; Fri, 18 Aug 2023 21:52:37 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=Ippw4orW; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf21.hostedemail.com: domain of yuzhao@google.com designates 209.85.160.169 as permitted sender) smtp.mailfrom=yuzhao@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1692395557; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=PyzJJLZzQ2GwmUew6/FboLveXIFYOb4sFr1ZSeLKQkQ=; b=l1p7BOeWEqGE0v0wuysffpIq8tTc0F0j1SoRR6uSm/1IeSAr8AExyM6nOA2fuv7u+edSro G6xqLorqKnBSMjtQHk6MEHC7d9KQSj+DxDGo5G+jwz79Mjo64elPUoffizUYr6lZefwePT lrzFQlaMMHJOR8uR5zLkan1TSq/gjBc= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=Ippw4orW; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf21.hostedemail.com: domain of yuzhao@google.com designates 209.85.160.169 as permitted sender) smtp.mailfrom=yuzhao@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1692395557; a=rsa-sha256; cv=none; b=YVkhd2H9qaRAB0qrHekjZtNsKnF8sbQPGyjVagn/SAW+5h+RlYG1MObuy/7kSeCfA8aBOH sgMXrezkHpHngZUtR0wYYCJr0vRHDkgnKQHAEXHAcEpUCASDSBZ7nDZLMkTqPKxE5vNuwt 1AOOyPI4d5fmepDHKiKvo0PcgCJV/2Y= Received: by mail-qt1-f169.google.com with SMTP id d75a77b69052e-4036bd4fff1so100681cf.0 for ; Fri, 18 Aug 2023 14:52:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1692395557; x=1693000357; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=PyzJJLZzQ2GwmUew6/FboLveXIFYOb4sFr1ZSeLKQkQ=; b=Ippw4orWVxJ6oq05XFl369NpUPkzESl7NJ4+NScfAvvK41CesrX/bYjOqmj/AAC45a AiExoazwRnHXoka2Yl3DVbSsPnGL9VoN2ze/POBx0ZuAaCDACusXYHwRy3sfFyfX8hFd kEXkBL7nAJJiuC/lBcp3as7zKTmVAAWnRQlZmOxE0OuSsx0tXonmhTo+KPlIT48wJG5y 9SOQa9Ck+/nq+XKzEEcB25cSoAAFgOyVULgH2HkhUAbo1N4nwxCfz4Eow3KwtDCZtui0 UqVfXXnf+Ok1efxEPewuPDmyipLwuvfapxJGHzjuIvFccpIZRmlYvuTmsiQWjFz8/kBM HVJw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1692395557; x=1693000357; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=PyzJJLZzQ2GwmUew6/FboLveXIFYOb4sFr1ZSeLKQkQ=; b=Cq5poOi1DYf0ZRI//SvVsJENN68wmBVjhD2OFYkB1ozsVnV3WGYUAvCTmyizGYP/kn EZvAWrOvMTPilxCZwFNPCaXw3az4eVKNBjd3mvSysm+1CnVV/ydLQgcrSrfFHTj1/EVn 2pO46ycBaKphS6Y1S/LF4K92roBx/ZDds8OZa8WsZz+JnFH3shfFCdA+H8kcG8pExFHj YYhMGur96EC1Sp7yVtD/G3PbKWja3rd1H70eZA/Udx9KFwHCbOWac50/zaAmypkuEibc lN50bALDMupdmhkkrZnV2F15vX4JVtOYl+GepGWVAyWJCx3UMBxMR76nY5klS56WXUQI vvjQ== X-Gm-Message-State: AOJu0Yzwk0WbbN92Qk7NyZ+ToU+kcL0q5wABEGgqUOmbmrE2SjiZ8A85 ViVzQw+98Ivl8nvr88rNziRDQ+LeEbV0tltFMOWZKw== X-Google-Smtp-Source: AGHT+IHYFtSoqpc8IxxYx5eaXcFiGCwQ1vUYhBlUo8Z0wDwrXOqlYDGpawv5zxzm5wyZPKyyvHgDfn7ahOisIyQoJYA= X-Received: by 2002:a05:622a:1041:b0:3f6:97b4:1a4d with SMTP id f1-20020a05622a104100b003f697b41a4dmr276674qte.23.1692395556823; Fri, 18 Aug 2023 14:52:36 -0700 (PDT) MIME-Version: 1.0 References: <20230818134906.GA138967@cmpxchg.org> <20230818173544.GA142196@cmpxchg.org> <20230818183538.GA142974@cmpxchg.org> <20230818213502.nsur4qbs7t7nxg54@google.com> In-Reply-To: <20230818213502.nsur4qbs7t7nxg54@google.com> From: Yu Zhao Date: Fri, 18 Aug 2023 15:51:59 -0600 Message-ID: Subject: Re: [PATCH v2] workingset: ensure memcg is valid for recency check To: Shakeel Butt Cc: Yosry Ahmed , Johannes Weiner , Nhat Pham , akpm@linux-foundation.org, kernel-team@meta.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, stable@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: CCA5C1C0010 X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: gdfyq1uzbchfpu61nmj8bfupyx8g57aw X-HE-Tag: 1692395557-191620 X-HE-Meta: U2FsdGVkX18/3V4VP+l7bq0w3ItZjxIaFp+VI5+DvJ7Vaw/BZBBu5atDdNWSKIGwcOAkPpBsfu102QRuAAwaMTqMS3I8ACBuW5nWmoLoBh2se/WfoCAYmcFjx9tvf+AFdxT+xet1hGeXffMvO+URJKG9IMiphgRrADqEd5BLYVvTjIV2bjbGwq4+EoebWh/9iCDxXRSe990BZE0sgCGIIyaQU80OEpc8h8npjZUibsJHquMcY4lcvp3z5ItShXHZ+0ui3hxIFQ3vFojTMB3RsA75+7itxlGtj7ltAbht12IcPXTPXFUqpe0rJrXmPoEBhGy1OlluPGd2UuJ5mFglubYH4rsrbWOYVssGNeUvlA5PNrxUCS81iKGoadxIZ3MnEQCXDMh1Z0Il7zZDwU+EXsbBVBOOiPG/9w4mzq7Jl+oLuWXzvu5HiVUccPB0LM7jhVQh1UVjpauIMlka3fswIdKzEqc8WNEM8DnS7/yx3hID/BViYQXa43LYAf3kb63qXr8IQjNLIpbTE3UiQAohNTJg43hxLRd+rlP/3lNcK44eYyFRUijlK0tILjTwXMqia6PI1YQz0ffnhzt5rGp/4qlGfxsl/Gt338dtDnvek6QtHd23KL9AHeEmStye2uZyB0U1UTn+5aDZThfbzXfptp1wfBMwZWAxNpKp/JHX1oHpvMnKOpjMUiKDom2pXZ/UEOynLsO8ofLLugp1DXxRunzO0xOF6wlr/UPjqLKl/Ar+zZW8gNzl3JGOtsuSLhhAsXE1XmJU+MaRWo/0WSvWuijLHOq5nrBVEUVMrGi21MoBhmdRlNthYoYOtMs1LyAVtPy/uLE/3DdKVUmprrR8ol8Re/7eIl47EwKyifZ3BAdlrU4xzuTEULSTAIH38y39KxDPR53lHUzz7OqzExKtWegzEK/+XBXwChMWgjeKB++i8pdsQW15jtsxn92fiXUhFAcII2OWQVXrLcb01NS VIKAaxjz ugPeF7oPjnY25A6w4087L9A+P7RJDNpuJ9CvY4rsfxuMsYHk7pYmQp1xUKuX566ks8Q/zenNvZV/qhCN1MMM/iIcSrf47ZjyR+jC5eVuJdqIeCZ+nRiXBhOHfhWnWAAZP1lViXE1R7p1GM7Rl3H4fIF36dAydzQB1qsPfSj9O4aax1m1JQZhheBPgSTNWb4MTEJSKGqt75yIS7+/9grrWTe+ZFQhwIOzmPIsL8aOJ3I+6RoepqjMpc1ILeNIvjHdS1upoTvuhnrIIXNdB3C0vfstWz0L3zkydsDAAw0fD3+76htLUMplxlRhjmg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, Aug 18, 2023 at 3:35=E2=80=AFPM Shakeel Butt = wrote: > > On Fri, Aug 18, 2023 at 11:44:45AM -0700, Yosry Ahmed wrote: > > On Fri, Aug 18, 2023 at 11:35=E2=80=AFAM Johannes Weiner wrote: > > > > > > On Fri, Aug 18, 2023 at 10:45:56AM -0700, Yosry Ahmed wrote: > > > > On Fri, Aug 18, 2023 at 10:35=E2=80=AFAM Johannes Weiner wrote: > > > > > On Fri, Aug 18, 2023 at 07:56:37AM -0700, Yosry Ahmed wrote: > > > > > > If this happens it seems possible for this to happen: > > > > > > > > > > > > cpu #1 cpu#2 > > > > > > css_put() > > > > > > /* css_free_rwork_= fn is queued */ > > > > > > rcu_read_lock() > > > > > > mem_cgroup_from_id() > > > > > > mem_cgroup_id_remo= ve() > > > > > > /* access memcg */ > > > > > > > > > > I don't quite see how that'd possible. IDR uses rcu_assign_pointe= r() > > > > > during deletion, which inserts the necessary barriering. My > > > > > understanding is that this should always be safe: > > > > > > > > > > rcu_read_lock() (writer serialization, in this = case ref count =3D=3D 0) > > > > > foo =3D idr_find(x) idr_remove(x) > > > > > if (foo) kfree_rcu(foo) > > > > > LOAD(foo->bar) > > > > > rcu_read_unlock() > > > > > > > > How does a barrier inside IDR removal protect against the memcg bei= ng > > > > freed here though? > > > > > > > > If css_put() is executed out-of-order before mem_cgroup_id_remove()= , > > > > the memcg can be freed even before mem_cgroup_id_remove() is called= , > > > > right? > > > > > > css_put() can start earlier, but it's not allowed to reorder the rcu > > > callback that frees past the rcu_assign_pointer() in idr_remove(). > > > > > > This is what RCU and its access primitives guarantees. It ensures tha= t > > > after "unpublishing" the pointer, all concurrent RCU-protected > > > accesses to the object have finished, and the memory can be freed. > > > > I am not sure I understand, this is the scenario I mean: > > > > cpu#1 cpu#2 cpu#3 > > css_put() > > /* schedule free */ > > rcu_read_lock() > > idr_remove() > > mem_cgroup_from_id() > > > > /* free memcg */ > > /* use memcg */ > > > > If I understand correctly you are saying that the scheduled free > > callback cannot run before idr_remove() due to the barrier in there, > > but it can run after the rcu_read_lock() in cpu #2 because it was > > scheduled before that RCU critical section started, right? > > Isn't there a simpler explanation. The memcg whose id is stored in the > shadow entry has been freed and there is an ongoing new memcg allocation > which by chance has acquired the same id and has not yet initialized > completely. More specifically the new memcg creation is between > css_alloc() and init_and_link_css() and there is a refault for the > shadow entry holding that id. I think so, and this fix would just crash at tryget() instead when hitting the problem.