From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B1887C25B75 for ; Sat, 1 Jun 2024 00:01:34 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2F4806B00A0; Fri, 31 May 2024 20:01:34 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 2A4CD6B00A1; Fri, 31 May 2024 20:01:34 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 16CB76B00A2; Fri, 31 May 2024 20:01:34 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id EE00C6B00A0 for ; Fri, 31 May 2024 20:01:33 -0400 (EDT) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 9DCA01C1EA5 for ; Sat, 1 Jun 2024 00:01:33 +0000 (UTC) X-FDA: 82180365666.04.E0EE260 Received: from mail-ej1-f47.google.com (mail-ej1-f47.google.com [209.85.218.47]) by imf30.hostedemail.com (Postfix) with ESMTP id C3E878001E for ; Sat, 1 Jun 2024 00:01:31 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b="FBtLtE/I"; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf30.hostedemail.com: domain of shy828301@gmail.com designates 209.85.218.47 as permitted sender) smtp.mailfrom=shy828301@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1717200091; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=524zdDFTH+Cq7WRnmF848UGfgcIcWPLE/iJesHfvItA=; b=d2m6qKGxZ6n3fcD8mf9aYX2MUGL8G1e2LwXVtED1hg82abGb/rnepLuUO81LLOAi7cH/bQ vNMPgXFN2dv8W/mKpNOa0Snx2uLhIHXp4EFB3xH6oqJtg+/R9W+u3dDrss5Zxia0nUigT8 Kr9Ueiqe5UmFwGpmeTiIfGqNxIOy3rA= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b="FBtLtE/I"; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf30.hostedemail.com: domain of shy828301@gmail.com designates 209.85.218.47 as permitted sender) smtp.mailfrom=shy828301@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1717200091; a=rsa-sha256; cv=none; b=4d5/N3bPebgxyNTx/iqxpk9bAqXK6myHyKBzbxqUi/c1np5EUCvsaZl+cwvjXxuPdfVKAn vh85C4HQLg1CJfjvbC00nPAOustCRokhorywZVd5RqUwP8vdr4GVINs9StkOmKDyST1IfN bY2Fkh5F/ajMojzXPJAFZhF+9DyF9UE= Received: by mail-ej1-f47.google.com with SMTP id a640c23a62f3a-a62db0c8c9cso279595466b.2 for ; Fri, 31 May 2024 17:01:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1717200090; x=1717804890; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=524zdDFTH+Cq7WRnmF848UGfgcIcWPLE/iJesHfvItA=; b=FBtLtE/ItPOLlY3YMLtTB761uoomyU06NCnDVd5QDfqXUQPTxCpC3DpaFFve/pg1Eu oRo1C5ccav5mabUs2UiNks6DNlhCMThGCWctNwAvXbRt9+qMRFTwvWRlrRk2gWtW69+6 bZzRPZbstP6MCp2CBpJ7HtxS1wy1dhI6IedLkoUEefCR+61m/K7pWB/7I4wo4uZKed8l gQl39O7769jwyPV5OW7LBFWSXAXlQc26jlMCG3vlci+0rcFvw0xeFLEGh/OPrnedDD4f AcTz4hx68CMSt6kZVMbl9EN0QCKdN4g7O+gnT/+a3U9wKy9gioMS5MvHt6j4teU9IqJc agYA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1717200090; x=1717804890; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=524zdDFTH+Cq7WRnmF848UGfgcIcWPLE/iJesHfvItA=; b=EjYiS2hWfUDSA2wfPM8pP98u7D/w3sGU2DyOFS2BigrNR+pbBzfLx2nwmENazc1Sw6 K6b+hI2skSP/w5r0JjRf/UrimS7QKMgE9OMsmFameCLpMAeRQA7PmMnnxK3Ln7sMam7x h+c1o+wfM7nU8hd+ihlYMWdpZf4/JQULaXzQ4qZwEWgad9hY960vFvKMrBTO05U6B9/g muTzRdcvCCBEXK6meXvPCQScu4er9aM0esfTrDpX7m7x5r+PBro3vtGuSKRyK3qzY6AJ RrIUhBto2t0hXIkBUwNIUwme1N6q/vroty7KNhzebxe8Jt8dBO/eLfcBMOAPadpDqFqE h28g== X-Forwarded-Encrypted: i=1; AJvYcCXOZ6JNKz6oMd8DmKkUYURPuuqJFrLFPepvuRuxZzONIj9dcQ1rpunzcjOOgvdabxV+104clUpbg7HEOpgXbTkfa30= X-Gm-Message-State: AOJu0Yxh5Uoz/YCdGveAZlVcKGAnVJ1dpLtm2ZaBpDToKTKmMvBe2MFJ dhCtbsPZI6xSznWRN2juPAe6w56jW/MeFOZD5y+VaMCrVgKv9ESo55NQG6Ej9CgUSKp2pL6IPai +YiCA8JksEEs2JtJElby3a7bVGjE= X-Google-Smtp-Source: AGHT+IGiU+0QniSLGCdZ7m/gT5QmO5VbQH36IAG/Heow8ZDZt51xdXJ24cdppqh1Q+wptzwGMRQ+2h9KQz00O+oLkG8= X-Received: by 2002:a17:907:5c4:b0:a59:9af5:2c9c with SMTP id a640c23a62f3a-a6820902b7amr309245266b.38.1717200089952; Fri, 31 May 2024 17:01:29 -0700 (PDT) MIME-Version: 1.0 References: <202405311534.86cd4043-lkp@intel.com> <890e5a79-8574-4a24-90ab-b9888968d5e5@redhat.com> In-Reply-To: From: Yang Shi Date: Fri, 31 May 2024 17:01:18 -0700 Message-ID: Subject: Re: [linus:master] [mm] efa7df3e3b: kernel_BUG_at_include/linux/page_ref.h To: Peter Xu Cc: David Hildenbrand , kernel test robot , Jason Gunthorpe , Vivek Kasireddy , Rik van Riel , oe-lkp@lists.linux.dev, lkp@intel.com, linux-kernel@vger.kernel.org, Andrew Morton , Matthew Wilcox , Christopher Lameter , linux-mm@kvack.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: C3E878001E X-Stat-Signature: ydbq9m41bioxpzkyozueou5zktruusuf X-Rspam-User: X-Rspamd-Server: rspam11 X-HE-Tag: 1717200091-7142 X-HE-Meta: U2FsdGVkX1/BvyGVVeB3c7FPy8GkxsQcWcdJDmfFkdjXpRfny0e154Hum3BHi4hQG+/ZouAb6gqh1JDDIrLp0WP98T9F7xnyHSL9SlwOsisBewaPPMmybaZWpcn2TpBhQHWLxsOLm10XrPbgMfectTFfMHlGmi5a+5o6+xuNK+KRdr37CIEpaVf0ahJas1G2ythllbUCQff62thXx6i5heyP2J3Fv6O1AcvBDJWmCgqdRcesdfDa8INSVP/QQ/6PKCezagG348FuQy6CpH6HdROhwdrSisnaXaDoYZzVcHioRryZg1huiypCotQGd13MSjCLK1m6LQ83Rh1+12GZgPrN4J2raS1xUSV83bD5FCbspSEfRJT8xClb5zOqvzgzCj3JXP92b5uKFZ2VbasroZzcB5QxLFG6abQEkxW+jSbWqSTQeVpLsJP5KJAUhWrkZ/iM1Souk2qM/9AO4Atxkbiz1f+mPiiKCLSlsu3LUX6PGXuMA4qDd3Sl+SzqJ+pRvSm/uof0JzNFj99L5f4LDfAnfq04iXSJAL74WOfexJkosxB05hzqljE0NhN0fvVRT2WhIhf71oQs3ypatrVZiyEZWRtA6tkRgR4V4N4MjpYYwPXirGV+byv98xS3Olr1SgcCvhUxU4BQT0SZJRxm5uzumkvh7BRyqWIh+8XUCYr/hHqIF5vM5lajQEjbLvgXai0Zaiipmm9yF9Ti+SpJBkHFjQk9b/8qTasHBsajhCZjG5wpmAovvtFemOPFf5EN8SxzA5HqbZjlrNSTXvdQmSejO8gpbtuFNgt0W9aDF97KLPduW+2brInrsBbrWmWNyZTNckJBK3WjiCOe/u5PnimI3eMKQhyNPRug0L9z1wrgbyEqhQx7EV2nd009t7ho/IxD+T1Xd35HjPdVrZucdpUaMmJytdALKqBgZw5jWhkbo7NbFj8OjJwswsF99QeXpYbNhZ3Jsn0tvRILn/0 AVDXqgOu 62PMkphKJZ9kgx2Ls36UW81C63QhdZhI+/aat324P1jGOvmO4/suBt6+IC4R8N0hzyfMz7OFdLnlSDYV+xHFjONliVwSB732aJ7XQzVxi0Sxk+UHs1Sd9hCozHEdMCoY5vmY+6qxHEAQEwNvhyax2UiijfG/NdcP8eNgGKFPNUl6zDeDi3RxKqAU6ZUpEn760hRJAZLGifT2wFoHPTEcGXPMWSpfOsEdxSDqXQ99VtOcDrHUU8z0zOyrA8nfJaG4y9PZk0780dxCY9j1GT1vhwnN2io1sQVd28NLyV29rObKNAyvGmJQFl4DYU5m4MI8/uqrH80sTYjFqE/NikTHwzK8rL282HGRQ35HYRCLfSCwNmbOEASSBqJB7zIN6pNOQWYb0XKoxhpGB0D+jEgxWWfd0klepx/oMG5NSjoiLs2dVj6k= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, May 31, 2024 at 4:25=E2=80=AFPM Peter Xu wrote: > > On Fri, May 31, 2024 at 07:46:41PM +0200, David Hildenbrand wrote: > > try_grab_folio()->try_get_folio()->folio_ref_try_add_rcu() > > > > Is called (mm-unstable) from: > > > > (1) gup_fast function, here IRQs are disable > > (2) gup_hugepte(), possibly problematic > > (3) memfd_pin_folios(), possibly problematic > > (4) __get_user_pages(), likely problematic > > > > (1) should be fine. > > > > (2) is possibly problematic on the !fast path. If so, due to commit > > a12083d721d7 ("mm/gup: handle hugepd for follow_page()") ? CCing Pe= ter. > > > > (3) is possibly wrong. CCing Vivek. > > > > (4) is what we hit here > > I guess it was overlooked because try_grab_folio() didn't have any commen= t > or implication on RCU or IRQ internal helpers being used, hence a bit > confusing. E.g. it has different context requirement on try_grab_page(), > even though they look like sister functions. It might be helpful to have= a > better name, something like try_grab_folio_rcu() in this case. > > Btw, none of above cases (2-4) have real bug, but we're looking at some w= ay > to avoid triggering the sanity check, am I right? I hope besides the hos= t > splash I didn't overlook any other side effect this issue would cause, an= d > the splash IIUC should so far be benign, as either gup slow (2,4) or the > newly added memfd_pin_folios() (3) look like to have the refcount stabliz= ed > anyway. > > Yang's patch in the other email looks sane to me, just that then we'll ad= d > quite some code just to avoid this sanity check in paths 2-4 which seems > like an slight overkill. > > One thing I'm thinking is whether folio_ref_try_add_rcu() can get rid of > its RCU limitation. It boils down to whether we can use atomic_add_unless= () > on TINY_RCU / UP setup too? I mean, we do plenty of similar things > (get_page_unless_zero, etc.) in generic code and I don't understand why > here we need to treat folio_ref_try_add_rcu() specially. > > IOW, the assertions here we added: > > VM_BUG_ON(!in_atomic() && !irqs_disabled()); > > Is because we need atomicity of below sequences: > > VM_BUG_ON_FOLIO(folio_ref_count(folio) =3D=3D 0, folio); > folio_ref_add(folio, count); > > But atomic ops avoids it. Yeah, I didn't think of why atomic can't do it either. But is it written in this way because we want to catch the refcount =3D=3D 0 case since it means a severe bug? Did we miss something? > > This chunk of code was (mostly) originally added in 2008 in commit > e286781d5f2e ("mm: speculative page references"). > > In short, I'm wondering whether something like below would make sense and > easier: > > =3D=3D=3D8<=3D=3D=3D > diff --git a/include/linux/page_ref.h b/include/linux/page_ref.h > index 1acf5bac7f50..c89a67d239d1 100644 > --- a/include/linux/page_ref.h > +++ b/include/linux/page_ref.h > @@ -258,26 +258,9 @@ static inline bool folio_try_get(struct folio *folio= ) > return folio_ref_add_unless(folio, 1, 0); > } > > -static inline bool folio_ref_try_add_rcu(struct folio *folio, int count) > -{ > -#ifdef CONFIG_TINY_RCU > - /* > - * The caller guarantees the folio will not be freed from interru= pt > - * context, so (on !SMP) we only need preemption to be disabled > - * and TINY_RCU does that for us. > - */ > -# ifdef CONFIG_PREEMPT_COUNT > - VM_BUG_ON(!in_atomic() && !irqs_disabled()); > -# endif > - VM_BUG_ON_FOLIO(folio_ref_count(folio) =3D=3D 0, folio); > - folio_ref_add(folio, count); > -#else > - if (unlikely(!folio_ref_add_unless(folio, count, 0))) { > - /* Either the folio has been freed, or will be freed. */ > - return false; > - } > -#endif > - return true; > +static inline bool folio_ref_try_add(struct folio *folio, int count) > +{ > + return folio_ref_add_unless(folio, count, 0); > } > > /** > @@ -305,7 +288,7 @@ static inline bool folio_ref_try_add_rcu(struct folio= *folio, int count) > */ > static inline bool folio_try_get_rcu(struct folio *folio) > { > - return folio_ref_try_add_rcu(folio, 1); > + return folio_ref_try_add(folio, 1); > } > > static inline int page_ref_freeze(struct page *page, int count) > diff --git a/mm/gup.c b/mm/gup.c > index e17466fd62bb..17f89e8d31f1 100644 > --- a/mm/gup.c > +++ b/mm/gup.c > @@ -78,7 +78,7 @@ static inline struct folio *try_get_folio(struct page *= page, int refs) > folio =3D page_folio(page); > if (WARN_ON_ONCE(folio_ref_count(folio) < 0)) > return NULL; > - if (unlikely(!folio_ref_try_add_rcu(folio, refs))) > + if (unlikely(!folio_ref_try_add(folio, refs))) > return NULL; > > /* > =3D=3D=3D8<=3D=3D=3D > > So instead of adding new code, we fix it by removing some. There might b= e > some implication on TINY_RCU / UP config on using the atomic_add_unless() > to replace one atomic_add(), but I'm not sure whether that's a major issu= e. > > The benefit is try_get_folio() then don't need a renaming then, because t= he > rcu implication just went away. > > Thanks, > > -- > Peter Xu >