From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2C9EBC27C53 for ; Thu, 6 Jun 2024 01:36:06 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BCA936B00A0; Wed, 5 Jun 2024 21:36:05 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id BA1A86B00A2; Wed, 5 Jun 2024 21:36:05 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A69396B00A3; Wed, 5 Jun 2024 21:36:05 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 88B726B00A0 for ; Wed, 5 Jun 2024 21:36:05 -0400 (EDT) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 433C5120468 for ; Thu, 6 Jun 2024 01:36:05 +0000 (UTC) X-FDA: 82198747890.22.E6411E9 Received: from m16.mail.126.com (m16.mail.126.com [220.197.31.6]) by imf28.hostedemail.com (Postfix) with ESMTP id 6CE1AC000B for ; Thu, 6 Jun 2024 01:36:01 +0000 (UTC) Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=126.com header.s=s110527 header.b=enOQm9qm; spf=pass (imf28.hostedemail.com: domain of yangge1116@126.com designates 220.197.31.6 as permitted sender) smtp.mailfrom=yangge1116@126.com; dmarc=pass (policy=none) header.from=126.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1717637763; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=1g6h3u62HKNyMsAaXhRTD1sfN/Q2RI3J8MBfg5F040k=; b=4VFgRlLINkQbPZjaLtm2hR6tBE3zIFWUlsqVaTp+zvXNCyp4uiPx53Hm7w1lZ3Ck9qbF4w d4Rd1D6a7uZLfAvGzTVPcfU8EbyMRzlHec+vKqKw5ui9lJ/6F+64XGY8EAGZQD98qsGzfF 2gaQr2L7rusE+RIyvaa8ws+I8fH84As= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1717637763; a=rsa-sha256; cv=none; b=KnxVwucXCXpq9RejepKiUbJtePHC1YQdPx1h9Eb0k5BITlj0h6Dha8oHTqgsQIujrUWdsL HN+DEgfhL5R26qY3RreK92o55ReOqg9uUnMAfIUlW/75m5XjLjkQzG/gxWOn2rZ70MnCDV lGelNO+oRbe+qsLTEipVkFl/6HRf+1o= ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=pass header.d=126.com header.s=s110527 header.b=enOQm9qm; spf=pass (imf28.hostedemail.com: domain of yangge1116@126.com designates 220.197.31.6 as permitted sender) smtp.mailfrom=yangge1116@126.com; dmarc=pass (policy=none) header.from=126.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=126.com; s=s110527; h=Subject:From:Message-ID:Date:MIME-Version: Content-Type; bh=1g6h3u62HKNyMsAaXhRTD1sfN/Q2RI3J8MBfg5F040k=; b=enOQm9qmsirFAzWGPdv9CExUBVgtai+D393mZixhFcpmj+mScUPvSKr+nDGYiA TmrrBzm5OUZNpe5ZoB0sQJWlbAs+hEkkD/eY/gQdHSS7XQp5Lqzk1g+5aiWd8etw DqtKyypPWovOWTTEL9uafykPwzKykW1QZWkKtDB03Xj2M= Received: from [172.21.21.216] (unknown [118.242.3.34]) by gzga-smtp-mta-g0-3 (Coremail) with SMTP id _____wCXfxFnEmFmkeFGBQ--.47631S2; Thu, 06 Jun 2024 09:35:36 +0800 (CST) Subject: Re: [PATCH] mm/gup: don't check page lru flag before draining it To: David Hildenbrand , akpm@linux-foundation.org, Matthew Wilcox Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, baolin.wang@linux.alibaba.com, liuzixing@hygon.cn References: <1717498121-20926-1-git-send-email-yangge1116@126.com> <0d7a4405-9a2e-4bd1-ba89-a31486155233@redhat.com> <776de760-e817-43b2-bd00-8ce96f4e37a8@redhat.com> <7063920f-963a-4b3e-a3f3-c5cc227bc877@redhat.com> From: yangge1116 Message-ID: Date: Thu, 6 Jun 2024 09:35:35 +0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0 MIME-Version: 1.0 In-Reply-To: <7063920f-963a-4b3e-a3f3-c5cc227bc877@redhat.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit X-CM-TRANSID:_____wCXfxFnEmFmkeFGBQ--.47631S2 X-Coremail-Antispam: 1Uf129KBjvJXoWxtrWfurWxWrWrZFWrGFy5XFb_yoW7Aw15pF WUGF1qqFWDGF1DCr42qF15Ar10kr9Iqr4UZF4xGry2yFn0qw1q9F47Kw13CFsxJr18uFn2 va4jqFn2q3WYqF7anT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDUYxBIdaVFxhVjvjDU0xZFpf9x07bwrW7UUUUU= X-Originating-IP: [118.242.3.34] X-CM-SenderInfo: 51dqwwjhrrila6rslhhfrp/1tbiWQT1G2VLalDc2AAAsj X-Rspam-User: X-Stat-Signature: gro6xyzzq7ngxkadt5j4yfaqrsi5emdu X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 6CE1AC000B X-HE-Tag: 1717637761-587766 X-HE-Meta: U2FsdGVkX1+AGu9nIoHzV+896KVJm1+R7QEsEC2gX9UnoPaXa+emtQsFXarFXtNOu+nczOLcxyLSbaXVxrevh+vlEQ5K2r7MZa0tmu4Jh8AhJYV1ecPvkKxS5zaLY3k9S85afh96+Ne0r4TwsldfvX70YSThwtykqh6Pfi4UQHELHMP00qPNmhtBfNe/iTZOMw/6Xe0T5y3fLp0BysxmvGCxZYu3evcnxgUK8MehjtLtvSbCglrcfVgVl8a23+AMMTeBGB7O0d2I9jMlzANug0MFf9bS1Q/SB+6JArGj7UmfeqlhIFN69JPkmqQb/AaXHgHrCkZ/eSL79ppJU9GlphsjbLVf++KKds7DzmIe3v8AxMsqi/EX6ZSil663vsCBTm/A6E/IIIm05QNULD290oQgAygHIiOAeYGbLyKhNwMm2S5JTRDax1nVhuUcVmKrZqLuDI6Y28FySn7WABYl5IS9NeSBAZTkhf86Bk/1bbCl/nSfsDJ+YUMdDBGrthBuGgLPjVFADSZghqRi7D2krnjBgFDoCopH7T2wq22uqvtcnN0dzlJ43WenVGapg0uv5mJgvPoSDfIkaDqUBOMmaCKDj9qWT5E60azwnI4aFxiAVQp3uoCuGRtAZKDI8/vOuSPis+HVv8/oXL3EYT3VjfQN1Q3sU+XQbfLM/gLk1ZTdhRdqoaJVBus6cwVobZTdbDnPmbiETWahxqSn9mtasAzOSpvtaFu+q6bCcejVltIIiJEgjZNz2D3/O1FXO03rE4uL3O25QywDGdTYEF8QBXbLB1kHvHNFYPsXmUO6N7/T98akLfjnSd0R/CUtO0gON0owGeMK54rQzbh7k/yJC+W1QS+/A/usbwsvyeLwDOZLw2PwiU99sGUz9X1Nd8QBURbqLHIoXqdiN+sXZNV/QUViMvyCwsYZ5hGC5IJDd0iRTUYnq3IMrLLI9UmZGbHnZ9xf+U3JurhFDg956Mn DTUHlLpB MV6jvPXbASR4lyEmVlhCzgNYRxAUaAJEWdakZ7tmqh9MwHDle/iDCuxx/sIzDzPDf65E3VHzZkpBZsxBuYX0aVka0KdPjEu93wAlG0mQQqedaAhovjq3Iad1jB+8jZBdThhicGghxlvL8Ac8Wt6Wy5sE0rib8S7SmUmzxkdg52egDMaHQbMMfJOK/5CbPs5CRjBUsHbRw9dz3X7V9gxhSrOZkLDwDtmMf6Zpp1QRXV9rNGODiZAPnuxn65QKy2Yl4QWPX5Kh6zfe/2SpdMEOsD2QgywieIBWo8/F/UxdwFqO8TqtEPTGBdXdjHxvhkrzuUIt76918CE6Y/F4= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: 在 2024/6/5 下午5:53, David Hildenbrand 写道: > On 05.06.24 11:41, David Hildenbrand wrote: >> On 05.06.24 03:18, yangge1116 wrote: >>> >>> >>> 在 2024/6/4 下午9:47, David Hildenbrand 写道: >>>> On 04.06.24 12:48, yangge1116@126.com wrote: >>>>> From: yangge >>>>> >>>>> If a page is added in pagevec, its ref count increases one, remove >>>>> the page from pagevec decreases one. Page migration requires the >>>>> page is not referenced by others except page mapping. Before >>>>> migrating a page, we should try to drain the page from pagevec in >>>>> case the page is in it, however, folio_test_lru() is not sufficient >>>>> to tell whether the page is in pagevec or not, if the page is in >>>>> pagevec, the migration will fail. >>>>> >>>>> Remove the condition and drain lru once to ensure the page is not >>>>> referenced by pagevec. >>>> >>>> What you are saying is that we might have a page on which >>>> folio_test_lru() succeeds, that was added to one of the cpu_fbatches, >>>> correct? >>> >>> Yes >>> >>>> >>>> Can you describe under which circumstances that happens? >>>> >>> >>> If we call folio_activate() to move a page from inactive LRU list to >>> active LRU list, the page is not only in LRU list, but also in one of >>> the cpu_fbatches. >>> >>> void folio_activate(struct folio *folio) >>> { >>>        if (folio_test_lru(folio) && !folio_test_active(folio) && >>>            !folio_test_unevictable(folio)) { >>>            struct folio_batch *fbatch; >>> >>>            folio_get(folio); >>>            //After this, folio is in LRU list, and its ref count have >>> increased one. >>> >>>            local_lock(&cpu_fbatches.lock); >>>            fbatch = this_cpu_ptr(&cpu_fbatches.activate); >>>            folio_batch_add_and_move(fbatch, folio, folio_activate_fn); >>>            local_unlock(&cpu_fbatches.lock); >>>        } >>> } >> >> Interesting, the !SMP variant does the folio_test_clear_lru(). >> >> It would be really helpful if we could reliably identify whether LRU >> batching code has a raised reference on a folio. >> >> We have the same scenario in >> * folio_deactivate() >> * folio_mark_lazyfree() >> >> In folio_batch_move_lru() we do the folio_test_clear_lru(folio). >> >> No expert on that code, I'm wondering if we could move the >> folio_test_clear_lru() out, such that we can more reliably identify >> whether a folio is on the LRU batch or not. > > I'm sure there would be something extremely broken with the following > (I don't know what I'm doing ;) ), but I wonder if there would be a way > to make something like that work (and perform well enough?). > > diff --git a/mm/swap.c b/mm/swap.c > index 67786cb771305..642e471c3ec5a 100644 > --- a/mm/swap.c > +++ b/mm/swap.c > @@ -212,10 +212,6 @@ static void folio_batch_move_lru(struct folio_batch > *fbatch, move_fn_t move_fn) >         for (i = 0; i < folio_batch_count(fbatch); i++) { >                 struct folio *folio = fbatch->folios[i]; > > -               /* block memcg migration while the folio moves between > lru */ > -               if (move_fn != lru_add_fn && !folio_test_clear_lru(folio)) > -                       continue; > - >                 folio_lruvec_relock_irqsave(folio, &lruvec, &flags); >                 move_fn(lruvec, folio); > > @@ -255,8 +251,9 @@ static void lru_move_tail_fn(struct lruvec *lruvec, > struct folio *folio) >   */ >  void folio_rotate_reclaimable(struct folio *folio) >  { > -       if (!folio_test_locked(folio) && !folio_test_dirty(folio) && > -           !folio_test_unevictable(folio) && folio_test_lru(folio)) { > +       if (folio_test_lru(folio) && !folio_test_locked(folio) && > +           !folio_test_dirty(folio) && !folio_test_unevictable(folio) && > +           folio_test_clear_lru(folio)) { >                 struct folio_batch *fbatch; >                 unsigned long flags; > > @@ -354,7 +351,7 @@ static void folio_activate_drain(int cpu) >  void folio_activate(struct folio *folio) >  { >         if (folio_test_lru(folio) && !folio_test_active(folio) && > -           !folio_test_unevictable(folio)) { > +           !folio_test_unevictable(folio) && > folio_test_clear_lru(folio)) { >                 struct folio_batch *fbatch; > >                 folio_get(folio); > @@ -699,6 +696,8 @@ void deactivate_file_folio(struct folio *folio) >         /* Deactivating an unevictable folio will not accelerate > reclaim */ >         if (folio_test_unevictable(folio)) >                 return; > +       if (!folio_test_clear_lru(folio)) > +               return; > >         folio_get(folio); >         local_lock(&cpu_fbatches.lock); > @@ -718,7 +717,8 @@ void deactivate_file_folio(struct folio *folio) >  void folio_deactivate(struct folio *folio) >  { >         if (folio_test_lru(folio) && !folio_test_unevictable(folio) && > -           (folio_test_active(folio) || lru_gen_enabled())) { > +           (folio_test_active(folio) || lru_gen_enabled()) && > +           folio_test_clear_lru(folio)) { >                 struct folio_batch *fbatch; > >                 folio_get(folio); > @@ -740,7 +740,8 @@ void folio_mark_lazyfree(struct folio *folio) >  { >         if (folio_test_lru(folio) && folio_test_anon(folio) && >             folio_test_swapbacked(folio) && > !folio_test_swapcache(folio) && > -           !folio_test_unevictable(folio)) { > +           !folio_test_unevictable(folio) && > +           folio_test_clear_lru(folio)) { >                 struct folio_batch *fbatch; > >                 folio_get(folio); With your changes, we will call folio_test_clear_lru(folio) firstly to clear the LRU flag, and then call folio_get(folio) to pin the folio, seems a little unreasonable. Normally, folio_get(folio) is called firstly to pin the page, and then some other functions is called to handle the folio.