linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Barry Song <21cnbao@gmail.com>
To: chenridong <chenridong@huawei.com>
Cc: Yu Zhao <yuzhao@google.com>, Matthew Wilcox <willy@infradead.org>,
	Chris Li <chrisl@kernel.org>,
	 Chen Ridong <chenridong@huaweicloud.com>,
	akpm@linux-foundation.org, mhocko@suse.com,  hannes@cmpxchg.org,
	yosryahmed@google.com, david@redhat.com,  ryan.roberts@arm.com,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	 wangweiyang2@huawei.com, xieym_ict@hotmail.com
Subject: Re: [RFC PATCH v2 1/1] mm/vmscan: move the written-back folios to the tail of LRU after shrinking
Date: Fri, 29 Nov 2024 16:07:29 +1300	[thread overview]
Message-ID: <CAGsJ_4zqL8ZHNRZ44o_CC69kE7DBVXvbZfvmQxMGiFqRxqHQdA@mail.gmail.com> (raw)
In-Reply-To: <bf98a80a-2be0-413f-8a7a-34bb17f053cc@huawei.com>

On Fri, Nov 29, 2024 at 3:25 PM chenridong <chenridong@huawei.com> wrote:
>
>
>
> On 2024/11/29 7:08, Barry Song wrote:
> > On Mon, Nov 25, 2024 at 2:19 PM chenridong <chenridong@huawei.com> wrote:
> >>
> >>
> >>
> >> On 2024/11/18 12:21, Matthew Wilcox wrote:
> >>> On Mon, Nov 18, 2024 at 05:14:14PM +1300, Barry Song wrote:
> >>>> On Mon, Nov 18, 2024 at 5:03 PM Matthew Wilcox <willy@infradead.org> wrote:
> >>>>>
> >>>>> On Sat, Nov 16, 2024 at 09:16:58AM +0000, Chen Ridong wrote:
> >>>>>> 2. In shrink_page_list function, if folioN is THP(2M), it may be splited
> >>>>>>    and added to swap cache folio by folio. After adding to swap cache,
> >>>>>>    it will submit io to writeback folio to swap, which is asynchronous.
> >>>>>>    When shrink_page_list is finished, the isolated folios list will be
> >>>>>>    moved back to the head of inactive lru. The inactive lru may just look
> >>>>>>    like this, with 512 filioes have been move to the head of inactive lru.
> >>>>>
> >>>>> I was hoping that we'd be able to stop splitting the folio when adding
> >>>>> to the swap cache.  Ideally. we'd add the whole 2MB and write it back
> >>>>> as a single unit.
> >>>>
> >>>> This is already the case: adding to the swapcache doesn’t require splitting
> >>>> THPs, but failing to allocate 2MB of contiguous swap slots will.
> >>>
> >>> Agreed we need to understand why this is happening.  As I've said a few
> >>> times now, we need to stop requiring contiguity.  Real filesystems don't
> >>> need the contiguity (they become less efficient, but they can scatter a
> >>> single 2MB folio to multiple places).
> >>>
> >>> Maybe Chris has a solution to this in the works?
> >>>
> >>
> >> Hi, Chris, do you have a better idea to solve this issue?
> >
> > Not Chris. As I read the code again, we have already the below code to fixup
> > the issue "missed folio_rotate_reclaimable()" in evict_folios():
> >
> >                 /* retry folios that may have missed
> > folio_rotate_reclaimable() */
> >                 list_move(&folio->lru, &clean);
> >
> > It doesn't work for you?
> >
> > commit 359a5e1416caaf9ce28396a65ed3e386cc5de663
> > Author: Yu Zhao <yuzhao@google.com>
> > Date:   Tue Nov 15 18:38:07 2022 -0700
> >     mm: multi-gen LRU: retry folios written back while isolated
> >
> >     The page reclaim isolates a batch of folios from the tail of one of the
> >     LRU lists and works on those folios one by one.  For a suitable
> >     swap-backed folio, if the swap device is async, it queues that folio for
> >     writeback.  After the page reclaim finishes an entire batch, it puts back
> >     the folios it queued for writeback to the head of the original LRU list.
> >
> >     In the meantime, the page writeback flushes the queued folios also by
> >     batches.  Its batching logic is independent from that of the page reclaim.
> >     For each of the folios it writes back, the page writeback calls
> >     folio_rotate_reclaimable() which tries to rotate a folio to the tail.
> >
> >
> >     folio_rotate_reclaimable() only works for a folio after the page reclaim
> >     has put it back.  If an async swap device is fast enough, the page
> >     writeback can finish with that folio while the page reclaim is still
> >     working on the rest of the batch containing it.  In this case, that folio
> >     will remain at the head and the page reclaim will not retry it before
> >     reaching there.
> >
> >     This patch adds a retry to evict_folios().  After evict_folios() has
> >     finished an entire batch and before it puts back folios it cannot free
> >     immediately, it retries those that may have missed the rotation.
> >     Before this patch, ~60% of folios swapped to an Intel Optane missed
> >     folio_rotate_reclaimable().  After this patch, ~99% of missed folios were
> >     reclaimed upon retry.
> >
> >     This problem affects relatively slow async swap devices like Samsung 980
> >     Pro much less and does not affect sync swap devices like zram or zswap at
> >     all.
> >
> >>
> >> Best regards,
> >> Ridong
> >
> > Thanks
> > Barry
>
> Thank you for your reply, Barry.
> I found this issue with 5.10 version. I reproduced this issue with the
> next version, but the CONFIG_LRU_GEN_ENABLED kconfig is disabled. I
> tested again with  CONFIG_LRU_GEN_ENABLED enabled, and this issue can be
> fixed.
>
> IIUC, the 359a5e1416caaf9ce28396a65ed3e386cc5de663 commit can only work
> when CONFIG_LRU_GEN_ENABLED is enabled, but this issue exists when
> CONFIG_LRU_GEN_ENABLED is disabled and it should be fixed.
>
> I read the code of commit 359a5e1416caaf9ce28396a65ed3e386cc5de663, it
> found folios that are missed to rotate in a more complicated way, but it
>  makes it much clearer what is being done. Should I implement in Yu
> Zhao's way?

yes. this is completely the same thing.
since Yu only fixed in mglru and you are still using active/inactive,
the same fix should apply to active/inactive lru.


>
> Best regards,
> Ridong

thanks
barry


  reply	other threads:[~2024-11-29  3:07 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-11-16  9:16 [RFC PATCH v2 0/1] " Chen Ridong
2024-11-16  9:16 ` [RFC PATCH v2 1/1] " Chen Ridong
2024-11-17  3:26   ` Barry Song
2024-11-18  2:18     ` Chen Ridong
2024-11-18  4:03   ` Matthew Wilcox
2024-11-18  4:14     ` Barry Song
2024-11-18  4:21       ` Matthew Wilcox
2024-11-25  1:19         ` chenridong
2024-11-28 23:08           ` Barry Song
2024-11-29  2:25             ` chenridong
2024-11-29  3:07               ` Barry Song [this message]
2024-11-27  0:08         ` Chris Li
2024-11-18  9:41       ` chenridong
2024-11-18  9:55         ` Barry Song
2024-11-27  0:17           ` Chris Li

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAGsJ_4zqL8ZHNRZ44o_CC69kE7DBVXvbZfvmQxMGiFqRxqHQdA@mail.gmail.com \
    --to=21cnbao@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=chenridong@huawei.com \
    --cc=chenridong@huaweicloud.com \
    --cc=chrisl@kernel.org \
    --cc=david@redhat.com \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=ryan.roberts@arm.com \
    --cc=wangweiyang2@huawei.com \
    --cc=willy@infradead.org \
    --cc=xieym_ict@hotmail.com \
    --cc=yosryahmed@google.com \
    --cc=yuzhao@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox