linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Barry Song <21cnbao@gmail.com>
To: Yosry Ahmed <yosryahmed@google.com>
Cc: Usama Arif <usamaarif642@gmail.com>,
	Nhat Pham <nphamcs@gmail.com>,
	akpm@linux-foundation.org,  linux-mm@kvack.org,
	linux-kernel@vger.kernel.org,  Barry Song <v-songbaohua@oppo.com>,
	Chengming Zhou <chengming.zhou@linux.dev>,
	 Johannes Weiner <hannes@cmpxchg.org>,
	David Hildenbrand <david@redhat.com>,
	Hugh Dickins <hughd@google.com>,
	 Matthew Wilcox <willy@infradead.org>,
	Shakeel Butt <shakeel.butt@linux.dev>,
	 Andi Kleen <ak@linux.intel.com>,
	Baolin Wang <baolin.wang@linux.alibaba.com>,
	 Chris Li <chrisl@kernel.org>,
	"Huang, Ying" <ying.huang@intel.com>,
	 Kairui Song <kasong@tencent.com>,
	Ryan Roberts <ryan.roberts@arm.com>,
	joshua.hahnjy@gmail.com
Subject: Re: [PATCH RFC] mm: count zeromap read and set for swapout and swapin
Date: Tue, 29 Oct 2024 06:51:57 +0800	[thread overview]
Message-ID: <CAGsJ_4yxoBVEY-Zpp3YNbiCCwbKO+v3-9R984uGVRHAtMSLDLQ@mail.gmail.com> (raw)
In-Reply-To: <CAJD7tkYPB=2c23LMi1+=qrPO+rcr5zJB4+2TPrcjAZHhsm=Vsw@mail.gmail.com>

On Tue, Oct 29, 2024 at 6:33 AM Yosry Ahmed <yosryahmed@google.com> wrote:
>
> [..]
> > > > By the way, I recently had an idea: if we can conduct the zeromap check
> > > > earlier - for example - before allocating swap slots and pageout(), could
> > > > we completely eliminate swap slot occupation and allocation/release
> > > > for zeromap data? For example, we could use a special swap
> > > > entry value in the PTE to indicate zero content and directly fill it with
> > > > zeros when swapping back. We've observed that swap slot allocation and
> > > > freeing can consume a lot of CPU and slow down functions like
> > > > zap_pte_range and swap-in. If we can entirely skip these steps, it
> > > > could improve performance. However, I'm uncertain about the benefits we
> > > > would gain if we only have 1-2% zeromap data.
> > >
> > > If I remember correctly this was one of the ideas floated around in the
> > > initial version of the zeromap series, but it was evaluated as a lot more
> > > complicated to do than what the current zeromap code looks like. But I
> > > think its definitely worth looking into!
>
> Yup, I did suggest this on the first version:
> https://lore.kernel.org/linux-mm/CAJD7tkYcTV_GOZV3qR6uxgFEvYXw1rP-h7WQjDnsdwM=g9cpAw@mail.gmail.com/
>
> , and Usama took a stab at implementing it in the second version:
> https://lore.kernel.org/linux-mm/20240604105950.1134192-1-usamaarif642@gmail.com/
>
> David and Shakeel pointed out a few problems. I think they are
> fixable, but the complexity/benefit tradeoff was getting unclear at
> that point.
>
> If we can make it work without too much complexity, that would be
> great of course.
>
> >
> > Sorry for the noise. I didn't review the initial discussion. But my feeling
> > is that it might be valuable considering the report from Zhiguo:
> >
> > https://lore.kernel.org/linux-mm/20240805153639.1057-1-justinjiang@vivo.com/
> >
> > In fact, our recent benchmark also indicates that swap free could account
> > for a significant portion in do_swap_page().
>
> As Shakeel mentioned in a reply to Usama's patch mentioned above, we
> would need to check the contents of the page after it's unmapped. So
> likely we need to allocate a swap slot, walk the rmap and unmap, check
> contents, walk the rmap again and update the PTEs, free the swap slot.
>

So the issue is that we can't check the content before allocating slots and
unmapping during reclamation? If we find the content is zero, can we skip
all slot operations and go directly to rmap/unmap by using a special PTE?

> So the swap free will be essentially moved from the fault path to the
> reclaim path, not eliminated. It may still be worth it, not sure. We
> also need to make sure we keep the rmap intact after the first walk
> and unmap in case we need to go back and update the PTEs again.
>
> Overall, I think the complexity is unlikely to be low.


  reply	other threads:[~2024-10-28 22:52 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-27  1:19 Barry Song
2024-10-27  2:45 ` Nhat Pham
2024-10-28  2:32   ` Barry Song
2024-10-28 12:23     ` Usama Arif
2024-10-28 16:33       ` Nhat Pham
2024-10-28 17:00         ` Usama Arif
2024-10-28 17:08           ` Yosry Ahmed
2024-10-28 17:19             ` Usama Arif
2024-10-28 19:54               ` Barry Song
2024-10-28 19:58                 ` Yosry Ahmed
2024-10-28 20:00                 ` Usama Arif
2024-10-28 20:42                   ` Barry Song
2024-10-28 20:51                     ` Usama Arif
2024-10-28 21:15                       ` Barry Song
2024-10-28 21:24                         ` Usama Arif
2024-10-28 21:40                           ` Barry Song
2024-10-28 21:49                             ` Usama Arif
2024-10-28 22:11                               ` Barry Song
2024-10-28 22:32                                 ` Yosry Ahmed
2024-10-28 22:51                                   ` Barry Song [this message]
2024-10-28 22:54                                     ` Yosry Ahmed
2024-10-28 23:03                                       ` Barry Song
2024-10-29 17:46                                         ` Nhat Pham
2024-10-29 17:55                                           ` Yosry Ahmed
2024-10-30 23:46                                             ` Nhat Pham
2024-10-28 16:34     ` Nhat Pham
2024-10-28 17:17       ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAGsJ_4yxoBVEY-Zpp3YNbiCCwbKO+v3-9R984uGVRHAtMSLDLQ@mail.gmail.com \
    --to=21cnbao@gmail.com \
    --cc=ak@linux.intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=chengming.zhou@linux.dev \
    --cc=chrisl@kernel.org \
    --cc=david@redhat.com \
    --cc=hannes@cmpxchg.org \
    --cc=hughd@google.com \
    --cc=joshua.hahnjy@gmail.com \
    --cc=kasong@tencent.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=nphamcs@gmail.com \
    --cc=ryan.roberts@arm.com \
    --cc=shakeel.butt@linux.dev \
    --cc=usamaarif642@gmail.com \
    --cc=v-songbaohua@oppo.com \
    --cc=willy@infradead.org \
    --cc=ying.huang@intel.com \
    --cc=yosryahmed@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox