Re: [RFC] mm/memory.c: Optimizing THP zeroing routine for !HIGHMEM cases

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Alexander Duyck <alexander.duyck@gmail.com>
To: Prathu Baronia <prathu.baronia@oneplus.com>
Cc: Chintan Pandya <chintan.pandya@oneplus.com>,
	"Huang, Ying" <ying.huang@intel.com>,
	 Michal Hocko <mhocko@suse.com>,
	"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
	 "linux-mm@kvack.org" <linux-mm@kvack.org>,
	 "gregkh@linuxfoundation.org" <gregkh@linuxfoundation.org>,
	"gthelen@google.com" <gthelen@google.com>,
	 "jack@suse.cz" <jack@suse.cz>, Ken Lin <ken.lin@oneplus.com>,
	Gasine Xu <Gasine.Xu@oneplus.com>
Subject: Re: [RFC] mm/memory.c: Optimizing THP zeroing routine for !HIGHMEM cases
Date: Mon, 13 Apr 2020 09:24:29 -0700	[thread overview]
Message-ID: <CAKgT0UcDG09F3pDV65Xx_5OUa8dcc6m-u+BNLvS0wzq=Gztj6w@mail.gmail.com> (raw)
In-Reply-To: <20200413153351.GB13136@oneplus.com>

On Mon, Apr 13, 2020 at 8:34 AM Prathu Baronia
<prathu.baronia@oneplus.com> wrote:
>
> The 04/11/2020 13:47, Alexander Duyck wrote:
> >
> > This is an interesting data point. So running things in reverse seems
> > much more expensive than running them forward. As such I would imagine
> > process_huge_page is going to be significantly more expensive then on
> > ARM64 since it will wind through the pages in reverse order from the
> > end of the page all the way down to wherever the page was accessed.
> >
> > I wonder if we couldn't simply process_huge_page to process pages in
> > two passes? The first being from the addr_hint + some offset to the
> > end, and then loop back around to the start of the page for the second
> > pass and just process up to where we started the first pass. The idea
> > would be that the offset would be enough so that we have the 4K that
> > was accessed plus some range before and after the address hopefully
> > still in the L1 cache after we are done.
> That's a great idea, we were working on a similar idea for the v2 patch and you
> suggesting this idea has reassured our approach. This will incorporate the
> benefits of optimized memset and will keep the cache hot around the
> faulting address.
>
> Earlier we had taken this offset as 0.5MB and after your response we have kept it
> as 32KB. As we understand there is a trade-off associated with keeping this value
> too high, we would really appreciate if you can suggest a method to derive an
> appropriate value for this offset from the L1 cache size.

I mentioned 32KB as a value since that happens to be a common value
for L1 cache size on both the ARM64 processor mentioned, and most
modern x86 CPUs. As far as deriving it I don't know if there is a good
way to go about doing that. I suspect it is something that would need
to be architecture specific. If nothing else you might be able to do
something like define it similar to how L1_CACHE_SHIFT/BYTES is
defined in cache.h for most architectures. Also we probably want to
play around with that value a bit as well as I suspect there may be
some room to either increase or decrease the value depending on the
cost for cold accesses versus being able to process memory
initialization in larger batches.

> > An additional thing I was just wondering is if this also impacts the
> > copy operations as well? Looking through the code the two big users
> > for process_huge_page are clear_huge_page and copy_user_huge_page. One
> > thing that might make more sense than just splitting the code at a
> > high level would be to look at possibly refactoring process_huge_page
> > and the users for it.
> You are right, we didn't consider refactoring process_huge_page earlier. We
> will incorporate this in the soon to be sent v2 patch.
>
> Thanks a lot for the interesting insights!

Sounds good. I'll look forward to v2.

Thanks.

- Alex

next prev parent reply	other threads:[~2020-04-13 16:24 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-03  8:18 Prathu Baronia
2020-04-03  8:52 ` Michal Hocko
2020-04-09 15:29   ` Prathu Baronia
2020-04-09 15:45     ` Michal Hocko
     [not found]       ` <SG2PR04MB2921D2AAA8726318EF53D83691DE0@SG2PR04MB2921.apcprd04.prod.outlook.com>
2020-04-10  9:05         ` Huang, Ying
2020-04-11 15:40           ` Chintan Pandya
2020-04-11 20:47             ` Alexander Duyck
2020-04-13 15:33               ` Prathu Baronia
2020-04-13 16:24                 ` Alexander Duyck [this message]
2020-04-14  1:10                 ` Huang, Ying
2020-04-10 18:54 ` Alexander Duyck
2020-04-11  8:45   ` Chintan Pandya
2020-04-14 15:55     ` Daniel Jordan
2020-04-14 17:33       ` Chintan Pandya

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAKgT0UcDG09F3pDV65Xx_5OUa8dcc6m-u+BNLvS0wzq=Gztj6w@mail.gmail.com' \
    --to=alexander.duyck@gmail.com \
    --cc=Gasine.Xu@oneplus.com \
    --cc=akpm@linux-foundation.org \
    --cc=chintan.pandya@oneplus.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=gthelen@google.com \
    --cc=jack@suse.cz \
    --cc=ken.lin@oneplus.com \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=prathu.baronia@oneplus.com \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox