From: Joel Fernandes <joel@joelfernandes.org>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
Andrew Morton <akpm@linux-foundation.org>,
Linux-MM <linux-mm@kvack.org>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
Naresh Kamboju <naresh.kamboju@linaro.org>,
William Kucharski <william.kucharski@oracle.com>
Subject: Re: [PATCHv2] mm: Fix warning in move_normal_pmd()
Date: Wed, 15 Jul 2020 16:54:28 -0400 [thread overview]
Message-ID: <20200715205428.GA201569@google.com> (raw)
In-Reply-To: <CAHk-=wh4pB-jRngOjcGxpc8=NPds3jWqJwDMUWC3-OEo4dRiKg@mail.gmail.com>
Hi Linus,
On Wed, Jul 15, 2020 at 11:36:59AM -0700, Linus Torvalds wrote:
> On Wed, Jul 15, 2020 at 6:50 AM Kirill A. Shutemov
> <kirill.shutemov@linux.intel.com> wrote:
> >
> > mremap(2) does not allow source and destination regions to overlap, but
> > shift_arg_pages() calls move_page_tables() directly and in this case the
> > source and destination overlap often. It
>
> Actually, before we do this patch (which I think is a good idea), I'd
> like Naresh to test the attached patch.
>
> And Kirill, Joel, mind looking it over too.
>
> IT IS ENTIRELY UNTESTED.
Seems a clever idea and was something I wanted as well. Thanks. Some comments
below:
> But I hope the concept (and the code) is fairly obvious: it makes
> move_page_tables() try to align the range to be moved, if possible.
>
> That *should* actually remove the warning that Naresh sees, for the
> simple reason that now the PMD moving case will *always* trigger when
> the stack movement is PMD-aligned, and you never see the "first do a
> few small pages, then do the PMD optimization" case.
>
> And that should mean that we don't ever hit the "oh, we already have a
> target pmd" thing, because the "move the whole pmd" case will clear
> the ones it has already taken care of, and you never have that "oh,
> there's an empty pmd where the pages were moved away one by one".
>
> And that means that for this case, it's _ok_ to overlap (as long as we
> copy downwards).
>
> What do you think?
I might not have fully understand the code but I get the basic idea it is
aiming for. Basically you want to align the cases where the address space is
free from both sides so that you can trigger the PMD-moving optimization.
Regarding the ADDR_AFTER_NEXT checks, shouldn't you check for:
if (ADDR_AFTER_NEXT(ALIGN(*old_addr + *len, PMD_SIZE), old))
return;
and for the len calculation, I did not follow what you did, but I think you
meant something like this? Does the following reduce to what you did? At
least this is a bit more readable I think:
*len += (ALIGN(*new_addr + *len, PMD_SIZE) - (*new_addr + *len));
which reduces to:
*len = ALIGN(*new_addr + *len, PMD_SIZE) - *new_addr;
Also you did "len +=", it should be "*len +=" in this function.
In try_to_align_start(), everything looks correct to me.
> Ok, so I could easily have screwed up the patch, even if it's
> conceptually fairly simple. The details are small, but they needed
> some care. The thing _looks_ fine to me, both on a source level and
> when looking at the generated code, and I made sure to try to use a
> lot of small helper functions and couple of helper macros to make each
> individual piece look clear and obvious.
>
> But obviously a single "Oh, that test is the wrong way around", or a
> missing mask inversion, or whatever, could completely break the code.
> So no guarantees, but it looks fairly straightforward to me.
>
> Naresh - this patch does *NOT* make sense to test together with
> Kirill's (or Joel's) patch that says "don't do the PMD optimization at
> all for overlapping cases". Those patches will hide the issue, and
> make the new code only work for mremap(), not for the initial stack
> setup that caused the original warning.
>
> Joel - I think this patch makes sense _regardless_ of the special
> execve() usage case, in that it extends your "let's just copy things
> at a whole PMD level" logic to even the case where the beginning
> wasn't PMD-aligned (or the length wasn't sufficient).
>
> Obviously it only triggers when the source and destination are
> mutually aligned, and if there are no other vma's around those
> first/last PMD entries. Which might mean that it almost never triggers
> in practice, but looking at the generated code, it sure looks like
> it's cheap enough to test.
Oh ok, we had some requirements in my old job about moving large address
spaces which were aligned so it triggered a lot in those. Also folks in the
ChromeOS team tell me it is helping them.
> Didn't you have some test-cases for your pmd optimized movement cases,
> at least for timing?
Yes, I had a simple mremap() test which was moving a 1GB map and measuring
performance. Once I get a chance I can try it out with this optimization and
trigger it with unaligned addresses as well.
thanks,
- Joel
next prev parent reply other threads:[~2020-07-15 20:54 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-07-15 13:50 Kirill A. Shutemov
2020-07-15 18:36 ` Linus Torvalds
2020-07-15 20:54 ` Joel Fernandes [this message]
2020-07-15 21:13 ` Kirill A. Shutemov
2020-07-15 21:31 ` Linus Torvalds
2020-07-15 21:43 ` Linus Torvalds
2020-07-15 22:22 ` Kirill A. Shutemov
2020-07-15 22:36 ` Linus Torvalds
2020-07-15 22:57 ` Linus Torvalds
2020-07-15 23:04 ` Linus Torvalds
2020-07-15 23:18 ` Linus Torvalds
2020-07-16 6:37 ` Naresh Kamboju
2020-07-16 7:23 ` Naresh Kamboju
2020-07-16 8:46 ` Kirill A. Shutemov
2020-07-16 8:32 ` Naresh Kamboju
2020-07-16 13:16 ` Kirill A. Shutemov
2020-07-16 17:54 ` Linus Torvalds
2020-07-16 18:47 ` Joel Fernandes
2020-07-15 20:55 ` Kirill A. Shutemov
2020-07-15 21:35 ` Linus Torvalds
2020-07-15 21:51 ` Linus Torvalds
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200715205428.GA201569@google.com \
--to=joel@joelfernandes.org \
--cc=akpm@linux-foundation.org \
--cc=kirill.shutemov@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=naresh.kamboju@linaro.org \
--cc=torvalds@linux-foundation.org \
--cc=william.kucharski@oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox