From: Linus Torvalds <torvalds@linux-foundation.org>
To: Joel Fernandes <joel@joelfernandes.org>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>,
Michal Hocko <mhocko@suse.com>,
Naresh Kamboju <naresh.kamboju@linaro.org>,
Andrew Morton <akpm@linux-foundation.org>,
linux-mm@kvack.org, LKML <linux-kernel@vger.kernel.org>
Subject: Re: WARN_ON in move_normal_pmd
Date: Sat, 25 Mar 2023 10:26:06 -0700 [thread overview]
Message-ID: <CAHk-=wjsrGG3-gvrpPs9f56PVvwmZViPKm++TuSsHeyTQ+tRmQ@mail.gmail.com> (raw)
In-Reply-To: <CAHk-=whd7msp8reJPfeGNyt0LiySMT0egExx3TVZSX3Ok6X=9g@mail.gmail.com>
On Sat, Mar 25, 2023 at 10:06 AM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> So what I'm saying is that *if* we start out with that situation, and
> we have that
>
> old = 0x1fff000
> new = 1dff000
> len = 0x201000
>
> we could easily decode "let's just move the whole PMD", and expand the
> move to be
>
> old = 0x1e00000
> new = 0x1c00000
> len = 0x400000
>
> instead. And then instead of moving PTE's around at first, we'd move
> PMD's around *all* the time, and turn this into that "simple case
> (a)".
>
> NOTE! For this to work, there must be no mapping right below 'old' or
> 'new', of course. But during the execve() startup, that should be
> trivially true.
>
> See what I'm saying?
Also note that my comments about "this can be tested with mremap()"
are because the above optimization works and is valid even when old
and new are not originally overlapping, but they overlap after the
expansion.
IOW, imagine that you have a 2GB mapping, but it is not 2GB-aligned
virtually, and you want to move that mapping down by 2GB.
Now, because that 2GB mapping is *not* 2GB-aligned, it actually takes
up *two* PMD entries. But if that mapping is the only thing that
exists in those two PMD entries, and the PMD entry below it is clear
(because there is no mapping right below the new address), then we can
still do that unaligned 2GB mapping move entirely at the PMD level.
So instead of wasting time to move it one page at a time (until it is
2GB aligned), we could just move two PMD entries around.
Here's a (UNTESTED! It compiles, but that's it) user test-case for
this situation:
#define _GNU_SOURCE
#include <sys/mman.h>
#include <string.h>
/* Pick some random 2GB-aligned address that isn't near anything else */
#define GB (1ul << 20)
#define VA ((void *)(128 * GB))
#define old (VA+GB)
#define new (VA-GB)
#define len (2*GB)
int main(int argc, char **argv)
{
void *addr;
addr = mmap(old, len,
PROT_READ | PROT_WRITE,
MAP_ANONYMOUS | MAP_PRIVATE | MAP_FIXED,
-1, 0);
memset(addr, 0xff, len);
mremap(old, len, len,
MREMAP_MAYMOVE | MREMAP_FIXED, new);
return 0;
}
and I claim that that mremap() right now ends up doing the whole 2GB
page table move one page at a time, but it *should* be doable as just
two PMD entry moves.
See?
Linus
next prev parent reply other threads:[~2023-03-25 17:26 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-03-24 11:15 Michal Hocko
2023-03-24 13:05 ` Kirill A. Shutemov
2023-03-24 13:43 ` Joel Fernandes
2023-03-24 13:48 ` Joel Fernandes
2023-03-24 13:55 ` Kirill A. Shutemov
2023-03-24 14:38 ` Michal Hocko
2023-03-24 23:38 ` Linus Torvalds
2023-03-25 16:33 ` Joel Fernandes
2023-03-25 16:47 ` Joel Fernandes
2023-03-25 17:06 ` Linus Torvalds
2023-03-25 17:26 ` Linus Torvalds [this message]
2023-03-26 2:26 ` Joel Fernandes
2023-03-26 22:48 ` Linus Torvalds
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAHk-=wjsrGG3-gvrpPs9f56PVvwmZViPKm++TuSsHeyTQ+tRmQ@mail.gmail.com' \
--to=torvalds@linux-foundation.org \
--cc=akpm@linux-foundation.org \
--cc=joel@joelfernandes.org \
--cc=kirill@shutemov.name \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.com \
--cc=naresh.kamboju@linaro.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox