From: Suren Baghdasaryan <surenb@google.com>
To: Yang Shi <yang@os.amperecomputing.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
hughd@google.com, willy@infradead.org, mhocko@suse.com,
vbabka@suse.cz, osalvador@suse.de, aquini@redhat.com,
kirill@shutemov.name, rientjes@google.com, linux-mm@kvack.org,
linux-kernel@vger.kernel.org, stable@vger.kernel.org
Subject: Re: [PATCH] mm: mempolicy: keep VMA walk if both MPOL_MF_STRICT and MPOL_MF_MOVE are specified
Date: Wed, 27 Sep 2023 14:39:21 -0700 [thread overview]
Message-ID: <CAJuCfpExMWXHfZjgZ=UKf4k=zxrNOLx2R-a_wQdZ3O_+JTOq4w@mail.gmail.com> (raw)
In-Reply-To: <90fc0e8d-f378-4d6f-5f52-c14583200a2e@os.amperecomputing.com>
On Mon, Sep 25, 2023 at 10:16 AM Yang Shi <yang@os.amperecomputing.com> wrote:
>
>
>
> On 9/25/23 8:48 AM, Andrew Morton wrote:
> > On Wed, 20 Sep 2023 15:32:42 -0700 Yang Shi <yang@os.amperecomputing.com> wrote:
> >
> >> When calling mbind() with MPOL_MF_{MOVE|MOVEALL} | MPOL_MF_STRICT,
> >> kernel should attempt to migrate all existing pages, and return -EIO if
> >> there is misplaced or unmovable page. Then commit 6f4576e3687b
> >> ("mempolicy: apply page table walker on queue_pages_range()") messed up
> >> the return value and didn't break VMA scan early ianymore when MPOL_MF_STRICT
> >> alone. The return value problem was fixed by commit a7f40cfe3b7a
> >> ("mm: mempolicy: make mbind() return -EIO when MPOL_MF_STRICT is specified"),
> >> but it broke the VMA walk early if unmovable page is met, it may cause some
> >> pages are not migrated as expected.
> > So I'm thinking that a7f40cfe3b7a is the suitable Fixes: target?
>
> Yes, thanks. My follow-up email also added this.
>
> >
> >> The code should conceptually do:
> >>
> >> if (MPOL_MF_MOVE|MOVEALL)
> >> scan all vmas
> >> try to migrate the existing pages
> >> return success
> >> else if (MPOL_MF_MOVE* | MPOL_MF_STRICT)
> >> scan all vmas
> >> try to migrate the existing pages
> >> return -EIO if unmovable or migration failed
> >> else /* MPOL_MF_STRICT alone */
> >> break early if meets unmovable and don't call mbind_range() at all
> >> else /* none of those flags */
> >> check the ranges in test_walk, EFAULT without mbind_range() if discontig.
With this change I think my temporary fix at
https://lore.kernel.org/all/20230918211608.3580629-1-surenb@google.com/
can be removed because we either scan all vmas (which means we locked
them all) or we break early and do not call mbind_range() at all (in
which case we don't need vmas to be locked).
> >>
> >> Fixed the behavior.
> >>
>
next prev parent reply other threads:[~2023-09-27 21:39 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-09-20 22:32 Yang Shi
2023-09-20 23:36 ` Yang Shi
2023-09-25 15:48 ` Andrew Morton
2023-09-25 17:16 ` Yang Shi
2023-09-27 21:39 ` Suren Baghdasaryan [this message]
2023-09-28 16:38 ` Andrew Morton
2023-09-28 17:35 ` Yang Shi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAJuCfpExMWXHfZjgZ=UKf4k=zxrNOLx2R-a_wQdZ3O_+JTOq4w@mail.gmail.com' \
--to=surenb@google.com \
--cc=akpm@linux-foundation.org \
--cc=aquini@redhat.com \
--cc=hughd@google.com \
--cc=kirill@shutemov.name \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.com \
--cc=osalvador@suse.de \
--cc=rientjes@google.com \
--cc=stable@vger.kernel.org \
--cc=vbabka@suse.cz \
--cc=willy@infradead.org \
--cc=yang@os.amperecomputing.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox