From: Yang Shi <shy828301@gmail.com>
To: John Hubbard <jhubbard@nvidia.com>, Christoph Lameter <cl@linux.com>
Cc: Felix Abecassis <fabecassis@nvidia.com>,
Linux MM <linux-mm@kvack.org>,
Andrew Morton <akpm@linux-foundation.org>
Subject: Re: bug: move_pages(2) does not udpate "status" if no pages are moved
Date: Wed, 4 Dec 2019 16:43:29 -0800 [thread overview]
Message-ID: <CAHbLzkprcO50g-EqL3t0-A3fGnUU0A6rps+4q8TYM+eqeKiaAw@mail.gmail.com> (raw)
In-Reply-To: <894a7d96-b715-bec5-2f72-1552891672ff@nvidia.com>
On Wed, Dec 4, 2019 at 3:45 PM John Hubbard <jhubbard@nvidia.com> wrote:
>
> On 12/4/19 12:04 PM, Yang Shi wrote:
> > On Wed, Dec 4, 2019 at 11:21 AM John Hubbard <jhubbard@nvidia.com> wrote:
> >>
> >> On 12/4/19 11:01 AM, Felix Abecassis wrote:
> >>> Hello all,
> >>>
> >>
> >> Hi Felix,
> >>
> >> Thanks for writing up a very clear description of the problem.
> >>
> >>> On kernel 5.3, when using the move_pages syscall (wrapped by libnuma) and all
> >>> pages happen to be on the right node already, this function returns 0 but the
> >>> "status" array is not updated. This array potentially contains garbage values
> >>> (e.g. from malloc(3)), and I don't see a way to detect this.
> >>
> >>
> >> The way to detect this case would be to zero the array before calling move_pages().
> >> Then, if move_pages() returns 0, and the array remains full of zeroes, you can
> >> conclude that move_pages() "succeeded", and that there were no errors for any
> >> of the pages. So the pages are where you requested them to end up.
> >
> > I don't think we can just simply return all zeros here. It looks the
> > status should contain error code or the target node id if the page is
> > moved to that node successfully. So, if the page is already on the
> > requested node, the status should contain the current node id, but the
> > current node maybe not 0.
> >
> > So, IMHO it sounds like a valid bug.
> >
>
> Yes, you're right, a more precise reading of the man page does support that:
> if move_pages() returns 0, then the status array *must* contain valid node IDs.
> I see. (Felix also mentioned the same thing, in a side note.)
>
> Looking some more at both the man page and Felix's report (and the kernel
> implementation), it seems like there are maybe two bugs here:
>
> 1) Not setting the status array in some cases, if some pages were not moved for
> "non fatal" reasons, and
>
> 2) Returning success if no pages were moved. The "ERRORS" section of the man
> page seems to require that ENOENT be returned in that case. Although, you could
> perhaps argue that this statement is only unidirectional. In other words, maybe
> ENOENT happens, but it doesn't *have* to happen, if all pages were already on the
> target node.
>
> ERRORS
>
> ENOENT No pages were found that require moving. All pages are either already
> on the target node, not present, had an invalid address or could not
> be moved because they were mapped by multiple processes.
Thanks for pointing this out. Yes, the man page says so, but the code
doesn't do so at all. I dig into the very first commit
742755a1d8ce2b548428f7aacf1758b4bba50080 ("[PATCH] page migration:
sys_move_pages(): support moving of individual pages"), however, it
didn't return -ENOENT if the pages are already on the target node if I
read the code correctly.
Added Chris in this thread who is the original author of this API.
>
>
> thanks,
> --
> John Hubbard
> NVIDIA
next prev parent reply other threads:[~2019-12-05 0:43 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-12-04 19:01 Felix Abecassis
2019-12-04 19:21 ` John Hubbard
2019-12-04 20:04 ` Yang Shi
2019-12-04 23:45 ` John Hubbard
2019-12-05 0:43 ` Yang Shi [this message]
2019-12-04 20:17 ` Yang Shi
2019-12-04 22:54 ` Felix Abecassis
2019-12-04 23:37 ` Yang Shi
2019-12-05 0:03 ` John Hubbard
2019-12-05 0:40 ` Yang Shi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAHbLzkprcO50g-EqL3t0-A3fGnUU0A6rps+4q8TYM+eqeKiaAw@mail.gmail.com \
--to=shy828301@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=cl@linux.com \
--cc=fabecassis@nvidia.com \
--cc=jhubbard@nvidia.com \
--cc=linux-mm@kvack.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox