From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-vk0-f71.google.com (mail-vk0-f71.google.com [209.85.213.71]) by kanga.kvack.org (Postfix) with ESMTP id C2BDD6B0005 for ; Tue, 17 Apr 2018 10:28:34 -0400 (EDT) Received: by mail-vk0-f71.google.com with SMTP id z78so284518vkd.14 for ; Tue, 17 Apr 2018 07:28:34 -0700 (PDT) Received: from mail-sor-f41.google.com (mail-sor-f41.google.com. [209.85.220.41]) by mx.google.com with SMTPS id 3sor3065813uah.141.2018.04.17.07.28.33 for (Google Transport Security); Tue, 17 Apr 2018 07:28:33 -0700 (PDT) MIME-Version: 1.0 In-Reply-To: <20180417141442.GG17484@dhcp22.suse.cz> References: <20180417110615.16043-1-liwang@redhat.com> <20180417130300.GF17484@dhcp22.suse.cz> <20180417141442.GG17484@dhcp22.suse.cz> From: Li Wang Date: Tue, 17 Apr 2018 22:28:33 +0800 Message-ID: Subject: Re: [RFC PATCH] mm: correct status code which move_pages() returns for zero page Content-Type: multipart/alternative; boundary="089e08240fd0390290056a0c273b" Sender: owner-linux-mm@kvack.org List-ID: To: Michal Hocko Cc: linux-mm@kvack.org, ltp@lists.linux.it, "Kirill A . Shutemov" , Zi Yan --089e08240fd0390290056a0c273b Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Tue, Apr 17, 2018 at 10:14 PM, Michal Hocko wrote: > On Tue 17-04-18 15:03:00, Michal Hocko wrote: > > On Tue 17-04-18 19:06:15, Li Wang wrote: > > [...] > > > diff --git a/mm/migrate.c b/mm/migrate.c > > > index f65dd69..2b315fc 100644 > > > --- a/mm/migrate.c > > > +++ b/mm/migrate.c > > > @@ -1608,7 +1608,7 @@ static int do_pages_move(struct mm_struct *mm, > nodemask_t task_nodes, > > > continue; > > > > > > err =3D store_status(status, i, err, 1); > > > - if (err) > > > + if (!err) > > > goto out_flush; > > > > This change just doesn't make any sense to me. Why should we bail out i= f > > the store_status is successul? I am trying to wrap my head around the > > test case. 6b9d757ecafc ("mm, numa: rework do_pages_move") tried to > > explain that move_pages has some semantic issues and the new > > implementation might be not 100% replacement. Anyway I am studying the > > test case to come up with a proper fix. > > OK, I get what the test cases does. I've failed to see the subtle > difference between alloc_pages_on_node and numa_alloc_onnode. The later > doesn't faul in anything. > > Why are we getting EPERM is quite not yet clear to me. > add_page_for_migration uses FOLL_DUMP which should return EFAULT on > zero pages (no_page_table()). > > err =3D PTR_ERR(page); > if (IS_ERR(page)) > goto out; > > therefore bails out from add_page_for_migration and store_status should > store that value. There shouldn't be any EPERM on the way. > Yes, I print the the return value and confirmed the add_page_for_migration()=E2=80=8B do right things for zero page. and after store_status(...) the status saves -EFAULT. So I did the change above. > > Let me try to reproduce and see what is going on. Btw. which kernel do > you try this on? > =E2=80=8BThe latest mainline kernel-4.17-rc1. --=20 Li Wang liwang@redhat.com --089e08240fd0390290056a0c273b Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable


On Tue, Apr 17, 2018 at 10:14 PM, Michal Hocko <mhocko@suse.com= > wrote:
O= n Tue 17-04-18 15:03:00, Michal Hocko wrote:
> On Tue 17-04-18 19:06:15, Li Wang wrote:
> [...]
> > diff --git a/mm/migrate.c b/mm/migrate.c
> > index f65dd69..2b315fc 100644
> > --- a/mm/migrate.c
> > +++ b/mm/migrate.c
> > @@ -1608,7 +1608,7 @@ static int do_pages_move(struct mm_struct *= mm, nodemask_t task_nodes,
> >=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0continue;
> >=C2=A0
> >=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0err =3D store_stat= us(status, i, err, 1);
> > -=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0if (err)
> > +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0if (!err)
> >=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0goto out_flush;
>
> This change just doesn't make any sense to me. Why should we bail = out if
> the store_status is successul? I am trying to wrap my head around the<= br> > test case. 6b9d757ecafc ("mm, numa: rework do_pages_move") t= ried to
> explain that move_pages has some semantic issues and the new
> implementation might be not 100% replacement. Anyway I am studying the=
> test case to come up with a proper fix.

OK, I get what the test cases does. I've failed to see the subtl= e
difference between alloc_pages_on_node and numa_alloc_onnode. The later
doesn't faul in anything.

Why are we getting EPERM is quite not yet clear to me.
add_page_for_migration uses FOLL_DUMP which should return EFAULT on
zero pages (no_page_table()).

=C2=A0 =C2=A0 =C2=A0 =C2=A0 err =3D PTR_ERR(page);
=C2=A0 =C2=A0 =C2=A0 =C2=A0 if (IS_ERR(page))
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 goto out;

therefore bails out from add_page_for_migration and store_status should
store that value. There shouldn't be any EPERM on the way.

Yes, I print the the return value and confirmed = the add_page_for_migration()=E2=80=8B
do right things for zero page= . and after store_status(...) the status saves -EFAULT.
So I did th= e change above.

=C2=A0

Let me try to reproduce and see what is going on. Btw. which kernel do
you try this on?

<= /div>
=E2=80=8BThe latest mainline kernel-4.17-rc1.



--
--089e08240fd0390290056a0c273b--