From: David Hildenbrand <david@redhat.com>
To: Alistair Popple <apopple@nvidia.com>,
Ryan Roberts <ryan.roberts@arm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
linux-mm@kvack.org, mhocko@suse.com, jhubbard@nvidia.com,
ying.huang@intel.com, osalvador@suse.de,
baolin.wang@linux.alibaba.com, ziy@nvidia.com,
shy828301@gmail.com
Subject: Re: [PATCH 2/2] selftests/migration: Disable NUMA balancing and check migration status
Date: Wed, 9 Aug 2023 11:34:47 +0200 [thread overview]
Message-ID: <f477c278-79bb-a685-96c3-3d50770e3268@redhat.com> (raw)
In-Reply-To: <87fs4sal9n.fsf@nvdebian.thelocal>
On 09.08.23 06:21, Alistair Popple wrote:
>
> Ryan Roberts <ryan.roberts@arm.com> writes:
>
>> On 07/08/2023 13:41, Alistair Popple wrote:
>>>
>>> Ryan Roberts <ryan.roberts@arm.com> writes:
>>>
>>>> On 07/08/2023 07:39, Alistair Popple wrote:
>>>>> The migration selftest was only checking the return code and not the
>>>>> status array for migration success/failure. Update the test to check
>>>>> both. This uncovered a bug in the return code handling of
>>>>> do_pages_move().
>>>>>
>>>>> Also disable NUMA balancing as that can lead to unexpected migration
>>>>> failures.
>>>>>
>>>>> Signed-off-by: Alistair Popple <apopple@nvidia.com>
>>>>> Suggested-by: Ryan Roberts <ryan.roberts@arm.com>
>>>>> ---
>>>>>
>>>>> Ryan, this will still cause the test to fail if a migration failed. I
>>>>> was unable to reproduce a migration failure for any cases on my system
>>>>> once I disabled NUMA balancing though so I'd be curious if you are
>>>>> still seeing failures with this patch applied. AFAIK there shouldn't
>>>>> be anything else that would be causing migration failure so would like
>>>>> to know what is causing failures. Thanks!
>>>>
>>>>
>>>> Hi Alistair,
>>>>
>>>> Afraid I'm still seeing unmigrated pages when running with these 2 patches:
>>>>
>>>>
>>>> # RUN migration.shared_anon ...
>>>> Didn't migrate 1 pages
>>>> # migration.c:183:shared_anon:Expected migrate(ptr, self->n1, self->n2) (-2) == 0 (0)
>>>> # shared_anon: Test terminated by assertion
>>>> # FAIL migration.shared_anon
>>>> not ok 2 migration.shared_anon
>>>>
>>>>
>>>> I added some instrumentation; it usually fails on the second time
>>>> through the loop in migrate() but I've also seen it fail the first
>>>> time. Never seen it get though 2 iterations successfully though.
>>>
>>> Interesting. I guess migration failure is always possible for various
>>> reasons so I will update the test to report the number of failed
>>> migrations rather than making it a test failure.
>>
>> I find it odd that migration always succeeds for the private_anon and
>> private_anon_thp cases, but fails for the fork case. I guess there is something
>> about the page being RO (for CoW) or having higher mapcount/refcount that causes
>> the migration to fail?
>
> But the fork case uses a shared mapping, so there shouldn't be any
> RO/CoW AFAIK. A higher refcount than expected would cause migration
> failure, but there's nothing I can think of that would be causing that
> now NUMA balancing is disabled in the test (that caused migration
> failures for me in the private cases due to the concurrent migration
> attempts).
Yes, we do have MAP_SHARED | MAP_ANONYMOUS, which is just shmem with
extra-steps.
In add_page_for_migration(), we do have
if (page_mapcount(page) > 1 && !migrate_all)
but the test seems to always specify MPOL_MF_MOVE_ALL when calling
move_pages(), so it should be ignored.
It would be helpful to dump the page when migration fails in that case.
Note that in add_page_for_migration(), we perform a follow_page() that
used to accidentally honor NUMA hinting faults. Fixed by
commit 3c471b04db7604974e3bea45e2d9c7c27a905536
Author: David Hildenbrand <david@redhat.com>
Date: Thu Aug 3 16:32:02 2023 +0200
mm/gup: reintroduce FOLL_NUMA as FOLL_HONOR_NUMA_FAULT
in mm-unstable.
Maybe that fix does no longer require you to disable NUMA hinting for
this test case. Maybe you're lucky and it resolves the shared_anon case
as well, but I somewhat doubt it :)
It doesn't really explain why the shared case would fail, though...
--
Cheers,
David / dhildenb
next prev parent reply other threads:[~2023-08-09 9:34 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-08-07 6:39 [PATCH 1/2] mm/migrate.c: Fix return code when migration fails Alistair Popple
2023-08-07 6:39 ` [PATCH 2/2] selftests/migration: Disable NUMA balancing and check migration status Alistair Popple
2023-08-07 9:15 ` Ryan Roberts
2023-08-07 12:41 ` Alistair Popple
2023-08-07 13:42 ` Ryan Roberts
2023-08-09 4:21 ` Alistair Popple
2023-08-09 9:34 ` David Hildenbrand [this message]
2023-08-09 13:39 ` Ryan Roberts
2023-08-10 8:23 ` Alistair Popple
2023-08-09 9:35 ` David Hildenbrand
2023-08-09 10:46 ` Alistair Popple
2023-08-07 8:39 ` [PATCH 1/2] mm/migrate.c: Fix return code when migration fails Michal Hocko
2023-08-07 12:31 ` Alistair Popple
2023-08-07 13:07 ` Michal Hocko
2023-08-09 4:10 ` Alistair Popple
2023-08-11 7:37 ` Huang, Ying
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=f477c278-79bb-a685-96c3-3d50770e3268@redhat.com \
--to=david@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=apopple@nvidia.com \
--cc=baolin.wang@linux.alibaba.com \
--cc=jhubbard@nvidia.com \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.com \
--cc=osalvador@suse.de \
--cc=ryan.roberts@arm.com \
--cc=shy828301@gmail.com \
--cc=ying.huang@intel.com \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox