From: Alistair Popple <apopple@nvidia.com>
To: Ryan Roberts <ryan.roberts@arm.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
linux-mm@kvack.org, mhocko@suse.com, jhubbard@nvidia.com,
ying.huang@intel.com, osalvador@suse.de,
baolin.wang@linux.alibaba.com, ziy@nvidia.com,
shy828301@gmail.com
Subject: Re: [PATCH 2/2] selftests/migration: Disable NUMA balancing and check migration status
Date: Mon, 07 Aug 2023 22:41:40 +1000 [thread overview]
Message-ID: <87leen2f4y.fsf@nvdebian.thelocal> (raw)
In-Reply-To: <9de42ace-dab1-5f60-af8a-26045ada27b9@arm.com>
Ryan Roberts <ryan.roberts@arm.com> writes:
> On 07/08/2023 07:39, Alistair Popple wrote:
>> The migration selftest was only checking the return code and not the
>> status array for migration success/failure. Update the test to check
>> both. This uncovered a bug in the return code handling of
>> do_pages_move().
>>
>> Also disable NUMA balancing as that can lead to unexpected migration
>> failures.
>>
>> Signed-off-by: Alistair Popple <apopple@nvidia.com>
>> Suggested-by: Ryan Roberts <ryan.roberts@arm.com>
>> ---
>>
>> Ryan, this will still cause the test to fail if a migration failed. I
>> was unable to reproduce a migration failure for any cases on my system
>> once I disabled NUMA balancing though so I'd be curious if you are
>> still seeing failures with this patch applied. AFAIK there shouldn't
>> be anything else that would be causing migration failure so would like
>> to know what is causing failures. Thanks!
>
>
> Hi Alistair,
>
> Afraid I'm still seeing unmigrated pages when running with these 2 patches:
>
>
> # RUN migration.shared_anon ...
> Didn't migrate 1 pages
> # migration.c:183:shared_anon:Expected migrate(ptr, self->n1, self->n2) (-2) == 0 (0)
> # shared_anon: Test terminated by assertion
> # FAIL migration.shared_anon
> not ok 2 migration.shared_anon
>
>
> I added some instrumentation; it usually fails on the second time
> through the loop in migrate() but I've also seen it fail the first
> time. Never seen it get though 2 iterations successfully though.
Interesting. I guess migration failure is always possible for various
reasons so I will update the test to report the number of failed
migrations rather than making it a test failure. I was mostly just
curious as to what would be causing the occasional failures for my own
understanding, but the failures themselves are unimportant.
> I did also try just this patch without the error handling update in the kernel, but it still fails in the same way.
>
> I'm running on arm64 in case that wasn't clear. Let me know if there is anything I can do to help debug.
Thanks! Unless you're concerned about the failures I am happy to ignore
them. Pages can fail to migrate for all sorts of reasons although I'm a
little suprised anonymous migrations are failing so frequently for you.
> Thanks,
> Ryan
>
>
>>
>> tools/testing/selftests/mm/migration.c | 18 +++++++++++++++++-
>> 1 file changed, 17 insertions(+), 1 deletion(-)
>>
>> diff --git a/tools/testing/selftests/mm/migration.c b/tools/testing/selftests/mm/migration.c
>> index 379581567f27..cf079af5799b 100644
>> --- a/tools/testing/selftests/mm/migration.c
>> +++ b/tools/testing/selftests/mm/migration.c
>> @@ -51,6 +51,12 @@ FIXTURE_SETUP(migration)
>> ASSERT_NE(self->threads, NULL);
>> self->pids = malloc(self->nthreads * sizeof(*self->pids));
>> ASSERT_NE(self->pids, NULL);
>> +
>> + /*
>> + * Disable NUMA balancing which can cause migration
>> + * failures.
>> + */
>> + numa_set_membind(numa_all_nodes_ptr);
>> };
>>
>> FIXTURE_TEARDOWN(migration)
>> @@ -62,13 +68,14 @@ FIXTURE_TEARDOWN(migration)
>> int migrate(uint64_t *ptr, int n1, int n2)
>> {
>> int ret, tmp;
>> - int status = 0;
>> struct timespec ts1, ts2;
>>
>> if (clock_gettime(CLOCK_MONOTONIC, &ts1))
>> return -1;
>>
>> while (1) {
>> + int status = NUMA_NUM_NODES + 1;
>> +
>> if (clock_gettime(CLOCK_MONOTONIC, &ts2))
>> return -1;
>>
>> @@ -85,6 +92,15 @@ int migrate(uint64_t *ptr, int n1, int n2)
>> return -2;
>> }
>>
>> + /*
>> + * Note we should never see this because move_pages() should
>> + * have indicated a page couldn't migrate above.
>> + */
>> + if (status < 0) {
>> + printf("Page didn't migrate, error %d\n", status);
>> + return -2;
>> + }
>> +
>> tmp = n2;
>> n2 = n1;
>> n1 = tmp;
next prev parent reply other threads:[~2023-08-07 12:47 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-08-07 6:39 [PATCH 1/2] mm/migrate.c: Fix return code when migration fails Alistair Popple
2023-08-07 6:39 ` [PATCH 2/2] selftests/migration: Disable NUMA balancing and check migration status Alistair Popple
2023-08-07 9:15 ` Ryan Roberts
2023-08-07 12:41 ` Alistair Popple [this message]
2023-08-07 13:42 ` Ryan Roberts
2023-08-09 4:21 ` Alistair Popple
2023-08-09 9:34 ` David Hildenbrand
2023-08-09 13:39 ` Ryan Roberts
2023-08-10 8:23 ` Alistair Popple
2023-08-09 9:35 ` David Hildenbrand
2023-08-09 10:46 ` Alistair Popple
2023-08-07 8:39 ` [PATCH 1/2] mm/migrate.c: Fix return code when migration fails Michal Hocko
2023-08-07 12:31 ` Alistair Popple
2023-08-07 13:07 ` Michal Hocko
2023-08-09 4:10 ` Alistair Popple
2023-08-11 7:37 ` Huang, Ying
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87leen2f4y.fsf@nvdebian.thelocal \
--to=apopple@nvidia.com \
--cc=akpm@linux-foundation.org \
--cc=baolin.wang@linux.alibaba.com \
--cc=jhubbard@nvidia.com \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.com \
--cc=osalvador@suse.de \
--cc=ryan.roberts@arm.com \
--cc=shy828301@gmail.com \
--cc=ying.huang@intel.com \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox