linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: "Ryan Roberts" <ryan.roberts@arm.com>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	"Shuah Khan" <shuah@kernel.org>,
	"Jérôme Glisse" <jglisse@redhat.com>,
	"Mark Brown" <broonie@kernel.org>,
	"John Hubbard" <jhubbard@nvidia.com>,
	"Florent Revest" <revest@chromium.org>,
	"Peter Xu" <peterx@redhat.com>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	linux-kselftest@vger.kernel.org
Subject: Re: [PATCH v2 6/8] selftests/mm: Make migration test robust to failure
Date: Tue, 18 Jul 2023 13:23:36 +0200	[thread overview]
Message-ID: <1314fe0e-dd32-bf10-0a33-2b571dad70bd@redhat.com> (raw)
In-Reply-To: <2e5baba4-c8ef-9da4-a2d6-3cf383ed05bd@arm.com>

On 18.07.23 12:49, Ryan Roberts wrote:
> On 17/07/2023 18:40, David Hildenbrand wrote:
>> On 17.07.23 12:31, Ryan Roberts wrote:
>>> The `migration` test currently has a number of robustness problems that
>>> cause it to hang and leak resources.
>>>
>>> Timeout: There are 3 tests, which each previously ran for 60 seconds.
>>> However, the timeout in mm/settings for a single test binary was set to
>>> 45 seconds. So when run using run_kselftest.sh, the top level timeout
>>> would trigger before the test binary was finished. Solve this by meeting
>>> in the middle; each of the 3 tests now runs for 20 seconds (for a total
>>> of 60), and the top level timeout is set to 90 seconds.
>>>
>>> Leaking child processes: the `shared_anon` test fork()s some children
>>> but then an ASSERT() fires before the test kills those children. The
>>> assert causes immediate exit of the parent and leaking of the children.
>>> Furthermore, if run using the run_kselftest.sh wrapper, the wrapper
>>> would get stuck waiting for those children to exit, which never happens.
>>> Solve this by deferring any asserts until after the children are killed.
>>> The same pattern is used for the threaded tests for uniformity.
>>>
>>> With these changes, the test binary now runs to completion on arm64,
>>> with 2 tests passing and the `shared_anon` test failing.
>>>
>>> Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
>>> ---
>>>    tools/testing/selftests/mm/migration.c | 14 ++++++++++----
>>>    tools/testing/selftests/mm/settings    |  2 +-
>>>    2 files changed, 11 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/tools/testing/selftests/mm/migration.c
>>> b/tools/testing/selftests/mm/migration.c
>>> index 379581567f27..189d7d9070e8 100644
>>> --- a/tools/testing/selftests/mm/migration.c
>>> +++ b/tools/testing/selftests/mm/migration.c
>>> @@ -15,7 +15,7 @@
>>>    #include <time.h>
>>>      #define TWOMEG (2<<20)
>>> -#define RUNTIME (60)
>>> +#define RUNTIME (20)
>>>      #define ALIGN(x, a) (((x) + (a - 1)) & (~((a) - 1)))
>>>    @@ -118,6 +118,7 @@ TEST_F_TIMEOUT(migration, private_anon, 2*RUNTIME)
>>>    {
>>>        uint64_t *ptr;
>>>        int i;
>>> +    int ret;
>>>          if (self->nthreads < 2 || self->n1 < 0 || self->n2 < 0)
>>>            SKIP(return, "Not enough threads or NUMA nodes available");
>>> @@ -131,9 +132,10 @@ TEST_F_TIMEOUT(migration, private_anon, 2*RUNTIME)
>>>            if (pthread_create(&self->threads[i], NULL, access_mem, ptr))
>>>                perror("Couldn't create thread");
>>>    -    ASSERT_EQ(migrate(ptr, self->n1, self->n2), 0);
>>> +    ret = migrate(ptr, self->n1, self->n2);
>>>        for (i = 0; i < self->nthreads - 1; i++)
>>>            ASSERT_EQ(pthread_cancel(self->threads[i]), 0);
>>> +    ASSERT_EQ(ret, 0);
>>
>> Why is that required? This does not involve fork.
> 
> It's not required. I was just trying to keep everything aligned to the same pattern.
> 
>>
>>>    }
>>>      /*
>>> @@ -144,6 +146,7 @@ TEST_F_TIMEOUT(migration, shared_anon, 2*RUNTIME)
>>>        pid_t pid;
>>>        uint64_t *ptr;
>>>        int i;
>>> +    int ret;
>>>          if (self->nthreads < 2 || self->n1 < 0 || self->n2 < 0)
>>>            SKIP(return, "Not enough threads or NUMA nodes available");
>>> @@ -161,9 +164,10 @@ TEST_F_TIMEOUT(migration, shared_anon, 2*RUNTIME)
>>>                self->pids[i] = pid;
>>>        }
>>>    -    ASSERT_EQ(migrate(ptr, self->n1, self->n2), 0);
>>> +    ret = migrate(ptr, self->n1, self->n2);
>>>        for (i = 0; i < self->nthreads - 1; i++)
>>>            ASSERT_EQ(kill(self->pids[i], SIGTERM), 0);
>>> +    ASSERT_EQ(ret, 0);
>>
>>
>> Might be cleaner to also:
> 
> Or instead of? I agree this is neater, so will undo the moving of the ASSERT()
> and rely on this prctl.

I was thinking about possible races when our parent process already 
quits before our child managed to set the prctl. prctl() won't do 
anything in that case, hmmmm.

But similarly, existing code might already trigger the migrate() + kill 
before the child processes even started to access_mem().

Racy :)

-- 
Cheers,

David / dhildenb



  reply	other threads:[~2023-07-18 11:23 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-07-17 10:31 [PATCH v2 0/8] selftests/mm fixes for arm64 Ryan Roberts
2023-07-17 10:31 ` [PATCH v2 1/8] selftests: Line buffer test program's stdout Ryan Roberts
2023-07-17 17:22   ` Mark Brown
2023-07-17 10:31 ` [PATCH v2 2/8] selftests/mm: Skip soft-dirty tests on arm64 Ryan Roberts
2023-07-17 17:20   ` David Hildenbrand
2023-07-17 10:31 ` [PATCH v2 3/8] selftests/mm: Enable mrelease_test for arm64 Ryan Roberts
2023-07-17 10:31 ` [PATCH v2 4/8] selftests/mm: Fix thuge-gen test bugs Ryan Roberts
2023-07-17 17:22   ` David Hildenbrand
2023-07-17 10:31 ` [PATCH v2 5/8] selftests/mm: va_high_addr_switch should skip unsupported arm64 configs Ryan Roberts
2023-07-17 10:31 ` [PATCH v2 6/8] selftests/mm: Make migration test robust to failure Ryan Roberts
2023-07-17 17:40   ` David Hildenbrand
2023-07-18 10:49     ` Ryan Roberts
2023-07-18 11:23       ` David Hildenbrand [this message]
2023-07-18 11:24         ` David Hildenbrand
2023-07-18 12:42           ` Ryan Roberts
2023-07-18 15:24             ` David Hildenbrand
2023-07-17 10:31 ` [PATCH v2 7/8] selftests/mm: Optionally pass duration to transhuge-stress Ryan Roberts
2023-07-17 17:24   ` David Hildenbrand
2023-07-17 10:31 ` [PATCH v2 8/8] selftests/mm: Run all tests from run_vmtests.sh Ryan Roberts
2023-07-17 17:27   ` David Hildenbrand
2023-07-19 20:45     ` Peter Xu
2023-07-20  8:14       ` Ryan Roberts

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1314fe0e-dd32-bf10-0a33-2b571dad70bd@redhat.com \
    --to=david@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=broonie@kernel.org \
    --cc=jglisse@redhat.com \
    --cc=jhubbard@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=peterx@redhat.com \
    --cc=revest@chromium.org \
    --cc=ryan.roberts@arm.com \
    --cc=shuah@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox