From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8AF1BEB64DD for ; Mon, 7 Aug 2023 09:15:12 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E36256B0075; Mon, 7 Aug 2023 05:15:11 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id DE4EF6B0078; Mon, 7 Aug 2023 05:15:11 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CD46E6B007B; Mon, 7 Aug 2023 05:15:11 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id BEF1C6B0075 for ; Mon, 7 Aug 2023 05:15:11 -0400 (EDT) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 6813D805B3 for ; Mon, 7 Aug 2023 09:15:11 +0000 (UTC) X-FDA: 81096749622.24.10C586A Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf14.hostedemail.com (Postfix) with ESMTP id 8F323100020 for ; Mon, 7 Aug 2023 09:15:08 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=none; spf=pass (imf14.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1691399708; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=fBZ/r5F3TGlmO4HkuoIv76KhFw7mHVT6e8h4Smi6UmU=; b=5V9janY4+egu2vqUl04u1xu3mPgErzp2xbG+i9XvBu5H0DycmkVLXNmahRqdxo3da6zh61 CetlOrCCJ/IbenzFx6V2qSbfZZO8MkbRV+Q7c0MLBL4J5e6IFS5MvfT2VW8MEn9pqPNaAm 45QzYDJFkvpVs+2nWR9CL1UXB/LMK8g= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=none; spf=pass (imf14.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1691399708; a=rsa-sha256; cv=none; b=J44J1DkdsdRkkZ27/8deNLPuAuEq/NKXLm65uhdu7Hyh0AC1B+hXe1jGpJHDkAI5eHxFGH Lp6v1Bb0l2r6bqPfpF3/c8Nii9PqPlTHLOpqpndCTINe1Xxaps58FFwTOpDskA4nlLKDJ+ RwSeP2GqTnW55aLck69aAX7TAM06VDc= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 7B9211FB; Mon, 7 Aug 2023 02:15:50 -0700 (PDT) Received: from [10.57.77.247] (unknown [10.57.77.247]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id A0DDE3F59C; Mon, 7 Aug 2023 02:15:05 -0700 (PDT) Message-ID: <9de42ace-dab1-5f60-af8a-26045ada27b9@arm.com> Date: Mon, 7 Aug 2023 10:15:03 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Thunderbird/102.13.0 Subject: Re: [PATCH 2/2] selftests/migration: Disable NUMA balancing and check migration status To: Alistair Popple , Andrew Morton Cc: linux-mm@kvack.org, mhocko@suse.com, jhubbard@nvidia.com, ying.huang@intel.com, osalvador@suse.de, baolin.wang@linux.alibaba.com, ziy@nvidia.com, shy828301@gmail.com References: <20230807063945.911582-1-apopple@nvidia.com> <20230807063945.911582-2-apopple@nvidia.com> From: Ryan Roberts In-Reply-To: <20230807063945.911582-2-apopple@nvidia.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: 8F323100020 X-Rspam-User: X-Stat-Signature: w9gecrokjpnjxkyanwsxkwf14kz9px4o X-Rspamd-Server: rspam01 X-HE-Tag: 1691399708-540723 X-HE-Meta: U2FsdGVkX18CYN5ZGBfKNJ5ibzVu2r37SuBC4Kunrgj7wBkdq3JuD3qKDMiw1vdFhp+MbLsi53uSC4cmqQxSaSxQOz2nieZ6AoDgShnzySkWelG39x3vo4kEbmkWgrQgEPGGy1WARYDrIU8gAVuXgwtYgeeZNkuc9spjkmiOMwO5k9SrUOFWUkp4aHz0zOVP/tfz4nOU5OlwEuIqQId0XQtqucrywEDr2uwD1gcKYqxW4YTpH7RUsgzpIABYUKYahOLUOodczTzkvHclVRNSIy1S5PMaK38Ibsn1BktcGhwKGDuhTx7TAGIwNL87OdxXwWYh7895h3pPShyODlyYinDY64knDKEo+cWZsj6eTo9gLalPEp1F2YxctDb5fCvmUH3fNJBLraIQXJFz/QUlDtU+noRFlGxBHH+II9yYqC9m+bY7DGlU9gkO0DlGRhDnu48YOqVjQxK/TqcJYDY6jSbdOgubQrLWpDm0mv/H+lh8Qc3vDoTO4v++8P9fphzLyrrGmyHkLGdgPUk/AuIuCsGeZTPSu+GKrxfqneSC8G1yH60P+YTMQvgn+U3r17JrLketzweYjU6GjSdCqEsQ8IVR+ATNIZW8j8jS1thfNrQl1GhB0qN5OBJJ86HEuV1EkA6z9UA0JrO23xY8j/PLkQY4WV7z+woF06idfPn/2Mahq1+xFaYi//JEgCxW4Ccbz3cmi40HBVSLP3aFz/Hc6tP8bFnbMXzr4snzAhOwnnXPp4UZOD8Z06+GY8WZDa0qx4jjBo2HA4YlYw2oxkHCEkbrGwfqJ1zaJVs+jf06Dp5Rd+GLhQdNXd7av8VOuHJuzZcd4FlCaY8JFxzYH0cKth1MZHiyA1H7WlvpOSSBu1Sf3IjTeSpf1wVg7C8Yo/j27afqUtMk5NvhpYpjn9vM/ppGl+Y85luRfv5xLGNYuIS59FYCeGiFp9j72amQrr6d/rK4ZyvShfVtcgCA/kr i/EWQU6g f31vP+td7hqLci8r43R65xTFB+r3E5B5okh1CXoWC3CGDj5bEKdbnkarOi3pT9oFjY6mdZk9LZ8LqpUmSmtYqvcXQ2sUJznBQy+iLZ+ffubESFCHaaVZWGCmVM2pviB/coEo0wvY5yLWOEAe3J/ojQ91QAs4MApG4cCRo64TEZtrFWdAuDDsJ1CODgZLAAxiiEOnU X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 07/08/2023 07:39, Alistair Popple wrote: > The migration selftest was only checking the return code and not the > status array for migration success/failure. Update the test to check > both. This uncovered a bug in the return code handling of > do_pages_move(). > > Also disable NUMA balancing as that can lead to unexpected migration > failures. > > Signed-off-by: Alistair Popple > Suggested-by: Ryan Roberts > --- > > Ryan, this will still cause the test to fail if a migration failed. I > was unable to reproduce a migration failure for any cases on my system > once I disabled NUMA balancing though so I'd be curious if you are > still seeing failures with this patch applied. AFAIK there shouldn't > be anything else that would be causing migration failure so would like > to know what is causing failures. Thanks! Hi Alistair, Afraid I'm still seeing unmigrated pages when running with these 2 patches: # RUN migration.shared_anon ... Didn't migrate 1 pages # migration.c:183:shared_anon:Expected migrate(ptr, self->n1, self->n2) (-2) == 0 (0) # shared_anon: Test terminated by assertion # FAIL migration.shared_anon not ok 2 migration.shared_anon I added some instrumentation; it usually fails on the second time through the loop in migrate() but I've also seen it fail the first time. Never seen it get though 2 iterations successfully though. I did also try just this patch without the error handling update in the kernel, but it still fails in the same way. I'm running on arm64 in case that wasn't clear. Let me know if there is anything I can do to help debug. Thanks, Ryan > > tools/testing/selftests/mm/migration.c | 18 +++++++++++++++++- > 1 file changed, 17 insertions(+), 1 deletion(-) > > diff --git a/tools/testing/selftests/mm/migration.c b/tools/testing/selftests/mm/migration.c > index 379581567f27..cf079af5799b 100644 > --- a/tools/testing/selftests/mm/migration.c > +++ b/tools/testing/selftests/mm/migration.c > @@ -51,6 +51,12 @@ FIXTURE_SETUP(migration) > ASSERT_NE(self->threads, NULL); > self->pids = malloc(self->nthreads * sizeof(*self->pids)); > ASSERT_NE(self->pids, NULL); > + > + /* > + * Disable NUMA balancing which can cause migration > + * failures. > + */ > + numa_set_membind(numa_all_nodes_ptr); > }; > > FIXTURE_TEARDOWN(migration) > @@ -62,13 +68,14 @@ FIXTURE_TEARDOWN(migration) > int migrate(uint64_t *ptr, int n1, int n2) > { > int ret, tmp; > - int status = 0; > struct timespec ts1, ts2; > > if (clock_gettime(CLOCK_MONOTONIC, &ts1)) > return -1; > > while (1) { > + int status = NUMA_NUM_NODES + 1; > + > if (clock_gettime(CLOCK_MONOTONIC, &ts2)) > return -1; > > @@ -85,6 +92,15 @@ int migrate(uint64_t *ptr, int n1, int n2) > return -2; > } > > + /* > + * Note we should never see this because move_pages() should > + * have indicated a page couldn't migrate above. > + */ > + if (status < 0) { > + printf("Page didn't migrate, error %d\n", status); > + return -2; > + } > + > tmp = n2; > n2 = n1; > n1 = tmp;