From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 67BD3C3DA6E for ; Wed, 10 Jan 2024 10:15:44 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D27E76B007D; Wed, 10 Jan 2024 05:15:43 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id CD8D26B007E; Wed, 10 Jan 2024 05:15:43 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BA1066B0080; Wed, 10 Jan 2024 05:15:43 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id AA8146B007D for ; Wed, 10 Jan 2024 05:15:43 -0500 (EST) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 6680380AE2 for ; Wed, 10 Jan 2024 10:15:36 +0000 (UTC) X-FDA: 81662994672.15.E9B5A3A Received: from madrid.collaboradmins.com (madrid.collaboradmins.com [46.235.227.194]) by imf21.hostedemail.com (Postfix) with ESMTP id 7B5DC1C001E for ; Wed, 10 Jan 2024 10:15:33 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=collabora.com header.s=mail header.b="hS4mD/5b"; dmarc=pass (policy=quarantine) header.from=collabora.com; spf=pass (imf21.hostedemail.com: domain of usama.anjum@collabora.com designates 46.235.227.194 as permitted sender) smtp.mailfrom=usama.anjum@collabora.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1704881733; a=rsa-sha256; cv=none; b=u7q7Ye+UFh41Qk9/BGw281AwcpPJlT7uaWDh/EDXqZvMK9fC0F81Dcs7jrM3viI7gMhDCD xrdFFhADadwR+XFbnAQMxjOyE6cyoUr9OCz+o+CAwOvJTFVMIZxXl4rJc2omELZmHR25SN EtWhq1jvuOff+q5yARHrp0TIezl1Afs= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=pass header.d=collabora.com header.s=mail header.b="hS4mD/5b"; dmarc=pass (policy=quarantine) header.from=collabora.com; spf=pass (imf21.hostedemail.com: domain of usama.anjum@collabora.com designates 46.235.227.194 as permitted sender) smtp.mailfrom=usama.anjum@collabora.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1704881733; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=/1NLtbCBdjHD3j9cFBGP2KNwAtAwLS5VTzqa3fffnB0=; b=l7nJpW5EDz+QsObYFnZztJR2UHXA6Uh4N3hCnKbNXKZQW10Y15Q5N3R87kXQCwFIdVHbfI P35tLmvT2sDk9zK3IW1AsOOf0h9BHQ4xUYmbYpUq7HCzRxMvKS6kJkB2huZ1iApOVtJ48G 1IT0PtSM0tjk+CeplNukKnJZCd2TZS0= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=collabora.com; s=mail; t=1704881732; bh=0bJLXu0DBAoJ0i9wVZaWcD3g7PAFELQ/me31lacb4uI=; h=Date:Cc:Subject:To:References:From:In-Reply-To:From; b=hS4mD/5bm9fU1BHvnNtYHMtTM3M6ohBF5sPP3i+5OvaaXJJ+IAQoAaR2/k6JmRWWO vQrRRcj2zGFDQA7DGTKaiYaq4YE55VKiDtK8JdO7sqzQC22VMN1cTP+Mw7DdsyEtdX fcHOr/2ujarZYPZ/qCcuj61M4pIPhnLUlMQCZa4QE8+JZUcOJpUXP5BYhBej2lRzSd BqHJbXkzJU4R4uFRhyZ34ZRGKrBBaAZGwvQr8BtWFcywN1VrR1vrR3SdKHicawFi3j drhzEdoRglrwXdZlgT9pfM8ua3G7jDCSToPiMqI1l5byUcrCmTO4NdWINUmhbPmTvP xZRx4sb58YnoQ== Received: from [100.96.234.34] (cola.collaboradmins.com [195.201.22.229]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: usama.anjum) by madrid.collaboradmins.com (Postfix) with ESMTPSA id 2B44A37813B9; Wed, 10 Jan 2024 10:15:27 +0000 (UTC) Message-ID: <079335ab-190f-41f7-b832-6ffe7528fd8b@collabora.com> Date: Wed, 10 Jan 2024 15:15:34 +0500 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Cc: Muhammad Usama Anjum , linmiaohe@huawei.com, mike.kravetz@oracle.com, naoya.horiguchi@nec.com, akpm@linux-foundation.org, songmuchun@bytedance.com, shy828301@gmail.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, jthoughton@google.com, "kernel@collabora.com" , "Matthew Wilcox (Oracle)" , Mike Kravetz , Muchun Song Subject: Re: [PATCH v4 4/4] selftests/mm: add tests for HWPOISON hugetlbfs read Content-Language: en-US To: Jiaqi Yan , Sidhartha Kumar References: <20230713001833.3778937-1-jiaqiyan@google.com> <20230713001833.3778937-5-jiaqiyan@google.com> From: Muhammad Usama Anjum In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Rspam-User: X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 7B5DC1C001E X-Stat-Signature: zqz8yxrtnejuweqkkcnbqjpe87weu358 X-HE-Tag: 1704881733-579371 X-HE-Meta: U2FsdGVkX18ztaWb/1R9AI0DdCF98rxcXb3+t9LlxFmkuLwNPDqATbcD1j3zoiXE8VfbshL/dlVu9VvOWPuMQW0RL9degcaB2RwQ9QNiGHdnaKkHV0mxVfAPh8CPeGqqzlmoy11anapMaYwrRpshlRlc0XNfplbhxsLpH43MoMdKj+OFEmG66JDFmYvY6pLuxCB0a3dHqiS3KP8LhFNvWo9YaYPiZ8gCaWT+Up4K92JodPugMwNRjosEkJYV98igd5bbGWv6YkXOMN8IaXPsyDkqL9R9AWXW+oHQ+CbsHPB11rlWeehD3uN5kOUOogXWYifGK9Fncb8zk4U+We6g1oN3cB9ObYnbhatHU/d0BRmId8YvYHlIFV3DQXH7FWnyIepDS4CXjlzRA7qOwFOdP4QPuYyaNcmmiVNZfb/CUmnRUrrjtMwOlkDqtNqXlBI5LcfcPPT+rr15C1Bw1VgJS9HzOmLjuRmoWm4iFsgpef4vnOF1xRm06owu+h33cVhl6dJSna+6f2Dw2duR+Ai4Dei1QGP1n6oU+rOwk1z+h1qXi7bLzVcYpeFepJonmEDRsQ4fqF+PFoWNxVhKVwC0YVMdj1Et/fKWRl0X8BNhz1DKyb1brZr7NHYoRGf/RY2fCsJR7IsJwZTnshCt+2MbI9SN6eu22ZPMMYk7XNmntU6/SRj52+gzq0eDbtSm6+2VO1ligZRpcL633yvA0N6vKEcEVva/JX5OsCrKXXe0wSyEaYeTFOuC+cf2I6Icv0OE4P1EO/aoqAJ19L/7EwEI9jfSrYolRjCD1t8hLlwUOdkc3+Km1ynhX5q45wc66PUAF6yZMzomvUQ9iKt7VwdyIqGIVHYbbJ46nRKr8Tu07otTeFt73HtszpY1RvaadlM+jEEXQI3qWQAxXSAOg0JpLK7dzQ1D44yl4HsVpZjqHJT/73hzp141O4HK0B+JFRJNyW5xKiLzSi7ga0RT8Po wEKYTQyq uaSdheT2gXJoZ+LVmIP7dt4XvFnj06iHrRNFW3LAVR3U4BisQGT/NupR1K0m0gHnDyDkXmalbb13s3Zf97yKGV7R4K4sqj6PggrHMDAproiAOAhLKOD2U7tdzVMK30NlQpDzdbcan6PA4ic09asbmCL+7AG4D5Z+yWyh8LyK0HSdLFYzyRGtDvXbgq51jEJ+E1iuDOrRezl3G0n7irnMaNPnOWSh4qXdY043kMRZkewGLPYM7xBkQaumapTIy2TzEab7CKK6802TTfMynosei9S2KLEf1su/A6hMksgFrUD/Y716BnkFNAX4KFfx5yREwffNOt7aayPw/jdd/Rf/uT0LSzeYdyJb4e5Yx80O30y01aq08Pg/yoajFn9Ah/TKOGxiS X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 1/10/24 11:49 AM, Muhammad Usama Anjum wrote: > On 1/6/24 2:13 AM, Jiaqi Yan wrote: >> On Thu, Jan 4, 2024 at 10:27 PM Muhammad Usama Anjum >> wrote: >>> >>> Hi, >>> >>> I'm trying to convert this test to TAP as I think the failures sometimes go >>> unnoticed on CI systems if we only depend on the return value of the >>> application. I've enabled the following configurations which aren't already >>> present in tools/testing/selftests/mm/config: >>> CONFIG_MEMORY_FAILURE=y >>> CONFIG_HWPOISON_INJECT=m >>> >>> I'll send a patch to add these configs later. Right now I'm trying to >>> investigate the failure when we are trying to inject the poison page by >>> madvise(MADV_HWPOISON). I'm getting device busy every single time. The test >>> fails as it doesn't expect any business for the hugetlb memory. I'm not >>> sure if the poison handling code has issues or test isn't robust enough. >>> >>> ./hugetlb-read-hwpoison >>> Write/read chunk size=0x800 >>> ... HugeTLB read regression test... >>> ... ... expect to read 0x200000 bytes of data in total >>> ... ... actually read 0x200000 bytes of data in total >>> ... HugeTLB read regression test...TEST_PASSED >>> ... HugeTLB read HWPOISON test... >>> [ 9.280854] Injecting memory failure for pfn 0x102f01 at process virtual >>> address 0x7f28ec101000 >>> [ 9.282029] Memory failure: 0x102f01: huge page still referenced by 511 >>> users >>> [ 9.282987] Memory failure: 0x102f01: recovery action for huge page: Failed >>> ... !!! MADV_HWPOISON failed: Device or resource busy >>> ... HugeTLB read HWPOISON test...TEST_FAILED >>> >>> I'm testing on v6.7-rc8. Not sure if this was working previously or not. >> >> Thanks for reporting this, Usama! >> >> I am also able to repro MADV_HWPOISON failure at "501a06fe8e4c >> (akpm/mm-stable, mm-stable) zswap: memcontrol: implement zswap >> writeback disabling." >> >> Then I checked out the earliest commit "ba91e7e5d15a (HEAD -> Base) >> selftests/mm: add tests for HWPOISON hugetlbfs read". The >> MADV_HWPOISON injection works and and the test passes: >> >> ... HugeTLB read HWPOISON test... >> ... ... expect to read 0x101000 bytes of data in total >> ... !!! read failed: Input/output error >> ... ... actually read 0x101000 bytes of data in total >> ... HugeTLB read HWPOISON test...TEST_PASSED >> ... HugeTLB seek then read HWPOISON test... >> ... ... init val=4 with offset=0x102000 >> ... ... expect to read 0xfe000 bytes of data in total >> ... ... actually read 0xfe000 bytes of data in total >> ... HugeTLB seek then read HWPOISON test...TEST_PASSED >> ... >> >> [ 2109.209225] Injecting memory failure for pfn 0x3190d01 at process >> virtual address 0x7f75e3101000 >> [ 2109.209438] Memory failure: 0x3190d01: recovery action for huge >> page: Recovered >> ... >> >> I think something in between broken MADV_HWPOISON on hugetlbfs, and we >> should be able to figure it out via bisection (and of course by >> reading delta commits between them, probably related to page >> refcount). > Thank you for this information. > >> >> That being said, I will be on vacation from tomorrow until the end of >> next week. So I will get back to this after next weekend. Meanwhile if >> you want to go ahead and bisect the problematic commit, that will be >> very much appreciated. > I'll try to bisect and post here if I find something. Found the culprit commit by bisection: a08c7193e4f18dc8508f2d07d0de2c5b94cb39a3 mm/filemap: remove hugetlb special casing in filemap.c hugetlb-read-hwpoison started failing from this patch. I've added the author of this patch to this bug report. > >> >> Thanks, >> Jiaqi >> >> >>> >>> Regards, >>> Usama >>>