From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BEE94E77188 for ; Fri, 20 Dec 2024 04:30:43 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id AD5136B007B; Thu, 19 Dec 2024 23:30:42 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id A85BC6B0083; Thu, 19 Dec 2024 23:30:42 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 925776B0085; Thu, 19 Dec 2024 23:30:42 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 752676B007B for ; Thu, 19 Dec 2024 23:30:42 -0500 (EST) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 0519AA0346 for ; Fri, 20 Dec 2024 04:30:41 +0000 (UTC) X-FDA: 82914061188.29.96A689B Received: from mx0a-001b2d01.pphosted.com (mx0a-001b2d01.pphosted.com [148.163.156.1]) by imf15.hostedemail.com (Postfix) with ESMTP id E7C0AA0006 for ; Fri, 20 Dec 2024 04:29:47 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=ibm.com header.s=pp1 header.b=DReQbSrG; spf=pass (imf15.hostedemail.com: domain of donettom@linux.ibm.com designates 148.163.156.1 as permitted sender) smtp.mailfrom=donettom@linux.ibm.com; dmarc=pass (policy=none) header.from=ibm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1734669001; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=aQdHuiCRdvoLQFEvY1lWWEGLtTGqG7EJjlci/EuQSOY=; b=sw+mC4fN5zfDJ/xC2gfftEIcSbJL6LxaY+EtX3EeJrHTeRtYI102/ggxpmvTIWzUuT7mmq 6mN3u/Pd2ofNlffM1z8PSsaVyyzB7ux05ICHFTnVHE9DPNM8qKz2d+V59FRjJInAy2lhaQ oqb3GMo7plMORHjjYX0rY7pV7dZqVEI= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1734669002; a=rsa-sha256; cv=none; b=IVkBEBtEaXJhxqyxsjWF7WyDIAVOH9avD7CzwK/snSo+HaaBc8pIW73C9Or5tqGweE1KBt z2FF39gt1eAxJeHT8aeDcr9iVMRsPAI6HHz1jXHs3zgD1OP4mIJmuqGRMSOPhLMJZU/Vue CWEpn3jNyfk+km0mAdV1mmLcfBCMaaE= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=ibm.com header.s=pp1 header.b=DReQbSrG; spf=pass (imf15.hostedemail.com: domain of donettom@linux.ibm.com designates 148.163.156.1 as permitted sender) smtp.mailfrom=donettom@linux.ibm.com; dmarc=pass (policy=none) header.from=ibm.com Received: from pps.filterd (m0353729.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 4BJNw1qM000877; Fri, 20 Dec 2024 04:30:33 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=cc :content-transfer-encoding:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to; s=pp1; bh=aQdHui CRdvoLQFEvY1lWWEGLtTGqG7EJjlci/EuQSOY=; b=DReQbSrGzti7saSgz0nKBX 1xW79+qi5e1MyBquR4YjTyvlv77TObibjV8Mz6I/IAD0vRcPYhC0qJ5Ej2oPsBu6 sG6nnCgPv/IO6Rgm06HlbOhToCVEKrMsLkBULKOnwf5O08QKuTDuwSalxcbrrxbJ EJtyHIkdbKe+HMq0OMYuc3AH90rZlOcbJl/DYrcRqImASDlx/D9/dlLlb8oHhBEE F8A1JPWZum+FkGcXTvGH2g3DRL0QnWzYVKeDHLVw8NEHVMfZecgwTsbTGyGHUqKV ZX9ME/Nwr5LefG6iEkwvO4c377XAkFPmVl3oonmPtFz2hCqKNojXS0zUTvO6+njg == Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 43mwmhgsx9-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 20 Dec 2024 04:30:33 +0000 (GMT) Received: from m0353729.ppops.net (m0353729.ppops.net [127.0.0.1]) by pps.reinject (8.18.0.8/8.18.0.8) with ESMTP id 4BK4UWxs016179; Fri, 20 Dec 2024 04:30:33 GMT Received: from ppma22.wdc07v.mail.ibm.com (5c.69.3da9.ip4.static.sl-reverse.com [169.61.105.92]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 43mwmhgsx5-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 20 Dec 2024 04:30:32 +0000 (GMT) Received: from pps.filterd (ppma22.wdc07v.mail.ibm.com [127.0.0.1]) by ppma22.wdc07v.mail.ibm.com (8.18.1.2/8.18.1.2) with ESMTP id 4BK01YZ3014435; Fri, 20 Dec 2024 04:30:31 GMT Received: from smtprelay01.wdc07v.mail.ibm.com ([172.16.1.68]) by ppma22.wdc07v.mail.ibm.com (PPS) with ESMTPS id 43hmqygns0-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 20 Dec 2024 04:30:31 +0000 Received: from smtpav01.wdc07v.mail.ibm.com (smtpav01.wdc07v.mail.ibm.com [10.39.53.228]) by smtprelay01.wdc07v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 4BK4UVto34668966 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 20 Dec 2024 04:30:31 GMT Received: from smtpav01.wdc07v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 4CF885804B; Fri, 20 Dec 2024 04:30:31 +0000 (GMT) Received: from smtpav01.wdc07v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 65EAF58055; Fri, 20 Dec 2024 04:30:27 +0000 (GMT) Received: from [9.179.0.110] (unknown [9.179.0.110]) by smtpav01.wdc07v.mail.ibm.com (Postfix) with ESMTP; Fri, 20 Dec 2024 04:30:26 +0000 (GMT) Message-ID: <8d5c6392-1a9b-44a5-a1bf-413277260a3b@linux.ibm.com> Date: Fri, 20 Dec 2024 10:00:25 +0530 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH] mm: migration :shared anonymous migration test is failing To: Baolin Wang , Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Ritesh Harjani , "Aneesh Kumar K . V" , Zi Yan , David Hildenbrand , shuah Khan , Dev Jain References: <20241219124717.4907-1-donettom@linux.ibm.com> <36f9ab13-e057-40a0-8d0b-9939df056fc6@linux.ibm.com> <4d76321e-7905-46e6-8105-f09afde516ff@linux.alibaba.com> Content-Language: en-US From: Donet Tom In-Reply-To: <4d76321e-7905-46e6-8105-f09afde516ff@linux.alibaba.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-TM-AS-GCONF: 00 X-Proofpoint-ORIG-GUID: lgd2EziqCYk4vkq_1hS7wdbKPTuHlp-f X-Proofpoint-GUID: RMT8hCc88QvP-Abx154NnjyHWBYc1e50 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1051,Hydra:6.0.680,FMLib:17.12.62.30 definitions=2024-10-15_01,2024-10-11_01,2024-09-30_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 impostorscore=0 priorityscore=1501 mlxscore=0 phishscore=0 lowpriorityscore=0 suspectscore=0 clxscore=1015 malwarescore=0 adultscore=0 mlxlogscore=833 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.19.0-2411120000 definitions=main-2412200036 X-Rspamd-Queue-Id: E7C0AA0006 X-Rspam-User: X-Rspamd-Server: rspam07 X-Stat-Signature: 4at35neecjeb9fq871fo5uepy94jpkzp X-HE-Tag: 1734668987-938304 X-HE-Meta: U2FsdGVkX1/eefdhEChcX9DhGC6BQ4q/5qE9+/tfQdfLOPJg3akhe+TR9fwk9yPSWn18yi9YDdtgD9gpdG0IjuoAdQXCp967QvonpcprAeTGb9yepYW8flVLYzYTDnDy2ojv0LduBojWXrXFbyDNa8SJaWTBv3OjlM/H5qG9G9WB6X2bZ7kN8KyjHMLwPIO0AnbDlqMu/5tYeGQNBnr+NotUb0BLcdf+wj+Fa4dMghp77FV7W+MHDjG5NGf9309ieolkTNDHhX1HlS5ZORxWmOcYaeQKrhQTqhY3aenGU0GBPCF6mOFa3QEKYh3NQsC3/7mxVAfVOPqqHZpoHXh5SAYFk4fCBiiAEKcwl4qiPVOGuvfVwL8CkCgpcOFhzK9c8Ea8rXNJCJGrTH378xHBtMe1p8qdhxqSStIMhrgcm+c76F0Ic9qeDydFjlms8wAZ418LkHGX8Jr02ig9xma9nkjsCFqS4+2wd2gzivIVZnssQd8tO2oQZ9rAn2RfkpLIfgInnjgCReViBBSrOEDtslkvAzjXF3RLaBB89rr+Lb6iXxDyVH+umb/FKlfu5SOYK0wr1KM3t3RuWo+u2d8yY+hfGwBgwUPc5JLNlDgc9fRzvig5s/Sgph+86CK9rZTBVCKUM+yHrdmBYQX76jOVmrdy0mjKVINQJHTHnmcZg4j9O57UwaX5vCYamXKxFUgO3dPXKJdGTeU1aKlzruOjLmUwV7YV4EQ0Rc/XU+rbZcC5YHWdaPAAJFy9HjXe6Rj43aA0QndnNVTIbx1foAJz1zIw53g3MIoTCkYFuzTS9rm3Rj+dU4C7averJp4FusLxqQpv6BoDqiOuQBzpeY3xYvRI72lw6qGyFco4XgIDDZe4gzXETW9YleaJZITwvjmZNmE24OC50UwBMys9Ddt19LdTKK0TMR7FS5k1Ua0cg0Ux0KxQcyjw1GHSipIxFw/e2QQH586Wa482x3ckCqp ytR++vcX wjK0118wVlC8n5XjgX2Qt5CHv8w7VwSxRMybB8izvSfLtQdH0coB2aEU2647hqhOuTd3tGKJZAtJE6Pmr5rRcNmmfikGxv9N6H+/k+fSRWVTkiJLEsBDNGdon4S+/YK4FTPWRrM9QCgbZISRu6NKQmm7mBti1KWAs27GQgD4nv5k+uKfN/pVfCtT/pdwsLj5Qs9rT9UTONW5W0Pv0SXpjIFCpDViuAYER4+6GYdijnvrDQgiVvZ75pJQ2Sg4Ol8V3ThL9iZjPBANwebDSowm4F5JLYZq9AvjQXh544wbyHe767gY= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000050, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 12/20/24 09:02, Baolin Wang wrote: > > > On 2024/12/20 11:12, Donet Tom wrote: >> >> On 12/20/24 08:01, Baolin Wang wrote: >>> >>> >>> On 2024/12/19 20:47, Donet Tom wrote: >>>> The migration selftest is currently failing for shared anonymous >>>> mappings due to a race condition. >>>> >>>> During migration, the source folio's PTE is unmapped by nuking the >>>> PTE, flushing the TLB,and then marking the page for migration >>>> (by creating the swap entries). The issue arises when, immediately >>>> after the PTE is nuked and the TLB is flushed, but before the page >>>> is marked for migration, another thread accesses the page. This >>>> triggers a page fault, and the page fault handler invokes >>>> do_pte_missing() instead of do_swap_page(), as the page is not yet >>>> marked for migration. >>>> >>>> In the fault handling path, do_pte_missing() calls __do_fault() >>>> ->shmem_fault() -> shmem_get_folio_gfp() -> filemap_get_entry(). >>>> This eventually calls folio_try_get(), incrementing the reference >>>> count of the folio undergoing migration. The thread then blocks >>>> on folio_lock(), as the migration path holds the lock. This >>>> results in the migration failing in __migrate_folio(), which expects >>>> the folio's reference count to be 2. However, the reference count is >>>> incremented by the fault handler, leading to the failure. >>>> >>>> The issue arises because, after nuking the PTE and before marking the >>>> page for migration, the page is accessed. To address this, we have >>>> updated the logic to first nuke the PTE, then mark the page for >>>> migration, and only then flush the TLB. With this patch, If the >>>> page is >>>> accessed immediately after nuking the PTE, the TLB entry is still >>>> valid, so no fault occurs. After marking the page for migration, >>> >>> IMO, I don't think this assumption is correct. At this point, the >>> TLB entry might also be evicted, so a page fault could still occur. >>> It's just a matter of probability. >> In this patch, we mark the page for migration before flushing the TLB. >> This ensures that if someone accesses the page after the TLB flush, >> the page fault will occur and in the page fault handler will wait for >> the >> migration to complete. So migration will not fail >> >> Without this patch, if someone accesses the page after the TLB flush >> but before it is marked for migration, the migration will fail. > > Actually my concern is the same as David's (I did not see David's > reply before sending my comments), which is that your patch does not > "rules out all cases". > >>> Additionally, IIUC, if another thread is accessing the shmem folio >>> causing the migration to fail, I think this is expected, and >>> migration failure is not a vital issue? >>> >> In my case, the shmem migration test is always failing, >> even after retries. Would it be correct to consider this >> as expected behavior? > > IMHO I think your test case is too aggressive and unlikely to occur in > real-world scenarios. Additionally, as I mentioned, migration failure > is not a vital issue in the system, and some temporary refcnt can also > lead to migration failure if you want to create such test cases. So > personally, I don't think it is worthy doing. Sure. Thank you Baolin.