From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2394FC7EE39 for ; Mon, 30 Jun 2025 07:55:29 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id AD3946B0099; Mon, 30 Jun 2025 03:55:28 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id AAB666B009A; Mon, 30 Jun 2025 03:55:28 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9E7F86B009B; Mon, 30 Jun 2025 03:55:28 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 8CB586B0099 for ; Mon, 30 Jun 2025 03:55:28 -0400 (EDT) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 3C6CA1D5069 for ; Mon, 30 Jun 2025 07:55:28 +0000 (UTC) X-FDA: 83611307136.04.682391C Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf03.hostedemail.com (Postfix) with ESMTP id 3D0D920007 for ; Mon, 30 Jun 2025 07:55:25 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf03.hostedemail.com: domain of anshuman.khandual@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=anshuman.khandual@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1751270126; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=7pPubA4KXzOF0Ho6738JvHlGz1ePTSfhol2XmB4SHAw=; b=oQEJ0c870Z94xi5e5Zho5mzdC/OlhSWFy3pYnNeFG9sqefGRzy9f5NwB5ujrBSObztHMaA jVV/WRKQT5vP+PyK2UE/gnDzVKoQTYi5KyqvVd1tmcIrW5CEpXXNhp2ZPo4Gtl6kLDgmo0 +6CsOah+1P5rTvCfscsDm0zS++NqH50= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1751270126; a=rsa-sha256; cv=none; b=ovMHbRs6ey441Rsg70FIusEPomLqkoJqjuM7JI/LNyxYcSFrWQiiGi6ReTKYCLOt89Iwfm FJuCGZjkuAjzWpdl5CU/2p8wTfNwZXdkK27DqGsiVfc8h1alfc7346/KNpdbR6PKaUhHmf Hy7J/zMdkGeF7p+7SwGlMVwtNIcIAp4= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf03.hostedemail.com: domain of anshuman.khandual@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=anshuman.khandual@arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 3C9CA15A1; Mon, 30 Jun 2025 00:55:09 -0700 (PDT) Received: from [10.163.37.132] (unknown [10.163.37.132]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id C60443F58B; Mon, 30 Jun 2025 00:55:19 -0700 (PDT) Message-ID: <786c83e0-d69f-4fa3-a39c-94c4dfc08a20@arm.com> Date: Mon, 30 Jun 2025 13:25:14 +0530 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH] khugepaged: Reduce race probability between migration and khugepaged To: Dev Jain , akpm@linux-foundation.org, david@redhat.com Cc: ziy@nvidia.com, baolin.wang@linux.alibaba.com, lorenzo.stoakes@oracle.com, Liam.Howlett@oracle.com, npache@redhat.com, ryan.roberts@arm.com, baohua@kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org References: <20250630044837.4675-1-dev.jain@arm.com> Content-Language: en-US From: Anshuman Khandual In-Reply-To: <20250630044837.4675-1-dev.jain@arm.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 3D0D920007 X-Stat-Signature: 7jybqswswp5h4cyf75k4ka8wk35izoyn X-Rspam-User: X-HE-Tag: 1751270125-87279 X-HE-Meta: U2FsdGVkX19J2yH2WtZ7UL1FsmbpYNShcBoLHgSwSNWKcAXFJXHsRM0wOE14nUPFMHhr8DpbTpBso/GVE1Lquz4JrHNMtgLC7VBBWU//PHVsK1DRHn8lsZrhD56jLyySLMYpQZ7iFx10NoRNK9XTJophoJQdi7m67EwVdTvEcxKyTgqKxBxDEOveB6PUPGjQWjxN6hi4zhtS8Jfugf167xRuK6A7QgjTPDiNIAB5a3F4DvPiw6rZf/OGIDGDEcsS5BbvC5Pwbi75pc6Jy65yThczfAq59S25LYmC+hWu3ecxQCjqjh2hViLx+DDVC2JcgjuC5bL+/WuuDUCWoTu89NUgzE7kpPw64XV6XFJlTemmChMu215yWWgd5d573NSUWgPTDk6O2J7hc0Itqgvpb8l2WjyJNlnskQ8+jDETuI+2HzPxMfBuDGIhffnE8gC9gSa0+y5PKqiXnQELBRW/emk7Ju4j1AcZWTKmMRR/iB86oiE6HeqL2EEqda+kMeYQ00xxRWvIALjsWvFHDtGXlwjSzSHUHnmODTHevyyLOZ1Staz4U9cgai+IDFCaddgS5cQb9pdZwXugeDwlSAu9vL5DvhO+ovN97ct9tqV7y0Ngz6XH9zq8vRp1dhNwtqhr9G2rZemQhpDQbvf3JTlVPBeNYLnuA6z6sCer2Z0QJn6Tqbyt90kh5xWa2yPkoNtqju7rQ3WjeZPxBC4vVUl9qAi2mgxRxyoWym92t6L5hiYTxOrnl8qAnWc31LG2VY80ZF0Fq2OhxPFqlTLQynvSbswTwWcsgaYsZE/of9eGZpGfLz2Ovdvj7KY/SmkjgXUq93+BswUZGIHMavXLicE3AMaSA98ah1VD2WMxhRc5au9sWzXWxjy87QQwJzMAOetiEaxWoIv4vgUbjcRAMUqQtvQg+/rtFXdclu5zDMLjxyBUsAOLfMRwFB8qvG/Gg9rFkk5AN/dmtV5EmPpw85A 4howUndF MDcv+opRX87LxQSR2nJw5dMtvbluK0RAbzANndOGnMZajaUc6ApIB8WQ4puqCzBm9Y4RoNFuFR3EZPH9lVCBCKgj24BYjLGzhHlNTSDQTvWFt7YTRDeKREQTk6jTmZlK+AwbOu5KzM4xUxaISoscoP6SswDNILi+zwPFGz90CjlZE4tQkiyzKXU01H+4RhKk3+Us+WuyKIEWkk4RmoSeOASykkDYNpE7VgYS+BS+LO3YdKn2UFwu3wB3PL2iWojtFieL6 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 30/06/25 10:18 AM, Dev Jain wrote: > Suppose a folio is under migration, and khugepaged is also trying to > collapse it. collapse_pte_mapped_thp() will retrieve the folio from the > page cache via filemap_lock_folio(), thus taking a reference on the folio > and sleeping on the folio lock, since the lock is held by the migration > path. Migration will then fail in > __folio_migrate_mapping -> folio_ref_freeze. Reduce the probability of > such a race happening (leading to migration failure) by bailing out > if we detect a PMD is marked with a migration entry. Could the migration be re-attempted after such failure ? Seems like the migration failure here is traded for a scan failure instead. > > This fixes the migration-shared-anon-thp testcase failure on Apple M3. Could you please provide some more context why this test case was failing earlier and how does this change here fixes the problem ? > > Note that, this is not a "fix" since it only reduces the chance of > interference of khugepaged with migration, wherein both the kernel > functionalities are deemed "best-effort". > > Signed-off-by: Dev Jain > --- > > This patch was part of > https://lore.kernel.org/all/20250625055806.82645-1-dev.jain@arm.com/ > but I have sent it separately on suggestion of Lorenzo, and also because > I plan to send the first two patches after David Hildenbrand's > folio_pte_batch series gets merged. > > mm/khugepaged.c | 12 ++++++++++-- > 1 file changed, 10 insertions(+), 2 deletions(-) > > diff --git a/mm/khugepaged.c b/mm/khugepaged.c > index 1aa7ca67c756..99977bb9bf6a 100644 > --- a/mm/khugepaged.c > +++ b/mm/khugepaged.c > @@ -31,6 +31,7 @@ enum scan_result { > SCAN_FAIL, > SCAN_SUCCEED, > SCAN_PMD_NULL, > + SCAN_PMD_MIGRATION, > SCAN_PMD_NONE, > SCAN_PMD_MAPPED, > SCAN_EXCEED_NONE_PTE, > @@ -941,6 +942,8 @@ static inline int check_pmd_state(pmd_t *pmd) > > if (pmd_none(pmde)) > return SCAN_PMD_NONE; > + if (is_pmd_migration_entry(pmde)) > + return SCAN_PMD_MIGRATION; > if (!pmd_present(pmde)) > return SCAN_PMD_NULL; > if (pmd_trans_huge(pmde)) > @@ -1502,9 +1505,12 @@ int collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long addr, > !range_in_vma(vma, haddr, haddr + HPAGE_PMD_SIZE)) > return SCAN_VMA_CHECK; > > - /* Fast check before locking page if already PMD-mapped */ > + /* > + * Fast check before locking folio if already PMD-mapped, or if the > + * folio is under migration > + */ > result = find_pmd_or_thp_or_none(mm, haddr, &pmd); > - if (result == SCAN_PMD_MAPPED) > + if (result == SCAN_PMD_MAPPED || result == SCAN_PMD_MIGRATION) Should mapped PMD and migrating PMD be treated equally while scanning ? > return result; > > /* > @@ -2716,6 +2722,7 @@ static int madvise_collapse_errno(enum scan_result r) > case SCAN_PAGE_LRU: > case SCAN_DEL_PAGE_LRU: > case SCAN_PAGE_FILLED: > + case SCAN_PMD_MIGRATION: > return -EAGAIN; > /* > * Other: Trying again likely not to succeed / error intrinsic to > @@ -2802,6 +2809,7 @@ int madvise_collapse(struct vm_area_struct *vma, unsigned long start, > goto handle_result; > /* Whitelisted set of results where continuing OK */ > case SCAN_PMD_NULL: > + case SCAN_PMD_MIGRATION: > case SCAN_PTE_NON_PRESENT: > case SCAN_PTE_UFFD_WP: > case SCAN_PAGE_RO: