From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1AB6BC3DA4A for ; Thu, 1 Aug 2024 09:38:51 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8AADB6B0082; Thu, 1 Aug 2024 05:38:51 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 85C576B0089; Thu, 1 Aug 2024 05:38:51 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6FBF06B0093; Thu, 1 Aug 2024 05:38:51 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 52DC46B0082 for ; Thu, 1 Aug 2024 05:38:51 -0400 (EDT) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id F4013A5A68 for ; Thu, 1 Aug 2024 09:38:50 +0000 (UTC) X-FDA: 82403177262.29.2283022 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf10.hostedemail.com (Postfix) with ESMTP id F0D9DC0004 for ; Thu, 1 Aug 2024 09:38:48 +0000 (UTC) Authentication-Results: imf10.hostedemail.com; dkim=none; spf=pass (imf10.hostedemail.com: domain of dev.jain@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=dev.jain@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1722505124; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=fdEKoFO3lbtUv8qsaT2/awy2yttskolE1Le7NF14sNQ=; b=C2w3VjPyBU5JwKCH6JRkmToa597BcZ+94nfqE5Xwwh5veS6jq44O9WSP2kPIUVpDecOfU0 m1kKWEjILt2xo0CxsrMUmeUmQAUJthdqysnAsiLKLObmxYekaVLgQq5qVnKCbitHAHlBhJ ATkEcGFteVdbEAJUasUIqsnuyKYz0HM= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=none; spf=pass (imf10.hostedemail.com: domain of dev.jain@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=dev.jain@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1722505124; a=rsa-sha256; cv=none; b=Wm8yY0nqsQNa492kYUrMp95zKNuigOdwelXqR7o8cFA0Aq/TGgwOOQ7T7OsvH2f1dEGGE5 Oeg88WePlzkHgQKqMKI0jKxABwlc3lW+3wXQNo9jmcvHe4Vr56RYQWaGeCVjjflRJKO/QM VbuvBIbaw+XPckajyIN7qRok0d7xWWs= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 835C4DA7; Thu, 1 Aug 2024 02:39:13 -0700 (PDT) Received: from [10.162.42.27] (e116581.arm.com [10.162.42.27]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 5A1B33F5A1; Thu, 1 Aug 2024 02:38:39 -0700 (PDT) Message-ID: Date: Thu, 1 Aug 2024 15:08:30 +0530 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: Race condition observed between page migration and page fault handling on arm64 machines To: David Hildenbrand , akpm@linux-foundation.org, willy@infradead.org Cc: ryan.roberts@arm.com, anshuman.khandual@arm.com, catalin.marinas@arm.com, cl@gentwo.org, vbabka@suse.cz, mhocko@suse.com, apopple@nvidia.com, osalvador@suse.de, baolin.wang@linux.alibaba.com, dave.hansen@linux.intel.com, will@kernel.org, baohua@kernel.org, ioworker0@gmail.com, gshan@redhat.com, mark.rutland@arm.com, kirill.shutemov@linux.intel.com, hughd@google.com, aneesh.kumar@kernel.org, yang@os.amperecomputing.com, peterx@redhat.com, broonie@kernel.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org References: <20240801081657.1386743-1-dev.jain@arm.com> <3b82e195-5871-4880-9ce5-d01bb751f471@redhat.com> Content-Language: en-US From: Dev Jain In-Reply-To: <3b82e195-5871-4880-9ce5-d01bb751f471@redhat.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Rspam-User: X-Stat-Signature: 8ntkf3rbt46ahd85yb1y3ni6ff689cr6 X-Rspamd-Queue-Id: F0D9DC0004 X-Rspamd-Server: rspam11 X-HE-Tag: 1722505128-593826 X-HE-Meta: U2FsdGVkX1/6oQ4sVk+aRKS3+lszL20DLsoO24MvTx4k0wvFT+PHs7E27fevqGXwDjvqcWSOXwJSWQ2mQQgilOyI7nfyiblA3yVgzlq3gPA1R6Kmjp1n6+OuZofvLPYloxU/Gc9zVfAzGX+EmtiVZBWfmikzuOe3LYt2FaRN5AaWJFzZX7qkcgWAR02mONPgskhwFbXRpyNPYidKeMk4JU5Qwx0Y1vO5QsahatSsVafcNPD1+IwIkPWc85l3CL+Ip1YaCtDnMeExGFPwGp2IfdCFUMLXXyPbAhDvoar7B71pCARVxe4Djq7B92L6YBL6SIfkoWBerzZRLRtAWT7MAJujnPsK6FqZEjszFNLKeBH2XFnF+2TwgX211AaeL90eSdOQbHxr+62cZ9wvOmAfHXTuecerD0qclAXJGPSssKbUYe3vwXgaVp6UzLxkNshXb3R5RyJPZ7olCBDC+LoNR17SenLMYmwYRIdIp4eOygemppeMqPaq7HJBcb68F4ZNddVUWGo/UyuIi2lya0TmYEVNQfbattJcn1VOnMbyOk8p3N+xVFJiOVuhzuN9vYWqvlS2N3eisqskdZc6oSsJlVwM3MJaoi3//+NXosVn/8Uk0Tj5x2QTTb37vpyxc/JHNgb7pT705VAQ183NkuvVsmAOshJoxeX51bUPkgPztZMRaoFbGEdcjjvF2Q0DrIDJuWb+nAC35+iomsBqpGBFprp3mF9Va6pjlQOCEXmmYef5OsfkV+V6u1GUGdTwGEhOyT/pCvAh0VpKP5gXzD8yuCniy37m/JPuYMOGcCTcVNtQ7qI2B0DogfFlY9xABCqGcG3J45jPHKxPzwQVWEXbvlyNrFqyP5lcicPB92DJ/YCbs1JPCo+cR7k18fM6Pd9wvD/sYzXj8mRTvF8QoCEoOcyT3S/n6mL39Olu5xbspS9bdlBpXPNExsV7kBYYPeyApmoQADDblOw7CcApmA4 pOECgd4O vexo+TAFgqMcOq+/CXupAZH8wEG9+vA01LvEMNkTcPArJh29/ba9NH7F1//JF4EeAtUsQjNR9ZunuTISMhyBEdsqaaLLcVJ2i5wQPVMOhqMU7mWfpo6ggWd+54c4V9DlCr+NqKrDeUr9jre71uCC/rmGcNdKoHDJKscw4Mf/rpKugTUQ= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000001, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 8/1/24 14:12, David Hildenbrand wrote: > On 01.08.24 10:16, Dev Jain wrote: >> I and Ryan had a discussion and we thought it would be best to get >> feedback >> from the community. >> >> The migration mm selftest currently fails on arm64 for shared anon >> mappings, >> due to the following race: > > Do you mean MAP_SHARED|MAP_ANON or MAP_PRIVATE|MAP_ANON_fork? Because > you note shmem below, I assume you mean MAP_SHARED|MAP_ANON Yes. > >> >> Migration:                        Page fault: >> try_to_migrate_one():                    handle_pte_fault(): >> 1. Nuke the PTE                        PTE has been deleted => >> do_pte_missing() >> 2. Mark the PTE for migration                PTE has not been deleted >> but is just not "present" => do_swap_page() >> > > In filemap_fault_recheck_pte_none() we recheck under PTL to make sure > that a temporary pte_none() really was persistent pte_none() and not a > temporary pte_none() under PTL. > > Should we do something similar in do_fault()? I see that we already do > something like that on the "!vma->vm_ops->fault" path. > > But of course, there is a tradeoff between letting migration > (temporarily) fail and grabbing the PTL during page faults. To dampen the tradeoff, we could do this in shmem_fault() instead? But then, this would mean that we do this in all kinds of vma->vm_ops->fault, only when we discover another reference count race condition :) Doing this in do_fault() should solve this once and for all. In fact, do_pte_missing() may call do_anonymous_page() or do_fault(), and I just noticed that the former already checks this using vmf_pte_changed().