From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A08C9C54E67 for ; Tue, 26 Mar 2024 17:27:46 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 303116B0085; Tue, 26 Mar 2024 13:27:46 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 2B1A56B0087; Tue, 26 Mar 2024 13:27:46 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1A01D6B0088; Tue, 26 Mar 2024 13:27:46 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 0AE606B0085 for ; Tue, 26 Mar 2024 13:27:46 -0400 (EDT) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id C7883C0CC1 for ; Tue, 26 Mar 2024 17:27:45 +0000 (UTC) X-FDA: 81939872490.22.3DD451B Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf06.hostedemail.com (Postfix) with ESMTP id B2075180007 for ; Tue, 26 Mar 2024 17:27:43 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=none; spf=pass (imf06.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1711474064; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=kwPpcBjQsEB5o4fq0p2YPXvBHLDIJewzYc9VQ3lUgps=; b=XlC4bqQQH+8sa+/XIoGqf7+mEgZhdnvwaaG0Re+ekzQIALqHCeSvEtmUxRPxNVrrnCZny5 csllTzjuRqOUHYN+eMCSrbwEVQGqgcWspNNQWiwrJqxKlgor2bweskrRgNu0+seO3YBmql vRIbW+lNMdiArbb5CJGg4Sun6YDFmho= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1711474064; a=rsa-sha256; cv=none; b=A0MeqXTwPhZ0XdiE5v51ZchwNrVSYD5GwzDVPZ3HLfGBmDrAL7ReTrKsb5kraeKzes3Kjv ruBuRdDjmUJzJW1ebGa5Wp+qj1K0xrxP4c3tSxmuHQ2sX3iy5vPiI7Ga4hw6IwWzGBMjZg CFdrPtj+LkcHB6AtRLrg4NrINGNt9gk= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=none; spf=pass (imf06.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 5EB002F4; Tue, 26 Mar 2024 10:28:16 -0700 (PDT) Received: from [10.1.29.179] (XHFQ2J9959.cambridge.arm.com [10.1.29.179]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id C68693F64C; Tue, 26 Mar 2024 10:27:40 -0700 (PDT) Message-ID: Date: Tue, 26 Mar 2024 17:27:39 +0000 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [RFC PATCH v1 3/4] mm/memory: Use ptep_get_lockless_norecency() for orig_pte Content-Language: en-GB To: David Hildenbrand , Mark Rutland , Catalin Marinas , Will Deacon , Alexander Shishkin , Jiri Olsa , Ian Rogers , Adrian Hunter , Andrew Morton , Muchun Song Cc: linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org References: <20240215121756.2734131-1-ryan.roberts@arm.com> <20240215121756.2734131-4-ryan.roberts@arm.com> From: Ryan Roberts In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Rspamd-Queue-Id: B2075180007 X-Rspam-User: X-Rspamd-Server: rspam11 X-Stat-Signature: spm56f7te39zz9t1y5ujt67j8xqpf1if X-HE-Tag: 1711474063-216789 X-HE-Meta: U2FsdGVkX1/9o70SU9LNoSH4R3b3CmaYD1TM92dICcNuEKN/4YuYkB88EXWE+/EeNFkY2QJJ1D8PI5+UAYH84YaQovLszKu3SIsY+aigfrXj85qo+YPRCkiYb0RxODt+5NriuBS0rS37hWThpzGxWDUgxLIgvuGra+++ym4blB9c0lc9/onb9Y/Ngz8p0c5jBATJd07w/1S88H8O/Oe5bN/J3vA3QNbgJhyg9xVVIKOpTzCSyQnGGPo+k/J8yz2pUnxUt7LlJb+mb3DBE6U1fS8MvRyULhh1Ns/2M6m37Vt9IRF63JrPM/Zz/OD1TFOEZ6FJRlcYvQF3a/5gJR4bJydYQ/O1qEF7+0av8HDKsD7iFGpxlxNrn4fvxTikNh+JEpskK4IhNff77A56G5uXQNVO8SpisvGWYz292jYBuouuSvnbKjqq25lUaCfa2UpiTQeEfh1XSbOYBgb5ZVrJ9evsy4TTx52BJhGfOLD4OXmal6NHLWfu/WWQqmIBvuFEQg1yMqgcjr5DLqXMo+XyyiWh3RfY2x4MJZXIOCJiCW9bKR3ovc1zAGlWBM2Vm8AhxoLE1fd+gHiS0UW5zaTXgsTRW0ZP5ta9nMJgoLzAVboZn8s+rth3pNYxvuUt1rMqncy5ZsWqGQC5XvYYbenfQnCAh/mTbMjmRThOPhM/q8peWRX0iUxqHsRF8przP0vU7q2HtY1JkRuvApPjER+8Hl8C6X8dKjQ7DjJebesYTSGiR20uDJCVpYtTe65H4ss/90MAsROU3UhMH2Q7bif7xj+7Sc2auikeE5Y5eTA6RSfqKU9exL6RmX8rvvI0hoFYzcKbXFtGwTk/MTuH9PYB7f9/u5XGydKiK3zb1bSjtWRfXpjydvXVyzw0I8PrukadTTst6XfsPD7GcHIo+xRhzQWezis1WvrdX+iU5z749UguIqyN2QtZpi+mvey2E7EbRG9Gae5YQIkMHYhS1Zn BymUQnpA cxbCB65uCJ7pg+OxLKc2zRZYkTA9j9xlop+zgzGdyW30E/oa52QNOLRCYL1oyJ3amqJn/Qg8/q5RD42QyFRhqIpYU7ShOfKN7Gk+pasvnd4KtuXqpJ65pM5s7s2gu2ZvTqun44tPXgff+V2EG4aWb3VTYcFfSgsGACdIblASXgS+YIKcEO7lT5Slg6zZuLw+NaGqOp84ENeWuHrRBnOloXy7BKQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 26/03/2024 17:02, David Hildenbrand wrote: > On 15.02.24 13:17, Ryan Roberts wrote: >> Let's convert handle_pte_fault()'s use of ptep_get_lockless() to >> ptep_get_lockless_norecency() to save orig_pte. >> >> There are a number of places that follow this model: >> >>      orig_pte = ptep_get_lockless(ptep) >>      ... >>      >>      if (!pte_same(orig_pte, ptep_get(ptep))) >>              // RACE! >>      ... >>      >> >> So we need to be careful to convert all of those to use >> pte_same_norecency() so that the access and dirty bits are excluded from >> the comparison. >> >> Additionally there are a couple of places that genuinely rely on the >> access and dirty bits of orig_pte, but with some careful refactoring, we >> can use ptep_get() once we are holding the lock to achieve equivalent >> logic. > > We really should document that changed behavior somewhere where it can be easily > found: that orig_pte might have incomplete/stale accessed/dirty information. I could add it to the orig_pte definition in the `struct vm_fault`? > > >> @@ -5343,7 +5356,7 @@ static vm_fault_t handle_pte_fault(struct vm_fault *vmf) >>                            vmf->address, &vmf->ptl); >>           if (unlikely(!vmf->pte)) >>               return 0; >> -        vmf->orig_pte = ptep_get_lockless(vmf->pte); >> +        vmf->orig_pte = ptep_get_lockless_norecency(vmf->pte); >>           vmf->flags |= FAULT_FLAG_ORIG_PTE_VALID; >> >>           if (pte_none(vmf->orig_pte)) { >> @@ -5363,7 +5376,7 @@ static vm_fault_t handle_pte_fault(struct vm_fault *vmf) >> >>       spin_lock(vmf->ptl); >>       entry = vmf->orig_pte; >> -    if (unlikely(!pte_same(ptep_get(vmf->pte), entry))) { >> +    if (unlikely(!pte_same_norecency(ptep_get(vmf->pte), entry))) { >>           update_mmu_tlb(vmf->vma, vmf->address, vmf->pte); >>           goto unlock; > > I was wondering about the following: > > Assume the PTE is not dirty. > > Thread 1 does Sorry not sure what threads have to do with this? How is the vmf shared between threads? What have I misunderstood... > > vmf->orig_pte = ptep_get_lockless_norecency(vmf->pte) > /* not dirty */ > > /* Now, thread 2 ends up setting the PTE dirty under PT lock. */ > > spin_lock(vmf->ptl); > entry = vmf->orig_pte; > if (unlikely(!pte_same(ptep_get(vmf->pte), entry))) { >     ... > } > ... > entry = pte_mkyoung(entry); Do you mean pte_mkdirty() here? You're talking about dirty everywhere else. > if (ptep_set_access_flags(vmf->vma, ...) > ... > pte_unmap_unlock(vmf->pte, vmf->ptl); > > > Generic ptep_set_access_flags() will do another pte_same() check and realize > "hey, there was a change!" let's update the PTE! > > set_pte_at(vma->vm_mm, address, ptep, entry); This is called from the generic ptep_set_access_flags() in your example, right? > > would overwrite the dirty bit set by thread 2. I'm not really sure what you are getting at... Is your concern that there is a race where the page could become dirty in the meantime and it now gets lost? I think that's why arm64 overrides ptep_set_access_flags(); since the hw can update access/dirty we have to deal with the races.