From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C4FC7C77B73 for ; Wed, 19 Apr 2023 11:13:14 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5D0F98E0002; Wed, 19 Apr 2023 07:13:14 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 5805C8E0001; Wed, 19 Apr 2023 07:13:14 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 46ED48E0002; Wed, 19 Apr 2023 07:13:14 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 3761B8E0001 for ; Wed, 19 Apr 2023 07:13:14 -0400 (EDT) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 00A621C617B for ; Wed, 19 Apr 2023 11:13:13 +0000 (UTC) X-FDA: 80697879108.26.A162BAC Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf12.hostedemail.com (Postfix) with ESMTP id 0084340009 for ; Wed, 19 Apr 2023 11:13:11 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=none; spf=pass (imf12.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1681902792; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=051T3GjlbDHx25jj+/dIg4l/jKSVJ+z0SDsgxdj4kDI=; b=vyILT2DZUIdxf4uGDSOQIwzM4BTQws5UVsLkoUzfcu+kFBQI+CKRe9FJq6PwzEmsStqcbl IOqNgTRU0AtknvoOAB8Lv5r786Pw7cTukz96oKQGu9sWWF6YcciQzaR4O9+UhKMgb8+iwT L3FjXZQ4p/BK/nfYuuf5+8fXXlhEhqU= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=none; spf=pass (imf12.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com; dmarc=pass (policy=none) header.from=arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1681902792; a=rsa-sha256; cv=none; b=wX+dFCdx7roB2fNypNbAtyi/XAcseNHqkvpRfNiejwum/6mBQeX9LTUDP8e2Je6qSpRVxM NeNJwGIT+00ndC1K49NA8QYjoy9GwCdxwR8TCTezMaVi8sIwTNO/VtccheqeXUVrAPclgs 6VJJ/Q65kb+WSziQLpYlqjmWirwfx1w= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 9C2EB4B3; Wed, 19 Apr 2023 04:13:54 -0700 (PDT) Received: from [10.57.68.215] (unknown [10.57.68.215]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id A50E93F6C4; Wed, 19 Apr 2023 04:13:09 -0700 (PDT) Message-ID: <0c9a18d6-334f-c9e4-c2b1-8db856960ff7@arm.com> Date: Wed, 19 Apr 2023 12:13:07 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Thunderbird/102.9.1 Subject: Re: [RFC v2 PATCH 00/17] variable-order, large folios for anonymous memory Content-Language: en-US To: David Hildenbrand , Andrew Morton , "Matthew Wilcox (Oracle)" , Yu Zhao , "Yin, Fengwei" Cc: linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org References: <20230414130303.2345383-1-ryan.roberts@arm.com> <13969045-4e47-ae5d-73f4-dad40fe631be@arm.com> <568b5b73-f0e9-c385-f628-93e45825fb7b@redhat.com> <5b6fe242-a19e-70bf-adba-240f2d5b8548@arm.com> <87ad8d4b-b117-0c7a-3d0b-723ad59a0405@redhat.com> From: Ryan Roberts In-Reply-To: <87ad8d4b-b117-0c7a-3d0b-723ad59a0405@redhat.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Stat-Signature: 6r5dosstsfukx4puc66dutgaaj1ypfn9 X-Rspam-User: X-Rspamd-Queue-Id: 0084340009 X-Rspamd-Server: rspam06 X-HE-Tag: 1681902791-438109 X-HE-Meta: U2FsdGVkX19rg7xiUzAxNe/0R8hStGenLH8kfBf2mif9wlrls6JsKz4dLWzHCgq0BUXUdanXFt3oysSoxKDxoRCnpDyAwaKy4+qgIP0Ip5k/MgF6vZHEdJYg+K738zeH/gJjTsZw5vRGEzent0Yt0i9Utuw5q5mTDRzAsjCSEFrJiVPixuEGiJJ+s3rKJWEtADo4lTFFH1FMoPHgTjT/YakSos5PCyQ1cNscduz4xsRm9G9axF9bxP/J6X2DpY9F3xpJ3QSY/IVOcPvNvDfIc3Sqw2BSLbSIIG5UoGs46W7h/5+PoKN61uxSBNVwfDdu0piRrasxmPt/4WHDYja7dBBHBEOnUXK7WNMGlkR5Mzi/zRVqz6n4p9kZrDKzPMYN6TlsthOARvZzA6p1C/8RDDlOkJB4+vdbwhLor7NgYMhfGxMDDLlmHU47gHdP/hF3ZyiAlOg9EgxdAtHU7Ckz3H5bMFFDK0CxVUOTdBOb94jLBZ2GRPasM36rc5plvlO8ud7d0RctBQNOluK1aUJMqIZRpAdBJPrAGeYw5LrjQjukcNOz9/W12h8GYG/T0RwqMlWoQcHob+krQZfPo56nhJ7x9wgnoPTYxmTIdvDhzDIdMSgZ5n6fXnNc6d7d507tCpT0RylR8/GqUVtWK8LEXYFcREkcIGkopgy0TPQ/19La/IDVEb2vzBbGJivWqinpIeP+IwzqbBPZ5FH4oHGFLmh18PlP/OdT3p/22egeQVBoaTTA5BOAB29yTg28BgTZqkSm9C5JgWdhSrOEMCIHXfZokxoMtfh7TIRnWzPYB9/rA3qxnXm1dGCh/2BqJLsiS+r+uG6+J0fXvrWtpv/5swa+ia/p9BeHVJAjVIZAhYX51ajAWpjttjmvln9A5zCPM+5+aDgXSpnbSb3sv8gckR4AHfwqDkwfcR+z3GM4pQve0rjSt43C9SLaqqz08k70aanaKTej9iFL/u02vGT L3fXKzUs 72av1HvdyAgWjTLAMebXZH5YC1ZviYKR7VtM7x8Y8lPICc2eV4w33kDgb3NInF7ACQU4vhOBtrIo8BrtUg+hLuaOY9iVA1dHo1OUuQYKRFDa16fkOoSnvwnK3bGTZ7FLBppNG X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 19/04/2023 11:51, David Hildenbrand wrote: >> I'm looking to fix this problem in my code, but am struggling to see how the >> current code is safe. I'm thinking about the following scenario: >> > > Let's see :) > >>   - A page is CoW mapped into processes A and B. >>   - The page takes a fault in process A, and do_wp_page() determines that it is >>     "maybe-shared" and therefore must copy. So drops the PTL and calls >>     wp_page_copy(). > > Note that before calling wp_page_copy(), we do a folio_get(folio); Further, the > page table reference is only dropped once we actually replace the page in the > page table. So while in wp_page_copy(), the folio should have at least 2 > references if the page is still mapped. Ahh, yes, of course! > >>   - Process B exits. >>   - Another thread in process A faults on the page. This time dw_wp_page() >>     determines that the page is exclusive (due to the ref count), and reuses it, >>     marking it exclusive along the way. > > The refcount should not be 1 (other reference from the wp_page_copy() caller), > so A won't be able to reuse it, and ... > >>   - wp_page_copy() from the original thread in process A retakes the PTL and >>     copies the _now exclusive_ page. >> >> Having typed it up, I guess this can't happen, because wp_page_copy() will only >> do the copy if the PTE hasn't changed and it will have changed because it is now >> writable? So this is safe? > > this applies as well. If the pte changed (when reusing due to a write failt it's > now writable, or someone else broke COW), we back off. For FAULT_FLAG_UNSHARE, > however, the PTE may not change. But the additional reference should make it work. > > I think it works as intended. It would be clearer if we'd also recheck in > wp_page_copy() whether we still don't have an exclusive anon page under PT lock > --  and if we would, back off. Yes, agreed. I now have a fix for my code that adds a check during the PTE scan that each page is non-exclusive. And the scan is (already) done once without the PTL to determine the size of destination folio, and then again with the lock to ensure there was no race. I'm not seeing any change in the relative counts of folio orders (which is expected due to the case being rare). Thanks!