From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id EA0E1C25B08 for ; Mon, 8 Aug 2022 16:05:34 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6CCD28E0002; Mon, 8 Aug 2022 12:05:34 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 6551E6B0072; Mon, 8 Aug 2022 12:05:34 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4F64B8E0002; Mon, 8 Aug 2022 12:05:34 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 3A3A46B0071 for ; Mon, 8 Aug 2022 12:05:34 -0400 (EDT) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id A40D916102A for ; Mon, 8 Aug 2022 16:05:33 +0000 (UTC) X-FDA: 79776900546.10.0074AFD Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf26.hostedemail.com (Postfix) with ESMTP id CB4DD140173 for ; Mon, 8 Aug 2022 16:05:31 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1659974731; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=pr6iF/TtZ25Rn6ZY2ogRbqXiMq4oLu2Igv/PRch7AW4=; b=aMJl4hKtoqTDivK1bqpbgxGDJXkn0Ph/yhZLxifDhZSSLpZwk5SqGb0enODTjnnmzk6+zS LBfSN3sSlOTikajKqH0CY7jOYegcyW1QjjzUzFQdHJXHF36on6qmap67ZEUwO0P0wgn11K UV5k4YOFkmavgQiarqRN0mYRzSZaMhg= Received: from mail-io1-f72.google.com (mail-io1-f72.google.com [209.85.166.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-484-1XY9gCQbNYS_mndxp3ZuHA-1; Mon, 08 Aug 2022 12:05:20 -0400 X-MC-Unique: 1XY9gCQbNYS_mndxp3ZuHA-1 Received: by mail-io1-f72.google.com with SMTP id s16-20020a5eaa10000000b006826060b7e4so4801505ioe.3 for ; Mon, 08 Aug 2022 09:05:19 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc; bh=pr6iF/TtZ25Rn6ZY2ogRbqXiMq4oLu2Igv/PRch7AW4=; b=Mh7JADMS9Jj2bx61UZak2cPh5kie2uCHqAFHTBaGFW0lLU8R9+LySh/JpbK5uBZ2pv 0pndLkYYdmDOKuP2tpHHhwBgtVt6Vfs5SxZAH+2PmqdhFowzsdU2elaICN1GcsKymrcu 7F6BpFEteu0DljgalWyrC/pSWdbjBv3Pu4OOxf5QYcNAXcSBab36vFPLun4F4Z63SvYe F5eJ02HNZQiZ6/Wnfi65D4f7A0Z/zFTKPKUy3Bgbu6HJB0u8HP2F5PfQK+MSYCjB+37h NCeYQtY9X3V/UZkMJkhHfkAGno9o6kM38dEF797BUDYvuujKGbBHFsgkJGziw+GG/7vV WwfA== X-Gm-Message-State: ACgBeo2exT6z7PGwMBqEoNnQJ91cbL6as3CNFNwHouKCoCSd+5laccrN RCGIL+qGJVfE+zbyeHkNCNiEOfb4eHnAfJtvI4F0FwocAVAeB8mSJYlW5lzqMKSVfSQ2z+Sp75O RTGIm2COWinA= X-Received: by 2002:a05:6e02:1d8a:b0:2de:aa0d:b99e with SMTP id h10-20020a056e021d8a00b002deaa0db99emr8615595ila.138.1659974718328; Mon, 08 Aug 2022 09:05:18 -0700 (PDT) X-Google-Smtp-Source: AA6agR5hxO2cm5lfMWEyGlowPa05C8Tvdxxfr5EXfxxt7IQET3PvVfQIjV9iMQ5zAEYyH9ZaAzBR2A== X-Received: by 2002:a05:6e02:1d8a:b0:2de:aa0d:b99e with SMTP id h10-20020a056e021d8a00b002deaa0db99emr8615579ila.138.1659974718091; Mon, 08 Aug 2022 09:05:18 -0700 (PDT) Received: from xz-m1.local (bras-base-aurron9127w-grc-35-70-27-3-10.dsl.bell.ca. [70.27.3.10]) by smtp.gmail.com with ESMTPSA id x10-20020a92b00a000000b002e0f21f0be4sm30529ilh.27.2022.08.08.09.05.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 08 Aug 2022 09:05:17 -0700 (PDT) Date: Mon, 8 Aug 2022 12:05:15 -0400 From: Peter Xu To: David Hildenbrand Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, Andrew Morton , Mike Kravetz , Muchun Song , Peter Feiner , "Kirill A . Shutemov" Subject: Re: [PATCH v1 2/2] mm/hugetlb: support write-faults in shared mappings Message-ID: References: <20220805110329.80540-1-david@redhat.com> <20220805110329.80540-3-david@redhat.com> <4f644ac5-c40b-32d4-3234-c1dac3d09f83@redhat.com> MIME-Version: 1.0 In-Reply-To: <4f644ac5-c40b-32d4-3234-c1dac3d09f83@redhat.com> X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Disposition: inline ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1659974731; a=rsa-sha256; cv=none; b=xrKly4cpQlMG+5K7l5T/1v7Ppo4mW7122KXEQ9cvqos00CxsWGFK/OBgPGmub1IkZZtMqQ uu7SlI9JJ3KfrkqXoVdP53L1gX+vSA18XcgK0HPLL6jcf2HIogeZfE/rbbmEaST0VyOj1n StUMbL7Vf2QYMX7NVejxC2HH/vwKeQ8= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=aMJl4hKt; spf=pass (imf26.hostedemail.com: domain of peterx@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1659974731; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=pr6iF/TtZ25Rn6ZY2ogRbqXiMq4oLu2Igv/PRch7AW4=; b=R7/SI5C/v11UwppaZChcKYbprVyyZyv24ZdER/NUxz9Q9GCp1Iz0ygRhrnn6csbKUnOR8o FUGziC1LlepQWXAKCv+cauEUzLb/OJNj1+DPQYewVB8DKdX+v/YpJOh0m+uku7oumuys5U JOBBv0bvOsS/TGoO9eEi3stWoWP76Gw= X-Rspamd-Queue-Id: CB4DD140173 Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=aMJl4hKt; spf=pass (imf26.hostedemail.com: domain of peterx@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com X-Rspamd-Server: rspam09 X-Rspam-User: X-Stat-Signature: cp37i5dmzs4goz7swmae48qixot73rd7 X-HE-Tag: 1659974731-51265 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, Aug 05, 2022 at 08:20:52PM +0200, David Hildenbrand wrote: > On 05.08.22 20:12, Peter Xu wrote: > > On Fri, Aug 05, 2022 at 01:03:29PM +0200, David Hildenbrand wrote: > >> Let's add a safety net if we ever get (again) a write-fault on a R/O-mapped > >> page in a shared mapping, in which case we simply have to map the > >> page writable. > >> > >> VM_MAYSHARE handling in hugetlb_fault() for FAULT_FLAG_WRITE > >> indicates that this was at least envisioned, but could never have worked > >> as expected. This theoretically paves the way for softdirty tracking > >> support in hugetlb. > >> > >> Tested without the fix for softdirty tracking. > >> > >> Note that there is no need to do any kind of reservation in hugetlb_fault() > >> in this case ... because we already have a hugetlb page mapped R/O > >> that we will simply map writable and we are not dealing with COW/unsharing. > >> > >> Signed-off-by: David Hildenbrand > >> --- > >> mm/hugetlb.c | 21 ++++++++++++++------- > >> 1 file changed, 14 insertions(+), 7 deletions(-) > >> > >> diff --git a/mm/hugetlb.c b/mm/hugetlb.c > >> index a18c071c294e..bbab7aa9d8f8 100644 > >> --- a/mm/hugetlb.c > >> +++ b/mm/hugetlb.c > >> @@ -5233,6 +5233,16 @@ static vm_fault_t hugetlb_wp(struct mm_struct *mm, struct vm_area_struct *vma, > >> VM_BUG_ON(unshare && (flags & FOLL_WRITE)); > >> VM_BUG_ON(!unshare && !(flags & FOLL_WRITE)); > >> > >> + /* Let's take out shared mappings first, this should be a rare event. */ > >> + if (unlikely(vma->vm_flags & VM_MAYSHARE)) { > > > > Should we check VM_SHARED instead? > > Relying on VM_SHARED to detect MAP_PRIVATE vs. MAP_SHARED is > unfortunately wrong. > > If you're curious, take a look at f83a275dbc5c ("mm: account for > MAP_SHARED mappings using VM_MAYSHARE and not VM_SHARED in hugetlbfs") > and mmap() code. > > Long story short: if the file is read-only, we only have VM_MAYSHARE but > not VM_SHARED (and consequently also not VM_MAYWRITE). To ask in another way: if file is RO but mapped RW (mmap() will have VM_SHARED cleared but VM_MAYSHARE set), then if we check VM_MAYSHARE here won't we grant write bit errornously while we shouldn't? As the user doesn't really have write permission to the file. > > > > >> + if (unshare) > >> + return 0; > > > > Curious when will this happen especially if we switch to VM_SHARED above. > > Shouldn't "unshare" not happen at all on a shared region? > > FAULT_FLAG_UNSHARE is documented to behave like: > > "FAULT_FLAG_UNSHARE is ignored and treated like an ordinary read fault > when no existing R/O-mapped anonymous page is encountered." > > It should currently not happen. Focus on should ;) OK. :) Then does it also mean that it should be better to turn into WARN_ON_ONCE()? It's just that it looks like a valid path if without it. > > > > >> + if (WARN_ON_ONCE(!(vma->vm_flags & VM_WRITE))) > >> + return VM_FAULT_SIGSEGV; > > > > I had a feeling that you just want to double check we have write > > permission, but IIUC this should be checked far earlier or we'll have > > problem. No strong opinion if so, but I'd suggest dropping this one, > > otherwise we could add tons of WARN_ON_ONCE() in anywhere in the page fault > > stack and they mostly won't trigger at all. > > Not quite. We usually (!hugetlb) have maybe_mkwrite() all over the > place. This is just an indication that we don't have maybe semantics > here. But as we also don't have it for hugetlb anon code below, maybe I > can just drop it. (or check it for both call paths) Hmm, this reminded me to wonder how hugetlb handles FOLL_FORCE|FOLL_WRITE on RO regions. Maybe that check is needed, but however instead of warning and sigbus, we need to handle it? I'll need to read more on this later, but raise this up. > > > > >> + set_huge_ptep_writable(vma, haddr, ptep); > > > > Do we wanna set dirty bits too? > > set_huge_ptep_writable() handles that. Right, I noticed it right after I sent too.. Thanks, -- Peter Xu