From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2D4EEC52D71 for ; Thu, 8 Aug 2024 20:25:19 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 996C96B0089; Thu, 8 Aug 2024 16:25:18 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 91FA56B008A; Thu, 8 Aug 2024 16:25:18 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 774156B008C; Thu, 8 Aug 2024 16:25:18 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 563676B0089 for ; Thu, 8 Aug 2024 16:25:18 -0400 (EDT) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 099271A1409 for ; Thu, 8 Aug 2024 20:25:18 +0000 (UTC) X-FDA: 82430207916.22.FB3905F Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf23.hostedemail.com (Postfix) with ESMTP id E545214000E for ; Thu, 8 Aug 2024 20:25:15 +0000 (UTC) Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=YSVAdMha; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf23.hostedemail.com: domain of peterx@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=peterx@redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1723148706; a=rsa-sha256; cv=none; b=tZImjhdvcE1uD+DEiJos7CSPaMEi1d2ZQIzDgC9lMfqn1NMqisSKedqG6L1H42tcjT+Oi8 m74hBhpFbaS+9YVb9uicOzl0fFZVyOQPxtYKSkrUxR+yIx6+xXWlgME/JARw54Ge0vWsLN NTgulDwO0SwJBNIA2hiZQE5UJZiqVuE= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=YSVAdMha; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf23.hostedemail.com: domain of peterx@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=peterx@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1723148706; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=8daUpyfTZqUQFCrzGvmeyWvUDYOhXzl0ULKyt5g2PCM=; b=U5WOyf9P+EBw7M8xdIcXyGgsZAzz/SmoVCTjX7KpL+emqkz62AZxoSeRlM6d8v36ML2Og/ rRQ/NxfJSrk9bMxpaJSbn+w4QJ6QTEMbdCJKj/tzEMMTxHbt4rzkO49Fm61/gTTX8lZAGy 7/PZFL7noXCQ0ltqJ1tBrpB6D1GMyNo= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1723148715; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=8daUpyfTZqUQFCrzGvmeyWvUDYOhXzl0ULKyt5g2PCM=; b=YSVAdMhaM5ZasfukU2HOCj10Tet3tMInAMsku/0CLAK/HeM4tYzIuIRWSL5OgcxhGw7YLY 694GCFrHC6RZvkUbNpPDAWrJSf8CpKp8ItNbJFBkzXc+9TvqKnwhLd7rRxrx/+a5L6n4uY 0L/0ooiJrwp0TRC+5dDqgOd04NMiB6s= Received: from mail-qv1-f69.google.com (mail-qv1-f69.google.com [209.85.219.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-661-9LGbGgOJOZOITJZSiautHQ-1; Thu, 08 Aug 2024 16:25:14 -0400 X-MC-Unique: 9LGbGgOJOZOITJZSiautHQ-1 Received: by mail-qv1-f69.google.com with SMTP id 6a1803df08f44-6b7678caf7dso3881786d6.1 for ; Thu, 08 Aug 2024 13:25:13 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1723148713; x=1723753513; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=8daUpyfTZqUQFCrzGvmeyWvUDYOhXzl0ULKyt5g2PCM=; b=rj9ePXTP5Nk1nxTgYDJM/a2eesBkdDROhX8FWTLIgzcVUyGQGe9h2bDrmsQbGiqZ5g pWIreNlAUzImjNg0Dr8bP1iinVuT/OZ46VYcWKdD+GQemZ7jL5bZCVP0vQvERlT+V1ke QupuP2Rtzf2UB5JF4vm8DWKVJdzpiQa+KByRaQ/XW4tjPW+zMXcZG0zayaaIN6PlwXeH srVybqO6Gx3bs+wAzDHbfPjVS+nwABGZ8phXEYKhZC1XGiXTUc9ZykIXXkZl1o3B7G11 BDF3Cs0Qbf9nVcuyE3YLd0flyqZRNi1j16yBc8OvPkY1c8AFVCOapcki4TNsHt6jbhnQ SNPQ== X-Forwarded-Encrypted: i=1; AJvYcCV4Rti24QugVUpS1/gMTknDX94WrdCV40KsbrfnlpZICWFtQAL16e6Uudtr58p2DeFzM+K94cCLh8Jgww/7uAE1+4k= X-Gm-Message-State: AOJu0YwJTfWNIqBRrOqXtOY/qv92kCdO691PgPkpN0jL1gbKhiGl+uSt +Y5cMQucY0BEtb7JvDQbpgsWxfFqFHKhbnYFJTJLP5h9OooqiHwowXOJDYB1nIjTVyDLRcGSwJz PRS/3NYn16wJCnyWdqFR5ofNxrnZCp0q09hnOFWKr9meR6xeu X-Received: by 2002:ad4:4c4b:0:b0:6bd:738c:844b with SMTP id 6a1803df08f44-6bd738c86cbmr7125856d6.7.1723148713385; Thu, 08 Aug 2024 13:25:13 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFHxHRY0hgUe0eqPdegGemg/UeFvqIQDH8YpIiffoYytwmP645+Qp3cG/ROktBrvTKfdRkxUA== X-Received: by 2002:ad4:4c4b:0:b0:6bd:738c:844b with SMTP id 6a1803df08f44-6bd738c86cbmr7125696d6.7.1723148712878; Thu, 08 Aug 2024 13:25:12 -0700 (PDT) Received: from x1n (pool-99-254-121-117.cpe.net.cable.rogers.com. [99.254.121.117]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-6bb9c78ae4asm70451386d6.33.2024.08.08.13.25.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 08 Aug 2024 13:25:12 -0700 (PDT) Date: Thu, 8 Aug 2024 16:25:07 -0400 From: Peter Xu To: Thomas Gleixner Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, "Aneesh Kumar K . V" , Michael Ellerman , Oscar Salvador , Dan Williams , James Houghton , Matthew Wilcox , Nicholas Piggin , Rik van Riel , Dave Jiang , Andrew Morton , x86@kernel.org, Ingo Molnar , Rick P Edgecombe , "Kirill A . Shutemov" , linuxppc-dev@lists.ozlabs.org, Mel Gorman , Hugh Dickins , Borislav Petkov , David Hildenbrand , Vlastimil Babka , Dave Hansen , Christophe Leroy , Huang Ying Subject: Re: [PATCH v4 6/7] mm/x86: Add missing pud helpers Message-ID: References: <20240807194812.819412-1-peterx@redhat.com> <20240807194812.819412-7-peterx@redhat.com> <875xsc0xjy.ffs@tglx> MIME-Version: 1.0 In-Reply-To: <875xsc0xjy.ffs@tglx> X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Disposition: inline X-Rspam-User: X-Rspamd-Queue-Id: E545214000E X-Rspamd-Server: rspam01 X-Stat-Signature: poa33fpgyod59czhjj19m4frynibw873 X-HE-Tag: 1723148715-543962 X-HE-Meta: U2FsdGVkX1+Z6Oze4PtoQLBSsNS5wqVoXGc6tuKKr5rBxM/xWsRIUmq/tsAEcunOYM1pt9TdinQURfi+7Fq8WaNvU8UcnMfVMc+AV48rdgJIseeYjHt4Nhn35ygLNQrb8l/TN3o0aZRnNnO8ghtdvC64RTZmv/maAkLglfZFgLMzXDjzh75w6qv/YHgA4zpWS2FzItLF+jzhE1C6gvkRcsq6sDOyhMZ40ENxI7M91NdDs3gGVMfjXRyY7laAcHit9HGHTFWKvUTi8Ik3mL01YeYtufB1D6Wsr6HqMYIZ/N3hqwjTr9cV/4bMESkQ4Y21aPqCuusXn07/iV9WdvFejo2GWUeYZnj6klxK9JSjBmumsnDkWV+ddINB1U4nKMyYV7oGPJA5WfwNNCUnV0S8szLa1eje1qeb7VHYbFEGxpcas93+VuIHYZUPD1D4uSr7Csg7a5NHuUnt8rpivYhC9uNp+/5HoO6Z9c1WGtrKMoIIhFHf47htbxTYA5sIwk0gDGsWOKnfCbj9CFMpO1h0uY40uhMN526egF7f/TAkNH0atRDPqKhv6wiEXZqhyOzT23Vtx/CtHRZgE4sion97BZX9PhEqozZ0elV4QVz1QJXKvkOJ8oLNPs3fT1PD2U399Nlv+L/hnnmh5EyeVpvTRu9l1Tgt3O3U5IOfiqVBfr+C3V5frmuDEwkhvYhNO1oOU7k7nJ2ryey1q+/nabzLd73E68THB1tdYHBJaa/7pO4bx3tRw+pHwVba65IeEcAP9tbZpVFd2K7kZK0upWANrm4SHIOn9WxrA5cKSXKdqGfYsMY9X1yZuJSytYNBLyrXH8B0rUT5t58DNs6BkvHFd4IUerzWPBVNHwAmxE68aVj7zjHJ8CYVBxih3FCB1aO5U8Pg12DMGKb7+dT60nT5pOliJLRWaewauoKqgSmElKDe8i9wj00VXpM8OFbraI3KGKhLRxMoH4X+Ee6Ytd2 ESn4cJSB ENGaX5iP3yu+P9gDZaJAW6gHJnSnISOTaGGm3L06kUYmN8D4KMs9AMR2HcnA9cGWoCpi0frYP3JaDmsX+Ey1YukzfTvGBkUngR8Iu9nwS8mlRcxzGuEgOzwvzBt/4Q36EwjqXDlh20QwSC0KgclsJHxzTIr/o3pJ3Ra0O2vyV4vTwfDV/dhatnU8k1PRWcHbfWRrTtNREo9louTW5esPx13Di0P43sUG94ES39UW7K6m12GUC1L1PwjM8LTRwQNDDvVN6NVNx2u4GVlDMV3yDihTKMmKJslNqZPA+xiUfwcRNwKrDZ/gpESq8jIFqT5o5hA36gqs4v/EUdxBRZknQRATdw+fNV7AT5tHX X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Aug 08, 2024 at 12:37:21AM +0200, Thomas Gleixner wrote: > On Wed, Aug 07 2024 at 15:48, Peter Xu wrote: > > These new helpers will be needed for pud entry updates soon. Introduce > > these helpers by referencing the pmd ones. Namely: > > > > - pudp_invalidate() > > - pud_modify() > > Zero content about what these helpers do and why they are needed. That's > not how it works, really. I hope what Dave suggested to add "by referencing the pmd ones" would be helpful, because they are exactly referencing those. But sure.. I rewrote the commit message as following: mm/x86: Add missing pud helpers Some new helpers will be needed for pud entry updates soon. Introduce these helpers by referencing the pmd ones. Namely: - pudp_invalidate(): this helper invalidates a huge pud before a split happens, so that the invalidated pud entry will make sure no race will happen (either with software, like a concurrent zap, or hardware, like a/d bit lost). - pud_modify(): this helper applies a new pgprot to an existing huge pud mapping. For more information on why we need these two helpers, please refer to the corresponding pmd helpers in the mprotect() code path. When at it, simplify the pud_modify()/pmd_modify() comments on shadow stack pgtable entries to reference pte_modify() to avoid duplicating the whole paragraph three times. Please let me know if this works for you. > > > > +static inline pud_t pud_mkinvalid(pud_t pud) > > +{ > > + return pfn_pud(pud_pfn(pud), > > + __pgprot(pud_flags(pud) & ~(_PAGE_PRESENT|_PAGE_PROTNONE))); > > 100 characters... Waiting for an answer on this one, so I'll ignore "100 char" comments for now, and will address them when I get a clue on what is happening.. > > > +} > > + > > static inline u64 flip_protnone_guard(u64 oldval, u64 val, u64 mask); > > > > static inline pte_t pte_modify(pte_t pte, pgprot_t newprot) > > @@ -834,14 +840,8 @@ static inline pmd_t pmd_modify(pmd_t pmd, pgprot_t newprot) > > pmd_result = __pmd(val); > > > > /* > > - * To avoid creating Write=0,Dirty=1 PMDs, pte_modify() needs to avoid: > > - * 1. Marking Write=0 PMDs Dirty=1 > > - * 2. Marking Dirty=1 PMDs Write=0 > > - * > > - * The first case cannot happen because the _PAGE_CHG_MASK will filter > > - * out any Dirty bit passed in newprot. Handle the second case by > > - * going through the mksaveddirty exercise. Only do this if the old > > - * value was Write=1 to avoid doing this on Shadow Stack PTEs. > > + * Avoid creating shadow stack PMD by accident. See comment in > > + * pte_modify(). > > The changelog is utterly silent about this comment update. Updated my commit message altogether above; I hope it works. > > > */ > > if (oldval & _PAGE_RW) > > pmd_result = pmd_mksaveddirty(pmd_result); > > @@ -851,6 +851,29 @@ static inline pmd_t pmd_modify(pmd_t pmd, pgprot_t newprot) > > return pmd_result; > > } > > > > +static inline pud_t pud_modify(pud_t pud, pgprot_t newprot) > > +{ > > + pudval_t val = pud_val(pud), oldval = val; > > + pud_t pud_result; > > + > > + val &= _HPAGE_CHG_MASK; > > + val |= check_pgprot(newprot) & ~_HPAGE_CHG_MASK; > > + val = flip_protnone_guard(oldval, val, PHYSICAL_PUD_PAGE_MASK); > > + > > + pud_result = __pud(val); > > + > > + /* > > + * Avoid creating shadow stack PUD by accident. See comment in > > + * pte_modify(). > > + */ > > + if (oldval & _PAGE_RW) > > + pud_result = pud_mksaveddirty(pud_result); > > + else > > + pud_result = pud_clear_saveddirty(pud_result); > > + > > + return pud_result; > > +} > > + > > /* > > * mprotect needs to preserve PAT and encryption bits when updating > > * vm_page_prot > > @@ -1389,10 +1412,26 @@ static inline pmd_t pmdp_establish(struct vm_area_struct *vma, > > } > > #endif > > > > +static inline pud_t pudp_establish(struct vm_area_struct *vma, > > + unsigned long address, pud_t *pudp, pud_t pud) > > Random line break alignment.... See documentation. This is exactly what we do with pmdp_establish() right above. Would you be fine I keep this just to make pmd/pud look the same? > > > +{ > > + page_table_check_pud_set(vma->vm_mm, pudp, pud); > > + if (IS_ENABLED(CONFIG_SMP)) { > > + return xchg(pudp, pud); > > + } else { > > + pud_t old = *pudp; > > + WRITE_ONCE(*pudp, pud); > > Lacks a newline between variable declaration and code. > > But seriously, why optimizing for !SMP? That's a pointless exercise and > a guarantee for bitrot. So far it looks still reasonable to me if it is there in pmd. If I remove it, people can complain again on "hey, we have this /optimization/ in pmd, why you removed it in pud?". No end.. So again.. it's the same to pmd ones. Either we change nothing, or we change both. I don't want to expand this series to more than it wants to do.. would you mind I keep it until someone justifies the change to modify both? > > > + return old; > > + } > > +} > > + > > #define __HAVE_ARCH_PMDP_INVALIDATE_AD > > extern pmd_t pmdp_invalidate_ad(struct vm_area_struct *vma, > > unsigned long address, pmd_t *pmdp); > > > > +pud_t pudp_invalidate(struct vm_area_struct *vma, unsigned long address, > > + pud_t *pudp); > > While 'extern' is not required, please keep the file style consistent > and use the 100 characters... > > > --- a/arch/x86/mm/pgtable.c > > +++ b/arch/x86/mm/pgtable.c > > @@ -641,6 +641,18 @@ pmd_t pmdp_invalidate_ad(struct vm_area_struct *vma, unsigned long address, > > } > > #endif > > > > +#if defined(CONFIG_TRANSPARENT_HUGEPAGE) && \ > > + defined(CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD) > > +pud_t pudp_invalidate(struct vm_area_struct *vma, unsigned long address, > > + pud_t *pudp) > > +{ > > + VM_WARN_ON_ONCE(!pud_present(*pudp)); > > + pud_t old = pudp_establish(vma, address, pudp, pud_mkinvalid(*pudp)); > > + flush_pud_tlb_range(vma, address, address + HPAGE_PUD_SIZE); > > + return old; > > Your keyboard clearly lacks a newline key ... This is also the same, that pmdp_invalidate() coded it like exactly that. In general, I prefer keeping the pmd/pud helpers look the same if you won't disagree as of now, all over the places. I know that it might even be better to clean up everything, get everything reviewed first on pmd changes, then clone that to pud. That might be cleaner indeed. But it adds much fuss all over the places, and even with this series I got stuck for months.. and so far I haven't yet post what I really wanted to post (huge pfnmaps). I sincerely hope we can move forward with this and then clean things up later (none of them are major issues; all trivial details so far, IMHO, so nothing I see go severely wrong yet), and then the cleanup will need to be justified one by one on each spot. Thanks, -- Peter Xu