From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id ACF7FCD4F2B for ; Fri, 22 Sep 2023 07:54:41 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4CD286B0252; Fri, 22 Sep 2023 03:54:41 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 47DB36B0298; Fri, 22 Sep 2023 03:54:41 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 31D546B0299; Fri, 22 Sep 2023 03:54:41 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 240D86B0252 for ; Fri, 22 Sep 2023 03:54:41 -0400 (EDT) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 0364240ECB for ; Fri, 22 Sep 2023 07:54:40 +0000 (UTC) X-FDA: 81263471562.01.3237BA6 Received: from mail-oi1-f178.google.com (mail-oi1-f178.google.com [209.85.167.178]) by imf11.hostedemail.com (Postfix) with ESMTP id 69E6E40003 for ; Fri, 22 Sep 2023 07:54:38 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=hZ+72sHw; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf11.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.167.178 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1695369279; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=mChtpslVRDtABUDMXlRNyI2zcTWp8b7v1J11sWjz5UU=; b=Y4Tp1/+TrT4+NP+tr+1slKWdiqmTMygwGga9rNl7HnG39n97W4DbpCqfbR/jmoj7oJVIKn YqtHCTLrJYzMUuwwGRmmVEt6WC5bez+ag4+zgUc+Dukig/EGFQLHHaA9ZIb6JaKnrpzP2W zgtvl4VZEq14GJQWE2mzfaQtaALkcRo= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=hZ+72sHw; dmarc=pass (policy=quarantine) header.from=bytedance.com; spf=pass (imf11.hostedemail.com: domain of zhengqi.arch@bytedance.com designates 209.85.167.178 as permitted sender) smtp.mailfrom=zhengqi.arch@bytedance.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1695369279; a=rsa-sha256; cv=none; b=A4/EqbQhtVdsITjjJd6nA6Kn8E24TE5thJw1cUuKXKCoAR1blKi2i27MDEPLywpcWanAzC YlzbpO3Pp3C5yQbbFAmlgB13E2G9Ed/D1fLBrp+oj/aVtM94G325ifehPC/HsIqyp9yjaR EdLQJvGcCF1DAZz5HYl77lchFMY/Oi0= Received: by mail-oi1-f178.google.com with SMTP id 5614622812f47-3a707bc2397so235789b6e.0 for ; Fri, 22 Sep 2023 00:54:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1695369277; x=1695974077; darn=kvack.org; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=mChtpslVRDtABUDMXlRNyI2zcTWp8b7v1J11sWjz5UU=; b=hZ+72sHwRVaiXvtgMnWPmRmwyXJtTIUHLHF2dsbcoLULkCMHuwPWcL8EYMguzOkdKG uGcvUdzgsAzpE0pf5dKoCkNRhT4yifuwITFz+WyXCMxWN0+y/5GCmRcHDnMNrH1qICEP XCR+nUe6DYOmyq4JBHxAe0sO6dXKfDnzIFSHegwBqNHySqVIjpMNPxFpjT1wAXPtCMNt EbgO66FZ73rYWNR9BXO4i8zWVPF/fYLVStEMtyy9ib9+lGIqckNP3sOQyxfpA3YlWP8t oowpDgfZlXuy+Ac4IyOF+UvfdpKHiSb464fFkfu27Ou2XJzTQDvnzl6OWe/wcpnPSChe gmSQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1695369277; x=1695974077; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=mChtpslVRDtABUDMXlRNyI2zcTWp8b7v1J11sWjz5UU=; b=AT5UtanuF4EIUeY1sH8mnXSO9hjxyXxjq8QOwm/FowjqOLDHdY7zHkerOxh0KMKrsa f77k4KzJDLGFbbYvf/pls0FqgfVj/NBEwg/surK5SIbQ1WKQHjSdPnBZ5wYX86XSV9Jb tvDepwXKz3UukqOZntZHdhtCbZlFgDtpm1UTcFw5bG5JogXwkCNLiaUFQQgG8B6eqwQC 9x59J4hgUhAGXUxewYpptnxTuxqS0uIYqW82oMJtlDK5lRXWS2hfn0dN8tjjRT6oTEC+ td63P27JW/Qa9Er8wYm1du8+r5ZxjUb90SBenIZ+il1qaKVsYf2/Ha5rKxIdDAisfoJu mflw== X-Gm-Message-State: AOJu0YxniU2NzdWXiwhbt2q62sazMjE3GYsUvvApV/AJiXpzXevpWrE6 W5cTgfe6LpAkr3nIMU3L8O5Yvg== X-Google-Smtp-Source: AGHT+IGqQ4SE81KpOPQycLdX+PpQiCkrykKUFjbltxRe8gku3+WZ7tAnJ+OrDpFpcSdrC4r4spkfyg== X-Received: by 2002:a05:6358:ee94:b0:143:688f:dc03 with SMTP id il20-20020a056358ee9400b00143688fdc03mr8397488rwb.2.1695369277251; Fri, 22 Sep 2023 00:54:37 -0700 (PDT) Received: from [10.84.155.178] ([203.208.167.146]) by smtp.gmail.com with ESMTPSA id y19-20020aa78053000000b0066684d8115bsm2666191pfm.178.2023.09.22.00.54.26 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 22 Sep 2023 00:54:36 -0700 (PDT) Message-ID: <6db7e7e0-4db6-f742-436b-1f4d8ae4e490@bytedance.com> Date: Fri, 22 Sep 2023 15:54:23 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Thunderbird/102.15.1 Subject: Re: [PATCH v1 8/8] arm64: hugetlb: Fix set_huge_pte_at() to work with all swap entries Content-Language: en-US To: Ryan Roberts Cc: Catalin Marinas , Will Deacon , "James E.J. Bottomley" , Helge Deller , Nicholas Piggin , Christophe Leroy , Paul Walmsley , Palmer Dabbelt , Albert Ou , Heiko Carstens , Vasily Gorbik , Alexander Gordeev , Christian Borntraeger , Sven Schnelle , Gerald Schaefer , "David S. Miller" , Arnd Bergmann , Mike Kravetz , Muchun Song , SeongJae Park , Andrew Morton , Uladzislau Rezki , Christoph Hellwig , Lorenzo Stoakes , Anshuman Khandual , Peter Xu , Axel Rasmussen , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-parisc@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-riscv@lists.infradead.org, linux-s390@vger.kernel.org, sparclinux@vger.kernel.org, linux-mm@kvack.org, stable@vger.kernel.org References: <20230921162007.1630149-1-ryan.roberts@arm.com> <20230921162007.1630149-9-ryan.roberts@arm.com> <217bb956-b9f6-1057-914b-436d4c775a8b@bytedance.com> <3358e732-8df9-4408-8249-384b102f5d75@arm.com> From: Qi Zheng In-Reply-To: <3358e732-8df9-4408-8249-384b102f5d75@arm.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 69E6E40003 X-Stat-Signature: eqpzzqj46bzmao85e8j5i68jydb8e7qf X-HE-Tag: 1695369278-508376 X-HE-Meta: U2FsdGVkX1/kQuo1P28pNgixsMlhPCfomjlK1xqomohmu94AqSW+ytNVc8sEV4zfdlTyRqMr4+rvmGV9mJlLWD7BQdHTHgNltAi+jeAEdiZDxs5QsiptRfGtQYKEOdtAgKIkR6GED3LOYCj4YSnQ7RI+cQmatMrbi+LKzWs99l7LyULykFMMd/AGX2F0rBogaGnxX8NfsbH+PKdrtkSEHb77cE78S50D1eDzs6jzP5CysCFOnOcA78cuWqC3Hl8MVqIWm9S4LOpK8Db8JyUr3wYyS1dGG5RpWr4D93TquyHiBNFCWlBUZYej8tOUwHBocbXFVoUImoZqhrwT9dJD8uh0JmG8NAN+HgpPjPF13KVyg6IzS+mi1OWyo7VzonAUtwoueF1HfOBqZlLE/JozQtNo75tw2p9FniLaQKq/X2ErCAlz+RMH0CstaEMsVZaCOUyWL1UhPuLMw1BpnN7UjaSgDqKt0EnEWIY9t9YLO0YatN81yRj5AAHLha1cR//PssGKhRssy3sqvwYeoPTGkOLB80zrb8JL9gDaKUEBqUhAwqE+nCNAC1wUKFWHYxEr94Lue51zloFD91vOgoswgziI37JaCJy1nfnpBrJE5dnbW5VYFgTsMxNi6PYo3V0H7paXBs4WYmNEQVX9lTySHTUma7LOfv9UgZ8XYWw7AwF30BjdBPt9tsYCN3lxKNBvFdter4RJ/I3qcU/xfpmYsqTZ9kY8vEGTrI2B4pZhl4MjkOOlEKztshUpOARqanP2WCve/MA/FneOau8TZgpnwc8e40Yy8+T5O06+hqZgBKqqqQWTQm95z2ah+mfaRtpMsdVaISDFmU3e7/qcrsShCzdaTGAOSYwV60p6+eIOiLaVOg9qqJndSgFC00trW9DT6HRSVvqDJZLr4Rp+HYYAO02p+QZ9pOR/HQ9WQmgWoS6yT5J9wbSTTrV743GpbW0mx5dOVXLIIP5S4XGvuc6 kISU0bBr nKP+LpvCixw4gPkfyNE5fGOnNBks8tRwP7hJGElx9lRzIuCVGGYcjXW9fSDDKFR2ipmhE7krj1SqYy1EnBUoRiNo7rjcvFVUewwP9i9zuHYe6Iz9jYKoVYXgUskXuHrFK1G4gWx/Npv6UO8K+xvNO7cD4iwmm2rxlQxPD1TInqE8IXjaWkwWUjVlHWNG63SSszXnvoUWVcAhSLzy/UBJ1oT1ptzbaDYXmPasc+EtViWcLYOqel8wd/tYI7V/w7f7vBvggxCVy2eZ3PhLPcIZWiRq2eA0geaz0ucq2WjKLxfpv51YswY1wtA6BFP+WT1mPWVMWVnXTZjxpO0vsre+Y3sjF4pf7FXAH6a3uLckkVezAd22UTAcoKYhommW4Gnqi1VSONhdUnmd0679yYgcmC/KJmtnPzJ0yO5MbN7pVMF0iBFyypRIfROahk1deGd2lrgmWS8BCf/ZhVVTX17v0PHo0kK9mXfu6rjAc X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Hi Ryan, On 2023/9/22 15:40, Ryan Roberts wrote: > On 22/09/2023 03:54, Qi Zheng wrote: >> Hi Ryan, >> >> On 2023/9/22 00:20, Ryan Roberts wrote: >>> When called with a swap entry that does not embed a PFN (e.g. >>> PTE_MARKER_POISONED or PTE_MARKER_UFFD_WP), the previous implementation >>> of set_huge_pte_at() would either cause a BUG() to fire (if >>> CONFIG_DEBUG_VM is enabled) or cause a dereference of an invalid address >>> and subsequent panic. >>> >>> arm64's huge pte implementation supports multiple huge page sizes, some >>> of which are implemented in the page table with contiguous mappings. So >>> set_huge_pte_at() needs to work out how big the logical pte is, so that >>> it can also work out how many physical ptes (or pmds) need to be >>> written. It does this by grabbing the folio out of the pte and querying >>> its size. >>> >>> However, there are cases when the pte being set is actually a swap >>> entry. But this also used to work fine, because for huge ptes, we only >>> ever saw migration entries and hwpoison entries. And both of these types >>> of swap entries have a PFN embedded, so the code would grab that and >>> everything still worked out. >>> >>> But over time, more calls to set_huge_pte_at() have been added that set >>> swap entry types that do not embed a PFN. And this causes the code to go >>> bang. The triggering case is for the uffd poison test, commit >>> 99aa77215ad0 ("selftests/mm: add uffd unit test for UFFDIO_POISON"), >>> which sets a PTE_MARKER_POISONED swap entry. But review shows there are >>> other places too (PTE_MARKER_UFFD_WP). >>> >>> So the root cause is due to commit 18f3962953e4 ("mm: hugetlb: kill >>> set_huge_swap_pte_at()"), which aimed to simplify the interface to the >>> core code by removing set_huge_swap_pte_at() (which took a page size >>> parameter) and replacing it with calls to set_huge_swap_pte_at() where >>> the size was inferred from the folio, as descibed above. While that >>> commit didn't break anything at the time, >> >> If it didn't break anything at that time, then shouldn't the Fixes tag >> be added to this commit? >> >>> it did break the interface >>> because it couldn't handle swap entries without PFNs. And since then new >>> callers have come along which rely on this working. >> >> So the Fixes tag should be added only to the commit that introduces the >> first new callers? > > Well I guess it's a matter of point of view; My view is that 18f3962953e4 is the > buggy change because it broke the interface to not be able to handle swap > entries which do not contain PFNs. The fact that there were no callers that used > the interface in this way at the time of the commit is irrelevant in my view. I understand your point of view. But IIUC, the Fixes tag is used to indicate the version that needs to backport, but the version where the commit 18f3962953e4 is located does not need to backport this bugfix patch. > But I already added 2 fixes tags; one for the buggy commit, and the other for > the commit containing the new user of the interface. I think 2 fixes tags will cause inconvenience to the maintainers. Thanks, Qi > >> >> Other than that, LGTM. > > Thanks! > >> >> Thanks, >> Qi >> >>> >>> Now that we have modified the set_huge_pte_at() interface to pass the >>> vma, we can extract the huge page size from it and fix this issue. >>> >>> I'm tagging the commit that added the uffd poison feature, since that is >>> what exposed the problem, as well as the original change that broke the >>> interface. Hopefully this is valuable for people doing bisect. >>> >>> Signed-off-by: Ryan Roberts >>> Fixes: 18f3962953e4 ("mm: hugetlb: kill set_huge_swap_pte_at()") >>> Fixes: 8a13897fb0da ("mm: userfaultfd: support UFFDIO_POISON for hugetlbfs") >>> --- >>>   arch/arm64/mm/hugetlbpage.c | 17 +++-------------- >>>   1 file changed, 3 insertions(+), 14 deletions(-) >>> >>> diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c >>> index 844832511c1e..a08601a14689 100644 >>> --- a/arch/arm64/mm/hugetlbpage.c >>> +++ b/arch/arm64/mm/hugetlbpage.c >>> @@ -241,13 +241,6 @@ static void clear_flush(struct mm_struct *mm, >>>       flush_tlb_range(&vma, saddr, addr); >>>   } >>>   -static inline struct folio *hugetlb_swap_entry_to_folio(swp_entry_t entry) >>> -{ >>> -    VM_BUG_ON(!is_migration_entry(entry) && !is_hwpoison_entry(entry)); >>> - >>> -    return page_folio(pfn_to_page(swp_offset_pfn(entry))); >>> -} >>> - >>>   void set_huge_pte_at(struct vm_area_struct *vma, unsigned long addr, >>>                   pte_t *ptep, pte_t pte) >>>   { >>> @@ -258,13 +251,10 @@ void set_huge_pte_at(struct vm_area_struct *vma, >>> unsigned long addr, >>>       unsigned long pfn, dpfn; >>>       pgprot_t hugeprot; >>>   -    if (!pte_present(pte)) { >>> -        struct folio *folio; >>> - >>> -        folio = hugetlb_swap_entry_to_folio(pte_to_swp_entry(pte)); >>> -        ncontig = num_contig_ptes(folio_size(folio), &pgsize); >>> +    ncontig = num_contig_ptes(huge_page_size(hstate_vma(vma)), &pgsize); >>>   -        for (i = 0; i < ncontig; i++, ptep++) >>> +    if (!pte_present(pte)) { >>> +        for (i = 0; i < ncontig; i++, ptep++, addr += pgsize) >>>               set_pte_at(mm, addr, ptep, pte); >>>           return; >>>       } >>> @@ -274,7 +264,6 @@ void set_huge_pte_at(struct vm_area_struct *vma, unsigned >>> long addr, >>>           return; >>>       } >>>   -    ncontig = find_num_contig(mm, addr, ptep, &pgsize); >>>       pfn = pte_pfn(pte); >>>       dpfn = pgsize >> PAGE_SHIFT; >>>       hugeprot = pte_pgprot(pte); >