From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 48B4CE77173 for ; Fri, 6 Dec 2024 05:27:28 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C906C6B0191; Fri, 6 Dec 2024 00:27:27 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id C41BE6B0195; Fri, 6 Dec 2024 00:27:27 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B08366B0196; Fri, 6 Dec 2024 00:27:27 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 9298E6B0191 for ; Fri, 6 Dec 2024 00:27:27 -0500 (EST) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 400471215D3 for ; Fri, 6 Dec 2024 05:27:27 +0000 (UTC) X-FDA: 82863400998.11.A0D4D18 Received: from smtp2-g21.free.fr (smtp2-g21.free.fr [212.27.42.2]) by imf13.hostedemail.com (Postfix) with ESMTP id B6B0520004 for ; Fri, 6 Dec 2024 05:27:07 +0000 (UTC) Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=morinfr.org header.s=20170427 header.b=KRXj+3xq; dmarc=pass (policy=quarantine) header.from=morinfr.org; spf=pass (imf13.hostedemail.com: domain of guillaume@morinfr.org designates 212.27.42.2 as permitted sender) smtp.mailfrom=guillaume@morinfr.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1733462838; a=rsa-sha256; cv=none; b=7KF9/1VFdXStZ8FceLu79qBSy6jBC+9a8B5/g3ewHZTu7dziUhhV6APwt2pUDS3PqMavbc z+178LBP+8zSO8WUqsbcuNvT2PSh/+xaXEt9eoEHfHExmXbAWjVMIgEgSEBtZQ+71/iy2v Iswb6V3Jpxyjkz0dycOoctYOYzQ+R6w= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=pass header.d=morinfr.org header.s=20170427 header.b=KRXj+3xq; dmarc=pass (policy=quarantine) header.from=morinfr.org; spf=pass (imf13.hostedemail.com: domain of guillaume@morinfr.org designates 212.27.42.2 as permitted sender) smtp.mailfrom=guillaume@morinfr.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1733462837; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=a0TgCH2F8hu00svqIVZfVT8cAOELTuTicrL+FYqAIPk=; b=2FnAbobt4sPMXw4T6aA6Va3HsJ4atbShqWIMGs7iZXWWQzytHK5eTNG+/fHstE2ue1g3qi 2Ymab7RkH+TqvWglnL/kcqdf2RojD5xmG/OtT9YxC1grTVXzUg/vx9s30k+zC3SIkO65VC tMEJOxYkblCJrQEoexFOWmF6lJ/gWrU= Received: from bender.morinfr.org (unknown [82.66.66.112]) by smtp2-g21.free.fr (Postfix) with ESMTPS id 99DE52003D0; Fri, 6 Dec 2024 06:27:10 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=morinfr.org ; s=20170427; h=In-Reply-To:Content-Type:MIME-Version:References:Message-ID: Subject:Cc:To:From:Date:Sender:Reply-To:Content-Transfer-Encoding:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=a0TgCH2F8hu00svqIVZfVT8cAOELTuTicrL+FYqAIPk=; b=KRXj+3xqZV/R9PSaEBPaTBjVsv v5gZWN+LgGPDwI/aN5JnBeUm+IqDEkUmLjsW2JpU0nH0q42K5aBkPp99JWped4q3e6DsKjec2PFNb 8o9vTHFyTboE4c5bzTVAtXzV9i9le38vqkBUb69q3Wyi8shreSMYvRrB1GUjntOs2hjI=; Received: from guillaum by bender.morinfr.org with local (Exim 4.96) (envelope-from ) id 1tJQrt-0022hx-2b; Fri, 06 Dec 2024 06:27:09 +0100 Date: Fri, 6 Dec 2024 06:27:09 +0100 From: Guillaume Morin To: Nathan Chancellor Cc: Guillaume Morin , Heiko Carstens , Vasily Gorbik , Alexander Gordeev , linux-kernel@vger.kernel.org, linux-mm@kvack.org, Muchun Song , Andrew Morton , Peter Xu , David Hildenbrand , Eric Hagberg , linux-s390@vger.kernel.org Subject: Re: [PATCH v3] mm/hugetlb: support FOLL_FORCE|FOLL_WRITE Message-ID: References: <20241206045019.GA2215843@thelio-3990X> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20241206045019.GA2215843@thelio-3990X> X-Rspam-User: X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: B6B0520004 X-Stat-Signature: 3wd8s6b6p5abgmjjzsgiypohnrxw7pm7 X-HE-Tag: 1733462827-506482 X-HE-Meta: U2FsdGVkX1+dnAP8UiRlaQbnsm9sVFpBJ2fffNgzTTvSRUKl3N7QuJN4yXFMscFwJj2VKN9jw8bK3c50MwmvQjyEiYxdtG9Cv6LHsDnykXDc/oPOVjpyVg27/u/qMWZvPsOMqlTZzZO/HtnyM11pebN3DaDn8Wa/AEv45pmCmD1CLe0Pkd+X8McucZO5+Uq2SdeQlDU8S2OGKNW74wX8U630cmn3wEFS4vlk0keDptomM5CgrrpLyY987AfX0sIUA6/9JrE/ueA0P05yFx0dCTjxtn85hpE1RTP41IeLHN0KbQhM4vquy85m19GuoY1Tuxoq4csnqot7QyIB+v/lSrlwPVLmXjXAP+TSCr2rBrv7u8WlfjTD0VbGa4GaBAv06f6UI98zqCJ/9oSzGwQMlyL73k7YapUA0FZnwqbDm2ywUhl4RfAW00ZDQvkVbSD1SIVMADjATlT7+b/lcq/IkTzgSSxqyYOfvxNlUDjxDl3VBExp69kqsanBjxMI42nmMfXCTit8eE/Z7Py8eSKFA35w4y6jO/a9Qdk3qJQpAY7RtT+rUqnYYsZxrIVTqFQ+eqNzP8tcDYCqY0nGsloQgHUlY9C0HgbAM+4xcauDasb6iWUazUpRiRN5odEfas/FY5y4G4KyvSSrUt7SoXi51t5ZHMLjVsiGiJrhtU1QHyliAeRvSFubMmJkAl8Doge00BobXSyCBlAKNgJgqpQoRXxyNQS7bbyhDoeMwPs0qzvjXGRKhQPMm2ArKc23D2OPVuzJ2/vyZBLm0iemJsXxfs0X+uBLZcZqJpJxFhkGORrxkPZzpRB2Hrfi0i6vG7OXfxgHcLImVkBD4w7fKjSHeVSFTmOwghhFh30gZd85lT/5ooKv18Bex3NkvjJBoKBv3q3oBByagCk6aeuQOk3VmBcjzQ2q9Lb/XZpIv4s4xj1PKScznEiglwknYtT2M6i12QJQvhCYuvvkLkR2kKb b20UxUfY znQoU5yfywM7mRQRiCrHwOjIzgbTErTqDD2inxCYPb+fglIgowYokZsNuAcKHW4SMnY8kzDi8Cf6Ck/5KKHCm5lbAtdyr7P5Ch+xA3nyhbvtaHO2a6TvK/juWnokGw+03XaefbsUcYgq8cDA6eRaITpc/7Z88JnjPjHaiw4tWLEgFZKT/PrALYYUNknU7XZ3bQ929RRCx6Vw/58XEQ23auT2oJaLKv2PBWXpQeqyljJ5R5kjYrMK+yhpiad2TmS+Dx96XpZ7Pw9ilZHbZ2dnfTGJbZt9+G5p9LMRwA4GutTZJGFXU9aaiqSGz8Hn+ReRriY0ACT4m90RRqNUa9Ayvne9ik1LOa5w1Z9x8EE/H6G6aIlDzCFoV7UlwcIWbuogRxVksrGl9tqZtwB9cUMxvGf8oPd4GOflKGvjw+oDYSJV5ID+KzTR3giUpSg6aIJBJBZoKaTiDR/jFh93Np1Wyj7UEXRZH8dZP1YODF6vFj4J1y+s= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 05 Dec 21:50, Nathan Chancellor wrote: > > Hi Guillaume and s390 folks, > > On Thu, Dec 05, 2024 at 03:02:26AM +0100, Guillaume Morin wrote: > > > > Eric reported that PTRACE_POKETEXT fails when applications use hugetlb > > for mapping text using huge pages. Before commit 1d8d14641fd9 > > ("mm/hugetlb: support write-faults in shared mappings"), PTRACE_POKETEXT > > worked by accident, but it was buggy and silently ended up mapping pages > > writable into the page tables even though VM_WRITE was not set. > > > > In general, FOLL_FORCE|FOLL_WRITE does currently not work with hugetlb. > > Let's implement FOLL_FORCE|FOLL_WRITE properly for hugetlb, such that > > what used to work in the past by accident now properly works, allowing > > applications using hugetlb for text etc. to get properly debugged. > > > > This change might also be required to implement uprobes support for > > hugetlb [1]. > > > > [1] https://lore.kernel.org/lkml/ZiK50qob9yl5e0Xz@bender.morinfr.org/ > > > > Cc: Muchun Song > > Cc: Andrew Morton > > Cc: Peter Xu > > Cc: David Hildenbrand > > Cc: Eric Hagberg > > Signed-off-by: Guillaume Morin > > --- > > Changes in v2: > > - Improved commit message > > Changes in v3: > > - Fix potential unitialized mem access in follow_huge_pud > > - define pud_soft_dirty when soft dirty is not enabled > > > > include/linux/pgtable.h | 5 +++ > > mm/gup.c | 99 +++++++++++++++++++++-------------------- > > mm/hugetlb.c | 20 +++++---- > > 3 files changed, 66 insertions(+), 58 deletions(-) > > > > diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h > > index adef9d6e9b1b..9335d7c82d20 100644 > > --- a/include/linux/pgtable.h > > +++ b/include/linux/pgtable.h > > @@ -1422,6 +1422,11 @@ static inline int pmd_soft_dirty(pmd_t pmd) > > return 0; > > } > > > > +static inline int pud_soft_dirty(pud_t pud) > > +{ > > + return 0; > > +} > > + > > static inline pte_t pte_mksoft_dirty(pte_t pte) > > { > > return pte; > > diff --git a/mm/gup.c b/mm/gup.c > > index 746070a1d8bf..cc3eae458013 100644 > > --- a/mm/gup.c > > +++ b/mm/gup.c > > @@ -587,6 +587,33 @@ static struct folio *try_grab_folio_fast(struct page *page, int refs, > > } > > #endif /* CONFIG_HAVE_GUP_FAST */ > > > > +/* Common code for can_follow_write_* */ > > +static inline bool can_follow_write_common(struct page *page, > > + struct vm_area_struct *vma, unsigned int flags) > > +{ > > + /* Maybe FOLL_FORCE is set to override it? */ > > + if (!(flags & FOLL_FORCE)) > > + return false; > > + > > + /* But FOLL_FORCE has no effect on shared mappings */ > > + if (vma->vm_flags & (VM_MAYSHARE | VM_SHARED)) > > + return false; > > + > > + /* ... or read-only private ones */ > > + if (!(vma->vm_flags & VM_MAYWRITE)) > > + return false; > > + > > + /* ... or already writable ones that just need to take a write fault */ > > + if (vma->vm_flags & VM_WRITE) > > + return false; > > + > > + /* > > + * See can_change_pte_writable(): we broke COW and could map the page > > + * writable if we have an exclusive anonymous page ... > > + */ > > + return page && PageAnon(page) && PageAnonExclusive(page); > > +} > > + > > static struct page *no_page_table(struct vm_area_struct *vma, > > unsigned int flags, unsigned long address) > > { > > @@ -613,6 +640,22 @@ static struct page *no_page_table(struct vm_area_struct *vma, > > } > > > > #ifdef CONFIG_PGTABLE_HAS_HUGE_LEAVES > > +/* FOLL_FORCE can write to even unwritable PUDs in COW mappings. */ > > +static inline bool can_follow_write_pud(pud_t pud, struct page *page, > > + struct vm_area_struct *vma, > > + unsigned int flags) > > +{ > > + /* If the pud is writable, we can write to the page. */ > > + if (pud_write(pud)) > > + return true; > > + > > + if (!can_follow_write_common(page, vma, flags)) > > + return false; > > + > > + /* ... and a write-fault isn't required for other reasons. */ > > + return !vma_soft_dirty_enabled(vma) || pud_soft_dirty(pud); > > This looks to be one of the first uses of pud_soft_dirty() in a generic > part of the tree from what I can tell, which shows that s390 is lacking > it despite setting CONFIG_HAVE_ARCH_SOFT_DIRTY: > > $ make -skj"$(nproc)" ARCH=s390 CROSS_COMPILE=s390-linux- mrproper defconfig mm/gup.o > mm/gup.c: In function 'can_follow_write_pud': > mm/gup.c:665:48: error: implicit declaration of function 'pud_soft_dirty'; did you mean 'pmd_soft_dirty'? [-Wimplicit-function-declaration] > 665 | return !vma_soft_dirty_enabled(vma) || pud_soft_dirty(pud); > | ^~~~~~~~~~~~~~ > | pmd_soft_dirty > > Is this expected? Yikes! It does look like an oversight in the s390 code since as you said it has CONFIG_HAVE_ARCH_SOFT_DIRTY and pud_mkdirty seems to be setting _REGION3_ENTRY_SOFT_DIRTY. But I'll let the s390 folks opine. I don't mind dropping the pud part of the change (even if that's a bit of a shame) if it's causing too many issues. -- Guillaume Morin