From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 30A73C61DA3 for ; Tue, 21 Feb 2023 16:36:48 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B20946B0073; Tue, 21 Feb 2023 11:36:47 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id AA98B6B0074; Tue, 21 Feb 2023 11:36:47 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 94A576B0078; Tue, 21 Feb 2023 11:36:47 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 85E506B0073 for ; Tue, 21 Feb 2023 11:36:47 -0500 (EST) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 532D240677 for ; Tue, 21 Feb 2023 16:36:47 +0000 (UTC) X-FDA: 80491852854.20.69BD653 Received: from mail-vs1-f54.google.com (mail-vs1-f54.google.com [209.85.217.54]) by imf18.hostedemail.com (Postfix) with ESMTP id 8745C1C000A for ; Tue, 21 Feb 2023 16:36:45 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=FEKXutWw; spf=pass (imf18.hostedemail.com: domain of jthoughton@google.com designates 209.85.217.54 as permitted sender) smtp.mailfrom=jthoughton@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1676997405; a=rsa-sha256; cv=none; b=SAjAS3L88BQi7yTC5BySzHQ7q/IY24T5UU4N0SaX0nRoP1uH2d7MJpswD50oLaxjQhsPhT A6LYQCWVcmalxp0UY/hhxRc8FnKKUSnh8FdfRUWQiOV9FeLohY4bbcADDHxlpP67tnXNH9 yPOqsPhRi3B+QO8HctRZah7AAJS4tSw= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=FEKXutWw; spf=pass (imf18.hostedemail.com: domain of jthoughton@google.com designates 209.85.217.54 as permitted sender) smtp.mailfrom=jthoughton@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1676997405; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=LfMiwamrb5Hsn9MOJ+vir9DkXggh5+lNX55ws8jXDRc=; b=U/yex1bon10/ByGAjteffhEHtTuSMLLqRMm6QXJWd/U2E9looxHLuoLCst1CW/CILVS89q 8neYGeFRtYH1f1m2ZdiVa+5IGo3V9UgWgXaFndEZYD4yzqf5ZPtIi2LubGBL7p7MzkhpnW su9F4FQ6PgHCNcdfhPE6XLrARcjZFFo= Received: by mail-vs1-f54.google.com with SMTP id b20so5020759vsu.5 for ; Tue, 21 Feb 2023 08:36:45 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=LfMiwamrb5Hsn9MOJ+vir9DkXggh5+lNX55ws8jXDRc=; b=FEKXutWwVJH+gZjAsYtZ+YPHAsGEVqh0mp0BUcTJow5DPClTHgz0sndSx5SLgvUQjT MBCJ/JdLzDsqiZDu35321Yzn45QflkDayhknpswCH9+o193J4+SXxC36Zueqfnl1gmai 9jmjt2vIy4orkreW/GXWQgV0MSKCt1CyZff4SaC54zW9J4iALW1Yl+lawyvVtOS01fv8 h6EqKVR+hOn8MqZ+zh/G95IJaGJe4mSZS1sT2eJCRRzAxxVj/dGtGjG+9u4VW0ZblbiE QpdjxqG7QXAXv9U7YacJFICiB4/iQ6a04o/USIyYPu12AecVLKTR4cgAn2B+1gBUO8ol vGhw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=LfMiwamrb5Hsn9MOJ+vir9DkXggh5+lNX55ws8jXDRc=; b=JGHNR7i0mi1qVGI09HqhlWRvm2rT/z8OZJLIcVTdKPvqj1ZWpN/ekaAu2zRPcQH6Tv 0ueRXW7ifprpj8jIDr1e+9zyz5D8xXhuD2UDemI4Y/61aJ3+ZehsoBrEVvU6xqdqWp+u fxAJz5qfy1De/XaSNlKVMdFuwtJqWi9NWNKCJpu5QDkGKk7Nxl0B2uKsMfM3nv48pEoI SILfwh8pr00xDmZK1cdvda0vwnhjBku+H6ZwXLEp3/CcjulbTwgr91XegKP97vEfJU+m NCxpq0JX0rdIk5FajZNeJm8vVGbPGvtYw3/P75qIjTdmM4iU5c/bmKnW2We97ntCXfXU G4tQ== X-Gm-Message-State: AO0yUKWXQatoB8spqOHclqi1Tyg1iQVF9AALTstUM7fB7/RydnUp3Khx KpziIWUC0t64qae0HOWKx/igzwLd1SjVKWK8lqd8Fg== X-Google-Smtp-Source: AK7set9c/bDAaKouOcrbgvzVfteiJt3VOJzqOkD1O6X8VuF5ny26eObWeXS7DM24YoOcPaRILLBjxOsIZznTqtHnM9c= X-Received: by 2002:a67:dc81:0:b0:41b:ed91:4d51 with SMTP id g1-20020a67dc81000000b0041bed914d51mr1385589vsk.84.1676997404573; Tue, 21 Feb 2023 08:36:44 -0800 (PST) MIME-Version: 1.0 References: <20230218002819.1486479-1-jthoughton@google.com> <20230218002819.1486479-12-jthoughton@google.com> In-Reply-To: From: James Houghton Date: Tue, 21 Feb 2023 08:36:08 -0800 Message-ID: Subject: Re: [PATCH v2 11/46] hugetlb: add hugetlb_pte to track HugeTLB page table entries To: Mina Almasry Cc: Mike Kravetz , Muchun Song , Peter Xu , Andrew Morton , David Hildenbrand , David Rientjes , Axel Rasmussen , "Zach O'Keefe" , Manish Mishra , Naoya Horiguchi , "Dr . David Alan Gilbert" , "Matthew Wilcox (Oracle)" , Vlastimil Babka , Baolin Wang , Miaohe Lin , Yang Shi , Frank van der Linden , Jiaqi Yan , linux-mm@kvack.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Queue-Id: 8745C1C000A X-Rspamd-Server: rspam01 X-Stat-Signature: kohdqtjqtibijeoneej5qr6t519rpotn X-HE-Tag: 1676997405-459676 X-HE-Meta: U2FsdGVkX1+NLQMEMdbtrdSuXih0ry/e8oXRm65M6d+FogFxfHPj+M1KwJ2BdyeXoyPk23ecLpBLlpfYGsmNjylKxA6LOrBqLGdsxoEES4fla+8qWacVRjJsbmRjkYcGkHO60fbI/mTl5v9YD+r+6ooNPnu3uTkCqAEcX7rGNIfh5iYx7gGOIGmvHNJYXINLVYN0O8ZvVzoDGpu7nyXe7Alp26aJy26lznDfISD+5Ss8afj4jvgvvOUSOMQUh4F67u2jUvX50ijV3tIakuV/wY20+yeSPk7IbjWj4tkYafHPLVCAT5ZKZ54m1Chpe8rYFM0sg/J8rcGFNGupfQ8TmTGflhEb2JF4VWdrolvFXSSUct3nbk7bpJwkgbJgst6CC92u+e9ZDFsJT6axYcn43HcbAoUvRyNhOtvGeVaLFRImiISfI/r/8+5CGR4oMKbpWKqWeaau2FXCy4jTX41e2nJDJuMIx3OM3rR+XepMT5q95B5OnEaK8tgvtXZZYWStTRjs3RiG02Deip9Wwz5XLdzHFrps9VzihdpNgRyjoGZAyi5vVgsoUrKaGNLuacHE0T6FmcBVd/Y9ThKOjYbvNVIl6bmgn5HqGmOngn/bQAGurr0g6H3F0fy4Sy6AoYYLzVATHC4SnMovJgROrOY9H0x2OHk/lteuxGgQPaKuaz4UbqRjLgltWA0ny64Wa7kecreNgyI3NTNa4lETVntcTL0D0TNcMRc5eA8P+p2oef4fZ4XSZR3E/ipe6mFOfDZJIVStpeJcoEZif09YbMvcW+mre88b1I0JLe74f2gNXt4ZeJbwNK57vYanvhDZA1011KSBfBGYpK2NQWcaJ1S2tc7BB3Sn42Qhd62/3KNzsRRxH+rUOrGsL+nJ9qwi9KLuE94xERfZcH8m6oZM0pwdiGH9fIuR+JCXExpIPgcELfYDV/tuhqVs++pHycC0Tp68APhyugt5XqYJALNihjU AzsulcOW rxcJ+Qxra2U7jXqpwi2XwpWBmLzbgi5Qjven7N2nLRq2jJFhSbUm5J6funcDv5iiABSAfzjkA/Fri3XOWbMBctWKv3L8POW9CF7bACfDKV4WozKbLKI7Q8jr7V7dJWVkoD58nsqX5uNmhZMFAZ3746RlvT8uTbmsbQg4SrlLXVZbhdmWF5wUFJzDKEtcb1vgKNWCoXmquBj0G4hm4T7nh/xRe3aM5Nrvf2r3VbIdZSoqxFJenYNP4BloKzUXUI4Us6VSHA8UhWOUNfbdus9ksKdS1j2Gb3S8HuA5PWXAXGEUJN50FhjLkOh9geMSVJXCs2QGs X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, Feb 17, 2023 at 9:24 PM Mina Almasry wrote= : > > On Fri, Feb 17, 2023 at 4:28=E2=80=AFPM James Houghton wrote: > > > > After high-granularity mapping, page table entries for HugeTLB pages ca= n > > be of any size/type. (For example, we can have a 1G page mapped with a > > mix of PMDs and PTEs.) This struct is to help keep track of a HugeTLB > > PTE after we have done a page table walk. > > > > Without this, we'd have to pass around the "size" of the PTE everywhere= . > > We effectively did this before; it could be fetched from the hstate, > > which we pass around pretty much everywhere. > > > > hugetlb_pte_present_leaf is included here as a helper function that wil= l > > be used frequently later on. > > > > Signed-off-by: James Houghton > > > > Only nits. > > Reviewed-by: Mina Almasry Thanks Mina! > > > diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h > > index a1ceb9417f01..eeacadf3272b 100644 > > --- a/include/linux/hugetlb.h > > +++ b/include/linux/hugetlb.h > > @@ -26,6 +26,25 @@ typedef struct { unsigned long pd; } hugepd_t; > > #define __hugepd(x) ((hugepd_t) { (x) }) > > #endif > > > > +enum hugetlb_level { > > + HUGETLB_LEVEL_PTE =3D 1, > > + /* > > + * We always include PMD, PUD, and P4D in this enum definition = so that, > > + * when logged as an integer, we can easily tell which level it= is. > > + */ > > + HUGETLB_LEVEL_PMD, > > + HUGETLB_LEVEL_PUD, > > + HUGETLB_LEVEL_P4D, > > + HUGETLB_LEVEL_PGD, > > +}; > > + > > +struct hugetlb_pte { > > + pte_t *ptep; > > + unsigned int shift; > > + enum hugetlb_level level; > > + spinlock_t *ptl; > > +}; > > + > > #ifdef CONFIG_HUGETLB_PAGE > > > > #include > > @@ -39,6 +58,20 @@ typedef struct { unsigned long pd; } hugepd_t; > > */ > > #define __NR_USED_SUBPAGE 3 > > > > +static inline > > +unsigned long hugetlb_pte_size(const struct hugetlb_pte *hpte) > > +{ > > + return 1UL << hpte->shift; > > +} > > + > > +static inline > > +unsigned long hugetlb_pte_mask(const struct hugetlb_pte *hpte) > > +{ > > + return ~(hugetlb_pte_size(hpte) - 1); > > +} > > + > > +bool hugetlb_pte_present_leaf(const struct hugetlb_pte *hpte, pte_t pt= e); > > + > > struct hugepage_subpool { > > spinlock_t lock; > > long count; > > @@ -1234,6 +1267,45 @@ static inline spinlock_t *huge_pte_lock(struct h= state *h, > > return ptl; > > } > > > > +static inline > > +spinlock_t *hugetlb_pte_lockptr(struct hugetlb_pte *hpte) > > +{ > > + return hpte->ptl; > > +} > > I find this helper unnecessary. I would remove it. Ok. Will do. > > > + > > +static inline > > +spinlock_t *hugetlb_pte_lock(struct hugetlb_pte *hpte) > > +{ > > + spinlock_t *ptl =3D hugetlb_pte_lockptr(hpte); > > + > > + spin_lock(ptl); > > Here 'spin_lock(hpte->ptl)' would be more immediately understandable > IMO, for example. > > > + return ptl; > > +} > > + > > +static inline > > +void __hugetlb_pte_init(struct hugetlb_pte *hpte, pte_t *ptep, > > + unsigned int shift, enum hugetlb_level level, > > + spinlock_t *ptl) > > +{ > > + /* > > + * If 'shift' indicates that this PTE is contiguous, then @ptep= must > > + * be the first pte of the contiguous bunch. > > + */ > > I would move the comment to above the function as a pseudo doc. It > seems to instruct the user of the function of how to use it. Right. Will do. > > > + hpte->ptl =3D ptl; > > + hpte->ptep =3D ptep; > > + hpte->shift =3D shift; > > + hpte->level =3D level; > > +} > > + > > +static inline > > +void hugetlb_pte_init(struct mm_struct *mm, struct hugetlb_pte *hpte, > > + pte_t *ptep, unsigned int shift, > > + enum hugetlb_level level) > > +{ > > + __hugetlb_pte_init(hpte, ptep, shift, level, > > + huge_pte_lockptr(shift, mm, ptep)); > > +} > > + > > #if defined(CONFIG_HUGETLB_PAGE) && defined(CONFIG_CMA) > > extern void __init hugetlb_cma_reserve(int order); > > #else > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > > index 5ca9eae0ac42..6c74adff43b6 100644 > > --- a/mm/hugetlb.c > > +++ b/mm/hugetlb.c > > @@ -1269,6 +1269,35 @@ static bool vma_has_reserves(struct vm_area_stru= ct *vma, long chg) > > return false; > > } > > > > +bool hugetlb_pte_present_leaf(const struct hugetlb_pte *hpte, pte_t pt= e) > > +{ > > + pgd_t pgd; > > + p4d_t p4d; > > + pud_t pud; > > + pmd_t pmd; > > + > > + switch (hpte->level) { > > + case HUGETLB_LEVEL_PGD: > > + pgd =3D __pgd(pte_val(pte)); > > + return pgd_present(pgd) && pgd_leaf(pgd); > > + case HUGETLB_LEVEL_P4D: > > + p4d =3D __p4d(pte_val(pte)); > > + return p4d_present(p4d) && p4d_leaf(p4d); > > + case HUGETLB_LEVEL_PUD: > > + pud =3D __pud(pte_val(pte)); > > + return pud_present(pud) && pud_leaf(pud); > > + case HUGETLB_LEVEL_PMD: > > + pmd =3D __pmd(pte_val(pte)); > > + return pmd_present(pmd) && pmd_leaf(pmd); > > + case HUGETLB_LEVEL_PTE: > > + return pte_present(pte); > > + default: > > + WARN_ON_ONCE(1); > > + return false; > > + } > > +} > > + > > + > > static void enqueue_hugetlb_folio(struct hstate *h, struct folio *foli= o) > > { > > int nid =3D folio_nid(folio); > > -- > > 2.39.2.637.g21b0678d19-goog > >