From: "Li Xinhai" <lixinhai.lxh@gmail.com>
To: "Jason Gunthorpe" <jgg@mellanox.com>
Cc: "Mike Kravetz" <mike.kravetz@oracle.com>,
"linux-mm@kvack.org" <linux-mm@kvack.org>,
akpm <akpm@linux-foundation.org>,
"Punit Agrawal" <punit.agrawal@arm.com>,
Longpeng <longpeng2@huawei.com>
Subject: Re: [PATCH] mm/hugetlb: avoid unnecessary check on pud and pmd entry in huge_pte_offset
Date: Fri, 24 Apr 2020 21:33:40 +0800 [thread overview]
Message-ID: <2020042421333861801820@gmail.com> (raw)
In-Reply-To: <20200424125753.GK13640@mellanox.com>
On 2020-04-24 at 20:57 Jason Gunthorpe wrote:
>On Fri, Apr 24, 2020 at 12:07:50PM +0800, Li Xinhai wrote:
>> On 2020-04-24 at 02:38 Jason Gunthorpe wrote:
>> >On Thu, Apr 23, 2020 at 11:14:28AM -0700, Mike Kravetz wrote:
>> >> Cc a few people who have looked at huge_pte_offset() recently.
>> >>
>> >> On 4/23/20 5:49 AM, Li Xinhai wrote:
>> >> > When huge_pte_offset() is called, the parameter sz can only be PUD_SIZE
>> >> > or PMD_SIZE.
>> >> > If sz is PUD_SIZE and code can reach pud, then *pud must be none, or
>> >> > normal hugetlb entry, or non-present (migration or hwpoisoned) hugetlb
>> >> > entry, and we can directly return pud.
>> >> > When sz is PMD_SIZE, pud must be none or present, and if code can reach
>> >> > pmd, we can directly return pmd.
>> >> >
>> >> > So, after this patch, the code is simplified by first check on the
>> >> > parameter sz, and avoid unnecessary checks in current code.
>> >> >
>> >> > Signed-off-by: Li Xinhai <lixinhai.lxh@gmail.com>
>> >> > Cc: Mike Kravetz <mike.kravetz@oracle.com>
>> >> > Cc: Andrew Morton <akpm@linux-foundation.org>
>> >> > mm/hugetlb.c | 24 +++++++++---------------
>> >> > 1 file changed, 9 insertions(+), 15 deletions(-)
>> >> >
>> >> > diff --git a/mm/hugetlb.c b/mm/hugetlb.c
>> >> > index bcabbe0..e1424f5 100644
>> >> > +++ b/mm/hugetlb.c
>> >> > @@ -5365,8 +5365,8 @@ pte_t *huge_pte_offset(struct mm_struct *mm,
>> >> > {
>> >> > pgd_t *pgd;
>> >> > p4d_t *p4d;
>> >> > - pud_t *pud, pud_entry;
>> >> > - pmd_t *pmd, pmd_entry;
>> >> > + pud_t *pud;
>> >> > + pmd_t *pmd;
>> >> >
>> >> > pgd = pgd_offset(mm, addr);
>> >> > if (!pgd_present(*pgd))
>> >> > @@ -5376,22 +5376,16 @@ pte_t *huge_pte_offset(struct mm_struct *mm,
>> >> > return NULL;
>> >> >
>> >> > pud = pud_offset(p4d, addr);
>> >> > - pud_entry = READ_ONCE(*pud);
>> >> > - if (sz != PUD_SIZE && pud_none(pud_entry))
>> >> > - return NULL;
>> >> > - /* hugepage or swap? */
>> >> > - if (pud_huge(pud_entry) || !pud_present(pud_entry))
>> >> > + if (sz == PUD_SIZE)
>> >> > + /* must be pud_huge or pud_none */
>> >> > return (pte_t *)pud;
>> >> > -
>> >> > - pmd = pmd_offset(pud, addr);
>> >> > - pmd_entry = READ_ONCE(*pmd);
>> >> > - if (sz != PMD_SIZE && pmd_none(pmd_entry))
>> >> > + if (!pud_present(*pud))
>> >> > return NULL;
>> >> > - /* hugepage or swap? */
>> >> > - if (pmd_huge(pmd_entry) || !pmd_present(pmd_entry))
>> >> > - return (pte_t *)pmd;
>> >> > + /* must have a valid entry and size to go further */
>> >> >
>> >> > - return NULL;
>> >> > + pmd = pmd_offset(pud, addr);
>> >>
>> >> Can we get here with sz = PMD_SIZE and pud_none(*pud)? Would that be
>> >> an issue for the pmd_offset() call?
>> >
>> >Certainly pmd_offset() must only be called if the PUD entry is
>> >pointing at a pmd level.
>> >
>> >AFAIK this means it should not be called on pud_none(), pud_huge() or
>> >!pud_present() cases.
>>
>> The test of !pud_present(*pud) also block pud_none(*pud)
>
>Sure
>
>> , so when sz == PMD_SIZE, pmd_offset() only called with a valid PUD
>> entry which point to PMD page table.
>
>But what prevents pud_huge?
>
if sz == PUD_SIZE, the 'return (pte_t*)pud' alrady end the function, which cover
pud_huge() and pud_none(), because we the mapping is for PUD_SIZE huge page.
So, there is no possibility for pmd_offset() been called with invalid pud entry.
Below is the code I used for test which has BUG_ON, that should give more
clear idea about the semantics of code path:
...
pud = pud_offset(p4d, addr);
if (sz == PUD_SIZE) {
/* must be pud_huge or pud_none */
BUG_ON(!pud_huge(*pud) && !pud_none(*pud));
return (pte_t *)pud; // note that return valid pointer for pud_none() case,
// instead of NULL, that is same semantics as existing code.
}
if (!pud_present(*pud))
return NULL; // note that only return NULL in case pud not present,
// same sematics as existing code.
/* must have a valid entry and size to go further */
BUG_ON(sz != PMD_SIZE);
pmd = pmd_offset(pud, addr);
/* must be pmd_huge or pmd_none */
BUG_ON(!pmd_huge(*pmd) && !pmd_none(*pmd));
return (pte_t *)pmd; // note that return valid pointer for pmd_none() case,
// instead of NULL, that is same semantics as existing code.
...
>This API seems kind of strange to be honest.. Should it be two
>functions instead of a sz parameter?
>
>huge_pud_offset() and huge_pmd_offset() ?
I think checking huge size then call to one of these two functions at caller
site will involve many redundant code do branch work in one function is
better.
>
>Jason
next prev parent reply other threads:[~2020-04-24 13:33 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-04-23 12:49 Li Xinhai
2020-04-23 13:12 ` Li Xinhai
2020-04-23 18:14 ` Mike Kravetz
2020-04-23 18:38 ` Jason Gunthorpe
2020-04-24 4:07 ` Li Xinhai
2020-04-24 12:57 ` Jason Gunthorpe
2020-04-24 13:33 ` Li Xinhai [this message]
2020-04-24 13:42 ` Jason Gunthorpe
2020-04-24 14:07 ` Li Xinhai
2020-04-24 14:10 ` Jason Gunthorpe
2020-04-24 14:53 ` Li Xinhai
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2020042421333861801820@gmail.com \
--to=lixinhai.lxh@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=jgg@mellanox.com \
--cc=linux-mm@kvack.org \
--cc=longpeng2@huawei.com \
--cc=mike.kravetz@oracle.com \
--cc=punit.agrawal@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox