From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wi0-f176.google.com (mail-wi0-f176.google.com [209.85.212.176]) by kanga.kvack.org (Postfix) with ESMTP id 4D8EE6B0253 for ; Tue, 13 Oct 2015 12:07:22 -0400 (EDT) Received: by wijq8 with SMTP id q8so38868016wij.0 for ; Tue, 13 Oct 2015 09:07:21 -0700 (PDT) Received: from mail-wi0-f180.google.com (mail-wi0-f180.google.com. [209.85.212.180]) by mx.google.com with ESMTPS id gl16si4794409wjc.187.2015.10.13.09.06.58 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 13 Oct 2015 09:06:58 -0700 (PDT) Received: by wicge5 with SMTP id ge5so199418263wic.0 for ; Tue, 13 Oct 2015 09:06:58 -0700 (PDT) Date: Tue, 13 Oct 2015 19:06:56 +0300 From: "Kirill A. Shutemov" Subject: Re: pmd_modify() semantics Message-ID: <20151013160656.GA14071@node> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Sender: owner-linux-mm@kvack.org List-ID: To: Vineet Gupta Cc: "Kirill A. Shutemov" , Andrew Morton , lkml , "linux-mm@kvack.org" , Minchan Kim On Tue, Oct 13, 2015 at 01:58:39PM +0000, Vineet Gupta wrote: > Hi Kirill, > > I'm running LTP tests on the new ARC THP code and thp03 seems to be triggering mm > spew. > > --------------->8--------------------- > [ARCLinux]# ./ltp-thp03-extract > PID 60 > bad pmd bf1c4600 be600231 > ../mm/pgtable-generic.c:34: bad pgd be600231. > bad pmd bf1c4604 bd800231 > ../mm/pgtable-generic.c:34: bad pgd bd800231. > BUG: Bad rss-counter state mm:bf12e900 idx:1 val:512 > BUG: non-zero nr_ptes on freeing mm: 2 > --------------->8--------------------- > > I know what exactly is happening and the likely fix, but would want to get some > thoughts from you if possible. > > background: ARC is software page walked with PGD -> PTE -> page for normal and PMD > -> page for THP case. A vanilla PGD doesn't have any flags - only pointer to PTE > > A reduced version of thp03 allocates a THP, dirties it, followed by > mprotect(PROT_NONE). > At the time of mprotect() -> change_huge_pmd() -> pmd_modify() needs to change > some of the bits. > > The issue is ARC implementation of pmd_modify() based on pte variant, which > retains the soft pte bits (dirty and accessed). > > static inline pmd_t pmd_modify(pmd_t pmd, pgprot_t newprot) > { > return pte_pmd(pte_modify(pmd_pte(pmd), newprot)); > } > > Obvious fix is to rewrite pmd_modify() so that it clears out all pte type flags > but that assumes PMD is becoming PGD (a vanilla PGD on ARC doesn't have any > flags). Can we have pmd_modify() ever be called for NOT splitting pmd e.g. > mprotect Write to Read which won't split the THP like it does now and simply > changes the prot flags. My proposed version of pmd_modify() will loose the dirty bit. Hm? pmd_modify() is nothing to do with splitting. The mprotect() codepath you've mentioned above calls pmd_modify() only if the THP is fully in mprotect range. > In short, what are the semantics of pmd_modify() - essentially does it imply pmd > is being split so are free to make it like PGD. No, pmd_modify() cannot make such assumption. That's just not true -- we don't split PMD in such codepath. And even if we do, we construct new PMD entry from scratch instead of modifying existing one. So the semantics of pmd_modify(): you can assume that the entry is pmd_large(), going to stay this way and you need to touch only protection-related bit. -- Kirill A. Shutemov -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: email@kvack.org