Re: [PATCH, RFC] x86/mm/pat: Restore large pages after fragmentation

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Peter Zijlstra <peterz@infradead.org>
To: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Andy Lutomirski <luto@kernel.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	"H. Peter Anvin" <hpa@zytor.com>,
	x86@kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH, RFC] x86/mm/pat: Restore large pages after fragmentation
Date: Fri, 17 Apr 2020 17:17:07 +0200	[thread overview]
Message-ID: <20200417151707.GG20730@hirez.programming.kicks-ass.net> (raw)
In-Reply-To: <20200417144244.gdoblao656l6ktkr@box>

On Fri, Apr 17, 2020 at 05:42:44PM +0300, Kirill A. Shutemov wrote:
> On Fri, Apr 17, 2020 at 02:18:28PM +0200, Peter Zijlstra wrote:
> Do you mean that it is not aligned to '(' on the previous line?
> 
> Okay, I'll fix it up (and change my vim setup). But I find indenting with
> spaces disgusting.

set cino=:0(0


> > >  static void cpa_flush(struct cpa_data *data, int cache)
> > >  {
> > > +	LIST_HEAD(pgtables);
> > > +	struct page *page, *tmp;
> > 
> > xmas fail
> 
> Emm. What are rules here?
> 
> > >  	struct cpa_data *cpa = data;
> > >  	unsigned int i;

Basically we (tip) strive for the inverse xmas tree thing, so here that
would be:

	struct cpa_data *cpa = data;
+	struct page *page, *tmp;
+	LIST_HEAD(pgtables);
	unsigned int i;


> > > +	pr_debug("2M restored at %#lx\n", addr);
> > 
> > While I appreciate it's usefulness while writing this, I do think we can
> > do without that print once we know it works.
> 
> Even with pr_debug()? I shouldn't be noisy for most users.

I always have debug on; also there is no counterpart on split either.

> > > +/*
> > > + * Restore PMD and PUD pages in the kernel mapping around the address where
> > > + * possible.
> > > + *
> > > + * Caller must flush TLB and free page tables queued on the list before
> > > + * touching the new entries. CPU must not see TLB entries of different size
> > > + * with different attributes.
> > > + */
> > > +static void restore_large_pages(unsigned long addr, struct list_head *pgtables)
> > > +{
> > > +	pgd_t *pgd;
> > > +	p4d_t *p4d;
> > > +	pud_t *pud;
> > > +
> > > +	addr &= PUD_MASK;
> > > +
> > > +	spin_lock(&pgd_lock);
> > > +	pgd = pgd_offset_k(addr);
> > > +	if (pgd_none(*pgd))
> > > +		goto out;
> > > +	p4d = p4d_offset(pgd, addr);
> > > +	if (p4d_none(*p4d))
> > > +		goto out;
> > > +	pud = pud_offset(p4d, addr);
> > > +	if (!pud_present(*pud) || pud_large(*pud))
> > > +		goto out;
> > > +
> > > +	restore_pud_page(pud, addr, pgtables);
> > > +out:
> > > +	spin_unlock(&pgd_lock);
> > > +}
> > 
> > I find this odd, at best. AFAICT this does not attempt to reconstruct a
> > PMD around @addr when possible. When the first PMD of the PUD can't be
> > reconstructed, we give up entirely.
> 
> No, you misread. If the first PMD is not suitable, we give up
> reconstructing PUD page, but we still look at all PMD of the PUD range.

Aah, indeed. I got my brain in a twist reading that pud function.

> But this might be excessive. What you are proposing below should be fine
> and more efficient. If we do everything right the rest of PMDs in the PUD
> have to already large or not suitable for reconstructing.

Just so.

> We might still look at the rest of PMDs for CONFIG_CPA_DEBUG, just to make
> sure we don't miss some corner case where we didn't restore a PMD.
> 
> (Also I think about s/restore/reconstruct/g)

Right, and WARN if they do succeed collapsing ;-)

next prev parent reply	other threads:[~2020-04-17 15:17 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-16 21:32 Kirill A. Shutemov
2020-04-16 22:03 ` Dave Hansen
2020-04-16 22:12   ` Kirill A. Shutemov
2020-04-16 22:38     ` Dave Hansen
2020-04-17  0:52 ` Mika Penttilä
2020-04-17 11:42   ` Kirill A. Shutemov
2020-04-17 12:05     ` Mika Penttilä
2020-04-17 12:18 ` Peter Zijlstra
2020-04-17 14:42   ` Kirill A. Shutemov
2020-04-17 15:17     ` Peter Zijlstra [this message]
2020-04-17 15:47 ` Peter Zijlstra
2020-04-17 16:54   ` Kirill A. Shutemov
2020-04-25 11:43 ` [x86/mm/pat] ae64ac1a83: BUG:Bad_page_state_in_process kernel test robot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200417151707.GG20730@hirez.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=hpa@zytor.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=kirill@shutemov.name \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=luto@kernel.org \
    --cc=mingo@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox