Re: [PATCH -mm] mincore: apply page table walker on do_mincore() (Re: [PATCH 00/10] mm: pagewalk: huge page cleanups and VMA passing)

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
To: Dave Hansen <dave.hansen@intel.com>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Subject: Re: [PATCH -mm] mincore: apply page table walker on do_mincore() (Re: [PATCH 00/10] mm: pagewalk: huge page cleanups and VMA passing)
Date: Tue,  3 Jun 2014 16:01:17 -0400	[thread overview]
Message-ID: <538e2996.cd7ae00a.1e64.6067SMTPIN_ADDED_BROKEN@mx.google.com> (raw)
In-Reply-To: <538DEFD8.4050506@intel.com>

On Tue, Jun 03, 2014 at 08:55:04AM -0700, Dave Hansen wrote:
> On 06/02/2014 11:18 PM, Naoya Horiguchi wrote:
> > And for patch 8, 9, and 10, I don't think it's good idea to add a new callback
> > which can handle both pmd and pte (because they are essentially differnt thing).
> > But the underneath idea of doing pmd_trans_huge_lock() in the common code in
> > walk_single_entry_locked() looks nice to me. So it would be great if we can do
> > the same thing in walk_pmd_range() (of linux-mm) to reduce code in callbacks.
> 
> You think they are different, I think they're the same. :)
> 
> What the walkers *really* care about is getting a leaf node in the page
> tables.  They generally don't *care* whether it is a pmd or pte, they
> just want to know what its value is and how large it is.

OK, I see your idea, so I think that we could go to the direction to
unify all p(gd|ud|md|te)_entry() callbacks.
And if we find the leaf entry in whatever level, we call the common entry
handler on the entry, right?
It would takes some time and effort to make all users to fit to this new
scheme, so my suggestion is:
 1. move pmd locking to walk_pmd_range() (then, your locked_single_entry()
    callback is equal to pmd_entry())
 2. let each existing user have its common entry handler, and connect it to
    its pmd_entry() and/or pte_entry() to keep compatibility
 3. apply page table walker to potential users.
    I'd like to keep pmd/pte_entry() until we complete phase 2.,
    because we could find something which let us change core code,
 4. and finaly replace all p(gd|ud|md|te)_entry() with a unified callback.

Could you let me have a few days to work on 1.?
I think that it means your patch 8 is effectively merged on top of mine.
So your current problem will be solved.

> I'd argue that they don't really ever need to actually know at which
> level they are in the page tables, just if they are at the bottom or
> not.  Note that *NOBODY* sets a pud or pgd entry.  That's because the
> walkers are 100% concerned about leaf nodes (pte's) at this point.

Yes. BTW do you think we should pud_entry() and pgd_entry() immediately?
We can do it and it reduces some trivial evaluations, so it's optimized
a little.

> Take a look at my version of gather_stats_locked():
> 
> >  static int gather_stats_locked(pte_t *pte, unsigned long addr,
> >                 unsigned long size, struct mm_walk *walk)
> >  {
> >         struct numa_maps *md = walk->private;
> >         struct page *page = can_gather_numa_stats(*pte, walk->vma, addr);
> >  
> >         if (page)
> >                 gather_stats(page, md, pte_dirty(*pte), size/PAGE_SIZE);
> >  
> >         return 0;
> >  }
> 
> The mmotm version looks _very_ similar to that, *BUT* the mmotm version
> needs to have an entire *EXTRA* 22-line gather_pmd_stats() dealing with
> THP locking, while mine doesn't.

OK, my objection was just on adding a new callback because it introduces
some duplication (3 callbacks for 2 types of entries.) But I agree to do
pmd locking in the common code. It should make both of us happy :)

Thanks,
Naoya Horiguchi

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

next prev parent reply	other threads:[~2014-06-03 20:01 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-06-02 21:36 [PATCH 00/10] mm: pagewalk: huge page cleanups and VMA passing Dave Hansen
2014-06-02 21:36 ` [PATCH 01/10] mm: pagewalk: consolidate vma->vm_start checks Dave Hansen
2014-06-02 21:36 ` [PATCH 02/10] mm: pagewalk: always skip hugetlbfs except when explicitly handled Dave Hansen
2014-06-02 21:36 ` [PATCH 03/10] mm: pagewalk: have generic code keep track of VMA Dave Hansen
2014-06-02 21:36 ` [PATCH 04/10] mm: pagewalk: add page walker for mincore() Dave Hansen
2014-06-02 21:36 ` [PATCH 05/10] mm: mincore: clean up hugetlbfs handling (part 1) Dave Hansen
2014-06-02 21:36 ` [PATCH 06/10] mm: mincore: clean up hugetlbfs handler (part 2) Dave Hansen
2014-06-02 21:36 ` [PATCH 07/10] mm: pagewalk: kill check for hugetlbfs inside /proc pagemap code Dave Hansen
2014-06-02 21:36 ` [PATCH 08/10] mm: pagewalk: add locked pte walker Dave Hansen
2014-06-02 21:36 ` [PATCH 09/10] mm: pagewalk: use new locked walker for /proc/pid/smaps Dave Hansen
2014-06-02 21:36 ` [PATCH 10/10] mm: pagewalk: use locked walker for /proc/pid/numa_maps Dave Hansen
2014-06-02 21:52 ` [PATCH 00/10] mm: pagewalk: huge page cleanups and VMA passing Naoya Horiguchi
     [not found] ` <1401745925-l651h3s9@n-horiguchi@ah.jp.nec.com>
2014-06-02 21:53   ` Dave Hansen
2014-06-03  6:18     ` [PATCH -mm] mincore: apply page table walker on do_mincore() (Re: [PATCH 00/10] mm: pagewalk: huge page cleanups and VMA passing) Naoya Horiguchi
     [not found]     ` <1401776292-dn0fof8e@n-horiguchi@ah.jp.nec.com>
2014-06-03 15:55       ` Dave Hansen
2014-06-03 20:01         ` Naoya Horiguchi [this message]
2014-06-03 20:08           ` Naoya Horiguchi
     [not found]         ` <1401825676-8py0r32h@n-horiguchi@ah.jp.nec.com>
2014-06-03 20:33           ` Dave Hansen
2014-06-03 15:59       ` Dave Hansen
2014-06-03 16:22         ` Naoya Horiguchi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=538e2996.cd7ae00a.1e64.6067SMTPIN_ADDED_BROKEN@mx.google.com \
    --to=n-horiguchi@ah.jp.nec.com \
    --cc=dave.hansen@intel.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox