linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Peter Xu <peterx@redhat.com>
To: Mike Kravetz <mike.kravetz@oracle.com>
Cc: linux-mm@kvack.org, Naoya Horiguchi <naoya.horiguchi@linux.dev>,
	David Rientjes <rientjes@google.com>,
	Michal Hocko <mhocko@suse.com>,
	Matthew Wilcox <willy@infradead.org>,
	David Hildenbrand <david@redhat.com>,
	James Houghton <jthoughton@google.com>,
	Muchun Song <songmuchun@bytedance.com>
Subject: Re: A mapcount riddle
Date: Tue, 24 Jan 2023 18:00:36 -0500	[thread overview]
Message-ID: <Y9BjFCnk31YjhUnY@x1n> (raw)
In-Reply-To: <Y9BF+OCdWnCSilEu@monkey>

On Tue, Jan 24, 2023 at 12:56:24PM -0800, Mike Kravetz wrote:
> Q How can a page be mapped into multiple processes and have a
>   mapcount of 1?
> 
> A It is a hugetlb page referenced by a shared PMD.
> 
> I was looking to expose some basic information about PMD sharing via
> /proc/smaps.  After adding the code, I started a couple processes
> sharing a large hugetlb mapping that would result in the use of
> shared PMDs.  When I looked at the output of /proc/smaps, I saw
> my new metric counting the number of shared PMDs.  However, what
> stood out was that the entire mapping was listed as Private_Hugetlb.
> WTH???  It certainly was shared!  The routine smaps_hugetlb_range
> decides between Private_Hugetlb and Shared_Hugetlb with this code:
> 
> 	if (page) {
> 		int mapcount = page_mapcount(page);
> 
> 		if (mapcount >= 2)
> 			mss->shared_hugetlb += huge_page_size(hstate_vma(vma));
> 		else
> 			mss->private_hugetlb += huge_page_size(hstate_vma(vma));
> 	}

This is definitely unfortunate..

> 
> After spending some time looking for issues in the page_mapcount code,
> I came to the realization that the mapcount of hugetlb pages only
> referenced by a shared PMD would be 1 no matter how many processes had
> mapped the page.  When a page is first faulted, the mapcount is set to 1.
> When faulted in other processes, the shared PMD is added to the page
> table of the other processes.  No increase of mapcount will occur.
> 
> At first thought this seems bad.  However, I believe this has been the
> behavior since hugetlb PMD sharing was introduced in 2006 and I am
> unaware of any reported issues.  I did a audit of code looking at
> mapcount.  In addition to the above issue with smaps, there appears
> to be an issue with 'migrate_pages' where shared pages could be migrated
> without appropriate privilege.
> 
> 	/* With MPOL_MF_MOVE, we migrate only unshared hugepage. */
> 	if (flags & (MPOL_MF_MOVE_ALL) ||
> 	    (flags & MPOL_MF_MOVE && page_mapcount(page) == 1)) {
> 		if (isolate_hugetlb(page, qp->pagelist) &&
> 			(flags & MPOL_MF_STRICT))
> 			/*
> 			 * Failed to isolate page but allow migrating pages
> 			 * which have been queued.
> 			 */
> 			ret = 1;
> 	}
> 
> I will prepare fixes for both of these.  However, I wanted to ask if
> anyone has ideas about other potential issues with this?

This reminded me whether things should be checked already before this
happens.  E.g. when trying to share pmd, whether it makes sense to check
vma mempolicy before doing so?

Then the question is if pmd sharing only happens with the vma that shares
the same memory policy, whether above mapcount==1 check would be acceptable
even if it's shared by multiple processes.

Besides, I'm also curious on the planned fix too regarding the two issues
mentioned.

Thanks,

> 
> Since COW is mostly relevant to private mappings, shared PMDs generally
> do not apply.  Nothing stood out in a quick audit of code.
> -- 
> Mike Kravetz
> 

-- 
Peter Xu



  reply	other threads:[~2023-01-24 23:00 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-01-24 20:56 Mike Kravetz
2023-01-24 23:00 ` Peter Xu [this message]
2023-01-24 23:29   ` Yang Shi
2023-01-25 16:02     ` Peter Xu
2023-01-25 18:26       ` Yang Shi
2023-01-24 23:35   ` Mike Kravetz
2023-01-25 16:46     ` Peter Xu
2023-01-25 18:16       ` Mike Kravetz
2023-01-25 20:13         ` Peter Xu
2023-01-25  8:24 ` Michal Hocko
2023-01-25 17:59   ` Mike Kravetz
2023-01-26  9:16     ` Michal Hocko
2023-01-26 17:51       ` Mike Kravetz
2023-01-27  9:56         ` Michal Hocko
2023-01-25  9:09 ` David Hildenbrand
2023-01-25 15:26 ` James Houghton
2023-01-25 15:54   ` Peter Xu
2023-01-25 16:22     ` James Houghton
2023-01-25 19:26       ` Vishal Moola
2023-01-26  9:15       ` David Hildenbrand
2023-01-26 18:22         ` Yang Shi
2023-01-26  9:10   ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y9BjFCnk31YjhUnY@x1n \
    --to=peterx@redhat.com \
    --cc=david@redhat.com \
    --cc=jthoughton@google.com \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=mike.kravetz@oracle.com \
    --cc=naoya.horiguchi@linux.dev \
    --cc=rientjes@google.com \
    --cc=songmuchun@bytedance.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox