From: Michal Hocko <mhocko@kernel.org>
To: "Huang, Ying" <ying.huang@intel.com>
Cc: "Andrew Morton" <akpm@linux-foundation.org>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
"Zi Yan" <ziy@nvidia.com>,
"Andrea Arcangeli" <aarcange@redhat.com>,
"Kirill A . Shutemov" <kirill.shutemov@linux.intel.com>,
"Vlastimil Babka" <vbabka@suse.cz>,
"Alexey Dobriyan" <adobriyan@gmail.com>,
"Konstantin Khlebnikov" <khlebnikov@yandex-team.ru>,
"Jérôme Glisse" <jglisse@redhat.com>,
"Yang Shi" <yang.shi@linux.alibaba.com>
Subject: Re: [PATCH -V3] /proc/PID/smaps: Add PMD migration entry parsing
Date: Fri, 3 Apr 2020 15:11:23 +0200 [thread overview]
Message-ID: <20200403131123.GD22681@dhcp22.suse.cz> (raw)
In-Reply-To: <20200403123059.1846960-1-ying.huang@intel.com>
On Fri 03-04-20 20:30:59, Huang, Ying wrote:
> From: Huang Ying <ying.huang@intel.com>
>
> Now, when read /proc/PID/smaps, the PMD migration entry in page table is simply
> ignored. To improve the accuracy of /proc/PID/smaps, its parsing and processing
> is added.
>
> To test the patch, we run pmbench to eat 400 MB memory in background, then run
> /usr/bin/migratepages and `cat /proc/PID/smaps` every second. The issue as
> follows can be reproduced within 60 seconds.
>
> Before the patch, for the fully populated 400 MB anonymous VMA, some THP pages
> under migration may be lost as below.
>
> 7f3f6a7e5000-7f3f837e5000 rw-p 00000000 00:00 0
> Size: 409600 kB
> KernelPageSize: 4 kB
> MMUPageSize: 4 kB
> Rss: 407552 kB
> Pss: 407552 kB
> Shared_Clean: 0 kB
> Shared_Dirty: 0 kB
> Private_Clean: 0 kB
> Private_Dirty: 407552 kB
> Referenced: 301056 kB
> Anonymous: 407552 kB
> LazyFree: 0 kB
> AnonHugePages: 405504 kB
> ShmemPmdMapped: 0 kB
> FilePmdMapped: 0 kB
> Shared_Hugetlb: 0 kB
> Private_Hugetlb: 0 kB
> Swap: 0 kB
> SwapPss: 0 kB
> Locked: 0 kB
> THPeligible: 1
> VmFlags: rd wr mr mw me ac
>
> After the patch, it will be always,
>
> 7f3f6a7e5000-7f3f837e5000 rw-p 00000000 00:00 0
> Size: 409600 kB
> KernelPageSize: 4 kB
> MMUPageSize: 4 kB
> Rss: 409600 kB
> Pss: 409600 kB
> Shared_Clean: 0 kB
> Shared_Dirty: 0 kB
> Private_Clean: 0 kB
> Private_Dirty: 409600 kB
> Referenced: 294912 kB
> Anonymous: 409600 kB
> LazyFree: 0 kB
> AnonHugePages: 407552 kB
> ShmemPmdMapped: 0 kB
> FilePmdMapped: 0 kB
> Shared_Hugetlb: 0 kB
> Private_Hugetlb: 0 kB
> Swap: 0 kB
> SwapPss: 0 kB
> Locked: 0 kB
> THPeligible: 1
> VmFlags: rd wr mr mw me ac
>
> Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
> Reviewed-by: Zi Yan <ziy@nvidia.com>
> Cc: Andrea Arcangeli <aarcange@redhat.com>
> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Cc: Vlastimil Babka <vbabka@suse.cz>
> Cc: Alexey Dobriyan <adobriyan@gmail.com>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
> Cc: "Jérôme Glisse" <jglisse@redhat.com>
> Cc: Yang Shi <yang.shi@linux.alibaba.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Thanks!
> ---
>
> v3:
>
> - Revised patch description and remove VM_WARN_ON_ONCE() per Michal's comments
>
> v2:
>
> - Use thp_migration_supported() in condition to reduce code size if THP
> migration isn't enabled.
>
> - Replace VM_BUG_ON() with VM_WARN_ON_ONCE(), it's not necessary to nuking
> kernel for this.
>
> ---
> fs/proc/task_mmu.c | 16 +++++++++++-----
> 1 file changed, 11 insertions(+), 5 deletions(-)
>
> diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
> index 8d382d4ec067..36dc7417c0df 100644
> --- a/fs/proc/task_mmu.c
> +++ b/fs/proc/task_mmu.c
> @@ -546,10 +546,17 @@ static void smaps_pmd_entry(pmd_t *pmd, unsigned long addr,
> struct mem_size_stats *mss = walk->private;
> struct vm_area_struct *vma = walk->vma;
> bool locked = !!(vma->vm_flags & VM_LOCKED);
> - struct page *page;
> + struct page *page = NULL;
> +
> + if (pmd_present(*pmd)) {
> + /* FOLL_DUMP will return -EFAULT on huge zero page */
> + page = follow_trans_huge_pmd(vma, addr, pmd, FOLL_DUMP);
> + } else if (unlikely(thp_migration_supported() && is_swap_pmd(*pmd))) {
> + swp_entry_t entry = pmd_to_swp_entry(*pmd);
>
> - /* FOLL_DUMP will return -EFAULT on huge zero page */
> - page = follow_trans_huge_pmd(vma, addr, pmd, FOLL_DUMP);
> + if (is_migration_entry(entry))
> + page = migration_entry_to_page(entry);
> + }
> if (IS_ERR_OR_NULL(page))
> return;
> if (PageAnon(page))
> @@ -578,8 +585,7 @@ static int smaps_pte_range(pmd_t *pmd, unsigned long addr, unsigned long end,
>
> ptl = pmd_trans_huge_lock(pmd, vma);
> if (ptl) {
> - if (pmd_present(*pmd))
> - smaps_pmd_entry(pmd, addr, walk);
> + smaps_pmd_entry(pmd, addr, walk);
> spin_unlock(ptl);
> goto out;
> }
> --
> 2.25.0
--
Michal Hocko
SUSE Labs
next prev parent reply other threads:[~2020-04-03 13:11 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-04-03 12:30 Huang, Ying
2020-04-03 13:11 ` Michal Hocko [this message]
2020-04-03 14:26 ` Kirill A. Shutemov
2020-04-06 8:19 ` Vlastimil Babka
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200403131123.GD22681@dhcp22.suse.cz \
--to=mhocko@kernel.org \
--cc=aarcange@redhat.com \
--cc=adobriyan@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=jglisse@redhat.com \
--cc=khlebnikov@yandex-team.ru \
--cc=kirill.shutemov@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=vbabka@suse.cz \
--cc=yang.shi@linux.alibaba.com \
--cc=ying.huang@intel.com \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox