From: Hugh Dickins <hughd@google.com>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
Andrea Arcangeli <aarcange@redhat.com>,
Andres Lagar-Cavilla <andreslc@google.com>,
Yang Shi <yang.shi@linaro.org>, Ning Qu <quning@gmail.com>,
Vladimir Davydov <vdavydov@virtuozzo.com>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: [PATCH 08/31] huge tmpfs: try_to_unmap_one use page_check_address_transhuge
Date: Tue, 5 Apr 2016 14:23:04 -0700 (PDT) [thread overview]
Message-ID: <alpine.LSU.2.11.1604051421260.5965@eggly.anvils> (raw)
In-Reply-To: <alpine.LSU.2.11.1604051403210.5965@eggly.anvils>
Anon THP's huge pages are split for reclaim in add_to_swap(), before they
reach try_to_unmap(); migrate_misplaced_transhuge_page() does its own pmd
remapping, instead of needing try_to_unmap(); migratable hugetlbfs pages
masquerade as pte-mapped in page_check_address(). So try_to_unmap_one()
did not need to handle transparent pmd mappings as page_referenced_one()
does (beyond the TTU_SPLIT_HUGE_PMD case; though what about TTU_MUNLOCK?).
But tmpfs huge pages are split a little later in the reclaim sequence,
when pageout() calls shmem_writepage(): so try_to_unmap_one() now needs
to handle pmd-mapped pages by using page_check_address_transhuge(), and
a function unmap_team_by_pmd() that we shall place in huge_memory.c in
a later patch, but just use a stub for now.
Refine the lookup in page_check_address_transhuge() slightly, to match
what mm_find_pmd() does, and we've been using for a year: take a pmdval
snapshot of *pmd first, to avoid pmd_lock before the pmd_page check,
with a retry if it changes in between. Was the code wrong before?
I don't think it was, but I am more comfortable with how it is now.
Change its check on hpage_nr_pages() to use compound_order() instead,
two reasons for that: one being that there's now a case in anon THP
splitting where the new call to page_check_address_transhuge() may be on
a PageTail, which hits VM_BUG_ON in PageTransHuge in hpage_nr_pages();
the other being that hpage_nr_pages() on PageTeam gets more interesting
in a later patch, and would no longer be appropriate here.
Say "pmdval" as usual, instead of the "pmde" I made up for mm_find_pmd()
before. Update the comment in mm_find_pmd() to generalise it away from
just the anon_vma lock.
Signed-off-by: Hugh Dickins <hughd@google.com>
---
include/linux/pageteam.h | 6 +++
mm/rmap.c | 65 +++++++++++++++++++++----------------
2 files changed, 43 insertions(+), 28 deletions(-)
--- a/include/linux/pageteam.h
+++ b/include/linux/pageteam.h
@@ -29,4 +29,10 @@ static inline struct page *team_head(str
return head;
}
+/* Temporary stub for mm/rmap.c until implemented in mm/huge_memory.c */
+static inline void unmap_team_by_pmd(struct vm_area_struct *vma,
+ unsigned long addr, pmd_t *pmd, struct page *page)
+{
+}
+
#endif /* _LINUX_PAGETEAM_H */
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -47,6 +47,7 @@
#include <linux/mm.h>
#include <linux/pagemap.h>
+#include <linux/pageteam.h>
#include <linux/swap.h>
#include <linux/swapops.h>
#include <linux/slab.h>
@@ -687,7 +688,7 @@ pmd_t *mm_find_pmd(struct mm_struct *mm,
pgd_t *pgd;
pud_t *pud;
pmd_t *pmd = NULL;
- pmd_t pmde;
+ pmd_t pmdval;
pgd = pgd_offset(mm, address);
if (!pgd_present(*pgd))
@@ -700,12 +701,12 @@ pmd_t *mm_find_pmd(struct mm_struct *mm,
pmd = pmd_offset(pud, address);
/*
* Some THP functions use the sequence pmdp_huge_clear_flush(), set_pmd_at()
- * without holding anon_vma lock for write. So when looking for a
- * genuine pmde (in which to find pte), test present and !THP together.
+ * without locking out concurrent rmap lookups. So when looking for a
+ * pmd entry, in which to find a pte, test present and !THP together.
*/
- pmde = *pmd;
+ pmdval = *pmd;
barrier();
- if (!pmd_present(pmde) || pmd_trans_huge(pmde))
+ if (!pmd_present(pmdval) || pmd_trans_huge(pmdval))
pmd = NULL;
out:
return pmd;
@@ -800,6 +801,7 @@ bool page_check_address_transhuge(struct
pgd_t *pgd;
pud_t *pud;
pmd_t *pmd;
+ pmd_t pmdval;
pte_t *pte;
spinlock_t *ptl;
@@ -821,32 +823,24 @@ bool page_check_address_transhuge(struct
if (!pud_present(*pud))
return false;
pmd = pmd_offset(pud, address);
+again:
+ pmdval = *pmd;
+ barrier();
+ if (!pmd_present(pmdval))
+ return false;
- if (pmd_trans_huge(*pmd)) {
+ if (pmd_trans_huge(pmdval)) {
+ if (pmd_page(pmdval) != page)
+ return false;
ptl = pmd_lock(mm, pmd);
- if (!pmd_present(*pmd))
- goto unlock_pmd;
- if (unlikely(!pmd_trans_huge(*pmd))) {
+ if (unlikely(!pmd_same(*pmd, pmdval))) {
spin_unlock(ptl);
- goto map_pte;
+ goto again;
}
-
- if (pmd_page(*pmd) != page)
- goto unlock_pmd;
-
pte = NULL;
goto found;
-unlock_pmd:
- spin_unlock(ptl);
- return false;
- } else {
- pmd_t pmde = *pmd;
-
- barrier();
- if (!pmd_present(pmde) || pmd_trans_huge(pmde))
- return false;
}
-map_pte:
+
pte = pte_offset_map(pmd, address);
if (!pte_present(*pte)) {
pte_unmap(pte);
@@ -863,7 +857,7 @@ check_pte:
}
/* THP can be referenced by any subpage */
- if (pte_pfn(*pte) - page_to_pfn(page) >= hpage_nr_pages(page)) {
+ if (pte_pfn(*pte) - page_to_pfn(page) >= (1 << compound_order(page))) {
pte_unmap_unlock(pte, ptl);
return false;
}
@@ -1404,6 +1398,7 @@ static int try_to_unmap_one(struct page
unsigned long address, void *arg)
{
struct mm_struct *mm = vma->vm_mm;
+ pmd_t *pmd;
pte_t *pte;
pte_t pteval;
spinlock_t *ptl;
@@ -1423,8 +1418,7 @@ static int try_to_unmap_one(struct page
goto out;
}
- pte = page_check_address(page, mm, address, &ptl, 0);
- if (!pte)
+ if (!page_check_address_transhuge(page, mm, address, &pmd, &pte, &ptl))
goto out;
/*
@@ -1442,6 +1436,19 @@ static int try_to_unmap_one(struct page
if (flags & TTU_MUNLOCK)
goto out_unmap;
}
+
+ if (!pte) {
+ if (!(flags & TTU_IGNORE_ACCESS) &&
+ IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE) &&
+ pmdp_clear_flush_young_notify(vma, address, pmd)) {
+ ret = SWAP_FAIL;
+ goto out_unmap;
+ }
+ spin_unlock(ptl);
+ unmap_team_by_pmd(vma, address, pmd, page);
+ goto out;
+ }
+
if (!(flags & TTU_IGNORE_ACCESS)) {
if (ptep_clear_flush_young_notify(vma, address, pte)) {
ret = SWAP_FAIL;
@@ -1542,7 +1549,9 @@ discard:
put_page(page);
out_unmap:
- pte_unmap_unlock(pte, ptl);
+ spin_unlock(ptl);
+ if (pte)
+ pte_unmap(pte);
if (ret != SWAP_FAIL && ret != SWAP_MLOCK && !(flags & TTU_MUNLOCK))
mmu_notifier_invalidate_page(mm, address);
out:
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2016-04-05 21:23 UTC|newest]
Thread overview: 47+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-04-05 21:10 [PATCH 00/31] huge tmpfs: THPagecache implemented by teams Hugh Dickins
2016-04-05 21:12 ` [PATCH 01/31] huge tmpfs: prepare counts in meminfo, vmstat and SysRq-m Hugh Dickins
2016-04-11 11:05 ` Kirill A. Shutemov
2016-04-17 2:28 ` Hugh Dickins
2016-04-05 21:13 ` [PATCH 02/31] huge tmpfs: include shmem freeholes in available memory Hugh Dickins
2016-04-05 21:15 ` [PATCH 03/31] huge tmpfs: huge=N mount option and /proc/sys/vm/shmem_huge Hugh Dickins
2016-04-11 11:17 ` Kirill A. Shutemov
2016-04-17 2:00 ` Hugh Dickins
2016-04-05 21:16 ` [PATCH 04/31] huge tmpfs: try to allocate huge pages, split into a team Hugh Dickins
2016-04-05 21:17 ` [PATCH 05/31] huge tmpfs: avoid team pages in a few places Hugh Dickins
2016-04-05 21:20 ` [PATCH 06/31] huge tmpfs: shrinker to migrate and free underused holes Hugh Dickins
2016-04-05 21:21 ` [PATCH 07/31] huge tmpfs: get_unmapped_area align & fault supply huge page Hugh Dickins
2016-04-05 21:23 ` Hugh Dickins [this message]
2016-04-05 21:24 ` [PATCH 09/31] huge tmpfs: avoid premature exposure of new pagetable Hugh Dickins
2016-04-11 11:54 ` Kirill A. Shutemov
2016-04-17 1:49 ` Hugh Dickins
2016-04-05 21:25 ` [PATCH 10/31] huge tmpfs: map shmem by huge page pmd or by page team ptes Hugh Dickins
2016-04-05 21:29 ` [PATCH 11/31] huge tmpfs: disband split huge pmds on race or memory failure Hugh Dickins
2016-04-05 21:33 ` [PATCH 12/31] huge tmpfs: extend get_user_pages_fast to shmem pmd Hugh Dickins
2016-04-06 7:00 ` Ingo Molnar
2016-04-07 2:53 ` Hugh Dickins
2016-04-13 8:58 ` Ingo Molnar
2016-04-05 21:34 ` [PATCH 13/31] huge tmpfs: use Unevictable lru with variable hpage_nr_pages Hugh Dickins
2016-04-05 21:35 ` [PATCH 14/31] huge tmpfs: fix Mlocked meminfo, track huge & unhuge mlocks Hugh Dickins
2016-04-05 21:37 ` [PATCH 15/31] huge tmpfs: fix Mapped meminfo, track huge & unhuge mappings Hugh Dickins
2016-04-05 21:39 ` [PATCH 16/31] kvm: plumb return of hva when resolving page fault Hugh Dickins
2016-04-05 21:41 ` [PATCH 17/31] kvm: teach kvm to map page teams as huge pages Hugh Dickins
2016-04-05 23:37 ` Paolo Bonzini
2016-04-06 1:12 ` Hugh Dickins
2016-04-06 6:47 ` Paolo Bonzini
2016-04-06 6:56 ` Andres Lagar-Cavilla
2016-04-05 21:44 ` [PATCH 18/31] huge tmpfs: mem_cgroup move charge on shmem " Hugh Dickins
2016-04-05 21:46 ` [PATCH 19/31] huge tmpfs: mem_cgroup shmem_pmdmapped accounting Hugh Dickins
2016-04-05 21:47 ` [PATCH 20/31] huge tmpfs: mem_cgroup shmem_hugepages accounting Hugh Dickins
2016-04-05 21:49 ` [PATCH 21/31] huge tmpfs: show page team flag in pageflags Hugh Dickins
2016-04-05 21:51 ` [PATCH 22/31] huge tmpfs: /proc/<pid>/smaps show ShmemHugePages Hugh Dickins
2016-04-05 21:53 ` [PATCH 23/31] huge tmpfs recovery: framework for reconstituting huge pages Hugh Dickins
2016-04-06 10:28 ` Mika Penttilä
2016-04-07 2:05 ` Hugh Dickins
2016-04-05 21:54 ` [PATCH 24/31] huge tmpfs recovery: shmem_recovery_populate to fill huge page Hugh Dickins
2016-04-05 21:56 ` [PATCH 25/31] huge tmpfs recovery: shmem_recovery_remap & remap_team_by_pmd Hugh Dickins
2016-04-05 21:58 ` [PATCH 26/31] huge tmpfs recovery: shmem_recovery_swapin to read from swap Hugh Dickins
2016-04-05 22:00 ` [PATCH 27/31] huge tmpfs recovery: tweak shmem_getpage_gfp to fill team Hugh Dickins
2016-04-05 22:02 ` [PATCH 28/31] huge tmpfs recovery: debugfs stats to complete this phase Hugh Dickins
2016-04-05 22:03 ` [PATCH 29/31] huge tmpfs recovery: page migration call back into shmem Hugh Dickins
2016-04-05 22:05 ` [PATCH 30/31] huge tmpfs: shmem_huge_gfpmask and shmem_recovery_gfpmask Hugh Dickins
2016-04-05 22:07 ` [PATCH 31/31] huge tmpfs: no kswapd by default on sync allocations Hugh Dickins
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=alpine.LSU.2.11.1604051421260.5965@eggly.anvils \
--to=hughd@google.com \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=andreslc@google.com \
--cc=kirill.shutemov@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=quning@gmail.com \
--cc=vdavydov@virtuozzo.com \
--cc=yang.shi@linaro.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox