From: Minchan Kim <minchan@kernel.org>
To: zhong jiang <zhongjiang@huawei.com>
Cc: Michal Hocko <mhocko@kernel.org>,
akpm@linux-foundation.org, hannes@cmpxchg.org,
ktkhai@virtuozzo.com, linux-mm@kvack.org
Subject: Re: [PATCH] mm: fix unevictable page reclaim when calling madvise_pageout
Date: Wed, 30 Oct 2019 09:52:39 -0700 [thread overview]
Message-ID: <20191030165239.GA167773@google.com> (raw)
In-Reply-To: <5DB81838.6020208@huawei.com>
On Tue, Oct 29, 2019 at 06:45:12PM +0800, zhong jiang wrote:
> On 2019/10/29 17:40, Michal Hocko wrote:
> > On Tue 29-10-19 17:30:57, zhong jiang wrote:
> >> On 2019/10/29 16:11, Michal Hocko wrote:
> >>> [Cc Minchan]
> > [...]
> >>> Removing a long existing BUG_ON begs for a much better explanation.
> >>> shrink_page_list is not a trivial piece of code but I _suspect_ that
> >>> removing it should be ok for mapped pages at least (try_to_unmap) but I
> >>> am not so sure how unmapped unevictable pages are handled from top of my
> >>> head.
> >> As to the unmapped unevictable pages. shrink_page_list has taken that into account.
> >>
> >> shinkr_page_list
> >> page_evictable --> will filter the unevictable pages to putback its lru.
> > Ohh, it is right there at the top. Missed it. The check has been added
> > by Nick along with the BUG_ON. So it is sounds more like a "this
> > shouldn't happen" bugon. I wouldn't mind to remove it with that
> > justification.
> As you has said, Minchan fix the same kind of bug by checking PageUnevictable (I did not notice before)
> Wait for Minchan to see whether he has better reason. thanks,
madvise_pageout could work with a shared page and one of the vmas among processes
could do mlock so it could pass Unevictable LRU pages into shrink_page_list.
It's pointless to try reclaim unevictable pages from the beginning so I want to fix
madvise_pageout via introducing only_evictable flag into the API so that
madvise_pageout uses it as "true".
If we want to remove the PageUnevictable VM_BUG_ON_PAGE in shrink_page_list,
I want to see more strong reason why it happens and why caller couldn't
filter them out from the beginning.
diff --git a/mm/gup.c b/mm/gup.c
index 8f236a335ae9..d1ad1c3ec596 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -1468,7 +1468,7 @@ static long check_and_migrate_cma_pages(struct task_struct *tsk,
drain_allow = false;
}
- if (!isolate_lru_page(head)) {
+ if (!isolate_lru_page(head, false)) {
list_add_tail(&head->lru, &cma_page_list);
mod_node_page_state(page_pgdat(head),
NR_ISOLATED_ANON +
diff --git a/mm/internal.h b/mm/internal.h
index 0d5f720c75ab..13319612bef0 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -85,7 +85,7 @@ extern unsigned long highest_memmap_pfn;
/*
* in mm/vmscan.c:
*/
-extern int isolate_lru_page(struct page *page);
+extern int isolate_lru_page(struct page *page, bool only_evictable);
extern void putback_lru_page(struct page *page);
/*
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index 0a1b4b484ac5..095560f7f8ec 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -609,7 +609,7 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma,
* Isolate the page to avoid collapsing an hugepage
* currently in use by the VM.
*/
- if (isolate_lru_page(page)) {
+ if (isolate_lru_page(page, false)) {
unlock_page(page);
result = SCAN_DEL_PAGE_LRU;
goto out;
@@ -1642,7 +1642,7 @@ static void collapse_file(struct mm_struct *mm,
goto out_unlock;
}
- if (isolate_lru_page(page)) {
+ if (isolate_lru_page(page, false)) {
result = SCAN_DEL_PAGE_LRU;
goto out_unlock;
}
diff --git a/mm/madvise.c b/mm/madvise.c
index 2be9f3fdb05e..2639de560a0b 100644
--- a/mm/madvise.c
+++ b/mm/madvise.c
@@ -363,7 +363,7 @@ static int madvise_cold_or_pageout_pte_range(pmd_t *pmd,
ClearPageReferenced(page);
test_and_clear_page_young(page);
if (pageout) {
- if (!isolate_lru_page(page))
+ if (!isolate_lru_page(page, true))
list_add(&page->lru, &page_list);
} else
deactivate_page(page);
@@ -441,7 +441,7 @@ static int madvise_cold_or_pageout_pte_range(pmd_t *pmd,
ClearPageReferenced(page);
test_and_clear_page_young(page);
if (pageout) {
- if (!isolate_lru_page(page))
+ if (!isolate_lru_page(page, true))
list_add(&page->lru, &page_list);
} else
deactivate_page(page);
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 363106578876..6d913215b074 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -5847,7 +5847,7 @@ static int mem_cgroup_move_charge_pte_range(pmd_t *pmd,
target_type = get_mctgt_type_thp(vma, addr, *pmd, &target);
if (target_type == MC_TARGET_PAGE) {
page = target.page;
- if (!isolate_lru_page(page)) {
+ if (!isolate_lru_page(page, false)) {
if (!mem_cgroup_move_account(page, true,
mc.from, mc.to)) {
mc.precharge -= HPAGE_PMD_NR;
@@ -5895,7 +5895,7 @@ static int mem_cgroup_move_charge_pte_range(pmd_t *pmd,
*/
if (PageTransCompound(page))
goto put;
- if (!device && isolate_lru_page(page))
+ if (!device && isolate_lru_page(page, false))
goto put;
if (!mem_cgroup_move_account(page, false,
mc.from, mc.to)) {
diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index 3151c87dff73..ef37c67a7bab 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -567,7 +567,7 @@ static const char * const action_page_types[] = {
*/
static int delete_from_lru_cache(struct page *p)
{
- if (!isolate_lru_page(p)) {
+ if (!isolate_lru_page(p, false)) {
/*
* Clear sensible page flags, so that the buddy system won't
* complain when the page is unpoison-and-freed.
@@ -1782,7 +1782,7 @@ static int __soft_offline_page(struct page *page, int flags)
* handles a large number of cases for us.
*/
if (PageLRU(page))
- ret = isolate_lru_page(page);
+ ret = isolate_lru_page(page, false);
else
ret = isolate_movable_page(page, ISOLATE_UNEVICTABLE);
/*
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index df570e5c71cc..8ba483d3d8cd 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1314,7 +1314,7 @@ do_migrate_range(unsigned long start_pfn, unsigned long end_pfn)
*/
if (PageHWPoison(page)) {
if (WARN_ON(PageLRU(page)))
- isolate_lru_page(page);
+ isolate_lru_page(page, false);
if (page_mapped(page))
try_to_unmap(page, TTU_IGNORE_MLOCK | TTU_IGNORE_ACCESS);
continue;
@@ -1327,7 +1327,7 @@ do_migrate_range(unsigned long start_pfn, unsigned long end_pfn)
* LRU and non-lru movable pages.
*/
if (PageLRU(page))
- ret = isolate_lru_page(page);
+ ret = isolate_lru_page(page, false);
else
ret = isolate_movable_page(page, ISOLATE_UNEVICTABLE);
if (!ret) { /* Success */
diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index 4ae967bcf954..585e5845f071 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -974,7 +974,7 @@ static int migrate_page_add(struct page *page, struct list_head *pagelist,
* Avoid migrating a page that is shared with others.
*/
if ((flags & MPOL_MF_MOVE_ALL) || page_mapcount(head) == 1) {
- if (!isolate_lru_page(head)) {
+ if (!isolate_lru_page(head, false)) {
list_add_tail(&head->lru, pagelist);
mod_node_page_state(page_pgdat(head),
NR_ISOLATED_ANON + page_is_file_cache(head),
diff --git a/mm/migrate.c b/mm/migrate.c
index 4fe45d1428c8..710e00317a8f 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1563,7 +1563,7 @@ static int add_page_for_migration(struct mm_struct *mm, unsigned long addr,
struct page *head;
head = compound_head(page);
- err = isolate_lru_page(head);
+ err = isolate_lru_page(head, false);
if (err)
goto out_putpage;
@@ -1895,7 +1895,7 @@ static int numamigrate_isolate_page(pg_data_t *pgdat, struct page *page)
if (!migrate_balanced_pgdat(pgdat, compound_nr(page)))
return 0;
- if (isolate_lru_page(page))
+ if (isolate_lru_page(page, false))
return 0;
/*
@@ -2450,7 +2450,7 @@ static void migrate_vma_prepare(struct migrate_vma *migrate)
allow_drain = false;
}
- if (isolate_lru_page(page)) {
+ if (isolate_lru_page(page, false)) {
if (remap) {
migrate->src[i] &= ~MIGRATE_PFN_MIGRATE;
migrate->cpages--;
diff --git a/mm/mlock.c b/mm/mlock.c
index a72c1eeded77..307e340fe2e0 100644
--- a/mm/mlock.c
+++ b/mm/mlock.c
@@ -70,7 +70,7 @@ void clear_page_mlock(struct page *page)
*
* See __pagevec_lru_add_fn for more explanation.
*/
- if (!isolate_lru_page(page)) {
+ if (!isolate_lru_page(page, false)) {
putback_lru_page(page);
} else {
/*
@@ -97,7 +97,7 @@ void mlock_vma_page(struct page *page)
mod_zone_page_state(page_zone(page), NR_MLOCK,
hpage_nr_pages(page));
count_vm_event(UNEVICTABLE_PGMLOCKED);
- if (!isolate_lru_page(page))
+ if (!isolate_lru_page(page, false))
putback_lru_page(page);
}
}
diff --git a/mm/vmscan.c b/mm/vmscan.c
index ee4eecc7e1c2..c44fb52c745f 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1793,7 +1793,7 @@ static unsigned long isolate_lru_pages(unsigned long nr_to_scan,
* (2) the lru_lock must not be held.
* (3) interrupts must be enabled.
*/
-int isolate_lru_page(struct page *page)
+int isolate_lru_page(struct page *page, bool only_evictable)
{
int ret = -EBUSY;
@@ -1805,6 +1805,8 @@ int isolate_lru_page(struct page *page)
struct lruvec *lruvec;
spin_lock_irq(&pgdat->lru_lock);
+ if (only_evictable && PageUnevictable(page))
+ goto out;
lruvec = mem_cgroup_page_lruvec(page, pgdat);
if (PageLRU(page)) {
int lru = page_lru(page);
@@ -1813,6 +1815,7 @@ int isolate_lru_page(struct page *page)
del_page_from_lru_list(page, lruvec, lru);
ret = 0;
}
+out:
spin_unlock_irq(&pgdat->lru_lock);
}
return ret;
next prev parent reply other threads:[~2019-10-30 16:52 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-10-28 15:08 zhong jiang
2019-10-28 15:27 ` David Hildenbrand
2019-10-28 15:45 ` zhong jiang
2019-10-28 16:07 ` David Hildenbrand
2019-10-28 16:15 ` zhong jiang
2019-10-28 16:15 ` David Hildenbrand
2019-10-29 2:29 ` zhong jiang
2019-10-29 8:11 ` Michal Hocko
2019-10-29 9:30 ` zhong jiang
2019-10-29 9:40 ` Michal Hocko
2019-10-29 10:45 ` zhong jiang
2019-10-30 16:52 ` Minchan Kim [this message]
2019-10-30 17:22 ` Johannes Weiner
2019-10-30 18:39 ` Minchan Kim
2019-11-01 8:57 ` zhong jiang
2019-10-30 17:45 ` Michal Hocko
2019-10-30 18:42 ` Minchan Kim
2019-10-30 19:33 ` Johannes Weiner
2019-10-31 9:16 ` Michal Hocko
2019-10-31 14:48 ` Minchan Kim
2019-10-31 17:17 ` Michal Hocko
2019-11-01 12:56 ` zhong jiang
2019-10-31 9:46 ` zhong jiang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20191030165239.GA167773@google.com \
--to=minchan@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=hannes@cmpxchg.org \
--cc=ktkhai@virtuozzo.com \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=zhongjiang@huawei.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox