From: "Kirill A. Shutemov" <kirill@shutemov.name>
To: Hugh Dickins <hughd@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
Yang Shi <shy828301@gmail.com>,
Wang Yugui <wangyugui@e16-tech.com>,
Matthew Wilcox <willy@infradead.org>,
Naoya Horiguchi <naoya.horiguchi@nec.com>,
Alistair Popple <apopple@nvidia.com>,
Ralph Campbell <rcampbell@nvidia.com>, Zi Yan <ziy@nvidia.com>,
Miaohe Lin <linmiaohe@huawei.com>,
Minchan Kim <minchan@kernel.org>, Jue Wang <juew@google.com>,
Peter Xu <peterx@redhat.com>, Jan Kara <jack@suse.cz>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 6/7] mm/thp: unmap_mapping_page() to fix THP truncate_cleanup_page()
Date: Fri, 4 Jun 2021 19:39:33 +0300 [thread overview]
Message-ID: <20210604163933.h6dj6cgr6tudpprd@box.shutemov.name> (raw)
In-Reply-To: <alpine.LSU.2.11.2106011413280.2148@eggly.anvils>
On Tue, Jun 01, 2021 at 02:15:35PM -0700, Hugh Dickins wrote:
> There is a race between THP unmapping and truncation, when truncate sees
> pmd_none() and skips the entry, after munmap's zap_huge_pmd() cleared it,
> but before its page_remove_rmap() gets to decrement compound_mapcount:
> generating false "BUG: Bad page cache" reports that the page is still
> mapped when deleted. This commit fixes that, but not in the way I hoped.
>
> The first attempt used try_to_unmap(page, TTU_SYNC|TTU_IGNORE_MLOCK)
> instead of unmap_mapping_range() in truncate_cleanup_page(): it has often
> been an annoyance that we usually call unmap_mapping_range() with no pages
> locked, but there apply it to a single locked page. try_to_unmap() looks
> more suitable for a single locked page.
>
> However, try_to_unmap_one() contains a VM_BUG_ON_PAGE(!pvmw.pte,page):
> it is used to insert THP migration entries, but not used to unmap THPs.
> Copy zap_huge_pmd() and add THP handling now? Perhaps, but their TLB
> needs are different, I'm too ignorant of the DAX cases, and couldn't
> decide how far to go for anon+swap. Set that aside.
>
> The second attempt took a different tack: make no change in truncate.c,
> but modify zap_huge_pmd() to insert an invalidated huge pmd instead of
> clearing it initially, then pmd_clear() between page_remove_rmap() and
> unlocking at the end. Nice. But powerpc blows that approach out of the
> water, with its serialize_against_pte_lookup(), and interesting pgtable
> usage. It would need serious help to get working on powerpc (with a
> minor optimization issue on s390 too). Set that aside.
>
> Just add an "if (page_mapped(page)) synchronize_rcu();" or other such
> delay, after unmapping in truncate_cleanup_page()? Perhaps, but though
> that's likely to reduce or eliminate the number of incidents, it would
> give less assurance of whether we had identified the problem correctly.
>
> This successful iteration introduces "unmap_mapping_page(page)" instead
> of try_to_unmap(), and goes the usual unmap_mapping_range_tree() route,
> with an addition to details. Then zap_pmd_range() watches for this case,
> and does spin_unlock(pmd_lock) if so - just like page_vma_mapped_walk()
> now does in the PVMW_SYNC case. Not pretty, but safe.
>
> Note that unmap_mapping_page() is doing a VM_BUG_ON(!PageLocked) to
> assert its interface; but currently that's only used to make sure that
> page->mapping is stable, and zap_pmd_range() doesn't care if the page is
> locked or not. Along these lines, in invalidate_inode_pages2_range()
> move the initial unmap_mapping_range() out from under page lock, before
> then calling unmap_mapping_page() under page lock if still mapped.
>
> Fixes: fc127da085c2 ("truncate: handle file thp")
> Signed-off-by: Hugh Dickins <hughd@google.com>
> Cc: <stable@vger.kernel.org>
I think adding support for THP in try_to_unmap_one() is the most
future-proof way. We would need it there eventually.
But this works too:
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
--
Kirill A. Shutemov
next prev parent reply other threads:[~2021-06-05 7:25 UTC|newest]
Thread overview: 43+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-06-01 21:03 [PATCH 0/7] mm/thp: fix THP splitting unmap BUGs and related Hugh Dickins
2021-06-01 21:05 ` [PATCH 1/7] mm/thp: fix __split_huge_pmd_locked() on shmem migration entry Hugh Dickins
2021-06-03 21:26 ` Yang Shi
2021-06-04 2:22 ` Hugh Dickins
2021-06-04 18:03 ` Yang Shi
2021-06-04 21:52 ` Hugh Dickins
2021-06-04 15:34 ` Kirill A. Shutemov
2021-06-04 21:29 ` Hugh Dickins
2021-06-01 21:07 ` [PATCH 2/7] mm/thp: try_to_unmap() use TTU_SYNC for safe DEBUG_VM splitting Hugh Dickins
2021-06-02 1:59 ` Alistair Popple
2021-06-03 21:45 ` Yang Shi
2021-06-04 2:45 ` Hugh Dickins
2021-06-04 18:24 ` Yang Shi
2021-06-03 21:48 ` Peter Xu
2021-06-04 2:54 ` Hugh Dickins
2021-06-04 14:48 ` Peter Xu
2021-06-04 22:26 ` Hugh Dickins
2021-06-04 15:47 ` Kirill A. Shutemov
2021-06-01 21:09 ` [PATCH 3/7] mm/thp: fix vma_address() if virtual address below file offset Hugh Dickins
2021-06-01 21:30 ` Matthew Wilcox
2021-06-03 21:36 ` Hugh Dickins
2021-06-03 21:40 ` [PATCH v2 " Hugh Dickins
2021-06-04 15:53 ` Kirill A. Shutemov
2021-06-04 17:36 ` Matthew Wilcox
2021-06-04 22:35 ` Hugh Dickins
2021-06-01 21:11 ` [PATCH 4/7] mm/thp: fix page_address_in_vma() on file THP tails Hugh Dickins
2021-06-01 21:32 ` Matthew Wilcox
2021-06-03 22:06 ` Yang Shi
2021-06-04 15:54 ` Kirill A. Shutemov
2021-06-01 21:13 ` [PATCH 5/7] mm/thp: fix page_vma_mapped_walk() if huge page mapped by ptes Hugh Dickins
2021-06-04 16:24 ` Kirill A. Shutemov
2021-06-04 17:42 ` Matthew Wilcox
2021-06-04 22:56 ` Hugh Dickins
2021-06-01 21:15 ` [PATCH 6/7] mm/thp: unmap_mapping_page() to fix THP truncate_cleanup_page() Hugh Dickins
2021-06-04 16:39 ` Kirill A. Shutemov [this message]
2021-06-04 23:07 ` Hugh Dickins
2021-06-01 21:17 ` [PATCH 7/7] mm/thp: remap_page() is only needed on anonymous THP Hugh Dickins
2021-06-03 22:09 ` Yang Shi
2021-06-04 16:41 ` Kirill A. Shutemov
2021-06-02 2:07 ` [PATCH 0/7] mm/thp: fix THP splitting unmap BUGs and related Alistair Popple
2021-06-03 22:21 ` Hugh Dickins
2021-06-03 23:03 ` Andrew Morton
2021-06-03 22:26 ` [PATCH 6.1/7] mm: thp: replace DEBUG_VM BUG with VM_WARN when unmap fails for split Hugh Dickins
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20210604163933.h6dj6cgr6tudpprd@box.shutemov.name \
--to=kirill@shutemov.name \
--cc=akpm@linux-foundation.org \
--cc=apopple@nvidia.com \
--cc=hughd@google.com \
--cc=jack@suse.cz \
--cc=juew@google.com \
--cc=kirill.shutemov@linux.intel.com \
--cc=linmiaohe@huawei.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=minchan@kernel.org \
--cc=naoya.horiguchi@nec.com \
--cc=peterx@redhat.com \
--cc=rcampbell@nvidia.com \
--cc=shy828301@gmail.com \
--cc=wangyugui@e16-tech.com \
--cc=willy@infradead.org \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox