From: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
To: Hugh Dickins <hughd@google.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>,
Andrew Morton <akpm@linux-foundation.org>,
"Kirill A. Shutemov" <kirill@shutemov.name>,
Suleiman Souhlal <suleiman@google.com>,
Matthew Wilcox <willy@infradead.org>,
Andrea Arcangeli <aarcange@redhat.com>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: madvise(MADV_REMOVE) deadlocks on shmem THP
Date: Thu, 14 Jan 2021 14:38:58 +0900 [thread overview]
Message-ID: <X//Y8iRUfuH8WDg2@jagdpanzerIV.localdomain> (raw)
In-Reply-To: <alpine.LSU.2.11.2101132000500.4777@eggly.anvils>
On (21/01/13 20:31), Hugh Dickins wrote:
> > We are running into lockups during the memory pressure tests on our
> > boards, which essentially NMI panic them. In short the test case is
> >
> > - THP shmem
> > echo advise > /sys/kernel/mm/transparent_hugepage/shmem_enabled
> >
> > - And a user-space process doing madvise(MADV_HUGEPAGE) on new mappings,
> > and madvise(MADV_REMOVE) when it wants to remove the page range
> >
> > The problem boils down to the reverse locking chain:
> > kswapd does
> >
> > lock_page(page) -> down_read(page->mapping->i_mmap_rwsem)
> >
> > madvise() process does
> >
> > down_write(page->mapping->i_mmap_rwsem) -> lock_page(page)
> >
> >
> >
> > CPU0 CPU1
> >
> > kswapd vfs_fallocate()
> > shrink_node() shmem_fallocate()
> > shrink_active_list() unmap_mapping_range()
> > page_referenced() << lock page:PG_locked >> unmap_mapping_pages() << down_write(mapping->i_mmap_rwsem) >>
> > rmap_walk_file() zap_page_range_single()
> > down_read(mapping->i_mmap_rwsem) << W-locked on CPU1>> unmap_page_range()
> > rwsem_down_read_failed() __split_huge_pmd()
> > __rwsem_down_read_failed_common() __lock_page() << PG_locked on CPU0 >>
> > schedule() wait_on_page_bit_common()
> > io_schedule()
>
> Very interesting, Sergey: many thanks for this report.
Thanks for the quick feedback.
> There is no doubt that kswapd is right in its lock ordering:
> __split_huge_pmd() is in the wrong to be attempting lock_page().
>
> Which used not to be done, but was added in 5.8's c444eb564fb1 ("mm:
> thp: make the THP mapcount atomic against __split_huge_pmd_locked()").
Hugh, I forgot to mention, we are facing these issues on 4.19.
Let me check if (maybe) we have cherry picked c444eb564fb1.
-ss
prev parent reply other threads:[~2021-01-14 5:39 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-01-14 3:33 Sergey Senozhatsky
2021-01-14 4:31 ` Hugh Dickins
2021-01-14 5:38 ` Sergey Senozhatsky [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=X//Y8iRUfuH8WDg2@jagdpanzerIV.localdomain \
--to=sergey.senozhatsky@gmail.com \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=hughd@google.com \
--cc=kirill@shutemov.name \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=sergey.senozhatsky.work@gmail.com \
--cc=suleiman@google.com \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox