* [PATCH] mm: fix livelock caused by iterating multi order entry
@ 2019-07-19 0:12 Liu Bo
2019-07-19 2:53 ` Dan Williams
0 siblings, 1 reply; 4+ messages in thread
From: Liu Bo @ 2019-07-19 0:12 UTC (permalink / raw)
To: stable
Cc: linux-mm, linux-fsdevel, Jan Kara, Dan Williams, Matthew Wilcox,
Peng Tao
The livelock can be triggerred in the following pattern,
while (index < end && pagevec_lookup_entries(&pvec, mapping, index,
min(end - index, (pgoff_t)PAGEVEC_SIZE),
indices)) {
...
for (i = 0; i < pagevec_count(&pvec); i++) {
index = indices[i];
...
}
index++; /* BUG */
}
multi order exceptional entry is not specially considered in
invalidate_inode_pages2_range() and it ended up with a livelock because
both index 0 and index 1 finds the same pmd, but this pmd is binded to
index 0, so index is set to 0 again.
This introduces a helper to take the pmd entry's length into account when
deciding the next index.
Note that there're other users of the above pattern which doesn't need to
fix,
- dax_layout_busy_page
It's been fixed in commit d7782145e1ad
("filesystem-dax: Fix dax_layout_busy_page() livelock")
- truncate_inode_pages_range
This won't loop forever since the exceptional entries are immediately
removed from radix tree after the search.
Fixes: 642261a ("dax: add struct iomap based DAX PMD support")
Cc: <stable@vger.kernel.org> since 4.9 to 4.19
Signed-off-by: Liu Bo <bo.liu@linux.alibaba.com>
---
The problem is gone after commit f280bf092d48 ("page cache: Convert
find_get_entries to XArray"), but since xarray seems too new to backport
to 4.19, I made this fix based on radix tree implementation.
fs/dax.c | 19 +++++++++++++++++++
include/linux/dax.h | 8 ++++++++
mm/truncate.c | 26 ++++++++++++++++++++++++--
3 files changed, 51 insertions(+), 2 deletions(-)
diff --git a/fs/dax.c b/fs/dax.c
index ac334bc..cd05337 100644
--- a/fs/dax.c
+++ b/fs/dax.c
@@ -764,6 +764,25 @@ int dax_invalidate_mapping_entry_sync(struct address_space *mapping,
return __dax_invalidate_mapping_entry(mapping, index, false);
}
+pgoff_t dax_get_multi_order(struct address_space *mapping, pgoff_t index,
+ void *entry)
+{
+ struct radix_tree_root *pages = &mapping->i_pages;
+ pgoff_t nr_pages = 1;
+
+ if (!dax_mapping(mapping))
+ return nr_pages;
+
+ xa_lock_irq(pages);
+ entry = get_unlocked_mapping_entry(mapping, index, NULL);
+ if (entry)
+ nr_pages = 1UL << dax_radix_order(entry);
+ put_unlocked_mapping_entry(mapping, index, entry);
+ xa_unlock_irq(pages);
+
+ return nr_pages;
+}
+
static int copy_user_dax(struct block_device *bdev, struct dax_device *dax_dev,
sector_t sector, size_t size, struct page *to,
unsigned long vaddr)
diff --git a/include/linux/dax.h b/include/linux/dax.h
index a846184..f3c95c6 100644
--- a/include/linux/dax.h
+++ b/include/linux/dax.h
@@ -91,6 +91,8 @@ int dax_writeback_mapping_range(struct address_space *mapping,
struct page *dax_layout_busy_page(struct address_space *mapping);
bool dax_lock_mapping_entry(struct page *page);
void dax_unlock_mapping_entry(struct page *page);
+pgoff_t dax_get_multi_order(struct address_space *mapping, pgoff_t index,
+ void *entry);
#else
static inline bool bdev_dax_supported(struct block_device *bdev,
int blocksize)
@@ -134,6 +136,12 @@ static inline bool dax_lock_mapping_entry(struct page *page)
static inline void dax_unlock_mapping_entry(struct page *page)
{
}
+
+static inline pgoff_t dax_get_multi_order(struct address_space *mapping,
+ pgoff_t index, void *entry)
+{
+ return 1;
+}
#endif
int dax_read_lock(void);
diff --git a/mm/truncate.c b/mm/truncate.c
index 71b65aa..835911f 100644
--- a/mm/truncate.c
+++ b/mm/truncate.c
@@ -557,6 +557,8 @@ unsigned long invalidate_mapping_pages(struct address_space *mapping,
while (index <= end && pagevec_lookup_entries(&pvec, mapping, index,
min(end - index, (pgoff_t)PAGEVEC_SIZE - 1) + 1,
indices)) {
+ pgoff_t nr_pages = 1;
+
for (i = 0; i < pagevec_count(&pvec); i++) {
struct page *page = pvec.pages[i];
@@ -568,6 +570,15 @@ unsigned long invalidate_mapping_pages(struct address_space *mapping,
if (radix_tree_exceptional_entry(page)) {
invalidate_exceptional_entry(mapping, index,
page);
+ /*
+ * Account for multi-order entries at
+ * the end of the pagevec.
+ */
+ if (i < pagevec_count(&pvec) - 1)
+ continue;
+
+ nr_pages = dax_get_multi_order(mapping, index,
+ page);
continue;
}
@@ -607,7 +618,7 @@ unsigned long invalidate_mapping_pages(struct address_space *mapping,
pagevec_remove_exceptionals(&pvec);
pagevec_release(&pvec);
cond_resched();
- index++;
+ index += nr_pages;
}
return count;
}
@@ -688,6 +699,8 @@ int invalidate_inode_pages2_range(struct address_space *mapping,
while (index <= end && pagevec_lookup_entries(&pvec, mapping, index,
min(end - index, (pgoff_t)PAGEVEC_SIZE - 1) + 1,
indices)) {
+ pgoff_t nr_pages = 1;
+
for (i = 0; i < pagevec_count(&pvec); i++) {
struct page *page = pvec.pages[i];
@@ -700,6 +713,15 @@ int invalidate_inode_pages2_range(struct address_space *mapping,
if (!invalidate_exceptional_entry2(mapping,
index, page))
ret = -EBUSY;
+ /*
+ * Account for multi-order entries at
+ * the end of the pagevec.
+ */
+ if (i < pagevec_count(&pvec) - 1)
+ continue;
+
+ nr_pages = dax_get_multi_order(mapping, index,
+ page);
continue;
}
@@ -739,7 +761,7 @@ int invalidate_inode_pages2_range(struct address_space *mapping,
pagevec_remove_exceptionals(&pvec);
pagevec_release(&pvec);
cond_resched();
- index++;
+ index += nr_pages;
}
/*
* For DAX we invalidate page tables after invalidating radix tree. We
--
1.8.3.1
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] mm: fix livelock caused by iterating multi order entry
2019-07-19 0:12 [PATCH] mm: fix livelock caused by iterating multi order entry Liu Bo
@ 2019-07-19 2:53 ` Dan Williams
2019-07-19 3:43 ` Greg KH
2019-07-19 21:43 ` Liu Bo
0 siblings, 2 replies; 4+ messages in thread
From: Dan Williams @ 2019-07-19 2:53 UTC (permalink / raw)
To: Liu Bo
Cc: stable, Linux MM, linux-fsdevel, Jan Kara, Matthew Wilcox,
Peng Tao, Sasha Levin
[ add Sasha for -stable advice ]
On Thu, Jul 18, 2019 at 5:13 PM Liu Bo <bo.liu@linux.alibaba.com> wrote:
>
> The livelock can be triggerred in the following pattern,
>
> while (index < end && pagevec_lookup_entries(&pvec, mapping, index,
> min(end - index, (pgoff_t)PAGEVEC_SIZE),
> indices)) {
> ...
> for (i = 0; i < pagevec_count(&pvec); i++) {
> index = indices[i];
> ...
> }
> index++; /* BUG */
> }
>
> multi order exceptional entry is not specially considered in
> invalidate_inode_pages2_range() and it ended up with a livelock because
> both index 0 and index 1 finds the same pmd, but this pmd is binded to
> index 0, so index is set to 0 again.
>
> This introduces a helper to take the pmd entry's length into account when
> deciding the next index.
>
> Note that there're other users of the above pattern which doesn't need to
> fix,
>
> - dax_layout_busy_page
> It's been fixed in commit d7782145e1ad
> ("filesystem-dax: Fix dax_layout_busy_page() livelock")
>
> - truncate_inode_pages_range
> This won't loop forever since the exceptional entries are immediately
> removed from radix tree after the search.
>
> Fixes: 642261a ("dax: add struct iomap based DAX PMD support")
> Cc: <stable@vger.kernel.org> since 4.9 to 4.19
> Signed-off-by: Liu Bo <bo.liu@linux.alibaba.com>
> ---
>
> The problem is gone after commit f280bf092d48 ("page cache: Convert
> find_get_entries to XArray"), but since xarray seems too new to backport
> to 4.19, I made this fix based on radix tree implementation.
I think in this situation, since mainline does not need this change
and the bug has been buried under a major refactoring, is to send a
backport directly against the v4.19 kernel. Include notes about how it
replaces the fix that was inadvertently contained in f280bf092d48
("page cache: Convert find_get_entries to XArray"). Do you have a test
case that you can include in the changelog?
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] mm: fix livelock caused by iterating multi order entry
2019-07-19 2:53 ` Dan Williams
@ 2019-07-19 3:43 ` Greg KH
2019-07-19 21:43 ` Liu Bo
1 sibling, 0 replies; 4+ messages in thread
From: Greg KH @ 2019-07-19 3:43 UTC (permalink / raw)
To: Dan Williams
Cc: Liu Bo, stable, Linux MM, linux-fsdevel, Jan Kara,
Matthew Wilcox, Peng Tao, Sasha Levin
On Thu, Jul 18, 2019 at 07:53:42PM -0700, Dan Williams wrote:
> [ add Sasha for -stable advice ]
>
> On Thu, Jul 18, 2019 at 5:13 PM Liu Bo <bo.liu@linux.alibaba.com> wrote:
> >
> > The livelock can be triggerred in the following pattern,
> >
> > while (index < end && pagevec_lookup_entries(&pvec, mapping, index,
> > min(end - index, (pgoff_t)PAGEVEC_SIZE),
> > indices)) {
> > ...
> > for (i = 0; i < pagevec_count(&pvec); i++) {
> > index = indices[i];
> > ...
> > }
> > index++; /* BUG */
> > }
> >
> > multi order exceptional entry is not specially considered in
> > invalidate_inode_pages2_range() and it ended up with a livelock because
> > both index 0 and index 1 finds the same pmd, but this pmd is binded to
> > index 0, so index is set to 0 again.
> >
> > This introduces a helper to take the pmd entry's length into account when
> > deciding the next index.
> >
> > Note that there're other users of the above pattern which doesn't need to
> > fix,
> >
> > - dax_layout_busy_page
> > It's been fixed in commit d7782145e1ad
> > ("filesystem-dax: Fix dax_layout_busy_page() livelock")
> >
> > - truncate_inode_pages_range
> > This won't loop forever since the exceptional entries are immediately
> > removed from radix tree after the search.
> >
> > Fixes: 642261a ("dax: add struct iomap based DAX PMD support")
> > Cc: <stable@vger.kernel.org> since 4.9 to 4.19
> > Signed-off-by: Liu Bo <bo.liu@linux.alibaba.com>
> > ---
> >
> > The problem is gone after commit f280bf092d48 ("page cache: Convert
> > find_get_entries to XArray"), but since xarray seems too new to backport
> > to 4.19, I made this fix based on radix tree implementation.
>
> I think in this situation, since mainline does not need this change
> and the bug has been buried under a major refactoring, is to send a
> backport directly against the v4.19 kernel. Include notes about how it
> replaces the fix that was inadvertently contained in f280bf092d48
> ("page cache: Convert find_get_entries to XArray"). Do you have a test
> case that you can include in the changelog?
Yes, I need a _TON_ of documentation, and signed off by from all of the
developers involved in this part of the kernel, before I can take this
not-in-mainline patch.
thanks,
greg k-h
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH] mm: fix livelock caused by iterating multi order entry
2019-07-19 2:53 ` Dan Williams
2019-07-19 3:43 ` Greg KH
@ 2019-07-19 21:43 ` Liu Bo
1 sibling, 0 replies; 4+ messages in thread
From: Liu Bo @ 2019-07-19 21:43 UTC (permalink / raw)
To: Dan Williams
Cc: stable, Linux MM, linux-fsdevel, Jan Kara, Matthew Wilcox,
Peng Tao, Sasha Levin
On Thu, Jul 18, 2019 at 07:53:42PM -0700, Dan Williams wrote:
> [ add Sasha for -stable advice ]
>
> On Thu, Jul 18, 2019 at 5:13 PM Liu Bo <bo.liu@linux.alibaba.com> wrote:
> >
> > The livelock can be triggerred in the following pattern,
> >
> > while (index < end && pagevec_lookup_entries(&pvec, mapping, index,
> > min(end - index, (pgoff_t)PAGEVEC_SIZE),
> > indices)) {
> > ...
> > for (i = 0; i < pagevec_count(&pvec); i++) {
> > index = indices[i];
> > ...
> > }
> > index++; /* BUG */
> > }
> >
> > multi order exceptional entry is not specially considered in
> > invalidate_inode_pages2_range() and it ended up with a livelock because
> > both index 0 and index 1 finds the same pmd, but this pmd is binded to
> > index 0, so index is set to 0 again.
> >
> > This introduces a helper to take the pmd entry's length into account when
> > deciding the next index.
> >
> > Note that there're other users of the above pattern which doesn't need to
> > fix,
> >
> > - dax_layout_busy_page
> > It's been fixed in commit d7782145e1ad
> > ("filesystem-dax: Fix dax_layout_busy_page() livelock")
> >
> > - truncate_inode_pages_range
> > This won't loop forever since the exceptional entries are immediately
> > removed from radix tree after the search.
> >
> > Fixes: 642261a ("dax: add struct iomap based DAX PMD support")
> > Cc: <stable@vger.kernel.org> since 4.9 to 4.19
> > Signed-off-by: Liu Bo <bo.liu@linux.alibaba.com>
> > ---
> >
> > The problem is gone after commit f280bf092d48 ("page cache: Convert
> > find_get_entries to XArray"), but since xarray seems too new to backport
> > to 4.19, I made this fix based on radix tree implementation.
>
> I think in this situation, since mainline does not need this change
> and the bug has been buried under a major refactoring, is to send a
> backport directly against the v4.19 kernel. Include notes about how it
> replaces the fix that was inadvertently contained in f280bf092d48
> ("page cache: Convert find_get_entries to XArray"). Do you have a test
> case that you can include in the changelog?
The root cause behind the bug is exactly same as what commit
d7782145e1ad ("filesystem-dax: Fix dax_layout_busy_page() livelock")
does.
For test case, I have a not 100% reproducible one based on ltp's
rwtest[1] and virtiofs.
[1]:
$mount -t virtio_fs -o tag=alwaysdax -o rootmode=040000,user_id=0,group_id=0,dax,default_permissions,allow_other alwaysdax /mnt/virtio-fs/
$cat test.txt
rwtest01 export LTPROOT; rwtest -N rwtest01 -c -q -i 60s -f sync 10%25000:$TMPDIR/rw-sync-$$
$runltp -d /mnt/virtio-fs -f test.txt
thanks,
-liubo
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2019-07-19 21:43 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-19 0:12 [PATCH] mm: fix livelock caused by iterating multi order entry Liu Bo
2019-07-19 2:53 ` Dan Williams
2019-07-19 3:43 ` Greg KH
2019-07-19 21:43 ` Liu Bo
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox