* [PATCH 0/4] mm: remove page_maybe_dma_pinned() and page_mkclean()
@ 2024-06-04 11:48 Kefeng Wang
2024-06-04 11:48 ` [PATCH 1/4] fs/proc/task_mmu: use folio API in pte_is_pinned() Kefeng Wang
` (3 more replies)
0 siblings, 4 replies; 12+ messages in thread
From: Kefeng Wang @ 2024-06-04 11:48 UTC (permalink / raw)
To: Andrew Morton
Cc: Helge Deller, Daniel Vetter, David Hildenbrand, Matthew Wilcox,
linux-mm, Jonathan Corbet, Kefeng Wang
Most page_maybe_dma_pinned() and page_mkclean() callers have been
converted to the folio equivalents, after two more convertsions,
remove them and update the comment and documention.
Kefeng Wang (4):
fs/proc/task_mmu: use folio API in pte_is_pinned()
mm: remove page_maybe_dma_pinned()
fb_defio: use a folio in fb_deferred_io_work()
mm: remove page_mkclean()
Documentation/core-api/pin_user_pages.rst | 16 ++++++++--------
drivers/video/fbdev/core/fb_defio.c | 13 +++++++------
fs/proc/task_mmu.c | 8 ++++----
include/linux/mm.h | 11 +++--------
include/linux/rmap.h | 4 ----
mm/gup.c | 2 +-
mm/mremap.c | 2 +-
7 files changed, 24 insertions(+), 32 deletions(-)
--
2.27.0
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH 1/4] fs/proc/task_mmu: use folio API in pte_is_pinned()
2024-06-04 11:48 [PATCH 0/4] mm: remove page_maybe_dma_pinned() and page_mkclean() Kefeng Wang
@ 2024-06-04 11:48 ` Kefeng Wang
2024-06-04 11:51 ` David Hildenbrand
2024-06-04 11:48 ` [PATCH 2/4] mm: remove page_maybe_dma_pinned() Kefeng Wang
` (2 subsequent siblings)
3 siblings, 1 reply; 12+ messages in thread
From: Kefeng Wang @ 2024-06-04 11:48 UTC (permalink / raw)
To: Andrew Morton
Cc: Helge Deller, Daniel Vetter, David Hildenbrand, Matthew Wilcox,
linux-mm, Jonathan Corbet, Kefeng Wang
Convert to use vm_normal_folio() and folio_maybe_dma_pinned() API,
which helps to remove page_maybe_dma_pinned() in the subsequent change.
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
---
fs/proc/task_mmu.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
index f8d35f993fe5..5aceb3db7565 100644
--- a/fs/proc/task_mmu.c
+++ b/fs/proc/task_mmu.c
@@ -1088,7 +1088,7 @@ struct clear_refs_private {
static inline bool pte_is_pinned(struct vm_area_struct *vma, unsigned long addr, pte_t pte)
{
- struct page *page;
+ struct folio *folio;
if (!pte_write(pte))
return false;
@@ -1096,10 +1096,10 @@ static inline bool pte_is_pinned(struct vm_area_struct *vma, unsigned long addr,
return false;
if (likely(!test_bit(MMF_HAS_PINNED, &vma->vm_mm->flags)))
return false;
- page = vm_normal_page(vma, addr, pte);
- if (!page)
+ folio = vm_normal_folio(vma, addr, pte);
+ if (!folio)
return false;
- return page_maybe_dma_pinned(page);
+ return folio_maybe_dma_pinned(folio);
}
static inline void clear_soft_dirty(struct vm_area_struct *vma,
--
2.27.0
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH 2/4] mm: remove page_maybe_dma_pinned()
2024-06-04 11:48 [PATCH 0/4] mm: remove page_maybe_dma_pinned() and page_mkclean() Kefeng Wang
2024-06-04 11:48 ` [PATCH 1/4] fs/proc/task_mmu: use folio API in pte_is_pinned() Kefeng Wang
@ 2024-06-04 11:48 ` Kefeng Wang
2024-06-04 11:48 ` [PATCH 3/4] fb_defio: use a folio in fb_deferred_io_work() Kefeng Wang
2024-06-04 11:48 ` [PATCH 4/4] mm: remove page_mkclean() Kefeng Wang
3 siblings, 0 replies; 12+ messages in thread
From: Kefeng Wang @ 2024-06-04 11:48 UTC (permalink / raw)
To: Andrew Morton
Cc: Helge Deller, Daniel Vetter, David Hildenbrand, Matthew Wilcox,
linux-mm, Jonathan Corbet, Kefeng Wang
After the last user of page_maybe_dma_pinned() is converted to
folio_maybe_dma_pinned(), remove page_maybe_dma_pinned() and update
the document and comment.
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
---
Documentation/core-api/pin_user_pages.rst | 10 +++++-----
include/linux/mm.h | 9 ++-------
2 files changed, 7 insertions(+), 12 deletions(-)
diff --git a/Documentation/core-api/pin_user_pages.rst b/Documentation/core-api/pin_user_pages.rst
index 6b5f7e6e7155..532ba8e8381b 100644
--- a/Documentation/core-api/pin_user_pages.rst
+++ b/Documentation/core-api/pin_user_pages.rst
@@ -196,20 +196,20 @@ INCORRECT (uses FOLL_GET calls):
write to the data within the pages
put_page()
-page_maybe_dma_pinned(): the whole point of pinning
+folio_maybe_dma_pinned(): the whole point of pinning
===================================================
-The whole point of marking pages as "DMA-pinned" or "gup-pinned" is to be able
-to query, "is this page DMA-pinned?" That allows code such as page_mkclean()
+The whole point of marking folios as "DMA-pinned" or "gup-pinned" is to be able
+to query, "is this folio DMA-pinned?" That allows code such as page_mkclean()
(and file system writeback code in general) to make informed decisions about
-what to do when a page cannot be unmapped due to such pins.
+what to do when a folio cannot be unmapped due to such pins.
What to do in those cases is the subject of a years-long series of discussions
and debates (see the References at the end of this document). It's a TODO item
here: fill in the details once that's worked out. Meanwhile, it's safe to say
that having this available: ::
- static inline bool page_maybe_dma_pinned(struct page *page)
+ static inline bool folio_maybe_dma_pinned(struct folio *folio)
...is a prerequisite to solving the long-running gup+DMA problem.
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 28c40f68762e..d42497e25d43 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1953,8 +1953,8 @@ static inline struct folio *pfn_folio(unsigned long pfn)
*
* For more information, please see Documentation/core-api/pin_user_pages.rst.
*
- * Return: True, if it is likely that the page has been "dma-pinned".
- * False, if the page is definitely not dma-pinned.
+ * Return: True, if it is likely that the folio has been "dma-pinned".
+ * False, if the folio is definitely not dma-pinned.
*/
static inline bool folio_maybe_dma_pinned(struct folio *folio)
{
@@ -1973,11 +1973,6 @@ static inline bool folio_maybe_dma_pinned(struct folio *folio)
GUP_PIN_COUNTING_BIAS;
}
-static inline bool page_maybe_dma_pinned(struct page *page)
-{
- return folio_maybe_dma_pinned(page_folio(page));
-}
-
/*
* This should most likely only be called during fork() to see whether we
* should break the cow immediately for an anon page on the src mm.
--
2.27.0
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH 3/4] fb_defio: use a folio in fb_deferred_io_work()
2024-06-04 11:48 [PATCH 0/4] mm: remove page_maybe_dma_pinned() and page_mkclean() Kefeng Wang
2024-06-04 11:48 ` [PATCH 1/4] fs/proc/task_mmu: use folio API in pte_is_pinned() Kefeng Wang
2024-06-04 11:48 ` [PATCH 2/4] mm: remove page_maybe_dma_pinned() Kefeng Wang
@ 2024-06-04 11:48 ` Kefeng Wang
2024-06-04 11:53 ` David Hildenbrand
2024-06-04 11:48 ` [PATCH 4/4] mm: remove page_mkclean() Kefeng Wang
3 siblings, 1 reply; 12+ messages in thread
From: Kefeng Wang @ 2024-06-04 11:48 UTC (permalink / raw)
To: Andrew Morton
Cc: Helge Deller, Daniel Vetter, David Hildenbrand, Matthew Wilcox,
linux-mm, Jonathan Corbet, Kefeng Wang
Replaces three calls to compound_head() with one, which removes last
caller of page_mkclean().
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
---
drivers/video/fbdev/core/fb_defio.c | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/drivers/video/fbdev/core/fb_defio.c b/drivers/video/fbdev/core/fb_defio.c
index 806ecd32219b..c9c8e294b7e7 100644
--- a/drivers/video/fbdev/core/fb_defio.c
+++ b/drivers/video/fbdev/core/fb_defio.c
@@ -244,10 +244,11 @@ static void fb_deferred_io_work(struct work_struct *work)
/* here we mkclean the pages, then do all deferred IO */
mutex_lock(&fbdefio->lock);
list_for_each_entry(pageref, &fbdefio->pagereflist, list) {
- struct page *cur = pageref->page;
- lock_page(cur);
- page_mkclean(cur);
- unlock_page(cur);
+ struct folio *folio = page_folio(pageref->page);
+
+ folio_lock(folio);
+ folio_mkclean(folio);
+ folio_unlock(folio);
}
/* driver's callback with pagereflist */
--
2.27.0
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH 4/4] mm: remove page_mkclean()
2024-06-04 11:48 [PATCH 0/4] mm: remove page_maybe_dma_pinned() and page_mkclean() Kefeng Wang
` (2 preceding siblings ...)
2024-06-04 11:48 ` [PATCH 3/4] fb_defio: use a folio in fb_deferred_io_work() Kefeng Wang
@ 2024-06-04 11:48 ` Kefeng Wang
2024-06-07 6:46 ` [PATCH] pin_user_pages: fix underline length Kefeng Wang
3 siblings, 1 reply; 12+ messages in thread
From: Kefeng Wang @ 2024-06-04 11:48 UTC (permalink / raw)
To: Andrew Morton
Cc: Helge Deller, Daniel Vetter, David Hildenbrand, Matthew Wilcox,
linux-mm, Jonathan Corbet, Kefeng Wang
There are no more users of page_mkclean(), remove it and update the
document and comment.
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
---
Documentation/core-api/pin_user_pages.rst | 8 ++++----
drivers/video/fbdev/core/fb_defio.c | 4 ++--
include/linux/mm.h | 2 +-
include/linux/rmap.h | 4 ----
mm/gup.c | 2 +-
mm/mremap.c | 2 +-
6 files changed, 9 insertions(+), 13 deletions(-)
diff --git a/Documentation/core-api/pin_user_pages.rst b/Documentation/core-api/pin_user_pages.rst
index 532ba8e8381b..4e997b0b8487 100644
--- a/Documentation/core-api/pin_user_pages.rst
+++ b/Documentation/core-api/pin_user_pages.rst
@@ -132,7 +132,7 @@ CASE 1: Direct IO (DIO)
-----------------------
There are GUP references to pages that are serving
as DIO buffers. These buffers are needed for a relatively short time (so they
-are not "long term"). No special synchronization with page_mkclean() or
+are not "long term"). No special synchronization with folio_mkclean() or
munmap() is provided. Therefore, flags to set at the call site are: ::
FOLL_PIN
@@ -144,7 +144,7 @@ CASE 2: RDMA
------------
There are GUP references to pages that are serving as DMA
buffers. These buffers are needed for a long time ("long term"). No special
-synchronization with page_mkclean() or munmap() is provided. Therefore, flags
+synchronization with folio_mkclean() or munmap() is provided. Therefore, flags
to set at the call site are: ::
FOLL_PIN | FOLL_LONGTERM
@@ -170,7 +170,7 @@ callback, simply remove the range from the device's page tables.
Either way, as long as the driver unpins the pages upon mmu notifier callback,
then there is proper synchronization with both filesystem and mm
-(page_mkclean(), munmap(), etc). Therefore, neither flag needs to be set.
+(folio_mkclean(), munmap(), etc). Therefore, neither flag needs to be set.
CASE 4: Pinning for struct page manipulation only
-------------------------------------------------
@@ -200,7 +200,7 @@ folio_maybe_dma_pinned(): the whole point of pinning
===================================================
The whole point of marking folios as "DMA-pinned" or "gup-pinned" is to be able
-to query, "is this folio DMA-pinned?" That allows code such as page_mkclean()
+to query, "is this folio DMA-pinned?" That allows code such as folio_mkclean()
(and file system writeback code in general) to make informed decisions about
what to do when a folio cannot be unmapped due to such pins.
diff --git a/drivers/video/fbdev/core/fb_defio.c b/drivers/video/fbdev/core/fb_defio.c
index c9c8e294b7e7..d38998714215 100644
--- a/drivers/video/fbdev/core/fb_defio.c
+++ b/drivers/video/fbdev/core/fb_defio.c
@@ -113,7 +113,7 @@ static vm_fault_t fb_deferred_io_fault(struct vm_fault *vmf)
printk(KERN_ERR "no mapping available\n");
BUG_ON(!page->mapping);
- page->index = vmf->pgoff; /* for page_mkclean() */
+ page->index = vmf->pgoff; /* for folio_mkclean() */
vmf->page = page;
return 0;
@@ -161,7 +161,7 @@ static vm_fault_t fb_deferred_io_track_page(struct fb_info *info, unsigned long
/*
* We want the page to remain locked from ->page_mkwrite until
- * the PTE is marked dirty to avoid page_mkclean() being called
+ * the PTE is marked dirty to avoid folio_mkclean() being called
* before the PTE is updated, which would leave the page ignored
* by defio.
* Do this by locking the page here and informing the caller
diff --git a/include/linux/mm.h b/include/linux/mm.h
index d42497e25d43..cc21b2f0cdf8 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1610,7 +1610,7 @@ static inline void put_page(struct page *page)
* issue.
*
* Locking: the lockless algorithm described in folio_try_get_rcu()
- * provides safe operation for get_user_pages(), page_mkclean() and
+ * provides safe operation for get_user_pages(), folio_mkclean() and
* other calls that race to set up page table entries.
*/
#define GUP_PIN_COUNTING_BIAS (1U << 10)
diff --git a/include/linux/rmap.h b/include/linux/rmap.h
index 0a5d8c7f0690..216ddd369e8f 100644
--- a/include/linux/rmap.h
+++ b/include/linux/rmap.h
@@ -793,8 +793,4 @@ static inline int folio_mkclean(struct folio *folio)
}
#endif /* CONFIG_MMU */
-static inline int page_mkclean(struct page *page)
-{
- return folio_mkclean(page_folio(page));
-}
#endif /* _LINUX_RMAP_H */
diff --git a/mm/gup.c b/mm/gup.c
index e17466fd62bb..83e279731d1b 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -393,7 +393,7 @@ void unpin_user_pages_dirty_lock(struct page **pages, unsigned long npages,
* 1) This code sees the page as already dirty, so it
* skips the call to set_page_dirty(). That could happen
* because clear_page_dirty_for_io() called
- * page_mkclean(), followed by set_page_dirty().
+ * folio_mkclean(), followed by set_page_dirty().
* However, now the page is going to get written back,
* which meets the original intention of setting it
* dirty, so all is well: clear_page_dirty_for_io() goes
diff --git a/mm/mremap.c b/mm/mremap.c
index 5f96bc5ee918..e7ae140fc640 100644
--- a/mm/mremap.c
+++ b/mm/mremap.c
@@ -198,7 +198,7 @@ static int move_ptes(struct vm_area_struct *vma, pmd_t *old_pmd,
* PTE.
*
* NOTE! Both old and new PTL matter: the old one
- * for racing with page_mkclean(), the new one to
+ * for racing with folio_mkclean(), the new one to
* make sure the physical page stays valid until
* the TLB entry for the old mapping has been
* flushed.
--
2.27.0
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/4] fs/proc/task_mmu: use folio API in pte_is_pinned()
2024-06-04 11:48 ` [PATCH 1/4] fs/proc/task_mmu: use folio API in pte_is_pinned() Kefeng Wang
@ 2024-06-04 11:51 ` David Hildenbrand
2024-06-04 14:26 ` Kefeng Wang
0 siblings, 1 reply; 12+ messages in thread
From: David Hildenbrand @ 2024-06-04 11:51 UTC (permalink / raw)
To: Kefeng Wang, Andrew Morton
Cc: Helge Deller, Daniel Vetter, Matthew Wilcox, linux-mm, Jonathan Corbet
On 04.06.24 13:48, Kefeng Wang wrote:
> Convert to use vm_normal_folio() and folio_maybe_dma_pinned() API,
> which helps to remove page_maybe_dma_pinned() in the subsequent change.
>
> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
> ---
> fs/proc/task_mmu.c | 8 ++++----
> 1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
> index f8d35f993fe5..5aceb3db7565 100644
> --- a/fs/proc/task_mmu.c
> +++ b/fs/proc/task_mmu.c
> @@ -1088,7 +1088,7 @@ struct clear_refs_private {
>
> static inline bool pte_is_pinned(struct vm_area_struct *vma, unsigned long addr, pte_t pte)
> {
> - struct page *page;
> + struct folio *folio;
>
> if (!pte_write(pte))
> return false;
> @@ -1096,10 +1096,10 @@ static inline bool pte_is_pinned(struct vm_area_struct *vma, unsigned long addr,
> return false;
> if (likely(!test_bit(MMF_HAS_PINNED, &vma->vm_mm->flags)))
> return false;
> - page = vm_normal_page(vma, addr, pte);
> - if (!page)
> + folio = vm_normal_folio(vma, addr, pte);
> + if (!folio)
> return false;
> - return page_maybe_dma_pinned(page);
> + return folio_maybe_dma_pinned(folio);
> }
>
> static inline void clear_soft_dirty(struct vm_area_struct *vma,
Likely we should just get rid of the pte_is_pinned() check completely
now. We don't perform the same for PMDs, we don't sync against GUP-fast,
and the original COW vs. GUP issue was resolved.
--
Cheers,
David / dhildenb
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 3/4] fb_defio: use a folio in fb_deferred_io_work()
2024-06-04 11:48 ` [PATCH 3/4] fb_defio: use a folio in fb_deferred_io_work() Kefeng Wang
@ 2024-06-04 11:53 ` David Hildenbrand
0 siblings, 0 replies; 12+ messages in thread
From: David Hildenbrand @ 2024-06-04 11:53 UTC (permalink / raw)
To: Kefeng Wang, Andrew Morton
Cc: Helge Deller, Daniel Vetter, Matthew Wilcox, linux-mm, Jonathan Corbet
On 04.06.24 13:48, Kefeng Wang wrote:
> Replaces three calls to compound_head() with one, which removes last
> caller of page_mkclean().
>
> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
> ---
> drivers/video/fbdev/core/fb_defio.c | 9 +++++----
> 1 file changed, 5 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/video/fbdev/core/fb_defio.c b/drivers/video/fbdev/core/fb_defio.c
> index 806ecd32219b..c9c8e294b7e7 100644
> --- a/drivers/video/fbdev/core/fb_defio.c
> +++ b/drivers/video/fbdev/core/fb_defio.c
> @@ -244,10 +244,11 @@ static void fb_deferred_io_work(struct work_struct *work)
> /* here we mkclean the pages, then do all deferred IO */
> mutex_lock(&fbdefio->lock);
> list_for_each_entry(pageref, &fbdefio->pagereflist, list) {
> - struct page *cur = pageref->page;
> - lock_page(cur);
> - page_mkclean(cur);
> - unlock_page(cur);
> + struct folio *folio = page_folio(pageref->page);
> +
> + folio_lock(folio);
> + folio_mkclean(folio);
> + folio_unlock(folio);
> }
>
> /* driver's callback with pagereflist */
Acked-by: David Hildenbrand <david@redhat.com>
--
Cheers,
David / dhildenb
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/4] fs/proc/task_mmu: use folio API in pte_is_pinned()
2024-06-04 11:51 ` David Hildenbrand
@ 2024-06-04 14:26 ` Kefeng Wang
[not found] ` <f466aae2-9a3e-4081-b2d8-fa5930b8bc29@redhat.com>
0 siblings, 1 reply; 12+ messages in thread
From: Kefeng Wang @ 2024-06-04 14:26 UTC (permalink / raw)
To: David Hildenbrand, Andrew Morton
Cc: Helge Deller, Daniel Vetter, Matthew Wilcox, linux-mm, Jonathan Corbet
On 2024/6/4 19:51, David Hildenbrand wrote:
> On 04.06.24 13:48, Kefeng Wang wrote:
>> Convert to use vm_normal_folio() and folio_maybe_dma_pinned() API,
>> which helps to remove page_maybe_dma_pinned() in the subsequent change.
>>
>> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
>> ---
>> fs/proc/task_mmu.c | 8 ++++----
>> 1 file changed, 4 insertions(+), 4 deletions(-)
>>
>> diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
>> index f8d35f993fe5..5aceb3db7565 100644
>> --- a/fs/proc/task_mmu.c
>> +++ b/fs/proc/task_mmu.c
>> @@ -1088,7 +1088,7 @@ struct clear_refs_private {
>> static inline bool pte_is_pinned(struct vm_area_struct *vma,
>> unsigned long addr, pte_t pte)
>> {
>> - struct page *page;
>> + struct folio *folio;
>> if (!pte_write(pte))
>> return false;
>> @@ -1096,10 +1096,10 @@ static inline bool pte_is_pinned(struct
>> vm_area_struct *vma, unsigned long addr,
>> return false;
>> if (likely(!test_bit(MMF_HAS_PINNED, &vma->vm_mm->flags)))
>> return false;
>> - page = vm_normal_page(vma, addr, pte);
>> - if (!page)
>> + folio = vm_normal_folio(vma, addr, pte);
>> + if (!folio)
>> return false;
>> - return page_maybe_dma_pinned(page);
>> + return folio_maybe_dma_pinned(folio);
>> }
>> static inline void clear_soft_dirty(struct vm_area_struct *vma,
>
> Likely we should just get rid of the pte_is_pinned() check completely
> now. We don't perform the same for PMDs, we don't sync against GUP-fast,
Yes, no handle for clear PMDs.
> and the original COW vs. GUP issue was resolved.
Agree, but I'm not sure about removing it, from the commit 9348b73c2e1bf,
"The whole "track page soft dirty" state doesn't work with pinned pages
anyway, since the page might be dirtied by the pinning entity without
ever being noticed in the page tables."
The issue is between the pin mechanism and "track page soft dirty", if
the page is pinned, the pining entiry(DMA?) could change the page but
the pte dirty won't be set, so maybe we still need it, even add some
similar thing for PMD? Correct me if I'm wrong, thanks.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/4] fs/proc/task_mmu: use folio API in pte_is_pinned()
[not found] ` <f466aae2-9a3e-4081-b2d8-fa5930b8bc29@redhat.com>
@ 2024-06-05 1:30 ` Kefeng Wang
2024-06-25 22:38 ` Andrew Morton
0 siblings, 1 reply; 12+ messages in thread
From: Kefeng Wang @ 2024-06-05 1:30 UTC (permalink / raw)
To: David Hildenbrand, Andrew Morton
Cc: Helge Deller, Daniel Vetter, Matthew Wilcox, linux-mm, Jonathan Corbet
On 2024/6/4 23:47, David Hildenbrand wrote:
> On 04.06.24 16:26, Kefeng Wang wrote:
>>
>>
>> On 2024/6/4 19:51, David Hildenbrand wrote:
>>> On 04.06.24 13:48, Kefeng Wang wrote:
>>>> Convert to use vm_normal_folio() and folio_maybe_dma_pinned() API,
>>>> which helps to remove page_maybe_dma_pinned() in the subsequent change.
>>>>
>>>> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
>>>> ---
>>>> fs/proc/task_mmu.c | 8 ++++----
>>>> 1 file changed, 4 insertions(+), 4 deletions(-)
>>>>
>>>> diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c
>>>> index f8d35f993fe5..5aceb3db7565 100644
>>>> --- a/fs/proc/task_mmu.c
>>>> +++ b/fs/proc/task_mmu.c
>>>> @@ -1088,7 +1088,7 @@ struct clear_refs_private {
>>>> static inline bool pte_is_pinned(struct vm_area_struct *vma,
>>>> unsigned long addr, pte_t pte)
>>>> {
>>>> - struct page *page;
>>>> + struct folio *folio;
>>>> if (!pte_write(pte))
>>>> return false;
>>>> @@ -1096,10 +1096,10 @@ static inline bool pte_is_pinned(struct
>>>> vm_area_struct *vma, unsigned long addr,
>>>> return false;
>>>> if (likely(!test_bit(MMF_HAS_PINNED, &vma->vm_mm->flags)))
>>>> return false;
>>>> - page = vm_normal_page(vma, addr, pte);
>>>> - if (!page)
>>>> + folio = vm_normal_folio(vma, addr, pte);
>>>> + if (!folio)
>>>> return false;
>>>> - return page_maybe_dma_pinned(page);
>>>> + return folio_maybe_dma_pinned(folio);
>>>> }
>>>> static inline void clear_soft_dirty(struct vm_area_struct *vma,
>>>
>>> Likely we should just get rid of the pte_is_pinned() check completely
>>> now. We don't perform the same for PMDs, we don't sync against GUP-fast,
>>
>> Yes, no handle for clear PMDs.
>>
>>> and the original COW vs. GUP issue was resolved.
>>
>> Agree, but I'm not sure about removing it, from the commit 9348b73c2e1bf,
>>
>> "The whole "track page soft dirty" state doesn't work with pinned
>> pages
>> anyway, since the page might be dirtied by the pinning entity without
>> ever being noticed in the page tables."
>>
>> The issue is between the pin mechanism and "track page soft dirty", if
>> the page is pinned, the pining entiry(DMA?) could change the page but
>> the pte dirty won't be set, so maybe we still need it, even add some
>> similar thing for PMD? Correct me if I'm wrong, thanks.
>
> Yes, but it doesn't work with any mechanism that write-protects PTEs,
> including mprotect() and uffd-wp.
>
> Then, we never synced agaisnt concurrent GUP-fast, concurrent O_DIRECT
> that might still use !FOLL_PIN, never handled PMD ... so it's all not
> consistent nor really helpful nowdays.
>
> ... and if you have read-only pinned pages (which we cannot distinguish)
>
OK, many cases need to be addressed.
> I have a proper patch lying around here for quite a while:
>
> commit 9ef578b7aba8bba626b904fe90e5be0690842fd3
> Author: David Hildenbrand <david@redhat.com>
> Date: Wed Feb 16 20:39:43 2022 +0100
>
> fs/proc/task_mmu: allow setting pinned pages R/O for softdirty
> tracking
> Before we had PG_anon_exclusive, our new COW logic would end up
> replacing a pinned page in the page tables, detecting it as possibly
> shared. Consequently, we'd lost synchronicity between the page mapped
> into the page table and the page pin.
> We tried preventing mapping pinned pages R/O, however, history told us
> that that is impossible -- and we added PG_anon_exclusive to have
> it all
> working reliably again.
> Now that we have PG_anon_exclusive in place, let's get rid of the
> check for
> pinned pages and revert it to the old state.
> Yes, we won't be able to detect the following scenario correctly:
> (1) R/W-pin a page.
> (2) Clear softdirty.
> (3) Modify the page via the R/W-pin.
> However, that isn't working reliably right now either way, because
> * The current check is racy, because we can race with GUP-fast
> taking a
> R/W pin while we're clearing the softdirty marker and mapping the
> page
> R/O.
> * The current check cannot handle FOLL_GET R/W references, as used for
> O_DIRECT. So if there is concurrent I/O the PTE will get marked as
> !softdirty, yet, direct I/O will modify page content afterwards.
> * We check for pins only when handling PTEs, but not for PMDs etc.
> Also, this handling is in no way different to other mechanisms
> (mprotect,
> uffd-wp) that map pages R/O to catch successive write access via the
> page table -- because acceses via the page pin no longer go logically
> via the page table, the page table is bypassed.
> With this change, the interface now works again as expected when we
> have
> R/O pins, and behaves just like any other mechanism that uses write
> protection to catch successive writes (mprotect, uffd-wp) -- and
> they all
> face the same issue regarding R/W access via GUP (FOLL_PIN and
> FOLL_GET).
> User space better be aware that using read-protection to catch
> writes to
> a page can miss writes via GUP. Softdirt tracking cannot reliably
> catch
> modifications via GUP after clearing softdirty and returning to user
> space.
>
> But I understand if you want to be careful :) So I might send that patch
> out at
> some point myself ...
>
Thank for your detail explanation, let's wait it out.
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH] pin_user_pages: fix underline length
2024-06-04 11:48 ` [PATCH 4/4] mm: remove page_mkclean() Kefeng Wang
@ 2024-06-07 6:46 ` Kefeng Wang
0 siblings, 0 replies; 12+ messages in thread
From: Kefeng Wang @ 2024-06-07 6:46 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux-mm, Jonathan Corbet, sfr, Kefeng Wang
Documentation/core-api/pin_user_pages.rst:200: WARNING: Title underline too short.
folio_maybe_dma_pinned(): the whole point of pinning
===================================================
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
---
Andrew, please help to squash this into "mm: remove page_mkclean()",
thansk.
Documentation/core-api/pin_user_pages.rst | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/Documentation/core-api/pin_user_pages.rst b/Documentation/core-api/pin_user_pages.rst
index 4e997b0b8487..c16ca163b55e 100644
--- a/Documentation/core-api/pin_user_pages.rst
+++ b/Documentation/core-api/pin_user_pages.rst
@@ -197,7 +197,7 @@ INCORRECT (uses FOLL_GET calls):
put_page()
folio_maybe_dma_pinned(): the whole point of pinning
-===================================================
+====================================================
The whole point of marking folios as "DMA-pinned" or "gup-pinned" is to be able
to query, "is this folio DMA-pinned?" That allows code such as folio_mkclean()
--
2.27.0
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/4] fs/proc/task_mmu: use folio API in pte_is_pinned()
2024-06-05 1:30 ` Kefeng Wang
@ 2024-06-25 22:38 ` Andrew Morton
2024-06-26 4:57 ` David Hildenbrand
0 siblings, 1 reply; 12+ messages in thread
From: Andrew Morton @ 2024-06-25 22:38 UTC (permalink / raw)
To: Kefeng Wang
Cc: David Hildenbrand, Helge Deller, Daniel Vetter, Matthew Wilcox,
linux-mm, Jonathan Corbet
On Wed, 5 Jun 2024 09:30:59 +0800 Kefeng Wang <wangkefeng.wang@huawei.com> wrote:
> > protection to catch successive writes (mprotect, uffd-wp) -- and
> > they all
> > face the same issue regarding R/W access via GUP (FOLL_PIN and
> > FOLL_GET).
> > User space better be aware that using read-protection to catch
> > writes to
> > a page can miss writes via GUP. Softdirt tracking cannot reliably
> > catch
> > modifications via GUP after clearing softdirty and returning to user
> > space.
> >
> > But I understand if you want to be careful :) So I might send that patch
> > out at
> > some point myself ...
> >
>
> Thank for your detail explanation, let's wait it out.
Did we wait long enough? Should we move ahead with this change
or should we drop it and have another run at it?
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH 1/4] fs/proc/task_mmu: use folio API in pte_is_pinned()
2024-06-25 22:38 ` Andrew Morton
@ 2024-06-26 4:57 ` David Hildenbrand
0 siblings, 0 replies; 12+ messages in thread
From: David Hildenbrand @ 2024-06-26 4:57 UTC (permalink / raw)
To: Andrew Morton, Kefeng Wang
Cc: Helge Deller, Daniel Vetter, Matthew Wilcox, linux-mm, Jonathan Corbet
On 26.06.24 00:38, Andrew Morton wrote:
> On Wed, 5 Jun 2024 09:30:59 +0800 Kefeng Wang <wangkefeng.wang@huawei.com> wrote:
>
>>> protection to catch successive writes (mprotect, uffd-wp) -- and
>>> they all
>>> face the same issue regarding R/W access via GUP (FOLL_PIN and
>>> FOLL_GET).
>>> User space better be aware that using read-protection to catch
>>> writes to
>>> a page can miss writes via GUP. Softdirt tracking cannot reliably
>>> catch
>>> modifications via GUP after clearing softdirty and returning to user
>>> space.
>>>
>>> But I understand if you want to be careful :) So I might send that patch
>>> out at
>>> some point myself ...
>>>
>>
>> Thank for your detail explanation, let's wait it out.
>
> Did we wait long enough? Should we move ahead with this change
> or should we drop it and have another run at it?
The change is fine, we will revisit removing this code (instead of
cleaning it up) separately.
--
Cheers,
David / dhildenb
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2024-06-26 4:57 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-06-04 11:48 [PATCH 0/4] mm: remove page_maybe_dma_pinned() and page_mkclean() Kefeng Wang
2024-06-04 11:48 ` [PATCH 1/4] fs/proc/task_mmu: use folio API in pte_is_pinned() Kefeng Wang
2024-06-04 11:51 ` David Hildenbrand
2024-06-04 14:26 ` Kefeng Wang
[not found] ` <f466aae2-9a3e-4081-b2d8-fa5930b8bc29@redhat.com>
2024-06-05 1:30 ` Kefeng Wang
2024-06-25 22:38 ` Andrew Morton
2024-06-26 4:57 ` David Hildenbrand
2024-06-04 11:48 ` [PATCH 2/4] mm: remove page_maybe_dma_pinned() Kefeng Wang
2024-06-04 11:48 ` [PATCH 3/4] fb_defio: use a folio in fb_deferred_io_work() Kefeng Wang
2024-06-04 11:53 ` David Hildenbrand
2024-06-04 11:48 ` [PATCH 4/4] mm: remove page_mkclean() Kefeng Wang
2024-06-07 6:46 ` [PATCH] pin_user_pages: fix underline length Kefeng Wang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox