* [PATCH -V2 0/5] swap: cleanup get/put_swap_device() usage
@ 2023-05-22 7:09 Huang Ying
2023-05-22 7:09 ` [PATCH -V2 1/5] swap: Remove get/put_swap_device() in __swap_count() Huang Ying
` (4 more replies)
0 siblings, 5 replies; 21+ messages in thread
From: Huang Ying @ 2023-05-22 7:09 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-mm, linux-kernel, Huang Ying, David Hildenbrand,
Hugh Dickins, Johannes Weiner, Matthew Wilcox, Michal Hocko,
Minchan Kim, Tim Chen, Yang Shi, Yu Zhao
The general rule to use a swap entry is as follows.
When we get a swap entry, if there isn't some other way to prevent
swapoff, such as page lock for swap cache, page table lock, etc., the
swap entry may become invalid because of swapoff. Then, we need to
enclose all swap related functions with get_swap_device() and
put_swap_device(), unless the swap functions call
get/put_swap_device() by themselves.
Based on the above rule, all get/put_swap_device() usage are checked
and cleaned up if necessary.
Changelogs:
v2:
- Split patch per David's comments. Thanks!
Best Regards,
Huang, Ying
^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH -V2 1/5] swap: Remove get/put_swap_device() in __swap_count()
2023-05-22 7:09 [PATCH -V2 0/5] swap: cleanup get/put_swap_device() usage Huang Ying
@ 2023-05-22 7:09 ` Huang Ying
2023-05-22 11:54 ` David Hildenbrand
2023-05-23 1:22 ` Yosry Ahmed
2023-05-22 7:09 ` [PATCH -V2 2/5] swap, __read_swap_cache_async(): enlarge get/put_swap_device protection range Huang Ying
` (3 subsequent siblings)
4 siblings, 2 replies; 21+ messages in thread
From: Huang Ying @ 2023-05-22 7:09 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-mm, linux-kernel, Huang Ying, David Hildenbrand,
Hugh Dickins, Johannes Weiner, Matthew Wilcox, Michal Hocko,
Minchan Kim, Tim Chen, Yang Shi, Yu Zhao
__swap_count() is called in do_swap_page() only, which encloses the
call site with get/put_swap_device() already.
Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Yang Shi <shy828301@gmail.com>
Cc: Yu Zhao <yuzhao@google.com>
---
mm/swapfile.c | 10 ++--------
1 file changed, 2 insertions(+), 8 deletions(-)
diff --git a/mm/swapfile.c b/mm/swapfile.c
index 274bbf797480..8419cba9c192 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -1432,16 +1432,10 @@ void swapcache_free_entries(swp_entry_t *entries, int n)
int __swap_count(swp_entry_t entry)
{
- struct swap_info_struct *si;
+ struct swap_info_struct *si = swp_swap_info(entry);
pgoff_t offset = swp_offset(entry);
- int count = 0;
- si = get_swap_device(entry);
- if (si) {
- count = swap_count(si->swap_map[offset]);
- put_swap_device(si);
- }
- return count;
+ return swap_count(si->swap_map[offset]);
}
/*
--
2.39.2
^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH -V2 2/5] swap, __read_swap_cache_async(): enlarge get/put_swap_device protection range
2023-05-22 7:09 [PATCH -V2 0/5] swap: cleanup get/put_swap_device() usage Huang Ying
2023-05-22 7:09 ` [PATCH -V2 1/5] swap: Remove get/put_swap_device() in __swap_count() Huang Ying
@ 2023-05-22 7:09 ` Huang Ying
2023-05-22 12:01 ` David Hildenbrand
2023-05-22 7:09 ` [PATCH -V2 3/5] swap: remove __swp_swapcount() Huang Ying
` (2 subsequent siblings)
4 siblings, 1 reply; 21+ messages in thread
From: Huang Ying @ 2023-05-22 7:09 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-mm, linux-kernel, Huang Ying, David Hildenbrand,
Hugh Dickins, Johannes Weiner, Matthew Wilcox, Michal Hocko,
Minchan Kim, Tim Chen, Yang Shi, Yu Zhao
This makes the function a little easier to be understood because we
don't need to consider swapoff. And this makes it possible to remove
get/put_swap_device() calling in some functions called by
__read_swap_cache_async().
Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Yang Shi <shy828301@gmail.com>
Cc: Yu Zhao <yuzhao@google.com>
---
mm/swap_state.c | 33 ++++++++++++++++++++++-----------
1 file changed, 22 insertions(+), 11 deletions(-)
diff --git a/mm/swap_state.c b/mm/swap_state.c
index b76a65ac28b3..a1028fe7214e 100644
--- a/mm/swap_state.c
+++ b/mm/swap_state.c
@@ -417,9 +417,13 @@ struct page *__read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask,
{
struct swap_info_struct *si;
struct folio *folio;
+ struct page *page;
void *shadow = NULL;
*new_page_allocated = false;
+ si = get_swap_device(entry);
+ if (!si)
+ return NULL;
for (;;) {
int err;
@@ -428,14 +432,12 @@ struct page *__read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask,
* called after swap_cache_get_folio() failed, re-calling
* that would confuse statistics.
*/
- si = get_swap_device(entry);
- if (!si)
- return NULL;
folio = filemap_get_folio(swap_address_space(entry),
swp_offset(entry));
- put_swap_device(si);
- if (!IS_ERR(folio))
- return folio_file_page(folio, swp_offset(entry));
+ if (!IS_ERR(folio)) {
+ page = folio_file_page(folio, swp_offset(entry));
+ goto got_page;
+ }
/*
* Just skip read ahead for unused swap slot.
@@ -445,8 +447,8 @@ struct page *__read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask,
* as SWAP_HAS_CACHE. That's done in later part of code or
* else swap_off will be aborted if we return NULL.
*/
- if (!__swp_swapcount(entry) && swap_slot_cache_enabled)
- return NULL;
+ if (!swap_swapcount(si, entry) && swap_slot_cache_enabled)
+ goto fail;
/*
* Get a new page to read into from swap. Allocate it now,
@@ -455,7 +457,7 @@ struct page *__read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask,
*/
folio = vma_alloc_folio(gfp_mask, 0, vma, addr, false);
if (!folio)
- return NULL;
+ goto fail;
/*
* Swap entry may have been freed since our caller observed it.
@@ -466,7 +468,7 @@ struct page *__read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask,
folio_put(folio);
if (err != -EEXIST)
- return NULL;
+ goto fail;
/*
* We might race against __delete_from_swap_cache(), and
@@ -500,12 +502,17 @@ struct page *__read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask,
/* Caller will initiate read into locked folio */
folio_add_lru(folio);
*new_page_allocated = true;
- return &folio->page;
+ page = &folio->page;
+got_page:
+ put_swap_device(si);
+ return page;
fail_unlock:
put_swap_folio(folio, entry);
folio_unlock(folio);
folio_put(folio);
+fail:
+ put_swap_device(si);
return NULL;
}
@@ -514,6 +521,10 @@ struct page *__read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask,
* and reading the disk if it is not already cached.
* A failure return means that either the page allocation failed or that
* the swap entry is no longer in use.
+ *
+ * get/put_swap_device() aren't needed to call this function, because
+ * __read_swap_cache_async() call them and swap_readpage() holds the
+ * swap cache folio lock.
*/
struct page *read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask,
struct vm_area_struct *vma,
--
2.39.2
^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH -V2 3/5] swap: remove __swp_swapcount()
2023-05-22 7:09 [PATCH -V2 0/5] swap: cleanup get/put_swap_device() usage Huang Ying
2023-05-22 7:09 ` [PATCH -V2 1/5] swap: Remove get/put_swap_device() in __swap_count() Huang Ying
2023-05-22 7:09 ` [PATCH -V2 2/5] swap, __read_swap_cache_async(): enlarge get/put_swap_device protection range Huang Ying
@ 2023-05-22 7:09 ` Huang Ying
2023-05-22 12:03 ` David Hildenbrand
2023-05-22 7:09 ` [PATCH -V2 4/5] swap: remove get/put_swap_device() in __swap_duplicate() Huang Ying
2023-05-22 7:09 ` [PATCH -V2 5/5] swap: comments get_swap_device() with usage rule Huang Ying
4 siblings, 1 reply; 21+ messages in thread
From: Huang Ying @ 2023-05-22 7:09 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-mm, linux-kernel, Huang Ying, David Hildenbrand,
Hugh Dickins, Johannes Weiner, Matthew Wilcox, Michal Hocko,
Minchan Kim, Tim Chen, Yang Shi, Yu Zhao
__swp_swapcount() just encloses the calling to swap_swapcount() with
get/put_swap_device(). It is called in __read_swap_cache_async()
only, which encloses the calling with get/put_swap_device() already.
So, __read_swap_cache_async() can call swap_swapcount() directly.
Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Yang Shi <shy828301@gmail.com>
Cc: Yu Zhao <yuzhao@google.com>
---
include/linux/swap.h | 4 ++--
mm/swapfile.c | 20 +-------------------
2 files changed, 3 insertions(+), 21 deletions(-)
diff --git a/include/linux/swap.h b/include/linux/swap.h
index 3c69cb653cb9..f6bd51aa05ea 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -512,7 +512,7 @@ int find_first_swap(dev_t *device);
extern unsigned int count_swap_pages(int, int);
extern sector_t swapdev_block(int, pgoff_t);
extern int __swap_count(swp_entry_t entry);
-extern int __swp_swapcount(swp_entry_t entry);
+extern int swap_swapcount(struct swap_info_struct *si, swp_entry_t entry);
extern int swp_swapcount(swp_entry_t entry);
extern struct swap_info_struct *page_swap_info(struct page *);
extern struct swap_info_struct *swp_swap_info(swp_entry_t entry);
@@ -590,7 +590,7 @@ static inline int __swap_count(swp_entry_t entry)
return 0;
}
-static inline int __swp_swapcount(swp_entry_t entry)
+static inline int swap_swapcount(struct swap_info_struct *si, swp_entry_t entry)
{
return 0;
}
diff --git a/mm/swapfile.c b/mm/swapfile.c
index 8419cba9c192..e9cce775fb25 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -1443,7 +1443,7 @@ int __swap_count(swp_entry_t entry)
* This does not give an exact answer when swap count is continued,
* but does include the high COUNT_CONTINUED flag to allow for that.
*/
-static int swap_swapcount(struct swap_info_struct *si, swp_entry_t entry)
+int swap_swapcount(struct swap_info_struct *si, swp_entry_t entry)
{
pgoff_t offset = swp_offset(entry);
struct swap_cluster_info *ci;
@@ -1455,24 +1455,6 @@ static int swap_swapcount(struct swap_info_struct *si, swp_entry_t entry)
return count;
}
-/*
- * How many references to @entry are currently swapped out?
- * This does not give an exact answer when swap count is continued,
- * but does include the high COUNT_CONTINUED flag to allow for that.
- */
-int __swp_swapcount(swp_entry_t entry)
-{
- int count = 0;
- struct swap_info_struct *si;
-
- si = get_swap_device(entry);
- if (si) {
- count = swap_swapcount(si, entry);
- put_swap_device(si);
- }
- return count;
-}
-
/*
* How many references to @entry are currently swapped out?
* This considers COUNT_CONTINUED so it returns exact answer.
--
2.39.2
^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH -V2 4/5] swap: remove get/put_swap_device() in __swap_duplicate()
2023-05-22 7:09 [PATCH -V2 0/5] swap: cleanup get/put_swap_device() usage Huang Ying
` (2 preceding siblings ...)
2023-05-22 7:09 ` [PATCH -V2 3/5] swap: remove __swp_swapcount() Huang Ying
@ 2023-05-22 7:09 ` Huang Ying
2023-05-22 12:06 ` David Hildenbrand
2023-05-23 1:39 ` Yosry Ahmed
2023-05-22 7:09 ` [PATCH -V2 5/5] swap: comments get_swap_device() with usage rule Huang Ying
4 siblings, 2 replies; 21+ messages in thread
From: Huang Ying @ 2023-05-22 7:09 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-mm, linux-kernel, Huang Ying, David Hildenbrand,
Hugh Dickins, Johannes Weiner, Matthew Wilcox, Michal Hocko,
Minchan Kim, Tim Chen, Yang Shi, Yu Zhao
__swap_duplicate() is called by
- swap_shmem_alloc(): the page lock of the swap cache is held.
- copy_nonpresent_pte() -> swap_duplicate() and try_to_unmap_one() ->
swap_duplicate(): the page table lock is held.
- __read_swap_cache_async() -> swapcache_prepare(): enclosed with
get/put_swap_device() in __read_swap_cache_async() already.
So, it's safe to remove get/put_swap_device() in __swap_duplicate().
Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Yang Shi <shy828301@gmail.com>
Cc: Yu Zhao <yuzhao@google.com>
---
mm/swapfile.c | 5 +----
1 file changed, 1 insertion(+), 4 deletions(-)
diff --git a/mm/swapfile.c b/mm/swapfile.c
index e9cce775fb25..4dbaea64635d 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -3264,9 +3264,7 @@ static int __swap_duplicate(swp_entry_t entry, unsigned char usage)
unsigned char has_cache;
int err;
- p = get_swap_device(entry);
- if (!p)
- return -EINVAL;
+ p = swp_swap_info(entry);
offset = swp_offset(entry);
ci = lock_cluster_or_swap_info(p, offset);
@@ -3313,7 +3311,6 @@ static int __swap_duplicate(swp_entry_t entry, unsigned char usage)
unlock_out:
unlock_cluster_or_swap_info(p, ci);
- put_swap_device(p);
return err;
}
--
2.39.2
^ permalink raw reply [flat|nested] 21+ messages in thread
* [PATCH -V2 5/5] swap: comments get_swap_device() with usage rule
2023-05-22 7:09 [PATCH -V2 0/5] swap: cleanup get/put_swap_device() usage Huang Ying
` (3 preceding siblings ...)
2023-05-22 7:09 ` [PATCH -V2 4/5] swap: remove get/put_swap_device() in __swap_duplicate() Huang Ying
@ 2023-05-22 7:09 ` Huang Ying
2023-05-22 12:07 ` David Hildenbrand
2023-05-23 1:37 ` Yosry Ahmed
4 siblings, 2 replies; 21+ messages in thread
From: Huang Ying @ 2023-05-22 7:09 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-mm, linux-kernel, Huang Ying, David Hildenbrand,
Hugh Dickins, Johannes Weiner, Matthew Wilcox, Michal Hocko,
Minchan Kim, Tim Chen, Yang Shi, Yu Zhao
The general rule to use a swap entry is as follows.
When we get a swap entry, if there isn't some other way to prevent
swapoff, such as page lock for swap cache, page table lock, etc., the
swap entry may become invalid because of swapoff. Then, we need to
enclose all swap related functions with get_swap_device() and
put_swap_device(), unless the swap functions call
get/put_swap_device() by themselves.
Add the rule as comments of get_swap_device().
Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Tim Chen <tim.c.chen@linux.intel.com>
Cc: Yang Shi <shy828301@gmail.com>
Cc: Yu Zhao <yuzhao@google.com>
---
mm/swapfile.c | 12 +++++++++---
1 file changed, 9 insertions(+), 3 deletions(-)
diff --git a/mm/swapfile.c b/mm/swapfile.c
index 4dbaea64635d..0c1cb935b2eb 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -1219,6 +1219,13 @@ static unsigned char __swap_entry_free_locked(struct swap_info_struct *p,
}
/*
+ * When we get a swap entry, if there isn't some other way to prevent
+ * swapoff, such as page lock for swap cache, page table lock, etc.,
+ * the swap entry may become invalid because of swapoff. Then, we
+ * need to enclose all swap related functions with get_swap_device()
+ * and put_swap_device(), unless the swap functions call
+ * get/put_swap_device() by themselves.
+ *
* Check whether swap entry is valid in the swap device. If so,
* return pointer to swap_info_struct, and keep the swap entry valid
* via preventing the swap device from being swapoff, until
@@ -1227,9 +1234,8 @@ static unsigned char __swap_entry_free_locked(struct swap_info_struct *p,
* Notice that swapoff or swapoff+swapon can still happen before the
* percpu_ref_tryget_live() in get_swap_device() or after the
* percpu_ref_put() in put_swap_device() if there isn't any other way
- * to prevent swapoff, such as page lock, page table lock, etc. The
- * caller must be prepared for that. For example, the following
- * situation is possible.
+ * to prevent swapoff. The caller must be prepared for that. For
+ * example, the following situation is possible.
*
* CPU1 CPU2
* do_swap_page()
--
2.39.2
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH -V2 1/5] swap: Remove get/put_swap_device() in __swap_count()
2023-05-22 7:09 ` [PATCH -V2 1/5] swap: Remove get/put_swap_device() in __swap_count() Huang Ying
@ 2023-05-22 11:54 ` David Hildenbrand
2023-05-23 1:22 ` Yosry Ahmed
1 sibling, 0 replies; 21+ messages in thread
From: David Hildenbrand @ 2023-05-22 11:54 UTC (permalink / raw)
To: Huang Ying, Andrew Morton
Cc: linux-mm, linux-kernel, Hugh Dickins, Johannes Weiner,
Matthew Wilcox, Michal Hocko, Minchan Kim, Tim Chen, Yang Shi,
Yu Zhao
On 22.05.23 09:09, Huang Ying wrote:
> __swap_count() is called in do_swap_page() only, which encloses the
> call site with get/put_swap_device() already.
>
> Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
> Cc: David Hildenbrand <david@redhat.com>
> Cc: Hugh Dickins <hughd@google.com>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Matthew Wilcox <willy@infradead.org>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Minchan Kim <minchan@kernel.org>
> Cc: Tim Chen <tim.c.chen@linux.intel.com>
> Cc: Yang Shi <shy828301@gmail.com>
> Cc: Yu Zhao <yuzhao@google.com>
> ---
> mm/swapfile.c | 10 ++--------
> 1 file changed, 2 insertions(+), 8 deletions(-)
>
> diff --git a/mm/swapfile.c b/mm/swapfile.c
> index 274bbf797480..8419cba9c192 100644
> --- a/mm/swapfile.c
> +++ b/mm/swapfile.c
> @@ -1432,16 +1432,10 @@ void swapcache_free_entries(swp_entry_t *entries, int n)
>
> int __swap_count(swp_entry_t entry)
> {
> - struct swap_info_struct *si;
> + struct swap_info_struct *si = swp_swap_info(entry);
> pgoff_t offset = swp_offset(entry);
> - int count = 0;
>
> - si = get_swap_device(entry);
> - if (si) {
> - count = swap_count(si->swap_map[offset]);
> - put_swap_device(si);
> - }
> - return count;
> + return swap_count(si->swap_map[offset]);
> }
>
> /*
That locking was added in eb085574a752 ("mm, swap: fix race between
swapoff and some swap operations"). Before 2799e77529c ("swap: fix
do_swap_page() race with swapoff") added the get_swap_device() to
do_swap_page().
Reviewed-by: David Hildenbrand <david@redhat.com>
--
Thanks,
David / dhildenb
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH -V2 2/5] swap, __read_swap_cache_async(): enlarge get/put_swap_device protection range
2023-05-22 7:09 ` [PATCH -V2 2/5] swap, __read_swap_cache_async(): enlarge get/put_swap_device protection range Huang Ying
@ 2023-05-22 12:01 ` David Hildenbrand
2023-05-23 0:43 ` Huang, Ying
0 siblings, 1 reply; 21+ messages in thread
From: David Hildenbrand @ 2023-05-22 12:01 UTC (permalink / raw)
To: Huang Ying, Andrew Morton
Cc: linux-mm, linux-kernel, Hugh Dickins, Johannes Weiner,
Matthew Wilcox, Michal Hocko, Minchan Kim, Tim Chen, Yang Shi,
Yu Zhao
On 22.05.23 09:09, Huang Ying wrote:
> This makes the function a little easier to be understood because we
> don't need to consider swapoff. And this makes it possible to remove
> get/put_swap_device() calling in some functions called by
> __read_swap_cache_async().
>
> Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
> Cc: David Hildenbrand <david@redhat.com>
> Cc: Hugh Dickins <hughd@google.com>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Matthew Wilcox <willy@infradead.org>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Minchan Kim <minchan@kernel.org>
> Cc: Tim Chen <tim.c.chen@linux.intel.com>
> Cc: Yang Shi <shy828301@gmail.com>
> Cc: Yu Zhao <yuzhao@google.com>
> ---
> mm/swap_state.c | 33 ++++++++++++++++++++++-----------
> 1 file changed, 22 insertions(+), 11 deletions(-)
>
> diff --git a/mm/swap_state.c b/mm/swap_state.c
> index b76a65ac28b3..a1028fe7214e 100644
> --- a/mm/swap_state.c
> +++ b/mm/swap_state.c
> @@ -417,9 +417,13 @@ struct page *__read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask,
> {
> struct swap_info_struct *si;
> struct folio *folio;
> + struct page *page;
> void *shadow = NULL;
>
> *new_page_allocated = false;
> + si = get_swap_device(entry);
> + if (!si)
> + return NULL;
>
> for (;;) {
> int err;
> @@ -428,14 +432,12 @@ struct page *__read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask,
> * called after swap_cache_get_folio() failed, re-calling
> * that would confuse statistics.
> */
> - si = get_swap_device(entry);
> - if (!si)
> - return NULL;
> folio = filemap_get_folio(swap_address_space(entry),
> swp_offset(entry));
> - put_swap_device(si);
> - if (!IS_ERR(folio))
> - return folio_file_page(folio, swp_offset(entry));
> + if (!IS_ERR(folio)) {
> + page = folio_file_page(folio, swp_offset(entry));
> + goto got_page;
> + }
>
> /*
> * Just skip read ahead for unused swap slot.
> @@ -445,8 +447,8 @@ struct page *__read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask,
> * as SWAP_HAS_CACHE. That's done in later part of code or
> * else swap_off will be aborted if we return NULL.
> */
> - if (!__swp_swapcount(entry) && swap_slot_cache_enabled)
> - return NULL;
> + if (!swap_swapcount(si, entry) && swap_slot_cache_enabled)
> + goto fail;
>
> /*
> * Get a new page to read into from swap. Allocate it now,
> @@ -455,7 +457,7 @@ struct page *__read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask,
> */
> folio = vma_alloc_folio(gfp_mask, 0, vma, addr, false);
> if (!folio)
> - return NULL;
> + goto fail;
>
> /*
> * Swap entry may have been freed since our caller observed it.
> @@ -466,7 +468,7 @@ struct page *__read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask,
>
> folio_put(folio);
> if (err != -EEXIST)
> - return NULL;
> + goto fail;
>
> /*
> * We might race against __delete_from_swap_cache(), and
> @@ -500,12 +502,17 @@ struct page *__read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask,
> /* Caller will initiate read into locked folio */
> folio_add_lru(folio);
> *new_page_allocated = true;
> - return &folio->page;
> + page = &folio->page;
> +got_page:
> + put_swap_device(si);
> + return page;
>
> fail_unlock:
> put_swap_folio(folio, entry);
> folio_unlock(folio);
> folio_put(folio);
> +fail:
Maybe better "fail_put_swap".
We now hold the swap device ref longer than we used to, prevent swapoff
over the whole operation. I guess there is no way we can deadlock this way?
In general, looks good to me.
--
Thanks,
David / dhildenb
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH -V2 3/5] swap: remove __swp_swapcount()
2023-05-22 7:09 ` [PATCH -V2 3/5] swap: remove __swp_swapcount() Huang Ying
@ 2023-05-22 12:03 ` David Hildenbrand
2023-05-23 0:50 ` Huang, Ying
0 siblings, 1 reply; 21+ messages in thread
From: David Hildenbrand @ 2023-05-22 12:03 UTC (permalink / raw)
To: Huang Ying, Andrew Morton
Cc: linux-mm, linux-kernel, Hugh Dickins, Johannes Weiner,
Matthew Wilcox, Michal Hocko, Minchan Kim, Tim Chen, Yang Shi,
Yu Zhao
On 22.05.23 09:09, Huang Ying wrote:
> __swp_swapcount() just encloses the calling to swap_swapcount() with
> get/put_swap_device(). It is called in __read_swap_cache_async()
> only, which encloses the calling with get/put_swap_device() already.
> So, __read_swap_cache_async() can call swap_swapcount() directly.
The previous patch contained the hunk
- if (!__swp_swapcount(entry) && swap_slot_cache_enabled)
- return NULL;
+ if (!swap_swapcount(si, entry) && swap_slot_cache_enabled)
+ goto fail;
So something is a bit off here. Either that hunk should go here, or
this patch description has to be adjusted.
But I guess patch #2 doesn't compile on its own because this patch
here adds swap_swapcount() to include/linux/swap.h ?
>
> Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
> Cc: David Hildenbrand <david@redhat.com>
> Cc: Hugh Dickins <hughd@google.com>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Matthew Wilcox <willy@infradead.org>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Minchan Kim <minchan@kernel.org>
> Cc: Tim Chen <tim.c.chen@linux.intel.com>
> Cc: Yang Shi <shy828301@gmail.com>
> Cc: Yu Zhao <yuzhao@google.com>
> ---
> include/linux/swap.h | 4 ++--
> mm/swapfile.c | 20 +-------------------
> 2 files changed, 3 insertions(+), 21 deletions(-)
>
> diff --git a/include/linux/swap.h b/include/linux/swap.h
> index 3c69cb653cb9..f6bd51aa05ea 100644
> --- a/include/linux/swap.h
> +++ b/include/linux/swap.h
> @@ -512,7 +512,7 @@ int find_first_swap(dev_t *device);
> extern unsigned int count_swap_pages(int, int);
> extern sector_t swapdev_block(int, pgoff_t);
> extern int __swap_count(swp_entry_t entry);
> -extern int __swp_swapcount(swp_entry_t entry);
> +extern int swap_swapcount(struct swap_info_struct *si, swp_entry_t entry);
> extern int swp_swapcount(swp_entry_t entry);
> extern struct swap_info_struct *page_swap_info(struct page *);
> extern struct swap_info_struct *swp_swap_info(swp_entry_t entry);
> @@ -590,7 +590,7 @@ static inline int __swap_count(swp_entry_t entry)
> return 0;
> }
>
> -static inline int __swp_swapcount(swp_entry_t entry)
> +static inline int swap_swapcount(struct swap_info_struct *si, swp_entry_t entry)
> {
> return 0;
> }
> diff --git a/mm/swapfile.c b/mm/swapfile.c
> index 8419cba9c192..e9cce775fb25 100644
> --- a/mm/swapfile.c
> +++ b/mm/swapfile.c
> @@ -1443,7 +1443,7 @@ int __swap_count(swp_entry_t entry)
> * This does not give an exact answer when swap count is continued,
> * but does include the high COUNT_CONTINUED flag to allow for that.
> */
> -static int swap_swapcount(struct swap_info_struct *si, swp_entry_t entry)
> +int swap_swapcount(struct swap_info_struct *si, swp_entry_t entry)
> {
> pgoff_t offset = swp_offset(entry);
> struct swap_cluster_info *ci;
> @@ -1455,24 +1455,6 @@ static int swap_swapcount(struct swap_info_struct *si, swp_entry_t entry)
> return count;
> }
>
> -/*
> - * How many references to @entry are currently swapped out?
> - * This does not give an exact answer when swap count is continued,
> - * but does include the high COUNT_CONTINUED flag to allow for that.
> - */
> -int __swp_swapcount(swp_entry_t entry)
> -{
> - int count = 0;
> - struct swap_info_struct *si;
> -
> - si = get_swap_device(entry);
> - if (si) {
> - count = swap_swapcount(si, entry);
> - put_swap_device(si);
> - }
> - return count;
> -}
> -
> /*
> * How many references to @entry are currently swapped out?
> * This considers COUNT_CONTINUED so it returns exact answer.
--
Thanks,
David / dhildenb
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH -V2 4/5] swap: remove get/put_swap_device() in __swap_duplicate()
2023-05-22 7:09 ` [PATCH -V2 4/5] swap: remove get/put_swap_device() in __swap_duplicate() Huang Ying
@ 2023-05-22 12:06 ` David Hildenbrand
2023-05-23 0:56 ` Huang, Ying
2023-05-23 1:39 ` Yosry Ahmed
1 sibling, 1 reply; 21+ messages in thread
From: David Hildenbrand @ 2023-05-22 12:06 UTC (permalink / raw)
To: Huang Ying, Andrew Morton
Cc: linux-mm, linux-kernel, Hugh Dickins, Johannes Weiner,
Matthew Wilcox, Michal Hocko, Minchan Kim, Tim Chen, Yang Shi,
Yu Zhao
On 22.05.23 09:09, Huang Ying wrote:
> __swap_duplicate() is called by
>
> - swap_shmem_alloc(): the page lock of the swap cache is held.
page lock of the swap cache? Did you really mean to say that or am I
confused?
"Page lock of the page that is in the swap cache?"
>
> - copy_nonpresent_pte() -> swap_duplicate() and try_to_unmap_one() ->
> swap_duplicate(): the page table lock is held.
>
> - __read_swap_cache_async() -> swapcache_prepare(): enclosed with
> get/put_swap_device() in __read_swap_cache_async() already.
>
> So, it's safe to remove get/put_swap_device() in __swap_duplicate().
--
Thanks,
David / dhildenb
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH -V2 5/5] swap: comments get_swap_device() with usage rule
2023-05-22 7:09 ` [PATCH -V2 5/5] swap: comments get_swap_device() with usage rule Huang Ying
@ 2023-05-22 12:07 ` David Hildenbrand
2023-05-23 1:00 ` Huang, Ying
2023-05-23 1:37 ` Yosry Ahmed
1 sibling, 1 reply; 21+ messages in thread
From: David Hildenbrand @ 2023-05-22 12:07 UTC (permalink / raw)
To: Huang Ying, Andrew Morton
Cc: linux-mm, linux-kernel, Hugh Dickins, Johannes Weiner,
Matthew Wilcox, Michal Hocko, Minchan Kim, Tim Chen, Yang Shi,
Yu Zhao
On 22.05.23 09:09, Huang Ying wrote:
> The general rule to use a swap entry is as follows.
>
> When we get a swap entry, if there isn't some other way to prevent
> swapoff, such as page lock for swap cache, page table lock, etc., the
> swap entry may become invalid because of swapoff. Then, we need to
> enclose all swap related functions with get_swap_device() and
> put_swap_device(), unless the swap functions call
> get/put_swap_device() by themselves.
>
> Add the rule as comments of get_swap_device().
>
> Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
> Cc: David Hildenbrand <david@redhat.com>
> Cc: Hugh Dickins <hughd@google.com>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Matthew Wilcox <willy@infradead.org>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Minchan Kim <minchan@kernel.org>
> Cc: Tim Chen <tim.c.chen@linux.intel.com>
> Cc: Yang Shi <shy828301@gmail.com>
> Cc: Yu Zhao <yuzhao@google.com>
> ---
> mm/swapfile.c | 12 +++++++++---
> 1 file changed, 9 insertions(+), 3 deletions(-)
>
> diff --git a/mm/swapfile.c b/mm/swapfile.c
> index 4dbaea64635d..0c1cb935b2eb 100644
> --- a/mm/swapfile.c
> +++ b/mm/swapfile.c
> @@ -1219,6 +1219,13 @@ static unsigned char __swap_entry_free_locked(struct swap_info_struct *p,
> }
>
> /*
> + * When we get a swap entry, if there isn't some other way to prevent
> + * swapoff, such as page lock for swap cache, page table lock, etc.,
Again "page lock for swap cache" might be imprecise.
> + * the swap entry may become invalid because of swapoff. Then, we
> + * need to enclose all swap related functions with get_swap_device()
> + * and put_swap_device(), unless the swap functions call
> + * get/put_swap_device() by themselves.
> + *
> * Check whether swap entry is valid in the swap device. If so,
> * return pointer to swap_info_struct, and keep the swap entry valid
> * via preventing the swap device from being swapoff, until
> @@ -1227,9 +1234,8 @@ static unsigned char __swap_entry_free_locked(struct swap_info_struct *p,
> * Notice that swapoff or swapoff+swapon can still happen before the
> * percpu_ref_tryget_live() in get_swap_device() or after the
> * percpu_ref_put() in put_swap_device() if there isn't any other way
> - * to prevent swapoff, such as page lock, page table lock, etc. The
> - * caller must be prepared for that. For example, the following
> - * situation is possible.
> + * to prevent swapoff. The caller must be prepared for that. For
> + * example, the following situation is possible.
> *
> * CPU1 CPU2
> * do_swap_page()
Reviewed-by: David Hildenbrand <david@redhat.com>
--
Thanks,
David / dhildenb
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH -V2 2/5] swap, __read_swap_cache_async(): enlarge get/put_swap_device protection range
2023-05-22 12:01 ` David Hildenbrand
@ 2023-05-23 0:43 ` Huang, Ying
0 siblings, 0 replies; 21+ messages in thread
From: Huang, Ying @ 2023-05-23 0:43 UTC (permalink / raw)
To: David Hildenbrand
Cc: Andrew Morton, linux-mm, linux-kernel, Hugh Dickins,
Johannes Weiner, Matthew Wilcox, Michal Hocko, Minchan Kim,
Tim Chen, Yang Shi, Yu Zhao
David Hildenbrand <david@redhat.com> writes:
> On 22.05.23 09:09, Huang Ying wrote:
>> This makes the function a little easier to be understood because we
>> don't need to consider swapoff. And this makes it possible to remove
>> get/put_swap_device() calling in some functions called by
>> __read_swap_cache_async().
>> Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
>> Cc: David Hildenbrand <david@redhat.com>
>> Cc: Hugh Dickins <hughd@google.com>
>> Cc: Johannes Weiner <hannes@cmpxchg.org>
>> Cc: Matthew Wilcox <willy@infradead.org>
>> Cc: Michal Hocko <mhocko@suse.com>
>> Cc: Minchan Kim <minchan@kernel.org>
>> Cc: Tim Chen <tim.c.chen@linux.intel.com>
>> Cc: Yang Shi <shy828301@gmail.com>
>> Cc: Yu Zhao <yuzhao@google.com>
>> ---
>> mm/swap_state.c | 33 ++++++++++++++++++++++-----------
>> 1 file changed, 22 insertions(+), 11 deletions(-)
>> diff --git a/mm/swap_state.c b/mm/swap_state.c
>> index b76a65ac28b3..a1028fe7214e 100644
>> --- a/mm/swap_state.c
>> +++ b/mm/swap_state.c
>> @@ -417,9 +417,13 @@ struct page *__read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask,
>> {
>> struct swap_info_struct *si;
>> struct folio *folio;
>> + struct page *page;
>> void *shadow = NULL;
>> *new_page_allocated = false;
>> + si = get_swap_device(entry);
>> + if (!si)
>> + return NULL;
>> for (;;) {
>> int err;
>> @@ -428,14 +432,12 @@ struct page *__read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask,
>> * called after swap_cache_get_folio() failed, re-calling
>> * that would confuse statistics.
>> */
>> - si = get_swap_device(entry);
>> - if (!si)
>> - return NULL;
>> folio = filemap_get_folio(swap_address_space(entry),
>> swp_offset(entry));
>> - put_swap_device(si);
>> - if (!IS_ERR(folio))
>> - return folio_file_page(folio, swp_offset(entry));
>> + if (!IS_ERR(folio)) {
>> + page = folio_file_page(folio, swp_offset(entry));
>> + goto got_page;
>> + }
>> /*
>> * Just skip read ahead for unused swap slot.
>> @@ -445,8 +447,8 @@ struct page *__read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask,
>> * as SWAP_HAS_CACHE. That's done in later part of code or
>> * else swap_off will be aborted if we return NULL.
>> */
>> - if (!__swp_swapcount(entry) && swap_slot_cache_enabled)
>> - return NULL;
>> + if (!swap_swapcount(si, entry) && swap_slot_cache_enabled)
>> + goto fail;
>> /*
>> * Get a new page to read into from swap. Allocate it now,
>> @@ -455,7 +457,7 @@ struct page *__read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask,
>> */
>> folio = vma_alloc_folio(gfp_mask, 0, vma, addr, false);
>> if (!folio)
>> - return NULL;
>> + goto fail;
>> /*
>> * Swap entry may have been freed since our caller observed it.
>> @@ -466,7 +468,7 @@ struct page *__read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask,
>> folio_put(folio);
>> if (err != -EEXIST)
>> - return NULL;
>> + goto fail;
>> /*
>> * We might race against __delete_from_swap_cache(), and
>> @@ -500,12 +502,17 @@ struct page *__read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask,
>> /* Caller will initiate read into locked folio */
>> folio_add_lru(folio);
>> *new_page_allocated = true;
>> - return &folio->page;
>> + page = &folio->page;
>> +got_page:
>> + put_swap_device(si);
>> + return page;
>> fail_unlock:
>> put_swap_folio(folio, entry);
>> folio_unlock(folio);
>> folio_put(folio);
>> +fail:
>
> Maybe better "fail_put_swap".
>
> We now hold the swap device ref longer than we used to, prevent
> swapoff over the whole operation. I guess there is no way we can
> deadlock this way?
I think that we are safe. In swapoff() syscall, we call
percpu_ref_kill() after all pages are swapped in (via try_to_unuse()).
> In general, looks good to me.
Thanks!
Best Regards,
Huang, Ying
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH -V2 3/5] swap: remove __swp_swapcount()
2023-05-22 12:03 ` David Hildenbrand
@ 2023-05-23 0:50 ` Huang, Ying
0 siblings, 0 replies; 21+ messages in thread
From: Huang, Ying @ 2023-05-23 0:50 UTC (permalink / raw)
To: David Hildenbrand
Cc: Andrew Morton, linux-mm, linux-kernel, Hugh Dickins,
Johannes Weiner, Matthew Wilcox, Michal Hocko, Minchan Kim,
Tim Chen, Yang Shi, Yu Zhao
David Hildenbrand <david@redhat.com> writes:
> On 22.05.23 09:09, Huang Ying wrote:
>> __swp_swapcount() just encloses the calling to swap_swapcount() with
>> get/put_swap_device(). It is called in __read_swap_cache_async()
>> only, which encloses the calling with get/put_swap_device() already.
>> So, __read_swap_cache_async() can call swap_swapcount() directly.
>
> The previous patch contained the hunk
>
> - if (!__swp_swapcount(entry) && swap_slot_cache_enabled)
> - return NULL;
> + if (!swap_swapcount(si, entry) && swap_slot_cache_enabled)
> + goto fail;
>
> So something is a bit off here. Either that hunk should go here, or
> this patch description has to be adjusted.
>
>
> But I guess patch #2 doesn't compile on its own because this patch
> here adds swap_swapcount() to include/linux/swap.h ?
Good catch! Will change this in the next version.
Best Regards,
Huang, Ying
>> Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
>> Cc: David Hildenbrand <david@redhat.com>
>> Cc: Hugh Dickins <hughd@google.com>
>> Cc: Johannes Weiner <hannes@cmpxchg.org>
>> Cc: Matthew Wilcox <willy@infradead.org>
>> Cc: Michal Hocko <mhocko@suse.com>
>> Cc: Minchan Kim <minchan@kernel.org>
>> Cc: Tim Chen <tim.c.chen@linux.intel.com>
>> Cc: Yang Shi <shy828301@gmail.com>
>> Cc: Yu Zhao <yuzhao@google.com>
>> ---
>> include/linux/swap.h | 4 ++--
>> mm/swapfile.c | 20 +-------------------
>> 2 files changed, 3 insertions(+), 21 deletions(-)
>> diff --git a/include/linux/swap.h b/include/linux/swap.h
>> index 3c69cb653cb9..f6bd51aa05ea 100644
>> --- a/include/linux/swap.h
>> +++ b/include/linux/swap.h
>> @@ -512,7 +512,7 @@ int find_first_swap(dev_t *device);
>> extern unsigned int count_swap_pages(int, int);
>> extern sector_t swapdev_block(int, pgoff_t);
>> extern int __swap_count(swp_entry_t entry);
>> -extern int __swp_swapcount(swp_entry_t entry);
>> +extern int swap_swapcount(struct swap_info_struct *si, swp_entry_t entry);
>> extern int swp_swapcount(swp_entry_t entry);
>> extern struct swap_info_struct *page_swap_info(struct page *);
>> extern struct swap_info_struct *swp_swap_info(swp_entry_t entry);
>> @@ -590,7 +590,7 @@ static inline int __swap_count(swp_entry_t entry)
>> return 0;
>> }
>> -static inline int __swp_swapcount(swp_entry_t entry)
>> +static inline int swap_swapcount(struct swap_info_struct *si, swp_entry_t entry)
>> {
>> return 0;
>> }
>> diff --git a/mm/swapfile.c b/mm/swapfile.c
>> index 8419cba9c192..e9cce775fb25 100644
>> --- a/mm/swapfile.c
>> +++ b/mm/swapfile.c
>> @@ -1443,7 +1443,7 @@ int __swap_count(swp_entry_t entry)
>> * This does not give an exact answer when swap count is continued,
>> * but does include the high COUNT_CONTINUED flag to allow for that.
>> */
>> -static int swap_swapcount(struct swap_info_struct *si, swp_entry_t entry)
>> +int swap_swapcount(struct swap_info_struct *si, swp_entry_t entry)
>> {
>> pgoff_t offset = swp_offset(entry);
>> struct swap_cluster_info *ci;
>> @@ -1455,24 +1455,6 @@ static int swap_swapcount(struct swap_info_struct *si, swp_entry_t entry)
>> return count;
>> }
>> -/*
>> - * How many references to @entry are currently swapped out?
>> - * This does not give an exact answer when swap count is continued,
>> - * but does include the high COUNT_CONTINUED flag to allow for that.
>> - */
>> -int __swp_swapcount(swp_entry_t entry)
>> -{
>> - int count = 0;
>> - struct swap_info_struct *si;
>> -
>> - si = get_swap_device(entry);
>> - if (si) {
>> - count = swap_swapcount(si, entry);
>> - put_swap_device(si);
>> - }
>> - return count;
>> -}
>> -
>> /*
>> * How many references to @entry are currently swapped out?
>> * This considers COUNT_CONTINUED so it returns exact answer.
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH -V2 4/5] swap: remove get/put_swap_device() in __swap_duplicate()
2023-05-22 12:06 ` David Hildenbrand
@ 2023-05-23 0:56 ` Huang, Ying
2023-05-23 7:59 ` David Hildenbrand
0 siblings, 1 reply; 21+ messages in thread
From: Huang, Ying @ 2023-05-23 0:56 UTC (permalink / raw)
To: David Hildenbrand
Cc: Andrew Morton, linux-mm, linux-kernel, Hugh Dickins,
Johannes Weiner, Matthew Wilcox, Michal Hocko, Minchan Kim,
Tim Chen, Yang Shi, Yu Zhao
David Hildenbrand <david@redhat.com> writes:
> On 22.05.23 09:09, Huang Ying wrote:
>> __swap_duplicate() is called by
>> - swap_shmem_alloc(): the page lock of the swap cache is held.
>
> page lock of the swap cache? Did you really mean to say that or am I
> confused?
>
> "Page lock of the page that is in the swap cache?"
Sorry for my poor English. Or make it shorter?
"the folio in the swap cache is locked"
Best Regards,
Huang, Ying
>> - copy_nonpresent_pte() -> swap_duplicate() and try_to_unmap_one()
>> ->
>> swap_duplicate(): the page table lock is held.
>> - __read_swap_cache_async() -> swapcache_prepare(): enclosed with
>> get/put_swap_device() in __read_swap_cache_async() already.
>> So, it's safe to remove get/put_swap_device() in __swap_duplicate().
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH -V2 5/5] swap: comments get_swap_device() with usage rule
2023-05-22 12:07 ` David Hildenbrand
@ 2023-05-23 1:00 ` Huang, Ying
0 siblings, 0 replies; 21+ messages in thread
From: Huang, Ying @ 2023-05-23 1:00 UTC (permalink / raw)
To: David Hildenbrand
Cc: Andrew Morton, linux-mm, linux-kernel, Hugh Dickins,
Johannes Weiner, Matthew Wilcox, Michal Hocko, Minchan Kim,
Tim Chen, Yang Shi, Yu Zhao
David Hildenbrand <david@redhat.com> writes:
> On 22.05.23 09:09, Huang Ying wrote:
>> The general rule to use a swap entry is as follows.
>> When we get a swap entry, if there isn't some other way to prevent
>> swapoff, such as page lock for swap cache, page table lock, etc., the
>> swap entry may become invalid because of swapoff. Then, we need to
>> enclose all swap related functions with get_swap_device() and
>> put_swap_device(), unless the swap functions call
>> get/put_swap_device() by themselves.
>> Add the rule as comments of get_swap_device().
>> Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
>> Cc: David Hildenbrand <david@redhat.com>
>> Cc: Hugh Dickins <hughd@google.com>
>> Cc: Johannes Weiner <hannes@cmpxchg.org>
>> Cc: Matthew Wilcox <willy@infradead.org>
>> Cc: Michal Hocko <mhocko@suse.com>
>> Cc: Minchan Kim <minchan@kernel.org>
>> Cc: Tim Chen <tim.c.chen@linux.intel.com>
>> Cc: Yang Shi <shy828301@gmail.com>
>> Cc: Yu Zhao <yuzhao@google.com>
>> ---
>> mm/swapfile.c | 12 +++++++++---
>> 1 file changed, 9 insertions(+), 3 deletions(-)
>> diff --git a/mm/swapfile.c b/mm/swapfile.c
>> index 4dbaea64635d..0c1cb935b2eb 100644
>> --- a/mm/swapfile.c
>> +++ b/mm/swapfile.c
>> @@ -1219,6 +1219,13 @@ static unsigned char __swap_entry_free_locked(struct swap_info_struct *p,
>> }
>> /*
>> + * When we get a swap entry, if there isn't some other way to prevent
>> + * swapoff, such as page lock for swap cache, page table lock, etc.,
>
> Again "page lock for swap cache" might be imprecise.
Sure. Will revise this.
>> + * the swap entry may become invalid because of swapoff. Then, we
>> + * need to enclose all swap related functions with get_swap_device()
>> + * and put_swap_device(), unless the swap functions call
>> + * get/put_swap_device() by themselves.
>> + *
>> * Check whether swap entry is valid in the swap device. If so,
>> * return pointer to swap_info_struct, and keep the swap entry valid
>> * via preventing the swap device from being swapoff, until
>> @@ -1227,9 +1234,8 @@ static unsigned char __swap_entry_free_locked(struct swap_info_struct *p,
>> * Notice that swapoff or swapoff+swapon can still happen before the
>> * percpu_ref_tryget_live() in get_swap_device() or after the
>> * percpu_ref_put() in put_swap_device() if there isn't any other way
>> - * to prevent swapoff, such as page lock, page table lock, etc. The
>> - * caller must be prepared for that. For example, the following
>> - * situation is possible.
>> + * to prevent swapoff. The caller must be prepared for that. For
>> + * example, the following situation is possible.
>> *
>> * CPU1 CPU2
>> * do_swap_page()
>
> Reviewed-by: David Hildenbrand <david@redhat.com>
Thanks!
Best Regards,
Huang, Ying
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH -V2 1/5] swap: Remove get/put_swap_device() in __swap_count()
2023-05-22 7:09 ` [PATCH -V2 1/5] swap: Remove get/put_swap_device() in __swap_count() Huang Ying
2023-05-22 11:54 ` David Hildenbrand
@ 2023-05-23 1:22 ` Yosry Ahmed
2023-05-23 1:47 ` Huang, Ying
1 sibling, 1 reply; 21+ messages in thread
From: Yosry Ahmed @ 2023-05-23 1:22 UTC (permalink / raw)
To: Huang Ying
Cc: Andrew Morton, linux-mm, linux-kernel, David Hildenbrand,
Hugh Dickins, Johannes Weiner, Matthew Wilcox, Michal Hocko,
Minchan Kim, Tim Chen, Yang Shi, Yu Zhao
On Mon, May 22, 2023 at 12:09 AM Huang Ying <ying.huang@intel.com> wrote:
>
> __swap_count() is called in do_swap_page() only, which encloses the
> call site with get/put_swap_device() already.
>
> Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
> Cc: David Hildenbrand <david@redhat.com>
> Cc: Hugh Dickins <hughd@google.com>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Matthew Wilcox <willy@infradead.org>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Minchan Kim <minchan@kernel.org>
> Cc: Tim Chen <tim.c.chen@linux.intel.com>
> Cc: Yang Shi <shy828301@gmail.com>
> Cc: Yu Zhao <yuzhao@google.com>
> ---
> mm/swapfile.c | 10 ++--------
> 1 file changed, 2 insertions(+), 8 deletions(-)
>
> diff --git a/mm/swapfile.c b/mm/swapfile.c
> index 274bbf797480..8419cba9c192 100644
> --- a/mm/swapfile.c
> +++ b/mm/swapfile.c
> @@ -1432,16 +1432,10 @@ void swapcache_free_entries(swp_entry_t *entries, int n)
>
nit: I would add a comment here that the caller needs get/put_swap_device().
Reviewed-by: Yosry Ahmed <yosryahmed@google.com>
> int __swap_count(swp_entry_t entry)
> {
> - struct swap_info_struct *si;
> + struct swap_info_struct *si = swp_swap_info(entry);
> pgoff_t offset = swp_offset(entry);
> - int count = 0;
>
> - si = get_swap_device(entry);
> - if (si) {
> - count = swap_count(si->swap_map[offset]);
> - put_swap_device(si);
> - }
> - return count;
> + return swap_count(si->swap_map[offset]);
> }
>
> /*
> --
> 2.39.2
>
>
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH -V2 5/5] swap: comments get_swap_device() with usage rule
2023-05-22 7:09 ` [PATCH -V2 5/5] swap: comments get_swap_device() with usage rule Huang Ying
2023-05-22 12:07 ` David Hildenbrand
@ 2023-05-23 1:37 ` Yosry Ahmed
1 sibling, 0 replies; 21+ messages in thread
From: Yosry Ahmed @ 2023-05-23 1:37 UTC (permalink / raw)
To: Huang Ying
Cc: Andrew Morton, linux-mm, linux-kernel, David Hildenbrand,
Hugh Dickins, Johannes Weiner, Matthew Wilcox, Michal Hocko,
Minchan Kim, Tim Chen, Yang Shi, Yu Zhao
On Mon, May 22, 2023 at 12:09 AM Huang Ying <ying.huang@intel.com> wrote:
>
> The general rule to use a swap entry is as follows.
>
> When we get a swap entry, if there isn't some other way to prevent
> swapoff, such as page lock for swap cache, page table lock, etc., the
> swap entry may become invalid because of swapoff. Then, we need to
> enclose all swap related functions with get_swap_device() and
> put_swap_device(), unless the swap functions call
> get/put_swap_device() by themselves.
>
> Add the rule as comments of get_swap_device().
>
> Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
> Cc: David Hildenbrand <david@redhat.com>
> Cc: Hugh Dickins <hughd@google.com>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Matthew Wilcox <willy@infradead.org>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Minchan Kim <minchan@kernel.org>
> Cc: Tim Chen <tim.c.chen@linux.intel.com>
> Cc: Yang Shi <shy828301@gmail.com>
> Cc: Yu Zhao <yuzhao@google.com>
> ---
> mm/swapfile.c | 12 +++++++++---
> 1 file changed, 9 insertions(+), 3 deletions(-)
>
> diff --git a/mm/swapfile.c b/mm/swapfile.c
> index 4dbaea64635d..0c1cb935b2eb 100644
> --- a/mm/swapfile.c
> +++ b/mm/swapfile.c
> @@ -1219,6 +1219,13 @@ static unsigned char __swap_entry_free_locked(struct swap_info_struct *p,
> }
>
> /*
> + * When we get a swap entry, if there isn't some other way to prevent
> + * swapoff, such as page lock for swap cache, page table lock, etc.,
> + * the swap entry may become invalid because of swapoff. Then, we
> + * need to enclose all swap related functions with get_swap_device()
> + * and put_swap_device(), unless the swap functions call
> + * get/put_swap_device() by themselves.
> + *
> * Check whether swap entry is valid in the swap device. If so,
> * return pointer to swap_info_struct, and keep the swap entry valid
> * via preventing the swap device from being swapoff, until
> @@ -1227,9 +1234,8 @@ static unsigned char __swap_entry_free_locked(struct swap_info_struct *p,
> * Notice that swapoff or swapoff+swapon can still happen before the
> * percpu_ref_tryget_live() in get_swap_device() or after the
> * percpu_ref_put() in put_swap_device() if there isn't any other way
> - * to prevent swapoff, such as page lock, page table lock, etc. The
> - * caller must be prepared for that. For example, the following
> - * situation is possible.
> + * to prevent swapoff. The caller must be prepared for that. For
> + * example, the following situation is possible.
> *
> * CPU1 CPU2
> * do_swap_page()
> --
> 2.39.2
>
>
Thanks for clarifying the code!
With David's comments:
Reviewed-by: Yosry Ahmed <yosryahmed@google.com>
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH -V2 4/5] swap: remove get/put_swap_device() in __swap_duplicate()
2023-05-22 7:09 ` [PATCH -V2 4/5] swap: remove get/put_swap_device() in __swap_duplicate() Huang Ying
2023-05-22 12:06 ` David Hildenbrand
@ 2023-05-23 1:39 ` Yosry Ahmed
1 sibling, 0 replies; 21+ messages in thread
From: Yosry Ahmed @ 2023-05-23 1:39 UTC (permalink / raw)
To: Huang Ying
Cc: Andrew Morton, linux-mm, linux-kernel, David Hildenbrand,
Hugh Dickins, Johannes Weiner, Matthew Wilcox, Michal Hocko,
Minchan Kim, Tim Chen, Yang Shi, Yu Zhao
On Mon, May 22, 2023 at 12:09 AM Huang Ying <ying.huang@intel.com> wrote:
>
> __swap_duplicate() is called by
>
> - swap_shmem_alloc(): the page lock of the swap cache is held.
>
> - copy_nonpresent_pte() -> swap_duplicate() and try_to_unmap_one() ->
> swap_duplicate(): the page table lock is held.
>
> - __read_swap_cache_async() -> swapcache_prepare(): enclosed with
> get/put_swap_device() in __read_swap_cache_async() already.
>
> So, it's safe to remove get/put_swap_device() in __swap_duplicate().
>
> Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
> Cc: David Hildenbrand <david@redhat.com>
> Cc: Hugh Dickins <hughd@google.com>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Matthew Wilcox <willy@infradead.org>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Minchan Kim <minchan@kernel.org>
> Cc: Tim Chen <tim.c.chen@linux.intel.com>
> Cc: Yang Shi <shy828301@gmail.com>
> Cc: Yu Zhao <yuzhao@google.com>
> ---
> mm/swapfile.c | 5 +----
> 1 file changed, 1 insertion(+), 4 deletions(-)
>
> diff --git a/mm/swapfile.c b/mm/swapfile.c
> index e9cce775fb25..4dbaea64635d 100644
> --- a/mm/swapfile.c
> +++ b/mm/swapfile.c
> @@ -3264,9 +3264,7 @@ static int __swap_duplicate(swp_entry_t entry, unsigned char usage)
I would add a comment above this function stating that the caller
needs to provide protection against swapoff, and refer to the comment
above get_swap_device().
Otherwise, LGTM with David's comment.
Reviewed-by: Yosry Ahmed <yosryahmed@google.com>
> unsigned char has_cache;
> int err;
>
> - p = get_swap_device(entry);
> - if (!p)
> - return -EINVAL;
> + p = swp_swap_info(entry);
>
> offset = swp_offset(entry);
> ci = lock_cluster_or_swap_info(p, offset);
> @@ -3313,7 +3311,6 @@ static int __swap_duplicate(swp_entry_t entry, unsigned char usage)
>
> unlock_out:
> unlock_cluster_or_swap_info(p, ci);
> - put_swap_device(p);
> return err;
> }
>
> --
> 2.39.2
>
>
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH -V2 1/5] swap: Remove get/put_swap_device() in __swap_count()
2023-05-23 1:22 ` Yosry Ahmed
@ 2023-05-23 1:47 ` Huang, Ying
2023-05-23 1:51 ` Yosry Ahmed
0 siblings, 1 reply; 21+ messages in thread
From: Huang, Ying @ 2023-05-23 1:47 UTC (permalink / raw)
To: Yosry Ahmed
Cc: Andrew Morton, linux-mm, linux-kernel, David Hildenbrand,
Hugh Dickins, Johannes Weiner, Matthew Wilcox, Michal Hocko,
Minchan Kim, Tim Chen, Yang Shi, Yu Zhao
Yosry Ahmed <yosryahmed@google.com> writes:
> On Mon, May 22, 2023 at 12:09 AM Huang Ying <ying.huang@intel.com> wrote:
>>
>> __swap_count() is called in do_swap_page() only, which encloses the
>> call site with get/put_swap_device() already.
>>
>> Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
>> Cc: David Hildenbrand <david@redhat.com>
>> Cc: Hugh Dickins <hughd@google.com>
>> Cc: Johannes Weiner <hannes@cmpxchg.org>
>> Cc: Matthew Wilcox <willy@infradead.org>
>> Cc: Michal Hocko <mhocko@suse.com>
>> Cc: Minchan Kim <minchan@kernel.org>
>> Cc: Tim Chen <tim.c.chen@linux.intel.com>
>> Cc: Yang Shi <shy828301@gmail.com>
>> Cc: Yu Zhao <yuzhao@google.com>
>> ---
>> mm/swapfile.c | 10 ++--------
>> 1 file changed, 2 insertions(+), 8 deletions(-)
>>
>> diff --git a/mm/swapfile.c b/mm/swapfile.c
>> index 274bbf797480..8419cba9c192 100644
>> --- a/mm/swapfile.c
>> +++ b/mm/swapfile.c
>> @@ -1432,16 +1432,10 @@ void swapcache_free_entries(swp_entry_t *entries, int n)
>>
>
> nit: I would add a comment here that the caller needs get/put_swap_device().
It's default behavior to call get/put_swap_device() in the caller for
all almost all swap functions. I would rather comment the swap
functions needn't to do that, as the comments for
read_swap_cache_async() in [2/5].
> Reviewed-by: Yosry Ahmed <yosryahmed@google.com>
Thanks!
>> int __swap_count(swp_entry_t entry)
>> {
>> - struct swap_info_struct *si;
>> + struct swap_info_struct *si = swp_swap_info(entry);
>> pgoff_t offset = swp_offset(entry);
>> - int count = 0;
>>
>> - si = get_swap_device(entry);
>> - if (si) {
>> - count = swap_count(si->swap_map[offset]);
>> - put_swap_device(si);
>> - }
>> - return count;
>> + return swap_count(si->swap_map[offset]);
>> }
>>
>> /*
Best Regards,
Huang, Ying
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH -V2 1/5] swap: Remove get/put_swap_device() in __swap_count()
2023-05-23 1:47 ` Huang, Ying
@ 2023-05-23 1:51 ` Yosry Ahmed
0 siblings, 0 replies; 21+ messages in thread
From: Yosry Ahmed @ 2023-05-23 1:51 UTC (permalink / raw)
To: Huang, Ying
Cc: Andrew Morton, linux-mm, linux-kernel, David Hildenbrand,
Hugh Dickins, Johannes Weiner, Matthew Wilcox, Michal Hocko,
Minchan Kim, Tim Chen, Yang Shi, Yu Zhao
On Mon, May 22, 2023 at 6:48 PM Huang, Ying <ying.huang@intel.com> wrote:
>
> Yosry Ahmed <yosryahmed@google.com> writes:
>
> > On Mon, May 22, 2023 at 12:09 AM Huang Ying <ying.huang@intel.com> wrote:
> >>
> >> __swap_count() is called in do_swap_page() only, which encloses the
> >> call site with get/put_swap_device() already.
> >>
> >> Signed-off-by: "Huang, Ying" <ying.huang@intel.com>
> >> Cc: David Hildenbrand <david@redhat.com>
> >> Cc: Hugh Dickins <hughd@google.com>
> >> Cc: Johannes Weiner <hannes@cmpxchg.org>
> >> Cc: Matthew Wilcox <willy@infradead.org>
> >> Cc: Michal Hocko <mhocko@suse.com>
> >> Cc: Minchan Kim <minchan@kernel.org>
> >> Cc: Tim Chen <tim.c.chen@linux.intel.com>
> >> Cc: Yang Shi <shy828301@gmail.com>
> >> Cc: Yu Zhao <yuzhao@google.com>
> >> ---
> >> mm/swapfile.c | 10 ++--------
> >> 1 file changed, 2 insertions(+), 8 deletions(-)
> >>
> >> diff --git a/mm/swapfile.c b/mm/swapfile.c
> >> index 274bbf797480..8419cba9c192 100644
> >> --- a/mm/swapfile.c
> >> +++ b/mm/swapfile.c
> >> @@ -1432,16 +1432,10 @@ void swapcache_free_entries(swp_entry_t *entries, int n)
> >>
> >
> > nit: I would add a comment here that the caller needs get/put_swap_device().
>
> It's default behavior to call get/put_swap_device() in the caller for
> all almost all swap functions. I would rather comment the swap
> functions needn't to do that, as the comments for
> read_swap_cache_async() in [2/5].
Fair enough. The comment that you added above get_swap_device() states
that all swap-related functions should have some sort of protection
against swapoff, so I guess that's sufficient.
My concern is that sometimes people don't know where to look, and
having a comment above the function might be helpful. I do agree
though that having a comment above ~all swap-related functions
pointing to the comment above get_swap_device() is too much.
>
> > Reviewed-by: Yosry Ahmed <yosryahmed@google.com>
>
> Thanks!
>
> >> int __swap_count(swp_entry_t entry)
> >> {
> >> - struct swap_info_struct *si;
> >> + struct swap_info_struct *si = swp_swap_info(entry);
> >> pgoff_t offset = swp_offset(entry);
> >> - int count = 0;
> >>
> >> - si = get_swap_device(entry);
> >> - if (si) {
> >> - count = swap_count(si->swap_map[offset]);
> >> - put_swap_device(si);
> >> - }
> >> - return count;
> >> + return swap_count(si->swap_map[offset]);
> >> }
> >>
> >> /*
>
> Best Regards,
> Huang, Ying
^ permalink raw reply [flat|nested] 21+ messages in thread
* Re: [PATCH -V2 4/5] swap: remove get/put_swap_device() in __swap_duplicate()
2023-05-23 0:56 ` Huang, Ying
@ 2023-05-23 7:59 ` David Hildenbrand
0 siblings, 0 replies; 21+ messages in thread
From: David Hildenbrand @ 2023-05-23 7:59 UTC (permalink / raw)
To: Huang, Ying
Cc: Andrew Morton, linux-mm, linux-kernel, Hugh Dickins,
Johannes Weiner, Matthew Wilcox, Michal Hocko, Minchan Kim,
Tim Chen, Yang Shi, Yu Zhao
On 23.05.23 02:56, Huang, Ying wrote:
> David Hildenbrand <david@redhat.com> writes:
>
>> On 22.05.23 09:09, Huang Ying wrote:
>>> __swap_duplicate() is called by
>>> - swap_shmem_alloc(): the page lock of the swap cache is held.
>>
>> page lock of the swap cache? Did you really mean to say that or am I
>> confused?
>>
>> "Page lock of the page that is in the swap cache?"
>
> Sorry for my poor English. Or make it shorter?
>
> "the folio in the swap cache is locked"
Much clearer, thanks.
--
Thanks,
David / dhildenb
^ permalink raw reply [flat|nested] 21+ messages in thread
end of thread, other threads:[~2023-05-23 8:00 UTC | newest]
Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-05-22 7:09 [PATCH -V2 0/5] swap: cleanup get/put_swap_device() usage Huang Ying
2023-05-22 7:09 ` [PATCH -V2 1/5] swap: Remove get/put_swap_device() in __swap_count() Huang Ying
2023-05-22 11:54 ` David Hildenbrand
2023-05-23 1:22 ` Yosry Ahmed
2023-05-23 1:47 ` Huang, Ying
2023-05-23 1:51 ` Yosry Ahmed
2023-05-22 7:09 ` [PATCH -V2 2/5] swap, __read_swap_cache_async(): enlarge get/put_swap_device protection range Huang Ying
2023-05-22 12:01 ` David Hildenbrand
2023-05-23 0:43 ` Huang, Ying
2023-05-22 7:09 ` [PATCH -V2 3/5] swap: remove __swp_swapcount() Huang Ying
2023-05-22 12:03 ` David Hildenbrand
2023-05-23 0:50 ` Huang, Ying
2023-05-22 7:09 ` [PATCH -V2 4/5] swap: remove get/put_swap_device() in __swap_duplicate() Huang Ying
2023-05-22 12:06 ` David Hildenbrand
2023-05-23 0:56 ` Huang, Ying
2023-05-23 7:59 ` David Hildenbrand
2023-05-23 1:39 ` Yosry Ahmed
2023-05-22 7:09 ` [PATCH -V2 5/5] swap: comments get_swap_device() with usage rule Huang Ying
2023-05-22 12:07 ` David Hildenbrand
2023-05-23 1:00 ` Huang, Ying
2023-05-23 1:37 ` Yosry Ahmed
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox