* [PATCH V3] mm/gup: Clear the LRU flag of a page before adding to LRU batch
@ 2024-07-03 12:02 yangge1116
2024-07-03 20:08 ` Andrew Morton
0 siblings, 1 reply; 3+ messages in thread
From: yangge1116 @ 2024-07-03 12:02 UTC (permalink / raw)
To: akpm
Cc: linux-mm, linux-kernel, stable, 21cnbao, david, baolin.wang,
aneesh.kumar, liuzixing, yangge
From: yangge <yangge1116@126.com>
If a large number of CMA memory are configured in system (for example, the
CMA memory accounts for 50% of the system memory), starting a virtual
virtual machine with device passthrough, it will
call pin_user_pages_remote(..., FOLL_LONGTERM, ...) to pin memory.
Normally if a page is present and in CMA area, pin_user_pages_remote()
will migrate the page from CMA area to non-CMA area because of
FOLL_LONGTERM flag. But the current code will cause the migration failure
due to unexpected page refcounts, and eventually cause the virtual machine
fail to start.
If a page is added in LRU batch, its refcount increases one, remove the
page from LRU batch decreases one. Page migration requires the page is not
referenced by others except page mapping. Before migrating a page, we
should try to drain the page from LRU batch in case the page is in it,
however, folio_test_lru() is not sufficient to tell whether the page is
in LRU batch or not, if the page is in LRU batch, the migration will fail.
To solve the problem above, we modify the logic of adding to LRU batch.
Before adding a page to LRU batch, we clear the LRU flag of the page so
that we can check whether the page is in LRU batch by folio_test_lru(page).
Seems making the LRU flag of the page invisible a long time is no problem,
because a new page is allocated from buddy and added to the lru batch,
its LRU flag is also not visible for a long time.
Fixes: 9a4e9f3b2d73 ("mm: update get_user_pages_longterm to migrate pages allocated from CMA region")
Cc: <stable@vger.kernel.org>
Signed-off-by: yangge <yangge1116@126.com>
---
mm/swap.c | 43 +++++++++++++++++++++++++++++++------------
1 file changed, 31 insertions(+), 12 deletions(-)
V3:
Add fixes tag
V2:
Adjust code and commit message according to David's comments
diff --git a/mm/swap.c b/mm/swap.c
index dc205bd..9caf6b0 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -211,10 +211,6 @@ static void folio_batch_move_lru(struct folio_batch *fbatch, move_fn_t move_fn)
for (i = 0; i < folio_batch_count(fbatch); i++) {
struct folio *folio = fbatch->folios[i];
- /* block memcg migration while the folio moves between lru */
- if (move_fn != lru_add_fn && !folio_test_clear_lru(folio))
- continue;
-
folio_lruvec_relock_irqsave(folio, &lruvec, &flags);
move_fn(lruvec, folio);
@@ -255,11 +251,16 @@ static void lru_move_tail_fn(struct lruvec *lruvec, struct folio *folio)
void folio_rotate_reclaimable(struct folio *folio)
{
if (!folio_test_locked(folio) && !folio_test_dirty(folio) &&
- !folio_test_unevictable(folio) && folio_test_lru(folio)) {
+ !folio_test_unevictable(folio)) {
struct folio_batch *fbatch;
unsigned long flags;
folio_get(folio);
+ if (!folio_test_clear_lru(folio)) {
+ folio_put(folio);
+ return;
+ }
+
local_lock_irqsave(&lru_rotate.lock, flags);
fbatch = this_cpu_ptr(&lru_rotate.fbatch);
folio_batch_add_and_move(fbatch, folio, lru_move_tail_fn);
@@ -352,11 +353,15 @@ static void folio_activate_drain(int cpu)
void folio_activate(struct folio *folio)
{
- if (folio_test_lru(folio) && !folio_test_active(folio) &&
- !folio_test_unevictable(folio)) {
+ if (!folio_test_active(folio) && !folio_test_unevictable(folio)) {
struct folio_batch *fbatch;
folio_get(folio);
+ if (!folio_test_clear_lru(folio)) {
+ folio_put(folio);
+ return;
+ }
+
local_lock(&cpu_fbatches.lock);
fbatch = this_cpu_ptr(&cpu_fbatches.activate);
folio_batch_add_and_move(fbatch, folio, folio_activate_fn);
@@ -700,6 +705,11 @@ void deactivate_file_folio(struct folio *folio)
return;
folio_get(folio);
+ if (!folio_test_clear_lru(folio)) {
+ folio_put(folio);
+ return;
+ }
+
local_lock(&cpu_fbatches.lock);
fbatch = this_cpu_ptr(&cpu_fbatches.lru_deactivate_file);
folio_batch_add_and_move(fbatch, folio, lru_deactivate_file_fn);
@@ -716,11 +726,16 @@ void deactivate_file_folio(struct folio *folio)
*/
void folio_deactivate(struct folio *folio)
{
- if (folio_test_lru(folio) && !folio_test_unevictable(folio) &&
- (folio_test_active(folio) || lru_gen_enabled())) {
+ if (!folio_test_unevictable(folio) && (folio_test_active(folio) ||
+ lru_gen_enabled())) {
struct folio_batch *fbatch;
folio_get(folio);
+ if (!folio_test_clear_lru(folio)) {
+ folio_put(folio);
+ return;
+ }
+
local_lock(&cpu_fbatches.lock);
fbatch = this_cpu_ptr(&cpu_fbatches.lru_deactivate);
folio_batch_add_and_move(fbatch, folio, lru_deactivate_fn);
@@ -737,12 +752,16 @@ void folio_deactivate(struct folio *folio)
*/
void folio_mark_lazyfree(struct folio *folio)
{
- if (folio_test_lru(folio) && folio_test_anon(folio) &&
- folio_test_swapbacked(folio) && !folio_test_swapcache(folio) &&
- !folio_test_unevictable(folio)) {
+ if (folio_test_anon(folio) && folio_test_swapbacked(folio) &&
+ !folio_test_swapcache(folio) && !folio_test_unevictable(folio)) {
struct folio_batch *fbatch;
folio_get(folio);
+ if (!folio_test_clear_lru(folio)) {
+ folio_put(folio);
+ return;
+ }
+
local_lock(&cpu_fbatches.lock);
fbatch = this_cpu_ptr(&cpu_fbatches.lru_lazyfree);
folio_batch_add_and_move(fbatch, folio, lru_lazyfree_fn);
--
2.7.4
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH V3] mm/gup: Clear the LRU flag of a page before adding to LRU batch
2024-07-03 12:02 [PATCH V3] mm/gup: Clear the LRU flag of a page before adding to LRU batch yangge1116
@ 2024-07-03 20:08 ` Andrew Morton
2024-07-04 0:57 ` Ge Yang
0 siblings, 1 reply; 3+ messages in thread
From: Andrew Morton @ 2024-07-03 20:08 UTC (permalink / raw)
To: yangge1116
Cc: linux-mm, linux-kernel, stable, 21cnbao, david, baolin.wang,
aneesh.kumar, liuzixing
On Wed, 3 Jul 2024 20:02:33 +0800 yangge1116@126.com wrote:
> From: yangge <yangge1116@126.com>
>
> If a large number of CMA memory are configured in system (for example, the
> CMA memory accounts for 50% of the system memory), starting a virtual
> virtual machine with device passthrough, it will
> call pin_user_pages_remote(..., FOLL_LONGTERM, ...) to pin memory.
> Normally if a page is present and in CMA area, pin_user_pages_remote()
> will migrate the page from CMA area to non-CMA area because of
> FOLL_LONGTERM flag. But the current code will cause the migration failure
> due to unexpected page refcounts, and eventually cause the virtual machine
> fail to start.
>
> If a page is added in LRU batch, its refcount increases one, remove the
> page from LRU batch decreases one. Page migration requires the page is not
> referenced by others except page mapping. Before migrating a page, we
> should try to drain the page from LRU batch in case the page is in it,
> however, folio_test_lru() is not sufficient to tell whether the page is
> in LRU batch or not, if the page is in LRU batch, the migration will fail.
>
> To solve the problem above, we modify the logic of adding to LRU batch.
> Before adding a page to LRU batch, we clear the LRU flag of the page so
> that we can check whether the page is in LRU batch by folio_test_lru(page).
> Seems making the LRU flag of the page invisible a long time is no problem,
> because a new page is allocated from buddy and added to the lru batch,
> its LRU flag is also not visible for a long time.
>
Thanks.
I'll add this to the mm-hotfixes branch for additional testing. Please
continue to work with David on the changelog enhancements.
In mm-hotfixes I'd expect to send it to Linus next week. I could move
it into mm-unstable (then mm-stable) for merging into 6.11-rc1. This
is for additional testing time - it will still be backported into
earlier kernels. We can do this with any patch.
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH V3] mm/gup: Clear the LRU flag of a page before adding to LRU batch
2024-07-03 20:08 ` Andrew Morton
@ 2024-07-04 0:57 ` Ge Yang
0 siblings, 0 replies; 3+ messages in thread
From: Ge Yang @ 2024-07-04 0:57 UTC (permalink / raw)
To: Andrew Morton
Cc: linux-mm, linux-kernel, stable, 21cnbao, david, baolin.wang,
aneesh.kumar, liuzixing
在 2024/7/4 4:08, Andrew Morton 写道:
> On Wed, 3 Jul 2024 20:02:33 +0800 yangge1116@126.com wrote:
>
>> From: yangge <yangge1116@126.com>
>>
>> If a large number of CMA memory are configured in system (for example, the
>> CMA memory accounts for 50% of the system memory), starting a virtual
>> virtual machine with device passthrough, it will
>> call pin_user_pages_remote(..., FOLL_LONGTERM, ...) to pin memory.
>> Normally if a page is present and in CMA area, pin_user_pages_remote()
>> will migrate the page from CMA area to non-CMA area because of
>> FOLL_LONGTERM flag. But the current code will cause the migration failure
>> due to unexpected page refcounts, and eventually cause the virtual machine
>> fail to start.
>>
>> If a page is added in LRU batch, its refcount increases one, remove the
>> page from LRU batch decreases one. Page migration requires the page is not
>> referenced by others except page mapping. Before migrating a page, we
>> should try to drain the page from LRU batch in case the page is in it,
>> however, folio_test_lru() is not sufficient to tell whether the page is
>> in LRU batch or not, if the page is in LRU batch, the migration will fail.
>>
>> To solve the problem above, we modify the logic of adding to LRU batch.
>> Before adding a page to LRU batch, we clear the LRU flag of the page so
>> that we can check whether the page is in LRU batch by folio_test_lru(page).
>> Seems making the LRU flag of the page invisible a long time is no problem,
>> because a new page is allocated from buddy and added to the lru batch,
>> its LRU flag is also not visible for a long time.
>>
>
> Thanks.
>
> I'll add this to the mm-hotfixes branch for additional testing. Please
> continue to work with David on the changelog enhancements.
>
> In mm-hotfixes I'd expect to send it to Linus next week. I could move
> it into mm-unstable (then mm-stable) for merging into 6.11-rc1. This
> is for additional testing time - it will still be backported into
> earlier kernels. We can do this with any patch.
Ok, thanks.
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2024-07-04 0:57 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-07-03 12:02 [PATCH V3] mm/gup: Clear the LRU flag of a page before adding to LRU batch yangge1116
2024-07-03 20:08 ` Andrew Morton
2024-07-04 0:57 ` Ge Yang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox