linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Mel Gorman <mgorman@techsingularity.net>
To: Hugh Dickins <hughd@google.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Nicolas Saenz Julienne <nsaenzju@redhat.com>,
	Marcelo Tosatti <mtosatti@redhat.com>,
	Vlastimil Babka <vbabka@suse.cz>,
	Michal Hocko <mhocko@kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Linux-MM <linux-mm@kvack.org>
Subject: Re: [PATCH 5/6] mm/page_alloc: Protect PCP lists with a spinlock
Date: Tue, 24 May 2022 13:12:24 +0100	[thread overview]
Message-ID: <20220524121224.GY3441@techsingularity.net> (raw)
In-Reply-To: <554f4cdf-e4d9-f547-d3bb-1bcc1c9eb1@google.com>

On Sat, May 21, 2022 at 07:49:10PM -0700, Hugh Dickins wrote:
> On Mon, 9 May 2022, Mel Gorman wrote:
> 
> > Currently the PCP lists are protected by using local_lock_irqsave to
> > prevent migration and IRQ reentrancy but this is inconvenient. Remote
> > draining of the lists is impossible and a workqueue is required and
> > every task allocation/free must disable then enable interrupts which is
> > expensive.
> > 
> > As preparation for dealing with both of those problems, protect the
> > lists with a spinlock. The IRQ-unsafe version of the lock is used
> > because IRQs are already disabled by local_lock_irqsave. spin_trylock
> > is used in preparation for a time when local_lock could be used instead
> > of lock_lock_irqsave.
> 
> 8c580f60a145 ("mm/page_alloc: protect PCP lists with a spinlock")
> in next-20220520: I haven't looked up whether that comes from a
> stable or unstable suburb of akpm's tree.
> 
> Mel, the VM_BUG_ON(in_hardirq()) which this adds to free_unref_page_list() 
> is not valid.  I have no appreciation of how important it is to the whole
> scheme, but as it stands, it crashes; and when I change it to a warning
> 

Thanks Hugh. Sorry for the delay in responding, I was offline for a few
days. The context where free_unref_page_list is called from IRQ context
is safe and the VM_BUG_ON can be removed.

--8<--
mm/page_alloc: Protect PCP lists with a spinlock -fix

Hugh Dickins reported the following problem;

[  256.167040] WARNING: CPU: 0 PID: 9842 at mm/page_alloc.c:3478 free_unref_page_list+0x92/0x343
[  256.170031] CPU: 0 PID: 9842 Comm: cc1 Not tainted 5.18.0-rc7-n20 #3
[  256.171285] Hardware name: LENOVO 20HQS0EG02/20HQS0EG02, BIOS N1MET54W (1.39 ) 04/16/2019
[  256.172555] RIP: 0010:free_unref_page_list+0x92/0x343
[  256.173820] Code: ff ff 49 8b 44 24 08 4d 89 e0 4c 8d 60 f8 eb b6 48 8b 03 48 39 c3 0f 84 af 02 00 00 65 8b 05 72 7f df 7e a9 00 00 0f 00 74 02 <0f> 0b 9c 41 5d fa 41 0f ba e5 09 73 05 e8 1f 0a f9 ff e8 46 90 7b
[  256.175289] RSP: 0018:ffff88803ec07c80 EFLAGS: 00010006
[  256.176683] RAX: 0000000080010000 RBX: ffff88803ec07cf8 RCX: 000000000000002c
[  256.178122] RDX: 0000000000000000 RSI: ffff88803ec29d28 RDI: 0000000000000040
[  256.179580] RBP: ffff88803ec07cc0 R08: ffff88803ec07cf0 R09: 00000000000a401d
[  256.181031] R10: 0000000000000000 R11: ffff8880101891b8 R12: ffff88803f6dd600
[  256.182501] R13: ffff88803ec07cf8 R14: 000000000000000f R15: 0000000000000000
[  256.183957] FS:  00007ffff7fcfac0(0000) GS:ffff88803ec00000(0000) knlGS:0000000000000000
[  256.185419] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  256.186911] CR2: 0000555555710cdc CR3: 00000000240b4004 CR4: 00000000003706f0
[  256.188395] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  256.189888] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  256.191390] Call Trace:
[  256.192844]  <IRQ>
[  256.194253]  ? __mem_cgroup_uncharge_list+0x4e/0x57
[  256.195715]  release_pages+0x26f/0x27e
[  256.197150]  ? list_add_tail+0x39/0x39
[  256.198603]  pagevec_lru_move_fn+0x95/0xa4

The VM_BUG_ON was added as preparing for a time when the PCP was an
IRQ-unsafe lock. The fundamental limitation is that free_unref_page_list()
cannot be called with the PCP lock held as a normal spinlock when an IRQ
is delivered. At the moment, this is impossible and even if PCP was an
IRQ-unsafe lock, free_unref_page_list is not called from page allocator
context in an unsafe manner. Remove the VM_BUG_ON.

This is a fix for the mmotm patch
mm-page_alloc-protect-pcp-lists-with-a-spinlock.patch

Reported-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Mel Gorman <mgorman@techsingularity.net>
---
 mm/page_alloc.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 0d169aeeac6f..4c1e2a773e47 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3522,8 +3522,6 @@ void free_unref_page_list(struct list_head *list)
 	if (list_empty(list))
 		return;
 
-	VM_BUG_ON(in_hardirq());
-
 	page = lru_to_page(list);
 	locked_zone = page_zone(page);
 	pcp = pcp_spin_lock(locked_zone->per_cpu_pageset);



  reply	other threads:[~2022-05-24 12:12 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-09 13:07 [RFC PATCH 0/6] Drain remote per-cpu directly v2 Mel Gorman
2022-05-09 13:08 ` [PATCH 1/6] mm/page_alloc: Add page->buddy_list and page->pcp_list Mel Gorman
2022-05-13  8:41   ` Muchun Song
2022-05-26 10:14     ` Mel Gorman
2022-05-09 13:08 ` [PATCH 2/6] mm/page_alloc: Use only one PCP list for THP-sized allocations Mel Gorman
2022-05-09 13:08 ` [PATCH 3/6] mm/page_alloc: Split out buddy removal code from rmqueue into separate helper Mel Gorman
2022-05-09 13:08 ` [PATCH 4/6] mm/page_alloc: Remove unnecessary page == NULL check in rmqueue Mel Gorman
2022-05-09 13:08 ` [PATCH 5/6] mm/page_alloc: Protect PCP lists with a spinlock Mel Gorman
2022-05-22  2:49   ` Hugh Dickins
2022-05-24 12:12     ` Mel Gorman [this message]
2022-05-24 12:19       ` Mel Gorman
2022-05-09 13:08 ` [PATCH 6/6] mm/page_alloc: Remotely drain per-cpu lists Mel Gorman
2022-05-09 15:58 ` [RFC PATCH 0/6] Drain remote per-cpu directly v2 Minchan Kim
2022-05-10  9:27   ` Mel Gorman
2022-05-10 18:13     ` Minchan Kim
2022-05-11 12:47       ` Mel Gorman
2022-05-11 17:20         ` Minchan Kim
  -- strict thread matches above, loose matches on Subject: below --
2022-05-12  8:50 [PATCH 0/6] Drain remote per-cpu directly v3 Mel Gorman
2022-05-12  8:50 ` [PATCH 5/6] mm/page_alloc: Protect PCP lists with a spinlock Mel Gorman
2022-05-13 12:22   ` Nicolas Saenz Julienne
2022-04-20  9:59 [RFC PATCH 0/6] Drain remote per-cpu directly Mel Gorman
2022-04-20  9:59 ` [PATCH 5/6] mm/page_alloc: Protect PCP lists with a spinlock Mel Gorman
2022-04-20 14:02   ` Hillf Danton
2022-04-20 14:35     ` Nicolas Saenz Julienne
2022-04-26 16:42   ` Nicolas Saenz Julienne
2022-04-26 16:48     ` Vlastimil Babka
2022-04-29  9:13     ` Mel Gorman
2022-04-26 19:24   ` Minchan Kim
2022-04-29  9:05     ` Mel Gorman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220524121224.GY3441@techsingularity.net \
    --to=mgorman@techsingularity.net \
    --cc=akpm@linux-foundation.org \
    --cc=hughd@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=mtosatti@redhat.com \
    --cc=nsaenzju@redhat.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox