Re: [PATCH] mm/memory_hotplug.c: don't fail hot unplug quite so eagerly

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: David Hildenbrand <david@redhat.com>
To: John Hubbard <jhubbard@nvidia.com>, Oscar Salvador <osalvador@suse.de>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	LKML <linux-kernel@vger.kernel.org>,
	linux-mm@kvack.org
Subject: Re: [PATCH] mm/memory_hotplug.c: don't fail hot unplug quite so eagerly
Date: Tue, 20 Jun 2023 09:12:51 +0200	[thread overview]
Message-ID: <ed83df65-f785-7077-ddd0-4e53d6fa6056@redhat.com> (raw)
In-Reply-To: <20230620011719.155379-1-jhubbard@nvidia.com>

On 20.06.23 03:17, John Hubbard wrote:
> mm/memory_hotplug.c: don't fail hot unplug quite so eagerly
> 
> Some device drivers add memory to the system via memory hotplug. When
> the driver is unloaded, that memory is hot-unplugged.

Which interfaces are they using to add/remove memory?

> 
> However, memory hot unplug can fail. And these days, it fails a little
> too easily, with respect to the above case. Specifically, if a signal is
> pending on the process, hot unplug fails. This leads directly to: the
> user must reboot the machine in order to unload the driver, and
> therefore the device is unusable until the machine is rebooted.

Why can't they retry in user space when offlining fails with -EINTR, or 
re-trigger driver unloading?

> 
> During teardown paths in the kernel, a higher tolerance for failures or
> imperfections is often best. That is, it is often better to continue
> with the teardown, than to error out too early.
> 
> So in this case, other things (unmovable pages, un-splittable huge
> pages) can also cause the above problem. However, those are demonstrably
> less common than simply having a pending signal. I've got bug reports
> from users who can trivially reproduce this by killing their process
> with a "kill -9", for example.
> 
> Fix this by soldering on with memory hot plug, even in the presence of
> pending signals.
> 
> Signed-off-by: John Hubbard <jhubbard@nvidia.com>
> ---
>   mm/memory_hotplug.c | 6 ------
>   1 file changed, 6 deletions(-)
> 
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index 8e0fa209d533..57a46620a667 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -1879,12 +1879,6 @@ int __ref offline_pages(unsigned long start_pfn, unsigned long nr_pages,
>   	do {
>   		pfn = start_pfn;
>   		do {
> -			if (signal_pending(current)) {
> -				ret = -EINTR;
> -				reason = "signal backoff";
> -				goto failed_removal_isolated;
> -			}
> -
>   			cond_resched();
>   
>   			ret = scan_movable_pages(pfn, end_pfn, &pfn);

No, we can't remove that. It's documented behavior that exists precisely 
for that reason:

https://docs.kernel.org/admin-guide/mm/memory-hotplug.html#id21

"
When offlining is triggered from user space, the offlining context can 
be terminated by sending a fatal signal. A timeout based offlining can 
easily be implemented via:

% timeout $TIMEOUT offline_block | failure_handling
"

Otherwise, there is no way to stop an userspace-triggered offline 
operation that loops forever in the kernel.

I guess switching to fatal_signal_pending() might help to some degree, 
it should keep the timeout trick working.

But it wouldn't help in your case because where root kills arbitrary 
processes. I'm not sure if that is something we should be paying 
attention to.


-- 
Cheers,

David / dhildenb

next prev parent reply	other threads:[~2023-06-20  7:12 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-06-20  1:17 John Hubbard
2023-06-20  1:17 ` [PATCH v2 00/11] A minor flurry of selftest/mm fixes John Hubbard
2023-06-20  1:19   ` Please disregard all of the selftest patches " John Hubbard
2023-06-20  1:17 ` [PATCH v2 01/11] selftests/mm: fix uffd-stress unused function warning John Hubbard
2023-06-20  1:17 ` [PATCH v2 02/11] selftests/mm: fix unused variable warnings in hugetlb-madvise.c, migration.c John Hubbard
2023-06-20  1:17 ` [PATCH v2 03/11] selftests/mm: fix "warning: expression which evaluates to zero..." in mlock2-tests.c John Hubbard
2023-06-20  1:17 ` [PATCH v2 04/11] selftests/mm: fix invocation of tests that are run via shell scripts John Hubbard
2023-06-20  1:17 ` [PATCH v2 06/11] selftests/mm: fix two -Wformat-security warnings in uffd builds John Hubbard
2023-06-20  1:17 ` [PATCH v2 07/11] selftests/mm: fix a "possibly uninitialized" warning in pkey-x86.h John Hubbard
2023-06-20  1:17 ` [PATCH v2 08/11] selftests/mm: fix uffd-unit-tests.c build failure due to missing MADV_COLLAPSE John Hubbard
2023-06-20 10:17   ` Muhammad Usama Anjum
2023-06-20 10:18     ` David Hildenbrand
2023-06-20  1:17 ` [PATCH v2 10/11] selftests/mm: move uffd* routines from vm_util.c to uffd-common.c John Hubbard
2023-06-20  7:12 ` David Hildenbrand [this message]
2023-06-20 21:54   ` [PATCH] mm/memory_hotplug.c: don't fail hot unplug quite so eagerly John Hubbard
2023-06-21  8:11     ` David Hildenbrand
2023-06-21  8:24       ` David Hildenbrand
2023-06-22  2:22       ` John Hubbard

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ed83df65-f785-7077-ddd0-4e53d6fa6056@redhat.com \
    --to=david@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=jhubbard@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=osalvador@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox