From: David Hildenbrand <david@redhat.com>
To: Pavel Tatashin <pasha.tatashin@soleen.com>,
jmorris@namei.org, sashal@kernel.org,
linux-kernel@vger.kernel.org, linux-mm@kvack.org,
linux-nvdimm@lists.01.org, akpm@linux-foundation.org,
mhocko@suse.com, dave.hansen@linux.intel.com,
dan.j.williams@intel.com, keith.busch@intel.com,
vishal.l.verma@intel.com, dave.jiang@intel.com,
zwisler@kernel.org, thomas.lendacky@amd.com,
ying.huang@intel.com, fengguang.wu@intel.com, bp@suse.de,
bhelgaas@google.com, baiyaowei@cmss.chinamobile.com,
tiwai@suse.de, jglisse@redhat.com
Subject: Re: [v5 2/3] mm/hotplug: make remove_memory() interface useable
Date: Fri, 3 May 2019 12:06:19 +0200 [thread overview]
Message-ID: <cfd599a7-ed05-fa5a-93a0-397fb9de72e4@redhat.com> (raw)
In-Reply-To: <20190502184337.20538-3-pasha.tatashin@soleen.com>
On 02.05.19 20:43, Pavel Tatashin wrote:
> As of right now remove_memory() interface is inherently broken. It tries
> to remove memory but panics if some memory is not offline. The problem
> is that it is impossible to ensure that all memory blocks are offline as
> this function also takes lock_device_hotplug that is required to
> change memory state via sysfs.
>
The existing interface can actually work today by registering a hotplug
notifier and rejecting any onlining attempts. But I agree that this way,
the interface becomes more usable.
> So, between calling this function and offlining all memory blocks there
> is always a window when lock_device_hotplug is released, and therefore,
> there is always a chance for a panic during this window.
>
> Make this interface to return an error if memory removal fails. This way
> it is safe to call this function without panicking machine, and also
> makes it symmetric to add_memory() which already returns an error.
>
> Signed-off-by: Pavel Tatashin <pasha.tatashin@soleen.com>> ---
> include/linux/memory_hotplug.h | 8 +++--
> mm/memory_hotplug.c | 61 ++++++++++++++++++++++------------
> 2 files changed, 46 insertions(+), 23 deletions(-)
>
> diff --git a/include/linux/memory_hotplug.h b/include/linux/memory_hotplug.h
> index 8ade08c50d26..5438a2d92560 100644
> --- a/include/linux/memory_hotplug.h
> +++ b/include/linux/memory_hotplug.h
> @@ -304,7 +304,7 @@ static inline void pgdat_resize_init(struct pglist_data *pgdat) {}
> extern bool is_mem_section_removable(unsigned long pfn, unsigned long nr_pages);
> extern void try_offline_node(int nid);
> extern int offline_pages(unsigned long start_pfn, unsigned long nr_pages);
> -extern void remove_memory(int nid, u64 start, u64 size);
> +extern int remove_memory(int nid, u64 start, u64 size);
> extern void __remove_memory(int nid, u64 start, u64 size);
>
> #else
> @@ -321,7 +321,11 @@ static inline int offline_pages(unsigned long start_pfn, unsigned long nr_pages)
> return -EINVAL;
> }
>
> -static inline void remove_memory(int nid, u64 start, u64 size) {}
> +static inline bool remove_memory(int nid, u64 start, u64 size)
> +{
> + return -EBUSY;
> +}
> +
> static inline void __remove_memory(int nid, u64 start, u64 size) {}
> #endif /* CONFIG_MEMORY_HOTREMOVE */
>
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index 8c454e82d4f6..a826aededa1a 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -1778,9 +1778,10 @@ static int check_memblock_offlined_cb(struct memory_block *mem, void *arg)
> endpa = PFN_PHYS(section_nr_to_pfn(mem->end_section_nr + 1))-1;
> pr_warn("removing memory fails, because memory [%pa-%pa] is onlined\n",
> &beginpa, &endpa);
> - }
>
> - return ret;
> + return -EBUSY;
> + }
> + return 0;
> }
>
> static int check_cpu_on_node(pg_data_t *pgdat)
> @@ -1843,19 +1844,9 @@ void try_offline_node(int nid)
> }
> EXPORT_SYMBOL(try_offline_node);
>
> -/**
> - * remove_memory
> - * @nid: the node ID
> - * @start: physical address of the region to remove
> - * @size: size of the region to remove
> - *
> - * NOTE: The caller must call lock_device_hotplug() to serialize hotplug
> - * and online/offline operations before this call, as required by
> - * try_offline_node().
> - */
> -void __ref __remove_memory(int nid, u64 start, u64 size)
> +static int __ref try_remove_memory(int nid, u64 start, u64 size)
> {
> - int ret;
> + int rc = 0;
>
> BUG_ON(check_hotplug_memory_range(start, size));
>
> @@ -1863,13 +1854,13 @@ void __ref __remove_memory(int nid, u64 start, u64 size)
>
> /*
> * All memory blocks must be offlined before removing memory. Check
> - * whether all memory blocks in question are offline and trigger a BUG()
> + * whether all memory blocks in question are offline and return error
> * if this is not the case.
> */
> - ret = walk_memory_range(PFN_DOWN(start), PFN_UP(start + size - 1), NULL,
> - check_memblock_offlined_cb);
> - if (ret)
> - BUG();
> + rc = walk_memory_range(PFN_DOWN(start), PFN_UP(start + size - 1), NULL,
> + check_memblock_offlined_cb);
> + if (rc)
> + goto done;
>
> /* remove memmap entry */
> firmware_map_remove(start, start + size, "System RAM");
> @@ -1879,14 +1870,42 @@ void __ref __remove_memory(int nid, u64 start, u64 size)
>
> try_offline_node(nid);
>
> +done:
> mem_hotplug_done();
> + return rc;
> }
>
> -void remove_memory(int nid, u64 start, u64 size)
> +/**
> + * remove_memory
> + * @nid: the node ID
> + * @start: physical address of the region to remove
> + * @size: size of the region to remove
> + *
> + * NOTE: The caller must call lock_device_hotplug() to serialize hotplug
> + * and online/offline operations before this call, as required by
> + * try_offline_node().
> + */
> +void __remove_memory(int nid, u64 start, u64 size)
> {
> +
> + /*
> + * trigger BUG() is some memory is not offlined prior to calling this
> + * function
> + */
> + if (try_remove_memory(nid, start, size))
> + BUG();
> +}
> +
> +/* Remove memory if every memory block is offline, otherwise return false */
Comment is wrong "return false"
> +int remove_memory(int nid, u64 start, u64 size)
> +{
> + int rc;
> +
> lock_device_hotplug();
> - __remove_memory(nid, start, size);
> + rc = try_remove_memory(nid, start, size);
> unlock_device_hotplug();
> +
> + return rc;
> }
> EXPORT_SYMBOL_GPL(remove_memory);
> #endif /* CONFIG_MEMORY_HOTREMOVE */
>
Looks sane to me
Reviewed-by: David Hildenbrand <david@redhat.com>
--
Thanks,
David / dhildenb
next prev parent reply other threads:[~2019-05-03 10:06 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-05-02 18:43 [v5 0/3] "Hotremove" persistent memory Pavel Tatashin
2019-05-02 18:43 ` [v5 1/3] device-dax: fix memory and resource leak if hotplug fails Pavel Tatashin
2019-05-02 18:43 ` [v5 2/3] mm/hotplug: make remove_memory() interface useable Pavel Tatashin
2019-05-03 10:06 ` David Hildenbrand [this message]
2019-05-06 17:57 ` Dave Hansen
2019-05-06 18:01 ` Dan Williams
2019-05-06 18:04 ` Dave Hansen
2019-05-06 18:18 ` Pavel Tatashin
2019-05-17 18:10 ` Pavel Tatashin
2019-05-06 18:13 ` Pavel Tatashin
2019-05-02 18:43 ` [v5 3/3] device-dax: "Hotremove" persistent memory that is used like normal RAM Pavel Tatashin
2019-05-02 20:50 ` [v5 0/3] "Hotremove" persistent memory Verma, Vishal L
2019-05-02 21:44 ` Pavel Tatashin
2019-05-02 22:29 ` Verma, Vishal L
2019-05-02 22:36 ` Pavel Tatashin
2019-05-03 21:48 ` Verma, Vishal L
2019-05-15 18:11 ` Pavel Tatashin
2019-05-16 0:42 ` Dan Williams
2019-05-16 7:10 ` David Hildenbrand
2019-05-17 14:09 ` Pavel Tatashin
2019-05-20 7:57 ` David Hildenbrand
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=cfd599a7-ed05-fa5a-93a0-397fb9de72e4@redhat.com \
--to=david@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=baiyaowei@cmss.chinamobile.com \
--cc=bhelgaas@google.com \
--cc=bp@suse.de \
--cc=dan.j.williams@intel.com \
--cc=dave.hansen@linux.intel.com \
--cc=dave.jiang@intel.com \
--cc=fengguang.wu@intel.com \
--cc=jglisse@redhat.com \
--cc=jmorris@namei.org \
--cc=keith.busch@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-nvdimm@lists.01.org \
--cc=mhocko@suse.com \
--cc=pasha.tatashin@soleen.com \
--cc=sashal@kernel.org \
--cc=thomas.lendacky@amd.com \
--cc=tiwai@suse.de \
--cc=vishal.l.verma@intel.com \
--cc=ying.huang@intel.com \
--cc=zwisler@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox