linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Vlastimil Babka <vbabka@suse.cz>
To: "Liam R. Howlett" <Liam.Howlett@Oracle.com>,
	Peng Zhang <zhangpeng.00@bytedance.com>,
	Hyeonggon Yoo <42.hyeyoo@gmail.com>,
	Joonsoo Kim <iamjoonsoo.kim@lge.com>,
	Pekka Enberg <penberg@kernel.org>,
	David Rientjes <rientjes@google.com>,
	Christoph Lameter <cl@linux.com>,
	Roman Gushchin <roman.gushchin@linux.dev>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	patches@lists.linux.dev, Matthew Wilcox <willy@infradead.org>
Subject: Re: [RFC v1 4/5] maple_tree: avoid bulk alloc/free to use percpu array more
Date: Tue, 8 Aug 2023 17:08:50 +0200	[thread overview]
Message-ID: <e4344207-4236-aeb7-5d51-91e3c65451d8@suse.cz> (raw)
In-Reply-To: <20230808142945.tulcze5bjg5ciftk@revolver>

On 8/8/23 16:29, Liam R. Howlett wrote:
> * Peng Zhang <zhangpeng.00@bytedance.com> [230808 07:17]:
>> 
>> 
>> 在 2023/8/8 17:53, Vlastimil Babka 写道:
>> > Using bulk alloc/free on a cache with percpu array should not be
>> > necessary and the bulk alloc actually bypasses the array (the prefill
>> > functionality currently relies on this).
>> > 
>> > The simplest change is just to convert the respective maple tree
>> > wrappers to do a loop of normal alloc/free.
>> > ---
>> >   lib/maple_tree.c | 11 +++++++++--
>> >   1 file changed, 9 insertions(+), 2 deletions(-)
>> > 
>> > diff --git a/lib/maple_tree.c b/lib/maple_tree.c
>> > index 1196d0a17f03..7a8e7c467d7c 100644
>> > --- a/lib/maple_tree.c
>> > +++ b/lib/maple_tree.c
>> > @@ -161,12 +161,19 @@ static inline struct maple_node *mt_alloc_one(gfp_t gfp)
>> >   static inline int mt_alloc_bulk(gfp_t gfp, size_t size, void **nodes)
>> >   {
>> > -	return kmem_cache_alloc_bulk(maple_node_cache, gfp, size, nodes);
>> > +	int allocated = 0;
>> > +	for (size_t i = 0; i < size; i++) {
>> > +		nodes[i] = kmem_cache_alloc(maple_node_cache, gfp);
>> > +		if (nodes[i])
>> If the i-th allocation fails, node[i] will be NULL. This is wrong. We'd
>> better guarantee that mt_alloc_bulk() allocates completely successfully,
>> or returns 0. The following cases are not allowed:
>> nodes: [addr1][addr2][NULL][addr3].

Thanks, indeed. I guess it should just break; in case of failure and return
how many allocations succeeded so far.

But note this is a really a quick RFC proof of concept hack. I'd expect if
the whole idea is deemed as good, the maple tree node handling could be
redesigned (simplified?) around it and maybe there's no mt_alloc_bulk()
anymore as a result?

> Thanks for pointing this out Peng.
> 
> We can handle a lower number than requested being returned, but we
> cannot handle the sparse data.
> 
> The kmem_cache_alloc_bulk() can return a failure today - leaving the
> array to be cleaned by the caller, so if this is changed to a full
> success or full fail, then we will also have to change the caller to
> handle whatever state is returned if it differs from
> kmem_cache_alloc_bulk().
> 
> It might be best to return the size already allocated when a failure is
> encountered. This will make the caller, mas_alloc_nodes(), request more
> nodes.  Only in the case of zero allocations would this be seen as an
> OOM event.
> 
> Vlastimil, Is the first kmem_cache_alloc() call failing a possibility?

Sure, if there's no memory, it can fail. In practice if gfp is one that
allows reclaim, it will ultimately be the "too small to fail" allocation on
the page allocator level. But there are exceptions, like having received a
fatal signal, IIRC :)

> If so, what should be the corrective action?

Depends on your context, if you can pass on -ENOMEM to the caller, or need
to succeed.

>> > +			allocated++;
>> > +	}
>> > +	return allocated;
>> >   }
>> >   static inline void mt_free_bulk(size_t size, void __rcu **nodes)
>> >   {
>> > -	kmem_cache_free_bulk(maple_node_cache, size, (void **)nodes);
>> > +	for (size_t i = 0; i < size; i++)
>> > +		kmem_cache_free(maple_node_cache, nodes[i]);
>> >   }
>> >   static void mt_free_rcu(struct rcu_head *head)



  reply	other threads:[~2023-08-08 15:11 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-08  9:53 [RFC v1 0/5] SLUB percpu array caches and maple tree nodes Vlastimil Babka
2023-08-08  9:53 ` [RFC v1 1/5] mm, slub: fix bulk alloc and free stats Vlastimil Babka
2023-08-18 11:47   ` Hyeonggon Yoo
2023-08-08  9:53 ` [RFC v1 2/5] mm, slub: add opt-in slub_percpu_array Vlastimil Babka
2023-08-08 12:06   ` Pedro Falcato
2023-08-08  9:53 ` [RFC v1 3/5] maple_tree: use slub percpu array Vlastimil Babka
2023-08-08  9:53 ` [RFC v1 4/5] maple_tree: avoid bulk alloc/free to use percpu array more Vlastimil Babka
2023-08-08 11:17   ` Peng Zhang
2023-08-08 14:29     ` Liam R. Howlett
2023-08-08 15:08       ` Vlastimil Babka [this message]
2023-08-08  9:53 ` [RFC v1 5/5] maple_tree: replace preallocation with slub percpu array prefill Vlastimil Babka
2023-08-08 14:37   ` Liam R. Howlett
2023-08-08 19:01     ` Liam R. Howlett
2023-08-08 19:03   ` Liam R. Howlett

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e4344207-4236-aeb7-5d51-91e3c65451d8@suse.cz \
    --to=vbabka@suse.cz \
    --cc=42.hyeyoo@gmail.com \
    --cc=Liam.Howlett@Oracle.com \
    --cc=cl@linux.com \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=patches@lists.linux.dev \
    --cc=penberg@kernel.org \
    --cc=rientjes@google.com \
    --cc=roman.gushchin@linux.dev \
    --cc=willy@infradead.org \
    --cc=zhangpeng.00@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox