linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Vlastimil Babka <vbabka@suse.cz>
To: Hyeonggon Yoo <42.hyeyoo@gmail.com>
Cc: Matthew Wilcox <willy@infradead.org>,
	Hugh Dickins <hughd@google.com>,
	David Laight <David.Laight@aculab.com>,
	Joel Fernandes <joel@joelfernandes.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	rcu@vger.kernel.org, "Paul E . McKenney" <paulmck@kernel.org>
Subject: Re: amusing SLUB compaction bug when CC_OPTIMIZE_FOR_SIZE
Date: Tue, 25 Oct 2022 16:08:33 +0200	[thread overview]
Message-ID: <ad3fae63-984b-3a4e-4bfc-a09db0ad35b0@suse.cz> (raw)
In-Reply-To: <Y1fpABCR3/Vs/u0H@hyeyoo>

On 10/25/22 15:47, Hyeonggon Yoo wrote:
> On Mon, Oct 24, 2022 at 04:35:04PM +0200, Vlastimil Babka wrote:
> 
> [,,,]
> 
>> diff --git a/mm/slab.c b/mm/slab.c
>> index 59c8e28f7b6a..219beb48588e 100644
>> --- a/mm/slab.c
>> +++ b/mm/slab.c
>> @@ -1370,6 +1370,8 @@ static struct slab *kmem_getpages(struct kmem_cache *cachep, gfp_t flags,
>>  
>>  	account_slab(slab, cachep->gfporder, cachep, flags);
>>  	__folio_set_slab(folio);
>> +	/* Make the flag visible before any changes to folio->mapping */
>> +	smp_wmb();
>>  	/* Record if ALLOC_NO_WATERMARKS was set when allocating the slab */
>>  	if (sk_memalloc_socks() && page_is_pfmemalloc(folio_page(folio, 0)))
>>  		slab_set_pfmemalloc(slab);
>> @@ -1387,9 +1389,11 @@ static void kmem_freepages(struct kmem_cache *cachep, struct slab *slab)
>>  
>>  	BUG_ON(!folio_test_slab(folio));
>>  	__slab_clear_pfmemalloc(slab);
>> -	__folio_clear_slab(folio);
>>  	page_mapcount_reset(folio_page(folio, 0));
>>  	folio->mapping = NULL;
>> +	/* Make the mapping reset visible before clearing the flag */
>> +	smp_wmb();
>> +	__folio_clear_slab(folio);
>>  
>>  	if (current->reclaim_state)
>>  		current->reclaim_state->reclaimed_slab += 1 << order;
>> diff --git a/mm/slub.c b/mm/slub.c
>> index 157527d7101b..6dc17cb915c5 100644
>> --- a/mm/slub.c
>> +++ b/mm/slub.c
>> @@ -1800,6 +1800,8 @@ static inline struct slab *alloc_slab_page(gfp_t flags, int node,
>>  
>>  	slab = folio_slab(folio);
>>  	__folio_set_slab(folio);
>> +	/* Make the flag visible before any changes to folio->mapping */
>> +	smp_wmb();
>>  	if (page_is_pfmemalloc(folio_page(folio, 0)))
>>  		slab_set_pfmemalloc(slab);
>>  
>> @@ -2008,8 +2010,10 @@ static void __free_slab(struct kmem_cache *s, struct slab *slab)
>>  	}
>>  
>>  	__slab_clear_pfmemalloc(slab);
>> -	__folio_clear_slab(folio);
>>  	folio->mapping = NULL;
>> +	/* Make the mapping reset visible before clearing the flag */
>> +	smp_wmb();
>> +	__folio_clear_slab(folio);
>>  	if (current->reclaim_state)
>>  		current->reclaim_state->reclaimed_slab += pages;
>>  	unaccount_slab(slab, order, s);
>> -- 
>> 2.38.0
> 
> Do we need to try this with memory barriers before frozen refcount lands in?

There was IIRC an unresolved issue with frozen refcount tripping the page
isolation code so I didn't want to be depending on that.

> It's quite complicated and IIUC there is a still theoretical race:
> 
> At isolate_movable_page:        At slab alloc:                          At slab free:
>                                 folio = alloc_pages(flags, order)
> 
> folio_try_get()
> folio_test_slab() == false
>                                 __folio_set_slab(folio)
>                                 smp_wmb()
> 
>                                                                         call_rcu(&slab->rcu_head, rcu_free_slab);
> 
> 
> smp_rmb()
> __folio_test_movable() == true
> 
>                                                                         folio->mapping = NULL;
>                                                                         smp_wmb()
>                                                                         __folio_clear_slab(folio);
> smp_rmb()
> folio_test_slab() == false
> 
> folio_trylock()

There's also between above and below:

if (!PageMovable(page) || PageIsolated(page))
	goto out_no_isolated;

mops = page_movable_ops(page);

If we put another smp_rmb() before the PageMovable test, could that have
helped? It would assure we observe the folio->mapping = NULL; from the "slab
free" side?

But yeah, it's getting ridiculous. Maybe there's a simpler way to check two
bits in two different bytes atomically. Or maybe it's just an impossible
task, I feel I just dunno computers at this point.

> mops->isolate_page() (*crash*)
> 
> 
> Please let me know if I'm missing something ;-)
> Thanks!
> 



  reply	other threads:[~2022-10-25 14:08 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-28  5:16 Hugh Dickins
2022-09-28  5:49 ` Hyeonggon Yoo
2022-09-28 13:48   ` Joel Fernandes
2022-09-28 15:09     ` Hyeonggon Yoo
2022-09-28 16:20     ` Vlastimil Babka
2022-09-28 17:50       ` Hugh Dickins
2022-09-29  9:58         ` Vlastimil Babka
2022-09-29 21:54           ` Hugh Dickins
2022-09-30  7:39             ` Vlastimil Babka
2022-09-30 10:45               ` Hugh Dickins
2022-09-30 11:02                 ` David Laight
2022-09-30 16:21                   ` Hugh Dickins
2022-09-30 21:34                     ` David Laight
2022-10-02  5:48             ` Hyeonggon Yoo
2022-10-03 17:00               ` Matthew Wilcox
2022-10-04 14:26                 ` Hyeonggon Yoo
2022-10-04 14:40                   ` Matthew Wilcox
2022-10-05 11:07                     ` Hyeonggon Yoo
2022-10-24 14:35                 ` Vlastimil Babka
2022-10-24 15:06                   ` Matthew Wilcox
2022-10-24 15:24                     ` Vlastimil Babka
2022-10-24 16:49                   ` Vlastimil Babka
2022-10-25  4:19                   ` Hugh Dickins
2022-10-25  9:17                     ` Vlastimil Babka
2022-10-25 15:45                       ` Hugh Dickins
2022-10-25 13:47                   ` Hyeonggon Yoo
2022-10-25 14:08                     ` Vlastimil Babka [this message]
2022-10-26 10:52                       ` Vlastimil Babka
2022-10-26 12:29                         ` Hyeonggon Yoo
2022-11-04 15:57                   ` Vlastimil Babka
2022-09-29 11:53         ` David Laight
2022-09-29 13:01           ` Vlastimil Babka
2022-09-29 14:04             ` David Laight
2022-09-28 17:56       ` Hyeonggon Yoo
2022-09-28 19:53         ` Joel Fernandes

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ad3fae63-984b-3a4e-4bfc-a09db0ad35b0@suse.cz \
    --to=vbabka@suse.cz \
    --cc=42.hyeyoo@gmail.com \
    --cc=David.Laight@aculab.com \
    --cc=akpm@linux-foundation.org \
    --cc=hughd@google.com \
    --cc=joel@joelfernandes.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=paulmck@kernel.org \
    --cc=rcu@vger.kernel.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox