linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: Michal Hocko <mhocko@suse.com>, Mel Gorman <mgorman@suse.de>
Cc: Patrick Daly <quic_pdaly@quicinc.com>,
	linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org,
	Juergen Gross <jgross@suse.com>
Subject: Re: Race condition in build_all_zonelists() when offlining movable zone
Date: Tue, 23 Aug 2022 15:50:23 +0200	[thread overview]
Message-ID: <37031749-ff57-2f90-5c90-f16473f31e37@redhat.com> (raw)
In-Reply-To: <YwTVMdH4RgSrUNUd@dhcp22.suse.cz>

On 23.08.22 15:25, Michal Hocko wrote:
> On Tue 23-08-22 13:58:50, Mel Gorman wrote:
>> On Tue, Aug 23, 2022 at 02:18:27PM +0200, Michal Hocko wrote:
>>> On Tue 23-08-22 12:09:46, Mel Gorman wrote:
>>>> On Tue, Aug 23, 2022 at 12:34:09PM +0200, David Hildenbrand wrote:
>>>>>> @@ -6553,7 +6576,7 @@ static void __build_all_zonelists(void *data)
>>>>>>  #endif
>>>>>>  	}
>>>>>>  
>>>>>> -	spin_unlock(&lock);
>>>>>> +	write_sequnlock(&zonelist_update_seq);
>>>>>>  }
>>>>>>  
>>>>>>  static noinline void __init
>>>>>>
>>>>>
>>>>> LGTM. The "retry_cpuset" label might deserve a better name now.
>>>>>
>>>>
>>>> Good point ...  "restart"?
>>>>
>>>>> Would
>>>>>
>>>>> Fixes: 6aa303defb74 ("mm, vmscan: only allocate and reclaim from zones
>>>>> with pages managed by the buddy allocator")
>>>>>
>>>>> be correct?
>>>>>
>>>>
>>>> Not specifically because the bug is due to a zone being completely removed
>>>> resulting in a rebuild. This race probably existed ever since memory
>>>> hotremove could theoritically remove a complete zone. A Cc: Stable would
>>>> be appropriate as it'll apply with fuzz back to at least 5.4.210 but beyond
>>>> that, it should be driven by a specific bug report showing that hot-remove
>>>> of a full zone was possible and triggered the race.
>>>
>>> I do not think so. 6aa303defb74 has changed the zonelist building and
>>> changed the check from pfn range (populated) to managed (with a memory).
>>
>> I'm not 100% convinced. The present_pages should have been the spanned range
>> minus any holes that exist in the zone. If the zone is completely removed,
>> the span should be zero meaning present and managed are both zero. No? 
> 
> IIRC, and David will correct me if I am mixing this up. The difference
> is that zonelists are rebuilt during memory offlining and that is when
> managed pages are removed from the allocator. Zone itself still has that
> physical range populated and so this patch would have made a difference.

To recap, memory offlining adjusts managed+present pages of the zone
essentially in one go. If after the adjustments, the zone is no longer
populated (present==0), we rebuild the zone lists.

Once that's done, we try shrinking the zone (start+spanned pages) --
which results in zone_start_pfn == 0 if there are no more pages. That
happens *after* rebuilding the zonelists via remove_pfn_range_from_zone().


Note that populated_zone() checks for present_pages. The actual zone
span (e.g., spanned_pages) is a different story and not of interest when
building zones or wanting to allocate memory.

> 
> Now, you are right that this is likely possible even without that commit
> but it is highly unlikely because physical hotremove is a very rare
> operation and the race window would be so large that it would be likely
> unfeasible.

I think I agree that 6aa303defb74 is most likely not the origin of this.
It could only have been the origin in weird corner cases where we
actually succeed offlining one memory block (adjust present+managed) and
end up with managed=0 and present!=0 -- which barely happens in
practice: especially for ZONE_MOVABLE. (yeah, there is memory ballooning
that adjusts managed pages dynamically and might provoke such a
situation on ZONE_MOVABLE)

-- 
Thanks,

David / dhildenb



  reply	other threads:[~2022-08-23 13:50 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-08-17  3:42 Patrick Daly
     [not found] ` <YvyM1AWeJRt6PF9B@dhcp22.suse.cz>
     [not found]   ` <YvyRvxO+FHMyuGn3@dhcp22.suse.cz>
     [not found]     ` <20220817104028.uin7cmkb4qlpgfbi@suse.de>
     [not found]       ` <YvzI0PHW6uojk+N1@dhcp22.suse.cz>
     [not found]         ` <20220817112647.z7wenwjpyt3hphtk@suse.de>
2022-08-19  2:11           ` Patrick Daly
2022-08-22 20:18           ` Patrick Daly
2022-08-23  6:36       ` David Hildenbrand
     [not found]         ` <20220823083349.5c2aolc6xgfhp3k7@suse.de>
2022-08-23  8:52           ` David Hildenbrand
2022-08-23  9:49             ` Mel Gorman
2022-08-23 10:34               ` David Hildenbrand
     [not found]                 ` <20220823110946.o3eawk3kghaykcim@suse.de>
2022-08-23 12:18                   ` Michal Hocko
     [not found]                     ` <20220823125850.o3nhkmikmv7vyxq4@suse.de>
2022-08-23 13:25                       ` Michal Hocko
2022-08-23 13:50                         ` David Hildenbrand [this message]
2022-08-23 13:57                           ` Michal Hocko
2022-08-23 15:14                             ` Mel Gorman
2022-08-23 15:38                               ` Michal Hocko
2022-08-23 15:51                                 ` David Hildenbrand
2022-08-24  9:45                                   ` Mel Gorman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=37031749-ff57-2f90-5c90-f16473f31e37@redhat.com \
    --to=david@redhat.com \
    --cc=jgross@suse.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=mhocko@suse.com \
    --cc=quic_pdaly@quicinc.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox