linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: Oscar Salvador <osalvador@suse.de>
Cc: Michal Hocko <mhocko@suse.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Anshuman Khandual <anshuman.khandual@arm.com>,
	Vlastimil Babka <vbabka@suse.cz>,
	Pavel Tatashin <pasha.tatashin@soleen.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2 1/3] mm,page_alloc: Use {get,put}_online_mems() to get stable zone's values
Date: Wed, 9 Jun 2021 11:42:14 +0200	[thread overview]
Message-ID: <a1fe74b9-5531-5cbb-f24a-437ac7534904@redhat.com> (raw)
In-Reply-To: <0eadea9c-5af0-d7e6-071e-898b04294dd3@redhat.com>

On 08.06.21 17:00, David Hildenbrand wrote:
> On 07.06.21 12:23, Oscar Salvador wrote:
>> On Mon, Jun 07, 2021 at 10:49:01AM +0200, David Hildenbrand wrote:
>>> I'd like to point out that I think the seqlock is not in place to
>>> synchronize with actual growing/shrinking but to get consistent zone ranges
>>> -- like using atomics, but we have two inter-dependent values here.
>>
>> I guess so, at least that's what it should do.
>> But the way it is placed right now is misleading.
>>
>> If we really want to get consistent zone ranges, we should start using
>> zone's seqlock where it matters and that is pretty much all those
>> places that use zone_spans_pfn().
> 
> Right, or even only zone_end_pfn() to get a consistent value.
> 
>> Otherwise there is no way you can be sure the pfn you're checking is
>> within the limits. Moreover, as Michal pointed out early, if we really
>> want to go down that road the locking should be made in the caller
>> evolving the operation, otheriwse things might change once the lock
>> is dropped and you're working with a wrong assumption.
>>
>> I can see arguments for both riping it out and doing it right (but none for
>> the way it is right now).
>> For riping it out, one could say that those races might not be fatal,
>> as usually the pfn you're working with (the one you want to check falls
>> within a certain range) you know is valid, so the worst can happen is
>> you get false positives/negatives and that might or might not be detected
>> further down. How bad are false positive/negatives I guess it depends on the
>> situation, but we already do that right now.
>> The zone_spans_pfn() from page_outside_zone_boundaries() is the only one using
>> locking right now, so well, if we survided this long without locks in other places
>> using zone_spans_pfn() makes one wonder if it is that bad.
>>
>> On the other hand, one could argue that for correctness sake, we should be holding
>> zone's seqlock whenever checking for zone_spans_pfn() to avoid any inconsistency.
>>
>>
> 
> IMHO, as we know the race exists and we have a tool to handle it in
> place, we should maybe fix the obvious cases if possible.
> 
> Code that uses zone->zone_start_pfn directly is unlikely to be broken on
> most architectures. We will usually read/write via single instruction
> and won't get inconsistencies, for example, when shrinking or growing
> the zone. We most probably don't want to use an atomic for that right now.
> 
> Code that uses zone->spanned_pages to detect the zone end, however, is
> more likely to be broken. I don't think we have any relevant around
> anymore. Everything was converted to zone_end_pfn().
> 
> I feel like we should just make zone_end_pfn() take the seqlock in read.
> Then, we at least get a consistent value, for example, while growing a zone.
> 
> Just imagine the following case when we grow a section to the front when
> onlining memory:
> 
> 	zone->zone_start_pfn -= new_pages;
> 	zone->spanned_pages += new_pages;
> 
> Note that compilers/CPUs might reshuffle as they like. If someone (e.g.,
> zone_spans_pfn()) races with that code, it might get new
> zone->zone_start_pfn but old zone->spanned_pages. zone_end_pfn() will
> report a "too small zone" and trigger false negatives in zone_spans_pfn().
> 

Thinking again, we could of course also simply convert to 
zone->zone_start_+ pfn zone->zone_end_pfn. Places that need 
spanned_pages() would have the same issue, but I think they are rather a 
concern case.

-- 
Thanks,

David / dhildenb



  reply	other threads:[~2021-06-09  9:42 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-02  9:14 [PATCH v2 0/3] Memory hotplug locking cleanup Oscar Salvador
2021-06-02  9:14 ` [PATCH v2 1/3] mm,page_alloc: Use {get,put}_online_mems() to get stable zone's values Oscar Salvador
2021-06-02 18:37   ` David Hildenbrand
2021-06-02 19:45     ` Oscar Salvador
2021-06-03  8:38       ` Oscar Salvador
2021-06-03 12:45         ` Michal Hocko
2021-06-04  7:41           ` Oscar Salvador
2021-06-07  7:52             ` Oscar Salvador
2021-06-07  8:49               ` David Hildenbrand
2021-06-07 10:23                 ` Oscar Salvador
2021-06-08 10:42                   ` Oscar Salvador
2021-06-08 15:00                   ` David Hildenbrand
2021-06-09  9:42                     ` David Hildenbrand [this message]
2021-06-07  8:42             ` Michal Hocko
2021-06-03  2:32   ` [mm,page_alloc] [confidence: ] acb5758bf4: BUG:sleeping_function_called_from_invalid_context_at_include/linux/percpu-rwsem.h kernel test robot
2021-06-02  9:14 ` [PATCH v2 2/3] mm,memory_hotplug: Drop unneeded locking Oscar Salvador
2021-06-03 12:52   ` Michal Hocko
2021-06-02  9:14 ` [PATCH v2 3/3] mm,memory_hotplug: Remove unneeded declarations Oscar Salvador
2021-06-02 18:38   ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a1fe74b9-5531-5cbb-f24a-437ac7534904@redhat.com \
    --to=david@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=anshuman.khandual@arm.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=osalvador@suse.de \
    --cc=pasha.tatashin@soleen.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox