linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: Mel Gorman <mgorman@techsingularity.net>,
	Alexander Duyck <alexander.h.duyck@linux.intel.com>
Cc: Michal Hocko <mhocko@kernel.org>,
	akpm@linux-foundation.org, aarcange@redhat.com,
	dan.j.williams@intel.com, dave.hansen@intel.com,
	konrad.wilk@oracle.com, lcapitulino@redhat.com,
	mm-commits@vger.kernel.org, mst@redhat.com, osalvador@suse.de,
	pagupta@redhat.com, pbonzini@redhat.com, riel@surriel.com,
	vbabka@suse.cz, wei.w.wang@intel.com, willy@infradead.org,
	yang.zhang.wz@gmail.com, linux-mm@kvack.org
Subject: Re: + mm-introduce-reported-pages.patch added to -mm tree
Date: Thu, 7 Nov 2019 00:38:42 +0100	[thread overview]
Message-ID: <cff63b5c-0e1c-3da9-23c9-2409daecde87@redhat.com> (raw)
In-Reply-To: <20191106221150.GR3016@techsingularity.net>

>>> I definitely do not intent to nack this work, I just have maintainability
>>> concerns and considering there is an alternative approach that does not
>>> require to touch page allocator internals and which we need to compare
>>> against then I do not really think there is any need to push something
>>> in right away. Or is there any pressing reason to have this merged right
>>> now?
>>
>> The alternative approach doesn't touch the page allocator, however it
>> still has essentially the same changes to __free_one_page. I suspect the
>> performance issue seen is mostly due to the fact that because it doesn't
>> touch the page allocator it is taking the zone lock and probing the page
>> for each set bit to see if the page is still free. As such the performance
>> regression seen gets worse the lower the order used for reporting.
>>
> 
> What confused me quite a lot is that this is enabled at compile time
> and then incurs a performance hit whether there is a hypervisor that
> even cares is involved or not. So I don't think the performance angle
> justifying this approach is a good one because this implementation has
> issues of its own. Previously I said
> 
> 	I worry that poking too much into the internal state of the
> 	allocator will be fragile long-term. There is the arch alloc/free
> 	hooks but they are typically about protections only and does not
> 	interfere with the internal state of the allocator. Compaction
> 	pokes in as well but once the page is off the free list, the page
> 	allocator no longer cares so again there is on interference with
> 	the internal state. If the state is interefered with externally,
> 	it becomes unclear what happens if things like page merging is
> 	deferred in a way the allocator cannot control as high-order
> 	allocation requests may fail for example.
> 
> Adding an API for compaction does not get away from the problem that
> it'll be fragile to depend on the internal state of the allocator for
> correctness. Existing users that poke into the state do so as an
> optimistic shortcut but if it fails, nothing actually breaks. The free
> list reporting stuff might and will not be routinely tested.
> 
> Take compaction as an example, the initial implementation of it was dumb
> as rocks and only started maintaining additional state and later poking
> into the page allocator when there was empirical evidence it was necessary.
> 
> The initial implementation of page reporting should be the same, it
> should do no harm at all to users that don't care (hiding behind
> kconfig is not good enough, use static branches) and it should not
> depend on the internal state of the page allocator and ordering of free
> lists for correctness until it's shown it's absolutely necessary.
> 
> You say that the zone lock has to be taken in the alternative
> implementation to check if it's still free and sure, that would cost
> but unless you are isolating that page immediately then you are racing
> once the lock is released. If you are isolating immediately, then isolate
> pages in batches to amortise the loock costs.  The details of this could
> be really hard but this approach is essentially saying "everything,
> everywhere should take a small hit so the overhead is not noticeable for
> virtio users" which is a curious choice for a new feature.
> 
> Regardless of the details of any implementation, the first one should be
> basic, do no harm and be relatively simple giving just a bare interface
> to virtio/qemu/etc. Then optimise it until such point as there is no
> chance but to poke into the core.

I second that. If somebody would ask me, I'd want to see a simple, 
maintainable design that provides a net benefit, does not harm !virt and 
possibly reuses existing core functionality (e.g., page isolation). We 
can work from there to optimize.

-- 

Thanks,

David / dhildenb



  reply	other threads:[~2019-11-06 23:39 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20191106000547.juQRi83gi%akpm@linux-foundation.org>
2019-11-06 12:16 ` Michal Hocko
2019-11-06 14:09   ` David Hildenbrand
2019-11-06 16:35     ` Alexander Duyck
2019-11-06 16:54       ` Michal Hocko
2019-11-06 17:48         ` Alexander Duyck
2019-11-06 22:11           ` Mel Gorman
2019-11-06 23:38             ` David Hildenbrand [this message]
2019-11-07  0:20             ` Alexander Duyck
2019-11-07 10:20               ` Mel Gorman
2019-11-07 16:07                 ` Alexander Duyck
2019-11-08  9:43                   ` Mel Gorman
2019-11-08 16:17                     ` Alexander Duyck
2019-11-08 18:41                       ` Mel Gorman
2019-11-08 20:29                         ` Alexander Duyck
2019-11-09 14:57                           ` Mel Gorman
2019-11-10 18:03                             ` Alexander Duyck
2019-11-06 23:33           ` David Hildenbrand
2019-11-07  0:20             ` Dave Hansen
2019-11-07  0:52               ` David Hildenbrand
2019-11-07 17:12                 ` Dave Hansen
2019-11-07 17:46                   ` Michal Hocko
2019-11-07 18:08                     ` Dave Hansen
2019-11-07 18:12                     ` Alexander Duyck
2019-11-08  9:57                       ` Michal Hocko
2019-11-08 16:43                         ` Alexander Duyck
2019-11-07 18:46                   ` Qian Cai
2019-11-07 18:02             ` Alexander Duyck
2019-11-07 19:37               ` Nitesh Narayan Lal
2019-11-07 22:46                 ` Alexander Duyck
2019-11-07 22:43               ` David Hildenbrand
2019-11-08  0:42                 ` Alexander Duyck
2019-11-08  7:06                   ` David Hildenbrand
2019-11-08 17:18                     ` Alexander Duyck
2019-11-12 13:04                       ` David Hildenbrand
2019-11-12 18:34                         ` Alexander Duyck
2019-11-12 21:05                           ` David Hildenbrand
2019-11-12 22:17                             ` David Hildenbrand
2019-11-12 22:19                             ` Alexander Duyck
2019-11-12 23:10                               ` David Hildenbrand
2019-11-13  0:31                                 ` Alexander Duyck
2019-11-13 18:51                           ` Nitesh Narayan Lal
2019-11-06 16:49   ` Nitesh Narayan Lal
2019-11-11 18:52   ` Nitesh Narayan Lal
2019-11-11 22:00     ` Alexander Duyck
2019-11-12 15:19       ` Nitesh Narayan Lal
2019-11-12 16:18         ` Alexander Duyck
2019-11-13 18:39           ` Nitesh Narayan Lal

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cff63b5c-0e1c-3da9-23c9-2409daecde87@redhat.com \
    --to=david@redhat.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=alexander.h.duyck@linux.intel.com \
    --cc=dan.j.williams@intel.com \
    --cc=dave.hansen@intel.com \
    --cc=konrad.wilk@oracle.com \
    --cc=lcapitulino@redhat.com \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@techsingularity.net \
    --cc=mhocko@kernel.org \
    --cc=mm-commits@vger.kernel.org \
    --cc=mst@redhat.com \
    --cc=osalvador@suse.de \
    --cc=pagupta@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=riel@surriel.com \
    --cc=vbabka@suse.cz \
    --cc=wei.w.wang@intel.com \
    --cc=willy@infradead.org \
    --cc=yang.zhang.wz@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox