Re: [linus:master] [mm] f822a9a81a: stress-ng.bigheap.realloc_calls_per_sec 37.3% regression

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
To: David Hildenbrand <david@redhat.com>
Cc: Jann Horn <jannh@google.com>,
	kernel test robot <oliver.sang@intel.com>,
	Dev Jain <dev.jain@arm.com>,
	oe-lkp@lists.linux.dev, lkp@intel.com,
	linux-kernel@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>,
	Barry Song <baohua@kernel.org>, Pedro Falcato <pfalcato@suse.de>,
	Anshuman Khandual <anshuman.khandual@arm.com>,
	Bang Li <libang.li@antgroup.com>,
	Baolin Wang <baolin.wang@linux.alibaba.com>,
	bibo mao <maobibo@loongson.cn>, Hugh Dickins <hughd@google.com>,
	Ingo Molnar <mingo@kernel.org>, Lance Yang <ioworker0@gmail.com>,
	Liam Howlett <liam.howlett@oracle.com>,
	Matthew Wilcox <willy@infradead.org>,
	Peter Xu <peterx@redhat.com>,
	Qi Zheng <zhengqi.arch@bytedance.com>,
	Ryan Roberts <ryan.roberts@arm.com>,
	Vlastimil Babka <vbabka@suse.cz>,
	Yang Shi <yang@os.amperecomputing.com>, Zi Yan <ziy@nvidia.com>,
	linux-mm@kvack.org
Subject: Re: [linus:master] [mm] f822a9a81a: stress-ng.bigheap.realloc_calls_per_sec 37.3% regression
Date: Thu, 7 Aug 2025 20:52:45 +0100	[thread overview]
Message-ID: <c170282b-be66-4eb0-91bf-17614acf3321@lucifer.local> (raw)
In-Reply-To: <d8e0f3b3-6ea7-492b-bb94-4f5d1ab28ef2@redhat.com>

On Thu, Aug 07, 2025 at 08:31:18PM +0200, David Hildenbrand wrote:
> On 07.08.25 20:07, Jann Horn wrote:
> > On Thu, Aug 7, 2025 at 8:02 PM David Hildenbrand <david@redhat.com> wrote:
> > > Sure, we could use pte_batch_hint(), but I'm curious if x86 would also
> > > benefit with larger folios (e.g., 64K, 128K) with this patch.
> >
> > Where would you expect such a benefit to come from? This function is
> > more or less a memcpy(), except it has to read PTEs with xchg(), write
> > them atomically, and set softdirty flags. For x86, what the associated
> > folios look like and whether the PTEs are contiguous shouldn't matter.
> >
>
> Good point, I was assuming TLB flushing as well, but that doesn't really
> apply here because we are already batching that.

Ah good point, but indeed, while we force a TLB flush if we discover a
present pte, we do so only _after_ we have finished processing entries in
the PTE table, and we would only batch up to, at most, the end of the PTE
table, so we have zero possible delta here on that.

 I did wonder if _somehow_ we'd get some benefit by grouping operations
(yes, this was a handwavey thought).

But Jann's point puts that to bed...

I really feel like this is a super arch-specfic feature that maybe we need
to go around and make arm64-only or predicated on something like the
contpte hint check to be effectively equivalent to.

Because my whole basis for accepting this on other arches is there'd be
little to no impact and now we have seen a huge impact and it's worrying.

>
> --
> Cheers,
>
> David / dhildenb
>

Cheers, Lorenzo

next prev parent reply	other threads:[~2025-08-07 19:53 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-07  8:17 kernel test robot
2025-08-07  8:27 ` Lorenzo Stoakes
2025-08-07  8:56   ` Dev Jain
2025-08-07 10:21   ` David Hildenbrand
2025-08-07 16:06     ` Dev Jain
2025-08-07 16:10       ` Lorenzo Stoakes
2025-08-07 16:16         ` Lorenzo Stoakes
2025-08-07 17:04           ` Dev Jain
2025-08-07 17:07             ` Lorenzo Stoakes
2025-08-07 17:11               ` Dev Jain
2025-08-07 17:37   ` Jann Horn
2025-08-07 17:41     ` Lorenzo Stoakes
2025-08-07 17:46       ` Jann Horn
2025-08-07 17:50         ` Dev Jain
2025-08-07 17:53           ` Lorenzo Stoakes
2025-08-07 17:51         ` Lorenzo Stoakes
2025-08-07 18:01           ` David Hildenbrand
2025-08-07 18:04             ` Lorenzo Stoakes
2025-08-07 18:13               ` David Hildenbrand
2025-08-07 18:07             ` Jann Horn
2025-08-07 18:31               ` David Hildenbrand
2025-08-07 19:52                 ` Lorenzo Stoakes [this message]
2025-08-07 17:59       ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c170282b-be66-4eb0-91bf-17614acf3321@lucifer.local \
    --to=lorenzo.stoakes@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=anshuman.khandual@arm.com \
    --cc=baohua@kernel.org \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=david@redhat.com \
    --cc=dev.jain@arm.com \
    --cc=hughd@google.com \
    --cc=ioworker0@gmail.com \
    --cc=jannh@google.com \
    --cc=liam.howlett@oracle.com \
    --cc=libang.li@antgroup.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lkp@intel.com \
    --cc=maobibo@loongson.cn \
    --cc=mingo@kernel.org \
    --cc=oe-lkp@lists.linux.dev \
    --cc=oliver.sang@intel.com \
    --cc=peterx@redhat.com \
    --cc=pfalcato@suse.de \
    --cc=ryan.roberts@arm.com \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    --cc=yang@os.amperecomputing.com \
    --cc=zhengqi.arch@bytedance.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox