linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Dev Jain <dev.jain@arm.com>
To: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: David Hildenbrand <david@redhat.com>,
	kernel test robot <oliver.sang@intel.com>,
	oe-lkp@lists.linux.dev, lkp@intel.com,
	linux-kernel@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>,
	Barry Song <baohua@kernel.org>, Pedro Falcato <pfalcato@suse.de>,
	Anshuman Khandual <anshuman.khandual@arm.com>,
	Bang Li <libang.li@antgroup.com>,
	Baolin Wang <baolin.wang@linux.alibaba.com>,
	bibo mao <maobibo@loongson.cn>, Hugh Dickins <hughd@google.com>,
	Ingo Molnar <mingo@kernel.org>, Jann Horn <jannh@google.com>,
	Lance Yang <ioworker0@gmail.com>,
	Liam Howlett <liam.howlett@oracle.com>,
	Matthew Wilcox <willy@infradead.org>,
	Peter Xu <peterx@redhat.com>,
	Qi Zheng <zhengqi.arch@bytedance.com>,
	Ryan Roberts <ryan.roberts@arm.com>,
	Vlastimil Babka <vbabka@suse.cz>,
	Yang Shi <yang@os.amperecomputing.com>, Zi Yan <ziy@nvidia.com>,
	linux-mm@kvack.org
Subject: Re: [linus:master] [mm] f822a9a81a: stress-ng.bigheap.realloc_calls_per_sec 37.3% regression
Date: Thu, 7 Aug 2025 22:41:21 +0530	[thread overview]
Message-ID: <7b4e14b1-c212-4207-aa4d-aa5610148abd@arm.com> (raw)
In-Reply-To: <404ae0cb-9c70-4aa1-99ef-b5e90c500140@lucifer.local>


On 07/08/25 10:37 pm, Lorenzo Stoakes wrote:
> On Thu, Aug 07, 2025 at 10:34:43PM +0530, Dev Jain wrote:
>> On 07/08/25 9:46 pm, Lorenzo Stoakes wrote:
>>> On Thu, Aug 07, 2025 at 05:10:17PM +0100, Lorenzo Stoakes wrote:
>>>> On Thu, Aug 07, 2025 at 09:36:38PM +0530, Dev Jain wrote:
>>>>
>>>>>>>> commit:
>>>>>>>>      94dab12d86 ("mm: call pointers to ptes as ptep")
>>>>>>>>      f822a9a81a ("mm: optimize mremap() by PTE batching")
>>>>>>>>
>>>>>>>> 94dab12d86cf77ff f822a9a81a31311d67f260aea96
>>>>>>>> ---------------- ---------------------------
>>>>>>>>             %stddev     %change         %stddev
>>>>>>>>                 \          |                \
>>>>>>>>         13777 ± 37%     +45.0%      19979 ± 27%
>>>>>>>> numa-vmstat.node1.nr_slab_reclaimable
>>>>>>>>        367205            +2.3%     375703 vmstat.system.in
>>>>>>>>         55106 ± 37%     +45.1%      79971 ± 27%
>>>>>>>> numa-meminfo.node1.KReclaimable
>>>>>>>>         55106 ± 37%     +45.1%      79971 ± 27%
>>>>>>>> numa-meminfo.node1.SReclaimable
>>>>>>>>        559381           -37.3%     350757
>>>>>>>> stress-ng.bigheap.realloc_calls_per_sec
>>>>>>>>         11468            +1.2%      11603 stress-ng.time.system_time
>>>>>>>>        296.25            +4.5%     309.70 stress-ng.time.user_time
>>>>>>>>          0.81 ±187%    -100.0%       0.00 perf-sched.sch_delay.avg.ms.__cond_resched.zap_pte_range.zap_pmd_range.isra.0
>>>>>>>>          9.36 ±165%    -100.0%       0.00 perf-sched.sch_delay.max.ms.__cond_resched.zap_pte_range.zap_pmd_range.isra.0
>>>>>>>>          0.81 ±187%    -100.0%       0.00 perf-sched.wait_time.avg.ms.__cond_resched.zap_pte_range.zap_pmd_range.isra.0
>>>>>>>>          9.36 ±165%    -100.0%       0.00 perf-sched.wait_time.max.ms.__cond_resched.zap_pte_range.zap_pmd_range.isra.0
>>>> Hm is lack of zap some kind of clue here?
>>>>
>>>>>>>>          5.50 ± 17%    +390.9%      27.00 ± 56% perf-c2c.DRAM.local
>>>>>>>>        388.50 ± 10%    +114.7%     834.17 ± 33% perf-c2c.DRAM.remote
>>>>>>>>          1214 ± 13%    +107.3%       2517 ± 31% perf-c2c.HITM.local
>>>>>>>>        135.00 ± 19%    +130.9%     311.67 ± 32% perf-c2c.HITM.remote
>>>>>>>>          1349 ± 13%    +109.6%       2829 ± 31% perf-c2c.HITM.total
>>>>>>> Yeah this also looks pretty consistent too...
>>>>>> It almost looks like some kind of NUMA effects?
>>>>>>
>>>>>> I would have expected that it's the overhead of the vm_normal_folio(),
>>>>>> but not sure how that corresponds to the SLAB + local vs. remote stats.
>>>>>> Maybe they are just noise?
>>>>> Is there any way of making the robot test again? As you said, the only
>>>>> suspect is vm_normal_folio(), nothing seems to pop up...
>>>>>
>>>> Not sure there's much point in that, these tests are run repeatedly and
>>>> statistical analysis taken from them so what would another run accomplish unless
>>>> there's something very consistently wrong with the box that happens only to
>>>> trigger at your commit?
>>>>
>>>> Cheers, Lorenzo
>>> Let me play around on my test box roughly and see if I can repro
>> So I tested with
>> ./stress-ng --timeout 1 --times --verify --metrics --no-rand-seed --oom-avoid --bigheap 20
>> extracted the number out of the line containing the output "realloc calls per sec", did an
>> avg and standard deviation over 20 runs. Before the patch:
>>
>> Average realloc calls/sec: 196907.380000
>> Standard deviation        : 12685.721021
>>
>> After the patch:
>>
>> Average realloc calls/sec: 187894.300500
>> Standard deviation        : 12494.153533
>>
>> which is 5% approx.
>>
> Are you testing that on x86-64 bare metal?

Qemu VM on x86-64.

>
> Anyway this is _not_ what I get.
>
> I am testing on my test box, and seeing a _very significant_ regression as reported.
>
> I am narrowing down the exact cause and will report back. Non-NUMA box, recent
> uArch, dedicated machine.

Oops. Thanks for testing. Lemme stare at my patch for some more time :)

>
> Cheers, Lorenzo


  reply	other threads:[~2025-08-07 17:12 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-07  8:17 kernel test robot
2025-08-07  8:27 ` Lorenzo Stoakes
2025-08-07  8:56   ` Dev Jain
2025-08-07 10:21   ` David Hildenbrand
2025-08-07 16:06     ` Dev Jain
2025-08-07 16:10       ` Lorenzo Stoakes
2025-08-07 16:16         ` Lorenzo Stoakes
2025-08-07 17:04           ` Dev Jain
2025-08-07 17:07             ` Lorenzo Stoakes
2025-08-07 17:11               ` Dev Jain [this message]
2025-08-07 17:37   ` Jann Horn
2025-08-07 17:41     ` Lorenzo Stoakes
2025-08-07 17:46       ` Jann Horn
2025-08-07 17:50         ` Dev Jain
2025-08-07 17:53           ` Lorenzo Stoakes
2025-08-07 17:51         ` Lorenzo Stoakes
2025-08-07 18:01           ` David Hildenbrand
2025-08-07 18:04             ` Lorenzo Stoakes
2025-08-07 18:13               ` David Hildenbrand
2025-08-07 18:07             ` Jann Horn
2025-08-07 18:31               ` David Hildenbrand
2025-08-07 19:52                 ` Lorenzo Stoakes
2025-08-07 17:59       ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7b4e14b1-c212-4207-aa4d-aa5610148abd@arm.com \
    --to=dev.jain@arm.com \
    --cc=akpm@linux-foundation.org \
    --cc=anshuman.khandual@arm.com \
    --cc=baohua@kernel.org \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=david@redhat.com \
    --cc=hughd@google.com \
    --cc=ioworker0@gmail.com \
    --cc=jannh@google.com \
    --cc=liam.howlett@oracle.com \
    --cc=libang.li@antgroup.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lkp@intel.com \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=maobibo@loongson.cn \
    --cc=mingo@kernel.org \
    --cc=oe-lkp@lists.linux.dev \
    --cc=oliver.sang@intel.com \
    --cc=peterx@redhat.com \
    --cc=pfalcato@suse.de \
    --cc=ryan.roberts@arm.com \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    --cc=yang@os.amperecomputing.com \
    --cc=zhengqi.arch@bytedance.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox