From: Jeff Layton <jlayton@kernel.org>
To: kernel test robot <oliver.sang@intel.com>
Cc: oe-lkp@lists.linux.dev, lkp@intel.com,
Linux Memory Management List <linux-mm@kvack.org>,
Chuck Lever <chuck.lever@oracle.com>,
linux-nfs@vger.kernel.org, ying.huang@intel.com,
feng.tang@intel.com, fengwei.yin@intel.com
Subject: Re: [linux-next:master] [nfsd] 8cb33389f6: fsmark.app_overhead 81.6% regression
Date: Tue, 01 Oct 2024 08:44:43 -0400 [thread overview]
Message-ID: <70ce1781c9fb39d974d38b7426194ead97926ed3.camel@kernel.org> (raw)
In-Reply-To: <202409161645.d44bced5-oliver.sang@intel.com>
On Mon, 2024-09-16 at 16:41 +0800, kernel test robot wrote:
>
> Hello,
>
> kernel test robot noticed a 81.6% regression of fsmark.app_overhead on:
>
>
> commit: 8cb33389f66441dc4e54b28fe0d9bd4bcd9b796d ("nfsd: implement OPEN_ARGS_SHARE_ACCESS_WANT_OPEN_XOR_DELEGATION")
> https://git.kernel.org/cgit/linux/kernel/git/next/linux-next.git master
>
> testcase: fsmark
> test machine: 128 threads 2 sockets Intel(R) Xeon(R) Platinum 8358 CPU @ 2.60GHz (Ice Lake) with 128G memory
> parameters:
>
> iterations: 1x
> nr_threads: 1t
> disk: 1HDD
> fs: btrfs
> fs2: nfsv4
> filesize: 4K
> test_size: 40M
> sync_method: fsyncBeforeClose
> nr_files_per_directory: 1fpd
> cpufreq_governor: performance
>
>
>
>
> If you fix the issue in a separate patch/commit (i.e. not just a new version of
> the same patch/commit), kindly add following tags
> > Reported-by: kernel test robot <oliver.sang@intel.com>
> > Closes: https://lore.kernel.org/oe-lkp/202409161645.d44bced5-oliver.sang@intel.com
>
>
> Details are as below:
> -------------------------------------------------------------------------------------------------->
>
>
> The kernel config and materials to reproduce are available at:
> https://download.01.org/0day-ci/archive/20240916/202409161645.d44bced5-oliver.sang@intel.com
>
> =========================================================================================
> compiler/cpufreq_governor/disk/filesize/fs2/fs/iterations/kconfig/nr_files_per_directory/nr_threads/rootfs/sync_method/tbox_group/test_size/testcase:
> gcc-12/performance/1HDD/4K/nfsv4/btrfs/1x/x86_64-rhel-8.3/1fpd/1t/debian-12-x86_64-20240206.cgz/fsyncBeforeClose/lkp-icl-2sp6/40M/fsmark
>
> commit:
> e29c78a693 ("nfsd: add support for FATTR4_OPEN_ARGUMENTS")
> 8cb33389f6 ("nfsd: implement OPEN_ARGS_SHARE_ACCESS_WANT_OPEN_XOR_DELEGATION")
>
> e29c78a6936e7422 8cb33389f66441dc4e54b28fe0d
> ---------------- ---------------------------
> %stddev %change %stddev
> \ | \
> 24388 ± 20% -32.8% 16400 ± 18% numa-vmstat.node0.nr_slab_reclaimable
> 61.50 ± 4% -10.6% 55.00 ± 6% perf-c2c.HITM.local
> 0.20 ± 3% +23.0% 0.24 ± 13% perf-sched.sch_delay.max.ms.__cond_resched.__wait_for_common.affine_move_task.__set_cpus_allowed_ptr.__sched_setaffinity
> 2977 -6.1% 2796 vmstat.system.cs
> 2132466 ± 2% +81.6% 3871852 fsmark.app_overhead
I have been unable to reproduce this result with fs_mark. I've run a
number of repeated tests, and I can create files just as fast with or
without this patch (roughly ~46 files/s on my test machine).
I'm particularly suspicious of the fsmark.app_overhead value above. The
fsmark output says:
# App overhead is time in microseconds spent in the test not doing file writing related system calls.
That seems outside the purview of anything we're altering here, so I
have to wonder if something else is going on. Oliver, can you rerun
this test and see if this regression is reproducible?
Thanks,
> 53442 -17.3% 44172 fsmark.time.voluntary_context_switches
> 2907 -5.7% 2742 perf-stat.i.context-switches
> 2902 -5.7% 2737 perf-stat.ps.context-switches
> 1724787 -1.0% 1706808 proc-vmstat.numa_hit
> 1592345 -1.1% 1574310 proc-vmstat.numa_local
> 24.87 ± 33% -38.9% 15.20 ± 12% sched_debug.cpu.nr_uninterruptible.max
> 4.36 ± 9% -17.1% 3.61 ± 10% sched_debug.cpu.nr_uninterruptible.stddev
> 97541 ± 20% -32.7% 65610 ± 18% numa-meminfo.node0.KReclaimable
> 97541 ± 20% -32.7% 65610 ± 18% numa-meminfo.node0.SReclaimable
> 256796 ± 9% -18.7% 208805 ± 13% numa-meminfo.node0.Slab
> 2307911 ± 52% +68.5% 3888971 ± 5% numa-meminfo.node1.MemUsed
> 193326 ± 12% +24.7% 241049 ± 12% numa-meminfo.node1.Slab
> 0.90 ± 27% -0.5 0.36 ±103% perf-profile.calltrace.cycles-pp.evsel__read_counter.read_counters.process_interval.dispatch_events.cmd_stat
> 0.36 ± 70% +0.2 0.58 ± 3% perf-profile.calltrace.cycles-pp.btrfs_commit_transaction.btrfs_sync_file.btrfs_do_write_iter.do_iter_readv_writev.vfs_iter_write
> 0.52 ± 47% +0.3 0.78 ± 8% perf-profile.calltrace.cycles-pp.vfs_write.ksys_write.do_syscall_64.entry_SYSCALL_64_after_hwframe.write
> 1.62 ± 12% +0.3 1.93 ± 9% perf-profile.calltrace.cycles-pp.__handle_mm_fault.handle_mm_fault.do_user_addr_fault.exc_page_fault.asm_exc_page_fault
> 1.22 ± 21% -0.3 0.89 ± 10% perf-profile.children.cycles-pp.readn
> 0.46 ± 32% -0.2 0.24 ± 34% perf-profile.children.cycles-pp.__close
> 0.45 ± 32% -0.2 0.22 ± 15% perf-profile.children.cycles-pp.__x64_sys_close
> 0.40 ± 29% -0.2 0.18 ± 38% perf-profile.children.cycles-pp.__fput
> 0.31 ± 23% -0.2 0.16 ± 33% perf-profile.children.cycles-pp.irq_work_tick
> 0.17 ± 51% -0.1 0.03 ±111% perf-profile.children.cycles-pp.nfs_file_release
> 0.16 ± 43% -0.1 0.03 ±111% perf-profile.children.cycles-pp.__put_nfs_open_context
> 0.26 ± 18% -0.1 0.15 ± 34% perf-profile.children.cycles-pp.perf_event_task_tick
> 0.15 ± 41% -0.1 0.03 ±108% perf-profile.children.cycles-pp.get_free_pages_noprof
> 0.18 ± 55% -0.1 0.06 ± 32% perf-profile.children.cycles-pp.native_apic_mem_eoi
> 0.18 ± 32% -0.1 0.07 ± 81% perf-profile.children.cycles-pp.flush_end_io
> 0.17 ± 41% -0.1 0.07 ± 93% perf-profile.children.cycles-pp.mas_store_gfp
> 0.52 ± 5% +0.1 0.58 ± 3% perf-profile.children.cycles-pp.btrfs_commit_transaction
> 0.02 ±141% +0.1 0.08 ± 42% perf-profile.children.cycles-pp.uptime_proc_show
> 0.02 ±141% +0.1 0.08 ± 44% perf-profile.children.cycles-pp.get_zeroed_page_noprof
> 0.02 ±141% +0.1 0.09 ± 35% perf-profile.children.cycles-pp.__rmqueue_pcplist
> 0.14 ± 12% +0.1 0.28 ± 29% perf-profile.children.cycles-pp.hrtimer_next_event_without
> 0.47 ± 27% +0.2 0.67 ± 19% perf-profile.children.cycles-pp.__mmap
> 0.70 ± 21% +0.2 0.91 ± 7% perf-profile.children.cycles-pp.vfs_write
> 0.74 ± 20% +0.2 0.96 ± 9% perf-profile.children.cycles-pp.ksys_write
> 0.73 ± 21% +0.3 1.00 ± 7% perf-profile.children.cycles-pp.copy_process
> 1.05 ± 13% +0.3 1.38 ± 10% perf-profile.children.cycles-pp.kernel_clone
> 0.28 ± 22% -0.1 0.13 ± 35% perf-profile.self.cycles-pp.irq_work_tick
> 0.18 ± 55% -0.1 0.06 ± 32% perf-profile.self.cycles-pp.native_apic_mem_eoi
>
>
>
>
> Disclaimer:
> Results have been estimated based on internal Intel analysis and are provided
> for informational purposes only. Any difference in system hardware or software
> design or configuration may affect actual performance.
>
>
--
Jeff Layton <jlayton@kernel.org>
next prev parent reply other threads:[~2024-10-01 12:44 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-09-16 8:41 kernel test robot
2024-09-17 2:43 ` Chuck Lever
2024-09-18 4:59 ` Oliver Sang
2024-10-01 12:44 ` Jeff Layton [this message]
2024-10-08 8:01 ` Oliver Sang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=70ce1781c9fb39d974d38b7426194ead97926ed3.camel@kernel.org \
--to=jlayton@kernel.org \
--cc=chuck.lever@oracle.com \
--cc=feng.tang@intel.com \
--cc=fengwei.yin@intel.com \
--cc=linux-mm@kvack.org \
--cc=linux-nfs@vger.kernel.org \
--cc=lkp@intel.com \
--cc=oe-lkp@lists.linux.dev \
--cc=oliver.sang@intel.com \
--cc=ying.huang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox