linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Mateusz Guzik <mjguzik@gmail.com>
To: Yu Zhao <yuzhao@google.com>
Cc: Bharata B Rao <bharata@amd.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	 nikunj@amd.com, "Upadhyay, Neeraj" <Neeraj.Upadhyay@amd.com>,
	 Andrew Morton <akpm@linux-foundation.org>,
	David Hildenbrand <david@redhat.com>,
	willy@infradead.org,  vbabka@suse.cz, kinseyho@google.com,
	Mel Gorman <mgorman@suse.de>
Subject: Re: Hard and soft lockups with FIO and LTP runs on a large system
Date: Sat, 20 Jul 2024 09:57:56 +0200	[thread overview]
Message-ID: <CAGudoHEd9FL+Mhk8GdFm+3gvBk35ho-BTX-f7jn=O5Lz2mij-Q@mail.gmail.com> (raw)
In-Reply-To: <CAOUHufbkhMZYz20aM_3rHZ3OcK4m2puji2FGpUpn_-DevGk3Kg@mail.gmail.com>

On Fri, Jul 19, 2024 at 10:21 PM Yu Zhao <yuzhao@google.com> wrote:
> I can't come up with any reasonable band-aid at this moment, i.e.,
> something not too ugly to work around a more fundamental scalability
> problem.
>
> Before I give up: what type of dirty data was written back to the nvme
> device? Was it page cache or swap?
>

With my corporate employee hat on, I would like to note a couple of
three things.

1. there are definitely bugs here and someone(tm) should sort them out(R)

however....

2. the real goal is presumably to beat the kernel into shape where
production kernels no longer suffer lockups running this workload on
this hardware
3. the flamegraph (to be found in [1]) shows expensive debug enabled,
notably for preemption count (search for preempt_count_sub to see)
4. I'm told the lruvec problem is being worked on (but no ETA) and I
don't think the above justifies considering any hacks or otherwise
putting more pressure on it

It is plausible eliminating the aforementioned debug will be good enough.

Apart from that I note percpu_counter_add_batch (+ irq debug) accounts
for 5.8% cpu time. This will of course go down if irq tracing is
disabled, but so happens I optimized this routine to be faster
single-threaded (in particular by dodging the interrupt trip). The
patch is hanging out in the mm tree [2] and is trivially applicable
for testing.

Even if none of the debug opts can get modified, this should drop
percpu_counter_add_batch to 1.5% or so, which may or may not have a
side effect of avoiding the lockup problem.

[1]: https://lore.kernel.org/lkml/584ecb5e-b1fc-4b43-ba36-ad396d379fad@amd.com/
[2]: https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git/commit/?h=mm-everything&id=51d821654be4286b005ad2b7dc8b973d5008a2ec

-- 
Mateusz Guzik <mjguzik gmail.com>


  reply	other threads:[~2024-07-20  7:58 UTC|newest]

Thread overview: 37+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-07-03 15:11 Bharata B Rao
2024-07-06 22:42 ` Yu Zhao
2024-07-08 14:34   ` Bharata B Rao
2024-07-08 16:17     ` Yu Zhao
2024-07-09  4:30       ` Bharata B Rao
2024-07-09  5:58         ` Yu Zhao
2024-07-11  5:43           ` Bharata B Rao
2024-07-15  5:19             ` Bharata B Rao
2024-07-19 20:21               ` Yu Zhao
2024-07-20  7:57                 ` Mateusz Guzik [this message]
2024-07-22  4:17                   ` Bharata B Rao
2024-07-22  4:12                 ` Bharata B Rao
2024-07-25  9:59               ` zhaoyang.huang
2024-07-26  3:26                 ` Zhaoyang Huang
2024-07-29  4:49                   ` Bharata B Rao
2024-08-13 11:04           ` Usama Arif
2024-08-13 17:43             ` Yu Zhao
2024-07-17  9:37         ` Vlastimil Babka
2024-07-17 10:50           ` Bharata B Rao
2024-07-17 11:15             ` Hillf Danton
2024-07-18  9:02               ` Bharata B Rao
2024-07-10 12:03   ` Bharata B Rao
2024-07-10 12:24     ` Mateusz Guzik
2024-07-10 13:04       ` Mateusz Guzik
2024-07-15  5:22         ` Bharata B Rao
2024-07-15  6:48           ` Mateusz Guzik
2024-07-10 18:04     ` Yu Zhao
2024-07-17  9:42 ` Vlastimil Babka
2024-07-17 10:31   ` Bharata B Rao
2024-07-17 16:44     ` Karim Manaouil
2024-07-17 11:29   ` Mateusz Guzik
2024-07-18  9:00     ` Bharata B Rao
2024-07-18 12:11       ` Mateusz Guzik
2024-07-19  6:16         ` Bharata B Rao
2024-07-19  7:06           ` Yu Zhao
2024-07-19 14:26           ` Mateusz Guzik
2024-07-17 16:34   ` Karim Manaouil

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAGudoHEd9FL+Mhk8GdFm+3gvBk35ho-BTX-f7jn=O5Lz2mij-Q@mail.gmail.com' \
    --to=mjguzik@gmail.com \
    --cc=Neeraj.Upadhyay@amd.com \
    --cc=akpm@linux-foundation.org \
    --cc=bharata@amd.com \
    --cc=david@redhat.com \
    --cc=kinseyho@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=nikunj@amd.com \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    --cc=yuzhao@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox