From: Marcelo Tosatti <mtosatti@redhat.com>
To: Hillf Danton <hdanton@sina.com>
Cc: Leonardo Bras <leobras@redhat.com>,
Michal Hocko <mhocko@kernel.org>,
Roman Gushchin <roman.gushchin@linux.dev>,
linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [RFC PATCH v1 0/4] Introduce QPW for per-cpu operations
Date: Wed, 11 Sep 2024 00:04:46 -0300 [thread overview]
Message-ID: <ZuEIzngSx36Gx8l/@tpad> (raw)
In-Reply-To: <20240905221908.1960-1-hdanton@sina.com>
On Fri, Sep 06, 2024 at 06:19:08AM +0800, Hillf Danton wrote:
> On Tue, 23 Jul 2024 14:14:34 -0300 Marcelo Tosatti <mtosatti@redhat.com>
> > On Sat, Jun 22, 2024 at 12:58:08AM -0300, Leonardo Bras wrote:
> > > The problem:
> > > Some places in the kernel implement a parallel programming strategy
> > > consisting on local_locks() for most of the work, and some rare remote
> > > operations are scheduled on target cpu. This keeps cache bouncing low since
> > > cacheline tends to be mostly local, and avoids the cost of locks in non-RT
> > > kernels, even though the very few remote operations will be expensive due
> > > to scheduling overhead.
> > >
> > > On the other hand, for RT workloads this can represent a problem: getting
> > > an important workload scheduled out to deal with remote requests is
> > > sure to introduce unexpected deadline misses.
> >
> > Another hang with a busy polling workload (kernel update hangs on
> > grub2-probe):
> >
> > [342431.665417] INFO: task grub2-probe:24484 blocked for more than 622 seconds.
> > [342431.665458] Tainted: G W X ------- --- 5.14.0-438.el9s.x86_64+rt #1
> > [342431.665488] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > [342431.665515] task:grub2-probe state:D stack:0 pid:24484 ppid:24455 flags:0x00004002
> > [342431.665523] Call Trace:
> > [342431.665525] <TASK>
> > [342431.665527] __schedule+0x22a/0x580
> > [342431.665537] schedule+0x30/0x80
> > [342431.665539] schedule_timeout+0x153/0x190
> > [342431.665543] ? preempt_schedule_thunk+0x16/0x30
> > [342431.665548] ? preempt_count_add+0x70/0xa0
> > [342431.665554] __wait_for_common+0x8b/0x1c0
> > [342431.665557] ? __pfx_schedule_timeout+0x10/0x10
> > [342431.665560] __flush_work.isra.0+0x15b/0x220
>
> The fresh new flush_percpu_work() is nop with CONFIG_PREEMPT_RT enabled, why
> are you testing it with 5.14.0-438.el9s.x86_64+rt instead of mainline? Or what
> are you testing?
I am demonstrating a type of bug that can happen without Leo's patch.
> BTW the hang fails to show the unexpected deadline misses.
Yes, because in this case the realtime app with FIFO priority never
stops running, therefore grub2-probe hangs and is unable to execute:
> > [342431.665417] INFO: task grub2-probe:24484 blocked for more than 622 seconds
>
> > [342431.665565] ? __pfx_wq_barrier_func+0x10/0x10
> > [342431.665570] __lru_add_drain_all+0x17d/0x220
> > [342431.665576] invalidate_bdev+0x28/0x40
> > [342431.665583] blkdev_common_ioctl+0x714/0xa30
> > [342431.665588] ? bucket_table_alloc.isra.0+0x1/0x150
> > [342431.665593] ? cp_new_stat+0xbb/0x180
> > [342431.665599] blkdev_ioctl+0x112/0x270
> > [342431.665603] ? security_file_ioctl+0x2f/0x50
> > [342431.665609] __x64_sys_ioctl+0x87/0xc0
Does that make sense now?
Thanks!
next prev parent reply other threads:[~2024-09-13 19:01 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-06-22 3:58 Leonardo Bras
2024-06-22 3:58 ` [RFC PATCH v1 1/4] Introducing qpw_lock() and per-cpu queue & flush work Leonardo Bras
2024-09-04 21:39 ` Waiman Long
2024-09-05 0:08 ` Waiman Long
2024-09-11 7:18 ` Leonardo Bras
2024-09-11 7:17 ` Leonardo Bras
2024-09-11 13:39 ` Waiman Long
2024-06-22 3:58 ` [RFC PATCH v1 2/4] swap: apply new queue_percpu_work_on() interface Leonardo Bras
2024-06-22 3:58 ` [RFC PATCH v1 3/4] memcontrol: " Leonardo Bras
2024-06-22 3:58 ` [RFC PATCH v1 4/4] slub: " Leonardo Bras
2024-06-24 7:31 ` [RFC PATCH v1 0/4] Introduce QPW for per-cpu operations Vlastimil Babka
2024-06-24 22:54 ` Boqun Feng
2024-06-25 2:57 ` Leonardo Bras
2024-06-25 17:51 ` Boqun Feng
2024-06-26 16:40 ` Leonardo Bras
2024-06-28 18:47 ` Marcelo Tosatti
2024-06-25 2:36 ` Leonardo Bras
2024-07-15 18:38 ` Marcelo Tosatti
2024-07-23 17:14 ` Marcelo Tosatti
2024-09-05 22:19 ` Hillf Danton
2024-09-11 3:04 ` Marcelo Tosatti [this message]
2024-09-15 0:30 ` Hillf Danton
2024-09-11 6:42 ` Leonardo Bras
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZuEIzngSx36Gx8l/@tpad \
--to=mtosatti@redhat.com \
--cc=hdanton@sina.com \
--cc=leobras@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=roman.gushchin@linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox