From: Jesper Dangaard Brouer <jbrouer@redhat.com>
To: "Toke Høiland-Jørgensen" <toke@redhat.com>,
"Jakub Kicinski" <kuba@kernel.org>
Cc: brouer@redhat.com, Ilias Apalodimas <ilias.apalodimas@linaro.org>,
netdev@vger.kernel.org, Eric Dumazet <eric.dumazet@gmail.com>,
linux-mm@kvack.org, Mel Gorman <mgorman@techsingularity.net>,
lorenzo@kernel.org, linyunsheng@huawei.com, bpf@vger.kernel.org,
"David S. Miller" <davem@davemloft.net>,
Paolo Abeni <pabeni@redhat.com>,
Andrew Morton <akpm@linux-foundation.org>,
willy@infradead.org
Subject: Re: [PATCH RFC net-next/mm V3 1/2] page_pool: Remove workqueue in new shutdown scheme
Date: Wed, 3 May 2023 17:49:34 +0200 [thread overview]
Message-ID: <3a5a28c4-01a3-793c-6969-475aba3ff3b5@redhat.com> (raw)
In-Reply-To: <87ednxbr3c.fsf@toke.dk>
On 03/05/2023 13.18, Toke Høiland-Jørgensen wrote:
> Jakub Kicinski <kuba@kernel.org> writes:
>
>> On Fri, 28 Apr 2023 18:16:19 +0200 Jesper Dangaard Brouer wrote:
>>> This removes the workqueue scheme that periodically tests when
>>> inflight reach zero such that page_pool memory can be freed.
>>>
>>> This change adds code to fast-path free checking for a shutdown flags
>>> bit after returning PP pages.
>>
>> We can remove the warning without removing the entire delayed freeing
>> scheme. I definitely like the SHUTDOWN flag and patch 2 but I'm a bit
>> less clear on why the complexity of datapath freeing is justified.
>> Can you explain?
>
> You mean just let the workqueue keep rescheduling itself every minute
> for the (potentially) hours that skbs will stick around? Seems a bit
> wasteful, doesn't it? :)
I agree that this workqueue that keeps rescheduling is wasteful.
It actually reschedules every second, even more wasteful.
NIC drivers will have many HW RX-queues, with separate PP instances,
that each can start a workqueue that resched every sec.
Eric have convinced me that SKBs can "stick around" for longer than the
assumptions in PP. The old PP assumptions came from XDP-return path.
It is time to cleanup.
>
> We did see an issue where creating and tearing down lots of page pools
> in a short period of time caused significant slowdowns due to the
> workqueue mechanism. Lots being "thousands per second". This is possible
> using the live packet mode of bpf_prog_run() for XDP, which will setup
> and destroy a page pool for each syscall...
Yes, the XDP live packet mode of bpf_prog_run is IMHO abusing the
page_pool API. We should fix that somehow, at least the case where live
packet mode is only injecting a single packet, but still creates a PP
instance. The PP in live packet mode IMHO only makes sense when
repeatedly sending packets that gets recycles and are pre-inited by PP.
This use of PP does exemplify why is it problematic to keep the workqueue.
I have considered (and could be convinced) delaying the free via
call_rcu, but it also create an unfortunate backlog of work in the case
of live packet mode of bpf_prog_run.
--Jesper
next prev parent reply other threads:[~2023-05-03 15:49 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-04-28 16:16 [PATCH RFC net-next/mm V3 0/2] page_pool: new approach for leak detection and shutdown phase Jesper Dangaard Brouer
2023-04-28 16:16 ` [PATCH RFC net-next/mm V3 1/2] page_pool: Remove workqueue in new shutdown scheme Jesper Dangaard Brouer
2023-04-28 21:38 ` Toke Høiland-Jørgensen
2023-05-03 15:21 ` Jesper Dangaard Brouer
2023-05-03 2:33 ` Jakub Kicinski
2023-05-03 11:18 ` Toke Høiland-Jørgensen
2023-05-03 15:49 ` Jesper Dangaard Brouer [this message]
2023-05-04 1:47 ` Jakub Kicinski
2023-05-04 2:42 ` Yunsheng Lin
2023-05-04 13:48 ` Jesper Dangaard Brouer
2023-05-05 0:54 ` Yunsheng Lin
2023-05-06 13:11 ` Yunsheng Lin
2023-04-28 16:16 ` [PATCH RFC net-next/mm V3 2/2] mm/page_pool: catch page_pool memory leaks Jesper Dangaard Brouer
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=3a5a28c4-01a3-793c-6969-475aba3ff3b5@redhat.com \
--to=jbrouer@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=bpf@vger.kernel.org \
--cc=brouer@redhat.com \
--cc=davem@davemloft.net \
--cc=eric.dumazet@gmail.com \
--cc=ilias.apalodimas@linaro.org \
--cc=kuba@kernel.org \
--cc=linux-mm@kvack.org \
--cc=linyunsheng@huawei.com \
--cc=lorenzo@kernel.org \
--cc=mgorman@techsingularity.net \
--cc=netdev@vger.kernel.org \
--cc=pabeni@redhat.com \
--cc=toke@redhat.com \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox