From: Vlastimil Babka <vbabka@suse.cz>
To: Hannes Reinecke <hare@suse.de>, Hannes Reinecke <hare@suse.com>,
Matthew Wilcox <willy@infradead.org>
Cc: Boris Pismenny <borisp@nvidia.com>,
John Fastabend <john.fastabend@gmail.com>,
Jakub Kicinski <kuba@kernel.org>,
Sagi Grimberg <sagi@grimberg.me>,
"linux-nvme@lists.infradead.org" <linux-nvme@lists.infradead.org>,
"linux-block@vger.kernel.org" <linux-block@vger.kernel.org>,
linux-mm@kvack.org, Harry Yoo <harry.yoo@oracle.com>,
"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
David Howells <dhowells@redhat.com>
Subject: Re: Kernel oops with 6.14 when enabling TLS
Date: Thu, 6 Mar 2025 10:15:12 +0100 [thread overview]
Message-ID: <c9110425-b584-4be6-af89-542d97859250@suse.cz> (raw)
In-Reply-To: <7439cb2f-6a97-494b-aa10-e9bebb218b58@suse.de>
On 3/5/25 12:43, Hannes Reinecke wrote:
> On 3/5/25 09:58, Vlastimil Babka wrote:
>> On 3/5/25 09:20, Hannes Reinecke wrote:
>>> On 3/4/25 20:44, Vlastimil Babka wrote:
>>>> On 3/4/25 20:39, Hannes Reinecke wrote:
>>> [ .. ]
>>>>>
>>>>> Good news and bad news ...
>>>>> Good news: TLS works again!
>>>>> Bad news: no errors.
>>>>
>>>> Wait, did you add a WARN_ON_ONCE() to the put_page() as I suggested? If yes
>>>> and there was no error, it would have to be leaking the page. Or the path
>>>> uses folio_put() and we'd need to put the warning there.
>>>>
>>> That triggers:
>> ...
>>> Not surprisingly, though, as the original code did a get_page(), so
>>> there had to be a corresponding put_page() somewhere.
>>
>> Is is this one? If there's no more warning afterwards, that should be it.
>>
>> diff --git a/net/core/skmsg.c b/net/core/skmsg.c
>> index 61f3f3d4e528..b37d99cec069 100644
>> --- a/net/core/skmsg.c
>> +++ b/net/core/skmsg.c
>> @@ -182,9 +182,14 @@ static int sk_msg_free_elem(struct sock *sk, struct sk_msg *msg, u32 i,
>>
>> /* When the skb owns the memory we free it from consume_skb path. */
>> if (!msg->skb) {
>> + struct folio *folio;
>> +
>> if (charge)
>> sk_mem_uncharge(sk, len);
>> - put_page(sg_page(sge));
>> +
>> + folio = page_folio(sg_page(sge));
>> + if (!folio_test_slab(folio))
>> + folio_put(folio);
>> }
>> memset(sge, 0, sizeof(*sge));
>> return len;
>>
>>
> Oh, sure. But what annoys me: why do we have to care?
>
> When doing I/O _all_ data is stuffed into bvecs via
> bio_add_page(), and after that information about the
> origin is lost; any iteration on the bio will be a bvec
> iteration.
> Previously we could just do a bvec iteration, get a reference
> for each page, and start processing.
AFAIU there's BIO_PAGE_PINNED that controls whether the pages are pinned, as
there are usecases where it makes sense to do that (userspace pages?). And
__bio_release_pages() can be removing the last pin and freeing the pages.
But this is a case where the buffer is a kmalloc() allocation, so somebody
has to do the corresponding kfree() when the messages are processed. A pin
on the slab folio where the kmalloc() resides helps nothing and as willy
says it's just unnecessary overhead of atomic allocations.
> Now suddenly the caller has to check if it's a slab page and don't
> get a reference for that. Not only that, he also has to remember
> to _not_ drop the reference when he's done.
The caller did kmalloc() and will have to do kfree(). I guess it's about
telling the intermediate layers via something similar like BIO_PAGE_PINNED
whether the pages should be pinned or not.
> And, of course, tracing get_page() and the corresponding put_page()
> calls through all the layers.
> Really?
>
> Cheers,
>
> Hannes
prev parent reply other threads:[~2025-03-06 9:15 UTC|newest]
Thread overview: 36+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <08c29e4b-2f71-4b6d-8046-27e407214d8c@suse.com>
2025-03-03 7:48 ` Hannes Reinecke
2025-03-03 11:06 ` Hannes Reinecke
2025-03-03 12:57 ` Hannes Reinecke
2025-03-03 13:57 ` Matthew Wilcox
2025-03-03 14:05 ` Hannes Reinecke
2025-03-03 14:27 ` Matthew Wilcox
2025-03-03 14:42 ` Matthew Wilcox
2025-03-03 15:12 ` Vlastimil Babka
2025-03-03 15:39 ` Hannes Reinecke
2025-03-03 15:48 ` Matthew Wilcox
2025-03-03 16:15 ` Vlastimil Babka
2025-03-03 22:02 ` Vlastimil Babka
2025-03-04 7:58 ` Hannes Reinecke
2025-03-04 8:18 ` Vlastimil Babka
2025-03-04 10:20 ` Hannes Reinecke
2025-03-04 10:26 ` Vlastimil Babka
2025-03-04 15:11 ` Hannes Reinecke
2025-03-04 15:29 ` Vlastimil Babka
2025-03-04 16:20 ` Hannes Reinecke
2025-03-04 16:14 ` Matthew Wilcox
2025-03-04 16:32 ` Hannes Reinecke
2025-03-04 16:53 ` Matthew Wilcox
2025-03-04 18:05 ` Matthew Wilcox
2025-03-04 18:31 ` Vlastimil Babka
2025-03-04 19:39 ` Hannes Reinecke
2025-03-04 19:44 ` Vlastimil Babka
2025-03-05 7:14 ` Hannes Reinecke
2025-03-05 8:20 ` Hannes Reinecke
2025-03-05 8:58 ` Vlastimil Babka
2025-03-05 11:43 ` Hannes Reinecke
2025-03-05 18:11 ` Networking people smell funny and make poor life choices Matthew Wilcox
2025-03-06 0:46 ` Cong Wang
2025-03-12 15:09 ` Christoph Hellwig
2025-03-12 18:28 ` James R. Bergsten
2025-03-13 9:43 ` David Laight
2025-03-06 9:15 ` Vlastimil Babka [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=c9110425-b584-4be6-af89-542d97859250@suse.cz \
--to=vbabka@suse.cz \
--cc=borisp@nvidia.com \
--cc=dhowells@redhat.com \
--cc=hare@suse.com \
--cc=hare@suse.de \
--cc=harry.yoo@oracle.com \
--cc=john.fastabend@gmail.com \
--cc=kuba@kernel.org \
--cc=linux-block@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-nvme@lists.infradead.org \
--cc=netdev@vger.kernel.org \
--cc=sagi@grimberg.me \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox