Re: Kernel oops with 6.14 when enabling TLS

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Vlastimil Babka <vbabka@suse.cz>
To: Hannes Reinecke <hare@suse.de>, Hannes Reinecke <hare@suse.com>,
	Matthew Wilcox <willy@infradead.org>
Cc: Boris Pismenny <borisp@nvidia.com>,
	John Fastabend <john.fastabend@gmail.com>,
	Jakub Kicinski <kuba@kernel.org>,
	Sagi Grimberg <sagi@grimberg.me>,
	"linux-nvme@lists.infradead.org" <linux-nvme@lists.infradead.org>,
	"linux-block@vger.kernel.org" <linux-block@vger.kernel.org>,
	linux-mm@kvack.org, Harry Yoo <harry.yoo@oracle.com>,
	"netdev@vger.kernel.org" <netdev@vger.kernel.org>,
	David Howells <dhowells@redhat.com>
Subject: Re: Kernel oops with 6.14 when enabling TLS
Date: Thu, 6 Mar 2025 10:15:12 +0100	[thread overview]
Message-ID: <c9110425-b584-4be6-af89-542d97859250@suse.cz> (raw)
In-Reply-To: <7439cb2f-6a97-494b-aa10-e9bebb218b58@suse.de>

On 3/5/25 12:43, Hannes Reinecke wrote:
> On 3/5/25 09:58, Vlastimil Babka wrote:
>> On 3/5/25 09:20, Hannes Reinecke wrote:
>>> On 3/4/25 20:44, Vlastimil Babka wrote:
>>>> On 3/4/25 20:39, Hannes Reinecke wrote:
>>> [ .. ]
>>>>>
>>>>> Good news and bad news ...
>>>>> Good news: TLS works again!
>>>>> Bad news: no errors.
>>>>
>>>> Wait, did you add a WARN_ON_ONCE() to the put_page() as I suggested? If yes
>>>> and there was no error, it would have to be leaking the page. Or the path
>>>> uses folio_put() and we'd need to put the warning there.
>>>>
>>> That triggers:
>> ...
>>> Not surprisingly, though, as the original code did a get_page(), so
>>> there had to be a corresponding put_page() somewhere.
>> 
>> Is is this one? If there's no more warning afterwards, that should be it.
>> 
>> diff --git a/net/core/skmsg.c b/net/core/skmsg.c
>> index 61f3f3d4e528..b37d99cec069 100644
>> --- a/net/core/skmsg.c
>> +++ b/net/core/skmsg.c
>> @@ -182,9 +182,14 @@ static int sk_msg_free_elem(struct sock *sk, struct sk_msg *msg, u32 i,
>>   
>>          /* When the skb owns the memory we free it from consume_skb path. */
>>          if (!msg->skb) {
>> +               struct folio *folio;
>> +
>>                  if (charge)
>>                          sk_mem_uncharge(sk, len);
>> -               put_page(sg_page(sge));
>> +
>> +               folio = page_folio(sg_page(sge));
>> +               if (!folio_test_slab(folio))
>> +                       folio_put(folio);
>>          }
>>          memset(sge, 0, sizeof(*sge));
>>          return len;
>> 
>> 
> Oh, sure. But what annoys me: why do we have to care?
> 
> When doing I/O _all_ data is stuffed into bvecs via
> bio_add_page(), and after that information about the
> origin is lost; any iteration on the bio will be a bvec
> iteration.
> Previously we could just do a bvec iteration, get a reference
> for each page, and start processing.

AFAIU there's BIO_PAGE_PINNED that controls whether the pages are pinned, as
there are usecases where it makes sense to do that (userspace pages?). And
__bio_release_pages() can be removing the last pin and freeing the pages.

But this is a case where the buffer is a kmalloc() allocation, so somebody
has to do the corresponding kfree() when the messages are processed. A pin
on the slab folio where the kmalloc() resides helps nothing and as willy
says it's just unnecessary overhead of atomic allocations.

> Now suddenly the caller has to check if it's a slab page and don't
> get a reference for that. Not only that, he also has to remember
> to _not_ drop the reference when he's done.

The caller did kmalloc() and will have to do kfree(). I guess it's about
telling the intermediate layers via something similar like BIO_PAGE_PINNED
whether the pages should be pinned or not.

> And, of course, tracing get_page() and the corresponding put_page()
> calls through all the layers.
> Really?
> 
> Cheers,
> 
> Hannes

     prev parent reply	other threads:[~2025-03-06  9:15 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <08c29e4b-2f71-4b6d-8046-27e407214d8c@suse.com>
2025-03-03  7:48 ` Hannes Reinecke
2025-03-03 11:06   ` Hannes Reinecke
2025-03-03 12:57     ` Hannes Reinecke
2025-03-03 13:57     ` Matthew Wilcox
2025-03-03 14:05       ` Hannes Reinecke
2025-03-03 14:27   ` Matthew Wilcox
2025-03-03 14:42     ` Matthew Wilcox
2025-03-03 15:12       ` Vlastimil Babka
2025-03-03 15:39       ` Hannes Reinecke
2025-03-03 15:48         ` Matthew Wilcox
2025-03-03 16:15           ` Vlastimil Babka
2025-03-03 22:02             ` Vlastimil Babka
2025-03-04  7:58               ` Hannes Reinecke
2025-03-04  8:18                 ` Vlastimil Babka
2025-03-04 10:20                   ` Hannes Reinecke
2025-03-04 10:26                     ` Vlastimil Babka
2025-03-04 15:11                       ` Hannes Reinecke
2025-03-04 15:29                       ` Vlastimil Babka
2025-03-04 16:20                         ` Hannes Reinecke
2025-03-04 16:14                       ` Matthew Wilcox
2025-03-04 16:32                         ` Hannes Reinecke
2025-03-04 16:53                           ` Matthew Wilcox
2025-03-04 18:05                             ` Matthew Wilcox
2025-03-04 18:31                               ` Vlastimil Babka
2025-03-04 19:39                               ` Hannes Reinecke
2025-03-04 19:44                                 ` Vlastimil Babka
2025-03-05  7:14                                   ` Hannes Reinecke
2025-03-05  8:20                                   ` Hannes Reinecke
2025-03-05  8:58                                     ` Vlastimil Babka
2025-03-05 11:43                                       ` Hannes Reinecke
2025-03-05 18:11                                         ` Networking people smell funny and make poor life choices Matthew Wilcox
2025-03-06  0:46                                           ` Cong Wang
2025-03-12 15:09                                           ` Christoph Hellwig
2025-03-12 18:28                                             ` James R. Bergsten
2025-03-13  9:43                                           ` David Laight
2025-03-06  9:15                                         ` Vlastimil Babka [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c9110425-b584-4be6-af89-542d97859250@suse.cz \
    --to=vbabka@suse.cz \
    --cc=borisp@nvidia.com \
    --cc=dhowells@redhat.com \
    --cc=hare@suse.com \
    --cc=hare@suse.de \
    --cc=harry.yoo@oracle.com \
    --cc=john.fastabend@gmail.com \
    --cc=kuba@kernel.org \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=netdev@vger.kernel.org \
    --cc=sagi@grimberg.me \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox