From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 38E06C282D2 for ; Tue, 4 Mar 2025 16:53:23 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 915B76B0085; Tue, 4 Mar 2025 11:53:22 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 8C5F96B0088; Tue, 4 Mar 2025 11:53:22 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7DB956B0089; Tue, 4 Mar 2025 11:53:22 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 610676B0085 for ; Tue, 4 Mar 2025 11:53:22 -0500 (EST) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 030284C9AF for ; Tue, 4 Mar 2025 16:53:21 +0000 (UTC) X-FDA: 83184464244.05.72C875D Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) by imf06.hostedemail.com (Postfix) with ESMTP id 3699C180007 for ; Tue, 4 Mar 2025 16:53:19 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=wRFv1js6; spf=none (imf06.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1741107200; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=cRHPazYps1wq7KJ5C8fXLniBimphIu0da+uuMFbtb3s=; b=qw6bjlvNXfHpvecwlT3K+j9KwF31sof/YZOOEDM/JRvKwqYDaQ9aqnTtgfqLvTuvhoNyP8 RtKo1/uqx9PIu+8k46mxduPlgwbSGx8NLWx8tu0ucP4U5Iqf/0fMgLaTVtwJLFdRObHpU+ rYDcaF8qMJbSVSDEEbE32KBcMLO4Kd4= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=wRFv1js6; spf=none (imf06.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1741107200; a=rsa-sha256; cv=none; b=Q0lfz/xKmqKZzdaQ6H9OdoUEf+oi3pf77HWJhfWgu6rlYmnmfdxMt40GYzRMA8j3qt6lCp 1GuhQZByPNySmLGspM2Cd9abFVVWcWrLPDgn4rLkT8g5arsNzG8QkOUfBe/KYaf1Gcj2d7 97XUadFkGED6X6+DcQDzPC8Mzu12FDE= DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=cRHPazYps1wq7KJ5C8fXLniBimphIu0da+uuMFbtb3s=; b=wRFv1js6bmgBAOVXTa8oSiVAP+ 4BZzmc0J9YF/co2jTAPij/WtgiOPXAF+05gWyH9g3EOeFWOooEAigwevLkNAbGCwJTcQYKi7OooWG MHkv9erbQPJxNPA6bKkBDyPT0HNl6Lj+YwIT7PFP0I5NZ+t7wGjPJ6IgVGjYSrIvLGUp0Lh9gTRu9 Im4d9aIIE+DtMjKu+ll2pYnm9Rik1RGD7oAhEXe3nr4HZbbOtwxNMrTDn1gVyxXjT6C40tcckihZb MhTLCdITMBU+yOqheRkwVs9T+H2yjWs7/abkwVF3kE5GI8w8AlGi4zwK18cWM7M0xSNmF4rVrdH0G 1RHwCWkg==; Received: from willy by casper.infradead.org with local (Exim 4.98 #2 (Red Hat Linux)) id 1tpVW2-000000029No-3nl4; Tue, 04 Mar 2025 16:53:10 +0000 Date: Tue, 4 Mar 2025 16:53:09 +0000 From: Matthew Wilcox To: Hannes Reinecke Cc: Vlastimil Babka , Boris Pismenny , John Fastabend , Jakub Kicinski , Sagi Grimberg , "linux-nvme@lists.infradead.org" , "linux-block@vger.kernel.org" , linux-mm@kvack.org, Harry Yoo , "netdev@vger.kernel.org" Subject: Re: Kernel oops with 6.14 when enabling TLS Message-ID: References: <15be2446-f096-45b9-aaf3-b371a694049d@suse.com> <95b0b93b-3b27-4482-8965-01963cc8beb8@suse.cz> <6877dfb1-9f44-4023-bb6d-e7530d03e33c@suse.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspam-User: X-Stat-Signature: kywz7c76xqiws9t18z6xunabqromeujf X-Rspamd-Queue-Id: 3699C180007 X-Rspamd-Server: rspam07 X-HE-Tag: 1741107199-281052 X-HE-Meta: U2FsdGVkX1+NL9ohDV0gAWzEznGvhpdX1wii1nA7QpgiVOp94N0CytYt3aFUSj8iZ/gXhf3fzfaVzGhw2vXijiCSm08PSCnomSBO1XryANswh+rQUrjgvBhYVHL/l74FEGEhZnaG9RLKRU0G7vFIUJJGr4R8vUExN7K3XJDjce1ftJSKbVq40zRF1khVtlOnSEpzIUS/Ja+P0XpUxsd9KCRCJZp+CIXCetmOysVVWuCvA/ASY53mYYVxuhvvG4idlHkixq365M9SEE+/mfcaB4urL/TnevmdyG2jV5avHsM52XvGl9Lw5sqpvmMj598OPTRTIaH5TTUlUvRBqtjTdQ7ub/+m6/Nzhvzn/x0O0N1dxSudcDVL/paEN7ejw6wf213YeTYUr821kHbmy9uH6uMXkmrwFvZ+kDG9WWXju0Y4hoWo4o+qoGVXGYiO/DqEe5qX36AhaZ55j56vOctxdDDj0IdkMOSC60d2tAh5A8ZNPohscPpiO/lSEh9hIBWNE6SGrxksUl/VmXGAJkKN6r1xxfgjF9ti2mL93S5PcpVYokLQTiFcAWVoJ4obG+Ga1wxKoslDeHYdSDmpCLQf1Pb9bARXgYDjcw6299ykbmUJfjeSnPwFjprSPh/TJGidt3xNNZjmx2f32/WjRV8DbSmp0gGIuL5gZc/0ftVMBApklwnyXAWxwHXI0uNkJJOL4TAfJZ7OVxA1+/aC8qU8JTLmOLGHwxJP7SNLGBrHgV01ZBGirMAD95TXj3hm03UtFo7QFaJ9G87RP0TQ3rLhNXKqO0Eweg8wf/ZiPX7bhnZ39X+lX0wfi5ToM7Cddr0cHU7TtBtIMQQe0BNjitRWZy3vUHfg4micGW1ZQ+3BXC/dtdMKpmeV9oi6+KaOzqFJBFyDU+CA8mgSjMZU8FgWwfFVqekP3VkSpLLV5iIwqu/LJcYGHLJ8DOaAUpMXbgx7Ayoj4eGfAg7awRCZdgc rWRpxjFO 14ZREVwzIvEiQhDTYxxIyt8RxD5PbA5FAEi0jgJIRndc0Q5yZl9bwO+WPQJzbkQD6RTJ8+E7cf+cm8wIWn+bsxWsmRW0AuUs2+7ucr4686PvJHehP8ueE7Jey9TuxrfsmfwXemLlsuj/FQAdcS9SpMjv9ewyPIGrsNaWWMbZU843NMt6q76Z8AXQuo/vAweiDtbFXU1OrZTTR7lY3cCLTNFQwBDewspNU+TdssOTlALV3gMvPtapoxREIYQizIsKp/VpoffQz2yqeZ2zUHzmg87CQTNi2jXullVd26cf6bjkVgxr8gZkmdAu2prXH61lgXacFmoaKDjClYyI5xawSlU7GWg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Mar 04, 2025 at 05:32:32PM +0100, Hannes Reinecke wrote: > On 3/4/25 17:14, Matthew Wilcox wrote: > > I thought we'd done all the work needed to get rid of these pointless > > refcount bumps. Turns out that's only on the block side (eg commit > > e4cc64657bec). So what does networking need in order to understand > > that some iovecs do not need to mess with the refcount? > > The network stack needs to get hold of the page while transmission is > ongoing, as there is potentially rather deep queueing involved, > requiring several calls to sendmsg() and friends before the page is finally > transmitted. And maybe some post-processing (checksums, > digests, you name it), too, all of which require the page to be there. > > It's all so jumbled up ... personally, I would _love_ to do away with > __iov_iter_get_pages_alloc(). Allocating a page array? Seriously? > > And the problem with that is that it's always takes a page(!) reference, > completely oblivious to the fact whether you even _can_ take a page > reference (eg for tail pages); we've hit this problem several times now > (check for sendpage_ok() ...). Calling get_page() / put_page() on a tail page is fine -- that just redirects to the head page. But calling it on a slab never made any sense; at best it gets you the equivalent of TYPESAFE_BY_RCU -- that is, the object can be freed and reallocated, but the underlying slab will not be reallocated to some other purpose. > But that's not the real issue; real issue is that the page reference is > taken down in the very bowels of __iov_iter_get_pages_alloc(), but needs > to be undone by the _caller_. Who might (or might not) have an idea > that he needs to drop the reference here. > That's why there is no straightforward conversion; you need to audit > each and every caller and try to find out where the page reference (if any) > is dropped. > Bah. > > Can't we (at the very least) leave it to the caller of > __iov_iter_get_pages() to get a page reference (he has access to the page > array, after all ...)? That would make the interface slightly > better, and it'll be far more obvious to the caller what needs > to be done. Right, that's what happened in the block layer. We mark the bio with BIO_PAGE_PINNED if the pincount needs to be dropped. As a transitional period, we had BIO_PAGE_REFFED which indicated that the page refcount needed to be dropped. Perhaps there's something similar that network could be doing.