From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 39CDAC021B8 for ; Tue, 4 Mar 2025 16:16:03 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id AEDC56B0085; Tue, 4 Mar 2025 11:16:02 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id A74FD6B0088; Tue, 4 Mar 2025 11:16:02 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9157F6B0089; Tue, 4 Mar 2025 11:16:02 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 731F36B0085 for ; Tue, 4 Mar 2025 11:16:02 -0500 (EST) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id E80F5B0BD2 for ; Tue, 4 Mar 2025 16:16:01 +0000 (UTC) X-FDA: 83184370122.21.BDD71F5 Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) by imf29.hostedemail.com (Postfix) with ESMTP id DD6B312000F for ; Tue, 4 Mar 2025 16:15:57 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=ioDwsSDp; spf=none (imf29.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1741104960; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=pL43Quyb0ICgUXShUhHF7YtI0r/6haqOAOyn/xgv5nc=; b=Dv62K6YfrTEVo++u6sTrUB00+f7Tfa2EcFWHNDO0oGcHzQ+QxgdeMhSlW/tL1GdR45uqLb 7+CeclbZwjBQ/PRP0/X+toqmDU4RDSxvm5o7tnHkz3xmcGVae0Xs/FxPxz2RRA3hqPdSDf LpTtnblPnKC0sptqrUpuNgfjFlHhSb0= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=ioDwsSDp; spf=none (imf29.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1741104960; a=rsa-sha256; cv=none; b=ufTL9RPrGqbronZBoNJVUMGAKTi8vZqSwXPRIsNLFKq92rYJ2j8L3ZqicGaWenV92dXT/Q Aig2R4D8rqBhmaByfJp5RBaUEGIcXdVmO3VEGVHzrIA5CUN7eYpxHg22c/pheZY4eUknlA WYXzKrtp+4ogecoT6TVwLPlvqiN+xCg= DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=pL43Quyb0ICgUXShUhHF7YtI0r/6haqOAOyn/xgv5nc=; b=ioDwsSDpQ9ByKM98PVFr7qaGNi ST5mF0gVx3oPZQTPM8RRnZZYPYhFxl3wf4Hgbb74Siu4a4vSM6QrDDYg5hbNVVvD2YUhnjQHQSRjU +qBTBt6+RTbgRd2HTeeiOB526g0wagPsMbvr30E5ehADsnEgtxqK4aS0jx0QnL0fbQbHDPaiwe/cH 5Q5mo2PH0cYQTIpj0ZUi1FjKqsiz4qD0TC9lxltij7tHfMbcPlXDV/5q1xyAoMaiSlB3KsT5qAHrW 4TBYXuHmZxYTU57sjC3Cz2/z4B4u6ZQA5New8VCPd9h6mAWZsRGe7ihZ46UaHmXsuQXzo+DTAa2xp PvQJsBeQ==; Received: from willy by casper.infradead.org with local (Exim 4.98 #2 (Red Hat Linux)) id 1tpUub-00000001h5s-2rJd; Tue, 04 Mar 2025 16:14:30 +0000 Date: Tue, 4 Mar 2025 16:14:29 +0000 From: Matthew Wilcox To: Vlastimil Babka Cc: Hannes Reinecke , Boris Pismenny , John Fastabend , Jakub Kicinski , Sagi Grimberg , "linux-nvme@lists.infradead.org" , "linux-block@vger.kernel.org" , linux-mm@kvack.org, Harry Yoo , "netdev@vger.kernel.org" Subject: Re: Kernel oops with 6.14 when enabling TLS Message-ID: References: <15be2446-f096-45b9-aaf3-b371a694049d@suse.com> <95b0b93b-3b27-4482-8965-01963cc8beb8@suse.cz> <6877dfb1-9f44-4023-bb6d-e7530d03e33c@suse.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Stat-Signature: 8chu5ub77q1uzz4e53gthg4kpzhznztr X-Rspamd-Server: rspam11 X-Rspamd-Queue-Id: DD6B312000F X-Rspam-User: X-HE-Tag: 1741104957-680147 X-HE-Meta: U2FsdGVkX1+ino7zqPDVv1yuvPpQJoy0SXW9gaF500a1UdVbG0yMBR4LfSkMmLQMAo3fK60ug02bJXVkHkASx/jlc6aJy9qvIP8/qmrt2SqucGaifC/5H/+Bfu6ff2RDDTl1b/80Zh9eMHLMAUrlA3FMvPU0KmGtyU8HFLL04fAA1kkHtvUNGV0+X8PghY5GNZO8CZfS3Z8Txpu8Sgu0J+dZ2tXo82R3jAbQuqtIRAznXzJwG36q/OINxzrW+ezIReMq6W1La6BxaCniYX1Xzt5iP+xWeRNoFOuUoJaLDtG8Bz9u+Mh+wDj6rS669q9CNcRoKWg5YdyFQHmHMV2Jp+hX6Brj3JgDZZXJOfowMUrZIJu1PY4QsBf6kgrjNXSsEjH9MyxKdI/NSzRalqafq57AVLaJYsztwljHnWv+rT/5MBUJd/CuydXsKJrOlOMEV9XVEHTxbkaZjxj7veB2UWtVh64WpvWQ4oSJRkB/D6+KgVZ5toe0QmCFQIdymJgIW6JUsnKgYiwsfHLnLQfIPWrt4zw3Y881f7ePYDBXf/MiSOvYik1lpRLBE6kq/pA7Y3r0uZtvkwop/O8scWvY6y80hutCr+6EeG3z3xz19pNP5YxrmfRblPhAQW0hQi2Y8LvLig2ZJDH3LX6UfMAchqsKcqtsen8b8smntrLkPO1jU3Ofq1burDklWYzgMg5leCPJIITfCBgqodXMMbOVWH9lqlobVIp64pqknWr6gMyfnAKMGh9RegK53GS7y28FKDndU5S2LT05iKjtint7nLqDBSz3WiVmcRy9JmWJDvD/IEti11ohhjKdhBwCgtHMbH17XNQxnrV9k6abUEHTJSKG34vVbzzK+hWdn7GWPuTad0LISDc4cjLQtKeHAswww/5TPXsc3E4DzoVXwaOgomuHbZsJ6BFG1pIy+5yhPEYmoGHMN4bX3KDhx1D4ROLS16u0lnhKyNiv7AfjpCP 9oKw7b0V 2Nprf+SWGQ/Ri3Px4TsbKuAw5GeSkeHV2TJBSvYWaJpphnZ0G4L6LEah92zvu5692kPXiBCdJjdf34oQKeye++39G7DNeJglJ8Srk79e+7A81EFNR9BkZiOHmf2qmboKFGzeh7caY0biuLT5RU+zdzHpNk1dZRUFkPm0Weipr+yYA4ipLeh+IYzY9lnAM+t8yHClF437wuSszN6syM8gONEEE7BwP8nMVkqbX+MgkThye9VASFkyzn9xZSxNXHfgqZh8pQP0QK9pYvNRhexb/Xy5xdw/3PUlz6YQYZMelJNC2TSehLz1uZhYj4NtH/4MB+5s5LC0edM04ADynNmhy92Q+e3DSOFcq/oo1czf+26q1E00= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Mar 04, 2025 at 11:26:07AM +0100, Vlastimil Babka wrote: > +Cc NETWORKING [TLS] maintainers and netdev for input, thanks. > > The full error is here: > https://lore.kernel.org/all/fcfa11c6-2738-4a2e-baa8-09fa8f79cbf3@suse.de/ > > On 3/4/25 11:20, Hannes Reinecke wrote: > > On 3/4/25 09:18, Vlastimil Babka wrote: > >> On 3/4/25 08:58, Hannes Reinecke wrote: > >>> On 3/3/25 23:02, Vlastimil Babka wrote: > >>>> Also make sure you have CONFIG_DEBUG_VM please. > >>>> > >>> Here you go: > >>> > >>> [ 134.506802] page: refcount:0 mapcount:0 mapping:0000000000000000 > >>> index:0x0 pfn:0x101ef8 > >>> [ 134.509253] head: order:3 mapcount:0 entire_mapcount:0 > >>> nr_pages_mapped:0 pincount:0 > >>> [ 134.511594] flags: > >>> 0x17ffffc0000040(head|node=0|zone=2|lastcpupid=0x1fffff) > >>> [ 134.513556] page_type: f5(slab) > >>> [ 134.513563] raw: 0017ffffc0000040 ffff888100041b00 ffffea0004a90810 > >>> ffff8881000402f0 > >>> [ 134.513568] raw: 0000000000000000 00000000000a000a 00000000f5000000 > >>> 0000000000000000 > >>> [ 134.513572] head: 0017ffffc0000040 ffff888100041b00 ffffea0004a90810 > >>> ffff8881000402f0 > >>> [ 134.513575] head: 0000000000000000 00000000000a000a 00000000f5000000 > >>> 0000000000000000 > >>> [ 134.513579] head: 0017ffffc0000003 ffffea000407be01 ffffffffffffffff > >>> 0000000000000000 > >>> [ 134.513583] head: 0000000000000008 0000000000000000 00000000ffffffff > >>> 0000000000000000 > >>> [ 134.513585] page dumped because: VM_BUG_ON_FOLIO(((unsigned int) > >>> folio_ref_count(folio) + 127u <= 127u)) > >>> [ 134.513615] ------------[ cut here ]------------ > >>> [ 134.529822] kernel BUG at ./include/linux/mm.h:1455! > >> > >> Yeah, just as I suspected, folio_get() says the refcount is 0. ... and it has a page_type of f5 (slab) > >>> [ 134.554509] Call Trace: > >>> [ 134.580282] iov_iter_get_pages2+0x19/0x30 > >> > >> Presumably that's __iov_iter_get_pages_alloc() doing get_page() either in > >> the " if (iov_iter_is_bvec(i)) " branch or via iter_folioq_get_pages()? It's the bvec path: iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bvec, 1, len); > >> Which doesn't work for a sub-size kmalloc() from a slab folio, which after > >> the frozen refcount conversion no longer supports get_page(). > >> > >> The question is if this is a mistake specific for this path that's easy to > >> fix or there are more paths that do this. At the very least the pinning of > >> page through a kmalloc() allocation from it is useless - the object itself > >> has to be kfree()'d and that would never happen through a put_page() > >> reaching zero. > >> > > Looks like a specific mistake. > > tls_sw is the only user of sk_msg_zerocopy_from_iter() > > (which is calling into __iov_iter_get_pages_alloc()). > > > > And, more to the point, tls_sw messes up iov pacing coming in from > > the upper layers. > > So even if the upper layers send individual iovs (where each iov might > > contain different allocation types), tls_sw is packing them together > > into full records. So it might end up with iovs having _different_ > > allocations. > > Which would explain why we only see it with TLS, but not with normal > > connections. I thought we'd done all the work needed to get rid of these pointless refcount bumps. Turns out that's only on the block side (eg commit e4cc64657bec). So what does networking need in order to understand that some iovecs do not need to mess with the refcount?