From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8E590EB8FDB for ; Wed, 6 Sep 2023 14:35:34 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 19C1428001C; Wed, 6 Sep 2023 10:35:34 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 14C28280017; Wed, 6 Sep 2023 10:35:34 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0133E28001C; Wed, 6 Sep 2023 10:35:33 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id E4E6F280017 for ; Wed, 6 Sep 2023 10:35:33 -0400 (EDT) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id A7ED4120B33 for ; Wed, 6 Sep 2023 14:35:33 +0000 (UTC) X-FDA: 81206420946.11.C9CE511 Received: from mail-il1-f173.google.com (mail-il1-f173.google.com [209.85.166.173]) by imf28.hostedemail.com (Postfix) with ESMTP id CDAC3C0024 for ; Wed, 6 Sep 2023 14:35:31 +0000 (UTC) Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=joelfernandes.org header.s=google header.b=YxPUc1Mm; spf=pass (imf28.hostedemail.com: domain of joel@joelfernandes.org designates 209.85.166.173 as permitted sender) smtp.mailfrom=joel@joelfernandes.org; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1694010931; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=bGlUYGV6FKGTKrHXEc/8wfyJkTzTkG8Ak90V0Yv/IDA=; b=0xp/uMclI2TrENYYgpwe69Vk0AVyw0LVARKPW75M4aswVnlthYfDevEMwxkkP0kJr5WrDK 7ONlOH5cv8vy2WsAsjOS/rBvGj4ORlQXJyzwDsyDgXWVADBiBXtVlpP5cnDCVmoujVMKe4 +1XXUy/QTNHTNVpWoz09qBjTtLRbPCw= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1694010931; a=rsa-sha256; cv=none; b=Tk4gbsUMQI1tlpyVkznxzwsqLK1YxtdVHKufr7DpA9itEr4GIedQlb9YndSDZBGwSeSYjQ O6STZCvqZXbUP24r2aW+wDqwrJquUs3X/65rUlhMA+JRUMu2Ecx/UV1uCuo8kOKOAJLro3 tNHoAKpE53KrOJCMjA6n7D8t7iXAdTI= ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=pass header.d=joelfernandes.org header.s=google header.b=YxPUc1Mm; spf=pass (imf28.hostedemail.com: domain of joel@joelfernandes.org designates 209.85.166.173 as permitted sender) smtp.mailfrom=joel@joelfernandes.org; dmarc=none Received: by mail-il1-f173.google.com with SMTP id e9e14a558f8ab-34f2cb006f9so5538375ab.0 for ; Wed, 06 Sep 2023 07:35:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=joelfernandes.org; s=google; t=1694010931; x=1694615731; darn=kvack.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=bGlUYGV6FKGTKrHXEc/8wfyJkTzTkG8Ak90V0Yv/IDA=; b=YxPUc1MmcZKXa9p/qjqiWAmagISBKWaLoye0FTLP0bSG2cidDkdiNfcOZE/anjbs+Z AGudcjXoXoOGwNkYV3f1J/lqrmQDycM5NcZ5aNqQOYkf9XxJxEffVMhxR0gZNWkkNgQQ OOoPEcEZRkoP3GNAFsZwNVBuDeQuMTRug57ao= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1694010931; x=1694615731; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=bGlUYGV6FKGTKrHXEc/8wfyJkTzTkG8Ak90V0Yv/IDA=; b=UZHn+4/ff/zX3Xvn8HKDuvmfRlYiYjprxQ9DieBNCQBRmlyHzQd1wSGH5cevr6buPD sOHMIwAHRO8pXzCkFzAsRV16qRCBhFuxcPgdy5EZCQEG+0pxSDpS9Ob0mOSiVXFHF5TB sH8sIJ9JTS3y+PRE4pEoOLAmt2oOL2dIr8F46n1xPPPPPh1clshFKwD3iyyP1kvmHAL8 NloFG3tNiTORfj5M2lwAgqj/qGupIRkOrtLLZaTlkxwO/ROL/Ba5dcyoxWuAvzAxq+ql Mq51aDXH0rzHwU9/U3psP6EcwGqV39weTE9Ji1huLa65g0mBf3EAbB/CnZF0MvZN+VEe 2OXA== X-Gm-Message-State: AOJu0YzmwbZdQbcNAZo3ndI4ITWdS3jvhQ/GWuWycGjr9vSnKmwd6zSF Ly2+1nhAs63tL4zOr7ePlJJ16i47eIIL0NdrREM= X-Google-Smtp-Source: AGHT+IHsCY3MvLSvr8Ngr8FPPEj7HK/rV7+hrLWxXlyYEoQzLi6qAJ0h6qWiBxXIAAnxESC2vbm02Q== X-Received: by 2002:a05:6e02:1d11:b0:349:86f6:e94a with SMTP id i17-20020a056e021d1100b0034986f6e94amr20928744ila.3.1694010930854; Wed, 06 Sep 2023 07:35:30 -0700 (PDT) Received: from localhost (74.120.171.34.bc.googleusercontent.com. [34.171.120.74]) by smtp.gmail.com with ESMTPSA id b17-20020a923411000000b0034e1acce730sm3879620ila.42.2023.09.06.07.35.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Sep 2023 07:35:29 -0700 (PDT) Date: Wed, 6 Sep 2023 14:35:29 +0000 From: Joel Fernandes To: Catalin Marinas Cc: Christoph Paasch , Andrew Morton , linux-mm@kvack.org, MPTCP Upstream , rcu@vger.kernel.org Subject: Re: kmemleak handling of kfree_rcu Message-ID: <20230906143529.GB1127143@google.com> References: <20230905111725.GA3737639@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspamd-Queue-Id: CDAC3C0024 X-Rspam-User: X-Stat-Signature: 16y6captfpw4mzm61zxrsdtewy5do1ra X-Rspamd-Server: rspam03 X-HE-Tag: 1694010931-283318 X-HE-Meta: U2FsdGVkX1+xUqEKv/ZlvZ7NsQkE6ucD/Nih80F7V5ZUQ9+41ILsDBHKicL97UIQMjY86o6+U2EHOpe6s4+HFPMPtmp3OspHNfWviKZylD9+IXqxpjNWKOordkxSIo4iUE3he06+8SGvRSxoxhR7OcLpHl4t7Zw3pVWcmh6cEIpAIYo3Zi4+N8ja5yhp8oWGkBz8coTnAeLfwlI4fM4rgpua6N5G7o7Ogmuk9227GKncFjqCy2JozyW0iqz3gCwEfaVJfC99w8IzEflp+D/smONwagpDEoUZDVdhLF7DB9nSxJDGOblAEW8gUQoGZihCDxMzbfw/QgzmWSccM80ek3QP1Vguk0YL+V/vT8AxzkkH0RFxzVpeHjgaLHg5Xvntvbk53qCk0iU9KJ7ooCUhti0S30ZTES/I0a+fNF2nxExdcExF5Qhx7yCF7fM8Emtf6hZAB9a+c6twPZVnkFP1hiYLEAzv+th8N425JpfVEueJOAsofzascq7GGJiqz+rm9lUM82+8HNEU0q2I6jI6pTYvKWBs0H9O4lgUvRqqgTTbNGDyX8hk79thTmtX4QLpTxFkKdGMmirEzRpAmv8LVGTHhxbCY55FfMFF/L9Y6njUyFBuQZAYDzf346eEcvoI2Xuch3Rcz4KfoUis6NdZfB6DUcneGWkTG4SGaSCUBkcA1jewMJB9AlWfzRYX2elwvRpk+z34H7YMHnZ6T3/WhX9PL3m/XjTPdi5ycVpM6oRoamOqae4uKICbDO8bkSLTkb8q8fwawUzpGeNCuUnDmFfVZB2cjFk9Er+lFe6vbTnq3q1ErfLDaUThtU2OhyomOTQVZMboHq55oNitpINprRnnHlIPSD2Zqa65zQ3X0QcJrJwOeQOGBR6biUp7z9EnPcsoastFxi6OjuDFjnQ/MgDcC+yy93j840ssmLYVSvKKwULIjiCfcE3fi0Owyf5lVR7TMis1SrEM6Q4EvuU CA4jI0uv HnqjpOTVjXHWi8OfTQ5s2pPtVGGYTA5jfsunedLPnS5UbAyJbCbl+fv05/ocKFT/NKO+9IMIIBNfCxeRz2MHyqxt4Ume1c3KO1mh15t9Bb4d/I+nuqb3zuyh3mLNeWSOnjDoHT8roLFF4uL0uwrn4F1dLiSC+waPlyEuomIaqUj8b2xkKJkZKcw1TM9tkgEn1ukyT79Gy2Kl/5KqGZG6oAuOXNVwf1WnyKNv/gwezZ+ZtfhhXSzrMKw0yHZRXETAvL2nSM45rtXjFLODJgiHno0Sw9GVafeMhYROrncUzV1Je0CiYW5AzNc5CTTaag2obeO0x9u5YhPuG/f5nAa8qijkdaTskk0NyAIX0AaZjFO9L5zvoZhsMMCWPrGHOi7kYkaHM X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Hi Catalin, On Tue, Sep 05, 2023 at 03:41:32PM +0100, Catalin Marinas wrote: > On Tue, Sep 05, 2023 at 11:17:25AM +0000, Joel Fernandes wrote: > > On Mon, Sep 04, 2023 at 10:22:56PM +0100, Catalin Marinas wrote: > > > On Wed, Aug 30, 2023 at 09:37:23AM -0700, Christoph Paasch wrote: > > > > for the MPTCP upstream project, we are running syzkaller [1] continuously to > > > > qualify our kernel changes. > > > > > > > > We found one issue with kmemleak and its handling of kfree_rcu. > > > > > > > > Specifically, it looks like kmemleak falsely reports memory-leaks when the > > > > object is being freed by kfree_rcu after a certain grace-period. > > > > > > > > For example, https://github.com/multipath-tcp/mptcp_net-next/issues/398# > > > > issuecomment-1584819482 shows how the syzkaller program reliably produces a > > > > kmemleak report, although the object is not leaked (we confirmed that by simply > > > > increasing MSECS_MIN_AGE to something larger than the grace-period). > > > > > > Not sure which RCU variant you are using but most likely the false > > > positives are caused by the original reference to the object being lost > > > and the pointer added to a new location that kmemleak does not track > > > (e.g. bnode->records[] in the tree-based variant). > > > > > > A quick attempt (untested, not even compiled): > > > > > > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c > > > > I am not sure if that works. Correct me if I'm wrong but the issue is that > > the allocated pointer being RCU-freed is no longer reachable by kmemleak (it > > is scanned but not reachable), however it might still be reachable via an > > RCU reader. In such a situation, it is a false-positive. > > Yes, with a small correction that the object being RCU-freed is not > scanned if it cannot be reached by kmemleak (unless explicitly marked as > not a leak). Yes I used wrong wording. I meant it is added to the kmemleak rbtree as something that needs to be reachable, but cannot be reached. Hopefully that makes sense. > > The correct fix then should probably be to mark the object as > > kmemleak_not_leak() until a grace period elapses. This will cause the object > > to not be reported but still be scanned until eventually the lower layers > > will remove the object from kmemleak-tracking after the grace period. Per the > > docs also, that API is used to prevent false-positives. > > This should work as well but I'd use kmemleak_ignore() instead of > kmemleak_not_leak(). The former, apart from masking the false positive, > also tells kmemleak not to scan the object. After a kvfree_rcu(), the > object shouldn't have any valid references to other objects, so not > worth scanning. Yes I am also OK with that, however to me I consider the object as alive as long as the grace period does not end. But I agree with you and it may not be worth tracking them or scanning them. > > Instead what the below diff appears to do is to mark the bnode cache as a > > kmemleak object itself, which smells a bit wrong. The bnode is not an > > allocated object in the traditional sense, it is simple an internal data > > structure. > > We do this in other places for objects containing pointers and which > aren't tracked by kmemleak (it doesn't track page allocations as there > would be too many leading to lots of false positives or overlapping > objects - the slab itself uses the page allocator). An example where we > do something similar is alloc_large_system_hash(). We could add a > wrapper API if kmemleak_alloc() feels wrong but underlying in kmemleak > it would use the same mechanism for recording and scanning the bnode. Ah I see what you did, you basically made the bnode as a kmemleak object in its own right, and perhaps the kmemleak detector can reach the object you added. That actually (though sounding like a little of a hack) would work too. I say hack because the object you added is not an allocated object in the traditional sense :-D. But as you said, there is precendent for that. > > That may not solve the issue as the false-positive unreachable > > object is still unreachable. > > It should solve the issue as long as the bnode is reachable from the > root objects (e.g. the data/bss sections; I think it is via the > per_cpu_ptr(&krc, cpu)->bkvcache llist_head). When the bnode is scanned, > it will find the pointer to the RCU-freed object and mark it as > referenced (not a leak). Right! That's what I was missing. > Anyway, as we trust the RCU freeing implementation not to leak data, we > can go with your simpler suggestion of a kmemleak_ignore() call on the > kvfree_rcu() path. Probably even better as the object pending to be > freed will no longer be scanned. Sounds good and thanks a lot Catalin! Unless you beat me to it, I'll send a patch out along those lines by the weekend and CC you with your suggested-by and the reported-by from the reporter ;-). Let me know if you have a preference. thanks, - Joel