From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B0EAAC352A1 for ; Wed, 7 Dec 2022 01:05:21 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 55CAD8E0005; Tue, 6 Dec 2022 20:05:21 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 50D248E0001; Tue, 6 Dec 2022 20:05:21 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3D5DD8E0005; Tue, 6 Dec 2022 20:05:21 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 2AD5F8E0001 for ; Tue, 6 Dec 2022 20:05:21 -0500 (EST) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id E59CF14041A for ; Wed, 7 Dec 2022 01:05:20 +0000 (UTC) X-FDA: 80213716800.18.5D0C7A6 Received: from mail-pj1-f42.google.com (mail-pj1-f42.google.com [209.85.216.42]) by imf30.hostedemail.com (Postfix) with ESMTP id 948808000F for ; Wed, 7 Dec 2022 01:05:20 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=none; dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=kernel.org (policy=none); spf=pass (imf30.hostedemail.com: domain of dennisszhou@gmail.com designates 209.85.216.42 as permitted sender) smtp.mailfrom=dennisszhou@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1670375120; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=kDHXNdP6MSMMsA/w0febLQLYepjyZbaxc8wqoj5tAiM=; b=mUZd6F7vf2HkPFbmFiJMJb7skS/UDHnn+Z4CEy+IMqjnLstrg0GIwqb8mAHSQvbjk1HziY pjaGp4fCXppQRpmvcwoJ36OBVbsvt5La5AWCopWWOyb75fw4R31USi756Yn9CCuaioWPub DacysieQp/YW9rogksWVBGD13wT+3FU= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=none; dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=kernel.org (policy=none); spf=pass (imf30.hostedemail.com: domain of dennisszhou@gmail.com designates 209.85.216.42 as permitted sender) smtp.mailfrom=dennisszhou@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1670375120; a=rsa-sha256; cv=none; b=8QC41bbUqeVtlS6RB0EdHe9UlK5JapIR+BefPU1H26Mg9/6NiB1Y4jN56PVP/2LOt0ZiwC 8A7aHrFCboQ9bnIuH0XRQyaxMsUyvPRnE+0WsXCGEU9rVZLJV/Mru+TX8ElgkEv4/I4rJ+ njoLR3x2S1pfdNtLQQkIOhk1oSzhriA= Received: by mail-pj1-f42.google.com with SMTP id u15-20020a17090a3fcf00b002191825cf02so26322pjm.2 for ; Tue, 06 Dec 2022 17:05:20 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=kDHXNdP6MSMMsA/w0febLQLYepjyZbaxc8wqoj5tAiM=; b=madpb1V+DbimF4JPmiUyYP7JSHYqEoeFTx9V2qFmEj7r9x9uIdxqZE++uE9M1xEVKE eJY3wJW/uoUEo6mcoL0LdfaIu68mpk4VIeG8p26vIqYymBESKJQzdf5LnMvm5JjLQEXd ZpHjDIQ9H1iS467q7nymHSRPw7nIay6DlyrMCitzfMULFLHyk3GrJFKE3BTEsYXrifmN ORQmQjUiObzS8ahckwv808HQrdV4sXS+tP4Tm5j66l0fAkUURpvXYixOGOE+BVBRqzi6 6kE43rpd0BUY0ynx2v73ZD+75jfi4oC7iW7m1kb5Ab6Jt4z0Sf9+DGsaFMrPLetHbwL/ OEhQ== X-Gm-Message-State: ANoB5pl8d1Z3RHcjJY8mcoMqEhyeACE29RiIuGHhdhg4G6jRgFPRx+25 G3GGpPhdM2U8sL8iGfDA1h8= X-Google-Smtp-Source: AA0mqf5yQUGmm/AHSLd619YeNEcEdOyfosSo9fU5Ru7AmzyQYKBEqV96pi5U077piuNPUeIQDNWIrA== X-Received: by 2002:a17:90b:3c4e:b0:219:e628:acf0 with SMTP id pm14-20020a17090b3c4e00b00219e628acf0mr9508654pjb.149.1670375119479; Tue, 06 Dec 2022 17:05:19 -0800 (PST) Received: from fedora (148-64-108-50.PUBLIC.monkeybrains.net. [148.64.108.50]) by smtp.gmail.com with ESMTPSA id j10-20020a170902c3ca00b001869f2120absm13197201plj.294.2022.12.06.17.05.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 06 Dec 2022 17:05:19 -0800 (PST) Date: Tue, 6 Dec 2022 17:05:16 -0800 From: Dennis Zhou To: Zhong Jinghua Cc: tj@kernel.org, cl@linux.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, yi.zhang@huawei.com, yukuai3@huawei.com Subject: Re: [PATCH-next] block: fix null-deref in percpu_ref_put Message-ID: References: <20221206090939.871239-1-zhongjinghua@huawei.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20221206090939.871239-1-zhongjinghua@huawei.com> X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 948808000F X-Rspam-User: X-Stat-Signature: 5si7ufqbhjitgbdyoy38j891hny3ai1a X-Spamd-Result: default: False [-1.00 / 9.00]; BAYES_HAM(-6.00)[100.00%]; SORBS_IRL_BL(3.00)[209.85.216.42:from]; SUBJECT_HAS_UNDERSCORES(1.00)[]; MID_RHS_NOT_FQDN(0.50)[]; FORGED_SENDER(0.30)[dennis@kernel.org,dennisszhou@gmail.com]; DMARC_POLICY_SOFTFAIL(0.10)[kernel.org : SPF not aligned (relaxed), No valid DKIM,none]; RCVD_NO_TLS_LAST(0.10)[]; MIME_GOOD(-0.10)[text/plain]; BAD_REP_POLICIES(0.10)[]; ARC_SIGNED(0.00)[hostedemail.com:s=arc-20220608:i=1]; R_DKIM_NA(0.00)[]; FROM_HAS_DN(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[]; FROM_NEQ_ENVFROM(0.00)[dennis@kernel.org,dennisszhou@gmail.com]; PREVIOUSLY_DELIVERED(0.00)[linux-mm@kvack.org]; TO_DN_SOME(0.00)[]; RCPT_COUNT_SEVEN(0.00)[7]; R_SPF_ALLOW(0.00)[+ip4:209.85.128.0/17:c]; MIME_TRACE(0.00)[0:+]; RCVD_COUNT_THREE(0.00)[3]; TO_MATCH_ENVRCPT_SOME(0.00)[]; ARC_NA(0.00)[] X-HE-Tag: 1670375120-603440 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Hello, On Tue, Dec 06, 2022 at 05:09:39PM +0800, Zhong Jinghua wrote: > A problem was find in stable 5.10 and the root cause of it like below. > > In the use of q_usage_counter of request_queue, blk_cleanup_queue using > "wait_event(q->mq_freeze_wq, percpu_ref_is_zero(&q->q_usage_counter))" > to wait q_usage_counter becoming zero. however, if the q_usage_counter > becoming zero quickly, and percpu_ref_exit will execute and ref->data > will be freed, maybe another process will cause a null-defef problem > like below: > > CPU0 CPU1 > blk_mq_destroy_queue > blk_freeze_queue > blk_mq_freeze_queue_wait > scsi_end_request > percpu_ref_get > ... > percpu_ref_put > atomic_long_sub_and_test > blk_put_queue > kobject_put > kref_put > blk_release_queue > percpu_ref_exit > ref->data -> NULL > ref->data->release(ref) -> null-deref > I remember thinking about this a while ago. I don't think this fix works as nicely as it may seem. Please correct me if I'm wrong. q->q_usage_counter has the oddity that the lifetime of the percpu_ref object isn't managed by the release function. The freeing is handled by a separate path where it depends on the percpu_ref hitting 0. So here we have 2 concurrent paths racing to run with 1 destroying the object. We probably need blk_release_queue() to wait on percpu_ref's release finishing, not starting. I think the above works in this specific case because there is a call_rcu() in blk_release_queue(). If there wasn't a call_rcu(), then by the same logic we could delay ref->data->release(ref) further and that could potentially lead to a use after free. Ideally, I think fixing the race in q->q_usage_counter's pattern is better than masking it here as I think we're being saved by the call_rcu() call further down the object release path. Thanks, Dennis > As suggested by Ming Lei, fix it by getting the release method before > the referebce count is minus 0. > > Suggested-by: Ming Lei > Signed-off-by: Zhong Jinghua > --- > include/linux/percpu-refcount.h | 7 +++++-- > 1 file changed, 5 insertions(+), 2 deletions(-) > > diff --git a/include/linux/percpu-refcount.h b/include/linux/percpu-refcount.h > index d73a1c08c3e3..11e717c95acb 100644 > --- a/include/linux/percpu-refcount.h > +++ b/include/linux/percpu-refcount.h > @@ -331,8 +331,11 @@ static inline void percpu_ref_put_many(struct percpu_ref *ref, unsigned long nr) > > if (__ref_is_percpu(ref, &percpu_count)) > this_cpu_sub(*percpu_count, nr); > - else if (unlikely(atomic_long_sub_and_test(nr, &ref->data->count))) > - ref->data->release(ref); > + else { > + percpu_ref_func_t *release = ref->data->release; > + if (unlikely(atomic_long_sub_and_test(nr, &ref->data->count))) > + release(ref); > + } > > rcu_read_unlock(); > } > -- > 2.31.1 >