From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8F282C352A1 for ; Wed, 7 Dec 2022 13:10:48 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 140AB8E0003; Wed, 7 Dec 2022 08:10:48 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 0F1198E0001; Wed, 7 Dec 2022 08:10:48 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EFBB78E0003; Wed, 7 Dec 2022 08:10:47 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id DF3F88E0001 for ; Wed, 7 Dec 2022 08:10:47 -0500 (EST) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id A985D1201C8 for ; Wed, 7 Dec 2022 13:10:47 +0000 (UTC) X-FDA: 80215544934.17.1EF1C9F Received: from dggsgout11.his.huawei.com (unknown [45.249.212.51]) by imf23.hostedemail.com (Postfix) with ESMTP id 32698140002 for ; Wed, 7 Dec 2022 13:10:43 +0000 (UTC) Authentication-Results: imf23.hostedemail.com; dkim=none; spf=pass (imf23.hostedemail.com: domain of yukuai1@huaweicloud.com designates 45.249.212.51 as permitted sender) smtp.mailfrom=yukuai1@huaweicloud.com; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1670418647; a=rsa-sha256; cv=none; b=vmbpNdVWw5U0X7ptzyhguIJrxPr1jW/JNYvnOcJQTkzortyE+ve6FSC+pWNMpNoE76iN7p 3MWhi/APDQRJYs8ePmQPB0WrjbiY/Pkd1QSfhmUGSzfAV4apJI7EWYLrwV+ptOwEuimMHD 2ZmdAdqOL6OuW5tPCHbE/GD2eru8MkI= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=none; spf=pass (imf23.hostedemail.com: domain of yukuai1@huaweicloud.com designates 45.249.212.51 as permitted sender) smtp.mailfrom=yukuai1@huaweicloud.com; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1670418647; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=sBrhSLDLqPs6arPBmA0gQYYDP6GDphsOi5BjZ4hQjN4=; b=LmtGWPKMrtaXu6s55XX4O8ihOU6o1YrsUqVlJh41xVvLFEZQKyCF9Qu0fJPV7RO3jql7dc QTYoDMDQXpJQvPB8KGrXsaxlyARObTGZpm0EyRsJ69tc/wTL7W84tfuj8g8Ajk4dqgm9Wh nschCAN74KlUesoRzjEd1FJxOLsjZzU= Received: from mail02.huawei.com (unknown [172.30.67.153]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTP id 4NRyMF2Bt2z4f3pC2 for ; Wed, 7 Dec 2022 21:10:33 +0800 (CST) Received: from [10.174.176.73] (unknown [10.174.176.73]) by APP1 (Coremail) with SMTP id cCh0CgD37azJkJBjG4rYBg--.48614S3; Wed, 07 Dec 2022 21:10:34 +0800 (CST) Subject: Re: [PATCH-next] block: fix null-deref in percpu_ref_put To: Dennis Zhou , Zhong Jinghua , Ming Lei Cc: tj@kernel.org, cl@linux.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, yi.zhang@huawei.com, "yukuai (C)" References: <20221206090939.871239-1-zhongjinghua@huawei.com> From: Yu Kuai Message-ID: <4b826950-52a5-b50b-1086-c14422ca3039@huaweicloud.com> Date: Wed, 7 Dec 2022 21:10:33 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.8.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=gbk; format=flowed Content-Transfer-Encoding: 8bit X-CM-TRANSID:cCh0CgD37azJkJBjG4rYBg--.48614S3 X-Coremail-Antispam: 1UD129KBjvJXoW7uF17KF4rAry8GrW8XF4ruFg_yoW8ur48pF WxtF4akr4ktF4DKwn7Aw4xu3yxZr45KFyfGas3Gr1ayr13WFyFv3W3CFyY9F4jyr4kA3y0 vr4qg3ZIkFyq937anT9S1TB71UUUUUUqnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUyEb4IE77IF4wAFF20E14v26r4j6ryUM7CY07I20VC2zVCF04k2 6cxKx2IYs7xG6rWj6s0DM7CIcVAFz4kK6r1j6r18M28lY4IEw2IIxxk0rwA2F7IY1VAKz4 vEj48ve4kI8wA2z4x0Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7Cj xVAFwI0_Gr1j6F4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x 0267AKxVW0oVCq3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG 6I80ewAv7VC0I7IYx2IY67AKxVWUGVWUXwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFV Cjc4AY6r1j6r4UM4x0Y48IcVAKI48JMxk0xIA0c2IEe2xFo4CEbIxvr21l42xK82IYc2Ij 64vIr41l4I8I3I0E4IkC6x0Yz7v_Jr0_Gr1lx2IqxVAqx4xG67AKxVWUJVWUGwC20s026x 8GjcxK67AKxVWUGVWUWwC2zVAF1VAY17CE14v26r1q6r43MIIYrxkI7VAKI48JMIIF0xvE 2Ix0cI8IcVAFwI0_Jr0_JF4lIxAIcVC0I7IYx2IY6xkF7I0E14v26r4j6F4UMIIF0xvE42 xK8VAvwI8IcIk0rVW3JVWrJr1lIxAIcVC2z280aVAFwI0_Jr0_Gr1lIxAIcVC2z280aVCY 1x0267AKxVW8JVW8JrUvcSsGvfC2KfnxnUUI43ZEXa7IUbPEf5UUUUU== X-CM-SenderInfo: 51xn3trlr6x35dzhxuhorxvhhfrp/ X-CFilter-Loop: Reflected X-Rspam-User: X-Spamd-Result: default: False [2.81 / 9.00]; HFILTER_HOSTNAME_UNKNOWN(2.50)[]; SUBJECT_HAS_UNDERSCORES(1.00)[]; BAYES_HAM(-0.49)[73.09%]; R_SPF_ALLOW(-0.20)[+ip4:45.249.212.51]; RCVD_NO_TLS_LAST(0.10)[]; MIME_GOOD(-0.10)[text/plain]; R_DKIM_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; FROM_EQ_ENVFROM(0.00)[]; RCPT_COUNT_SEVEN(0.00)[9]; TO_MATCH_ENVRCPT_SOME(0.00)[]; ARC_SIGNED(0.00)[hostedemail.com:s=arc-20220608:i=1]; MID_RHS_MATCH_FROM(0.00)[]; FROM_HAS_DN(0.00)[]; RCVD_COUNT_THREE(0.00)[3]; TO_DN_SOME(0.00)[]; DMARC_NA(0.00)[huaweicloud.com]; PREVIOUSLY_DELIVERED(0.00)[linux-mm@kvack.org]; ARC_NA(0.00)[] X-Rspamd-Queue-Id: 32698140002 X-Rspamd-Server: rspam01 X-Stat-Signature: 8xpm4ee9yadndx8tqwtdeb9hf5iea4uo X-HE-Tag: 1670418643-991207 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Hi, ÔÚ 2022/12/07 9:05, Dennis Zhou дµÀ: > Hello, > > On Tue, Dec 06, 2022 at 05:09:39PM +0800, Zhong Jinghua wrote: >> A problem was find in stable 5.10 and the root cause of it like below. >> >> In the use of q_usage_counter of request_queue, blk_cleanup_queue using >> "wait_event(q->mq_freeze_wq, percpu_ref_is_zero(&q->q_usage_counter))" >> to wait q_usage_counter becoming zero. however, if the q_usage_counter >> becoming zero quickly, and percpu_ref_exit will execute and ref->data >> will be freed, maybe another process will cause a null-defef problem >> like below: >> >> CPU0 CPU1 >> blk_mq_destroy_queue >> blk_freeze_queue >> blk_mq_freeze_queue_wait >> scsi_end_request >> percpu_ref_get >> ... >> percpu_ref_put >> atomic_long_sub_and_test >> blk_put_queue >> kobject_put >> kref_put >> blk_release_queue >> percpu_ref_exit >> ref->data -> NULL >> ref->data->release(ref) -> null-deref >> > > I remember thinking about this a while ago. I don't think this fix works > as nicely as it may seem. Please correct me if I'm wrong. > > q->q_usage_counter has the oddity that the lifetime of the percpu_ref > object isn't managed by the release function. The freeing is handled by > a separate path where it depends on the percpu_ref hitting 0. So here we > have 2 concurrent paths racing to run with 1 destroying the object. We > probably need blk_release_queue() to wait on percpu_ref's release > finishing, not starting. > > I think the above works in this specific case because there is a > call_rcu() in blk_release_queue(). If there wasn't a call_rcu(), > then by the same logic we could delay ref->data->release(ref) further > and that could potentially lead to a use after free. > > Ideally, I think fixing the race in q->q_usage_counter's pattern is > better than masking it here as I think we're being saved by the > call_rcu() call further down the object release path. Agree. BTW, Wensheng used to send a patch to fix this in block layer: https://www.spinics.net/lists/kernel/msg4615696.html. Thanks, Kuai