From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id EF58BC636D6 for ; Wed, 22 Feb 2023 16:53:03 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5A6926B0078; Wed, 22 Feb 2023 11:53:03 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 555776B007B; Wed, 22 Feb 2023 11:53:03 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4450B6B007D; Wed, 22 Feb 2023 11:53:03 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 3724B6B0078 for ; Wed, 22 Feb 2023 11:53:03 -0500 (EST) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id A9AA516064F for ; Wed, 22 Feb 2023 16:53:02 +0000 (UTC) X-FDA: 80495522604.04.0339466 Received: from out30-130.freemail.mail.aliyun.com (out30-130.freemail.mail.aliyun.com [115.124.30.130]) by imf29.hostedemail.com (Postfix) with ESMTP id C804212000F for ; Wed, 22 Feb 2023 16:52:59 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=alibaba.com; spf=pass (imf29.hostedemail.com: domain of GuoRui.Yu@linux.alibaba.com designates 115.124.30.130 as permitted sender) smtp.mailfrom=GuoRui.Yu@linux.alibaba.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1677084781; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references; bh=frbUtv/s55VoWycBaYAQd46ZWTVXEFmJiVlXYetHK6g=; b=h1WD0OxVDyR0o4gF2/x9dw4yADIoOQ3A56loDO6xao17Q/kF5SPunJwnwkx/3ARfVS/ZVx nelBYgGqp+TUSJ4H5CEJ9tLq6sqCMhOmxZi/2kiMsLJJefHB2KHoWwp0vlBWHHndbrummz XC/kAghL74DTN5/Cn+Ocku7/cWuAL2s= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=alibaba.com; spf=pass (imf29.hostedemail.com: domain of GuoRui.Yu@linux.alibaba.com designates 115.124.30.130 as permitted sender) smtp.mailfrom=GuoRui.Yu@linux.alibaba.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1677084781; a=rsa-sha256; cv=none; b=TK8y4rpbuUxFbg7g01JHyxN/k5Ny9vSZaSNxIHUvB4RbukzVIYDwZT1kjfIW7Irav+aUY7 AGICY8HS61jEr7IDKo1T8oa2ujdN9YW+wEBnCHEBlqT3HFAN/Yxk43gIa1IEpjkUfBWl4P Yu0pl9MDw+wrZXcYDNAdter/b/xmbD8= X-Alimail-AntiSpam:AC=PASS;BC=-1|-1;BR=01201311R381e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018046059;MF=guorui.yu@linux.alibaba.com;NM=1;PH=DS;RN=7;SR=0;TI=SMTPD_---0VcHP-D1_1677084772; Received: from localhost(mailfrom:GuoRui.Yu@linux.alibaba.com fp:SMTPD_---0VcHP-D1_1677084772) by smtp.aliyun-inc.com; Thu, 23 Feb 2023 00:52:52 +0800 From: "GuoRui.Yu" To: hch@lst.de, m.szyprowski@samsung.com Cc: robin.murphy@arm.com, iommu@lists.linux.dev, linux-kernel@vger.kernel.org, GuoRui.Yu@linux.alibaba.com, linux-mm@kvack.org Subject: [PATCH] swiotlb: fix the deadlock in swiotlb_do_find_slots Date: Thu, 23 Feb 2023 00:52:51 +0800 Message-Id: <20230222165251.88700-1-GuoRui.Yu@linux.alibaba.com> X-Mailer: git-send-email 2.29.2.540.g3cf59784d4 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Rspam-User: X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: C804212000F X-Stat-Signature: cnyp8n65k61f6poh3bspx3gcxw7c3oqf X-HE-Tag: 1677084779-253786 X-HE-Meta: U2FsdGVkX19MNydynBfLe/bbiu2kze4IYYnLf4h2eBaeJ/yzXsD/dfbLPDb3trnSdjuqnRyOQNx8OLQOE5aUZuTDxK3lpllW99eEvKGeFew9hsoZKVLZ//ri2MHgW/ZUf1/PlRiyLtIoAuKrepzEtjpyvGNuAWK3HBa4efdQaA0Qnkp1D9jvmzJ7VWHEjQPxysF1MfOsT7cJ58LRLMEt6sXZ9D2Rl3JO0nJa8K0/WN8yWIxjOP7EXuO2wR6tCAPKZnV2/TmuGsjGNRI/O9vLl4UMpIakZIYDI7ks6HtVu/FFVBJlnu/d3KQL7A8hL1Bm483b3udCxGTxPe2+LzsHAYKuHJ0/dCKvmgGRVHfmHwcoHmtIm32XnxkASBPg5pCrpgfSKoLW/5i8HJMr/ZScf6JABiLEwRin0QQtYpVmTok/HxwNfXt0hYqM7nfG930urqRH/xJm8QHjDgqoBJQ+0G/1pSEFkfx6VnX6IBWrRSNP7y6or0qeLgkVpTH8i/IHkMyh6mSAs/87w1Ytqit5y4QOESBTd8kTQ7sMy/9DK9UtqQ3GrXSOj8hgE3xDMRD+s3SknLxNS/WBb29dAd66mt46pM28OCnyLBJT7ojFs1RPwxwqI0viEuHjEZtqhUdbUY1+Ax7dg9QP0RyCz58SSOhWpKOY4gMhEjoZe5RW3pVDS2DjC7bWK7Lq3+YJrcnqqrqrYxQnISLEmZKxFw4h/zNdKkyPSSi9U97P0JK6Y0cWsapPKc5meAJ3DKrCw+AeA9bDpsQ7lQKMHCubvihhR+2iGMX+tUyeKvQW8iPHKJml7OA9i8cDW5HYGardzwGO1NmIhTON/FneWHSoBidAbnwKoBpu8+UjDu6kF0/mHBUClZdpVo206bSrLHNgtOtF7V1K/Eh6jPdXZ2/qMOAUAU1aBw7WghKEGcdldOWqK9fE7vbFs81pSXHAXeDtelu/m6Qy8pwrYS7q9QGzbpl T7kYZvbN VIvAb5Xbtysg/FQ4YzZKWz9Ugq0i7CniFp2mF62wqftF5vr+APvBGR6yMap4zfzHnecj3S0jwiEXAY0oRQtJiz3xSgcdMSkt+DpVHeapFTlBQMMQ+LA9BDGlwgDJHss7d4bufvFAaEz0sqZUxJsCkh9eJur8J9L/50D4AuCqohr7KGRHbGcZj2av5T/71OO+pOrfkIVcRuYo5eIrH9toirxfdsJnRfAbsEtR2Cdd1bO7B5ovXhVEyiXnUR0dmdRWY7A4JjD0lWkDtmYEakQZoW+pg0rYXqPbufSDQc0uG7cT2roXLf6ZSYflwQg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: In general, if swiotlb is sufficient, the logic of index = wrap_area_index(mem, index + 1) is fine, it will quickly take a slot and release the area->lock; But if swiotlb is insufficient and the device has min_align_mask requirements, such as NVME, we may not be able to satisfy index == wrap and exit the loop properly. In this case, other kernel threads will not be able to acquire the area->lock and release the slot, resulting in a deadlock. The current implementation of wrap_area_index does not involve a modulo operation, so adjusting the wrap to ensure the loop ends is not trivial. Introduce the index_nowrap variable to record the number of loops and exit the loop after completing the traversal. Backtraces: Other CPUs are waiting this core to exit the swiotlb_do_find_slots loop. [10199.924391] RIP: 0010:swiotlb_do_find_slots+0x1fe/0x3e0 [10199.924403] Call Trace: [10199.924404] [10199.924405] swiotlb_tbl_map_single+0xec/0x1f0 [10199.924407] swiotlb_map+0x5c/0x260 [10199.924409] ? nvme_pci_setup_prps+0x1ed/0x340 [10199.924411] dma_direct_map_page+0x12e/0x1c0 [10199.924413] nvme_map_data+0x304/0x370 [10199.924415] nvme_prep_rq.part.0+0x31/0x120 [10199.924417] nvme_queue_rq+0x77/0x1f0 ... [ 9639.596311] NMI backtrace for cpu 48 [ 9639.596336] Call Trace: [ 9639.596337] [ 9639.596338] _raw_spin_lock_irqsave+0x37/0x40 [ 9639.596341] swiotlb_do_find_slots+0xef/0x3e0 [ 9639.596344] swiotlb_tbl_map_single+0xec/0x1f0 [ 9639.596347] swiotlb_map+0x5c/0x260 [ 9639.596349] dma_direct_map_sg+0x7a/0x280 [ 9639.596352] __dma_map_sg_attrs+0x30/0x70 [ 9639.596355] dma_map_sgtable+0x1d/0x30 [ 9639.596356] nvme_map_data+0xce/0x370 ... [ 9639.595665] NMI backtrace for cpu 50 [ 9639.595682] Call Trace: [ 9639.595682] [ 9639.595683] _raw_spin_lock_irqsave+0x37/0x40 [ 9639.595686] swiotlb_release_slots.isra.0+0x86/0x180 [ 9639.595688] dma_direct_unmap_sg+0xcf/0x1a0 [ 9639.595690] nvme_unmap_data.part.0+0x43/0xc0 Fixes: 1f221a0d0dbf ("swiotlb: respect min_align_mask") Signed-off-by: GuoRui.Yu Signed-off-by: Xiaokang Hu --- kernel/dma/swiotlb.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c index a34c38bbe28f..638ba3ea94f4 100644 --- a/kernel/dma/swiotlb.c +++ b/kernel/dma/swiotlb.c @@ -632,7 +632,7 @@ static int swiotlb_do_find_slots(struct device *dev, int area_index, unsigned int iotlb_align_mask = dma_get_min_align_mask(dev) & ~(IO_TLB_SIZE - 1); unsigned int nslots = nr_slots(alloc_size), stride; - unsigned int index, wrap, count = 0, i; + unsigned int index, index_nowrap = 0, wrap, count = 0, i; unsigned int offset = swiotlb_align_offset(dev, orig_addr); unsigned long flags; unsigned int slot_base; @@ -665,6 +665,7 @@ static int swiotlb_do_find_slots(struct device *dev, int area_index, (slot_addr(tbl_dma_addr, slot_index) & iotlb_align_mask) != (orig_addr & iotlb_align_mask)) { index = wrap_area_index(mem, index + 1); + index_nowrap++; continue; } @@ -680,7 +681,8 @@ static int swiotlb_do_find_slots(struct device *dev, int area_index, goto found; } index = wrap_area_index(mem, index + stride); - } while (index != wrap); + index_nowrap += stride; + } while (index_nowrap < mem->area_nslabs); not_found: spin_unlock_irqrestore(&area->lock, flags); -- 2.31.1