From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C1B6CE7716D for ; Thu, 5 Dec 2024 15:19:08 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 47B466B00AD; Thu, 5 Dec 2024 10:19:08 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 41E396B00A9; Thu, 5 Dec 2024 10:19:08 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1F9DE6B00A9; Thu, 5 Dec 2024 10:19:08 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 7E9626B00DB for ; Sun, 8 Sep 2024 14:53:28 -0400 (EDT) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 20AAC1C4579 for ; Sun, 8 Sep 2024 18:53:28 +0000 (UTC) X-FDA: 82542469296.20.9B7F922 Received: from mail-pl1-f177.google.com (mail-pl1-f177.google.com [209.85.214.177]) by imf24.hostedemail.com (Postfix) with ESMTP id 35057180007 for ; Sun, 8 Sep 2024 18:53:26 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=none; spf=pass (imf24.hostedemail.com: domain of dennisszhou@gmail.com designates 209.85.214.177 as permitted sender) smtp.mailfrom=dennisszhou@gmail.com; dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=kernel.org (policy=quarantine) ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1725821505; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=6uCJ9AvJDNH55nzQc9GN1xu818GG/S5c50EOZ6rrXCc=; b=211kNGaKGDnD3E98ywwUms7mTmbO4otq9a863l+5cpDqNqQmcOtaabGpraeggLBgIfM4j1 tZrTIu4tWZOTrB/WtQHrTRWJD2bbhEZnT+mTdfPaSgae/NsirHuzs/4c9rtm46N8h4y3zv 5u7eQkucmw3XGLsBoGLJyWILfhD+Ino= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1725821505; a=rsa-sha256; cv=none; b=BtK49E3lfSfNcqgWnNAV6S7Dt+xTKhd4MHHS8NYWP2xUn/U7j4KobxWHohykpceCXruojx rZhheicErMfmw6p2phT2PpLEjXefLiHSuH3NVBy1XlIDuGD85enfJTLB7XNOGvzFkEsKw/ xXLDEpPHMMgMLXiTtzQDppXeL1Y0egA= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=none; spf=pass (imf24.hostedemail.com: domain of dennisszhou@gmail.com designates 209.85.214.177 as permitted sender) smtp.mailfrom=dennisszhou@gmail.com; dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=kernel.org (policy=quarantine) Received: by mail-pl1-f177.google.com with SMTP id d9443c01a7336-206aee40676so29811205ad.0 for ; Sun, 08 Sep 2024 11:53:25 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725821605; x=1726426405; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=6uCJ9AvJDNH55nzQc9GN1xu818GG/S5c50EOZ6rrXCc=; b=v4uXQ+M/WJIdvRXsWQQnlI/3ySwNVIm34SgVCUZRF2B7iXJLz3PQrAetS83l6T0x0E SePvgbK8qTwDdDLz/qpittJlUcmDTsOaC2s7BTEpz0bLrWJCGtQhe80xAI+NBcJq8UjK gzWWberXsEBxDXAKHfbJLrqfXwJ7OyGcC+nT5jBopemzhR27PXFzHW1Lz140L7C86Rq3 kPkvY7DBRfdFNotw7lVHWLgLZNCmFUcputXf9ao/ytv2Dg3seXcWMeC7mxDWMkF2dBE/ JiAvmCUM3UsdPGnwjfLfgVa0G+GfJUWX2urDPz0Z9hnNULHR/tJqVo9eMCPswqe3xZJa b+Xg== X-Forwarded-Encrypted: i=1; AJvYcCWiyGi5LAcTKi3yqQWkMKmNEyAu5cY1jGAotYPLTNPj7dYNj/1lR3+AMT+97As5uDfkLrL5vcrmZw==@kvack.org X-Gm-Message-State: AOJu0YyfRIvKMIMCe+W/IYauRpdnre3fK+OsXtheP2RewP3HLPcC5LkY h6kmFan0Z48vJ/s9hcBFYUewyG2CpkxRdcJZwEZcZ8CPSD+EIK12ALh3oBoQ X-Google-Smtp-Source: AGHT+IHGXoFhM3YhJqWziJWfozX3naXDrko1aq8/x/3ZldaDxH8TQZBA47/qpJJIFMKBgXqm8GR6ww== X-Received: by 2002:a17:902:c946:b0:206:c8dc:e334 with SMTP id d9443c01a7336-20706fb929emr49132025ad.39.1725821604582; Sun, 08 Sep 2024 11:53:24 -0700 (PDT) Received: from V92F7Y9K0C.attlocal.net (99-124-153-174.lightspeed.sntcca.sbcglobal.net. [99.124.153.174]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-20714b31ba5sm21092205ad.248.2024.09.08.11.53.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 08 Sep 2024 11:53:24 -0700 (PDT) Date: Sun, 8 Sep 2024 11:53:21 -0700 From: Dennis Zhou To: Jeongjun Park Cc: dennis@kernel.org, tj@kernel.org, cl@linux.com, akpm@linux-foundation.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, syzbot Subject: Re: [PATCH] percpu: fix data race in pcpu_alloc_noprof() and extend spinlock protection area Message-ID: References: <20240908135209.15159-1-aha310510@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20240908135209.15159-1-aha310510@gmail.com> X-Rspam-User: X-Rspamd-Queue-Id: 35057180007 X-Rspamd-Server: rspam04 X-Stat-Signature: ujxtr7xw8bh8eqj55p6y7piexff4ntp9 X-Rspamd-Pre-Result: action=add header; module=dmarc; Action set by DMARC X-Rspam: Yes X-HE-Tag: 1725821606-56136 X-HE-Meta: U2FsdGVkX18rpvskdi9x9YKInpvVN1FdjgE9g58UyC88/OEP1gxfGtkvawlPECkRT1Wj22Rye7dIfUYhbdE4T4JJDkbypzWFxWyd51Dlcs3swnzZsT9D7RvA6yiOlV0oeWwF+m9RTwwEa+he/iXgsuYmoVSny/1QwmcrnweSFvnbX2FYPmeABnWcahTNsLaNhmejVFDbjrGDLtX7/mZo0qTp/bXnAD7bdrMrbpTecU2gHVlMvlY6JzZQJNqpe5SUw5gCsZnCt7lFtkrAyM5GJNEbrpjuzw3FIZppB49LwmC7PsYutX3oMFbuX0i98gv8AIDvcmMVvZCUPS+YcWyVjs5SGa9mHhr5uqZzcbvG9H/hRliPzRVMo4HjVje1zfd3PL3MKdDpw1x/HosnG9rll5ZpTCj2Q7x6yI+3a6fBR3sKx7zG/7b83DGrQlO9Zc6QRaSwKCLcj71T37ZCN1tVuiGTkYsHBr09Sil5XYWpXPfC2k/0VEGPG/wXTAnvlmuecN9xVltW/W3kZfMi99NAAcop4WCVBTxMqCBrKj/v2y2S6FYqQ16u1s+zQb5S68vIMj/dyzMloJrnuGy8wp70aMNbct+4oanqxSaqWDBoe553OybzX6k4Ai0nb7Lzb9kIllAiYgAkZ9dgGxDWfgw547WSHNbWh0G17nXPWvLiPGXu9FUvF5Osw6GHIapBBz5/wfPecJdWwYOgWfKF7HuGsG6BxkquAzlshfGzQWQmn1KoztFpGCg2YhtIEFdNwqh8+LaHaGbjPMUOpTqce2iUHgL9Vkj/EjAaMRJnJD1TZrXOpatmwbMXw8w/lo4Qhlo+weY8yW2iqTRAFa6rG4AjOoSW4kf2FiTyaRpwzIl/wtXEn9MlSWa3Cv09nRNj9Ogmd3AJrxQ/ylQpJ0MZX5MZPCZQB3AWgy+1x1cTy2bo7hEk9XWQGwpnuQijAIkiKqpSNvPcs99Ij8CKeS8ojq2 qEw2GAHB yk/St2uFhRKBC78RxCkVwkiJmo0I8EG0d5Q+oD7/GEH/J5+FknIaKH9d7SvQmdLQ/WjzO05mBkHstQM0azS1RCAmYNVcHneT18bPFqfz73ffK9ZEMce5EHGwS7n4cSEsYtyvNLv5y2vm9fXeb+Oe5CfNsw/ud2NUeOvMcoOTpdmJ7lcidxRyX6aRbiLjq7FVf+3WgIRg/Y05crGIN5n25LUfetWWiE/x3SErbIka8/XWYMx3qdO7L4h7MC5oTTA6iquiIIXjpmqZdgEPo9I/9H7TPAXz8Cw8xq1Z6mRZb8rx3q+P42HeO/Aab5/1Lf/pQECZ5wCPjtEOjlsPVeT2hTZXRwvyLkDX0GojeFc15eWLeb46iAZWkPmW3tEPok7xUyvzSeL7XSaiZlak+Ryx/NIBH0f/4u24z0vjlc8bTvf18h2U6mETDiBIe7P7ibZXfWZe0svVz6z2gjVDUZMHUGGnvy9WveQ01hJJyc+XqvO+7JWIgAV86Mw7QQsWlzx+urkHJ4ILwtGsu+KEGJgbVrETJqIWtq+Mzf8go3IDap81pmMJG4rqpVQbDsXPZBB+CwY8txtjWlNmddsz6QBNe2Bx+V2tDVxnwOAkXJGpDpFt8tU/xNptnk9fO4g== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Hello, On Sun, Sep 08, 2024 at 10:52:10PM +0900, Jeongjun Park wrote: > I got the following KCSAN report during syzbot testing: > > ================================================================== > BUG: KCSAN: data-race in pcpu_alloc_noprof / pcpu_free_area > > read-write to 0xffffffff883f872c of 4 bytes by task 3378 on cpu 0: > pcpu_update_empty_pages mm/percpu.c:602 [inline] > pcpu_block_update_hint_free mm/percpu.c:1044 [inline] > pcpu_free_area+0x4dc/0x570 mm/percpu.c:1302 > free_percpu+0x1c6/0xb30 mm/percpu.c:2277 > xt_percpu_counter_free+0x63/0x80 net/netfilter/x_tables.c:1951 > cleanup_entry+0x195/0x1c0 net/ipv6/netfilter/ip6_tables.c:671 > __do_replace+0x470/0x580 net/ipv6/netfilter/ip6_tables.c:1099 > do_replace net/ipv6/netfilter/ip6_tables.c:1158 [inline] > do_ip6t_set_ctl+0x820/0x8c0 net/ipv6/netfilter/ip6_tables.c:1644 > nf_setsockopt+0x195/0x1b0 net/netfilter/nf_sockopt.c:101 > ipv6_setsockopt+0x126/0x140 net/ipv6/ipv6_sockglue.c:998 > tcp_setsockopt+0x93/0xb0 net/ipv4/tcp.c:3768 > sock_common_setsockopt+0x64/0x80 net/core/sock.c:3735 > do_sock_setsockopt net/socket.c:2324 [inline] > __sys_setsockopt+0x1d8/0x250 net/socket.c:2347 > __do_sys_setsockopt net/socket.c:2356 [inline] > __se_sys_setsockopt net/socket.c:2353 [inline] > __x64_sys_setsockopt+0x66/0x80 net/socket.c:2353 > x64_sys_call+0x278d/0x2d60 arch/x86/include/generated/asm/syscalls_64.h:55 > do_syscall_x64 arch/x86/entry/common.c:52 [inline] > do_syscall_64+0x54/0x120 arch/x86/entry/common.c:83 > entry_SYSCALL_64_after_hwframe+0x76/0x7e > > read to 0xffffffff883f872c of 4 bytes by task 3374 on cpu 1: > pcpu_alloc_noprof+0x9a5/0x10c0 mm/percpu.c:1894 > xt_percpu_counter_alloc+0x79/0x110 net/netfilter/x_tables.c:1931 > find_check_entry net/ipv4/netfilter/ip_tables.c:526 [inline] > translate_table+0x921/0xf70 net/ipv4/netfilter/ip_tables.c:716 > do_replace net/ipv4/netfilter/ip_tables.c:1137 [inline] > do_ipt_set_ctl+0x7bd/0x8b0 net/ipv4/netfilter/ip_tables.c:1635 > nf_setsockopt+0x195/0x1b0 net/netfilter/nf_sockopt.c:101 > ip_setsockopt+0xea/0x100 net/ipv4/ip_sockglue.c:1424 > tcp_setsockopt+0x93/0xb0 net/ipv4/tcp.c:3768 > sock_common_setsockopt+0x64/0x80 net/core/sock.c:3735 > do_sock_setsockopt net/socket.c:2324 [inline] > __sys_setsockopt+0x1d8/0x250 net/socket.c:2347 > __do_sys_setsockopt net/socket.c:2356 [inline] > __se_sys_setsockopt net/socket.c:2353 [inline] > __x64_sys_setsockopt+0x66/0x80 net/socket.c:2353 > x64_sys_call+0x278d/0x2d60 arch/x86/include/generated/asm/syscalls_64.h:55 > do_syscall_x64 arch/x86/entry/common.c:52 [inline] > do_syscall_64+0x54/0x120 arch/x86/entry/common.c:83 > entry_SYSCALL_64_after_hwframe+0x76/0x7e > > value changed: 0x00000005 -> 0x00000006 > > Reported by Kernel Concurrency Sanitizer on: > CPU: 1 UID: 0 PID: 3374 Comm: syz-executor.3 Not tainted 6.11.0-rc6-syzkaller-00326-gd1f2d51b711a-dirty #0 > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/06/2024 > ================================================================== > > The global variable pcpu_nr_empty_pop_pages can be protected by pcpu_lock, > but since pcpu_alloc_noprof reads outside the spinlock protection section, > a data race may occur and the branch of the conditional statement may change. > Therefore, the reading of pcpu_nr_empty_pop_pages should be modified to be > performed within the spinlock protection section. > > However, the for_each_clear_bitrange_from loop requires and uses a spinlock, > but it repeatedly locks and unlocks the spinlock unnecessarily. > > Therefore, I think it is appropriate to remove the repeated spin_lock and > spin_unlock in for_each_clear_bitrange_from and perform the operation of > reading pcpu_nr_empty_pop_pages and then perform spin_unlock to postpone > the point in time when the spin_unlock is performed. > > Reported-by: syzbot > Fixes: e04d320838f5 ("percpu: indent the population block in pcpu_alloc()") > Signed-off-by: Jeongjun Park > --- > mm/percpu.c | 5 ++--- > 1 files changed, 2 insertions(+), 3 deletions(-) > > diff --git a/mm/percpu.c b/mm/percpu.c > index 20d91af8c033..5c958a54da51 100644 > --- a/mm/percpu.c > +++ b/mm/percpu.c > @@ -1864,7 +1864,6 @@ void __percpu *pcpu_alloc_noprof(size_t size, size_t align, bool reserved, > > area_found: > pcpu_stats_area_alloc(chunk, size); > - spin_unlock_irqrestore(&pcpu_lock, flags); > > /* populate if not all pages are already there */ > if (!is_atomic) { > @@ -1878,14 +1877,12 @@ void __percpu *pcpu_alloc_noprof(size_t size, size_t align, bool reserved, > > ret = pcpu_populate_chunk(chunk, rs, re, pcpu_gfp); > > - spin_lock_irqsave(&pcpu_lock, flags); > if (ret) { > pcpu_free_area(chunk, off); > err = "failed to populate"; > goto fail_unlock; > } > pcpu_chunk_populated(chunk, rs, re); > - spin_unlock_irqrestore(&pcpu_lock, flags); > } We don't want to do this because pcpu_populate_chunk() calls alloc_pages_node() which can block. > > mutex_unlock(&pcpu_alloc_mutex); > @@ -1894,6 +1891,8 @@ void __percpu *pcpu_alloc_noprof(size_t size, size_t align, bool reserved, > if (pcpu_nr_empty_pop_pages < PCPU_EMPTY_POP_PAGES_LOW) > pcpu_schedule_balance_work(); > > + spin_unlock_irqrestore(&pcpu_lock, flags); > + > /* clear the areas and return address relative to base address */ > for_each_possible_cpu(cpu) > memset((void *)pcpu_chunk_addr(chunk, cpu, 0) + off, 0, size); > -- I sent out [1] which is a more appropriate fix. I'll merge it later today. Thanks, Dennis [1] https://lore.kernel.org/lkml/20240906031151.80719-1-dennis@kernel.org/