From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9E16AC3DA4A for ; Sat, 27 Jul 2024 03:15:15 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2F2816B0083; Fri, 26 Jul 2024 23:15:15 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 2A1996B0088; Fri, 26 Jul 2024 23:15:15 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 169DB6B0089; Fri, 26 Jul 2024 23:15:15 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id F33476B0083 for ; Fri, 26 Jul 2024 23:15:14 -0400 (EDT) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 5D96816023F for ; Sat, 27 Jul 2024 03:15:14 +0000 (UTC) X-FDA: 82384066548.26.81625C9 Received: from mail-pl1-f169.google.com (mail-pl1-f169.google.com [209.85.214.169]) by imf29.hostedemail.com (Postfix) with ESMTP id 8AF27120007 for ; Sat, 27 Jul 2024 03:15:12 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=none; spf=pass (imf29.hostedemail.com: domain of dennisszhou@gmail.com designates 209.85.214.169 as permitted sender) smtp.mailfrom=dennisszhou@gmail.com; dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=kernel.org (policy=none) ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1722050045; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=GLRbFlYw1y7KC9S4hQ5ZFyFBozLVkK+PnMW946p6k3U=; b=qcDfKcmSZZxFXjh/ero5QzyXSoJDew41ngtfArzFhOTh7CEN13ToBN5zAOUyEJ8mNB8tN4 o5z+6X+/Ou6FAew4ujo17McxssI20xGnwJWuf4Ffj6NIVQUsoGv+TD6KLTFC9KruQJd0lL mkpOjVHQOgGX9sDajTdnPMXBma2p6Hk= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=none; spf=pass (imf29.hostedemail.com: domain of dennisszhou@gmail.com designates 209.85.214.169 as permitted sender) smtp.mailfrom=dennisszhou@gmail.com; dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=kernel.org (policy=none) ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1722050045; a=rsa-sha256; cv=none; b=7OaUAIG2NWJ1DH3KAgsUJFHEB4uHEhD8s8v6c8VThY/8UMzLdrSMNVhbf2WSGmZtarGOHa AU4XStoa4i401d26oylTffgZ1Bzsy4lblvHoxTtefCDXXBOWmXbXxZahB/Cba51tytwoR/ brqIHbqQSYXn2oswxj2zXB2WuSN11o8= Received: by mail-pl1-f169.google.com with SMTP id d9443c01a7336-1fec34f94abso11414725ad.2 for ; Fri, 26 Jul 2024 20:15:12 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1722050111; x=1722654911; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=GLRbFlYw1y7KC9S4hQ5ZFyFBozLVkK+PnMW946p6k3U=; b=YsNmHmVVLjp21G3RgkY3eIuODu7rSHjFSMvv4BPhlootG0i32DHNGDZRGXFp0RIdU6 132RdkonH+d8h4ccEkV+OTkBwPYuu04O5SW74P+5lnkvikv4d0qqjSayrRNQreRV+lUd X7gaAcvVQVWhmV7wNINtR+4EscpqWAs4FkXL+c7GqysaXe3WWxcULVF+9z2UJx5sGPC0 PKv9Xdpnjrye/L46zeyo47jOe00pjPeOvjCobbbmLVqHv1T9Gj5WIY1IdMWhsLpeE/sn qEHsJqyznJD0zrxwVitLz5QO8LVF9cuJh0Jec4R+mYrjB/KSQlXWjbOniVuJ0qXMpGe2 EYrw== X-Forwarded-Encrypted: i=1; AJvYcCWKMm5zXwT0jOU0CHsD8+iScUrxX/6xYFb0XpDaAv8BYt6WiyHNmxi2cAuYqHm6v1LipuROfzWy9vGcCT//q2WZYDQ= X-Gm-Message-State: AOJu0YxcFMaAjbDtodbnCWzQhhUw+YddjrM78UPp2LrKPaMBSz5FkEA2 OCc7tZ18vTCJ1LIk14DkySuoPmasvAdFPh8BG42pwhMNwouBu25w X-Google-Smtp-Source: AGHT+IEh0vRQcYCjKacpBe+tlN5TG/wqG8G56zuBfhH316hJWOb2texq/hSHkLPuAMlU+eKcHwai8g== X-Received: by 2002:a17:903:228c:b0:1fd:6766:6848 with SMTP id d9443c01a7336-1ff0481b9d3mr18339475ad.17.1722050111104; Fri, 26 Jul 2024 20:15:11 -0700 (PDT) Received: from snowbird ([136.25.84.117]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-1fed7c7f62fsm40488975ad.19.2024.07.26.20.15.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 26 Jul 2024 20:15:10 -0700 (PDT) Date: Fri, 26 Jul 2024 20:15:07 -0700 From: Dennis Zhou To: Boqun Feng Cc: Tejun Heo , kernel test robot , Suren Baghdasaryan , oe-lkp@lists.linux.dev, lkp@intel.com, linux-kernel@vger.kernel.org, Andrew Morton , Kent Overstreet , Kees Cook , Alexander Viro , Alex Gaynor , Alice Ryhl , Andreas Hindborg , Benno Lossin , =?iso-8859-1?Q?Bj=F6rn?= Roy Baron , Christoph Lameter , Gary Guo , Miguel Ojeda , Pasha Tatashin , Peter Zijlstra , Vlastimil Babka , Wedson Almeida Filho , linux-mm@kvack.org, lkmm@lists.linux.dev Subject: Re: [linus:master] [mm] 24e44cc22a: BUG:KCSAN:data-race_in_pcpu_alloc_noprof/pcpu_block_update_hint_alloc Message-ID: References: <202407191651.f24e499d-oliver.sang@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 8AF27120007 X-Stat-Signature: xy5uzp6ucy4d3ssa9gfem3b6zu783d6u X-Rspam-User: X-HE-Tag: 1722050112-763688 X-HE-Meta: U2FsdGVkX1802t5ivpLWDNKjxm8v12eXtp6Hqtzt2gnca42HazKt2W8favkiv73TWi5Z02BEDzKkTncm0uQyJ6ofMfZpbKFcciSutztRRdE9MNHArBlC2suBZt7I3mwmIohzw4e222tKeUP6x3ao8IjqDzjtq0UIJqu3EpOEoEupG7UOCe/d3a60Z3mGVJdCNfXrqOSrQEuJF+/QMan8rA7yz4f9MCKqAsa0Y6UZU559wCXTK22x7vcbDizS97fhlP0ZbXkI7ahL2JWl+G2eRt3HyD79R4zUvkxJZE4/VfYSCg27UYO1Wr8P9sh4syNrUUgpmQeGAKo8DpwfLXkpW0jxzL6o/+9JDjS4fNRxrBD6VTeZbfflgO3RPfwXR8NkWmMN4GyplQtiDlccArvNuHVAkzBFGmUxNcaepFiu7RhXRJib/CWx+Kkq61vTstxHF1ADTpRQw2X7ratTz76IcP3CtHrGMparf4nCjBX1ECEVflO38skPwBH74ZIE/IQIFNdlhllA5pYWC0E1pXitT9fNpcnAErEhdR9ehXXECXoUIDERKkcun0iaswKNeamF8FN3fl2nUP7dpzvWOn2AamW5bmTYpuceea9fG0wZZhzfMVZiTjdIYpvo53J8xWEfUl4By6vF06CkT28CkFY/FrNb5D4XqtIr9oBAQd5hJ7fRGylirXfxKkSZ2HAEn1nF94BxgthVL0HPKz59ztVLjFK0NGrrVqbAh6w2fM70askMHSVvNm3nJj58ohSEEIde0GpSy43PVeGhCPYHSvidEq9h5cWju4Sd1P8xOlYp1R8P7XBVUa4XcRGdyMv2RKw73S9dRna/7XyU6q03tmSMhSKUITtn5T6ZKeVEJuEfyFkzCxHk2jHxiO/cRsLy0Ic+41Zb9p7L0Xv+sffodaZ8/oAvNfq+XVwKyWJ6mKiV7ISjIEQJy6CLdkM9nkS+ZaKcnfCbYVPaQN1yZFtLCh0 FPuP4vjD +CUy0ARvCoIxRUok5Jr7kw5nRd1MIqYDtGXMwuD06JSmjgfMSVC21NH4Bn6RZEI+GG07xCt4GKe+OSbasKs5NPBg/PjdqiGxcbf1+zf5g9DRDrdLWCjJ92AcoGJUhptWx6ZuAS6RnqjvZFp4CGIdnjLBQ4GWJX5hN4KuxnyesPInIE3aSqSg+Nh2wuDHghKcVgD7grr4DJA+HUGvl4/ZNsZmA5Nv60UkYXmC/IOgazRhWjaxMeEXgQTw6+BZ0OzWixREB80N6bhSpwW2QV30CfL6i5XbFhwU7p4HVYZjZpWGwFN3h6P/0N5QHyjKW5lKfb/Mdd3mvERd7yVL82+nghUi0ZGSS6u1oEwvpthUSvs812qmQn0CoIFqB96q73mf3ZNdQT00sdLksEyaBxtt6TtRhhIOxmvQW7VbYHM5kdkDWvUvqqGsfUIBy+0I+0IgCKcuARyhaPSRdudR/+795frqBTu4JD8IY43xIigTvqW+RbZFK0rhNB7VYHNkpbQWEilTFcC/8/T3GGAN48nRnTEc6FA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Jul 23, 2024 at 02:14:00PM -0700, Boqun Feng wrote: > On Mon, Jul 22, 2024 at 10:50:53PM -0700, Dennis Zhou wrote: > > On Mon, Jul 22, 2024 at 01:53:52PM -0700, Boqun Feng wrote: > > > On Mon, Jul 22, 2024 at 11:27:48AM -0700, Dennis Zhou wrote: > > > > Hello, > > > > > > > > On Mon, Jul 22, 2024 at 11:03:00AM -0700, Boqun Feng wrote: > > > > > On Mon, Jul 22, 2024 at 07:52:22AM -1000, Tejun Heo wrote: > > > > > > On Mon, Jul 22, 2024 at 10:47:30AM -0700, Boqun Feng wrote: > > > > > > > This looks like a data race because we read pcpu_nr_empty_pop_pages out > > > > > > > of the lock for a best effort checking, @Tejun, maybe you could confirm > > > > > > > on this? > > > > > > > > > > > > That does sound plausible. > > > > > > > > > > > > > - if (pcpu_nr_empty_pop_pages < PCPU_EMPTY_POP_PAGES_LOW) > > > > > > > + /* > > > > > > > + * Checks pcpu_nr_empty_pop_pages out of the pcpu_lock, data races may > > > > > > > + * occur but this is just a best-effort checking, everything is synced > > > > > > > + * in pcpu_balance_work. > > > > > > > + */ > > > > > > > + if (data_race(pcpu_nr_empty_pop_pages) < PCPU_EMPTY_POP_PAGES_LOW) > > > > > > > pcpu_schedule_balance_work(); > > > > > > > > > > > > Would it be better to use READ/WRITE_ONCE() for the variable? > > > > > > > > > > > > > > > > For READ/WRITE_ONCE(), we will need to replace all write accesses and > > > > > all out-of-lock read accesses to pcpu_nr_empty_pop_pages, like below. > > > > > It's better in the sense that it doesn't rely on compiler behaviors on > > > > > data races, not sure about the performance impact though. > > > > > > > > > > > > > I think a better alternative is we can move it up into the lock under > > > > area_found. The value gets updated as part of pcpu_alloc_area() as the > > > > code above populates percpu memory that is already allocated. > > > > > > > > > > Not sure I followed what exactly you suggested here because I'm not > > > familiar with the logic, but a simpler version would be: > > > > > > > > > > I believe that's the only naked access of pcpu_nr_empty_pop_pages. So > > I was thinking this'll fix this problem. > > > > I also don't know how to rerun this CI tho.. > > > > --- > > diff --git a/mm/percpu.c b/mm/percpu.c > > index 20d91af8c033..325fb8412e90 100644 > > --- a/mm/percpu.c > > +++ b/mm/percpu.c > > @@ -1864,6 +1864,10 @@ void __percpu *pcpu_alloc_noprof(size_t size, size_t align, bool reserved, > > > > area_found: > > pcpu_stats_area_alloc(chunk, size); > > + > > + if (pcpu_nr_empty_pop_pages < PCPU_EMPTY_POP_PAGES_LOW) > > + pcpu_schedule_balance_work(); > > + > > But the pcpu_chunk_populated() afterwards could modify the > pcpu_nr_empty_pop_pages again, wouldn't this be a behavior changing? > It does, but really at this point it's a mixed bag because the lock isn't permanently held at all while we do all these operations. The value is read at best effort. Ultimately the code below is populating backing pages for non-atomic allocations. At this point the ideal situation is we're using an already populated page. There are caveats but I can't say the prior is any better than this version. The code you mentioned pairs with the comment on line 916 below. /* * If the allocation is not atomic, some blocks may not be * populated with pages, while we account it here. The number * of pages will be added back with pcpu_chunk_populated() * when populating pages. */ Thanks, Dennis > Regards, > Boqun > > > spin_unlock_irqrestore(&pcpu_lock, flags); > > > > /* populate if not all pages are already there */ > > @@ -1891,9 +1895,6 @@ void __percpu *pcpu_alloc_noprof(size_t size, size_t align, bool reserved, > > mutex_unlock(&pcpu_alloc_mutex); > > } > > > > - if (pcpu_nr_empty_pop_pages < PCPU_EMPTY_POP_PAGES_LOW) > > - pcpu_schedule_balance_work(); > > - > > /* clear the areas and return address relative to base address */ > > for_each_possible_cpu(cpu) > > memset((void *)pcpu_chunk_addr(chunk, cpu, 0) + off, 0, size);