From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E957FC3DA63 for ; Tue, 23 Jul 2024 06:13:20 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1D1AF6B0085; Tue, 23 Jul 2024 02:13:20 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 1820F6B0088; Tue, 23 Jul 2024 02:13:20 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 022226B0089; Tue, 23 Jul 2024 02:13:19 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id D85D76B0085 for ; Tue, 23 Jul 2024 02:13:19 -0400 (EDT) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 5A137121C27 for ; Tue, 23 Jul 2024 06:13:19 +0000 (UTC) X-FDA: 82370000118.25.DE3DF20 Received: from mail-pl1-f175.google.com (mail-pl1-f175.google.com [209.85.214.175]) by imf09.hostedemail.com (Postfix) with ESMTP id 6C81514000A for ; Tue, 23 Jul 2024 06:13:17 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=none; spf=pass (imf09.hostedemail.com: domain of dennisszhou@gmail.com designates 209.85.214.175 as permitted sender) smtp.mailfrom=dennisszhou@gmail.com; dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=kernel.org (policy=none) ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1721715150; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=veqI8JpWND9x/vrna5uBQjtmH0anUTbMAymlouU2IIA=; b=yAtSHT9sSViU2AByivvUVAe2HjFkSJiMcJk6iDwstuL8VhPsw6X6+eGA8bpG5HDVCCevBZ iqFq2MN7Q57H5FgfIGUo5WMig2kTquuPGRqTjltcwUHler2h9Mzz3eSqLQdn+owB6lzOKK /Vfpm6oBnta2Mm9XBLT6Oa9P93Qy60g= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1721715150; a=rsa-sha256; cv=none; b=XpU/Pa5phCMq7mNKWlCVvEo6QiDOozg9NWq8vxXBv+e2E+9IHCm5QuYx5sIyXOZeJMhOHd N2SZ8txyp6XHvjZCcyrroPA0j5kNb9Q73q69Pb2hYdMPW+e0+MW7F8VtgtH100BCWEbfSu 91hsimb8Qvgoo625/PTrtywX3WMui5Q= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=none; spf=pass (imf09.hostedemail.com: domain of dennisszhou@gmail.com designates 209.85.214.175 as permitted sender) smtp.mailfrom=dennisszhou@gmail.com; dmarc=fail reason="SPF not aligned (relaxed), No valid DKIM" header.from=kernel.org (policy=none) Received: by mail-pl1-f175.google.com with SMTP id d9443c01a7336-1fc587361b6so38607875ad.2 for ; Mon, 22 Jul 2024 23:13:17 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1721715196; x=1722319996; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=veqI8JpWND9x/vrna5uBQjtmH0anUTbMAymlouU2IIA=; b=qhuUii36YrQnpRkXNeck/O0twqpC5h2pBUJP3wjf5f5qbqgpyA4y3GeKO+KLm0yplI rP4tZjRhMmLfEDeaYOC9NDUQ7iK0B7XM8OEG/pbHIGvDJztM4z6uKSGkhvF6uqYX2pdD Pmv7ICq3nDjwLeYtqaimjHvmMOVm3j5pg9yCxG3Axaeub23X9SOjmH2xVdtxxoFyJQqV IjfuGkanHzLUhnUZeZj07a7AZQcX0vd10bKDczJ7bcd5cQ7DIlL0RPo/iciPxvVn026B 8ZiovNLX/Ai6Odx5YwHrHjzcEPPCe1OqEWP19ZAxDPgZL87bPVV30TzvTuE8I6d3UDUa Pb3w== X-Forwarded-Encrypted: i=1; AJvYcCUzRf/r/qmhJUcM1DPWlfG9VohcpNf8VDn5tLfUYLY5wLMky2XAGUqQA3rFLoo6tJpSdAV/X8xilMs2UJ9oDfjRITY= X-Gm-Message-State: AOJu0YwjQZcj0Fl/QbvdF0ghfzAzFxAh2kE0wtEQ4orN2+yFLm9xiQO1 aQr5xiRi8Ugq0P9PeZiWWb0cVbfYjWIUCno4Z/pLB8EaCLQDzJZB X-Google-Smtp-Source: AGHT+IFokcuiNQpmIx+LeDifSxdr+eQv/RRynQlUmI+tAfEfDwliz2LL7xMK+H5ow8Cc9glZnTEo8w== X-Received: by 2002:a17:903:190:b0:1fd:93d2:fba4 with SMTP id d9443c01a7336-1fd93d30202mr37575885ad.48.1721715196030; Mon, 22 Jul 2024 23:13:16 -0700 (PDT) Received: from snowbird ([136.25.84.117]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-1fd6f44c7b2sm65785955ad.218.2024.07.22.23.13.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 22 Jul 2024 23:13:15 -0700 (PDT) Date: Mon, 22 Jul 2024 23:13:10 -0700 From: Dennis Zhou To: Oliver Sang Cc: Boqun Feng , Tejun Heo , Suren Baghdasaryan , oe-lkp@lists.linux.dev, lkp@intel.com, linux-kernel@vger.kernel.org, Andrew Morton , Kent Overstreet , Kees Cook , Alexander Viro , Alex Gaynor , Alice Ryhl , Andreas Hindborg , Benno Lossin , =?iso-8859-1?Q?Bj=F6rn?= Roy Baron , Christoph Lameter , Gary Guo , Miguel Ojeda , Pasha Tatashin , Peter Zijlstra , Vlastimil Babka , Wedson Almeida Filho , linux-mm@kvack.org, lkmm@lists.linux.dev Subject: Re: [linus:master] [mm] 24e44cc22a: BUG:KCSAN:data-race_in_pcpu_alloc_noprof/pcpu_block_update_hint_alloc Message-ID: References: <202407191651.f24e499d-oliver.sang@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Stat-Signature: q8tu1n5rj3nkpujrym6abscrecrdw95a X-Rspamd-Queue-Id: 6C81514000A X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1721715197-102275 X-HE-Meta: U2FsdGVkX19JpWNEKs8xW8eQ+myghf9+nla7ZgJImtMUyoegIeWs1p68NXUfY0T7fci21xS+ahUwQNa90sXlIqN7HP/ulereqW2DR9pTc0wMLi6TkJMFON/xJsL3jibC0OBVyr5YgtyUXkGgonvETbbxCyms7lL/j90R+cU8vC2JEpl2R6qVPiGiCX8aJb9cOwDIJs7TOLTJU3xg16m9HCVZjbWTppjvhCHWLhqHy5Ti4r9VqERXwwRuZ6r0gj+VcccREpKVPWSmwiBDFDljhUBnjhVmK+Xo7quYpMAla04J7UuYpaeVlWGd7gVHQ8zc9DxuxFBau8eL7gJ6G7G5t3jXU+rmCmg23Z3LtKm4WTgdc651k6ykQQ4tZBWc4tmNQDZ4t0yU5Azgrd4yXIVXWL5ohbrevrGJlxKCjqmmamDbL/BSeMlBtUJkpfZYFwexh1fojMk4M9jgqJFHm+ooO92mhqMKSpf7OKP4UCa4QjvlC7xwSTlaQNcBSyVfjq+Z5SkSCXLX9Pq4xUzZvQ5C0BHrbwmk0He8D9TfEjhY8D2KK8v8vBPeLVnYywTAcb3XBv/OtFyHVHPbfIfTKjpMLZsXQKdt+BdOauLfquMgB6tXfXNdvm60DqkBCx34PoQI/FVe8p/KpbACPYv3FLEk6z8BB+UuUFWitbWfpBDqViY5IlBBjua1E+e2s0wG/D0/RqeEJ+0Ydl8FknBHNpOmFISXVhdfnlkrCwHWp6olIsVUZaiiH4i20YhAsyzQPVqvI+uA5GJNjsqSsYswBr6uBPKXaE8MwLW/GMtiMP3qHCdHUhk5+2v0RoooOscWXUVlOjgdUePIyN9EnryUthuUkATyrsvTLjPgBwxCfsDioGuq+zm9P1h52ctRUNbOUiXeM/k1iuJ8bEMa8FAjEASxprhNty3a4d5CLgtm9kggW288tkbY4xujKGecEMm3ql4aAcIzz8ysJL24wmMlV+d ptLuMscX /uem3SJKP8ZZdedy+TqBTQ1pn7FeV8Zf9HvlTOPaaSbX63+k7UiUaHLYmjsm4r+Wvk/gugxj1sRSXaTFCvTbFKdX0edRVGPPHKXNOzEqHEg8y00ZjJcG3T7v2kokqaeP13w9pdfQuIAU4B6rEvZP+v63c6EUveM0eQidhGi/UU8nQFJbs+YMUJIT/zbpV5ICKt98G1ZOrJ3GtR9pU3aLsTVkfpG9O5/W9KwJTChvoBorpVCzxNuOnv0jbRi/JnLMVowav+7tu2QwY9ZxR1jj5ZZHyJ0EPsL0ZPLgeuMXmulkEJCnAu3vuCqSuWP1X3Gi2XEb5ZuPG/l4IAyOF/VdIHg6wANnN+axSzZcvdu1Cr8dIdc96u18nz5wxwvWmuilSsMoXjbIdqFebSILKPicduyilkLhMs8FeSzSCgjb9ajJLQ8uwwnjjA7EaTNlkFWxa1hXHBqsws/c6xyzE8qJQyWR0ot7IU/rRY1YqOZH6a09ZP4FfYAqFsK8inbcoM/jDDv5GIHIpl6oHj2rg0TuBHCNj2w== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Hi Oliver, On Tue, Jul 23, 2024 at 02:09:38PM +0800, Oliver Sang wrote: > hi, Dennis Zhou, > > On Mon, Jul 22, 2024 at 10:50:53PM -0700, Dennis Zhou wrote: > > On Mon, Jul 22, 2024 at 01:53:52PM -0700, Boqun Feng wrote: > > > On Mon, Jul 22, 2024 at 11:27:48AM -0700, Dennis Zhou wrote: > > > > Hello, > > > > > > > > On Mon, Jul 22, 2024 at 11:03:00AM -0700, Boqun Feng wrote: > > > > > On Mon, Jul 22, 2024 at 07:52:22AM -1000, Tejun Heo wrote: > > > > > > On Mon, Jul 22, 2024 at 10:47:30AM -0700, Boqun Feng wrote: > > > > > > > This looks like a data race because we read pcpu_nr_empty_pop_pages out > > > > > > > of the lock for a best effort checking, @Tejun, maybe you could confirm > > > > > > > on this? > > > > > > > > > > > > That does sound plausible. > > > > > > > > > > > > > - if (pcpu_nr_empty_pop_pages < PCPU_EMPTY_POP_PAGES_LOW) > > > > > > > + /* > > > > > > > + * Checks pcpu_nr_empty_pop_pages out of the pcpu_lock, data races may > > > > > > > + * occur but this is just a best-effort checking, everything is synced > > > > > > > + * in pcpu_balance_work. > > > > > > > + */ > > > > > > > + if (data_race(pcpu_nr_empty_pop_pages) < PCPU_EMPTY_POP_PAGES_LOW) > > > > > > > pcpu_schedule_balance_work(); > > > > > > > > > > > > Would it be better to use READ/WRITE_ONCE() for the variable? > > > > > > > > > > > > > > > > For READ/WRITE_ONCE(), we will need to replace all write accesses and > > > > > all out-of-lock read accesses to pcpu_nr_empty_pop_pages, like below. > > > > > It's better in the sense that it doesn't rely on compiler behaviors on > > > > > data races, not sure about the performance impact though. > > > > > > > > > > > > > I think a better alternative is we can move it up into the lock under > > > > area_found. The value gets updated as part of pcpu_alloc_area() as the > > > > code above populates percpu memory that is already allocated. > > > > > > > > > > Not sure I followed what exactly you suggested here because I'm not > > > familiar with the logic, but a simpler version would be: > > > > > > > > > > I believe that's the only naked access of pcpu_nr_empty_pop_pages. So > > I was thinking this'll fix this problem. > > > > I also don't know how to rerun this CI tho.. > > we could test this patch. what's the base? could we apply it directly upon > 24e44cc22a? > > BTW, our bot is not so clever so far to auto test fix patches, so this is kind > of manual effort. due to resource constraint, it will be hard for us to test > each patch (we saw several patches in this thread already) or test very fast. > Ah yeah that makes sense. If you don't mind testing the last one I sent, the one below, that applies cleanly to 24e44cc22a. Thanks, Dennis > > > > --- > > diff --git a/mm/percpu.c b/mm/percpu.c > > index 20d91af8c033..325fb8412e90 100644 > > --- a/mm/percpu.c > > +++ b/mm/percpu.c > > @@ -1864,6 +1864,10 @@ void __percpu *pcpu_alloc_noprof(size_t size, size_t align, bool reserved, > > > > area_found: > > pcpu_stats_area_alloc(chunk, size); > > + > > + if (pcpu_nr_empty_pop_pages < PCPU_EMPTY_POP_PAGES_LOW) > > + pcpu_schedule_balance_work(); > > + > > spin_unlock_irqrestore(&pcpu_lock, flags); > > > > /* populate if not all pages are already there */ > > @@ -1891,9 +1895,6 @@ void __percpu *pcpu_alloc_noprof(size_t size, size_t align, bool reserved, > > mutex_unlock(&pcpu_alloc_mutex); > > } > > > > - if (pcpu_nr_empty_pop_pages < PCPU_EMPTY_POP_PAGES_LOW) > > - pcpu_schedule_balance_work(); > > - > > /* clear the areas and return address relative to base address */ > > for_each_possible_cpu(cpu) > > memset((void *)pcpu_chunk_addr(chunk, cpu, 0) + off, 0, size);