From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.8 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id AE9DEC2BA1A for ; Tue, 7 Apr 2020 16:04:03 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 749012075E for ; Tue, 7 Apr 2020 16:04:03 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 749012075E Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=chris-wilson.co.uk Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id E9F498E0015; Tue, 7 Apr 2020 12:04:02 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E29038E0011; Tue, 7 Apr 2020 12:04:02 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D3EEF8E0015; Tue, 7 Apr 2020 12:04:02 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0219.hostedemail.com [216.40.44.219]) by kanga.kvack.org (Postfix) with ESMTP id B5AB98E0011 for ; Tue, 7 Apr 2020 12:04:02 -0400 (EDT) Received: from smtpin09.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 7AFBBAC11 for ; Tue, 7 Apr 2020 16:04:02 +0000 (UTC) X-FDA: 76681530324.09.end11_404daed12962e X-HE-Tag: end11_404daed12962e X-Filterd-Recvd-Size: 3820 Received: from fireflyinternet.com (mail.fireflyinternet.com [109.228.58.192]) by imf08.hostedemail.com (Postfix) with ESMTP for ; Tue, 7 Apr 2020 16:04:01 +0000 (UTC) X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=78.156.65.138; Received: from localhost (unverified [78.156.65.138]) by fireflyinternet.com (Firefly Internet (M1)) with ESMTP (TLS) id 20829903-1500050 for multiple; Tue, 07 Apr 2020 17:03:20 +0100 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable In-Reply-To: References: <20200406200254.a69ebd9e08c4074e41ddebaf@linux-foundation.org> <20200407031042.8o-fYMox-%akpm@linux-foundation.org> From: Chris Wilson Subject: Re: [patch 125/166] lib/list: prevent compiler reloads inside 'safe' list iteration To: Andrew Morton , Linus Torvalds Cc: David Laight , Marco Elver , Linux-MM , Mark Rutland , mm-commits@vger.kernel.org, "Paul E. McKenney" , Randy Dunlap , stable Message-ID: <158627540139.8918.10102358634447361335@build.alporthouse.com> User-Agent: alot/0.8.1 Date: Tue, 07 Apr 2020 17:03:21 +0100 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Quoting Linus Torvalds (2020-04-07 16:44:35) > On Mon, Apr 6, 2020 at 8:10 PM Andrew Morton = wrote: > > > > From: Chris Wilson > > Subject: lib/list: prevent compiler reloads inside 'safe' list iteration > > > > Instruct the compiler to read the next element in the list iteration > > once, and that it is not allowed to reload the value from the stale > > element later. This is important as during the course of the safe > > iteration, the stale element may be poisoned (unbeknownst to the > > compiler). >=20 > Andrew, Chris, this one looks rather questionable to me. >=20 > How the heck would the ->next pointer be changed without the compiler > being aware of it? That implies a bug to begin with - possibly an > inline asm that changes kernel memory without having a memory clobber. >=20 > Quite fundamentally, the READ_ONCE() doesn't seem to fix anything. If > something else can change the list _concurrently_, it's still > completely broken, and hiding the KASAN report is just hiding a bug. >=20 > What and where was the actual KASAN issue? The commit log doesn't say... It'll take some time to reconstruct the original report, but the case in question was in removing the last element of the list of the last list, switch to a global lock over all such lists to park the HW, which in doing so added one more element to the original list. [If we happen to be retiring along the kernel timeline in the first place.] list->next changed from pointing to list_head, to point to the new element instead. However, we don't have to check the next element yet and want to terminate the list iteration. For reference, drivers/gpu/drm/i915/gt/intel_engine_pm.c::__engine_park() Global activity is serialised by engine->wakeref.mutex; every active timeline is required to hold an engine wakeref, but retiring is local to timelines and serialised by their own timeline->mutex. lock(&timeline->lock) list_for_each_safe(&timeline->requests) \-> i915_request_retire [list_del(&timeline->requests)] \-> intel_timeline_exit \-> lock(&engine->wakeref.mutex) engine_park [list_add_tail(&engine->kernel_context->timeline->reque= sts)] -Chris