From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3C1ACC001DC for ; Thu, 27 Jul 2023 16:35:01 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 92B6E6B0071; Thu, 27 Jul 2023 12:35:00 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 8DBEB6B007B; Thu, 27 Jul 2023 12:35:00 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 72E696B007E; Thu, 27 Jul 2023 12:35:00 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 625396B0071 for ; Thu, 27 Jul 2023 12:35:00 -0400 (EDT) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 169ABB2BD0 for ; Thu, 27 Jul 2023 16:35:00 +0000 (UTC) X-FDA: 81057941160.19.DC882C1 Received: from mail-qv1-f49.google.com (mail-qv1-f49.google.com [209.85.219.49]) by imf08.hostedemail.com (Postfix) with ESMTP id F3AB5160014 for ; Thu, 27 Jul 2023 16:34:57 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=joelfernandes.org header.s=google header.b="NKjR/jzX"; dmarc=none; spf=pass (imf08.hostedemail.com: domain of joel@joelfernandes.org designates 209.85.219.49 as permitted sender) smtp.mailfrom=joel@joelfernandes.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1690475697; a=rsa-sha256; cv=none; b=wZA1hALNQkc7tVZ+DlJTwZAF+TdH2M+l4k1ZhLIoVqCntuwVvAS9kEFRySfd2UME0Gcfwr oOexGLJKEcPzwaEjNlTTFlfdbMhlgc2LXqmfD9iBaVu5qkv3JBIiiA4XqgTG05/pdz5ELo d+XPQuJpHAs8pax7In17MmR4HA53Udc= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=joelfernandes.org header.s=google header.b="NKjR/jzX"; dmarc=none; spf=pass (imf08.hostedemail.com: domain of joel@joelfernandes.org designates 209.85.219.49 as permitted sender) smtp.mailfrom=joel@joelfernandes.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1690475697; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=4M64aY7MND+8kDvfr6Jb21UTiBjsxTrzrXa+kPuKjOA=; b=UB18E65wN+VvKw4vCpBHIUDzjrtU0bLeBEJkRmdwEW+M3lGGT6h1ffQ1TMFpxpGFKISJFm hzTGu4MjN7XSOgqUTuprBqRMbXpMOWMkZU1ZHVOwSOG+HWOHP3PukR6nnlHyNraVnzXZqe csQS1/8DqPs95TROwPSjWcZRYacndN4= Received: by mail-qv1-f49.google.com with SMTP id 6a1803df08f44-63d09d886a3so7922876d6.2 for ; Thu, 27 Jul 2023 09:34:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=joelfernandes.org; s=google; t=1690475697; x=1691080497; h=to:cc:date:message-id:subject:mime-version:from :content-transfer-encoding:from:to:cc:subject:date:message-id :reply-to; bh=4M64aY7MND+8kDvfr6Jb21UTiBjsxTrzrXa+kPuKjOA=; b=NKjR/jzXrciTrHWs+gbG+F/LXw0U2qRdVICRdk+g4TorVApf+fFL4P9OILdu4Ic+xS ORpnriHemCmM/YUb3R57wPcwg+cC5XRoLTyka8D21DSFSjgl7mIXGIWM7HEhcvEUKpCa RsTGJY2cODFPuqXBW9rWW8oocpDJXlIUH1tTg= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1690475697; x=1691080497; h=to:cc:date:message-id:subject:mime-version:from :content-transfer-encoding:x-gm-message-state:from:to:cc:subject :date:message-id:reply-to; bh=4M64aY7MND+8kDvfr6Jb21UTiBjsxTrzrXa+kPuKjOA=; b=PUwjINrRazOrN45kPhB96ilOD5rcvjzGO0FayriOX+ftODnu/+nuWgMYNtkfHFskgT ifc0jlCIP+aHlr3DBOjvN+wSUgC5vVS+FvrO7xJLqmysCJDgcQxvJiVUUNRMvlqld6kS mMRpiRuCiyDfo718CtmklgknGzw8WQn/PwSs93w1I3FywZNfRDyDceiGi8Y4Qh4tM99S oeaNoh+UaITi4kdm0cdHvsTfiFOse6TCesQ7ssUOhTG9sh4Kdv2bXICZs+8FvuUYPLfn tvbTsOdTYxB7ZV3vfOe7VkEg4BsXV5h9jheujTLdBgVFlvJgCItNtFFAMDYQjmfgRzQD /aSg== X-Gm-Message-State: ABy/qLbDPRx874bHIV5cU1h1th6+jSt46OnfmbatWLj81djAEnXpsnnw uI39KW+u4gZQj/caSM+TY7xtEw== X-Google-Smtp-Source: APBJJlG91/oj6QegvwQKLIfS5Rof+hPwFf7cXXnHXMmRTnij/AVWdu6az0I4RX4y1XJYIQwAZRb96A== X-Received: by 2002:a05:6214:448c:b0:63c:f3ed:d48e with SMTP id on12-20020a056214448c00b0063cf3edd48emr445315qvb.21.1690475696878; Thu, 27 Jul 2023 09:34:56 -0700 (PDT) Received: from smtpclient.apple ([45.88.220.68]) by smtp.gmail.com with ESMTPSA id f9-20020a0caa89000000b0063c7037f85fsm536501qvb.73.2023.07.27.09.34.55 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 27 Jul 2023 09:34:55 -0700 (PDT) Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable From: Joel Fernandes Mime-Version: 1.0 (1.0) Subject: Re: [PATCH 0/2] fix vma->anon_vma check for per-VMA locking; fix anon_vma memory ordering Message-Id: Date: Thu, 27 Jul 2023 12:34:44 -0400 Cc: Jann Horn , paulmck@kernel.org, Andrew Morton , Linus Torvalds , Peter Zijlstra , Suren Baghdasaryan , Matthew Wilcox , linux-kernel@vger.kernel.org, linux-mm@kvack.org, Alan Stern , Andrea Parri , Boqun Feng , Nicholas Piggin , David Howells , Jade Alglave , Luc Maranget , Akira Yokosawa , Daniel Lustig To: Will Deacon X-Mailer: iPhone Mail (20B101) X-Rspam-User: X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: F3AB5160014 X-Stat-Signature: h915dngmoy9nrujww1swd73k47zbdgda X-HE-Tag: 1690475697-328751 X-HE-Meta: U2FsdGVkX19D+rgWLOOBfd17aJ/MayY60odJSh9S2DO0OBPCaaoeW8mygf3I2lHsdy5Y8wJfwwJaUqlovGI5z/ozoREx1jFd/nTizIlsNO05sjy2t0IlCMEDU783B5l/lx+8wijeQUJ4pB9IpIXGVQSj6YGic7WHmVa3/a2tMDiZyPzfHKkJtnxxEfZc/uJiPI/JzdVD7FljW1g69ayQPJI+kD6qB5IILKUfCXNrx0eHzEFdQHUEO3KvD9X1YE6oedXOErQnZbt9xwHfiZrURxYCUxacxkyzbMmxZICSh32VT8wkr6YHGHMEEB2Dumhieis2SEt777qDMww+VjGyvZ4yTZUmSTDlF0CxpJIGiNNisd0t0uUXtoHsWjTKgaqEiNUer7sxpVsymBp3J6VYAQ+lc8nlGmg54Ze/75u/47CG+vjPnBUqgrArn5tyctA8aojQDJ3qYhuvqvug35qiv2EMNgOJ2ooRC6mpziqOE1B51e7OeLVAR8VD8VyBvEZduHmhaeQjcTiMCIQfLnJkXzCxrDiENFfdHth0PiD3kyoNvk6hOL6feQHvHNLHm08FYiFeMU10lO+lFKekN9dn65eUFdOHr/vbHzAsR4WRRKKGM4WMfiU1AAx+dWy2jcxIRJTsav6ePDRNAcYVkDHkApX5UGAjGdglhQV66tVfwg63nWY5+ffLzRyaAjpgO7D2d/AwvNASgp8pwk/z0ujDqNZG1ZPW1N/kwO6HJzTXKwRAfoRmDoUHGv4wXExGNXrWU8PV/NVEYWt3S1m2nJZ+A45Ru/MsFCNLOsSDRJsEWoszVrBJih3mJNhpufyN8vpjZtOjfQxtyhBx3T3vfmkrz5F11US7+BsLswEMn7LQAB1D2U5yhf4n3AcXsG2guCXHhwFG3DmmugL/1k20a9HLMHItLXfVWlwcebxPpmoyoQZoNgRNz3gyYQ6vGyPLdyzo05aehZFk4tvE0pSdstg rdiMUUnf YBS+1UXKKUKx/HY7mvnKPgOG264PDdU6qMgFOnSYNsyS7sdNEVMKa81KPLCl+fxNt9fEKeM8knL9RGJzKqXTCYm7oLWt8FECaz6tFZIBie7a3Q3V5z6qSyt3uAcy1T9Hw7g/enAYvQYXlo2to32gfgwFNPpZSEZeCiiGzXNwoNp3tEdzMsX/wSPlce1CN83cqpN0vqDpdRDk/qH2b+xcBhsMWYtZv1V6jo3jH9XxOYcima9GvPFUMIUU2+HTXYvNuyqqe5VuSa0/Pgo0C2EXgjVEmrhFz++7RVq4H1zpD0dSdfTLmgsPSkB6df4mzhZWSgoQxzrQTa+BxRyY8UG/khg4kdoDT9a3s6EXEbHOQlVmrmJD0SGxfsMankE+cmG/z3UnX1+POYgjULUDzT2K+cSbLNLjp9ser6KQSru1u8I4Ns+rQUbsZmbioh2Yx5pDwSDDy3FYeOIcaqRqAfj4LOatFjccBIaeU13fo7JkybviLbPcM+XGDzLo8BW2o0iWjy/sZBIgsgFJ2A2ZaFppLnuLxn4pYp2llCC+29e3909SrXkx8Pq7fI8cNQ9lFDPMxXA12bq1hPqvdevt9PK7dCjiHXQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: =EF=BB=BF > On Jul 27, 2023, at 10:57 AM, Will Deacon wrote: > =EF=BB=BFOn Thu, Jul 27, 2023 at 04:39:34PM +0200, Jann Horn wrote: >> On Thu, Jul 27, 2023 at 1:19=E2=80=AFAM Paul E. McKenney wrote: >>> On Wed, Jul 26, 2023 at 11:41:01PM +0200, Jann Horn wrote: >>>> Hi! >>>> Patch 1 here is a straightforward fix for a race in per-VMA locking cod= e >>>> that can lead to use-after-free; I hope we can get this one into >>>> mainline and stable quickly. >>>> Patch 2 is a fix for what I believe is a longstanding memory ordering >>>> issue in how vma->anon_vma is used across the MM subsystem; I expect >>>> that this one will have to go through a few iterations of review and >>>> potentially rewrites, because memory ordering is tricky. >>>> (If someone else wants to take over patch 2, I would be very happy.) >>>> These patches don't really belong together all that much, I'm just >>>> sending them as a series because they'd otherwise conflict. >>>> I am CCing: >>>> - Suren because patch 1 touches his code >>>> - Matthew Wilcox because he is also currently working on per-VMA >>>> locking stuff >>>> - all the maintainers/reviewers for the Kernel Memory Consistency Model= >>>> so they can help figure out the READ_ONCE() vs smp_load_acquire() >>>> thing >>> READ_ONCE() has weaker ordering properties than smp_load_acquire(). >>> For example, given a pointer gp: >>> p =3D whichever(gp); >>> a =3D 1; >>> r1 =3D p->b; >>> if ((uintptr_t)p & 0x1) >>> WRITE_ONCE(b, 1); >>> WRITE_ONCE(c, 1); >>> Leaving aside the "&" needed by smp_load_acquire(), if "whichever" is >>> "READ_ONCE", then the load from p->b and the WRITE_ONCE() to "b" are >>> ordered after the load from gp (the former due to an address dependency >>> and the latter due to a (fragile) control dependency). The compiler >>> is within its rights to reorder the store to "a" to precede the load >>> from gp. The compiler is forbidden from reordering the store to "c" >>> wtih the load from gp (because both are volatile accesses), but the CPU >>> is completely within its rights to do this reordering. >>> But if "whichever" is "smp_load_acquire()", all four of the subsequent >>> memory accesses are ordered after the load from gp. >>> Similarly, for WRITE_ONCE() and smp_store_release(): >>> p =3D READ_ONCE(gp); >>> r1 =3D READ_ONCE(gi); >>> r2 =3D READ_ONCE(gj); >>> a =3D 1; >>> WRITE_ONCE(b, 1); >>> if (r1 & 0x1) >>> whichever(p->q, r2); >>> Again leaving aside the "&" needed by smp_store_release(), if "whichever= " >>> is WRITE_ONCE(), then the load from gp, the load from gi, and the load >>> from gj are all ordered before the store to p->q (by address dependency,= >>> control dependency, and data dependency, respectively). The store to "a= " >>> can be reordered with the store to p->q by the compiler. The store to >>> "b" cannot be reordered with the store to p->q by the compiler (again, >>> both are volatile), but the CPU is free to reorder them, especially when= >>> whichever() is implemented as a conditional store. >>> But if "whichever" is "smp_store_release()", all five of the earlier >>> memory accesses are ordered before the store to p->q. >>> Does that help, or am I missing the point of your question? >>=20 >> My main question is how permissible/ugly you think the following use >> of READ_ONCE() would be, and whether you think it ought to be an >> smp_load_acquire() instead. >>=20 >> Assume that we are holding some kind of lock that ensures that the >> only possible concurrent update to "vma->anon_vma" is that it changes >> from a NULL pointer to a non-NULL pointer (using smp_store_release()). >>=20 >>=20 >> if (READ_ONCE(vma->anon_vma) !=3D NULL) { >> // we now know that vma->anon_vma cannot change anymore >>=20 >> // access the same memory location again with a plain load >> struct anon_vma *a =3D vma->anon_vma; >>=20 >> // this needs to be address-dependency-ordered against one of >> // the loads from vma->anon_vma >> struct anon_vma *root =3D a->root; >> } >>=20 >>=20 >> Is this fine? If it is not fine just because the compiler might >> reorder the plain load of vma->anon_vma before the READ_ONCE() load, >> would it be fine after adding a barrier() directly after the >> READ_ONCE()? >=20 > I'm _very_ wary of mixing READ_ONCE() and plain loads to the same variable= , > as I've run into cases where you have sequences such as: >=20 > // Assume *ptr is initially 0 and somebody else writes it to 1 > // concurrently >=20 > foo =3D *ptr; > bar =3D READ_ONCE(*ptr); > baz =3D *ptr; >=20 > and you can get foo =3D=3D baz =3D=3D 0 but bar =3D=3D 1 because the compi= ler only > ends up reading from memory twice. >=20 > That was the root cause behind f069faba6887 ("arm64: mm: Use READ_ONCE > when dereferencing pointer to pte table"), which was very unpleasant to > debug. Will, Unless I am missing something fundamental, this case is different thou= gh. This case does not care about fewer reads. As long as the first read is vola= tile, the subsequent loads (even plain) should work fine, no? I am not seeing how the compiler can screw that up, so please do enlighten := ). Also RCU read dereference does a similar pattern (as Alan also pointed). Cheers, - Joel >=20 > Will