From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id EBE73C05027 for ; Fri, 17 Feb 2023 15:54:19 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 25C856B0071; Fri, 17 Feb 2023 10:54:19 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 20C956B0072; Fri, 17 Feb 2023 10:54:19 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0D4AE6B0073; Fri, 17 Feb 2023 10:54:19 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id F3AAD6B0071 for ; Fri, 17 Feb 2023 10:54:18 -0500 (EST) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id C589E140400 for ; Fri, 17 Feb 2023 15:54:18 +0000 (UTC) X-FDA: 80477230596.19.A241ED7 Received: from mail-yw1-f170.google.com (mail-yw1-f170.google.com [209.85.128.170]) by imf11.hostedemail.com (Postfix) with ESMTP id 1ECAE40010 for ; Fri, 17 Feb 2023 15:54:16 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=b3DiNrjl; spf=pass (imf11.hostedemail.com: domain of surenb@google.com designates 209.85.128.170 as permitted sender) smtp.mailfrom=surenb@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1676649257; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=lH86Zid2dZm5bnTlKuYHBL+UBFHGqixgoyFrzinN6Ms=; b=jRDQXe79HbMyGCWhMJ9bepNY4S4dPE9vvl4y6WAdfwB2d2jOCqKBRb3SLZ7DGnsHjtMCqT YTDcjrDBA2d+HAOjf0zpbVF0iOf+WXbFSgbC6Ny0NG5M/ctz2enNAGXgkdpNEqmr9GzPM8 i0jAugct5ps2FUw1rAKJM0GSg8SqEkc= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=b3DiNrjl; spf=pass (imf11.hostedemail.com: domain of surenb@google.com designates 209.85.128.170 as permitted sender) smtp.mailfrom=surenb@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1676649257; a=rsa-sha256; cv=none; b=nkzrOyo6U+yrfyQjiLgOVztKQG6bIFnDOoWvnrW80bu6N7hbNLayQIWCIGhmlkcvEDQ8JW StWwh1Bo/1j91aR7b+SFbyZhEa2P2ojQ5ak7JhEWid9lJb/8j+VKYto8ABjN3aEDTO6j5V TYYkXlQl++EckosvvFe8DJ+q7UE4s0c= Received: by mail-yw1-f170.google.com with SMTP id 00721157ae682-536566e8dfdso19202257b3.9 for ; Fri, 17 Feb 2023 07:54:16 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=to:subject:message-id:date:from:in-reply-to:references:mime-version :from:to:cc:subject:date:message-id:reply-to; bh=lH86Zid2dZm5bnTlKuYHBL+UBFHGqixgoyFrzinN6Ms=; b=b3DiNrjlfqW1fEnQXBFb6r3/6ssYOYFmb07YdJ57aCoZgGF7z0cSlFp6LNvq0nlO9T h7MQwVLJHgbKiFwZZnwcEW43LlrllCq4xINr3/5ByMHFSkzVTwD0ec1uud4lk2QUxJJf S69FzFOpllxouNLN8H3PgzXaSkTdVhd46LsGMpHYh5L5MOJ97JDTrRU8MUhBzY/9ZNNz GtpgFyxBGBNhqKcVmCB7lXv7APqJKgNWYk+cgBG24fA+uR7dIeiMyU1HboAIFZEnBPuP ozgg4fNv6nMmWSBGnpWjhn36QNV5wp5RLqFu4rS04Uz2NPVfLutpOQHPA8Ea49wLkew+ ljYA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=to:subject:message-id:date:from:in-reply-to:references:mime-version :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=lH86Zid2dZm5bnTlKuYHBL+UBFHGqixgoyFrzinN6Ms=; b=2EgIWvCSBHe+GPtj7mKPE4qT7J2zXn1RRmSUWRzPr0cxkBmMV7agOb04t7zfv1i92c sfZSMrPUTQM354yfQ1x5qf1DUYLm7Kv2ZlMvOjx/KmT42bk0z4GgLhv12zud/CfhZ5qb 32/rzvnikZRuY63xyPgp3zONkhQBRqkg7ZI+AvPSqFZVUepliDangdrS7uqG6qtJJPj5 x+tTkIGDAxwsZh7u6ArXFmCKDzu5dwhKYvtIqk6ijD5PEUHEB5DAnej93ZU/VOD4ar8H thNdm9ARXq7TigkMsZubKMEseY1ayLZ1oqkWNTY/hJt5DPo7LrDpFUmWJDXVjneawycs W1jQ== X-Gm-Message-State: AO0yUKVB1WFTQpjmCU+SqHqw0fAUcujPSxEsg2Acqf6CQXdmflhZS9mz jj7oD3jqDTLlsN5K3zjLkwHy6/bwJo80RFPJ5l+32g== X-Google-Smtp-Source: AK7set8xo9es7N+pYGqxpm5ZGXh1uiEPHw0Au1RV6e1ob6blTwBLvHlxWY896OShiLQCNh8oO9NcmsY5tue+lwdwTZ8= X-Received: by 2002:a05:690c:c90:b0:52f:184a:da09 with SMTP id cm16-20020a05690c0c9000b0052f184ada09mr323598ywb.2.1676649255794; Fri, 17 Feb 2023 07:54:15 -0800 (PST) MIME-Version: 1.0 References: <20230216051750.3125598-1-surenb@google.com> <20230216051750.3125598-22-surenb@google.com> <20230216153405.zo4l2lqpnc2agdzg@revolver> <20230217145052.y526nmjudi6t2ael@revolver> In-Reply-To: <20230217145052.y526nmjudi6t2ael@revolver> From: Suren Baghdasaryan Date: Fri, 17 Feb 2023 07:54:04 -0800 Message-ID: Subject: Re: [PATCH v3 21/35] mm/mmap: write-lock adjacent VMAs if they can grow into unmapped area To: "Liam R. Howlett" , Suren Baghdasaryan , akpm@linux-foundation.org, michel@lespinasse.org, jglisse@google.com, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, mgorman@techsingularity.net, dave@stgolabs.net, willy@infradead.org, peterz@infradead.org, ldufour@linux.ibm.com, paulmck@kernel.org, mingo@redhat.com, will@kernel.org, luto@kernel.org, songliubraving@fb.com, peterx@redhat.com, david@redhat.com, dhowells@redhat.com, hughd@google.com, bigeasy@linutronix.de, kent.overstreet@linux.dev, punit.agrawal@bytedance.com, lstoakes@gmail.com, peterjung1337@gmail.com, rientjes@google.com, chriscli@google.com, axelrasmussen@google.com, joelaf@google.com, minchan@google.com, rppt@kernel.org, jannh@google.com, shakeelb@google.com, tatashin@google.com, edumazet@google.com, gthelen@google.com, gurua@google.com, arjunroy@google.com, soheil@google.com, leewalsh@google.com, posk@google.com, michalechner92@googlemail.com, linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, x86@kernel.org, linux-kernel@vger.kernel.org, kernel-team@android.com Content-Type: text/plain; charset="UTF-8" X-Rspam-User: X-Rspamd-Server: rspam03 X-Stat-Signature: jtztbbzdonfsw4d48s1fo68afgpinkoc X-Rspamd-Queue-Id: 1ECAE40010 X-HE-Tag: 1676649256-261484 X-HE-Meta: U2FsdGVkX18Zz4OJydUF0qZpKl0s2D6hEkTqfICYx8tXn8++80XF31PsNI/ZHJAE26X7kRDMmQFLoRkCD3ZDKT5ovePBGfb3isSkuuPb/7wwXFgcjI50/4TlzlQAd0tZb/iTmLa6BZHwzNeh5xNXF1PY+MY+/uEyBESgPWaxJwg3fCBS9oS7VCeLwk7aKmB5Yd13wC/KjCy9M63wuvlbH0KPPDc6mTqLGDGXh7WcWd2Yyfj4Dsye9DV5b9Wn9SCLxRkpJBtUmVQd1RFx8cwFShXfu+0bWyAUEKty2gOhhZof906Q4QblZ17BmW13z4wLXesfj8F9D9xS1h6qkS6+U7qyCFBo0PgyChOEww2JH1Hie7/aWhXdc/ZvDuoa4ov4u3k/o0LjMv7GZrthP3x9eWSh4KfjFNFNxeVNiF11YcNAAAI+FdLFtIputR5kCbmuJCEYqVWqKSfxSDD+IGnxa041LE2ol7ksZX4JItXNf58eLEhr9T9ue01ucZ16Ur26nYurCucQs2iYVhvSRuoz6JKTVGR7qVlT0QDwebfrUHOWLgkd0VQvTzUzTdB1CcxtPjrLtwT9q8teQbvO+izkd+N0xXD35owA9Lfgk6NA1LovNpTyFfC1Gw8x/3r+yCcWVkSrGR+wjkprHzbtIqBKeg6RirgeRBmek1mp5g4jIpqj79U154oZ4rgZ9GVaikQ8jj3qvaQ1jEkCC0ExFQAV09GJE26yRtKjPwgc0M9XGH36UixBzQa/I6K2VgKEwsknRHLxPGp1iX3oKuf1NlqH5R0Zh5RBoOPZEe49H+cq1o+w3/2R9BRpdu16UF2QvIzkX9z5dclglfgmRiw0PRT2CTR7zqD6IJ0HCH1o4OFKrUtBAetSnLDzMoScJOwzHMh+zzDacooVa/m57qQj0uPhgGfeVdzxIqFeILsP+3kVC/W7XPQnfYxx7a0+n1X4TroKjvikN1z4+7+NCpXpPej SD3ilDVe Y0qb2cuvz58XcKLutOihYshkZM3NEEzttW8Q5cx5rpR/Xax3/9wH2BsCB7oG+CqMLuz2ZbpFXDdd8p337CanGhic41yzyS+Z+VHUtae/jlDHZ8zAB2DKUNvtEch5fbvhKQM4RcXuttmG0SDjBFnQdwPnpQhF5vdlWVaMYGJzwIe/S6hlflB1Bx4qp27nS0nw6IpPCOSMKauPJ5z9VEMjpQ4refpzpyPZ6xtMvyp7G2hiZg2GaVAdCHD9ueUglrjF3Kqxjq/tSQNLjsQKhrUbk3zCXyj+ONvoHf8MwOiyyFYwVy1JaEz0XeBPDK4REaM+E/F+pDfoxvcQ1IDq0Nl1KFUYhnw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, Feb 17, 2023 at 6:51 AM Liam R. Howlett wrote: > > * Suren Baghdasaryan [230216 14:36]: > > On Thu, Feb 16, 2023 at 7:34 AM Liam R. Howlett wrote: > > > > > > > > > First, sorry I didn't see this before v3.. > > > > Feedback at any time is highly appreciated! > > > > > > > > * Suren Baghdasaryan [230216 00:18]: > > > > While unmapping VMAs, adjacent VMAs might be able to grow into the area > > > > being unmapped. In such cases write-lock adjacent VMAs to prevent this > > > > growth. > > > > > > > > Signed-off-by: Suren Baghdasaryan > > > > --- > > > > mm/mmap.c | 8 +++++--- > > > > 1 file changed, 5 insertions(+), 3 deletions(-) > > > > > > > > diff --git a/mm/mmap.c b/mm/mmap.c > > > > index 118b2246bba9..00f8c5798936 100644 > > > > --- a/mm/mmap.c > > > > +++ b/mm/mmap.c > > > > @@ -2399,11 +2399,13 @@ do_vmi_align_munmap(struct vma_iterator *vmi, struct vm_area_struct *vma, > > > > * down_read(mmap_lock) and collide with the VMA we are about to unmap. > > > > */ > > > > if (downgrade) { > > > > - if (next && (next->vm_flags & VM_GROWSDOWN)) > > > > + if (next && (next->vm_flags & VM_GROWSDOWN)) { > > > > + vma_start_write(next); > > > > downgrade = false; > > > > > > If the mmap write lock is insufficient to protect us from next/prev > > > modifications then we need to move *most* of this block above the maple > > > tree write operation, otherwise we have a race here. When I say most, I > > > mean everything besides the call to mmap_write_downgrade() needs to be > > > moved. > > > > Which prior maple tree write operation are you referring to? I see > > __split_vma() and munmap_sidetree() which both already lock the VMAs > > they operate on, so page faults can't happen in those VMAs. > > The write that removes the VMAs from the maple tree a few lines above.. > /* Point of no return */ > > If the mmap lock is not sufficient, then we need to move the > vma_start_write() of prev/next to above the call to > vma_iter_clear_gfp() in do_vmi_align_munmap(). > > But I still think it IS enough. > > > > > > > > > If the mmap write lock is sufficient to protect us from next/prev > > > modifications then we don't need to write lock the vmas themselves. > > > > mmap write lock is not sufficient because with per-VMA locks we do not > > take mmap lock at all. > > Understood, but it also does not expand VMAs. > > > > > > > > > I believe this is for expand_stack() protection, so I believe it's okay > > > to not vma write lock these vmas.. I don't think there are other areas > > > where we can modify the vmas without holding the mmap lock, but others > > > on the CC list please chime in if I've forgotten something. > > > > > > So, if I am correct, then you shouldn't lock next/prev and allow the > > > vma locking fault method on these vmas. This will work because > > > lock_vma_under_rcu() uses mas_walk() on the faulting address. That is, > > > your lock_vma_under_rcu() will fail to find anything that needs to be > > > grown and go back to mmap lock protection. As it is written today, the > > > vma locking fault handler will fail and we will wait for the mmap lock > > > to be released even when the vma isn't going to expand. > > > > So, let's consider a case when the next VMA is not being removed (so > > it was neither removed nor locked by munmap_sidetree()) and it is > > found by lock_vma_under_rcu() in the page fault handling path. > > By this point next VMA is either NULL or outside the munmap area, so > what you said here is always true. > > >Page > > fault handler can now expand it and push into the area we are > > unmapping in unmap_region(). That is the race I'm trying to prevent > > here by locking the next/prev VMAs which can be expanded before > > unmap_region() unmaps them. Am I missing something? > > Yes, I think the part you are missing (or I am missing..) is that > expand_stack() will never be called without the mmap lock. We don't use > the vma locking to expand the stack. Ah, yes, you are absolutely right. I missed that when the VMA explands as a result of a page fault, lock_vma_under_rcu() can't find the faulting VMA (the fault is outside of the area and hence the need to expand) and will fall back to mmap read locking. Since do_vmi_align_munmap() holds the mmap write lock and does not downgrade it, the race will be avoided and expansion will wait until we drop the mmap write lock. Good catch Liam! We can drop this patch completely from the series. Thanks, Suren. > > ... > > -- > To unsubscribe from this group and stop receiving emails from it, send an email to kernel-team+unsubscribe@android.com. >