From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 325D2E77184 for ; Thu, 19 Dec 2024 11:20:35 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2F4946B0089; Thu, 19 Dec 2024 06:20:34 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 27D856B008A; Thu, 19 Dec 2024 06:20:34 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0F8256B008C; Thu, 19 Dec 2024 06:20:34 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id E48026B0089 for ; Thu, 19 Dec 2024 06:20:33 -0500 (EST) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 651351A15CD for ; Thu, 19 Dec 2024 11:20:25 +0000 (UTC) X-FDA: 82911464916.14.EF4E647 Received: from desiato.infradead.org (desiato.infradead.org [90.155.92.199]) by imf11.hostedemail.com (Postfix) with ESMTP id 2D7ED40003 for ; Thu, 19 Dec 2024 11:19:51 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=infradead.org header.s=desiato.20200630 header.b="BwzWC/I8"; spf=none (imf11.hostedemail.com: domain of peterz@infradead.org has no SPF policy when checking 90.155.92.199) smtp.mailfrom=peterz@infradead.org; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1734607187; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=1GfGA0JV2idnv7+7kDBABUrKU+s7odyP7WiVHkQNEOA=; b=Vq25pEIu8VNUn3WX9D18ca8N6ZAMdsdoGrU0k1Pwr1mwC9n3dJMbV4iHUFj66dLw/fi+Mc z4E+hpJuelaUna+DJT7VPHgfu6K7zWKjGQ0HeL2uiWjPf8nlzhYuC9HMU9hCeVczBcB9ln S0zs9iq9GLOx03vTe7uFjzCm+f0xuIY= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1734607187; a=rsa-sha256; cv=none; b=wti8sn4LAfb6FOS0Xt9DWmLJ+5Tj7REy+o06TJ+9dLwUjITUaic/SgfpRqSkBcKjdt6ODD 5bzuRgTsHpqr5Nbhp4XBjs/aoXIYMyIeu0tF8kVO0fRRgZRX80Ye6zNnR6426pn7JmnIIn VVPuhbXPLpQqXzdz/zPTHFQ6ss8d6NY= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=infradead.org header.s=desiato.20200630 header.b="BwzWC/I8"; spf=none (imf11.hostedemail.com: domain of peterz@infradead.org has no SPF policy when checking 90.155.92.199) smtp.mailfrom=peterz@infradead.org; dmarc=none DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=1GfGA0JV2idnv7+7kDBABUrKU+s7odyP7WiVHkQNEOA=; b=BwzWC/I89b30x9Pi55xC4uCitV jYC4YED/tEnUpTy292BMrni5q4fzJzK1r/SeALXTgh7t2CUl+vXI0rTOAO56b6aoTdgJgvSt5vSQs BG2Qt64pXTqT95JuqC73pj5A9noiVFsUlRxf8nDNy5iW+o+tGyWmGzFwQDWXXBtLUKtZ2qocmt72W govzhSeeqqJ+eswMtPNWr17rJpcpPO6etuDZKfo7oAM4jeF3ZqLGplOZfm+GnEqlXxIH2a592Ucry l9deg+7By/HEesIyhuU0Np5VaJN084w0PErLMePhiGJhQxp2DplzAXFLBo272Ei3CyXmYavBuzVBP Q3Avk2pQ==; Received: from 77-249-17-89.cable.dynamic.v4.ziggo.nl ([77.249.17.89] helo=noisy.programming.kicks-ass.net) by desiato.infradead.org with esmtpsa (Exim 4.98 #2 (Red Hat Linux)) id 1tOEZg-00000005QoF-33vP; Thu, 19 Dec 2024 11:20:12 +0000 Received: by noisy.programming.kicks-ass.net (Postfix, from userid 1000) id E35863003FF; Thu, 19 Dec 2024 12:20:11 +0100 (CET) Date: Thu, 19 Dec 2024 12:20:11 +0100 From: Peter Zijlstra To: Suren Baghdasaryan Cc: "Liam R. Howlett" , akpm@linux-foundation.org, willy@infradead.org, lorenzo.stoakes@oracle.com, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, mjguzik@gmail.com, oliver.sang@intel.com, mgorman@techsingularity.net, david@redhat.com, peterx@redhat.com, oleg@redhat.com, dave@stgolabs.net, paulmck@kernel.org, brauner@kernel.org, dhowells@redhat.com, hdanton@sina.com, hughd@google.com, lokeshgidra@google.com, minchan@google.com, jannh@google.com, shakeel.butt@linux.dev, souravpanda@google.com, pasha.tatashin@soleen.com, klarasmodin@gmail.com, corbet@lwn.net, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, kernel-team@android.com Subject: Re: [PATCH v6 10/16] mm: replace vm_lock and detached flag with a reference count Message-ID: <20241219112011.GA34942@noisy.programming.kicks-ass.net> References: <20241218174428.GQ2354@noisy.programming.kicks-ass.net> <20241219091334.GC26551@noisy.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20241219091334.GC26551@noisy.programming.kicks-ass.net> X-Rspamd-Queue-Id: 2D7ED40003 X-Rspam-User: X-Rspamd-Server: rspam07 X-Stat-Signature: 1mrowuswx61ptw9kwg39qxp1ax87dsh4 X-HE-Tag: 1734607191-632740 X-HE-Meta: U2FsdGVkX1/eBfF3/Ck5nMUkRObXWNdgrhf8kqbISFvV/8/fE5IHYYrO7fgargbWOqw+dwyooCZTJgMD5BMw6OnXmrYn3rg3lQYn6Cl0vGvxYmwl+ua7kHKinfvtSqNQR5StRej4e+2DFjuGlj8Hcv4QEau70vJIyoH9BsAZpdz2nRIim84x/Wq1k53bvP3Tte4yqxAJ/XWpuBdo6rCD2xFQGX0fa1nSMFcTXVQdFe113pRvC+VphM45BDbbAAUig9ni3j/I3nNeiAB3UUI9oEGo2F3Cd1CiIKVi3Zs3oISVdIR3rNoaBIkJofZr+2flTliqlxl4NFNxsL+0Lu8WSXdctecSBW43PcjTS77fL78oIyUjvJpbK4AI5tIQ1ZhGkLQo4PRox2WAAhYljdFX+Og8YbDGCPcNtF/mEL4kXJX/exqnhhfP7n0tlhIUciV+NsiWAKjbEYltAPvlrWBkRG2Z4Q8WoY0n0Wu2yt+ys/rZXvI+xoUKxWpbXchwwXirtcUYMGwmGptEhSnB0l5eec0sZ2O8YYYD+rgh+2PzH2YZRwMdVqPeV/XKqgeZ1GufEN9t4Ssoq25D4HLivpXguHDXLAplKKNLSJ0KMXK0VGDOcOgjDvZuOVRQpeJDacAiRSer5rTA2fhqcI6hsAIyLNsx7DKHDw45kxmt5nnLUo1joLzsEsqV1zlpJXOup0dy6GavCL47KXY0eyVY6zZ1VejXdHlEU3Bb++r1haDmufdU700loqY2+NqgG1Q7QMRfVX8rWSJJ71UzW02Kxch7ynDQYMNI6/K2ArvcqsuMg9ILJG2Q0+PbAOGu6RIqdip2xCa+/457aR+80IQDMM2MmN/M93YGVlufjQLAMVvxrcttrnOuo9OD7vOuWkfDLqW/4RNRkB+gxyNi9tizd8tca+yGUfw3+jFuAT7Lg7JMs/izuxyLDNLKz3qCl10F+ILnhrPIwRT6vUz+rLhn9pr lBGxntlv hobXr2RP/7zyz+MbVUIbWLJFeDPaw5J0cxZZektqBHhPMnulVmeI45PUQ/dYKUHOpXIDTALrc2ZaKRywXWEFhAliQzVQAZd0jzW1nwYjFrccyE9RCDYJ0HXCajOsyAB+cXyHIyPwCfkkFFJ4evwF3+hCCmjxix/xgqnNavEs6ZnW7T8QDIe0afGaEAo9e7qhSgxcXASDjJ5+U/yMQS58p5VmVIhFtfy9nbb2aT01ecI2u/+aVv2oAER1wHQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Dec 19, 2024 at 10:13:34AM +0100, Peter Zijlstra wrote: > On Wed, Dec 18, 2024 at 01:53:17PM -0800, Suren Baghdasaryan wrote: > > > Ah, ok I see now. I completely misunderstood what for_each_vma_range() > > was doing. > > > > Then I think vma_start_write() should remain inside > > vms_gather_munmap_vmas() and all vmas in mas_detach should be > > No, it must not. You really are not modifying anything yet (except the > split, which we've already noted mark write themselves). > > > write-locked, even the ones we are not modifying. Otherwise what would > > prevent the race I mentioned before? > > > > __mmap_region > > __mmap_prepare > > vms_gather_munmap_vmas // adds vmas to be unmapped into mas_detach, > > // some locked > > by __split_vma(), some not locked > > > > lock_vma_under_rcu() > > vma = mas_walk // finds > > unlocked vma also in mas_detach > > vma_start_read(vma) // > > succeeds since vma is not locked > > // vma->detached, vm_start, > > vm_end checks pass > > // vma is successfully read-locked > > > > vms_clean_up_area(mas_detach) > > vms_clear_ptes > > // steps on a cleared PTE > > So here we have the added complexity that the vma is not unhooked at > all. Is there anything that would prevent a concurrent gup_fast() from > doing the same -- touch a cleared PTE? > > AFAICT two threads, one doing overlapping mmap() and the other doing > gup_fast() can result in exactly this scenario. > > If we don't care about the GUP case, when I'm thinking we should not > care about the lockless RCU case either. Also, at this point we'll just fail to find a page, and that is nothing special. The problem with accessing an unmapped VMA is that the page-table walk will instantiate page-tables. Given this is an overlapping mmap -- we're going to need to those page-tables anyway, so no harm done. Only after the VMA is unlinked must we ensure we don't accidentally re-instantiate page-tables.