From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D169BE7717F for ; Tue, 10 Dec 2024 16:21:03 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 452418D000A; Tue, 10 Dec 2024 11:21:03 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 40D378D0007; Tue, 10 Dec 2024 11:21:03 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2A31E8D000A; Tue, 10 Dec 2024 11:21:03 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id EE8448D0007 for ; Tue, 10 Dec 2024 11:21:02 -0500 (EST) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 5D5C2AE047 for ; Tue, 10 Dec 2024 16:21:02 +0000 (UTC) X-FDA: 82879563102.18.EA17E6E Received: from mail-qt1-f176.google.com (mail-qt1-f176.google.com [209.85.160.176]) by imf21.hostedemail.com (Postfix) with ESMTP id 63ADC1C0017 for ; Tue, 10 Dec 2024 16:20:18 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=lcCRHNLM; spf=pass (imf21.hostedemail.com: domain of surenb@google.com designates 209.85.160.176 as permitted sender) smtp.mailfrom=surenb@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1733847637; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=M2hiMkjjjVlbn+YVdVeNdJ0jmHrRa7EL5H/2Jru6fxk=; b=JOG0S8vHWiO2ebL9IRUQDf0JvxGlBJDQ37crGE8kDGwEXV4FL+tCvV+hSdLv0HXbRM6200 CG9jLQnT5ZPnZyFqAYknXCqTbYK7f46ROqF3SMuCVTc7weiPcJ9wfBLr/67gYv8wdOL/AP o2RS2tFmxAVkCnfhIHwWBpCMxOtV6PA= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=lcCRHNLM; spf=pass (imf21.hostedemail.com: domain of surenb@google.com designates 209.85.160.176 as permitted sender) smtp.mailfrom=surenb@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1733847637; a=rsa-sha256; cv=none; b=gr6SNJ+DNvUmqa9RUYK7CDTwVIwbaspJSrfR5gP3HPyU6faehDrcf7rw9XKx6IoCBvTG4l Ragy8u2F3YNCx9j5ArVm8ef1KWXyygAGuZ78jrN83evQBh6hB5yLSEQANSGcRWJtqmzMy4 0nuvbOFP1UivNkzNJQT7tHMOOHio+ps= Received: by mail-qt1-f176.google.com with SMTP id d75a77b69052e-4674a47b7e4so273861cf.1 for ; Tue, 10 Dec 2024 08:21:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1733847659; x=1734452459; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=M2hiMkjjjVlbn+YVdVeNdJ0jmHrRa7EL5H/2Jru6fxk=; b=lcCRHNLM28F4plfsWf6XXe83aw886R9qkoAQY4z3WkPdjejj10Jpm7qiwsaida3MSk IS9KGePQVUnB4lb2P0Yk/5eXFY6NNki4V4AlEb3mME7AxXCEMizGb6FykWMUnQln943L +sta7qp6BWAWa0YR9CN/ovv04LuJOfTI3IvTOw7xWnjEmwrlDNN5ire2k3qkw61Pacaz 2Rst1Ghw8Bb3ycndW+/YHHtuTrnrFiK8zIlfznecZScod1Kzb4PhL+kzRqZNSrvveeS1 uP5VnwI+xazokxIv4IKQS923PCx4I2GCZyfuP/XV4TSfjGeKDM3HOk+2Hl4trGSdK2QT qNQw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1733847659; x=1734452459; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=M2hiMkjjjVlbn+YVdVeNdJ0jmHrRa7EL5H/2Jru6fxk=; b=wBMmLbedybIFu3VkH6VLpYCCtiL42p5fbW6aQy6Gpx1ufcX9YxLue7yieAnkPiTXcm Q+qYwfpr4KYY+oHmSV7KwmsQIAwg9MuPGi49uBqEASAP7Ptnaz0LvIZovTO9nzlZGay4 fXiSy+v8wEXBFS2Ck9GTVPtb8PEm5C1sIIsGLNF1uvg2+yDUSrj1p9bqPqvdw5gSljyl qpClDS3wJBmddXPYLfCF3tY+XHyjgkU6jM9Jcso0YjgxOjdgBDDIrKxuRSGSz3Yt7Q+B FBls4pJ5yDP/qW/TaOfceErDM0qu0LZT3H5F4tBhBNYCoi05fKNrQjdxOpICsQS7cpIg wSow== X-Forwarded-Encrypted: i=1; AJvYcCW92SMqAsaUYg5w7I6/atCmbmmfkz8EWC/NKYOvRYPkYfoukymSCyL9x06teyVKctz0+K04w2GhOw==@kvack.org X-Gm-Message-State: AOJu0YzOU9kKsfiE18LsiW3mmpa11tSgrYYd34Sj1txVyhm7zzfThnA0 sI6d4p2+Iy/gfLY87Qi12Wqck4i/9W4lfbpM8G/8M29UA23fE5GotPYJ74wIyq62stGGN6PPKoR +96fJi3yxy7eTa/OSOPAOrqC5+p+Bt32vPT92 X-Gm-Gg: ASbGnctiyuxspUwqWHitegZB91a6QTNiz14GyEi7llL/s6aR7lb8cZ7LCn0/IUAvTNW UOeJ2tH2VQOGCPbIWvr3gDEQMCRfCm/MRXIP081SwLK57KlSejnt9HWNI2FKOKUIoMg== X-Google-Smtp-Source: AGHT+IEK52d7RE1NdbydIAQRmCAaAzJc6Wse0Ga444Vy7SdCxJyRTZz+lZVsbT/20lTHNDne5D6oKbW6pS20DgMSCSQ= X-Received: by 2002:a05:622a:5a19:b0:466:9b73:8e3c with SMTP id d75a77b69052e-46777658682mr4146491cf.13.1733847659152; Tue, 10 Dec 2024 08:20:59 -0800 (PST) MIME-Version: 1.0 References: <20241206225204.4008261-1-surenb@google.com> <20241206225204.4008261-5-surenb@google.com> <6b29b2a5-c244-4930-a5a0-1a24a04e7e35@suse.cz> In-Reply-To: <6b29b2a5-c244-4930-a5a0-1a24a04e7e35@suse.cz> From: Suren Baghdasaryan Date: Tue, 10 Dec 2024 08:20:48 -0800 Message-ID: Subject: Re: [PATCH v5 4/6] mm: make vma cache SLAB_TYPESAFE_BY_RCU To: Vlastimil Babka Cc: akpm@linux-foundation.org, willy@infradead.org, liam.howlett@oracle.com, lorenzo.stoakes@oracle.com, mhocko@suse.com, hannes@cmpxchg.org, mjguzik@gmail.com, oliver.sang@intel.com, mgorman@techsingularity.net, david@redhat.com, peterx@redhat.com, oleg@redhat.com, dave@stgolabs.net, paulmck@kernel.org, brauner@kernel.org, dhowells@redhat.com, hdanton@sina.com, hughd@google.com, minchan@google.com, jannh@google.com, shakeel.butt@linux.dev, souravpanda@google.com, pasha.tatashin@soleen.com, corbet@lwn.net, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, kernel-team@android.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 63ADC1C0017 X-Rspam-User: X-Stat-Signature: nq3g58rehbeufejcah3j4gi9oyatjri4 X-HE-Tag: 1733847618-466820 X-HE-Meta: U2FsdGVkX18dbHyH9PvmIadxSwb+S7o/UgQa8KfjZ5Lw1sTAd12xuLypdk0M5KLG4CDUt+3dX3iXOQSvZHHAqIX2JKkzHP2EArLJChbBoz67DBXqFlh3NkjtnGdIU1FSAcuE2glLrlvQCmbsMatPFGSE0j8jK8UAsjCYKuP/WyvnP7ESk1ahZQpij1zK+MXZmc1GrDCrqJ6ZzHsSFEgL9zVqDTvK0m4OhRIWTnos0DocGes39biBm/Nxydnzt7k9qovxHPuYB2E6HVNdRc/o18OR7njFx6VI7tm58sd3t0tEQj67J4PXVYAMhukiSQS9wa+/L1eHlHeiMBHH9C3T2acIFLWozL+JgBb53ntoq9RZOIw/WhH0P/Olga+9sH397slPbyZ0N9EUypfHvFva55UziqFVD8LuPLsi+0A4/vmZD47sb+kV0hCIaai54IwM2r0I0bsN2iWk+qUSkK+yjt5w8vFb/MHXB9bS78HaWxdU5gJuDmU4YL5ZmwzjIfjD2dKRqUzubw4KPrdAoaeM3x9+Q00HCvd5fFbHSrOor3JCrwHtRt03nU8kzGccch9JwmM7EWH3fL9qx9Q0XUF+69IiduLEeUSpI/enHSHDy6L4ZvIEk+RRR+Y1R0JrBfph7VstHQyue51KsdVJOf1ENUfTQ2S68hn9e3qYGTzHM3m8U+IPc3/cS+l/GJkuHxTA4v/wBMWkgL7mG8tWF8SIbPcFxENXvDOvspnzq4x/RWdRPT5HAJaBdHip9g8JVGf6Gv3dtOorcPTNKOrFxOU1eBKgwkb+Qij6I5osrd2ZAz6h/IuaDKouPtuNA0f8JzRMG47d1ioRM49oB+FVYTVP4SogyUKH52fx+GWJwlo01iixq106/zCFeelfiQtkb5SG1HpYwGvX4nSpHHO0V4ZZh43jhpk02BRW4MJ9BzUD69Wd2mfWnscaAgWByZvJEJEYy/yjXVHWwJNuEPLqpri vf/ZysJM 9M9+uu8U3nljUMugucKdtiOr4k7z1vO4/gG9AkIvViUwlSTGaXfaz9CIF8QLA3YSyJBhbhq5TU7kW8bOS2SRSQO0KdznqGklcl6Atf+Z91Pn3Ecy6WDOJc+86Jxea1TMovqiQZGndaynufKe3vqPY92nnOpOjgGKtaFh2wwu4r8jtopiS58oL5XeZhCCjPqJww7jG3CqAdIwNhX3FoX7RgpOZXnvRFxAfDSomM3L7qR2HOkgqBmuAbzzOvHegyCV990OVojBMiRTB4dY= X-Bogosity: Ham, tests=bogofilter, spamicity=0.006377, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Dec 10, 2024 at 6:21=E2=80=AFAM Vlastimil Babka wr= ote: > > On 12/6/24 23:52, Suren Baghdasaryan wrote: > > To enable SLAB_TYPESAFE_BY_RCU for vma cache we need to ensure that > > object reuse before RCU grace period is over will be detected inside > > lock_vma_under_rcu(). > > lock_vma_under_rcu() enters RCU read section, finds the vma at the > > given address, locks the vma and checks if it got detached or remapped > > to cover a different address range. These last checks are there > > to ensure that the vma was not modified after we found it but before > > locking it. > > vma reuse introduces several new possibilities: > > 1. vma can be reused after it was found but before it is locked; > > 2. vma can be reused and reinitialized (including changing its vm_mm) > > while being locked in vma_start_read(); > > 3. vma can be reused and reinitialized after it was found but before > > it is locked, then attached at a new address or to a new mm while > > read-locked; > > For case #1 current checks will help detecting cases when: > > - vma was reused but not yet added into the tree (detached check) > > - vma was reused at a different address range (address check); > > We are missing the check for vm_mm to ensure the reused vma was not > > attached to a different mm. This patch adds the missing check. > > For case #2, we pass mm to vma_start_read() to prevent access to > > unstable vma->vm_mm. This might lead to vma_start_read() returning > > a false locked result but that's not critical if it's rare because > > it will only lead to a retry under mmap_lock. > > For case #3, we ensure the order in which vma->detached flag and > > vm_start/vm_end/vm_mm are set and checked. vma gets attached after > > vm_start/vm_end/vm_mm were set and lock_vma_under_rcu() should check > > vma->detached before checking vm_start/vm_end/vm_mm. This is required > > because attaching vma happens without vma write-lock, as opposed to > > vma detaching, which requires vma write-lock. This patch adds memory > > barriers inside is_vma_detached() and vma_mark_attached() needed to > > order reads and writes to vma->detached vs vm_start/vm_end/vm_mm. > > After these provisions, SLAB_TYPESAFE_BY_RCU is added to vm_area_cachep= . > > This will facilitate vm_area_struct reuse and will minimize the number > > of call_rcu() calls. > > > > Signed-off-by: Suren Baghdasaryan > > I'm wondering about the vma freeing path. Consider vma_complete(): > > vma_mark_detached(vp->remove); > vma->detached =3D true; - plain write > vm_area_free(vp->remove); > vma->vm_lock_seq =3D UINT_MAX; - plain write > kmem_cache_free(vm_area_cachep) > ... > potential reallocation > > against: > > lock_vma_under_rcu() > - mas_walk finds a stale vma due to race > vma_start_read() > if (READ_ONCE(vma->vm_lock_seq) =3D=3D READ_ONCE(mm->mm_lock_seq.sequen= ce)) > - can be false, the vma was not being locked on the freeing side? > down_read_trylock(&vma->vm_lock.lock) - suceeds, wasn't locked > this is acquire, but was there any release? Yes, there was a release. I think what you missed is that vma_mark_detached() that is called from vma_complete() requires VMA to be write-locked (see vma_assert_write_locked() in vma_mark_detached()). The rule is that a VMA can be attached without write-locking but only a write-locked VMA can be detached. So, after vma_mark_detached() and before down_read_trylock(&vma->vm_lock.lock) in vma_start_read() the VMA write-lock should have been released by mmap_write_unlock() and therefore vma->detached=3Dfalse should be visible to the reader when it executed lock_vma_under_rcu(). > is_vma_detached() - false negative as the write above didn't propagate > here yet; a read barrier but where is the write barrier? > checks for vma->vm_mm, vm_start, vm_end - nobody reset them yet so fals= e > positive, or they got reset on reallocation but writes didn't propaga= te > > Am I missing something that would prevent lock_vma_under_rcu() falsely > succeeding here? >