From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2A638C00A5A for ; Wed, 18 Jan 2023 01:07:12 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8D5CD6B0072; Tue, 17 Jan 2023 20:07:11 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 885E06B0075; Tue, 17 Jan 2023 20:07:11 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 726016B0078; Tue, 17 Jan 2023 20:07:11 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 62D646B0072 for ; Tue, 17 Jan 2023 20:07:11 -0500 (EST) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 2957C804EC for ; Wed, 18 Jan 2023 01:07:11 +0000 (UTC) X-FDA: 80366131062.19.C6A9754 Received: from mail-yw1-f180.google.com (mail-yw1-f180.google.com [209.85.128.180]) by imf22.hostedemail.com (Postfix) with ESMTP id 99A54C0004 for ; Wed, 18 Jan 2023 01:07:09 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=HR7OBma7; spf=pass (imf22.hostedemail.com: domain of surenb@google.com designates 209.85.128.180 as permitted sender) smtp.mailfrom=surenb@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1674004029; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=+UzsDlXlC7t5BIJkt4S1C9pOgCAtRZ1rD2tVfs3+9vM=; b=PFBB3W40/AQ9VBNsys3FsuUQRdjYcMdsY/nurO4nLeqyOtZ4HbxciAjwvuf8W/SLfPBqeV MGHezDiNuGydm8iY3YqLJ3IBT+lPWusrmHyz9qKYRLbbmhZ5RNWGut2OrzaZnrTPyt+/dq 8tzKmzMFpLNtQ2VypqnxYXmPh215uWQ= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=HR7OBma7; spf=pass (imf22.hostedemail.com: domain of surenb@google.com designates 209.85.128.180 as permitted sender) smtp.mailfrom=surenb@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1674004029; a=rsa-sha256; cv=none; b=P2nTOajw0jfZhgmeOwHQW+3FO2PtFB3MVqDDYk0NYvHL4Hnd8gKMKdUNWYt+yP3IlqU2tC S3ShtmXjaSLBLeTr9L7Svh9rQbce92nSMJfB6q5mHoDjPVmQQpl7nuBw0jaGb07J/x3KOB MIURazEJkFhkLS5DnPKKoZSzesQ4+GE= Received: by mail-yw1-f180.google.com with SMTP id 00721157ae682-4c131bede4bso447779467b3.5 for ; Tue, 17 Jan 2023 17:07:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=+UzsDlXlC7t5BIJkt4S1C9pOgCAtRZ1rD2tVfs3+9vM=; b=HR7OBma7oNcSAgyQdahWkS/+GYEiMo2K7u7SH7XKaEtDUcUden1EUbI6LD9VVF7J1x W7lI2euLfggvfs2V0JWHWkHe3qxzN6ohS9ExKzTaJMGOQSZuOKR9RScixndA7s2QYaQ6 HfUWv9VD3eHjySCGt6xoaLyVVtj2EEONzfERHlbh0iPGw2sCO/lcesCY2eTVPQRoMJb+ gqR21sjs3J+x5zerViL5YkWvVAzNFkjnCi2vD95FzbGwg3CJYzMrtPoGRP4w9LZjSUIu gRLpF7+BYQnoKYJBu29968t+KMhHg8v34Cty12VpH/W2gIUx8Z32jLwQzC5wkSaXhVBr /4kA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=+UzsDlXlC7t5BIJkt4S1C9pOgCAtRZ1rD2tVfs3+9vM=; b=d94SzEiDA8LT1zwzwb+hCpQh5kzbcFCB9spN9Kls7svynfq6bi443rCo2OUck4hynt 1/PJfuBf5nXSRJwQXiyuTwXNZkFzPqXd4CAdOFBgL2YnWu0Yx3vRR4YERDb0lg64iRzK 4zr4GKX45vTVUm6ZzIFwgB9ifWpxoNiJN0p3KP3jrvOJzsp4oaCgByd46QCCUb/wOTr/ Zj5IU2ilUxnerKBKEhjTIHHyjEemdoiFtVdgBAtKM+QCSVlnQIjZSJyhxEAfcis0PGXK isH/yUm/TBHcTZ2gQa7RbBp2tYrpBCm1UPpJ2sparHHwve5vhpTsWgT7VXibTMbTc0W1 vyDw== X-Gm-Message-State: AFqh2kriGn8Z8pOlHN7j48mflVjuYWPh7X4W5vf+ooeSEYzmLIG0+Ucv TSKg5BFBozg6NLh7fzxlqlpWe7weIPn4D/W9pzPBFA== X-Google-Smtp-Source: AMrXdXt1di0oM/b6morQivkbcXm4Ky5z3zOnzmSbWPfTQh0c+LbBSHcKZuP95W8tXfA3+CxNeZNf17AZY9H5yMUDV7E= X-Received: by 2002:a81:6d8d:0:b0:490:89c3:21b0 with SMTP id i135-20020a816d8d000000b0049089c321b0mr672398ywc.132.1674004028426; Tue, 17 Jan 2023 17:07:08 -0800 (PST) MIME-Version: 1.0 References: <20230109205336.3665937-1-surenb@google.com> <20230109205336.3665937-29-surenb@google.com> In-Reply-To: From: Suren Baghdasaryan Date: Tue, 17 Jan 2023 17:06:57 -0800 Message-ID: Subject: Re: [PATCH 28/41] mm: introduce lock_vma_under_rcu to be used from arch-specific code To: Michal Hocko Cc: akpm@linux-foundation.org, michel@lespinasse.org, jglisse@google.com, vbabka@suse.cz, hannes@cmpxchg.org, mgorman@techsingularity.net, dave@stgolabs.net, willy@infradead.org, liam.howlett@oracle.com, peterz@infradead.org, ldufour@linux.ibm.com, laurent.dufour@fr.ibm.com, paulmck@kernel.org, luto@kernel.org, songliubraving@fb.com, peterx@redhat.com, david@redhat.com, dhowells@redhat.com, hughd@google.com, bigeasy@linutronix.de, kent.overstreet@linux.dev, punit.agrawal@bytedance.com, lstoakes@gmail.com, peterjung1337@gmail.com, rientjes@google.com, axelrasmussen@google.com, joelaf@google.com, minchan@google.com, jannh@google.com, shakeelb@google.com, tatashin@google.com, edumazet@google.com, gthelen@google.com, gurua@google.com, arjunroy@google.com, soheil@google.com, hughlynch@google.com, leewalsh@google.com, posk@google.com, linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, x86@kernel.org, linux-kernel@vger.kernel.org, kernel-team@android.com Content-Type: text/plain; charset="UTF-8" X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 99A54C0004 X-Stat-Signature: 1mbid8833sr8ug4ruun3aa38zh6c8uco X-HE-Tag: 1674004029-643387 X-HE-Meta: U2FsdGVkX19DYHEIt6dL8mwETsUjpwIoQD9dyNqBLjuwWGGI4HuGaRq0Kd07H9vNNate2tRhfRSQrovHiHjZHMvr/SLnFgG4EKS+ePj9fQeepAwvBnrbVv7zcLAZwslpOWxbzRPhKnC/qol4eBjRFECkjxSWPrBdJABuWZdSZWqNYH00uTiShbWKosFLd+oTP/vRAhpTU0hhSto6cj32D3p+Ko3DtfS16FSqiyZqIjQLH9A+BQKYAlE+FkNaAjetMy9dKJ13FlTKEbMLZC+bqNFzpw4s1m8q3Jed7+WpRwT6TE3+/fHJB40rz+VqN0BnuVGRpINVtOarAlxAGTDWi+2irXLrzXtyGW+dM8GiO4EwRonTKTQ7LTYKD2Cp6UQbieMgQLD4OrJTHQcJbGgG+e1Kgh4jbGPUMBipOrX5mfgLd6Uc8Why7cwWOvLZ3ey11PM85zbKapPZh+Xzj5I2Q7OyQxIKRebf2gUfSSz8joxcu1cNiTsfvB29aapiuSLQjW93SfgL+OZGBHuWMqPLESMqWOiM20HGVX1MtrSaMnUQ8RVbJ2YI61cign4AC7bYxESPgSFl2ock3HMFrFtmXn3H/hbP/ntMv7BuCTC8tUP5hUMD+UjN0M18pcIMHufZDusU8N6DYRLV/F12LWr7Qn+Ek7YQCUWWtINug3qL8RB+JhFgEVaDVUyoiR4ALfkaqvpJDR/Lxda1WhSl3J9cf2yJ5c5H13BMS2KpGeGblVYmfGfX8BBBBDpaQCdcVAYUzojWV5jj3q44BPmAz7lMChYT0OFIs6R+V5vuvgPjqQSuRWzGhKJeipt+mc+SRazD5dGxDBbZ90N0OU0+3ZIaBp/00UYZ5p3mHFwUsbTHYcW7Xvlclp39haXBKz6rXxArNdIo21rGJREvhLhx4U9B2gqZF5QWVIE3YUg2UZfotk7O3limIBJUWPKTsWfJJ7Zk X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Jan 17, 2023 at 7:47 AM Michal Hocko wrote: > > On Mon 09-01-23 12:53:23, Suren Baghdasaryan wrote: > > Introduce lock_vma_under_rcu function to lookup and lock a VMA during > > page fault handling. When VMA is not found, can't be locked or changes > > after being locked, the function returns NULL. The lookup is performed > > under RCU protection to prevent the found VMA from being destroyed before > > the VMA lock is acquired. VMA lock statistics are updated according to > > the results. > > For now only anonymous VMAs can be searched this way. In other cases the > > function returns NULL. > > Could you describe why only anonymous vmas are handled at this stage and > what (roughly) has to be done to support other vmas? lock_vma_under_rcu > doesn't seem to have any anonymous vma specific requirements AFAICS. TBH I haven't spent too much time looking into file-backed page faults yet but a couple of tasks I can think of are: - Ensure that all vma->vm_ops->fault() handlers do not rely on mmap_lock being read-locked; - vma->vm_file freeing like VMA freeing will need to be done after RCU grace period since page fault handlers use it. This will require some caution because simply adding it into __vm_area_free() called via call_rcu() will cause corresponding fops->release() to be called asynchronously. I had to solve this issue with out-of-tree SPF implementation when asynchronously called snd_pcm_release() was problematic. I'm sure I'm missing more potential issues and maybe Matthew and Michel can pinpoint more things to resolve here? > > Also isn't lock_vma_under_rcu effectively find_read_lock_vma? Not that > the naming is really the most important part but the rcu locking is > internal to the function so why should we spread this implementation > detail to the world... I wanted the name to indicate that the lookup is done with no locks held. But I'm open to suggestions. > > > Signed-off-by: Suren Baghdasaryan > > --- > > include/linux/mm.h | 3 +++ > > mm/memory.c | 51 ++++++++++++++++++++++++++++++++++++++++++++++ > > 2 files changed, 54 insertions(+) > > > > diff --git a/include/linux/mm.h b/include/linux/mm.h > > index c464fc8a514c..d0fddf6a1de9 100644 > > --- a/include/linux/mm.h > > +++ b/include/linux/mm.h > > @@ -687,6 +687,9 @@ static inline void vma_assert_no_reader(struct vm_area_struct *vma) > > vma); > > } > > > > +struct vm_area_struct *lock_vma_under_rcu(struct mm_struct *mm, > > + unsigned long address); > > + > > #else /* CONFIG_PER_VMA_LOCK */ > > > > static inline void vma_init_lock(struct vm_area_struct *vma) {} > > diff --git a/mm/memory.c b/mm/memory.c > > index 9ece18548db1..a658e26d965d 100644 > > --- a/mm/memory.c > > +++ b/mm/memory.c > > @@ -5242,6 +5242,57 @@ vm_fault_t handle_mm_fault(struct vm_area_struct *vma, unsigned long address, > > } > > EXPORT_SYMBOL_GPL(handle_mm_fault); > > > > +#ifdef CONFIG_PER_VMA_LOCK > > +/* > > + * Lookup and lock a VMA under RCU protection. Returned VMA is guaranteed to be > > + * stable and not isolated. If the VMA is not found or is being modified the > > + * function returns NULL. > > + */ > > +struct vm_area_struct *lock_vma_under_rcu(struct mm_struct *mm, > > + unsigned long address) > > +{ > > + MA_STATE(mas, &mm->mm_mt, address, address); > > + struct vm_area_struct *vma, *validate; > > + > > + rcu_read_lock(); > > + vma = mas_walk(&mas); > > +retry: > > + if (!vma) > > + goto inval; > > + > > + /* Only anonymous vmas are supported for now */ > > + if (!vma_is_anonymous(vma)) > > + goto inval; > > + > > + if (!vma_read_trylock(vma)) > > + goto inval; > > + > > + /* Check since vm_start/vm_end might change before we lock the VMA */ > > + if (unlikely(address < vma->vm_start || address >= vma->vm_end)) { > > + vma_read_unlock(vma); > > + goto inval; > > + } > > + > > + /* Check if the VMA got isolated after we found it */ > > + mas.index = address; > > + validate = mas_walk(&mas); > > + if (validate != vma) { > > + vma_read_unlock(vma); > > + count_vm_vma_lock_event(VMA_LOCK_MISS); > > + /* The area was replaced with another one. */ > > + vma = validate; > > + goto retry; > > + } > > + > > + rcu_read_unlock(); > > + return vma; > > +inval: > > + rcu_read_unlock(); > > + count_vm_vma_lock_event(VMA_LOCK_ABORT); > > + return NULL; > > +} > > +#endif /* CONFIG_PER_VMA_LOCK */ > > + > > #ifndef __PAGETABLE_P4D_FOLDED > > /* > > * Allocate p4d page table. > > -- > > 2.39.0 > > -- > Michal Hocko > SUSE Labs