From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 32FFAC47258 for ; Tue, 23 Jan 2024 23:12:59 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A724B6B0089; Tue, 23 Jan 2024 18:12:58 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 9FAF66B008C; Tue, 23 Jan 2024 18:12:58 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 89AE48D0001; Tue, 23 Jan 2024 18:12:58 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 773BE6B0089 for ; Tue, 23 Jan 2024 18:12:58 -0500 (EST) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 1AC6A1C144C for ; Tue, 23 Jan 2024 23:12:58 +0000 (UTC) X-FDA: 81712128036.08.9CBED0F Received: from mail-yb1-f175.google.com (mail-yb1-f175.google.com [209.85.219.175]) by imf18.hostedemail.com (Postfix) with ESMTP id 52D6E1C000F for ; Tue, 23 Jan 2024 23:12:56 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=B8iU0DA6; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf18.hostedemail.com: domain of surenb@google.com designates 209.85.219.175 as permitted sender) smtp.mailfrom=surenb@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1706051576; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=sWIDmaprRDXC791QIR9l6bk++1KT2d3jHI2j8t0C6Rc=; b=e7Cr5hrVuBdS14nEaVrIhjNEOqyhoCRoZ4sCYkbVkqzXtYbQihZdQ5Ua7a2BxZxHWXBPMS ygr1Z26A1eyORsNTKS/zSUB6aBpRQEGogl9JfFpt5q1QznqH97+9J2Rz9WpBcDnf/ZI+Ub +TMTVniXxFiA7kOlblsmIEg5kzHfwZM= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=B8iU0DA6; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf18.hostedemail.com: domain of surenb@google.com designates 209.85.219.175 as permitted sender) smtp.mailfrom=surenb@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1706051576; a=rsa-sha256; cv=none; b=atlyyl9S9xHbxYbdManEgnWAELYvXGN1W1kZp1CYEuiOHjEAm/gGVPzn9u3V9m1LTUJwCs wt8Zl3+WVc76fY6cCPMEq7gGj5TpksG7Gr6pGNaQVD19/3inmBg2ndpF6RwMtetheZjqu6 PKYSC0RptUNoWRfdb+03fyTKoMQBOow= Received: by mail-yb1-f175.google.com with SMTP id 3f1490d57ef6-dc25e12cc63so4098438276.0 for ; Tue, 23 Jan 2024 15:12:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1706051575; x=1706656375; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=sWIDmaprRDXC791QIR9l6bk++1KT2d3jHI2j8t0C6Rc=; b=B8iU0DA6BbEyo1zHWDgf4k+emoTshVxXiVs25Mwj/ow2SMwgLXNu3Fk5NxQd3MEM8B t5PkVh6QWxp9aT+vAzvr8tdkZ6VUhzQcdIhSmsmJjPfkroSw9FFcmYdZySK+PeQ59VM7 JwnrQ4M6L7EL52FkSeMsDy1RcRBI1XB10dDqI47wAAomUWmbmoPbQd1uVrRNnoysqR/v O2N1uDgJAJidaNrVmbrb2S73ouN8kUgOjMKkuuONljE6bXGulIyfPZ8RWmDaaMaYO3jN y7iN4mr7OBawgC9qQe0nJkRMOx/TESwreO2vSk/q79ouZB7Tp6wbJNj6fhIrFbyWnOUT 0c0g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1706051575; x=1706656375; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=sWIDmaprRDXC791QIR9l6bk++1KT2d3jHI2j8t0C6Rc=; b=l4FMLdvpmQW6al3baq9FYGAAdWSPxX9+JinYBiPtoY7saMni6S/7noKArs5H0je/+x xe8zw2WIlbRxZvzQ924CGb2cXcaKIpx9VI+rgzubSIDxu0NlWchtXrk5ng2Zck1NDtnv wDLyMWPJ53a9KCJmNvVo7WOSq/F2iS7aI25zVI19DKYHw6DEfUT2N/IMx8kn5CtxchS1 RyLNrriCzQk3tg0QccmG7S/TmSj2MHXANLSNiMMmgPcs39HLgQo35Wgu2z2JNUsdfLa2 F8nF3361kUMDeLlIjRBOI5mTg/r8DeNtJD2LFk1fxPxSeSc2YjXYXCHk8OOYbt3ZYShw pipQ== X-Gm-Message-State: AOJu0YwzoYOPohiimLEJ2RD1LeMRu8/fY9CNzV65O/qYqbQA5kfYRWAs K8Cf6Y4JrN7Vl6Bxdsn4lQEzttngFidC3tBCCt5zbBS57k+bLAOcxaahBRBcpWFVw1K+uZkwtgh fAV1/sKadcunGxdYP9be64Q+cXeQruD5+NdKn X-Google-Smtp-Source: AGHT+IEOH964Zn3SaV2E3HbQp+INXxfDthex+muRsVBygWUg/Q34ZfkyYcFb5cy7eUhUJWQtKoj32YvR84V9Gy5W+3U= X-Received: by 2002:a05:6902:50c:b0:dbd:2ae7:f363 with SMTP id x12-20020a056902050c00b00dbd2ae7f363mr253766ybs.4.1706051575076; Tue, 23 Jan 2024 15:12:55 -0800 (PST) MIME-Version: 1.0 References: <20240122071324.2099712-3-surenb@google.com> <20240123053629.365673-1-sj@kernel.org> In-Reply-To: From: Suren Baghdasaryan Date: Tue, 23 Jan 2024 15:12:44 -0800 Message-ID: Subject: Re: [PATCH 3/3] mm/maps: read proc/pid/maps under RCU To: SeongJae Park Cc: akpm@linux-foundation.org, viro@zeniv.linux.org.uk, brauner@kernel.org, jack@suse.cz, dchinner@redhat.com, casey@schaufler-ca.com, ben.wolsieffer@hefring.com, paulmck@kernel.org, david@redhat.com, avagin@google.com, usama.anjum@collabora.com, peterx@redhat.com, hughd@google.com, ryan.roberts@arm.com, wangkefeng.wang@huawei.com, Liam.Howlett@oracle.com, yuzhao@google.com, axelrasmussen@google.com, lstoakes@gmail.com, talumbau@google.com, willy@infradead.org, vbabka@suse.cz, mgorman@techsingularity.net, jhubbard@nvidia.com, vishal.moola@gmail.com, mathieu.desnoyers@efficios.com, dhowells@redhat.com, jgg@ziepe.ca, sidhartha.kumar@oracle.com, andriy.shevchenko@linux.intel.com, yangxingui@huawei.com, keescook@chromium.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, kernel-team@android.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 52D6E1C000F X-Rspam-User: X-Rspamd-Server: rspam02 X-Stat-Signature: yx88axk9b3kg8nr7tdp19s3n3fqab5dx X-HE-Tag: 1706051576-40175 X-HE-Meta: U2FsdGVkX19MToAf3uehvFVbuZOh+GSivVNAdmQ8Bc+1hAjex43/yixbW4H1dNX7+QonEcB1wu7I87Qjjy4+KB1D/ccDZKMv7qu98DmfXZwDBPcmbtHCsKu0hXUi6GapCb2MNqhW6fo9CyDLzqovtxj/cJkwiHAXbARYxmkysAKmxPeQGeimTKHIrTnJu4JYDccr/L/uKeuiAHb6l41IkfDv0Ol2TN4msYPfZ8xWFbGCAVUKRbKHmrli1MeYWGxTJnIpYHz+0Ah7I3UPbLTsNAJt7yKjdNQrIwfJwP0mgLWHvprYD99pzBuRAnrfLg/BEC00kLU5LNvnVihRUFJDWKPKVYjKgIRWOV9hWkRDTxJqkGWWV8tIEcDmq1ggKKY9usRjjg7/TMfm/LgtZCpGDgKoEWvFz84/Q/g650gVObSTHBPUhhAM1NVeanQxUJuPjQiiSqCgG4aR+0KM527UeUEr9F/qPB/nRaEuM0AiWMOpIbC00qDkjvH/wQUbNq70ouAAWggchh+Xr4z9znokEYlfCDhRbaIEmCuoTgkQ4b8L9eaJb6zLm+swkT3ArejWIQtxXoDHJlLjrOR8Wh21VaVJUfT+3wMLnoHFxwGkQtS9xm0/ZZqGhyssoqev3TCLIz7GOB/P8vviXr/36TdzlCcvZ5L33NJjyOWG0TsHO2f3P0kt6GtqV4zK+53kRcXCGRFXaj4S3HQ2dfDKLLcLwhqc8B8LrxINx7ebPJqXQZCCJywm+BtymCdlRwHuDoNid9vLS7TI+tYoaIQdFJZ1SmJkVt5v10uc0fjJVE3F7k12sGUkPHlxkxvCVEqIAKVGfQ5XsYTMWMLfUPBp4hKZsMXXvDRxmkJO8KGzGjzWwNbKEXWGXTNuCaB1lKKenu7OaJVFnYr2g92p898YWOIxvwPcVO6LAuw1UGC6UuBHm7bguYs6mfVOFJU6X+tWACWSEdwgEHOonf4ZOCRVi4S fWJ33cnb qmQZofIm3jtymG2RnCYkpE9HK19rVm1TmNko3pxDaMc+fg+B8qHcUPv5Bz+cJ8QarZhkVH5R27H4g7cjbks3flw6JOJ99krYjWtTN3F+SahZJ7YNhfxjBLu3mLfNY3TxMKW+M21Cynsva3oF/y+VZPEnopOEBGq0AKVILbfVUKj7HDX02rYx+/jIA/qHBKPrZe/sb7DjZgeszPA3vPbWi1Gdut2qQcUDd7PSdeIpwBqZj4RxCWZUZ61oDLvym/UkgHXXLrgkpnS6GWygzlz0E00jVFJreHQkG1urpGLj1yoSPcBTJ7uiUroyRulVa+FRr3GhppU/mB4Clfm3f+/oe0gEbC32zooUS/1iDm2NIkovqpOl2GPmpMOUy1RbpN3okElT2 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Jan 22, 2024 at 10:07=E2=80=AFPM Suren Baghdasaryan wrote: > > On Mon, Jan 22, 2024 at 9:36=E2=80=AFPM SeongJae Park wro= te: > > > > Hi Suren, > > > > On Sun, 21 Jan 2024 23:13:24 -0800 Suren Baghdasaryan wrote: > > > > > With maple_tree supporting vma tree traversal under RCU and per-vma l= ocks > > > making vma access RCU-safe, /proc/pid/maps can be read under RCU and > > > without the need to read-lock mmap_lock. However vma content can chan= ge > > > from under us, therefore we make a copy of the vma and we pin pointer > > > fields used when generating the output (currently only vm_file and > > > anon_name). Afterwards we check for concurrent address space > > > modifications, wait for them to end and retry. That last check is nee= ded > > > to avoid possibility of missing a vma during concurrent maple_tree > > > node replacement, which might report a NULL when a vma is replaced > > > with another one. While we take the mmap_lock for reading during such > > > contention, we do that momentarily only to record new mm_wr_seq count= er. > > > This change is designed to reduce mmap_lock contention and prevent a > > > process reading /proc/pid/maps files (often a low priority task, such= as > > > monitoring/data collection services) from blocking address space upda= tes. > > > > > > Note that this change has a userspace visible disadvantage: it allows= for > > > sub-page data tearing as opposed to the previous mechanism where data > > > tearing could happen only between pages of generated output data. > > > Since current userspace considers data tearing between pages to be > > > acceptable, we assume is will be able to handle sub-page data tearing > > > as well. > > > > > > Signed-off-by: Suren Baghdasaryan > > > --- > > > fs/proc/internal.h | 2 + > > > fs/proc/task_mmu.c | 114 ++++++++++++++++++++++++++++++++++++++++++-= -- > > > 2 files changed, 109 insertions(+), 7 deletions(-) > > > > > > diff --git a/fs/proc/internal.h b/fs/proc/internal.h > > > index a71ac5379584..e0247225bb68 100644 > > > --- a/fs/proc/internal.h > > > +++ b/fs/proc/internal.h > > > @@ -290,6 +290,8 @@ struct proc_maps_private { > > > struct task_struct *task; > > > struct mm_struct *mm; > > > struct vma_iterator iter; > > > + unsigned long mm_wr_seq; > > > + struct vm_area_struct vma_copy; > > > #ifdef CONFIG_NUMA > > > struct mempolicy *task_mempolicy; > > > #endif > > > diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c > > > index 3f78ebbb795f..3886d04afc01 100644 > > > --- a/fs/proc/task_mmu.c > > > +++ b/fs/proc/task_mmu.c > > > @@ -126,11 +126,96 @@ static void release_task_mempolicy(struct proc_= maps_private *priv) > > > } > > > #endif > > > > > > -static struct vm_area_struct *proc_get_vma(struct proc_maps_private = *priv, > > > - loff_t *ppos) > > > +#ifdef CONFIG_PER_VMA_LOCK > > > + > > > +static const struct seq_operations proc_pid_maps_op; > > > +/* > > > + * Take VMA snapshot and pin vm_file and anon_name as they are used = by > > > + * show_map_vma. > > > + */ > > > +static int get_vma_snapshow(struct proc_maps_private *priv, struct v= m_area_struct *vma) > > > { > > > + struct vm_area_struct *copy =3D &priv->vma_copy; > > > + int ret =3D -EAGAIN; > > > + > > > + memcpy(copy, vma, sizeof(*vma)); > > > + if (copy->vm_file && !get_file_rcu(©->vm_file)) > > > + goto out; > > > + > > > + if (copy->anon_name && !anon_vma_name_get_rcu(copy)) > > > + goto put_file; > > > > From today updated mm-unstable which containing this patch, I'm getting= below > > build error when CONFIG_ANON_VMA_NAME is not set. Seems this patch nee= ds to > > handle the case? > > Hi SeongJae, > Thanks for reporting! I'll post an updated version fixing this config. Fix is posted at https://lore.kernel.org/all/20240123231014.3801041-3-surenb@google.com/ as part of v2 of this patchset. Thanks, Suren. > Suren. > > > > > > .../linux/fs/proc/task_mmu.c: In function =E2=80=98get_vma_snapshow= =E2=80=99: > > .../linux/fs/proc/task_mmu.c:145:19: error: =E2=80=98struct vm_area= _struct=E2=80=99 has no member named =E2=80=98anon_name=E2=80=99; did you m= ean =E2=80=98anon_vma=E2=80=99? > > 145 | if (copy->anon_name && !anon_vma_name_get_rcu(copy)= ) > > | ^~~~~~~~~ > > | anon_vma > > .../linux/fs/proc/task_mmu.c:161:19: error: =E2=80=98struct vm_area= _struct=E2=80=99 has no member named =E2=80=98anon_name=E2=80=99; did you m= ean =E2=80=98anon_vma=E2=80=99? > > 161 | if (copy->anon_name) > > | ^~~~~~~~~ > > | anon_vma > > .../linux/fs/proc/task_mmu.c:162:41: error: =E2=80=98struct vm_area= _struct=E2=80=99 has no member named =E2=80=98anon_name=E2=80=99; did you m= ean =E2=80=98anon_vma=E2=80=99? > > 162 | anon_vma_name_put(copy->anon_name); > > | ^~~~~~~~~ > > | anon_vma > > .../linux/fs/proc/task_mmu.c: In function =E2=80=98put_vma_snapshot= =E2=80=99: > > .../linux/fs/proc/task_mmu.c:174:18: error: =E2=80=98struct vm_area= _struct=E2=80=99 has no member named =E2=80=98anon_name=E2=80=99; did you m= ean =E2=80=98anon_vma=E2=80=99? > > 174 | if (vma->anon_name) > > | ^~~~~~~~~ > > | anon_vma > > .../linux/fs/proc/task_mmu.c:175:40: error: =E2=80=98struct vm_area= _struct=E2=80=99 has no member named =E2=80=98anon_name=E2=80=99; did you m= ean =E2=80=98anon_vma=E2=80=99? > > 175 | anon_vma_name_put(vma->anon_name); > > | ^~~~~~~~~ > > | anon_vma > > > > [...] > > > > > > Thanks, > > SJ