From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5CA9DC83F27 for ; Wed, 16 Jul 2025 14:29:56 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id F36676B00AE; Wed, 16 Jul 2025 10:29:55 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id EE5366B00AF; Wed, 16 Jul 2025 10:29:55 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DD3F56B00B0; Wed, 16 Jul 2025 10:29:55 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id C4C6C6B00AE for ; Wed, 16 Jul 2025 10:29:55 -0400 (EDT) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 6A2661A03B7 for ; Wed, 16 Jul 2025 14:29:55 +0000 (UTC) X-FDA: 83670361950.07.22E37B9 Received: from mail-qt1-f170.google.com (mail-qt1-f170.google.com [209.85.160.170]) by imf10.hostedemail.com (Postfix) with ESMTP id 8D0EEC000E for ; Wed, 16 Jul 2025 14:29:53 +0000 (UTC) Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=Dvr7uodI; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf10.hostedemail.com: domain of surenb@google.com designates 209.85.160.170 as permitted sender) smtp.mailfrom=surenb@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1752676193; a=rsa-sha256; cv=none; b=DUe4i4Tz1Th2vA9NJJ1by+Zom2rUeJf52tv2tByONVpMzreOhcmCpLcmH6+7f0Vo2e3p6B 6r3dmR9gFFAikiCs34p0oEXzjB8gCF1UJiBGN3tb54Ulze3hHmlZKWcF3ebHWD1NuaOE69 2wCDDsV/pm70F0RwxuUB4spooF2jw68= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=Dvr7uodI; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf10.hostedemail.com: domain of surenb@google.com designates 209.85.160.170 as permitted sender) smtp.mailfrom=surenb@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1752676193; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=v9dJcIYNmFLt97FV6TaWcTMU/S/O8suqtyZyGxaHRnM=; b=bt0J3sezHXYPYFGrMFh93bjaRdC3Z5B906h7dv6Kui8EA1J/MMNYeSjiERZzTl3Kr291Hr 6wo7ORtsxsfs+Itn7o4NkOj2+K96n6j8maWFp1KnG3DlZf2araTEGbkDjUIcZhwnEI/TqJ nnwb2N9lh2epCtNX/fjfTDu0kParqQs= Received: by mail-qt1-f170.google.com with SMTP id d75a77b69052e-4ab86a29c98so482691cf.0 for ; Wed, 16 Jul 2025 07:29:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1752676193; x=1753280993; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=v9dJcIYNmFLt97FV6TaWcTMU/S/O8suqtyZyGxaHRnM=; b=Dvr7uodIS6ga6c+DIPVgK9UA8zq32Q61Kcj3DpaGTE2colg1K0uJQTYqzB0CUDQ3ZQ yP2Np90j0Rp1duVPkDvquie0xF7ab0JX+iJcabSPZEph9iLqpRSrHqq5ka2u89lHKLmK H/EXyuIrHrGOf7zsu9fq5n1GcTZ+Fetz1xm3i132Sti+3ypj+hw+qd5I0lCY7L8XtAG4 U0gGxLthntlMRX5sx98Rbe53upj+CKBsEvUI90CjdWvyD7Ab5Kj6o3gDAppU5UsbQKDW KCH0dic8phbHw+p1QSbN6t+Jii5QNdNU/4M7oORoYqDT9YVQI+rSnFR9GFLHBz0dGOcY 7w7A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1752676193; x=1753280993; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=v9dJcIYNmFLt97FV6TaWcTMU/S/O8suqtyZyGxaHRnM=; b=mfzx3hlinpdxWOO4AhX2wzDGuJ+ycsgPJnts+fdln4dETD24arBsFOEq00Em/DsDt6 X071YUD7rJpV2Td1OzIqsZ6zuRVDRClUTJRWTNpav/zLs4wexyhI7skNvbbrAucFXvio pukeB8RBMxniPur3e1VTKaR2zz1psgHm6rylHv5wqG90ibevdH2zytalRwRlJex9buMG ZbEEiasOGbx0rq9feoORVCtSL0E+gODZvOgxVvw5ltxlace4n/piOjgE7mmIlYtKczEJ Ch1qjQ3phhNqr5xXnb3kr4w8KUPYx182jO0xJ2AcpRs/x8srZyPAHjJe48THFDVCEa7P 8ErA== X-Forwarded-Encrypted: i=1; AJvYcCWHb3fotMoBen2eu7eftgGJaHoUJ/yFuY686T3ajlcH909EcKfe7taWcQWn2Jfed8oYVO7lHoUKuQ==@kvack.org X-Gm-Message-State: AOJu0YwXlo57ALSAFDMQTDr0uUbHd962AOJzMsSWqZMRCMrbfWAd+Csb 52Xrdtz+0KGAY+YC2o8YLD0EsBb9Sted+tDtqQlZNWsJ2xoIT0DXWqZwndy6j3x5uYHhevW9i36 H5O3IhCQvfy4O0I32bT+c5v6RA+DpYKYjNvEtkdWc X-Gm-Gg: ASbGncsK2PoPseClVrqesE5xA7+LSglwRMOOwdzMudOuBwLcAbmJlZ+SWX0Up3Emi5R EiGfoT87v5UygQkI/hxI+/AqDlrvFA2e4YRY1hbTH0/MK/fFYUIW8+T6fNEtnsqN43wKda63HDV 66VeDtgrSx08uu/5RexJ8M6zJPImPlkPNvQnWiTvl42X9HpeR5Hz3/YEOrx0cN1FA2LF/dyJW5n N3OU3vntSgzDIBbUNOOvb0IaV4creON0ToA X-Google-Smtp-Source: AGHT+IF1jgIoyk5R1Kif1z4ud/Iv9ltObgmGNfl91kkWVCZZHs4vDHtmbDAOFhVoWxeWru8tTtLk2fMmer7cvrmEICE= X-Received: by 2002:a05:622a:1aa0:b0:4a9:d263:dbc5 with SMTP id d75a77b69052e-4ab954da1cfmr2945571cf.20.1752676192219; Wed, 16 Jul 2025 07:29:52 -0700 (PDT) MIME-Version: 1.0 References: <20250716030557.1547501-1-surenb@google.com> <20250716030557.1547501-8-surenb@google.com> In-Reply-To: From: Suren Baghdasaryan Date: Wed, 16 Jul 2025 07:29:41 -0700 X-Gm-Features: Ac12FXzno4zj4-9Yms3LsC7vLEmlFKsL-np6yaN8ehs9Whc_WjCHA4V2guD--iU Message-ID: Subject: Re: [PATCH v7 7/7] fs/proc/task_mmu: read proc/pid/maps under per-vma lock To: Vlastimil Babka Cc: akpm@linux-foundation.org, Liam.Howlett@oracle.com, lorenzo.stoakes@oracle.com, david@redhat.com, peterx@redhat.com, jannh@google.com, hannes@cmpxchg.org, mhocko@kernel.org, paulmck@kernel.org, shuah@kernel.org, adobriyan@gmail.com, brauner@kernel.org, josef@toxicpanda.com, yebin10@huawei.com, linux@weissschuh.net, willy@infradead.org, osalvador@suse.de, andrii@kernel.org, ryan.roberts@arm.com, christophe.leroy@csgroup.eu, tjmercier@google.com, kaleshsingh@google.com, aha310510@gmail.com, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kselftest@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 8D0EEC000E X-Stat-Signature: hzhf1t6mm3wja6hx9wzdafy5dnpwk4xc X-Rspam-User: X-HE-Tag: 1752676193-857554 X-HE-Meta: U2FsdGVkX19U+/6xIOL87P503xIwT+JHuXbqjtud+B6huHimPegKGzXa6cFMYFqfKsR/EcdLbHrDNCMagWU1KRhOk6l8iH0ANgOY+iLTXvspdGd9b94NSMewW5pR5iltoXnK2znoFdyWWOQE2kcwOPZxf0tdv+nYEMLnWjYVBnvqq/K7zkTpjTW6Lcmf7ous0VtYx0zn4MSsB/oa5JVI36vvkwIZiF6eXWPg5WwrRTdudgllGZq62ytUWyEuG1z0l8wz4bFmv6P2Oxdqjm+Gp/hEl3ImY12rM/K3CixYpuf3tY/2vIvT7M1HiFNnnsp8TCRyHtvHDm6pFQpVop0AIe3FSOtZVD6u34drvHHXQ4lAJFZGHLR2Hrx4uKJUc0Wt51QTDQiuGewJPzS5FUVgPVj6+++mTrm7LpCchqkGgfCt7vlh0zbS8kfndlr5+cukGVv0IoGBbYSf3GGEV/JGNRwKV4O2D2Nler2OK5OLUk6cY838aTLnB2PYvmmG6v+BW6omzAH5WVGmkkD0osx7GdEe8MmIMqWWqBI1Klet1wh3+MT/IO0ywDM44yLKUoqtG9LIrhYWY0uXXeXReFZ0Q9CJZOUJZa+RGMHu8K2FNEWBTECMOZTFXq89i/eGFAK+0fC+/6gmKWL2uqcIkUngyd2NQutmc4R3u6P/hF8NsMGL1BC00CL6Eps1mqc1aoYV343U1nt8snRMV/oBMnQxMaWoxlz4tlai+OpU80iZ8FUaxq6wys/ETIZ/ddIHjLAOFzNJRhZ68cEuoiuAD5Wai3xv/NJLEFjm+R9NarETmaMkj0fXTxA80Ey/QABYYKXVabU8UFhRTkO047od2NY5RlgMewR8BG0esMTvkTHEzIe5yeWV/q61VDW/4ox2sDYqcVIqtYXlmAZ8Km7g7KZNrIJ0ka2u9NSvVH6gQ0K4oS9nTzbJ6wpcXpkEAdv5JQY9gtXxoW8cxdX3uuoihDa ZCFVx2/h egLckfWdeJfAPnlyq14CZ271xELTRywxTVwaSgI3jdv4hqz1y9xG5DdCo5Whv1rBgzuVT9JboZ55opT+0zfg38kfnv4dQz/eTzwXqz/eshQsCWAM2m9FcU6g/RKkcwUBT8iDy0WF7TtZg/euOY36dscsQ6BKUZU8OytHnLOLrffacQuwAnj4C8RGLZqUgwaKjFPxtHDL/KJHPi8RgV34f1E7O1hNYYJOaIcjymkOLYmMo5jKCDh+58R8my+0CXLLc6wLss+ExjK7a598= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Jul 16, 2025 at 6:57=E2=80=AFAM Vlastimil Babka wr= ote: > > On 7/16/25 05:05, Suren Baghdasaryan wrote: > > With maple_tree supporting vma tree traversal under RCU and per-vma > > locks, /proc/pid/maps can be read while holding individual vma locks > > instead of locking the entire address space. > > A completely lockless approach (walking vma tree under RCU) would be > > quite complex with the main issue being get_vma_name() using callbacks > > which might not work correctly with a stable vma copy, requiring > > original (unstable) vma - see special_mapping_name() for example. > > > > When per-vma lock acquisition fails, we take the mmap_lock for reading, > > lock the vma, release the mmap_lock and continue. This fallback to mmap > > read lock guarantees the reader to make forward progress even during > > lock contention. This will interfere with the writer but for a very > > short time while we are acquiring the per-vma lock and only when there > > was contention on the vma reader is interested in. > > > > We shouldn't see a repeated fallback to mmap read locks in practice, as > > this require a very unlikely series of lock contentions (for instance > > due to repeated vma split operations). However even if this did somehow > > happen, we would still progress. > > > > One case requiring special handling is when a vma changes between the > > time it was found and the time it got locked. A problematic case would > > be if a vma got shrunk so that its vm_start moved higher in the address > > space and a new vma was installed at the beginning: > > > > reader found: |--------VMA A--------| > > VMA is modified: |-VMA B-|----VMA A----| > > reader locks modified VMA A > > reader reports VMA A: | gap |----VMA A----| > > > > This would result in reporting a gap in the address space that does not > > exist. To prevent this we retry the lookup after locking the vma, howev= er > > we do that only when we identify a gap and detect that the address spac= e > > was changed after we found the vma. > > > > This change is designed to reduce mmap_lock contention and prevent a > > process reading /proc/pid/maps files (often a low priority task, such > > as monitoring/data collection services) from blocking address space > > updates. Note that this change has a userspace visible disadvantage: > > it allows for sub-page data tearing as opposed to the previous mechanis= m > > where data tearing could happen only between pages of generated output > > data. Since current userspace considers data tearing between pages to b= e > > acceptable, we assume is will be able to handle sub-page data tearing > > as well. > > > > Signed-off-by: Suren Baghdasaryan > > Reviewed-by: Vlastimil Babka > > Nit: the previous patch changed lines with e.g. -2UL to -2 and this seems > changing the same lines to add a comment e.g. *ppos =3D -2; /* -2 indicat= es > gate vma */ > > That comment could have been added in the previous patch already. Also if > you feel the need to add the comments, maybe it's time to just name those > special values with a #define or something :) Good point. I'll see if I can fit that into the next version. >