From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id DBFD5C83F1B for ; Wed, 16 Jul 2025 13:57:11 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 77E3B6B00A7; Wed, 16 Jul 2025 09:57:11 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 72DCE6B00AA; Wed, 16 Jul 2025 09:57:11 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5CE766B00AE; Wed, 16 Jul 2025 09:57:11 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 4A08E6B00A7 for ; Wed, 16 Jul 2025 09:57:11 -0400 (EDT) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id BB7E214049F for ; Wed, 16 Jul 2025 13:57:10 +0000 (UTC) X-FDA: 83670279420.30.15DF4E0 Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.223.131]) by imf13.hostedemail.com (Postfix) with ESMTP id 388BB20005 for ; Wed, 16 Jul 2025 13:57:08 +0000 (UTC) Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=dGOVqCn7; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=VzLCJMvA; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=dGOVqCn7; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=VzLCJMvA; dmarc=none; spf=pass (imf13.hostedemail.com: domain of vbabka@suse.cz designates 195.135.223.131 as permitted sender) smtp.mailfrom=vbabka@suse.cz ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1752674228; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=rwnWGicdVAughpBbGHvo31mHshgZJT044k4Ot0D8V/I=; b=yl2j0BHgudJbbLR7/kA66ZxwV0v+gRSn3Cg/qwVauH+MEGo/Xo0C5Xu5RTIyh8yUezjwhf /5Z6IBrGEfN1WCGNpP4F62Vj46rEW8yubmwQ5NcUopWEST1ZepWZvNscCDvMK8+uaz0bl/ yyKmhh1NWZx1bqDt6V9Jg/p8LWUEUvE= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1752674228; a=rsa-sha256; cv=none; b=y2goTZe3eYZUF/4Z7mE3Fr87zu8I8JKl5orS+dCsWmljhqNOSToEn5DiicYAw2qbAoUA6y mNmwubDsM2+6JPYqttT5Eaulc2F+peiqxSiU6QmZmrEpFfsmpnMG16NPZAzbyOHtz0TqIv leMnt4uWnibFPY4Amy4rz2olDWJWJV0= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=dGOVqCn7; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=VzLCJMvA; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=dGOVqCn7; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=VzLCJMvA; dmarc=none; spf=pass (imf13.hostedemail.com: domain of vbabka@suse.cz designates 195.135.223.131 as permitted sender) smtp.mailfrom=vbabka@suse.cz Received: from imap1.dmz-prg2.suse.org (imap1.dmz-prg2.suse.org [IPv6:2a07:de40:b281:104:10:150:64:97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 9E2C41F799; Wed, 16 Jul 2025 13:57:06 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1752674226; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:autocrypt:autocrypt; bh=rwnWGicdVAughpBbGHvo31mHshgZJT044k4Ot0D8V/I=; b=dGOVqCn7cVRM8lT200YvVonjbyIGQcrN6uXxx/b6iV2dHVTEFjjEesWMxRuDlpAQFVtnnn wnvWqLr87UucMJqiy/frxY8wTjrBIsxBqcApN3nnxAC3U0nEoIDbuxiS3bvMwFtUWl/SRV 1afFmD5qhJLxnDH5liD4OCoO4ibLZDw= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1752674226; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:autocrypt:autocrypt; bh=rwnWGicdVAughpBbGHvo31mHshgZJT044k4Ot0D8V/I=; b=VzLCJMvA1iPjm6MbEUwvzW6+DUNJC9vlbmVaeT/4gv+HMy/vAlBdaZuPIt41KVi+L3kAhj zzK+iZH0xp04IbBQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1752674226; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:autocrypt:autocrypt; bh=rwnWGicdVAughpBbGHvo31mHshgZJT044k4Ot0D8V/I=; b=dGOVqCn7cVRM8lT200YvVonjbyIGQcrN6uXxx/b6iV2dHVTEFjjEesWMxRuDlpAQFVtnnn wnvWqLr87UucMJqiy/frxY8wTjrBIsxBqcApN3nnxAC3U0nEoIDbuxiS3bvMwFtUWl/SRV 1afFmD5qhJLxnDH5liD4OCoO4ibLZDw= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1752674226; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:autocrypt:autocrypt; bh=rwnWGicdVAughpBbGHvo31mHshgZJT044k4Ot0D8V/I=; b=VzLCJMvA1iPjm6MbEUwvzW6+DUNJC9vlbmVaeT/4gv+HMy/vAlBdaZuPIt41KVi+L3kAhj zzK+iZH0xp04IbBQ== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 6E169138D2; Wed, 16 Jul 2025 13:57:06 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id 81mRGrKvd2iDAwAAD6G6ig (envelope-from ); Wed, 16 Jul 2025 13:57:06 +0000 Message-ID: Date: Wed, 16 Jul 2025 15:57:06 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v7 7/7] fs/proc/task_mmu: read proc/pid/maps under per-vma lock Content-Language: en-US To: Suren Baghdasaryan , akpm@linux-foundation.org Cc: Liam.Howlett@oracle.com, lorenzo.stoakes@oracle.com, david@redhat.com, peterx@redhat.com, jannh@google.com, hannes@cmpxchg.org, mhocko@kernel.org, paulmck@kernel.org, shuah@kernel.org, adobriyan@gmail.com, brauner@kernel.org, josef@toxicpanda.com, yebin10@huawei.com, linux@weissschuh.net, willy@infradead.org, osalvador@suse.de, andrii@kernel.org, ryan.roberts@arm.com, christophe.leroy@csgroup.eu, tjmercier@google.com, kaleshsingh@google.com, aha310510@gmail.com, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kselftest@vger.kernel.org References: <20250716030557.1547501-1-surenb@google.com> <20250716030557.1547501-8-surenb@google.com> From: Vlastimil Babka Autocrypt: addr=vbabka@suse.cz; keydata= xsFNBFZdmxYBEADsw/SiUSjB0dM+vSh95UkgcHjzEVBlby/Fg+g42O7LAEkCYXi/vvq31JTB KxRWDHX0R2tgpFDXHnzZcQywawu8eSq0LxzxFNYMvtB7sV1pxYwej2qx9B75qW2plBs+7+YB 87tMFA+u+L4Z5xAzIimfLD5EKC56kJ1CsXlM8S/LHcmdD9Ctkn3trYDNnat0eoAcfPIP2OZ+ 9oe9IF/R28zmh0ifLXyJQQz5ofdj4bPf8ecEW0rhcqHfTD8k4yK0xxt3xW+6Exqp9n9bydiy tcSAw/TahjW6yrA+6JhSBv1v2tIm+itQc073zjSX8OFL51qQVzRFr7H2UQG33lw2QrvHRXqD Ot7ViKam7v0Ho9wEWiQOOZlHItOOXFphWb2yq3nzrKe45oWoSgkxKb97MVsQ+q2SYjJRBBH4 8qKhphADYxkIP6yut/eaj9ImvRUZZRi0DTc8xfnvHGTjKbJzC2xpFcY0DQbZzuwsIZ8OPJCc LM4S7mT25NE5kUTG/TKQCk922vRdGVMoLA7dIQrgXnRXtyT61sg8PG4wcfOnuWf8577aXP1x 6mzw3/jh3F+oSBHb/GcLC7mvWreJifUL2gEdssGfXhGWBo6zLS3qhgtwjay0Jl+kza1lo+Cv BB2T79D4WGdDuVa4eOrQ02TxqGN7G0Biz5ZLRSFzQSQwLn8fbwARAQABzSBWbGFzdGltaWwg QmFia2EgPHZiYWJrYUBzdXNlLmN6PsLBlAQTAQoAPgIbAwULCQgHAwUVCgkICwUWAgMBAAIe AQIXgBYhBKlA1DSZLC6OmRA9UCJPp+fMgqZkBQJnyBr8BQka0IFQAAoJECJPp+fMgqZkqmMQ AIbGN95ptUMUvo6aAdhxaOCHXp1DfIBuIOK/zpx8ylY4pOwu3GRe4dQ8u4XS9gaZ96Gj4bC+ jwWcSmn+TjtKW3rH1dRKopvC07tSJIGGVyw7ieV/5cbFffA8NL0ILowzVg8w1ipnz1VTkWDr 2zcfslxJsJ6vhXw5/npcY0ldeC1E8f6UUoa4eyoskd70vO0wOAoGd02ZkJoox3F5ODM0kjHu Y97VLOa3GG66lh+ZEelVZEujHfKceCw9G3PMvEzyLFbXvSOigZQMdKzQ8D/OChwqig8wFBmV QCPS4yDdmZP3oeDHRjJ9jvMUKoYODiNKsl2F+xXwyRM2qoKRqFlhCn4usVd1+wmv9iLV8nPs 2Db1ZIa49fJet3Sk3PN4bV1rAPuWvtbuTBN39Q/6MgkLTYHb84HyFKw14Rqe5YorrBLbF3rl M51Dpf6Egu1yTJDHCTEwePWug4XI11FT8lK0LNnHNpbhTCYRjX73iWOnFraJNcURld1jL1nV r/LRD+/e2gNtSTPK0Qkon6HcOBZnxRoqtazTU6YQRmGlT0v+rukj/cn5sToYibWLn+RoV1CE Qj6tApOiHBkpEsCzHGu+iDQ1WT0Idtdynst738f/uCeCMkdRu4WMZjteQaqvARFwCy3P/jpK uvzMtves5HvZw33ZwOtMCgbpce00DaET4y/UzsBNBFsZNTUBCACfQfpSsWJZyi+SHoRdVyX5 J6rI7okc4+b571a7RXD5UhS9dlVRVVAtrU9ANSLqPTQKGVxHrqD39XSw8hxK61pw8p90pg4G /N3iuWEvyt+t0SxDDkClnGsDyRhlUyEWYFEoBrrCizbmahOUwqkJbNMfzj5Y7n7OIJOxNRkB IBOjPdF26dMP69BwePQao1M8Acrrex9sAHYjQGyVmReRjVEtv9iG4DoTsnIR3amKVk6si4Ea X/mrapJqSCcBUVYUFH8M7bsm4CSxier5ofy8jTEa/CfvkqpKThTMCQPNZKY7hke5qEq1CBk2 wxhX48ZrJEFf1v3NuV3OimgsF2odzieNABEBAAHCwXwEGAEKACYCGwwWIQSpQNQ0mSwujpkQ PVAiT6fnzIKmZAUCZ8gcVAUJFhTonwAKCRAiT6fnzIKmZLY8D/9uo3Ut9yi2YCuASWxr7QQZ lJCViArjymbxYB5NdOeC50/0gnhK4pgdHlE2MdwF6o34x7TPFGpjNFvycZqccSQPJ/gibwNA zx3q9vJT4Vw+YbiyS53iSBLXMweeVV1Jd9IjAoL+EqB0cbxoFXvnjkvP1foiiF5r73jCd4PR rD+GoX5BZ7AZmFYmuJYBm28STM2NA6LhT0X+2su16f/HtummENKcMwom0hNu3MBNPUOrujtW khQrWcJNAAsy4yMoJ2Lw51T/5X5Hc7jQ9da9fyqu+phqlVtn70qpPvgWy4HRhr25fCAEXZDp xG4RNmTm+pqorHOqhBkI7wA7P/nyPo7ZEc3L+ZkQ37u0nlOyrjbNUniPGxPxv1imVq8IyycG AN5FaFxtiELK22gvudghLJaDiRBhn8/AhXc642/Z/yIpizE2xG4KU4AXzb6C+o7LX/WmmsWP Ly6jamSg6tvrdo4/e87lUedEqCtrp2o1xpn5zongf6cQkaLZKQcBQnPmgHO5OG8+50u88D9I rywqgzTUhHFKKF6/9L/lYtrNcHU8Z6Y4Ju/MLUiNYkmtrGIMnkjKCiRqlRrZE/v5YFHbayRD dJKXobXTtCBYpLJM4ZYRpGZXne/FAtWNe4KbNJJqxMvrTOrnIatPj8NhBVI0RSJRsbilh6TE m6M14QORSWTLRg== In-Reply-To: <20250716030557.1547501-8-surenb@google.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Rspamd-Action: no action X-Stat-Signature: xhexzricjo3g3kiadd46mhrgts1xcc9n X-Rspamd-Queue-Id: 388BB20005 X-Rspamd-Server: rspam10 X-Rspam-User: X-HE-Tag: 1752674228-960150 X-HE-Meta: U2FsdGVkX18j/0KxoBl0bZCJ7tgkUcKAfoj01afmGa5uSFN9fJJnuirlemme2ERlGxgEHhGW2e04EuRSOkzQVk7g/GkdyMpf8AGXn32g5fZpZIxPilvXXNA46QLjTA9xPjWiVSn7/qJ5JAIfuFe/ZjZYQ2oFwcfwGG6copj+9l41jeTITaBSFmotY++FJNqszHAxHbDZUpBBSS3JGSfcRRKJ5rsh7oyRMF0gBX0vdzrTQgXsteCq0ZJIKJNo3HL7fPO8F53KxpZt5hbiQpg2LjAmTGcAuPZ7pzhOg01l1i0kI+XQdvIANZzenjd0bfY1QgsrRb81EiVp7OcNiAVFsC6uRlVfAztV+MuWcTSKBqQxj3sNGY3zVTTg9FaaY3AOxXQ8MohlQHKbuGRDLBjnyxdaaT14nJZ9W/qw96vnY0nfbn0THm2tu5+TAiswHk1fxATw5MhJY2Jkz+BLL7CkBmQUHkcaPsMRF6orCHh/iHQg20Dy1zhcEOm+RWhCyCiNCP9LnLD9Fh1nm0EIRKFX/F1XGBmvonhhhNJn1FZ4gPTAfU0alHv5Uc151nQ7hidEIEEczscKArQNdc1uaYmuwsUP1LIEm4s8lP97cm4mu0ps4Cbb16nxXcu7SF4w81hJ14UDY0kmm4tIOPIL6yXvIrSDBqiInAYbFLYHOesTwSAgpGq2xnMFXTUikAeUAkwKqqJtPjMWWy+Dfa1/mcDerDT75qFaStGx+XYpryqlBf5tICWLZojMtuTYqnxMfpb9f0mbYs7SngZjto5EimdLg+aaw9Rm7tqek70lknRGJAcdyz4P1yItdR2TNo7gslOuV4geT8W3QSLQVRZeAdeUbkg8vYcmO2xUqfwe+dGn7HNFeE3IpT43eqovCq99YKC2CJIaZFtdxnPX2WKoAEVna6DbGYyLdAA+llLq3gY2ZyBU8JotULP/Aw3hbMBMCKEyMWNbqkW2Yui/8bB7Wn+ CH/fi9F0 BUoec+SLLALnsLPdJM7ZEyHj9yTkGwavzu0u90hkQiE79SGVLYMTcZBXgrIy2vtH1hyD+iuxSIBmuRBQV5JARt83Nl1S0ArYsSDEK9NH0irwXKpM8HsKuTiLj8l+YQ5Xeyc4n4k38zmxoArseGjGAgFpa/7tJS+LeTdmwHQQGt+FzFBHPMsMwH6m+saSSH2JzFHOfzjKtRQ9BH7xBnW1ga/10U0kPlAOTonEDoqhlNWYXmDbPRdDuwpsWJ+47RXtaIjlFEuISXjHsB/N7tgYN4mMtopLjpYR2rbs+L9uKuQ7cVYhm/7nOCukPvk0m6EJlH8FRYYullWM3SR+SEBR4Bt7xt22bh7y9+D69P5iS5dxptjQ= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 7/16/25 05:05, Suren Baghdasaryan wrote: > With maple_tree supporting vma tree traversal under RCU and per-vma > locks, /proc/pid/maps can be read while holding individual vma locks > instead of locking the entire address space. > A completely lockless approach (walking vma tree under RCU) would be > quite complex with the main issue being get_vma_name() using callbacks > which might not work correctly with a stable vma copy, requiring > original (unstable) vma - see special_mapping_name() for example. > > When per-vma lock acquisition fails, we take the mmap_lock for reading, > lock the vma, release the mmap_lock and continue. This fallback to mmap > read lock guarantees the reader to make forward progress even during > lock contention. This will interfere with the writer but for a very > short time while we are acquiring the per-vma lock and only when there > was contention on the vma reader is interested in. > > We shouldn't see a repeated fallback to mmap read locks in practice, as > this require a very unlikely series of lock contentions (for instance > due to repeated vma split operations). However even if this did somehow > happen, we would still progress. > > One case requiring special handling is when a vma changes between the > time it was found and the time it got locked. A problematic case would > be if a vma got shrunk so that its vm_start moved higher in the address > space and a new vma was installed at the beginning: > > reader found: |--------VMA A--------| > VMA is modified: |-VMA B-|----VMA A----| > reader locks modified VMA A > reader reports VMA A: | gap |----VMA A----| > > This would result in reporting a gap in the address space that does not > exist. To prevent this we retry the lookup after locking the vma, however > we do that only when we identify a gap and detect that the address space > was changed after we found the vma. > > This change is designed to reduce mmap_lock contention and prevent a > process reading /proc/pid/maps files (often a low priority task, such > as monitoring/data collection services) from blocking address space > updates. Note that this change has a userspace visible disadvantage: > it allows for sub-page data tearing as opposed to the previous mechanism > where data tearing could happen only between pages of generated output > data. Since current userspace considers data tearing between pages to be > acceptable, we assume is will be able to handle sub-page data tearing > as well. > > Signed-off-by: Suren Baghdasaryan Reviewed-by: Vlastimil Babka Nit: the previous patch changed lines with e.g. -2UL to -2 and this seems changing the same lines to add a comment e.g. *ppos = -2; /* -2 indicates gate vma */ That comment could have been added in the previous patch already. Also if you feel the need to add the comments, maybe it's time to just name those special values with a #define or something :)