From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id DF022C369AB for ; Thu, 24 Apr 2025 15:20:55 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CAA1F6B00A5; Thu, 24 Apr 2025 11:20:53 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C0ACD6B00BF; Thu, 24 Apr 2025 11:20:53 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AA9E46B00C0; Thu, 24 Apr 2025 11:20:53 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 8D6586B00A5 for ; Thu, 24 Apr 2025 11:20:53 -0400 (EDT) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id ACCB3B8416 for ; Thu, 24 Apr 2025 15:20:54 +0000 (UTC) X-FDA: 83369300028.25.BEB6780 Received: from mail-qt1-f171.google.com (mail-qt1-f171.google.com [209.85.160.171]) by imf05.hostedemail.com (Postfix) with ESMTP id DEDA1100006 for ; Thu, 24 Apr 2025 15:20:52 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=szf3pHhs; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf05.hostedemail.com: domain of surenb@google.com designates 209.85.160.171 as permitted sender) smtp.mailfrom=surenb@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1745508052; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=ZNHK302YZFUT1cxr0jtljW49RgNj4+xwWJVeU3dUkCI=; b=k92CGNE/Cw4OFmbtmjM2VmArRcmN8rNKQFUyKqH+s4k0PRSP5WAOCksGPWjV3ipRfyfr+p gJiup2QTZyBbCM4ASCNsNH7uVfoN/Mab3j3XylzhqyKd2KcdT5fmdMWrO88/+femWMNfwX xEpj2xUyUoT40aDJPyqSyU5jgNhIM9E= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1745508052; a=rsa-sha256; cv=none; b=KOCtPW5ISHvc2EiRhusnsDjOWWEVXdlP2PE0Iz+He7yiFl0E+4AasbcxfJSHbd7nR+cko/ Ux86F5wMxVJeGJIKpV7IbEFmgakiV91XYgoKhzsjzShbLDTFSUqWCV+yRO1wQ6BPJbG1MS IhWic90imavZoovlek8Czh0EzaYb4TM= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=szf3pHhs; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf05.hostedemail.com: domain of surenb@google.com designates 209.85.160.171 as permitted sender) smtp.mailfrom=surenb@google.com Received: by mail-qt1-f171.google.com with SMTP id d75a77b69052e-47e9fea29easo376121cf.1 for ; Thu, 24 Apr 2025 08:20:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1745508052; x=1746112852; darn=kvack.org; h=content-transfer-encoding:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=ZNHK302YZFUT1cxr0jtljW49RgNj4+xwWJVeU3dUkCI=; b=szf3pHhsn7SNmLNtgmdcBhWgED8HRxOOXrWCZGb9IClhoffakqomR5oOf3ga+C3HdD UVZsRa3uWDWh/V+kTbtn60LtVJt6fdYOS32+WOxeFEZabVRNEKDhv6K4wqj9Vog+pcAa IjSQ8GhhE+x8p03yOI2IqMezJnMJ6/iRoHtzgaoTuf5VYzq7KQHEQPDXxHoutYguhaJf Ag6BqeJFcH4Yfi5x4lGww75qRs6R7I5CQ5S5hRaAQucmYumaPYR2WK3NVkM78Bwmj+IW LDpdQpWEhdqum9A9DSkTP586YPVDHlgR2aBDjcn6Zn8ZKKnO+ufrX7mw6ogqqI941bbF p3Dw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1745508052; x=1746112852; h=content-transfer-encoding:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ZNHK302YZFUT1cxr0jtljW49RgNj4+xwWJVeU3dUkCI=; b=JQmXXSDSW3+PvBU8sf9/jXVHeyokst4otbyYYtXH1gCN9vSMwdYvoEuEMq4gsb95zX vMpRsq/sVB1pnXSrNNhS48d0+voOOzmxRJgX0sgdG41lsQeIEi8IPbe52yX9zbohrkzZ N3LvWENMV6fnxNRWff/symhi2x+rTGYKscvDQCtt4ffTshwGDClDpfBTq1ZpoZVodU7e 9o4a90ssi7Q8HChMfX6qBti4dcmbuZQP+K6oFXF2Ms7YnXf23IGQw9Li7ZMkJROsBKAP GCcoILbxsS/3n+STkGvojkKCHbgYUEI+m+w1C1Kuh+CHdP8X9i6PT3a2jgprcOvHo8em 3Vbg== X-Forwarded-Encrypted: i=1; AJvYcCV+SNWXn5AyC5dAI9dOVEuL8n4f9mcMdNQxyE4ErbRgX3+XmrjDmiPbq+zunIKUqfOVdfPV27JSGQ==@kvack.org X-Gm-Message-State: AOJu0YyTweFOk7sQD6zvXu+ovxmQ1AHDJMloClmhnDdZup+OSSCCKiX6 EzCGZ9TngEtjGM72Sw6fNcjFFiPapU3LNOpG8A2BfJR2c2PdOEDrPv9as0m5MzJEdV40nlgRYsw uX6zqZ/9bdSUXHcTHfrl3v3vfBvxE/vyk2iYs X-Gm-Gg: ASbGnctu3OqbXsgGZhQ0ErIj1wQ7DPAQ3NiSa71EtQgPrnDm8yT0FhyjiE64MVsiXBt 1Lyyi1NvQ56+3mKTkJBVzMklBz7h9das8syGXkRbarTN/igPo/1idc8VlNCNuU5+3ul/ycwsmsB AKkdgsIgTAsIekO8nFcwA1d/3eDDV/UE+kjWMMqdU5llo0X8wKILYT X-Google-Smtp-Source: AGHT+IEJ5MqCyYAZ5GwHFLcMlAiX+rZznU8L6WqiKzfk2t5zO6ifbX/QQVtO0BzE3KXJcP5VhQWuQTTPs/LiXap8uVE= X-Received: by 2002:a05:622a:5792:b0:476:f1a6:d8e8 with SMTP id d75a77b69052e-47ea474f9d3mr4291421cf.11.1745508051546; Thu, 24 Apr 2025 08:20:51 -0700 (PDT) MIME-Version: 1.0 References: <20250418174959.1431962-1-surenb@google.com> <20250418174959.1431962-8-surenb@google.com> <6ay37xorr35nw4ljtptnfqchuaozu73ffvjpmwopat42n4t6vr@qnr6xvralx2o> In-Reply-To: <6ay37xorr35nw4ljtptnfqchuaozu73ffvjpmwopat42n4t6vr@qnr6xvralx2o> From: Suren Baghdasaryan Date: Thu, 24 Apr 2025 08:20:40 -0700 X-Gm-Features: ATxdqUFhUQugR6vvry0z8YMCa5qp4tuc1EH-U_8QzRi08SiVqS2WqPXZHYhgyGc Message-ID: Subject: Re: [PATCH v3 7/8] mm/maps: read proc/pid/maps under RCU To: "Liam R. Howlett" , Andrii Nakryiko , Suren Baghdasaryan , akpm@linux-foundation.org, lorenzo.stoakes@oracle.com, david@redhat.com, vbabka@suse.cz, peterx@redhat.com, jannh@google.com, hannes@cmpxchg.org, mhocko@kernel.org, paulmck@kernel.org, shuah@kernel.org, adobriyan@gmail.com, brauner@kernel.org, josef@toxicpanda.com, yebin10@huawei.com, linux@weissschuh.net, willy@infradead.org, osalvador@suse.de, andrii@kernel.org, ryan.roberts@arm.com, christophe.leroy@csgroup.eu, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kselftest@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: DEDA1100006 X-Rspam-User: X-Stat-Signature: bz4c747uqeede4kuowjm3n4j4mkuujmd X-HE-Tag: 1745508052-887015 X-HE-Meta: U2FsdGVkX19SoSkE6BQDmYQis+BIVrnVI/1XIPcdnDhwdUEEmAxIJbsX4JecvhYtr/m1R6A5tr21Ra9bqtAlLq/saaolxV0fH+761rqhr5vwTHFs0k1S9ov41lkrGnWdqlIbg1cPmlq/Qy4IkqHymbqADtaafTGc4xWKiyAzQeir5okqvf30BHgiNAQeqNtrwZBag0bnbu1vse14196Yn5bvFFKxNl6s1wCtPmNfcjBx9NZ0UOyKNbPXB45uaU5hFxr9s0ssx6ZR3wWX4CkfjF9/6cmKt0s/0f3kzaITJ4b/5Qwc2wuD0K2fcHz4GLoC7osRDcfCfO7vY3TQ4yKMUkEpfxZKvmm0TBinUykxNGmRGo9gLX0P1oATLCzz98DP5mIK/7bnQtBzi/jfGJGeOwarjL4RLjX4gjz5PnFllrKSgS9EO4O5mL2vAle83kTQPl1mrcLIJvveNulOzgulDxTMMu9ieVkTPxKtuTELQW7hOHukzjjlwFP/dOhVs2DjN/+2ESzb3lOxjKmTR0MTL5dbnk1o1PmLLVjs6eBsFCjlRXdAWppgo2/OYcaihdvIlOB+A3PBkxuQc8D5bVGXF4uTs2iZutZ8aNEa4q1BGW2/Yjx/U990OgbhumMGOPD9fOfPJ1gfjnhspEp7e4klRM3J+9MwktXWsaL6vCXL4t8QrzGPFHWMkLOfBK4LeYUQIZbOyPRjafTjwcXylwUxnxzwMsn5ytd8NJtmxx0nTY0+NyNQHz4kE0XAVj3/+ugCcf17hwiMYZVxxdBwoFOTfn9opMheqHkplTfNPlZigjGe252g7Ox9r4JFlrXYN9RYYt2UuOnkXEySvEkx7/+GaIrPwdqpMNk8UAnZm/4Jc2MKqrhTM0aBmurUXTmMKQzM7zgtU1teY+EHtdI/HH9/jsk9ojEdYBlPrP9nOUDhCM7/bOwhBcYHCrrTg4KebolNMeH6t4WcX8xNsaoF0xU 2uR3+PD7 UjbKZsuq+l2VpYeNsHnWqxdnryT4HUDzxWTpTCWzf+/lbA53OO4JQBEcf3PXjUbxobpJOVO2/jJg2qxlZHKYb0EVVRKEehzk/ioEb0YdjjqqtDUmh5Es86e1/czf1i/63Ko+Bo4M+hA5QGlS4VJ/sPoEVYiY8Dc81d+s2IX2aB4O/90MDBRW1K+Y/w9oqa8X0Q46b0kDRcaVOLxLy/T2E2kh5RvvcDA9h1WMS3INWnJ4bhhniN/+Z9d2mYRg5InwGGXrVhSlF1GkK1WISuF9n7v01AVxrD3x8KwkOLfxyrTy2D0bCvU1h8/9NnPk9XagGxXvWn2e/f42wjvA= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Apr 23, 2025 at 5:24=E2=80=AFPM Liam R. Howlett wrote: > > * Andrii Nakryiko [250423 18:06]: > > On Wed, Apr 23, 2025 at 2:49=E2=80=AFPM Suren Baghdasaryan wrote: > > > > > > On Tue, Apr 22, 2025 at 3:49=E2=80=AFPM Andrii Nakryiko > > > wrote: > > > > > > > > On Fri, Apr 18, 2025 at 10:50=E2=80=AFAM Suren Baghdasaryan wrote: > > > > > > > > > > With maple_tree supporting vma tree traversal under RCU and vma a= nd > > > > > its important members being RCU-safe, /proc/pid/maps can be read = under > > > > > RCU and without the need to read-lock mmap_lock. However vma cont= ent > > > > > can change from under us, therefore we make a copy of the vma and= we > > > > > pin pointer fields used when generating the output (currently onl= y > > > > > vm_file and anon_name). Afterwards we check for concurrent addres= s > > > > > space modifications, wait for them to end and retry. While we tak= e > > > > > the mmap_lock for reading during such contention, we do that mome= ntarily > > > > > only to record new mm_wr_seq counter. This change is designed to = reduce > > > > > > > > This is probably a stupid question, but why do we need to take a lo= ck > > > > just to record this counter? uprobes get away without taking mmap_l= ock > > > > even for reads, and still record this seq counter. And then detect > > > > whether there were any modifications in between. Why does this chan= ge > > > > need more heavy-weight mmap_read_lock to do speculative reads? > > > > > > Not a stupid question. mmap_read_lock() is used to wait for the write= r > > > to finish what it's doing and then we continue by recording a new > > > sequence counter value and call mmap_read_unlock. This is what > > > get_vma_snapshot() does. But your question made me realize that we ca= n > > > optimize m_start() further by not taking mmap_read_lock at all. > > > Instead of taking mmap_read_lock then doing drop_mmap_lock() we can > > > try mmap_lock_speculate_try_begin() and only if it fails do the same > > > dance we do in the get_vma_snapshot(). I think that should work. > > > > Ok, yeah, it would be great to avoid taking a lock in a common case! > > We can check this counter once per 4k block and maintain the same > 'tearing' that exists today instead of per-vma. Not that anyone said > they had an issue with changing it, but since we're on this road anyways > I'd thought I'd point out where we could end up. We would need to run that check on the last call to show_map() right before seq_file detects the overflow and flushes the page. On contention we will also be throwing away more prepared data (up to a page worth of records) vs only the last record. All in all I'm not convinced this is worth doing unless increased chances of data tearing is identified as a problem. > > I am concerned about live locking in either scenario, but I haven't > looked too deep into this pattern. > > I also don't love (as usual) the lack of ensured forward progress. Hmm. Maybe we should add a retry limit on mmap_lock_speculate_try_begin() and once the limit is hit we just take the mmap_read_lock and proceed with it? That would prevent a hyperactive writer from blocking the reader's forward progress indefinitely. > > It seems like we have four cases for the vm area state now: > 1. we want to read a stable vma or set of vmas (per-vma locking) > 2. we want to read a stable mm state for reading (the very short named > mmap_lock_speculate_try_begin) and we don't mind retrying on contention. This one should be done under RCU protection. > 3. we ensure a stable vma/mm state for reading (mmap read lock) > 4. we are writing - get out of my way (mmap write lock). I wouldn't call #2 a vma state. More of a usecase when we want to read vma under RCU (valid but can change from under us) and then retry if it might have been modified from under us. > > Cheers, > Liam >