From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 54B8EC369D3 for ; Fri, 25 Apr 2025 15:33:04 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9CBAF6B000A; Fri, 25 Apr 2025 11:33:01 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9549D6B000D; Fri, 25 Apr 2025 11:33:01 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7D0BD6B000E; Fri, 25 Apr 2025 11:33:01 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 5CDF66B000A for ; Fri, 25 Apr 2025 11:33:01 -0400 (EDT) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id E8C03C71F6 for ; Fri, 25 Apr 2025 15:33:02 +0000 (UTC) X-FDA: 83372959404.16.CBDB816 Received: from mail-qt1-f181.google.com (mail-qt1-f181.google.com [209.85.160.181]) by imf25.hostedemail.com (Postfix) with ESMTP id 1A4A7A0015 for ; Fri, 25 Apr 2025 15:33:00 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=0x4aQQKB; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf25.hostedemail.com: domain of surenb@google.com designates 209.85.160.181 as permitted sender) smtp.mailfrom=surenb@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1745595181; a=rsa-sha256; cv=none; b=pmJpadiOlFVrPZmWHkMYSr/0UmelNH8/+T9ClHv0PWPs+JqM8edg3ad7+Rnf9AzIknt8A2 XgBzvDxSTeEr9fN8Y3TDnRZVBkDGIoJ4ziSXW3eS+gPK7kboQDl5pzBPxAp5alZRhYyqBq rtRsPGf4/5v6hxhJgeAgyTYS6EhJU0s= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=0x4aQQKB; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf25.hostedemail.com: domain of surenb@google.com designates 209.85.160.181 as permitted sender) smtp.mailfrom=surenb@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1745595181; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=pP6z3CE+fLIrfWHZmKWlxUhDiYFfaprQFcGgrLTshIA=; b=WlrfS51wRjRMTmbZV3KdE49o99XqoVKo/2/MVbZlpXtJHIe3Ip8+7I+tVJmvbASS2Jgub/ 0sGgM4FTASmbKl0G9sgnaD5Y5ZEjgGs58mNalG8bw0kOwSmlOdzfTN8KwQoR3QNgzQLfYA yRUNMf2b4SlIgylHw+XyKlF5QWKNto4= Received: by mail-qt1-f181.google.com with SMTP id d75a77b69052e-47e9fea29easo338181cf.1 for ; Fri, 25 Apr 2025 08:33:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1745595180; x=1746199980; darn=kvack.org; h=content-transfer-encoding:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=pP6z3CE+fLIrfWHZmKWlxUhDiYFfaprQFcGgrLTshIA=; b=0x4aQQKBQ4bnJc2Iabj6Oz9tGvAUOfZ8z5+U2LJaE1NtVz/VDP7Q1QoHa1muCOEeHF TPTIyJaQVibZ0tei1m3brGZg0aU3QepRMBPru7P+Mk3+b6ETNM69WfTXur0OAUWqgVCI igwe+GNu4Nrl2WhTQ3IiUUTkUuY/vu/KZjRcfOU/6YSZhF7M/psrnOjJAnhp1X1j9rOV OzjyA5mrdQnOLQQDa61ZVG8o/z7M3FfsxoUUOq/ULughfxL0qhAxamXlU+b+lp7681gM x0+dWpQzWHwWC6y/tnIO+20NlT5q9pA/hvzTT+Pkd0k37HZOpbKvysoSMTsEpZpvRUM9 5smw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1745595180; x=1746199980; h=content-transfer-encoding:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=pP6z3CE+fLIrfWHZmKWlxUhDiYFfaprQFcGgrLTshIA=; b=DanZ5bnMD1zwSst1XPTJN4XadugqSmTBhCrojNAHSCM6/BB7Q1kFJNzqW5ClsHeXfB OU6PTb/L3IbcR3wE47CmlelPfXk0illHmc+NMRhfquE6hu4MY667R6aIrL1/kMcXsX2L i94mlerej+yTbLwR43Tlu1a0wJCsFjdfFTGx35F5EoB+zlys5O1L+UrPYzak7PJ4L7V2 NDUxgSftnFfbFvv/HqNose2ER3/4O6SemK1Roa/xRS/M8cPI7mt6RsfGh46qsRPVVa0w ReNkSJvnobTYBqWSwK5H/ILA5Vf+gfRUi3bSB5OVLoV0tkAMz5UwPsKXAmBWbEDp6Kpz lofQ== X-Forwarded-Encrypted: i=1; AJvYcCUCMEg+I5ftZjCC8J1Z4pRpRPhwpxy6lKiicJWeQFLcI84FoIxPiZrPnAIJc/4fLRRqtPqLqujNcA==@kvack.org X-Gm-Message-State: AOJu0YyrO7YRPULFMrf39YHw5emuV3HRAcNsFyMiGW1ncSn7bbr1DI+B edbGbMNn8IZWcYlrEuedIO5oI1GvUX0LfdgpQDJdNqRmHR+vfo0HMsrnd9WOvcqmAAjqI9Ab53s MjUV+Mii8FA9n0oVxkbq1Q0TXsUWKpGA1IJ+2 X-Gm-Gg: ASbGnctTQyxNVkwTLKiZ1vnmKtU4HKqpkj+7IfKJgNUkdHciDwbEnBiVuoli7l99NCR a8v0YtV0nubIvXgr1aJnoFcBBcePY8k4R0ny6n6AibGMDdwWr5xfoL8m++x3yf1B18fIlQmtaZi x5baOV1Cxwe67jId0AdopT X-Google-Smtp-Source: AGHT+IGI+8KofZYr/rnLzwwlH5lcjuihGFz+/pVqSeDJMF8TV6u79oDolvsETvevEUoiykM1utbOMi84hjY6taQXhiM= X-Received: by 2002:a05:622a:2517:b0:477:c4f:ee58 with SMTP id d75a77b69052e-47fe1ed5f87mr4183671cf.24.1745595179801; Fri, 25 Apr 2025 08:32:59 -0700 (PDT) MIME-Version: 1.0 References: <0f848d59f3eea3dd0c0cdc3920644222c40cffe6.1745528282.git.lorenzo.stoakes@oracle.com> <51903B43-2BFC-4BA6-9D74-63F79CF890B7@kernel.org> <7212f5f4-f12b-4b94-834f-b392601360a3@lucifer.local> In-Reply-To: From: Suren Baghdasaryan Date: Fri, 25 Apr 2025 08:32:48 -0700 X-Gm-Features: ATxdqUH-7fcjY1Nm52E2GQYuVXFxGwTyQxbInQgwFOhzX15Q8W5Vpu5wImOKUvg Message-ID: Subject: Re: [PATCH 2/4] mm: perform VMA allocation, freeing, duplication in mm To: "Liam R. Howlett" , Lorenzo Stoakes , Kees Cook , Andrew Morton , Vlastimil Babka , Jann Horn , Pedro Falcato , David Hildenbrand , Alexander Viro , Christian Brauner , Jan Kara , Suren Baghdasaryan , linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 1A4A7A0015 X-Rspam-User: X-Rspamd-Server: rspam07 X-Stat-Signature: 164s6naew7amjhaoeopwnif6gks5s4th X-HE-Tag: 1745595180-474480 X-HE-Meta: U2FsdGVkX18/ij2GzK1JRw8ErFueZr/VKgp5dqgoJ+DBkVFMPV9mqBLu7VliK/Pq4BAB+bvmURtOMSSJYOvjo68Zj2iJ808MO2raRh8dwqN+9nOZaTAQUrfcFecTkHSC92Kvo33/yIyiVoQESdiceJfGWapQGB6OrHfnRi8r0lc7oOEYEgpjvu6Pmsn19O8uODeElB5ZZAM8cAkYVh0ChMO9lARb32qPv3VymuGCo8JFN8Q8FYvIvseEADZkw6WvUgT1/nJwBrK3cdqCy7Y2228sdhSnVdnQ2x2CkUgchyOjwgg+ZBB5DPGeQ+5a8XlM4+aOTLYf7Rr5qRtDdfIUkpHzeiO7iS+aqx6LVn+91kE3CH9iuT5PVhYnntN9rejYxAMFM8x1jJKa45yc5WCQliJuMxx+aXI9wrYx2Z0MIobEiQ7IAIeAiBsRjepKNWsnN5YnFEaFV622PShYv2l6bzEI2761DJwEE/cngUbN+5It7rDBEVoIG5akK7yipKx5Vo4T+XnAw8HOPp/Ygv2vFQ+liTWe9b3A3CvjEuwtJXXXVb2aGtGAlOYf+pgFaVosqfXoegDqE86T6Cg6gDOjiiNp31oNbjyqD2Bc3kb4btcB514g3IlHNtm8EPDxZzt67bw2kv/ARvRvv/5fKw/EB7zBMB9TckVAE5xl4mgY8V5961XL2GFT6XTulvifCDtXQK9pBNfRIZS6UU0dKIQJEHivAStDYRizX5jdAuBLlL1Arj6Nn3RPJQsWmAr9/mie3gFujVAfwkKBNi3HqmRQrFxbj7L7TOKLyL42i/mxZXvJHmqRmh6WFAsHlLBzVYB4LZPQ7To4UluX5gFzUSvw9EMSvur54ER+8PUrG7hA5RIPs7Dw9i8toq/xMQCjuLYAsvuqsBSCOBMcEAsRkJJxMUr3C/VMIduAG3hiJGa/OxNcNwXWTIX/1qYneIcXRee56kVfWAl1cN+/5JMXieq NtuuG4DE tmJaGMC3+xzajs94o1cmd7k7YYFYLvvQg2GZfy1IG8S8J6HazeeOeyuU4FMCJ3sVBStugtS8GUpp+7vLY1TxBqb5xqz4X2v+H7nXTreEmh6LSGz8j8odcGlRGbHMHE0TlTaAl+8N6pNgGf9ctIVDxrWszj3zCEl9XQO3Fwm0ErOYu7gblWtTbnI8LJUfUDU5p5Qh+6r7kFHc9tsgLI7MmUFQdmeoNmSANT7BFZUBzFJtleTzOeqywWQhLbQClEVH5ZivEwCWqaJJGHyQ= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Apr 25, 2025 at 6:55=E2=80=AFAM Liam R. Howlett wrote: > > * Lorenzo Stoakes [250425 06:40]: > > On Thu, Apr 24, 2025 at 08:15:26PM -0700, Kees Cook wrote: > > > > > > > > > On April 24, 2025 2:15:27 PM PDT, Lorenzo Stoakes wrote: > > > >+static void vm_area_init_from(const struct vm_area_struct *src, > > > >+ struct vm_area_struct *dest) > > > >+{ > > > >+ dest->vm_mm =3D src->vm_mm; > > > >+ dest->vm_ops =3D src->vm_ops; > > > >+ dest->vm_start =3D src->vm_start; > > > >+ dest->vm_end =3D src->vm_end; > > > >+ dest->anon_vma =3D src->anon_vma; > > > >+ dest->vm_pgoff =3D src->vm_pgoff; > > > >+ dest->vm_file =3D src->vm_file; > > > >+ dest->vm_private_data =3D src->vm_private_data; > > > >+ vm_flags_init(dest, src->vm_flags); > > > >+ memcpy(&dest->vm_page_prot, &src->vm_page_prot, > > > >+ sizeof(dest->vm_page_prot)); > > > >+ /* > > > >+ * src->shared.rb may be modified concurrently when called from > > > >+ * dup_mmap(), but the clone will reinitialize it. > > > >+ */ > > > >+ data_race(memcpy(&dest->shared, &src->shared, sizeof(dest->shared= ))); > > > >+ memcpy(&dest->vm_userfaultfd_ctx, &src->vm_userfaultfd_ctx, > > > >+ sizeof(dest->vm_userfaultfd_ctx)); > > > >+#ifdef CONFIG_ANON_VMA_NAME > > > >+ dest->anon_name =3D src->anon_name; > > > >+#endif > > > >+#ifdef CONFIG_SWAP > > > >+ memcpy(&dest->swap_readahead_info, &src->swap_readahead_info, > > > >+ sizeof(dest->swap_readahead_info)); > > > >+#endif > > > >+#ifdef CONFIG_NUMA > > > >+ dest->vm_policy =3D src->vm_policy; > > > >+#endif > > > >+} > > > > > > I know you're doing a big cut/paste here, but why in the world is thi= s function written this way? Why not just: > > > > > > *dest =3D *src; > > > > > > And then do any one-off cleanups? > > > > Yup I find it odd, and error prone to be honest. We'll end up with unin= itialised > > state for some fields if we miss them here, seems unwise... > > > > Presumably for performance? > > > > This is, as you say, me simply propagating what exists, but I do wonder= . > > Two things come to mind: > > 1. How ctors are done. (v3 of Suren's RCU safe patch series, willy made > a comment.. I think) > > 2. Some race that Vlastimil came up with the copy and the RCU safeness. > IIRC it had to do with the ordering of the setting of things? > > Also, looking at it again... > > How is it safe to do dest->anon_name =3D src->anon_name? Isn't that ref > counted? dest->anon_name =3D src->anon_name is fine here because right after vm_area_init_from() we call dup_anon_vma_name() which will bump up the refcount. I don't recall why this is done this way but now looking at it I wonder if I could call dup_anon_vma_name() directly instead of this assignment. Might be just an overlooked legacy from the time we memcpy'd the entire structure. I'll need to double-check. > > Pretty sure it's okay, but Suren would know for sure on all of this. > > Suren, maybe you could send a patch with comments on this stuff? Yeah, I think I need to add some comments in this code for clarification. We do not copy the entire vm_area_struct because we have to preserve vma->vm_refcnt field of the dest vma. Since these structures are allocated from a cache with SLAB_TYPESAFE_BY_RCU, another thread might be concurrently checking the state of the dest object by reading dest->vm_refcnt. Therefore it's important here not to override the vm_refcnt. Changelog in https://lore.kernel.org/all/20250213224655.1680278-18-surenb@google.com/ touches on it but a comment in the code would be indeed helpful. Will add it but will wait for Lorenzo's refactoring to land into linux-mm first to avoid adding merge conflicts. > > Thanks, > Liam