From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 39F48C5AD49 for ; Tue, 3 Jun 2025 07:52:00 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A45376B03C9; Tue, 3 Jun 2025 03:51:59 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9F5EC6B03CA; Tue, 3 Jun 2025 03:51:59 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8E37D6B03CB; Tue, 3 Jun 2025 03:51:59 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 6E8FE6B03C9 for ; Tue, 3 Jun 2025 03:51:59 -0400 (EDT) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 20223BBAF2 for ; Tue, 3 Jun 2025 07:51:59 +0000 (UTC) X-FDA: 83513320758.23.B07E64B Received: from mail-vk1-f177.google.com (mail-vk1-f177.google.com [209.85.221.177]) by imf04.hostedemail.com (Postfix) with ESMTP id 446D840003 for ; Tue, 3 Jun 2025 07:51:57 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=QzMHaMO1; spf=pass (imf04.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.221.177 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1748937117; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=bJ5G99A6DX6XXkgPgca6WYFZcoU8J+uLWkvPstKPeB8=; b=urbJF70zzpb+3VfmqsqYFgbS4hX7/j32i079gVaE2J/6AFfNq910EPdQo+gsy2Or7NlDqk 1s45qXfTb4KD3zC0i5saZquBV+M/+p1eh5/2PVmz04DgNDv9UDyazIORX59qo6KfslwQeY KNPV9V9XMYA0rjzLovC5qyYzMfQao9o= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=QzMHaMO1; spf=pass (imf04.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.221.177 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1748937117; a=rsa-sha256; cv=none; b=WWw8WQEZs8X1Wpa6aj5wFQOfxhuuMcceXfJUbZ1mwlITwMSvyB6DaPRrih/proxwmoA4YX FsgWqC+6jWUayc2h8kzbTt7q2Duce2P0WR+CGPfJFeQYhon9dWgRsCuf/eDH4RPSpi/9EJ 94IC5Xcgajrit2Vfg4mTiDPV7Psn+gQ= Received: by mail-vk1-f177.google.com with SMTP id 71dfb90a1353d-52934f4fb23so3545168e0c.1 for ; Tue, 03 Jun 2025 00:51:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1748937116; x=1749541916; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=bJ5G99A6DX6XXkgPgca6WYFZcoU8J+uLWkvPstKPeB8=; b=QzMHaMO1PCqVA2IDsvvqWRyU2jlOPbRLfmOk9M5NER3Vjrvn7hxj1K/fpfflSEbPAy yXGvuKesJ+bWIBlBkl0nxH2WiePnACIxZ/ZUIwyg/AtjRui6e8znGe/ZnYVYXN4f6qJz OXsd7CAzKubJK6MjqqlddMwyUxDvmhCvOiLXILeiZU/ER33fN+kh0Xks3QU25a6z3eg0 Q+SztMPahPUtPO1UVz+eX18K/giZOey6fa+cIMBdtKALWrNVfEoD5EgyOr6bkfNZrR5r fMV/vIeySQ/a1iotRZCryjaY+hgHxmI33wy/TN49jsOx24yFYdYvACff3ZNLq1DwMNz3 7xFQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1748937116; x=1749541916; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=bJ5G99A6DX6XXkgPgca6WYFZcoU8J+uLWkvPstKPeB8=; b=Azc9/0ozBm+YyNuWf+jkLVyjUi181mTyUJDLTIOIg+zeyKzBvLv8a0d4ozMeHokCzX uQmFimcXyMyluElUZtnUyNVc/j/XPnsDvbJl7gpg2UNeF9e3T90v052ouWSqRauD/vO+ 4R/oYhcG0XT8Y2viGg0KpJpT0fKQ0lVZOxG0m+MfXCIOpDuzaDRA0Ld8EeiEDqJFQbb0 l9OmjpCwq4IIuEFuS7SA7/P3dYXXYlb+ZjvdBy9REfYlM7LJRvAzQhp5ee8q7w6h1XoI 4abW28bO6g8uGJeqd5JcYKRyzloz8ngSCa/08GNTXDRz90SbIYxfwab8bBIW7y/hDZnm Uh3g== X-Forwarded-Encrypted: i=1; AJvYcCUVIM89GhWnXhAMONe2xPvgGrk06aM7u/EISBu+ljj813sjcB/JkxscWmOqESnZXiIqN0yL/ej8eQ==@kvack.org X-Gm-Message-State: AOJu0Yy1rOy2Q/cacmogTLX6WRXtFi/5Kx5OUMJgNKKJZ8Ii8W5n+fdh U0Erca/cZHWPA2Gr4eUhldBKbOnMo+V7wjKUPy730Iy51Nqz/sdw3FYFDDhCqDJfx3EKYgXhMDk dlm2fLzcC92Li0vLGPt4oib2HpWgmHKo= X-Gm-Gg: ASbGncsmVwLRlSCxDlppRvunoBCOU1mW1tyF6F0k90hcYgmZPny++wN8b4tELBiEmS8 qTQ7Epfk7HXoLvRHkU/HQ2SecnPafNpqemos4gG4GUX6GajI8+GO5PyftfNtqPfMYStymtxLvDB 181/9xBRwB5vN2Z7fvkFMkBBHs+UDB+XR4eSmrqajOhabN X-Google-Smtp-Source: AGHT+IFI+8PL3m+wRaSraIEh4OjMXLGOtYxdVSBEACCFQTgu8YiYVC4//ksyCFn+uah9r+h5K9GPa0I8qscbTpypuHg= X-Received: by 2002:a05:6122:17a1:b0:530:65f0:7fd4 with SMTP id 71dfb90a1353d-53080f57ffamr15240623e0c.2.1748937116257; Tue, 03 Jun 2025 00:51:56 -0700 (PDT) MIME-Version: 1.0 References: <20250530104439.64841-1-21cnbao@gmail.com> In-Reply-To: From: Barry Song <21cnbao@gmail.com> Date: Tue, 3 Jun 2025 19:51:45 +1200 X-Gm-Features: AX0GCFsxD1f5K8klZlqF9swRtjX9af4CaD0TVcTpZ0zoJfEvfeQET56ZXV57oA8 Message-ID: Subject: Re: [PATCH RFC v2] mm: use per_vma lock for MADV_DONTNEED To: Jann Horn Cc: akpm@linux-foundation.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Barry Song , "Liam R. Howlett" , Lorenzo Stoakes , David Hildenbrand , Vlastimil Babka , Suren Baghdasaryan , Lokesh Gidra , Tangquan Zheng Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Stat-Signature: yjqgi91dhbga6jkfq7fmsg7w6aieowqf X-Rspamd-Queue-Id: 446D840003 X-Rspam-User: X-Rspamd-Server: rspam02 X-HE-Tag: 1748937117-564218 X-HE-Meta: U2FsdGVkX1+KXXuMo+bdHtG/JsbaMx6j7K26u7GawfWXVxosHp4+4nWJnHjoEwd+7WKZmrp0v8EKV/HzJ8I2nDqfqLgm7Jv7oM9DVeXEI4HsJdMGTPFEkAlv/jxFbmhvLsFXe+bacxZtg0C9emaCyzLoDD0xcbRPSdj4AmAMybYmWP2LaaerlybzMpgwH+eMuyVKhilm442AM3W3xwc3lIt85wRZvLL8fqJ8XKzVSRUlGQLuphDJQCwIFD1X6AO8n+rCrlySpsw7w7dxIrR3wsnF8LQljv60NiFqZIoUNEMkBJZtCbQyBvcjzSh8X6CPXV3fkwMdA1QQsjA45TopU5jgsYrUFap5BbedNoRBlNhcyuI2y7Ae9nTijDQGlM3Ydp/CfKyQBYC24eIm4E9hQdsp/ADf+MBUF8mxnp1M3GnjoQdQHyIQGv9LwUKyreeo3EzCW0Xsd5lQTJS1YNlYf7HUFzwyFEL6T4UdWTgcgeX6jXvwwuumJALD+0UrkKOrBpEH2bLJNEhznzqsbtobdA5gmv2/JrN+9G4C2YMFwVMBfkt8+P7wcuac1awgnAOIbfzsCKYjgmuBQNm4W2KpWEqYFKIFPR3vqiqoIztELghm4vHWF8hkBKU1qR97BSCYHrJXmp3R0aC+5O01UbvSIt95jLuH7FvAKxkV7W7Ey2vRxn43YlPVGn2J8lc+8G9LK7rBOLO9l9mz19bXY1kVah8WzWy1JCs6AvdhJX6T26rzOZRxcbEPmkzb6D4jSOCeJfC9kl6qSJnAbiif2LpEk4gOY2TifASrZyYLaQitqkouXY/MyoUEMbLTuiXMNMnH1zOAt3ixKW+MBReZChpAZvGJZX105mQjkeSj11A8SYrEdwdqS4EBqx2V1lbu/9FZB4mSrOuDqusn/+DXGKMx5qUOccG1s/6gnyL2G7VQ+MrwOUs+KV2TX8Oc3NB586P4md9XkJ7mVO8LouO1z9t B0Xl78ir 38Sl2z+GsL1DisqcOHaA4YSsCZPp0lX4DOLzMQkPLSnj341IpGSBiZ7OaIlvMeMaNA0GqdsILFlcZOIaHll9UeWyoUxnl18XceKKDNYNqjbjEr5JOAkZEq9LCYGQSq1wdhQdvsjX5Ji8nYqrlfgGwFKEz+7OlkkADZPBrYqF7YYOu2UDnHLHDNMqR5O2KTOwEOtBY4IrTmJPD/ANlf9cwELaJwiK2q5zdzjVD6zfQSzPYz2sm2Z5tI3xmR2c+awSz7Z4KPFldyQcT5lLQCHf5rEv9Fjh8W5PM0xyOfl5DqfU4hUWZaghDvxUAqnRUYcOwPk5Sh6aHwEywQjdRT7WxZPv1DKuUssGX6669gvoSw0ZEklrMaE9W4k5nTSIqcKzuRMNwggrh8/fAAB5c0gUYM7voJ4LxSwsa00Bn X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Jun 3, 2025 at 2:56=E2=80=AFAM Jann Horn wrote: > > On Sat, May 31, 2025 at 12:00=E2=80=AFAM Barry Song <21cnbao@gmail.com> w= rote: > > On Fri, May 30, 2025 at 10:07=E2=80=AFPM Jann Horn w= rote: > > > On Fri, May 30, 2025 at 12:44=E2=80=AFPM Barry Song <21cnbao@gmail.co= m> wrote: > > > > @@ -1714,19 +1770,24 @@ static int madvise_do_behavior(struct mm_st= ruct *mm, > > > > unsigned long start, size_t len_in, > > > > struct madvise_behavior *madv_behavior) > > > > { > > > > + struct vm_area_struct *vma =3D madv_behavior->vma; > > > > int behavior =3D madv_behavior->behavior; > > > > + > > > > struct blk_plug plug; > > > > unsigned long end; > > > > int error; > > > > > > > > if (is_memory_failure(behavior)) > > > > return madvise_inject_error(behavior, start, start = + len_in); > > > > - start =3D untagged_addr_remote(mm, start); > > > > + start =3D untagged_addr(start); > > > > > > Why is this okay? I see that X86's untagged_addr_remote() asserts tha= t > > > the mmap lock is held, which is no longer the case here with your > > > patch, but untagged_addr() seems wrong here, since we can be operatin= g > > > on another process. I think especially on X86 with 5-level paging and > > > LAM, there can probably be cases where address bits are used for part > > > of the virtual address in one task while they need to be masked off i= n > > > another task? > > > > > > I wonder if you'll have to refactor X86 and Risc-V first to make this > > > work... ideally by making sure that their address tagging state > > > updates are atomic and untagged_area_remote() works locklessly. > > > > If possible, can we try to avoid this at least for this stage? We all > > agree that > > a per-VMA lock for DONTNEED is long overdue. The main goal of the patch > > is to drop the mmap_lock for high-frequency madvise operations like > > MADV_DONTNEED and potentially MADV_FREE. For these two cases, it's high= ly > > unlikely that one process would be managing the memory of another. In v= 2, > > we're modifying common code, which is why we ended up here. > > > > We could consider doing: > > > > if (current->mm =3D=3D mm) > > untagged_addr(start); > > else > > untagged_addr_remote(mm, start); > > Ah, in other words, basically make it so that for now we don't use VMA > locking on remote processes, and then we can have two different paths > here for "local operation" and "MM-locked operation"? That's not very > pretty but from a correctness perspective I'm fine with that. Right, that=E2=80=99s exactly what I mean=E2=80=94we may hold off on remote= `madvise` for now unless we can find a straightforward way to fix up the architecture code, especially since the tagging implementations in x86 and RISC-V are quite confusing. It=E2=80=99s particularly tricky for RISC-V, which supports two different P= MLEN values simultaneously. Resolving the untagging issue will likely require extensive discussions with both the x86 and RISC-V architecture teams. long set_tagged_addr_ctrl(struct task_struct *task, unsigned long arg) { ... /* * Prefer the smallest PMLEN that satisfies the user's request, * in case choosing a larger PMLEN has a performance impact. */ pmlen =3D FIELD_GET(PR_PMLEN_MASK, arg); if (pmlen =3D=3D PMLEN_0) { pmm =3D ENVCFG_PMM_PMLEN_0; } else if (pmlen <=3D PMLEN_7 && have_user_pmlen_7) { pmlen =3D PMLEN_7; pmm =3D ENVCFG_PMM_PMLEN_7; } else if (pmlen <=3D PMLEN_16 && have_user_pmlen_16) { pmlen =3D PMLEN_16; pmm =3D ENVCFG_PMM_PMLEN_16; } else { return -EINVAL; } ... } It=E2=80=99s strange that x86=E2=80=99s LAM U48 was rejected, while RISC-V= =E2=80=99s PMLEN values of 7 and 16 were both accepted. Another reason is that we=E2=80=99re not too concerned about remote `madvis= e` at this stage, as our current focus is on high-frequency cases like `MADV_DONTNEED`, and possibly `MADV_FREE`. Thanks Barry