From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2C525C02193 for ; Thu, 30 Jan 2025 10:43:33 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 83C7A280285; Thu, 30 Jan 2025 05:43:32 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 7EBE0280284; Thu, 30 Jan 2025 05:43:32 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 66585280285; Thu, 30 Jan 2025 05:43:32 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 47A04280284 for ; Thu, 30 Jan 2025 05:43:32 -0500 (EST) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id EA50D1C7E3C for ; Thu, 30 Jan 2025 10:43:31 +0000 (UTC) X-FDA: 83063781822.20.834919B Received: from mail-wm1-f44.google.com (mail-wm1-f44.google.com [209.85.128.44]) by imf18.hostedemail.com (Postfix) with ESMTP id CFAED1C000B for ; Thu, 30 Jan 2025 10:43:29 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=ffwll.ch header.s=google header.b=RVUXgQdE; dmarc=none; spf=none (imf18.hostedemail.com: domain of simona.vetter@ffwll.ch has no SPF policy when checking 209.85.128.44) smtp.mailfrom=simona.vetter@ffwll.ch ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738233809; a=rsa-sha256; cv=none; b=ArD0sBSVywnzMiBk4hq1IL9nx7JyVRBlbEdcDG7x4pzJvP+Syn83juyqRY9XlnGWSv1etP KhDz7dfI+bhDxOVlP/9BexuNUtSY/SjSZjEJGYXwQSZTk43rXv4iF1870fK99sgEYCaWRO m1EvPEE+nOD8jLJEeoCq/a1tVvkWAUs= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=ffwll.ch header.s=google header.b=RVUXgQdE; dmarc=none; spf=none (imf18.hostedemail.com: domain of simona.vetter@ffwll.ch has no SPF policy when checking 209.85.128.44) smtp.mailfrom=simona.vetter@ffwll.ch ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738233809; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Q8o779/JKhHVNvMt9ae+MuGgQ6geVG8dZfqUZQIwBdU=; b=bOWnIE9s8xOd+AC8TBYpX4Z3OZzDaf8BoYNJ64h2H/qQVywPPptqk+rpsvmfSIZbIRp4eM BPTqyqidzjtfi8rkU0r+x+bQu/um+sAuJ44W65ov7lgbPpVHjPugXTlaenxwlue7EUzXqe yqrjU5pgFmkxpdhtm+2h9SYRzkzyXh8= Received: by mail-wm1-f44.google.com with SMTP id 5b1f17b1804b1-43625c4a50dso4153995e9.0 for ; Thu, 30 Jan 2025 02:43:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ffwll.ch; s=google; t=1738233808; x=1738838608; darn=kvack.org; h=in-reply-to:content-disposition:mime-version:references :mail-followup-to:message-id:subject:cc:to:from:date:from:to:cc :subject:date:message-id:reply-to; bh=Q8o779/JKhHVNvMt9ae+MuGgQ6geVG8dZfqUZQIwBdU=; b=RVUXgQdEnYjSlptOp4jJGWfZbnkKgEL8u3eIWcI88NuNtq8hj3l8uZPpBFf9evYGcl Qt3KDB5KakhwXQHVBT3QLpkvHOyIwH22RIcbY1wIdct7qHPF1yojRJFymr/9AuUuGOfh yiD5yn4WgbyLkJmozfKZ5FgiKjoe19iBPutKc= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738233808; x=1738838608; h=in-reply-to:content-disposition:mime-version:references :mail-followup-to:message-id:subject:cc:to:from:date :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Q8o779/JKhHVNvMt9ae+MuGgQ6geVG8dZfqUZQIwBdU=; b=cInvTArKzaquDHNTYwlSuwgkQxWR7iMMrGntiy8PYEwFsE/DQBvkz0LOw9CLetarvL fFVDenwLrxF3KnDlilFCnuQGJ3cJS20u2gY8KFhDKn0DpJ4rH/sqTID1X/N0EReY/B0e tLDZq4ZHkpZ3Jr9yFrgUqlRQqKmZYriv91OKYVAxrtCaez8yEUkVycXDDzsezpd9tGmf xlDC8BI8Ch/tfDtsIkWKxtQfbwdNKjDYOqiDmEybBaTXFtg1uo9mNEmleoxqmAfVtsN1 YVKroKUDthb+bf525o/2HxmO/SWxpg2DLCe1Ytj8pM4+ZA6Wy/DWnSgEBtTdD22VAlfq z8Kg== X-Forwarded-Encrypted: i=1; AJvYcCWlIz+Dz9Dd76pHNjXv2AgW/uGxKeb7FYlkRQCZ9F441GUf12chZ8EhihBVwbJuYXibiUwNBK5TYg==@kvack.org X-Gm-Message-State: AOJu0Yw9DT86YA35NxQ1HFVNHybJCEd44+N4Ed2tnZZ/t5Y9+HVbivZw 2HGTs4mdTgRa4z2C2MVlfMvLMusPGNKLocnfYM3CwnjN308zHxvwLXm6kSnr13s= X-Gm-Gg: ASbGncsEp8cAzCdpN1BvyzlKDBmidMx/tvKkyrwYsz95EbEhu1Hp+6PDbm9KNaPsbO5 krisI2Ol8mqJrzFdVNmlT558LeR7vZkNEg1PuvTMSVeVx/RXcAcX4koV/6OFEscMDVBL7ij+JCO DNC/QkbqPgEWPbpuijVgnUi2pU/QtFtE9Nb3PmB6zkmJaXXxi/811pjMXNjontIdZ99mDdxEZxn lAbY2kKDiA0UqkihN0OEGwnUiftzJXcwbMTORc1ukAVP6N7Rn2ote0cHsJtzSwumvFc/V6HVAuA mTnI/c0gOlDqUdOLBjgmh19w29M= X-Google-Smtp-Source: AGHT+IHoUvHshAGTRaYOTGv2HbproKUGTWM5vAaGbhLqj8z8OheIfWyoMmWPJMekjx89GDC8VcDxDQ== X-Received: by 2002:a05:600c:c87:b0:434:f1e9:afb3 with SMTP id 5b1f17b1804b1-438e0d879fdmr29012205e9.3.1738233808211; Thu, 30 Jan 2025 02:43:28 -0800 (PST) Received: from phenom.ffwll.local ([2a02:168:57f4:0:5485:d4b2:c087:b497]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-38c5c1b579dsm1584246f8f.78.2025.01.30.02.43.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 30 Jan 2025 02:43:27 -0800 (PST) Date: Thu, 30 Jan 2025 11:43:25 +0100 From: Simona Vetter To: Alistair Popple Cc: David Hildenbrand , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, dri-devel@lists.freedesktop.org, linux-mm@kvack.org, nouveau@lists.freedesktop.org, Andrew Morton , =?iso-8859-1?B?Suly9G1l?= Glisse , Jonathan Corbet , Alex Shi , Yanteng Si , Karol Herbst , Lyude Paul , Danilo Krummrich , David Airlie , Simona Vetter , "Liam R. Howlett" , Lorenzo Stoakes , Vlastimil Babka , Jann Horn , Pasha Tatashin , Peter Xu , Jason Gunthorpe Subject: Re: [PATCH v1 4/4] mm/memory: document restore_exclusive_pte() Message-ID: Mail-Followup-To: Alistair Popple , David Hildenbrand , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, dri-devel@lists.freedesktop.org, linux-mm@kvack.org, nouveau@lists.freedesktop.org, Andrew Morton , =?iso-8859-1?B?Suly9G1l?= Glisse , Jonathan Corbet , Alex Shi , Yanteng Si , Karol Herbst , Lyude Paul , Danilo Krummrich , David Airlie , Simona Vetter , "Liam R. Howlett" , Lorenzo Stoakes , Vlastimil Babka , Jann Horn , Pasha Tatashin , Peter Xu , Jason Gunthorpe References: <20250129115803.2084769-1-david@redhat.com> <20250129115803.2084769-5-david@redhat.com> <7vejbjs7btkof4iguvn3nqvozxqpnzbymxbumd7pant4zi4ac4@3ozuzfzsm5tp> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <7vejbjs7btkof4iguvn3nqvozxqpnzbymxbumd7pant4zi4ac4@3ozuzfzsm5tp> X-Operating-System: Linux phenom 6.12.11-amd64 X-Rspam-User: X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: CFAED1C000B X-Stat-Signature: xrg5xn91bi17f8ygbhbc9q5u5rq3suqw X-HE-Tag: 1738233809-496818 X-HE-Meta: U2FsdGVkX18iq1Ey+DvlIl6R1Bu3VaDU9IkzHscmS7/PwCId9Pial43Q2SVXbj0qjuM6DgeBNw50pxcqS0LdxySdts+G8NmJtPOKKgNx4/fbdUyNT5GlKxm0PrwTzFoJdd0sTXnvHfr6kXHPJ5aXvDLcaYBFZERvmHqa7U2PuK5jTwuFIJQ4EI3DS+JeI1vbFLSc8WbqDc3KwwQzgokxfbJluQcKyl6+UPBIRkpJySxj8tYafPuHc0Ji1iwUQUd0tSv3fQN4zTrsyNjel123iClhRa+r0Cjfi6v0IwNhRSIhJq9sYjc4OyTqfIuqIi1o8NkP6DON/DEN4Y56c2+Ms7JqOCZl7zrLkw8UqBLIb2iY6gMvfn94BaFc+amgHgtlCSfHuEbjRvaKp8Y49AFhkcY7/9kKmx1Q8IaCfOMkc4ZtSWViMl3KaQdxuP5oE5lm+eWS7SV1zMLaBeKqZHpqdL0WZ3l6rmOqonxpakL6jkVVubdvk7j5viMurONIZ4gJJ2Tij8wnBWuLQivojM3woHTAkW11I5gcdj0Vdy2OFN8SIyhQSm3bWMBzFAlFTXkOecEhMwAJoTQ6nU3UDuNb5oanW2jF47TB9mLsLNGh0q8i98n11PHlT1N/rDw34Zb8kyMwZ2NO2HYfMcJRlEyw6OYYAMQbalwMCZmbMEkKM13qNZwfCGQuXPy6vVMJpna6dm0PlxcZRlDhfdgvWDyCiJHrS5x7yLHTMVwBWgLgRgnOa047HNwRN98yO+iLhCdYtNgdEHkPrmvTYCD6JJSqXMxsk11TMFifI7NOjbBF2v0tQaO6XqBTiBbVt7ra9Cwqs1ENU3ak6tk66FAPSWmKTaYJXPNTThM/rwkk8DD3hr9N3VYiOYfnXDw8oYRNBzgP5uZk78VWtz4RuhEfv0Ahka7OpVe4gnbtLmBg1g3/8CaEJUsi1k13UiDKCnTloLCWXOFh4F8DtQ9ovLakpI1 KufVsHjB vCF1K/BAyCSvNB/eG/sZnRLlUYcb5PWcDYo4rkNX7011cLpEsiQtvJG/WsSNV/RUK+vXo5APFnJ5IClVRZx4JpgdbpPICGfyaxKxe7vLqP9EKDsJxrvV/PYB/UEqBlbNU+ovcNGtptUYMFeptqVmEjGgreOE95vezX91sNCcpJnmOeh8Yc1adCXsYWqSgLvCqGKpT7eiMc5TICvpTbbLcJ3DMVK7aUywnQjMZ4Dg5SRytuUs1y/Z7QGwFHc4j6p5t18jZiEg8om0ASNbleV4KEotwU+Krwv0pF8wUQTF7SSOIjaZdwOxc/wqSc451NV4FVDwEvSDcWkdHKcGi4bUw0JRbqxwdaX99i6X8W7oAOg/c0LpHOPem1Zkr/g== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Jan 30, 2025 at 11:27:37AM +1100, Alistair Popple wrote: > On Wed, Jan 29, 2025 at 12:58:02PM +0100, David Hildenbrand wrote: > > Let's document how this function is to be used, and why the requirement > > for the folio lock might maybe be dropped in the future. > > Sorry, only just catching up on your other thread. The folio lock was to ensure > the GPU got a chance to make forward progress by mapping the page. Without it > the CPU could immediately invalidate the entry before the GPU had a chance to > retry the fault. > > Obviously performance wise having such thrashing is terrible, so should > really be avoided by userspace, but the lock at least allowed such programs > to complete. Imo this is not a legit use-case. If userspace concurrently (instead of clearly alternating) uses the same 4k page for gpu atomics and on the cpu, it just gets to keep the fallout. Plus there's no guarantee that we hold the folio_lock long enough for the gpu to actually complete the atomic, so this isn't even really helping with forward progress even if this somehow would be a legit usecase. But this is also why thp is potentially an issue, because if thp constantly creates pmd entries that potentially causes false sharing and we do have thrashing that shouldn't happen. -Sima > > Signed-off-by: David Hildenbrand > > --- > > mm/memory.c | 25 +++++++++++++++++++++++++ > > 1 file changed, 25 insertions(+) > > > > diff --git a/mm/memory.c b/mm/memory.c > > index 46956994aaff..caaae8df11a9 100644 > > --- a/mm/memory.c > > +++ b/mm/memory.c > > @@ -718,6 +718,31 @@ struct folio *vm_normal_folio_pmd(struct vm_area_struct *vma, > > } > > #endif > > > > +/** > > + * restore_exclusive_pte - Restore a device-exclusive entry > > + * @vma: VMA covering @address > > + * @folio: the mapped folio > > + * @page: the mapped folio page > > + * @address: the virtual address > > + * @ptep: PTE pointer into the locked page table mapping the folio page > > + * @orig_pte: PTE value at @ptep > > + * > > + * Restore a device-exclusive non-swap entry to an ordinary present PTE. > > + * > > + * The folio and the page table must be locked, and MMU notifiers must have > > + * been called to invalidate any (exclusive) device mappings. In case of > > + * fork(), MMU_NOTIFY_PROTECTION_PAGE is triggered, and in case of a page > > + * fault MMU_NOTIFY_EXCLUSIVE is triggered. > > + * > > + * Locking the folio makes sure that anybody who just converted the PTE to > > + * a device-private entry can map it into the device, before unlocking it; so > > + * the folio lock prevents concurrent conversion to device-exclusive. > > I don't quite follow this - a concurrent conversion would already fail > because the GUP in make_device_exclusive_range() would most likely cause > an unexpected reference during the migration. And if a migration entry > has already been installed for the device private PTE conversion then > make_device_exclusive_range() will skip it as a non-present entry anyway. > > However s/device-private/device-exclusive/ makes sense - the intent was to allow > the device to map it before a call to restore_exclusive_pte() (ie. a CPU fault) > could convert it back to a normal PTE. > > > + * TODO: the folio lock does not protect against all cases of concurrent > > + * page table modifications (e.g., MADV_DONTNEED, mprotect), so device drivers > > + * must already use MMU notifiers to sync against any concurrent changes > > Right. It's expected drivers are using MMU notifiers to keep page tables in > sync, same as for hmm_range_fault(). > > > + * Maybe the requirement for the folio lock can be dropped in the future. > > + */ > > static void restore_exclusive_pte(struct vm_area_struct *vma, > > struct folio *folio, struct page *page, unsigned long address, > > pte_t *ptep, pte_t orig_pte) > > -- > > 2.48.1 > > -- Simona Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch