From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BE0DFC3DA59 for ; Mon, 22 Jul 2024 13:52:17 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2500C6B007B; Mon, 22 Jul 2024 09:52:17 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 1FFEE6B0083; Mon, 22 Jul 2024 09:52:17 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0A0B76B0085; Mon, 22 Jul 2024 09:52:17 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id E01576B007B for ; Mon, 22 Jul 2024 09:52:16 -0400 (EDT) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 5164681840 for ; Mon, 22 Jul 2024 13:52:16 +0000 (UTC) X-FDA: 82367527872.15.CAA4B2D Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf22.hostedemail.com (Postfix) with ESMTP id 36F82C001C for ; Mon, 22 Jul 2024 13:52:14 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=SSsz0Ml1; spf=pass (imf22.hostedemail.com: domain of peterx@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1721656311; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=CqewATc2Z5Y0oifjk9JwNW+cFpf4TyaY2cbgESbwaag=; b=j0383ns+KysNOQvtcM2cQxsD47FFzz62QkD61x+5iAaeXucD09kWwQ846ZpXOSJZ8wxMHp Sl7+DIChGEnZFhLhkWcuWkMqwswRmO+M74Meo15KbMNZ8+rFHr+Bl7iHW0AcRW5p8MbAX0 Be3/3S0JSZEmPFQ0NGCFN0DnZUSe8Uo= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=SSsz0Ml1; spf=pass (imf22.hostedemail.com: domain of peterx@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1721656311; a=rsa-sha256; cv=none; b=uTno0/Gpy3ouTFbQJn9SSvx5gm27MO5xr8rL6kWi49+t4IWiyxoKmxtXwKvmQLeSNpFv7N Rv+jS1YqA+ZbTRACs9qy05HUtwkxtXrKen72qpNeAl8yeShROzVHpYf1lojKaXBIUWGgdu 89SZl77yGGfr363GH4OwmXMaBxhJ2M8= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1721656333; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=CqewATc2Z5Y0oifjk9JwNW+cFpf4TyaY2cbgESbwaag=; b=SSsz0Ml1dp35Krlu/xm3r1SL9Jo8gVAgVekDtC3bMBGeo4wI2wfWOe/WDLW/VVdviHeNMp V3+2cw5QeHFIl51c55LZeKoSPSEvh8KzPKXP2NHHv9MJBMPliFrh3Vratea3B4ouAGz8Ez VmK0uKBwpKkGU5YOQL6w6qmICG5LF2k= Received: from mail-vk1-f197.google.com (mail-vk1-f197.google.com [209.85.221.197]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-391-aCEFSfU8P8SAX6sO3ochEw-1; Mon, 22 Jul 2024 09:52:12 -0400 X-MC-Unique: aCEFSfU8P8SAX6sO3ochEw-1 Received: by mail-vk1-f197.google.com with SMTP id 71dfb90a1353d-4f5111f666fso52162e0c.0 for ; Mon, 22 Jul 2024 06:52:12 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1721656331; x=1722261131; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=CqewATc2Z5Y0oifjk9JwNW+cFpf4TyaY2cbgESbwaag=; b=TsAzoBKNRtGScZp3HB4I4gA61KQPfH+JkYbi/6MZCwlpgo0EpPsi++/HzPSisH6tbu 9dJkNqkuI+sjWAm2OoLJMUuQsSeHTAaPxHeeYHHT/oIdEIJxORd5TTSo7d2KNr31nWt1 icz76ojHzOQTUYsQcATQgiQ1dHLtOgOsA8CCJ3NEeWwgHohX8ZKHqK8vD/IGflG03P/N w24b7aU9AOV3T4PajnCMhlI5UJ1/PbzP7j/ZVapDeHF99/006Oix1Z4T7ZAnPWxZDpi1 GDBNm8ttz4bWAv1PoQ4h0ORq6ndupz+dVJkCBCZMkOkK0mj4OHwrHlxM6zJOupEqUfGW hv6g== X-Forwarded-Encrypted: i=1; AJvYcCW/B0e6cyjoGu0yKf5W98jeXTpq+MRWftU0XvEM3vE3RRP1lS8JhyKqIZL9k8N106ELKSF1h1PpQc9PgxiGLjYtux4= X-Gm-Message-State: AOJu0YwY5Oykyr2UF8SExpfeaNjUTMZZVnY5l4B7We3yjvdqYmzuE/sk CgQXHU3aCgaGv1bC7dcjHmVl3l1QIpZk4oIYJzEb1ROh0TOS8yt3DpgY/PpAW5OjTMbMVXWnVA0 f9NhbOdgKC5/e1AwTATEAIOjXPMKF5fU+apvGggxT4H9/PTYA X-Received: by 2002:a05:6122:3389:b0:4f5:19d0:17a9 with SMTP id 71dfb90a1353d-4f519d018efmr1890352e0c.1.1721656331405; Mon, 22 Jul 2024 06:52:11 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHYJ6qAtUvIlmu+E8dfSAtWy9mWi+MkTv/N/grUnBegwQBmHFkWvDlpjnrosFZ11UuiwLaucQ== X-Received: by 2002:a05:6122:3389:b0:4f5:19d0:17a9 with SMTP id 71dfb90a1353d-4f519d018efmr1890326e0c.1.1721656330946; Mon, 22 Jul 2024 06:52:10 -0700 (PDT) Received: from x1n (pool-99-254-121-117.cpe.net.cable.rogers.com. [99.254.121.117]) by smtp.gmail.com with ESMTPSA id af79cd13be357-7a1990136f1sm362390085a.71.2024.07.22.06.52.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 22 Jul 2024 06:52:10 -0700 (PDT) Date: Mon, 22 Jul 2024 09:52:08 -0400 From: Peter Xu To: Yan Zhao Cc: David Hildenbrand , "linux-kernel@vger.kernel.org" , "linux-mm@kvack.org" , Andrew Morton , Alex Williamson , Jason Gunthorpe , Al Viro , Dave Hansen , Andy Lutomirski , Peter Zijlstra , Thomas Gleixner , Ingo Molnar , Borislav Petkov , "Kirill A . Shutemov" , "x86@kernel.org" , "Tian, Kevin" , Pei Li , David Wang <00107082@163.com>, Bert Karwatzki , Sergey Senozhatsky Subject: Re: [PATCH] mm/x86/pat: Only untrack the pfn range if unmap region Message-ID: References: <1a0884cb-39ed-455e-a505-7c1b2a0e5225@redhat.com> MIME-Version: 1.0 In-Reply-To: X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Disposition: inline X-Rspamd-Server: rspam03 X-Rspam-User: X-Rspamd-Queue-Id: 36F82C001C X-Stat-Signature: 4jkz9wr463uyimdnxdbycyka8eijtyww X-HE-Tag: 1721656334-509073 X-HE-Meta: U2FsdGVkX1+Mu2iIpx1e4VbGZBMNVje36tx1m9YSj8MGo3agsE+CyimBryuj2sw2q6SAGdwEpTmKEv7S9FLe98D58rmF/OCu9NCX4bCvWybatyE8JUIwp9HTq+HtZWLbXcNA/My8L95JJwZQoGyNWQuyjcVjDyNDQzKtg7V5jA2wBZa7+mn0aS+M5D7NPibWj1qmPVCX616QVR20u8aHL8i9n2SY2VnIRZ7wmEdio0+GvDuqh+YhXRRnAKGKLUzYpJj4xTBW718eWeerHJYJAsbEThFz7Tdml/nmOaIoNwi587X6OJylrTk8ThpC2eyQkh+cfK54Yp6U5El2CrH3VjxxhEAZVbs0cnrM9lZ9UhrNAAv9TjozgkrUEWZ2cWGVt0cghYRS9mY+OqWYotJaQ2XFRcR0WXG/RqkvSS4zUbbzZT1JZLjyPrTS9TEZC4bsTAQ6KUOSl7rkG2907vyDXu450iTGB8oluOLMMOGqESWRX2Dj0EH8npw3hDHqnbL1d466siCMjOg2OgAt884pR3q1mR+QDmr1q99l5SNF9of8Mz7GOW8fj0BWLlpCC2ZoOS6EA6X5XkxHLBpc1tf850GuAFao3DTxuZ6n0A+Cogxf5STcC5Ue9xb33km0U5Z7c/VrvlMO1NO+4uhhAPJbbIl7ZtMi7iCwoakI7/1v6Q0v+KsuB4YxKgwh4EzHIxFTuTVv2VfmvpvOK/GxhUr2tgsLsxIxlEk0QHaPohtDxIOcpZ9ZZb82EkdADKL3lBQcG531cU2IInDIaNm1MVTIRJmTf2pLculxIrTaqIKTqr6KINdyg/vn3k60Kx3ZSeoeihHM8ax1Z3/UVv6EsQ1sJ7T4yByRG4pcCkmA2jC44z64gpzkRJ5MvEIzbdWA5RPL9lTrTCC5mJbdG+RZLudphg6q+KC9ZhR8/vZDaUdBJ95JhK9f21lJc4AD5WY8nGiSn+xp+Jb/BoH19hqC5PG PgVsXpDP GZKOd1OD0mXNbGdEEEdBANU0rV3upuOH6fLwPrD59Oi5aocTSUEC+Jbf/NDcMmd5TzCydOk0TPFjEaeBiee0JVk3/nFxo9WpEXVnIDBm29hud5nwDJk1jOFyhIUzeYwkKX42jgdZvHKS/8fCWZn6P3Ca9Qs1NKn4oxmR5SkMAuIol5O/MPkxVhMi7Lbd+Eq38kD/M/STnJ+/4Chk9Ssrp8lIgPywNewP+imxLRsAzXhTUStTfA2+WdIKJaFKXRkoQnJtkZ6LrRlw+2RhJATlw3Sd4KOlqeFrzl21phT/x7hbv1GvRWnWyeOODy+3YrGHiNrxL5u9NCvN2WrJAqyBL0pntmwGLmviE1L++tB63/gemBN0YY6krPFR5yuK4B3o40noP02cCbTYIp7wlpyi0kokSP3ae8ysgN41KFbctaA9PutGaEUgeVE4zZ5J5Z7vGKU/TeIOID6avNwvqDkjoIj9AKeO9Y9QaF0hyjtWBjaoeH2taoqk9TNjPlrrmuK4kox/e35kcbStGIZdP0htLwqdBFFI10OsR1xAWcHZgy3XywGM= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Jul 22, 2024 at 02:49:12PM +0800, Yan Zhao wrote: > On Fri, Jul 19, 2024 at 10:13:33AM -0400, Peter Xu wrote: > > On Fri, Jul 19, 2024 at 10:28:09AM +0200, David Hildenbrand wrote: > > > On 19.07.24 01:18, Yan Zhao wrote: > > > > On Thu, Jul 18, 2024 at 10:03:01AM -0400, Peter Xu wrote: > > > > > On Thu, Jul 18, 2024 at 09:50:31AM +0800, Yan Zhao wrote: > > > > > > Ok. Then if we have two sets of pfns, then we can > > > > > > 1. Call remap_pfn_range() in mmap() for pfn set 1. > > > > > > > > > > I don't think this will work.. At least from the current implementation, > > > > > remap_pfn_range() will only reserve the memtype if the range covers the > > > > > whole vma. > > > > Hmm, by referring to pfn set 1 and pfn set 2, I mean that they're both > > > > covering the entire vma, but at different times. > > > > > > > > To make it more accurately: > > > > > > > > Consider this hypothetical scenario (not the same as what's implemented in > > > > vfio-pci, but seems plausible): > > > > > > > > Suppose we have a vma covering only one page, then > > > > (1) Initially, the vma is mapped to pfn1, with remap_pfn_range(). > > > > (2) Subsequently, unmap_single_vma() is invoked to unmap the entire VMA. > > > > (3) The driver then maps the entire vma to pfn2 in fault handler > > > > > > > > Given this context, my questions are: > > > > 1. How can we reserve the memory type for pfn2? Should we call > > > > track_pfn_remap() in mmap() in advance? > > > > 2. How do we untrack the memory type for pfn1 and pfn2, considering they > > > > belong to the same VMA but mutual exclusively and not concurrently? > > > > > > Do we really have to support such changing PFNs in a VMA? Are there existing > > > use cases that would rely on that? > > > > I share the same question with David. I don't think we support that, and I > > don't know whether we should, either. > > > > Such flexibility already will break with current PAT design. See: > Previously with remap_pfn_range() being able to be called in fault handlers, > this flexibility is doable. i.e. reserve in the fault handler and untrack > in unmap_single_vma(). AFAICT, remap_pfn_range() should never be allowed to be called in a fault handler.. So IMO it's not "it was allowed before", but we did it wrong from when we used it in fault path: remap_pfn_range() changes VMA flags since the 1st day, and that requires a writable lock, while fault paths only hold it read.. I think it's just that the per-vma lock was added a few years ago (then some lock attestations on vma lock v.s. vma flag changes), and until then we found this issue. -- Peter Xu