From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6D9E6C3DA42 for ; Wed, 17 Jul 2024 14:14:26 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E191C6B0085; Wed, 17 Jul 2024 10:14:25 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id DC77D6B0092; Wed, 17 Jul 2024 10:14:25 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C67F76B0096; Wed, 17 Jul 2024 10:14:25 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id A8EFD6B0085 for ; Wed, 17 Jul 2024 10:14:25 -0400 (EDT) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 4F1091C3D2E for ; Wed, 17 Jul 2024 14:14:25 +0000 (UTC) X-FDA: 82349439690.24.2436FF4 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf23.hostedemail.com (Postfix) with ESMTP id E05C014001C for ; Wed, 17 Jul 2024 14:14:21 +0000 (UTC) Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b="Vp/+RMcd"; spf=pass (imf23.hostedemail.com: domain of david@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1721225623; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=1We4FFZxPkq/cjsrPbwaJfITzcRFUwgnw9gMpKh3MRY=; b=R6GC+TMp78y08jke3YlYwX9II5miBPXtYTPFBBo1VHxyERyDE2Wdm6r57cIFxSEaR+Rr7y okJ6965jcZPkSWlJtmhiq2HhdtmfeVe18TO97iTeQsLpEuWANNdmPAoxWx7sZZ48yp21IL 1P/G9NaFdu2KKBRKSjVGUhJDjuYM4/U= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1721225623; a=rsa-sha256; cv=none; b=nYE6MzaUhCK6ZQitbjOCtwmaCz8GmDpXSpdUROF++qsi1sh0/BlJAT9CkzXvBR2wD91NKh issw8gPCzLwpuS1VazPcFDdn47ZsdckPorv0+IcZAPfEFhrf9amVTngoWYDEipeqH4S4Je vdJaokaco7PmUIkphGBQKXaonJbiOLs= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b="Vp/+RMcd"; spf=pass (imf23.hostedemail.com: domain of david@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1721225660; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:autocrypt:autocrypt; bh=1We4FFZxPkq/cjsrPbwaJfITzcRFUwgnw9gMpKh3MRY=; b=Vp/+RMcdXamv1hrmMYSFG2eDegYB59k4duEAgGd3c2opTPOWmV4F96RDSYRla7lpWvOlHA lnB6KbpJoEbum10Xdvj3kXD4M3zVAhdUvfpSCw8Q5LSn3NxBh3if1igIWRlFNDJ2/kB6/A LpZ8oNDE3xHcdxfxnSZbuzQt6G2mSJQ= Received: from mail-wm1-f72.google.com (mail-wm1-f72.google.com [209.85.128.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-74-pFi8M-oLPJeGKagmlVbLMg-1; Wed, 17 Jul 2024 10:14:19 -0400 X-MC-Unique: pFi8M-oLPJeGKagmlVbLMg-1 Received: by mail-wm1-f72.google.com with SMTP id 5b1f17b1804b1-42668699453so64324605e9.3 for ; Wed, 17 Jul 2024 07:14:19 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1721225658; x=1721830458; h=content-transfer-encoding:in-reply-to:organization:autocrypt :content-language:from:references:cc:to:subject:user-agent :mime-version:date:message-id:x-gm-message-state:from:to:cc:subject :date:message-id:reply-to; bh=1We4FFZxPkq/cjsrPbwaJfITzcRFUwgnw9gMpKh3MRY=; b=aspU1pAd8iIF40/B3IjIhVFGjrvfFC2cvL4jGYdkv//klZK0epMcVJ6e1s28kFd+iZ SOIbxdjfOtICBan2fiYJOC2HQOPKSx7N0EJKlF5hXS5v1N/KHrSYA1vIs8NnA4UVfaDN 8gKSSScFy7FNjx1GkFNvEdkEtUqIh/mcWDEfImBCwBeYPWbUbY69DbQrD2Obcgoxxt3S /ce8g4xNK6ZzrAO4vvVbUvncrtrey1q/7AZy+QvuiAoW70qYCf0hct+5JrNg/WZT9ZzE oo+XJAreIgtALYpCLuw+Y6yYRNmUZo73BVUqtk2q1LIgjkEMxP8EudAfajauh9+YBhuX WGOQ== X-Forwarded-Encrypted: i=1; AJvYcCVRy3CqCyrPhUNnBQJPM/DIU6gsB19PAmHTOwX5JH37/H78vYhybbsIStqzDjFu14qyRXoGrgZL0lx42kb8VyZt+hk= X-Gm-Message-State: AOJu0YxM3TZbcSW++XxyO524Cr0A6BxXQrszTfW4z+7ajTevtFbHxqqx nk2/1dnpxz2n5fPevhoSLkKxIfVLYgShsd2L5FnHEiFaMGdvcA2BEgxGkGO8u53Y7eMPm/mztXr Sk2rrvzAdAU0WqK8gIwad9ZU2hX5YG6Vral9nZ4EN9aHLTWfD X-Received: by 2002:a05:600c:3411:b0:427:9dae:2768 with SMTP id 5b1f17b1804b1-427c2d119d7mr16164675e9.38.1721225658043; Wed, 17 Jul 2024 07:14:18 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFCmyJmzOHq0FoH7UwKtGCBpWohzbLIPYi//OqduCZVEF7/TYb6wl2QBePnw96kUmoEKma5sA== X-Received: by 2002:a05:600c:3411:b0:427:9dae:2768 with SMTP id 5b1f17b1804b1-427c2d119d7mr16164435e9.38.1721225657598; Wed, 17 Jul 2024 07:14:17 -0700 (PDT) Received: from ?IPV6:2003:cb:c714:c00:b08b:a871:ce99:dfde? (p200300cbc7140c00b08ba871ce99dfde.dip0.t-ipconnect.de. [2003:cb:c714:c00:b08b:a871:ce99:dfde]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-4279f25a957sm205610285e9.13.2024.07.17.07.14.16 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 17 Jul 2024 07:14:17 -0700 (PDT) Message-ID: <2c6ec60e-1eff-417e-aed2-4554ea9a86eb@redhat.com> Date: Wed, 17 Jul 2024 16:14:15 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH] mm/x86/pat: Only untrack the pfn range if unmap region To: Peter Xu Cc: David Wang <00107082@163.com>, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Andrew Morton , Alex Williamson , Jason Gunthorpe , Al Viro , Dave Hansen , Andy Lutomirski , Peter Zijlstra , Thomas Gleixner , Ingo Molnar , Borislav Petkov , "Kirill A . Shutemov" , x86@kernel.org, Yan Zhao , Kevin Tian , Pei Li , Bert Karwatzki , Sergey Senozhatsky References: <20240712144244.3090089-1-peterx@redhat.com> <1182a459.1e35.190b0e61754.Coremail.00107082@163.com> <8da2b3bf-b9bf-44e3-88ff-750dc91c2388@redhat.com> From: David Hildenbrand Autocrypt: addr=david@redhat.com; keydata= xsFNBFXLn5EBEAC+zYvAFJxCBY9Tr1xZgcESmxVNI/0ffzE/ZQOiHJl6mGkmA1R7/uUpiCjJ dBrn+lhhOYjjNefFQou6478faXE6o2AhmebqT4KiQoUQFV4R7y1KMEKoSyy8hQaK1umALTdL QZLQMzNE74ap+GDK0wnacPQFpcG1AE9RMq3aeErY5tujekBS32jfC/7AnH7I0v1v1TbbK3Gp XNeiN4QroO+5qaSr0ID2sz5jtBLRb15RMre27E1ImpaIv2Jw8NJgW0k/D1RyKCwaTsgRdwuK Kx/Y91XuSBdz0uOyU/S8kM1+ag0wvsGlpBVxRR/xw/E8M7TEwuCZQArqqTCmkG6HGcXFT0V9 PXFNNgV5jXMQRwU0O/ztJIQqsE5LsUomE//bLwzj9IVsaQpKDqW6TAPjcdBDPLHvriq7kGjt WhVhdl0qEYB8lkBEU7V2Yb+SYhmhpDrti9Fq1EsmhiHSkxJcGREoMK/63r9WLZYI3+4W2rAc UucZa4OT27U5ZISjNg3Ev0rxU5UH2/pT4wJCfxwocmqaRr6UYmrtZmND89X0KigoFD/XSeVv jwBRNjPAubK9/k5NoRrYqztM9W6sJqrH8+UWZ1Idd/DdmogJh0gNC0+N42Za9yBRURfIdKSb B3JfpUqcWwE7vUaYrHG1nw54pLUoPG6sAA7Mehl3nd4pZUALHwARAQABzSREYXZpZCBIaWxk ZW5icmFuZCA8ZGF2aWRAcmVkaGF0LmNvbT7CwZgEEwEIAEICGwMGCwkIBwMCBhUIAgkKCwQW AgMBAh4BAheAAhkBFiEEG9nKrXNcTDpGDfzKTd4Q9wD/g1oFAl8Ox4kFCRKpKXgACgkQTd4Q 9wD/g1oHcA//a6Tj7SBNjFNM1iNhWUo1lxAja0lpSodSnB2g4FCZ4R61SBR4l/psBL73xktp rDHrx4aSpwkRP6Epu6mLvhlfjmkRG4OynJ5HG1gfv7RJJfnUdUM1z5kdS8JBrOhMJS2c/gPf wv1TGRq2XdMPnfY2o0CxRqpcLkx4vBODvJGl2mQyJF/gPepdDfcT8/PY9BJ7FL6Hrq1gnAo4 3Iv9qV0JiT2wmZciNyYQhmA1V6dyTRiQ4YAc31zOo2IM+xisPzeSHgw3ONY/XhYvfZ9r7W1l pNQdc2G+o4Di9NPFHQQhDw3YTRR1opJaTlRDzxYxzU6ZnUUBghxt9cwUWTpfCktkMZiPSDGd KgQBjnweV2jw9UOTxjb4LXqDjmSNkjDdQUOU69jGMUXgihvo4zhYcMX8F5gWdRtMR7DzW/YE BgVcyxNkMIXoY1aYj6npHYiNQesQlqjU6azjbH70/SXKM5tNRplgW8TNprMDuntdvV9wNkFs 9TyM02V5aWxFfI42+aivc4KEw69SE9KXwC7FSf5wXzuTot97N9Phj/Z3+jx443jo2NR34XgF 89cct7wJMjOF7bBefo0fPPZQuIma0Zym71cP61OP/i11ahNye6HGKfxGCOcs5wW9kRQEk8P9 M/k2wt3mt/fCQnuP/mWutNPt95w9wSsUyATLmtNrwccz63XOwU0EVcufkQEQAOfX3n0g0fZz Bgm/S2zF/kxQKCEKP8ID+Vz8sy2GpDvveBq4H2Y34XWsT1zLJdvqPI4af4ZSMxuerWjXbVWb T6d4odQIG0fKx4F8NccDqbgHeZRNajXeeJ3R7gAzvWvQNLz4piHrO/B4tf8svmRBL0ZB5P5A 2uhdwLU3NZuK22zpNn4is87BPWF8HhY0L5fafgDMOqnf4guJVJPYNPhUFzXUbPqOKOkL8ojk CXxkOFHAbjstSK5Ca3fKquY3rdX3DNo+EL7FvAiw1mUtS+5GeYE+RMnDCsVFm/C7kY8c2d0G NWkB9pJM5+mnIoFNxy7YBcldYATVeOHoY4LyaUWNnAvFYWp08dHWfZo9WCiJMuTfgtH9tc75 7QanMVdPt6fDK8UUXIBLQ2TWr/sQKE9xtFuEmoQGlE1l6bGaDnnMLcYu+Asp3kDT0w4zYGsx 5r6XQVRH4+5N6eHZiaeYtFOujp5n+pjBaQK7wUUjDilPQ5QMzIuCL4YjVoylWiBNknvQWBXS lQCWmavOT9sttGQXdPCC5ynI+1ymZC1ORZKANLnRAb0NH/UCzcsstw2TAkFnMEbo9Zu9w7Kv AxBQXWeXhJI9XQssfrf4Gusdqx8nPEpfOqCtbbwJMATbHyqLt7/oz/5deGuwxgb65pWIzufa N7eop7uh+6bezi+rugUI+w6DABEBAAHCwXwEGAEIACYCGwwWIQQb2cqtc1xMOkYN/MpN3hD3 AP+DWgUCXw7HsgUJEqkpoQAKCRBN3hD3AP+DWrrpD/4qS3dyVRxDcDHIlmguXjC1Q5tZTwNB boaBTPHSy/Nksu0eY7x6HfQJ3xajVH32Ms6t1trDQmPx2iP5+7iDsb7OKAb5eOS8h+BEBDeq 3ecsQDv0fFJOA9ag5O3LLNk+3x3q7e0uo06XMaY7UHS341ozXUUI7wC7iKfoUTv03iO9El5f XpNMx/YrIMduZ2+nd9Di7o5+KIwlb2mAB9sTNHdMrXesX8eBL6T9b+MZJk+mZuPxKNVfEQMQ a5SxUEADIPQTPNvBewdeI80yeOCrN+Zzwy/Mrx9EPeu59Y5vSJOx/z6OUImD/GhX7Xvkt3kq Er5KTrJz3++B6SH9pum9PuoE/k+nntJkNMmQpR4MCBaV/J9gIOPGodDKnjdng+mXliF3Ptu6 3oxc2RCyGzTlxyMwuc2U5Q7KtUNTdDe8T0uE+9b8BLMVQDDfJjqY0VVqSUwImzTDLX9S4g/8 kC4HRcclk8hpyhY2jKGluZO0awwTIMgVEzmTyBphDg/Gx7dZU1Xf8HFuE+UZ5UDHDTnwgv7E th6RC9+WrhDNspZ9fJjKWRbveQgUFCpe1sa77LAw+XFrKmBHXp9ZVIe90RMe2tRL06BGiRZr jPrnvUsUUsjRoRNJjKKA/REq+sAnhkNPPZ/NNMjaZ5b8Tovi8C0tmxiCHaQYqj7G2rgnT0kt WNyWQQ== Organization: Red Hat In-Reply-To: X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: E05C014001C X-Stat-Signature: gj4jrtu4kofwgnapu8aau979uzjrhe8x X-Rspamd-Server: rspam09 X-Rspam-User: X-HE-Tag: 1721225661-654153 X-HE-Meta: U2FsdGVkX19jYfZnNxShJQVtIVAHXUZp6GC7oS1/P9uKT1g+UPsAEV7+x11ycZWgbSxISsvBUFw3Igp/11ZLvLnSRsLEAMrT5n8mhJ0mxzUl7VqDrWSfQa14uamlTP5HuIc2iwxhNsqnzbSrhw4aGoLb2vRC0Z6gJOIiKtffLGZzDfjfP1SLB7IrnZsBCShGICguqm6iw5t9vWpAMvHC/SENyRpqV7GWLC9+A/HXkhL7VLCthjCiNHnseTPnYxzWheWC2+5oC8wYXQdTtIHwZC30SIhw7j6upGI9rfD418Pf2LznIED58fTLk8CU6HyKapilPnkdZoGp5McOsMfAL1mozuK+ueYGQn0Dva1lfS5nsrBipL2+5bx+sDzIIjoW6MiZo22trQzzeN4XhcWzy61pSwIhZVA1m3jUszixmRCWpAwD9O13eGRKBU9c3PdHiAg4NRfIJ3HGHTAnj9dYzEPdnyYbKETyVpY9YScqsSy5Gy01ema+aNPhDf1rhKIYGmyCbUb+LvrIYvkMpNaKcbdJvhoX6KRJkGjN8KLGKK/noleBVKTojI3u/dwKwelQlkU9spOBplCU5WAoQWSrpJmMzTQJPiKhZ8S2IbK1/p/kUF52iog842EcpKf9NDd6nrDl4TWjTPQtNrkjLJ+ew9tZgoEJt0Wcxnn9kadcIAeJHmL8TawI6hkus4twhVF3gO8cdeTZjUWgBG9JOv4Rjqo0zzh0NoC0b4WtJjZFCe4hcJ8rL0W2GOjStsmCoawtq7zOn+AVbPAks9deEhmSTWg6dWyfkOKJBbPs0hLKNPMSKOLvY9RKmAVlZBDHqoUnZ2glKb2bioHjsZMm/z2cH1JcDtdlGnKnPTpJETPLsqRSfRr2DSH6u+EIw0aJMLkh5KkF03xr/ckxyB4uSfNyKmREj2Qh3O5ACFBOmx/LCSSTrNMDMudedhGZizuzkXlstUaAtml1hEldA7prIbv fSDhADV+ SaUyRXdpwt9dniGqkwQMsa1kAXY/jL8JWOFZ5xbwPPiN+axFoKuAem8EUXyyo0bEgHQRubPNAtz/FOC8KKrah/xaVjHk2IHfafP9tFZvcjn4elU/mMXC2SeevIr6onkn7Ay16jk1uWBjfwPrtgUsPMDK4mKqjS21xzoYC1mQbD++QLNQVkCIFDNbPkmgjxGFLvbag7ymxU0aM+kl9HKBMxTXXkiQV6IEQ/ollN5Jf3MR0D2z5Mu8YOkpK0GtV6LgZC7mF24/S1PpWYLHCe7FM/sW/RzwGusEs3dh5KTMlmN2rbN3MH6pvJm3clHSvGqoHZlHLM37PpNqk3nKevh18Md5P6ObCCT/HZ5xto9J/lZCKXOOvg/cqBDgOkuqSTrr3GmKe0BzArObEjk4e8JjfeeCtn8wLeP3iVALoCpnaJRF8ya/0MDFUMVLdMwIDEp9bVNO3qlUGR60febxTuaYU43rZXW3QHriX2+BPm5BEe2iF1jziwa+Xdy3iRq+Ie7yznhSfCgAH3mtTby1x/TiDnZZahA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: [catching up on mails] >> indicates that file truncation seems to end up messing with a PFNMAP mapping >> that has PAT set. That is ... weird. I would have thought that PFNMAP would >> never really happen with file truncation. >> >> Does this only happen with an OOT driver, that seems to do weird truncate >> stuff on files that have a PFNMAP mapping? >> >> [1] >> https://lore.kernel.org/all/3879ee72-84de-4d2a-93a8-c0b3dc3f0a4c@redhat.com/ > > Ohhh.. I guess this will also stop working in VFIO, but I think it's fine > for now because as Yan pointed out VFIO PCI doesn't register those regions > now so VM_PAT is not yet set.. Interesting, I was assuming that VFIO might be relying on that. > > And one thing I said wrong in the previous reply to Yan is, obviously > memtype_check_insert() can work with >1 owners as long as the memtype > matches.. and that's how fork() works where VM_PAT needs to be duplicated. > But this whole thing is a bit confusing to me.. As I think it also means > when fork the track_pfn_copy() will call memtype_kernel_map_sync one more > time even if we're 100% sure the pgprot will be the same for the kernel > mappings.. I consider the VM_PAT code quite ugly and I wish we could just get rid of it (especially, the automatic "entire VMA covered" handling thingy). > > I wonder whether there's some way that untrack pfn framework doesn't need > to rely on the pgtable to fetch the pfn, because VFIO MMIO region > protection will also do that in the near future, AFAICT. The pgprot part > should be easy there to fetch: get_pat_info() should fallback to vma's > pgprot if no mapping found; the only outlier should be CoW pages in > reality. The pfn is the real issue so far, so that either track_pfn_copy() > or untrack_pfn() may need to know the pfn to untrack, even if it only has > the vma information. I had a prototype to store that information per VMA to avoid the page table lookup. VMA splitting was a bit "added complication", but I got it to work. (maybe I can still find it if there is demand) The downside was having to consume more memory for all VMAs in the system simply (even if only 8 byte) because a handful of VMAs in the system could be VM_PAT. I decided that's not what we want. I managed to not consume memory in some configurations, but not in all, so I discarded that approach. I did not explore storing that information in some auxiliary datastructure. IMHO the whole VM_PAT model is weird: 1) mmap() 2) remap_pfn_range(): if it covers the whole VMA apply some magic reservation. 3) munmap(): we unmap *all* PFNs and, therefore, clean up VM_PAT (VMA splitting make the whole model weirder, but it works, because we never merge these VMAs) This model cannot properly work if we get partial page table zapping via truncation/MADV_DONTNEED or similar things after 2). And likely we also shouldn't be doing it that way. We should forbid any partial unmappings in that model, just like we already disallow MADV_DONTNEED as you note. As you mention in your other comment, maybe relevant/all? caller should just manage the PAT side independently. So maybe we can move to a different model. -- Cheers, David / dhildenb