From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A0810C3DA4A for ; Fri, 16 Aug 2024 14:21:27 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 35A636B028A; Fri, 16 Aug 2024 10:21:27 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 30A5A6B028B; Fri, 16 Aug 2024 10:21:27 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1D2456B028C; Fri, 16 Aug 2024 10:21:27 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id F1B3A6B028A for ; Fri, 16 Aug 2024 10:21:26 -0400 (EDT) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 8DBB8C1BC6 for ; Fri, 16 Aug 2024 14:21:26 +0000 (UTC) X-FDA: 82458321372.08.3181F71 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf28.hostedemail.com (Postfix) with ESMTP id 6DFDAC001D for ; Fri, 16 Aug 2024 14:21:24 +0000 (UTC) Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=UoXpv5lU; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf28.hostedemail.com: domain of peterx@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=peterx@redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1723818071; a=rsa-sha256; cv=none; b=2fVWN/LYGKAQpB9DfHnV3iYkcEf2iqp8ANj31tRdKalaWFBNfFwk/09HuEivl8T9yUn4eH llGj4LTgY89Nz6hdESwg7a3MTJyAEQokZsJnCcwZdp8zn5NZ+V4HPlOdQxiog4NS1SrCV/ 8ZkMhKB6DgBPZMIm6JsQQXu1q5YZvDA= ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=UoXpv5lU; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf28.hostedemail.com: domain of peterx@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=peterx@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1723818071; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=DuswRPhqxZF2X8qLOLYIapi1ZWMauW2dTfqaGd5YdE0=; b=eM1u28I1RnPfz+2rF/1HWSkIW4wdELbr0XzlswAbl6Y5Mt3xl2RMukgmecn+58n91n4Rnl Amu3CJ64768/83N5fft7xzpCqKKrQdLzIwzpzt/FAq4N07DgCMBbrC9TUReT47aQV6j2Sc NAdv72s1Tk6PVMDLhLtAedvkxq26QMI= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1723818083; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=DuswRPhqxZF2X8qLOLYIapi1ZWMauW2dTfqaGd5YdE0=; b=UoXpv5lUgfk+pLVW9tMal1SZnCj3eXDFDn0YTeyv5DWDsCTn9AujeaB5Op0C/uoUdxL61T t8ILYZIrLkwntr2nb/F+wMBH93JXTEHx/xpWV3v8w+qSXBUKqdD7nd6BHduBFpfyahLd+p HgptEmsRP7o5jHX+PMfB0W/9jeOdtKU= Received: from mail-oi1-f198.google.com (mail-oi1-f198.google.com [209.85.167.198]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-133-pGTeKCjzO96sWIYpmBCn9w-1; Fri, 16 Aug 2024 10:21:22 -0400 X-MC-Unique: pGTeKCjzO96sWIYpmBCn9w-1 Received: by mail-oi1-f198.google.com with SMTP id 5614622812f47-3daf1458b58so304428b6e.0 for ; Fri, 16 Aug 2024 07:21:22 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1723818082; x=1724422882; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=DuswRPhqxZF2X8qLOLYIapi1ZWMauW2dTfqaGd5YdE0=; b=wWYjEblGhGixJXJq6jD66k9v+W7cwhVQRSDKBIBcwyA00SmAoYoJsTlZMvorjO8PMS IfL1pE1+uGzWO5dZZA1ndvJJxErmO4A4Ck8FbFQIHSVlzkuLq/5yTROfRrM6RZg4RQzs yxoL3GpGC6hp8pogAieoJt5phOCZbCyz4lv2kqah4nToh6W6a6P8yqIDLHu2TN30k4nQ TlSmrZQhE/fSeAcIQifGfroPYOmoHYw6JEqzDaPSWb2sHCdors5dfako7FSkUHvdXzey y693FaWGMoMTrNb7xsRPst+z+IHicPh4FlTvDsOpunvCMcSkGFs9BrnTdtsOPG+FZ22o hL+w== X-Forwarded-Encrypted: i=1; AJvYcCU4z4S7b8Gf8YB5GYCxh0XaQUZn+2vuk7j5BUau3tDhBw/y6IY7mxk1TTzfNTGlQjp/wAn86z/j+HVcGUvVjUhy4YE= X-Gm-Message-State: AOJu0Yw19VA9aDkGLdRzA1ZjdpBa2ucQ8b+P8db6aatYS0Y9HLlFR/jQ Y1u4tLMsryJdB2YdSUeCJRmcxBIRUcyaXMi7xwfBXdW1iy5dY5snLHAEl2E6iXhEeMqlXGMBXgs Q9QtBRS/gWhE3rAuuVci+OCSNDEfKx/9dUQglfzBgsgTAHOul X-Received: by 2002:a05:6808:148a:b0:3d9:402e:4d17 with SMTP id 5614622812f47-3dd3acb5e6dmr1979655b6e.2.1723818081696; Fri, 16 Aug 2024 07:21:21 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFDqekk9YZ0zc1gPPIevbw82K49KS4P67EowjlW1kTTHelWBjTXZ7lPyZqRU/oSZDp42I5DQA== X-Received: by 2002:a05:6808:148a:b0:3d9:402e:4d17 with SMTP id 5614622812f47-3dd3acb5e6dmr1979620b6e.2.1723818081238; Fri, 16 Aug 2024 07:21:21 -0700 (PDT) Received: from x1n (pool-99-254-121-117.cpe.net.cable.rogers.com. [99.254.121.117]) by smtp.gmail.com with ESMTPSA id af79cd13be357-7a4ff1073b4sm176155085a.104.2024.08.16.07.21.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 16 Aug 2024 07:21:20 -0700 (PDT) Date: Fri, 16 Aug 2024 10:21:17 -0400 From: Peter Xu To: David Hildenbrand Cc: Jason Gunthorpe , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Sean Christopherson , Oscar Salvador , Axel Rasmussen , linux-arm-kernel@lists.infradead.org, x86@kernel.org, Will Deacon , Gavin Shan , Paolo Bonzini , Zi Yan , Andrew Morton , Catalin Marinas , Ingo Molnar , Alistair Popple , Borislav Petkov , Thomas Gleixner , kvm@vger.kernel.org, Dave Hansen , Alex Williamson , Yan Zhao Subject: Re: [PATCH 06/19] mm/pagewalk: Check pfnmap early for folio_walk_start() Message-ID: References: <20240809160909.1023470-1-peterx@redhat.com> <20240809160909.1023470-7-peterx@redhat.com> <20240814130525.GH2032816@nvidia.com> <81080764-7c94-463f-80d3-e3b2968ddf5f@redhat.com> MIME-Version: 1.0 In-Reply-To: <81080764-7c94-463f-80d3-e3b2968ddf5f@redhat.com> X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Disposition: inline X-Rspam-User: X-Rspamd-Queue-Id: 6DFDAC001D X-Rspamd-Server: rspam01 X-Stat-Signature: ihirjff8amiexncsgzye4xhynzoffnnk X-HE-Tag: 1723818084-993342 X-HE-Meta: U2FsdGVkX1+tnTazBGyjkvQRjM7ZiHH+FIkXz+zZp5AoQZtOBmTBiaP15+cLc7uAFz7e4JoQeh+3+mVvQnvgs2AlY9Uq3pDhv9W0ROZllK0TyXsIpUxDRoYLwaRTScPL1rriXldoPFE8l4pHuoiScAieTbY3Qg0F2CNCk5TfUrKjbEUXGgl/Dcy85qT5zz4GD21bpe5omhV/gaKcdLGZKAAiS33pbkjbSipnoDgCvuRtRIxUvCvjjrtn9PGSEXnh/hbBJ2khfX0M9TfXdLvxyeClNbI4lYPZXpcmc4Uy7dWKfSfAPIMGyLKEfnoZ1BQCpLFBdXCPDWZAAuSZOZ0/gEjsTq67QYfwOwvVS7KOIVB3NgvtsaP77+WwqqFY4G6vUTE/r05aeZ5dz3Y1G+cSTZojxcBbDf7FUgJ7Zr3BF+rxhkIU3MkFc1NZaJk7mOaRG4wW4+F+SYqU8KEqyFVpQ63mR3tn4/4JgfWn7br3Fl8l3Jza/D69PF9xTPEQ5CMYtC37njSivl50GvdJ6P9jzlssgZL5pKMAGdUDWzxwtblokOJDrINI3obQRVHZ4hUF7oL6tvsnpore7/BXrl6rfmoyk5Yz/AxLyhJT9it2DRPdCLGzomnscwwdlEnXdrDyyrygjMBjAiQHSRu6snjvzs4l58BMig5rsPwLVQCQtbRkeRNR8Tj7XOk3+jxW3BXApAPZPkGeW9wwAqAy8p5RsahJly39xY4jCSOKpMg+h62WTEImvr4kdtypXIfVRkgyagG2RRUpPm7iSdhNI1QecvMgdKHpNIwO3IOuF2kqhXMTGjlD1x2XyRL244hUk4/B3PghU/K6S9CeHMCj3mhkCt1FFtaTcCizE4iWbwxuTuKQ1ws/6uRuLKFtWiQ+GZrZzaLeFLvX9iptjlYh+yeWcgoUulXbauQuqHV5v7dsVBB8Jqc0MXuSn35yIru1i7yrNgA/c2RvnMwm/M4LnE5 0opNu1XM GrodVVTJ7o6WwGn8Mo1gC5nikPuqawnvw5IYuv5HIqAVNtesv2Dn/pIBF5N+M8X7XPdSqHV606NMLjus43n4YLyDevpYOBDQgb7bc5dbn4TCSdWCSRjdm3HRLbJCaWsS9Kuuh4bB6aVn0sxtLVqV1tnO/IYdaOr/HgfeMVWYuj4EVNZViOvrrUcdLwmeY8+AxvS3EEpEgd4tdsZawzV3uxLy0b9IB9e9cNy5zl6j99NBGDtfcEJVaInfBfn1MTtQyBmk4CyBNhrlbzNaG5l+TVuwTkSQgp6hNHK/p25lFrnCay5uMGddRcXwT5sd8gmgfDeGeN/Ah+9ZtRTNmkeAXEnBaDIcx09frMZxQ X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Aug 16, 2024 at 11:30:31AM +0200, David Hildenbrand wrote: > On 14.08.24 15:05, Jason Gunthorpe wrote: > > On Fri, Aug 09, 2024 at 07:25:36PM +0200, David Hildenbrand wrote: > > > > > > > That is in general not what we want, and we still have some places that > > > > > wrongly hard-code that behavior. > > > > > > > > > > In a MAP_PRIVATE mapping you might have anon pages that we can happily walk. > > > > > > > > > > vm_normal_page() / vm_normal_page_pmd() [and as commented as a TODO, > > > > > vm_normal_page_pud()] should be able to identify PFN maps and reject them, > > > > > no? > > > > > > > > Yep, I think we can also rely on special bit. > > > > It is more than just relying on the special bit.. > > > > VM_PFNMAP/VM_MIXEDMAP should really only be used inside > > vm_normal_page() because thay are, effectively, support for a limited > > emulation of the special bit on arches that don't have them. There are > > a bunch of weird rules that are used to try and make that work > > properly that have to be followed. > > > > On arches with the sepcial bit they should possibly never be checked > > since the special bit does everything you need. > > > > Arguably any place reading those flags out side of vm_normal_page/etc > > is suspect. > > IIUC, your opinion matches mine: VM_PFNMAP/VM_MIXEDMAP and pte_special()/... > usage should be limited to vm_normal_page/vm_normal_page_pmd/ ... of course, > GUP-fast is special (one of the reason for "pte_special()" and friends after > all). The issue is at least GUP currently doesn't work with pfnmaps, while there're potentially users who wants to be able to work on both page + !page use cases. Besides access_process_vm(), KVM also uses similar thing, and maybe more; these all seem to be valid use case of reference the vma flags for PFNMAP and such, so they can identify "it's pfnmap" or more generic issues like "permission check error on pgtable". The whole private mapping thing definitely made it complicated. > > > > > > > Here I chose to follow gup-slow, and I suppose you meant that's also wrong? > > > > > > I assume just nobody really noticed, just like nobody noticed that > > > walk_page_test() skips VM_PFNMAP (but not VM_IO :) ). > > > > Like here.. > > > > > > And, just curious: is there any use case you're aware of that can benefit > > > > from caring PRIVATE pfnmaps yet so far, especially in this path? > > > > > > In general MAP_PRIVATE pfnmaps is not really useful on things like MMIO. > > > > > > There was a discussion (in VM_PAT) some time ago whether we could remove > > > MAP_PRIVATE PFNMAPs completely [1]. At least some users still use COW > > > mappings on /dev/mem, although not many (and they might not actually write > > > to these areas). > > > > I've squashed many bugs where kernel drivers don't demand userspace > > use MAP_SHARED when asking for a PFNMAP, and of course userspace has > > gained the wrong flags too. I don't know if anyone needs this, but it > > has crept wrongly into the API. > > > > Maybe an interesting place to start is a warning printk about using an > > obsolete feature and see where things go from there?? > > Maybe we should start with some way to pr_warn_ONCE() whenever we get a > COW/unshare-fault in such a MAP_PRIVATE mapping, and essentially populate > the fresh anon folio. > > Then we don't only know who mmaps() something like that, but who actually > relies on getting anon folios in there. Sounds useful to me, if nobody yet has solid understanding of those private mappings while we'd want to collect some info. My gut feeling is we'll see some valid use of them, but I hope I'm wrong.. I hope we can still leave that as a separate thing so we focus on large mappings in this series. And yes, I'll stick with special bits here to not add one more flag reference. Thanks, -- Peter Xu