From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 37D43C7115A for ; Thu, 19 Jun 2025 14:55:12 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BA9056B007B; Thu, 19 Jun 2025 10:55:11 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B5A346B0088; Thu, 19 Jun 2025 10:55:11 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A708B6B0089; Thu, 19 Jun 2025 10:55:11 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 981B56B007B for ; Thu, 19 Jun 2025 10:55:11 -0400 (EDT) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 122CE1A0325 for ; Thu, 19 Jun 2025 14:55:11 +0000 (UTC) X-FDA: 83572448022.12.B9C6B06 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf16.hostedemail.com (Postfix) with ESMTP id 9482C180007 for ; Thu, 19 Jun 2025 14:55:08 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=BPRnnjg9; dmarc=pass (policy=quarantine) header.from=redhat.com; spf=pass (imf16.hostedemail.com: domain of peterx@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=peterx@redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1750344908; a=rsa-sha256; cv=none; b=EWcFo1yx5maFdWQf0de4C0cF7jHBBEOul7ca3fUnzkhufNK/JZV9RGyY4X2wIl/x01+2nN /R26HksIzYdupugNWnbCSIDGlB6KhbZCDjnvY4qkikH9+8opCX1vbRbaOX//4InLArznZ1 ywcPRvjIWaKLi1YxBbQOF5yFIm2Gq/8= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=BPRnnjg9; dmarc=pass (policy=quarantine) header.from=redhat.com; spf=pass (imf16.hostedemail.com: domain of peterx@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=peterx@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1750344908; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=+nIR3s6WyChuWWIwaj4SIlo+aTHkYFJ52GIb7pLAwAo=; b=TN6VfT74Vuxt+gUvuIto8+jBf3Q5bHnT6qjoKZa+5FcO9wUrtAc5NsBKCNf1ku8dTCXCal ubNCwmxo9M0YTA3pLUxVLdAq3VW9+vjXB9cFoe8E6l4HXokhqAhzp2GDyMXBSv6rRavm+X xUntU20bsJZoGpN9Q/woPRHskhej9fY= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1750344907; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=+nIR3s6WyChuWWIwaj4SIlo+aTHkYFJ52GIb7pLAwAo=; b=BPRnnjg9tClXT3FpcQjPhHbM95biIekBaQFXoB26EPe/BYfC2jffZYfOqeqpjxPtVVecm+ OxbmiYZuXQ+l9PBn+0F5aHbJCz1ICdc+IOw4i22DvnczPX9SD5YlRaRYUnC0fM3IXZ6WEY lgnusCSojm59K8Kgs2ofAemf4RyK5zc= Received: from mail-qk1-f198.google.com (mail-qk1-f198.google.com [209.85.222.198]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-602-s9xcaNlpPymkFlPUt4Bw8A-1; Thu, 19 Jun 2025 10:55:06 -0400 X-MC-Unique: s9xcaNlpPymkFlPUt4Bw8A-1 X-Mimecast-MFC-AGG-ID: s9xcaNlpPymkFlPUt4Bw8A_1750344906 Received: by mail-qk1-f198.google.com with SMTP id af79cd13be357-7d399065d55so119036285a.1 for ; Thu, 19 Jun 2025 07:55:06 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1750344906; x=1750949706; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=+nIR3s6WyChuWWIwaj4SIlo+aTHkYFJ52GIb7pLAwAo=; b=v3RE2py5M1CkS3VxTuXF2RY1anZ8hWug18xgPp/xjLgGyciEBeste3uUT1xkE+pxgv 3/MqhaKCXp5bpk24RAK8GGbkQeMFeciELMboGNTFPtYBxuNowtsCzguBB+EQfsSezvF5 NzIuEEJSqjmfYU5QGt3e+xGm0p36l+EacLqyX32opwguw+sX7TiZIyn7zdmFZCDWyx5o L6qEN1QUfsPtVAj9GMVYe3jx+wvW8MYdznC4Wa0pJGde6Q0fUDlTXt7Bz4fO7j4novlQ NtPFRPNU02hPz8UPeW2bITdVu1tK3RLZk8W2bBOveomcQ22LkFzk9WSTMP0Lrvw80Cah JluA== X-Forwarded-Encrypted: i=1; AJvYcCWhFyXFHfF2KdqkIbXUJhSgDyUbMbaFqLI1KcAsckuoo5Q4Q08v2Wmrr8syry1qv+1u1et/Flvf7A==@kvack.org X-Gm-Message-State: AOJu0Yw/Rc1tZTuPoOkX+KjCdPqYuC+9x/72+lveGMvwWtDoCuZvXZQb tNA+YDxxuA9sWg8Y5x8JOZxt5Y5V8hvkGyMgQgPVhpM5YxuG9rjFCA8Ho1HaaUx2j+LaGkoUxWz KOiT78xMSvKHYgFfgjByJchG2iwQ4/iGN45ndjqZI1G9sjxYh6q78 X-Gm-Gg: ASbGnctp+YRFESQ3fFtPSbTFy+pAbiBSut/c0510PIymWVADvZh7SOcOAWQNjMnIP9j Fbq29iQ5goAnqO8qkiMhtD2lc76bU25BHS9bGE49BCy/DcG6dE/wCA/vO5MnK5g0W8mB5oq9x/3 Xk9OCRbhduCeGdNZ7wp4rfUiItRMbEroHH78tdQTqCxjaX1GRWoWaWEl6qkcGxGjhx2nAnygOfY K2OOu1WG8R5nhq8pFW6nKqTN833CMJw5rCkwtM2wHoptMVpxVmXdmuvc4kiHWEvGF77Pt+EoRlV mzCNoI4s0XcLHg== X-Received: by 2002:a05:620a:40cb:b0:7d0:97b1:bfa with SMTP id af79cd13be357-7d3c6c0d376mr3689366385a.8.1750344905716; Thu, 19 Jun 2025 07:55:05 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHFzfYtvhWciWVVF5GyT6JlUpid1yuToLPfDm2d7WEu2TrPd8D0c+sMxu0V7+LQLYejLfKHqA== X-Received: by 2002:a05:620a:40cb:b0:7d0:97b1:bfa with SMTP id af79cd13be357-7d3c6c0d376mr3689362885a.8.1750344905302; Thu, 19 Jun 2025 07:55:05 -0700 (PDT) Received: from x1.local ([85.131.185.92]) by smtp.gmail.com with ESMTPSA id af79cd13be357-7d3f8e5ac5esm4759385a.79.2025.06.19.07.55.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 19 Jun 2025 07:55:04 -0700 (PDT) Date: Thu, 19 Jun 2025 10:55:02 -0400 From: Peter Xu To: Jason Gunthorpe Cc: "Liam R. Howlett" , Lorenzo Stoakes , linux-kernel@vger.kernel.org, linux-mm@kvack.org, kvm@vger.kernel.org, Andrew Morton , Alex Williamson , Zi Yan , Alex Mastro , David Hildenbrand , Nico Pache Subject: Re: [PATCH 5/5] vfio-pci: Best-effort huge pfnmaps with !MAP_FIXED mappings Message-ID: References: <20250613231657.GO1174925@nvidia.com> <20250616230011.GS1174925@nvidia.com> <20250617231807.GD1575786@nvidia.com> <20250618174641.GB1629589@nvidia.com> <20250619135852.GC1643312@nvidia.com> MIME-Version: 1.0 In-Reply-To: <20250619135852.GC1643312@nvidia.com> X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: GHA7FfYlmFeb8vFkjnYlbIVqdef27k2eVyPiRIBNy14_1750344906 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Disposition: inline X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 9482C180007 X-Stat-Signature: nqcjny5nshobh8gyw6emrxup6zy1xq6m X-Rspam-User: X-HE-Tag: 1750344908-472937 X-HE-Meta: U2FsdGVkX1+7uaJ68qF+X1dK1AnLgM+kX/m8632E8v/SGsumGpjXW9Ms7ANMstbQyzobRaXCH3BVbPrEEvvKc2o4dFZjlIDaiS/+8CRpFBvwEAVZ731CFXU5icVxsv5bw/e2GICODRsk+cqn8nLQMo5FTrQsWw8jrkMo4ve+2Dz7zsqOWfEoWGiwZS5EtgvTQNBqiOqGD8hGZfVqbt2KQ9LpBEQawbtKOzgry9wehdBZkP61q26WkgYGe5Qb2JpAS8bKdplMANMndaX4o0kMqHtMPqQUfhZza5CspcF1tmemJ2h2xueKvpOQpssL3RqfaeOER0yXSdBkqeM1nA4N67aD+9m6NEakQ+QzGDv9sTx/FLX7ZE1RxXrN16AgTp651q6DUGDezPDanEjl0M/uxeCyYaVSYx7Vs1+I9CdkKF79CJxi3JTVQQs1/UQzYkvj4w44oAxoTKlkmy1+YyFP6DJ+IlwjIGaSDcUROk7M38UdBLMvtNSCc8G7ba4sbsfEFlaGz3vaqd0KsKxqCtWgZC7/NjLFhdyq3kiYQAyAz7/3XOm7pVeNhp2ryz0lKNa4AoUTX1pwHPUyJNvs95lGibx37yrykTaJrp58yRHqK+uNO04GA3apBxBES+6+MIjNYssBtimNoHrNmxtNms+QmScJXAiByonnRcD1iOJshW+OorHuC3e8mLH2hd3k/ZFgE1/Vyn88jdrMRiqjlUMfS0sFX2c73RtagBNx/ceGuT4KngUyHvzD5Hurs/Htv84S6YtWC3so21hlnak8SmYOwjh3c39+VKRO3i5ObZsDCAVlxPsXL6erXxnYtcxdx9YI7j67nywFtJOSUl31iBlFrEF4sfp2N8hW/Gtnx5N88Plyd9Pp1G8wKsbgRnXZXC1zZ0HiMJ5WyhYgLlPbgpRKMmkvb8r3LFJ6aBQiqPrIF5jSC0pj26A9WXaf4jd7HgjApx5dw9cw5KIiQ9v0Drk UTljiiyv OcOyx8VG6e/Ma35M2SOSuotG8gx5mEr4Wq+wEJuqo8cW0SyXzKHak5nvLPHXU0b5+55Lh+Rnlr3IvOJDaQYJyKN3lcvGqKAVIg2kkRyLBTLiDu2MfFrHn79tYMX7piiZArtvFXTUT+BQ/CsjQpCaiUsR5YxzttAri/qWOzvnfTOjUJkfVFJ1FStUDBZ31UmXFUB/w+EWdNCMnq4eB6N33jQh9qWiBM9VP507+y0nu5ys5jRcE3QM5I74suMfGIz/hy773vgmZPV7/PQWe0g1atccLoAADcT7ttxMAyt5rhtoW6H2uU7lhfGew9aM4BdeHMoR5OTChHeUxvBpDlCuzldK/qDg6cTvqRxaYmdluhQ6AU6HDXh8X5O8qF5YRV5/muIVHcFwqm5hbHH0YW1KUsnb5kdXIrQdg8Q3EJSUvgQwOfkqqk0UB8GnVkg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Jun 19, 2025 at 10:58:52AM -0300, Jason Gunthorpe wrote: > On Wed, Jun 18, 2025 at 03:15:50PM -0400, Peter Xu wrote: > > > > So I changed my mind, slightly. I can still have the "order" parameter to > > > > make the API cleaner (even if it'll be a pure overhead.. because all > > > > existing caller will pass in PUD_SIZE as of now), > > > > > > That doesn't seem right, the callers should report the real value not > > > artifically cap it.. Like ARM does have page sizes greater than PUD > > > that might be interesting to enable someday for PFN users. > > > > It needs to pass in PUD_SIZE to match what vfio-pci currently supports in > > its huge_fault(). > > Hm, OK that does make sense. I would add a small comment though as it > is not so intuitive and may not apply to something using ioremap.. Sure, I'll remember to add some comment if I'll go back to the old interface. I hope it won't happen.. > > > So this will introduce a new file operation that will only be used so far > > in VFIO, playing similar role until we start to convert many > > get_unmapped_area() to this one. > > Yes, if someone wants to do a project here you can markup > memfds/shmem/hugetlbfs/etc/etc to define their internal folio orders > and hopefully ultimately remove some of that alignment logic from the > arch code. I'm a bit refrained to touch all of the files just for this, but I can definitely add very verbose explanation into the commit log when I'll introduce the new API, on not only the relationship of that and the old APIs, also possible future works. Besides the get_unmapped_area() -> NEW API conversions which is arch independent in most cases, indeed if it would be great to reduce per-arch alignment requirement as much as possible. At least that should apply for hugetlbfs that it shouldn't be arch-dependent. I am not sure about the rest, though. For example, I see archs may treat PF_RANDOMIZE differently. There might be a lot of trivial details to look at. OTOH, one other thought (which may not need to monitor all archs) is it does look confusing to have two layers of alignment operation, which is at least the case of THP right now. So it might be good to at least punch it through to use vm_unmapped_area_info.align_mask / etc. if possible, to avoid double-padding: after all, unmapped_area() also did align paddings. It smells like something we overlooked when initially support THP. Thanks, -- Peter Xu