From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id EC811C71136 for ; Tue, 17 Jun 2025 23:36:20 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 224A06B0088; Tue, 17 Jun 2025 19:36:20 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 1D59E6B0089; Tue, 17 Jun 2025 19:36:20 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0C49B6B008A; Tue, 17 Jun 2025 19:36:20 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id F052B6B0088 for ; Tue, 17 Jun 2025 19:36:19 -0400 (EDT) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 185685F5C4 for ; Tue, 17 Jun 2025 23:36:19 +0000 (UTC) X-FDA: 83566503678.28.BDE2D3D Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf14.hostedemail.com (Postfix) with ESMTP id 38915100002 for ; Tue, 17 Jun 2025 23:36:15 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=bOs55ais; dmarc=pass (policy=quarantine) header.from=redhat.com; spf=pass (imf14.hostedemail.com: domain of peterx@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=peterx@redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1750203376; a=rsa-sha256; cv=none; b=giEu+Ce279U7+YTQLWWhKZVXfENZmjg2lKyK3ZBQE3hNT8uE6+FvJQ0QmNfTyQ5APZg6ec lbBXl8XSNUXfoOJhgGRtqEA40ZLivntOqlt9G9lnC+zSDrDwCMv3VAFCvLoWXF8cbvOJDK SqMHzBuKCblp5d82jQYoYI3Y1zOsn+0= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=bOs55ais; dmarc=pass (policy=quarantine) header.from=redhat.com; spf=pass (imf14.hostedemail.com: domain of peterx@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=peterx@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1750203376; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=KBSv6o9j7WYY/vKWRFs7+EFKW/WKi+ZVGi0l2xZHTJg=; b=aet4rFf3A+JMETgEPfOeCU2WkTzMb/zvpa4uQz7Zurql57Ov+IrSVf85oLIcZFHROn2KEV 81eNstw5kEQEKZl4WYGJdqNh9bBOy43j3U4WoifMxfh11y7cZZRdXQbeKT1Ulc3wBH/aq9 poVzGaecyrEzoBykxAoLp2vJRTYrZE4= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1750203374; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=KBSv6o9j7WYY/vKWRFs7+EFKW/WKi+ZVGi0l2xZHTJg=; b=bOs55aisfcgZcx+uoFhOe3cBb9suOhk9Von5PpnPvjYemk2qZObGnQAP7oBW4zSmKzS8OE klgcPbDiRNJE6uyRhnn4qawVpwsqezpoWoG6fpkcz66SQMn80QY/RqRGPVMzKtNZNe74VJ OR5NgoyAIcvHmHR6L3yfozM2sbUVEXM= Received: from mail-qv1-f71.google.com (mail-qv1-f71.google.com [209.85.219.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-351-MhMzeLYVPfm4xe13m7Fvig-1; Tue, 17 Jun 2025 19:36:13 -0400 X-MC-Unique: MhMzeLYVPfm4xe13m7Fvig-1 X-Mimecast-MFC-AGG-ID: MhMzeLYVPfm4xe13m7Fvig_1750203371 Received: by mail-qv1-f71.google.com with SMTP id 6a1803df08f44-6faca0f2677so160411946d6.1 for ; Tue, 17 Jun 2025 16:36:12 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1750203371; x=1750808171; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=KBSv6o9j7WYY/vKWRFs7+EFKW/WKi+ZVGi0l2xZHTJg=; b=evK92b22Jp4A9tWE8Jp7IruJbZhJmqaswcb/z/a26OJoRqAJTz5HilNdlWEx79q3tN FO/iDfOLibPLZTQm4vkAuHYgbEgo9aO1VcDMLRWImBFKG777exAo9v0hIqj96eXAdbJC AAEx90PwHpfcqBxLB6CREHkq8UNuV7dkQrhQuq7QJeZn1ytuSF2nlEDm7Tbm7511GHPL 5Btj3A7cGr4CwoOs9trKEaubphVCoI1moFhA7ncMhdjngV6j+xMya4mAEus3rngvjtj5 WCQYOvRGv4saR+d1yhpLXWH4HQNCwVOHy8KwKGrJWMYkvCXd3Avwu91r0x1XNHguWZ5H pDvQ== X-Forwarded-Encrypted: i=1; AJvYcCWn1cOXbhKOprF64mPFjSGO/eSDt7+sz2L2WT68YwsGjEkZYp4Mu8ryeiIDLkF2+sxE92n21JTVuA==@kvack.org X-Gm-Message-State: AOJu0YxXxXUhW5U2HUGNfdUh9eSFs6XL0MVxG/o8XWvyPr3EmKk/QDEF 0PqzueF+mBS7mRvjXUF7+ZacPtfm/Ubs5L1XPBde/tT13/5sAlv5BqWN15zeCeAJRAVs0PtnBNE 5wEdIyC8PoyPoLvajK3chZor2Oy+eAbIKNXop+6i5dSxxVb7ys3rY X-Gm-Gg: ASbGnctRiTRf3Sfb1IwADezPkKuh6+fYuSgSY8ANItC2CDP0CUCwVVxLIzHm+bn2xwI vtWVX5SQMX/yKlFDYEUxSaSvBak8ip9pgF3SYq3FfRoo7trxdqj3F32xHLODZ7/deXqvTZm+TXS ZVSvRGj7faceLnkYHZ2EdWr8Lij2Y38u84AX70Vf++/PcS7iiBJaSjpLNcXCzQgjSkQLSwNnXt4 Fv7yL4AeNRAqFz/CoOVgLGBMsVHF5Y6flAjbmPX+ZMyck8Vw1dU/w89QONLnVX7H8sufi2vavb/ CdeTIQ1z62yIjA== X-Received: by 2002:a05:6214:5018:b0:6fa:fb25:e0f1 with SMTP id 6a1803df08f44-6fb477d99b7mr200740196d6.24.1750203371422; Tue, 17 Jun 2025 16:36:11 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFKrTSuB1EI9cqt5G4JNRshFwKRQv2YtvABo07xYPeeJ2qDKgN205wNAbsTpAWIrLURyfvhgg== X-Received: by 2002:a05:6214:5018:b0:6fa:fb25:e0f1 with SMTP id 6a1803df08f44-6fb477d99b7mr200739946d6.24.1750203371054; Tue, 17 Jun 2025 16:36:11 -0700 (PDT) Received: from x1.local ([85.131.185.92]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-6fb4e432100sm37584916d6.116.2025.06.17.16.36.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 17 Jun 2025 16:36:10 -0700 (PDT) Date: Tue, 17 Jun 2025 19:36:08 -0400 From: Peter Xu To: Jason Gunthorpe Cc: "Liam R. Howlett" , Lorenzo Stoakes , linux-kernel@vger.kernel.org, linux-mm@kvack.org, kvm@vger.kernel.org, Andrew Morton , Alex Williamson , Zi Yan , Alex Mastro , David Hildenbrand , Nico Pache Subject: Re: [PATCH 5/5] vfio-pci: Best-effort huge pfnmaps with !MAP_FIXED mappings Message-ID: References: <20250613134111.469884-6-peterx@redhat.com> <20250613142903.GL1174925@nvidia.com> <20250613160956.GN1174925@nvidia.com> <20250613231657.GO1174925@nvidia.com> <20250616230011.GS1174925@nvidia.com> <20250617231807.GD1575786@nvidia.com> MIME-Version: 1.0 In-Reply-To: <20250617231807.GD1575786@nvidia.com> X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: FqUk_M6ZiwaSIqJvYtW54TfRCI1F_BLyTzF3J_g8qXc_1750203371 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Disposition: inline X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 38915100002 X-Stat-Signature: sxtwnqrsadewitbzkr8x4uk9ikaa1gkm X-Rspam-User: X-HE-Tag: 1750203375-681192 X-HE-Meta: U2FsdGVkX1+da02RweinGpCB17haW/fBUhmd+3TJIf0kqY0C2NpmdIF+nK4ONMQVhoYY87yeoWRUQ5Q7EHhMGsRXjd9k6q5nuELOH0BXzjfVDw2htNFBz25YhfE9cnKCotYH9uEIHIJRGu47ktIVH6q9JWNDkNutatoPmRgWOxH/YDzP3QQrF8GDbS8pPBZMI9kxbZ++3EghXmMB1DU641h9wrt8hX+F+Arn43VvKaTsZru/KW/3fDzmOhlhJfcIkECyjjh4Dg9I3a5JXMDF697gfXLTPKYGilUkubRK2/vmDox1cXdD24c1IYh7xu+/DXPeGHDPEZrGVSI4mpg3iD//drx2ITNDjP/wrQ7slxigVab4/QYu6RLKstF58j26KsNzHcjXhRYWMCyygMsmsQFtvQxd2mFnYfIiTltlNguaiTctKxsdk5ZqGuViV5IItQkX1kLA6uWnbZtUb1T611WcESn/GPq+RqWgq2kBwobxzGgA7imDRn3JCN1D4cCt5wM38uZPU5QrciaTR5UdtCKTVPm2mM69fmUNYKCbzHn+NuS+XuoNfq1AA10qx5DXUqE/pZ3Apwivoa2PaR2odOuhTFMthnRBs2x0715AOW0w/BPX5sIVu/hp6vYPVe3yk4v4cIR1v2cFGU+7tKwWQMAbMKC2Uxiz34tdIKgmCgKRa+iV59cuCyMKErG19C13VWSvh3rb5UFy4kDxKKr1YQIuN/yAjPIxgadMASerfo/OLN/KYP0AwHiZe7DBvb9rx8X0jU8b/in85AfskXGQlNDS2HjxiILqG/oKpgQjGY7SMvGaaQ4UsS5SycT0j4415jjGavlh/wY4mUxIGR9vRwR2ZnF0I8WSqygO01sx38KohsXHtXI8Fx6vNn7p8cBMCLb6tJgBXKc8SK2Ae3b6hI5FKAHN9/M2mJ5RXeW8I8szXwrE7XDC3TA8NKmEl7fdhZ5xaVyBhlTNw5fU5wJ cxalVMW6 +hmP1TcjFkBr3DgIv2y1ndrmIDP3gUCkKLHDhetfgjQzYDQJpNV9l0ej0yjDi/NgjtH0gnHE9A9gtkyzOazIfj9daQTugaCCDZqYNQQk0R3iihjfi/CF2bjuKcJpG5NcBrQoeWcKBsUga8clAL+JvgFk1aQfaNxZCsq8KlDrGfwa/F7+8f/5DJESxiFCOrN6H8bIrYYQpkHPqOp1CEfXpBfB0NRmcZbN5rPwyxEtb1GmO9p2FHsT8S1gN9fJ9MPwb6N/8ITsXXXEiUuh+Ovhv6wCaOF6hEEaetTHizC9p30IZWkWdLOxERkViJz4W+Xa039CgcjT9EZPi5mQw8wPdNKliiO36w+I4Uoc0EQ+G9oCfVHDVq242Jo17jGPTI1ft7vcNWvyPGbSfKV/aPJZGx6I4Jz2UCvWzjLOmSnP/fJ6Uq5ip7RnjX49f8A== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Jun 17, 2025 at 08:18:07PM -0300, Jason Gunthorpe wrote: > On Tue, Jun 17, 2025 at 04:56:13PM -0400, Peter Xu wrote: > > On Mon, Jun 16, 2025 at 08:00:11PM -0300, Jason Gunthorpe wrote: > > > On Mon, Jun 16, 2025 at 06:06:23PM -0400, Peter Xu wrote: > > > > > > > Can I understand it as a suggestion to pass in a bitmask into the core mm > > > > API (e.g. keep the name of mm_get_unmapped_area_aligned()), instead of a > > > > constant "align", so that core mm would try to allocate from the largest > > > > size to smaller until it finds some working VA to use? > > > > > > I don't think you need a bitmask. > > > > > > Split the concerns, the caller knows what is inside it's FD. It only > > > needs to provide the highest pgoff aligned folio/pfn within the FD. > > > > Ultimately I even dropped this hint. I found that it's not really > > get_unmapped_area()'s job to detect over-sized pgoffs. It's mmap()'s job. > > So I decided to avoid this parameter as of now. > > Well, the point of the pgoff is only what you said earlier, to adjust > the starting alignment so the pgoff aligned high order folios/pfns > line up properly. I meant "highest pgoff" that I dropped. We definitely need the pgoff to make it work. So here I dropped "highest pgoff" passed from the caller because I decided to leave such check to the mmap() hook later. > > > > The mm knows what leaf page tables options exist. It should try to > > > align to the closest leaf page table size that is <= the FD's max > > > aligned folio. > > > > So again IMHO this is also not per-FD information, but needs to be passed > > over from the driver for each call. > > It is per-FD in the sense that each FD is unique and each range of > pgoff could have a unique maximum. > > > Likely the "order" parameter appeared in other discussions to imply a > > maximum supported size from the driver side (or, for a folio, but that is > > definitely another user after this series can land). > > Yes, it is the only information the driver can actually provide and > comes directly from what it will install in the VMA. > > > So far I didn't yet add the "order", because currently VFIO definitely > > supports all max orders the system supports. Maybe we can add the order > > when there's a real need, but maybe it won't happen in the near > > future? > > The purpose of the order is to prevent over alignment and waste of > VMA. Your technique to use the length to limit alignment instead is > good enough for VFIO but not very general. Yes that's also something I didn't like. I think I'll just go ahead and add the order parameter, then use it in previous patch too. I'll wait for some more time though for others' input before a respin. Thanks, > > The VFIO part looks pretty good, I still don't really understand why > you'd have CONFIG_ARCH_SUPPORTS_HUGE_PFNMAP though. The inline > fallback you have for it seems good enough and we don't care if things > are overaligned for ioremap. > > Jason > -- Peter Xu