From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A546DC71136 for ; Fri, 13 Jun 2025 15:27:02 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 471296B008C; Fri, 13 Jun 2025 11:27:02 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 448CD6B0096; Fri, 13 Jun 2025 11:27:02 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 35FD06B0098; Fri, 13 Jun 2025 11:27:02 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 177BE6B008C for ; Fri, 13 Jun 2025 11:27:02 -0400 (EDT) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 7EEBA1011EA for ; Fri, 13 Jun 2025 15:27:01 +0000 (UTC) X-FDA: 83550755442.01.C6BB789 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf30.hostedemail.com (Postfix) with ESMTP id 0CBF58000A for ; Fri, 13 Jun 2025 15:26:58 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b="S/nLzMRz"; spf=pass (imf30.hostedemail.com: domain of peterx@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=quarantine) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1749828419; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=ffGF128HVtw6a+HHD8uShTa3yQQnx4ItFakvkjkmL6Q=; b=AXvH4J7Hmf75OcRoenkPdqoMCqASkSi2IwE8mtHmk9bhVJ/KdwFzvdrgYRK47udhD33c0v URtPYMvAKxgUM/5FojHY1LVmZAHaZ6yeY/upRZDfKh/8NpaZqKuL9JeBzOTc+LaovM7uQa ZYXWj2rSZHHznpYr36DOPeP+i5Ioi68= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b="S/nLzMRz"; spf=pass (imf30.hostedemail.com: domain of peterx@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=quarantine) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1749828419; a=rsa-sha256; cv=none; b=gCjjW/9f6Di80RWJHKzBsQScwMrnKfi5kjTht57hDPOhrouFECCayKMFq6ynVdGqcTrJFf ZejLTMkPgiU9rMDNm9WfKsvhDaSODuXVN6L52KOQVp0ReMbniADodP+0kd6VxgExeguzQC rFglpKjslcMXVhPs+8TfRzhhA52HN0g= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1749828418; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=ffGF128HVtw6a+HHD8uShTa3yQQnx4ItFakvkjkmL6Q=; b=S/nLzMRzbipq6ZiTeKmZisOwm3R+iLkkHpzmaWFphYlygtqSLnPUQAZrpH5BxBjRrUDXbk 0NRVTmIL1j3sVMdwchiKKwjPUX9hUpS4xchmiGI/ZIwy1I6mPqA3EWAWCTKORO10iOsLO0 4U6YBqzsuApAOlUSUJskXbSgTmZLoL0= Received: from mail-il1-f199.google.com (mail-il1-f199.google.com [209.85.166.199]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-488-gxQhS4P_PfuHz21XWOoqNg-1; Fri, 13 Jun 2025 11:26:57 -0400 X-MC-Unique: gxQhS4P_PfuHz21XWOoqNg-1 X-Mimecast-MFC-AGG-ID: gxQhS4P_PfuHz21XWOoqNg_1749828416 Received: by mail-il1-f199.google.com with SMTP id e9e14a558f8ab-3ddd02c8bffso35535565ab.3 for ; Fri, 13 Jun 2025 08:26:56 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1749828416; x=1750433216; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=ffGF128HVtw6a+HHD8uShTa3yQQnx4ItFakvkjkmL6Q=; b=uVliZMAfe610DvbZOfi73AIxRAdEkwkuN824+ECyXZMwVRqqAoX4nBjcY4mp+kad0N 9Xx8+ALMSKDzg1TF72rko+q9o8iQkpjq3XtTietEkejt9OUDKk22NQ4ebDReeKVaaOXm R6C3gR8C44VQaFjIWtU1IBvZWbhVcsRLW8bx/ndmVJpKMaClNWn5nqiJN6ycG+hMnpl+ s6kX3IHX9u2WhjbbW0Y0Y27h3BS1jn7WSHceRKGXDsOjeFPLVfgjeriXA9lG0n6v/8fm UbJV6gRAFSt/t51Tzwi04/4c8OwQXgsKdKMdmmPJwE5aJ0e9gbQ4M7V+EIZDWtuMfU8B W2kA== X-Forwarded-Encrypted: i=1; AJvYcCV3DSyI+6l6ehfbcqTPRh72ESItUXihFSq7d4+z3DrUrm4JX64lRlDDAN+OgHGSo32/1/SMEUVRWQ==@kvack.org X-Gm-Message-State: AOJu0Yxzg09efEwftvVOTp5A6bxWjwAquMGmTO5FtufqIdRckeX5sTqi 5KyyuB/WQQJK1gP59ZQhBGh+wxaKyoRxJjiSQosQ310/cAxRHdRx10lo+/Gv4/JKTjc+quxTRzq pK406wi1GNgdPE23eWnIyewv0pBrPaxvVEAk+TtG2Pw+QSVfK8G2epOYJXTdE X-Gm-Gg: ASbGncvuktHoNfedZ9u1PJeAf6f1Gze0HkSYHxN/IWAXwquf+Mxg1b2vrd6JByQPJdH K7rJc3JWEvCFv67RQ7PTwV76kTOErJj8CyYeK1pr+l5ayQCxObdMQ/U02oibNDHJnojHT+0bvml DqcTDv89YPHFAP+XI3WtMKse7XEWowgaCc+yRLLyXMe1PNSOqRmSyyDxPXYh169wsBPVvXz6qMn IOZ1RwFknRLwC0Kw3E3PqrnGXnEejEQClbfeUKRIuRwgi2FM7D6jv/spGm1tMrEKulr1sSqDqfI JFp2hKSbFlbkiA== X-Received: by 2002:a05:6e02:1a0f:b0:3dc:79e5:e6a8 with SMTP id e9e14a558f8ab-3de00beb276mr39871405ab.15.1749828415986; Fri, 13 Jun 2025 08:26:55 -0700 (PDT) X-Google-Smtp-Source: AGHT+IE8djcjjfItHRHrDNgzteRBKJ058qY0Qze27Tb0GmaT85bQCB0E8VSqwOBOZ5ggANoXHojH2Q== X-Received: by 2002:a05:6214:2343:b0:6fb:3537:fcfe with SMTP id 6a1803df08f44-6fb3e602ce6mr52125976d6.22.1749828404861; Fri, 13 Jun 2025 08:26:44 -0700 (PDT) Received: from x1.local ([85.131.185.92]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-6fb35c55ff4sm22627546d6.92.2025.06.13.08.26.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 13 Jun 2025 08:26:44 -0700 (PDT) Date: Fri, 13 Jun 2025 11:26:40 -0400 From: Peter Xu To: Jason Gunthorpe Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, kvm@vger.kernel.org, Andrew Morton , Alex Williamson , Zi Yan , Alex Mastro , David Hildenbrand , Nico Pache Subject: Re: [PATCH 5/5] vfio-pci: Best-effort huge pfnmaps with !MAP_FIXED mappings Message-ID: References: <20250613134111.469884-1-peterx@redhat.com> <20250613134111.469884-6-peterx@redhat.com> <20250613142903.GL1174925@nvidia.com> MIME-Version: 1.0 In-Reply-To: <20250613142903.GL1174925@nvidia.com> X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: 2JgCl64cSU0QNrUEioL3Cm7qSwlqzicU7msKsaM5kd4_1749828416 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Disposition: inline X-Rspamd-Queue-Id: 0CBF58000A X-Rspamd-Server: rspam07 X-Stat-Signature: bdke8rc3bmurt99449op5wwqcna6g315 X-Rspam-User: X-HE-Tag: 1749828418-171825 X-HE-Meta: U2FsdGVkX1/MTwaO4CqsMD8n4N6BFc83P/v0Jy+LdSeX6rK1gBiABTBuiaAZelOX/jxAtwSRhRAO40sbJBgfWjU1aAkxUeSPY7KQ/4wRpxXTHZTVYWmndnCpn5cXXND9EyyI6IcqW2+HJ2jW5fDwlNL6LHoOlBhNMMABg0F/Mo50KvWqckXAwtZIILNRFQ3CvmAu8rzT8KvdtqxLXh/dbuyObt8+sVpTJ/68/caP/QloJ9BQbHurIDLXPM5U6PHMSoi5LxuzYCOUpHaFyHgczQoSa/a0BfeneOsU92X0Yku81ytfkt4nsmkhxQfiaGLuwj0AFiHq+KdIauHutDXQvQQ1F5n8lOMJba6RCFFHCL1zjtBuHT8qqKx0OQzt61dawuQWaoz3DWL+ghAi9VDhoMC4M9O1Zlu1mJeYTCutiKXLhvljhc1kztyGCRWXwlYWdQhH+P0m9ZxQnGJuYn6wlScXw7F/mQbaEbf25qSC7Zioxsfgctbtp8kR9ZpCWRoY6ZDJdCw6QtkyxzOsEjDd2C3VEFW8cNg8lY7awDBoquR1BNZozSgqxW45Pja6zweLcUWbluEP+HX886lxnM0Fbra0zeHchtouEfI4mpouBaHKLtkUUKPgmRaMPCLsUugyEgZJD9lpDtZz+7RXU/hESKDhY97AdXw69tNFQsMI11GfHgevlTuA12XbA6NyiATYv12M1dgqOLwfm0MvfFEh4FVLqRkRNOlkjOcXv5SdV43mAoZ0NieW4iFSwdD+9FRr7m22f18b0s/DhzdvTZpSihiLYcFWZhwzmYk1YDZUw9Zva1eOC85ilR0Ht/xp0gl0IGRYIRyoo/C5WSKD/bGbJptXL3x8p0RLzVrbSBVjlJguucNWnsydRC/kOuu5NwafWyyVfOhr9vt0ktrfoEFsfqbCmu16Q1KZ1jQZ2+u/uFgMKr349Fu9oOueJVSt+eq9Typ8Ia1wth8PTzVqCag rJezpFZo knet4mLlqvqsZmoIWkpvr5HuUkNk/zg1DTIonljr3Riro3LdTQznnnTA/rwHdL6/rWfGvmxWSWcPMgfyCodSMZs8uxNY1pFy6VmG9Mjo6U/zIVxWQcN9qZWZ8cINPfSuxQsa8nGynIZGvJvAkXt20sudjNn3gg2ppfg6J53lAPdH+mFEpUWPjY5gcYt2VmCL5J9utXtyGXEXBCQVZp8oUA/wi0/DG9yBMo5cms0sV3jwULWOZ9a77yf7ZecRX3JRA0zhUDUWA0cJIntUBy+O0SfHHXOccfE2QIofYmAgLD+N/bDTbl4+Z6BJQyMBzNR6o8rB1zf9shZis0uBmKnUztm0SkEwPV+wg1jV7d0Qr+F2jmgpZSSAA3vcJ6L/fV6sgvNTXzZW1p23nY/eXVs/VFv9dK/1hcrIFqntzJKgOipNQJfY= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Jun 13, 2025 at 11:29:03AM -0300, Jason Gunthorpe wrote: > On Fri, Jun 13, 2025 at 09:41:11AM -0400, Peter Xu wrote: > > > + /* Choose the alignment */ > > + if (IS_ENABLED(CONFIG_ARCH_SUPPORTS_PUD_PFNMAP) && phys_len >= PUD_SIZE) { > > + ret = mm_get_unmapped_area_aligned(file, addr, len, phys_addr, > > + flags, PUD_SIZE, 0); > > + if (ret) > > + return ret; > > + } > > + > > + if (phys_len >= PMD_SIZE) { > > + ret = mm_get_unmapped_area_aligned(file, addr, len, phys_addr, > > + flags, PMD_SIZE, 0); > > + if (ret) > > + return ret; > > + } > > Hurm, we have contiguous pages now, so PMD_SIZE is not so great, eg on > 4k ARM with we can have a 16*2M=32MB contiguity, and 16k ARM uses > contiguity to get a 32*16k=1GB option. > > Forcing to only align to the PMD or PUD seems suboptimal.. Right, however the cont-pte / cont-pmd are still not supported in huge pfnmaps in general? It'll definitely be nice if someone could look at that from ARM perspective, then provide support of both in one shot. > > > +fallback: > > + return mm_get_unmapped_area(current->mm, file, addr, len, pgoff, flags); > > Why not put this into mm_get_unmapped_area_vmflags() and get rid of > thp_get_unmapped_area_vmflags() too? > > Is there any reason the caller should have to do a retry? We would still need thp_get_unmapped_area_vmflags() because that encodes PMD_SIZE for THPs; we need the flexibility of providing any size alignment as a generic helper. But I get your point. For example, mm_get_unmapped_area_aligned() can still fallback to mm_get_unmapped_area_vmflags() automatically. That was ok, however that loses some flexibility when the caller wants to try with different alignments, exactly like above: currently, it was trying to do a first attempt of PUD mapping then fallback to PMD if that fails. Indeed I don't know whether such fallback would help in our unit tests. But logically speaking we'll need to look into every arch's va allocator to know when it might fail with bigger allocations, and if PUD fails it's still sensible one wants to retry with PMD if available. From that POV, we don't want to immediately fallback to 4K if 1G fails. Thanks, -- Peter Xu