From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B8F2ACE8E97 for ; Thu, 24 Oct 2024 17:06:33 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 375696B0085; Thu, 24 Oct 2024 13:06:33 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 2FD7D6B009E; Thu, 24 Oct 2024 13:06:33 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 177A46B00A3; Thu, 24 Oct 2024 13:06:33 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id EEA876B0085 for ; Thu, 24 Oct 2024 13:06:32 -0400 (EDT) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id EDF3D1A0A2C for ; Thu, 24 Oct 2024 17:05:58 +0000 (UTC) X-FDA: 82709123952.25.A325B53 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf05.hostedemail.com (Postfix) with ESMTP id 5EAA410001E for ; Thu, 24 Oct 2024 17:05:54 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=IfESQaJ0; spf=pass (imf05.hostedemail.com: domain of alex.williamson@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=alex.williamson@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1729789386; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=DJivlp5+vY5saGH+G4HFhSzZKNWMZaI7pdxEbQ8gMx4=; b=LxaS2BdRfz4luU42BTRbnf6Jckr6e3R3CF8RL1NDFIg9drhZjIV4eZayXsHyzsY2JOE273 PQ/g/SCrttwq414P6rSQAwfio86SX+ez1q4+y9hbDgRYT719SI9xgcdRs0VqWxto9YYjf6 p1rzVTZX+C8pXrVb1Ysc/Lv79g7xzsk= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=IfESQaJ0; spf=pass (imf05.hostedemail.com: domain of alex.williamson@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=alex.williamson@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729789386; a=rsa-sha256; cv=none; b=Zj6Ef2qSQu8v0bLhmEZpfrdanzLKZRYkpWVi56+nvcCxIwdmwGAcHVVoCwYmviyzrMR49o RdMnW7p/xiQqEpiC8RO7xs5IUHvWto0gypGWsCpPb8bECdZMfSBz8733cyxxBPM3Yjr+zF xxrmYCfBPcAF0xS7hm8B6XKx3/ZQfPo= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1729789589; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=DJivlp5+vY5saGH+G4HFhSzZKNWMZaI7pdxEbQ8gMx4=; b=IfESQaJ06LjEXY4X+YXhP1S/CjMYMJAdfyVXDZ9qsipSWA/vDSuPTl3jef5p8A0GMuISbg yOSjp+G5js7gYE9rUUZE3oWnZ9ZKqsNz4PhXr7fpuUiCY6dlbNR3/TUuOCFJRaPpAw4dfk M0KA2J2BuAGLk4B4ngPheJHSV0b/C6U= Received: from mail-io1-f69.google.com (mail-io1-f69.google.com [209.85.166.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-552-u4rWsoVJMzqslC04BcCc-w-1; Thu, 24 Oct 2024 13:06:28 -0400 X-MC-Unique: u4rWsoVJMzqslC04BcCc-w-1 Received: by mail-io1-f69.google.com with SMTP id ca18e2360f4ac-83abbe0a16dso17939739f.2 for ; Thu, 24 Oct 2024 10:06:27 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1729789587; x=1730394387; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:subject:cc:to:from:date:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=DJivlp5+vY5saGH+G4HFhSzZKNWMZaI7pdxEbQ8gMx4=; b=xLgI7fG/lj5+iwjEQLSsuLZOReOxw4LV3s1jb8xOvHq2warhFns195CTxxD8ixKbbF +wmm3I10RDRqf5OPM6WAX2kXYMC7/qwkpHsnJTsLz3oaSQp8dxq6euH30YXAw8+H9IbW d9C/lTsiif+EwRWk3AQVd6KHer5UPklNCS+k5Q6Iq66Urb3aR6YF0vzg4HSSttMZGeF1 Aray0GxwhTQ86WxmsOyN0HIa7neN9lvCxEYY5h6T+pJUJCMjWKYk3J8RFwZs/wHRHz5r hSuRFHcvfCaA7RUWKpcUyNhP+YffCALEt+wcjcZpQjGa552be7WGPk5+y3N1kbDCA8FB Y1dw== X-Forwarded-Encrypted: i=1; AJvYcCU62zsuF4POnUD+eLIy4+5GNP0IAWAXKT8sFtadCdgqQflTrseReYSnCPnDwWLhQu8xT9XbRSbk8Q==@kvack.org X-Gm-Message-State: AOJu0YyMo29OmqbJy+VOY3ZgbRlkW55UvZpEnp9msrk4V9LIxHLLMc7a WUlYr5vaKnBDUqiWSsK+o7XhZ6YaIKYZ+3PT4uOgItIO/X59UDQ+Xe42e55+EMzJ0WeyA32SAn4 d+QmXdcLEpk+Bf41d7810sEiaPRjM0s9Do+n4uEIISAFlUjfd X-Received: by 2002:a05:6602:2564:b0:83a:9c22:23b3 with SMTP id ca18e2360f4ac-83af6460bd2mr172822339f.4.1729789586997; Thu, 24 Oct 2024 10:06:26 -0700 (PDT) X-Google-Smtp-Source: AGHT+IF0kqiNgE1U9eRrDe1V4C/y24cuS+s9R9LQzqi7g3Sgz9abY/IAJocZ7cw6oH4QmP1S6Y7LyQ== X-Received: by 2002:a05:6602:2564:b0:83a:9c22:23b3 with SMTP id ca18e2360f4ac-83af6460bd2mr172819539f.4.1729789586416; Thu, 24 Oct 2024 10:06:26 -0700 (PDT) Received: from redhat.com ([38.15.36.11]) by smtp.gmail.com with ESMTPSA id ca18e2360f4ac-83ad1c4e380sm286926339f.13.2024.10.24.10.06.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 24 Oct 2024 10:06:25 -0700 (PDT) Date: Thu, 24 Oct 2024 11:06:24 -0600 From: Alex Williamson To: Qinyun Tan Cc: Andrew Morton , linux-mm@kvack.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH v1: vfio: avoid unnecessary pin memory when dma map io address space 0/2] Message-ID: <20241024110624.63871cfa.alex.williamson@redhat.com> In-Reply-To: References: X-Mailer: Claws Mail 4.3.0 (GTK 3.24.43; x86_64-redhat-linux-gnu) MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 5EAA410001E X-Stat-Signature: a5j9y55gqdzjrt9da99xygr1t6i43dpr X-Rspam-User: X-HE-Tag: 1729789554-68959 X-HE-Meta: U2FsdGVkX1872B6blVaERYvBlk8kutLpC6CsWv38j1uaNfsMNnRDLWVMZUeNuHyrR/xWruREC0UsEuMyj8fjU8O9M9LqJlSWQfxjHCJYh++6G3OvMxM8cV8pa/nyR1MQ+/IxCraohEqx8P9Ty5UvxTKTEErqeEDlBOUw35Nc0pGRPi0UuZePcHICKUbe1Ta0aV7z8pjSv2ORKwfNFE29z20+AtS090kUncCVNlm7R3TrVSUIlNXE/HcEwAhHT5G+F8DWqWnAyfmB1RqJ+rwa8Omb3+onJovRpvTotVzkHGW9fAC5yLXeHQIaC9WG16P+Ez3Egsb0VBVVWJ5zSdwW1Cu+69UV44JnALdV3T17FTA70Llbbfz1P3e+gAHRG0AHWnhrVcgjrJ2IoTVBdmH75pMNTgH3cx/dFvZviqptWAWa4PgNvV+AC/qjVx4jN9TbuEYaNE5yJQiQP0auODomXlK3N98U64niKDBexScD0FtrEjAkwSbgozNKyH9eFOqRsTg/1fN/cu6xW81f25rODGIWInIZwowiu9IRKFnGukJXsLVKmtVWAZdXOxtCRBMf0P0zwkXCcZdc2jMpIcKgXAL342Q7gTJsUZ5gl3nN7dc2h09SDc5qaOofhhyuVILwk9eP/9hLeF+7UK9p12aW3n4IgCe0fqPVJwVjdmi7YadZpBHaUPPspbZzUZRoLdUCjj3L4d1PQQYWq1LFoB6zC8x0eV2aEt1WsifLWjIorx7zP4oPMdVtKZaQ3cNE5k6OE8bXKysVqGmzVzoH3CZc+UfkbO+zG0vSGCDqxrm00u0oSGkt6nbRjOweQUA2Ud7HlVDmsdzZYXm4gtW8G18Vo3V4fQwpDbHBIk+GOQ1zHr0zd8ySWF5JkGQO8tlTZykKwdRBS9wEI2drqPTKges0TSbhjsqiDLoASYWWXTecrOb0QXzB/qWsHObv1kVX8X2oFOodisxJxmS8mu58LY7 ZkB6kzjH MVn0mDhqopH73qmlgZCHdpdwqKVv0B02dimT+O74t70sliuW2HtrHKGBLIHvu6LGHeDAbzA4vVEzNhmdyZ6tahydXF4mteWC/6EHVh7oJuNRh8bHIZInI5v88lxd0fS22iVVRFZKJpBuXEMqBpOXrV9sgFiXYRUut9cunzoqpQuVvhwru2GZsNZYbzPxtvWr7KWHgDY5ESZcdcMAClEmMOEN34AX13smaRVEXNKpouWjYsUYQOYwLtfVpYk6qywLLVv9HCf3D+8ulyuoE5zNZ3K8BVRNBgjlosFc428oLm05bp0680uReHFPKT5EctBrJoNaUzLGmtvXDKnygveQDFzqDEgBLPjU24a9RIo+21TyFUP8= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, 24 Oct 2024 17:34:42 +0800 Qinyun Tan wrote: > When user application call ioctl(VFIO_IOMMU_MAP_DMA) to map a dma address, > the general handler 'vfio_pin_map_dma' attempts to pin the memory and > then create the mapping in the iommu. > > However, some mappings aren't backed by a struct page, for example an > mmap'd MMIO range for our own or another device. In this scenario, a vma > with flag VM_IO | VM_PFNMAP, the pin operation will fail. Moreover, the > pin operation incurs a large overhead which will result in a longer > startup time for the VM. We don't actually need a pin in this scenario. > > To address this issue, we introduce a new DMA MAP flag > 'VFIO_DMA_MAP_FLAG_MMIO_DONT_PIN' to skip the 'vfio_pin_pages_remote' > operation in the DMA map process for mmio memory. Additionally, we add > the 'VM_PGOFF_IS_PFN' flag for vfio_pci_mmap address, ensuring that we can > directly obtain the pfn through vma->vm_pgoff. > > This approach allows us to avoid unnecessary memory pinning operations, > which would otherwise introduce additional overhead during DMA mapping. > > In my tests, using vfio to pass through an 8-card AMD GPU which with a > large bar size (128GB*8), the time mapping the 192GB*8 bar was reduced > from about 50.79s to 1.57s. If the vma has a flag to indicate pfnmap, why does the user need to provide a mapping flag to indicate not to pin? We generally cannot trust such a user directive anyway, nor do we in this series, so it all seems rather redundant. What about simply improving the batching of pfnmap ranges rather than imposing any sort of mm or uapi changes? Or perhaps, since we're now using huge_fault to populate the vma, maybe we can iterate at PMD or PUD granularity rather than PAGE_SIZE? Seems like we have plenty of optimizations to pursue that could be done transparently to the user. Thanks, Alex