From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.8 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,UNPARSEABLE_RELAY autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7BC8AC4361B for ; Tue, 8 Dec 2020 02:51:37 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 229B123609 for ; Tue, 8 Dec 2020 02:51:37 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 229B123609 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=oracle.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id A913B8D0012; Mon, 7 Dec 2020 21:51:36 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id A19F18D0001; Mon, 7 Dec 2020 21:51:36 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8E40E8D0012; Mon, 7 Dec 2020 21:51:36 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0220.hostedemail.com [216.40.44.220]) by kanga.kvack.org (Postfix) with ESMTP id 75AB68D0001 for ; Mon, 7 Dec 2020 21:51:36 -0500 (EST) Received: from smtpin28.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 3C54E8249980 for ; Tue, 8 Dec 2020 02:51:36 +0000 (UTC) X-FDA: 77568589392.28.oil10_3e06035273e3 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin28.hostedemail.com (Postfix) with ESMTP id 144DC6C3A for ; Tue, 8 Dec 2020 02:51:36 +0000 (UTC) X-HE-Tag: oil10_3e06035273e3 X-Filterd-Recvd-Size: 6400 Received: from userp2130.oracle.com (userp2130.oracle.com [156.151.31.86]) by imf50.hostedemail.com (Postfix) with ESMTP for ; Tue, 8 Dec 2020 02:51:35 +0000 (UTC) Received: from pps.filterd (userp2130.oracle.com [127.0.0.1]) by userp2130.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 0B82nreB098678; Tue, 8 Dec 2020 02:51:00 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : in-reply-to : references : date : message-id : mime-version : content-type; s=corp-2020-01-29; bh=iT75PEr74+22T8OxtC1oHxs5BBlW6iP/VhqAfwjey4U=; b=KTWKv6gq1d6evY6A5bLK9N7H+MIOhHRKLW3Sylcxd8sIgMU6jWRhGvGFkcLbk6SpWfST aYO/NU/FyagY88KAyA4zqCeOdWXYeVud6wJVXzQtoigLxzUOWInpQ/lHltiWw415LLOZ ysx3AOFrx2wXO5josSk/O7qE58jXPSDIrHyidQtP9LH9IKUnukrQpywGTigWCtxM6mEr 1X3sWuUCSfkFNaEl2swshhWOhxhKb4EEQcaKCfFDFEEXxVolCp41p/JM+k90XKnZfntA UMsTPH0Vx80NHbH3LyjE1G/YqmBQKmNOOlvHA0pwl9WqNWGZpQxOXRlETluorNew+X7X eg== Received: from aserp3020.oracle.com (aserp3020.oracle.com [141.146.126.70]) by userp2130.oracle.com with ESMTP id 3581mqrgu8-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Tue, 08 Dec 2020 02:51:00 +0000 Received: from pps.filterd (aserp3020.oracle.com [127.0.0.1]) by aserp3020.oracle.com (8.16.0.42/8.16.0.42) with SMTP id 0B82kSjN090694; Tue, 8 Dec 2020 02:48:59 GMT Received: from userv0121.oracle.com (userv0121.oracle.com [156.151.31.72]) by aserp3020.oracle.com with ESMTP id 358m3x4ek1-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Tue, 08 Dec 2020 02:48:59 +0000 Received: from abhmp0011.oracle.com (abhmp0011.oracle.com [141.146.116.17]) by userv0121.oracle.com (8.14.4/8.13.8) with ESMTP id 0B82mskI003262; Tue, 8 Dec 2020 02:48:54 GMT Received: from parnassus (/98.229.125.203) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Mon, 07 Dec 2020 18:48:53 -0800 From: Daniel Jordan To: Jason Gunthorpe Cc: Pavel Tatashin , Alex Williamson , LKML , linux-mm , Andrew Morton , Vlastimil Babka , Michal Hocko , David Hildenbrand , Oscar Salvador , Dan Williams , Sasha Levin , Tyler Hicks , Joonsoo Kim , mike.kravetz@oracle.com, Steven Rostedt , Ingo Molnar , Peter Zijlstra , Mel Gorman , Matthew Wilcox , David Rientjes , John Hubbard Subject: Re: [PATCH 6/6] mm/gup: migrate pinned pages out of movable zone In-Reply-To: <20201204205233.GF5487@ziepe.ca> References: <20201202052330.474592-1-pasha.tatashin@soleen.com> <20201202052330.474592-7-pasha.tatashin@soleen.com> <20201202163507.GL5487@ziepe.ca> <20201203010809.GQ5487@ziepe.ca> <20201203141729.GS5487@ziepe.ca> <87360lnxph.fsf@oracle.com> <20201204205233.GF5487@ziepe.ca> Date: Mon, 07 Dec 2020 21:48:48 -0500 Message-ID: <87k0ttrp0v.fsf@oracle.com> MIME-Version: 1.0 Content-Type: text/plain X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9828 signatures=668682 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 spamscore=0 malwarescore=0 adultscore=0 bulkscore=0 phishscore=0 suspectscore=1 mlxscore=0 mlxlogscore=999 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2009150000 definitions=main-2012080016 X-Proofpoint-Virus-Version: vendor=nai engine=6000 definitions=9828 signatures=668682 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=1 mlxlogscore=999 clxscore=1015 malwarescore=0 priorityscore=1501 adultscore=0 lowpriorityscore=0 phishscore=0 spamscore=0 impostorscore=0 mlxscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2009150000 definitions=main-2012080016 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Jason Gunthorpe writes: > On Fri, Dec 04, 2020 at 03:05:46PM -0500, Daniel Jordan wrote: >> Well Alex can correct me, but I went digging and a comment from the >> first type1 vfio commit says the iommu API didn't promise to unmap >> subpages of previous mappings, so doing page at a time gave flexibility >> at the cost of inefficiency. > > iommu restrictions are not related to with gup. vfio needs to get the > page list from the page tables as efficiently as possible, then you > break it up into what you want to feed into the IOMMU how the iommu > wants. > > vfio must maintain a page list to call unpin_user_pages() anyhow, so It does in some cases but not others, namely the expensive VFIO_IOMMU_MAP_DMA/UNMAP_DMA path where the iommu page tables are used to find the pfns when unpinning. I don't see why vfio couldn't do as you say, though, and the worst case memory overhead of using scatterlist to remember the pfns of a 300g VM backed by huge but physically discontiguous pages is only a few meg, not bad at all. > it makes alot of sense to assemble the page list up front, then do the > iommu, instead of trying to do both things page at a time. > > It would be smart to rebuild vfio to use scatter lists to store the > page list and then break the sgl into pages for iommu > configuration. SGLs will consume alot less memory for the usual case > of THPs backing the VFIO registrations. > > ib_umem_get() has some example of how to code this, I've been thinking > we could make this some common API, and it could be further optimized. Agreed, great suggestions, both above and in the rest of your response. >> Yesterday I tried optimizing vfio to skip gup calls for tail pages after >> Matthew pointed out this same issue to me by coincidence last week. > > Please don't just hack up vfio like this. Yeah, you've cured me of that idea. I'll see where I get experimenting with some of this stuff.