From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.6 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id EA6DEC433E0 for ; Wed, 13 Jan 2021 21:57:04 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 7A07623136 for ; Wed, 13 Jan 2021 21:57:04 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 7A07623136 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 0A5F58D009A; Wed, 13 Jan 2021 16:57:04 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 056618D008E; Wed, 13 Jan 2021 16:57:03 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E87918D009A; Wed, 13 Jan 2021 16:57:03 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0225.hostedemail.com [216.40.44.225]) by kanga.kvack.org (Postfix) with ESMTP id D22608D008E for ; Wed, 13 Jan 2021 16:57:03 -0500 (EST) Received: from smtpin07.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 9E211180AD81F for ; Wed, 13 Jan 2021 21:57:03 +0000 (UTC) X-FDA: 77702112726.07.wall72_4715f7427521 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin07.hostedemail.com (Postfix) with ESMTP id 8498218079A57 for ; Wed, 13 Jan 2021 21:57:03 +0000 (UTC) X-HE-Tag: wall72_4715f7427521 X-Filterd-Recvd-Size: 5777 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [63.128.21.124]) by imf30.hostedemail.com (Postfix) with ESMTP for ; Wed, 13 Jan 2021 21:57:02 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1610575022; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Q3AL2hFTE8XLIenXG1MWNstHJ1MIfu4paLFzoc+bKbA=; b=ffheJ3AocN8b5gQ30oXpjy8M/w+X2cIr/W18yM/ZV6ScDtwwbFkUT8s3wpw8LxKrdAJmcs VEM33QoDCmUwrUPwTalnGl91cNUEqYHKxC8BFAwK6g3xMHCv3ZpinB/d2Lpkj7fLep75Zg +q7/kNBd6XJKFliNPUbRGVmD6MYruG0= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-52-IJ3mGltiN4qt4DxWlhHPig-1; Wed, 13 Jan 2021 16:57:00 -0500 X-MC-Unique: IJ3mGltiN4qt4DxWlhHPig-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 03137107ACF7; Wed, 13 Jan 2021 21:56:58 +0000 (UTC) Received: from redhat.com (ovpn-112-31.rdu2.redhat.com [10.10.112.31]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 720DC1F401; Wed, 13 Jan 2021 21:56:40 +0000 (UTC) Date: Wed, 13 Jan 2021 16:56:38 -0500 From: Jerome Glisse To: Jason Gunthorpe Cc: Andrea Arcangeli , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Yu Zhao , Andy Lutomirski , Peter Xu , Pavel Emelyanov , Mike Kravetz , Mike Rapoport , Minchan Kim , Will Deacon , Peter Zijlstra , Linus Torvalds , Hugh Dickins , "Kirill A. Shutemov" , Matthew Wilcox , Oleg Nesterov , Jann Horn , Kees Cook , John Hubbard , Leon Romanovsky , Jan Kara , Kirill Tkhai Subject: Re: [PATCH 0/2] page_count can't be used to decide when wp_page_copy Message-ID: <20210113215638.GA528828@redhat.com> References: <20210107200402.31095-1-aarcange@redhat.com> <20210107202525.GD504133@ziepe.ca> <20210108133649.GE504133@ziepe.ca> <20210108181945.GF504133@ziepe.ca> <20210109004255.GG504133@ziepe.ca> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline In-Reply-To: <20210109004255.GG504133@ziepe.ca> X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, Jan 08, 2021 at 08:42:55PM -0400, Jason Gunthorpe wrote: > On Fri, Jan 08, 2021 at 05:43:56PM -0500, Andrea Arcangeli wrote: > > On Fri, Jan 08, 2021 at 02:19:45PM -0400, Jason Gunthorpe wrote: > > > On Fri, Jan 08, 2021 at 12:00:36PM -0500, Andrea Arcangeli wrote: > > > > > The majority cannot be converted to notifiers because they are = DMA > > > > > based. Every one of those is an ABI for something, and does not= expect > > > > > extra privilege to function. It would be a major breaking chang= e to > > > > > have pin_user_pages require some cap. > > > >=20 > > > > ... what makes them safe is to be transient GUP pin and not long > > > > term. > > > >=20 > > > > Please note the "long term" in the underlined line. > > >=20 > > > Many of them are long term, though only 50 or so have been marked > > > specifically with FOLL_LONGTERM. I don't see how we can make such a > > > major ABI break. > >=20 > > io_uring is one of those indeed and I already flagged it. > >=20 > > This isn't a black and white issue, kernel memory is also pinned but > > it's not in movable pageblocks... How do you tell the VM in GUP to > > migrate memory to a non movable pageblock before pinning it? Because > > that's what it should do to create less breakage. >=20 > There is already a patch series floating about to do exactly that for > FOLL_LONGTERM pins based on the existing code in GUP for CMA migration >=20 > > For example iommu obviously need to be privileged, if your argument > > that it's enough to use the right API to take long term pins > > unconstrained, that's not the case. Pins are pins and prevent moving > > or freeing the memory, their effect is the same and again worse than > > mlock on many levels. >=20 > The ship sailed on this a decade ago, it is completely infeasible to > go back now, it would completely break widely used things like GPU, > RDMA and more. >=20 I am late to this but GPU should not be use as an excuse for GUP. GUP is a broken model and the way GPU use GUP is less broken then RDMA. In GPU driver GUP contract with userspace is that the data the GPU can access is a snapshot of what the process memory was at the time you asked for the GUP. Process can start using different pages right after. There is no constant coherency contract (ie CPU and GPU can be working on different pages). If you want coherency ie always have CPU and GPU work on the same page then you need to use mmu notifier and avoid pinning pages. Anything that does not abide by mmu notifier is broken and can not be fix. Cheers, J=E9r=F4me