From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 99942C7619A for ; Wed, 12 Apr 2023 08:17:26 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 34BF0900006; Wed, 12 Apr 2023 04:17:26 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 2FC0A900004; Wed, 12 Apr 2023 04:17:26 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1C34D900006; Wed, 12 Apr 2023 04:17:26 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 0C544900004 for ; Wed, 12 Apr 2023 04:17:26 -0400 (EDT) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id D8527AC652 for ; Wed, 12 Apr 2023 08:17:25 +0000 (UTC) X-FDA: 80672034450.11.CCCA728 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf29.hostedemail.com (Postfix) with ESMTP id B5C30120007 for ; Wed, 12 Apr 2023 08:17:23 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=N+42g4ze; spf=pass (imf29.hostedemail.com: domain of david@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1681287443; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=NzMYQVznDE4YkNJ+RQau1i1jLsAVGnKy89iPVH1EiHw=; b=ATn1TTbNQWQRc3VuVAFa5KRjlhWWrkitEgqNfCDm49yq0Bt3eU1S5ZYM+gKwh2anvASAMr BKSGKzTczRDP5hF+QBKrj5t7VkyQrX1g4blRqdjKkvRM9EfDzj3T5c3TruiF4Ujq1U1xpW SO/g0/C3pBxlr62mbS7VfuUx974z80o= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=N+42g4ze; spf=pass (imf29.hostedemail.com: domain of david@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1681287443; a=rsa-sha256; cv=none; b=eeRS3+kwOPGYgXebGLFxrnlthiIjGVop6FkjH/blI0+lmhIiF8OpvsLt/fvT7fd6wdP4Pu X4Lc5T6alNdfKsULCBbVePKP295H6yR5d8zER5mngiz2l3vwyb+vHOIh/7kY90rPcp7anv MY+FFeUWciHrxa56MTCc9BIsY3O9XWE= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1681287442; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=NzMYQVznDE4YkNJ+RQau1i1jLsAVGnKy89iPVH1EiHw=; b=N+42g4zeoK1diTnK5JCubsrKlb5zc3TPvq4QoIzojKvBTq6fXz9+0ufX9Au8VfOrVfBNM2 7kuxjkwNe7qVx2EbBenErFbmzNndRtX1EkuDGM69C4BXK5CJwIVTez6cXowkpLshwfAP75 7jO0SVvsj6WrnKg0t1C+U1s+z/3Y6J0= Received: from mail-wm1-f70.google.com (mail-wm1-f70.google.com [209.85.128.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-479-DgWD2NiOMteU5C-x2BAqCw-1; Wed, 12 Apr 2023 04:17:21 -0400 X-MC-Unique: DgWD2NiOMteU5C-x2BAqCw-1 Received: by mail-wm1-f70.google.com with SMTP id q19-20020a05600c46d300b003ef69894934so3526510wmo.6 for ; Wed, 12 Apr 2023 01:17:21 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1681287440; x=1683879440; h=content-transfer-encoding:in-reply-to:subject:organization:from :references:to:content-language:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=NzMYQVznDE4YkNJ+RQau1i1jLsAVGnKy89iPVH1EiHw=; b=dE0vm/mXB99xiMY2MsUc8vcb23rNF57RuYJrL2l10nTiAc1ee30dXePcHUtN5HX0lz VDK6p4TgFDmD3nf55VsXuW1JEgsWtkcdHTI958+31hNZJIh+Ltx42b20Q46rZF4idh3r 678CZ0afJWu4/ubddQXqqwQzF90ya2Vn7dFuYdo2+gHtJlqpT9FMyOO4ohRkWWsRkX1p cevK3oLsSDWm+fapd7cRWwE40/Bde5tnVdRENBbYw0ffB5xLhgD0s334IOrRTQVL/mpn RjdafXkcedvQ1KxQLKi3oiZuhtWl/m8hN8g3hH7x5HTHCa/pjVhKbFByeM17q4wRezV/ zYFg== X-Gm-Message-State: AAQBX9dZ7M6N/T2JHHDTvXwDGDrnZfTvudGP3o6hqV+MiW3npUdD5TA0 0zn+BFkgC+kzwbMFCw0kkMJdhkSWPRQdOQmpTt/0h8tMgzUcmil83XuXTKfprVk0CuramCugIPQ DPDshhmAZ2Ek= X-Received: by 2002:adf:fad1:0:b0:2ef:b8e3:46fd with SMTP id a17-20020adffad1000000b002efb8e346fdmr1163634wrs.38.1681287440686; Wed, 12 Apr 2023 01:17:20 -0700 (PDT) X-Google-Smtp-Source: AKy350bxta8fTx0XSzSl2idHt++wEgxAIFvsbHiX+99XZip3kfVd1B+1ZY+Rz0OU4K0+23oEO6fX8g== X-Received: by 2002:adf:fad1:0:b0:2ef:b8e3:46fd with SMTP id a17-20020adffad1000000b002efb8e346fdmr1163607wrs.38.1681287440285; Wed, 12 Apr 2023 01:17:20 -0700 (PDT) Received: from ?IPV6:2003:cb:c702:4b00:c6fa:b613:dbdc:ab? (p200300cbc7024b00c6fab613dbdc00ab.dip0.t-ipconnect.de. [2003:cb:c702:4b00:c6fa:b613:dbdc:ab]) by smtp.gmail.com with ESMTPSA id e8-20020a5d5308000000b002ce9f0e4a8fsm16626051wrv.84.2023.04.12.01.17.19 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 12 Apr 2023 01:17:19 -0700 (PDT) Message-ID: <93f2614e-4521-8bc8-2eca-e7ad03e7e399@redhat.com> Date: Wed, 12 Apr 2023 10:17:18 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.9.1 To: "Teterevkov, Ivan" , "linux-mm@kvack.org" , "jhubbard@nvidia.com" , "jack@suse.cz" , "rppt@linux.ibm.com" , "jglisse@redhat.com" , "ira.weiny@intel.com" , "linux-kernel@vger.kernel.org" , David Howells , Christoph Hellwig , Matthew Wilcox References: From: David Hildenbrand Organization: Red Hat Subject: Re: find_get_page() VS pin_user_pages() In-Reply-To: X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: B5C30120007 X-Stat-Signature: 7y8ux7jdrzgjrf7e3tyde85hnxptxypw X-Rspam-User: X-HE-Tag: 1681287443-212063 X-HE-Meta: U2FsdGVkX191MhRNtMivySlZLDNn6fAuVHG5qUWIG1s1ejpOF0bmP2VgpCportFt01VPJECfgC0eBZhTb6I723m3dpd1YcI579jA9YULltI1/zWa6Rx1di5k2KhWkvv5gjQoRYRfRDodXApwIOTwsjA24YOcsZVb69tbZUZ432z8F1ZdYWEpTeb98godmckfKAoEWeqBnGJmsprY8+tVXavZMxivR4dKLT9J4QkUDfSimgcHCSb1KfmDOlp6y2sOTM8uJiE5vzi1bMy/RrJWYQ7u1BFlYDrbUI8OwapeRKl3hYPmESRo9cM+YCUKrAn663bZoHVFYDsDLgv+jcxxMlvqTnxf3b9oBzDMAewJ2kHk+6QvpiM+W/ai0KUQFOfH9NhnWIlTakAwTMh9K27U5WonioLiQ0WZenr1T/yGWvwXLPLK6Rrk8Z+cCw3FA2nCTU2csYhn1I0PH/bWbiaBTOW52w9JZQGnMXnEuaBlqAWkN3XxokBDNkGAav92t7yNN3rBcJiVAiSR/eHz+7pvxIk2XIV0TIRRr5U+Z65w+ShTQ+w5eJ0X1gFNbt2Qcaws3rh1JOcvi9j9/xZAJUn+lsHpPrymv7Zaz93jvXDvgGfswRjG7xOBaFneIGKQJBSv0NiSO5P/hFMCgf9dlslFfsy5szNWte4skz5wEsb8gM9Zcky+xGFGkeYlBX2mp2msdTgyhMXmugXEZ+rOajb/DblOX3jXc6TK1vWWa42Q4UnG0y65udyuY8zSM5jf9TxczbROgsa9U0SFFNAa89f0EmpQ9RL4DTBvurZFa94r9h8xZ9ClsZysjB66X3ZRvxWNJH/KKcdNOL9RIZTvkLerk8v0eei/AVrAbrSnOtcVoh3ssvQxSGQUvu9FhZt6f3s93pW8twfPMqQi//oelxlhj8DWXVMjLUtmbj2ieIdRxubjnc+BKnMwttAPrKikGw2K69qiSjeKjcuZb3T4aEc gXh0MHVc sXVbpzRWszv0Eeu7LpnkYDyNWMm/QRTm7lsNcLathmxCiN1tV+7VfirCGtraTeo/tjbKwcQMiFTpPR2veqks32MiQhenVR8YtZtJqKEDWBOAFqaZ/avMolS+7rukA6HzWq2PZHNf1nXuKXtxoJEb//Gr9op8hUDoEZs5xUJUFNL4GsAATXCeXguc1BdyfiJBHxOkbi5vNar7sU+KVuhz6Jw7f7qnxNCC/KvSps17tX1Kp2XHCondtpfQo5hDDeyU+IIjb7J/YBkKiLOfJJ8PDDOUJfXpzRnZXkqHwBComyIDkFsQT0P1PzN7MBehfVi0k0SSs66+cHerUvXAsnf9wJ/4sf4pQ7pcTHWwxYyHDRwGT/GXn8FNIKjMcutuKr60SLE4uDBPKVjfygYJlFKj3mzHaCYxXZ23vPIqeyvVpvqqNifIN8izuNMXjc9/WmBHmlBJLdbOKVeKslsaUkK3Ur4taHL3gtGfyoUK9mtd/R2vjFHS76JIQNLYeEY1X9oiDzke8YRnJu8mX/LOs+6vs6egsdP5z1y7hqimls4nA4ZGYAtIopKZeo9qsWWCEt4BGHTFc X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 11.04.23 21:43, Teterevkov, Ivan wrote: > Hello folks, > > I work with an application which aims to share memory in the userspace and > interact with the NIC DMA. The memory allocation workflow begins in the > userspace, which creates a new file backed by 2MiB hugepages with > memfd_create(MFD_HUGETLB, MFD_HUGE_2MB) and fallocate(). Then the userspace > makes an IOCTL to the kernel module with the file descriptor and size so that > the kernel module can get the struct page with find_get_page(). Then the kernel > module calls dma_map_single(page_address(page)) for NIC, which concludes the > datapath. The allocated memory may (significantly) outlive the originating > userspace application. The hugepages stay mapped with NIC, and the kernel > module wants to continue using them and map to other applications that come and > go with vm_mmap(). > > I am studying the pin_user_pages*() family of functions, and I wonder if the > outlined workflow requires it. The hugepages do not page out, but they can move > as they may be allocated with GFP_HIGHUSER_MOVABLE. However, find_get_page() > must increment the page reference counter without mapping and prevent it from > moving. In particular, https://docs.kernel.org/mm/page_migration.html: I suspect that find_get_page() is not the kind of interface you want to use for the purpose you describe. find_get_page() is a wrapper around pagecache_get_page() and seems more like a helper for implementing an fs (looking at the users and the fact that it only considers pages that are in the pagecache). Instead, you might want to mmap the memfd and pass the user space address range to the ioctl. There, you'd call pin_user_pages_*(). In general, for long-term pinning a page (possibly keeping the page pinned forever, controlled by user space, which seems to be what you are doing) you want so use pin_user_pages() with FOLL_LONGTERM. That will try migrating the page off of e.g., ZONE_MOVABLE or MIGRATE_CMA, where movability has to be guaranteed. But I am no fs expert, so I'll cc some people that might know better if this would be an abuse of find_get_page(). -- Thanks, David / dhildenb