From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 26573C433EF for ; Tue, 23 Nov 2021 21:48:12 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 75A1C6B0072; Tue, 23 Nov 2021 16:47:56 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 6E1476B0074; Tue, 23 Nov 2021 16:47:56 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5AAA36B0075; Tue, 23 Nov 2021 16:47:56 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0063.hostedemail.com [216.40.44.63]) by kanga.kvack.org (Postfix) with ESMTP id 4521C6B0072 for ; Tue, 23 Nov 2021 16:47:56 -0500 (EST) Received: from smtpin03.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id DC11482F7DD2 for ; Tue, 23 Nov 2021 21:47:45 +0000 (UTC) X-FDA: 78841532490.03.738B990 Received: from mail-io1-f45.google.com (mail-io1-f45.google.com [209.85.166.45]) by imf23.hostedemail.com (Postfix) with ESMTP id 16A8790000BE for ; Tue, 23 Nov 2021 21:47:40 +0000 (UTC) Received: by mail-io1-f45.google.com with SMTP id e144so530843iof.3 for ; Tue, 23 Nov 2021 13:47:45 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=sUEg3QUrs08YDU9iOTMIC5pKNmifc7FDR8i4cnVmNs0=; b=Kqk5rqIHb4VmUjsfbj+vrwZJKE7Qq/16jkHloGZBr1jHYNXR0RQmEujqCodkmvGEGY AHDXxe/tMVE6D2KiYbytml65hQvnKYqqxw5NTCd0TqeExkdH0owqk2J22cO/9pslJvzG 8gm53ZVBv8NQ1C2zxdNc+Vh3d5bEPmJYsBXidkNXSIQEE1cB70m1r6y5RPiVdkRMl57w t1zFJibeGrSt+MlMaboYTcYXDrWcwma4z9LkNK8KsWfRAqc1iPDH44h2WkGwqPezoYFQ W1ChPSI19vCnFWAe+JpExzD5SjlWle75UDVd8R0Rmg6elfFfsHHrVpa9hHGlKpd5fftO XbLA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=sUEg3QUrs08YDU9iOTMIC5pKNmifc7FDR8i4cnVmNs0=; b=4PNZMq9ECYkfKNKOhcfirbKMQB6eQUvCNWKlGpeHLrfNHq1Kd7Q+b+iRGZA8B849VO KCHoHIYBEkz+akIne7FZtC8D+iqK7Mi6mea02TW0q4hXw0nqegX5egfk69Jf7UD68eU9 ae2wYiW1mKg7QYkeSqxvskueJqwvDU+IuVoE3GXhpNrwBIB2FOHW7oJcicVXayRcRaro 5TN7oSZT4eC1jhMJGzCXxJPOQsKytX3u65wtwpPjyPM1SnefEvO6/G+GsDFGachON3XM qDMEUyibSrEjXSEbNV73DVBUBxnh3C1MszwMiscynA/m/wmJIe/z8wYF4nKClKf7Qewx Xvkg== X-Gm-Message-State: AOAM530cx3yXDHzPX9rt5ZUvazOAVp+HEv4uRIo8Q7TLNi8ieYD+d0JR ThLrnJYenORdl2+oSvoduz4U3oP8fHHEGdtQBU+Y/w== X-Google-Smtp-Source: ABdhPJw+V82TF1qaBVXuuWYkWY+rm9HiymOEqRm4BrdYNvCzBYJJ7rbPIH8GxSbo8QMjXkLN06+9zP0krMBu6nE5ecA= X-Received: by 2002:a5e:cb0d:: with SMTP id p13mr8904638iom.71.1637704064568; Tue, 23 Nov 2021 13:47:44 -0800 (PST) MIME-Version: 1.0 References: <20211123000102.4052105-1-almasrymina@google.com> In-Reply-To: From: Mina Almasry Date: Tue, 23 Nov 2021 13:47:33 -0800 Message-ID: Subject: Re: [PATCH v7] mm: Add PM_THP_MAPPED to /proc/pid/pagemap To: Matthew Wilcox Cc: Jonathan Corbet , David Hildenbrand , "Paul E . McKenney" , Yu Zhao , Andrew Morton , Peter Xu , Ivan Teterevkov , Florian Schmidt , linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-doc@vger.kernel.org Content-Type: text/plain; charset="UTF-8" X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 16A8790000BE X-Stat-Signature: 7px9odcm5fdnu5t5hukp94gb51btu47h Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=Kqk5rqIH; spf=pass (imf23.hostedemail.com: domain of almasrymina@google.com designates 209.85.166.45 as permitted sender) smtp.mailfrom=almasrymina@google.com; dmarc=pass (policy=reject) header.from=google.com X-HE-Tag: 1637704060-632674 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Nov 23, 2021 at 1:30 PM Matthew Wilcox wrote: > > On Tue, Nov 23, 2021 at 01:10:37PM -0800, Mina Almasry wrote: > > On Tue, Nov 23, 2021 at 12:51 PM Matthew Wilcox wrote: > > > > > > On Mon, Nov 22, 2021 at 04:01:02PM -0800, Mina Almasry wrote: > > > > Add PM_THP_MAPPED MAPPING to allow userspace to detect whether a given virt > > > > address is currently mapped by a transparent huge page or not. Example > > > > use case is a process requesting THPs from the kernel (via a huge tmpfs > > > > mount for example), for a performance critical region of memory. The > > > > userspace may want to query whether the kernel is actually backing this > > > > memory by hugepages or not. > > > > > > So you want this bit to be clear if the memory is backed by a hugetlb > > > page? > > > > > > > Yes I believe so. I do not see value in telling the userspace that the > > virt address is backed by a hugetlb page, since if the memory is > > mapped by MAP_HUGETLB or is backed by a hugetlb file then the memory > > is backed by hugetlb pages and there is no vagueness from the kernel > > here. > > > > Additionally hugetlb interfaces are more size based rather than PMD or > > not. arm64 for example supports 64K, 2MB, 32MB and 1G 'huge' pages and > > it's an implementation detail that those sizes are mapped CONTIG PTE, > > PMD, CONITG PMD, and PUD respectively, and the specific mapping > > mechanism is typically not exposed to the userspace and might not be > > stable. Assuming pagemap_hugetlb_range() == PMD_MAPPED would not > > technically be correct. > > What I've been trying to communicate over the N reviews of this > patch series is that *the same thing is about to happen to THPs*. > Only more so. THPs are going to be of arbitrary power-of-two size, not > necessarily sizes supported by the hardware. That means that we need to > be extremely precise about what we mean by "is this a THP?" Do we just > mean "This is a compound page?" Do we mean "this is mapped by a PMD?" > Or do we mean something else? And I feel like I haven't been able to > get that information out of you. > Yes, I'm very sorry for the trouble, but I'm also confused what the disconnect is. To allocate hugepages I can do like so: mount -t tmpfs -o huge=always tmpfs /mnt/mytmpfs or madvise(..., MADV_HUGEPAGE) Note I don't ask the kernel for a specific size, or a specific mapping mechanism (PMD/contig PTE/contig PMD/PUD), I just ask the kernel for 'huge' pages. I would like to know whether the kernel was successful in allocating a hugepage or not. Today a THP hugepage AFAICT is PMD mapped + is_transparent_hugepage(), which is the check I have here. In the future, THP may become an arbitrary power of two size, and I think I'll need to update this querying interface once/if that gets merged to the kernel. I.e, if in the future I allocate pages by using: mount -t tmpfs -o huge=2MB tmpfs /mnt/mytmpfs I need the kernel to tell me whether the mapping is 2MB size or not. If I allocate pages by using: mount -t tmpfs -o huge=pmd tmpfs /mnt/mytmps, Then I need the kernel to tell me whether the pages are PMD mapped or not, as I'm doing here. The current implementation is based on what the current THP implementation is in the kernel, and depending on future changes to THP I may need to update it in the future. Does that make sense?