From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2E4CBC27C53 for ; Sun, 9 Jun 2024 20:08:45 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3D35E6B0083; Sun, 9 Jun 2024 16:08:44 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 382E56B0085; Sun, 9 Jun 2024 16:08:44 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 24B086B0088; Sun, 9 Jun 2024 16:08:44 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 06F116B0083 for ; Sun, 9 Jun 2024 16:08:43 -0400 (EDT) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 65D06A09FE for ; Sun, 9 Jun 2024 20:08:43 +0000 (UTC) X-FDA: 82212438126.08.78AB6FE Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) by imf02.hostedemail.com (Postfix) with ESMTP id 8DCE980004 for ; Sun, 9 Jun 2024 20:08:40 +0000 (UTC) Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=Kku8lgMF; spf=none (imf02.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1717963721; a=rsa-sha256; cv=none; b=M9YrCQf2NQ8/lbTc0swv6XLXhPHertaiFiwYIIGZ+Tnti/A1lk1BsDCVswS2TlG/60/pNG ag5NpOom3wNCSeOSyCIOFvfD8RI3+ljqSYANqg8rJ7m4nCKpuNoSDD3a4ezBJL/QO30Hhh RSsHowsl2KVLpxTTPR0DQEdeZZkakxc= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=Kku8lgMF; spf=none (imf02.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1717963721; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=/a50pnQwuRBI+vecC1ow5WLwVdT51tXBpK6EWtF9e5E=; b=zIOaB0DvdmgMjK5eyX83aKvNhxj3B4n7YX7I0nHGgDgDN3v6cDcTqxTqYpbwJzXwCqB4yM 7Vzi0HJOnYG0B79Z26n8ZOLXDXQhTsfEZZ0cAFOKGq5vktOAWoeoKObtrVcbAL/Sh8MuLM te2OHPj9blh+r7tk5PUzvHZOyVtwOAc= DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=/a50pnQwuRBI+vecC1ow5WLwVdT51tXBpK6EWtF9e5E=; b=Kku8lgMFGC8YprOAIDVtwCdUue WGxqctpO5yDIIUfDt2vzwGcQOsZEVwd5BoWJXbUPM6BWjhjODAQSownjSKUdNDLPnIMIp3odoxusB jwjB8O6f+D/Ly6HBFd5m1EBlKIeWAxImPa3hgndN6VILh21cyq04xVkvj4VV77jOtTRZVxE84HgE1 LpHXf4iPY5YPVCnRMAwunPmRm9zmlr4hzV35EQbvlJuqw4ZPD9yL4oI6vaLEmLmEGWRlHWEvPDwg7 OB/p86H5+7SDq4sLQThgrsB1iIvNmB+KvZuB839gxh7ojvuXQkAtXjw1UqcLoHTbDwVvfTA31VnVS ItThGfsw==; Received: from willy by casper.infradead.org with local (Exim 4.97.1 #2 (Red Hat Linux)) id 1sGOqA-00000008Iqk-1pxC; Sun, 09 Jun 2024 20:08:34 +0000 Date: Sun, 9 Jun 2024 21:08:34 +0100 From: Matthew Wilcox To: David Hildenbrand Cc: Khalid Aziz , Peter Xu , Vishal Moola , Jane Chu , Muchun Song , linux-mm@kvack.org Subject: Re: Unifying page table walkers Message-ID: References: <502bb09f-ea09-451b-8473-48b14dd2f554@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <502bb09f-ea09-451b-8473-48b14dd2f554@redhat.com> X-Rspam-User: X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 8DCE980004 X-Stat-Signature: 4m6o8uidwpqrebmpyp9p84o6emjg656g X-HE-Tag: 1717963720-953882 X-HE-Meta: U2FsdGVkX1/UY5euF7TOsC8GVjCCWAGf4qe1a4ZBLHbQwfrkLYT/+Sr04MOViTY5Ci4vXjuiae3p81HociU8Z3faqgYfqKGivjrwN/rc1BKEhfdXkr7qS0y0zPHOzfa/uPis4sNVeut57qtQhU3JXv/vZg0Y9fWGrL3+fxgRqloNfEOSqibkPdO+o1VY32VWL295IH2R78heXgU5W4sqJiht54iIgKB/fBZ76CzFpJC6s/bcICXD+o8qTGbSV0oMQP9dbmbhAKtqkN7VtzQxqWFf/hczTf57s+dMwKx2aupaLmBraO5LWwoe8EIPJ0V3W8z4At2rTeM7B6L0+maTbM+NuZ2haaQZVWtg9DImYCCzGNUG+4GHyutvrJYIXPngji6C+uHGtZXN2dwSXhi1/POz1fLrdEbUaREmgeeL7wHTMwNR0p89fJpjjaoRA0gPaDqTq4vgBU1wCh6NO3LSPUzSNMbE+yKjZX4wP38NnFAaJEDIKrNkYL/jgSb9jTJECF4Ye5aBvxj5+P/IcIkROHKkzqnU+CN/RpqRsCJ7Q6LSu+mhk3TR+fwrrH0tLCfWYo4zs7le9TvaNnoEw4QOeMpniPV/uWwtiz+ah4e+Zo1deR+Co25qDHifS3st/nTbvL9kYzwtIVE/Q53B/qYI/JbjjU0HBgWJjDbvqpEkWqStcQKPm7EXL22oXOjWj3JYvT50+hcH3BfHizP4e4OrjgmqPLCICqLC2T9oWCvXY/Dv0tmXuA9rfA25kHHjlsIW8zeUVWJTNp4j8hhjA+09LDBCifVhKWDgAesYwWRT9qfUnDicnBpLZmvwR/AG1dvMaQffs/0tJAJtEwzFUb1LQ1D79Jt5KNlwPjvqKR79v8WuA0VIK+n3x6EjieRyMkpcCPy0jDrVUTat73VFAJW33FaqExiuaTTvgCH1f8KrWKuXua8ZuOYIePTW6H+WCib++8Lr3cDPxzq+9twYC5Y Pppoalna R9nXHS3Qzn8kZ4xGFlxbr/72oFyedoZ71lAkyzNN7zMYz1R+DoWF2Gcjlau37ZET070kA87OvcDTqfrun5gGaJwUKmYlqG5+ud9fI8fPG775zQx6v5NzX4SVQFu6F/gtycRdQ2anRBLg4sGzkLQJHEDxFZMEtA3ma8EWvRRHrWgHt1doGo5xlzrwyd4dDZTOg2Hgkfq7UuQvmgio= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Jun 07, 2024 at 08:59:17AM +0200, David Hildenbrand wrote: > On 06.06.24 20:29, Matthew Wilcox wrote: > > One of the things we discussed at LSFMM was unifying the hugetlb and > > THP page table walkers. I've been looking into it some more recently; > > I've found a problem and I think a solution. > > > > The reason we have a separate hugetlb_entry from pmd_entry and pud_entry > > is that it has a different locking context. It is called with the > > hugetlb_vma_lock held for read (nb: this is not the same as the vma > > lock; see walk_hugetlb_range()). Why do we need this? Because of page > > table sharing. > > > > In a completely separate discussion, I was talking with Khalid about > > mshare() support for hugetlbfs, and I suggested that we permit hugetlbfs > > pages to be mapped by a VMA which does not have the VM_HUGETLB flag set. > > If we do that, the page tables would not be permitted to be shared with > > other users of that hugetlbfs file. But we want to eliminate support > > for that anyway, so that's more of a feature than a bug. > > I am not sure why hugetlb support in mshare would require that (we don't > need partial mappings and all of that to support mshare+hugetlb). You're absolutely right. My motivation is the other way around. A large part of "hugetlbfs is special" is tied to the sharing of page tables. That's why we have the hugetlb_vma_lock. If we're already sharing page tables with mshare, I assert that it is not necessary to also share page tables with other hugetlb users. So as part of including hugetlb support in mshare, we should drop that support, and handle hugetlb-mapped-with-mshare similarly to THP. Possibly not the mapcount parts so that we preserve the HVO.