From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3CB1FC001DC for ; Mon, 31 Jul 2023 16:39:05 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id AD6426B012B; Mon, 31 Jul 2023 12:39:04 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A872228006E; Mon, 31 Jul 2023 12:39:04 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 94E396B012D; Mon, 31 Jul 2023 12:39:04 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 8261C6B012B for ; Mon, 31 Jul 2023 12:39:04 -0400 (EDT) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 391C31A034E for ; Mon, 31 Jul 2023 16:39:04 +0000 (UTC) X-FDA: 81072466608.13.9F7DDDD Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) by imf01.hostedemail.com (Postfix) with ESMTP id 3ED814000B for ; Mon, 31 Jul 2023 16:39:01 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=XyE5qzlv; spf=none (imf01.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1690821542; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=OQo3yL4yG3jGyWSkou4YxRtg8JFH3orkKLuMQw2GA/o=; b=loY0xdzJZmNZVbabnRmK5HtRfvCxoCRiDMmc1RRy8Q5xvBlkO2/Vvuwt+nweX91+71pae/ Kk/XfO9dziFnxkFNmXGIXRo3xkC2BDR2Kevn0OlKQM2pdpjsgKoHwsGgzCP6ofJ9WNyzY5 07muipLkhqe/nfAEc3k10YTzrKfXLZw= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1690821542; a=rsa-sha256; cv=none; b=ORdMaa5pnl0UMbLcFWZvxuDyL5O8QAfHM7W5RtqAfrkCOd17tTW8XHggt3TWOR914sUmRN KN0w3jSSZKOvAd1dejCGy3Ci5z1HVopgrwUOU/+An2zjJOH846Nk7sAmbwrnj1ZoCrO2c/ HY4MwJaeY7OQYiWYfaNyXaKDEkNq8sI= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=XyE5qzlv; spf=none (imf01.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org; dmarc=none DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=OQo3yL4yG3jGyWSkou4YxRtg8JFH3orkKLuMQw2GA/o=; b=XyE5qzlvho47FbWNmCDHfzzD23 TD+vjVmfbujk2rxhGTl5FeDSkSpPR9xhwnfuuZdcnUFgyYgtfMfL8Al1n9Fs8ZPzeAWt3D8kBaxE0 xkOfXwDby6DwjZf/PGj9OI6d+5QAYtAc9CUDeskJddccqpbaugMVsITh+RDrj4yFgIaXe0nhgkzor 56z7YN7zXuw7ujKhH5HWUpv4mlzr8m+/4YDbAf5tObdgrmEiH53Aqqd+thv9lUOeA2roFSQnSAifu rrbtH8W26v5XKbnsk4ckgo4GESKjWz1TIK7u3drS6B6yZsxzqHGUb9zAfdq6mEx16Semp7xzb/NJ8 QX/m3Vxg==; Received: from willy by casper.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1qQVv0-002ffe-NZ; Mon, 31 Jul 2023 16:38:50 +0000 Date: Mon, 31 Jul 2023 17:38:50 +0100 From: Matthew Wilcox To: David Hildenbrand Cc: Rongwei Wang , linux-arch@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, "xuyu@linux.alibaba.com" Subject: Re: [PATCH RFC v2 0/4] Add support for sharing page tables across processes (Previously mshare) Message-ID: References: <74fe50d9-9be9-cc97-e550-3ca30aebfd13@linux.alibaba.com> <9faea1cf-d3da-47ff-eb41-adc5bd73e5ca@linux.alibaba.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Stat-Signature: 65ob6zanrzazzomarxp44y7p5mhom3up X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: 3ED814000B X-Rspam-User: X-HE-Tag: 1690821541-16683 X-HE-Meta: U2FsdGVkX1+JONYN3GsE305bOlmP9RL7VmjNcBe2t7K4SrXAwl8bMU58vOYkM+HUr8BpWh4LjapVx+t0xOVMTLs0QBbvM1ADHF3Xchw7DQ5nl2lGq2S877UAQmBZQ8eLE5HJZw/tsi+E3nAe4GdmcGWFVB3bASTA0Xfl/jDyOHLCl6F4Pe66VScFBifKLEXNvfHI7qrbO/Y+YFIvXshQNybCHouszuH/A5DBVKYLUJbJj2X0LuAJPTUdx+w3kzGuwkWj1mpnUFbiL+6Ba60Qss6N/jFww3afC4Bl0tKD45eOwc/8bRwqbKLG+EF4bSuHOm80Cc1850TCoLHsURssUxfSnXDW309uJ7SKlxO0zBXNjlXKeEdG5TeTzep4f1ijhdY9LH3UUTC4fDJb2VitQ0QncOgQkNTKNin0VYAMce6bm9T5MizeeMAjv0u4aI8XoZ0Y3ltpNxQL5VUu3rAbkGyRDQyiac3BAYraypZeEonkewkp/bpdhG57vQwo0t+6k5YzSRrmYMCo15blY/p3pgNKsUXYz1iaQ63Yu+onN/qVsIpPCb0lcoEqOx2UZjne3Ns/0LI1QCjXUEQdWQBU53kB2AcEBbbmuvKnuo8VK79S4ruGSRBW/X/XrevXXz1p3Vii4nEoCxqk9KBLNLpx3VvTuowe9ipgcE+FRq0Anor++T+52dg1qxLR+yQVAQdU6nwSoboGKEZ+oz84lI56Li1xnr/DGGL8/vM5L5y1xPGSV6bQBJl9hfK7XPMk+4RsorbDWYRQD50oH7UB4bucwcmW2eN0F9IW2AGg2wS8t9WEIhcJgzxanO8csGYOu1EdErT9wLxQ8MhETqZUznMkcP4/z6xGnko5Kp8pZn1ZBpJ58gS7mnDKuJ6YTaATc0l/cYWEhN5RWsEIIkjANIJFZzSrdzp1d2tqyVBM7ymAzcQSldLKi/3mq0S56xcBA96u/KjYwSqNFU8WltqSHZu Mr544h8P u2X5p2nziBofW/tmgVfdYZQj1+yqncwWhzjm7jCksX3LEqctWQOA4bRMo4YxYzpE29FmexwLh9PJsacUJaonZryHbSYH+sDVSFbPScU3Sy+uK1P/0G9Wm/70kqG2oYx8l1rByJSIjarne4W/v3KWK6jD4JuAV2Nrhv6UKdAC/1eUTrRo= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, Jul 31, 2023 at 06:30:22PM +0200, David Hildenbrand wrote: > Assume we do do the page table sharing at mmap time, if the flags are right. > Let's focus on the most common: > > mmap(memfd, PROT_READ | PROT_WRITE, MAP_SHARED) > > And doing the same in each and every process. That may be the most common in your usage, but for a database, you're looking at two usage scenarios. Postgres calls mmap() on the database file itself so that all processes share the kernel page cache. Some Commercial Databases call mmap() on a hugetlbfs file so that all processes share the same userspace buffer cache. Other Commecial Databases call shmget() / shmat() with SHM_HUGETLB for the exact same reason. This is why I proposed mshare(). Anyone can use it for anything. We have such a diverse set of users who want to do stuff with shared page tables that we should not be tying it to memfd or any other filesystem. Not to mention that it's more flexible; you can map individual 4kB files into it and still get page table sharing.