From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7F21CC001E0 for ; Tue, 1 Aug 2023 06:53:13 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E6BFC2800E4; Tue, 1 Aug 2023 02:53:12 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E1CA92800C8; Tue, 1 Aug 2023 02:53:12 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D0AFD2800E4; Tue, 1 Aug 2023 02:53:12 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id C08E72800C8 for ; Tue, 1 Aug 2023 02:53:12 -0400 (EDT) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 7FB8280978 for ; Tue, 1 Aug 2023 06:53:12 +0000 (UTC) X-FDA: 81074619024.27.E45644D Received: from out30-97.freemail.mail.aliyun.com (out30-97.freemail.mail.aliyun.com [115.124.30.97]) by imf19.hostedemail.com (Postfix) with ESMTP id A2A571A000E for ; Tue, 1 Aug 2023 06:53:09 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=alibaba.com; spf=pass (imf19.hostedemail.com: domain of rongwei.wang@linux.alibaba.com designates 115.124.30.97 as permitted sender) smtp.mailfrom=rongwei.wang@linux.alibaba.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1690872790; a=rsa-sha256; cv=none; b=ttUEWSK9PwFZaE70EL16DC1yNOFOOx3vSdb3yRwyue4RRMQu4rjAm4q09iJpM5Oq07Y5wV u8i+slBlTOoFcz1sH/eh6xWElhnvumJXwdLqgyd/8nLrLFi7lN8OBwdUN8y5DtTsm3Clhp m34Pk0hAvDAmg/4R+PnFr+rPEbwOnO4= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=alibaba.com; spf=pass (imf19.hostedemail.com: domain of rongwei.wang@linux.alibaba.com designates 115.124.30.97 as permitted sender) smtp.mailfrom=rongwei.wang@linux.alibaba.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1690872790; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=KUuqxP8/fQp5nspUgWugvFBY3+vwAH7szB+pwZEXrlM=; b=VYqohY9Ha+sk0EhSFIM5fJEFgncKTP5/jMRlmSQet4lTH72Agh51RggHYkL3/LzaLsvMXE VisYPwuPsVNU+0aNTSwsRt3IUzeQt/T+LXNoQgH75j2jq4pImk39HGc7wRY75dQC2lDy9J LrwO472uHRejHFVAp8EcpNYTm/Blp3w= X-Alimail-AntiSpam:AC=PASS;BC=-1|-1;BR=01201311R591e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018045168;MF=rongwei.wang@linux.alibaba.com;NM=1;PH=DS;RN=5;SR=0;TI=SMTPD_---0VonXRf3_1690872783; Received: from 30.240.106.99(mailfrom:rongwei.wang@linux.alibaba.com fp:SMTPD_---0VonXRf3_1690872783) by smtp.aliyun-inc.com; Tue, 01 Aug 2023 14:53:05 +0800 Message-ID: Date: Tue, 1 Aug 2023 14:53:02 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Thunderbird/102.13.0 Subject: Re: [PATCH RFC v2 0/4] Add support for sharing page tables across processes (Previously mshare) Content-Language: en-US To: Matthew Wilcox Cc: linux-arch@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, "xuyu@linux.alibaba.com" References: <74fe50d9-9be9-cc97-e550-3ca30aebfd13@linux.alibaba.com> <9faea1cf-d3da-47ff-eb41-adc5bd73e5ca@linux.alibaba.com> From: Rongwei Wang In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: A2A571A000E X-Stat-Signature: gmu66uy6un434jqgmic39zgehhyh18rj X-Rspam-User: X-HE-Tag: 1690872789-150183 X-HE-Meta: U2FsdGVkX19sExx0FxKLtwbuSqahKQlmhar4WXdasX4A/O5llUkule47XdPAuZLHvWgrEYwAqSyYtxtDAXnihIVoENx6f/y/edUMAu74XTSdhxTk7y5ErY3bYMesVOwGU/bY3KsPfY72UOkY1qEwPPicX+dwfwuSrjUze8e77SK+rWQZwcplgtSEAU4kvVJop8acBA1F07W5Y0lEAB7vyLrGyBcr58KmjSLV+F09wnofMU2wv+6eZDxUjKyPup1/vthKIs3SfFi8jb5L2Rme/8gJYEWNxEKljRdVYbFed3xqqF4INYat75caK2aygcQ+V/Jwx4cR+MWCh2hJq0WLTB4i/Wn5LDlcyFTp/BnsUFJ4IqV2fXnfq0IftfL0+ih3ktyR1rMErhu4CgjPL3W/040zV/TV+IMFQ6ouX19dLiTvforM4CI2gbB05vVFjQXIs1OOE3GcYzEP8YJeXKiizmSzKhfTxEEI11jZSb9w9q3KHlk7r+yyPXZhsyAC5jaa6mb36lMyE8lWeySt1PyN5p2nRhsXC2/dIknig6Ec9Hxdo1L6DDf/QQLCNNJrhni0be6pCl62OdNtgZvDBS71NkSorLwZ+nQtUQKRKtBzCEi4F9CxRmHaOKePFScbUzKZkRY40VWRiAajvX+G9LruGx/Dhr97hQkcwuO+ELriyy28sCLFuKE+JelW6fy8OeEyRbo5dIhIjaKPMH6BX9a7Bp4uATzmpU6j9wCO5UdH84MKqXDdVLSlS4OZGiCJqgVZkOFMg45Tdxsc8is0v2/kwDkOzjweDN/QujBwku1j+lJWSlvAyGKpGKFfB+YHrG5oQzpnRQJOpmVT4tt3Yv8Thwf+aO0xoXBXDE4ZiXpy1DwZoeNdsLI24fZw7b5cCraL9CjrwgPqkl/3j+iyO5JJGc1tqDzu6s1IQtK3kKH6N56B8E/CzVde8dOspi8/ntDEDZn0QLm2Th+By0NFkPk NSxLavgi 8Ql9gOfsKaXYuTqbVDt3a5UP+D3T5PokPNb/r44x1t3lMXOz6+qj5JPx1zl0Y1ZVD0z5ORm+QSBorh1Pudr/3rqIVqE1/c59++NBmQRwiCtrarmkG//4TfQDHRw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 2023/8/1 00:38, Matthew Wilcox wrote: > On Mon, Jul 31, 2023 at 06:30:22PM +0200, David Hildenbrand wrote: >> Assume we do do the page table sharing at mmap time, if the flags are right. >> Let's focus on the most common: >> >> mmap(memfd, PROT_READ | PROT_WRITE, MAP_SHARED) >> >> And doing the same in each and every process. > That may be the most common in your usage, but for a database, you're > looking at two usage scenarios. Postgres calls mmap() on the database > file itself so that all processes share the kernel page cache. > Some Commercial Databases call mmap() on a hugetlbfs file so that all > processes share the same userspace buffer cache. Other Commecial > Databases call shmget() / shmat() with SHM_HUGETLB for the exact > same reason. > > This is why I proposed mshare(). Anyone can use it for anything. Hi Matthew I'm a little confused about this mshare(). Which one is the mshare() you refer to here, previous mshare() based on filesystem or this RFC v2 posted by Khalid? IMHO, they have much difference between previously mshare() and MAP_SHARED_PT now. > We have such a diverse set of users who want to do stuff with shared > page tables that we should not be tying it to memfd or any other > filesystem. Not to mention that it's more flexible; you can map > individual 4kB files into it and still get page table sharing.