From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 31A4FC04FFE for ; Tue, 14 May 2024 18:21:42 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 856688D0040; Tue, 14 May 2024 14:21:41 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 806998D000D; Tue, 14 May 2024 14:21:41 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 680288D0040; Tue, 14 May 2024 14:21:41 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 44EFB8D000D for ; Tue, 14 May 2024 14:21:41 -0400 (EDT) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id C2AC01C138F for ; Tue, 14 May 2024 18:21:40 +0000 (UTC) X-FDA: 82117819560.20.9689CB6 Received: from gentwo.org (gentwo.org [62.72.0.81]) by imf30.hostedemail.com (Postfix) with ESMTP id 296C180014 for ; Tue, 14 May 2024 18:21:38 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=none; dmarc=fail reason="No valid SPF, No valid DKIM" header.from=linux.com (policy=none); spf=softfail (imf30.hostedemail.com: 62.72.0.81 is neither permitted nor denied by domain of cl@linux.com) smtp.mailfrom=cl@linux.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1715710899; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=9zQ8NQIGY/fBAB+dDubpcasP4o6gEWIsG4GHqh4MEfs=; b=t0pWtE+anapAKaanFf8S3nALYfWI/y6MGTh8zXsiUbD839fuW939xQjbtqSShpAK15NLIq 3CU044lZZsMTWiNFioXR2L14DVbdKMZgDVePT1IPiMhSk/YLVR/9z8GRDL6r9oQJs4ZUs+ NpuZ0VZ8OPq3mFeOZr52HOiQMPZybws= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=none; dmarc=fail reason="No valid SPF, No valid DKIM" header.from=linux.com (policy=none); spf=softfail (imf30.hostedemail.com: 62.72.0.81 is neither permitted nor denied by domain of cl@linux.com) smtp.mailfrom=cl@linux.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1715710899; a=rsa-sha256; cv=none; b=ss+kxT9KaBQJI9/Bscemv+hudVBteHRm6gcomtcn4VxK0zAone0lXnbCGldjfENv0GtXta DxbJ8mLqjpveu6BwxA0o/dO0PiCMbIwXdEuU9CA8J5Ez4F3bzUyIxYlP1oa5/a+Zf+snuy b6M8ZRem0qgY9VVA8bnCR0HDllOHK9w= Received: by gentwo.org (Postfix, from userid 1003) id C3D1440A89; Tue, 14 May 2024 11:21:37 -0700 (PDT) Received: from localhost (localhost [127.0.0.1]) by gentwo.org (Postfix) with ESMTP id C29AA4037C; Tue, 14 May 2024 11:21:37 -0700 (PDT) Date: Tue, 14 May 2024 11:21:37 -0700 (PDT) From: "Christoph Lameter (Ampere)" To: Khalid Aziz cc: lsf-pc@lists.linux-foundation.org, "linux-mm@kvack.org" Subject: Re: [LSF/MM/BPF TOPIC] Sharing page tables across processes (mshare) In-Reply-To: Message-ID: <1ee0e6e0-f82a-5c90-0bca-5c5b93aeadc9@linux.com> References: MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII; format=flowed X-Stat-Signature: sft9fuoxspq5kaqs4m3xyen8json3ikx X-Rspamd-Queue-Id: 296C180014 X-Rspam-User: X-Rspamd-Server: rspam10 X-HE-Tag: 1715710898-210285 X-HE-Meta: U2FsdGVkX1+BE9367Tt1BHJGv4nYM89i7ZUQKBh/oAftWWUnI9hIjCpyM5IvhHiTWzDB/vBli2bAvou5v1r1O/RIateAiZK7MSfDWeWEtRK18jOLM8Q7RVPokrVm3ehFRTM+gqh9kQ8QDj4juKisfwLK/xH9sn6TpN3j0LaJumiKKxqIhm4G+AYSXuya5F4VgmMhBa2X4Dr+dcnLkXjJmfydemvrchgx+LE1lXolbqux/wOgvZaNJAJFJGRF0m4lSoB7qlkwdOGK0puRc6W8jAQ7JF3exXvkEt8cJ1v9Ov3REPlbiMg21C9j3KlW05fxwZ9NtsUkjhIUHvFvYiani8sk11WW2XKUph7QoJMhnjXHcknuq4SN3Fuj0hlHRtU0hauvalf8O1iAI6M4kcdx1lTzWD1V5OZ+6SfBZBpDalmGBMhDPvai2wlcRNnUz0b2zfPLbo/fvkwlCksMLrWE6Q+Nx+IAitm+SGbO1GRRRUBoUmEt64DQ0t08Zdzn5YrfEWsovy7WCYnU1erQjtyRVIRSztsfVsymh9MSDjwudlMdFy28FtNlege3FvXixQr1P2bwx8swzZ6L7Mi/yoXaNG/Vpya4ycthQzVgRxkyMvCcPDGKt4CYgsCFlFS/0DljVFfHLokDiygppuAYkWNNr86Yaou2pdIoCgqLcoIkIsLVSfwTJol1GmdN+QKSCckAx0C0lp+S9Z4K9rEkT9rd/9csZ+YVee4xv3HGWr/Ya2W9L/WNSZngVbYsH/IU3sJRMQDwHZf665t6Koi3cE92k+NzacZkI2CH0AIDLzGq0ZekRtXIl5KG8MzhGDp1cT0fkGwv1dOK7j1yEJzawALB6klHEfD0fLM8O3gpru0V80QUK2PYhdZjMfTImx0XT3XHaXRqEpzXjwYfp+foIquBGrAv77ICGQglc7zO53CMprG+TydTxfybLOAaOy8apsgqvg0JSfZPe7KKY+8dKpF ZXXKO4Rs GhbEUrAyOA6NAmaix10aPZ7J4hZh+fBuhNFnE42GTL50XjpUzXDyLEbCcnUHxlUmORpLEM2SS6gSUmQXUv/D2s5ikZ02GIGWYlvao/BxE8mkR/r8v2SO0+WBXFzZws5nuxYc3J17jpJsrDxStp0JcthQIy2yZmeZMSLd+VdRYBVO9tQEcPx1YMRpvgppMUNMArghPlTc5hfOAPzXDqMrJ/Eczt/zuiE4zG8p6oKxuGV+tcJs= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: > 1. Amount of memory required for PTEs to map physical pages stays low > even when large number of threads share the same pages since PTEs are > shared across threads. > > 2. Page protection attributes are shared across threads and a change > of attributes applies immediately to every thread without any overhead > of coordinating protection bit changes across threads. > > These advantages no longer apply when unrelated processes share pages. > Large database applications can easily comprise of 1000s of processes > that share 100s of GB of pages. In cases like this, amount of memory > consumed by page tables can exceed the size of actual shared data. > On a database server with 300GB SGA, a system crash was seen with > out-of-memory condition when 1500+ clients tried to share this SGA even > though the system had 512GB of memory. On this server, in the worst case > scenario of all 1500 processes mapping every page from SGA would have > required 878GB+ for just the PTEs. Ok then use 1Gig pages or higher for a shared mapping of huge pages. I am not sure why there is a need for sharing page tables here. I just listened to your talk at the LSF/MM and noted some things. It may be best to follow established shared memory approaches like for example implemented already in shmem. If you want to do it with actually sharing page table semantics then the proper implementation using shmem would be maybe to add an additional flag. Lets call this O_SHARED_PAGE_TABLE for now. Then you would do fd = shmem_open("shared_pagetable_segment", O_CREATE|O_RDWR|O_SHARED_PAGE_TABLE, 0666); The remaining handling is straightforward and the shmem subsystem already provides consistent handling of shared memory segments. What you would have to do is to sort out the kernel internal problems created by sharing page table sections when using SHM vmas. But with that there are only limited changes required to special types of vma and the shmem subsystem. So the impact on the kernel overall is limited and you are following an established method of managing shared memory. I actually need something like shared page tables also for another in kernel page table use case in order to define sections in kernel virtual memory that are special for cpus or nodes. Some abstracted functions to manage page tables that share pgd,pud,pmd would be good to have in the kernel if you dont mind. But for this use case I'd suggest to use gigabyte shmem mappings and be done with it. https://lwn.net/Articles/375098/