From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 99141E7716D for ; Wed, 4 Dec 2024 22:49:23 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2E6E76B0085; Wed, 4 Dec 2024 17:49:23 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 295476B0088; Wed, 4 Dec 2024 17:49:23 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 184606B0089; Wed, 4 Dec 2024 17:49:23 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id EC8FC6B0085 for ; Wed, 4 Dec 2024 17:49:22 -0500 (EST) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 713FF1C7CAE for ; Wed, 4 Dec 2024 22:49:22 +0000 (UTC) X-FDA: 82858768944.12.A981276 Received: from nyc.source.kernel.org (nyc.source.kernel.org [147.75.193.91]) by imf26.hostedemail.com (Postfix) with ESMTP id 72E92140009 for ; Wed, 4 Dec 2024 22:49:09 +0000 (UTC) Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=XRsnoFzd; spf=pass (imf26.hostedemail.com: domain of akpm@linux-foundation.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1733352544; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Aznl1JG+2XHzeF+h9DQGz1CUSHjNKm27LAkRkGZZYwM=; b=VRQUxcmUornsifegRpsdFlflSnw5OuCjnNcbXb5QNofON4wt/i027Y6ZN/Z0FImuOsYTij g5feS9QoKgwGXEQSGFcYVksy2SIrgp+7zF3d3s+b4kEOPG25h9pvvz6msPSz8NzveElfWM KCA/t9V2TSSjLXSiMaAhrJy9URpYyAM= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=XRsnoFzd; spf=pass (imf26.hostedemail.com: domain of akpm@linux-foundation.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org; dmarc=none ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1733352544; a=rsa-sha256; cv=none; b=VtPDLe+ZqzCmPxG/FwoQMmeBoQGVbtFZZLo9NL7+jG5PYW30aj5Np3J1wbjZiTqLVbzkpt UXVXqj+fQnURzXRBiakkrVeWlzmscFtrVwV8nYSx2bQBisUIj3eWnpQM2fxjJClkCSeIYO 8DiBLqdA2jjLPe9szsfCe7XI5mUy6I8= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by nyc.source.kernel.org (Postfix) with ESMTP id 09193A41CBB; Wed, 4 Dec 2024 22:47:28 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 24F71C4CECD; Wed, 4 Dec 2024 22:49:19 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1733352559; bh=QSx0YX9GTvAnD4VTmejGcVAQBDaWQeLp1doZlxhzA0c=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=XRsnoFzdu0bluJdapeLXPwObNVqO0YnPagJLcVZJ44RzCRf/dpZgRF4ddR9LVU4UN BjOLCnQQ8ubJWq/PMVFmTEx1T+wLAdmNH2sd1Th2zGH+jOUIxwFIYLW4RVzSMTBf1U pyLrAjnYCtuTZMNpgn+/8//9A6bWkGo0ZrhSmT70= Date: Wed, 4 Dec 2024 14:49:18 -0800 From: Andrew Morton To: Qi Zheng Cc: david@redhat.com, jannh@google.com, hughd@google.com, willy@infradead.org, muchun.song@linux.dev, vbabka@kernel.org, peterx@redhat.com, mgorman@suse.de, catalin.marinas@arm.com, will@kernel.org, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, x86@kernel.org, lorenzo.stoakes@oracle.com, zokeefe@google.com, rientjes@google.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH v4 00/11] synchronously scan and reclaim empty user PTE pages Message-Id: <20241204144918.b08dbdd99903d3e18a27eb44@linux-foundation.org> In-Reply-To: References: X-Mailer: Sylpheed 3.7.0 (GTK+ 2.24.33; x86_64-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 72E92140009 X-Rspam-User: X-Stat-Signature: 4g18grj4mjs43qqni68tn49ud8sc8u3h X-HE-Tag: 1733352549-21681 X-HE-Meta: U2FsdGVkX1/Nn0AQyfkQ06rKieNa+ZxgX3GrcSjf2JWRJUdCAEhwlF9OXwHua6eawXsyCC7Y0bThWx6HaW9SjNNUf9D/V6NF7/Y24qwQcxMEwQT2XKs8SEqw9tEwD52FDKrFVa6A+VboD51FKe4/1WIS17XnCmerDOaevgooYNk7JC3w1Rhd1Qpk/fc4UXHcgqnfWk/B4Y3EQu+Pql1RB/+GHvjRbmxMKXpjjYmK2ekwHO9XJGZntM33oYp8xPjPN0gXjNlvzp0uFBa7I/A051Y3K9dgkjbzEHOk20Ep9AZjcnQbfwj6nuy/IoA1N6TEiyu+TvhrlEvHZ8xGGuCugqLpkZ4dVj3kRn/AeC3/Nzvw/SfxiHZ/lkaOfohQudMa9jxRSk0v+jTRtNJmGr/zp30WOqMwLygRqkrbhxxNMKQVhlFxp+If3P1kwY4nNOhnACp9bm0cH5aG7Nu3DsGyg2UBZxkr3FbZeMy65OJ4ilUHPS23Dw/D4Vez55uB0NQ4d3q+9dJGUIYfxVfdqfjyn1F0GTqCjkeagnhH4mSkFhDRPAWFamS800/mNo1ccqf/na14Z4fcKFhwy8Me07qNm2EMzO63XGpW2rbgyXkSVUFok0OSYFmVeE0RR4wAhgc1LOSL08A3kr+xmHXNyJgGtsCoF91huCOggEMQp16Zc0yJcIewbnvofW5uHqGIAHZIAkggjmwq7KQPNCPPVBOm03YuH73pE8hGDZZ8pCNUaln/qygBeUKmceezZT9CGAzbrFyIbLzzLljRsKebAPuiqgLVn91SC0Mh0Zbc2ZY1tOsKCz7IMY9qtwN/vBOycJ+UeHYILkI4Hcu1i4o+fHnQqQ7znOojxoVGeBcCNmCR88YnOyboaKNsgjR1hK4VmWm7VpSRpuAVH+Dp/KM925AO6ehIyNff72VPJIYJmIhwK1JwrgmRZ6xdounsMlI2vYrprazNwjlmsp8LKu0yaEP 7hARLSh4 2+JdJl5bW5jNIJP9uVM5xb4qLjVVKFicB39YGgkc6c/ECNbML9E+YqSc1taNtbYTzghhCmLVscVtcPXfSh163Mvw+POxuqtK/Gafzt3jXQJu6FyGElvvfU87vQ4qO6rNKhYkSQoTu5rLXOpu65va/G5HH/wKaKOEw3e2YQb888DgKB0SZqggbGeMJPR1qTmdf1B66eAo1giKdvfRRfgWMu690wMeZ87ssu4K/6PeXDDHRoC10rY0YpseK7DhSNJaqpaQh3/QMHA7BbgC8V+A5WodK0g== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, 4 Dec 2024 19:09:40 +0800 Qi Zheng wrote: > > ... > > Previously, we tried to use a completely asynchronous method to reclaim empty > user PTE pages [1]. After discussing with David Hildenbrand, we decided to > implement synchronous reclaimation in the case of madvise(MADV_DONTNEED) as the > first step. Please help us understand what the other steps are. Because we dont want to commit to a particular partial implementation only to later discover that completing that implementation causes us problems. > So this series aims to synchronously free the empty PTE pages in > madvise(MADV_DONTNEED) case. We will detect and free empty PTE pages in > zap_pte_range(), and will add zap_details.reclaim_pt to exclude cases other than > madvise(MADV_DONTNEED). > > In zap_pte_range(), mmu_gather is used to perform batch tlb flushing and page > freeing operations. Therefore, if we want to free the empty PTE page in this > path, the most natural way is to add it to mmu_gather as well. Now, if > CONFIG_MMU_GATHER_RCU_TABLE_FREE is selected, mmu_gather will free page table > pages by semi RCU: > > - batch table freeing: asynchronous free by RCU > - single table freeing: IPI + synchronous free > > But this is not enough to free the empty PTE page table pages in paths other > that munmap and exit_mmap path, because IPI cannot be synchronized with > rcu_read_lock() in pte_offset_map{_lock}(). So we should let single table also > be freed by RCU like batch table freeing. > > As a first step, we supported this feature on x86_64 and selectd the newly > introduced CONFIG_ARCH_SUPPORTS_PT_RECLAIM. > > For other cases such as madvise(MADV_FREE), consider scanning and freeing empty > PTE pages asynchronously in the future. Handling MADV_FREE sounds fairly straightforward?