From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id AD184C25B7E for ; Thu, 30 May 2024 20:04:25 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 426356B00A0; Thu, 30 May 2024 16:04:25 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 3D60C6B00A1; Thu, 30 May 2024 16:04:25 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2C53B6B00A2; Thu, 30 May 2024 16:04:25 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 0E9026B00A0 for ; Thu, 30 May 2024 16:04:25 -0400 (EDT) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 85E811C1088 for ; Thu, 30 May 2024 20:04:24 +0000 (UTC) X-FDA: 82176139248.07.C62CC11 Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) by imf06.hostedemail.com (Postfix) with ESMTP id 4611B180015 for ; Thu, 30 May 2024 20:04:22 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=e2f1bkq9; dmarc=none; spf=none (imf06.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1717099463; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=/bBYT7UwHhY6mRNtfgkO0poVUMVNoTkDOtlF1Z4cO0U=; b=XHZ6CGZGTkquCH+1v5YbiyFQaOLkKjN7pGYgW1eRVT8QtKTf5D6DrcYfPncefu4ogGuHOr wJQSdr31w+8Q/C1eD7pocg73RRW1yIyMAhJQDqahfWxT+1EBA4TW2QV7GlEE7jkNA4sRHX arU4fYIapbjg1lDxjhxGlv/mzR8IMYA= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=e2f1bkq9; dmarc=none; spf=none (imf06.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1717099463; a=rsa-sha256; cv=none; b=i2C3BT2zQ3ZKAs21YqvaPRoakZuB3Sh4/zRAFiA2Fk8onoiTUsB6JSerSa+U8Tg1jZUmrZ 3omVAV/uHEXHIgW4JP2lY/gsNB7f2n07yL7dQDvycaMNCy10mvloLr8DfEJJdfmLbFVjwS sMN2eJbY/RnaLIgrj9QACvjZcj0b++0= DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=/bBYT7UwHhY6mRNtfgkO0poVUMVNoTkDOtlF1Z4cO0U=; b=e2f1bkq9uLx1zHBflnppPaiMxj tR+yCUYGJP/XdDgupBQXMJjQMTmumVo1XdwR9sW5Uxt4Vj1fxw7llefMQYoc5MLxkh8o9JHsCk9xr 6bm8mBedCA5W7GA9K69t2NckzSxWpT5qRcyZCfzHEigaeaAMwWdkrHMvSFriTeNacuUkvI9tUFi6F rm+EBY3E7mr5xUVqw2AUXnxVPfUocX47U7dsczxJIHllBmfSiJ+CsHUcxQMsZPryLMrkbc7Kn5qNp elo7tVXEZ+uKSN8i6UiwOKpv/eSZbz1Lpj+QI14rMMEagjSInP673raxVagl8UqtlQmBSiAEBQ7Lk 5UyClcwA==; Received: from willy by casper.infradead.org with local (Exim 4.97.1 #2 (Red Hat Linux)) id 1sCm0W-0000000B7ep-1RqI; Thu, 30 May 2024 20:04:16 +0000 Date: Thu, 30 May 2024 21:04:16 +0100 From: Matthew Wilcox To: Yosry Ahmed Cc: Johannes Weiner , Usama Arif , akpm@linux-foundation.org, nphamcs@gmail.com, chengming.zhou@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org, kernel-team@meta.com, Hugh Dickins , Huang Ying Subject: Re: [PATCH 1/2] mm: store zero pages to be swapped out in a bitmap Message-ID: References: <20240530102126.357438-1-usamaarif642@gmail.com> <20240530102126.357438-2-usamaarif642@gmail.com> <20240530122715.GB1222079@cmpxchg.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspamd-Queue-Id: 4611B180015 X-Stat-Signature: 37jqhm1jq63ters4m6nt4wz4jpxiw9mm X-Rspam-User: X-Rspamd-Server: rspam04 X-HE-Tag: 1717099462-659661 X-HE-Meta: U2FsdGVkX1+cwTheINPxKhyrLYODqxqozBlpgFoVBUgLCJGpbzPyVv7xW3Q2r4Z6/Pa6eVxItSBA6xbpNUX6skWwDc3KiOl7wEm/bCJkFtvwZBJYnxq1oJMa1Vk40DlYxekeKdCp1vOGubeJ+B5v2voBVHtGc8zzf51C9KdlLWaD5I8GT7fvcnDZtqIQWBCyDhocBNrwBQ/LS9pUqHsdQ7pzkX809HoPKV6CAkwOKwkyZQF0om3At2IBFvQ+mC0i3jjEJranjnCpJ+ALXzzFilIyyhnXVX/H45b1zOuxecmEoquObV2OhtWqgo6OxMFhmRJnbqphjFddmmDqPP95dPoTXpXtkUIt8wpffmasRjxBX3ZWDQE+c4J14XmksqEHMJxkicLPKzMhB0bnzacDzL7z6U6MTRwuoOvggNCWw2lZGhpwN2DIA/jmm+aNXumBK8g4xTxl5E5Y4ihJzSBxKXQzK3sNSMuQ/VCJ9TpHOO/r0L4QEsPM4/juLsFo1hZf0ufNkzI/8SYYPnIqNeVbhJXFe1UHbpafJBUPxaAyLXqtg17j6AzdNbF1eF3wtkbJq3KjmqcN4S+sgAOCu95XU3zbhFucDew14bBy1pfAHTZWBxlCJTe6q/7HlTZerC7xx2bnzm+eGYhUA4pWcSWlIGfZMjHVjaW0VNrKoMYZc/iNZXP0Azi43LfeWDzcea7UeX+zg8Bk9WL5gsa24o0OAa3bsfoGU+Kvl9tVHwU44PByZ9PFpyKQp0wjqUIkJrYm7t4Aaj83SHHyqlyQgV5n6pT0lhLpfounUUBO7huKRnCyNzxOMPFPMA2kDup3q7k0EYjKSEtCGyED5k22AVh3zTfsGRIidbfw285nIDhlqTsvB/rf7zClbSo5MjGApppd8EOPtMSIhzWRJ44qaLNNio6iRR7gxfbO7imrijJ9c21/A5+pFlC9KSaMRdyzSPoIq3fdGY+leVq61PZcT2T arSvi/TM jwGCoNcGJb/B88Oi8jBeuZ8/6DJnK7M2Z4AmHNFErwCswFMGimiDqVpvlhzEZtdotFn52UCRW2xxrQlstWbMDwTezNXhfeNcIsLcbExgLtwqMkYBZZWksVsXLDMtSHGqO9jspz26GGj83/o0= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, May 30, 2024 at 09:24:20AM -0700, Yosry Ahmed wrote: > I am wondering if it's even possible to take this one step further and > avoid reclaiming zero-filled pages in the first place. Can we just > unmap them and let the first read fault allocate a zero'd page like > uninitialized memory, or point them at the zero page and make them > read-only, or something? Then we could free them directly without > going into the swap code to begin with. I was having similar thoughts. You can see in do_anonymous_page() that we simply map the shared zero page when we take a read fault on unallocated anon memory. So my question is where are all these zero pages coming from in the Meta fleet? Obviously we never try to swap out the shared zero page (it's not on any LRU list). So I see three possibilities: - Userspace wrote to it, but it wrote zeroes. Then we did a memcmp(), discovered it was zeroes and fall into this path. It would be safe to just discard this page. - We allocated it as part of a THP. We never wrote to this particular page of the THP, so it's zero-filled. While it's safe to just discard this page, we might want to write it for better swap-in performance. - Userspace wrote non-zeroes to it, then wrote zeroes to it before abandoning use of this page, and so it eventually got swapped out. Perhaps we could teach userspace to MADV_DONTNEED the page instead? Has any data been gathered on this? Maybe there are other sources of zeroed pages that I'm missing. I do remember a presentation at LSFMM in 2022 from Google about very sparsely used THPs.