From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9AC9FC25B75 for ; Fri, 31 May 2024 18:18:47 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D60DD6B00A6; Fri, 31 May 2024 14:18:46 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D11206B00A8; Fri, 31 May 2024 14:18:46 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BD88A6B00A9; Fri, 31 May 2024 14:18:46 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id A21386B00A6 for ; Fri, 31 May 2024 14:18:46 -0400 (EDT) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 6AF41161850 for ; Fri, 31 May 2024 18:18:46 +0000 (UTC) X-FDA: 82179501852.07.C8FA203 Received: from mail-lj1-f179.google.com (mail-lj1-f179.google.com [209.85.208.179]) by imf30.hostedemail.com (Postfix) with ESMTP id 68BED80010 for ; Fri, 31 May 2024 18:18:44 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=GVxwKzAs; spf=pass (imf30.hostedemail.com: domain of usamaarif642@gmail.com designates 209.85.208.179 as permitted sender) smtp.mailfrom=usamaarif642@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1717179524; a=rsa-sha256; cv=none; b=n3NpwA0oyQO8FYZlTuEi0psT9Nn9/+Ii9DL54s2pFJAxMYnFkX9QrIkE698XXQCRqwdwVe ooewrDUAd7UpXXWmPjZxdbYt8yknmqjXxE8QAGCSqACfdVN6vAGoSLFANskw5aN1xbcQS3 uBWMW2RJxi23Zj3qoul3R57u9kNUcHE= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=GVxwKzAs; spf=pass (imf30.hostedemail.com: domain of usamaarif642@gmail.com designates 209.85.208.179 as permitted sender) smtp.mailfrom=usamaarif642@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1717179524; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=za1v+8+s5jbY5MfR5OYN0EWZPcoPWUFnJZSnKGdP8/Y=; b=JYe5LGQIuQFLhsCpdEzL29rsx+6HJUkYqLrS8zHrRpDCTUMdo+HDp3nIo6BZVs8aGl5QsP nghYDDS9H6nBaksmS4lw7K7WC4cMMU2eBiFkL95P3WMSFfkg7Hl49kxn0m76Fj9puxlOF3 AsqEkbsT7Mv2aEsW/h+kfMFvSpLFlU0= Received: by mail-lj1-f179.google.com with SMTP id 38308e7fff4ca-2e724bc46c4so25190331fa.2 for ; Fri, 31 May 2024 11:18:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1717179523; x=1717784323; darn=kvack.org; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=za1v+8+s5jbY5MfR5OYN0EWZPcoPWUFnJZSnKGdP8/Y=; b=GVxwKzAsiQiy5J6/deHSPRtZGfUgRUBUOIDgc5brfVXXmumU/H/IFh+aFl77s1KNLG r9MCCBbVuQp53QTxQTxaRjrDyIk0PFtYb66KZkAM91FaueG0kR8xp+1axXBnSShp+sS/ J3t2O0HlVU9OHVX01uBck6g0tcbbqmq431Ce5twK79EkZ8zyHyj6HkeYpA8ALb6dHvry AqxHcELL/7Ao1YzPjZD9zrb0NmAAnPslLH9oXuQrpyCPo3VQ1C9Um/JMaSjFh2Yldqlc 8UXyx7LpAKBhevN0RR/zfbTpy6/bft1X3W9RJJ8n93FQF/tFVyzjI8pMhb1GSNO/ex2P yDJQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1717179523; x=1717784323; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=za1v+8+s5jbY5MfR5OYN0EWZPcoPWUFnJZSnKGdP8/Y=; b=r7TNFido3hY2DFxPgNF/k4ee53Nbal5Ghd+Oy/m2LEXh+V0dcOgD+/BRodhxN6fSjm m8h45kVnvOLn91/GM0UybfslLW8ecF0R0nYpZdnnxAay4EjsfyMlJtoUAnOjWJ3ag/oq qLlOrPDDFY7QtDQZOrQPxP7HSPNpr52NE2uOnaAPQ6R1zuNRyuUWDzENi3GBMXJC8/uf JJf7HgAqfTt8SqiWFq2mGcUXshxHhDdqy0x73/Y/EwdLX+A8bbg7C/W9GGSSSKhikIKc eU6T2G+wgP8RP3yFYwjgxlxiptf7QorQatshhpFiC3FSD3VuqcVzgZqUFhdX/3Nxbu+3 Zh8A== X-Forwarded-Encrypted: i=1; AJvYcCV2v+/HGcFd4vfRrsr1VRTLTndnkURRjQCC+Y9PxDVYxUfwXcwmsONS/1Bs/fG1a3L/W5Oz3OECQVddoMvdRG8TIwE= X-Gm-Message-State: AOJu0YwHlr7ffEQnSKd4df3dWjIo/lhsHwLgD8fFgc6xMUiLletINIHk 1XGIZrZtXoe4Tqlz/y7RNBihtBSs2eYibd4R43t2k/HWt4PQRRe7 X-Google-Smtp-Source: AGHT+IElgPkF3MIk2LjETPhOiV9iqohIKUJI3pe7iNHqhjauokyz4F0DsOsCDTIFa+9sMTATM0m8VA== X-Received: by 2002:a2e:a583:0:b0:2d9:f68a:d82c with SMTP id 38308e7fff4ca-2ea951e02aemr23245951fa.41.1717179522263; Fri, 31 May 2024 11:18:42 -0700 (PDT) Received: from ?IPV6:2a02:6b6a:b75d:0:64:3301:4710:ec21? ([2a02:6b6a:b75d:0:64:3301:4710:ec21]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-4212b8a4f98sm32798995e9.32.2024.05.31.11.18.41 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 31 May 2024 11:18:41 -0700 (PDT) Message-ID: <434aacbe-e32d-468f-8135-bd550847c267@gmail.com> Date: Fri, 31 May 2024 19:18:41 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH 1/2] mm: store zero pages to be swapped out in a bitmap To: Matthew Wilcox , Yosry Ahmed Cc: Johannes Weiner , akpm@linux-foundation.org, nphamcs@gmail.com, chengming.zhou@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org, kernel-team@meta.com, Hugh Dickins , Huang Ying References: <20240530102126.357438-1-usamaarif642@gmail.com> <20240530102126.357438-2-usamaarif642@gmail.com> <20240530122715.GB1222079@cmpxchg.org> Content-Language: en-US From: Usama Arif In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: 68BED80010 X-Rspam-User: X-Rspamd-Server: rspam12 X-Stat-Signature: gq6qg6qehedxmanz7urok6w9ht9s6pqa X-HE-Tag: 1717179524-53724 X-HE-Meta: U2FsdGVkX18xZyBxrU1U+s4BQpJs3YZF//jeVpMGwb0VYum622GZnWQTf0XFC2naJ2DKpXzoVDViwNQmIgh6uvj0UUOHjI0oEetE3Uy2qsWIz3t0iCa2kglytSdfJBdnh+bYDGqMp81z+yvkRcWiGf0+BZntRjJAM3FWUdHqqcSxlo2KFFsHG7igxe7len2D0TxoS/c8RN1PD/qZSv3cM8RMV5SXDDNcdMrnuMTfA30x7muby2Eegi2sQ0F1f6+i3bt5ZQk+J7cywHnP7QwrswRqd3KTAUsu8oHdB8esigFIuIPTlqPUMrNJVuunaLF0ixWiRhnu0MjnYQcFuuRW4yCd8SzCIl8olpRtQ6LW200xvvNTSKQKQtse2tfB0EO5E8OtIVMia+pBxXKGNEz+l1EUzJ3jVm1N+gO7EHfvTrXPNJMPU+FwVJLoL9DttD9bqH8rFQHwucMUicPNlS964pTJ8Wu60IYOJ+cmn11S3G3bfVnRfkLiWRVw9IDH6fjOBouvitqds3+cmQNGJhwKHtxf9mnjCn3H/9KEnDmyJpGIyI5DOqGZHt/tmNAq5Sh6FYppJHIwkwflKf4ehGwHpS6UG+/yvrwPVVf+EP4JmGO74UFdykVLo00HsRGlqu3RMnvdBR9CO++xXnQ8oRiFY+rZ7NbBgzv5zmduIhDIXuHWWpNFlKjY4gK/RptkYPpovkWQEgtH63eaACdwaGjLy/gXAoy7OpaWwss5TynaLd4IEqiUxof4DZYV+aSd4rlML7Iczs2q9qpkOxKIWEBjH4PzcZwsbWvOwtpfKtuXyXict0cZfkk2vyfl6O37DMuMe8UCaMWLwNu4A76w+0Au4qyRAyjRpsyyoVyDhrJRmHjDa8YvqC0XYfh/q1BM+utKyZEDxnhUqW0PylXnsjjxNM1ntyxTuTA0pwb3Z7Aq9jwBuvE44AnRMyqUWMT5+foGpEpC1TWHTpJwVy325Gs 5LQX/Vx3 o8eISQflx945gkCjGc/xYqGlV+DUB+55/QYm1FXC6YQx2xglvtO3iJX/eYL4EW+H1958YpNnGLmY6BsSC2mLqyVbaCBABouB1TBIVwm2Pv03eQGCRR+pUDnZsl5sxbhO9xRK8jLKAhKF1bURYVRUK6KmleagPdAhCKT077Fp8SzBT6UQTwAmAbg/rUPixLRa7FBL49UZpWGkE7C83Z6bNiyHa3PEqVtjt8RGBpmyFW1fZ1hsgtO2spyrRQJhLDz9MRSxtDtXONCPhJuT64ZEq43sZyjmkJm4oqY6R5xOAZvXTulWcV0cD4C8fO83hq1OF71uzF2l9sOqGSspRHhUkZhpCRzJcRmCu7YcbAXSA/J6Zxcf1Wa8jtnX85tM9FbKuUNaw6iDtlckC6Yr+0PKVTl4lPTWWvUcK0j/ahoupz9n/DP/LeFVK7LtMmlqsGttPG4qwISd1idDLmt6xmpeTnloFD6y/JZoGfyaSE750tqO5aD8Y9l9EDIlJYOWUxQXIhGGz5VrXldwVvqgersulcL9zDrpsZa2mNuR9r0N6X6cjb34dxXQ5FxJZ5Ha7veK1u5R12sJ1uRJ/cOfGT0meuC4repaAfByKVvgXLuibvqEF61giCzHcSFaPcgCnP9bQqEVi X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 30/05/2024 21:04, Matthew Wilcox wrote: > On Thu, May 30, 2024 at 09:24:20AM -0700, Yosry Ahmed wrote: >> I am wondering if it's even possible to take this one step further and >> avoid reclaiming zero-filled pages in the first place. Can we just >> unmap them and let the first read fault allocate a zero'd page like >> uninitialized memory, or point them at the zero page and make them >> read-only, or something? Then we could free them directly without >> going into the swap code to begin with. > I was having similar thoughts. You can see in do_anonymous_page() that > we simply map the shared zero page when we take a read fault on > unallocated anon memory. Thanks Yosry and Matthew. Currently trying to prototype and see how this might look. Hopefully should have an update next week. > So my question is where are all these zero pages coming from in the Meta > fleet? Obviously we never try to swap out the shared zero page (it's > not on any LRU list). So I see three possibilities: > > - Userspace wrote to it, but it wrote zeroes. Then we did a memcmp(), > discovered it was zeroes and fall into this path. It would be safe > to just discard this page. > - We allocated it as part of a THP. We never wrote to this particular > page of the THP, so it's zero-filled. While it's safe to just > discard this page, we might want to write it for better swap-in > performance. Its mostly THP. Alex presented the numbers well in his THP series https://lore.kernel.org/lkml/cover.1661461643.git.alexlzhu@fb.com/ > - Userspace wrote non-zeroes to it, then wrote zeroes to it before > abandoning use of this page, and so it eventually got swapped out. > Perhaps we could teach userspace to MADV_DONTNEED the page instead? > > Has any data been gathered on this? Maybe there are other sources of > zeroed pages that I'm missing. I do remember a presentation at LSFMM > in 2022 from Google about very sparsely used THPs.