From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 65605C25B74 for ; Thu, 30 May 2024 20:16:48 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 039246B00A6; Thu, 30 May 2024 16:16:48 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id F2B896B00A7; Thu, 30 May 2024 16:16:47 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DF2D76B00A8; Thu, 30 May 2024 16:16:47 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id C0A5A6B00A6 for ; Thu, 30 May 2024 16:16:47 -0400 (EDT) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 4971C1A0D9E for ; Thu, 30 May 2024 20:16:47 +0000 (UTC) X-FDA: 82176170454.30.20DC7D2 Received: from mail-ed1-f42.google.com (mail-ed1-f42.google.com [209.85.208.42]) by imf19.hostedemail.com (Postfix) with ESMTP id 6C9CE1A0017 for ; Thu, 30 May 2024 20:16:45 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=k+sZYlf1; spf=pass (imf19.hostedemail.com: domain of yosryahmed@google.com designates 209.85.208.42 as permitted sender) smtp.mailfrom=yosryahmed@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1717100205; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=7OTWf97i9msDaAtUbKTlawU5KSwt/tocxmyS3gYffoM=; b=5iADznS6KHGqLlTqOx1evIarO1dL0ucntxKFOy4vBJKLjUIdk5nVDGhXQbKf7d35RBmd21 yKSV0hg/O8FeTAreDWo6vcTDrmp/fwjemBAcH3CjRH6MK4v1m0A9Nu9+xXvt/PUs0j6dn7 UMMYQGCI3uZxJK6P+Rb7tESAESmfsmk= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=k+sZYlf1; spf=pass (imf19.hostedemail.com: domain of yosryahmed@google.com designates 209.85.208.42 as permitted sender) smtp.mailfrom=yosryahmed@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1717100205; a=rsa-sha256; cv=none; b=yDgVByPkDhq4SCJrnQjP6M6PulTBVe7MKHXjHAkERasrULa7fzOqly4eDemZqpVvY4sNbh KO9vWH0r4FiWtkh1QzrY22qAz6Ejm7lcXE0hqf3V6+AH4iRAL7W1rqZpah9A53MabtGlRo 5u5xvoTC/qzjx/2T9Chkh5Fjtes2euY= Received: by mail-ed1-f42.google.com with SMTP id 4fb4d7f45d1cf-57a23997da3so1109128a12.3 for ; Thu, 30 May 2024 13:16:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1717100204; x=1717705004; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=7OTWf97i9msDaAtUbKTlawU5KSwt/tocxmyS3gYffoM=; b=k+sZYlf1VPYH16NS06/I3Cm5mMBgLE8B8vYXwXHoqxQBWesLC4mmNakme/5rJd9Irq CySvfPU1tIp2kDOCkdUekShDtrxR88M1lDvUP7ayWufJjcGTr2uleBJlP4pLof3ikmDp +nU8BMIWhscKQaW4xYTRC3ux66WKjLougY0dc3TElH5oBFO6KSUNWmDeY5yp09EGItiO WNK7sWdqjE9ti1Wwyu2tTIlAX2qhfYScO4AJEAdEmKVq42CQiLDw+upbVceEa8JOIKUt BngR6NxNKhGS0/ENB1+zSpt2o6NLY1IZ3JfHncHGY5Z++qo84F+Npjvk/RXl4inSSb41 eiPQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1717100204; x=1717705004; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=7OTWf97i9msDaAtUbKTlawU5KSwt/tocxmyS3gYffoM=; b=dPdNlmGETMguvjBb7wWDWRXbspI+hScFo7KJ7xdM6ehqs8Tp64+M3Mw6yQr88MMuPM 9kj4Q+O8IqM4tXuFOVgWpYl8mhzApyxb6Y8+4bTg7GR2Wfm1lZfNsDtqHCC8NG2gMF8g xWAKVkDDRgT11lwfhhJQFoBAoKWu7B+6OiIn1VhIZ8Mta4GsfxNnbY2fBlXGaxOAjm/x kQIpnWNeuB3byMKekH/bsa7HS3ekG+ADRfyn7SZ3bNY7SgMT3JyZoJX4WiiWbhE8Yj43 41HicFawetsLf/Nlf/wviUspEO8G60FA2Ev0sCplY94kA5lAy+JJAr5/+7dhqQFpvx1p gLog== X-Forwarded-Encrypted: i=1; AJvYcCVLrvKdducUwnf5fR9sD87vwzG2LnuQlcjveVHfdh2gta3BclkNoFDkgDoiogKCifctmkspsOJj5Uamyp3d2ahgA3A= X-Gm-Message-State: AOJu0Yy93dmTPWZILZ09OTPtlOeUsN06j95PYhxf4kMAaIOiW1hp4zMc buqRP//1VEpnNFNOb7Kni3F+PCHVAmi5aaa/9qlTjOeQwTV7jRAVQb+dvB6+PZalLpptZGwnZ6H Dsp4cSE8JIGnTT/PRmno9Am0tA0MNYNVqgzUA X-Google-Smtp-Source: AGHT+IGfhSphpt3Xt9x/htvOQl3pSYxwOhOVDjvz7uv3zgNs2OoaZVoWnDiNErSEG6McsHKtRKsOInjy25s7jMJjoKI= X-Received: by 2002:a17:906:abd0:b0:a5a:8a63:9fee with SMTP id a640c23a62f3a-a65e8d2b827mr209645766b.3.1717100203599; Thu, 30 May 2024 13:16:43 -0700 (PDT) MIME-Version: 1.0 References: <20240530102126.357438-1-usamaarif642@gmail.com> <20240530102126.357438-2-usamaarif642@gmail.com> <20240530122715.GB1222079@cmpxchg.org> In-Reply-To: From: Yosry Ahmed Date: Thu, 30 May 2024 13:16:04 -0700 Message-ID: Subject: Re: [PATCH 1/2] mm: store zero pages to be swapped out in a bitmap To: Matthew Wilcox Cc: Johannes Weiner , Usama Arif , akpm@linux-foundation.org, nphamcs@gmail.com, chengming.zhou@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org, kernel-team@meta.com, Hugh Dickins , Huang Ying Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Stat-Signature: qx1946nay5uksucx9rha8a48rp6h3ji6 X-Rspamd-Queue-Id: 6C9CE1A0017 X-Rspam-User: X-Rspamd-Server: rspam10 X-HE-Tag: 1717100205-875522 X-HE-Meta: U2FsdGVkX187a47X17USKs2osBATH9+Cf7bDlBfSYuXW267DpKKvLvJms4E7Acd8IY0Bt+L9/PYwx80ecHJwHBHeCgQEp6JKZdPiXY8CdH3njcx0/NXIqpxk3kRlvDuaqmtpg9/IGJifsqANN5q+rD7giAz8IQT5qRStLw9196wMO9PXj2CnHtgGb1oruhBo7pSuAlFCN9wf1VDrplC+gcUERse0oo2W+UlbqwRzMazxrNaK2Jn2gWveZhn1wNgaACE/O5BbJRrUkv4Br9aNUwr0kvEjd7ML2mustsQpJRt+wckn35nzLs525as2PvKnw0cxOJRBL4SKwp0fKabmBvPncfhoddm1hNj8Xy+oWsNyz9oRR0FUB7jSZndpC9QGHCqb0Yl0LGPOaGKvd7iz0HXh+eRuEADYyfIZsiwCfQkYPuW0AFT694jO9p02dRqVJcWnq+Sd6GdfZ9WOoEZFPvmuMgbWyX9PeSW9fzI8b5kBVZ3mNkqL2GHdBCCbF05Y39HZ+fzbU4gjlwaLXYOu0Ps2SushF3MwTXI3uMJl7OHLAV5+u2TeC03vKrJsVLXV9zWQllyGs7m8BDT/2VVD1Dfw7EW5qV8p4ELqttE9fx0ZyxKjwGvO/S0F4Vp/Rp4cVUmst+WTFXyLJwQ8SAqQnGA2/1jKPgJnObeJjvgFqthXx04V1AD+cdp/CygaRYC6PzUN7YMHTm6u2MLDoxW4LAjEwYrw1qx9EoQwS8BnXEqedMRuONccZlF6C6qCwIRCDzLWttM0rFur51lH/80PqeQbmCJlYivcc7Iao6BvOu1eXYe+Fy1U/sbMpfE9oJKnvphDJ6DL3J//AqfopXpOgw32Xy+VOWISTv/ukaAXwF0RqZmSurhcNe6vXeyR/vetBg1m75FB8hrLSr4afrrSrPKkTJL3ZCmvRwoBIoHcvOr45/4drsKPKK4VfvSyL/COqTKweEppfA6FZEFQDTf wlt7R06z HKfJNW83qUV5vZ5lQ8QxhN0UDrlga6APqKgXLTfjU9a42j0+aijL4ml+VLBf2JJwqmAXFnNqBnv4JfqK6q4QmBqITQRSmyDJScGwRqik1ipfTl7tXcH7joraQRQ4lx6hi+pO5rWoRfaP/P4M4ZIenaW6u0Rl0U3+2CrZTjHAGu87D2Ez2tTT/ouOOUHWZOdhFHcbs X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, May 30, 2024 at 1:04=E2=80=AFPM Matthew Wilcox wrote: > > On Thu, May 30, 2024 at 09:24:20AM -0700, Yosry Ahmed wrote: > > I am wondering if it's even possible to take this one step further and > > avoid reclaiming zero-filled pages in the first place. Can we just > > unmap them and let the first read fault allocate a zero'd page like > > uninitialized memory, or point them at the zero page and make them > > read-only, or something? Then we could free them directly without > > going into the swap code to begin with. > > I was having similar thoughts. You can see in do_anonymous_page() that > we simply map the shared zero page when we take a read fault on > unallocated anon memory. > > So my question is where are all these zero pages coming from in the Meta > fleet? Obviously we never try to swap out the shared zero page (it's > not on any LRU list). So I see three possibilities: > > - Userspace wrote to it, but it wrote zeroes. Then we did a memcmp(), > discovered it was zeroes and fall into this path. It would be safe > to just discard this page. > - We allocated it as part of a THP. We never wrote to this particular > page of the THP, so it's zero-filled. While it's safe to just > discard this page, we might want to write it for better swap-in > performance. My understanding is that here we check if the entire folio is zero-filled. If the THP is still intact as a single folio, we will only apply the optimization if the entire THP is zero-filled. If we are checking a page that used to be part of a THP, then I think the THP is already split and swap-in performance would not be affected. Did I miss anything here? > - Userspace wrote non-zeroes to it, then wrote zeroes to it before > abandoning use of this page, and so it eventually got swapped out. > Perhaps we could teach userspace to MADV_DONTNEED the page instead? Why wouldn't it be safe to discard the page in this case as well? > > Has any data been gathered on this? Maybe there are other sources of > zeroed pages that I'm missing. I do remember a presentation at LSFMM > in 2022 from Google about very sparsely used THPs. Apart from that, we may also want to think about shmem if we want a general approach to avoid swapping out zero pages.