From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5FD75CD4F4C for ; Wed, 4 Sep 2024 23:27:36 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E2A306B02D5; Wed, 4 Sep 2024 19:27:35 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id DD8BE6B02D8; Wed, 4 Sep 2024 19:27:35 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C2C3A6B02D7; Wed, 4 Sep 2024 19:27:35 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 9FDF66B02D0 for ; Wed, 4 Sep 2024 19:27:35 -0400 (EDT) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 1E1F31416F3 for ; Wed, 4 Sep 2024 23:27:35 +0000 (UTC) X-FDA: 82528644870.26.E62993A Received: from mail-vs1-f45.google.com (mail-vs1-f45.google.com [209.85.217.45]) by imf23.hostedemail.com (Postfix) with ESMTP id 3AC9F14000A for ; Wed, 4 Sep 2024 23:27:33 +0000 (UTC) Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=NqmxfEdZ; spf=pass (imf23.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.217.45 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1725492404; a=rsa-sha256; cv=none; b=aef/8v+iHa3gOpnYMalwduSTCJOeLBSC+B3FanjiOpu8pBAvO9sKJdhpl8pmA4SKUetL6M TvESR4WpUV03AiSFb25Gefsj8Tp9a+D4NCcvFlkQwxiQxAhv/h7o1SsjVL6nlb4TPb9/2F WDCpU0/SK+hCvOBfXFMCZtITuCkW94g= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=NqmxfEdZ; spf=pass (imf23.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.217.45 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1725492404; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=0CVB6NjO4vGSa7X2qTrnLEWBdQAEUawVJKFhtty1RSs=; b=LjiPFkpkxdWdKqVDGHTGWHCThCmWHGCgmJDlgQZEUQ4KeOR6cbDvshp9ehOButCEBCJH/j hI5ZQ+IvSiejCyDH+De8HIz1ysmjxrXym4hG6A+K7xpWEj35f51ohJmk7aMgZu1MTdgPSX Wel5VHuWL4PDAhYsUenFIQV2G6kvzlU= Received: by mail-vs1-f45.google.com with SMTP id ada2fe7eead31-49bc672bb46so48856137.2 for ; Wed, 04 Sep 2024 16:27:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1725492452; x=1726097252; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=0CVB6NjO4vGSa7X2qTrnLEWBdQAEUawVJKFhtty1RSs=; b=NqmxfEdZNxSGtaw3Iuc4fe8BDkBKGOyHpLiwnm7J8DgV7yrUsu3ro4+vB3id/wG175 x3GhLbZaFfTUTlZfaJwLpgSw1QhY4cbDPCRreLs0pfN0imTfn6/BvrNq1XaWmlaRAtlF ieQh49s4fXyy1hfjrffSCAX42j2JmSaTqVZr2mdbC3MdneRwrXx+CVvuuL3kCsecT2Ta 2VVnQ0uSINYMxPRLLacn1zu+nqbqPvPr3IcOIW+VCqyKnW+/mo33eK1zd8TYHEk7zIkS vMZZ1b3e9xQhzAJdNJj0WzKAkxzqPy2ZU+aT9nK5r+WiUCoVowiVwsjXmVPjdqbOvhS1 Bi3g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725492452; x=1726097252; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=0CVB6NjO4vGSa7X2qTrnLEWBdQAEUawVJKFhtty1RSs=; b=LNfruY1gqdPmhGeUUehmXTG7d6PVv/u47DmGGrgUH6YiafSjNoe2oFOGtg+kcpBwFc EaqM4ocrTcE+oX29/bUbJnfGTfUKg+Oq7Iif2/RKUtcIIah84l0Qd+IYoFepnuOCNELM Ab1dcjGB88VIdUN9OUTmVGgGL5eyPaaQ+ctvsB7zQYaHCVHg6G/0nTcYZ5Q6MXilVwsy iaqYNd0FHIHskA293tel9UtemoI5+KaKzGbUf9nk5jjp6lIX+ALmbd57NtOs4GzHTjMJ Wn/AwZp1Gcr5HshWVMwiL/cNLYpVimrUQZOaF5GVq+gjwbCcDTPccyZxx7RvgV4gqQmC AyxQ== X-Forwarded-Encrypted: i=1; AJvYcCWchtULkt/xfwFd1po7mz2n1GO9L9dAsQoWPJqWZDIpFMMYMs5ZaWrS0/ALhcB7Tz6QI5YYL8RzCg==@kvack.org X-Gm-Message-State: AOJu0YzwQdY5RAKtYe42zE0FU5iBoFHkvm+na7M3GOoP7k53YLCy7kJt Amul54UkZG/k+kAOzTUpiec9ulcT0Q9eLcklw08eqgf9S0z03N4EIzsqnE7wuP8uBKzcJBujRs7 h0tT3JmiXjXxiLz0owGSd9s2L8CM= X-Google-Smtp-Source: AGHT+IEryQ59FsHoDv/++0Kig5/0baRUl4VeAn4/vqIFOVlRWbLFSAES/B0wP2dYzORn+8//6Tv4n1sZNEEirGeRaI0= X-Received: by 2002:a05:6102:3054:b0:48f:4a50:233e with SMTP id ada2fe7eead31-49a79bc8888mr14914282137.21.1725492452099; Wed, 04 Sep 2024 16:27:32 -0700 (PDT) MIME-Version: 1.0 References: <20240821074541.516249-1-hanchuanhua@oppo.com> <20240821074541.516249-3-hanchuanhua@oppo.com> <20240903130757.f584c73f356c03617a2c8804@linux-foundation.org> <94eb70cd-b508-42ef-b5d2-acc29e22eb0e@gmail.com> In-Reply-To: From: Barry Song <21cnbao@gmail.com> Date: Thu, 5 Sep 2024 11:27:18 +1200 Message-ID: Subject: Re: [PATCH v7 2/2] mm: support large folios swap-in for sync io devices To: Usama Arif Cc: Yosry Ahmed , Andrew Morton , Kairui Song , hanchuanhua@oppo.com, linux-mm@kvack.org, baolin.wang@linux.alibaba.com, chrisl@kernel.org, david@redhat.com, hannes@cmpxchg.org, hughd@google.com, kaleshsingh@google.com, linux-kernel@vger.kernel.org, mhocko@suse.com, minchan@kernel.org, nphamcs@gmail.com, ryan.roberts@arm.com, senozhatsky@chromium.org, shakeel.butt@linux.dev, shy828301@gmail.com, surenb@google.com, v-songbaohua@oppo.com, willy@infradead.org, xiang@kernel.org, ying.huang@intel.com, hch@infradead.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Stat-Signature: io1egpi8byeyutqbok8wee9w5fxonpwc X-Rspamd-Queue-Id: 3AC9F14000A X-Rspam-User: X-Rspamd-Server: rspam10 X-HE-Tag: 1725492453-711338 X-HE-Meta: U2FsdGVkX1+1wK5nWUZzQITqV7HJKOrC7c+5adqf8P8ssSqPXz3zB2ePtHcWngpIPnC4CEqEdZ27C+kvJtIm4dF8h+9+A82Vni3r/yTHmL1wJAi+FMNdHmd0ZE8JFNzd4yZaTcWds0j1jixnz56nYp9Fo5cJ1b85np3PEwSCFCbYMaLBkf0/vbZ7WiZjuNt06ZS8NMkIz6fdZjcRhzrCAW+Dalzz7e2IUEzKGze/G2fnYwex9Ao0eqn2rk0wZmr0dlT+SCLiF+mB0AGBx+DN1HXUY4lQ8yFpCxBe6FatjI4tJfG7EH+ICmZ4bJJnJ9lgLnwcZZjk//WGVqXK1Hvntv69PEZsnTBA8uyChGt9ULCKjVNLhqNrgGFN8xVSftamEVgZqeRHH/ZaDDuUlTwndXKDEDqfFZyTces/ug3umwOqeFXZjug+VfyVIwwCzNgTt7QZCKw8soXNiivUsI05qFNdGX8dzY7ttKH1r/ScaSI1lpPK9GZakkb84Q5IsuNKJ6FTHI0Itc2ZtKq2FtrKfd5KRWPmHxjupdoj2vdomwG2V47Ym26EEH5VL8V9kB4F+W95Y2t02tt6ngdFRty0DRlIKcsn0Jm4pKocKo6ZCg1zumzL+fN4pfyQylUApe+2Nz06iPKc7BSXL1HqvZMtQGR5OlPOsD0DSXuOzXhAGgccBfyvjMdIJTdXy7RvK/88BQIDu5d3kci0MGJrB844lPAe+NAGI2hDcFd0PsrDoCCJVxHJe2qYxms0iMHv7N66ge/cfmoomrvQPibWtyN0438wWzztx6CvBNZcSDnvrFDOFGiPtC4cB5C6aiON+61kEuJ+hKy44dqNqNOzneskDxY5jF33/TClBJGsQX9Otpozb6C+c8kY0DhFcgfB77ywEFCc6rB6yCMBT5T52oI6IkVvVdsAhFkMlYDwtNtok8i76I9gyxFkZsqf/DLhSXI7X5pd3YsaRWJlwFocQqX Y7gk7frd +jxQBchLS0/w7TQx4hlXIG11qvGIGzzEC27MqSYZN+6QH6IwnWUTvOYIMyMt3p4gqjepxTFLhtGFcFbuhGRWdSZ6HfEdA0emX8jfXhiyFpL6HgZABPEkaaXDhTI6xia5hxe2VlBzVlGyyZWv74OWBfRBhurGRJ3k/IBXwMti+qNXjECesvJ0iSmhhMZMSyoMKKscBBFv3hMx4Rd2BklSP+LraC2eqbsbBvsjNADzk/v/YajIvw0leK+QzZ5HM8JGJ8S8rUQMww+6oCcJOFhievKJay3G/8D0r2eKsVhHahV9lrWRDzCcr3cYpdHCDjVybKe7+Ms0LQh3HKK+pnRsbcDnvYxHgZAHS9WUkOY+MPcndFiPzVz1A+ualSjODrvNVWEtagSpPfu/ygVf6ReSbXTbWSQmBUzVBKqEAiMlor271osFhaNiHBmI8eL2sfwIP/cL8 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Sep 5, 2024 at 11:23=E2=80=AFAM Usama Arif = wrote: > > > > On 05/09/2024 00:10, Barry Song wrote: > > On Thu, Sep 5, 2024 at 9:30=E2=80=AFAM Usama Arif wrote: > >> > >> > >> > >> On 03/09/2024 23:05, Yosry Ahmed wrote: > >>> On Tue, Sep 3, 2024 at 2:36=E2=80=AFPM Barry Song <21cnbao@gmail.com>= wrote: > >>>> > >>>> On Wed, Sep 4, 2024 at 8:08=E2=80=AFAM Andrew Morton wrote: > >>>>> > >>>>> On Tue, 3 Sep 2024 11:38:37 -0700 Yosry Ahmed wrote: > >>>>> > >>>>>>> [ 39.157954] RBP: 0000000000000000 R08: 0000000000000000 R09: 0= 000000000000007 > >>>>>>> [ 39.158288] R10: 0000000000000000 R11: 0000000000000246 R12: 0= 000000000000001 > >>>>>>> [ 39.158634] R13: 0000000000002b9a R14: 0000000000000000 R15: 0= 0007ffd619d5518 > >>>>>>> [ 39.158998] > >>>>>>> [ 39.159226] ---[ end trace 0000000000000000 ]--- > >>>>>>> > >>>>>>> After reverting this or Usama's "mm: store zero pages to be swapp= ed > >>>>>>> out in a bitmap", the problem is gone. I think these two patches = may > >>>>>>> have some conflict that needs to be resolved. > >>>>>> > >>>>>> Yup. I saw this conflict coming and specifically asked for this > >>>>>> warning to be added in Usama's patch to catch it [1]. It served it= s > >>>>>> purpose. > >>>>>> > >>>>>> Usama's patch does not handle large folio swapin, because at the t= ime > >>>>>> it was written we didn't have it. We expected Usama's series to la= nd > >>>>>> sooner than this one, so the warning was to make sure that this se= ries > >>>>>> handles large folio swapin in the zeromap code. Now that they are = both > >>>>>> in mm-unstable, we are gonna have to figure this out. > >>>>>> > >>>>>> I suspect Usama's patches are closer to land so it's better to han= dle > >>>>>> this in this series, but I will leave it up to Usama and > >>>>>> Chuanhua/Barry to figure this out :) > >>>> > >>>> I believe handling this in swap-in might violate layer separation. > >>>> `swap_read_folio()` should be a reliable API to call, regardless of > >>>> whether `zeromap` is present. Therefore, the fix should likely be > >>>> within `zeromap` but not this `swap-in`. I=E2=80=99ll take a look at= this with > >>>> Usama :-) > >>> > >>> I meant handling it within this series to avoid blocking Usama > >>> patches, not within this code. Thanks for taking a look, I am sure yo= u > >>> and Usama will figure out the best way forward :) > >> > >> Hi Barry and Yosry, > >> > >> Is the best (and quickest) way forward to have a v8 of this with > >> https://lore.kernel.org/all/20240904055522.2376-1-21cnbao@gmail.com/ > >> as the first patch, and using swap_zeromap_entries_count in alloc_swap= _folio > >> in this support large folios swap-in patch? > > > > Yes, Usama. i can actually do a check: > > > > zeromap_cnt =3D swap_zeromap_entries_count(entry, nr); > > > > /* swap_read_folio() can handle inconsistent zeromap in multiple entrie= s */ > > if (zeromap_cnt > 0 && zeromap_cnt < nr) > > try next order; > > > > On the other hand, if you read the code of zRAM, you will find zRAM has > > exactly the same mechanism as zeromap but zRAM can even do more > > by same_pages filled. since zRAM does the job in swapfile layer, there > > is no this kind of consistency issue like zeromap. > > > > So I feel for zRAM case, we don't need zeromap at all as there are dupl= icated > > efforts while I really appreciate your job which can benefit all swapfi= les. > > i mean, zRAM has the ability to check "zero"(and also non-zero but same > > content). after zeromap checks zeromap, zRAM will check again: > > > > Yes, so there is a reason for having the zeromap patches, which I have ou= tlined > in the coverletter. > > https://lore.kernel.org/all/20240627105730.3110705-1-usamaarif642@gmail.c= om/ > > There are usecases where zswap/zram might not be used in production. > We can reduce I/O and flash wear in those cases by a large amount. > > Also running in Meta production, we found that the number of non-zero fil= led > complete pages were less than 1%, so essentially its only the zero-filled= pages > that matter. I don't have data on Android phones, i'd like to see if phones have exactly the same ratio that non-zero same page is rare. > > I believe after zeromap, it might be a good idea to remove the page_same_= filled > check from zram code? Its not really a problem if its kept as well as I d= ont > believe any zero-filled pages should reach zram_write_page? > > > static int zram_write_page(struct zram *zram, struct page *page, u32 in= dex) > > { > > ... > > > > if (page_same_filled(mem, &element)) { > > kunmap_local(mem); > > /* Free memory associated with this sector now. */ > > flags =3D ZRAM_SAME; > > atomic64_inc(&zram->stats.same_pages); > > goto out; > > } > > ... > > } > > > > So it seems that zeromap might slightly impact my zRAM use case. I'm no= t > > blaming you, just pointing out that there might be some overlap in effo= rt > > here :-) > > > >> > >> Thanks, > >> Usama > > Thanks Barry