From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C4E81C52D70 for ; Fri, 2 Aug 2024 07:18:42 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 468926B007B; Fri, 2 Aug 2024 03:18:42 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 417FE6B0083; Fri, 2 Aug 2024 03:18:42 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2E0D96B0085; Fri, 2 Aug 2024 03:18:42 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 0FA396B007B for ; Fri, 2 Aug 2024 03:18:42 -0400 (EDT) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 81E7D12112A for ; Fri, 2 Aug 2024 07:18:41 +0000 (UTC) X-FDA: 82406452842.28.F3EEDF2 Received: from mail-pl1-f182.google.com (mail-pl1-f182.google.com [209.85.214.182]) by imf13.hostedemail.com (Postfix) with ESMTP id 9C71820027 for ; Fri, 2 Aug 2024 07:18:39 +0000 (UTC) Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b="Xwg//biI"; spf=pass (imf13.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.214.182 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1722583090; a=rsa-sha256; cv=none; b=l+Vd5jZDpf2PaZCCJqdmQvjbF6tZ2RgmrGZPQCV3qEAxnlkt9VCgaxdviM04a2yrmaQi+W bBREJbp1HbOPDUPkCbEEA9ztUV9MayjPbSJWqk1VS2rHJ88QaJaPrq46Oh9TaZIP5FbTmm S+QjDtp1AlH2w6itVhGRQBWIVakzrT0= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b="Xwg//biI"; spf=pass (imf13.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.214.182 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1722583090; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=k78JGF02XTaSpSY2F19gccm9MTwGNhAILnPDJY5TGS4=; b=HJWI2MfylHX2yAa7PCD2UPbtRgwIT060UNX+3Wzf8otTWu90qddMVLlvPIH90Te/JWLVgd zgmgZsJDM1InPLV9R6bjYiYI1Xg5LUQGKgEQ7th+BE0avQpqQt5EAr0dHizKqsejnrgDkA sUKgqlQhciKxSZ6q5q42810opqojmnY= Received: by mail-pl1-f182.google.com with SMTP id d9443c01a7336-1fc65329979so67448115ad.0 for ; Fri, 02 Aug 2024 00:18:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1722583118; x=1723187918; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=k78JGF02XTaSpSY2F19gccm9MTwGNhAILnPDJY5TGS4=; b=Xwg//biIDOO5lEoxuxaza9rhrNPqt7eVC5lt+Za3Cb3XplWhxWMHnUwgWZUKv5IMzQ J/VxDwGKnMryUKQh+CYrneN7V0XFk6e+J3qDl5vO9fOxgXF1qp6hQq8PEhHLvOMjmVWX AHM3xBa0J2YpVuvBEz5uep9RXWiU2SmidtQfJxtvytnDIpEl7DcBHDGetenOJxtP/4aU SR4eKEYmCDy1GWxZtthkrZ4cARlOdI4+LUMFky48X/ydZENpTjgFy2yio0OLzMApDkyb ubD7aKSw97y9Am3yU53pAjpABNl7H2DE3X6M9UQBKOUK/aj5yolwIWBdhbJoVF9zp01N C9eQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1722583118; x=1723187918; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=k78JGF02XTaSpSY2F19gccm9MTwGNhAILnPDJY5TGS4=; b=ry7coubJZdYl+z8aFU4huHc2rdpaX23GaWe8AULoHjS+n26FinmtGaKabgYXOCH6Ai gJMO3LgMev0WQ5cxc0LSbEm+QitEMz8FF5CvBHbst/U9JNUajoZQvyhiZFg8pYpt4gBC YlkJNbsVw9kr0f+IHyEsAREPPwdvdxJsbBi///5lIUfk2k2ExGf809s7QKytLuUmMuhj 6fuFKCgmo4oHH5QeKTHRtzlN0TK67DqBu74RO4JoiznEgIE1UYqUmvqXmI3+l670GBln DjldMofkQr9yYwC/KfEFcZw9Apo7X9ifL25f0+zuEQchD2//WqIxYigaNdPeub+DwAbE nByA== X-Forwarded-Encrypted: i=1; AJvYcCXyKIKVuqEpDct/qYMzySYJPq/UZmNicCxnlQrvjUtKNI6AzHolaeCJT0T6f5Wx1hp90poXbe4zgszXcxEy271XS1k= X-Gm-Message-State: AOJu0YwZU8a7rfh709ni9C4edeDqtmrggbApCpwLEXbrQCWisbau0FSK WfmNBbfIQLM7rffuSEHNAoLUbLKL8SV6IbTjhbl/SCOLFdNsLHdW X-Google-Smtp-Source: AGHT+IHCBWMSEl6WKmbUX247lj6zM93iaVkd2vbq5KQvq05U1WFX8o+U3IBR2/Le70WBHfcgpmwgWA== X-Received: by 2002:a17:902:dac2:b0:1fb:2e9a:beea with SMTP id d9443c01a7336-1ff570d8163mr43037285ad.0.1722583118068; Fri, 02 Aug 2024 00:18:38 -0700 (PDT) Received: from localhost.localdomain ([2407:7000:8942:5500:aaa1:59ff:fe57:eb97]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-1ff590608casm10208605ad.159.2024.08.02.00.18.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 02 Aug 2024 00:18:37 -0700 (PDT) From: Barry Song <21cnbao@gmail.com> To: ying.huang@intel.com Cc: akpm@linux-foundation.org, baolin.wang@linux.alibaba.com, chrisl@kernel.org, david@redhat.com, hannes@cmpxchg.org, hughd@google.com, kaleshsingh@google.com, kasong@tencent.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, mhocko@suse.com, minchan@kernel.org, nphamcs@gmail.com, ryan.roberts@arm.com, senozhatsky@chromium.org, shakeel.butt@linux.dev, shy828301@gmail.com, surenb@google.com, v-songbaohua@oppo.com, willy@infradead.org, xiang@kernel.org, yosryahmed@google.com Subject: Re: [PATCH 1/1] mm: swap: add nr argument in swapcache_prepare and swapcache_clear to support large folios Date: Fri, 2 Aug 2024 19:18:17 +1200 Message-Id: <20240802071817.47081-1-21cnbao@gmail.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <87sevpj6xp.fsf@yhuang6-desk2.ccr.corp.intel.com> References: <87sevpj6xp.fsf@yhuang6-desk2.ccr.corp.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Stat-Signature: 74uy6zznktb9pokqf9cz7s7siwg8ppen X-Rspamd-Queue-Id: 9C71820027 X-Rspam-User: X-Rspamd-Server: rspam10 X-HE-Tag: 1722583119-835962 X-HE-Meta: U2FsdGVkX1+NT2o2WvxjCeUkrIMYj6q8m1hgPsaAe8PSAq0btzgaUmSX3nZHOsV+kFZiSydnIxbP+ewreKqKIqJnvDyFK/BQ93YxBb+jMzv7nmHw/ihLQ8+bf9YC8wzV1xmZsq2uL0lf3OUMvnU0G1A+6QGwpqOSg8L1JozK2kqyOUTXEnvkayV7KLvG0njQj9qlVj6WbcE4wLvq5Q7a56hEWhq9cH+c+N4uxbDNZYN7Uah8VG+iLDgFWrAkkiSBuSSi8HQlYxjN/34FlmjHFv5DNbU/sXo1AO2TDbKD3SahlISq9A/moKP4NAqZ5EW37VUoTt+Uzg6gEk9P/QXC03aiwiuD3hmvX+nA4CYd1lVTz0nzHCRvTsSeFJsvjwMIvzPX3pSQO8juHbbaXCe34CQLn5dUQrcj5l3cjiCobO1heZ+qTlj+3xAgoLQyxpGcuSSeiyovUSu/HixqZwl4KDD25UT0JYXz556xqZPQ30tANDji8Pih3m1Z5x/TIm8FAupUOr0fWiNM6yilHYhA15LCxEYR55cW3CeXOuDXUT6K/ilhSk/oqiKsQGbh+sifs53IQph/1ULWW3dyYiuPEJmkJkXrj8xbJtNilPC9QpQpjxFOYSB7C7Mk04CB6V6Mehfb9Jcilrut2TH36BRAhdKWcJyJ74F9AtUEPJQsbEn6k8frHyyiEfcODbmadnC2Q60kwvmqZkLUasIsTkufpRLj1G6j0YkgBpt/Xte013ANi2wJmTt43CXfTPr21r+EAVU+DIJbUx+gQYOgvfuSvdXlESzYBhRei6h/2iMfb5302vc7IrCV45Tki4BdIMVcvPE8MfYdAuvmG5enhFEU6WyiL9Hl3JqqgOspxDONt6qyE5tEAp7YRBPdj0UK+NBfMs/gVT0D7KCO942M5XQnp+r8+uN4C8rAVhrWAS48c3Akco9YIONwwbzFfqeprnpxpmOhZooPU3FsvcjYD27 B5PXchfP m/IfJcQnfPrK1t05clgTikZar14acKZB50RNO3ZvsW1gAqHdYSLOhjDRha0gXj9WLXbxOh18bt0Yjts2Gimlmy1gziISEmDJX/VItjEmL+bGWrnGt3UP7QCoxZAWANmI8q5RqLt7V+z8ogDvdSTUZztzrPwwM4UFiU1kl60ROTWEz98QuCeJW+kaR/iADNI53cKUXDyDdi1s3h60BZlU5S8cfxDEk++0FM62aUytrAFSYxGBVhISuiD6SwAwqFP1U60lH/8eOsnkWiB7YcPXlKYpYjbnFRZCPUtx+BZUBIbt2SC9fVePAN5lhYRd7stV85IqXVqGbvp1HQw0lEt+ldrO3liParYmxEmvXHx7Tl/S/HsaWCU5v79hwuPHF63+z3Jk7bjKdqH9UepGiA2CtfF1cvFHC0qUzjcUcrb+IhxNuuzgT5t9kxYXOoNRjhXqDfIXu3qKDvnL42WuRrSp0eZp6PhPII3Mb+MtQDqhnKbJwynmO31S+q43km7U5fJsDZQb54P3sepiRK/rMzXQmK0QipmFFXrMA8vFT X-Bogosity: Ham, tests=bogofilter, spamicity=0.000142, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Aug 2, 2024 at 1:50 PM Huang, Ying wrote: > > > > Right. I believe the change below can help improve readability and also > > clarify the swap_count_continued() case. > > > > diff --git a/mm/swapfile.c b/mm/swapfile.c > > index 2fa29bdec171..75a89ce18edc 100644 > > --- a/mm/swapfile.c > > +++ b/mm/swapfile.c > > @@ -3538,6 +3538,7 @@ static int __swap_duplicate(swp_entry_t entry, unsigned char usage, int nr) > > > >       offset = swp_offset(entry); > >       VM_WARN_ON(nr > SWAPFILE_CLUSTER - offset % SWAPFILE_CLUSTER); > > +     VM_WARN_ON(usage == 1 && nr > 1); > >       ci = lock_cluster_or_swap_info(p, offset); > > > >       err = 0; > > @@ -3564,17 +3565,9 @@ static int __swap_duplicate(swp_entry_t entry, unsigned char usage, int nr) > >                               err = -EEXIST; > >                       else                            /* no users remaining */ > >                               err = -ENOENT; > > - > >               } else if (count || has_cache) { > > - > > -                     if ((count & ~COUNT_CONTINUED) < SWAP_MAP_MAX) > > -                             continue; > > -                     else if ((count & ~COUNT_CONTINUED) > SWAP_MAP_MAX) > > +                     if ((count & ~COUNT_CONTINUED) > SWAP_MAP_MAX) > >                               err = -EINVAL; > > -                     else if (swap_count_continued(p, offset + i, count)) > > -                             continue; > > -                     else > > -                             err = -ENOMEM; > >               } else > >                       err = -ENOENT;                  /* unused swap entry */ > > > > @@ -3591,8 +3584,12 @@ static int __swap_duplicate(swp_entry_t entry, unsigned char usage, int nr) > >                       has_cache = SWAP_HAS_CACHE; > >               else if ((count & ~COUNT_CONTINUED) < SWAP_MAP_MAX) > >                       count += usage; > > -             else > > +             else if (swap_count_continued(p, offset + i, count)) > >                       count = COUNT_CONTINUED; > > +             else { > > +                     err = -ENOMEM; > > +                     goto unlock_out; > > +             } > > > >               WRITE_ONCE(p->swap_map[offset + i], count | has_cache); > >       } > > > > This makes the two iterations become: > > > >       for (i = 0; i < nr; i++) { > >               count = p->swap_map[offset + i]; > > > >               /* > >                * swapin_readahead() doesn't check if a swap entry is valid, so the > >                * swap entry could be SWAP_MAP_BAD. Check here with lock held. > >                */ > >               if (unlikely(swap_count(count) == SWAP_MAP_BAD)) { > >                       err = -ENOENT; > >                       goto unlock_out; > >               } > > > >               has_cache = count & SWAP_HAS_CACHE; > >               count &= ~SWAP_HAS_CACHE; > > > >               if (usage == SWAP_HAS_CACHE) { > >                       /* set SWAP_HAS_CACHE if there is no cache and entry is used */ > > The comments doen't apply now, we don't "set" here. > > >                       if (!has_cache && count) > >                               continue; > >                       else if (has_cache)             /* someone else added cache */ > >                               err = -EEXIST; > >                       else                            /* no users remaining */ > >                               err = -ENOENT; > >               } else if (count || has_cache) { > >                       if ((count & ~COUNT_CONTINUED) > SWAP_MAP_MAX) > >                               err = -EINVAL; > >               } else > >                       err = -ENOENT;                  /* unused swap entry */ > > It seems that this can be simplified to: > >                 if (!count && !has_cache) { >                         err = -ENOENT; >                 } else if (usage == SWAP_HAS_CACHE) { >                         if (has_cache) >                                 err = -EEXIST; >                 } else if ((count & ~COUNT_CONTINUED) > SWAP_MAP_MAX) { >                         err = -EINVAL; >                 } > > >               if (err) > >                       goto unlock_out; > >       } > > > >       for (i = 0; i < nr; i++) { > >               count = p->swap_map[offset + i]; > >               has_cache = count & SWAP_HAS_CACHE; > >               count &= ~SWAP_HAS_CACHE; > > > >               if (usage == SWAP_HAS_CACHE) > >                       has_cache = SWAP_HAS_CACHE; > >               else if ((count & ~COUNT_CONTINUED) < SWAP_MAP_MAX) > >                       count += usage; > >               else if (swap_count_continued(p, offset + i, count)) > >                       count = COUNT_CONTINUED; > >               else { > > Better to add some comments here, > >                         /* >                          * Don't need to rollback changes, because if >                          * usage == 1, there must be nr == 1. >                          */ > > >                       err = -ENOMEM; > >                       goto unlock_out; > >               } > > > >               WRITE_ONCE(p->swap_map[offset + i], count | has_cache); > >       } > > > > Ying, do you feel more satisfied with the version above > > compared to the code in mm-unstable? > > This looks good to me except some minor comments above.  Thanks! Thanks very much, Ying. Hi Andrew, Could you please help squash the below change? >From 17cbd696ecd37bc199b6e87c0c837d1a1ae9ac8d Mon Sep 17 00:00:00 2001 From: Barry Song Date: Fri, 2 Aug 2024 17:52:43 +1200 Subject: [PATCH] mm: clarify swap_count_continued and improve readability for __swap_duplicate when usage=1 and swapcount is very large, the situation can become quite complex. The first entry might succeed with swap_count_continued(), but the second entry may require extending to an additional continued page. Rolling back these changes can be extremely challenging. Therefore, anyone using usage==1 for batched __swap_duplicate() operations should proceed with caution. Additionally, we have moved swap_count_continued() to the second iteration to enhance readability, as this function may modify data. Suggested-by: "Huang, Ying" Signed-off-by: Barry Song --- mm/swapfile.c | 38 +++++++++++++++++--------------------- 1 file changed, 17 insertions(+), 21 deletions(-) diff --git a/mm/swapfile.c b/mm/swapfile.c index d9cf31b04db3..ea023fc25d08 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -3540,6 +3540,7 @@ static int __swap_duplicate(swp_entry_t entry, unsigned char usage, int nr) offset = swp_offset(entry); VM_WARN_ON(nr > SWAPFILE_CLUSTER - offset % SWAPFILE_CLUSTER); + VM_WARN_ON(usage == 1 && nr > 1); ci = lock_cluster_or_swap_info(p, offset); err = 0; @@ -3558,27 +3559,14 @@ static int __swap_duplicate(swp_entry_t entry, unsigned char usage, int nr) has_cache = count & SWAP_HAS_CACHE; count &= ~SWAP_HAS_CACHE; - if (usage == SWAP_HAS_CACHE) { - /* set SWAP_HAS_CACHE if there is no cache and entry is used */ - if (!has_cache && count) - continue; - else if (has_cache) /* someone else added cache */ + if (!count && !has_cache) { + err = -ENOENT; + } else if (usage == SWAP_HAS_CACHE) { + if (has_cache) err = -EEXIST; - else /* no users remaining */ - err = -ENOENT; - - } else if (count || has_cache) { - - if ((count & ~COUNT_CONTINUED) < SWAP_MAP_MAX) - continue; - else if ((count & ~COUNT_CONTINUED) > SWAP_MAP_MAX) - err = -EINVAL; - else if (swap_count_continued(p, offset + i, count)) - continue; - else - err = -ENOMEM; - } else - err = -ENOENT; /* unused swap entry */ + } else if ((count & ~COUNT_CONTINUED) > SWAP_MAP_MAX) { + err = -EINVAL; + } if (err) goto unlock_out; @@ -3593,8 +3581,16 @@ static int __swap_duplicate(swp_entry_t entry, unsigned char usage, int nr) has_cache = SWAP_HAS_CACHE; else if ((count & ~COUNT_CONTINUED) < SWAP_MAP_MAX) count += usage; - else + else if (swap_count_continued(p, offset + i, count)) count = COUNT_CONTINUED; + else { + /* + * Don't need to rollback changes, because if + * usage == 1, there must be nr == 1. + */ + err = -ENOMEM; + goto unlock_out; + } WRITE_ONCE(p->swap_map[offset + i], count | has_cache); } -- 2.34.1 > >> >> >> >> >> Best Regards, > >> >> >> >> >> Huang, Ying > >> >> >> >> > > >> >> > > > Thanks Barry