From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id AEC9BCF9C71 for ; Wed, 25 Sep 2024 06:27:08 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 23EF26B007B; Wed, 25 Sep 2024 02:27:08 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 1C9106B0082; Wed, 25 Sep 2024 02:27:08 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0687A6B0083; Wed, 25 Sep 2024 02:27:08 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id D82D56B007B for ; Wed, 25 Sep 2024 02:27:07 -0400 (EDT) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 85CE71208E0 for ; Wed, 25 Sep 2024 06:27:07 +0000 (UTC) X-FDA: 82602278094.05.E8E2AAC Received: from mail-ua1-f44.google.com (mail-ua1-f44.google.com [209.85.222.44]) by imf02.hostedemail.com (Postfix) with ESMTP id C160D8000D for ; Wed, 25 Sep 2024 06:27:05 +0000 (UTC) Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=ZKfihQvw; spf=pass (imf02.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.222.44 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1727245506; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=iw2KumctdM5J/ktaLv2+p8EP/6GhqY7xK+IMjtOjZwk=; b=31Un84UPRJvOa9OLT/WHc3CPrl87YURlm5BKDu4oBtVjn97Q3iqfK9aHFAQFwu2CVEdeRD Ibexk4ZUHLv7O7VNTOV9fVGrff+rqKpppAf9mtBclbZTD3BIw6RjK+bJ+DoiQSxNq7cHf9 8nWBV6SCrc0bYlbYzV1cjtQJiKcaDWI= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1727245506; a=rsa-sha256; cv=none; b=D6FLsfhPAZs3eh20yOq2gbFtz4J6gD4D0tuAJ3cB625WLnPsnHY6cqJDbfVQtJjHQZsC9u bubChcgGxoUfiLXNX5FBwbp5RG7FvyQJh+5ocBOabVgjaHYlKmPFB7kGv7uxnMd0n/mm1B GG/zkq1cWyKz4Neq1y20LTyMVD/LseA= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=ZKfihQvw; spf=pass (imf02.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.222.44 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com Received: by mail-ua1-f44.google.com with SMTP id a1e0cc1a2514c-84e8ca005caso62625241.1 for ; Tue, 24 Sep 2024 23:27:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1727245625; x=1727850425; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=iw2KumctdM5J/ktaLv2+p8EP/6GhqY7xK+IMjtOjZwk=; b=ZKfihQvwugAgTlxkJ4AcEIffYCFPU6QRegYNcmEpaymfe9cdk3B2iTDH5c9Zgh8pb+ i9+SI4GAypd5A25gLRotHlrci8mNlBqDYYCd9hqJl5CwEMouyg4briGLS+BzWGKWqOwq qpFGOFT7A93SItaWeD0ApwC+KgqqjRjdHa2ZgqBxfxcucCUhTQMep+3xp5HaGbwhHnpM c2zSRQEo5Ax+1l4dyOeI3G+MPDXiSRev1XvhMeOpCQGYDqR6flZcKdnbfJ0oFVdgck9n +LDvjaR0PVoz3/i8hfmLgdCTyYswS5VZ9RJnILTLCG1pUZuIy2J4ye3bwr32WfsjxCJH swTw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1727245625; x=1727850425; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=iw2KumctdM5J/ktaLv2+p8EP/6GhqY7xK+IMjtOjZwk=; b=Hf18pVVP5QY/Z4PREE7sry6Ntd5ppqQI6JHVs7VAe+cvjm6shpe0+weZ2FrPcGhOIx YrCwOEf3Wwujwf7J0sJ3sse3fCVwegNFSrMRjZCwZitk1m1StGb2xTjiaPu7t5WdEH9q RMO/qSQ0dFhqDU1sjXLDSMKCBfphyQI4Gcz351alreyiCmrkqGiRK+qpkozHcNJbVBGx E/MxPy8C8UKKoXsyumbkQvlOO1mLSGd7leZlZusJLmQp/0xbk+IRGoRTdg0481RoWdcr FUr15HZteCjTQlIilnP35WvvKegy5yEw5SBa9D41yMIHDMq5Bx8P5m2P8R7VDE3xJ8e5 w7+w== X-Forwarded-Encrypted: i=1; AJvYcCVAoOGLlU0xQdP6glsRjVqheJEE1yzwX0hPR0eda/Zo/kt+OdtReUCEFptdwSpZV002TBu9D7uJKw==@kvack.org X-Gm-Message-State: AOJu0YwGszEx2vS7OEHPxy1g/zlREy6QeVoSJfTmyCdVJR6EuN2QRjNn EOTVTePQxlFSmkpWUVW98iEL4XQdDO8MXEDUXFLHgLd4ehPhinG+WQBYS920Nr208aHUrRZZ6wh ub9x227ZdZh3tnKzqe8LjxuzPsdw= X-Google-Smtp-Source: AGHT+IGvdSE72ZOSM1zF21YJkmzPxuDbV43RU1LlEgRSjp1vtfjmV6s9Rr9DU8IyTbWXn6xb1YS4oQINdfs8ya2Ados= X-Received: by 2002:a05:6102:5492:b0:49b:f8f4:6a77 with SMTP id ada2fe7eead31-4a14e957527mr4697244137.5.1727245624726; Tue, 24 Sep 2024 23:27:04 -0700 (PDT) MIME-Version: 1.0 References: <20240923231142.4155415-1-nphamcs@gmail.com> <4d38c65d-760c-43a5-bb47-8e0235c13a51@linux.alibaba.com> <9a110f20-42ad-468b-96c6-683e162452a9@linux.alibaba.com> In-Reply-To: From: Barry Song <21cnbao@gmail.com> Date: Wed, 25 Sep 2024 14:26:52 +0800 Message-ID: Subject: Re: [RFC PATCH 0/2] remove SWAP_MAP_SHMEM To: Yosry Ahmed Cc: Nhat Pham , Baolin Wang , akpm@linux-foundation.org, hannes@cmpxchg.org, hughd@google.com, shakeel.butt@linux.dev, ryan.roberts@arm.com, ying.huang@intel.com, chrisl@kernel.org, david@redhat.com, kasong@tencent.com, willy@infradead.org, viro@zeniv.linux.org.uk, chengming.zhou@linux.dev, linux-mm@kvack.org, kernel-team@meta.com, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: C160D8000D X-Stat-Signature: borp3abadynjcnip1txfg5eqrdyi8bpm X-Rspamd-Server: rspam09 X-Rspam-User: X-HE-Tag: 1727245625-343480 X-HE-Meta: U2FsdGVkX1/EfQAEXXi13+kXqphcFSRfNOKg+k/X0SdCrVQ+vwgZL6Dw3KU2aUhKRdtvgycEHpdQUfrJ4e2UeQJxHm5FAcPQJLbR5s4MaWCrFXYk9VHk/tG6xmZs70SH+QcWN5L6xZ6ofGkXhksQy6W8aAKT1Y2wGKO0UNi4OF+N/8X4qA99hNhVvPFN8fxYfcsEXAP/TBs7/Ji8xu3yfiI+F7cVXt2tXfKSbbkQp6/VnPeZgMxmjIZiHEapiU/kWWVc4gqA1ED3rgqwhzZUwgIvyZZWtZZapjVyvUz6Lt4czP190nucpa9IJcdk8FKXkzDfuYOfzksNsZbzKiNkrOZW7zZMZ3T74p72NcQNGxXmViqPoKNo3IvibH2iy1CD2+q+kbEjVuCGSHIrL7yRx7CQzSwj7rfj5jb8iaRdLZl7ptvSv5ThxyYYGBvoZ6wZ1n64q69JtggIeow5MEbklcqrz4+BFVpl+Qgys32sOhKQ0Cfzmbn7MdkTlOypQUwzMrPzJDxarYe2Rxg0Vg9mHbL87NjEv3WDdoC8BVuxosH1TvOU8sCNEwVMApoEpVzdOjGqY9jYgfziNXFWJKwVOkBTJVe+alTmZInyDVO0K0L9mqbkZZyoiWJ0TvtnW25hKUCy6JXbqCJun1xu7WAoqEguBmM/MMHXat9tE0uYbBjA0+v3S/gfDNPfAU6OpWhnpstTiMKmr7Uz7C5R0/8GpTKjewzWCJOrHPcz91vYDnT2CaIs9ZeirF37LzDYec5saEr8m/YDXABwvUaitHE+V3sYiMB7CcHDQCLhDR85105Jrx80MjVEBQI+8+dLHdJ4JOpSRaKg6eJfR/zba60835cb/6Zuqq+JkNsHpPfzUmm3M+3vbFPHTCzkWQUOORCJaUfSbSEsr7FjyfMciAxPUk8sO0F7hleGOfXrdTHUa/skQbsF5iY/AkzP/fyAsbtBDhgUXTWonh3pSAJsHO2 Q3RD/Wnj HLwdKh2vJUDhSWzjtAUXA5XZv+7CcWuvqqhNEDZV2frcPNfXDgwY9XpK5nnKw6If1zhvGMC+uWaTJdGpxPT168IM2ywLV2kt72fPaqTPHV/KOhDwQSvTdBKLB8c/MGbkcfwEB1bcEJuBne0CkqALbFSuytWgV3/kaEyb9zFD+v90ZzND/+sCF1gC8MzJC+fDgRWJFBboBeL39602QzWWcB7iGf9X/PoMwv3mRsvKtahyM57w7uoVAeu+BSJ9dhWpA6u4SipvW9cL76UAnoirXJ3JXjwFlfx04AIp1nrx2UFT4lfnJrMrKqIVC4RiYKEnsA78kvy0/AEFOCqYx711mhzmCdQ1SiYQUdUUqtDCbtl+9fyag5nICuOhgSJfL8g+9LE1dq1uJiKFxF8M= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Sep 25, 2024 at 2:12=E2=80=AFAM Yosry Ahmed = wrote: > > [..] > > > > > > Apparently __swap_duplicate() does not currently handle increas= ing the > > > > > > swap count for multiple swap entries by 1 (i.e. usage =3D=3D 1)= because it > > > > > > does not handle rolling back count increases when > > > > > > swap_count_continued() fails. > > > > > > > > > > > > I guess this voids my Reviewed-by until we sort this out. Techn= ically > > > > > > swap_count_continued() won't ever be called for shmem because w= e only > > > > > > ever increment the count by 1, but there is no way to know this= in > > > > > > __swap_duplicate() without SWAP_HAS_SHMEM. > > > > > > > > Ah this is my bad. I compiled with CONFIG_THP_SWAP, but forgot to > > > > remove the swapfile check (that's another can of worms, but I need > > > > data before submitting the patch to remove it...) > > > > > > > > One thing we can do is instead of warning here, we can handle it in > > > > the for loop check, where we have access to count - that's the poin= t > > > > of having that for-loop check anyway? :) > > > > > > > > There's a couple of ways to go about it: > > > > > > > > 1. VM_WARN_ON(usage =3D=3D 1 && nr > 1 && count !=3D 0 ); > > > > > > Hmm that should work, although it's a bit complicated tbh. > > > > > > > (or more accurately, (count & ~COUNT_CONTINUED) >=3D SWAP_MAP_MAX)) > > > > > > I think this will make the warning very hard to hit if there's a > > > misuse of __swap_duplicate(). It will only be hit when an entry needs > > > count continuation, which I am not sure is very common. If there's a > > > bug, the warning will potentially catch it too late, if ever. > > > > > > The side effect here is failing to decrement the swap count of some > > > swap entries which will lead to them never being freed, essentially > > > leaking swap capacity slowly over time. I am not sure if there are > > > more detrimental effects. > > > > > > > > > > > 2. Alternatively, instead of warning here, we can simply return > > > > -ENOMEM. Then, at shmem callsite, have a VM_WARN_ON/VM_BUG_ON(), si= nce > > > > this MUST succeed. > > > > > > We still fail to rollback incremented counts though when we return > > > -ENOMEM, right? Maybe I didn't get what you mean. > > > > My understanding now is that there are two for loops. One for loop > > that checks the entry's states, and one for loop that does the actual > > incrementing work (or state modification). > > > > We can check in the first for loop, if it is safe to proceed: > > > > if (!count && !has_cache) { > > err =3D -ENOENT; > > } else if (usage =3D=3D SWAP_HAS_CACHE) { > > if (has_cache) > > err =3D -EEXIST; > > } else if ((count & ~COUNT_CONTINUED) > SWAP_MAP_MAX) { > > err =3D -EINVAL; > > } else if (usage =3D=3D 1 && nr > 1 && (count & ~COUNT_CONTINUED) >=3D > > SWAP_MAP_MAX)) { > > /* the batched variants currently do not support rollback */ > > err =3D -ENOMEM; > > } > > Hmm yeah I think something like this should work and is arguably > better than just warning, although this needs cleaning up: > - We already know usage !=3D SWAP_HAS_CACHE, so no need to check if usage= =3D=3D 1. > - We already know (count & ~COUNT_CONTINUED) is larger than > SWAP_MAP_MAX, so we should check if it's equal to SWAP_MAP_MAX. > - We should probably just calculate count & ~COUNT_CONTINUED above the > if conditions at this point. > > I would also like to hear what Barry thinks since he added this (and I > just realized he is not CC'd). I am perfectly fine with the approach, in the first loop, if we find all en= tries don't need CONTINUED, we can run the 2nd loop even for usage=3D=3D1 and nr > 1. this is almost always true for a real product where anon folios are unlikely to be fork-shared by so many processes. but we need fall back to iterating nr times if this really happens: int swap_duplicate_nr(swp_entry_t entry, int nr) { .... if (nr > 1 and ENOMEM) { for(nr entries) { __swap_duplicate(entry, 1, 1); entry =3D next_entry; } } seems a bit ugly? maybe we can keep the swap_duplicate(swp_entry_t entry) there? then avoid __swap_duplicate(entry, 1, 1);? Thanks Barry