From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 589CEC19F32 for ; Thu, 27 Feb 2025 18:02:05 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D05906B0083; Thu, 27 Feb 2025 13:02:04 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id CB5FF6B0085; Thu, 27 Feb 2025 13:02:04 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B7D5D6B0088; Thu, 27 Feb 2025 13:02:04 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 99FC46B0083 for ; Thu, 27 Feb 2025 13:02:04 -0500 (EST) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 7FB6F1CB775 for ; Thu, 27 Feb 2025 18:02:03 +0000 (UTC) X-FDA: 83166493326.10.BF07A0C Received: from out-185.mta0.migadu.com (out-185.mta0.migadu.com [91.218.175.185]) by imf05.hostedemail.com (Postfix) with ESMTP id 9AFC6100043 for ; Thu, 27 Feb 2025 18:01:40 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=MeFjGhMz; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf05.hostedemail.com: domain of yosry.ahmed@linux.dev designates 91.218.175.185 as permitted sender) smtp.mailfrom=yosry.ahmed@linux.dev ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1740679301; a=rsa-sha256; cv=none; b=8TZQBgSpD9/igq6oWvN1p79VXb7nTGyMyzScAUm3uVLgkd7U+DJig/hL0v3YY06kpffNu8 6VHMjD8/+5MPpR8/Hk0MJd3HjhCjWCQ/GhiEG5faTtIKEmMvg+YQ8K1BpoAw5gTh7rliDb wc12VA8+P6XQ565kHWvZvctjINttIOM= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=MeFjGhMz; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf05.hostedemail.com: domain of yosry.ahmed@linux.dev designates 91.218.175.185 as permitted sender) smtp.mailfrom=yosry.ahmed@linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1740679301; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=pwBsgEVKDn0GJpd6GPiEBOIf24UR9yocMEJl9yXiBk0=; b=aXbkpEWxj/cLIqA8BOygMaKctu1HfjJY/N5g/qscRsMbkkCTKnGQUqDpTHvmjn0ABncR4b XN48CNHg7jYOaJFA8j5+77BfOgC1SJcnTmkIf8+ae1KhWfTuNHV+m9BXuQ0LjZEidFLifZ Lg7xevAlY089V45LTo/ookXi1cTmT0I= Date: Thu, 27 Feb 2025 18:01:32 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1740679297; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=pwBsgEVKDn0GJpd6GPiEBOIf24UR9yocMEJl9yXiBk0=; b=MeFjGhMzpxmQ03GvSJsJ5lNFLjVz4HpQbrClhJHaCKGp8ddDwBcERXWF3HNRafAQ5MBYMb bnIegsQ+iXQ+CFuamWJZbR506NAyFv0F3d8ySCAXOdnFLQ9+14yDnyh4VQQQEGSEzGia3v bwnpjFf9pV/QPBwifkhlPwvrRZ5tuuk= X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Yosry Ahmed To: Johannes Weiner Cc: Nhat Pham , akpm@linux-foundation.org, chengming.zhou@linux.dev, linux-mm@kvack.org, kernel-team@meta.com, linux-kernel@vger.kernel.org Subject: Re: [PATCH v2] zswap: do not crash the kernel on decompression failure Message-ID: References: <20250227001445.1099203-1-nphamcs@gmail.com> <20250227043141.GB110982@cmpxchg.org> <20250227061616.GD110982@cmpxchg.org> <20250227160528.GF110982@cmpxchg.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20250227160528.GF110982@cmpxchg.org> X-Migadu-Flow: FLOW_OUT X-Rspam-User: X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 9AFC6100043 X-Stat-Signature: zw93ecjgsz4o7pxj56wd6jssou34mnf3 X-HE-Tag: 1740679300-952254 X-HE-Meta: U2FsdGVkX19BsOXck+EkTNHo+UvFZoEZZ/oxt/HbdoNVUM3whEWZMXlbIRKEH8KXIGCv7xcEz9vKKLc4pgmeBxx67Qgqw9vAaIHtOZy7+y2V8JVUEo9YwwAorsbMhIWcpPzP8gM1sLvNxesh2a9Ro0CM7+MS1BeQPt/WlXWouXOcQWrO585DVXWzq5zsUSAXLhwKNx35PjeJUPb7ijC285X/0831+lbpZZZrbN0YyX4ofsY3sWUjaOykcHniXLXr2K1h7CkFSrdmhWJGfygT3k87QB+k0wWjB+A/awMGNs7t+AFnErZlOg6sBTTS0ryIa37drGRX9+FGyckGM3XMnlXo4DAMXrO/PrUpfYGXsfC7u5egc3NzUrB6Qg18HGcmU77yU+UviwJ3O6QXgthiYiIRzZ6VeKjE/iKJa3ERzpWqiVZwrRO35Qncd4R4eKn3vAvShEhc1YwljYRDPlQsBI475FfuHNJl+snFvb59aSmiLyo9avlDJt4Hpe8gxWGW9oGDhPfHton06we7X/0Kwyapi1eW3MCI3/wZSt5xcdIx9upxl0jCk8rvbF4wvqU4On9o05QeChy7KIObu7QIeDb71UvUF0i4y/HxsY2dHxzDF0cr0tdtSZRkXn5xlI+DsENFFi0+2pUKwSIp0Ct1OICOz+n67w1tEYxhmwprFHy8hbO0BRSGwSoAqrYHsBrinhibpZCaAYYhe5t/G7FoEtM3sYwSNzTtwUcC3Hs4laethp7Y9gG1VpFgBcNfZ7nbgr48HGM3WwnxcKJdtHU18rwQK8QrVyFLdl+4PKB9c1q1RtX45qFiVCWtSWOwZZ5O53KIbst7qCQQOhaAqVO11d9cT0E+QlnKxms92gvAN0UqrI2unNKH6ZDv9d1qeyStro6yTGoNHISZSKj2y21OpK6FYs96DmvcDXbqOJEd3ndG8riRVvEmkgY3uDSTE9cS37eHT+5bSqeHZTjrus+ GqsM1gSc t1+bVEUPbh7ra60pN40mJ5YENChdSIhgWJOl0OjKxiraRDbGrSUmmTI4QbMh8+ZkIpXkZHXK3MpkFmnk95XbFBRFewofGyckzx/6JlqT4ZFxpa3O/wtdcJ83h/2f741gnjd+wEUFji8GsS/BKNUabHw1Co1K1YcYGVrMT6Pq6QDMLZ+UtjMP6KlwpbFZXQRE/oHORZtslcgsGB5zKPQeNIC74cQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Feb 27, 2025 at 11:05:28AM -0500, Johannes Weiner wrote: > On Thu, Feb 27, 2025 at 07:29:45AM +0000, Yosry Ahmed wrote: > > Maybe we can do something like this: > > > > /* Returns true if the folio was in the zeromap or zswap */ > > bool swap_read_folio_in_memory(struct folio *folio) > > { > > int ret; > > > > ret = swap_read_folio_zeromap(folio); > > if (ret == -ENOENT) > > ret = zswap_load(folio); > > > > if (ret == 0) { > > folio_mark_uptodate(folio); > > folio_unlock(folio); > > return true; > > } else if (ret != -ENOENT) { > > folio_unlock(folio); > > return true; > > } else { > > return false; > > } > > } > > Eh, I think we're getting colder again. > > This looks repetitive, zswap_load() is kind of awkward in that error > leg, and combining the two into one function is a bit of a stretch. > > There is also something to be said about folio_mark_uptodate() and > folio_unlock() ususally being done by the backend implementation - in > what the page cache would call the "filler" method - to signal when > it's done reading, and what the outcome was. > > E.g. for fs it's always in the specific ->read implementation: > > static int simple_read_folio(struct file *file, struct folio *folio) > { > folio_zero_range(folio, 0, folio_size(folio)); > flush_dcache_folio(folio); > folio_mark_uptodate(folio); > folio_unlock(folio); > return 0; > } > > and not in the generic manifold: > > $ grep -c folio_mark_uptodate mm/filemap.c > 0 > > I'd actually rather push those down into zeromap and zswap as well to > follow that pattern more closely: Hmm yeah for the sake of consistency I think we can do that. My main concern was the comments clarifying the 'true' return value without marking uptodate is a fail in a lot of places in zswap and zeromap code. However, I think returning an error code as you suggested makes it more obvious and reduces the need for comments. So yes I think your proposed approach is better (with a few comments). > > diff --git a/mm/page_io.c b/mm/page_io.c > index 9b983de351f9..1fb5ce1884bd 100644 > --- a/mm/page_io.c > +++ b/mm/page_io.c > @@ -538,6 +538,7 @@ static bool swap_read_folio_zeromap(struct folio *folio) > > folio_zero_range(folio, 0, folio_size(folio)); > folio_mark_uptodate(folio); > + folio_unlock(folio); > return true; So I think we should double down and follow the same pattern for swap_read_folio_zeromap() too. Return an error code too to clarify the skip uptodate case. > } > > @@ -635,13 +636,11 @@ void swap_read_folio(struct folio *folio, struct swap_iocb **plug) > } > delayacct_swapin_start(); > > - if (swap_read_folio_zeromap(folio)) { > - folio_unlock(folio); > + if (swap_read_folio_zeromap(folio)) > goto finish; and this ends up being closer to the zswap pattern: if (swap_read_folio_zeromap(folio) != -ENOENT) goto finish; > - } else if (zswap_load(folio)) { > - folio_unlock(folio); > + > + if (zswap_load(folio) != -ENOENT) > goto finish; > - } > > /* We have to read from slower devices. Increase zswap protection. */ > zswap_folio_swapin(folio); > diff --git a/mm/zswap.c b/mm/zswap.c > index 6dbf31bd2218..76b2a964b0cd 100644 > --- a/mm/zswap.c > +++ b/mm/zswap.c > @@ -1620,7 +1620,7 @@ bool zswap_store(struct folio *folio) > return ret; > } > And here we should explain the different return codes and when we return with the folio locked/unlocked or marked uptodate. > -bool zswap_load(struct folio *folio) > +int zswap_load(struct folio *folio) > { > swp_entry_t swp = folio->swap; > pgoff_t offset = swp_offset(swp); > @@ -1631,7 +1631,7 @@ bool zswap_load(struct folio *folio) > VM_WARN_ON_ONCE(!folio_test_locked(folio)); > > if (zswap_never_enabled()) > - return false; > + return -ENOENT; > > /* > * Large folios should not be swapped in while zswap is being used, as > @@ -1641,8 +1641,25 @@ bool zswap_load(struct folio *folio) > * Return true without marking the folio uptodate so that an IO error is > * emitted (e.g. do_swap_page() will sigbus). > */ > - if (WARN_ON_ONCE(folio_test_large(folio))) > - return true; > + if (WARN_ON_ONCE(folio_test_large(folio))) { > + folio_unlock(folio); > + return -EINVAL; > + } > + > + entry = xa_load(tree, offset); > + if (!entry) > + return -ENOENT; > + > + if (!zswap_decompress(entry, folio)) { > + folio_unlock(folio); > + return -EIO; > + } > + > + folio_mark_uptodate(folio); > + > + count_vm_event(ZSWPIN); > + if (entry->objcg) > + count_objcg_events(entry->objcg, ZSWPIN, 1); > > /* > * When reading into the swapcache, invalidate our entry. The > @@ -1656,27 +1673,14 @@ bool zswap_load(struct folio *folio) > * files, which reads into a private page and may free it if > * the fault fails. We remain the primary owner of the entry.) > */ > - if (swapcache) > - entry = xa_erase(tree, offset); > - else > - entry = xa_load(tree, offset); > - > - if (!entry) > - return false; > - > - zswap_decompress(entry, folio); > - > - count_vm_event(ZSWPIN); > - if (entry->objcg) > - count_objcg_events(entry->objcg, ZSWPIN, 1); > - > if (swapcache) { > - zswap_entry_free(entry); > folio_mark_dirty(folio); > + xa_erase(tree, offset); > + zswap_entry_free(entry); > } > > - folio_mark_uptodate(folio); > - return true; > + folio_unlock(folio); > + return 0; > } > > void zswap_invalidate(swp_entry_t swp)