From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8E1C8C021B8 for ; Thu, 27 Feb 2025 00:00:37 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 10B466B009C; Wed, 26 Feb 2025 19:00:37 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 0BBB26B009D; Wed, 26 Feb 2025 19:00:37 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EC4BA6B009E; Wed, 26 Feb 2025 19:00:36 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id CDD586B009C for ; Wed, 26 Feb 2025 19:00:36 -0500 (EST) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 8270016105F for ; Thu, 27 Feb 2025 00:00:36 +0000 (UTC) X-FDA: 83163768072.26.6000CA5 Received: from out-182.mta0.migadu.com (out-182.mta0.migadu.com [91.218.175.182]) by imf28.hostedemail.com (Postfix) with ESMTP id A8939C0003 for ; Thu, 27 Feb 2025 00:00:34 +0000 (UTC) Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=bEIOLwsm; spf=pass (imf28.hostedemail.com: domain of yosry.ahmed@linux.dev designates 91.218.175.182 as permitted sender) smtp.mailfrom=yosry.ahmed@linux.dev; dmarc=pass (policy=none) header.from=linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1740614434; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=lZF8l/6nKdJOw4wJVKCersbjISsxoghQ1nANtX1CqBI=; b=YTTOx7dwbLxpS2LvOUupqsXRQIAYCHgjMH2D7TXa2s968QajW0igWEIAR2VIlV/hgaIwFs 7tKt8l5/yDnLi00n9sl9dtAN7efFCPT5+SYni/z8mGpSR3cSxBXS0jmIgRUdtHhj3xufgW bD4rJiXAp4AWA0h8MLhW4DNZ4lm00jo= ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=bEIOLwsm; spf=pass (imf28.hostedemail.com: domain of yosry.ahmed@linux.dev designates 91.218.175.182 as permitted sender) smtp.mailfrom=yosry.ahmed@linux.dev; dmarc=pass (policy=none) header.from=linux.dev ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1740614434; a=rsa-sha256; cv=none; b=xjuyiF6G6VywSeaH3/Xoo2wNdhIgQB0JmKDR3xcGQccTz1pTdIu5bh0XgR650ymlT5JW+V P0zEO88+LLJs7jaWO7bH+pKpoRKt2jwi3rOm7vyup/WT9XHfOAKc+rFGvdSOmuUK2W0BH7 5qNDXoicIreFSik637zgrUv59MVT/EE= Date: Thu, 27 Feb 2025 00:00:27 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1740614432; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=lZF8l/6nKdJOw4wJVKCersbjISsxoghQ1nANtX1CqBI=; b=bEIOLwsm3ncqczp/Vdewq3DRMAeuJM57tPvyoFbWb8dSKpn+c5XrKn5D5RO7ftTn/K2nTI JdVSGtPOFX4iEBUFZBrZn99vHjobZGDh1xOWebsjao2oLpsPLT4PfO6ERhJ78HhQFJ8v95 7bkKrpxsDU6QlK8pYG+Wh+zu/DQc0v4= X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Yosry Ahmed To: Nhat Pham Cc: Johannes Weiner , akpm@linux-foundation.org, chengming.zhou@linux.dev, linux-mm@kvack.org, kernel-team@meta.com, linux-kernel@vger.kernel.org Subject: Re: [PATCH] zswap: do not crash the kernel on decompression failure Message-ID: References: <20250225213200.729056-1-nphamcs@gmail.com> <20250226045727.GB1775487@cmpxchg.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Migadu-Flow: FLOW_OUT X-Rspam-User: X-Stat-Signature: z8qwkk6ji6cr4yrhrg3nyraze3sx47ro X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: A8939C0003 X-HE-Tag: 1740614434-683437 X-HE-Meta: U2FsdGVkX1+9QEcanzmQWHlzi2PqwLMAdhmRMWowsgOWeVRWljOWOQjofHiot3HfL92+ptFrvXPjUxcFwFzXQff3tXL2rI//5fmIsvGeS6xNoFdsKvLmovgBTOa9b6C29ZganxP/hgrhkzsprFYjQAcxyMRahl4/fkAtCStfmEPUh5U3AohR1lTvpEpFbvFIAUpY3QaMGYRwpsAXr7LM7jR03GouVmbjC7Z1s/K4PBMJ1eO4Q7YpSbZq/SV18esBrv/FcVedbDnVFd+6azwjuKsYRQZboj2lpRkdSXISCxWkBZnboS1DaXADwuP7HkZB3u8aIkDJ/yVNZM3KYWdbvZQYIPGw7cWOpHvYBdb5KFzCaaBL6ApNHwVlgRuXfxIYEJfOuaJFvcRcLQ9BFcU6HtNmMSW0jpeVgRT68mTAkhiNl83f65/Ql4rstW7j8Vft5DGJ4CGSIJ+O3DjusgUqX0Qx7Y1pvq6+jv/iMGh85yLrHLyd12vtF9muY/z0d8/F128uYrGN1rM/icEGJdCLbUkfW8k1gkU7F77grNJRkI6JUZEymLVDXinKNWb8ghnQTHJUlrJuXn3dyuhhz4nUzh3tViO6W9WUtM6XvaAsDQhe2Hx6IB1HBLp16LiuAiVagKsdjWoyX2rOhaA8MxU8EzmbUoiRlxO04g+kZNjodk2D2kHqw+8+2tQ7wyAcOwt/2ErqJbRf/7Cuqx6RBklalIrp6PYCHq9L8lW9dHeEvxp7+Sf+BaSQZv9irdE3HkRTCcGXiaO2qapLOp9TqkUAMToxiErY1f3lf5xR9bhZvtZN80UHDnUKSUXbU3/0a7AtIOWqZrP8sVCNlWZ+7bF+G1Cm8yyoR0VOPMVty2mVO+lbzETx9dAO5sVeAvUIOvjt5Z6L7CgM0wAzgmCshCG6y6SEl+KJezjol4rKQ33UaHet49dPgWw39HxJV4giaymcBUb4bVZnOSzZh/DUE/c j56Q1vBi mu+/eThWC1ff2i1SIczBQUh7XSs7aPqZy9gRUwPF10x7F76zGALU8292+uc4K3mQdFHElCaRW14W+9kuS8fLuz8EPtTTy/Ho5NyYlFo0pNHlc2VdhEuRuHhHUz2VxDH3FXJ6PWDS5fR5sSrKJ0yzjDW27PuNPDxiwtEfdeUrQXUgb9vG1uC7RNkCqUvKQZ5FrG8pRmubgLCfUPMutOG2YbxLAlerGTeO9M95qLSNecK50XlQ= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, Feb 26, 2025 at 03:20:13PM -0800, Nhat Pham wrote: > On Wed, Feb 26, 2025 at 7:33 AM Yosry Ahmed wrote: > > > > On Tue, Feb 25, 2025 at 11:57:27PM -0500, Johannes Weiner wrote: > > > On Wed, Feb 26, 2025 at 03:12:35AM +0000, Yosry Ahmed wrote: > > > > On Tue, Feb 25, 2025 at 01:32:00PM -0800, Nhat Pham wrote: > > > > > Currently, we crash the kernel when a decompression failure occurs in > > > > > zswap (either because of memory corruption, or a bug in the compression > > > > > algorithm). This is overkill. We should only SIGBUS the unfortunate > > > > > process asking for the zswap entry on zswap load, and skip the corrupted > > > > > entry in zswap writeback. > > > > > > > > Some relevant observations/questions, but not really actionable for this > > > > patch, perhaps some future work, or more likely some incoherent > > > > illogical thoughts : > > > > > > > > (1) It seems like not making the folio uptodate will cause shmem faults > > > > to mark the swap entry as hwpoisoned, but I don't see similar handling > > > > for do_swap_page(). So it seems like even if we SIGBUS the process, > > > > other processes mapping the same page could follow in the same > > > > footsteps. > > > > > > It's analogous to what __end_swap_bio_read() does for block backends, > > > so it's hitchhiking on the standard swap protocol for read failures. > > > > Right, that's also how I got the idea when I did the same for large > > folios handling. > > And your handling of the large folio (along with the comment in the > other thread) was how I got the idea for this patch :) > > > > > > > > > The page sticks around if there are other users. It can get reclaimed, > > > but since it's not marked dirty, it won't get overwritten. Another > > > access will either find it in the swapcache and die on !uptodate; if > > > it was reclaimed, it will attempt another decompression. If all > > > references have been killed, zswap_invalidate() will finally drop it. > > > > > > Swapoff actually poisons the page table as well (unuse_pte). > > > > Right. My question was basically why don't we also poison the page table > > in do_swap_page() in this case. It's like that we never swapoff. > > That would require a rmap walk right? To also poison the other PTEs > that point to the faulty (z)swap entry? > > Or am I misunderstanding your point :) Oh I meant why not just mark the entry where the fault happened as poisoned at least. Finding other PTEs that point to the swap entry is a different story. I don't think we can even use the rmap here.