From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B9E0CC433F5 for ; Fri, 15 Oct 2021 20:28:05 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 3D3BE611CE for ; Fri, 15 Oct 2021 20:28:05 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 3D3BE611CE Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id D6822940007; Fri, 15 Oct 2021 16:28:04 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id CF0966B0072; Fri, 15 Oct 2021 16:28:04 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BB89C940007; Fri, 15 Oct 2021 16:28:04 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0055.hostedemail.com [216.40.44.55]) by kanga.kvack.org (Postfix) with ESMTP id AC5A06B006C for ; Fri, 15 Oct 2021 16:28:04 -0400 (EDT) Received: from smtpin35.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 6ADF68249980 for ; Fri, 15 Oct 2021 20:28:04 +0000 (UTC) X-FDA: 78699808488.35.B4AB671 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf02.hostedemail.com (Postfix) with ESMTP id AAAE47001A08 for ; Fri, 15 Oct 2021 20:28:02 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id AEE7D610E5; Fri, 15 Oct 2021 20:28:02 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linux-foundation.org; s=korg; t=1634329683; bh=+hYQMU6HoRUMtiXuwckcki+zu/fY1O6s+B9DModVMd8=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=iqvxEEFB220JWHdFvhux+d3aQApzxz61s1utNdtBG5JrTIsC0+Jllr9B0pOLvv/te oHXjcGZgse48FuZj6GqItrT55GXOLnemOos9Ta9dyx43TsKO6cc0aG+U7vg3KwXhW3 JVRw6w30qbQcTdMjloV/mS/CTtZ/a8i0BbbvNzt8= Date: Fri, 15 Oct 2021 13:28:00 -0700 From: Andrew Morton To: Yang Shi Cc: naoya.horiguchi@nec.com, hughd@google.com, kirill.shutemov@linux.intel.com, willy@infradead.org, peterx@redhat.com, osalvador@suse.de, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [RFC v4 PATCH 0/6] Solve silent data loss caused by poisoned page cache (shmem/tmpfs) Message-Id: <20211015132800.357d891d0b3ad34adb9c7383@linux-foundation.org> In-Reply-To: <20211014191615.6674-1-shy828301@gmail.com> References: <20211014191615.6674-1-shy828301@gmail.com> X-Mailer: Sylpheed 3.5.1 (GTK+ 2.24.31; x86_64-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: AAAE47001A08 X-Stat-Signature: ncirkjq9yj7fc63aucu8difwo4dxoxyb Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=korg header.b=iqvxEEFB; dmarc=none; spf=pass (imf02.hostedemail.com: domain of akpm@linux-foundation.org designates 198.145.29.99 as permitted sender) smtp.mailfrom=akpm@linux-foundation.org X-HE-Tag: 1634329682-481941 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, 14 Oct 2021 12:16:09 -0700 Yang Shi wrote: > When discussing the patch that splits page cache THP in order to offline the > poisoned page, Noaya mentioned there is a bigger problem [1] that prevents this > from working since the page cache page will be truncated if uncorrectable > errors happen. By looking this deeper it turns out this approach (truncating > poisoned page) may incur silent data loss for all non-readonly filesystems if > the page is dirty. It may be worse for in-memory filesystem, e.g. shmem/tmpfs > since the data blocks are actually gone. > > To solve this problem we could keep the poisoned dirty page in page cache then > notify the users on any later access, e.g. page fault, read/write, etc. The > clean page could be truncated as is since they can be reread from disk later on. > > The consequence is the filesystems may find poisoned page and manipulate it as > healthy page since all the filesystems actually don't check if the page is > poisoned or not in all the relevant paths except page fault. In general, we > need make the filesystems be aware of poisoned page before we could keep the > poisoned page in page cache in order to solve the data loss problem. Is the "RFC" still accurate, or might it be an accidental leftover? I grabbed this series as-is for some testing, but I do think it wouild be better if it was delivered as two separate series - one series for the -stable material and one series for the 5.16-rc1 material.