From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.5 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B564AC4338F for ; Thu, 12 Aug 2021 15:09:46 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 3B8FD610A7 for ; Thu, 12 Aug 2021 15:09:46 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 3B8FD610A7 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 90BFE8D0003; Thu, 12 Aug 2021 11:09:45 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 8BBCD8D0002; Thu, 12 Aug 2021 11:09:45 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7D1368D0003; Thu, 12 Aug 2021 11:09:45 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0237.hostedemail.com [216.40.44.237]) by kanga.kvack.org (Postfix) with ESMTP id 6064A8D0002 for ; Thu, 12 Aug 2021 11:09:45 -0400 (EDT) Received: from smtpin39.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 04A121849C for ; Thu, 12 Aug 2021 15:09:45 +0000 (UTC) X-FDA: 78466763130.39.011DD05 Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) by imf05.hostedemail.com (Postfix) with ESMTP id A93FE50435B2 for ; Thu, 12 Aug 2021 15:09:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=Content-Type:MIME-Version:Message-ID: Subject:To:From:Date:Sender:Reply-To:Cc:Content-Transfer-Encoding:Content-ID: Content-Description:In-Reply-To:References; bh=bYb2ie6jMLMzND0qi3cawroT6axBFe5IDHItnoLWNfQ=; b=PSqKAk7KoY+Cz+5OSdXuT7DPbQ /arIZDmC7pBItMpZPItoy2/r+fwZPx/+DsmWY/RwNSQVWg9HLlEZNiJu4W/otwkQWT+WHw4u49xAe AO2EUfTZH2Ot81+fCO2uJtmG2JFXGTDC69tqN+HqVIrbr3S8C98Y8JjLzWKoqskBU/3qii8hdaOf8 TA0YOYTiuNkUSl3uEgiQFLTA6vMS0aieIcmZ5nrji2mD4sKGV4WOgIGcRCAIt1r/H3q4r+77ss+O/ 6TrkuDGISLk7cW2ptKmKK+QDY5v58X6jdFHf0vNHXjvNeLYkYgPUifwFjBBv2tQkisZQ8r0Iof4Jb eHTu9TBA==; Received: from willy by casper.infradead.org with local (Exim 4.94.2 #2 (Red Hat Linux)) id 1mECIu-00EhL3-7Q; Thu, 12 Aug 2021 15:07:47 +0000 Date: Thu, 12 Aug 2021 16:07:32 +0100 From: Matthew Wilcox To: Huang Ying , linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: Data corruption problem with swapfiles and THP Message-ID: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-Rspamd-Queue-Id: A93FE50435B2 Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=PSqKAk7K; spf=none (imf05.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org; dmarc=none X-Rspamd-Server: rspam01 X-Stat-Signature: 5x4jne6eh4u9ywcwx3go18rwaa5nxtct X-HE-Tag: 1628780984-828595 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: There is an assumption in the swap writepage path that a THP is physically contiguous on swap: bio->bi_iter.bi_sector = swap_page_sector(page); bio->bi_opf = REQ_OP_WRITE | REQ_SWAP | wbc_to_write_flags(wbc); bio->bi_end_io = end_write_func; bio_add_page(bio, page, thp_size(page), 0); As far as I can tell, this is not necessarily true. If a file is not contiguous, we can have an extent which is 1MB long followed by an extent somewhere else on storage that's 1MB long. When we try to write a 2MB page to swap, we overwrite whatever's on the block device after that first 1MB extent. (Came across this by code examination while looking at getting rid of the bio path entirely; no attempt has been made to produce this problem; something else may prevent it from actually happening)