From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B4B20C43217 for ; Thu, 21 Oct 2021 08:03:14 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 5A360611CB for ; Thu, 21 Oct 2021 08:03:14 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 5A360611CB Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=suse.cz Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id A9D40900002; Thu, 21 Oct 2021 04:03:13 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A4CF36B0072; Thu, 21 Oct 2021 04:03:13 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 91476900002; Thu, 21 Oct 2021 04:03:13 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0232.hostedemail.com [216.40.44.232]) by kanga.kvack.org (Postfix) with ESMTP id 830536B0071 for ; Thu, 21 Oct 2021 04:03:13 -0400 (EDT) Received: from smtpin39.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 1759D180269F4 for ; Thu, 21 Oct 2021 08:03:13 +0000 (UTC) X-FDA: 78719704266.39.E965B6F Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.220.29]) by imf01.hostedemail.com (Postfix) with ESMTP id DDEE650833C3 for ; Thu, 21 Oct 2021 08:03:07 +0000 (UTC) Received: from relay2.suse.de (relay2.suse.de [149.44.160.134]) by smtp-out2.suse.de (Postfix) with ESMTP id 458501FDA2; Thu, 21 Oct 2021 08:03:08 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1634803388; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=+pm/NZLEuV5BEzdZJ9xSVzcAyFs8p5nXmzxCLPlAfyI=; b=RI+cS14EpigLQbiToyCM9aR23GCi18cOA1ZFxJY5dHoKWY6B+DyNUWv8SKvxx8pZ8vbTGv qv25XK5M7wBnnK6T1XPQNT4iUGKdhsRncdYyi1RPyLqxS7QhBpEKjh4dO1dSOoYZpvjaIa Or6IRIU3/ue1ljwgHCYEvYOtsV6Tpi4= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1634803388; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=+pm/NZLEuV5BEzdZJ9xSVzcAyFs8p5nXmzxCLPlAfyI=; b=xVvf2bQnUPOKG+957FxH6NuPHIoNLrGq+ZnDMfVe7GI4e0/zjcXs5QQNBbXTSOCTdB8mfQ In2o3sjIF5b4NMBA== Received: from quack2.suse.cz (unknown [10.100.224.230]) by relay2.suse.de (Postfix) with ESMTP id 93F0FA3C73; Thu, 21 Oct 2021 08:03:06 +0000 (UTC) Received: by quack2.suse.cz (Postfix, from userid 1000) id DB2CD1E0BFB; Thu, 21 Oct 2021 10:03:04 +0200 (CEST) Date: Thu, 21 Oct 2021 10:03:04 +0200 From: Jan Kara To: Zhengyuan Liu Cc: Jan Kara , viro@zeniv.linux.org.uk, Andrew Morton , tytso@mit.edu, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-ext4@vger.kernel.org, =?utf-8?B?5YiY5LqR?= , Zhengyuan Liu Subject: Re: Problem with direct IO Message-ID: <20211021080304.GB5784@quack2.suse.cz> References: <20211020173729.GF16460@quack2.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: DDEE650833C3 X-Stat-Signature: unsokb1ysaenwo5tx4gzp6uz3onyb5eu Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=RI+cS14E; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=xVvf2bQn; dmarc=none; spf=pass (imf01.hostedemail.com: domain of jack@suse.cz designates 195.135.220.29 as permitted sender) smtp.mailfrom=jack@suse.cz X-HE-Tag: 1634803387-918420 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu 21-10-21 10:21:55, Zhengyuan Liu wrote: > On Thu, Oct 21, 2021 at 1:37 AM Jan Kara wrote: > > On Wed 13-10-21 09:46:46, Zhengyuan Liu wrote: > > > we are encounting following Mysql crash problem while importing tables : > > > > > > 2021-09-26T11:22:17.825250Z 0 [ERROR] [MY-013622] [InnoDB] [FATAL] > > > fsync() returned EIO, aborting. > > > 2021-09-26T11:22:17.825315Z 0 [ERROR] [MY-013183] [InnoDB] > > > Assertion failure: ut0ut.cc:555 thread 281472996733168 > > > > > > At the same time , we found dmesg had following message: > > > > > > [ 4328.838972] Page cache invalidation failure on direct I/O. > > > Possible data corruption due to collision with buffered I/O! > > > [ 4328.850234] File: /data/mysql/data/sysbench/sbtest53.ibd PID: > > > 625 Comm: kworker/42:1 > > > > > > Firstly, we doubled Mysql has operating the file with direct IO and > > > buffered IO interlaced, but after some checking we found it did only > > > do direct IO using aio. The problem is exactly from direct-io > > > interface (__generic_file_write_iter) itself. > > > > > > ssize_t __generic_file_write_iter() > > > { > > > ... > > > if (iocb->ki_flags & IOCB_DIRECT) { > > > loff_t pos, endbyte; > > > > > > written = generic_file_direct_write(iocb, from); > > > /* > > > * If the write stopped short of completing, fall back to > > > * buffered writes. Some filesystems do this for writes to > > > * holes, for example. For DAX files, a buffered write will > > > * not succeed (even if it did, DAX does not handle dirty > > > * page-cache pages correctly). > > > */ > > > if (written < 0 || !iov_iter_count(from) || IS_DAX(inode)) > > > goto out; > > > > > > status = generic_perform_write(file, from, pos = iocb->ki_pos); > > > ... > > > } > > > > > > From above code snippet we can see that direct io could fall back to > > > buffered IO under certain conditions, so even Mysql only did direct IO > > > it could interleave with buffered IO when fall back occurred. I have > > > no idea why FS(ext3) failed the direct IO currently, but it is strange > > > __generic_file_write_iter make direct IO fall back to buffered IO, it > > > seems breaking the semantics of direct IO. > > > > > > The reproduced environment is: > > > Platform: Kunpeng 920 (arm64) > > > Kernel: V5.15-rc > > > PAGESIZE: 64K > > > Mysql: V8.0 > > > Innodb_page_size: default(16K) > > > > Thanks for report. I agree this should not happen. How hard is this to > > reproduce? Any idea whether the fallback to buffered IO happens because > > iomap_dio_rw() returns -ENOTBLK or because it returns short write? > > It is easy to reproduce in my test environment, as I said in the previous > email replied to Andrew this problem is related to kernel page size. Ok, can you share a reproducer? > > Can you post output of "dumpe2fs -h " for the filesystem where the > > problem happens? Thanks! > > Sure, the output is: > > # dumpe2fs -h /dev/sda3 > dumpe2fs 1.45.3 (14-Jul-2019) > Filesystem volume name: > Last mounted on: /data > Filesystem UUID: 09a51146-b325-48bb-be63-c9df539a90a1 > Filesystem magic number: 0xEF53 > Filesystem revision #: 1 (dynamic) > Filesystem features: has_journal ext_attr resize_inode dir_index > filetype needs_recovery sparse_super large_file Thanks for the data. OK, a filesystem without extents. Does your test by any chance try to do direct IO to a hole in a file? Because that is not (and never was) supported without extents. Also the fact that you don't see the problem with ext4 (which means extents support) would be pointing in that direction. Honza -- Jan Kara SUSE Labs, CR