From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9DB5FC83F11 for ; Sun, 27 Aug 2023 19:42:00 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E9A968E000E; Sun, 27 Aug 2023 15:41:59 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E4A438E0001; Sun, 27 Aug 2023 15:41:59 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D130A8E000E; Sun, 27 Aug 2023 15:41:59 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id C43F78E0001 for ; Sun, 27 Aug 2023 15:41:59 -0400 (EDT) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 5FFAC140124 for ; Sun, 27 Aug 2023 19:41:59 +0000 (UTC) X-FDA: 81170905158.05.8D1FD6C Received: from zeniv.linux.org.uk (zeniv.linux.org.uk [62.89.141.173]) by imf12.hostedemail.com (Postfix) with ESMTP id 534F94000F for ; Sun, 27 Aug 2023 19:41:57 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=linux.org.uk header.s=zeniv-20220401 header.b=MC2inegb; dmarc=pass (policy=none) header.from=zeniv.linux.org.uk; spf=none (imf12.hostedemail.com: domain of viro@ftp.linux.org.uk has no SPF policy when checking 62.89.141.173) smtp.mailfrom=viro@ftp.linux.org.uk ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1693165317; a=rsa-sha256; cv=none; b=p/cG9wg4rtx+Di24IOLwTS856e9J+Xg3ktc7obsqFKS/OlmY5PTe1NwX5RiGgh91E475BO Tfxb3UdAeZzD7aeYSfbfq9+nrslI14ySGfw4fKNJ4F2OIE3EPI49oQT6fdytMfG46had+g BxK4KmiYNkqtGNOxgzi8JU1dywNeCxQ= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=linux.org.uk header.s=zeniv-20220401 header.b=MC2inegb; dmarc=pass (policy=none) header.from=zeniv.linux.org.uk; spf=none (imf12.hostedemail.com: domain of viro@ftp.linux.org.uk has no SPF policy when checking 62.89.141.173) smtp.mailfrom=viro@ftp.linux.org.uk ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1693165317; h=from:from:sender:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=684ZmibV0iRA7W1nKgqEgsL3aMIsGcfp6Zpl3WdJ4DU=; b=h+39JsDeR9mmYd9xIaWuEzAMgR3Xgx518myp+yq2AkBEaznAEofDyGNq05M3Fltfjw5MEp YdMPahi18AT0IcNtXf0r2sD0E9RgkDpGPmUxvEDW2tTnbFjbvlf2fKHUru56ZwuHhdWcO/ iUN6pQyWIZrI2GGnqm5hI+xu0wH0OSY= DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=linux.org.uk; s=zeniv-20220401; h=Sender:In-Reply-To:Content-Type: MIME-Version:References:Message-ID:Subject:Cc:To:From:Date:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=684ZmibV0iRA7W1nKgqEgsL3aMIsGcfp6Zpl3WdJ4DU=; b=MC2inegbdk323giCAEH+EfdYar 4kXwfSwyy2upf/dK+53+T+qrKyewF5s7dkck4Fy8HiN43ivhIJelmFUnVBY8gTnpZi1xWKTWuXIKw PMxisq0e3KNF97DWVe6B6NWvtgyUOe2N5H03vNuXGmN7U3J3s/iHPYzfZnHTYEtijbR38gB7y6yM+ f7ACCYsTkBZjsnjijsZxvBLbNGKP/mTU09fHGjxl78/It44EuuASOQ2dnSIavl4Uy9jNvEkCB6Mv6 wSB2x3czu7lrhAFvslc5eM2xpF0ak84dfAdNaJZPWXabaeFkiWOblNYIYyhnd8Q9EJERXc2PyYNn0 cSxlywzw==; Received: from viro by zeniv.linux.org.uk with local (Exim 4.96 #2 (Red Hat Linux)) id 1qaLdS-001NqG-1T; Sun, 27 Aug 2023 19:41:22 +0000 Date: Sun, 27 Aug 2023 20:41:22 +0100 From: Al Viro To: Christoph Hellwig Cc: Matthew Wilcox , Jens Axboe , Xiubo Li , Ilya Dryomov , Christian Brauner , Theodore Ts'o , Jaegeuk Kim , Chao Yu , Miklos Szeredi , Andreas Gruenbacher , "Darrick J. Wong" , Trond Myklebust , Anna Schumaker , Damien Le Moal , Andrew Morton , linux-block@vger.kernel.org, ceph-devel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-ext4@vger.kernel.org, linux-f2fs-devel@lists.sourceforge.net, cluster-devel@redhat.com, linux-xfs@vger.kernel.org, linux-nfs@vger.kernel.org, linux-mm@kvack.org, Hannes Reinecke Subject: Re: [PATCH 03/12] filemap: update ki_pos in generic_perform_write Message-ID: <20230827194122.GA325446@ZenIV> References: <20230601145904.1385409-1-hch@lst.de> <20230601145904.1385409-4-hch@lst.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20230601145904.1385409-4-hch@lst.de> X-Rspam-User: X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 534F94000F X-Stat-Signature: oxdkyog4bnzzer1dqcy4cezxfcfhnpp7 X-HE-Tag: 1693165317-314721 X-HE-Meta: U2FsdGVkX1/Pdqzve5lKhn5wCjPZfGDFy1TdzEwxha4UiPrAFhseYgjw6tUihgakY3lHPJZbTKcRBeaEXoUmOgOh1CY4cTQaRZQ0I7iBCRDb3PiZ+yNtTyhOaTcnLqm2OXzAkwM0oxj2OnnNhUeYsMEvBRE3siEAmf6CH/t6UIlta5b7DCzedwRnW5Rn5PSoUxu4n9Mp9laa/dmnrkwQSDJj2cP81mUY/KsjTD+Litjf2sD5AQyGFklv7FDpUF+GX6pJPMNVhjJmLx56veEAboBTnB8xbN9es/eWxuca8uwOzZWfztNrNmRY1K3g4BIakwUrBLCnmKAkE9mAa3htPiVNVlW9RjFrBnGeuh3xHV3IY2gXCA763XeJI+5KCJ8UZtuyTc9G3yoa0oRQB55vzbv6lXdvYTvYPn762F5oFTvxFR+pyf9SYaRiSWYwTOsOYFKoTrWpRQphLhui8Pz3WzVv55Ccgrn1pns8barI4XGDp6dva6MSYBD/jcNToVp5JrSlYaTQL9KxjVnhZTp9rwXCRZ3uxaGNdlp47ug3/wXT7k/IZQRtk4646pdnI+xy2BOjT192mjiD0ZDZZDXr4TPJU5IuYJkaXmxNuGUywQId9Vh20Bemz+9p8QLwWCRLzFSDigBaqijXQuq+Z5yl5leJsa4TDSKt5gJoTfuLnhRh6p2GEbyfJTCd9oRgNh3fulk39sZzpOizMNtZ5h/cZAPjsTUS7vM655Zb552EkvMCZxAlpBE+AwT4ehqS/fJMRDxyyhccXK0oRTeb5IzQD5sYQrl+YzSpwYlyXI0NJeBJQh5o21oxhD7qD4Ijd+Nf8dPE5kDNSTkSz49YtpHKRFqz0jmtDNl2ZE5HOAqn9fQ6AS6nrYoDRu56XIowLYG7SV5JIV8Pm7Hrrr+esWVnT6Z5b+3iYzoHEWLVAzGKNAEAr1e0rrFYowyLvuVJS1ud2zErj35QKR3akIceCEY GY3oEzws 1+oZTE93ADIVs5mE78LCcL6m/wV9WpYcHEiTV9OnFUEKwthHAM+Afut1rO93o+fMuuo/Kno1uYSTWX1MlriksDp+V2ceUicoukdAMyZIHI7lBVlMSjNqQQJusXrfP2y9qLLTXyPCQaVoEbIfccXBcjkFntT1uFQzFNnKoknbt1aVfOZNphzJ893B5xWPPtUBA6q5Veqin2/VCrtZOu/0bx9/mdoQ65v235Y2+0IaMbgfVXM3u91JcyHyFTeyfU61POmCrtBVVfXCt/vQtSlY+zknDwa/WSg1MH/RKC8HWTRaIjaS5xRtadIuXzsDZ5762+TjBr4RzHG+afsrvxY0C76wCVms8oAcKKPx4pRFFCmUlkFJKjwQSPcS+YwmsfHlv8s1cESBtheDomiT68pjlh2G7DEC0NHXdAkO1wp7X7qwfyrg3PPTURSJQDvpHWyreQ1lvYryoI1Cd+l4Dcl2mSCRXxxlu+0SpJr2w/Ct/OKBwFgU= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Jun 01, 2023 at 04:58:55PM +0200, Christoph Hellwig wrote: > All callers of generic_perform_write need to updated ki_pos, move it into > common code. > @@ -4034,7 +4037,6 @@ ssize_t __generic_file_write_iter(struct kiocb *iocb, struct iov_iter *from) > endbyte = pos + status - 1; > err = filemap_write_and_wait_range(mapping, pos, endbyte); > if (err == 0) { > - iocb->ki_pos = endbyte + 1; > written += status; > invalidate_mapping_pages(mapping, > pos >> PAGE_SHIFT, > @@ -4047,8 +4049,6 @@ ssize_t __generic_file_write_iter(struct kiocb *iocb, struct iov_iter *from) > } > } else { > written = generic_perform_write(iocb, from); > - if (likely(written > 0)) > - iocb->ki_pos += written; > } > out: > return written ? written : err; [another late reply, sorry] That part is somewhat fishy - there's a case where you return a positive value and advance ->ki_pos by more than that amount. I really wonder if all callers of ->write_iter() are OK with that. Consider e.g. this: ssize_t ksys_write(unsigned int fd, const char __user *buf, size_t count) { struct fd f = fdget_pos(fd); ssize_t ret = -EBADF; if (f.file) { loff_t pos, *ppos = file_ppos(f.file); if (ppos) { pos = *ppos; ppos = &pos; } ret = vfs_write(f.file, buf, count, ppos); if (ret >= 0 && ppos) f.file->f_pos = pos; fdput_pos(f); } return ret; } ssize_t vfs_write(struct file *file, const char __user *buf, size_t count, loff_t *pos) { ssize_t ret; if (!(file->f_mode & FMODE_WRITE)) return -EBADF; if (!(file->f_mode & FMODE_CAN_WRITE)) return -EINVAL; if (unlikely(!access_ok(buf, count))) return -EFAULT; ret = rw_verify_area(WRITE, file, pos, count); if (ret) return ret; if (count > MAX_RW_COUNT) count = MAX_RW_COUNT; file_start_write(file); if (file->f_op->write) ret = file->f_op->write(file, buf, count, pos); else if (file->f_op->write_iter) ret = new_sync_write(file, buf, count, pos); else ret = -EINVAL; if (ret > 0) { fsnotify_modify(file); add_wchar(current, ret); } inc_syscw(current); file_end_write(file); return ret; } static ssize_t new_sync_write(struct file *filp, const char __user *buf, size_t len, loff_t *ppos) { struct kiocb kiocb; struct iov_iter iter; ssize_t ret; init_sync_kiocb(&kiocb, filp); kiocb.ki_pos = (ppos ? *ppos : 0); iov_iter_ubuf(&iter, ITER_SOURCE, (void __user *)buf, len); ret = call_write_iter(filp, &kiocb, &iter); BUG_ON(ret == -EIOCBQUEUED); if (ret > 0 && ppos) *ppos = kiocb.ki_pos; return ret; } Suppose ->write_iter() ends up doing returning a positive value smaller than the increment of kiocb.ki_pos. What do we get? ret is positive, so kiocb.ki_pos gets copied into *ppos, which is ksys_write's pos and there we copy it into file->f_pos. Is it really OK to have write() return 4096 and advance the file position by 16K? AFAICS, userland wouldn't get any indication of something odd going on - just a short write to a regular file, with followup write of remaining 12K getting quietly written in the range 16K..28K. I don't remember what POSIX says about that, but it would qualify as nasty surprise for any userland program - sure, one can check fsync() results before closing the sucker and see if everything looks fine, but the way it's usually discussed could easily lead to assumption that (synchronous) O_DIRECT writes would not be affected by anything of that sort.