From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7C324C52D7D for ; Fri, 16 Aug 2024 01:24:23 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BC3B58D000B; Thu, 15 Aug 2024 21:24:22 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B22796B024B; Thu, 15 Aug 2024 21:24:22 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9750C8D000B; Thu, 15 Aug 2024 21:24:22 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 6FA106B024A for ; Thu, 15 Aug 2024 21:24:22 -0400 (EDT) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id E6BCEA18C3 for ; Fri, 16 Aug 2024 01:24:21 +0000 (UTC) X-FDA: 82456363122.10.0C50EA2 Received: from mail-oa1-f46.google.com (mail-oa1-f46.google.com [209.85.160.46]) by imf05.hostedemail.com (Postfix) with ESMTP id B584610000F for ; Fri, 16 Aug 2024 01:24:19 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=kernel-dk.20230601.gappssmtp.com header.s=20230601 header.b=Qe7BnvUz; dmarc=none; spf=pass (imf05.hostedemail.com: domain of axboe@kernel.dk designates 209.85.160.46 as permitted sender) smtp.mailfrom=axboe@kernel.dk ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1723771378; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=wS7/+wdQvlwCW86C+NC+Kxgz7DJypNgq3andC6NteXI=; b=hKtfB7IvUwknV9yjyXYsIu67q8g9XrcHJzuMhgEqphQOJA443a2ki/NqjFddL0oTKUXuFk 19Mj/1becj3boUbvAKqEB6xjapWaSvZvUtMF5QzLD56/2r3YJ9Uacj6THfb1+g2YGpV9Uz nuntpXEfEh4uacw4+RTH/7bS50T3g6w= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1723771378; a=rsa-sha256; cv=none; b=JDgzO92K/9OBaf31u4T+GE9X3+8vgS+jKfd06MqRVkLhgBOH5rEs2gRDzX6tguTcM7tkcb VZEhhX+jBlH7bU2pCpdXqgeyN3LJSracGVQd+c8UbcuJ+zq9nO24HY2ztcdyAoNTtjDZu1 YrWFSOJcNJaxx0bYBMbxdkUmxdxTr/Y= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=pass header.d=kernel-dk.20230601.gappssmtp.com header.s=20230601 header.b=Qe7BnvUz; dmarc=none; spf=pass (imf05.hostedemail.com: domain of axboe@kernel.dk designates 209.85.160.46 as permitted sender) smtp.mailfrom=axboe@kernel.dk Received: by mail-oa1-f46.google.com with SMTP id 586e51a60fabf-2644f7d0fb2so197744fac.0 for ; Thu, 15 Aug 2024 18:24:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20230601.gappssmtp.com; s=20230601; t=1723771459; x=1724376259; darn=kvack.org; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=wS7/+wdQvlwCW86C+NC+Kxgz7DJypNgq3andC6NteXI=; b=Qe7BnvUztmb3z24WR6uopBV6dHrUKvwaLDw9K4k9EYlr7AdHcv30ibBkPvZ1pAlccn HMEjEOciC0J0Qhm6snXaVtNbewThdilOXoCuiaI0+Q/G5wrgz0rdoU1aCd7o4b1QdoFO es/83+R1HKF03es0C4CQZFy5O5I8Lmr+cTKpI+bJ1uhxXU6E4S+eVcR4EcbY+M/D8beh ZKzzY8ylHuB7AIDaI28ATT50l5UzyzCAQCBDfsd9nPJLpRr0SidnqxNppAP1LsJ1NPhF JCsfia3uYwdWCmRYwSEXFo3f1dYk5w2LyRiSSEx5ie2nu56Z6Hfp7U6GYeiQtkfxk/XA cVQg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1723771459; x=1724376259; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=wS7/+wdQvlwCW86C+NC+Kxgz7DJypNgq3andC6NteXI=; b=bqIakjsylRwC0Q9GC7piy/E9a3fAgniWljL94khwTA2TqKO4nMfA0wN8dB9di3yfBF 63+ElAyg0Zs+i/l6tavwX+X5JflVlHs5EYYC4zCerDXN3Ti/gpU/v4JVD+NxbVS15EqZ ammY0Wpew6ZWQlrCagfm5O2DxgxHB9gmNvZYsqEfm3z1vRZ8K9ZDtm8z6x3v7VETA00v OTD8Au4Qt0sGs0SHLN7U5sO9LgE9m6kn60DijPxE3oYhBf6mz166BqhK9K7O4boeGVI3 9tA2uJ7uA7F4gHxmw5hT4MF0uO4b63h6JY0zw8jegLmF6uns58pHYyZIuPQuZQDWTd/j VzUQ== X-Forwarded-Encrypted: i=1; AJvYcCVgoZWQyoIkW1Q8mHxkUI/5gayC9fKBp7WpSLsS0ml3SqsGOEX3Za2JFgU/c1fM77aa+PcPWXkexg==@kvack.org X-Gm-Message-State: AOJu0Yz7Em64jo8WXpSiurzsp0gf0vWKiwIYZ4aEGNzQv7snGjyEuVSC UavYZM4fqjf/2cY7rvxGkspolDjwm1SCmQscf5TDAV5/h3sWm2FN1JN6E+xpq24= X-Google-Smtp-Source: AGHT+IHZeP+Bz6KlOUnnumj1Ayg3CCjn6iyNDDoVptcwAhUO55T+8Zj2OLZIXa9i4NCwfnFqKkCfTQ== X-Received: by 2002:a05:6870:fb8c:b0:25e:44b9:b2ee with SMTP id 586e51a60fabf-2701c346756mr943721fac.2.1723771458703; Thu, 15 Aug 2024 18:24:18 -0700 (PDT) Received: from [192.168.1.150] ([198.8.77.157]) by smtp.gmail.com with ESMTPSA id 41be03b00d2f7-7c6b6356ad2sm1792656a12.69.2024.08.15.18.24.17 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 15 Aug 2024 18:24:17 -0700 (PDT) Message-ID: <4d016a30-d258-4d0e-b3bc-18bf0bd48e32@kernel.dk> Date: Thu, 15 Aug 2024 19:24:16 -0600 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [RFC 5/5] block: implement io_uring discard cmd To: Ming Lei , Pavel Begunkov Cc: io-uring@vger.kernel.org, Conrad Meyer , linux-block@vger.kernel.org, linux-mm@kvack.org References: <6ecd7ab3386f63f1656dc766c1b5b038ff5353c2.1723601134.git.asml.silence@gmail.com> Content-Language: en-US From: Jens Axboe In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: B584610000F X-Stat-Signature: z553u58d8ojq1u7fwzqs8hxccbi6xm5i X-Rspam-User: X-HE-Tag: 1723771459-62136 X-HE-Meta: U2FsdGVkX1+V3GtfeNM76aeVo7AFjP2XY5G4TIh9L2YDHnB40V+0JyLJ1CBxe2yUXSzEI56FVWIQsmfmYLCW1qK9vXShTOvHUwLV7uVYEcAKBj1BZHKSPgd8hUU1P4Z+90KvO0pw9ZBRvsK43MXjDkiJ8TeUUU9D4CZaqknFvMeAZwB2YqYDb3lJMJBhIknBxtFW7tt73GGiLxBBYMK+aqFmfrb/68XJnpy8GZ9SgZiQ/p2pXZJwrCgMWf0QqPNDVbvZxq/WP7QAgsoW2YTiHvVj7LKw/Tb4uhTbu9gQzK3FDp6FNZx0px3dKip/urSktzvNHf4mDdFd9RGrS0ZR8J+pPC20dGCQ2yXrLxiJ8q+upARhulRFvm1PCENU6lDxOynnfKRmZGscb6ves6DaGlxz6UEYfsiQCubZt+6Y0VCuZqEwUUJ1TO0BN7m7Jgqa0fOmT4tlflWUPzOXRbYubEjQc2hDCzvrnCzDdeJhUMwXnKVBf9SihJGnhUUmcgrlD7nQZcHOhrMvHi5a+pks+ZzmI1R2mK/p4wLTxBkFEAxp7wMMSQ5rE2tprd0jhiWzVfsS2eqPMsv2DEVWPg9dCLMkM4lVz2c9MBcpYQbvEPhP0iI0JPpQ79J9F4OiYusoGlh89LeVOvZsuseWEtRpySms/g9wkY+/uqY+A2eKjY9t1h1FajD21hBS0ZJS7e6niX38aEs5WxgzDBY/5ipgCH8hd5jQnwv50WEdSj+hjCwscgtUf6K5sBqYi28Qc1MW8DKAR+5Ephy+HEe1LQUYYLSEamomEXmmnD1AoU/AEi+NZPEZ6y5HyOJ5+q3FHnG1Itsnbh+hThWON+a6NH9qHfpxVbuMuJvqyd2/RY1k0ohdWyHs4yWfs6AewMJQJqbEX9W6MmdOAALAzL/JxmTZIhpsyw4SbjSgUHmjCASK2lB2eLuhwhFnU3gvpf7tZ5ov9qMZSOuHyPFib0TMkZT Dbe47X0/ XkB4M0RhFTyXY32eGoa2IeA56iiY8aNLgQCe1ottMgWQkebOOI6YtMLv9Lsumyxn9JL2q8ZQcwEoLGi4zRWB0PXbEehkTyRuzTxDyqGBygNIxNO/lYFxWKmCd98189RrNFejb391tZ5tvne2e6GWISH/OKxn9alDp35evOy58vt4Cxk4bbUJQ5Q/JZ6jcN1kSxxN89Tb2/N5kpI8Mumkrd5y9JB28KCcSmtFqQRGLgRxMn5XpRaKiBCUcDCCNJyb/nAloHJJhnSbn0bGxmHvv4wXefsGTsT7nL9wL5aVRDMPlSH5pmL5qE9B9lVM+foCiCVDA0yEahFucVNOk6kXe7rDkAz/El0hIK/1BTQyflA9e9AFDZo4uH1Ben7erINa7lA3PUjbaJTOK7rbQ2fhbhgsmt5PYc2aBLWAnU4emYr9msmEIT4+Nxy/fi/tP0fTFWSf6KFT8adB0Rf5Har9H/0Lxrc1XzVtYgJWfA9jL2rPNKuZPf9fRqyuomtk1RvFkTzNhcuhJ+wRalwTObWbm5mPYeA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 8/15/24 5:44 PM, Ming Lei wrote: > On Thu, Aug 15, 2024 at 06:11:13PM +0100, Pavel Begunkov wrote: >> On 8/15/24 15:33, Jens Axboe wrote: >>> On 8/14/24 7:42 PM, Ming Lei wrote: >>>> On Wed, Aug 14, 2024 at 6:46?PM Pavel Begunkov wrote: >>>>> >>>>> Add ->uring_cmd callback for block device files and use it to implement >>>>> asynchronous discard. Normally, it first tries to execute the command >>>>> from non-blocking context, which we limit to a single bio because >>>>> otherwise one of sub-bios may need to wait for other bios, and we don't >>>>> want to deal with partial IO. If non-blocking attempt fails, we'll retry >>>>> it in a blocking context. >>>>> >>>>> Suggested-by: Conrad Meyer >>>>> Signed-off-by: Pavel Begunkov >>>>> --- >>>>> block/blk.h | 1 + >>>>> block/fops.c | 2 + >>>>> block/ioctl.c | 94 +++++++++++++++++++++++++++++++++++++++++ >>>>> include/uapi/linux/fs.h | 2 + >>>>> 4 files changed, 99 insertions(+) >>>>> >>>>> diff --git a/block/blk.h b/block/blk.h >>>>> index e180863f918b..5178c5ba6852 100644 >>>>> --- a/block/blk.h >>>>> +++ b/block/blk.h >>>>> @@ -571,6 +571,7 @@ blk_mode_t file_to_blk_mode(struct file *file); >>>>> int truncate_bdev_range(struct block_device *bdev, blk_mode_t mode, >>>>> loff_t lstart, loff_t lend); >>>>> long blkdev_ioctl(struct file *file, unsigned cmd, unsigned long arg); >>>>> +int blkdev_uring_cmd(struct io_uring_cmd *cmd, unsigned int issue_flags); >>>>> long compat_blkdev_ioctl(struct file *file, unsigned cmd, unsigned long arg); >>>>> >>>>> extern const struct address_space_operations def_blk_aops; >>>>> diff --git a/block/fops.c b/block/fops.c >>>>> index 9825c1713a49..8154b10b5abf 100644 >>>>> --- a/block/fops.c >>>>> +++ b/block/fops.c >>>>> @@ -17,6 +17,7 @@ >>>>> #include >>>>> #include >>>>> #include >>>>> +#include >>>>> #include "blk.h" >>>>> >>>>> static inline struct inode *bdev_file_inode(struct file *file) >>>>> @@ -873,6 +874,7 @@ const struct file_operations def_blk_fops = { >>>>> .splice_read = filemap_splice_read, >>>>> .splice_write = iter_file_splice_write, >>>>> .fallocate = blkdev_fallocate, >>>>> + .uring_cmd = blkdev_uring_cmd, >>>> >>>> Just be curious, we have IORING_OP_FALLOCATE already for sending >>>> discard to block device, why is .uring_cmd added for this purpose? >> >> Which is a good question, I haven't thought about it, but I tend to >> agree with Jens. Because vfs_fallocate is created synchronous >> IORING_OP_FALLOCATE is slow for anything but pretty large requests. >> Probably can be patched up, which would involve changing the >> fops->fallocate protot, but I'm not sure async there makes sense >> outside of bdev (?), and cmd approach is simpler, can be made >> somewhat more efficient (1 less layer in the way), and it's not >> really something completely new since we have it in ioctl. > > Yeah, we have ioctl(DISCARD), which acquires filemap_invalidate_lock, > same with blkdev_fallocate(). > > But this patch drops this exclusive lock, so it becomes async friendly, > but may cause stale page cache. However, if the lock is required, it can't > be efficient anymore and io-wq may be inevitable, :-) If you want to grab the lock, you can still opportunistically grab it. For (by far) the common case, you'll get it, and you can still do it inline. Really not that unusual. -- Jens Axboe