From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7187BEDE9B4 for ; Tue, 10 Sep 2024 20:22:16 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0853E8D00C1; Tue, 10 Sep 2024 16:22:16 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 034878D0056; Tue, 10 Sep 2024 16:22:15 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E3E248D00C1; Tue, 10 Sep 2024 16:22:15 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id C776A8D0056 for ; Tue, 10 Sep 2024 16:22:15 -0400 (EDT) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 4E6A4A889C for ; Tue, 10 Sep 2024 20:22:15 +0000 (UTC) X-FDA: 82549950630.13.789BCBB Received: from mail-wm1-f52.google.com (mail-wm1-f52.google.com [209.85.128.52]) by imf23.hostedemail.com (Postfix) with ESMTP id 5746B14000E for ; Tue, 10 Sep 2024 20:22:12 +0000 (UTC) Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b="L/hOOqbM"; spf=pass (imf23.hostedemail.com: domain of asml.silence@gmail.com designates 209.85.128.52 as permitted sender) smtp.mailfrom=asml.silence@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1725999595; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=MDjydcKVUDScuDqfe8vsl4HZhSr4A4L5web6qrYiGpI=; b=ZFNcHMBMaI5vLOz7zcjSmdkuCLfgJTwHCK9zk5Vv0RczIsDNfthydDoWZM786QkMTEsNqB xf1mjzzO8RoiaC10sE46ATQQGl4MPs6kZe6PqcOr0QhF3cSEJHZh50pY4HgSMEpaAbKfKX AnC2CRpCehXbSC0gFGs6KCLSce/8JsQ= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b="L/hOOqbM"; spf=pass (imf23.hostedemail.com: domain of asml.silence@gmail.com designates 209.85.128.52 as permitted sender) smtp.mailfrom=asml.silence@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1725999595; a=rsa-sha256; cv=none; b=mr7I3LJ7saQAphj/45aFzARg+SSc9A+foJsr7YtSoDlEWt73mil3lLqWUnHk72LnDudTPc EO9OanKYqGX0KbW9IvYTT250/UjkiF65Ztia2nmnwfNUjkpH7WwGzfQCQR0o6Q9gFLtQND wzliuEqQm0xyhvnLK+XUzvLYn0lmTno= Received: by mail-wm1-f52.google.com with SMTP id 5b1f17b1804b1-42cc8782869so6847715e9.2 for ; Tue, 10 Sep 2024 13:22:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1725999731; x=1726604531; darn=kvack.org; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=MDjydcKVUDScuDqfe8vsl4HZhSr4A4L5web6qrYiGpI=; b=L/hOOqbM+wtPpcBnR6204Y3Pt28WTFQqwwBZfB0e7DVWU6kTsTvmDC03r7OybUN8W4 CCLKezMbXK3vxiIieQgSgNtvgktmqYWUYu+c240E/wfyUiAVBXzP/GRg5lBJ8w2bmaJU 4PHVzIXPh/H7/+5g4nH4Rfp/jKEaiGkAxbys6bbvlO73p+SK1gFCu8nA5+/BEL5fu20N g1WSkGjgEuSKeWqkH8G3SlpegTIkH2TflcVdKVBkEhcNWk4FxGrXF2n6LE4cnmdPNClm VkCa7iKYQvDNUGtGOLbr1HLRx+kcoskX7LjxpluPm9Ptrcile1J7SGcFMMzL2JAVuxTz GBFg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725999731; x=1726604531; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=MDjydcKVUDScuDqfe8vsl4HZhSr4A4L5web6qrYiGpI=; b=TrgRlFkKk5ZcRDdIkkj7DUTdPyGLrnPpto7dZ9xKloQ8T0yDRVM+0EcH4Xiy4VlbyJ 8dAabxkbwkkQySRaySzbXE1FCh0LJZTuLLg/DGGgZ8hyQLwcVD+XJqwJSLxuaz+29KqB Q2CE0S8qtUnmvpbVg1kcjPAq4Q+IPK9P1//HvTmkzyXTwES1SlJWFeADFVxAOGpGzXlI VOJMEmceDgnH8zLuxzr9CYtYvXRiADp4YtcMJp7fY5/rWhJXM5LQV573SbgJ4ZjOs3iN z3tkAkVZHzt2TD04CG4ojO/pAWpQlo5FP415uiAclUKk5zTion19VAsxFyEXgwBcY2vO BDXw== X-Forwarded-Encrypted: i=1; AJvYcCWhg4Lrf3cJcVa4/2mbYd/fV01GxLd/QzbxP2d37F4GO+sU4unZef1+iU37c2RzFls2vUtqgc2Q9g==@kvack.org X-Gm-Message-State: AOJu0YwnQc2kCH2eE+vspHFODL2t0AMpFHUL0ahROIMZE/iLWH7DwPMK RYYKI+ubFKvVZhhTEwI59sYb4AKilS6lqITQ7/YjIWtTNm73VrpHYk6RxapUcO0= X-Google-Smtp-Source: AGHT+IEXI7/Uw0lWdkbc1U6kE4SSL+rt3p7Kxas3S5dCTBcXcLo8UyFbt9cKKwqFtKZcpt9g6bDf7Q== X-Received: by 2002:a05:600c:3b0f:b0:428:23c8:1e54 with SMTP id 5b1f17b1804b1-42ccd32deb6mr6702565e9.18.1725999730311; Tue, 10 Sep 2024 13:22:10 -0700 (PDT) Received: from [192.168.42.24] ([185.69.144.178]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-378956653c6sm9757407f8f.32.2024.09.10.13.22.09 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 10 Sep 2024 13:22:09 -0700 (PDT) Message-ID: Date: Tue, 10 Sep 2024 21:22:37 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v4 5/8] block: implement async discard as io_uring cmd To: Christoph Hellwig Cc: io-uring@vger.kernel.org, Jens Axboe , Conrad Meyer , linux-block@vger.kernel.org, linux-mm@kvack.org References: <7fc0a61ae29190a42e958eddfefd6d44cdf372ad.1725621577.git.asml.silence@gmail.com> <430ca5b3-6ee1-463b-9e4e-5d0b934578cc@gmail.com> Content-Language: en-US From: Pavel Begunkov In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 5746B14000E X-Stat-Signature: smqak7fhnh5neu3njk9qnbbxb8rssz3d X-Rspam-User: X-HE-Tag: 1725999732-911849 X-HE-Meta: U2FsdGVkX18KE61Fn3nhFxre4BOC+vLW8Ow+XuwKzMA/lLYMPTguNpqK0WkjyuoZXYRyFhRqYbMbR3H47hX5z6MnsEwHA5m+5QGFSn4IgloW7x36O4iiNhnB4+bDCU/csaojPmyFyCaPtB7ooo78Z6JE9blq+zqBuRlrQqBrzYgpQ0K4pgn6fMM+RPrgbOQVfSYuXq3P1JGt2MrSrUZqW9o7uXwO5q/lHFA8F+BXGMyvPIpWrz7+jxkMOcL4wt/x41xWMJv1QreFPMYQkJQRLxWx3mrhyrHVxSWbxlGYD7ZHt0Fv2nq/X5ljKumGBBJQvuq9drBby7e4hWypRZcmljUXLHdF3uvZIviwZoUCF2j3d87PL1yJjn8GpuszenAOyv8ojkfHXI09V8INrrDDtE5dYARXiccC/CT5euRYns4/XHUkf2X9j8yb7NPOOtO9WaDtD4be4f/gmGnchwh50Qn8Ar3nYyD9y6DE0IKokVC955QpEoAvA88ugdCCJV3NnfTmyMFVYDk24qWVndAnWHy4p3YW7VMzXAWOpJvcpP0EbNM2WZyxCbjnXtO6sEkoo9CKhB6NOeyaAl9msbw6kGCSk6wg6WFFZ8kc52VFb1EbYUkyT9C9SQhaWdjJ/bQmojYcmDgne2u7GapYzxg045sSrbcb+P6TTQNTgMGNtUQgB85YyS9MJhK8UKJLNXys5PYY+txxD9EPeUP9LWW4WocTVEv+xjkvZSbBSTWkUN4Jppc5/VafWzVJO0T6fiXVyN/tEWdTh45uNdJg1uSW++5xIUaJM7gnDZbQIoF0Z1plaGhEhRxtIBZa0xAYoTZifYHCHJF5SGG0G0p2OjV1MuXTHR7+rhJBN4Ch4KNa+8No87NWfHWxJQzMS1x195MXET3uVGC0KQdZSU088Lvp79AN6Yz5TyJMdeScYqZPPMdzj3gBuddNCDbDmpAxewMeblEkHNwhbrA8lUrn3RT utfOdjPo 0MlRCF9GUKk8kX0Eld+H092nAwT7XsKXJk1VHEID+Y/9FAQaNTN07rDdPmvtX62VKQl4cwOs+3XWJle9/kyNmGpyBiuAGW19CEnoK7Vl8C3f5bjrpFwd3LZwjGAAdYC3UgjNljA5VL/W65lHDlfAImcgrR7K8LapCDwJyUcMt1ybvo8YbCJgkT4MiElhOZ+/kI65PuVMoaP2MlOdTxQhHjlncQQn8G29pB+rUPhh9wdrME68QoYrYX/KJMWq9HZQpp3dZHn/8bClu/Z1C9UOe8Hclz8NRC/0X5llAWOTyKxPyDLwvPQNqyHAb/Wm6oORAabshAJW4gaVLzTVt158DQA0bnYq1rdv1eSUBGlv8BzULS20rtmwGukCZ1IKgfk4QisJ21eXgpyvfXYEp1YDIVmQ9l0pisqQWsE/wPfKTfQ9ReaQ7qDRdafVEWlr4pDaF0IUZ1uFyICVWqTIJu89wYnRiuo5ovS+R68ct X-Bogosity: Ham, tests=bogofilter, spamicity=0.000134, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 9/10/24 15:17, Christoph Hellwig wrote: > On Tue, Sep 10, 2024 at 11:58:23AM +0100, Pavel Begunkov wrote: >>>> + >>>> + err = filemap_invalidate_pages(bdev->bd_mapping, start, >>>> + start + len - 1, nowait); >>>> + if (err) >>>> + return err; >>>> + >>>> + while ((bio = blk_alloc_discard_bio(bdev, §or, &nr_sects, gfp))) { >>>> + if (nowait) >>>> + bio->bi_opf |= REQ_NOWAIT; >>>> + prev = bio_chain_and_submit(prev, bio); >>>> + } >>>> + if (!prev) >>>> + return -EAGAIN; >>> >>> If a user changes the max_discard value between the check above and >>> the loop here this is racy. >> >> If the driver randomly changes it, it's racy either way. What do >> you want to protect against? > > The discard limit shrinking and now this successfully returning while > not actually discarding the range. The fix is pretty simple in that If it's shrinking then bios initialised and submitted with that initial larger limit should fail, e.g. by the disk or driver, which would be caught by bio_cmd_bio_end_io(). If nobody fails bios, then nothing ever will help here, you can always first queue up bios and change the limit afterwards while they're still in flight. > the nowait case should simply break out of the loop after the first bio. while ((bio = blk_alloc_discard_bio(bdev, §or, &nr_sects, gfp))) { if (nowait) bio->bi_opf |= REQ_NOWAIT; prev = bio_chain_and_submit(prev, bio); if (nowait) break; } Like this? I need to add nr_sects==0 post loop checking either way, but I don't see how this break would be better any better than bio_put before the submit from v2. >>>> +sector_t bio_discard_limit(struct block_device *bdev, sector_t sector); >>> >>> And to be honest, I'd really prefer to not have bio_discard_limit >>> exposed. Certainly not outside a header private to block/. >> >> Which is the other reason why first versions were putting down >> a bio seeing that there is more to be done for nowait, which >> you didn't like. I can return back to it or narrow the scopre. > > The above should also take care of that. > >> >>> Also why start at 137? A comment >>> would generally be pretty useful as well. >> >> There is a comment, 2 lines above the new define. >> >> /* >> * A jump here: 130-136 are reserved for zoned block devices >> * (see uapi/linux/blkzoned.h) >> */ >> >> Is that your concern? > > But those are ioctls, this is not an ioctl and uses a different > number space. Take a look at e.g. nvme uring cmds which also > don't try to use the same number space as the ioctl. As far as I see nvme cmds are just dropped onto the 0x80- range. Not continuing ioctls, right, but nevertheless random and undocumented. And if we're arguing that IOC helps preventing people issuing ioctls to a wrong file type, we can easily extend it to "what if someone passes BLK* ioctl number to io_uring or vise versa? Not to mention that most of the IOC selling points have zero sense for io_uring like struct size and struct copy direction. >>> Also can we have a include/uapi/linux/blkdev.h for this instead of >>> bloating fs.h that gets included just about everywhere? >> I don't think it belongs to this series. > > How would adding a proper header instead of bloating fs.h not be > part of the series adding the first ever block layer uring_cmds? Because, apparently, no one have ever gave a damn about it. I'll add it for you, but with header probing instead of a simple ifdef I'd call it a usability downgrade. > Just in case I wasn't clear - this isn't asking you to move anything > existing as we can't do that without breaking existing applications. We can, by including blkdev.h into fs.h, but that's a different kind of a structure. > It is about adding the new command to the proper place. -- Pavel Begunkov