From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 85DBAC64EC7 for ; Wed, 1 Mar 2023 04:49:21 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BA2886B0071; Tue, 28 Feb 2023 23:49:20 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id B52866B0072; Tue, 28 Feb 2023 23:49:20 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A6A3F6B0073; Tue, 28 Feb 2023 23:49:20 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 96B396B0071 for ; Tue, 28 Feb 2023 23:49:20 -0500 (EST) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 4B4CC1C6A0B for ; Wed, 1 Mar 2023 04:49:20 +0000 (UTC) X-FDA: 80519100480.18.E5A8FED Received: from out30-133.freemail.mail.aliyun.com (out30-133.freemail.mail.aliyun.com [115.124.30.133]) by imf02.hostedemail.com (Postfix) with ESMTP id B51178000D for ; Wed, 1 Mar 2023 04:49:16 +0000 (UTC) Authentication-Results: imf02.hostedemail.com; dkim=none; spf=pass (imf02.hostedemail.com: domain of hsiangkao@linux.alibaba.com designates 115.124.30.133 as permitted sender) smtp.mailfrom=hsiangkao@linux.alibaba.com; dmarc=pass (policy=none) header.from=alibaba.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1677646158; a=rsa-sha256; cv=none; b=UTOnbLmb2TyP5jALjDjyTxdBoFk28np5ajJVUIkgIpMbY1HDr6xQvCaC8nCJ0F7KKAAoA5 cXE3+99DVzLaSVO4Ua9in1BfjYC1as7ImTHVfAOw+NH1Cxnm19rJVLK9JgFWGsHinN4dWe DOOKCtvlYS8PLUVKRbfOscMk0XoEDLA= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=none; spf=pass (imf02.hostedemail.com: domain of hsiangkao@linux.alibaba.com designates 115.124.30.133 as permitted sender) smtp.mailfrom=hsiangkao@linux.alibaba.com; dmarc=pass (policy=none) header.from=alibaba.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1677646158; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=XlfkBfrxxb1aTA7HgwTn4MXHEsF+YiO4Xrrz9zDwZSg=; b=2nJoFMGqykxolMwkhbvg2RRqTlzbZzmIZsz8OdZpeHUGAiH64+H/k/EDbSx4Bf44WA87Zx FKFDlmh3qd5RmkS3MTLLpmC7ceUggsOuRT0opaJI/lvqEFhRj2vKvo75jy7GhzrAQfb+az XNHaUu7hzqDv4UFZ3j2oxTJI164pEbY= X-Alimail-AntiSpam:AC=PASS;BC=-1|-1;BR=01201311R121e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018046060;MF=hsiangkao@linux.alibaba.com;NM=1;PH=DS;RN=6;SR=0;TI=SMTPD_---0Vco0u.M_1677646151; Received: from 30.97.48.239(mailfrom:hsiangkao@linux.alibaba.com fp:SMTPD_---0Vco0u.M_1677646151) by smtp.aliyun-inc.com; Wed, 01 Mar 2023 12:49:11 +0800 Message-ID: Date: Wed, 1 Mar 2023 12:49:10 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Thunderbird/102.6.1 Subject: Re: [LSF/MM/BPF TOPIC] Cloud storage optimizations To: Matthew Wilcox , Theodore Ts'o Cc: lsf-pc@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-block@vger.kernel.org References: From: Gao Xiang In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Rspam-User: X-Rspamd-Queue-Id: B51178000D X-Rspamd-Server: rspam01 X-Stat-Signature: mxpkwagduncfraudzrtjt8f3dit6719k X-HE-Tag: 1677646156-346283 X-HE-Meta: U2FsdGVkX1+CPjSUkk3HvzKQUnErM/GdF/v7vNlAuvEsH3lwHVj7DuP3ELEUYx3Du8A3hDcKQzxhQi7aVLEHSh/CDVzI2Z0nqT7rW7LI3B/n3jRScwOzyksejOjOeZbTxHrx8sf50Ud//xbuHxcmCxnJEFkDS9XrataULJFThw6iBpnrULKY9c/p8fx95EgftuveZYmmwd+5/LnHOdID936vH+FxS5nGRbJ+8eqHOTljX2ZG+q2SQOQGJowZsw3e+cUb6wr/Ub7mcxHRLnpQgY1SjmfDSgAa3yFvPRLtZIHOA7Kjqz4FwbbU4aPse/PwpenHX84bbEd7JwrO7m/xqH+m7e+INtoTi/eV2w6Vjn5Lj/aQQzZpUAUPAWFmLFeXAG72H8l7vkGRIFYV2TRCJpeDnxCoyeJjq20pNivn0YPBF/2rMX6kKUbCha9LYdgq9IlV7G5vc8w74Pwm9cy/pomzHyRQnoGlxC13L12SwZlo15zg2nWbv4EAp9qw6pPepZjodS8LgLc4cDekTkbvMAvD7HOju3TrQ9JsHYjO3uAQ3tcBJPT9lA7RiyPj3Avcxw+1avF357+IxSoGLjxqorK+rgfKBuKnjJK4Ot0cuUlbmGyyBBXeOBi1YhIqp50AMhu2X63VEV7Cz4uQeALf7GWHuL8rfvrNWyJ7QTYX+9S6T+/HkTFd9RU8gqCKQQJAtjwFf+A0KA/XSuN6JBgqgAkIMLTupQNArVrtJ6WS0BYbDtZFHV4bZcSBR4wIWkZfoV2fb7xEoOagU2mk23m5JaP4Uj3VfhGgKIIQKoRBhqdNpburMbMeJoQN7c1DzzOAeU1fmK9ZCI0+4QMGDYolOHoOA07WaLJEqTayZKKHQ1xSUCw/jy5GxpHFSEviOAOS1Z3ZXgz8GyS/UxNeyqyVswwbljIT+kBw9eX/WBKOegTogIBXZEZW8WlS6Z/Kn4VKuiVmVCPw9eYWmAY7JhT QZMzW1Hr BK9RsSHAH00DS2QFCdMmQmlg65H4A7ZFNz6SHg7oki5/H3DbnJNy7wBOG2fXHspb8eHzHKw7NEMqd9Q0jgLHnP2p+mJNj8wqjS+cf X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Hi Matthew! On 2023/3/1 12:35, Matthew Wilcox wrote: > On Tue, Feb 28, 2023 at 10:52:15PM -0500, Theodore Ts'o wrote: >> For example, most cloud storage devices are doing read-ahead to try to >> anticipate read requests from the VM. This can interfere with the >> read-ahead being done by the guest kernel. So being able to tell >> cloud storage device whether a particular read request is stemming >> from a read-ahead or not. At the moment, as Matthew Wilcox has >> pointed out, we currently use the read-ahead code path for synchronous >> buffered reads. So plumbing this information so it can passed through >> multiple levels of the mm, fs, and block layers will probably be >> needed. > > This shouldn't be _too_ painful. For example, the NVMe driver already > does the right thing: > > if (req->cmd_flags & (REQ_FAILFAST_DEV | REQ_RAHEAD)) > control |= NVME_RW_LR; > > if (req->cmd_flags & REQ_RAHEAD) > dsmgmt |= NVME_RW_DSM_FREQ_PREFETCH; > > (LR is Limited Retry; FREQ_PREFETCH is "Speculative read. The command > is part of a prefetch operation") > > The only problem is that the readahead code doesn't tell the filesystem > whether the request is sync or async. This should be a simple matter > of adding a new 'bool async' to the readahead_control and then setting > REQ_RAHEAD based on that, rather than on whether the request came in > through readahead() or read_folio() (eg see mpage_readahead()). Great! In addition to that, just (somewhat) off topic, if we have a "bool async" now, I think it will immediately have some users (such as EROFS), since we'd like to do post-processing (such as decompression) immediately in the same context with sync readahead (due to missing pages) and leave it to another kworker for async readahead (I think it's almost same for decryption and verification). So "bool async" is quite useful on my side if it could be possible passed to fs side. I'd like to raise my hands to have it. Thanks, Gao Xiang > > Another thing to fix is that SCSI doesn't do anything with the REQ_RAHEAD > flag, so I presume T10 has some work to do (maybe they could borrow the > Access Frequency field from NVMe, since that was what the drive vendors > told us they wanted; maybe they changed their minds since).