From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1C831C64ED6 for ; Wed, 1 Mar 2023 05:19:29 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6A8BF6B0071; Wed, 1 Mar 2023 00:19:28 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 659666B0072; Wed, 1 Mar 2023 00:19:28 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 520946B0073; Wed, 1 Mar 2023 00:19:28 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 422DF6B0071 for ; Wed, 1 Mar 2023 00:19:28 -0500 (EST) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 1CF9E1C6A7C for ; Wed, 1 Mar 2023 05:19:28 +0000 (UTC) X-FDA: 80519176416.28.57FE412 Received: from out30-133.freemail.mail.aliyun.com (out30-133.freemail.mail.aliyun.com [115.124.30.133]) by imf25.hostedemail.com (Postfix) with ESMTP id AF22DA0003 for ; Wed, 1 Mar 2023 05:19:25 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=alibaba.com; spf=pass (imf25.hostedemail.com: domain of hsiangkao@linux.alibaba.com designates 115.124.30.133 as permitted sender) smtp.mailfrom=hsiangkao@linux.alibaba.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1677647966; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Qycn16yVM3GXUltvOxRUsQd8zIztsXLAuMEcrcpxoZ4=; b=oklGS1T4rfnYtMRQXNZW++qdmZPFB9NS9pWusbdcOQtU9qgVrpcGcoauIZhVEDikpXx8Wb YGU1Vt2vz3nMmQXlt8/hyw6ZRLOlumUV72/WcZuLwvOIr2KLn38zDu29j60k3SfcnfQYZ1 BIs6sUtj41mhLuqefs9LGCNrfNZMZ6g= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=alibaba.com; spf=pass (imf25.hostedemail.com: domain of hsiangkao@linux.alibaba.com designates 115.124.30.133 as permitted sender) smtp.mailfrom=hsiangkao@linux.alibaba.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1677647966; a=rsa-sha256; cv=none; b=dGQ+8aRQJeNhCT1yzBeH1iDEQr+72g+t0V8XpBHqJPUrT22kpDhaS8YGr7m4GhiCwEdZXV s5qVztfydyev1PP+iiFGkcTgGCRKfIkAMJ72KRLXupKwbK3d/iNDwneu0oMnJ9O07cRQ9S AS444+fNKXfHZ5T6g8FeHqZvzt0WgIs= X-Alimail-AntiSpam:AC=PASS;BC=-1|-1;BR=01201311R201e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018046060;MF=hsiangkao@linux.alibaba.com;NM=1;PH=DS;RN=6;SR=0;TI=SMTPD_---0Vcnk2Ym_1677647960; Received: from 30.97.48.239(mailfrom:hsiangkao@linux.alibaba.com fp:SMTPD_---0Vcnk2Ym_1677647960) by smtp.aliyun-inc.com; Wed, 01 Mar 2023 13:19:21 +0800 Message-ID: <565f2ceb-6e6b-8775-4446-5aefcd377e7d@linux.alibaba.com> Date: Wed, 1 Mar 2023 13:19:19 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Thunderbird/102.6.1 Subject: Re: [LSF/MM/BPF TOPIC] Cloud storage optimizations From: Gao Xiang To: Matthew Wilcox Cc: Theodore Ts'o , lsf-pc@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-block@vger.kernel.org References: <49b6d3de-e5c7-73fc-fa43-5c068426619b@linux.alibaba.com> In-Reply-To: <49b6d3de-e5c7-73fc-fa43-5c068426619b@linux.alibaba.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: AF22DA0003 X-Stat-Signature: ibnxb8i8ucg8etxkhdwi5to3g8aohtkk X-HE-Tag: 1677647965-323012 X-HE-Meta: U2FsdGVkX1/0f6wjaCIbCxTuYzZEyelNle9A3UtQuJ+5gCUxgIQf685tbmfY+tfUTVKZO8sLFPla8mOIM6Mwj50X0CIwHyuRw+ZHHfmwWqaPy5ub9nIrU/QZwlsuWLu6zrFfZXwiEz1lruxA9Fh7QUKrPsZC0adY2BiVnJ3QhdRN7M8cBQ2QcgrIO7HsGkQgpENIS9oOQ0rp7pvs8deMR8ho5Dw3mxuzXQoF0LyL1z1lnJzbazqIjLh5XPxi59+2fI2Dpe6v8SOUnmhITm/pjD6YGZElhLPILMkkj6oJPAsqtGNwsX8q1gnbrEiYPcV1T4E/y+1ZGjOftrxRcMUmnj360XQfHKhA9zH3hsRTf9HGY4Cd4QjIiqzw/CMJi8KhWcxML1FeY7svwtyE5dYBCS06JOpwsNKXJFg2Gmw7kFQv86X2z+52uDbXFof9MFGXR1XVfLiZ+ge5+9OEx3c3cXfcT5jA+dVW036m3Cl1taFq2oseRLC58U/0kk/ZjboF0n0QrcAY2f0JDZMG/D8ijtPN1zlA9kP1O17X/IrDxEheTNn2Ch1or2NyY1ZIQfnliQiXjlDohFnUMp3BTIgmtKCD1EjCUY5ZrjI35/t0+g/4y5eCiG+/Sxj16S3DX0Eg9HcHuuLYOGGA5ZLokvpN4FToA7C3u0CPrErQ8cOJLhbaxhYGUKHEoCWsfGiPVMhy7Ri/QiUa8zLxE6w5sGFRGTQ1ztpiQ7OSTqOwobepUmR8bmDEvWwSS+rPbrvnl3TW3T4nXGf79lz+lt6J9sC0GAUpISWMYOQxETayi+SkxdjeTbq+VLD/G9IzsNImXw6UoZTYU0KXJT3dcsskq6MxFd6YGkWkknko8Q/X7xEkW4WEWqY/boANLrpDYdU7JoqeFX+Re+AW77Ez3N/kU/fF0FLuMUyc+T8rESe4JwnjFP7bcgkOBAG3NBNp7cMqe9LGoNizBc75AFG88LDQ7gQ TrFZ5FW3 RDSmIlpGUAC9LW0KyG3ZR1EiLwlTxXC5qNLJxla303VVpYTrzUs87+/hwhAPsSGkjevEjsclTJJy/oym+V11ME5gxjalDyHK5ogcXsmw7He5zrCR+0UXXbJFBidTYts9l3tUKQ6ggnz9cxqrh+C6GvnEEyEupteTrWzJD0EPZ2E24uW92tdxMQa8E4zy/WkbmAoRs X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 2023/3/1 13:09, Gao Xiang wrote: > > > On 2023/3/1 13:01, Matthew Wilcox wrote: >> On Wed, Mar 01, 2023 at 12:49:10PM +0800, Gao Xiang wrote: >>>> The only problem is that the readahead code doesn't tell the filesystem >>>> whether the request is sync or async.  This should be a simple matter >>>> of adding a new 'bool async' to the readahead_control and then setting >>>> REQ_RAHEAD based on that, rather than on whether the request came in >>>> through readahead() or read_folio() (eg see mpage_readahead()). >>> >>> Great!  In addition to that, just (somewhat) off topic, if we have a >>> "bool async" now, I think it will immediately have some users (such as >>> EROFS), since we'd like to do post-processing (such as decompression) >>> immediately in the same context with sync readahead (due to missing >>> pages) and leave it to another kworker for async readahead (I think >>> it's almost same for decryption and verification). >>> >>> So "bool async" is quite useful on my side if it could be possible >>> passed to fs side.  I'd like to raise my hands to have it. >> >> That's a really interesting use-case; thanks for bringing it up. >> >> Ideally, we'd have the waiting task do the >> decompression/decryption/verification for proper accounting of CPU. >> Unfortunately, if the folio isn't uptodate, the task doesn't even hold >> a reference to the folio while it waits, so there's no way to wake the >> task and let it know that it has work to do.  At least not at the moment >> ... let me think about that a bit (and if you see a way to do it, feel >> free to propose it) > > Honestly, I'd like to take the folio lock until all post-processing is > done and make it uptodate and unlock so that only we need is to pass > locked-folios requests to kworkers for async way or sync handling in > the original context. > > If we unlocked these folios in advance without uptodate, which means > we have to lock it again (which could have more lock contention) and > need to have a way to trace I/Oed but not post-processed stuff in > addition to no I/Oed stuff. I'm not sure which way is better to proper accounting of CPU, but I think individual fs could know more than mm about post-processing handling, I think just have some accounting apis to fses for these. currently I think core-MM just needs to export "async" bool to rac. and EROFS now just do sync decompression for <= 4 pages in z_erofs_readahead(), and I think it can be done better, see: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/fs/erofs/zdata.c?h=v6.2#n832 Thanks, Gao Xiang > > Thanks, > Gao Xiang