From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 636E4C7EE23 for ; Wed, 1 Mar 2023 04:59:26 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id F1C296B0073; Tue, 28 Feb 2023 23:59:25 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id EA5046B0074; Tue, 28 Feb 2023 23:59:25 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D47986B0075; Tue, 28 Feb 2023 23:59:25 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id C1B376B0073 for ; Tue, 28 Feb 2023 23:59:25 -0500 (EST) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 7A45F1610A9 for ; Wed, 1 Mar 2023 04:59:25 +0000 (UTC) X-FDA: 80519125890.22.938BFC6 Received: from out30-130.freemail.mail.aliyun.com (out30-130.freemail.mail.aliyun.com [115.124.30.130]) by imf23.hostedemail.com (Postfix) with ESMTP id ADA8B14000E for ; Wed, 1 Mar 2023 04:59:22 +0000 (UTC) Authentication-Results: imf23.hostedemail.com; dkim=none; spf=pass (imf23.hostedemail.com: domain of hsiangkao@linux.alibaba.com designates 115.124.30.130 as permitted sender) smtp.mailfrom=hsiangkao@linux.alibaba.com; dmarc=pass (policy=none) header.from=alibaba.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1677646763; a=rsa-sha256; cv=none; b=NkJg/jgbCQUyKBKDH77r63h6X6vXFN6yCCQC23RqxTWEIDb8Pqzu6cgO0IHn6CTU3ANpLt aUltZ6pM+bCC5g8OePA7n19bEWX/i+QQQYKVnYbTIMkJjLEGb5Z2mETV25Qpld31cWx2dt hqxSrWrhrEgweLtHt6lZ+sR2edd8Oeo= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=none; spf=pass (imf23.hostedemail.com: domain of hsiangkao@linux.alibaba.com designates 115.124.30.130 as permitted sender) smtp.mailfrom=hsiangkao@linux.alibaba.com; dmarc=pass (policy=none) header.from=alibaba.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1677646763; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Z1oxVMsuqYApNBZZ/1HrKQiax4V6J3JKgbJVdg/rhrQ=; b=AgX92OCzypLbMPCkbdCzgzauX8TMc/HLahvs+mp2G7FYdla2yBbfdjBDGzvCAUtjRHsUz2 pgQ3EHVZhsMlA7BrDFChOZivdlXE+BvlWc7QHwp6DjAAPcxkKhgAk9pWUzFBxwPduWhwj8 hQ1eF1047MuNJwO5stygJjjSxjhztSM= X-Alimail-AntiSpam:AC=PASS;BC=-1|-1;BR=01201311R141e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018045168;MF=hsiangkao@linux.alibaba.com;NM=1;PH=DS;RN=6;SR=0;TI=SMTPD_---0VcnLHeP_1677646758; Received: from 30.97.48.239(mailfrom:hsiangkao@linux.alibaba.com fp:SMTPD_---0VcnLHeP_1677646758) by smtp.aliyun-inc.com; Wed, 01 Mar 2023 12:59:18 +0800 Message-ID: <8c4e5b16-fb7f-ebd1-60d5-0bb9718fd8dd@linux.alibaba.com> Date: Wed, 1 Mar 2023 12:59:17 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:102.0) Gecko/20100101 Thunderbird/102.6.1 Subject: Re: [LSF/MM/BPF TOPIC] Cloud storage optimizations To: Matthew Wilcox Cc: Theodore Ts'o , lsf-pc@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-block@vger.kernel.org References: <7c111304-b56b-167f-bced-9e06e44241cd@linux.alibaba.com> From: Gao Xiang In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Rspam-User: X-Rspamd-Queue-Id: ADA8B14000E X-Rspamd-Server: rspam01 X-Stat-Signature: rhzhjtnt53pde5jr89ymqqdf9mnry1qw X-HE-Tag: 1677646762-936132 X-HE-Meta: U2FsdGVkX19L7gkjC2uIXsrTlWL+WsgmzgTZEGeEyO5MfDhuIY0U75phGH3qJg8vLN24RrrUFNtSPK0V8pCLSX/T7jx/2MO/x0Oxu90HDjUgyI5o835WzvlHCR89uuzXplfFUZWdm6ME8RXs6rhFm2AyNRet6FrHD24EqQWTJzsBRk0U3CQXxHpNlYv/DZLCixn3ZU67UnPLoqHTpbJwcveOELCSscOdEFRR8G0jXku07CSpU1eS8p6ySAJ11xw69q7yF9KBfkzF/jpdrobzuO/5ZYmioR7UQ0ppi74ELppnGTHkbBRu5dNJux+iEUJVWndS4yydppLSRHJxAQhs+JWlOn3Y6ZuuS/7+HOYs4tbCcy+K6f7hjJ+8HU/LdgiRkub2DPrnX7hE/6zY1neAcvvuc5lg5gQ6h6GoafoyMXXSR6L1YXyXi/ZnrvDL7Pt01ZoG7/Z7w+vpqTW+FnMD7keUewnVH9EZUy4mTWJfTW1DQ8HEWI7Xz41Qf/ehxVJT/sxUEW9GYdACz5JCPEQ2ASgu1RuG6MvG8aYesf1JMFrFxPXDcW+VU1CJeOjoliE0dUg2t8MKXb7mb4K50eNfxCWWJk75KBaUWIOLtgGmo6+g2uwll3GMH68pg07/+oILSNtxcNfIXf75/NNIRMibxRUnqlSRjl8/8Lki7t4yBuieHMfZ0dbxS/fmUgBQKgcqnvxMReoNoiUTcADKrNoli51hm4dJvQTN98eCQrjxw6pMR42czNuOdR2lv+YnZfACcDJCLDvoiHCy4YbmtzzFywf98S8aPZRXKKjHMPL9Qq9cj5ysUWW97/qSdRqIecLzLo6tmAKSl+0vYcz56Dp4Cr+H7RSu1akSywa6rhLKKRnG0gp5GbLP9IBQiJKoKMTh82zigX4DsjdYp+Q/PGXSPdIcHE6UYJeHg12+EFzNUW5Y9piUbBi2Vw5jHfdXVQomqvVh0SbEQnTi6Ww037v WW3xxFzd UGsd/DiKUAbPHdGMJUUrJUKpE7QIwrxoTAWPU/Zuxr1+8/Pc0ihvFkIq2ZoIpkjfhimn6pEDFLW3Y2QDmoDF+/kJt+5k/UK5QRcvatP9K7dsU8wGt7q4t7YgjY8bup5LIpP1SYOubXsdp4Wf8ZQMm6G0+N+hBqfO38xW6bMdJA04MEtMYhHGHvZ6QjGCdEGwg4cxo X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 2023/3/1 12:40, Matthew Wilcox wrote: > On Wed, Mar 01, 2023 at 12:18:30PM +0800, Gao Xiang wrote: >>> For example, most cloud storage devices are doing read-ahead to try to >>> anticipate read requests from the VM. This can interfere with the >>> read-ahead being done by the guest kernel. So being able to tell >>> cloud storage device whether a particular read request is stemming >>> from a read-ahead or not. At the moment, as Matthew Wilcox has >>> pointed out, we currently use the read-ahead code path for synchronous >>> buffered reads. So plumbing this information so it can passed through >>> multiple levels of the mm, fs, and block layers will probably be >>> needed. >> >> It seems that is also useful as well, yet if my understanding is correct, >> it's somewhat unclear for me if we could do more and have a better form >> compared with the current REQ_RAHEAD (currently REQ_RAHEAD use cases and >> impacts are quite limited.) > > I'm pretty sure the Linux readahead algorithms could do with some serious > tuning (as opposed to the hacks the Android device vendors are doing). > Outside my current level of enthusiasm / knowledge, alas. And it's > hard because while we no longer care about performance on floppies, > we do care about performance from CompactFlash to 8GB/s NVMe drives. > I had one person recently complain that 200Gbps ethernet was too slow > for their storage, so there's a faster usecase to care about. Yes, we might have a chance to revisit the current readahead algorithm towards the modern storage devices. I understand how the current readahead works but don't have enough slots to analyse the workloads and investigate more, also such heuristic stuff can have pro-and-con sides all the time. As a public cloud vendor, it becomes vital to change since some users just would like to care about the corner cases compared with other competitors. Thanks, Gao Xiang