From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.6 required=3.0 tests=DKIM_INVALID,DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0F9FBC433E0 for ; Fri, 19 Jun 2020 20:12:27 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id ACE44208B8 for ; Fri, 19 Jun 2020 20:12:26 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="hMYgCBWI" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org ACE44208B8 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 121518D00F7; Fri, 19 Jun 2020 16:12:26 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 0AA728D00E9; Fri, 19 Jun 2020 16:12:26 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E8CF68D00F7; Fri, 19 Jun 2020 16:12:25 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id CDBF08D00E9 for ; Fri, 19 Jun 2020 16:12:25 -0400 (EDT) Received: from smtpin10.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 34644181AC9C6 for ; Fri, 19 Jun 2020 20:12:25 +0000 (UTC) X-FDA: 76947058650.10.waves24_05162f126e1b Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin10.hostedemail.com (Postfix) with ESMTP id 0274216A040 for ; Fri, 19 Jun 2020 20:12:24 +0000 (UTC) X-HE-Tag: waves24_05162f126e1b X-Filterd-Recvd-Size: 4988 Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) by imf49.hostedemail.com (Postfix) with ESMTP for ; Fri, 19 Jun 2020 20:12:24 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=In-Reply-To:Content-Type:MIME-Version :References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=WWsx+pVx9piQcNEsHsBJIPa3fmY2toq8I9FnSM+ryZ4=; b=hMYgCBWIu7277TNYsJDgCbquy4 /91lhyU/OpohJkij0iD8PIz9tiWQ8qAO7Mrd8Wi9pK/n2xfpYRUFOfAWS1xXrMOO6Osk8DS9khIo4 DETgCGskQTnR2aba10rsox2965+iEfQHZd6HwIzdJbdkKtqnNdI7r4ApPFnDJ+p8F3GHE/C/AQHaY ejpOCYwpIAuA1OF+iOlpcNa7x4B/dzg6OUUhWu0jkqHbicOE8LNokxW0ZgQF1CY+b7ba7wgIxPXZ9 DCD+U5D8+R736SMLnJNW0SN25qBTzRn/5I+HJ/QTT2EO7KLwLdiSh6MQnRCkm1LNX8IlPIh6awvOO Q0aTU9Dg==; Received: from willy by bombadil.infradead.org with local (Exim 4.92.3 #3 (Red Hat Linux)) id 1jmNN2-00050h-HG; Fri, 19 Jun 2020 20:12:16 +0000 Date: Fri, 19 Jun 2020 13:12:16 -0700 From: Matthew Wilcox To: Chaitanya Kulkarni Cc: "linux-fsdevel@vger.kernel.org" , "linux-mm@kvack.org" , "agruenba@redhat.com" , "linux-kernel@vger.kernel.org" Subject: Re: [RFC] Bypass filesystems for reading cached pages Message-ID: <20200619201216.GA8681@bombadil.infradead.org> References: <20200619155036.GZ8681@bombadil.infradead.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspamd-Queue-Id: 0274216A040 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam04 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, Jun 19, 2020 at 07:06:19PM +0000, Chaitanya Kulkarni wrote: > On 6/19/20 8:50 AM, Matthew Wilcox wrote: > > This patch lifts the IOCB_CACHED idea expressed by Andreas to the VFS. > > The advantage of this patch is that we can avoid taking any filesystem > > lock, as long as the pages being accessed are in the cache (and we don't > > need to readahead any pages into the cache). We also avoid an indirect > > function call in these cases. > > I did a testing with NVMeOF target file backend with buffered I/O > enabled with your patch and setting the IOCB_CACHED for each I/O ored > '|' with IOCB_NOWAIT calling call_read_iter_cached() [1]. > > The name was changed from call_read_iter() -> call_read_iter_cached() [2]). > > For the file system I've used XFS and device was null_blk with memory > backed so entire file was cached into the DRAM. Thanks for testing! Can you elaborate a little more on what the test does? Are there many threads or tasks? What is the I/O path? XFS on an NVMEoF device, talking over loopback to localhost with nullblk as the server? The nullblk device will have all the data in its pagecache, but each XFS file will have an empty pagecache initially. Then it'll be populated by the test, so does the I/O pattern revisit previously accessed data at all? > Following are the performance numbers :- > > IOPS/Bandwidth :- > > default-page-cache: read: IOPS=1389k, BW=5424MiB/s (5688MB/s) > default-page-cache: read: IOPS=1381k, BW=5395MiB/s (5657MB/s) > default-page-cache: read: IOPS=1391k, BW=5432MiB/s (5696MB/s) > iocb-cached-page-cache: read: IOPS=1403k, BW=5481MiB/s (5747MB/s) > iocb-cached-page-cache: read: IOPS=1393k, BW=5439MiB/s (5704MB/s) > iocb-cached-page-cache: read: IOPS=1399k, BW=5465MiB/s (5731MB/s) That doesn't look bad at all ... about 0.7% increase in IOPS. > Submission lat :- > > default-page-cache: slat (usec): min=2, max=1076, avg= 3.71, > default-page-cache: slat (usec): min=2, max=489, avg= 3.72, > default-page-cache: slat (usec): min=2, max=1078, avg= 3.70, > iocb-cached-page-cache: slat (usec): min=2, max=1731, avg= 3.70, > iocb-cached-page-cache: slat (usec): min=2, max=2115, avg= 3.69, > iocb-cached-page-cache: slat (usec): min=2, max=3055, avg= 3.70, Average latency unchanged, max latency up a little ... makes sense, since we'll do a little more work in the worst case. > @@ -264,7 +267,8 @@ static void nvmet_file_execute_rw(struct nvmet_req *req) > > if (req->ns->buffered_io) { > if (likely(!req->f.mpool_alloc) && > - nvmet_file_execute_io(req, IOCB_NOWAIT)) > + nvmet_file_execute_io(req, > + IOCB_NOWAIT |IOCB_CACHED)) > return; > nvmet_file_submit_buffered_io(req); You'll need a fallback path here, right? IOCB_CACHED can get part-way through doing a request, and then need to be finished off after taking the mutex.