From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-1.1 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C70DFC433E0 for ; Tue, 23 Jun 2020 07:41:30 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 7C2CF2073E for ; Tue, 23 Jun 2020 07:41:30 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="MWu64jiN" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 7C2CF2073E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 20AC56B0005; Tue, 23 Jun 2020 03:41:30 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 194326B0006; Tue, 23 Jun 2020 03:41:30 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 082546B0007; Tue, 23 Jun 2020 03:41:30 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0098.hostedemail.com [216.40.44.98]) by kanga.kvack.org (Postfix) with ESMTP id DDCA26B0005 for ; Tue, 23 Jun 2020 03:41:29 -0400 (EDT) Received: from smtpin17.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 97B01181AC9BF for ; Tue, 23 Jun 2020 07:41:29 +0000 (UTC) X-FDA: 76959681498.17.cough31_4712e9626e39 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin17.hostedemail.com (Postfix) with ESMTP id 70C38180D01BB for ; Tue, 23 Jun 2020 07:41:29 +0000 (UTC) X-HE-Tag: cough31_4712e9626e39 X-Filterd-Recvd-Size: 6060 Received: from us-smtp-delivery-1.mimecast.com (us-smtp-1.mimecast.com [205.139.110.61]) by imf14.hostedemail.com (Postfix) with ESMTP for ; Tue, 23 Jun 2020 07:41:28 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1592898088; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=kANMhVOVTkewDfS+wze7GKWDAqJzYtfvX0eoCKjfpM4=; b=MWu64jiNxPsqzmYF+W8QwiTeLJGnrqLtlBwx0WUE753YBmCGTf/XyJPmEEREivuAwJnDTH Yu+SqjYRIniu+yfkLcWD1g4Swkrx8CqKV/waEdovAq0b4tBWvOla/3S6SR90sNE7GgwObz pAun7osbRHXfXIpnABUI9ljqhhPDvwM= Received: from mail-ot1-f72.google.com (mail-ot1-f72.google.com [209.85.210.72]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-228-M5GLvSbRMquOegk_D_Hx7g-1; Tue, 23 Jun 2020 03:41:26 -0400 X-MC-Unique: M5GLvSbRMquOegk_D_Hx7g-1 Received: by mail-ot1-f72.google.com with SMTP id t17so9833408otp.22 for ; Tue, 23 Jun 2020 00:41:26 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=kANMhVOVTkewDfS+wze7GKWDAqJzYtfvX0eoCKjfpM4=; b=SzCj2UcGoWfQUs+Ua3ZwW+8/5Djnzu8b7Vjo7suQ53V0sVoq+/w5jDiAFbS7BmN1pj q/RoPsG3/uj0a8+WzVLASDtTSwekr7nuddF8462DseDOFzqh8vBx5tj2sEgNW8tIl3kQ yjt2Qeb0ZkBOd7oUvZzu7n0yUj9F3KVSAwiX3JFf3AhnqUxBu41FeotjgdYbNt/gEQ+K OkJu/kVjTEM0Yv5BUvtEK6uqFt98I4mkvNSPnuXbI+8f0gOf+j1mtmS5OmZshf4cQz72 bhZTE+CCcwPGh0JfQCZrqubGhOQW6t67fMy9LUJdIezUWocvI03fUqPwIK/WKkL/dCKp EEEw== X-Gm-Message-State: AOAM531h8i4K/wbvtkNoVdzJwj0IjPCFMTXwz8XXNxoa/lJvi5YQZol4 QYjWE6BC+ZHpeGyDuQQIjWsvxcLLn3H8zzs/u/xZFA8dZ2qkuDCFm1qhB5dkBGb0giCyVAElZsr tSxgBqHuFYkkH87XmtrjbGXVt4Ng= X-Received: by 2002:aca:5049:: with SMTP id e70mr15806169oib.72.1592898085705; Tue, 23 Jun 2020 00:41:25 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxvT6etI273O3836DUQF/ve/uxCrBhp8TObCS3rPpRZkC9DlVwAUGbt/aCSwt2x93cFLClNeboKKcK+g/zgNg8= X-Received: by 2002:aca:5049:: with SMTP id e70mr15806157oib.72.1592898085470; Tue, 23 Jun 2020 00:41:25 -0700 (PDT) MIME-Version: 1.0 References: <20200619155036.GZ8681@bombadil.infradead.org> <20200622003215.GC2040@dread.disaster.area> <20200623005218.GF2040@dread.disaster.area> In-Reply-To: <20200623005218.GF2040@dread.disaster.area> From: Andreas Gruenbacher Date: Tue, 23 Jun 2020 09:41:13 +0200 Message-ID: Subject: Re: [RFC] Bypass filesystems for reading cached pages To: Dave Chinner Cc: Matthew Wilcox , linux-fsdevel , Linux-MM , LKML X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: 70C38180D01BB X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam04 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Jun 23, 2020 at 2:52 AM Dave Chinner wrote: > On Mon, Jun 22, 2020 at 04:35:05PM +0200, Andreas Gruenbacher wrote: > > On Mon, Jun 22, 2020 at 2:32 AM Dave Chinner wrote: > > > On Fri, Jun 19, 2020 at 08:50:36AM -0700, Matthew Wilcox wrote: > > > > > > > > This patch lifts the IOCB_CACHED idea expressed by Andreas to the VFS. > > > > The advantage of this patch is that we can avoid taking any filesystem > > > > lock, as long as the pages being accessed are in the cache (and we don't > > > > need to readahead any pages into the cache). We also avoid an indirect > > > > function call in these cases. > > > > > > What does this micro-optimisation actually gain us except for more > > > complexity in the IO path? > > > > > > i.e. if a filesystem lock has such massive overhead that it slows > > > down the cached readahead path in production workloads, then that's > > > something the filesystem needs to address, not unconditionally > > > bypass the filesystem before the IO gets anywhere near it. > > > > I'm fine with not moving that functionality into the VFS. The problem > > I have in gfs2 is that taking glocks is really expensive. Part of that > > overhead is accidental, but we definitely won't be able to fix it in > > the short term. So something like the IOCB_CACHED flag that prevents > > generic_file_read_iter from issuing readahead I/O would save the day > > for us. Does that idea stand a chance? > > I have no problem with a "NOREADAHEAD" flag being passed to > generic_file_read_iter(). It's not a "already cached" flag though, > it's a "don't start any IO" directive, just like the NOWAIT flag is > a "don't block on locks or IO in progress" directive and not an > "already cached" flag. Readahead is something we should be doing, > unless a filesystem has a very good reason not to, such as the gfs2 > locking case here... The requests coming in can have the IOCB_NOWAIT flag set or cleared. The idea was to have an additional flag that implies IOCB_NOWAIT so that you can do: iocb->ki_flags |= IOCB_NOIO; generic_file_read_iter() if ("failed because of IOCB_NOIO") { if ("failed because of IOCB_NOWAIT") return -EAGAIN; iocb->ki_flags &= ~IOCB_NOIO; "locking" generic_file_read_iter() "unlocking" } without having to save iocb->ki_flags. The alternative would be: int flags = iocb->ki_flags; iocb->ki_flags |= IOCB_NOIO | IOCB_NOWAIT; ret = generic_file_read_iter() if ("failed because of IOCB_NOIO or IOCB_NOWAIT") { if ("failed because of IOCB_NOWAIT" && (flags & IOCB_NOWAIT)) return -EAGAIN; iocb->ki_flags &= ~IOCB_NOIO; "locking" generic_file_read_iter() "unlocking" } Andreas