From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B886DC77B7F for ; Fri, 19 May 2023 08:40:42 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4DE0D900006; Fri, 19 May 2023 04:40:42 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 48D25900004; Fri, 19 May 2023 04:40:42 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2E1C8900006; Fri, 19 May 2023 04:40:42 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 1DF73900004 for ; Fri, 19 May 2023 04:40:42 -0400 (EDT) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id ECA951C793C for ; Fri, 19 May 2023 08:40:41 +0000 (UTC) X-FDA: 80806358682.17.C43FB42 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf07.hostedemail.com (Postfix) with ESMTP id D0F6A40010 for ; Fri, 19 May 2023 08:40:39 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=Y77Oh3f7; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf07.hostedemail.com: domain of xiubli@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=xiubli@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1684485639; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=egBPRHVpdHSfd20nm0e9mbT5sU7zP2+9zHGLlUwPTiM=; b=7HuhKPEYoD0gvPbMo5CvNpy78LWyRPAW3vMvqQmkRcznhMLukFRxdCBE8wKSXLyv/K1Utx utUlJusbuEYESyDoCOFesQxSlxRqkbf6oG2H8gwqUq/cW0oRSnZ9msPvdsJBzUNRGGBx2I HYXo3uyZzxnJkvwEdAxMwsAB5dNmxtA= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=Y77Oh3f7; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf07.hostedemail.com: domain of xiubli@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=xiubli@redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1684485639; a=rsa-sha256; cv=none; b=KdiZ1TOwtxMCwcighE3FQZDv4KaBwwMbBFmmsCe6QwrDG4FV4JPJO3bkj8eqaYHtO/SplA JliNzLTXcY8EJoFfpkwQPYBbWgFGGgkKv4ZDuUUiW6ThYKk3wAc0v0ov8SyvJzWxOCSduQ 35HpmbM4hvf9et0uBJ0HJkMDURXaiKs= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1684485639; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=egBPRHVpdHSfd20nm0e9mbT5sU7zP2+9zHGLlUwPTiM=; b=Y77Oh3f7BbqcfJAya9qhO8BC23iyJEZx3HziCeSQzEQQ599j2qfpmgfm7k+H1sGowKMamg RhjRf+yEOfGQLldxYGJqMxWAiDakcFBZkknCfFWlqTqdSidTIxHuS6So5ljtjYRQTJp4rL 005bhweGkCVLmz1VxYS0ZpsndKRkmpc= Received: from mail-pf1-f199.google.com (mail-pf1-f199.google.com [209.85.210.199]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-96-Ol66YrcqMHaVNuVrnXkqnw-1; Fri, 19 May 2023 04:40:37 -0400 X-MC-Unique: Ol66YrcqMHaVNuVrnXkqnw-1 Received: by mail-pf1-f199.google.com with SMTP id d2e1a72fcca58-64d2d0f5144so577392b3a.1 for ; Fri, 19 May 2023 01:40:37 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1684485637; x=1687077637; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=egBPRHVpdHSfd20nm0e9mbT5sU7zP2+9zHGLlUwPTiM=; b=TQ2gDlFJuJoR3eOLjL0HslTFVYMKRY9CdaLzPS/TXx5iVNfDOSkGs+C+VfKbTKZvtv 0Mj0hXZi3RKSUluwWJhRUK3aQhsWs8IqHl6b7Zn6rerEH0DN/SwLC67raOdXh59zSUqn C0cj/rIFTiQ7folvtALHhiCYa5/wgWnj9Ho6PmxuJSOr409a9Y5HMs7qC/qcUB8yZiq9 X7y6w4LrbUWN6bgE5KifmxqKPDv6wHVzlx26EMLDdwJ3o9uONnGSGEG3mty6eHZ36nIz foBnZ9LUyiNioR4vbmJ/4mKiwYqQ60D7sHRT0xPwyptSJ15rM4GmhVrkwPqR//t6PlPH jp5g== X-Gm-Message-State: AC+VfDw9MEZyss7TE440o4VW3phrEj1peqhMzamlq4RrQHq01ljcaO8v qGsZt0OZwsqedF4wdALkjTHfkFkFeZOn3EImcvPX8xmfP0mJos1IEgVpUp2e5+KVF1rFk5/+yx5 fDlHH4FfZjV8= X-Received: by 2002:a05:6a20:12ce:b0:103:4188:5dc6 with SMTP id v14-20020a056a2012ce00b0010341885dc6mr1404878pzg.61.1684485636695; Fri, 19 May 2023 01:40:36 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ5qifzK49CJEVh43BJzHJgxewYSM92TBkBZCMgJvFD6rw1sXkY73qala5W0RRLrYfUw1jrMRQ== X-Received: by 2002:a05:6a20:12ce:b0:103:4188:5dc6 with SMTP id v14-20020a056a2012ce00b0010341885dc6mr1404860pzg.61.1684485636344; Fri, 19 May 2023 01:40:36 -0700 (PDT) Received: from [10.72.12.98] ([209.132.188.80]) by smtp.gmail.com with ESMTPSA id c24-20020aa78e18000000b00622e01989cbsm2609668pfr.176.2023.05.19.01.40.29 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 19 May 2023 01:40:36 -0700 (PDT) Message-ID: Date: Fri, 19 May 2023 16:40:26 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.10.0 Subject: Re: [PATCH v20 13/32] ceph: Provide a splice-read stub To: David Howells , Jens Axboe , Al Viro , Christoph Hellwig Cc: Matthew Wilcox , Jan Kara , Jeff Layton , David Hildenbrand , Jason Gunthorpe , Logan Gunthorpe , Hillf Danton , Christian Brauner , Linus Torvalds , linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Christoph Hellwig , Ilya Dryomov , ceph-devel@vger.kernel.org References: <20230519074047.1739879-1-dhowells@redhat.com> <20230519074047.1739879-14-dhowells@redhat.com> From: Xiubo Li In-Reply-To: <20230519074047.1739879-14-dhowells@redhat.com> X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: D0F6A40010 X-Stat-Signature: knwxprsixqxwnp7orxw8w8bhg5oq1do5 X-Rspam-User: X-HE-Tag: 1684485639-389351 X-HE-Meta: U2FsdGVkX1+lSbHWgZ4bGSm8WwogpWx12niFpsqxSaJjZXG90uTJ2TBaWnQzivHd1Wh5KSHD5mLgwC5shE5Ewy+POlK/I6ioQ6uSh0wL23TaVvOEm9rbHs50LJoZoYVXAZGCdiGV2Y+bXQCrRsHxn10fJLdfU0yRPQynwRMjh1Qoj6iEMTAxsLKMAJ64O4m31h92J9gnt7FL97n8iMwjiX/uTwm5gr98RZo0TGabDF4IgBOuZddc2HFd+QT2g1uQ4Br8INuvf6++5HTb5DCPOf/6E0uMYirC7ZNiQIH4LUsyKAwQ80k5wwQD11nhBh6AIHHwnZ4F0cOf9AYirXCwFm/827YhSYioJNQCTDPa+kl78K2jsOot+CO/e9g0X3VL1JzBPtFvoaC56MesBxo26Q8c+5JtoGnpuru9FKIGKwWibyORl7+FMiiK+1ORhnK5eOGxdVQEKTHUNIt5HdhQHFHKK0L+e0q1fpUVVvLCAY8smitzNV+sHI1zrrs8XFRZ9FZoDmtJZTyIV0SlbLEVVfW74EdgvTO7QZgFXSjecR2lUhnRuCLF1gpe44t8GJCqPWP7eqWGt/unqP6miGwIGIslgraIqWpxZCEoXUerW6nToZoWUxE3HAUO+N9NRsFMjnpOfFYB5Nbn4TCYwkPih59Rxkv7dWTADaPSR6I6R23YKM3ByUMm7fi2PfDPqQERSXWbH7o2bc09ilCkpxCGQDzGgjukm92K2bZI9+BOBdG35Iripszmrg1uyZP+BhgL1cIACC075V8X1e4jnrOjar8dN+Kp+z1Ges0HFW0H8X8Ja5/weXRLTJEjsWW/CC32R68OQtM8rikigKU48X+JZEhiEVJ2dz0C+4cFjQHFLVdyUYm9Fkmk/1W/AkcxkV3QfQuldw1CvAs3l/wFi794P+cDR4jz+SyWaWXGVlWqsRt7B/Mc4JWOEjYLRS1tx3c6jz7vGPa4HMS7HSEE+HJ vgQhNjgr wZ1pCZl5rSClQwaW3rkaN3Oo94fBdMABdj8Vz9vDAKM5e2zTBAXedSzYC/NFEeJBYVYMjqMdZJCdupi3iub4i4oPq4H2C3ZpJ6hmsv7hcY05j+xCQOX6I+W6+H3VbX498Ssv2KIBmd4WyOYpdzZ6CQVuWTQMS7hkhI50sxf/ztygM4JWNKi8QXcsi0egc8mnP2WaV4UjJUkNecubVVs6c5fhPs+ncBUgWf9+Ko5LQGaLVLC6z+IMDZd92ohQj1o9K1Z2/sGpsK3Ys4SRozuGPuzEbcHZiSn/LhjoIdf9qLkQBMVUkaS/JpFTISH5VI2YadBCM22mGPyn3oUT0xqHGE3pbh5qxjUxH4sROHmszjSTDgGP07gTUupkckVGbWHCiZs3czfmZh8Me71y2yObJZUjv4xM5o1BLkuJvn/liShT6VgHQ3Ctlw18Tqv0eS4SVIYRSVQ/jc+wJLm01r0Yaq9Le+8JvuxD1IC+Mcl6IcgBQjmeakGsNnQkQAwSlM/GO0Wu5iAf9lRg8vNvDkisx37TJGDZszgw1c7cENJAbJf6aP8+tU/ih6feFBMvzUwB6UELBV2xmdozQxWOiiAO+8KWh/A== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 5/19/23 15:40, David Howells wrote: > Provide a splice_read stub for Ceph. This does the inode shutdown check > before proceeding and jumps to direct_splice_read() if O_DIRECT is set, the > file has inline data or is a synchronous file. > > We try and get FILE_RD and either FILE_CACHE and/or FILE_LAZYIO caps and > hold them across filemap_splice_read(). If we fail to get FILE_CACHE or > FILE_LAZYIO capabilities, we use direct_splice_read() instead. > > Signed-off-by: David Howells > cc: Christoph Hellwig > cc: Al Viro > cc: Jens Axboe > cc: Xiubo Li > cc: Ilya Dryomov > cc: Jeff Layton > cc: ceph-devel@vger.kernel.org > cc: linux-fsdevel@vger.kernel.org > cc: linux-block@vger.kernel.org > cc: linux-mm@kvack.org > --- > fs/ceph/file.c | 66 +++++++++++++++++++++++++++++++++++++++++++++++++- > 1 file changed, 65 insertions(+), 1 deletion(-) > > diff --git a/fs/ceph/file.c b/fs/ceph/file.c > index f4d8bf7dec88..382dd5901748 100644 > --- a/fs/ceph/file.c > +++ b/fs/ceph/file.c > @@ -1745,6 +1745,70 @@ static ssize_t ceph_read_iter(struct kiocb *iocb, struct iov_iter *to) > return ret; > } > > +/* > + * Wrap filemap_splice_read with checks for cap bits on the inode. > + * Atomically grab references, so that those bits are not released > + * back to the MDS mid-read. > + */ > +static ssize_t ceph_splice_read(struct file *in, loff_t *ppos, > + struct pipe_inode_info *pipe, > + size_t len, unsigned int flags) > +{ > + struct ceph_file_info *fi = in->private_data; > + struct inode *inode = file_inode(in); > + struct ceph_inode_info *ci = ceph_inode(inode); > + ssize_t ret; > + int want = 0, got = 0; > + CEPH_DEFINE_RW_CONTEXT(rw_ctx, 0); > + > + dout("splice_read %p %llx.%llx %llu~%zu trying to get caps on %p\n", > + inode, ceph_vinop(inode), *ppos, len, inode); > + > + if (ceph_inode_is_shutdown(inode)) > + return -ESTALE; > + > + if ((in->f_flags & O_DIRECT) || > + ceph_has_inline_data(ci) || > + (fi->flags & CEPH_F_SYNC)) > + return direct_splice_read(in, ppos, pipe, len, flags); > + > + ceph_start_io_read(inode); > + > + want = CEPH_CAP_FILE_CACHE; > + if (fi->fmode & CEPH_FILE_MODE_LAZY) > + want |= CEPH_CAP_FILE_LAZYIO; > + > + ret = ceph_get_caps(in, CEPH_CAP_FILE_RD, want, -1, &got); > + if (ret < 0) { > + ceph_end_io_read(inode); > + return ret; > + } > + > + if ((got & (CEPH_CAP_FILE_CACHE | CEPH_CAP_FILE_LAZYIO)) == 0) { > + dout("splice_read/sync %p %llx.%llx %llu~%zu got cap refs on %s\n", > + inode, ceph_vinop(inode), *ppos, len, > + ceph_cap_string(got)); > + > + ceph_end_io_read(inode); > + return direct_splice_read(in, ppos, pipe, len, flags); Shouldn't we release cap ref before returning here ? Thanks - Xiubo > + } > + > + dout("splice_read %p %llx.%llx %llu~%zu got cap refs on %s\n", > + inode, ceph_vinop(inode), *ppos, len, ceph_cap_string(got)); > + > + rw_ctx.caps = got; > + ceph_add_rw_context(fi, &rw_ctx); > + ret = filemap_splice_read(in, ppos, pipe, len, flags); > + ceph_del_rw_context(fi, &rw_ctx); > + > + dout("splice_read %p %llx.%llx dropping cap refs on %s = %zd\n", > + inode, ceph_vinop(inode), ceph_cap_string(got), ret); > + ceph_put_cap_refs(ci, got); > + > + ceph_end_io_read(inode); > + return ret; > +} > + > /* > * Take cap references to avoid releasing caps to MDS mid-write. > * > @@ -2593,7 +2657,7 @@ const struct file_operations ceph_file_fops = { > .lock = ceph_lock, > .setlease = simple_nosetlease, > .flock = ceph_flock, > - .splice_read = generic_file_splice_read, > + .splice_read = ceph_splice_read, > .splice_write = iter_file_splice_write, > .unlocked_ioctl = ceph_ioctl, > .compat_ioctl = compat_ptr_ioctl, >