From: Andrii Nakryiko <andrii.nakryiko@gmail.com>
To: Jann Horn <jannh@google.com>
Cc: Shakeel Butt <shakeel.butt@linux.dev>,
Andrii Nakryiko <andrii@kernel.org>,
bpf@vger.kernel.org, linux-mm@kvack.org,
akpm@linux-foundation.org, adobriyan@gmail.com,
hannes@cmpxchg.org, ak@linux.intel.com, osandov@osandov.com,
song@kernel.org, linux-fsdevel@vger.kernel.org,
willy@infradead.org, Omar Sandoval <osandov@fb.com>
Subject: Re: [PATCH v4 bpf-next 06/10] lib/buildid: implement sleepable build_id_parse() API
Date: Thu, 8 Aug 2024 14:23:59 -0700 [thread overview]
Message-ID: <CAEf4BzZY7asE51qDVWMuvQiocaxkMNvRKy555-S+asVksDeTKQ@mail.gmail.com> (raw)
In-Reply-To: <CAG48ez1SkqF7q+FydGcUunYMriG+rt8eWyJuSH8meaDAUJbECw@mail.gmail.com>
On Thu, Aug 8, 2024 at 1:58 PM Jann Horn <jannh@google.com> wrote:
>
> On Thu, Aug 8, 2024 at 10:16 PM Andrii Nakryiko
> <andrii.nakryiko@gmail.com> wrote:
> > On Thu, Aug 8, 2024 at 11:40 AM Shakeel Butt <shakeel.butt@linux.dev> wrote:
> > >
> > > On Wed, Aug 07, 2024 at 04:40:25PM GMT, Andrii Nakryiko wrote:
> > > > Extend freader with a flag specifying whether it's OK to cause page
> > > > fault to fetch file data that is not already physically present in
> > > > memory. With this, it's now easy to wait for data if the caller is
> > > > running in sleepable (faultable) context.
> > > >
> > > > We utilize read_cache_folio() to bring the desired folio into page
> > > > cache, after which the rest of the logic works just the same at folio level.
> > > >
> > > > Suggested-by: Omar Sandoval <osandov@fb.com>
> > > > Cc: Shakeel Butt <shakeel.butt@linux.dev>
> > > > Cc: Johannes Weiner <hannes@cmpxchg.org>
> > > > Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
> > > > ---
> > > > lib/buildid.c | 44 ++++++++++++++++++++++++++++----------------
> > > > 1 file changed, 28 insertions(+), 16 deletions(-)
> > > >
> > > > diff --git a/lib/buildid.c b/lib/buildid.c
> > > > index 5e6f842f56f0..e1c01b23efd8 100644
> > > > --- a/lib/buildid.c
> > > > +++ b/lib/buildid.c
> > > > @@ -20,6 +20,7 @@ struct freader {
> > > > struct folio *folio;
> > > > void *addr;
> > > > loff_t folio_off;
> > > > + bool may_fault;
> > > > };
> > > > struct {
> > > > const char *data;
> > > > @@ -29,12 +30,13 @@ struct freader {
> > > > };
> > > >
> > > > static void freader_init_from_file(struct freader *r, void *buf, u32 buf_sz,
> > > > - struct address_space *mapping)
> > > > + struct address_space *mapping, bool may_fault)
> > > > {
> > > > memset(r, 0, sizeof(*r));
> > > > r->buf = buf;
> > > > r->buf_sz = buf_sz;
> > > > r->mapping = mapping;
> > > > + r->may_fault = may_fault;
> > > > }
> > > >
> > > > static void freader_init_from_mem(struct freader *r, const char *data, u64 data_sz)
> > > > @@ -63,6 +65,11 @@ static int freader_get_folio(struct freader *r, loff_t file_off)
> > > > freader_put_folio(r);
> > > >
> > > > r->folio = filemap_get_folio(r->mapping, file_off >> PAGE_SHIFT);
> > > > +
> > > > + /* if sleeping is allowed, wait for the page, if necessary */
> > > > + if (r->may_fault && (IS_ERR(r->folio) || !folio_test_uptodate(r->folio)))
> > > > + r->folio = read_cache_folio(r->mapping, file_off >> PAGE_SHIFT, NULL, NULL);
> > >
> > > Willy's network fs comment is bugging me. If we pass NULL for filler,
> > > the kernel will going to use fs's read_folio() callback. I have checked
> > > read_folio() for fuse and nfs and it seems like for at least these two
> > > filesystems the callback is accessing file->private_data. So, if the elf
> > > file is on these filesystems, we might see null accesses.
> > >
> >
> > Isn't that just a huge problem with the read_cache_folio() interface
> > then? That file is optional, in general, but for some specific FS
> > types it's not. How generic code is supposed to know this?
>
> I think you have to think about it the other way around. The file is
Fair enough:
> @file: Passed to filler function, may be NULL if not required.
But then you look at mapping_read_folio_gfp() which *always*
unconditionally passes NULL for filler and file, and that makes you
think that file is some special *extra* parameter.
But regardless, as you pointed out, I won't have to take extra ref, so
my concerns about performance are wrong. I'll pass the file.
> required, unless you know the filler function that will be used
> doesn't use the file. Which you don't know when you're coming from
> generic code, so generic code has to pass in a file.
>
> As far as I can tell, most of the callers of read_cache_folio() (via
> read_mapping_folio()) are inside filesystem implementations, not
> generic code, so they know what the filler function will do. You're
> generic code, so I think you have to pass in a file.
>
Yep, I guess this is a bit of trailblazing use case. I was confused by
some other helpers passing NULL for file unconditionally, which made
me think that NULL is a supported default use case. Clearly I was
wrong.
> > Or maybe it's a bug with the nfs_read_folio() and fuse_read_folio()
> > implementation that they can't handle NULL file argument?
> > netfs_read_folio(), for example, seems to be working with file == NULL
> > just fine.
> >
> > Matthew, can you please advise what's the right approach here? I can,
> > of course, always get file refcount, but most of the time it will be
> > just an unnecessary overhead, so ideally I'd like to avoid that. But
> > if I have to check each read_folio callback implementation to know
> > whether it's required or not, then that's not great...
>
> Why would you need to increment the file refcount? As far as I can
> tell, all your accesses to the file would happen under
> __build_id_parse(), which is borrowing the refcounted reference from
> vma->vm_file; the file can't go away as long as your caller is holding
> the mmap lock.
Yep, agreed.
next prev parent reply other threads:[~2024-08-08 21:24 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20240807234029.456316-1-andrii@kernel.org>
[not found] ` <20240807234029.456316-3-andrii@kernel.org>
2024-08-08 18:33 ` [PATCH v4 bpf-next 02/10] lib/buildid: add single folio-based file reader abstraction Shakeel Butt
[not found] ` <20240807234029.456316-7-andrii@kernel.org>
2024-08-08 18:40 ` [PATCH v4 bpf-next 06/10] lib/buildid: implement sleepable build_id_parse() API Shakeel Butt
2024-08-08 20:15 ` Andrii Nakryiko
2024-08-08 20:57 ` Jann Horn
2024-08-08 21:23 ` Andrii Nakryiko [this message]
2024-08-08 21:02 ` Shakeel Butt
2024-08-08 21:21 ` Andrii Nakryiko
[not found] ` <20240807234029.456316-2-andrii@kernel.org>
2024-08-08 22:24 ` [PATCH v4 bpf-next 01/10] lib/buildid: harden build ID parsing logic Andi Kleen
2024-08-08 22:44 ` Andrii Nakryiko
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAEf4BzZY7asE51qDVWMuvQiocaxkMNvRKy555-S+asVksDeTKQ@mail.gmail.com \
--to=andrii.nakryiko@gmail.com \
--cc=adobriyan@gmail.com \
--cc=ak@linux.intel.com \
--cc=akpm@linux-foundation.org \
--cc=andrii@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=hannes@cmpxchg.org \
--cc=jannh@google.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=osandov@fb.com \
--cc=osandov@osandov.com \
--cc=shakeel.butt@linux.dev \
--cc=song@kernel.org \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox