Re: [PATCH v5 bpf-next 01/10] lib/buildid: harden build ID parsing logic

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Jann Horn <jannh@google.com>
To: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: Andrii Nakryiko <andrii@kernel.org>,
	bpf@vger.kernel.org, linux-mm@kvack.org,
	 akpm@linux-foundation.org, adobriyan@gmail.com,
	shakeel.butt@linux.dev,  hannes@cmpxchg.org, ak@linux.intel.com,
	osandov@osandov.com, song@kernel.org,
	 linux-fsdevel@vger.kernel.org, willy@infradead.org,
	stable@vger.kernel.org
Subject: Re: [PATCH v5 bpf-next 01/10] lib/buildid: harden build ID parsing logic
Date: Wed, 14 Aug 2024 18:13:59 +0200	[thread overview]
Message-ID: <CAG48ez0QdmjJua8V4RPhs2WmuGGhD++H-e2vacfP1=2jVgCy+w@mail.gmail.com> (raw)
In-Reply-To: <CAEf4BzZa9Rkm=MAOOF58K444NAfiRry2Y1DDgPYaB48x6yEdbw@mail.gmail.com>

On Wed, Aug 14, 2024 at 1:21 AM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
> On Tue, Aug 13, 2024 at 1:59 PM Jann Horn <jannh@google.com> wrote:
> >
> > On Tue, Aug 13, 2024 at 2:29 AM Andrii Nakryiko <andrii@kernel.org> wrote:
> > > Harden build ID parsing logic, adding explicit READ_ONCE() where it's
> > > important to have a consistent value read and validated just once.
> > >
> > > Also, as pointed out by Andi Kleen, we need to make sure that entire ELF
> > > note is within a page bounds, so move the overflow check up and add an
> > > extra note_size boundaries validation.
> > >
> > > Fixes tag below points to the code that moved this code into
> > > lib/buildid.c, and then subsequently was used in perf subsystem, making
> > > this code exposed to perf_event_open() users in v5.12+.
> >
> > Sorry, I missed some things in previous review rounds:
> >
> > [...]
> > > @@ -18,31 +18,37 @@ static int parse_build_id_buf(unsigned char *build_id,
> > [...]
> > >                 if (nhdr->n_type == BUILD_ID &&
> > > -                   nhdr->n_namesz == sizeof("GNU") &&
> > > -                   !strcmp((char *)(nhdr + 1), "GNU") &&
> > > -                   nhdr->n_descsz > 0 &&
> > > -                   nhdr->n_descsz <= BUILD_ID_SIZE_MAX) {
> > > -                       memcpy(build_id,
> > > -                              note_start + note_offs +
> > > -                              ALIGN(sizeof("GNU"), 4) + sizeof(Elf32_Nhdr),
> > > -                              nhdr->n_descsz);
> > > -                       memset(build_id + nhdr->n_descsz, 0,
> > > -                              BUILD_ID_SIZE_MAX - nhdr->n_descsz);
> > > +                   name_sz == note_name_sz &&
> > > +                   strcmp((char *)(nhdr + 1), note_name) == 0 &&
> >
> > Please change this to something like "memcmp((char *)(nhdr + 1),
> > note_name, note_name_sz) == 0" to ensure that we can't run off the end
> > of the page if there are no null bytes in the rest of the page.
>
> I did switch this to strncmp() at some earlier point, but then
> realized that there is no point because note_name is controlled by us
> and will ensure there is a zero at byte (note_name_sz - 1). So I don't
> think memcmp() buys us anything.

There are two reasons why using strcmp() here makes me uneasy.


First: We're still operating on shared memory that can concurrently change.

Let's say strcmp is implemented like this, this is the generic C
implementation in the kernel (which I think is the implementation
that's used for x86-64):

int strcmp(const char *cs, const char *ct)
{
        unsigned char c1, c2;

        while (1) {
                c1 = *cs++;
                c2 = *ct++;
                if (c1 != c2)
                        return c1 < c2 ? -1 : 1;
                if (!c1)
                        break;
        }
        return 0;
}

No READ_ONCE() or anything like that - it's not designed for being
used on concurrently changing memory.

And let's say you call it like strcmp(<shared memory>, "GNU"), and
we're now in the fourth iteration. If the compiler decides to re-fetch
the value of "c1" from memory for each of the two conditions, then it
could be that the "if (c1 != c2)" sees c1='\0' and c2='\0', so the
condition evaluates as false; but then at the "if (!c1)", the value in
memory changed, and we see c1='A'. So now in the next round, we'll be
accessing out-of-bounds memory behind the 4-byte string constant
"GNU".

So I don't think strcmp() on memory that can concurrently change is allowed.

(It actually seems like the generic memcmp() is also implemented
without READ_ONCE(), maybe we should change that...)


Second: You are assuming that if one side of the strcmp() is at most
four bytes long (including null terminator), then strcmp() also won't
access more than 4 bytes of the other string, even if that string does
not have a null terminator at index 4. I don't think that's part of
the normal strcmp() API contract.

next prev parent reply	other threads:[~2024-08-14 16:14 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-08-13  0:29 [PATCH v5 bpf-next 00/10] Harden and extend ELF " Andrii Nakryiko
2024-08-13  0:29 ` [PATCH v5 bpf-next 01/10] lib/buildid: harden " Andrii Nakryiko
2024-08-13  0:52   ` Andi Kleen
2024-08-13  3:06     ` Andrii Nakryiko
2024-08-13 20:59   ` Jann Horn
2024-08-13 23:21     ` Andrii Nakryiko
2024-08-14 16:13       ` Jann Horn [this message]
2024-08-14 17:06         ` Andrii Nakryiko
2024-08-13  0:29 ` [PATCH v5 bpf-next 02/10] lib/buildid: add single folio-based file reader abstraction Andrii Nakryiko
2024-08-13  0:29 ` [PATCH v5 bpf-next 03/10] lib/buildid: take into account e_phoff when fetching program headers Andrii Nakryiko
2024-08-13  0:29 ` [PATCH v5 bpf-next 04/10] lib/buildid: remove single-page limit for PHDR search Andrii Nakryiko
2024-08-13  0:29 ` [PATCH v5 bpf-next 05/10] lib/buildid: rename build_id_parse() into build_id_parse_nofault() Andrii Nakryiko
2024-08-13  0:29 ` [PATCH v5 bpf-next 06/10] lib/buildid: implement sleepable build_id_parse() API Andrii Nakryiko
2024-08-13 17:26   ` Shakeel Butt
2024-08-13  0:29 ` [PATCH v5 bpf-next 07/10] lib/buildid: don't limit .note.gnu.build-id to the first page in ELF Andrii Nakryiko
2024-08-13  0:29 ` [PATCH v5 bpf-next 08/10] bpf: decouple stack_map_get_build_id_offset() from perf_callchain_entry Andrii Nakryiko
2024-08-13  0:29 ` [PATCH v5 bpf-next 09/10] bpf: wire up sleepable bpf_get_stack() and bpf_get_task_stack() helpers Andrii Nakryiko
2024-08-13  1:05   ` Andi Kleen
2024-08-13  3:11     ` Andrii Nakryiko
2024-08-13  0:29 ` [PATCH v5 bpf-next 10/10] selftests/bpf: add build ID tests Andrii Nakryiko
2024-08-13  1:05   ` Andi Kleen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAG48ez0QdmjJua8V4RPhs2WmuGGhD++H-e2vacfP1=2jVgCy+w@mail.gmail.com' \
    --to=jannh@google.com \
    --cc=adobriyan@gmail.com \
    --cc=ak@linux.intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=andrii.nakryiko@gmail.com \
    --cc=andrii@kernel.org \
    --cc=bpf@vger.kernel.org \
    --cc=hannes@cmpxchg.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=osandov@osandov.com \
    --cc=shakeel.butt@linux.dev \
    --cc=song@kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox