From: Jann Horn <jannh@google.com>
To: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: Andrii Nakryiko <andrii@kernel.org>,
bpf@vger.kernel.org, linux-mm@kvack.org,
akpm@linux-foundation.org, adobriyan@gmail.com,
shakeel.butt@linux.dev, hannes@cmpxchg.org, ak@linux.intel.com,
osandov@osandov.com, song@kernel.org,
linux-fsdevel@vger.kernel.org, willy@infradead.org,
stable@vger.kernel.org
Subject: Re: [PATCH v5 bpf-next 01/10] lib/buildid: harden build ID parsing logic
Date: Wed, 14 Aug 2024 18:13:59 +0200 [thread overview]
Message-ID: <CAG48ez0QdmjJua8V4RPhs2WmuGGhD++H-e2vacfP1=2jVgCy+w@mail.gmail.com> (raw)
In-Reply-To: <CAEf4BzZa9Rkm=MAOOF58K444NAfiRry2Y1DDgPYaB48x6yEdbw@mail.gmail.com>
On Wed, Aug 14, 2024 at 1:21 AM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
> On Tue, Aug 13, 2024 at 1:59 PM Jann Horn <jannh@google.com> wrote:
> >
> > On Tue, Aug 13, 2024 at 2:29 AM Andrii Nakryiko <andrii@kernel.org> wrote:
> > > Harden build ID parsing logic, adding explicit READ_ONCE() where it's
> > > important to have a consistent value read and validated just once.
> > >
> > > Also, as pointed out by Andi Kleen, we need to make sure that entire ELF
> > > note is within a page bounds, so move the overflow check up and add an
> > > extra note_size boundaries validation.
> > >
> > > Fixes tag below points to the code that moved this code into
> > > lib/buildid.c, and then subsequently was used in perf subsystem, making
> > > this code exposed to perf_event_open() users in v5.12+.
> >
> > Sorry, I missed some things in previous review rounds:
> >
> > [...]
> > > @@ -18,31 +18,37 @@ static int parse_build_id_buf(unsigned char *build_id,
> > [...]
> > > if (nhdr->n_type == BUILD_ID &&
> > > - nhdr->n_namesz == sizeof("GNU") &&
> > > - !strcmp((char *)(nhdr + 1), "GNU") &&
> > > - nhdr->n_descsz > 0 &&
> > > - nhdr->n_descsz <= BUILD_ID_SIZE_MAX) {
> > > - memcpy(build_id,
> > > - note_start + note_offs +
> > > - ALIGN(sizeof("GNU"), 4) + sizeof(Elf32_Nhdr),
> > > - nhdr->n_descsz);
> > > - memset(build_id + nhdr->n_descsz, 0,
> > > - BUILD_ID_SIZE_MAX - nhdr->n_descsz);
> > > + name_sz == note_name_sz &&
> > > + strcmp((char *)(nhdr + 1), note_name) == 0 &&
> >
> > Please change this to something like "memcmp((char *)(nhdr + 1),
> > note_name, note_name_sz) == 0" to ensure that we can't run off the end
> > of the page if there are no null bytes in the rest of the page.
>
> I did switch this to strncmp() at some earlier point, but then
> realized that there is no point because note_name is controlled by us
> and will ensure there is a zero at byte (note_name_sz - 1). So I don't
> think memcmp() buys us anything.
There are two reasons why using strcmp() here makes me uneasy.
First: We're still operating on shared memory that can concurrently change.
Let's say strcmp is implemented like this, this is the generic C
implementation in the kernel (which I think is the implementation
that's used for x86-64):
int strcmp(const char *cs, const char *ct)
{
unsigned char c1, c2;
while (1) {
c1 = *cs++;
c2 = *ct++;
if (c1 != c2)
return c1 < c2 ? -1 : 1;
if (!c1)
break;
}
return 0;
}
No READ_ONCE() or anything like that - it's not designed for being
used on concurrently changing memory.
And let's say you call it like strcmp(<shared memory>, "GNU"), and
we're now in the fourth iteration. If the compiler decides to re-fetch
the value of "c1" from memory for each of the two conditions, then it
could be that the "if (c1 != c2)" sees c1='\0' and c2='\0', so the
condition evaluates as false; but then at the "if (!c1)", the value in
memory changed, and we see c1='A'. So now in the next round, we'll be
accessing out-of-bounds memory behind the 4-byte string constant
"GNU".
So I don't think strcmp() on memory that can concurrently change is allowed.
(It actually seems like the generic memcmp() is also implemented
without READ_ONCE(), maybe we should change that...)
Second: You are assuming that if one side of the strcmp() is at most
four bytes long (including null terminator), then strcmp() also won't
access more than 4 bytes of the other string, even if that string does
not have a null terminator at index 4. I don't think that's part of
the normal strcmp() API contract.
next prev parent reply other threads:[~2024-08-14 16:14 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-08-13 0:29 [PATCH v5 bpf-next 00/10] Harden and extend ELF " Andrii Nakryiko
2024-08-13 0:29 ` [PATCH v5 bpf-next 01/10] lib/buildid: harden " Andrii Nakryiko
2024-08-13 0:52 ` Andi Kleen
2024-08-13 3:06 ` Andrii Nakryiko
2024-08-13 20:59 ` Jann Horn
2024-08-13 23:21 ` Andrii Nakryiko
2024-08-14 16:13 ` Jann Horn [this message]
2024-08-14 17:06 ` Andrii Nakryiko
2024-08-13 0:29 ` [PATCH v5 bpf-next 02/10] lib/buildid: add single folio-based file reader abstraction Andrii Nakryiko
2024-08-13 0:29 ` [PATCH v5 bpf-next 03/10] lib/buildid: take into account e_phoff when fetching program headers Andrii Nakryiko
2024-08-13 0:29 ` [PATCH v5 bpf-next 04/10] lib/buildid: remove single-page limit for PHDR search Andrii Nakryiko
2024-08-13 0:29 ` [PATCH v5 bpf-next 05/10] lib/buildid: rename build_id_parse() into build_id_parse_nofault() Andrii Nakryiko
2024-08-13 0:29 ` [PATCH v5 bpf-next 06/10] lib/buildid: implement sleepable build_id_parse() API Andrii Nakryiko
2024-08-13 17:26 ` Shakeel Butt
2024-08-13 0:29 ` [PATCH v5 bpf-next 07/10] lib/buildid: don't limit .note.gnu.build-id to the first page in ELF Andrii Nakryiko
2024-08-13 0:29 ` [PATCH v5 bpf-next 08/10] bpf: decouple stack_map_get_build_id_offset() from perf_callchain_entry Andrii Nakryiko
2024-08-13 0:29 ` [PATCH v5 bpf-next 09/10] bpf: wire up sleepable bpf_get_stack() and bpf_get_task_stack() helpers Andrii Nakryiko
2024-08-13 1:05 ` Andi Kleen
2024-08-13 3:11 ` Andrii Nakryiko
2024-08-13 0:29 ` [PATCH v5 bpf-next 10/10] selftests/bpf: add build ID tests Andrii Nakryiko
2024-08-13 1:05 ` Andi Kleen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAG48ez0QdmjJua8V4RPhs2WmuGGhD++H-e2vacfP1=2jVgCy+w@mail.gmail.com' \
--to=jannh@google.com \
--cc=adobriyan@gmail.com \
--cc=ak@linux.intel.com \
--cc=akpm@linux-foundation.org \
--cc=andrii.nakryiko@gmail.com \
--cc=andrii@kernel.org \
--cc=bpf@vger.kernel.org \
--cc=hannes@cmpxchg.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=osandov@osandov.com \
--cc=shakeel.butt@linux.dev \
--cc=song@kernel.org \
--cc=stable@vger.kernel.org \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox