From: Alexander Potapenko <glider@google.com>
To: Joey Jiao <quic_jiangenj@quicinc.com>
Cc: Marco Elver <elver@google.com>,
Dmitry Vyukov <dvyukov@google.com>,
Andrey Konovalov <andreyknvl@gmail.com>,
Jonathan Corbet <corbet@lwn.net>,
Andrew Morton <akpm@linux-foundation.org>,
Dennis Zhou <dennis@kernel.org>, Tejun Heo <tj@kernel.org>,
Christoph Lameter <cl@linux.com>,
Catalin Marinas <catalin.marinas@arm.com>,
Will Deacon <will@kernel.org>,
kasan-dev@googlegroups.com, linux-kernel@vger.kernel.org,
workflows@vger.kernel.org, linux-doc@vger.kernel.org,
linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org,
kernel@quicinc.com
Subject: Re: [PATCH 0/7] kcov: Introduce New Unique PC|EDGE|CMP Modes
Date: Wed, 15 Jan 2025 16:16:57 +0100 [thread overview]
Message-ID: <CAG_fn=XFkNVkT3EmB99SdEBAwkGq3EUdM9xR4rzH_HatrJw8rQ@mail.gmail.com> (raw)
In-Reply-To: <Z4ZfzoqhrJA0jeQI@hu-jiangenj-sha.qualcomm.com>
On Tue, Jan 14, 2025 at 2:00 PM Joey Jiao <quic_jiangenj@quicinc.com> wrote:
>
> On Tue, Jan 14, 2025 at 11:43:08AM +0100, Marco Elver wrote:
> > On Tue, 14 Jan 2025 at 06:35, Jiao, Joey <quic_jiangenj@quicinc.com> wrote:
> > >
> > > Hi,
> > >
> > > This patch series introduces new kcov unique modes:
> > > `KCOV_TRACE_UNIQ_[PC|EDGE|CMP]`, which are used to collect unique PC, EDGE,
> > > CMP information.
> > >
> > > Background
> > > ----------
> > >
> > > In the current kcov implementation, when `__sanitizer_cov_trace_pc` is hit,
> > > the instruction pointer (IP) is stored sequentially in an area. Userspace
> > > programs then read this area to record covered PCs and calculate covered
> > > edges. However, recent syzkaller runs show that many syscalls likely have
> > > `pos > t->kcov_size`, leading to kcov overflow. To address this issue, we
> > > introduce new kcov unique modes.
Hi Joey,
Sorry for not responding earlier, I thought I'd come with a working
proposal, but it is taking a while.
You are right that kcov is prone to overflows, and we might be missing
interesting coverage because of that.
Recently we've been discussing the applicability of
-fsanitize-coverage=trace-pc-guard to this problem, and it is almost
working already.
The idea is as follows:
- -fsanitize-coverage=trace-pc-guard instruments basic blocks with
calls to `__sanitizer_cov_trace_pc_guard(u32 *guard)`, each taking a
unique 32-bit global in the __sancov_guards section;
- these globals are zero-initialized, but upon the first call to
__sanitizer_cov_trace_pc_guard() from each callsite, the corresponding
global will receive a unique consequent number;
- now we have a mapping of PCs into indices, which can we use to
deduplicate the coverage:
-- storing PCs by their index taken from *guard directly in the
user-supplied buffer (which size will not exceed several megabytes in
practice);
-- using a per-task bitmap (at most hundreds of kilobytes) to mark
visited basic blocks, and appending newly encountered PCs to the
user-supplied buffer like it's done now.
I think this approach is more promising than using hashmaps in kcov:
- direct mapping should be way faster than a hashmap (and the overhead
of index allocation is amortized, because they are persistent between
program runs);
- there cannot be collisions;
- no additional complexity from pool allocations, RCU synchronization.
The above approach will naturally break edge coverage, as there will
be no notion of a program trace anymore.
But it is still a question whether edges are helping the fuzzer, and
correctly deduplicating them may not be worth the effort.
If you don't object, I would like to finish prototyping coverage
guards for kcov before proceeding with this review.
Alex
> > > 2. [P 2-3] Introduce `KCOV_TRACE_UNIQ_EDGE` Mode:
> > > - Save `prev_pc` to calculate edges with the current IP.
> > > - Add unique edges to the hashmap.
> > > - Use a lower 12-bit mask to make hash independent of module offsets.
Note that on ARM64 this will be effectively using bits 11:2, so if I
am understanding correctly more than a million coverage callbacks will
be mapped into one of 1024 buckets.
next prev parent reply other threads:[~2025-01-15 15:17 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-01-14 5:34 Jiao, Joey
2025-01-14 5:34 ` [PATCH 1/7] kcov: introduce new kcov KCOV_TRACE_UNIQ_PC mode Jiao, Joey
2025-01-14 5:34 ` [PATCH 2/7] kcov: introduce new kcov KCOV_TRACE_UNIQ_EDGE mode Jiao, Joey
2025-01-14 5:34 ` [PATCH 3/7] kcov: allow using KCOV_TRACE_UNIQ_[PC|EDGE] modes together Jiao, Joey
2025-01-14 5:34 ` [PATCH 4/7] kcov: introduce new kcov KCOV_TRACE_UNIQ_CMP mode Jiao, Joey
2025-01-24 2:11 ` kernel test robot
2025-01-24 12:26 ` kernel test robot
2025-01-14 5:34 ` [PATCH 5/7] kcov: add the new KCOV uniq modes example code Jiao, Joey
2025-01-14 5:34 ` [PATCH 6/7] kcov: disable instrumentation for genalloc and bitmap Jiao, Joey
2025-01-14 5:34 ` [PATCH 7/7] arm64: disable kcov instrument in header files Jiao, Joey
2025-01-14 10:43 ` [PATCH 0/7] kcov: Introduce New Unique PC|EDGE|CMP Modes Marco Elver
2025-01-14 11:02 ` Dmitry Vyukov
2025-01-14 12:39 ` Joey Jiao
2025-01-14 12:59 ` Joey Jiao
2025-01-15 15:16 ` Alexander Potapenko [this message]
2025-01-16 1:16 ` Joey Jiao
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAG_fn=XFkNVkT3EmB99SdEBAwkGq3EUdM9xR4rzH_HatrJw8rQ@mail.gmail.com' \
--to=glider@google.com \
--cc=akpm@linux-foundation.org \
--cc=andreyknvl@gmail.com \
--cc=catalin.marinas@arm.com \
--cc=cl@linux.com \
--cc=corbet@lwn.net \
--cc=dennis@kernel.org \
--cc=dvyukov@google.com \
--cc=elver@google.com \
--cc=kasan-dev@googlegroups.com \
--cc=kernel@quicinc.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=quic_jiangenj@quicinc.com \
--cc=tj@kernel.org \
--cc=will@kernel.org \
--cc=workflows@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox