From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E6D30C61DF7 for ; Thu, 23 Nov 2023 14:59:34 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 48C6C6B06C1; Thu, 23 Nov 2023 09:59:34 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 43C546B06C3; Thu, 23 Nov 2023 09:59:34 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2DC706B06C4; Thu, 23 Nov 2023 09:59:34 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 1C41D6B06C1 for ; Thu, 23 Nov 2023 09:59:34 -0500 (EST) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id DB02B160FC2 for ; Thu, 23 Nov 2023 14:59:33 +0000 (UTC) X-FDA: 81489527826.27.324F569 Received: from mail-lf1-f41.google.com (mail-lf1-f41.google.com [209.85.167.41]) by imf11.hostedemail.com (Postfix) with ESMTP id 071B84000B for ; Thu, 23 Nov 2023 14:59:31 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=WTO9G6sG; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf11.hostedemail.com: domain of dvyukov@google.com designates 209.85.167.41 as permitted sender) smtp.mailfrom=dvyukov@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1700751572; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=rjsWntyf3ek48ZMQ86Tl47VHUhxQBToAw1VMl26tGwI=; b=tu72J5EYEiNG4D9FqXBDiJIafap8qQj3XSbcKOphuLFFKRxhuxn2T0ODzncM0qlO6YHYib Si+frotko4oajg4FQwEutVpfPHVtU40gO+RgTyLbCEoGisXQZt/fUFxP3ATZnnhD4itOl4 xWAzoAXJ4TmDphh1Ww1WqMwzh6gFmys= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=WTO9G6sG; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf11.hostedemail.com: domain of dvyukov@google.com designates 209.85.167.41 as permitted sender) smtp.mailfrom=dvyukov@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1700751572; a=rsa-sha256; cv=none; b=FzO9AYafu6bPERyFkNayATbonAi+8/2FsgLujfKbmb+odlnffTIwX2zggJfWOmlM+Rr7RG 87b83YfmKsElQZBzSMr2wvB+cAKan3yefh7CuPd0JgrvZ6qOC5X3iX3O+hWzYrDY2z1ftN 4xMz7qHq8ghYn02ERnHnTS+ose3e3Yw= Received: by mail-lf1-f41.google.com with SMTP id 2adb3069b0e04-50aa698b384so3621e87.0 for ; Thu, 23 Nov 2023 06:59:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1700751570; x=1701356370; darn=kvack.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=rjsWntyf3ek48ZMQ86Tl47VHUhxQBToAw1VMl26tGwI=; b=WTO9G6sGURIlSqQq4/jesOy1FWpicuTbLjBnUTwJa5VgSJNXcJ1w8WOuPG7h7gVZPF IaeyIeAnFMn4B/8ZfSE/A8AhRFKTam1bivDQD44C9vBxVOpdnwCtjoM5R2s+idu0bUxe PM6B63VBzitqXXPvhOwXy9UYhzTcbJ+Iuv+9GCruTsrsKieZ9jBBfbzGnXF6djo0Rx7M itDB18QmMvSO3CgsxoSoZJmeNsP+yhNwcw/SIfJxU9qBIqziTuvs4B+Dfan1tsGVmF4r mo1zZs9J+/Soy+bAxRVmM/Dsz8A416dUIs4KrebG6ynYO41Pdlp1MLLNmsg7EbwBjbV5 g3og== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1700751570; x=1701356370; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=rjsWntyf3ek48ZMQ86Tl47VHUhxQBToAw1VMl26tGwI=; b=iqmt8+37FqflIUij7Uc8fHjF4gbAcsuXMnhSMTzYsRrzB02d/x3ooepzxUiMdxDdVM 6y+cgZUH+hz4VVTEC9L999Pimdw38XExORp1KfCBZ72QwOVyAJxH63lck8lcMErn8qNG ht9PLSNWJdmqt/tqqEaoBUF2wL+TWGOuyAdzmWtm3Vfbjeko2Ef1GpF0RTuuPB6pw+4M vy9cz+S4TI1PD2OQiO0stkqX6h1/Fs6nssvr4xhD3guPA7PpdU9xb6ZTD9XmKmQ3AFgM X+eNH8teIsBZVRKgtsFTMCWmuP2nOpN4wBiZdIBEcJ2JY1Cpgr/3VUNI5oa2hgEOb1s3 itJQ== X-Gm-Message-State: AOJu0YxElNAMB+ziOiTiPuYjyJtEU3IocCI9m/FFX/4/9MpJOz261Wxo D40yMs0sGGggzAMEJEGl+fdTQ68RbJNa2i88dtagIQ== X-Google-Smtp-Source: AGHT+IHIl/WBZ2iDyzl0IR+ziEaFAWIlCfOa3lmlYRCwgZw978+0QVt/5bBPs7XrHN1JL0NxP4OnuWDIrySOcoIbrbs= X-Received: by 2002:a19:ad43:0:b0:504:7b50:ec9a with SMTP id s3-20020a19ad43000000b005047b50ec9amr144412lfd.1.1700751569826; Thu, 23 Nov 2023 06:59:29 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Dmitry Vyukov Date: Thu, 23 Nov 2023 15:59:17 +0100 Message-ID: Subject: Re: [RFC PATCH] kasan: Record and report more information To: Juntong Deng , Marco Elver Cc: ryabinin.a.a@gmail.com, glider@google.com, andreyknvl@gmail.com, vincenzo.frascino@arm.com, akpm@linux-foundation.org, kasan-dev@googlegroups.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-kernel-mentees@lists.linuxfoundation.org Content-Type: text/plain; charset="UTF-8" X-Rspam-User: X-Stat-Signature: gmk16z6wkigchzreq5wti7y5i4yj6go6 X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 071B84000B X-HE-Tag: 1700751571-210205 X-HE-Meta: U2FsdGVkX18+8S0wFZeJ5puvOfW9L4U6X6wULHsWJTQk5KqpB2G86M0Yo9704P5wd1Exy/NkaASFoS+zEc39ILSd1+lQbTlbK8BvneyqkkNsYYaBc9/uwzpkM30IqgyvCG+MYYqihdEBYTdirGlA46kwvFJGS4U0nAPWvQtenhTE8fcmIfH/z0zB7L26hECANbW21ooFSbLUvKwcueBHlmcRQIm/e9IvkXLHfdWbEUfL+Z1lTiYeE7U5GxrJek2fezk4WCRGUQlmCumTtwm0dWtA93kdrubbbLhgEM8ZGOKemUzEcML9wZhfOqOYT9ItLcJa1O53FyVGctl7OBWAMxBQqT1iFV9zcYQvo66SW+xHzxKHoogi5AbCLOOSr/rkXQqQUiqgQ7By7J4H+ZrOkSYX2N9yOM23bIMfFL4aXkR5muXmubt/c+64HprfM25A0+ZTkMABpFu5142sWA9l9PawqhFRKo1ikv4Xai7FzSEkf73CV7pcWwWMLmYkHrKogs6jnzNfPdTKeUUuZwXLRn0bXzLQ3w4vhHkByfZyyFQHL6BIdIkrYyGMY1SzghJy1wdiJJo5uehXF4dHl23xXlI9Tma5lsF5Ow1feV5fEd2QaPS1xtuHVajLwrrskm10YNpM3k2CN8Zv6RlnHB52xnX0CivhuO7CiGO7R1/mq6JUsm8/59kQbsdbk+RizSA6ciI9Sj+kPOEQtv4QsCPBscgw5wCbSpUtTiYgsUAQdzQ35oFLjoNkH8fbaeNm0/cG8VRV5qOiTGtBWyvyMWuUSADv/K0quv7MFtKx2z6tBNlFMShRZOokPdLJM7Ubo20pHU7ANWnKlIo1rSxWy1++z83DrgTBBgGFbT4RHWHMrM8woLVxqxY3cbgzE5V2dKjTVkoU1I5TnjE/P2z1d6y6RdCOMCPNeh/ER1tPc/JQnHEkSi9YjB6kW2G8YSiybXePObM+b4ZjHxyu1iEZe4a ASsKYAXp /+ujCOh7Vp/Eb0FmLU888Z3LEBeVq3LqQF1OwPxauV7XFhxHL0JplFQIA1+0DMibn9byuoBS1ho71pEw58guK7r6mopHEL0PKQPBnzh/dQim4EqHzF7PQSWZbCLxNwSL6Ktp2uMUDuxrVa7K41pIzf0E9ampdwLxwhQk9VysSSE77ugo8AcEMwRJgPaORmYAVJEt0anHZGdtXLA5gcaYca1yJ8hCANhDONIDz2yucCnEXJS1/nvHUAFHeSngi8sOIftEP9GSl13v3kSlcV6oLN6OZW/iPupV/F7Bn8nhj2V3Mk4fGXaGQBs+axLgjgECNSqIaDROOhQ2FhD9UwAAv9QVcaO1K3JN5bV/6 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, 23 Nov 2023 at 11:06, Juntong Deng wrote: > > On 2023/11/20 19:06, Dmitry Vyukov wrote: > > On Wed, 15 Nov 2023 at 22:53, Juntong Deng wrote: > >> > >> On 2023/11/14 15:27, Dmitry Vyukov wrote: > >>> On Mon, 13 Nov 2023 at 22:17, Juntong Deng wrote: > >>>> > >>>> Record and report more information to help us find the cause of > >>>> the bug (for example, bugs caused by subtle race condition). > >>>> > >>>> This patch adds recording and showing CPU number and timestamp at > >>>> allocation and free (controlled by CONFIG_KASAN_EXTRA_INFO), and > >>>> adds recording and showing timestamp at error occurrence (CPU number > >>>> is already shown by dump_stack_lvl). The timestamps in the report use > >>>> the same format and source as printk. > >>>> > >>>> In order to record CPU number and timestamp at allocation and free, > >>>> corresponding members need to be added to the relevant data structures, > >>>> which may lead to increased memory consumption. > >>>> > >>>> In Generic KASAN, members are added to struct kasan_track. Since in > >>>> most cases, alloc meta is stored in the redzone and free meta is > >>>> stored in the object or the redzone, memory consumption will not > >>>> increase much. > >>>> > >>>> In SW_TAGS KASAN and HW_TAGS KASAN, members are added to > >>>> struct kasan_stack_ring_entry. Memory consumption increases as the > >>>> size of struct kasan_stack_ring_entry increases (this part of the > >>>> memory is allocated by memblock), but since this is configurable, > >>>> it is up to the user to choose. > >>>> > >>>> Signed-off-by: Juntong Deng > >>>> --- > >>>> lib/Kconfig.kasan | 11 +++++++++++ > >>>> mm/kasan/common.c | 5 +++++ > >>>> mm/kasan/kasan.h | 9 +++++++++ > >>>> mm/kasan/report.c | 28 ++++++++++++++++++++++------ > >>>> mm/kasan/report_tags.c | 18 ++++++++++++++++++ > >>>> mm/kasan/tags.c | 15 +++++++++++++++ > >>>> 6 files changed, 80 insertions(+), 6 deletions(-) > >>>> > >>>> diff --git a/lib/Kconfig.kasan b/lib/Kconfig.kasan > >>>> index fdca89c05745..d9611564b339 100644 > >>>> --- a/lib/Kconfig.kasan > >>>> +++ b/lib/Kconfig.kasan > >>>> @@ -207,4 +207,15 @@ config KASAN_MODULE_TEST > >>>> A part of the KASAN test suite that is not integrated with KUnit. > >>>> Incompatible with Hardware Tag-Based KASAN. > >>>> > >>>> +config KASAN_EXTRA_INFO > >>>> + bool "Record and report more information" > >>>> + depends on KASAN > >>>> + help > >>>> + Record and report more information to help us find the cause of > >>>> + the bug. The trade-off is potentially increased memory consumption > >>>> + (to record more information). > >>>> + > >>>> + Currently the CPU number and timestamp are additionally recorded > >>>> + at allocation and free. > >>> > >>> Hi Juntong, > >>> > >>> Thanks for working on this. > >>> > >> > >> > >> Thanks for your reply! > >> > >> > >>> As a KASAN developer I understand what this is doing, but I am trying > >>> to think from a position of a user that does not know details of KASAN > >>> implementation. From this position it may be useful to say somewhere > >>> that information is recorded "per heap allocation". Perhaps something > >>> like: > >>> > >>> "Currently the CPU number and timestamp are additionally recorded for > >>> each heap block at allocation and free time". > >> > >> > >> Yes, I agree, that is a better expression. > >> > >> > >>> > >>> Also it's unclear what the memory consumption increase is. You say > >>> "potentially|, so may it not increase at all? If it increases, by how > >>> much? I obviously want more information, if I can afford it, but I > >>> can't understand if I can or not based on this description. I would > >>> assume that this may be a problem only for small/embedded devices. > >>> Can we provide some ballpark estimation of the memory consumption > >>> increase? And somehow say that's probably not an issue for larger > >>> machines? > >>> > >> > >> > >> How about this expression? > >> > >> Currently, in order to record CPU number and timestamp, the data > >> structure to record allocation and free information will increase > >> by 12 bytes. > >> > >> In Generic KASAN, this affects all allocations less than 32 bytes. > >> In SW_TAGS KASAN and HW_TAGS KASAN, depending on the stack_ring_size > >> boot parameter increases the memory consumption by > >> 12 * stack_ring_size bytes. > > > > Let's go with this version. > > > > > >>>> endif # KASAN > >>>> diff --git a/mm/kasan/common.c b/mm/kasan/common.c > >>>> index 256930da578a..7a81566d9d66 100644 > >>>> --- a/mm/kasan/common.c > >>>> +++ b/mm/kasan/common.c > >>>> @@ -20,6 +20,7 @@ > >>>> #include > >>>> #include > >>>> #include > >>>> +#include > >>>> #include > >>>> #include > >>>> #include > >>>> @@ -50,6 +51,10 @@ void kasan_set_track(struct kasan_track *track, gfp_t flags) > >>>> { > >>>> track->pid = current->pid; > >>>> track->stack = kasan_save_stack(flags, true); > >>>> +#ifdef CONFIG_KASAN_EXTRA_INFO > >>>> + track->cpu = raw_smp_processor_id(); > >>>> + track->ts_nsec = local_clock(); > >>> > >>> What does "local_" mean? Is this clock value comparable across CPUs? > >>> > >> > >> > >> No, local_clock is the local CPU clock and cannot be used for comparison > >> across CPUs, I made a mistake here. > >> > >> I delved into the clock subsystem of the Linux kernel today and I found > >> that we have two choices. > >> > >> - sched_clock(): based on jiffies, high performance, but on some > >> hardware, it will drift between CPUs. > >> > >> - ktime_get_boot_fast_ns(): based on clocksource, highly accurate, > >> can be compared between CPUs, but performance is worse (seqlock). > >> > >> I tested 100000 calls respectively on my laptop, the average of > >> sched_clock() is 17ns and the average of ktime_get_boot_fast_ns() > >> is 25ns. > >> > >> ktime_get_boot_fast_ns() takes about 1.5 times as long as sched_clock(). > >> > >> With Generic KASAN enabled, the average of one memory allocation is > >> 3512ns on my laptop. > >> > >> Personally, I prefer ktime_get_boot_fast_ns() because it is more > >> accurate and the extra time is insignificant for the time required for > >> one memory allocation with Generic KASAN enabled. > >> > >> But maybe using ktime_get_boot_fast_ns() would have a more serious > >> impact on small/embedded devices. > >> > >> Which do you think is the better choice? > > > > I don't have a strong preference. > > > > Re drift of sched_clock(), do you mean unsynchronized RDTSC on > > different cores? I had the impression that RDTSC is synchronized > > across cores on all recent CPUs/systems. > > > > > After discussions with Marco Elver, I now think that perhaps continuing > to use local_clock() is a better option. > > The full discussion with Marco Elver can be found at > https://groups.google.com/g/kasan-dev/c/zmxwYv8wZTg > > Because local_clock() is the clock source used by printk and is the > default clock source for ftrace. > > Using local_clock() as the clock source for KASAN reports makes it > easier to correlate the timestamps of allocations, frees, and errors > with the timestamps of other events in the system. > > This is perhaps more important than being able to accurately compare > across CPUs. > > What do you think? If other similar tools (kfence, ftrace, printk) use local_clock(), then it's reasonable to use it in kasan as well.