From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-11.4 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 35648C43461 for ; Fri, 11 Sep 2020 16:34:07 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id A9C57208FE for ; Fri, 11 Sep 2020 16:34:06 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="gPyGKwnG" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A9C57208FE Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id E1C796B005A; Fri, 11 Sep 2020 12:34:05 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id DA50A6B005C; Fri, 11 Sep 2020 12:34:05 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C46746B005D; Fri, 11 Sep 2020 12:34:05 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0009.hostedemail.com [216.40.44.9]) by kanga.kvack.org (Postfix) with ESMTP id A91476B005A for ; Fri, 11 Sep 2020 12:34:05 -0400 (EDT) Received: from smtpin26.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 49B1312D7 for ; Fri, 11 Sep 2020 16:34:05 +0000 (UTC) X-FDA: 77251327650.26.flame52_0501745270f0 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin26.hostedemail.com (Postfix) with ESMTP id 118701804B656 for ; Fri, 11 Sep 2020 16:34:05 +0000 (UTC) X-HE-Tag: flame52_0501745270f0 X-Filterd-Recvd-Size: 7651 Received: from mail-oi1-f194.google.com (mail-oi1-f194.google.com [209.85.167.194]) by imf10.hostedemail.com (Postfix) with ESMTP for ; Fri, 11 Sep 2020 16:34:04 +0000 (UTC) Received: by mail-oi1-f194.google.com with SMTP id x19so10021996oix.3 for ; Fri, 11 Sep 2020 09:34:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=LXuI1+2niHV1f/itEDtLBlt+egT8GEGRmfKrWMZ85Ng=; b=gPyGKwnGM1G4h49Vz2S2gxB+Iub1XTLvkXVUxUDWA9zqi3G6XLGlEH+NbXH92MinNL bPp+fmjiAji2QO8/gVFHtFwnXp4Q3BFZnDYqgI/m4dGxmyY68jLER1kQu9rST+AFag/n RHpNL2CDn1B8ZlJBMMrHj7PUnc4xIWd4L6iMoYbfjWv71mVlAjxn7Tkg63j79tBeqkhr vTgN6cNA5c/XhSanFsKnUKeWF3Gt2JEByZzJ/qxTGfe4SwmqY/JHUoiLaGklx83GFHuT 7N//ZByW/IvZjob5vVqze7DrtCfZiSDfQK/5UwatBjtWOV99ErfEL6NSWqD9nwo0H5jh sy1w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=LXuI1+2niHV1f/itEDtLBlt+egT8GEGRmfKrWMZ85Ng=; b=jkvb61TvaDnj9srb7fGLw4WppVSTWhnuoXL7VbEqpxzY0MuFd2xQcfsbOhqu0MyRex HPcuCAVCPW1PoWrYaQ+lYvv/j8XXq4lGPTnVtMKzYUmWkrzvhC1lE9JoC1aoRXaiSHH/ BigKZj0vA5rrGe/HQCCoJc4oGWzUdrvZ9CSxecCmbQz+DAOvIdRP7y3iIilvJRS6lMqE yy7cJxv+3wQtjQ3sGpMf/1vvVfYMq0YfNnfs9K2yXfjdyJk9Ol/pJ/CzhVi288d6zavM akgxRsypCPdlvKUE44WyiKDymmCWBDL+RZMXa6vCnxcQ0RZqT4c4cmwd0dZ5qYo+WisV rXIA== X-Gm-Message-State: AOAM533JIX72KEgJQIBJO9/F5XmrhFNivmlTn1HUrQdgrBKkT5ZCxhqG 4bLeeb6W3NrwyTvgTtwTmqrdhtI2Gxn5Rqe8y4QSXw== X-Google-Smtp-Source: ABdhPJwGiZMFMPYdc2jzGyDoUKrANSuGf5WBAyb+m1FAjK+gLOrz9ts2OJiDc13pK0JCuZAVplb5SoIninhtNvLBI3s= X-Received: by 2002:aca:54d1:: with SMTP id i200mr1720432oib.172.1599842043737; Fri, 11 Sep 2020 09:34:03 -0700 (PDT) MIME-Version: 1.0 References: <20200907134055.2878499-1-elver@google.com> <20200908153102.GB61807@elver.google.com> <20200908155631.GC61807@elver.google.com> In-Reply-To: From: Marco Elver Date: Fri, 11 Sep 2020 18:33:52 +0200 Message-ID: Subject: Re: [PATCH RFC 00/10] KFENCE: A low-overhead sampling-based memory safety error detector To: Dmitry Vyukov Cc: Vlastimil Babka , Dave Hansen , Alexander Potapenko , Andrew Morton , Catalin Marinas , Christoph Lameter , David Rientjes , Joonsoo Kim , Mark Rutland , Pekka Enberg , "H. Peter Anvin" , "Paul E. McKenney" , Andrey Konovalov , Andrey Ryabinin , Andy Lutomirski , Borislav Petkov , Dave Hansen , Eric Dumazet , Greg Kroah-Hartman , Ingo Molnar , Jann Horn , Jonathan Corbet , Kees Cook , Peter Zijlstra , Qian Cai , Thomas Gleixner , Will Deacon , "the arch/x86 maintainers" , "open list:DOCUMENTATION" , LKML , kasan-dev , Linux ARM , Linux-MM Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: 118701804B656 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam01 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, 11 Sep 2020 at 15:33, Marco Elver wrote: > On Fri, 11 Sep 2020 at 15:10, Dmitry Vyukov wrote: > > On Fri, Sep 11, 2020 at 2:03 PM Marco Elver wrote: > > > On Fri, 11 Sep 2020 at 09:36, Dmitry Vyukov wrote: > [...] > > > > By "reasonable" I mean if the pool will last long enough to still > > > > sample something after hours/days? Have you tried any experiments with > > > > some workload (both short-lived processes and long-lived > > > > processes/namespaces) capturing state of the pool? It can make sense > > > > to do to better understand dynamics. I suspect that the rate may need > > > > to be orders of magnitude lower. > > > > > > Yes, the current default sample interval is a lower bound, and is also > > > a reasonable default for testing. I expect real deployments to use > > > much higher sample intervals (lower rate). > > > > > > So here's some data (with CONFIG_KFENCE_NUM_OBJECTS=1000, so that > > > allocated KFENCE objects isn't artificially capped): > > > > > > -- With a mostly vanilla config + KFENCE (sample interval 100 ms), > > > after ~40 min uptime (only boot, then idle) I see ~60 KFENCE objects > > > (total allocations >600). Those aren't always the same objects, with > > > roughly ~2 allocations/frees per second. > > > > > > -- Then running sysbench I/O benchmark, KFENCE objects allocated peak > > > at 82. During the benchmark, allocations/frees per second are closer > > > to 10-15. After the benchmark, the KFENCE objects allocated remain at > > > 82, and allocations/frees per second fall back to ~2. > > > > > > -- For the same system, changing the sample interval to 1 ms (echo 1 > > > > /sys/module/kfence/parameters/sample_interval), and re-running the > > > benchmark gives me: KFENCE objects allocated peak at exactly 500, with > > > ~500 allocations/frees per second. After that, allocated KFENCE > > > objects dropped a little to 496, and allocations/frees per second fell > > > back to ~2. > > > > > > -- The long-lived objects are due to caches, and just running 'echo 1 > > > > /proc/sys/vm/drop_caches' reduced allocated KFENCE objects back to > > > 45. > > > > Interesting. What type of caches is this? If there is some type of > > cache that caches particularly lots of sampled objects, we could > > potentially change the cache to release sampled objects eagerly. > > The 2 major users of KFENCE objects for that workload are > 'buffer_head' and 'bio-0'. > > If we want to deal with those, I guess there are 2 options: > > 1. More complex, but more precise: make the users of them check > is_kfence_address() and release their buffers earlier. > > 2. Simpler, generic solution: make KFENCE stop return allocations for > non-kmalloc_caches memcaches after more than ~90% of the pool is > exhausted. This assumes that creators of long-lived objects usually > set up their own memcaches. > > I'm currently inclined to go for (2). Ok, after some offline chat, we determined that (2) would be premature and we can't really say if kmalloc should have precedence if we reach some usage threshold. So for now, let's just leave as-is and start with the recommendation to monitor and adjust based on usage, fleet size, etc. Thanks, -- Marco