From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 617F2C433EF for ; Fri, 4 Mar 2022 03:29:49 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E39458D0002; Thu, 3 Mar 2022 22:29:48 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id DC1728D0001; Thu, 3 Mar 2022 22:29:48 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C89218D0002; Thu, 3 Mar 2022 22:29:48 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0182.hostedemail.com [216.40.44.182]) by kanga.kvack.org (Postfix) with ESMTP id B875A8D0001 for ; Thu, 3 Mar 2022 22:29:48 -0500 (EST) Received: from smtpin24.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 783AA9007D for ; Fri, 4 Mar 2022 03:29:48 +0000 (UTC) X-FDA: 79205274456.24.E1EE96C Received: from out199-10.us.a.mail.aliyun.com (out199-10.us.a.mail.aliyun.com [47.90.199.10]) by imf03.hostedemail.com (Postfix) with ESMTP id 5BC0C20352 for ; Fri, 4 Mar 2022 03:25:32 +0000 (UTC) X-Alimail-AntiSpam:AC=PASS;BC=-1|-1;BR=01201311R901e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e04400;MF=dtcccc@linux.alibaba.com;NM=1;PH=DS;RN=7;SR=0;TI=SMTPD_---0V6AHJDk_1646360695; Received: from 30.97.48.223(mailfrom:dtcccc@linux.alibaba.com fp:SMTPD_---0V6AHJDk_1646360695) by smtp.aliyun-inc.com(127.0.0.1); Fri, 04 Mar 2022 10:24:56 +0800 Message-ID: <7c14bb40-1e7b-9819-1634-e9e9051726fa@linux.alibaba.com> Date: Fri, 4 Mar 2022 10:24:55 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:91.0) Gecko/20100101 Thunderbird/91.6.1 Subject: Re: [RFC PATCH 0/2] Alloc kfence_pool after system startup Content-Language: en-US To: Marco Elver , Alexander Potapenko Cc: Dmitry Vyukov , Andrew Morton , kasan-dev , Linux Memory Management List , LKML References: <20220303031505.28495-1-dtcccc@linux.alibaba.com> From: Tianchen Ding In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed X-Rspamd-Queue-Id: 5BC0C20352 X-Stat-Signature: xg75sj15md6b5ag1fmpfu3f6yqigpbax Authentication-Results: imf03.hostedemail.com; dkim=none; spf=pass (imf03.hostedemail.com: domain of dtcccc@linux.alibaba.com designates 47.90.199.10 as permitted sender) smtp.mailfrom=dtcccc@linux.alibaba.com; dmarc=pass (policy=none) header.from=alibaba.com X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1646364332-164015 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 2022/3/3 17:30, Marco Elver wrote: Thanks for your replies. I do see setting a large sample_interval means almost disabling KFENCE. In fact, my point is to provide a more =E2=80=9Cflexible=E2=80=9D way. Si= nce some Ops=20 may be glad to use something like on/off switch than 10000ms interval. :-= ) > On Thu, 3 Mar 2022 at 10:05, Alexander Potapenko wr= ote: >=20 > I share Alex's concerns. >=20 >> On Thu, Mar 3, 2022 at 4:15 AM Tianchen Ding wrote: >>> >>> KFENCE aims at production environments, but it does not allow enablin= g >>> after system startup because kfence_pool only alloc pages from memblo= ck. >>> Consider the following production scene: >>> At first, for performance considerations, production machines do not >>> enable KFENCE. >> >> What are the performance considerations you have in mind? Are you runn= ing KFENCE with a very aggressive sampling rate? >=20 > Indeed, what is wrong with simply starting up KFENCE with a sample > interval of 10000? However, I very much doubt that you'll notice any > performance issues above 500ms. >=20 > Do let us know what performance issues you have seen. It may be > related to an earlier version of KFENCE but has since been fixed (see > log). >=20 >>> However, after running for a while, the kernel is suspected to have >>> memory errors. (e.g., a sibling machine crashed.) >> >> I have doubts regarding this setup. It might be faster (although one c= an tune KFENCE to have nearly zero performance impact), but is harder to = maintain. >> It will also catch fewer errors than if you just had KFENCE on from th= e very beginning: >> - sibling machines may behave differently, and a certain bug may onl= y occur once - in that case the secondary instances won't notice it, even= with KFENCE; >> - KFENCE also catches non-lethal corruptions (e.g. OOB reads), which= may stay under radar for a very long time. >> >>> >>> So other production machines need to enable KFENCE, but it's hard for >>> them to reboot. >>> >>> The 1st patch allows re-enabling KFENCE if the pool is already >>> allocated from memblock. >=20 > Patch 1/2 might be ok by itself, but I still don't see the point > because you should just leave KFENCE enabled. There should be no > reason to have to turn it off. If anything, you can increase the > sample interval to something very large if needed.