From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B9300C74A44 for ; Tue, 14 Mar 2023 05:27:10 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id AE1376B0072; Tue, 14 Mar 2023 01:27:09 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A91266B0074; Tue, 14 Mar 2023 01:27:09 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 958876B0075; Tue, 14 Mar 2023 01:27:09 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 85A316B0072 for ; Tue, 14 Mar 2023 01:27:09 -0400 (EDT) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 76413C0430 for ; Tue, 14 Mar 2023 05:27:08 +0000 (UTC) X-FDA: 80566370136.17.7CF7B83 Received: from mx0b-0031df01.pphosted.com (mx0b-0031df01.pphosted.com [205.220.180.131]) by imf05.hostedemail.com (Postfix) with ESMTP id 0F28C100004 for ; Tue, 14 Mar 2023 05:27:04 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=quicinc.com header.s=qcppdkim1 header.b=StOFPTda; spf=pass (imf05.hostedemail.com: domain of quic_zhenhuah@quicinc.com designates 205.220.180.131 as permitted sender) smtp.mailfrom=quic_zhenhuah@quicinc.com; dmarc=pass (policy=none) header.from=quicinc.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1678771625; a=rsa-sha256; cv=none; b=VQ0daPN455L32SWo3ssNlRi1BJQh+ei6f63e66+0erNlzx67SxneAWbFvBxFlwaXOUb92P fAWd4U0hkpwegXqICwd3gBSEpe850bkHZytoAl73+f4hn5ejBphZlHHGRXw2eZSBL0M+x/ 51y6QV/bYteWe1gpE3G81eAxUWMUPGE= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=pass header.d=quicinc.com header.s=qcppdkim1 header.b=StOFPTda; spf=pass (imf05.hostedemail.com: domain of quic_zhenhuah@quicinc.com designates 205.220.180.131 as permitted sender) smtp.mailfrom=quic_zhenhuah@quicinc.com; dmarc=pass (policy=none) header.from=quicinc.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1678771625; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=VlTj/BSZdiSgRaCFf4tRZ6asMFzFqKu8djlnb3KOotk=; b=Lrn5Gak0dASws3pTgMV9MdV3dkN+qndINs1XgDvty/CFJ5xY4spcnFhB5YUwF1ctZCJx2n tAYt13k3Fm4Q5MEKCWijJBv/RM+0P/gMHSgqA0o9pX33upqqLyPUJ9djtDczyAF1OWOEC9 Ro9qgoGOjfI1clOBUW7daPdtuQ+pq6c= Received: from pps.filterd (m0279869.ppops.net [127.0.0.1]) by mx0a-0031df01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 32E3ipX5004587; Tue, 14 Mar 2023 05:26:51 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=quicinc.com; h=message-id : date : mime-version : subject : to : cc : references : from : in-reply-to : content-type : content-transfer-encoding; s=qcppdkim1; bh=VlTj/BSZdiSgRaCFf4tRZ6asMFzFqKu8djlnb3KOotk=; b=StOFPTdas8V8fuF6N82wJMNkVbr1JBpSkx2gExWXpyesqS43KWqjYTB2z51H5OZ37AkD VTpnm5LENcR9y6kz7DLDkphv3FCMlhn/NZviCjH6phnTBWyXBayhmBckyqOqF7UwcPA+ wK/uFRNSi6jAAXMwVusZri9SkyVkmRtzUvMnBFK2VImK0rXi3ofi7ZmyEWy0/iQmjYxQ D8JwXsqF1wuKyNnm2S+qnZ6dwBEHqpqCDS6F8RjX+xi0ezLKNiwbBtI9gpTEoBseSERT Lxq755FKCedLIKsFX1LwNsC1K3YS1LdKfXrNWnBAJObKoVI2wq7eUnUEN9TcnN3KYLUo 0w== Received: from nalasppmta05.qualcomm.com (Global_NAT1.qualcomm.com [129.46.96.20]) by mx0a-0031df01.pphosted.com (PPS) with ESMTPS id 3pa6n31qyr-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 14 Mar 2023 05:26:51 +0000 Received: from nalasex01a.na.qualcomm.com (nalasex01a.na.qualcomm.com [10.47.209.196]) by NALASPPMTA05.qualcomm.com (8.17.1.5/8.17.1.5) with ESMTPS id 32E5QoRv009564 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 14 Mar 2023 05:26:50 GMT Received: from [10.239.132.245] (10.80.80.8) by nalasex01a.na.qualcomm.com (10.47.209.196) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.986.41; Mon, 13 Mar 2023 22:26:46 -0700 Message-ID: Date: Tue, 14 Mar 2023 13:26:43 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.8.0 Subject: Re: [PATCH v6] mm,kfence: decouple kfence from page granularity mapping judgement Content-Language: en-US To: Kefeng Wang , Marco Elver CC: , , , , , , , , , , , , , , References: <1678708637-8669-1-git-send-email-quic_zhenhuah@quicinc.com> <41a98759-1626-5e8f-3b1b-d038ef1925a7@huawei.com> From: Zhenhua Huang In-Reply-To: <41a98759-1626-5e8f-3b1b-d038ef1925a7@huawei.com> Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: 8bit X-Originating-IP: [10.80.80.8] X-ClientProxiedBy: nasanex01a.na.qualcomm.com (10.52.223.231) To nalasex01a.na.qualcomm.com (10.47.209.196) X-QCInternal: smtphost X-Proofpoint-Virus-Version: vendor=nai engine=6200 definitions=5800 signatures=585085 X-Proofpoint-ORIG-GUID: 5P-s7k1LDX2T0L3iEKkBpzwXYjDOzg5Y X-Proofpoint-GUID: 5P-s7k1LDX2T0L3iEKkBpzwXYjDOzg5Y X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.942,Hydra:6.0.573,FMLib:17.11.170.22 definitions=2023-03-13_13,2023-03-13_03,2023-02-09_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 malwarescore=0 mlxscore=0 bulkscore=0 mlxlogscore=999 impostorscore=0 priorityscore=1501 adultscore=0 phishscore=0 spamscore=0 lowpriorityscore=0 clxscore=1015 suspectscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2212070000 definitions=main-2303140046 X-Rspam-User: X-Rspamd-Queue-Id: 0F28C100004 X-Rspamd-Server: rspam01 X-Stat-Signature: 43cjd9hxfs8gdmd7g8r7cryh99e81afg X-HE-Tag: 1678771624-435235 X-HE-Meta: U2FsdGVkX18WcFcDkiBPFGIJ9dGQ9IE3G4cZ+B5D36sd6/8mX9zp0HWvM2LSQcInIeMc2Fj/YBTMfgBxtFnXOiWweXeFJ8l+w2ORst3wqJeWewPFHOBskgFnNowrYFj7LOjPhszhnYwIENyKT/4yFgrO6btb4oFexVARNjfGJc5yCthyme7o0s2TnejAOTjQsXyISQDeg6JnpSQlo0bcwc15aZUEfyzd/4LL6OgVwqQpxnVaLyenl87V6I/8sPH/F2R10oog7jL+wo3nKHiSA0OidulbkIjvMULv2p/E4YdjHXusxbkJrMm16JqVJ/5+I/IZoF9M002gTAvDrRxfnSVy+X8sFzq2JsXIgsS1GheNGupOY0x2cONCRmC4e3NRHnERf2Kdz/iQBM8mq99pDeg1YJPcfkQ8w15O0gpBFjFRS4p15x+DtOzgBeUZT6/KBsjEzhQlFwTpp8lrcggEnh+iVs7K2qccjBlUJQzdiEpn0ceu1c0/qa47ju/AgyRYmMnDV3cQse4/1o3375NS3bl02N0clnsJ2J5zOepQmeVPVTIm1MkOvx3dwu/VxgsC+UIzr2OvgNW86K+rp0OfWeFFgFcNLA+KC6+N0xZZXcNbTHF/4AMBjQo9wKgdLR6V9lW24jh+QD7aOjHFtZsoK//cs+mxjmvimLyYIjyn/TUfhnQveNqjV9IvnExl598dHnVm30c0tcsPkGBE1p0jcKMSBmnk6um7lI2XHAsbY6xPFI+p1yLoc7Qzgm67twtPeyBvxNMFkZRzlmEEavLyImqWSashiiLSnUm1uciPUSMYLwyw4VoW9GCzTk08YyOZftYvAluSYNpOZUV7ruxYOzZYVFxmey6Idlf21zlaV6K1WbIiscQblZGjlmlTPGD1YD4FWQVSC7epgSUOk6yDy94OQFPYjgX2FLw+VtXlMlE7P/CDM6bInOrvrWUebcCyphEDVq7eXJ1kytBiRnQ Rtdb35s1 9kJWdIc6nKkZSMeCbRxRBv3eLZZLoJgoU9LhkSqh/CvvVDoctgICTl7Y7z0kIi09O8SnbaBmObjmiA24nZmpYK9Qhe5rIY6CJBnt5Z8s77Xzfy+F6I0LcRsMTDRZD7hewb/bq0FdoCfn7EAl+TuuQk1Bb0QwDYbODCQ6gj4zbiUIyYs3lIgAOzbxsxPvmL6IdcKTw1fPK7EDnIW4Hqf5FExdQo7tOrdK3UEaOYApzXxkGqjIlbT36ZWx92jDHTb3Vv1exonKq8R10boy4FEC5+Ot24gAZYjhWEnhKl2wKqvPidfcZJ5pvR8RHU3+hQS2/C93VvtFNJOhZSHKQGZAnfZruwpjbFtNVS/1I9aEyE4sqn5vSk9E9a8nKBUsPXmvHAPtEYex7VqeymO/xFLuOAL+smg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 2023/3/13 22:42, Kefeng Wang wrote: > > > On 2023/3/13 21:00, Marco Elver wrote: >> On Mon, 13 Mar 2023 at 12:57, Zhenhua Huang >> wrote: >>> >>> Kfence only needs its pool to be mapped as page granularity, if it is >>> inited early. Previous judgement was a bit over protected. From [1], >>> Mark >>> suggested to "just map the KFENCE region a page granularity". So I >>> decouple it from judgement and do page granularity mapping for kfence >>> pool only. Need to be noticed that late init of kfence pool still >>> requires >>> page granularity mapping. >>> >>> Page granularity mapping in theory cost more(2M per 1GB) memory on arm64 >>> platform. Like what I've tested on QEMU(emulated 1GB RAM) with >>> gki_defconfig, also turning off rodata protection: >>> Before: >>> [root@liebao ]# cat /proc/meminfo >>> MemTotal:         999484 kB >>> After: >>> [root@liebao ]# cat /proc/meminfo >>> MemTotal:        1001480 kB >>> >>> To implement this, also relocate the kfence pool allocation before the >>> linear mapping setting up, arm64_kfence_alloc_pool is to allocate phys >>> addr, __kfence_pool is to be set after linear mapping set up. >>> >>> LINK: [1] >>> https://lore.kernel.org/linux-arm-kernel/Y+IsdrvDNILA59UN@FVFF77S0Q05N/ >>> Suggested-by: Mark Rutland >>> Signed-off-by: Zhenhua Huang >>> --- >>>   arch/arm64/mm/mmu.c      | 42 >>> ++++++++++++++++++++++++++++++++++++++++++ >>>   arch/arm64/mm/pageattr.c |  8 ++++++-- >>>   include/linux/kfence.h   | 10 ++++++++++ >>>   mm/kfence/core.c         |  9 +++++++++ >>>   4 files changed, 67 insertions(+), 2 deletions(-) >>> >>> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c >>> index 6f9d889..ca5c932 100644 >>> --- a/arch/arm64/mm/mmu.c >>> +++ b/arch/arm64/mm/mmu.c >>> @@ -24,6 +24,7 @@ >>>   #include >>>   #include >>>   #include >>> +#include >>> >>>   #include >>>   #include >>> @@ -525,6 +526,31 @@ static int __init enable_crash_mem_map(char *arg) >>>   } >>>   early_param("crashkernel", enable_crash_mem_map); >>> >>> +#ifdef CONFIG_KFENCE >>> + >>> +static phys_addr_t arm64_kfence_alloc_pool(void) >>> +{ >>> +       phys_addr_t kfence_pool; >>> + >>> +       if (!kfence_sample_interval) >>> +               return 0; >>> + >>> +       kfence_pool = memblock_phys_alloc(KFENCE_POOL_SIZE, PAGE_SIZE); >>> +       if (!kfence_pool) >>> +               pr_err("failed to allocate kfence pool\n"); >>> + >>> +       return kfence_pool; >>> +} >>> + >>> +#else >>> + >>> +static phys_addr_t arm64_kfence_alloc_pool(void) >>> +{ >>> +       return 0; >>> +} >>> + >>> +#endif >>> + >>>   static void __init map_mem(pgd_t *pgdp) >>>   { >>>          static const u64 direct_map_end = _PAGE_END(VA_BITS_MIN); >>> @@ -532,6 +558,7 @@ static void __init map_mem(pgd_t *pgdp) >>>          phys_addr_t kernel_end = __pa_symbol(__init_begin); >>>          phys_addr_t start, end; >>>          int flags = NO_EXEC_MAPPINGS; >>> +       phys_addr_t kfence_pool; >>>          u64 i; >>> >>>          /* >>> @@ -564,6 +591,10 @@ static void __init map_mem(pgd_t *pgdp) >>>          } >>>   #endif >>> >>> +       kfence_pool = arm64_kfence_alloc_pool(); >>> +       if (kfence_pool) >>> +               memblock_mark_nomap(kfence_pool, KFENCE_POOL_SIZE); >>> + >>>          /* map all the memory banks */ >>>          for_each_mem_range(i, &start, &end) { >>>                  if (start >= end) >>> @@ -608,6 +639,17 @@ static void __init map_mem(pgd_t *pgdp) >>>                  } >>>          } >>>   #endif >>> + >>> +       /* Kfence pool needs page-level mapping */ >>> +       if (kfence_pool) { >>> +               __map_memblock(pgdp, kfence_pool, >>> +                       kfence_pool + KFENCE_POOL_SIZE, >>> +                       pgprot_tagged(PAGE_KERNEL), >>> +                       NO_BLOCK_MAPPINGS | NO_CONT_MAPPINGS); >>> +               memblock_clear_nomap(kfence_pool, KFENCE_POOL_SIZE); >>> +               /* kfence_pool really mapped now */ >>> +               kfence_set_pool(kfence_pool); >>> +       } >>>   } >>> >>>   void mark_rodata_ro(void) >>> diff --git a/arch/arm64/mm/pageattr.c b/arch/arm64/mm/pageattr.c >>> index 79dd201..25e4a983 100644 >>> --- a/arch/arm64/mm/pageattr.c >>> +++ b/arch/arm64/mm/pageattr.c >>> @@ -7,6 +7,7 @@ >>>   #include >>>   #include >>>   #include >>> +#include >>> >>>   #include >>>   #include >>> @@ -22,12 +23,15 @@ bool rodata_full __ro_after_init = >>> IS_ENABLED(CONFIG_RODATA_FULL_DEFAULT_ENABLED >>>   bool can_set_direct_map(void) >>>   { >>>          /* >>> -        * rodata_full, DEBUG_PAGEALLOC and KFENCE require linear map >>> to be >>> +        * rodata_full and DEBUG_PAGEALLOC require linear map to be >>>           * mapped at page granularity, so that it is possible to >>>           * protect/unprotect single pages. >>> +        * >>> +        * Kfence pool requires page granularity mapping also if we >>> init it >>> +        * late. >>>           */ >>>          return (rodata_enabled && rodata_full) || >>> debug_pagealloc_enabled() || >>> -               IS_ENABLED(CONFIG_KFENCE); >>> +           (IS_ENABLED(CONFIG_KFENCE) && !kfence_sample_interval); >> >> If you're struggling with kfence_sample_interval not existing if >> !CONFIG_KFENCE, this is one of the occasions where it'd be perfectly >> fine to write: >> >> bool can_set_direct_map(void) { >> #ifdef CONFIG_KFENCE >>      /* ... your comment here ...*/ >>      if (!kfence_sample_interval) >>          return true; >> } >> #endif >>       return ......... >> } >> >>>   } >>> > The can_set_direct_map() could be called anytime, eg, memory add, > vmalloc, and this will make different state of can_set_direct_map() > if kfence is re-enabled, I think that we need a new value to check > whether or not the early kfence_pool is initialized. Many thanks, Kefeng and Marco for your careful review. Agree, kfence_sample_interval can be modified in a few ways and we can't use it in can_set_direct_map(). To be honest, previously I wanted to allocate kfence pool early always but it seems breaks the flexibility that b33f778bba5e ("kfence: alloc kfence_pool after system startup") introduced. Now I prefer to introduce one global variable early_kfence_pool to indicate if kfence_pool is initialized early, then can_set_direct_map() should be easy and clear to handle: just add "(IS_ENABLED(CONFIG_KFENCE) && !early_kfence_pool)" for the case of possibility we may init kfence pool later. The naming of early_kfence_pool also can well expressed what we're doing :) How about your idea? I will update a new patchset. > >