From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C7DC7C00140 for ; Fri, 5 Aug 2022 05:30:36 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 025648E0002; Fri, 5 Aug 2022 01:30:36 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id F171D8E0001; Fri, 5 Aug 2022 01:30:35 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D905A8E0002; Fri, 5 Aug 2022 01:30:35 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id CA8308E0001 for ; Fri, 5 Aug 2022 01:30:35 -0400 (EDT) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 9641A804E9 for ; Fri, 5 Aug 2022 05:30:35 +0000 (UTC) X-FDA: 79764414030.17.8B9660E Received: from mx0b-00082601.pphosted.com (mx0b-00082601.pphosted.com [67.231.153.30]) by imf04.hostedemail.com (Postfix) with ESMTP id 0A1A040133 for ; Fri, 5 Aug 2022 05:30:34 +0000 (UTC) Received: from pps.filterd (m0109332.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 274MOCUr030800; Thu, 4 Aug 2022 22:29:54 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fb.com; h=from : to : cc : subject : date : message-id : references : in-reply-to : content-type : content-id : content-transfer-encoding : mime-version; s=facebook; bh=3LuXEGiwpAsu6Zby6oE/DYs/jWCQwwWYOOgyyEtNX8A=; b=E3ceBjuDdlKLqKlOJrvD+UtpGqOVabjQGccoTb6oiDit7a112y5E5qmM0mXp9TCxWC/Z BSv3cSJHf30rzHZd9Bltq68iC5w6MBCsnk7FigixkN9feE23wNCRjGMu1ayXRB+HHwLE iyiKP+Jp9TZVl09xVRAuUXbi2qv+wFf9MYc= Received: from nam02-sn1-obe.outbound.protection.outlook.com (mail-sn1anam02lp2042.outbound.protection.outlook.com [104.47.57.42]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 3hrb6nek89-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 04 Aug 2022 22:29:54 -0700 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Cl8gwMxGmaOyUNmZ+2MSfq50d5D9BsYi4QjtDQV2ldfuZfeNuQYRVeEpKRN/gPZ6na3qA9KO00T+qWE0RUqNFNQ5LBQz7q0WLTXoyCKgbl7TDfRQQNXw2N7RUV4LIFGrMgz9PbomOl2ftbYkUf0tK013R5WoQ+BLWToeI0K1EQqDr7NoGbUHY2OHpEGUvpJnfkMUKBWSBHGe4f2RHoLIMxp6e2VPTsi3IowBwPluzy+p/Mrm37SOzMRSCSXWL7uIHZWA6LngbqJ4m95dmcmf3nh4SXTmelpjIjA6iTnx2Hb99NWBgM9r4fAwOoc1lPlTAIWOYIQ2e+60XwnwWiggkw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=3LuXEGiwpAsu6Zby6oE/DYs/jWCQwwWYOOgyyEtNX8A=; b=UsI5mj16NaWAn7vnmFiKl943EK3lAYf9RMXWwtUvPyFh/2XW+aWzhLYj+vOX0LJmemWzpwGSeeaRFls8HNjIxMEq9tezSViHL3Hw24zMoY5WpdgF+qlFEDZ3/34k8szky4cx0LWg8ewXon2qKQznzODtZyQ5Ku9WSlD1G811d19vSTUZOCz+4zAfGjoLYinJpfebm06ebGefw10/XNw8cdkjw3EZmHQ3t7q9Tl2Q6/7jbaKQl/S/j1JP0z7AWDQBXaN5a4XYW3jq9CZWvUlsvduglBAcVJPZGNNipwwSAaV3WP64SjpwGpGJSm8oGtJ1Ksw6s3eI8tPxUNjrI6VTmw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=fb.com; dmarc=pass action=none header.from=fb.com; dkim=pass header.d=fb.com; arc=none Received: from SA1PR15MB5109.namprd15.prod.outlook.com (2603:10b6:806:1dc::10) by DM6PR15MB4056.namprd15.prod.outlook.com (2603:10b6:5:2b9::15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5504.16; Fri, 5 Aug 2022 05:29:51 +0000 Received: from SA1PR15MB5109.namprd15.prod.outlook.com ([fe80::c488:891f:57b:d5da]) by SA1PR15MB5109.namprd15.prod.outlook.com ([fe80::c488:891f:57b:d5da%9]) with mapi id 15.20.5504.016; Fri, 5 Aug 2022 05:29:51 +0000 From: Song Liu To: Peter Zijlstra CC: Song Liu , bpf , "linux-kernel@vger.kernel.org" , "linux-mm@kvack.org" , "linux-modules@vger.kernel.org" , "mcgrof@kernel.org" , "rostedt@goodmis.org" , "tglx@linutronix.de" , "mingo@redhat.com" , "bp@alien8.de" , "mhiramat@kernel.org" , "naveen.n.rao@linux.ibm.com" , "davem@davemloft.net" , "anil.s.keshavamurthy@intel.com" , "keescook@chromium.org" , "hch@infradead.org" , "dave@stgolabs.net" , "daniel@iogearbox.net" , Kernel Team , "x86@kernel.org" , "dave.hansen@linux.intel.com" , "rick.p.edgecombe@intel.com" , "akpm@linux-foundation.org" Subject: Re: [PATCH bpf-next 1/3] mm/vmalloc: introduce vmalloc_exec which allocates RO+X memory Thread-Topic: [PATCH bpf-next 1/3] mm/vmalloc: introduce vmalloc_exec which allocates RO+X memory Thread-Index: AQHYlpK25UY5kW1ND0S/krM8qG//OK18FxyAgCPUggA= Date: Fri, 5 Aug 2022 05:29:51 +0000 Message-ID: <14D6DBA0-0572-44FB-A566-464B1FF541E0@fb.com> References: <20220713071846.3286727-1-song@kernel.org> <20220713071846.3286727-2-song@kernel.org> In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-mailer: Apple Mail (2.3696.120.41.1.1) x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: 9e8beff1-706b-450e-3c27-08da76a38528 x-ms-traffictypediagnostic: DM6PR15MB4056:EE_ x-fb-source: Internal x-ms-exchange-senderadcheck: 1 x-ms-exchange-antispam-relay: 0 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: blaEldDfVk+UMaZEMV8Z44eHI9xEf7heabKji/7dgSP/JgJnoSI41yn0XZv6jmt7QNlaP58c5/S7s3QaqoBYsav8oHoTUq3xjhrtjH7zsyrQhTPqYJrLu3H0suzLsO338t9VTfqMwnzV377X9YqenRMU8icpsOYkqT8rw5+TEPS3tfEb/s99L9fuwU0FqSYNpFEhwSdDm9r9MDhv2DLDHOWmkz2hDbDtuy81XY65md3RUSWqqJfuBOwg0Wtfh6XUHtfnAouaeEebkaI1uz2iznrAXnkaunNeb5cf2/laiNSD4noeakiOGUjKs9CaM1gTj6CrKSL9gw6LnMh7aG5jAkmR/5Jtr54adUgfJh68iKJ6GjEQ6i7YnFXYYHMEmSYCh56dDjUQoEoK4ee+XwmdF5U8YCB2WSwwCiKX3OvimNTxMVE2ahs4D4LJnQ7DGc1cnAncMtpRcrSBUAbpgP3XMWH7Jm9KxEl+ihny4riKXIyTDxLl4ZkCuKdhazYb5Ys4b4UGEEb2xrS0tjrGkoSySCBcwp5oRIl7wPZs+2HcvtqjHONx/IpiRlGNr6riI9munS0mWIZxHBLLjk3orCj66zNH+XO642uRwCU7VKRwEnIHHDNt3Ozea+ZNwaS/Lx1T6jR4DaJfsB/ngrcXY+TwUJgJ8NYiYLjSZ32vn+XJIW838FuzkBqPH3esyQN6NNDn5LXLAlvB0owXsLvtSc2Ij23QhvrJO527Oxu+eBVxXCCxTCGMLQFMVQLTFv3QNJO320+DtF+30v2DdIokDbVUPKVQN5hMOf7RbPu20IlBh24Q4vdGhtjVfI9HK+J8CJ60SIYHlWEQ4bMVC9Dl1w+2Uw== x-forefront-antispam-report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:SA1PR15MB5109.namprd15.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230016)(4636009)(346002)(366004)(376002)(136003)(39860400002)(396003)(6506007)(2906002)(6512007)(41300700001)(53546011)(478600001)(36756003)(7416002)(6486002)(38070700005)(8936002)(33656002)(122000001)(316002)(6916009)(4326008)(8676002)(2616005)(38100700002)(186003)(5660300002)(86362001)(64756008)(66556008)(66476007)(66446008)(71200400001)(66946007)(54906003)(91956017)(76116006)(45980500001);DIR:OUT;SFP:1102; x-ms-exchange-antispam-messagedata-chunkcount: 1 x-ms-exchange-antispam-messagedata-0: =?us-ascii?Q?sv1e07CyyKYLOqxEzMpR2cCC0AljLAbvHMJXWIbnZq3ASl8ciTSirYJTy6ju?= =?us-ascii?Q?KZMnDcttJbczCQomg0BJaC93eISI3WP1tzhevY5Xnm7RH2DP6B89VqCxtB3l?= =?us-ascii?Q?+KIsxFK8CKv746aFZ71IKk2TbO1XmdbsfCvSH2Yz6gqpSR4n7ZQt96FIiFoy?= =?us-ascii?Q?XQdSKRtxJL/dHUpXOG5xHasIxHOEWEaPoig1Vi9lm6augRWCu5sDg8TKrAoR?= =?us-ascii?Q?dESy0ATzwjcti06BriPIIbGGaETLbHZ4hkLzx44lJca1C478X/F4AXulPE09?= =?us-ascii?Q?cnrTWELDuvliUwyDmdirNbdPo8h7uViDavuCTArZBAuYyewMmjKNlqsQ7U+M?= =?us-ascii?Q?ub0v0385s/4s4xoBOP5yr1TfOgM2NhLzYaDvEwH02Nu+pdXwtSDVG5P3sGZo?= =?us-ascii?Q?LEFg+Hq6tNY9pPHRSvohdv+lFB1NkJZDaB5m4vNQIxz0By+Vf8GrIplUu3u/?= =?us-ascii?Q?OAMyCRfNGhYLaAeskG5YKov30ZCDEUyBb5ly5AmY/jjj0I6m6/vRV/DzheMz?= =?us-ascii?Q?iVU+XzMGELX3BldbALJhpc3xV2+shZhamAGouBJVStLBPIawcMMRgHm1P348?= =?us-ascii?Q?TN9PfGYiI1mmtYYRh9eNGaenw8As+jQWg3tjaU40LhUq4yZWQBoWa7a8vi9H?= =?us-ascii?Q?ZWOKjHVCLC+JNf0LK2hwYGaOb4TWUWdVcqiEjPmOcS1qmj12IQ+bNDafv4Ap?= =?us-ascii?Q?BMpo/jCUFZAkYF1Oi9bKIG3aFLN7fGJsUBKsul3joo36EBZinvPWYNwBf7q5?= =?us-ascii?Q?Ogw1ndH/ctuPm2pY6pkOKQoDyTWmpJ7LiOE393X+ntF3aZKHVI1bETZKImtY?= =?us-ascii?Q?T87qMyR3OjYhXQM+WbP26rIxNxi3CHFwigx722+CTxYi7+hRMtPuyHWDPO0U?= =?us-ascii?Q?vGqT9wGoJbWXyJqMf/rTPmiaE82bIh47ms4PvH4M37AAKAgP1jxJa3zvARHx?= =?us-ascii?Q?jNHhD5mn82ve87gF+4v7V7vL8ejFqo4F0wAWYAZPC5H/ITnwG78qCz4/7wYv?= =?us-ascii?Q?ZP5QpAHASYv2Qv/uuhRwG8AHAuPzdV1BzqhFue6AwF9qOtCpgnsVu8pVKDZZ?= =?us-ascii?Q?qCKegko61Q0ccbKpmGBWILYTANzVvIq7HSaNF/ML0js0rr8gLOOuxjEe4B++?= =?us-ascii?Q?qLZ4q0DJaXtSQ7emGGo7Xyfq+R96aSoxG3tazPubQRSTTcA6sLrEK19SzUW2?= =?us-ascii?Q?mMjaDDS8jLT6eYY2qPfxqzwBvg/GlpMaVrn8EjCU/G2gYoCpGZQXrqepxTcG?= =?us-ascii?Q?xViUGbfk74xDEaCYSz2Bo4aoH1wH9cnyWq/UWtAqP2g72BSELkdY/WrXrhTq?= =?us-ascii?Q?qsSLNRvYKizdZv2lpbG0g6aslBVYPVO2D0X5WmMADBOhqTYVC/y/SMpYy72y?= =?us-ascii?Q?5nKySY42l8v6zAeQjgnVR51VljXE1FI4HaAmPI/TA5Fd1zX38IhflZ1hOomV?= =?us-ascii?Q?H97puqXb+pet2loyxwRuVOgBG5G5OSVt9NGnGuTmP7s8k+fJoGphQ1vWdcu0?= =?us-ascii?Q?CXFiRnxFiVkHRL6b4GOQ6gs7R0CLC/m1I57MabUOgJFz6SpncYW3o616YXKB?= =?us-ascii?Q?AeVnSlgobSsZimz867xoiQSrqPcEv/wKRuyjv2UhMPz334VwJ11tq58gmXVF?= =?us-ascii?Q?Mrp6Xow1GNhrfJxivCfaOKo=3D?= Content-Type: text/plain; charset="us-ascii" Content-ID: Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: fb.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: SA1PR15MB5109.namprd15.prod.outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: 9e8beff1-706b-450e-3c27-08da76a38528 X-MS-Exchange-CrossTenant-originalarrivaltime: 05 Aug 2022 05:29:51.1787 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 8ae927fe-1255-47a7-a2af-5f3a069daaa2 X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: 4/Fjl/U7gjFiPJzJMuIUCwavS3yJVvtzQbbSvN577LJicw5huuVxiuKAVT8oQ4Sjd16RcIGIdannJPZCLGUdeg== X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM6PR15MB4056 X-Proofpoint-ORIG-GUID: GmTVaqguDqAGaFgSr2JUXUpEidlhjaoL X-Proofpoint-GUID: GmTVaqguDqAGaFgSr2JUXUpEidlhjaoL X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.883,Hydra:6.0.517,FMLib:17.11.122.1 definitions=2022-08-04_06,2022-08-04_02,2022-06-22_01 ARC-Seal: i=2; s=arc-20220608; d=hostedemail.com; t=1659677435; a=rsa-sha256; cv=pass; b=x5a5gwbrl3bHjZpW7RC6D4xL+8Rfb6jyzxKmwMkvTocMwLJWXwzT2OQ+YVVjZKcra7iFCy Od6rk9On9AmSnS4OoNNNQlC8rsK37ku2cZv/Xm2hDxUZr9l3t9oTYqZdEhVFM+opn3gIg/ tZWjMRzRve3BXD0cFMhixHNrLjlP55I= ARC-Authentication-Results: i=2; imf04.hostedemail.com; dkim=pass header.d=fb.com header.s=facebook header.b=E3ceBjuD; dmarc=pass (policy=reject) header.from=fb.com; arc=pass ("microsoft.com:s=arcselector9901:i=1"); spf=pass (imf04.hostedemail.com: domain of "prvs=921617cede=songliubraving@fb.com" designates 67.231.153.30 as permitted sender) smtp.mailfrom="prvs=921617cede=songliubraving@fb.com" ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1659677435; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=3LuXEGiwpAsu6Zby6oE/DYs/jWCQwwWYOOgyyEtNX8A=; b=5m6MbjEUCKrZVNahDs2rXoJ+RORr0FpL5thrqHB13dUXrzjUIKz/nHqSs0pct/Rgn+3Kyk ZzQ6ZAzl18I4Rvr4xePxCQTleNKvo/JnRMEFR01PrmoKwQ/2/abZcD6Bw8ZixFoP56Z/0y xNczwNIruXaEyinoMXYPSCVVNM0LkNA= Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=fb.com header.s=facebook header.b=E3ceBjuD; dmarc=pass (policy=reject) header.from=fb.com; arc=pass ("microsoft.com:s=arcselector9901:i=1"); spf=pass (imf04.hostedemail.com: domain of "prvs=921617cede=songliubraving@fb.com" designates 67.231.153.30 as permitted sender) smtp.mailfrom="prvs=921617cede=songliubraving@fb.com" X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 0A1A040133 X-Rspam-User: X-Stat-Signature: ca61misdxt8k4dp6xbwj3ki8p79ubcf8 X-HE-Tag: 1659677434-189434 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Hi Peter, > On Jul 13, 2022, at 3:20 AM, Peter Zijlstra wrote: >=20 [...] >=20 > So how about instead we separate them? Then much of the problem goes > away, you don't need to track these 2M chunks at all. >=20 > Start by adding VM_TOPDOWN_VMAP, which instead of returning the lowest > (leftmost) vmap_area that fits, picks the higests (rightmost). >=20 > Then add module_alloc_data() that uses VM_TOPDOWN_VMAP and make > ARCH_WANTS_MODULE_DATA_IN_VMALLOC use that instead of vmalloc (with a > weak function doing the vmalloc). >=20 > This gets you bottom of module range is RO+X only, top is shattered > between different !X types. >=20 > Then track the boundary between X and !X and ensure module_alloc_data() > and module_alloc() never cross over and stay strictly separated. >=20 > Then change all module_alloc() users to expect RO+X memory, instead of > RW. >=20 > Then make sure any extention of the X range is 2M aligned. >=20 > And presto, *everybody* always uses 2M TLB for text, modules, bpf, > ftrace, the lot and nobody is tracking chunks. >=20 > Maybe migration can be eased by instead providing module_alloc_text() > and ARCH_WANTS_MODULE_ALLOC_TEXT. I finally got some time to look into the code. A few questions: 1. AFAICT, vmap_area tree only works with PAGE_SIZE aligned addresses.=20 For the sharing to be more efficient, I think we need to go with smaller granularity. Will this work? Shall we pick a smaller=20 granularity, say 64 bytes? Or shall we go all the way to 1 byte? 2. I think we will need multiple vmap_area's sharing the same vm_struct.=20 Do we need to add refcount to vm_struct? Thanks, Song