From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B7256C19F2D for ; Fri, 5 Aug 2022 05:30:37 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5CA518E0003; Fri, 5 Aug 2022 01:30:36 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 5798E8E0001; Fri, 5 Aug 2022 01:30:36 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3A66A8E0003; Fri, 5 Aug 2022 01:30:36 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 2E9298E0001 for ; Fri, 5 Aug 2022 01:30:36 -0400 (EDT) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 044061A16F1 for ; Fri, 5 Aug 2022 05:30:35 +0000 (UTC) X-FDA: 79764414072.22.71EB11C Received: from mx0b-00082601.pphosted.com (mx0b-00082601.pphosted.com [67.231.153.30]) by imf22.hostedemail.com (Postfix) with ESMTP id 91E2AC00FD for ; Fri, 5 Aug 2022 05:30:35 +0000 (UTC) Received: from pps.filterd (m0109332.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 274MOCUt030800; Thu, 4 Aug 2022 22:29:55 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fb.com; h=from : to : cc : subject : date : message-id : references : in-reply-to : content-type : content-id : content-transfer-encoding : mime-version; s=facebook; bh=3LuXEGiwpAsu6Zby6oE/DYs/jWCQwwWYOOgyyEtNX8A=; b=hSWtbuuulACBrlcl4Sm5QuQDh7t8ZipeDsElDSvVi+AvXN3Fck3IhSuc49sDyrhCGw64 joiS4c2tOa89AK5TLrvD0gwWBwPxBXV7X5Cm1cyFNQ+bxHxZzGbOucA6FzTp+veMFMeW 9MDyMwp7hBbV2lJnphrxhKgX2jy/2vfE9hU= Received: from nam02-sn1-obe.outbound.protection.outlook.com (mail-sn1anam02lp2042.outbound.protection.outlook.com [104.47.57.42]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 3hrb6nek89-3 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 04 Aug 2022 22:29:55 -0700 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=ge/leYcNb1qIT0FrsxF3wCMat89e48owhxgrcu4Y3sBogpdXxXe+ukcU1NrgaAMYCMwgY6FEHSv1GS0D0UpdLI5r6UpGhWluz9xEzNLefa91TaMsEBDFbrMYku9KQCU+hyOalZrmOxCfYcj7U5Au0FF9jNjcg43dbVqs5vTu4Z2c2B8OoAdIj3gpr8wHc77W0zS+zQWK1S015izmPWYTnbg4sD/GyZ/YsbdwD+J3IIBuEIubZ5TaqCHBcJkixUsB1lhGzVQbPxwe++awNStVX7czmV2s/xd3lqmhkL3tQgfAgk76ZyyClBDw6zXWesQ+6YjFFVi+mpxDAyQp8wdigA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=3LuXEGiwpAsu6Zby6oE/DYs/jWCQwwWYOOgyyEtNX8A=; b=fqDYgj8V3XHgqevxl9RP6ZqArpOvdB0XmSL6eQTosIWwmq7eEe2ymT7CWr+ZDZrTC/7E95V87ax2S7C0GMGv3v2TiifDrkTASl1E3vHbnlJxqG76XQNQmWD57zAJpVY7PC3G4o/yt+vN4NoRk8YBBE+Pv4AvogHigk/MBgCc2NYm7DsB3wtldETTwQ/+Gha5gjd8YK9uD/OrAyZhJjSmrHVM4sMkaPrL+d8euDZGc85fSkbt1rYsYau+MOw0Id4MDDsDKthEKI16rX/PftdurLxdU1VDF6gqOreY7AOBAtaGFO/ZmVYjpvQOEXRVUg1SiDWRj+LewJ2V+L8hevA1Hw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=fb.com; dmarc=pass action=none header.from=fb.com; dkim=pass header.d=fb.com; arc=none Received: from SA1PR15MB5109.namprd15.prod.outlook.com (2603:10b6:806:1dc::10) by DM6PR15MB4056.namprd15.prod.outlook.com (2603:10b6:5:2b9::15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5504.16; Fri, 5 Aug 2022 05:29:52 +0000 Received: from SA1PR15MB5109.namprd15.prod.outlook.com ([fe80::c488:891f:57b:d5da]) by SA1PR15MB5109.namprd15.prod.outlook.com ([fe80::c488:891f:57b:d5da%9]) with mapi id 15.20.5504.016; Fri, 5 Aug 2022 05:29:52 +0000 From: Song Liu To: Peter Zijlstra CC: Song Liu , bpf , "linux-kernel@vger.kernel.org" , "linux-mm@kvack.org" , "linux-modules@vger.kernel.org" , "mcgrof@kernel.org" , "rostedt@goodmis.org" , "tglx@linutronix.de" , "mingo@redhat.com" , "bp@alien8.de" , "mhiramat@kernel.org" , "naveen.n.rao@linux.ibm.com" , "davem@davemloft.net" , "anil.s.keshavamurthy@intel.com" , "keescook@chromium.org" , "hch@infradead.org" , "dave@stgolabs.net" , "daniel@iogearbox.net" , Kernel Team , "x86@kernel.org" , "dave.hansen@linux.intel.com" , "rick.p.edgecombe@intel.com" , "akpm@linux-foundation.org" Subject: Re: [PATCH bpf-next 1/3] mm/vmalloc: introduce vmalloc_exec which allocates RO+X memory Thread-Topic: [PATCH bpf-next 1/3] mm/vmalloc: introduce vmalloc_exec which allocates RO+X memory Thread-Index: AQHYlpK25UY5kW1ND0S/krM8qG//OK18FxyAgCPUggA= Date: Fri, 5 Aug 2022 05:29:51 +0000 Message-ID: References: <20220713071846.3286727-1-song@kernel.org> <20220713071846.3286727-2-song@kernel.org> In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-mailer: Apple Mail (2.3696.120.41.1.1) x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: 5037842a-90ff-4a2a-1b64-08da76a385b0 x-ms-traffictypediagnostic: DM6PR15MB4056:EE_ x-fb-source: Internal x-ms-exchange-senderadcheck: 1 x-ms-exchange-antispam-relay: 0 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: aSwgknjNTHk1q2uYLEC7bHwjrCWS2m19szRf+B8yTXCWGGayTeXmYnvx/V7CIx6vXZilcgzwwCPUGCwXV0DNNmGkcdIrtbd/ZsTfgAjPmQXoGJrIkA0cQlP7eTIT6qExsgb/AmjTYClWwk+iSUYyv9Hvhvu1l6KZHC8J4SgeT43iz1xtBApeMosAvil4xJHMX9hAZTV92IlWOute8pGTgimRIoBOqJqhY/8IPv8211Cab6kijUXdAGlvi3O5IIrRdOk0xRBTJ720h7tWbMt3oHUX8Id9lQ3mB8+G3XtwKav1MgYQV2/coou4E6yvD3vKskL08M4gXwaWAZR8HwarJD+p+dTLx3J+8WGg/5XJaZYxXFVZIbWNu4Oe48z0TYeWHUEpst04FvPNX4hKndn7wpcXYqeCQDoghY25LKNAIIfGOerTyTjmRpTlU9F0JlZGbH+oXOdEvS09Mas1zcwE/smkFwavmMoQViHy0CdTU8SztstMink57rjdBnR+tATcoI5+0Kpj061Y/bjlIjaDekeqm7YeJchxr9RTjh2Per3z5hMrxSFiPDxmVnoRNg2vK7i1JjB4FBhKjePb7kqkST+MIiagpJoUv78/2d2L0WOWO3p4awdw8cy4Ria4zL1dEqExk3BXRGxB6uRr6SQQ+4VZL9t7lxdh3UnHeohTbmF+/AUMkpWS+yuBBoICW1da2wsD9MTfe31HOEqsXKUEvK+WFdzZrvAlWA5Og85V6lpKk5zYwEnxdEB5P/FAG/h+LkdKfZ4YNlGx5g+Ccj60tF5KVjbgVh68iDJKYxnTOF7cHlulAD6EG9Rb2AfrW5LnnQsysLdGoEQ3PHMynx0X+g== x-forefront-antispam-report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:SA1PR15MB5109.namprd15.prod.outlook.com;PTR:;CAT:NONE;SFS:(13230016)(4636009)(346002)(366004)(376002)(136003)(39860400002)(396003)(6506007)(2906002)(6512007)(41300700001)(53546011)(478600001)(36756003)(7416002)(6486002)(38070700005)(8936002)(33656002)(122000001)(316002)(6916009)(4326008)(8676002)(2616005)(38100700002)(186003)(5660300002)(86362001)(64756008)(66556008)(66476007)(66446008)(71200400001)(66946007)(54906003)(91956017)(76116006)(45980500001);DIR:OUT;SFP:1102; x-ms-exchange-antispam-messagedata-chunkcount: 1 x-ms-exchange-antispam-messagedata-0: =?us-ascii?Q?lDKduxP+vvJGgaFUpFRvxmFlFvFrW3QKoC/A/BAWI/0WXnKcEBN68k6qqMMe?= =?us-ascii?Q?J7lD+vSNBivXAektDn+bufftpcbKaI5oAWBEbO5VUsGNo/KmP63nGbnmtjUn?= =?us-ascii?Q?KN6Ikre06lIMjpuJXOyG6LN35wF2QkAFm5McIg6IdsnBIIINDhSaEoZDS1FV?= =?us-ascii?Q?M/ytbue6kaH/wkpPmhpV9RiLo2pGb/qy/T2a9k21aYr5lcai1orRLtDpdO1f?= =?us-ascii?Q?bT9p74VXtZIt/sk4Nuqq5w3q9ewWg6rEcQIfL+t5o0wSLiB8IatwIvSUYZzW?= =?us-ascii?Q?sBFtpaWmshHpkLvhEiq7dIlKvkAwuGe6eZ63YzScewHgjGXDTRY6sq6iBB6P?= =?us-ascii?Q?GdrBJxanHg2l7lKEn8e+Oc3HHglsFa4ZJNFHFOXvQmQnZl9Kos3ZN64Y0ZUe?= =?us-ascii?Q?9j04MqmdauGubs8feml1uGCUrVLI2NqpcibrwEFbABYAlpHJx/DE+G7lwA4k?= =?us-ascii?Q?Sk/YzUjtjZI7GrV5Rbi+5oaFNHbgr0/d9snIS9GE/F3Y60h5hDJ6aICgVk4m?= =?us-ascii?Q?hDLI5+maWDO8FpdiENRwvLaCftx4zo/naD5SiVHXnnPeSgJ7h0FR+pZGYNnN?= =?us-ascii?Q?DNPfZgRVZJZ0rPS29d+k8vKN4LvIEOU27rBsdugAASgM6ZbbLD996V9pWWJa?= =?us-ascii?Q?mgZmAYcNjjFeDMM3OMbqpQCPZNoHOCRu5MP038vSRhbise1xwztL2TcwqQ6h?= =?us-ascii?Q?G9QnSiUVJJwGylk8rybdLmTtTky6A2IgNwWXpilvolbO6v+dXExKYKXA6ccL?= =?us-ascii?Q?scoWehROgWnn0A0lelC+sr1hp/RmtI+ZwCf2+/vx8TfYGsS5/kNYr/COzyqE?= =?us-ascii?Q?QauU/CbOp2UPCRrv72jkIhY2Q6uh9yWCGxVyGeaVxwCn5lfLwvf4O1IXEe3i?= =?us-ascii?Q?nTrKZKkIKoKfRXXMLdud9xd2Qw4Dmh9VJk6q8iDnZENmDR2HMOpgVU9v0aUI?= =?us-ascii?Q?2nIxq/LmObNqAgpPUZP6enAKF9xuco6ddE6dnI5dBS1tWeCDIezL9vXLimuV?= =?us-ascii?Q?rrs/Ts+C95EhLuIb2PqRezEUJl2i2zH0ckMcZ4HrILvvtB9NC4hhpelEWODQ?= =?us-ascii?Q?gBj61NVOczy0u1OSOnfPyVzHA7FKTWJw3dlNzVavdQQZ1YLi58uYRpq/OaVm?= =?us-ascii?Q?k5rrgfcEgYY+xPtx76FsrVdGRwv+NpUGeSKK4uoI9dSmSGe+4FLcxT9sf86F?= =?us-ascii?Q?ISKk73nm1BtBntSrjWV1fr0hizceUBwTlku3B6qz/3qEx8tVJRl0kx1LA4OI?= =?us-ascii?Q?bd8JuRwAi06U3JiyCxv91aV0O4+zpNISdB909nYDwk4eSbR6awgaSGEiFz0O?= =?us-ascii?Q?mUrm7ZQUfECstuTN99+Dz+g2JE7rZmS7qxHMIMAPlbLfrcYfXC0Or2Yh/hWa?= =?us-ascii?Q?6DRnD+CrmroNg2mAkLvwNxN3HSQpEZGbzX1n4kvxP0PDVoxqW8F+sKfvJ8Vi?= =?us-ascii?Q?YkC9p8uV5ZHPCLv+1rFISTJBJ+gR5Ea9jVukbVlXInJ8Rerue1QA763pi6B4?= =?us-ascii?Q?zSV/DT7efnRIcoZu3Kq2UtOGeZ8Q0emAU7cfZSTb0VR0fkozkNkGem2lOtEQ?= =?us-ascii?Q?LwOeCqhHzIWkOu4Ih6cBGQI8QXDMCQ9i+ALKHd6tN/EtYW/8uvnXoM5J9HOe?= =?us-ascii?Q?7SlXnlmk1lUUiJ2y2K+AoyI=3D?= Content-Type: text/plain; charset="us-ascii" Content-ID: <2A718FF731656141B0DA383C714CD85C@namprd15.prod.outlook.com> Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: fb.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: SA1PR15MB5109.namprd15.prod.outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: 5037842a-90ff-4a2a-1b64-08da76a385b0 X-MS-Exchange-CrossTenant-originalarrivaltime: 05 Aug 2022 05:29:51.6786 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 8ae927fe-1255-47a7-a2af-5f3a069daaa2 X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: wNpUAYDCfQYNa06Lq7FwsFJaGBFYgbiLe/xCSRJXPysBksSMW2IANEsIwF/KNVQwzA1ZoWnvihw1RQ4ppUFqFQ== X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM6PR15MB4056 X-Proofpoint-ORIG-GUID: ppD4pJEqjOHMJDHHMXTW7Yz4qWCs2nJ6 X-Proofpoint-GUID: ppD4pJEqjOHMJDHHMXTW7Yz4qWCs2nJ6 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.883,Hydra:6.0.517,FMLib:17.11.122.1 definitions=2022-08-04_06,2022-08-04_02,2022-06-22_01 ARC-Seal: i=2; s=arc-20220608; d=hostedemail.com; t=1659677435; a=rsa-sha256; cv=pass; b=xfCGoQ2SMyQLLBwqZeyFJoW5hiwFjmh5UTb7QrxbACpuC5cncVs3KYuVFT0ei+7BWQqJw5 7/6VbNSAb49a0gUjDHaBKgtVR4Qd2+TVUFza6rhcVo8yYIzqdml7wCbGkE2zS64ni8kpiX aQnhUqXCdB34vU2aiCH9f2+PIENV5mA= ARC-Authentication-Results: i=2; imf22.hostedemail.com; dkim=pass header.d=fb.com header.s=facebook header.b=hSWtbuuu; dmarc=pass (policy=reject) header.from=fb.com; arc=pass ("microsoft.com:s=arcselector9901:i=1"); spf=pass (imf22.hostedemail.com: domain of "prvs=921617cede=songliubraving@fb.com" designates 67.231.153.30 as permitted sender) smtp.mailfrom="prvs=921617cede=songliubraving@fb.com" ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1659677435; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=3LuXEGiwpAsu6Zby6oE/DYs/jWCQwwWYOOgyyEtNX8A=; b=Nl777DTRMZReSxjL/BtlNqnDTDUoL45Vdg4AnjyNs0fCGNbuDtBGc2eQRBTV9TPBIx5X9e jmkShYHw9Zk6t5yglDikkWt4Qi1/SNSkLbOmiTlJsqO2DK1WIdZHT/vrnKra7sOJKGb8bs OMxfEcPYAs7A3YMyUlRFySstLGQ3MDk= X-Stat-Signature: ca61misdxt8k4dp6xbwj3ki8p79ubcf8 X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 91E2AC00FD Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=fb.com header.s=facebook header.b=hSWtbuuu; dmarc=pass (policy=reject) header.from=fb.com; arc=pass ("microsoft.com:s=arcselector9901:i=1"); spf=pass (imf22.hostedemail.com: domain of "prvs=921617cede=songliubraving@fb.com" designates 67.231.153.30 as permitted sender) smtp.mailfrom="prvs=921617cede=songliubraving@fb.com" X-Rspam-User: X-HE-Tag: 1659677435-988201 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Hi Peter, > On Jul 13, 2022, at 3:20 AM, Peter Zijlstra wrote: >=20 [...] >=20 > So how about instead we separate them? Then much of the problem goes > away, you don't need to track these 2M chunks at all. >=20 > Start by adding VM_TOPDOWN_VMAP, which instead of returning the lowest > (leftmost) vmap_area that fits, picks the higests (rightmost). >=20 > Then add module_alloc_data() that uses VM_TOPDOWN_VMAP and make > ARCH_WANTS_MODULE_DATA_IN_VMALLOC use that instead of vmalloc (with a > weak function doing the vmalloc). >=20 > This gets you bottom of module range is RO+X only, top is shattered > between different !X types. >=20 > Then track the boundary between X and !X and ensure module_alloc_data() > and module_alloc() never cross over and stay strictly separated. >=20 > Then change all module_alloc() users to expect RO+X memory, instead of > RW. >=20 > Then make sure any extention of the X range is 2M aligned. >=20 > And presto, *everybody* always uses 2M TLB for text, modules, bpf, > ftrace, the lot and nobody is tracking chunks. >=20 > Maybe migration can be eased by instead providing module_alloc_text() > and ARCH_WANTS_MODULE_ALLOC_TEXT. I finally got some time to look into the code. A few questions: 1. AFAICT, vmap_area tree only works with PAGE_SIZE aligned addresses.=20 For the sharing to be more efficient, I think we need to go with smaller granularity. Will this work? Shall we pick a smaller=20 granularity, say 64 bytes? Or shall we go all the way to 1 byte? 2. I think we will need multiple vmap_area's sharing the same vm_struct.=20 Do we need to add refcount to vm_struct? Thanks, Song