From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BC892C433FE for ; Fri, 14 Oct 2022 06:08:06 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 18EDE6B0072; Fri, 14 Oct 2022 02:08:06 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 13FA36B0075; Fri, 14 Oct 2022 02:08:06 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 007666B0078; Fri, 14 Oct 2022 02:08:05 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id E3F8B6B0072 for ; Fri, 14 Oct 2022 02:08:05 -0400 (EDT) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 9F2C8160761 for ; Fri, 14 Oct 2022 06:08:05 +0000 (UTC) X-FDA: 80018524530.11.02E07B3 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf29.hostedemail.com (Postfix) with ESMTP id 33339120026 for ; Fri, 14 Oct 2022 06:08:05 +0000 (UTC) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 61CFD61A0C for ; Fri, 14 Oct 2022 06:08:04 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id C41E9C433D7 for ; Fri, 14 Oct 2022 06:08:03 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1665727683; bh=pjo0EBaYOm3B/44XfyP4yUNEG3H60VNDBmXU8fIbsQY=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=DU064CD8Dfss2Qr05ts/ONK45RHGVd5hOuyM9nI9F6ReQsD1XWXQ2qa2HA1d5ngAb dLP4pv1ncqnj/XFlnR2jpTZ/4n5JOqGBIC9PM8dvR7vrxO3Jkqo8Rjwn1wgwUbdX/a 8VJn6Sfny7Ttr89sZaqyiY3l4vvOLGHl34rndnP1+Rw7lm/xYaY9IL3Mtehmvv5eky 1VYw1aHCpNoJ2cbtE8Oh56kNW7zYPk5+cOJRzoozhby4Sqd43c4x7dBzWoAKijzIgj y1msWi6LFwwLXMlTvLNbmkhJhnVEaV5kEArWRGuonHR23VzM7stZJmysmrBq9FfTnq PkIoce67cCdpg== Received: by mail-ej1-f54.google.com with SMTP id ot12so8363548ejb.1 for ; Thu, 13 Oct 2022 23:08:03 -0700 (PDT) X-Gm-Message-State: ACrzQf1r6H9SjETo7qLl37C0S7JD6ORZ+qBq24ig3ZVuSlW0NUmHrawG NPPvNn+LqLnpcqxiIz9Mr4Z5S7QMgNMW0k+Sqj4= X-Google-Smtp-Source: AMsMyM4YfPEHVvqvhGf+VKX+WRNQB41i9RA3SXUZpmhhrJhka9iNvjat1Ogib0i/02e6PMecIesehaeahKg/ES15G6A= X-Received: by 2002:a17:907:8a0a:b0:78d:b87d:e68a with SMTP id sc10-20020a1709078a0a00b0078db87de68amr2289298ejc.301.1665727681841; Thu, 13 Oct 2022 23:08:01 -0700 (PDT) MIME-Version: 1.0 References: <20221007234315.2877365-1-song@kernel.org> <20221007234315.2877365-4-song@kernel.org> In-Reply-To: From: Song Liu Date: Thu, 13 Oct 2022 23:07:49 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [RFC v2 3/4] modules, x86: use vmalloc_exec for module core To: Aaron Lu Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, akpm@linux-foundation.org, x86@kernel.org, peterz@infradead.org, hch@lst.de, kernel-team@fb.com, rick.p.edgecombe@intel.com, dave.hansen@intel.com, urezki@gmail.com Content-Type: text/plain; charset="UTF-8" ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1665727685; a=rsa-sha256; cv=none; b=SVPAQhHV4xTFPLynANPEa3MRlI4v+IKyBjv2vhbo+MmaiM+f7fBw392vBuoKL6cIwqZNhT n9EP3y5oondOM/hGCKpFYUHrvnMJlgAKKVxPmwbhKPuzitZdO5q1O6pgjhiz6JdtEmmp1f j+lEzos+TlL3jBOfmX9rEgCkA1976cM= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=DU064CD8; dmarc=pass (policy=none) header.from=kernel.org; spf=pass (imf29.hostedemail.com: domain of song@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=song@kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1665727685; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=OzthvCX9TyfXLK9aqppaM4FdkHOzbihwKrQxq8KXfTc=; b=5RbBMwZW2g81roeickQzT4cr41XX9BhER/KS3w3qWGKeso0A4JFCe4c+drYssSOj96Ql3f 2LEo6ad14MkOT51/MplrzEK9S9lrsoThQstTY9bomk/MilpSn+8Whcsjx/868gmXm0AaWC A7kLqgs3wVpDC9Cz+MOh/gK9+qkcvGc= X-Stat-Signature: 69itmr7ca6yd7qczsg97jba9oqqkqhaq X-Rspamd-Queue-Id: 33339120026 Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=DU064CD8; dmarc=pass (policy=none) header.from=kernel.org; spf=pass (imf29.hostedemail.com: domain of song@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=song@kernel.org X-Rspam-User: X-Rspamd-Server: rspam03 X-HE-Tag: 1665727685-804101 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Oct 13, 2022 at 8:49 PM Aaron Lu wrote: > > On Fri, Oct 07, 2022 at 04:43:14PM -0700, Song Liu wrote: > > This is a prototype that allows modules to share 2MB text pages with other > > modules and BPF programs. > > > > Current version only covers core_layout. > > --- > > arch/x86/Kconfig | 1 + > > arch/x86/kernel/alternative.c | 30 ++++++++++++++++++++++++------ > > arch/x86/kernel/module.c | 1 + > > kernel/module/main.c | 23 +++++++++++++---------- > > kernel/module/strict_rwx.c | 3 --- > > kernel/trace/ftrace.c | 3 ++- > > 6 files changed, 41 insertions(+), 20 deletions(-) > > > > diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig > > index f9920f1341c8..0b1ea05a1da6 100644 > > --- a/arch/x86/Kconfig > > +++ b/arch/x86/Kconfig > > @@ -91,6 +91,7 @@ config X86 > > select ARCH_HAS_SET_DIRECT_MAP > > select ARCH_HAS_STRICT_KERNEL_RWX > > select ARCH_HAS_STRICT_MODULE_RWX > > + select ARCH_WANTS_MODULES_DATA_IN_VMALLOC if X86_64 > > select ARCH_HAS_SYNC_CORE_BEFORE_USERMODE > > select ARCH_HAS_SYSCALL_WRAPPER > > select ARCH_HAS_UBSAN_SANITIZE_ALL > > diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c > > index 4f3204364caa..0e47a558c5bc 100644 > > --- a/arch/x86/kernel/alternative.c > > +++ b/arch/x86/kernel/alternative.c > > @@ -332,7 +332,13 @@ void __init_or_module noinline apply_alternatives(struct alt_instr *start, > > > > DUMP_BYTES(insn_buff, insn_buff_sz, "%px: final_insn: ", instr); > > > > - text_poke_early(instr, insn_buff, insn_buff_sz); > > + if (system_state < SYSTEM_RUNNING) { > > + text_poke_early(instr, insn_buff, insn_buff_sz); > > + } else { > > + mutex_lock(&text_mutex); > > + text_poke(instr, insn_buff, insn_buff_sz); > > + mutex_unlock(&text_mutex); > > + } > > > > next: > > optimize_nops(instr, a->instrlen); > > @@ -503,7 +509,13 @@ void __init_or_module noinline apply_retpolines(s32 *start, s32 *end) > > optimize_nops(bytes, len); > > DUMP_BYTES(((u8*)addr), len, "%px: orig: ", addr); > > DUMP_BYTES(((u8*)bytes), len, "%px: repl: ", addr); > > - text_poke_early(addr, bytes, len); > > + if (system_state == SYSTEM_BOOTING) { > > + text_poke_early(addr, bytes, len); > > + } else { > > + mutex_lock(&text_mutex); > > + text_poke(addr, bytes, len); > > + mutex_unlock(&text_mutex); > > + } > > } > > } > > } > > @@ -568,7 +580,13 @@ void __init_or_module noinline apply_returns(s32 *start, s32 *end) > > if (len == insn.length) { > > DUMP_BYTES(((u8*)addr), len, "%px: orig: ", addr); > > DUMP_BYTES(((u8*)bytes), len, "%px: repl: ", addr); > > - text_poke_early(addr, bytes, len); > > + if (unlikely(system_state == SYSTEM_BOOTING)) { > > + text_poke_early(addr, bytes, len); > > + } else { > > + mutex_lock(&text_mutex); > > + text_poke(addr, bytes, len); > > + mutex_unlock(&text_mutex); > > + } > > } > > } > > } > > @@ -609,7 +627,7 @@ void __init_or_module noinline apply_ibt_endbr(s32 *start, s32 *end) > > */ > > DUMP_BYTES(((u8*)addr), 4, "%px: orig: ", addr); > > DUMP_BYTES(((u8*)&poison), 4, "%px: repl: ", addr); > > - text_poke_early(addr, &poison, 4); > > + text_poke(addr, &poison, 4); > > } > > } > > > > @@ -791,7 +809,7 @@ void __init_or_module apply_paravirt(struct paravirt_patch_site *start, > > > > /* Pad the rest with nops */ > > add_nops(insn_buff + used, p->len - used); > > - text_poke_early(p->instr, insn_buff, p->len); > > + text_poke(p->instr, insn_buff, p->len); > > Got below warning when booting a VM: > > [ 0.190098] ------------[ cut here ]------------ > [ 0.190377] WARNING: CPU: 0 PID: 0 at /home/aaron/linux/src/arch/x86/kernel/alternative.c:1224 text_poke+0x53/0x60 > [ 0.191083] Modules linked in: > [ 0.191269] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.0.0-00004-gc49d19177d78 #5 > [ 0.191721] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014 > [ 0.192083] RIP: 0010:text_poke+0x53/0x60 > [ 0.192326] Code: c7 c7 20 e7 02 81 5b 5d e9 2a f8 ff ff be ff ff ff ff 48 c7 c7 b0 6d 06 83 48 89 14 24 e8 75 fd bf 00 85 c0 48 8b 14 24 75 c8 <0f> 0b eb c4 66 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 41 57 41 56 > [ 0.193083] RSP: 0000:ffffffff83003d60 EFLAGS: 00010246 > [ 0.194083] RAX: 0000000000000000 RBX: ffffffff810295b7 RCX: 0000000000000001 > [ 0.194506] RDX: 0000000000000006 RSI: ffffffff828b01c5 RDI: ffffffff8293898e > [ 0.195083] RBP: ffffffff83003d82 R08: ffffffff82206520 R09: 0000000000000001 > [ 0.195506] R10: 0000000000000000 R11: 0000000000000001 R12: ffffffff8a9949c0 > [ 0.195929] R13: ffffffff8a95f400 R14: 00000000ffffffff R15: 00000000ffffffff > [ 0.196083] FS: 0000000000000000(0000) GS:ffff88842de00000(0000) knlGS:0000000000000000 > [ 0.196562] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 0.197083] CR2: ffff88843ffff000 CR3: 0000000003012001 CR4: 0000000000770ef0 > [ 0.197508] PKRU: 55555554 > [ 0.197673] Call Trace: > [ 0.197822] > [ 0.198084] apply_paravirt+0xaf/0x150 > [ 0.198313] ? __might_resched+0x3f/0x280 > [ 0.198557] ? synchronize_rcu+0xe0/0x1c0 > [ 0.198799] ? lock_release+0x230/0x450 > [ 0.199030] ? _raw_spin_unlock_irqrestore+0x30/0x60 > [ 0.199083] ? lockdep_hardirqs_on+0x79/0x100 > [ 0.199345] ? _raw_spin_unlock_irqrestore+0x3b/0x60 > [ 0.199643] ? atomic_notifier_chain_unregister+0x51/0x80 > [ 0.200084] alternative_instructions+0x27/0xfa > [ 0.200357] check_bugs+0xe08/0xe82 > [ 0.200570] start_kernel+0x692/0x6cc > [ 0.200797] secondary_startup_64_no_verify+0xe0/0xeb > [ 0.201088] > [ 0.201223] irq event stamp: 13575 > [ 0.201428] hardirqs last enabled at (13583): [] __up_console_sem+0x52/0x60 > [ 0.202083] hardirqs last disabled at (13592): [] __up_console_sem+0x37/0x60 > [ 0.202594] softirqs last enabled at (12762): [] cgroup_idr_alloc.constprop.60+0x59/0x100 > [ 0.203083] softirqs last disabled at (12750): [] cgroup_idr_alloc.constprop.60+0x2d/0x100 > [ 0.203665] ---[ end trace 0000000000000000 ]--- > > Looks like it is also necessary to differentiate system_state in > apply_paravirt() like you did in the other apply_XXX() functions. Thanks for the report! Somehow I didn't see this in my qemu vm. Song