From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 43497C433F5 for ; Mon, 3 Oct 2022 22:14:28 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A09CB6B0074; Mon, 3 Oct 2022 18:14:27 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9BA3E6B0075; Mon, 3 Oct 2022 18:14:27 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 835276B0078; Mon, 3 Oct 2022 18:14:27 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 6B4766B0074 for ; Mon, 3 Oct 2022 18:14:27 -0400 (EDT) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 3F96FAB1C4 for ; Mon, 3 Oct 2022 22:14:27 +0000 (UTC) X-FDA: 79981042974.07.2FC5B22 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by imf04.hostedemail.com (Postfix) with ESMTP id BA93440016 for ; Mon, 3 Oct 2022 22:14:26 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1664835266; x=1696371266; h=message-id:date:mime-version:subject:to:cc:references: from:in-reply-to:content-transfer-encoding; bh=hzc2df+MLVVbc4Zq72Mgkf8FxKiTS251nfNBExvVIT8=; b=R3dYD8OA/FGVFG1N7pveV4s8dQ/2gNcYx6z+LJL5JtAMJLfKcimnjkhe +HLDqy6Y0KFFVmrCntoT7ZJkAXoomO/5JwHY+GMfP/KSgPcxBKrXtrqKL BCHSQuDrVf/ufy1mkYDynnpKOkQCxdYlVMZtOrjTYfFtaSxmLfMh3MRXU 4jhZzgeVNeOYy8oevvCMVjit8/AXpR7KkoSdS7iG3ndL5KNiflzeFLSwW 5aiAGHk5lLSbPEq5P+OmGHltCR5QAivQdCF9ydALTugmUSLN+G1C2iwC6 f5KpLqyYHIAcAEiEJ5DhysHfcYPZAAmQxXb3DjWiV4OWyVmfhXGydCQO2 g==; X-IronPort-AV: E=McAfee;i="6500,9779,10489"; a="285964120" X-IronPort-AV: E=Sophos;i="5.93,366,1654585200"; d="scan'208";a="285964120" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 03 Oct 2022 15:14:25 -0700 X-IronPort-AV: E=McAfee;i="6500,9779,10489"; a="625963353" X-IronPort-AV: E=Sophos;i="5.93,366,1654585200"; d="scan'208";a="625963353" Received: from akashred-mobl.amr.corp.intel.com (HELO [10.212.139.217]) ([10.212.139.217]) by fmsmga007-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 03 Oct 2022 15:14:23 -0700 Message-ID: Date: Mon, 3 Oct 2022 15:14:22 -0700 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.11.0 Subject: Re: [PATCH v2 10/39] x86/mm: Introduce _PAGE_COW Content-Language: en-US To: "Edgecombe, Rick P" , "kirill.shutemov@linux.intel.com" Cc: "bsingharora@gmail.com" , "hpa@zytor.com" , "Syromiatnikov, Eugene" , "peterz@infradead.org" , "rdunlap@infradead.org" , "keescook@chromium.org" , "Yu, Yu-cheng" , "dave.hansen@linux.intel.com" , "Eranian, Stephane" , "linux-mm@kvack.org" , "fweimer@redhat.com" , "nadav.amit@gmail.com" , "jannh@google.com" , "dethoma@microsoft.com" , "linux-arch@vger.kernel.org" , "kcc@google.com" , "bp@alien8.de" , "oleg@redhat.com" , "hjl.tools@gmail.com" , "Yang, Weijiang" , "Lutomirski, Andy" , "pavel@ucw.cz" , "arnd@arndb.de" , "Moreira, Joao" , "tglx@linutronix.de" , "mike.kravetz@oracle.com" , "x86@kernel.org" , "linux-doc@vger.kernel.org" , "jamorris@linux.microsoft.com" , "john.allen@amd.com" , "rppt@kernel.org" , "mingo@redhat.com" , "Shankar, Ravi V" , "corbet@lwn.net" , "linux-kernel@vger.kernel.org" , "linux-api@vger.kernel.org" , "gorcunov@gmail.com" References: <20220929222936.14584-1-rick.p.edgecombe@intel.com> <20220929222936.14584-11-rick.p.edgecombe@intel.com> <20221003162613.2yvhvb6hmnae2awz@box.shutemov.name> <9e9f2ce8193ea2e86474ab999ad2a034c49d8b22.camel@intel.com> From: Dave Hansen In-Reply-To: <9e9f2ce8193ea2e86474ab999ad2a034c49d8b22.camel@intel.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1664835266; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=q6N9rKPYPYriUU+VwmPOyGBuBHI0Ij2BQ/rkaL7PP0Q=; b=zM0pXGhUu4WZH1Mhe1RTpGMbzW7yNf1SxY15n9g3Aj9ljrh3UIUpWPN9bY6unyycwRIpmm lTDPjGl2M61/7AuG7kMrY2PKB2bQ6Shn26P0hrDSDPsHJHqE4IqEVmOOwMiE/iUsq6U9Vo qZt5t00xvyxAFkj3ouhi3+RWg+7D6XE= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=none ("invalid DKIM record") header.d=intel.com header.s=Intel header.b=R3dYD8OA; spf=pass (imf04.hostedemail.com: domain of dave.hansen@intel.com designates 134.134.136.126 as permitted sender) smtp.mailfrom=dave.hansen@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1664835266; a=rsa-sha256; cv=none; b=W5zdz93Xn0n30HWuCILuYLR64Y8PZT1mHnnLjwSVC5Nuso72VF3IiKxjdL3RyoDDO96t1t AAGV+m6jBhWAMKyzMPvXAc6QPVYrk10OSGURRysQmZbX4qtrZRVgZdEOEJluQnZewswuDz M02QPaQcbIcHuc52snGErbUQJiMya/w= X-Rspamd-Queue-Id: BA93440016 Authentication-Results: imf04.hostedemail.com; dkim=none ("invalid DKIM record") header.d=intel.com header.s=Intel header.b=R3dYD8OA; spf=pass (imf04.hostedemail.com: domain of dave.hansen@intel.com designates 134.134.136.126 as permitted sender) smtp.mailfrom=dave.hansen@intel.com; dmarc=pass (policy=none) header.from=intel.com X-Rspamd-Server: rspam06 X-Rspam-User: X-Stat-Signature: girshbaxezd6gy4oni3p3x6ij5tad3if X-HE-Tag: 1664835266-594925 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 10/3/22 14:36, Edgecombe, Rick P wrote: >>> +static inline pte_t pte_clear_cow(pte_t pte) >>> +{ >>> + /* >>> + * _PAGE_COW is unnecessary on !X86_FEATURE_SHSTK kernels. >>> + * See the _PAGE_COW definition for more details. >>> + */ >>> + if (!cpu_feature_enabled(X86_FEATURE_SHSTK)) >>> + return pte; >>> + >>> + /* >>> + * PTE is getting copied-on-write, so it will be dirtied >>> + * if writable, or made shadow stack if shadow stack and >>> + * being copied on access. Set they dirty bit for both >>> + * cases. >>> + */ >>> + pte = pte_set_flags(pte, _PAGE_DIRTY); >>> + return pte_clear_flags(pte, _PAGE_COW); >>> +} >> These X86_FEATURE_SHSTK checks make me uneasy. Maybe use the >> _PAGE_COW >> logic for all machines with 64-bit entries. It will get you much more >> coverage and more universal rules. > Yes, I didn't like them either at first. The reasoning originally was > that _PAGE_COW is a bit more work and it might show up for some > benchmark. > > Looking at this again though, it is just a few more operations on > memory that is already getting touched either way. It must be a very > tiny amount of impact if any. I'm fine removing them. Having just one > set of logic around this would make it easier to reason about. > > Dave, any thoughts on this? The cpu_feature_enabled(X86_FEATURE_SHSTK) checks enable both compile-time and runtime optimization. What makes this even more fun is: +#ifdef CONFIG_X86_SHADOW_STACK +#define _PAGE_COW (_AT(pteval_t, 1) << _PAGE_BIT_COW) +#else +#define _PAGE_COW (_AT(pteval_t, 0)) +#endif which I think means that the pte_clear_flags() goes away if CONFIG_X86_SHADOW_STACK is disabled. So, what Rick posted here ends up doing the following with: | X86_FEATURE_SHSTK=1 | X86_FEATURE_SHSTK=0 ==========+=====================+======================== CONFIG=n | compiled out | compiled out CONFIG=y | set/clear | boot-time patched out If we pull the cpu_feature_enabled() out, I think we end up getting behavior like this: | X86_FEATURE_SHSTK=1 | X86_FEATURE_SHSTK=0 ==========+=====================+======================== CONFIG=n | set _PAGE_DIRTY | set _PAGE_DIRTY CONFIG=y | set/clear | set/clear It ends up adding instruction overhead (set _PAGE_DIRTY) to two cases where it completely compiled out before. It also adds runtime overhead (the "tiny amount of impact" you mentioned) to set/clear where it would have runtime patched out before. None of this is a deal breaker in terms of runtime overhead. But, I do think the benefits of the cpu_feature_enabled() are worth it, even if it's just an optimization. You could move it to the end of the series and we can debate it on its own merits if you want.