From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pl0-f70.google.com (mail-pl0-f70.google.com [209.85.160.70]) by kanga.kvack.org (Postfix) with ESMTP id 015556B0278 for ; Wed, 18 Jul 2018 20:06:39 -0400 (EDT) Received: by mail-pl0-f70.google.com with SMTP id az8-v6so3418008plb.15 for ; Wed, 18 Jul 2018 17:06:38 -0700 (PDT) Received: from mga03.intel.com (mga03.intel.com. [134.134.136.65]) by mx.google.com with ESMTPS id d19-v6si4789471pfm.226.2018.07.18.17.06.36 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 18 Jul 2018 17:06:36 -0700 (PDT) Subject: Re: [RFC PATCH v2 16/27] mm: Modify can_follow_write_pte/pmd for shadow stack References: <20180710222639.8241-1-yu-cheng.yu@intel.com> <20180710222639.8241-17-yu-cheng.yu@intel.com> <1531328731.15351.3.camel@intel.com> <45a85b01-e005-8cb6-af96-b23ce9b5fca7@linux.intel.com> <1531868610.3541.21.camel@intel.com> <1531944882.10738.1.camel@intel.com> <3f158401-f0b6-7bf7-48ab-2958354b28ad@linux.intel.com> <1531955428.12385.30.camel@intel.com> From: Dave Hansen Message-ID: Date: Wed, 18 Jul 2018 17:06:33 -0700 MIME-Version: 1.0 In-Reply-To: <1531955428.12385.30.camel@intel.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit Sender: owner-linux-mm@kvack.org List-ID: To: Yu-cheng Yu , x86@kernel.org, "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, linux-api@vger.kernel.org, Arnd Bergmann , Andy Lutomirski , Balbir Singh , Cyrill Gorcunov , Florian Weimer , "H.J. Lu" , Jann Horn , Jonathan Corbet , Kees Cook , Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , Peter Zijlstra , "Ravi V. Shankar" , Vedvyas Shanbhogue >>> -static inline bool can_follow_write_pte(pte_t pte, unsigned int flags) >>> +static inline bool can_follow_write(pte_t pte, unsigned int flags, >>> + A A A A struct vm_area_struct *vma) >>> A { >>> - return pte_write(pte) || >>> - ((flags & FOLL_FORCE) && (flags & FOLL_COW) && pte_dirty(pte)); >>> + if (!is_shstk_mapping(vma->vm_flags)) { >>> + if (pte_write(pte)) >>> + return true; >> Let me see if I can say this another way. >> >> The bigger issue is that these patches change the semantics of >> pte_write().A A Before these patches, it meant that you *MUST* have this >> bit set to write to the page controlled by the PTE.A A Now, it means: you >> can write if this bit is set *OR* the shadowstack bit combination is set. > > Here, we only figure out (1) if the page is pointed by a writable PTE; or > (2) if the page is pointed by a RO PTE (data or SHSTK) and it has been > copied and it still exists. A We are not trying to > determine if the > SHSTK PTE is writable (we know it is not). Please think about the big picture. I'm not just talking about this patch, but about every use of pte_write() in the kernel. >> That's the fundamental problem.A A We need some code in the kernel that >> logically represents the concept of "is this PTE a shadowstack PTE or a >> PTE with the write bit set", and we will call that pte_write(), or maybe >> pte_writable(). >> >> You *have* to somehow rectify this situation.A A We can absolutely no >> leave pte_write() in its current, ambiguous state where it has no real >> meaning or where it is used to mean _both_ things depending on context. > > True, the processor can always write to a page through a shadow stack > PTE, but it must do that with a CALL instruction. A Can we define aA > write operation as: MOV r1, *(r2). A Then we don't have any doubt on > pte_write() any more. No, we can't just move the target. :) You can define it this way, but then you also need to go to every spot in the kernel that calls pte_write() (and _PAGE_RW in fact) and audit it to ensure it means "mov ..." and not push.