From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pf1-f198.google.com (mail-pf1-f198.google.com [209.85.210.198]) by kanga.kvack.org (Postfix) with ESMTP id D51AF6B57C1 for ; Fri, 31 Aug 2018 11:58:41 -0400 (EDT) Received: by mail-pf1-f198.google.com with SMTP id j15-v6so7052354pfi.10 for ; Fri, 31 Aug 2018 08:58:41 -0700 (PDT) Received: from mga04.intel.com (mga04.intel.com. [192.55.52.120]) by mx.google.com with ESMTPS id l3-v6si10060949pga.137.2018.08.31.08.58.40 for (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 31 Aug 2018 08:58:40 -0700 (PDT) Subject: Re: [RFC PATCH v3 12/24] x86/mm: Modify ptep_set_wrprotect and pmdp_set_wrprotect for _PAGE_DIRTY_SW References: <1535649960.26689.15.camel@intel.com> <33d45a12-513c-eba2-a2de-3d6b630e928e@linux.intel.com> <1535651666.27823.6.camel@intel.com> <1535660494.28258.36.camel@intel.com> <1535662366.28781.6.camel@intel.com> <20180831095300.GF24124@hirez.programming.kicks-ass.net> <1535726032.32537.0.camel@intel.com> <1535730524.501.13.camel@intel.com> From: Dave Hansen Message-ID: <6d31bd30-6d5b-bbde-1e97-1d8255eff76d@linux.intel.com> Date: Fri, 31 Aug 2018 08:58:39 -0700 MIME-Version: 1.0 In-Reply-To: <1535730524.501.13.camel@intel.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit Sender: owner-linux-mm@kvack.org List-ID: To: Yu-cheng Yu , Peter Zijlstra , Jann Horn Cc: the arch/x86 maintainers , "H . Peter Anvin" , Thomas Gleixner , Ingo Molnar , kernel list , linux-doc@vger.kernel.org, Linux-MM , linux-arch , Linux API , Arnd Bergmann , Andy Lutomirski , Balbir Singh , Cyrill Gorcunov , Florian Weimer , hjl.tools@gmail.com, Jonathan Corbet , keescook@chromiun.org, Mike Kravetz , Nadav Amit , Oleg Nesterov , Pavel Machek , ravi.v.shankar@intel.com, vedvyas.shanbhogue@intel.com On 08/31/2018 08:48 AM, Yu-cheng Yu wrote: > To trigger a race in ptep_set_wrprotect(), we need to fork from one of > three pthread siblings. > > Or do we measure only how much this affects fork? > If there is no racing, the effect should be minimal. We don't need a race. I think the cmpxchg will be slower, even without a race, than the code that was there before. The cmpxchg is a simple, straightforward solution, but we're putting it in place of a plain memory write, which is suboptimal. But, before I nitpick the performance, I wanted to see if we could even detect a delta.