linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: Ryan Roberts <ryan.roberts@arm.com>,
	Mark Rutland <mark.rutland@arm.com>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Will Deacon <will@kernel.org>,
	Alexander Shishkin <alexander.shishkin@linux.intel.com>,
	Jiri Olsa <jolsa@kernel.org>, Ian Rogers <irogers@google.com>,
	Adrian Hunter <adrian.hunter@intel.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Muchun Song <muchun.song@linux.dev>
Cc: linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org
Subject: Re: [RFC PATCH v1 0/4] Reduce cost of ptep_get_lockless on arm64
Date: Wed, 27 Mar 2024 10:34:30 +0100	[thread overview]
Message-ID: <35236bbf-3d9a-40e9-84b5-e10e10295c0c@redhat.com> (raw)
In-Reply-To: <86680856-2532-495b-951a-ea7b2b93872f@arm.com>

On 26.03.24 18:51, Ryan Roberts wrote:
> On 26/03/2024 17:39, David Hildenbrand wrote:
>> On 26.03.24 18:32, Ryan Roberts wrote:
>>> On 26/03/2024 17:04, David Hildenbrand wrote:
>>>>>>>>
>>>>>>>> Likely, we just want to read "the real deal" on both sides of the pte_same()
>>>>>>>> handling.
>>>>>>>
>>>>>>> Sorry I'm not sure I understand? You mean read the full pte including
>>>>>>> access/dirty? That's the same as dropping the patch, right? Of course if
>>>>>>> we do
>>>>>>> that, we still have to keep pte_get_lockless() around for this case. In an
>>>>>>> ideal
>>>>>>> world we would convert everything over to ptep_get_lockless_norecency() and
>>>>>>> delete ptep_get_lockless() to remove the ugliness from arm64.
>>>>>>
>>>>>> Yes, agreed. Patch #3 does not look too crazy and it wouldn't really affect
>>>>>> any
>>>>>> architecture.
>>>>>>
>>>>>> I do wonder if pte_same_norecency() should be defined per architecture and the
>>>>>> default would be pte_same(). So we could avoid the mkold etc on all other
>>>>>> architectures.
>>>>>
>>>>> Wouldn't that break it's semantics? The "norecency" of
>>>>> ptep_get_lockless_norecency() means "recency information in the returned pte
>>>>> may
>>>>> be incorrect". But the "norecency" of pte_same_norecency() means "ignore the
>>>>> access and dirty bits when you do the comparison".
>>>>
>>>> My idea was that ptep_get_lockless_norecency() would return the actual result on
>>>> these architectures. So e.g., on x86, there would be no actual change in
>>>> generated code.
>>>
>>> I think this is a bad plan... You'll end up with subtle differences between
>>> architectures.
>>>
>>>>
>>>> But yes, the documentation of these functions would have to be improved.
>>>>
>>>> Now I wonder if ptep_get_lockless_norecency() should actively clear
>>>> dirty/accessed bits to more easily find any actual issues where the bits still
>>>> matter ...
>>>
>>> I did a version that took that approach. Decided it was not as good as this way
>>> though. Now for the life of me, I can't remember my reasoning.
>>
>> Maybe because there are some code paths that check accessed/dirty without
>> "correctness" implications? For example, if the PTE is already dirty, no need to
>> set it dirty etc?
> 
> I think I decided I was penalizing the architectures that don't care because all
> their ptep_get_norecency() and ptep_get_lockless_norecency() need to explicitly
> clear access/dirty. And I would have needed ptep_get_norecency() from day 1 so
> that I could feed its result into pte_same().

True. With ptep_get_norecency() you're also penalizing other 
architectures. Therefore my original thought about making the behavior 
arch-specific, but the arch has to make sure to get the combination of 
ptep_get_lockless_norecency()+ptep_same_norecency() is right.

So if an arch decide to ignore bits in ptep_get_lockless_norecency(), it 
must make sure to also ignore them in ptep_same_norecency(), and must be 
able to handle access/dirty bit changes differently.

Maybe one could have one variant for "hw-managed access/dirty" vs. "sw 
managed accessed or dirty". Only the former would end up ignoring stuff 
here, the latter would not.

But again, just some random thoughts how this affects other 
architectures and how we could avoid it. The issue I describe in patch 
#3 would be gone if ptep_same_norecency() would just do a ptep_same() 
check on other architectures -- and would make it easier to sell :)

-- 
Cheers,

David / dhildenb



  reply	other threads:[~2024-03-27  9:34 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-15 12:17 Ryan Roberts
2024-02-15 12:17 ` [RFC PATCH v1 1/4] mm: Introduce ptep_get_lockless_norecency() Ryan Roberts
     [not found]   ` <7aefa967-43aa-490b-ae0d-7d1455402e89@redhat.com>
2024-03-26 16:39     ` Ryan Roberts
2024-03-27  9:28       ` David Hildenbrand
2024-03-27  9:57         ` Ryan Roberts
2024-03-27 17:02           ` David Hildenbrand
2024-02-15 12:17 ` [RFC PATCH v1 2/4] mm/gup: Use ptep_get_lockless_norecency() Ryan Roberts
2024-03-26 16:30   ` David Hildenbrand
2024-03-26 16:48     ` Ryan Roberts
2024-02-15 12:17 ` [RFC PATCH v1 3/4] mm/memory: Use ptep_get_lockless_norecency() for orig_pte Ryan Roberts
2024-03-26 17:02   ` David Hildenbrand
2024-03-26 17:27     ` Ryan Roberts
2024-03-26 17:38       ` David Hildenbrand
2024-03-26 17:48         ` Ryan Roberts
2024-03-26 17:58           ` David Hildenbrand
2024-03-27  9:51             ` Ryan Roberts
2024-03-27 17:05               ` David Hildenbrand
2024-02-15 12:17 ` [RFC PATCH v1 4/4] arm64/mm: Override ptep_get_lockless_norecency() Ryan Roberts
2024-03-26 16:35   ` David Hildenbrand
2024-03-26 16:17 ` [RFC PATCH v1 0/4] Reduce cost of ptep_get_lockless on arm64 David Hildenbrand
2024-03-26 16:31   ` Ryan Roberts
     [not found]     ` <de143212-49ce-4c30-8bfa-4c0ff613f107@redhat.com>
2024-03-26 16:53       ` Ryan Roberts
2024-03-26 17:04         ` David Hildenbrand
2024-03-26 17:32           ` Ryan Roberts
2024-03-26 17:39             ` David Hildenbrand
2024-03-26 17:51               ` Ryan Roberts
2024-03-27  9:34                 ` David Hildenbrand [this message]
2024-03-27 10:01                   ` Ryan Roberts
2024-04-03 12:59                   ` Ryan Roberts
2024-04-08  8:36                     ` David Hildenbrand
2024-04-09 16:35                       ` Ryan Roberts
2024-04-10 20:09                         ` David Hildenbrand
2024-04-11  9:45                           ` Ryan Roberts
     [not found]                             ` <70a36403-aefd-4311-b612-84e602465689@redhat.com>
2024-04-15  9:28                               ` Ryan Roberts
     [not found]                                 ` <3e50030d-2289-4470-a727-a293baa21618@redhat.com>
2024-04-15 13:30                                   ` Ryan Roberts
     [not found]                                     ` <969dc6c3-2764-4a35-9fa6-7596832fb2a3@redhat.com>
2024-04-15 14:34                                       ` Ryan Roberts
     [not found]                                         ` <11b1c25b-3e20-4acf-9be5-57b508266c5b@redhat.com>
2024-04-15 15:17                                           ` Ryan Roberts
2024-04-15 15:22                                             ` David Hildenbrand
2024-04-15 15:53                                               ` Ryan Roberts
2024-04-15 16:02                                                 ` David Hildenbrand
2024-04-23 10:15                                                   ` Ryan Roberts
2024-04-23 10:18                                                     ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=35236bbf-3d9a-40e9-84b5-e10e10295c0c@redhat.com \
    --to=david@redhat.com \
    --cc=adrian.hunter@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=catalin.marinas@arm.com \
    --cc=irogers@google.com \
    --cc=jolsa@kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mark.rutland@arm.com \
    --cc=muchun.song@linux.dev \
    --cc=ryan.roberts@arm.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox