linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: Barry Song <21cnbao@gmail.com>
Cc: Lance Yang <ioworker0@gmail.com>, Linux-MM <linux-mm@kvack.org>,
	Ryan Roberts <ryan.roberts@arm.com>,
	Baolin Wang <baolin.wang@linux.alibaba.com>,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: All MADV_FREE mTHPs are fully subjected to deferred_split_folio()
Date: Mon, 30 Dec 2024 20:32:37 +0100	[thread overview]
Message-ID: <a1a8c02a-c9cf-4d75-9420-b329660d06ba@redhat.com> (raw)
In-Reply-To: <CAGsJ_4z94HqGt8mVMYABnMQ5jOhNyztmqB5bOqqE6MSNx6vgAA@mail.gmail.com>

>> goto discard;
>>
> 
> I agree that this is necessary, but I'm not sure it addresses my
> concerns. MADV_FREE'ed mTHPs are still being added to `deferred_split`,
> and this does not resolve the issue of them being partially unmapped
> though it is definitely better than the existing code, at least folios are
> not moved back to swap-backed.
 > > On the other hand, users might rely on the `deferred_split` counter to
> assess how aggressively userspace is performing address/size unaligned
> operations
> like MADV_DONTNEED or unmapped behavior. However, our debugging shows
> that the majority of `deferred_split` counter increments result from
> aligned MADV_FREE operations. This diminishes the counter's usefulness
> in reflecting unaligned userspace behavior.

Optimizing that is certainly something to look into, but the bigger 
issue you describe arises from bad handling of speculative references.

Just imagine you indeed have a partially-mapped anon folio and the 
remaining pages are MADV_FREE'ed. The problem with the speculative 
reference would still apply.

> 
> If possible, I am still looking for some approach to entirely avoid
> adding the folio to deferred_split and partially being unmapped.
> 
> Could the concept be something like this?

Very likely it's wrong, because you really have to assure that that 
folio range is mapped here.

Proper folio PTE batching should be applied here -- folio_pte_batch() etc.

That can please the counters in many, but not all cases. Again, maybe 
the deferred-split handling should be handled differently, and not 
synchronously from rmap code.

I see 3 different work items

1) Fix mis-handling of speculative references

2) Perform proper PTE batching during unmap/migration. Will improve
    performance in any case.

3) Try moving deferred-split handling out of rmap code into reclaim/
    access-bit handling.

-- 
Cheers,

David / dhildenb



  reply	other threads:[~2024-12-30 19:32 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-12-29 21:12 Barry Song
2024-12-30  2:14 ` Lance Yang
2024-12-30  9:48   ` David Hildenbrand
2024-12-30 11:54     ` Barry Song
2024-12-30 12:52       ` David Hildenbrand
2024-12-30 16:02         ` Lance Yang
2024-12-30 19:19         ` Barry Song
2024-12-30 19:32           ` David Hildenbrand [this message]
2024-12-30 20:22             ` Barry Song
2024-12-30 20:31               ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a1a8c02a-c9cf-4d75-9420-b329660d06ba@redhat.com \
    --to=david@redhat.com \
    --cc=21cnbao@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=ioworker0@gmail.com \
    --cc=linux-mm@kvack.org \
    --cc=ryan.roberts@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox