Re: [PATCH v5 1/2] mm/madvise: optimize lazyfreeing with mTHP in madvise_free

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Ryan Roberts <ryan.roberts@arm.com>
To: Lance Yang <ioworker0@gmail.com>
Cc: akpm@linux-foundation.org, david@redhat.com, 21cnbao@gmail.com,
	mhocko@suse.com, fengwei.yin@intel.com, zokeefe@google.com,
	shy828301@gmail.com, xiehuan09@gmail.com,
	wangkefeng.wang@huawei.com, songmuchun@bytedance.com,
	peterx@redhat.com, minchan@kernel.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v5 1/2] mm/madvise: optimize lazyfreeing with mTHP in madvise_free
Date: Thu, 11 Apr 2024 15:39:28 +0100	[thread overview]
Message-ID: <8d674b15-ef74-4d96-bc27-8794f744517c@arm.com> (raw)
In-Reply-To: <CAK1f24k6mhQZwws7fjvL0ynme4FtjqBM3T6ZYuFPytH0fG=v6w@mail.gmail.com>

On 11/04/2024 15:07, Lance Yang wrote:
> On Thu, Apr 11, 2024 at 9:48 PM Ryan Roberts <ryan.roberts@arm.com> wrote:
>>
>> [...]
>>
>>>>> +
>>>>> +             if (!folio_trylock(folio))
>>>>> +                     continue;
>>>>
>>>> This is still wrong. This should all be protected by the "if
>>>> (folio_test_swapcache(folio) || folio_test_dirty(folio))" as it was previously
>>>> so that you only call folio_trylock() if that condition is true. You are
>>>> unconditionally locking here, then unlocking, then relocking below if the
>>>> condition is met. Just put everything inside the condition and lock once.
>>>
>>> I'm not sure if it's safe to call folio_mapcount() without holding the
>>> folio lock.
>>>
>>> As mentioned earlier by David in the v2[1]
>>>> What could work for large folios is making sure that #ptes that map the
>>>> folio here correspond to the folio_mapcount(). And folio_mapcount()
>>>> should be called under folio lock, to avoid racing with swapout/migration.
>>>
>>> [1] https://lore.kernel.org/all/5cc05529-eb80-410e-bc26-233b0ba0b21f@redhat.com/
>>
>> But I'm not suggesting that you should call folio_mapcount() without the lock.
>> I'm proposing this:
>>
>>                 if (folio_test_swapcache(folio) || folio_test_dirty(folio)) {
>>                         if (!folio_trylock(folio))
>>                                 continue;
>>                         /*
>> -                        * If folio is shared with others, we mustn't clear
>> -                        * the folio's dirty flag.
>> +                        * If we have a large folio at this point, we know it is
>> +                        * fully mapped so if its mapcount is the same as its
>> +                        * number of pages, it must be exclusive.
>>                          */
>> -                       if (folio_mapcount(folio) != 1) {
>> +                       if (folio_mapcount(folio) != folio_nr_pages(folio)) {
>>                                 folio_unlock(folio);
>>                                 continue;
>>                         }
> 
> IIUC, if the folio is clean and not in the swapcache, we still need to
> compare the number of batched PTEs against folio_mapcount().

Why? That's not how the old code worked. In fact the comment says that the
reason for the exclusive check is to avoid marking a dirty *folio* as clean if
shared; that would be bad because we could throw away data that others relied
upon. It's perfectly safe to clear the dirty flag from the *pte* even if it is
shared; the ptes are private to the process so that won't affect sharers.

You should just follow the pattern already estabilished by the original code.
The only difference is that because the folio is now (potentially) large, you
have to change the way to detect exclusivity.

> 
> Thanks,
> Lance
> 
>>
>> What am I missing?
>>

next prev parent reply	other threads:[~2024-04-11 14:39 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-08  4:24 [PATCH v5 0/2] mm/madvise: enhance " Lance Yang
2024-04-08  4:24 ` [PATCH v5 1/2] mm/madvise: optimize " Lance Yang
2024-04-11 11:11   ` Ryan Roberts
2024-04-11 11:20     ` David Hildenbrand
2024-04-11 11:27       ` Ryan Roberts
2024-04-11 12:23         ` Lance Yang
2024-04-11 13:51           ` Ryan Roberts
2024-04-11 13:55             ` David Hildenbrand
2024-04-11 12:46     ` Lance Yang
2024-04-11 13:48       ` Ryan Roberts
2024-04-11 14:07         ` Lance Yang
2024-04-11 14:39           ` Ryan Roberts [this message]
2024-04-11 14:42             ` David Hildenbrand
2024-04-12  1:48             ` Lance Yang
2024-04-08  4:24 ` [PATCH v5 2/2] mm/arm64: override mkold_clean_ptes() batch helper Lance Yang
2024-04-11 13:17   ` Ryan Roberts
2024-04-12  2:09     ` Lance Yang
2024-04-12 11:21       ` Ryan Roberts
2024-04-10 21:50 ` [PATCH v5 0/2] mm/madvise: enhance lazyfreeing with mTHP in madvise_free Andrew Morton
2024-04-11  5:01   ` Lance Yang
2024-04-11 10:29     ` Ryan Roberts

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8d674b15-ef74-4d96-bc27-8794f744517c@arm.com \
    --to=ryan.roberts@arm.com \
    --cc=21cnbao@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=david@redhat.com \
    --cc=fengwei.yin@intel.com \
    --cc=ioworker0@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=minchan@kernel.org \
    --cc=peterx@redhat.com \
    --cc=shy828301@gmail.com \
    --cc=songmuchun@bytedance.com \
    --cc=wangkefeng.wang@huawei.com \
    --cc=xiehuan09@gmail.com \
    --cc=zokeefe@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox