* [PATCH v5] mm: shrink skip folio mapped by an exiting process
@ 2024-07-08 9:04 Zhiguo Jiang
2024-07-08 9:36 ` David Hildenbrand
2024-07-08 11:02 ` Barry Song
0 siblings, 2 replies; 12+ messages in thread
From: Zhiguo Jiang @ 2024-07-08 9:04 UTC (permalink / raw)
To: Andrew Morton, linux-mm, linux-kernel, Barry Song
Cc: opensource.kernel, Zhiguo Jiang
The releasing process of the non-shared anonymous folio mapped solely by
an exiting process may go through two flows: 1) the anonymous folio is
firstly is swaped-out into swapspace and transformed into a swp_entry
in shrink_folio_list; 2) then the swp_entry is released in the process
exiting flow. This will increase the cpu load of releasing a non-shared
anonymous folio mapped solely by an exiting process, because the folio
go through swap-out and the releasing the swapspace and swp_entry.
When system is low memory, it is more likely to occur, because more
backend applidatuions will be killed.
The modification is that shrink skips the non-shared anonymous folio
solely mapped by an exting process and the folio is only released
directly in the process exiting flow, which will save swap-out time
and alleviate the load of the process exiting.
Signed-off-by: Zhiguo Jiang <justinjiang@vivo.com>
---
Change log:
v4->v5:
1.Modify to skip non-shared anonymous folio only.
2.Update comments for pra->referenced = -1.
v3->v4:
1.Modify that the unshared folios mapped only in exiting task are skip.
v2->v3:
Nothing.
v1->v2:
1.The VM_EXITING added in v1 patch is removed, because it will fail
to compile in 32-bit system.
mm/rmap.c | 13 +++++++++++++
mm/vmscan.c | 7 ++++++-
2 files changed, 19 insertions(+), 1 deletion(-)
diff --git a/mm/rmap.c b/mm/rmap.c
index 26806b49a86f..5b5281d71dbb
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -843,6 +843,19 @@ static bool folio_referenced_one(struct folio *folio,
int referenced = 0;
unsigned long start = address, ptes = 0;
+ /*
+ * Skip the non-shared anonymous folio mapped solely by
+ * the single exiting process, and release it directly
+ * in the process exiting.
+ */
+ if ((!atomic_read(&vma->vm_mm->mm_users) ||
+ test_bit(MMF_OOM_SKIP, &vma->vm_mm->flags)) &&
+ folio_test_anon(folio) && folio_test_swapbacked(folio) &&
+ !folio_likely_mapped_shared(folio)) {
+ pra->referenced = -1;
+ return false;
+ }
+
while (page_vma_mapped_walk(&pvmw)) {
address = pvmw.address;
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 0761f91b407f..bae7a8bf6b3d
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -863,7 +863,12 @@ static enum folio_references folio_check_references(struct folio *folio,
if (vm_flags & VM_LOCKED)
return FOLIOREF_ACTIVATE;
- /* rmap lock contention: rotate */
+ /*
+ * There are two cases to consider.
+ * 1) Rmap lock contention: rotate.
+ * 2) Skip the non-shared anonymous folio mapped solely by
+ * the single exiting process.
+ */
if (referenced_ptes == -1)
return FOLIOREF_KEEP;
--
2.39.0
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v5] mm: shrink skip folio mapped by an exiting process
2024-07-08 9:04 [PATCH v5] mm: shrink skip folio mapped by an exiting process Zhiguo Jiang
@ 2024-07-08 9:36 ` David Hildenbrand
2024-07-08 9:46 ` Barry Song
2024-07-08 11:02 ` Barry Song
1 sibling, 1 reply; 12+ messages in thread
From: David Hildenbrand @ 2024-07-08 9:36 UTC (permalink / raw)
To: Zhiguo Jiang, Andrew Morton, linux-mm, linux-kernel, Barry Song
Cc: opensource.kernel
On 08.07.24 11:04, Zhiguo Jiang wrote:
> The releasing process of the non-shared anonymous folio mapped solely by
> an exiting process may go through two flows: 1) the anonymous folio is
> firstly is swaped-out into swapspace and transformed into a swp_entry
> in shrink_folio_list; 2) then the swp_entry is released in the process
> exiting flow. This will increase the cpu load of releasing a non-shared
> anonymous folio mapped solely by an exiting process, because the folio
> go through swap-out and the releasing the swapspace and swp_entry.
>
> When system is low memory, it is more likely to occur, because more
> backend applidatuions will be killed.
>
> The modification is that shrink skips the non-shared anonymous folio
> solely mapped by an exting process and the folio is only released
> directly in the process exiting flow, which will save swap-out time
> and alleviate the load of the process exiting.
>
> Signed-off-by: Zhiguo Jiang <justinjiang@vivo.com>
> ---
>
> Change log:
> v4->v5:
> 1.Modify to skip non-shared anonymous folio only.
> 2.Update comments for pra->referenced = -1.
> v3->v4:
> 1.Modify that the unshared folios mapped only in exiting task are skip.
> v2->v3:
> Nothing.
> v1->v2:
> 1.The VM_EXITING added in v1 patch is removed, because it will fail
> to compile in 32-bit system.
>
> mm/rmap.c | 13 +++++++++++++
> mm/vmscan.c | 7 ++++++-
> 2 files changed, 19 insertions(+), 1 deletion(-)
>
> diff --git a/mm/rmap.c b/mm/rmap.c
> index 26806b49a86f..5b5281d71dbb
> --- a/mm/rmap.c
> +++ b/mm/rmap.c
> @@ -843,6 +843,19 @@ static bool folio_referenced_one(struct folio *folio,
> int referenced = 0;
> unsigned long start = address, ptes = 0;
>
> + /*
> + * Skip the non-shared anonymous folio mapped solely by
> + * the single exiting process, and release it directly
> + * in the process exiting.
> + */
> + if ((!atomic_read(&vma->vm_mm->mm_users) ||
> + test_bit(MMF_OOM_SKIP, &vma->vm_mm->flags)) &&
> + folio_test_anon(folio) && folio_test_swapbacked(folio) &&
> + !folio_likely_mapped_shared(folio)) {
I'm currently working on moving all folio_likely_mapped_shared() under
the PTL, where we are then sure that the folio is actually mapped by
this process (e.g., no concurrent unmapping poisslbe).
Can we do the same here directly?
--
Cheers,
David / dhildenb
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v5] mm: shrink skip folio mapped by an exiting process
2024-07-08 9:36 ` David Hildenbrand
@ 2024-07-08 9:46 ` Barry Song
2024-07-08 9:49 ` David Hildenbrand
0 siblings, 1 reply; 12+ messages in thread
From: Barry Song @ 2024-07-08 9:46 UTC (permalink / raw)
To: David Hildenbrand
Cc: Zhiguo Jiang, Andrew Morton, linux-mm, linux-kernel, opensource.kernel
On Mon, Jul 8, 2024 at 9:36 PM David Hildenbrand <david@redhat.com> wrote:
>
> On 08.07.24 11:04, Zhiguo Jiang wrote:
> > The releasing process of the non-shared anonymous folio mapped solely by
> > an exiting process may go through two flows: 1) the anonymous folio is
> > firstly is swaped-out into swapspace and transformed into a swp_entry
> > in shrink_folio_list; 2) then the swp_entry is released in the process
> > exiting flow. This will increase the cpu load of releasing a non-shared
> > anonymous folio mapped solely by an exiting process, because the folio
> > go through swap-out and the releasing the swapspace and swp_entry.
> >
> > When system is low memory, it is more likely to occur, because more
> > backend applidatuions will be killed.
> >
> > The modification is that shrink skips the non-shared anonymous folio
> > solely mapped by an exting process and the folio is only released
> > directly in the process exiting flow, which will save swap-out time
> > and alleviate the load of the process exiting.
> >
> > Signed-off-by: Zhiguo Jiang <justinjiang@vivo.com>
> > ---
> >
> > Change log:
> > v4->v5:
> > 1.Modify to skip non-shared anonymous folio only.
> > 2.Update comments for pra->referenced = -1.
> > v3->v4:
> > 1.Modify that the unshared folios mapped only in exiting task are skip.
> > v2->v3:
> > Nothing.
> > v1->v2:
> > 1.The VM_EXITING added in v1 patch is removed, because it will fail
> > to compile in 32-bit system.
> >
> > mm/rmap.c | 13 +++++++++++++
> > mm/vmscan.c | 7 ++++++-
> > 2 files changed, 19 insertions(+), 1 deletion(-)
> >
> > diff --git a/mm/rmap.c b/mm/rmap.c
> > index 26806b49a86f..5b5281d71dbb
> > --- a/mm/rmap.c
> > +++ b/mm/rmap.c
> > @@ -843,6 +843,19 @@ static bool folio_referenced_one(struct folio *folio,
> > int referenced = 0;
> > unsigned long start = address, ptes = 0;
> >
> > + /*
> > + * Skip the non-shared anonymous folio mapped solely by
> > + * the single exiting process, and release it directly
> > + * in the process exiting.
> > + */
> > + if ((!atomic_read(&vma->vm_mm->mm_users) ||
> > + test_bit(MMF_OOM_SKIP, &vma->vm_mm->flags)) &&
> > + folio_test_anon(folio) && folio_test_swapbacked(folio) &&
> > + !folio_likely_mapped_shared(folio)) {
>
> I'm currently working on moving all folio_likely_mapped_shared() under
> the PTL, where we are then sure that the folio is actually mapped by
> this process (e.g., no concurrent unmapping poisslbe).
>
> Can we do the same here directly?
Implementing this is challenging because page_vma_mapped_walk() is
responsible for
traversing the page table to acquire and release the PTL. This becomes
particularly
complex with mTHP, as we may need to interrupt the page_vma_mapped_walk
loop at the first PTE.
>
> --
> Cheers,
>
> David / dhildenb
>
Thanks
Barry
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v5] mm: shrink skip folio mapped by an exiting process
2024-07-08 9:46 ` Barry Song
@ 2024-07-08 9:49 ` David Hildenbrand
2024-07-08 10:05 ` Barry Song
0 siblings, 1 reply; 12+ messages in thread
From: David Hildenbrand @ 2024-07-08 9:49 UTC (permalink / raw)
To: Barry Song
Cc: Zhiguo Jiang, Andrew Morton, linux-mm, linux-kernel, opensource.kernel
On 08.07.24 11:46, Barry Song wrote:
> On Mon, Jul 8, 2024 at 9:36 PM David Hildenbrand <david@redhat.com> wrote:
>>
>> On 08.07.24 11:04, Zhiguo Jiang wrote:
>>> The releasing process of the non-shared anonymous folio mapped solely by
>>> an exiting process may go through two flows: 1) the anonymous folio is
>>> firstly is swaped-out into swapspace and transformed into a swp_entry
>>> in shrink_folio_list; 2) then the swp_entry is released in the process
>>> exiting flow. This will increase the cpu load of releasing a non-shared
>>> anonymous folio mapped solely by an exiting process, because the folio
>>> go through swap-out and the releasing the swapspace and swp_entry.
>>>
>>> When system is low memory, it is more likely to occur, because more
>>> backend applidatuions will be killed.
>>>
>>> The modification is that shrink skips the non-shared anonymous folio
>>> solely mapped by an exting process and the folio is only released
>>> directly in the process exiting flow, which will save swap-out time
>>> and alleviate the load of the process exiting.
>>>
>>> Signed-off-by: Zhiguo Jiang <justinjiang@vivo.com>
>>> ---
>>>
>>> Change log:
>>> v4->v5:
>>> 1.Modify to skip non-shared anonymous folio only.
>>> 2.Update comments for pra->referenced = -1.
>>> v3->v4:
>>> 1.Modify that the unshared folios mapped only in exiting task are skip.
>>> v2->v3:
>>> Nothing.
>>> v1->v2:
>>> 1.The VM_EXITING added in v1 patch is removed, because it will fail
>>> to compile in 32-bit system.
>>>
>>> mm/rmap.c | 13 +++++++++++++
>>> mm/vmscan.c | 7 ++++++-
>>> 2 files changed, 19 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/mm/rmap.c b/mm/rmap.c
>>> index 26806b49a86f..5b5281d71dbb
>>> --- a/mm/rmap.c
>>> +++ b/mm/rmap.c
>>> @@ -843,6 +843,19 @@ static bool folio_referenced_one(struct folio *folio,
>>> int referenced = 0;
>>> unsigned long start = address, ptes = 0;
>>>
>>> + /*
>>> + * Skip the non-shared anonymous folio mapped solely by
>>> + * the single exiting process, and release it directly
>>> + * in the process exiting.
>>> + */
>>> + if ((!atomic_read(&vma->vm_mm->mm_users) ||
>>> + test_bit(MMF_OOM_SKIP, &vma->vm_mm->flags)) &&
>>> + folio_test_anon(folio) && folio_test_swapbacked(folio) &&
>>> + !folio_likely_mapped_shared(folio)) {
>>
>> I'm currently working on moving all folio_likely_mapped_shared() under
>> the PTL, where we are then sure that the folio is actually mapped by
>> this process (e.g., no concurrent unmapping poisslbe).
>>
>> Can we do the same here directly?
>
> Implementing this is challenging because page_vma_mapped_walk() is
> responsible for
> traversing the page table to acquire and release the PTL. This becomes
> particularly
> complex with mTHP, as we may need to interrupt the page_vma_mapped_walk
> loop at the first PTE.
Why can't we perform the check under the PTL and bail out? I'm missing
something important.
--
Cheers,
David / dhildenb
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v5] mm: shrink skip folio mapped by an exiting process
2024-07-08 9:49 ` David Hildenbrand
@ 2024-07-08 10:05 ` Barry Song
0 siblings, 0 replies; 12+ messages in thread
From: Barry Song @ 2024-07-08 10:05 UTC (permalink / raw)
To: David Hildenbrand
Cc: Zhiguo Jiang, Andrew Morton, linux-mm, linux-kernel, opensource.kernel
On Mon, Jul 8, 2024 at 9:49 PM David Hildenbrand <david@redhat.com> wrote:
>
> On 08.07.24 11:46, Barry Song wrote:
> > On Mon, Jul 8, 2024 at 9:36 PM David Hildenbrand <david@redhat.com> wrote:
> >>
> >> On 08.07.24 11:04, Zhiguo Jiang wrote:
> >>> The releasing process of the non-shared anonymous folio mapped solely by
> >>> an exiting process may go through two flows: 1) the anonymous folio is
> >>> firstly is swaped-out into swapspace and transformed into a swp_entry
> >>> in shrink_folio_list; 2) then the swp_entry is released in the process
> >>> exiting flow. This will increase the cpu load of releasing a non-shared
> >>> anonymous folio mapped solely by an exiting process, because the folio
> >>> go through swap-out and the releasing the swapspace and swp_entry.
> >>>
> >>> When system is low memory, it is more likely to occur, because more
> >>> backend applidatuions will be killed.
> >>>
> >>> The modification is that shrink skips the non-shared anonymous folio
> >>> solely mapped by an exting process and the folio is only released
> >>> directly in the process exiting flow, which will save swap-out time
> >>> and alleviate the load of the process exiting.
> >>>
> >>> Signed-off-by: Zhiguo Jiang <justinjiang@vivo.com>
> >>> ---
> >>>
> >>> Change log:
> >>> v4->v5:
> >>> 1.Modify to skip non-shared anonymous folio only.
> >>> 2.Update comments for pra->referenced = -1.
> >>> v3->v4:
> >>> 1.Modify that the unshared folios mapped only in exiting task are skip.
> >>> v2->v3:
> >>> Nothing.
> >>> v1->v2:
> >>> 1.The VM_EXITING added in v1 patch is removed, because it will fail
> >>> to compile in 32-bit system.
> >>>
> >>> mm/rmap.c | 13 +++++++++++++
> >>> mm/vmscan.c | 7 ++++++-
> >>> 2 files changed, 19 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/mm/rmap.c b/mm/rmap.c
> >>> index 26806b49a86f..5b5281d71dbb
> >>> --- a/mm/rmap.c
> >>> +++ b/mm/rmap.c
> >>> @@ -843,6 +843,19 @@ static bool folio_referenced_one(struct folio *folio,
> >>> int referenced = 0;
> >>> unsigned long start = address, ptes = 0;
> >>>
> >>> + /*
> >>> + * Skip the non-shared anonymous folio mapped solely by
> >>> + * the single exiting process, and release it directly
> >>> + * in the process exiting.
> >>> + */
> >>> + if ((!atomic_read(&vma->vm_mm->mm_users) ||
> >>> + test_bit(MMF_OOM_SKIP, &vma->vm_mm->flags)) &&
> >>> + folio_test_anon(folio) && folio_test_swapbacked(folio) &&
> >>> + !folio_likely_mapped_shared(folio)) {
> >>
> >> I'm currently working on moving all folio_likely_mapped_shared() under
> >> the PTL, where we are then sure that the folio is actually mapped by
> >> this process (e.g., no concurrent unmapping poisslbe).
> >>
> >> Can we do the same here directly?
> >
> > Implementing this is challenging because page_vma_mapped_walk() is
> > responsible for
> > traversing the page table to acquire and release the PTL. This becomes
> > particularly
> > complex with mTHP, as we may need to interrupt the page_vma_mapped_walk
> > loop at the first PTE.
>
> Why can't we perform the check under the PTL and bail out? I'm missing
> something important.
You are right. we might use page_vma_mapped_walk_done() to bail out.
>
> --
> Cheers,
>
> David / dhildenb
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v5] mm: shrink skip folio mapped by an exiting process
2024-07-08 9:04 [PATCH v5] mm: shrink skip folio mapped by an exiting process Zhiguo Jiang
2024-07-08 9:36 ` David Hildenbrand
@ 2024-07-08 11:02 ` Barry Song
2024-07-08 12:17 ` zhiguojiang
1 sibling, 1 reply; 12+ messages in thread
From: Barry Song @ 2024-07-08 11:02 UTC (permalink / raw)
To: Zhiguo Jiang; +Cc: Andrew Morton, linux-mm, linux-kernel, opensource.kernel
On Mon, Jul 8, 2024 at 9:04 PM Zhiguo Jiang <justinjiang@vivo.com> wrote:
>
> The releasing process of the non-shared anonymous folio mapped solely by
> an exiting process may go through two flows: 1) the anonymous folio is
> firstly is swaped-out into swapspace and transformed into a swp_entry
> in shrink_folio_list; 2) then the swp_entry is released in the process
> exiting flow. This will increase the cpu load of releasing a non-shared
> anonymous folio mapped solely by an exiting process, because the folio
> go through swap-out and the releasing the swapspace and swp_entry.
>
> When system is low memory, it is more likely to occur, because more
> backend applidatuions will be killed.
>
> The modification is that shrink skips the non-shared anonymous folio
> solely mapped by an exting process and the folio is only released
> directly in the process exiting flow, which will save swap-out time
> and alleviate the load of the process exiting.
>
> Signed-off-by: Zhiguo Jiang <justinjiang@vivo.com>
> ---
>
> Change log:
> v4->v5:
> 1.Modify to skip non-shared anonymous folio only.
> 2.Update comments for pra->referenced = -1.
> v3->v4:
> 1.Modify that the unshared folios mapped only in exiting task are skip.
> v2->v3:
> Nothing.
> v1->v2:
> 1.The VM_EXITING added in v1 patch is removed, because it will fail
> to compile in 32-bit system.
>
> mm/rmap.c | 13 +++++++++++++
> mm/vmscan.c | 7 ++++++-
> 2 files changed, 19 insertions(+), 1 deletion(-)
>
> diff --git a/mm/rmap.c b/mm/rmap.c
> index 26806b49a86f..5b5281d71dbb
> --- a/mm/rmap.c
> +++ b/mm/rmap.c
> @@ -843,6 +843,19 @@ static bool folio_referenced_one(struct folio *folio,
> int referenced = 0;
> unsigned long start = address, ptes = 0;
>
> + /*
> + * Skip the non-shared anonymous folio mapped solely by
> + * the single exiting process, and release it directly
> + * in the process exiting.
> + */
> + if ((!atomic_read(&vma->vm_mm->mm_users) ||
> + test_bit(MMF_OOM_SKIP, &vma->vm_mm->flags)) &&
> + folio_test_anon(folio) && folio_test_swapbacked(folio) &&
> + !folio_likely_mapped_shared(folio)) {
> + pra->referenced = -1;
> + return false;
> + }
> +
> while (page_vma_mapped_walk(&pvmw)) {
> address = pvmw.address;
As David suggested, what about the below?
@@ -883,6 +870,21 @@ static bool folio_referenced_one(struct folio *folio,
continue;
}
+ /*
+ * Skip the non-shared anonymous folio mapped solely by
+ * the single exiting process, and release it directly
+ * in the process exiting.
+ */
+ if ((!atomic_read(&vma->vm_mm->mm_users) ||
+ test_bit(MMF_OOM_SKIP,
&vma->vm_mm->flags)) &&
+ folio_test_anon(folio) &&
folio_test_swapbacked(folio) &&
+ !folio_likely_mapped_shared(folio)) {
+ pra->referenced = -1;
+ page_vma_mapped_walk_done(&pvmw);
+ return false;
+ }
+
if (pvmw.pte) {
if (lru_gen_enabled() &&
pte_young(ptep_get(pvmw.pte))) {
By the way, I am not convinced that using test_bit(MMF_OOM_SKIP,
&vma->vm_mm->flags) is
correct (I think it is wrong). For example, global_init can directly have it:
if (is_global_init(p)) {
can_oom_reap = false;
set_bit(MMF_OOM_SKIP, &mm->flags);
pr_info("oom killer %d (%s) has mm pinned by %d (%s)\n",
task_pid_nr(victim), victim->comm,
task_pid_nr(p), p->comm);
continue;
}
And exit_mmap() automatically has MMF_OOM_SKIP.
What is the purpose of this check? Is there a better way to determine
if a process is an
OOM target? What about check_stable_address_space() ?
>
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 0761f91b407f..bae7a8bf6b3d
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -863,7 +863,12 @@ static enum folio_references folio_check_references(struct folio *folio,
> if (vm_flags & VM_LOCKED)
> return FOLIOREF_ACTIVATE;
>
> - /* rmap lock contention: rotate */
> + /*
> + * There are two cases to consider.
> + * 1) Rmap lock contention: rotate.
> + * 2) Skip the non-shared anonymous folio mapped solely by
> + * the single exiting process.
> + */
> if (referenced_ptes == -1)
> return FOLIOREF_KEEP;
>
> --
> 2.39.0
>
Thanks
Barry
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v5] mm: shrink skip folio mapped by an exiting process
2024-07-08 11:02 ` Barry Song
@ 2024-07-08 12:17 ` zhiguojiang
2024-07-08 12:25 ` zhiguojiang
0 siblings, 1 reply; 12+ messages in thread
From: zhiguojiang @ 2024-07-08 12:17 UTC (permalink / raw)
To: Barry Song; +Cc: Andrew Morton, linux-mm, linux-kernel, opensource.kernel
在 2024/7/8 19:02, Barry Song 写道:
> On Mon, Jul 8, 2024 at 9:04 PM Zhiguo Jiang <justinjiang@vivo.com> wrote:
>> The releasing process of the non-shared anonymous folio mapped solely by
>> an exiting process may go through two flows: 1) the anonymous folio is
>> firstly is swaped-out into swapspace and transformed into a swp_entry
>> in shrink_folio_list; 2) then the swp_entry is released in the process
>> exiting flow. This will increase the cpu load of releasing a non-shared
>> anonymous folio mapped solely by an exiting process, because the folio
>> go through swap-out and the releasing the swapspace and swp_entry.
>>
>> When system is low memory, it is more likely to occur, because more
>> backend applidatuions will be killed.
>>
>> The modification is that shrink skips the non-shared anonymous folio
>> solely mapped by an exting process and the folio is only released
>> directly in the process exiting flow, which will save swap-out time
>> and alleviate the load of the process exiting.
>>
>> Signed-off-by: Zhiguo Jiang <justinjiang@vivo.com>
>> ---
>>
>> Change log:
>> v4->v5:
>> 1.Modify to skip non-shared anonymous folio only.
>> 2.Update comments for pra->referenced = -1.
>> v3->v4:
>> 1.Modify that the unshared folios mapped only in exiting task are skip.
>> v2->v3:
>> Nothing.
>> v1->v2:
>> 1.The VM_EXITING added in v1 patch is removed, because it will fail
>> to compile in 32-bit system.
>>
>> mm/rmap.c | 13 +++++++++++++
>> mm/vmscan.c | 7 ++++++-
>> 2 files changed, 19 insertions(+), 1 deletion(-)
>>
>> diff --git a/mm/rmap.c b/mm/rmap.c
>> index 26806b49a86f..5b5281d71dbb
>> --- a/mm/rmap.c
>> +++ b/mm/rmap.c
>> @@ -843,6 +843,19 @@ static bool folio_referenced_one(struct folio *folio,
>> int referenced = 0;
>> unsigned long start = address, ptes = 0;
>>
>> + /*
>> + * Skip the non-shared anonymous folio mapped solely by
>> + * the single exiting process, and release it directly
>> + * in the process exiting.
>> + */
>> + if ((!atomic_read(&vma->vm_mm->mm_users) ||
>> + test_bit(MMF_OOM_SKIP, &vma->vm_mm->flags)) &&
>> + folio_test_anon(folio) && folio_test_swapbacked(folio) &&
>> + !folio_likely_mapped_shared(folio)) {
>> + pra->referenced = -1;
>> + return false;
>> + }
>> +
>> while (page_vma_mapped_walk(&pvmw)) {
>> address = pvmw.address;
Sure, I agree with your modification suggestions. This way, using PTL
indeed sure
that the folio is mapped by this process.
Thanks
> As David suggested, what about the below?
>
> @@ -883,6 +870,21 @@ static bool folio_referenced_one(struct folio *folio,
> continue;
> }
>
> + /*
> + * Skip the non-shared anonymous folio mapped solely by
> + * the single exiting process, and release it directly
> + * in the process exiting.
> + */
> + if ((!atomic_read(&vma->vm_mm->mm_users) ||
> + test_bit(MMF_OOM_SKIP,
> &vma->vm_mm->flags)) &&
> + folio_test_anon(folio) &&
> folio_test_swapbacked(folio) &&
> + !folio_likely_mapped_shared(folio)) {
> + pra->referenced = -1;
> + page_vma_mapped_walk_done(&pvmw);
> + return false;
> + }
> +
> if (pvmw.pte) {
> if (lru_gen_enabled() &&
> pte_young(ptep_get(pvmw.pte))) {
>
>
> By the way, I am not convinced that using test_bit(MMF_OOM_SKIP,
> &vma->vm_mm->flags) is
> correct (I think it is wrong). For example, global_init can directly have it:
> if (is_global_init(p)) {
> can_oom_reap = false;
> set_bit(MMF_OOM_SKIP, &mm->flags);
> pr_info("oom killer %d (%s) has mm pinned by %d (%s)\n",
> task_pid_nr(victim), victim->comm,
> task_pid_nr(p), p->comm);
> continue;
> }
>
> And exit_mmap() automatically has MMF_OOM_SKIP.
>
> What is the purpose of this check? Is there a better way to determine
> if a process is an
> OOM target? What about check_stable_address_space() ?
1.Sorry, I overlook the situation with if (is_global_init(p)),
MMF_OOM_SKIP is indeed not suitable.
2.check_stable_address_space() can indicate oom_reaper, but it seems
unable to identify the situation where the process exits normally. What
about task_is_dying()? static inline bool task_is_dying(void) { return
tsk_is_oom_victim(current) || fatal_signal_pending(current) ||
(current->flags & PF_EXITING); } Thanks
>
>
>> diff --git a/mm/vmscan.c b/mm/vmscan.c
>> index 0761f91b407f..bae7a8bf6b3d
>> --- a/mm/vmscan.c
>> +++ b/mm/vmscan.c
>> @@ -863,7 +863,12 @@ static enum folio_references folio_check_references(struct folio *folio,
>> if (vm_flags & VM_LOCKED)
>> return FOLIOREF_ACTIVATE;
>>
>> - /* rmap lock contention: rotate */
>> + /*
>> + * There are two cases to consider.
>> + * 1) Rmap lock contention: rotate.
>> + * 2) Skip the non-shared anonymous folio mapped solely by
>> + * the single exiting process.
>> + */
>> if (referenced_ptes == -1)
>> return FOLIOREF_KEEP;
>>
>> --
>> 2.39.0
>>
> Thanks
> Barry
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v5] mm: shrink skip folio mapped by an exiting process
2024-07-08 12:17 ` zhiguojiang
@ 2024-07-08 12:25 ` zhiguojiang
2024-07-08 12:41 ` Barry Song
0 siblings, 1 reply; 12+ messages in thread
From: zhiguojiang @ 2024-07-08 12:25 UTC (permalink / raw)
To: Barry Song; +Cc: Andrew Morton, linux-mm, linux-kernel, opensource.kernel
在 2024/7/8 20:17, zhiguojiang 写道:
>
>
> 在 2024/7/8 19:02, Barry Song 写道:
>> On Mon, Jul 8, 2024 at 9:04 PM Zhiguo Jiang <justinjiang@vivo.com>
>> wrote:
>>> The releasing process of the non-shared anonymous folio mapped
>>> solely by
>>> an exiting process may go through two flows: 1) the anonymous folio is
>>> firstly is swaped-out into swapspace and transformed into a swp_entry
>>> in shrink_folio_list; 2) then the swp_entry is released in the process
>>> exiting flow. This will increase the cpu load of releasing a non-shared
>>> anonymous folio mapped solely by an exiting process, because the folio
>>> go through swap-out and the releasing the swapspace and swp_entry.
>>>
>>> When system is low memory, it is more likely to occur, because more
>>> backend applidatuions will be killed.
>>>
>>> The modification is that shrink skips the non-shared anonymous folio
>>> solely mapped by an exting process and the folio is only released
>>> directly in the process exiting flow, which will save swap-out time
>>> and alleviate the load of the process exiting.
>>>
>>> Signed-off-by: Zhiguo Jiang <justinjiang@vivo.com>
>>> ---
>>>
>>> Change log:
>>> v4->v5:
>>> 1.Modify to skip non-shared anonymous folio only.
>>> 2.Update comments for pra->referenced = -1.
>>> v3->v4:
>>> 1.Modify that the unshared folios mapped only in exiting task are skip.
>>> v2->v3:
>>> Nothing.
>>> v1->v2:
>>> 1.The VM_EXITING added in v1 patch is removed, because it will fail
>>> to compile in 32-bit system.
>>>
>>> mm/rmap.c | 13 +++++++++++++
>>> mm/vmscan.c | 7 ++++++-
>>> 2 files changed, 19 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/mm/rmap.c b/mm/rmap.c
>>> index 26806b49a86f..5b5281d71dbb
>>> --- a/mm/rmap.c
>>> +++ b/mm/rmap.c
>>> @@ -843,6 +843,19 @@ static bool folio_referenced_one(struct folio
>>> *folio,
>>> int referenced = 0;
>>> unsigned long start = address, ptes = 0;
>>>
>>> + /*
>>> + * Skip the non-shared anonymous folio mapped solely by
>>> + * the single exiting process, and release it directly
>>> + * in the process exiting.
>>> + */
>>> + if ((!atomic_read(&vma->vm_mm->mm_users) ||
>>> + test_bit(MMF_OOM_SKIP, &vma->vm_mm->flags)) &&
>>> + folio_test_anon(folio) &&
>>> folio_test_swapbacked(folio) &&
>>> + !folio_likely_mapped_shared(folio)) {
>>> + pra->referenced = -1;
>>> + return false;
>>> + }
>>> +
>>> while (page_vma_mapped_walk(&pvmw)) {
>>> address = pvmw.address;
> Sure, I agree with your modification suggestions. This way, using PTL
> indeed sure
> that the folio is mapped by this process.
> Thanks
>> As David suggested, what about the below?
>>
>> @@ -883,6 +870,21 @@ static bool folio_referenced_one(struct folio
>> *folio,
>> continue;
>> }
>>
>> + /*
>> + * Skip the non-shared anonymous folio mapped solely by
>> + * the single exiting process, and release it directly
>> + * in the process exiting.
>> + */
>> + if ((!atomic_read(&vma->vm_mm->mm_users) ||
>> + test_bit(MMF_OOM_SKIP,
>> &vma->vm_mm->flags)) &&
>> + folio_test_anon(folio) &&
>> folio_test_swapbacked(folio) &&
>> + !folio_likely_mapped_shared(folio)) {
>> + pra->referenced = -1;
>> + page_vma_mapped_walk_done(&pvmw);
>> + return false;
>> + }
>> +
>> if (pvmw.pte) {
>> if (lru_gen_enabled() &&
>> pte_young(ptep_get(pvmw.pte))) {
>>
>>
>> By the way, I am not convinced that using test_bit(MMF_OOM_SKIP,
>> &vma->vm_mm->flags) is
>> correct (I think it is wrong). For example, global_init can
>> directly have it:
>> if (is_global_init(p)) {
>> can_oom_reap = false;
>> set_bit(MMF_OOM_SKIP, &mm->flags);
>> pr_info("oom killer %d (%s) has mm pinned by
>> %d (%s)\n",
>> task_pid_nr(victim),
>> victim->comm,
>> task_pid_nr(p), p->comm);
>> continue;
>> }
>>
>> And exit_mmap() automatically has MMF_OOM_SKIP.
>>
>> What is the purpose of this check? Is there a better way to determine
>> if a process is an
>> OOM target? What about check_stable_address_space() ?
> 1.Sorry, I overlook the situation with if (is_global_init(p)),
> MMF_OOM_SKIP is indeed not suitable.
>
> 2.check_stable_address_space() can indicate oom_reaper, but it seems
> unable to identify the situation where the process exits normally.
> What about task_is_dying()? static inline bool task_is_dying(void) {
> return tsk_is_oom_victim(current) || fatal_signal_pending(current) ||
> (current->flags & PF_EXITING); } Thanks
We can migrate task_is_dying() from mm/memcontrol.c to include/linux/oom.h
> static inline bool task_is_dying(void)
> {
> return tsk_is_oom_victim(current) || fatal_signal_pending(current) ||
> (current->flags & PF_EXITING);
> }
>>
>>
>>> diff --git a/mm/vmscan.c b/mm/vmscan.c
>>> index 0761f91b407f..bae7a8bf6b3d
>>> --- a/mm/vmscan.c
>>> +++ b/mm/vmscan.c
>>> @@ -863,7 +863,12 @@ static enum folio_references
>>> folio_check_references(struct folio *folio,
>>> if (vm_flags & VM_LOCKED)
>>> return FOLIOREF_ACTIVATE;
>>>
>>> - /* rmap lock contention: rotate */
>>> + /*
>>> + * There are two cases to consider.
>>> + * 1) Rmap lock contention: rotate.
>>> + * 2) Skip the non-shared anonymous folio mapped solely by
>>> + * the single exiting process.
>>> + */
>>> if (referenced_ptes == -1)
>>> return FOLIOREF_KEEP;
>>>
>>> --
>>> 2.39.0
>>>
>> Thanks
>> Barry
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v5] mm: shrink skip folio mapped by an exiting process
2024-07-08 12:25 ` zhiguojiang
@ 2024-07-08 12:41 ` Barry Song
2024-07-08 13:11 ` zhiguojiang
0 siblings, 1 reply; 12+ messages in thread
From: Barry Song @ 2024-07-08 12:41 UTC (permalink / raw)
To: zhiguojiang; +Cc: Andrew Morton, linux-mm, LKML, opensource.kernel
[-- Attachment #1: Type: text/plain, Size: 6663 bytes --]
zhiguojiang <justinjiang@vivo.com> 于 2024年7月9日周二 00:25写道:
>
>
> 在 2024/7/8 20:17, zhiguojiang 写道:
> >
> >
> > 在 2024/7/8 19:02, Barry Song 写道:
> >> On Mon, Jul 8, 2024 at 9:04 PM Zhiguo Jiang <justinjiang@vivo.com>
> >> wrote:
> >>> The releasing process of the non-shared anonymous folio mapped
> >>> solely by
> >>> an exiting process may go through two flows: 1) the anonymous folio is
> >>> firstly is swaped-out into swapspace and transformed into a swp_entry
> >>> in shrink_folio_list; 2) then the swp_entry is released in the process
> >>> exiting flow. This will increase the cpu load of releasing a non-shared
> >>> anonymous folio mapped solely by an exiting process, because the folio
> >>> go through swap-out and the releasing the swapspace and swp_entry.
> >>>
> >>> When system is low memory, it is more likely to occur, because more
> >>> backend applidatuions will be killed.
> >>>
> >>> The modification is that shrink skips the non-shared anonymous folio
> >>> solely mapped by an exting process and the folio is only released
> >>> directly in the process exiting flow, which will save swap-out time
> >>> and alleviate the load of the process exiting.
> >>>
> >>> Signed-off-by: Zhiguo Jiang <justinjiang@vivo.com>
> >>> ---
> >>>
> >>> Change log:
> >>> v4->v5:
> >>> 1.Modify to skip non-shared anonymous folio only.
> >>> 2.Update comments for pra->referenced = -1.
> >>> v3->v4:
> >>> 1.Modify that the unshared folios mapped only in exiting task are skip.
> >>> v2->v3:
> >>> Nothing.
> >>> v1->v2:
> >>> 1.The VM_EXITING added in v1 patch is removed, because it will fail
> >>> to compile in 32-bit system.
> >>>
> >>> mm/rmap.c | 13 +++++++++++++
> >>> mm/vmscan.c | 7 ++++++-
> >>> 2 files changed, 19 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/mm/rmap.c b/mm/rmap.c
> >>> index 26806b49a86f..5b5281d71dbb
> >>> --- a/mm/rmap.c
> >>> +++ b/mm/rmap.c
> >>> @@ -843,6 +843,19 @@ static bool folio_referenced_one(struct folio
> >>> *folio,
> >>> int referenced = 0;
> >>> unsigned long start = address, ptes = 0;
> >>>
> >>> + /*
> >>> + * Skip the non-shared anonymous folio mapped solely by
> >>> + * the single exiting process, and release it directly
> >>> + * in the process exiting.
> >>> + */
> >>> + if ((!atomic_read(&vma->vm_mm->mm_users) ||
> >>> + test_bit(MMF_OOM_SKIP, &vma->vm_mm->flags)) &&
> >>> + folio_test_anon(folio) &&
> >>> folio_test_swapbacked(folio) &&
> >>> + !folio_likely_mapped_shared(folio)) {
> >>> + pra->referenced = -1;
> >>> + return false;
> >>> + }
> >>> +
> >>> while (page_vma_mapped_walk(&pvmw)) {
> >>> address = pvmw.address;
> > Sure, I agree with your modification suggestions. This way, using PTL
> > indeed sure
> > that the folio is mapped by this process.
> > Thanks
> >> As David suggested, what about the below?
> >>
> >> @@ -883,6 +870,21 @@ static bool folio_referenced_one(struct folio
> >> *folio,
> >> continue;
> >> }
> >>
> >> + /*
> >> + * Skip the non-shared anonymous folio mapped solely by
> >> + * the single exiting process, and release it directly
> >> + * in the process exiting.
> >> + */
> >> + if ((!atomic_read(&vma->vm_mm->mm_users) ||
> >> + test_bit(MMF_OOM_SKIP,
> >> &vma->vm_mm->flags)) &&
> >> + folio_test_anon(folio) &&
> >> folio_test_swapbacked(folio) &&
> >> + !folio_likely_mapped_shared(folio)) {
> >> + pra->referenced = -1;
> >> + page_vma_mapped_walk_done(&pvmw);
> >> + return false;
> >> + }
> >> +
> >> if (pvmw.pte) {
> >> if (lru_gen_enabled() &&
> >> pte_young(ptep_get(pvmw.pte))) {
> >>
> >>
> >> By the way, I am not convinced that using test_bit(MMF_OOM_SKIP,
> >> &vma->vm_mm->flags) is
> >> correct (I think it is wrong). For example, global_init can
> >> directly have it:
> >> if (is_global_init(p)) {
> >> can_oom_reap = false;
> >> set_bit(MMF_OOM_SKIP, &mm->flags);
> >> pr_info("oom killer %d (%s) has mm pinned by
> >> %d (%s)\n",
> >> task_pid_nr(victim),
> >> victim->comm,
> >> task_pid_nr(p), p->comm);
> >> continue;
> >> }
> >>
> >> And exit_mmap() automatically has MMF_OOM_SKIP.
> >>
> >> What is the purpose of this check? Is there a better way to determine
> >> if a process is an
> >> OOM target? What about check_stable_address_space() ?
> > 1.Sorry, I overlook the situation with if (is_global_init(p)),
> > MMF_OOM_SKIP is indeed not suitable.
> >
> > 2.check_stable_address_space() can indicate oom_reaper, but it seems
> > unable to identify the situation where the process exits normally.
> > What about task_is_dying()? static inline bool task_is_dying(void) {
> > return tsk_is_oom_victim(current) || fatal_signal_pending(current) ||
> > (current->flags & PF_EXITING); } Thanks
> We can migrate task_is_dying() from mm/memcontrol.c to include/linux/oom.h
> > static inline bool task_is_dying(void)
> > {
> > return tsk_is_oom_victim(current) || fatal_signal_pending(current) ||
> > (current->flags & PF_EXITING);
> > }
>
no. current is kswapd.
> >>
> >>
> >>> diff --git a/mm/vmscan.c b/mm/vmscan.c
> >>> index 0761f91b407f..bae7a8bf6b3d
> >>> --- a/mm/vmscan.c
> >>> +++ b/mm/vmscan.c
> >>> @@ -863,7 +863,12 @@ static enum folio_references
> >>> folio_check_references(struct folio *folio,
> >>> if (vm_flags & VM_LOCKED)
> >>> return FOLIOREF_ACTIVATE;
> >>>
> >>> - /* rmap lock contention: rotate */
> >>> + /*
> >>> + * There are two cases to consider.
> >>> + * 1) Rmap lock contention: rotate.
> >>> + * 2) Skip the non-shared anonymous folio mapped solely by
> >>> + * the single exiting process.
> >>> + */
> >>> if (referenced_ptes == -1)
> >>> return FOLIOREF_KEEP;
> >>>
> >>> --
> >>> 2.39.0
> >>>
> >> Thanks
> >> Barry
> >
>
>
[-- Attachment #2: Type: text/html, Size: 9814 bytes --]
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v5] mm: shrink skip folio mapped by an exiting process
2024-07-08 12:41 ` Barry Song
@ 2024-07-08 13:11 ` zhiguojiang
2024-07-08 21:34 ` Barry Song
0 siblings, 1 reply; 12+ messages in thread
From: zhiguojiang @ 2024-07-08 13:11 UTC (permalink / raw)
To: Barry Song; +Cc: Andrew Morton, linux-mm, LKML, opensource.kernel
在 2024/7/8 20:41, Barry Song 写道:
>
>
> zhiguojiang <justinjiang@vivo.com> 于 2024年7月9日周二 00:25写道:
>
>
>
> 在 2024/7/8 20:17, zhiguojiang 写道:
> >
> >
> > 在 2024/7/8 19:02, Barry Song 写道:
> >> On Mon, Jul 8, 2024 at 9:04 PM Zhiguo Jiang <justinjiang@vivo.com>
> >> wrote:
> >>> The releasing process of the non-shared anonymous folio mapped
> >>> solely by
> >>> an exiting process may go through two flows: 1) the anonymous
> folio is
> >>> firstly is swaped-out into swapspace and transformed into a
> swp_entry
> >>> in shrink_folio_list; 2) then the swp_entry is released in the
> process
> >>> exiting flow. This will increase the cpu load of releasing a
> non-shared
> >>> anonymous folio mapped solely by an exiting process, because
> the folio
> >>> go through swap-out and the releasing the swapspace and swp_entry.
> >>>
> >>> When system is low memory, it is more likely to occur, because
> more
> >>> backend applidatuions will be killed.
> >>>
> >>> The modification is that shrink skips the non-shared anonymous
> folio
> >>> solely mapped by an exting process and the folio is only released
> >>> directly in the process exiting flow, which will save swap-out
> time
> >>> and alleviate the load of the process exiting.
> >>>
> >>> Signed-off-by: Zhiguo Jiang <justinjiang@vivo.com>
> >>> ---
> >>>
> >>> Change log:
> >>> v4->v5:
> >>> 1.Modify to skip non-shared anonymous folio only.
> >>> 2.Update comments for pra->referenced = -1.
> >>> v3->v4:
> >>> 1.Modify that the unshared folios mapped only in exiting task
> are skip.
> >>> v2->v3:
> >>> Nothing.
> >>> v1->v2:
> >>> 1.The VM_EXITING added in v1 patch is removed, because it will
> fail
> >>> to compile in 32-bit system.
> >>>
> >>> mm/rmap.c | 13 +++++++++++++
> >>> mm/vmscan.c | 7 ++++++-
> >>> 2 files changed, 19 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/mm/rmap.c b/mm/rmap.c
> >>> index 26806b49a86f..5b5281d71dbb
> >>> --- a/mm/rmap.c
> >>> +++ b/mm/rmap.c
> >>> @@ -843,6 +843,19 @@ static bool folio_referenced_one(struct
> folio
> >>> *folio,
> >>> int referenced = 0;
> >>> unsigned long start = address, ptes = 0;
> >>>
> >>> + /*
> >>> + * Skip the non-shared anonymous folio mapped solely by
> >>> + * the single exiting process, and release it directly
> >>> + * in the process exiting.
> >>> + */
> >>> + if ((!atomic_read(&vma->vm_mm->mm_users) ||
> >>> + test_bit(MMF_OOM_SKIP, &vma->vm_mm->flags)) &&
> >>> + folio_test_anon(folio) &&
> >>> folio_test_swapbacked(folio) &&
> >>> + !folio_likely_mapped_shared(folio)) {
> >>> + pra->referenced = -1;
> >>> + return false;
> >>> + }
> >>> +
> >>> while (page_vma_mapped_walk(&pvmw)) {
> >>> address = pvmw.address;
> > Sure, I agree with your modification suggestions. This way,
> using PTL
> > indeed sure
> > that the folio is mapped by this process.
> > Thanks
> >> As David suggested, what about the below?
> >>
> >> @@ -883,6 +870,21 @@ static bool folio_referenced_one(struct folio
> >> *folio,
> >> continue;
> >> }
> >>
> >> + /*
> >> + * Skip the non-shared anonymous folio mapped
> solely by
> >> + * the single exiting process, and release it
> directly
> >> + * in the process exiting.
> >> + */
> >> + if ((!atomic_read(&vma->vm_mm->mm_users) ||
> >> + test_bit(MMF_OOM_SKIP,
> >> &vma->vm_mm->flags)) &&
> >> + folio_test_anon(folio) &&
> >> folio_test_swapbacked(folio) &&
> >> + !folio_likely_mapped_shared(folio)) {
> >> + pra->referenced = -1;
> >> + page_vma_mapped_walk_done(&pvmw);
> >> + return false;
> >> + }
> >> +
> >> if (pvmw.pte) {
> >> if (lru_gen_enabled() &&
> >> pte_young(ptep_get(pvmw.pte))) {
> >>
> >>
> >> By the way, I am not convinced that using test_bit(MMF_OOM_SKIP,
> >> &vma->vm_mm->flags) is
> >> correct (I think it is wrong). For example, global_init can
> >> directly have it:
> >> if (is_global_init(p)) {
> >> can_oom_reap = false;
> >> set_bit(MMF_OOM_SKIP, &mm->flags);
> >> pr_info("oom killer %d (%s) has mm
> pinned by
> >> %d (%s)\n",
> >> task_pid_nr(victim),
> >> victim->comm,
> >> task_pid_nr(p), p->comm);
> >> continue;
> >> }
> >>
> >> And exit_mmap() automatically has MMF_OOM_SKIP.
> >>
> >> What is the purpose of this check? Is there a better way to
> determine
> >> if a process is an
> >> OOM target? What about check_stable_address_space() ?
> > 1.Sorry, I overlook the situation with if (is_global_init(p)),
> > MMF_OOM_SKIP is indeed not suitable.
> >
> > 2.check_stable_address_space() can indicate oom_reaper, but it
> seems
> > unable to identify the situation where the process exits normally.
> > What about task_is_dying()? static inline bool
> task_is_dying(void) {
> > return tsk_is_oom_victim(current) ||
> fatal_signal_pending(current) ||
> > (current->flags & PF_EXITING); } Thanks
> We can migrate task_is_dying() from mm/memcontrol.c to
> include/linux/oom.h
> > static inline bool task_is_dying(void)
> > {
> > return tsk_is_oom_victim(current) ||
> fatal_signal_pending(current) ||
> > (current->flags & PF_EXITING);
> > }
>
>
> no. current is kswapd.
Hi Barry,
It seems feasible for check_stable_address_space() replacing MMF_OOM_SKIP.
check_stable_address_space() can indicate oom kill, and
!atomic_read(&vma->vm_mm->mm_users)
can indicate the normal process exiting.
/*
* Skip the non-shared anonymous folio mapped solely by
* the single exiting process, and release it directly
* in the process exiting.
*/
if ((!atomic_read(&vma->vm_mm->mm_users) ||
check_stable_address_space(vma->vm_mm)) &&
folio_test_anon(folio) && folio_test_swapbacked(folio) &&
!folio_likely_mapped_shared(folio)) {
pra->referenced = -1;
page_vma_mapped_walk_done(&pvmw);
return false;
}
Thanks
Zhiguo
>
>
> >>
> >>
> >>> diff --git a/mm/vmscan.c b/mm/vmscan.c
> >>> index 0761f91b407f..bae7a8bf6b3d
> >>> --- a/mm/vmscan.c
> >>> +++ b/mm/vmscan.c
> >>> @@ -863,7 +863,12 @@ static enum folio_references
> >>> folio_check_references(struct folio *folio,
> >>> if (vm_flags & VM_LOCKED)
> >>> return FOLIOREF_ACTIVATE;
> >>>
> >>> - /* rmap lock contention: rotate */
> >>> + /*
> >>> + * There are two cases to consider.
> >>> + * 1) Rmap lock contention: rotate.
> >>> + * 2) Skip the non-shared anonymous folio mapped solely by
> >>> + * the single exiting process.
> >>> + */
> >>> if (referenced_ptes == -1)
> >>> return FOLIOREF_KEEP;
> >>>
> >>> --
> >>> 2.39.0
> >>>
> >> Thanks
> >> Barry
> >
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v5] mm: shrink skip folio mapped by an exiting process
2024-07-08 13:11 ` zhiguojiang
@ 2024-07-08 21:34 ` Barry Song
2024-07-09 4:23 ` zhiguojiang
0 siblings, 1 reply; 12+ messages in thread
From: Barry Song @ 2024-07-08 21:34 UTC (permalink / raw)
To: zhiguojiang
Cc: Andrew Morton, linux-mm, LKML, opensource.kernel,
David Hildenbrand, Matthew Wilcox
On Tue, Jul 9, 2024 at 1:11 AM zhiguojiang <justinjiang@vivo.com> wrote:
>
>
>
> 在 2024/7/8 20:41, Barry Song 写道:
> >
> >
> > zhiguojiang <justinjiang@vivo.com> 于 2024年7月9日周二 00:25写道:
> >
> >
> >
> > 在 2024/7/8 20:17, zhiguojiang 写道:
> > >
> > >
> > > 在 2024/7/8 19:02, Barry Song 写道:
> > >> On Mon, Jul 8, 2024 at 9:04 PM Zhiguo Jiang <justinjiang@vivo.com>
> > >> wrote:
> > >>> The releasing process of the non-shared anonymous folio mapped
> > >>> solely by
> > >>> an exiting process may go through two flows: 1) the anonymous
> > folio is
> > >>> firstly is swaped-out into swapspace and transformed into a
> > swp_entry
> > >>> in shrink_folio_list; 2) then the swp_entry is released in the
> > process
> > >>> exiting flow. This will increase the cpu load of releasing a
> > non-shared
> > >>> anonymous folio mapped solely by an exiting process, because
> > the folio
> > >>> go through swap-out and the releasing the swapspace and swp_entry.
> > >>>
> > >>> When system is low memory, it is more likely to occur, because
> > more
> > >>> backend applidatuions will be killed.
> > >>>
> > >>> The modification is that shrink skips the non-shared anonymous
> > folio
> > >>> solely mapped by an exting process and the folio is only released
> > >>> directly in the process exiting flow, which will save swap-out
> > time
> > >>> and alleviate the load of the process exiting.
> > >>>
> > >>> Signed-off-by: Zhiguo Jiang <justinjiang@vivo.com>
> > >>> ---
> > >>>
> > >>> Change log:
> > >>> v4->v5:
> > >>> 1.Modify to skip non-shared anonymous folio only.
> > >>> 2.Update comments for pra->referenced = -1.
> > >>> v3->v4:
> > >>> 1.Modify that the unshared folios mapped only in exiting task
> > are skip.
> > >>> v2->v3:
> > >>> Nothing.
> > >>> v1->v2:
> > >>> 1.The VM_EXITING added in v1 patch is removed, because it will
> > fail
> > >>> to compile in 32-bit system.
> > >>>
> > >>> mm/rmap.c | 13 +++++++++++++
> > >>> mm/vmscan.c | 7 ++++++-
> > >>> 2 files changed, 19 insertions(+), 1 deletion(-)
> > >>>
> > >>> diff --git a/mm/rmap.c b/mm/rmap.c
> > >>> index 26806b49a86f..5b5281d71dbb
> > >>> --- a/mm/rmap.c
> > >>> +++ b/mm/rmap.c
> > >>> @@ -843,6 +843,19 @@ static bool folio_referenced_one(struct
> > folio
> > >>> *folio,
> > >>> int referenced = 0;
> > >>> unsigned long start = address, ptes = 0;
> > >>>
> > >>> + /*
> > >>> + * Skip the non-shared anonymous folio mapped solely by
> > >>> + * the single exiting process, and release it directly
> > >>> + * in the process exiting.
> > >>> + */
> > >>> + if ((!atomic_read(&vma->vm_mm->mm_users) ||
> > >>> + test_bit(MMF_OOM_SKIP, &vma->vm_mm->flags)) &&
> > >>> + folio_test_anon(folio) &&
> > >>> folio_test_swapbacked(folio) &&
> > >>> + !folio_likely_mapped_shared(folio)) {
> > >>> + pra->referenced = -1;
> > >>> + return false;
> > >>> + }
> > >>> +
> > >>> while (page_vma_mapped_walk(&pvmw)) {
> > >>> address = pvmw.address;
> > > Sure, I agree with your modification suggestions. This way,
> > using PTL
> > > indeed sure
> > > that the folio is mapped by this process.
> > > Thanks
> > >> As David suggested, what about the below?
> > >>
> > >> @@ -883,6 +870,21 @@ static bool folio_referenced_one(struct folio
> > >> *folio,
> > >> continue;
> > >> }
> > >>
> > >> + /*
> > >> + * Skip the non-shared anonymous folio mapped
> > solely by
> > >> + * the single exiting process, and release it
> > directly
> > >> + * in the process exiting.
> > >> + */
> > >> + if ((!atomic_read(&vma->vm_mm->mm_users) ||
> > >> + test_bit(MMF_OOM_SKIP,
> > >> &vma->vm_mm->flags)) &&
> > >> + folio_test_anon(folio) &&
> > >> folio_test_swapbacked(folio) &&
> > >> + !folio_likely_mapped_shared(folio)) {
> > >> + pra->referenced = -1;
> > >> + page_vma_mapped_walk_done(&pvmw);
> > >> + return false;
> > >> + }
> > >> +
> > >> if (pvmw.pte) {
> > >> if (lru_gen_enabled() &&
> > >> pte_young(ptep_get(pvmw.pte))) {
> > >>
> > >>
> > >> By the way, I am not convinced that using test_bit(MMF_OOM_SKIP,
> > >> &vma->vm_mm->flags) is
> > >> correct (I think it is wrong). For example, global_init can
> > >> directly have it:
> > >> if (is_global_init(p)) {
> > >> can_oom_reap = false;
> > >> set_bit(MMF_OOM_SKIP, &mm->flags);
> > >> pr_info("oom killer %d (%s) has mm
> > pinned by
> > >> %d (%s)\n",
> > >> task_pid_nr(victim),
> > >> victim->comm,
> > >> task_pid_nr(p), p->comm);
> > >> continue;
> > >> }
> > >>
> > >> And exit_mmap() automatically has MMF_OOM_SKIP.
> > >>
> > >> What is the purpose of this check? Is there a better way to
> > determine
> > >> if a process is an
> > >> OOM target? What about check_stable_address_space() ?
> > > 1.Sorry, I overlook the situation with if (is_global_init(p)),
> > > MMF_OOM_SKIP is indeed not suitable.
> > >
> > > 2.check_stable_address_space() can indicate oom_reaper, but it
> > seems
> > > unable to identify the situation where the process exits normally.
> > > What about task_is_dying()? static inline bool
> > task_is_dying(void) {
> > > return tsk_is_oom_victim(current) ||
> > fatal_signal_pending(current) ||
> > > (current->flags & PF_EXITING); } Thanks
> > We can migrate task_is_dying() from mm/memcontrol.c to
> > include/linux/oom.h
> > > static inline bool task_is_dying(void)
> > > {
> > > return tsk_is_oom_victim(current) ||
> > fatal_signal_pending(current) ||
> > > (current->flags & PF_EXITING);
> > > }
> >
> >
> > no. current is kswapd.
> Hi Barry,
>
> It seems feasible for check_stable_address_space() replacing MMF_OOM_SKIP.
> check_stable_address_space() can indicate oom kill, and
> !atomic_read(&vma->vm_mm->mm_users)
> can indicate the normal process exiting.
>
> /*
> * Skip the non-shared anonymous folio mapped solely by
> * the single exiting process, and release it directly
> * in the process exiting.
> */
> if ((!atomic_read(&vma->vm_mm->mm_users) ||
> check_stable_address_space(vma->vm_mm)) &&
> folio_test_anon(folio) && folio_test_swapbacked(folio) &&
> !folio_likely_mapped_shared(folio)) {
> pra->referenced = -1;
> page_vma_mapped_walk_done(&pvmw);
> return false;
> }
>
Yes, + David, Willy (when you send a new version, please CC people who have
participated and describe how you have addressed comments from those
people.)
I also think we actually can remove "folio_test_anon(folio)".
So It could be,
@@ -883,6 +871,21 @@ static bool folio_referenced_one(struct folio *folio,
continue;
}
+ /*
+ * Skip the non-shared swapbacked folio mapped solely by
+ * the exiting or OOM-reaped process. This avoids redundant
+ * swap-out followed by an immediate unmap.
+ */
+ if ((!atomic_read(&vma->vm_mm->mm_users) ||
+ check_stable_address_space(vma->vm_mm)) &&
+ folio_test_swapbacked(folio) &&
+ !folio_likely_mapped_shared(folio)) {
+ pra->referenced = -1;
+ page_vma_mapped_walk_done(&pvmw);
+ return false;
+ }
+
if (pvmw.pte) {
if (lru_gen_enabled() &&
pte_young(ptep_get(pvmw.pte))) {
> Thanks
> Zhiguo
> >
> >
> > >>
> > >>
> > >>> diff --git a/mm/vmscan.c b/mm/vmscan.c
> > >>> index 0761f91b407f..bae7a8bf6b3d
> > >>> --- a/mm/vmscan.c
> > >>> +++ b/mm/vmscan.c
> > >>> @@ -863,7 +863,12 @@ static enum folio_references
> > >>> folio_check_references(struct folio *folio,
> > >>> if (vm_flags & VM_LOCKED)
> > >>> return FOLIOREF_ACTIVATE;
> > >>>
> > >>> - /* rmap lock contention: rotate */
> > >>> + /*
> > >>> + * There are two cases to consider.
> > >>> + * 1) Rmap lock contention: rotate.
> > >>> + * 2) Skip the non-shared anonymous folio mapped solely by
> > >>> + * the single exiting process.
> > >>> + */
> > >>> if (referenced_ptes == -1)
> > >>> return FOLIOREF_KEEP;
> > >>>
> > >>> --
> > >>> 2.39.0
> > >>>
> > >> Thanks
> > >> Barry
> > >
> >
>
Thanks
Barry
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v5] mm: shrink skip folio mapped by an exiting process
2024-07-08 21:34 ` Barry Song
@ 2024-07-09 4:23 ` zhiguojiang
0 siblings, 0 replies; 12+ messages in thread
From: zhiguojiang @ 2024-07-09 4:23 UTC (permalink / raw)
To: Barry Song
Cc: Andrew Morton, linux-mm, LKML, opensource.kernel,
David Hildenbrand, Matthew Wilcox
在 2024/7/9 5:34, Barry Song 写道:
> On Tue, Jul 9, 2024 at 1:11 AM zhiguojiang <justinjiang@vivo.com> wrote:
>>
>>
>> 在 2024/7/8 20:41, Barry Song 写道:
>>>
>>> zhiguojiang <justinjiang@vivo.com> 于 2024年7月9日周二 00:25写道:
>>>
>>>
>>>
>>> 在 2024/7/8 20:17, zhiguojiang 写道:
>>> >
>>> >
>>> > 在 2024/7/8 19:02, Barry Song 写道:
>>> >> On Mon, Jul 8, 2024 at 9:04 PM Zhiguo Jiang <justinjiang@vivo.com>
>>> >> wrote:
>>> >>> The releasing process of the non-shared anonymous folio mapped
>>> >>> solely by
>>> >>> an exiting process may go through two flows: 1) the anonymous
>>> folio is
>>> >>> firstly is swaped-out into swapspace and transformed into a
>>> swp_entry
>>> >>> in shrink_folio_list; 2) then the swp_entry is released in the
>>> process
>>> >>> exiting flow. This will increase the cpu load of releasing a
>>> non-shared
>>> >>> anonymous folio mapped solely by an exiting process, because
>>> the folio
>>> >>> go through swap-out and the releasing the swapspace and swp_entry.
>>> >>>
>>> >>> When system is low memory, it is more likely to occur, because
>>> more
>>> >>> backend applidatuions will be killed.
>>> >>>
>>> >>> The modification is that shrink skips the non-shared anonymous
>>> folio
>>> >>> solely mapped by an exting process and the folio is only released
>>> >>> directly in the process exiting flow, which will save swap-out
>>> time
>>> >>> and alleviate the load of the process exiting.
>>> >>>
>>> >>> Signed-off-by: Zhiguo Jiang <justinjiang@vivo.com>
>>> >>> ---
>>> >>>
>>> >>> Change log:
>>> >>> v4->v5:
>>> >>> 1.Modify to skip non-shared anonymous folio only.
>>> >>> 2.Update comments for pra->referenced = -1.
>>> >>> v3->v4:
>>> >>> 1.Modify that the unshared folios mapped only in exiting task
>>> are skip.
>>> >>> v2->v3:
>>> >>> Nothing.
>>> >>> v1->v2:
>>> >>> 1.The VM_EXITING added in v1 patch is removed, because it will
>>> fail
>>> >>> to compile in 32-bit system.
>>> >>>
>>> >>> mm/rmap.c | 13 +++++++++++++
>>> >>> mm/vmscan.c | 7 ++++++-
>>> >>> 2 files changed, 19 insertions(+), 1 deletion(-)
>>> >>>
>>> >>> diff --git a/mm/rmap.c b/mm/rmap.c
>>> >>> index 26806b49a86f..5b5281d71dbb
>>> >>> --- a/mm/rmap.c
>>> >>> +++ b/mm/rmap.c
>>> >>> @@ -843,6 +843,19 @@ static bool folio_referenced_one(struct
>>> folio
>>> >>> *folio,
>>> >>> int referenced = 0;
>>> >>> unsigned long start = address, ptes = 0;
>>> >>>
>>> >>> + /*
>>> >>> + * Skip the non-shared anonymous folio mapped solely by
>>> >>> + * the single exiting process, and release it directly
>>> >>> + * in the process exiting.
>>> >>> + */
>>> >>> + if ((!atomic_read(&vma->vm_mm->mm_users) ||
>>> >>> + test_bit(MMF_OOM_SKIP, &vma->vm_mm->flags)) &&
>>> >>> + folio_test_anon(folio) &&
>>> >>> folio_test_swapbacked(folio) &&
>>> >>> + !folio_likely_mapped_shared(folio)) {
>>> >>> + pra->referenced = -1;
>>> >>> + return false;
>>> >>> + }
>>> >>> +
>>> >>> while (page_vma_mapped_walk(&pvmw)) {
>>> >>> address = pvmw.address;
>>> > Sure, I agree with your modification suggestions. This way,
>>> using PTL
>>> > indeed sure
>>> > that the folio is mapped by this process.
>>> > Thanks
>>> >> As David suggested, what about the below?
>>> >>
>>> >> @@ -883,6 +870,21 @@ static bool folio_referenced_one(struct folio
>>> >> *folio,
>>> >> continue;
>>> >> }
>>> >>
>>> >> + /*
>>> >> + * Skip the non-shared anonymous folio mapped
>>> solely by
>>> >> + * the single exiting process, and release it
>>> directly
>>> >> + * in the process exiting.
>>> >> + */
>>> >> + if ((!atomic_read(&vma->vm_mm->mm_users) ||
>>> >> + test_bit(MMF_OOM_SKIP,
>>> >> &vma->vm_mm->flags)) &&
>>> >> + folio_test_anon(folio) &&
>>> >> folio_test_swapbacked(folio) &&
>>> >> + !folio_likely_mapped_shared(folio)) {
>>> >> + pra->referenced = -1;
>>> >> + page_vma_mapped_walk_done(&pvmw);
>>> >> + return false;
>>> >> + }
>>> >> +
>>> >> if (pvmw.pte) {
>>> >> if (lru_gen_enabled() &&
>>> >> pte_young(ptep_get(pvmw.pte))) {
>>> >>
>>> >>
>>> >> By the way, I am not convinced that using test_bit(MMF_OOM_SKIP,
>>> >> &vma->vm_mm->flags) is
>>> >> correct (I think it is wrong). For example, global_init can
>>> >> directly have it:
>>> >> if (is_global_init(p)) {
>>> >> can_oom_reap = false;
>>> >> set_bit(MMF_OOM_SKIP, &mm->flags);
>>> >> pr_info("oom killer %d (%s) has mm
>>> pinned by
>>> >> %d (%s)\n",
>>> >> task_pid_nr(victim),
>>> >> victim->comm,
>>> >> task_pid_nr(p), p->comm);
>>> >> continue;
>>> >> }
>>> >>
>>> >> And exit_mmap() automatically has MMF_OOM_SKIP.
>>> >>
>>> >> What is the purpose of this check? Is there a better way to
>>> determine
>>> >> if a process is an
>>> >> OOM target? What about check_stable_address_space() ?
>>> > 1.Sorry, I overlook the situation with if (is_global_init(p)),
>>> > MMF_OOM_SKIP is indeed not suitable.
>>> >
>>> > 2.check_stable_address_space() can indicate oom_reaper, but it
>>> seems
>>> > unable to identify the situation where the process exits normally.
>>> > What about task_is_dying()? static inline bool
>>> task_is_dying(void) {
>>> > return tsk_is_oom_victim(current) ||
>>> fatal_signal_pending(current) ||
>>> > (current->flags & PF_EXITING); } Thanks
>>> We can migrate task_is_dying() from mm/memcontrol.c to
>>> include/linux/oom.h
>>> > static inline bool task_is_dying(void)
>>> > {
>>> > return tsk_is_oom_victim(current) ||
>>> fatal_signal_pending(current) ||
>>> > (current->flags & PF_EXITING);
>>> > }
>>>
>>>
>>> no. current is kswapd.
>> Hi Barry,
>>
>> It seems feasible for check_stable_address_space() replacing MMF_OOM_SKIP.
>> check_stable_address_space() can indicate oom kill, and
>> !atomic_read(&vma->vm_mm->mm_users)
>> can indicate the normal process exiting.
>>
>> /*
>> * Skip the non-shared anonymous folio mapped solely by
>> * the single exiting process, and release it directly
>> * in the process exiting.
>> */
>> if ((!atomic_read(&vma->vm_mm->mm_users) ||
>> check_stable_address_space(vma->vm_mm)) &&
>> folio_test_anon(folio) && folio_test_swapbacked(folio) &&
>> !folio_likely_mapped_shared(folio)) {
>> pra->referenced = -1;
>> page_vma_mapped_walk_done(&pvmw);
>> return false;
>> }
>>
> Yes, + David, Willy (when you send a new version, please CC people who have
> participated and describe how you have addressed comments from those
> people.)
>
> I also think we actually can remove "folio_test_anon(folio)".
>
> So It could be,
>
> @@ -883,6 +871,21 @@ static bool folio_referenced_one(struct folio *folio,
> continue;
> }
>
> + /*
> + * Skip the non-shared swapbacked folio mapped solely by
> + * the exiting or OOM-reaped process. This avoids redundant
> + * swap-out followed by an immediate unmap.
> + */
> + if ((!atomic_read(&vma->vm_mm->mm_users) ||
> + check_stable_address_space(vma->vm_mm)) &&
> + folio_test_swapbacked(folio) &&
> + !folio_likely_mapped_shared(folio)) {
> + pra->referenced = -1;
> + page_vma_mapped_walk_done(&pvmw);
> + return false;
> + }
> +
> if (pvmw.pte) {
> if (lru_gen_enabled() &&
> pte_young(ptep_get(pvmw.pte))) {
Ok, update in patch v6:
https://lore.kernel.org/linux-mm/20240709042122.631-1-justinjiang@vivo.com/
Thanks
>
>> Thanks
>> Zhiguo
>>>
>>> >>
>>> >>
>>> >>> diff --git a/mm/vmscan.c b/mm/vmscan.c
>>> >>> index 0761f91b407f..bae7a8bf6b3d
>>> >>> --- a/mm/vmscan.c
>>> >>> +++ b/mm/vmscan.c
>>> >>> @@ -863,7 +863,12 @@ static enum folio_references
>>> >>> folio_check_references(struct folio *folio,
>>> >>> if (vm_flags & VM_LOCKED)
>>> >>> return FOLIOREF_ACTIVATE;
>>> >>>
>>> >>> - /* rmap lock contention: rotate */
>>> >>> + /*
>>> >>> + * There are two cases to consider.
>>> >>> + * 1) Rmap lock contention: rotate.
>>> >>> + * 2) Skip the non-shared anonymous folio mapped solely by
>>> >>> + * the single exiting process.
>>> >>> + */
>>> >>> if (referenced_ptes == -1)
>>> >>> return FOLIOREF_KEEP;
>>> >>>
>>> >>> --
>>> >>> 2.39.0
>>> >>>
>>> >> Thanks
>>> >> Barry
>>> >
>>>
> Thanks
> Barry
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2024-07-09 4:24 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-07-08 9:04 [PATCH v5] mm: shrink skip folio mapped by an exiting process Zhiguo Jiang
2024-07-08 9:36 ` David Hildenbrand
2024-07-08 9:46 ` Barry Song
2024-07-08 9:49 ` David Hildenbrand
2024-07-08 10:05 ` Barry Song
2024-07-08 11:02 ` Barry Song
2024-07-08 12:17 ` zhiguojiang
2024-07-08 12:25 ` zhiguojiang
2024-07-08 12:41 ` Barry Song
2024-07-08 13:11 ` zhiguojiang
2024-07-08 21:34 ` Barry Song
2024-07-09 4:23 ` zhiguojiang
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox