* [PATCH v5] mm: shrink skip folio mapped by an exiting process @ 2024-07-08 9:04 Zhiguo Jiang 2024-07-08 9:36 ` David Hildenbrand 2024-07-08 11:02 ` Barry Song 0 siblings, 2 replies; 12+ messages in thread From: Zhiguo Jiang @ 2024-07-08 9:04 UTC (permalink / raw) To: Andrew Morton, linux-mm, linux-kernel, Barry Song Cc: opensource.kernel, Zhiguo Jiang The releasing process of the non-shared anonymous folio mapped solely by an exiting process may go through two flows: 1) the anonymous folio is firstly is swaped-out into swapspace and transformed into a swp_entry in shrink_folio_list; 2) then the swp_entry is released in the process exiting flow. This will increase the cpu load of releasing a non-shared anonymous folio mapped solely by an exiting process, because the folio go through swap-out and the releasing the swapspace and swp_entry. When system is low memory, it is more likely to occur, because more backend applidatuions will be killed. The modification is that shrink skips the non-shared anonymous folio solely mapped by an exting process and the folio is only released directly in the process exiting flow, which will save swap-out time and alleviate the load of the process exiting. Signed-off-by: Zhiguo Jiang <justinjiang@vivo.com> --- Change log: v4->v5: 1.Modify to skip non-shared anonymous folio only. 2.Update comments for pra->referenced = -1. v3->v4: 1.Modify that the unshared folios mapped only in exiting task are skip. v2->v3: Nothing. v1->v2: 1.The VM_EXITING added in v1 patch is removed, because it will fail to compile in 32-bit system. mm/rmap.c | 13 +++++++++++++ mm/vmscan.c | 7 ++++++- 2 files changed, 19 insertions(+), 1 deletion(-) diff --git a/mm/rmap.c b/mm/rmap.c index 26806b49a86f..5b5281d71dbb --- a/mm/rmap.c +++ b/mm/rmap.c @@ -843,6 +843,19 @@ static bool folio_referenced_one(struct folio *folio, int referenced = 0; unsigned long start = address, ptes = 0; + /* + * Skip the non-shared anonymous folio mapped solely by + * the single exiting process, and release it directly + * in the process exiting. + */ + if ((!atomic_read(&vma->vm_mm->mm_users) || + test_bit(MMF_OOM_SKIP, &vma->vm_mm->flags)) && + folio_test_anon(folio) && folio_test_swapbacked(folio) && + !folio_likely_mapped_shared(folio)) { + pra->referenced = -1; + return false; + } + while (page_vma_mapped_walk(&pvmw)) { address = pvmw.address; diff --git a/mm/vmscan.c b/mm/vmscan.c index 0761f91b407f..bae7a8bf6b3d --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -863,7 +863,12 @@ static enum folio_references folio_check_references(struct folio *folio, if (vm_flags & VM_LOCKED) return FOLIOREF_ACTIVATE; - /* rmap lock contention: rotate */ + /* + * There are two cases to consider. + * 1) Rmap lock contention: rotate. + * 2) Skip the non-shared anonymous folio mapped solely by + * the single exiting process. + */ if (referenced_ptes == -1) return FOLIOREF_KEEP; -- 2.39.0 ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v5] mm: shrink skip folio mapped by an exiting process 2024-07-08 9:04 [PATCH v5] mm: shrink skip folio mapped by an exiting process Zhiguo Jiang @ 2024-07-08 9:36 ` David Hildenbrand 2024-07-08 9:46 ` Barry Song 2024-07-08 11:02 ` Barry Song 1 sibling, 1 reply; 12+ messages in thread From: David Hildenbrand @ 2024-07-08 9:36 UTC (permalink / raw) To: Zhiguo Jiang, Andrew Morton, linux-mm, linux-kernel, Barry Song Cc: opensource.kernel On 08.07.24 11:04, Zhiguo Jiang wrote: > The releasing process of the non-shared anonymous folio mapped solely by > an exiting process may go through two flows: 1) the anonymous folio is > firstly is swaped-out into swapspace and transformed into a swp_entry > in shrink_folio_list; 2) then the swp_entry is released in the process > exiting flow. This will increase the cpu load of releasing a non-shared > anonymous folio mapped solely by an exiting process, because the folio > go through swap-out and the releasing the swapspace and swp_entry. > > When system is low memory, it is more likely to occur, because more > backend applidatuions will be killed. > > The modification is that shrink skips the non-shared anonymous folio > solely mapped by an exting process and the folio is only released > directly in the process exiting flow, which will save swap-out time > and alleviate the load of the process exiting. > > Signed-off-by: Zhiguo Jiang <justinjiang@vivo.com> > --- > > Change log: > v4->v5: > 1.Modify to skip non-shared anonymous folio only. > 2.Update comments for pra->referenced = -1. > v3->v4: > 1.Modify that the unshared folios mapped only in exiting task are skip. > v2->v3: > Nothing. > v1->v2: > 1.The VM_EXITING added in v1 patch is removed, because it will fail > to compile in 32-bit system. > > mm/rmap.c | 13 +++++++++++++ > mm/vmscan.c | 7 ++++++- > 2 files changed, 19 insertions(+), 1 deletion(-) > > diff --git a/mm/rmap.c b/mm/rmap.c > index 26806b49a86f..5b5281d71dbb > --- a/mm/rmap.c > +++ b/mm/rmap.c > @@ -843,6 +843,19 @@ static bool folio_referenced_one(struct folio *folio, > int referenced = 0; > unsigned long start = address, ptes = 0; > > + /* > + * Skip the non-shared anonymous folio mapped solely by > + * the single exiting process, and release it directly > + * in the process exiting. > + */ > + if ((!atomic_read(&vma->vm_mm->mm_users) || > + test_bit(MMF_OOM_SKIP, &vma->vm_mm->flags)) && > + folio_test_anon(folio) && folio_test_swapbacked(folio) && > + !folio_likely_mapped_shared(folio)) { I'm currently working on moving all folio_likely_mapped_shared() under the PTL, where we are then sure that the folio is actually mapped by this process (e.g., no concurrent unmapping poisslbe). Can we do the same here directly? -- Cheers, David / dhildenb ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v5] mm: shrink skip folio mapped by an exiting process 2024-07-08 9:36 ` David Hildenbrand @ 2024-07-08 9:46 ` Barry Song 2024-07-08 9:49 ` David Hildenbrand 0 siblings, 1 reply; 12+ messages in thread From: Barry Song @ 2024-07-08 9:46 UTC (permalink / raw) To: David Hildenbrand Cc: Zhiguo Jiang, Andrew Morton, linux-mm, linux-kernel, opensource.kernel On Mon, Jul 8, 2024 at 9:36 PM David Hildenbrand <david@redhat.com> wrote: > > On 08.07.24 11:04, Zhiguo Jiang wrote: > > The releasing process of the non-shared anonymous folio mapped solely by > > an exiting process may go through two flows: 1) the anonymous folio is > > firstly is swaped-out into swapspace and transformed into a swp_entry > > in shrink_folio_list; 2) then the swp_entry is released in the process > > exiting flow. This will increase the cpu load of releasing a non-shared > > anonymous folio mapped solely by an exiting process, because the folio > > go through swap-out and the releasing the swapspace and swp_entry. > > > > When system is low memory, it is more likely to occur, because more > > backend applidatuions will be killed. > > > > The modification is that shrink skips the non-shared anonymous folio > > solely mapped by an exting process and the folio is only released > > directly in the process exiting flow, which will save swap-out time > > and alleviate the load of the process exiting. > > > > Signed-off-by: Zhiguo Jiang <justinjiang@vivo.com> > > --- > > > > Change log: > > v4->v5: > > 1.Modify to skip non-shared anonymous folio only. > > 2.Update comments for pra->referenced = -1. > > v3->v4: > > 1.Modify that the unshared folios mapped only in exiting task are skip. > > v2->v3: > > Nothing. > > v1->v2: > > 1.The VM_EXITING added in v1 patch is removed, because it will fail > > to compile in 32-bit system. > > > > mm/rmap.c | 13 +++++++++++++ > > mm/vmscan.c | 7 ++++++- > > 2 files changed, 19 insertions(+), 1 deletion(-) > > > > diff --git a/mm/rmap.c b/mm/rmap.c > > index 26806b49a86f..5b5281d71dbb > > --- a/mm/rmap.c > > +++ b/mm/rmap.c > > @@ -843,6 +843,19 @@ static bool folio_referenced_one(struct folio *folio, > > int referenced = 0; > > unsigned long start = address, ptes = 0; > > > > + /* > > + * Skip the non-shared anonymous folio mapped solely by > > + * the single exiting process, and release it directly > > + * in the process exiting. > > + */ > > + if ((!atomic_read(&vma->vm_mm->mm_users) || > > + test_bit(MMF_OOM_SKIP, &vma->vm_mm->flags)) && > > + folio_test_anon(folio) && folio_test_swapbacked(folio) && > > + !folio_likely_mapped_shared(folio)) { > > I'm currently working on moving all folio_likely_mapped_shared() under > the PTL, where we are then sure that the folio is actually mapped by > this process (e.g., no concurrent unmapping poisslbe). > > Can we do the same here directly? Implementing this is challenging because page_vma_mapped_walk() is responsible for traversing the page table to acquire and release the PTL. This becomes particularly complex with mTHP, as we may need to interrupt the page_vma_mapped_walk loop at the first PTE. > > -- > Cheers, > > David / dhildenb > Thanks Barry ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v5] mm: shrink skip folio mapped by an exiting process 2024-07-08 9:46 ` Barry Song @ 2024-07-08 9:49 ` David Hildenbrand 2024-07-08 10:05 ` Barry Song 0 siblings, 1 reply; 12+ messages in thread From: David Hildenbrand @ 2024-07-08 9:49 UTC (permalink / raw) To: Barry Song Cc: Zhiguo Jiang, Andrew Morton, linux-mm, linux-kernel, opensource.kernel On 08.07.24 11:46, Barry Song wrote: > On Mon, Jul 8, 2024 at 9:36 PM David Hildenbrand <david@redhat.com> wrote: >> >> On 08.07.24 11:04, Zhiguo Jiang wrote: >>> The releasing process of the non-shared anonymous folio mapped solely by >>> an exiting process may go through two flows: 1) the anonymous folio is >>> firstly is swaped-out into swapspace and transformed into a swp_entry >>> in shrink_folio_list; 2) then the swp_entry is released in the process >>> exiting flow. This will increase the cpu load of releasing a non-shared >>> anonymous folio mapped solely by an exiting process, because the folio >>> go through swap-out and the releasing the swapspace and swp_entry. >>> >>> When system is low memory, it is more likely to occur, because more >>> backend applidatuions will be killed. >>> >>> The modification is that shrink skips the non-shared anonymous folio >>> solely mapped by an exting process and the folio is only released >>> directly in the process exiting flow, which will save swap-out time >>> and alleviate the load of the process exiting. >>> >>> Signed-off-by: Zhiguo Jiang <justinjiang@vivo.com> >>> --- >>> >>> Change log: >>> v4->v5: >>> 1.Modify to skip non-shared anonymous folio only. >>> 2.Update comments for pra->referenced = -1. >>> v3->v4: >>> 1.Modify that the unshared folios mapped only in exiting task are skip. >>> v2->v3: >>> Nothing. >>> v1->v2: >>> 1.The VM_EXITING added in v1 patch is removed, because it will fail >>> to compile in 32-bit system. >>> >>> mm/rmap.c | 13 +++++++++++++ >>> mm/vmscan.c | 7 ++++++- >>> 2 files changed, 19 insertions(+), 1 deletion(-) >>> >>> diff --git a/mm/rmap.c b/mm/rmap.c >>> index 26806b49a86f..5b5281d71dbb >>> --- a/mm/rmap.c >>> +++ b/mm/rmap.c >>> @@ -843,6 +843,19 @@ static bool folio_referenced_one(struct folio *folio, >>> int referenced = 0; >>> unsigned long start = address, ptes = 0; >>> >>> + /* >>> + * Skip the non-shared anonymous folio mapped solely by >>> + * the single exiting process, and release it directly >>> + * in the process exiting. >>> + */ >>> + if ((!atomic_read(&vma->vm_mm->mm_users) || >>> + test_bit(MMF_OOM_SKIP, &vma->vm_mm->flags)) && >>> + folio_test_anon(folio) && folio_test_swapbacked(folio) && >>> + !folio_likely_mapped_shared(folio)) { >> >> I'm currently working on moving all folio_likely_mapped_shared() under >> the PTL, where we are then sure that the folio is actually mapped by >> this process (e.g., no concurrent unmapping poisslbe). >> >> Can we do the same here directly? > > Implementing this is challenging because page_vma_mapped_walk() is > responsible for > traversing the page table to acquire and release the PTL. This becomes > particularly > complex with mTHP, as we may need to interrupt the page_vma_mapped_walk > loop at the first PTE. Why can't we perform the check under the PTL and bail out? I'm missing something important. -- Cheers, David / dhildenb ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v5] mm: shrink skip folio mapped by an exiting process 2024-07-08 9:49 ` David Hildenbrand @ 2024-07-08 10:05 ` Barry Song 0 siblings, 0 replies; 12+ messages in thread From: Barry Song @ 2024-07-08 10:05 UTC (permalink / raw) To: David Hildenbrand Cc: Zhiguo Jiang, Andrew Morton, linux-mm, linux-kernel, opensource.kernel On Mon, Jul 8, 2024 at 9:49 PM David Hildenbrand <david@redhat.com> wrote: > > On 08.07.24 11:46, Barry Song wrote: > > On Mon, Jul 8, 2024 at 9:36 PM David Hildenbrand <david@redhat.com> wrote: > >> > >> On 08.07.24 11:04, Zhiguo Jiang wrote: > >>> The releasing process of the non-shared anonymous folio mapped solely by > >>> an exiting process may go through two flows: 1) the anonymous folio is > >>> firstly is swaped-out into swapspace and transformed into a swp_entry > >>> in shrink_folio_list; 2) then the swp_entry is released in the process > >>> exiting flow. This will increase the cpu load of releasing a non-shared > >>> anonymous folio mapped solely by an exiting process, because the folio > >>> go through swap-out and the releasing the swapspace and swp_entry. > >>> > >>> When system is low memory, it is more likely to occur, because more > >>> backend applidatuions will be killed. > >>> > >>> The modification is that shrink skips the non-shared anonymous folio > >>> solely mapped by an exting process and the folio is only released > >>> directly in the process exiting flow, which will save swap-out time > >>> and alleviate the load of the process exiting. > >>> > >>> Signed-off-by: Zhiguo Jiang <justinjiang@vivo.com> > >>> --- > >>> > >>> Change log: > >>> v4->v5: > >>> 1.Modify to skip non-shared anonymous folio only. > >>> 2.Update comments for pra->referenced = -1. > >>> v3->v4: > >>> 1.Modify that the unshared folios mapped only in exiting task are skip. > >>> v2->v3: > >>> Nothing. > >>> v1->v2: > >>> 1.The VM_EXITING added in v1 patch is removed, because it will fail > >>> to compile in 32-bit system. > >>> > >>> mm/rmap.c | 13 +++++++++++++ > >>> mm/vmscan.c | 7 ++++++- > >>> 2 files changed, 19 insertions(+), 1 deletion(-) > >>> > >>> diff --git a/mm/rmap.c b/mm/rmap.c > >>> index 26806b49a86f..5b5281d71dbb > >>> --- a/mm/rmap.c > >>> +++ b/mm/rmap.c > >>> @@ -843,6 +843,19 @@ static bool folio_referenced_one(struct folio *folio, > >>> int referenced = 0; > >>> unsigned long start = address, ptes = 0; > >>> > >>> + /* > >>> + * Skip the non-shared anonymous folio mapped solely by > >>> + * the single exiting process, and release it directly > >>> + * in the process exiting. > >>> + */ > >>> + if ((!atomic_read(&vma->vm_mm->mm_users) || > >>> + test_bit(MMF_OOM_SKIP, &vma->vm_mm->flags)) && > >>> + folio_test_anon(folio) && folio_test_swapbacked(folio) && > >>> + !folio_likely_mapped_shared(folio)) { > >> > >> I'm currently working on moving all folio_likely_mapped_shared() under > >> the PTL, where we are then sure that the folio is actually mapped by > >> this process (e.g., no concurrent unmapping poisslbe). > >> > >> Can we do the same here directly? > > > > Implementing this is challenging because page_vma_mapped_walk() is > > responsible for > > traversing the page table to acquire and release the PTL. This becomes > > particularly > > complex with mTHP, as we may need to interrupt the page_vma_mapped_walk > > loop at the first PTE. > > Why can't we perform the check under the PTL and bail out? I'm missing > something important. You are right. we might use page_vma_mapped_walk_done() to bail out. > > -- > Cheers, > > David / dhildenb > ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v5] mm: shrink skip folio mapped by an exiting process 2024-07-08 9:04 [PATCH v5] mm: shrink skip folio mapped by an exiting process Zhiguo Jiang 2024-07-08 9:36 ` David Hildenbrand @ 2024-07-08 11:02 ` Barry Song 2024-07-08 12:17 ` zhiguojiang 1 sibling, 1 reply; 12+ messages in thread From: Barry Song @ 2024-07-08 11:02 UTC (permalink / raw) To: Zhiguo Jiang; +Cc: Andrew Morton, linux-mm, linux-kernel, opensource.kernel On Mon, Jul 8, 2024 at 9:04 PM Zhiguo Jiang <justinjiang@vivo.com> wrote: > > The releasing process of the non-shared anonymous folio mapped solely by > an exiting process may go through two flows: 1) the anonymous folio is > firstly is swaped-out into swapspace and transformed into a swp_entry > in shrink_folio_list; 2) then the swp_entry is released in the process > exiting flow. This will increase the cpu load of releasing a non-shared > anonymous folio mapped solely by an exiting process, because the folio > go through swap-out and the releasing the swapspace and swp_entry. > > When system is low memory, it is more likely to occur, because more > backend applidatuions will be killed. > > The modification is that shrink skips the non-shared anonymous folio > solely mapped by an exting process and the folio is only released > directly in the process exiting flow, which will save swap-out time > and alleviate the load of the process exiting. > > Signed-off-by: Zhiguo Jiang <justinjiang@vivo.com> > --- > > Change log: > v4->v5: > 1.Modify to skip non-shared anonymous folio only. > 2.Update comments for pra->referenced = -1. > v3->v4: > 1.Modify that the unshared folios mapped only in exiting task are skip. > v2->v3: > Nothing. > v1->v2: > 1.The VM_EXITING added in v1 patch is removed, because it will fail > to compile in 32-bit system. > > mm/rmap.c | 13 +++++++++++++ > mm/vmscan.c | 7 ++++++- > 2 files changed, 19 insertions(+), 1 deletion(-) > > diff --git a/mm/rmap.c b/mm/rmap.c > index 26806b49a86f..5b5281d71dbb > --- a/mm/rmap.c > +++ b/mm/rmap.c > @@ -843,6 +843,19 @@ static bool folio_referenced_one(struct folio *folio, > int referenced = 0; > unsigned long start = address, ptes = 0; > > + /* > + * Skip the non-shared anonymous folio mapped solely by > + * the single exiting process, and release it directly > + * in the process exiting. > + */ > + if ((!atomic_read(&vma->vm_mm->mm_users) || > + test_bit(MMF_OOM_SKIP, &vma->vm_mm->flags)) && > + folio_test_anon(folio) && folio_test_swapbacked(folio) && > + !folio_likely_mapped_shared(folio)) { > + pra->referenced = -1; > + return false; > + } > + > while (page_vma_mapped_walk(&pvmw)) { > address = pvmw.address; As David suggested, what about the below? @@ -883,6 +870,21 @@ static bool folio_referenced_one(struct folio *folio, continue; } + /* + * Skip the non-shared anonymous folio mapped solely by + * the single exiting process, and release it directly + * in the process exiting. + */ + if ((!atomic_read(&vma->vm_mm->mm_users) || + test_bit(MMF_OOM_SKIP, &vma->vm_mm->flags)) && + folio_test_anon(folio) && folio_test_swapbacked(folio) && + !folio_likely_mapped_shared(folio)) { + pra->referenced = -1; + page_vma_mapped_walk_done(&pvmw); + return false; + } + if (pvmw.pte) { if (lru_gen_enabled() && pte_young(ptep_get(pvmw.pte))) { By the way, I am not convinced that using test_bit(MMF_OOM_SKIP, &vma->vm_mm->flags) is correct (I think it is wrong). For example, global_init can directly have it: if (is_global_init(p)) { can_oom_reap = false; set_bit(MMF_OOM_SKIP, &mm->flags); pr_info("oom killer %d (%s) has mm pinned by %d (%s)\n", task_pid_nr(victim), victim->comm, task_pid_nr(p), p->comm); continue; } And exit_mmap() automatically has MMF_OOM_SKIP. What is the purpose of this check? Is there a better way to determine if a process is an OOM target? What about check_stable_address_space() ? > > diff --git a/mm/vmscan.c b/mm/vmscan.c > index 0761f91b407f..bae7a8bf6b3d > --- a/mm/vmscan.c > +++ b/mm/vmscan.c > @@ -863,7 +863,12 @@ static enum folio_references folio_check_references(struct folio *folio, > if (vm_flags & VM_LOCKED) > return FOLIOREF_ACTIVATE; > > - /* rmap lock contention: rotate */ > + /* > + * There are two cases to consider. > + * 1) Rmap lock contention: rotate. > + * 2) Skip the non-shared anonymous folio mapped solely by > + * the single exiting process. > + */ > if (referenced_ptes == -1) > return FOLIOREF_KEEP; > > -- > 2.39.0 > Thanks Barry ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v5] mm: shrink skip folio mapped by an exiting process 2024-07-08 11:02 ` Barry Song @ 2024-07-08 12:17 ` zhiguojiang 2024-07-08 12:25 ` zhiguojiang 0 siblings, 1 reply; 12+ messages in thread From: zhiguojiang @ 2024-07-08 12:17 UTC (permalink / raw) To: Barry Song; +Cc: Andrew Morton, linux-mm, linux-kernel, opensource.kernel 在 2024/7/8 19:02, Barry Song 写道: > On Mon, Jul 8, 2024 at 9:04 PM Zhiguo Jiang <justinjiang@vivo.com> wrote: >> The releasing process of the non-shared anonymous folio mapped solely by >> an exiting process may go through two flows: 1) the anonymous folio is >> firstly is swaped-out into swapspace and transformed into a swp_entry >> in shrink_folio_list; 2) then the swp_entry is released in the process >> exiting flow. This will increase the cpu load of releasing a non-shared >> anonymous folio mapped solely by an exiting process, because the folio >> go through swap-out and the releasing the swapspace and swp_entry. >> >> When system is low memory, it is more likely to occur, because more >> backend applidatuions will be killed. >> >> The modification is that shrink skips the non-shared anonymous folio >> solely mapped by an exting process and the folio is only released >> directly in the process exiting flow, which will save swap-out time >> and alleviate the load of the process exiting. >> >> Signed-off-by: Zhiguo Jiang <justinjiang@vivo.com> >> --- >> >> Change log: >> v4->v5: >> 1.Modify to skip non-shared anonymous folio only. >> 2.Update comments for pra->referenced = -1. >> v3->v4: >> 1.Modify that the unshared folios mapped only in exiting task are skip. >> v2->v3: >> Nothing. >> v1->v2: >> 1.The VM_EXITING added in v1 patch is removed, because it will fail >> to compile in 32-bit system. >> >> mm/rmap.c | 13 +++++++++++++ >> mm/vmscan.c | 7 ++++++- >> 2 files changed, 19 insertions(+), 1 deletion(-) >> >> diff --git a/mm/rmap.c b/mm/rmap.c >> index 26806b49a86f..5b5281d71dbb >> --- a/mm/rmap.c >> +++ b/mm/rmap.c >> @@ -843,6 +843,19 @@ static bool folio_referenced_one(struct folio *folio, >> int referenced = 0; >> unsigned long start = address, ptes = 0; >> >> + /* >> + * Skip the non-shared anonymous folio mapped solely by >> + * the single exiting process, and release it directly >> + * in the process exiting. >> + */ >> + if ((!atomic_read(&vma->vm_mm->mm_users) || >> + test_bit(MMF_OOM_SKIP, &vma->vm_mm->flags)) && >> + folio_test_anon(folio) && folio_test_swapbacked(folio) && >> + !folio_likely_mapped_shared(folio)) { >> + pra->referenced = -1; >> + return false; >> + } >> + >> while (page_vma_mapped_walk(&pvmw)) { >> address = pvmw.address; Sure, I agree with your modification suggestions. This way, using PTL indeed sure that the folio is mapped by this process. Thanks > As David suggested, what about the below? > > @@ -883,6 +870,21 @@ static bool folio_referenced_one(struct folio *folio, > continue; > } > > + /* > + * Skip the non-shared anonymous folio mapped solely by > + * the single exiting process, and release it directly > + * in the process exiting. > + */ > + if ((!atomic_read(&vma->vm_mm->mm_users) || > + test_bit(MMF_OOM_SKIP, > &vma->vm_mm->flags)) && > + folio_test_anon(folio) && > folio_test_swapbacked(folio) && > + !folio_likely_mapped_shared(folio)) { > + pra->referenced = -1; > + page_vma_mapped_walk_done(&pvmw); > + return false; > + } > + > if (pvmw.pte) { > if (lru_gen_enabled() && > pte_young(ptep_get(pvmw.pte))) { > > > By the way, I am not convinced that using test_bit(MMF_OOM_SKIP, > &vma->vm_mm->flags) is > correct (I think it is wrong). For example, global_init can directly have it: > if (is_global_init(p)) { > can_oom_reap = false; > set_bit(MMF_OOM_SKIP, &mm->flags); > pr_info("oom killer %d (%s) has mm pinned by %d (%s)\n", > task_pid_nr(victim), victim->comm, > task_pid_nr(p), p->comm); > continue; > } > > And exit_mmap() automatically has MMF_OOM_SKIP. > > What is the purpose of this check? Is there a better way to determine > if a process is an > OOM target? What about check_stable_address_space() ? 1.Sorry, I overlook the situation with if (is_global_init(p)), MMF_OOM_SKIP is indeed not suitable. 2.check_stable_address_space() can indicate oom_reaper, but it seems unable to identify the situation where the process exits normally. What about task_is_dying()? static inline bool task_is_dying(void) { return tsk_is_oom_victim(current) || fatal_signal_pending(current) || (current->flags & PF_EXITING); } Thanks > > >> diff --git a/mm/vmscan.c b/mm/vmscan.c >> index 0761f91b407f..bae7a8bf6b3d >> --- a/mm/vmscan.c >> +++ b/mm/vmscan.c >> @@ -863,7 +863,12 @@ static enum folio_references folio_check_references(struct folio *folio, >> if (vm_flags & VM_LOCKED) >> return FOLIOREF_ACTIVATE; >> >> - /* rmap lock contention: rotate */ >> + /* >> + * There are two cases to consider. >> + * 1) Rmap lock contention: rotate. >> + * 2) Skip the non-shared anonymous folio mapped solely by >> + * the single exiting process. >> + */ >> if (referenced_ptes == -1) >> return FOLIOREF_KEEP; >> >> -- >> 2.39.0 >> > Thanks > Barry ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v5] mm: shrink skip folio mapped by an exiting process 2024-07-08 12:17 ` zhiguojiang @ 2024-07-08 12:25 ` zhiguojiang 2024-07-08 12:41 ` Barry Song 0 siblings, 1 reply; 12+ messages in thread From: zhiguojiang @ 2024-07-08 12:25 UTC (permalink / raw) To: Barry Song; +Cc: Andrew Morton, linux-mm, linux-kernel, opensource.kernel 在 2024/7/8 20:17, zhiguojiang 写道: > > > 在 2024/7/8 19:02, Barry Song 写道: >> On Mon, Jul 8, 2024 at 9:04 PM Zhiguo Jiang <justinjiang@vivo.com> >> wrote: >>> The releasing process of the non-shared anonymous folio mapped >>> solely by >>> an exiting process may go through two flows: 1) the anonymous folio is >>> firstly is swaped-out into swapspace and transformed into a swp_entry >>> in shrink_folio_list; 2) then the swp_entry is released in the process >>> exiting flow. This will increase the cpu load of releasing a non-shared >>> anonymous folio mapped solely by an exiting process, because the folio >>> go through swap-out and the releasing the swapspace and swp_entry. >>> >>> When system is low memory, it is more likely to occur, because more >>> backend applidatuions will be killed. >>> >>> The modification is that shrink skips the non-shared anonymous folio >>> solely mapped by an exting process and the folio is only released >>> directly in the process exiting flow, which will save swap-out time >>> and alleviate the load of the process exiting. >>> >>> Signed-off-by: Zhiguo Jiang <justinjiang@vivo.com> >>> --- >>> >>> Change log: >>> v4->v5: >>> 1.Modify to skip non-shared anonymous folio only. >>> 2.Update comments for pra->referenced = -1. >>> v3->v4: >>> 1.Modify that the unshared folios mapped only in exiting task are skip. >>> v2->v3: >>> Nothing. >>> v1->v2: >>> 1.The VM_EXITING added in v1 patch is removed, because it will fail >>> to compile in 32-bit system. >>> >>> mm/rmap.c | 13 +++++++++++++ >>> mm/vmscan.c | 7 ++++++- >>> 2 files changed, 19 insertions(+), 1 deletion(-) >>> >>> diff --git a/mm/rmap.c b/mm/rmap.c >>> index 26806b49a86f..5b5281d71dbb >>> --- a/mm/rmap.c >>> +++ b/mm/rmap.c >>> @@ -843,6 +843,19 @@ static bool folio_referenced_one(struct folio >>> *folio, >>> int referenced = 0; >>> unsigned long start = address, ptes = 0; >>> >>> + /* >>> + * Skip the non-shared anonymous folio mapped solely by >>> + * the single exiting process, and release it directly >>> + * in the process exiting. >>> + */ >>> + if ((!atomic_read(&vma->vm_mm->mm_users) || >>> + test_bit(MMF_OOM_SKIP, &vma->vm_mm->flags)) && >>> + folio_test_anon(folio) && >>> folio_test_swapbacked(folio) && >>> + !folio_likely_mapped_shared(folio)) { >>> + pra->referenced = -1; >>> + return false; >>> + } >>> + >>> while (page_vma_mapped_walk(&pvmw)) { >>> address = pvmw.address; > Sure, I agree with your modification suggestions. This way, using PTL > indeed sure > that the folio is mapped by this process. > Thanks >> As David suggested, what about the below? >> >> @@ -883,6 +870,21 @@ static bool folio_referenced_one(struct folio >> *folio, >> continue; >> } >> >> + /* >> + * Skip the non-shared anonymous folio mapped solely by >> + * the single exiting process, and release it directly >> + * in the process exiting. >> + */ >> + if ((!atomic_read(&vma->vm_mm->mm_users) || >> + test_bit(MMF_OOM_SKIP, >> &vma->vm_mm->flags)) && >> + folio_test_anon(folio) && >> folio_test_swapbacked(folio) && >> + !folio_likely_mapped_shared(folio)) { >> + pra->referenced = -1; >> + page_vma_mapped_walk_done(&pvmw); >> + return false; >> + } >> + >> if (pvmw.pte) { >> if (lru_gen_enabled() && >> pte_young(ptep_get(pvmw.pte))) { >> >> >> By the way, I am not convinced that using test_bit(MMF_OOM_SKIP, >> &vma->vm_mm->flags) is >> correct (I think it is wrong). For example, global_init can >> directly have it: >> if (is_global_init(p)) { >> can_oom_reap = false; >> set_bit(MMF_OOM_SKIP, &mm->flags); >> pr_info("oom killer %d (%s) has mm pinned by >> %d (%s)\n", >> task_pid_nr(victim), >> victim->comm, >> task_pid_nr(p), p->comm); >> continue; >> } >> >> And exit_mmap() automatically has MMF_OOM_SKIP. >> >> What is the purpose of this check? Is there a better way to determine >> if a process is an >> OOM target? What about check_stable_address_space() ? > 1.Sorry, I overlook the situation with if (is_global_init(p)), > MMF_OOM_SKIP is indeed not suitable. > > 2.check_stable_address_space() can indicate oom_reaper, but it seems > unable to identify the situation where the process exits normally. > What about task_is_dying()? static inline bool task_is_dying(void) { > return tsk_is_oom_victim(current) || fatal_signal_pending(current) || > (current->flags & PF_EXITING); } Thanks We can migrate task_is_dying() from mm/memcontrol.c to include/linux/oom.h > static inline bool task_is_dying(void) > { > return tsk_is_oom_victim(current) || fatal_signal_pending(current) || > (current->flags & PF_EXITING); > } >> >> >>> diff --git a/mm/vmscan.c b/mm/vmscan.c >>> index 0761f91b407f..bae7a8bf6b3d >>> --- a/mm/vmscan.c >>> +++ b/mm/vmscan.c >>> @@ -863,7 +863,12 @@ static enum folio_references >>> folio_check_references(struct folio *folio, >>> if (vm_flags & VM_LOCKED) >>> return FOLIOREF_ACTIVATE; >>> >>> - /* rmap lock contention: rotate */ >>> + /* >>> + * There are two cases to consider. >>> + * 1) Rmap lock contention: rotate. >>> + * 2) Skip the non-shared anonymous folio mapped solely by >>> + * the single exiting process. >>> + */ >>> if (referenced_ptes == -1) >>> return FOLIOREF_KEEP; >>> >>> -- >>> 2.39.0 >>> >> Thanks >> Barry > ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v5] mm: shrink skip folio mapped by an exiting process 2024-07-08 12:25 ` zhiguojiang @ 2024-07-08 12:41 ` Barry Song 2024-07-08 13:11 ` zhiguojiang 0 siblings, 1 reply; 12+ messages in thread From: Barry Song @ 2024-07-08 12:41 UTC (permalink / raw) To: zhiguojiang; +Cc: Andrew Morton, linux-mm, LKML, opensource.kernel [-- Attachment #1: Type: text/plain, Size: 6663 bytes --] zhiguojiang <justinjiang@vivo.com> 于 2024年7月9日周二 00:25写道: > > > 在 2024/7/8 20:17, zhiguojiang 写道: > > > > > > 在 2024/7/8 19:02, Barry Song 写道: > >> On Mon, Jul 8, 2024 at 9:04 PM Zhiguo Jiang <justinjiang@vivo.com> > >> wrote: > >>> The releasing process of the non-shared anonymous folio mapped > >>> solely by > >>> an exiting process may go through two flows: 1) the anonymous folio is > >>> firstly is swaped-out into swapspace and transformed into a swp_entry > >>> in shrink_folio_list; 2) then the swp_entry is released in the process > >>> exiting flow. This will increase the cpu load of releasing a non-shared > >>> anonymous folio mapped solely by an exiting process, because the folio > >>> go through swap-out and the releasing the swapspace and swp_entry. > >>> > >>> When system is low memory, it is more likely to occur, because more > >>> backend applidatuions will be killed. > >>> > >>> The modification is that shrink skips the non-shared anonymous folio > >>> solely mapped by an exting process and the folio is only released > >>> directly in the process exiting flow, which will save swap-out time > >>> and alleviate the load of the process exiting. > >>> > >>> Signed-off-by: Zhiguo Jiang <justinjiang@vivo.com> > >>> --- > >>> > >>> Change log: > >>> v4->v5: > >>> 1.Modify to skip non-shared anonymous folio only. > >>> 2.Update comments for pra->referenced = -1. > >>> v3->v4: > >>> 1.Modify that the unshared folios mapped only in exiting task are skip. > >>> v2->v3: > >>> Nothing. > >>> v1->v2: > >>> 1.The VM_EXITING added in v1 patch is removed, because it will fail > >>> to compile in 32-bit system. > >>> > >>> mm/rmap.c | 13 +++++++++++++ > >>> mm/vmscan.c | 7 ++++++- > >>> 2 files changed, 19 insertions(+), 1 deletion(-) > >>> > >>> diff --git a/mm/rmap.c b/mm/rmap.c > >>> index 26806b49a86f..5b5281d71dbb > >>> --- a/mm/rmap.c > >>> +++ b/mm/rmap.c > >>> @@ -843,6 +843,19 @@ static bool folio_referenced_one(struct folio > >>> *folio, > >>> int referenced = 0; > >>> unsigned long start = address, ptes = 0; > >>> > >>> + /* > >>> + * Skip the non-shared anonymous folio mapped solely by > >>> + * the single exiting process, and release it directly > >>> + * in the process exiting. > >>> + */ > >>> + if ((!atomic_read(&vma->vm_mm->mm_users) || > >>> + test_bit(MMF_OOM_SKIP, &vma->vm_mm->flags)) && > >>> + folio_test_anon(folio) && > >>> folio_test_swapbacked(folio) && > >>> + !folio_likely_mapped_shared(folio)) { > >>> + pra->referenced = -1; > >>> + return false; > >>> + } > >>> + > >>> while (page_vma_mapped_walk(&pvmw)) { > >>> address = pvmw.address; > > Sure, I agree with your modification suggestions. This way, using PTL > > indeed sure > > that the folio is mapped by this process. > > Thanks > >> As David suggested, what about the below? > >> > >> @@ -883,6 +870,21 @@ static bool folio_referenced_one(struct folio > >> *folio, > >> continue; > >> } > >> > >> + /* > >> + * Skip the non-shared anonymous folio mapped solely by > >> + * the single exiting process, and release it directly > >> + * in the process exiting. > >> + */ > >> + if ((!atomic_read(&vma->vm_mm->mm_users) || > >> + test_bit(MMF_OOM_SKIP, > >> &vma->vm_mm->flags)) && > >> + folio_test_anon(folio) && > >> folio_test_swapbacked(folio) && > >> + !folio_likely_mapped_shared(folio)) { > >> + pra->referenced = -1; > >> + page_vma_mapped_walk_done(&pvmw); > >> + return false; > >> + } > >> + > >> if (pvmw.pte) { > >> if (lru_gen_enabled() && > >> pte_young(ptep_get(pvmw.pte))) { > >> > >> > >> By the way, I am not convinced that using test_bit(MMF_OOM_SKIP, > >> &vma->vm_mm->flags) is > >> correct (I think it is wrong). For example, global_init can > >> directly have it: > >> if (is_global_init(p)) { > >> can_oom_reap = false; > >> set_bit(MMF_OOM_SKIP, &mm->flags); > >> pr_info("oom killer %d (%s) has mm pinned by > >> %d (%s)\n", > >> task_pid_nr(victim), > >> victim->comm, > >> task_pid_nr(p), p->comm); > >> continue; > >> } > >> > >> And exit_mmap() automatically has MMF_OOM_SKIP. > >> > >> What is the purpose of this check? Is there a better way to determine > >> if a process is an > >> OOM target? What about check_stable_address_space() ? > > 1.Sorry, I overlook the situation with if (is_global_init(p)), > > MMF_OOM_SKIP is indeed not suitable. > > > > 2.check_stable_address_space() can indicate oom_reaper, but it seems > > unable to identify the situation where the process exits normally. > > What about task_is_dying()? static inline bool task_is_dying(void) { > > return tsk_is_oom_victim(current) || fatal_signal_pending(current) || > > (current->flags & PF_EXITING); } Thanks > We can migrate task_is_dying() from mm/memcontrol.c to include/linux/oom.h > > static inline bool task_is_dying(void) > > { > > return tsk_is_oom_victim(current) || fatal_signal_pending(current) || > > (current->flags & PF_EXITING); > > } > no. current is kswapd. > >> > >> > >>> diff --git a/mm/vmscan.c b/mm/vmscan.c > >>> index 0761f91b407f..bae7a8bf6b3d > >>> --- a/mm/vmscan.c > >>> +++ b/mm/vmscan.c > >>> @@ -863,7 +863,12 @@ static enum folio_references > >>> folio_check_references(struct folio *folio, > >>> if (vm_flags & VM_LOCKED) > >>> return FOLIOREF_ACTIVATE; > >>> > >>> - /* rmap lock contention: rotate */ > >>> + /* > >>> + * There are two cases to consider. > >>> + * 1) Rmap lock contention: rotate. > >>> + * 2) Skip the non-shared anonymous folio mapped solely by > >>> + * the single exiting process. > >>> + */ > >>> if (referenced_ptes == -1) > >>> return FOLIOREF_KEEP; > >>> > >>> -- > >>> 2.39.0 > >>> > >> Thanks > >> Barry > > > > [-- Attachment #2: Type: text/html, Size: 9814 bytes --] ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v5] mm: shrink skip folio mapped by an exiting process 2024-07-08 12:41 ` Barry Song @ 2024-07-08 13:11 ` zhiguojiang 2024-07-08 21:34 ` Barry Song 0 siblings, 1 reply; 12+ messages in thread From: zhiguojiang @ 2024-07-08 13:11 UTC (permalink / raw) To: Barry Song; +Cc: Andrew Morton, linux-mm, LKML, opensource.kernel 在 2024/7/8 20:41, Barry Song 写道: > > > zhiguojiang <justinjiang@vivo.com> 于 2024年7月9日周二 00:25写道: > > > > 在 2024/7/8 20:17, zhiguojiang 写道: > > > > > > 在 2024/7/8 19:02, Barry Song 写道: > >> On Mon, Jul 8, 2024 at 9:04 PM Zhiguo Jiang <justinjiang@vivo.com> > >> wrote: > >>> The releasing process of the non-shared anonymous folio mapped > >>> solely by > >>> an exiting process may go through two flows: 1) the anonymous > folio is > >>> firstly is swaped-out into swapspace and transformed into a > swp_entry > >>> in shrink_folio_list; 2) then the swp_entry is released in the > process > >>> exiting flow. This will increase the cpu load of releasing a > non-shared > >>> anonymous folio mapped solely by an exiting process, because > the folio > >>> go through swap-out and the releasing the swapspace and swp_entry. > >>> > >>> When system is low memory, it is more likely to occur, because > more > >>> backend applidatuions will be killed. > >>> > >>> The modification is that shrink skips the non-shared anonymous > folio > >>> solely mapped by an exting process and the folio is only released > >>> directly in the process exiting flow, which will save swap-out > time > >>> and alleviate the load of the process exiting. > >>> > >>> Signed-off-by: Zhiguo Jiang <justinjiang@vivo.com> > >>> --- > >>> > >>> Change log: > >>> v4->v5: > >>> 1.Modify to skip non-shared anonymous folio only. > >>> 2.Update comments for pra->referenced = -1. > >>> v3->v4: > >>> 1.Modify that the unshared folios mapped only in exiting task > are skip. > >>> v2->v3: > >>> Nothing. > >>> v1->v2: > >>> 1.The VM_EXITING added in v1 patch is removed, because it will > fail > >>> to compile in 32-bit system. > >>> > >>> mm/rmap.c | 13 +++++++++++++ > >>> mm/vmscan.c | 7 ++++++- > >>> 2 files changed, 19 insertions(+), 1 deletion(-) > >>> > >>> diff --git a/mm/rmap.c b/mm/rmap.c > >>> index 26806b49a86f..5b5281d71dbb > >>> --- a/mm/rmap.c > >>> +++ b/mm/rmap.c > >>> @@ -843,6 +843,19 @@ static bool folio_referenced_one(struct > folio > >>> *folio, > >>> int referenced = 0; > >>> unsigned long start = address, ptes = 0; > >>> > >>> + /* > >>> + * Skip the non-shared anonymous folio mapped solely by > >>> + * the single exiting process, and release it directly > >>> + * in the process exiting. > >>> + */ > >>> + if ((!atomic_read(&vma->vm_mm->mm_users) || > >>> + test_bit(MMF_OOM_SKIP, &vma->vm_mm->flags)) && > >>> + folio_test_anon(folio) && > >>> folio_test_swapbacked(folio) && > >>> + !folio_likely_mapped_shared(folio)) { > >>> + pra->referenced = -1; > >>> + return false; > >>> + } > >>> + > >>> while (page_vma_mapped_walk(&pvmw)) { > >>> address = pvmw.address; > > Sure, I agree with your modification suggestions. This way, > using PTL > > indeed sure > > that the folio is mapped by this process. > > Thanks > >> As David suggested, what about the below? > >> > >> @@ -883,6 +870,21 @@ static bool folio_referenced_one(struct folio > >> *folio, > >> continue; > >> } > >> > >> + /* > >> + * Skip the non-shared anonymous folio mapped > solely by > >> + * the single exiting process, and release it > directly > >> + * in the process exiting. > >> + */ > >> + if ((!atomic_read(&vma->vm_mm->mm_users) || > >> + test_bit(MMF_OOM_SKIP, > >> &vma->vm_mm->flags)) && > >> + folio_test_anon(folio) && > >> folio_test_swapbacked(folio) && > >> + !folio_likely_mapped_shared(folio)) { > >> + pra->referenced = -1; > >> + page_vma_mapped_walk_done(&pvmw); > >> + return false; > >> + } > >> + > >> if (pvmw.pte) { > >> if (lru_gen_enabled() && > >> pte_young(ptep_get(pvmw.pte))) { > >> > >> > >> By the way, I am not convinced that using test_bit(MMF_OOM_SKIP, > >> &vma->vm_mm->flags) is > >> correct (I think it is wrong). For example, global_init can > >> directly have it: > >> if (is_global_init(p)) { > >> can_oom_reap = false; > >> set_bit(MMF_OOM_SKIP, &mm->flags); > >> pr_info("oom killer %d (%s) has mm > pinned by > >> %d (%s)\n", > >> task_pid_nr(victim), > >> victim->comm, > >> task_pid_nr(p), p->comm); > >> continue; > >> } > >> > >> And exit_mmap() automatically has MMF_OOM_SKIP. > >> > >> What is the purpose of this check? Is there a better way to > determine > >> if a process is an > >> OOM target? What about check_stable_address_space() ? > > 1.Sorry, I overlook the situation with if (is_global_init(p)), > > MMF_OOM_SKIP is indeed not suitable. > > > > 2.check_stable_address_space() can indicate oom_reaper, but it > seems > > unable to identify the situation where the process exits normally. > > What about task_is_dying()? static inline bool > task_is_dying(void) { > > return tsk_is_oom_victim(current) || > fatal_signal_pending(current) || > > (current->flags & PF_EXITING); } Thanks > We can migrate task_is_dying() from mm/memcontrol.c to > include/linux/oom.h > > static inline bool task_is_dying(void) > > { > > return tsk_is_oom_victim(current) || > fatal_signal_pending(current) || > > (current->flags & PF_EXITING); > > } > > > no. current is kswapd. Hi Barry, It seems feasible for check_stable_address_space() replacing MMF_OOM_SKIP. check_stable_address_space() can indicate oom kill, and !atomic_read(&vma->vm_mm->mm_users) can indicate the normal process exiting. /* * Skip the non-shared anonymous folio mapped solely by * the single exiting process, and release it directly * in the process exiting. */ if ((!atomic_read(&vma->vm_mm->mm_users) || check_stable_address_space(vma->vm_mm)) && folio_test_anon(folio) && folio_test_swapbacked(folio) && !folio_likely_mapped_shared(folio)) { pra->referenced = -1; page_vma_mapped_walk_done(&pvmw); return false; } Thanks Zhiguo > > > >> > >> > >>> diff --git a/mm/vmscan.c b/mm/vmscan.c > >>> index 0761f91b407f..bae7a8bf6b3d > >>> --- a/mm/vmscan.c > >>> +++ b/mm/vmscan.c > >>> @@ -863,7 +863,12 @@ static enum folio_references > >>> folio_check_references(struct folio *folio, > >>> if (vm_flags & VM_LOCKED) > >>> return FOLIOREF_ACTIVATE; > >>> > >>> - /* rmap lock contention: rotate */ > >>> + /* > >>> + * There are two cases to consider. > >>> + * 1) Rmap lock contention: rotate. > >>> + * 2) Skip the non-shared anonymous folio mapped solely by > >>> + * the single exiting process. > >>> + */ > >>> if (referenced_ptes == -1) > >>> return FOLIOREF_KEEP; > >>> > >>> -- > >>> 2.39.0 > >>> > >> Thanks > >> Barry > > > ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v5] mm: shrink skip folio mapped by an exiting process 2024-07-08 13:11 ` zhiguojiang @ 2024-07-08 21:34 ` Barry Song 2024-07-09 4:23 ` zhiguojiang 0 siblings, 1 reply; 12+ messages in thread From: Barry Song @ 2024-07-08 21:34 UTC (permalink / raw) To: zhiguojiang Cc: Andrew Morton, linux-mm, LKML, opensource.kernel, David Hildenbrand, Matthew Wilcox On Tue, Jul 9, 2024 at 1:11 AM zhiguojiang <justinjiang@vivo.com> wrote: > > > > 在 2024/7/8 20:41, Barry Song 写道: > > > > > > zhiguojiang <justinjiang@vivo.com> 于 2024年7月9日周二 00:25写道: > > > > > > > > 在 2024/7/8 20:17, zhiguojiang 写道: > > > > > > > > > 在 2024/7/8 19:02, Barry Song 写道: > > >> On Mon, Jul 8, 2024 at 9:04 PM Zhiguo Jiang <justinjiang@vivo.com> > > >> wrote: > > >>> The releasing process of the non-shared anonymous folio mapped > > >>> solely by > > >>> an exiting process may go through two flows: 1) the anonymous > > folio is > > >>> firstly is swaped-out into swapspace and transformed into a > > swp_entry > > >>> in shrink_folio_list; 2) then the swp_entry is released in the > > process > > >>> exiting flow. This will increase the cpu load of releasing a > > non-shared > > >>> anonymous folio mapped solely by an exiting process, because > > the folio > > >>> go through swap-out and the releasing the swapspace and swp_entry. > > >>> > > >>> When system is low memory, it is more likely to occur, because > > more > > >>> backend applidatuions will be killed. > > >>> > > >>> The modification is that shrink skips the non-shared anonymous > > folio > > >>> solely mapped by an exting process and the folio is only released > > >>> directly in the process exiting flow, which will save swap-out > > time > > >>> and alleviate the load of the process exiting. > > >>> > > >>> Signed-off-by: Zhiguo Jiang <justinjiang@vivo.com> > > >>> --- > > >>> > > >>> Change log: > > >>> v4->v5: > > >>> 1.Modify to skip non-shared anonymous folio only. > > >>> 2.Update comments for pra->referenced = -1. > > >>> v3->v4: > > >>> 1.Modify that the unshared folios mapped only in exiting task > > are skip. > > >>> v2->v3: > > >>> Nothing. > > >>> v1->v2: > > >>> 1.The VM_EXITING added in v1 patch is removed, because it will > > fail > > >>> to compile in 32-bit system. > > >>> > > >>> mm/rmap.c | 13 +++++++++++++ > > >>> mm/vmscan.c | 7 ++++++- > > >>> 2 files changed, 19 insertions(+), 1 deletion(-) > > >>> > > >>> diff --git a/mm/rmap.c b/mm/rmap.c > > >>> index 26806b49a86f..5b5281d71dbb > > >>> --- a/mm/rmap.c > > >>> +++ b/mm/rmap.c > > >>> @@ -843,6 +843,19 @@ static bool folio_referenced_one(struct > > folio > > >>> *folio, > > >>> int referenced = 0; > > >>> unsigned long start = address, ptes = 0; > > >>> > > >>> + /* > > >>> + * Skip the non-shared anonymous folio mapped solely by > > >>> + * the single exiting process, and release it directly > > >>> + * in the process exiting. > > >>> + */ > > >>> + if ((!atomic_read(&vma->vm_mm->mm_users) || > > >>> + test_bit(MMF_OOM_SKIP, &vma->vm_mm->flags)) && > > >>> + folio_test_anon(folio) && > > >>> folio_test_swapbacked(folio) && > > >>> + !folio_likely_mapped_shared(folio)) { > > >>> + pra->referenced = -1; > > >>> + return false; > > >>> + } > > >>> + > > >>> while (page_vma_mapped_walk(&pvmw)) { > > >>> address = pvmw.address; > > > Sure, I agree with your modification suggestions. This way, > > using PTL > > > indeed sure > > > that the folio is mapped by this process. > > > Thanks > > >> As David suggested, what about the below? > > >> > > >> @@ -883,6 +870,21 @@ static bool folio_referenced_one(struct folio > > >> *folio, > > >> continue; > > >> } > > >> > > >> + /* > > >> + * Skip the non-shared anonymous folio mapped > > solely by > > >> + * the single exiting process, and release it > > directly > > >> + * in the process exiting. > > >> + */ > > >> + if ((!atomic_read(&vma->vm_mm->mm_users) || > > >> + test_bit(MMF_OOM_SKIP, > > >> &vma->vm_mm->flags)) && > > >> + folio_test_anon(folio) && > > >> folio_test_swapbacked(folio) && > > >> + !folio_likely_mapped_shared(folio)) { > > >> + pra->referenced = -1; > > >> + page_vma_mapped_walk_done(&pvmw); > > >> + return false; > > >> + } > > >> + > > >> if (pvmw.pte) { > > >> if (lru_gen_enabled() && > > >> pte_young(ptep_get(pvmw.pte))) { > > >> > > >> > > >> By the way, I am not convinced that using test_bit(MMF_OOM_SKIP, > > >> &vma->vm_mm->flags) is > > >> correct (I think it is wrong). For example, global_init can > > >> directly have it: > > >> if (is_global_init(p)) { > > >> can_oom_reap = false; > > >> set_bit(MMF_OOM_SKIP, &mm->flags); > > >> pr_info("oom killer %d (%s) has mm > > pinned by > > >> %d (%s)\n", > > >> task_pid_nr(victim), > > >> victim->comm, > > >> task_pid_nr(p), p->comm); > > >> continue; > > >> } > > >> > > >> And exit_mmap() automatically has MMF_OOM_SKIP. > > >> > > >> What is the purpose of this check? Is there a better way to > > determine > > >> if a process is an > > >> OOM target? What about check_stable_address_space() ? > > > 1.Sorry, I overlook the situation with if (is_global_init(p)), > > > MMF_OOM_SKIP is indeed not suitable. > > > > > > 2.check_stable_address_space() can indicate oom_reaper, but it > > seems > > > unable to identify the situation where the process exits normally. > > > What about task_is_dying()? static inline bool > > task_is_dying(void) { > > > return tsk_is_oom_victim(current) || > > fatal_signal_pending(current) || > > > (current->flags & PF_EXITING); } Thanks > > We can migrate task_is_dying() from mm/memcontrol.c to > > include/linux/oom.h > > > static inline bool task_is_dying(void) > > > { > > > return tsk_is_oom_victim(current) || > > fatal_signal_pending(current) || > > > (current->flags & PF_EXITING); > > > } > > > > > > no. current is kswapd. > Hi Barry, > > It seems feasible for check_stable_address_space() replacing MMF_OOM_SKIP. > check_stable_address_space() can indicate oom kill, and > !atomic_read(&vma->vm_mm->mm_users) > can indicate the normal process exiting. > > /* > * Skip the non-shared anonymous folio mapped solely by > * the single exiting process, and release it directly > * in the process exiting. > */ > if ((!atomic_read(&vma->vm_mm->mm_users) || > check_stable_address_space(vma->vm_mm)) && > folio_test_anon(folio) && folio_test_swapbacked(folio) && > !folio_likely_mapped_shared(folio)) { > pra->referenced = -1; > page_vma_mapped_walk_done(&pvmw); > return false; > } > Yes, + David, Willy (when you send a new version, please CC people who have participated and describe how you have addressed comments from those people.) I also think we actually can remove "folio_test_anon(folio)". So It could be, @@ -883,6 +871,21 @@ static bool folio_referenced_one(struct folio *folio, continue; } + /* + * Skip the non-shared swapbacked folio mapped solely by + * the exiting or OOM-reaped process. This avoids redundant + * swap-out followed by an immediate unmap. + */ + if ((!atomic_read(&vma->vm_mm->mm_users) || + check_stable_address_space(vma->vm_mm)) && + folio_test_swapbacked(folio) && + !folio_likely_mapped_shared(folio)) { + pra->referenced = -1; + page_vma_mapped_walk_done(&pvmw); + return false; + } + if (pvmw.pte) { if (lru_gen_enabled() && pte_young(ptep_get(pvmw.pte))) { > Thanks > Zhiguo > > > > > > >> > > >> > > >>> diff --git a/mm/vmscan.c b/mm/vmscan.c > > >>> index 0761f91b407f..bae7a8bf6b3d > > >>> --- a/mm/vmscan.c > > >>> +++ b/mm/vmscan.c > > >>> @@ -863,7 +863,12 @@ static enum folio_references > > >>> folio_check_references(struct folio *folio, > > >>> if (vm_flags & VM_LOCKED) > > >>> return FOLIOREF_ACTIVATE; > > >>> > > >>> - /* rmap lock contention: rotate */ > > >>> + /* > > >>> + * There are two cases to consider. > > >>> + * 1) Rmap lock contention: rotate. > > >>> + * 2) Skip the non-shared anonymous folio mapped solely by > > >>> + * the single exiting process. > > >>> + */ > > >>> if (referenced_ptes == -1) > > >>> return FOLIOREF_KEEP; > > >>> > > >>> -- > > >>> 2.39.0 > > >>> > > >> Thanks > > >> Barry > > > > > > Thanks Barry ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v5] mm: shrink skip folio mapped by an exiting process 2024-07-08 21:34 ` Barry Song @ 2024-07-09 4:23 ` zhiguojiang 0 siblings, 0 replies; 12+ messages in thread From: zhiguojiang @ 2024-07-09 4:23 UTC (permalink / raw) To: Barry Song Cc: Andrew Morton, linux-mm, LKML, opensource.kernel, David Hildenbrand, Matthew Wilcox 在 2024/7/9 5:34, Barry Song 写道: > On Tue, Jul 9, 2024 at 1:11 AM zhiguojiang <justinjiang@vivo.com> wrote: >> >> >> 在 2024/7/8 20:41, Barry Song 写道: >>> >>> zhiguojiang <justinjiang@vivo.com> 于 2024年7月9日周二 00:25写道: >>> >>> >>> >>> 在 2024/7/8 20:17, zhiguojiang 写道: >>> > >>> > >>> > 在 2024/7/8 19:02, Barry Song 写道: >>> >> On Mon, Jul 8, 2024 at 9:04 PM Zhiguo Jiang <justinjiang@vivo.com> >>> >> wrote: >>> >>> The releasing process of the non-shared anonymous folio mapped >>> >>> solely by >>> >>> an exiting process may go through two flows: 1) the anonymous >>> folio is >>> >>> firstly is swaped-out into swapspace and transformed into a >>> swp_entry >>> >>> in shrink_folio_list; 2) then the swp_entry is released in the >>> process >>> >>> exiting flow. This will increase the cpu load of releasing a >>> non-shared >>> >>> anonymous folio mapped solely by an exiting process, because >>> the folio >>> >>> go through swap-out and the releasing the swapspace and swp_entry. >>> >>> >>> >>> When system is low memory, it is more likely to occur, because >>> more >>> >>> backend applidatuions will be killed. >>> >>> >>> >>> The modification is that shrink skips the non-shared anonymous >>> folio >>> >>> solely mapped by an exting process and the folio is only released >>> >>> directly in the process exiting flow, which will save swap-out >>> time >>> >>> and alleviate the load of the process exiting. >>> >>> >>> >>> Signed-off-by: Zhiguo Jiang <justinjiang@vivo.com> >>> >>> --- >>> >>> >>> >>> Change log: >>> >>> v4->v5: >>> >>> 1.Modify to skip non-shared anonymous folio only. >>> >>> 2.Update comments for pra->referenced = -1. >>> >>> v3->v4: >>> >>> 1.Modify that the unshared folios mapped only in exiting task >>> are skip. >>> >>> v2->v3: >>> >>> Nothing. >>> >>> v1->v2: >>> >>> 1.The VM_EXITING added in v1 patch is removed, because it will >>> fail >>> >>> to compile in 32-bit system. >>> >>> >>> >>> mm/rmap.c | 13 +++++++++++++ >>> >>> mm/vmscan.c | 7 ++++++- >>> >>> 2 files changed, 19 insertions(+), 1 deletion(-) >>> >>> >>> >>> diff --git a/mm/rmap.c b/mm/rmap.c >>> >>> index 26806b49a86f..5b5281d71dbb >>> >>> --- a/mm/rmap.c >>> >>> +++ b/mm/rmap.c >>> >>> @@ -843,6 +843,19 @@ static bool folio_referenced_one(struct >>> folio >>> >>> *folio, >>> >>> int referenced = 0; >>> >>> unsigned long start = address, ptes = 0; >>> >>> >>> >>> + /* >>> >>> + * Skip the non-shared anonymous folio mapped solely by >>> >>> + * the single exiting process, and release it directly >>> >>> + * in the process exiting. >>> >>> + */ >>> >>> + if ((!atomic_read(&vma->vm_mm->mm_users) || >>> >>> + test_bit(MMF_OOM_SKIP, &vma->vm_mm->flags)) && >>> >>> + folio_test_anon(folio) && >>> >>> folio_test_swapbacked(folio) && >>> >>> + !folio_likely_mapped_shared(folio)) { >>> >>> + pra->referenced = -1; >>> >>> + return false; >>> >>> + } >>> >>> + >>> >>> while (page_vma_mapped_walk(&pvmw)) { >>> >>> address = pvmw.address; >>> > Sure, I agree with your modification suggestions. This way, >>> using PTL >>> > indeed sure >>> > that the folio is mapped by this process. >>> > Thanks >>> >> As David suggested, what about the below? >>> >> >>> >> @@ -883,6 +870,21 @@ static bool folio_referenced_one(struct folio >>> >> *folio, >>> >> continue; >>> >> } >>> >> >>> >> + /* >>> >> + * Skip the non-shared anonymous folio mapped >>> solely by >>> >> + * the single exiting process, and release it >>> directly >>> >> + * in the process exiting. >>> >> + */ >>> >> + if ((!atomic_read(&vma->vm_mm->mm_users) || >>> >> + test_bit(MMF_OOM_SKIP, >>> >> &vma->vm_mm->flags)) && >>> >> + folio_test_anon(folio) && >>> >> folio_test_swapbacked(folio) && >>> >> + !folio_likely_mapped_shared(folio)) { >>> >> + pra->referenced = -1; >>> >> + page_vma_mapped_walk_done(&pvmw); >>> >> + return false; >>> >> + } >>> >> + >>> >> if (pvmw.pte) { >>> >> if (lru_gen_enabled() && >>> >> pte_young(ptep_get(pvmw.pte))) { >>> >> >>> >> >>> >> By the way, I am not convinced that using test_bit(MMF_OOM_SKIP, >>> >> &vma->vm_mm->flags) is >>> >> correct (I think it is wrong). For example, global_init can >>> >> directly have it: >>> >> if (is_global_init(p)) { >>> >> can_oom_reap = false; >>> >> set_bit(MMF_OOM_SKIP, &mm->flags); >>> >> pr_info("oom killer %d (%s) has mm >>> pinned by >>> >> %d (%s)\n", >>> >> task_pid_nr(victim), >>> >> victim->comm, >>> >> task_pid_nr(p), p->comm); >>> >> continue; >>> >> } >>> >> >>> >> And exit_mmap() automatically has MMF_OOM_SKIP. >>> >> >>> >> What is the purpose of this check? Is there a better way to >>> determine >>> >> if a process is an >>> >> OOM target? What about check_stable_address_space() ? >>> > 1.Sorry, I overlook the situation with if (is_global_init(p)), >>> > MMF_OOM_SKIP is indeed not suitable. >>> > >>> > 2.check_stable_address_space() can indicate oom_reaper, but it >>> seems >>> > unable to identify the situation where the process exits normally. >>> > What about task_is_dying()? static inline bool >>> task_is_dying(void) { >>> > return tsk_is_oom_victim(current) || >>> fatal_signal_pending(current) || >>> > (current->flags & PF_EXITING); } Thanks >>> We can migrate task_is_dying() from mm/memcontrol.c to >>> include/linux/oom.h >>> > static inline bool task_is_dying(void) >>> > { >>> > return tsk_is_oom_victim(current) || >>> fatal_signal_pending(current) || >>> > (current->flags & PF_EXITING); >>> > } >>> >>> >>> no. current is kswapd. >> Hi Barry, >> >> It seems feasible for check_stable_address_space() replacing MMF_OOM_SKIP. >> check_stable_address_space() can indicate oom kill, and >> !atomic_read(&vma->vm_mm->mm_users) >> can indicate the normal process exiting. >> >> /* >> * Skip the non-shared anonymous folio mapped solely by >> * the single exiting process, and release it directly >> * in the process exiting. >> */ >> if ((!atomic_read(&vma->vm_mm->mm_users) || >> check_stable_address_space(vma->vm_mm)) && >> folio_test_anon(folio) && folio_test_swapbacked(folio) && >> !folio_likely_mapped_shared(folio)) { >> pra->referenced = -1; >> page_vma_mapped_walk_done(&pvmw); >> return false; >> } >> > Yes, + David, Willy (when you send a new version, please CC people who have > participated and describe how you have addressed comments from those > people.) > > I also think we actually can remove "folio_test_anon(folio)". > > So It could be, > > @@ -883,6 +871,21 @@ static bool folio_referenced_one(struct folio *folio, > continue; > } > > + /* > + * Skip the non-shared swapbacked folio mapped solely by > + * the exiting or OOM-reaped process. This avoids redundant > + * swap-out followed by an immediate unmap. > + */ > + if ((!atomic_read(&vma->vm_mm->mm_users) || > + check_stable_address_space(vma->vm_mm)) && > + folio_test_swapbacked(folio) && > + !folio_likely_mapped_shared(folio)) { > + pra->referenced = -1; > + page_vma_mapped_walk_done(&pvmw); > + return false; > + } > + > if (pvmw.pte) { > if (lru_gen_enabled() && > pte_young(ptep_get(pvmw.pte))) { Ok, update in patch v6: https://lore.kernel.org/linux-mm/20240709042122.631-1-justinjiang@vivo.com/ Thanks > >> Thanks >> Zhiguo >>> >>> >> >>> >> >>> >>> diff --git a/mm/vmscan.c b/mm/vmscan.c >>> >>> index 0761f91b407f..bae7a8bf6b3d >>> >>> --- a/mm/vmscan.c >>> >>> +++ b/mm/vmscan.c >>> >>> @@ -863,7 +863,12 @@ static enum folio_references >>> >>> folio_check_references(struct folio *folio, >>> >>> if (vm_flags & VM_LOCKED) >>> >>> return FOLIOREF_ACTIVATE; >>> >>> >>> >>> - /* rmap lock contention: rotate */ >>> >>> + /* >>> >>> + * There are two cases to consider. >>> >>> + * 1) Rmap lock contention: rotate. >>> >>> + * 2) Skip the non-shared anonymous folio mapped solely by >>> >>> + * the single exiting process. >>> >>> + */ >>> >>> if (referenced_ptes == -1) >>> >>> return FOLIOREF_KEEP; >>> >>> >>> >>> -- >>> >>> 2.39.0 >>> >>> >>> >> Thanks >>> >> Barry >>> > >>> > Thanks > Barry ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2024-07-09 4:24 UTC | newest] Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2024-07-08 9:04 [PATCH v5] mm: shrink skip folio mapped by an exiting process Zhiguo Jiang 2024-07-08 9:36 ` David Hildenbrand 2024-07-08 9:46 ` Barry Song 2024-07-08 9:49 ` David Hildenbrand 2024-07-08 10:05 ` Barry Song 2024-07-08 11:02 ` Barry Song 2024-07-08 12:17 ` zhiguojiang 2024-07-08 12:25 ` zhiguojiang 2024-07-08 12:41 ` Barry Song 2024-07-08 13:11 ` zhiguojiang 2024-07-08 21:34 ` Barry Song 2024-07-09 4:23 ` zhiguojiang
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox