linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH] mm: don't promote exclusive file folios of dying processes
@ 2025-04-12  8:58 Barry Song
  2025-04-12 15:48 ` Matthew Wilcox
                   ` (2 more replies)
  0 siblings, 3 replies; 19+ messages in thread
From: Barry Song @ 2025-04-12  8:58 UTC (permalink / raw)
  To: akpm, linux-mm
  Cc: linux-kernel, Barry Song, Baolin Wang, David Hildenbrand,
	Johannes Weiner, Matthew Wilcox, Oscar Salvador, Ryan Roberts,
	Zi Yan

From: Barry Song <v-songbaohua@oppo.com>

Promoting exclusive file folios of a dying process is unnecessary and
harmful. For example, while Firefox is killed and LibreOffice is
launched, activating Firefox's young file-backed folios makes it
harder to reclaim memory that LibreOffice doesn't use at all.

An exiting process is unlikely to be restarted right away—it's
either terminated by the user or killed by the OOM handler.

Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Barry Song <v-songbaohua@oppo.com>
---
 mm/huge_memory.c |  4 ++--
 mm/internal.h    | 19 +++++++++++++++++++
 mm/memory.c      |  9 ++++++++-
 3 files changed, 29 insertions(+), 3 deletions(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index e97a97586478..05b83d2fcbb6 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -2264,8 +2264,8 @@ int zap_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma,
 			 * Use flush_needed to indicate whether the PMD entry
 			 * is present, instead of checking pmd_present() again.
 			 */
-			if (flush_needed && pmd_young(orig_pmd) &&
-			    likely(vma_has_recency(vma)))
+			if (!exclusive_folio_of_dying_process(folio, vma) && flush_needed &&
+			    pmd_young(orig_pmd) && likely(vma_has_recency(vma)))
 				folio_mark_accessed(folio);
 		}
 
diff --git a/mm/internal.h b/mm/internal.h
index 4e0ea83aaf1c..666de96a293d 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -11,6 +11,7 @@
 #include <linux/khugepaged.h>
 #include <linux/mm.h>
 #include <linux/mm_inline.h>
+#include <linux/oom.h>
 #include <linux/pagemap.h>
 #include <linux/pagewalk.h>
 #include <linux/rmap.h>
@@ -130,6 +131,24 @@ static inline int folio_nr_pages_mapped(const struct folio *folio)
 	return atomic_read(&folio->_nr_pages_mapped) & FOLIO_PAGES_MAPPED;
 }
 
+/*
+ * Return true if a folio is exclusive and belongs to an exiting or
+ * oom-reaped process; otherwise, return false.
+ */
+static inline bool exclusive_folio_of_dying_process(struct folio *folio,
+		struct vm_area_struct *vma)
+{
+	if (folio_maybe_mapped_shared(folio))
+		return false;
+
+	if (!atomic_read(&vma->vm_mm->mm_users))
+		return true;
+	if (check_stable_address_space(vma->vm_mm))
+		return true;
+
+	return false;
+}
+
 /*
  * Retrieve the first entry of a folio based on a provided entry within the
  * folio. We cannot rely on folio->swap as there is no guarantee that it has
diff --git a/mm/memory.c b/mm/memory.c
index b9e8443aaa86..cab69275e473 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1515,7 +1515,14 @@ static __always_inline void zap_present_folio_ptes(struct mmu_gather *tlb,
 				*force_flush = true;
 			}
 		}
-		if (pte_young(ptent) && likely(vma_has_recency(vma)))
+
+		/*
+		 * Skip marking exclusive file folios as accessed for processes that are
+		 * exiting or have been reaped due to OOM. This prevents unnecessary
+		 * promotion of folios that won't benefit the new process being launched.
+		 */
+		if (!exclusive_folio_of_dying_process(folio, vma) && pte_young(ptent) &&
+				likely(vma_has_recency(vma)))
 			folio_mark_accessed(folio);
 		rss[mm_counter(folio)] -= nr;
 	} else {
-- 
2.39.3 (Apple Git-146)



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH] mm: don't promote exclusive file folios of dying processes
  2025-04-12  8:58 [RFC PATCH] mm: don't promote exclusive file folios of dying processes Barry Song
@ 2025-04-12 15:48 ` Matthew Wilcox
  2025-04-12 16:31 ` Zi Yan
  2025-04-16  8:32 ` David Hildenbrand
  2 siblings, 0 replies; 19+ messages in thread
From: Matthew Wilcox @ 2025-04-12 15:48 UTC (permalink / raw)
  To: Barry Song
  Cc: akpm, linux-mm, linux-kernel, Barry Song, Baolin Wang,
	David Hildenbrand, Johannes Weiner, Oscar Salvador, Ryan Roberts,
	Zi Yan

On Sat, Apr 12, 2025 at 08:58:52PM +1200, Barry Song wrote:
> +		/*
> +		 * Skip marking exclusive file folios as accessed for processes that are
> +		 * exiting or have been reaped due to OOM. This prevents unnecessary
> +		 * promotion of folios that won't benefit the new process being launched.
> +		 */

Please wrap at 80 columns.

One easy way to achieve this is to pipe it through 'fmt -p \*'


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH] mm: don't promote exclusive file folios of dying processes
  2025-04-12  8:58 [RFC PATCH] mm: don't promote exclusive file folios of dying processes Barry Song
  2025-04-12 15:48 ` Matthew Wilcox
@ 2025-04-12 16:31 ` Zi Yan
  2025-04-16  7:48   ` Barry Song
  2025-04-16  8:32 ` David Hildenbrand
  2 siblings, 1 reply; 19+ messages in thread
From: Zi Yan @ 2025-04-12 16:31 UTC (permalink / raw)
  To: Barry Song
  Cc: akpm, linux-mm, linux-kernel, Barry Song, Baolin Wang,
	David Hildenbrand, Johannes Weiner, Matthew Wilcox,
	Oscar Salvador, Ryan Roberts

On 12 Apr 2025, at 4:58, Barry Song wrote:

> From: Barry Song <v-songbaohua@oppo.com>
>
> Promoting exclusive file folios of a dying process is unnecessary and
> harmful. For example, while Firefox is killed and LibreOffice is
> launched, activating Firefox's young file-backed folios makes it
> harder to reclaim memory that LibreOffice doesn't use at all.
>
> An exiting process is unlikely to be restarted right away—it's
> either terminated by the user or killed by the OOM handler.

The proposal looks reasonable to me. Do you have any performance number
about the improvement?

>
> Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
> Cc: David Hildenbrand <david@redhat.com>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
> Cc: Oscar Salvador <osalvador@suse.de>
> Cc: Ryan Roberts <ryan.roberts@arm.com>
> Cc: Zi Yan <ziy@nvidia.com>
> Signed-off-by: Barry Song <v-songbaohua@oppo.com>
> ---
>  mm/huge_memory.c |  4 ++--
>  mm/internal.h    | 19 +++++++++++++++++++
>  mm/memory.c      |  9 ++++++++-
>  3 files changed, 29 insertions(+), 3 deletions(-)
>
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index e97a97586478..05b83d2fcbb6 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -2264,8 +2264,8 @@ int zap_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma,
>  			 * Use flush_needed to indicate whether the PMD entry
>  			 * is present, instead of checking pmd_present() again.
>  			 */
> -			if (flush_needed && pmd_young(orig_pmd) &&
> -			    likely(vma_has_recency(vma)))
> +			if (!exclusive_folio_of_dying_process(folio, vma) && flush_needed &&
> +			    pmd_young(orig_pmd) && likely(vma_has_recency(vma)))
>  				folio_mark_accessed(folio);
>  		}
>
> diff --git a/mm/internal.h b/mm/internal.h
> index 4e0ea83aaf1c..666de96a293d 100644
> --- a/mm/internal.h
> +++ b/mm/internal.h
> @@ -11,6 +11,7 @@
>  #include <linux/khugepaged.h>
>  #include <linux/mm.h>
>  #include <linux/mm_inline.h>
> +#include <linux/oom.h>
>  #include <linux/pagemap.h>
>  #include <linux/pagewalk.h>
>  #include <linux/rmap.h>
> @@ -130,6 +131,24 @@ static inline int folio_nr_pages_mapped(const struct folio *folio)
>  	return atomic_read(&folio->_nr_pages_mapped) & FOLIO_PAGES_MAPPED;
>  }
>
> +/*
> + * Return true if a folio is exclusive and belongs to an exiting or
> + * oom-reaped process; otherwise, return false.
> + */
> +static inline bool exclusive_folio_of_dying_process(struct folio *folio,
> +		struct vm_area_struct *vma)
> +{
> +	if (folio_maybe_mapped_shared(folio))
> +		return false;
> +
> +	if (!atomic_read(&vma->vm_mm->mm_users))
> +		return true;
> +	if (check_stable_address_space(vma->vm_mm))
> +		return true;
> +
> +	return false;
> +}
> +
>  /*
>   * Retrieve the first entry of a folio based on a provided entry within the
>   * folio. We cannot rely on folio->swap as there is no guarantee that it has
> diff --git a/mm/memory.c b/mm/memory.c
> index b9e8443aaa86..cab69275e473 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -1515,7 +1515,14 @@ static __always_inline void zap_present_folio_ptes(struct mmu_gather *tlb,
>  				*force_flush = true;
>  			}
>  		}
> -		if (pte_young(ptent) && likely(vma_has_recency(vma)))
> +
> +		/*
> +		 * Skip marking exclusive file folios as accessed for processes that are
> +		 * exiting or have been reaped due to OOM. This prevents unnecessary
> +		 * promotion of folios that won't benefit the new process being launched.
> +		 */
> +		if (!exclusive_folio_of_dying_process(folio, vma) && pte_young(ptent) &&
> +				likely(vma_has_recency(vma)))
>  			folio_mark_accessed(folio);
>  		rss[mm_counter(folio)] -= nr;
>  	} else {
> -- 
> 2.39.3 (Apple Git-146)


--
Best Regards,
Yan, Zi


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH] mm: don't promote exclusive file folios of dying processes
  2025-04-12 16:31 ` Zi Yan
@ 2025-04-16  7:48   ` Barry Song
  2025-04-16  8:24     ` Baolin Wang
  0 siblings, 1 reply; 19+ messages in thread
From: Barry Song @ 2025-04-16  7:48 UTC (permalink / raw)
  To: Zi Yan, Tangquan Zheng
  Cc: akpm, linux-mm, linux-kernel, Barry Song, Baolin Wang,
	David Hildenbrand, Johannes Weiner, Matthew Wilcox,
	Oscar Salvador, Ryan Roberts

On Sun, Apr 13, 2025 at 12:31 AM Zi Yan <ziy@nvidia.com> wrote:
>
> On 12 Apr 2025, at 4:58, Barry Song wrote:
>
> > From: Barry Song <v-songbaohua@oppo.com>
> >
> > Promoting exclusive file folios of a dying process is unnecessary and
> > harmful. For example, while Firefox is killed and LibreOffice is
> > launched, activating Firefox's young file-backed folios makes it
> > harder to reclaim memory that LibreOffice doesn't use at all.
> >
> > An exiting process is unlikely to be restarted right away—it's
> > either terminated by the user or killed by the OOM handler.
>
> The proposal looks reasonable to me. Do you have any performance number
> about the improvement?

Tangquan ran the test on Android phones and saw 3% improvement on
refault/thrashing things:
                                                   w/o patch           w/patch
workingset_refault_anon    2215933          2146602            3.13%
workingset_refault_file       9859208          9646518             2.16%
pswpin                                2411086          2337790             3.04%
pswpout                              6482838          6264865             3.36%

A further demotion of exclusive file folios can improvement more, but
might be controversial. it could be a separate patch later.

>
> >
> > Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
> > Cc: David Hildenbrand <david@redhat.com>
> > Cc: Johannes Weiner <hannes@cmpxchg.org>
> > Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
> > Cc: Oscar Salvador <osalvador@suse.de>
> > Cc: Ryan Roberts <ryan.roberts@arm.com>
> > Cc: Zi Yan <ziy@nvidia.com>
> > Signed-off-by: Barry Song <v-songbaohua@oppo.com>
> > ---
> >  mm/huge_memory.c |  4 ++--
> >  mm/internal.h    | 19 +++++++++++++++++++
> >  mm/memory.c      |  9 ++++++++-
> >  3 files changed, 29 insertions(+), 3 deletions(-)
> >
> > diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> > index e97a97586478..05b83d2fcbb6 100644
> > --- a/mm/huge_memory.c
> > +++ b/mm/huge_memory.c
> > @@ -2264,8 +2264,8 @@ int zap_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma,
> >                        * Use flush_needed to indicate whether the PMD entry
> >                        * is present, instead of checking pmd_present() again.
> >                        */
> > -                     if (flush_needed && pmd_young(orig_pmd) &&
> > -                         likely(vma_has_recency(vma)))
> > +                     if (!exclusive_folio_of_dying_process(folio, vma) && flush_needed &&
> > +                         pmd_young(orig_pmd) && likely(vma_has_recency(vma)))
> >                               folio_mark_accessed(folio);
> >               }
> >
> > diff --git a/mm/internal.h b/mm/internal.h
> > index 4e0ea83aaf1c..666de96a293d 100644
> > --- a/mm/internal.h
> > +++ b/mm/internal.h
> > @@ -11,6 +11,7 @@
> >  #include <linux/khugepaged.h>
> >  #include <linux/mm.h>
> >  #include <linux/mm_inline.h>
> > +#include <linux/oom.h>
> >  #include <linux/pagemap.h>
> >  #include <linux/pagewalk.h>
> >  #include <linux/rmap.h>
> > @@ -130,6 +131,24 @@ static inline int folio_nr_pages_mapped(const struct folio *folio)
> >       return atomic_read(&folio->_nr_pages_mapped) & FOLIO_PAGES_MAPPED;
> >  }
> >
> > +/*
> > + * Return true if a folio is exclusive and belongs to an exiting or
> > + * oom-reaped process; otherwise, return false.
> > + */
> > +static inline bool exclusive_folio_of_dying_process(struct folio *folio,
> > +             struct vm_area_struct *vma)
> > +{
> > +     if (folio_maybe_mapped_shared(folio))
> > +             return false;
> > +
> > +     if (!atomic_read(&vma->vm_mm->mm_users))
> > +             return true;
> > +     if (check_stable_address_space(vma->vm_mm))
> > +             return true;
> > +
> > +     return false;
> > +}
> > +
> >  /*
> >   * Retrieve the first entry of a folio based on a provided entry within the
> >   * folio. We cannot rely on folio->swap as there is no guarantee that it has
> > diff --git a/mm/memory.c b/mm/memory.c
> > index b9e8443aaa86..cab69275e473 100644
> > --- a/mm/memory.c
> > +++ b/mm/memory.c
> > @@ -1515,7 +1515,14 @@ static __always_inline void zap_present_folio_ptes(struct mmu_gather *tlb,
> >                               *force_flush = true;
> >                       }
> >               }
> > -             if (pte_young(ptent) && likely(vma_has_recency(vma)))
> > +
> > +             /*
> > +              * Skip marking exclusive file folios as accessed for processes that are
> > +              * exiting or have been reaped due to OOM. This prevents unnecessary
> > +              * promotion of folios that won't benefit the new process being launched.
> > +              */
> > +             if (!exclusive_folio_of_dying_process(folio, vma) && pte_young(ptent) &&
> > +                             likely(vma_has_recency(vma)))
> >                       folio_mark_accessed(folio);
> >               rss[mm_counter(folio)] -= nr;
> >       } else {
> > --
> > 2.39.3 (Apple Git-146)
>
>
> --
> Best Regards,
> Yan, Zi

Thanks
Barry


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH] mm: don't promote exclusive file folios of dying processes
  2025-04-16  7:48   ` Barry Song
@ 2025-04-16  8:24     ` Baolin Wang
  0 siblings, 0 replies; 19+ messages in thread
From: Baolin Wang @ 2025-04-16  8:24 UTC (permalink / raw)
  To: Barry Song, Zi Yan, Tangquan Zheng
  Cc: akpm, linux-mm, linux-kernel, Barry Song, David Hildenbrand,
	Johannes Weiner, Matthew Wilcox, Oscar Salvador, Ryan Roberts



On 2025/4/16 15:48, Barry Song wrote:
> On Sun, Apr 13, 2025 at 12:31 AM Zi Yan <ziy@nvidia.com> wrote:
>>
>> On 12 Apr 2025, at 4:58, Barry Song wrote:
>>
>>> From: Barry Song <v-songbaohua@oppo.com>
>>>
>>> Promoting exclusive file folios of a dying process is unnecessary and
>>> harmful. For example, while Firefox is killed and LibreOffice is
>>> launched, activating Firefox's young file-backed folios makes it
>>> harder to reclaim memory that LibreOffice doesn't use at all.
>>>
>>> An exiting process is unlikely to be restarted right away—it's
>>> either terminated by the user or killed by the OOM handler.
>>
>> The proposal looks reasonable to me. Do you have any performance number
>> about the improvement?
> 
> Tangquan ran the test on Android phones and saw 3% improvement on
> refault/thrashing things:

Good.

>                                                     w/o patch           w/patch
> workingset_refault_anon    2215933          2146602            3.13%
> workingset_refault_file       9859208          9646518             2.16%
> pswpin                                2411086          2337790             3.04%
> pswpout                              6482838          6264865             3.36%
> 
> A further demotion of exclusive file folios can improvement more, but
> might be controversial. it could be a separate patch later.
> 
>>
>>>
>>> Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
>>> Cc: David Hildenbrand <david@redhat.com>
>>> Cc: Johannes Weiner <hannes@cmpxchg.org>
>>> Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
>>> Cc: Oscar Salvador <osalvador@suse.de>
>>> Cc: Ryan Roberts <ryan.roberts@arm.com>
>>> Cc: Zi Yan <ziy@nvidia.com>
>>> Signed-off-by: Barry Song <v-songbaohua@oppo.com>
>>> ---
>>>   mm/huge_memory.c |  4 ++--
>>>   mm/internal.h    | 19 +++++++++++++++++++
>>>   mm/memory.c      |  9 ++++++++-
>>>   3 files changed, 29 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
>>> index e97a97586478..05b83d2fcbb6 100644
>>> --- a/mm/huge_memory.c
>>> +++ b/mm/huge_memory.c
>>> @@ -2264,8 +2264,8 @@ int zap_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma,
>>>                         * Use flush_needed to indicate whether the PMD entry
>>>                         * is present, instead of checking pmd_present() again.
>>>                         */
>>> -                     if (flush_needed && pmd_young(orig_pmd) &&
>>> -                         likely(vma_has_recency(vma)))
>>> +                     if (!exclusive_folio_of_dying_process(folio, vma) && flush_needed &&

Nit: I prefer to check 'flush_needed' first to make sure it is a present 
pte. Otherwise look good to me.

>>> +                         pmd_young(orig_pmd) && likely(vma_has_recency(vma)))
>>>                                folio_mark_accessed(folio);
>>>                }
>>>
>>> diff --git a/mm/internal.h b/mm/internal.h
>>> index 4e0ea83aaf1c..666de96a293d 100644
>>> --- a/mm/internal.h
>>> +++ b/mm/internal.h
>>> @@ -11,6 +11,7 @@
>>>   #include <linux/khugepaged.h>
>>>   #include <linux/mm.h>
>>>   #include <linux/mm_inline.h>
>>> +#include <linux/oom.h>
>>>   #include <linux/pagemap.h>
>>>   #include <linux/pagewalk.h>
>>>   #include <linux/rmap.h>
>>> @@ -130,6 +131,24 @@ static inline int folio_nr_pages_mapped(const struct folio *folio)
>>>        return atomic_read(&folio->_nr_pages_mapped) & FOLIO_PAGES_MAPPED;
>>>   }
>>>
>>> +/*
>>> + * Return true if a folio is exclusive and belongs to an exiting or
>>> + * oom-reaped process; otherwise, return false.
>>> + */
>>> +static inline bool exclusive_folio_of_dying_process(struct folio *folio,
>>> +             struct vm_area_struct *vma)
>>> +{
>>> +     if (folio_maybe_mapped_shared(folio))
>>> +             return false;
>>> +
>>> +     if (!atomic_read(&vma->vm_mm->mm_users))
>>> +             return true;
>>> +     if (check_stable_address_space(vma->vm_mm))
>>> +             return true;
>>> +
>>> +     return false;
>>> +}
>>> +
>>>   /*
>>>    * Retrieve the first entry of a folio based on a provided entry within the
>>>    * folio. We cannot rely on folio->swap as there is no guarantee that it has
>>> diff --git a/mm/memory.c b/mm/memory.c
>>> index b9e8443aaa86..cab69275e473 100644
>>> --- a/mm/memory.c
>>> +++ b/mm/memory.c
>>> @@ -1515,7 +1515,14 @@ static __always_inline void zap_present_folio_ptes(struct mmu_gather *tlb,
>>>                                *force_flush = true;
>>>                        }
>>>                }
>>> -             if (pte_young(ptent) && likely(vma_has_recency(vma)))
>>> +
>>> +             /*
>>> +              * Skip marking exclusive file folios as accessed for processes that are
>>> +              * exiting or have been reaped due to OOM. This prevents unnecessary
>>> +              * promotion of folios that won't benefit the new process being launched.
>>> +              */
>>> +             if (!exclusive_folio_of_dying_process(folio, vma) && pte_young(ptent) &&
>>> +                             likely(vma_has_recency(vma)))
>>>                        folio_mark_accessed(folio);
>>>                rss[mm_counter(folio)] -= nr;
>>>        } else {
>>> --
>>> 2.39.3 (Apple Git-146)
>>
>>
>> --
>> Best Regards,
>> Yan, Zi
> 
> Thanks
> Barry


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH] mm: don't promote exclusive file folios of dying processes
  2025-04-12  8:58 [RFC PATCH] mm: don't promote exclusive file folios of dying processes Barry Song
  2025-04-12 15:48 ` Matthew Wilcox
  2025-04-12 16:31 ` Zi Yan
@ 2025-04-16  8:32 ` David Hildenbrand
  2025-04-16  9:24   ` Barry Song
  2 siblings, 1 reply; 19+ messages in thread
From: David Hildenbrand @ 2025-04-16  8:32 UTC (permalink / raw)
  To: Barry Song, akpm, linux-mm
  Cc: linux-kernel, Barry Song, Baolin Wang, Johannes Weiner,
	Matthew Wilcox, Oscar Salvador, Ryan Roberts, Zi Yan

On 12.04.25 10:58, Barry Song wrote:
> From: Barry Song <v-songbaohua@oppo.com>
> 
> Promoting exclusive file folios of a dying process is unnecessary and
> harmful. For example, while Firefox is killed and LibreOffice is
> launched, activating Firefox's young file-backed folios makes it
> harder to reclaim memory that LibreOffice doesn't use at all.

Do we know when it is reasonable to promote any folios of a dying process?

Assume you restart Firefox, would it really matter to promote them when 
unmapping? New Firefox would fault-in / touch the ones it really needs 
immediately afterwards?

-- 
Cheers,

David / dhildenb



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH] mm: don't promote exclusive file folios of dying processes
  2025-04-16  8:32 ` David Hildenbrand
@ 2025-04-16  9:24   ` Barry Song
  2025-04-16  9:32     ` David Hildenbrand
  0 siblings, 1 reply; 19+ messages in thread
From: Barry Song @ 2025-04-16  9:24 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: akpm, linux-mm, linux-kernel, Barry Song, Baolin Wang,
	Johannes Weiner, Matthew Wilcox, Oscar Salvador, Ryan Roberts,
	Zi Yan

On Wed, Apr 16, 2025 at 4:32 PM David Hildenbrand <david@redhat.com> wrote:
>
> On 12.04.25 10:58, Barry Song wrote:
> > From: Barry Song <v-songbaohua@oppo.com>
> >
> > Promoting exclusive file folios of a dying process is unnecessary and
> > harmful. For example, while Firefox is killed and LibreOffice is
> > launched, activating Firefox's young file-backed folios makes it
> > harder to reclaim memory that LibreOffice doesn't use at all.
>
> Do we know when it is reasonable to promote any folios of a dying process?
>

I don't know. It seems not reasonable at all. if one service crashes due to
SW bug, systemd will restart it immediately. this might be the case promoting
folios might be good. but it is really a bug of the service, not a normal case.

> Assume you restart Firefox, would it really matter to promote them when
> unmapping? New Firefox would fault-in / touch the ones it really needs
> immediately afterwards?

Usually users kill firefox to start other applications (users intend
to free memory
for new applications). For Android, an app might be killed because it has been
staying in the background inactively for a while.

On the other hand, even if users restart firefox immediately, their folios are
probably still in LRU to hit.

>
> --
> Cheers,
>
> David / dhildenb
>
Thanks
Barry


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH] mm: don't promote exclusive file folios of dying processes
  2025-04-16  9:24   ` Barry Song
@ 2025-04-16  9:32     ` David Hildenbrand
  2025-04-16  9:38       ` Barry Song
  0 siblings, 1 reply; 19+ messages in thread
From: David Hildenbrand @ 2025-04-16  9:32 UTC (permalink / raw)
  To: Barry Song
  Cc: akpm, linux-mm, linux-kernel, Barry Song, Baolin Wang,
	Johannes Weiner, Matthew Wilcox, Oscar Salvador, Ryan Roberts,
	Zi Yan

On 16.04.25 11:24, Barry Song wrote:
> On Wed, Apr 16, 2025 at 4:32 PM David Hildenbrand <david@redhat.com> wrote:
>>
>> On 12.04.25 10:58, Barry Song wrote:
>>> From: Barry Song <v-songbaohua@oppo.com>
>>>
>>> Promoting exclusive file folios of a dying process is unnecessary and
>>> harmful. For example, while Firefox is killed and LibreOffice is
>>> launched, activating Firefox's young file-backed folios makes it
>>> harder to reclaim memory that LibreOffice doesn't use at all.
>>
>> Do we know when it is reasonable to promote any folios of a dying process?
>>
> 
> I don't know. It seems not reasonable at all. if one service crashes due to
> SW bug, systemd will restart it immediately. this might be the case promoting
> folios might be good. but it is really a bug of the service, not a normal case.
> 
>> Assume you restart Firefox, would it really matter to promote them when
>> unmapping? New Firefox would fault-in / touch the ones it really needs
>> immediately afterwards?
> 
> Usually users kill firefox to start other applications (users intend
> to free memory
> for new applications). For Android, an app might be killed because it has been
> staying in the background inactively for a while.

> On the other hand, even if users restart firefox immediately, their folios are
> probably still in LRU to hit.

Right, that's what I'm thinking.

So I wonder if we could just say "the whole process is going down; even 
if we had some recency information, that could only affect some other 
process, where we would have to guess if it really matters".

If the data is important, one would assume that another process would 
soon access it either way, and as you say, likely it will still be on 
the LRU to hit.

-- 
Cheers,

David / dhildenb



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH] mm: don't promote exclusive file folios of dying processes
  2025-04-16  9:32     ` David Hildenbrand
@ 2025-04-16  9:38       ` Barry Song
  2025-04-16  9:40         ` David Hildenbrand
  0 siblings, 1 reply; 19+ messages in thread
From: Barry Song @ 2025-04-16  9:38 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: akpm, linux-mm, linux-kernel, Barry Song, Baolin Wang,
	Johannes Weiner, Matthew Wilcox, Oscar Salvador, Ryan Roberts,
	Zi Yan

On Wed, Apr 16, 2025 at 5:32 PM David Hildenbrand <david@redhat.com> wrote:
>
> On 16.04.25 11:24, Barry Song wrote:
> > On Wed, Apr 16, 2025 at 4:32 PM David Hildenbrand <david@redhat.com> wrote:
> >>
> >> On 12.04.25 10:58, Barry Song wrote:
> >>> From: Barry Song <v-songbaohua@oppo.com>
> >>>
> >>> Promoting exclusive file folios of a dying process is unnecessary and
> >>> harmful. For example, while Firefox is killed and LibreOffice is
> >>> launched, activating Firefox's young file-backed folios makes it
> >>> harder to reclaim memory that LibreOffice doesn't use at all.
> >>
> >> Do we know when it is reasonable to promote any folios of a dying process?
> >>
> >
> > I don't know. It seems not reasonable at all. if one service crashes due to
> > SW bug, systemd will restart it immediately. this might be the case promoting
> > folios might be good. but it is really a bug of the service, not a normal case.
> >
> >> Assume you restart Firefox, would it really matter to promote them when
> >> unmapping? New Firefox would fault-in / touch the ones it really needs
> >> immediately afterwards?
> >
> > Usually users kill firefox to start other applications (users intend
> > to free memory
> > for new applications). For Android, an app might be killed because it has been
> > staying in the background inactively for a while.
>
> > On the other hand, even if users restart firefox immediately, their folios are
> > probably still in LRU to hit.
>
> Right, that's what I'm thinking.
>
> So I wonder if we could just say "the whole process is going down; even
> if we had some recency information, that could only affect some other
> process, where we would have to guess if it really matters".
>
> If the data is important, one would assume that another process would
> soon access it either way, and as you say, likely it will still be on
> the LRU to hit.

I'll include this additional information in the v2 version of the patch since
you think it would be helpful.

Regarding the exclusive flag - I'm wondering whether we actually need to
distinguish between exclusive and shared folios in this case. The current
patch uses the exclusive flag mainly to reduce controversy, but even for
shared folios: does the recency from a dying process matter? The
recency information only reflects the dying process's usage pattern, which
will soon be irrelevant.

>
> --
> Cheers,
>
> David / dhildenb
>

Thanks
Barry


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH] mm: don't promote exclusive file folios of dying processes
  2025-04-16  9:38       ` Barry Song
@ 2025-04-16  9:40         ` David Hildenbrand
  2025-04-16 14:15           ` Johannes Weiner
  0 siblings, 1 reply; 19+ messages in thread
From: David Hildenbrand @ 2025-04-16  9:40 UTC (permalink / raw)
  To: Barry Song
  Cc: akpm, linux-mm, linux-kernel, Barry Song, Baolin Wang,
	Johannes Weiner, Matthew Wilcox, Oscar Salvador, Ryan Roberts,
	Zi Yan

On 16.04.25 11:38, Barry Song wrote:
> On Wed, Apr 16, 2025 at 5:32 PM David Hildenbrand <david@redhat.com> wrote:
>>
>> On 16.04.25 11:24, Barry Song wrote:
>>> On Wed, Apr 16, 2025 at 4:32 PM David Hildenbrand <david@redhat.com> wrote:
>>>>
>>>> On 12.04.25 10:58, Barry Song wrote:
>>>>> From: Barry Song <v-songbaohua@oppo.com>
>>>>>
>>>>> Promoting exclusive file folios of a dying process is unnecessary and
>>>>> harmful. For example, while Firefox is killed and LibreOffice is
>>>>> launched, activating Firefox's young file-backed folios makes it
>>>>> harder to reclaim memory that LibreOffice doesn't use at all.
>>>>
>>>> Do we know when it is reasonable to promote any folios of a dying process?
>>>>
>>>
>>> I don't know. It seems not reasonable at all. if one service crashes due to
>>> SW bug, systemd will restart it immediately. this might be the case promoting
>>> folios might be good. but it is really a bug of the service, not a normal case.
>>>
>>>> Assume you restart Firefox, would it really matter to promote them when
>>>> unmapping? New Firefox would fault-in / touch the ones it really needs
>>>> immediately afterwards?
>>>
>>> Usually users kill firefox to start other applications (users intend
>>> to free memory
>>> for new applications). For Android, an app might be killed because it has been
>>> staying in the background inactively for a while.
>>
>>> On the other hand, even if users restart firefox immediately, their folios are
>>> probably still in LRU to hit.
>>
>> Right, that's what I'm thinking.
>>
>> So I wonder if we could just say "the whole process is going down; even
>> if we had some recency information, that could only affect some other
>> process, where we would have to guess if it really matters".
>>
>> If the data is important, one would assume that another process would
>> soon access it either way, and as you say, likely it will still be on
>> the LRU to hit.
> 
> I'll include this additional information in the v2 version of the patch since
> you think it would be helpful.
> 
> Regarding the exclusive flag - I'm wondering whether we actually need to
> distinguish between exclusive and shared folios in this case. The current
> patch uses the exclusive flag mainly to reduce controversy, but even for
> shared folios: does the recency from a dying process matter? The
> recency information only reflects the dying process's usage pattern, which
> will soon be irrelevant.

Exactly my thoughts. So if we can simplify -- ignore it completely -- 
that would certainly be nice.

-- 
Cheers,

David / dhildenb



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH] mm: don't promote exclusive file folios of dying processes
  2025-04-16  9:40         ` David Hildenbrand
@ 2025-04-16 14:15           ` Johannes Weiner
  2025-04-16 15:59             ` David Hildenbrand
  0 siblings, 1 reply; 19+ messages in thread
From: Johannes Weiner @ 2025-04-16 14:15 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Barry Song, akpm, linux-mm, linux-kernel, Barry Song,
	Baolin Wang, Matthew Wilcox, Oscar Salvador, Ryan Roberts,
	Zi Yan

On Wed, Apr 16, 2025 at 11:40:31AM +0200, David Hildenbrand wrote:
> On 16.04.25 11:38, Barry Song wrote:
> > On Wed, Apr 16, 2025 at 5:32 PM David Hildenbrand <david@redhat.com> wrote:
> >>
> >> On 16.04.25 11:24, Barry Song wrote:
> >>> On Wed, Apr 16, 2025 at 4:32 PM David Hildenbrand <david@redhat.com> wrote:
> >>>>
> >>>> On 12.04.25 10:58, Barry Song wrote:
> >>>>> From: Barry Song <v-songbaohua@oppo.com>
> >>>>>
> >>>>> Promoting exclusive file folios of a dying process is unnecessary and
> >>>>> harmful. For example, while Firefox is killed and LibreOffice is
> >>>>> launched, activating Firefox's young file-backed folios makes it
> >>>>> harder to reclaim memory that LibreOffice doesn't use at all.
> >>>>
> >>>> Do we know when it is reasonable to promote any folios of a dying process?
> >>>>
> >>>
> >>> I don't know. It seems not reasonable at all. if one service crashes due to
> >>> SW bug, systemd will restart it immediately. this might be the case promoting
> >>> folios might be good. but it is really a bug of the service, not a normal case.
> >>>
> >>>> Assume you restart Firefox, would it really matter to promote them when
> >>>> unmapping? New Firefox would fault-in / touch the ones it really needs
> >>>> immediately afterwards?
> >>>
> >>> Usually users kill firefox to start other applications (users intend
> >>> to free memory
> >>> for new applications). For Android, an app might be killed because it has been
> >>> staying in the background inactively for a while.
> >>
> >>> On the other hand, even if users restart firefox immediately, their folios are
> >>> probably still in LRU to hit.
> >>
> >> Right, that's what I'm thinking.
> >>
> >> So I wonder if we could just say "the whole process is going down; even
> >> if we had some recency information, that could only affect some other
> >> process, where we would have to guess if it really matters".
> >>
> >> If the data is important, one would assume that another process would
> >> soon access it either way, and as you say, likely it will still be on
> >> the LRU to hit.
> > 
> > I'll include this additional information in the v2 version of the patch since
> > you think it would be helpful.
> > 
> > Regarding the exclusive flag - I'm wondering whether we actually need to
> > distinguish between exclusive and shared folios in this case. The current
> > patch uses the exclusive flag mainly to reduce controversy, but even for
> > shared folios: does the recency from a dying process matter? The
> > recency information only reflects the dying process's usage pattern, which
> > will soon be irrelevant.
> 
> Exactly my thoughts. So if we can simplify -- ignore it completely -- 
> that would certainly be nice.

This doesn't sound right to me.

Remembering the accesses of an exiting task is very much the point of
this. Consider executables and shared libraries repeatedly referenced
by short-lived jobs, like shell scripts, compiles etc.

MADV_COLD and MADV_PAGEOUT where specifically added for this Android
usecase - the rare situation where you *know* those pages are done.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH] mm: don't promote exclusive file folios of dying processes
  2025-04-16 14:15           ` Johannes Weiner
@ 2025-04-16 15:59             ` David Hildenbrand
  2025-04-16 18:18               ` Johannes Weiner
  0 siblings, 1 reply; 19+ messages in thread
From: David Hildenbrand @ 2025-04-16 15:59 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Barry Song, akpm, linux-mm, linux-kernel, Barry Song,
	Baolin Wang, Matthew Wilcox, Oscar Salvador, Ryan Roberts,
	Zi Yan

On 16.04.25 16:15, Johannes Weiner wrote:
> On Wed, Apr 16, 2025 at 11:40:31AM +0200, David Hildenbrand wrote:
>> On 16.04.25 11:38, Barry Song wrote:
>>> On Wed, Apr 16, 2025 at 5:32 PM David Hildenbrand <david@redhat.com> wrote:
>>>>
>>>> On 16.04.25 11:24, Barry Song wrote:
>>>>> On Wed, Apr 16, 2025 at 4:32 PM David Hildenbrand <david@redhat.com> wrote:
>>>>>>
>>>>>> On 12.04.25 10:58, Barry Song wrote:
>>>>>>> From: Barry Song <v-songbaohua@oppo.com>
>>>>>>>
>>>>>>> Promoting exclusive file folios of a dying process is unnecessary and
>>>>>>> harmful. For example, while Firefox is killed and LibreOffice is
>>>>>>> launched, activating Firefox's young file-backed folios makes it
>>>>>>> harder to reclaim memory that LibreOffice doesn't use at all.
>>>>>>
>>>>>> Do we know when it is reasonable to promote any folios of a dying process?
>>>>>>
>>>>>
>>>>> I don't know. It seems not reasonable at all. if one service crashes due to
>>>>> SW bug, systemd will restart it immediately. this might be the case promoting
>>>>> folios might be good. but it is really a bug of the service, not a normal case.
>>>>>
>>>>>> Assume you restart Firefox, would it really matter to promote them when
>>>>>> unmapping? New Firefox would fault-in / touch the ones it really needs
>>>>>> immediately afterwards?
>>>>>
>>>>> Usually users kill firefox to start other applications (users intend
>>>>> to free memory
>>>>> for new applications). For Android, an app might be killed because it has been
>>>>> staying in the background inactively for a while.
>>>>
>>>>> On the other hand, even if users restart firefox immediately, their folios are
>>>>> probably still in LRU to hit.
>>>>
>>>> Right, that's what I'm thinking.
>>>>
>>>> So I wonder if we could just say "the whole process is going down; even
>>>> if we had some recency information, that could only affect some other
>>>> process, where we would have to guess if it really matters".
>>>>
>>>> If the data is important, one would assume that another process would
>>>> soon access it either way, and as you say, likely it will still be on
>>>> the LRU to hit.
>>>
>>> I'll include this additional information in the v2 version of the patch since
>>> you think it would be helpful.
>>>
>>> Regarding the exclusive flag - I'm wondering whether we actually need to
>>> distinguish between exclusive and shared folios in this case. The current
>>> patch uses the exclusive flag mainly to reduce controversy, but even for
>>> shared folios: does the recency from a dying process matter? The
>>> recency information only reflects the dying process's usage pattern, which
>>> will soon be irrelevant.
>>
>> Exactly my thoughts. So if we can simplify -- ignore it completely --
>> that would certainly be nice.
> 
> This doesn't sound right to me.
> 
> Remembering the accesses of an exiting task is very much the point of
> this. Consider executables and shared libraries repeatedly referenced
> by short-lived jobs, like shell scripts, compiles etc.

For these always-mmaped / never read/write files I tend to agree.

But, is it really a good indication whether a folio is exclusive to this 
process or not?

I mean, if a bash scripts executes the same executable repeatedly, but 
never multiple copies at the same time, we would also not tracking the 
access with this patch.

Similarly with an app that mmaps() a large data set (DB, VM, ML, ..) 
exclusively. Re-starting the app would not track recency with this patch.

But I guess there is no right or wrong ...

-- 
Cheers,

David / dhildenb



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH] mm: don't promote exclusive file folios of dying processes
  2025-04-16 15:59             ` David Hildenbrand
@ 2025-04-16 18:18               ` Johannes Weiner
  2025-04-16 21:54                 ` Barry Song
  0 siblings, 1 reply; 19+ messages in thread
From: Johannes Weiner @ 2025-04-16 18:18 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Barry Song, akpm, linux-mm, linux-kernel, Barry Song,
	Baolin Wang, Matthew Wilcox, Oscar Salvador, Ryan Roberts,
	Zi Yan

On Wed, Apr 16, 2025 at 05:59:12PM +0200, David Hildenbrand wrote:
> On 16.04.25 16:15, Johannes Weiner wrote:
> > On Wed, Apr 16, 2025 at 11:40:31AM +0200, David Hildenbrand wrote:
> >> On 16.04.25 11:38, Barry Song wrote:
> >>> On Wed, Apr 16, 2025 at 5:32 PM David Hildenbrand <david@redhat.com> wrote:
> >>>>
> >>>> On 16.04.25 11:24, Barry Song wrote:
> >>>>> On Wed, Apr 16, 2025 at 4:32 PM David Hildenbrand <david@redhat.com> wrote:
> >>>>>>
> >>>>>> On 12.04.25 10:58, Barry Song wrote:
> >>>>>>> From: Barry Song <v-songbaohua@oppo.com>
> >>>>>>>
> >>>>>>> Promoting exclusive file folios of a dying process is unnecessary and
> >>>>>>> harmful. For example, while Firefox is killed and LibreOffice is
> >>>>>>> launched, activating Firefox's young file-backed folios makes it
> >>>>>>> harder to reclaim memory that LibreOffice doesn't use at all.
> >>>>>>
> >>>>>> Do we know when it is reasonable to promote any folios of a dying process?
> >>>>>>
> >>>>>
> >>>>> I don't know. It seems not reasonable at all. if one service crashes due to
> >>>>> SW bug, systemd will restart it immediately. this might be the case promoting
> >>>>> folios might be good. but it is really a bug of the service, not a normal case.
> >>>>>
> >>>>>> Assume you restart Firefox, would it really matter to promote them when
> >>>>>> unmapping? New Firefox would fault-in / touch the ones it really needs
> >>>>>> immediately afterwards?
> >>>>>
> >>>>> Usually users kill firefox to start other applications (users intend
> >>>>> to free memory
> >>>>> for new applications). For Android, an app might be killed because it has been
> >>>>> staying in the background inactively for a while.
> >>>>
> >>>>> On the other hand, even if users restart firefox immediately, their folios are
> >>>>> probably still in LRU to hit.
> >>>>
> >>>> Right, that's what I'm thinking.
> >>>>
> >>>> So I wonder if we could just say "the whole process is going down; even
> >>>> if we had some recency information, that could only affect some other
> >>>> process, where we would have to guess if it really matters".
> >>>>
> >>>> If the data is important, one would assume that another process would
> >>>> soon access it either way, and as you say, likely it will still be on
> >>>> the LRU to hit.
> >>>
> >>> I'll include this additional information in the v2 version of the patch since
> >>> you think it would be helpful.
> >>>
> >>> Regarding the exclusive flag - I'm wondering whether we actually need to
> >>> distinguish between exclusive and shared folios in this case. The current
> >>> patch uses the exclusive flag mainly to reduce controversy, but even for
> >>> shared folios: does the recency from a dying process matter? The
> >>> recency information only reflects the dying process's usage pattern, which
> >>> will soon be irrelevant.
> >>
> >> Exactly my thoughts. So if we can simplify -- ignore it completely --
> >> that would certainly be nice.
> > 
> > This doesn't sound right to me.
> > 
> > Remembering the accesses of an exiting task is very much the point of
> > this. Consider executables and shared libraries repeatedly referenced
> > by short-lived jobs, like shell scripts, compiles etc.
> 
> For these always-mmaped / never read/write files I tend to agree.
> 
> But, is it really a good indication whether a folio is exclusive to this 
> process or not?
>
> I mean, if a bash scripts executes the same executable repeatedly, but 
> never multiple copies at the same time, we would also not tracking the 
> access with this patch.
> 
> Similarly with an app that mmaps() a large data set (DB, VM, ML, ..) 
> exclusively. Re-starting the app would not track recency with this patch.
> 
> But I guess there is no right or wrong ...

Right, I'm more broadly objecting to the patch and its premise, but
thought the exclusive filtering would at least mitigate its downsides
somewhat. You raise good points that it's not as clear cut.

IMO this is too subtle and unpredictable for everybody else. The
kernel can't see the future, but access locality and recent use is a
proven predictor. We generally don't discard access information,
unless the user asks us to, and that's what the madvise calls are for.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH] mm: don't promote exclusive file folios of dying processes
  2025-04-16 18:18               ` Johannes Weiner
@ 2025-04-16 21:54                 ` Barry Song
  2025-04-16 23:58                   ` Johannes Weiner
  0 siblings, 1 reply; 19+ messages in thread
From: Barry Song @ 2025-04-16 21:54 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: David Hildenbrand, akpm, linux-mm, linux-kernel, Barry Song,
	Baolin Wang, Matthew Wilcox, Oscar Salvador, Ryan Roberts,
	Zi Yan

On Thu, Apr 17, 2025 at 2:18 AM Johannes Weiner <hannes@cmpxchg.org> wrote:
>
> On Wed, Apr 16, 2025 at 05:59:12PM +0200, David Hildenbrand wrote:
> > On 16.04.25 16:15, Johannes Weiner wrote:
> > > On Wed, Apr 16, 2025 at 11:40:31AM +0200, David Hildenbrand wrote:
> > >> On 16.04.25 11:38, Barry Song wrote:
> > >>> On Wed, Apr 16, 2025 at 5:32 PM David Hildenbrand <david@redhat.com> wrote:
> > >>>>
> > >>>> On 16.04.25 11:24, Barry Song wrote:
> > >>>>> On Wed, Apr 16, 2025 at 4:32 PM David Hildenbrand <david@redhat.com> wrote:
> > >>>>>>
> > >>>>>> On 12.04.25 10:58, Barry Song wrote:
> > >>>>>>> From: Barry Song <v-songbaohua@oppo.com>
> > >>>>>>>
> > >>>>>>> Promoting exclusive file folios of a dying process is unnecessary and
> > >>>>>>> harmful. For example, while Firefox is killed and LibreOffice is
> > >>>>>>> launched, activating Firefox's young file-backed folios makes it
> > >>>>>>> harder to reclaim memory that LibreOffice doesn't use at all.
> > >>>>>>
> > >>>>>> Do we know when it is reasonable to promote any folios of a dying process?
> > >>>>>>
> > >>>>>
> > >>>>> I don't know. It seems not reasonable at all. if one service crashes due to
> > >>>>> SW bug, systemd will restart it immediately. this might be the case promoting
> > >>>>> folios might be good. but it is really a bug of the service, not a normal case.
> > >>>>>
> > >>>>>> Assume you restart Firefox, would it really matter to promote them when
> > >>>>>> unmapping? New Firefox would fault-in / touch the ones it really needs
> > >>>>>> immediately afterwards?
> > >>>>>
> > >>>>> Usually users kill firefox to start other applications (users intend
> > >>>>> to free memory
> > >>>>> for new applications). For Android, an app might be killed because it has been
> > >>>>> staying in the background inactively for a while.
> > >>>>
> > >>>>> On the other hand, even if users restart firefox immediately, their folios are
> > >>>>> probably still in LRU to hit.
> > >>>>
> > >>>> Right, that's what I'm thinking.
> > >>>>
> > >>>> So I wonder if we could just say "the whole process is going down; even
> > >>>> if we had some recency information, that could only affect some other
> > >>>> process, where we would have to guess if it really matters".
> > >>>>
> > >>>> If the data is important, one would assume that another process would
> > >>>> soon access it either way, and as you say, likely it will still be on
> > >>>> the LRU to hit.
> > >>>
> > >>> I'll include this additional information in the v2 version of the patch since
> > >>> you think it would be helpful.
> > >>>
> > >>> Regarding the exclusive flag - I'm wondering whether we actually need to
> > >>> distinguish between exclusive and shared folios in this case. The current
> > >>> patch uses the exclusive flag mainly to reduce controversy, but even for
> > >>> shared folios: does the recency from a dying process matter? The
> > >>> recency information only reflects the dying process's usage pattern, which
> > >>> will soon be irrelevant.
> > >>
> > >> Exactly my thoughts. So if we can simplify -- ignore it completely --
> > >> that would certainly be nice.
> > >
> > > This doesn't sound right to me.
> > >
> > > Remembering the accesses of an exiting task is very much the point of
> > > this. Consider executables and shared libraries repeatedly referenced
> > > by short-lived jobs, like shell scripts, compiles etc.
> >
> > For these always-mmaped / never read/write files I tend to agree.
> >
> > But, is it really a good indication whether a folio is exclusive to this
> > process or not?
> >
> > I mean, if a bash scripts executes the same executable repeatedly, but
> > never multiple copies at the same time, we would also not tracking the
> > access with this patch.
> >
> > Similarly with an app that mmaps() a large data set (DB, VM, ML, ..)
> > exclusively. Re-starting the app would not track recency with this patch.
> >
> > But I guess there is no right or wrong ...
>
> Right, I'm more broadly objecting to the patch and its premise, but
> thought the exclusive filtering would at least mitigate its downsides
> somewhat. You raise good points that it's not as clear cut.
>
> IMO this is too subtle and unpredictable for everybody else. The
> kernel can't see the future, but access locality and recent use is a
> proven predictor. We generally don't discard access information,
> unless the user asks us to, and that's what the madvise calls are for.

David pointed out some exceptions - the recency of dying processes might
still be useful to new processes, particularly in cases like:

  while true; do app; done

Here, 'app' is repeatedly restarted but always maintains a single running
instance. I agree this seems correct.

However, we can also find many cases where a dying process means its folios
instantly become cold. For example:
- If someone enjoys watching his/her TV (not shared with family) and then
  passes away, the TV's folios become instantly cold.
- Even if the TV is shared with family but only that person actively used
  it, the folios still become cold.  If other users access this TV
too, shouldn't
  their PTEs reflect that it's still young?

I agree that "access locality and recent use" is generally a good heuristic,
but it must have some correlation (strong or weak) with the process lifecycle.
Implementing 'madv_cold' on a dying process seems impractical as dying
means 'cold' for many cases. Also, It is really not doable to execute
madv_cold on a dying process.

Thanks
Barry


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH] mm: don't promote exclusive file folios of dying processes
  2025-04-16 21:54                 ` Barry Song
@ 2025-04-16 23:58                   ` Johannes Weiner
  2025-04-17  2:43                     ` Barry Song
  0 siblings, 1 reply; 19+ messages in thread
From: Johannes Weiner @ 2025-04-16 23:58 UTC (permalink / raw)
  To: Barry Song
  Cc: David Hildenbrand, akpm, linux-mm, linux-kernel, Barry Song,
	Baolin Wang, Matthew Wilcox, Oscar Salvador, Ryan Roberts,
	Zi Yan

On Thu, Apr 17, 2025 at 05:54:57AM +0800, Barry Song wrote:
> On Thu, Apr 17, 2025 at 2:18 AM Johannes Weiner <hannes@cmpxchg.org> wrote:
> > Right, I'm more broadly objecting to the patch and its premise, but
> > thought the exclusive filtering would at least mitigate its downsides
> > somewhat. You raise good points that it's not as clear cut.
> >
> > IMO this is too subtle and unpredictable for everybody else. The
> > kernel can't see the future, but access locality and recent use is a
> > proven predictor. We generally don't discard access information,
> > unless the user asks us to, and that's what the madvise calls are for.
> 
> David pointed out some exceptions - the recency of dying processes might
> still be useful to new processes, particularly in cases like:
> 
>   while true; do app; done
> 
> Here, 'app' is repeatedly restarted but always maintains a single running
> instance. I agree this seems correct.
> 
> However, we can also find many cases where a dying process means its folios
> instantly become cold. For example:

Of course, there are many of them. Just like any access could be the
last one to that page for the next hour. But you don't know which ones
they are. Just like you don't know if I'm shutting down firefox
because that's enough internet for one day, or if I'm just restarting
it to clear out the 107 tabs I've lost track off.

> I agree that "access locality and recent use" is generally a good heuristic,
> but it must have some correlation (strong or weak) with the process lifecycle.

I don't agree. It's a cache shared between past, present and future
processes. The lifecycle of an individual processes is not saying much.

Unless you know something about userspace, and the exact data at hand,
that the kernel doesn't, which is why the Android usecase of MADV_COLD
or PAGEOUT for background apps makes sense to me, but generally tying
it to a process death does not.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH] mm: don't promote exclusive file folios of dying processes
  2025-04-16 23:58                   ` Johannes Weiner
@ 2025-04-17  2:43                     ` Barry Song
  2025-04-17 12:17                       ` Johannes Weiner
  0 siblings, 1 reply; 19+ messages in thread
From: Barry Song @ 2025-04-17  2:43 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: David Hildenbrand, akpm, linux-mm, linux-kernel, Barry Song,
	Baolin Wang, Matthew Wilcox, Oscar Salvador, Ryan Roberts,
	Zi Yan

On Thu, Apr 17, 2025 at 7:58 AM Johannes Weiner <hannes@cmpxchg.org> wrote:
>
> On Thu, Apr 17, 2025 at 05:54:57AM +0800, Barry Song wrote:
> > On Thu, Apr 17, 2025 at 2:18 AM Johannes Weiner <hannes@cmpxchg.org> wrote:
> > > Right, I'm more broadly objecting to the patch and its premise, but
> > > thought the exclusive filtering would at least mitigate its downsides
> > > somewhat. You raise good points that it's not as clear cut.
> > >
> > > IMO this is too subtle and unpredictable for everybody else. The
> > > kernel can't see the future, but access locality and recent use is a
> > > proven predictor. We generally don't discard access information,
> > > unless the user asks us to, and that's what the madvise calls are for.
> >
> > David pointed out some exceptions - the recency of dying processes might
> > still be useful to new processes, particularly in cases like:
> >
> >   while true; do app; done
> >
> > Here, 'app' is repeatedly restarted but always maintains a single running
> > instance. I agree this seems correct.
> >
> > However, we can also find many cases where a dying process means its folios
> > instantly become cold. For example:
>
> Of course, there are many of them. Just like any access could be the
> last one to that page for the next hour. But you don't know which ones
> they are. Just like you don't know if I'm shutting down firefox
> because that's enough internet for one day, or if I'm just restarting
> it to clear out the 107 tabs I've lost track off.

Typically, we focus on scenarios where multiple applications switch
seamlessly—for instance, on a phone, when transitioning between
different apps. The smoothness of these transitions matters most,
Immediately restarting a just-terminated app isn't problematic since
its memory footprint often persists before being reclaimed.

>
> > I agree that "access locality and recent use" is generally a good heuristic,
> > but it must have some correlation (strong or weak) with the process lifecycle.
>
> I don't agree. It's a cache shared between past, present and future
> processes. The lifecycle of an individual processes is not saying much.
>
> Unless you know something about userspace, and the exact data at hand,
> that the kernel doesn't, which is why the Android usecase of MADV_COLD
> or PAGEOUT for background apps makes sense to me, but generally tying
> it to a process death does not.

I agree that MADV_COLD or PAGEOUT makes sense for background apps,
but I still believe process death is somewhat underestimated by you :-) In
Android, process death is actually a strong signal that an app is inactive and
consuming much memory—leading to its termination by either userspace or
the kernel's OOM mechanism.

We actually took a more aggressive approach by implementing a hook to demote
exclusive folios of dying apps, which yielded good results—reducing kswapd
overhead, refaults, and thrashing. Of course, it is even much more controversial
than this patch.

While I acknowledge that counter-examples to my described pattern can always
be found, our observations clearly show that process death is a big event - far
from being just a trivial unmap operation.

Anyway, not trying to push the patch as obviously it seems quite hard :-)

Thanks
Barry


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH] mm: don't promote exclusive file folios of dying processes
  2025-04-17  2:43                     ` Barry Song
@ 2025-04-17 12:17                       ` Johannes Weiner
  2025-04-17 12:57                         ` David Hildenbrand
  0 siblings, 1 reply; 19+ messages in thread
From: Johannes Weiner @ 2025-04-17 12:17 UTC (permalink / raw)
  To: Barry Song
  Cc: David Hildenbrand, akpm, linux-mm, linux-kernel, Barry Song,
	Baolin Wang, Matthew Wilcox, Oscar Salvador, Ryan Roberts,
	Zi Yan

Hi Barry,

On Thu, Apr 17, 2025 at 10:43:20AM +0800, Barry Song wrote:
> On Thu, Apr 17, 2025 at 7:58 AM Johannes Weiner <hannes@cmpxchg.org> wrote:
> > On Thu, Apr 17, 2025 at 05:54:57AM +0800, Barry Song wrote:
> > > I agree that "access locality and recent use" is generally a good heuristic,
> > > but it must have some correlation (strong or weak) with the process lifecycle.
> >
> > I don't agree. It's a cache shared between past, present and future
> > processes. The lifecycle of an individual processes is not saying much.
> >
> > Unless you know something about userspace, and the exact data at hand,
> > that the kernel doesn't, which is why the Android usecase of MADV_COLD
> > or PAGEOUT for background apps makes sense to me, but generally tying
> > it to a process death does not.
> 
> I agree that MADV_COLD or PAGEOUT makes sense for background apps,
> but I still believe process death is somewhat underestimated by you :-) In
> Android, process death is actually a strong signal that an app is inactive and
> consuming much memory—leading to its termination by either userspace or
> the kernel's OOM mechanism.

That's exactly what I'm saying, though. You know something about
userspace that the kernel doesn't, which results from the unique way
in which app scheduling and killing works on Android. Where you have
recent foreground apps, idle background apps that you can kill and
switching back to them later transparently restarts them and shows the
user a fresh instance. But you have to admit that this is a unique
microcosm modeled on top of a conventional Unix process model.

So this doesn't necessarily translate to other Linux systems, like
servers or desktops. There is much higher concurrency, workingsets are
more static, there is no systematic distinction between foreground and
background apps (not in the Android sense), OOM killing is a rare
cornercase most setups ususally try hard to avoid etc.

But surely even Android has system management components, daemons
etc. that fit more into that second category?

> We actually took a more aggressive approach by implementing a hook to demote
> exclusive folios of dying apps, which yielded good results—reducing kswapd
> overhead, refaults, and thrashing. Of course, it is even much more controversial
> than this patch.

That doesn't sound wrong to me for Android apps.

How about a prctl() to request the behavior for those specific app
processes where you have clear usage signal? And then by all means,
outright demote the pages, or even invalidate the cache.

Delete the files, discard the flash blocks! (Ok that was a joke).


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH] mm: don't promote exclusive file folios of dying processes
  2025-04-17 12:17                       ` Johannes Weiner
@ 2025-04-17 12:57                         ` David Hildenbrand
  2025-04-18  0:16                           ` Barry Song
  0 siblings, 1 reply; 19+ messages in thread
From: David Hildenbrand @ 2025-04-17 12:57 UTC (permalink / raw)
  To: Johannes Weiner, Barry Song
  Cc: akpm, linux-mm, linux-kernel, Barry Song, Baolin Wang,
	Matthew Wilcox, Oscar Salvador, Ryan Roberts, Zi Yan

>> We actually took a more aggressive approach by implementing a hook to demote
>> exclusive folios of dying apps, which yielded good results—reducing kswapd
>> overhead, refaults, and thrashing. Of course, it is even much more controversial
>> than this patch.
> 
> That doesn't sound wrong to me for Android apps.
> 
> How about a prctl() to request the behavior for those specific app
> processes where you have clear usage signal?

I was thinking about the same, so likely that might be a viable solution.

-- 
Cheers,

David / dhildenb



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [RFC PATCH] mm: don't promote exclusive file folios of dying processes
  2025-04-17 12:57                         ` David Hildenbrand
@ 2025-04-18  0:16                           ` Barry Song
  0 siblings, 0 replies; 19+ messages in thread
From: Barry Song @ 2025-04-18  0:16 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Johannes Weiner, akpm, linux-mm, linux-kernel, Barry Song,
	Baolin Wang, Matthew Wilcox, Oscar Salvador, Ryan Roberts,
	Zi Yan

On Thu, Apr 17, 2025 at 8:57 PM David Hildenbrand <david@redhat.com> wrote:
>
> >> We actually took a more aggressive approach by implementing a hook to demote
> >> exclusive folios of dying apps, which yielded good results—reducing kswapd
> >> overhead, refaults, and thrashing. Of course, it is even much more controversial
> >> than this patch.
> >
> > That doesn't sound wrong to me for Android apps.
> >
> > How about a prctl() to request the behavior for those specific app
> > processes where you have clear usage signal?
>
> I was thinking about the same, so likely that might be a viable solution.

Many thanks to both Johannes and David for the suggestion. I’d be delighted to
take a look at this.

>
> --
> Cheers,
>
> David / dhildenb
>

Thanks
Barry


^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2025-04-18  0:16 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2025-04-12  8:58 [RFC PATCH] mm: don't promote exclusive file folios of dying processes Barry Song
2025-04-12 15:48 ` Matthew Wilcox
2025-04-12 16:31 ` Zi Yan
2025-04-16  7:48   ` Barry Song
2025-04-16  8:24     ` Baolin Wang
2025-04-16  8:32 ` David Hildenbrand
2025-04-16  9:24   ` Barry Song
2025-04-16  9:32     ` David Hildenbrand
2025-04-16  9:38       ` Barry Song
2025-04-16  9:40         ` David Hildenbrand
2025-04-16 14:15           ` Johannes Weiner
2025-04-16 15:59             ` David Hildenbrand
2025-04-16 18:18               ` Johannes Weiner
2025-04-16 21:54                 ` Barry Song
2025-04-16 23:58                   ` Johannes Weiner
2025-04-17  2:43                     ` Barry Song
2025-04-17 12:17                       ` Johannes Weiner
2025-04-17 12:57                         ` David Hildenbrand
2025-04-18  0:16                           ` Barry Song

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox