* [RFC PATCH] mm: don't promote exclusive file folios of dying processes
@ 2025-04-12 8:58 Barry Song
2025-04-12 15:48 ` Matthew Wilcox
` (2 more replies)
0 siblings, 3 replies; 19+ messages in thread
From: Barry Song @ 2025-04-12 8:58 UTC (permalink / raw)
To: akpm, linux-mm
Cc: linux-kernel, Barry Song, Baolin Wang, David Hildenbrand,
Johannes Weiner, Matthew Wilcox, Oscar Salvador, Ryan Roberts,
Zi Yan
From: Barry Song <v-songbaohua@oppo.com>
Promoting exclusive file folios of a dying process is unnecessary and
harmful. For example, while Firefox is killed and LibreOffice is
launched, activating Firefox's young file-backed folios makes it
harder to reclaim memory that LibreOffice doesn't use at all.
An exiting process is unlikely to be restarted right away—it's
either terminated by the user or killed by the OOM handler.
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Zi Yan <ziy@nvidia.com>
Signed-off-by: Barry Song <v-songbaohua@oppo.com>
---
mm/huge_memory.c | 4 ++--
mm/internal.h | 19 +++++++++++++++++++
mm/memory.c | 9 ++++++++-
3 files changed, 29 insertions(+), 3 deletions(-)
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index e97a97586478..05b83d2fcbb6 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -2264,8 +2264,8 @@ int zap_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma,
* Use flush_needed to indicate whether the PMD entry
* is present, instead of checking pmd_present() again.
*/
- if (flush_needed && pmd_young(orig_pmd) &&
- likely(vma_has_recency(vma)))
+ if (!exclusive_folio_of_dying_process(folio, vma) && flush_needed &&
+ pmd_young(orig_pmd) && likely(vma_has_recency(vma)))
folio_mark_accessed(folio);
}
diff --git a/mm/internal.h b/mm/internal.h
index 4e0ea83aaf1c..666de96a293d 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -11,6 +11,7 @@
#include <linux/khugepaged.h>
#include <linux/mm.h>
#include <linux/mm_inline.h>
+#include <linux/oom.h>
#include <linux/pagemap.h>
#include <linux/pagewalk.h>
#include <linux/rmap.h>
@@ -130,6 +131,24 @@ static inline int folio_nr_pages_mapped(const struct folio *folio)
return atomic_read(&folio->_nr_pages_mapped) & FOLIO_PAGES_MAPPED;
}
+/*
+ * Return true if a folio is exclusive and belongs to an exiting or
+ * oom-reaped process; otherwise, return false.
+ */
+static inline bool exclusive_folio_of_dying_process(struct folio *folio,
+ struct vm_area_struct *vma)
+{
+ if (folio_maybe_mapped_shared(folio))
+ return false;
+
+ if (!atomic_read(&vma->vm_mm->mm_users))
+ return true;
+ if (check_stable_address_space(vma->vm_mm))
+ return true;
+
+ return false;
+}
+
/*
* Retrieve the first entry of a folio based on a provided entry within the
* folio. We cannot rely on folio->swap as there is no guarantee that it has
diff --git a/mm/memory.c b/mm/memory.c
index b9e8443aaa86..cab69275e473 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1515,7 +1515,14 @@ static __always_inline void zap_present_folio_ptes(struct mmu_gather *tlb,
*force_flush = true;
}
}
- if (pte_young(ptent) && likely(vma_has_recency(vma)))
+
+ /*
+ * Skip marking exclusive file folios as accessed for processes that are
+ * exiting or have been reaped due to OOM. This prevents unnecessary
+ * promotion of folios that won't benefit the new process being launched.
+ */
+ if (!exclusive_folio_of_dying_process(folio, vma) && pte_young(ptent) &&
+ likely(vma_has_recency(vma)))
folio_mark_accessed(folio);
rss[mm_counter(folio)] -= nr;
} else {
--
2.39.3 (Apple Git-146)
^ permalink raw reply [flat|nested] 19+ messages in thread* Re: [RFC PATCH] mm: don't promote exclusive file folios of dying processes 2025-04-12 8:58 [RFC PATCH] mm: don't promote exclusive file folios of dying processes Barry Song @ 2025-04-12 15:48 ` Matthew Wilcox 2025-04-12 16:31 ` Zi Yan 2025-04-16 8:32 ` David Hildenbrand 2 siblings, 0 replies; 19+ messages in thread From: Matthew Wilcox @ 2025-04-12 15:48 UTC (permalink / raw) To: Barry Song Cc: akpm, linux-mm, linux-kernel, Barry Song, Baolin Wang, David Hildenbrand, Johannes Weiner, Oscar Salvador, Ryan Roberts, Zi Yan On Sat, Apr 12, 2025 at 08:58:52PM +1200, Barry Song wrote: > + /* > + * Skip marking exclusive file folios as accessed for processes that are > + * exiting or have been reaped due to OOM. This prevents unnecessary > + * promotion of folios that won't benefit the new process being launched. > + */ Please wrap at 80 columns. One easy way to achieve this is to pipe it through 'fmt -p \*' ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFC PATCH] mm: don't promote exclusive file folios of dying processes 2025-04-12 8:58 [RFC PATCH] mm: don't promote exclusive file folios of dying processes Barry Song 2025-04-12 15:48 ` Matthew Wilcox @ 2025-04-12 16:31 ` Zi Yan 2025-04-16 7:48 ` Barry Song 2025-04-16 8:32 ` David Hildenbrand 2 siblings, 1 reply; 19+ messages in thread From: Zi Yan @ 2025-04-12 16:31 UTC (permalink / raw) To: Barry Song Cc: akpm, linux-mm, linux-kernel, Barry Song, Baolin Wang, David Hildenbrand, Johannes Weiner, Matthew Wilcox, Oscar Salvador, Ryan Roberts On 12 Apr 2025, at 4:58, Barry Song wrote: > From: Barry Song <v-songbaohua@oppo.com> > > Promoting exclusive file folios of a dying process is unnecessary and > harmful. For example, while Firefox is killed and LibreOffice is > launched, activating Firefox's young file-backed folios makes it > harder to reclaim memory that LibreOffice doesn't use at all. > > An exiting process is unlikely to be restarted right away—it's > either terminated by the user or killed by the OOM handler. The proposal looks reasonable to me. Do you have any performance number about the improvement? > > Cc: Baolin Wang <baolin.wang@linux.alibaba.com> > Cc: David Hildenbrand <david@redhat.com> > Cc: Johannes Weiner <hannes@cmpxchg.org> > Cc: Matthew Wilcox (Oracle) <willy@infradead.org> > Cc: Oscar Salvador <osalvador@suse.de> > Cc: Ryan Roberts <ryan.roberts@arm.com> > Cc: Zi Yan <ziy@nvidia.com> > Signed-off-by: Barry Song <v-songbaohua@oppo.com> > --- > mm/huge_memory.c | 4 ++-- > mm/internal.h | 19 +++++++++++++++++++ > mm/memory.c | 9 ++++++++- > 3 files changed, 29 insertions(+), 3 deletions(-) > > diff --git a/mm/huge_memory.c b/mm/huge_memory.c > index e97a97586478..05b83d2fcbb6 100644 > --- a/mm/huge_memory.c > +++ b/mm/huge_memory.c > @@ -2264,8 +2264,8 @@ int zap_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma, > * Use flush_needed to indicate whether the PMD entry > * is present, instead of checking pmd_present() again. > */ > - if (flush_needed && pmd_young(orig_pmd) && > - likely(vma_has_recency(vma))) > + if (!exclusive_folio_of_dying_process(folio, vma) && flush_needed && > + pmd_young(orig_pmd) && likely(vma_has_recency(vma))) > folio_mark_accessed(folio); > } > > diff --git a/mm/internal.h b/mm/internal.h > index 4e0ea83aaf1c..666de96a293d 100644 > --- a/mm/internal.h > +++ b/mm/internal.h > @@ -11,6 +11,7 @@ > #include <linux/khugepaged.h> > #include <linux/mm.h> > #include <linux/mm_inline.h> > +#include <linux/oom.h> > #include <linux/pagemap.h> > #include <linux/pagewalk.h> > #include <linux/rmap.h> > @@ -130,6 +131,24 @@ static inline int folio_nr_pages_mapped(const struct folio *folio) > return atomic_read(&folio->_nr_pages_mapped) & FOLIO_PAGES_MAPPED; > } > > +/* > + * Return true if a folio is exclusive and belongs to an exiting or > + * oom-reaped process; otherwise, return false. > + */ > +static inline bool exclusive_folio_of_dying_process(struct folio *folio, > + struct vm_area_struct *vma) > +{ > + if (folio_maybe_mapped_shared(folio)) > + return false; > + > + if (!atomic_read(&vma->vm_mm->mm_users)) > + return true; > + if (check_stable_address_space(vma->vm_mm)) > + return true; > + > + return false; > +} > + > /* > * Retrieve the first entry of a folio based on a provided entry within the > * folio. We cannot rely on folio->swap as there is no guarantee that it has > diff --git a/mm/memory.c b/mm/memory.c > index b9e8443aaa86..cab69275e473 100644 > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -1515,7 +1515,14 @@ static __always_inline void zap_present_folio_ptes(struct mmu_gather *tlb, > *force_flush = true; > } > } > - if (pte_young(ptent) && likely(vma_has_recency(vma))) > + > + /* > + * Skip marking exclusive file folios as accessed for processes that are > + * exiting or have been reaped due to OOM. This prevents unnecessary > + * promotion of folios that won't benefit the new process being launched. > + */ > + if (!exclusive_folio_of_dying_process(folio, vma) && pte_young(ptent) && > + likely(vma_has_recency(vma))) > folio_mark_accessed(folio); > rss[mm_counter(folio)] -= nr; > } else { > -- > 2.39.3 (Apple Git-146) -- Best Regards, Yan, Zi ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFC PATCH] mm: don't promote exclusive file folios of dying processes 2025-04-12 16:31 ` Zi Yan @ 2025-04-16 7:48 ` Barry Song 2025-04-16 8:24 ` Baolin Wang 0 siblings, 1 reply; 19+ messages in thread From: Barry Song @ 2025-04-16 7:48 UTC (permalink / raw) To: Zi Yan, Tangquan Zheng Cc: akpm, linux-mm, linux-kernel, Barry Song, Baolin Wang, David Hildenbrand, Johannes Weiner, Matthew Wilcox, Oscar Salvador, Ryan Roberts On Sun, Apr 13, 2025 at 12:31 AM Zi Yan <ziy@nvidia.com> wrote: > > On 12 Apr 2025, at 4:58, Barry Song wrote: > > > From: Barry Song <v-songbaohua@oppo.com> > > > > Promoting exclusive file folios of a dying process is unnecessary and > > harmful. For example, while Firefox is killed and LibreOffice is > > launched, activating Firefox's young file-backed folios makes it > > harder to reclaim memory that LibreOffice doesn't use at all. > > > > An exiting process is unlikely to be restarted right away—it's > > either terminated by the user or killed by the OOM handler. > > The proposal looks reasonable to me. Do you have any performance number > about the improvement? Tangquan ran the test on Android phones and saw 3% improvement on refault/thrashing things: w/o patch w/patch workingset_refault_anon 2215933 2146602 3.13% workingset_refault_file 9859208 9646518 2.16% pswpin 2411086 2337790 3.04% pswpout 6482838 6264865 3.36% A further demotion of exclusive file folios can improvement more, but might be controversial. it could be a separate patch later. > > > > > Cc: Baolin Wang <baolin.wang@linux.alibaba.com> > > Cc: David Hildenbrand <david@redhat.com> > > Cc: Johannes Weiner <hannes@cmpxchg.org> > > Cc: Matthew Wilcox (Oracle) <willy@infradead.org> > > Cc: Oscar Salvador <osalvador@suse.de> > > Cc: Ryan Roberts <ryan.roberts@arm.com> > > Cc: Zi Yan <ziy@nvidia.com> > > Signed-off-by: Barry Song <v-songbaohua@oppo.com> > > --- > > mm/huge_memory.c | 4 ++-- > > mm/internal.h | 19 +++++++++++++++++++ > > mm/memory.c | 9 ++++++++- > > 3 files changed, 29 insertions(+), 3 deletions(-) > > > > diff --git a/mm/huge_memory.c b/mm/huge_memory.c > > index e97a97586478..05b83d2fcbb6 100644 > > --- a/mm/huge_memory.c > > +++ b/mm/huge_memory.c > > @@ -2264,8 +2264,8 @@ int zap_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma, > > * Use flush_needed to indicate whether the PMD entry > > * is present, instead of checking pmd_present() again. > > */ > > - if (flush_needed && pmd_young(orig_pmd) && > > - likely(vma_has_recency(vma))) > > + if (!exclusive_folio_of_dying_process(folio, vma) && flush_needed && > > + pmd_young(orig_pmd) && likely(vma_has_recency(vma))) > > folio_mark_accessed(folio); > > } > > > > diff --git a/mm/internal.h b/mm/internal.h > > index 4e0ea83aaf1c..666de96a293d 100644 > > --- a/mm/internal.h > > +++ b/mm/internal.h > > @@ -11,6 +11,7 @@ > > #include <linux/khugepaged.h> > > #include <linux/mm.h> > > #include <linux/mm_inline.h> > > +#include <linux/oom.h> > > #include <linux/pagemap.h> > > #include <linux/pagewalk.h> > > #include <linux/rmap.h> > > @@ -130,6 +131,24 @@ static inline int folio_nr_pages_mapped(const struct folio *folio) > > return atomic_read(&folio->_nr_pages_mapped) & FOLIO_PAGES_MAPPED; > > } > > > > +/* > > + * Return true if a folio is exclusive and belongs to an exiting or > > + * oom-reaped process; otherwise, return false. > > + */ > > +static inline bool exclusive_folio_of_dying_process(struct folio *folio, > > + struct vm_area_struct *vma) > > +{ > > + if (folio_maybe_mapped_shared(folio)) > > + return false; > > + > > + if (!atomic_read(&vma->vm_mm->mm_users)) > > + return true; > > + if (check_stable_address_space(vma->vm_mm)) > > + return true; > > + > > + return false; > > +} > > + > > /* > > * Retrieve the first entry of a folio based on a provided entry within the > > * folio. We cannot rely on folio->swap as there is no guarantee that it has > > diff --git a/mm/memory.c b/mm/memory.c > > index b9e8443aaa86..cab69275e473 100644 > > --- a/mm/memory.c > > +++ b/mm/memory.c > > @@ -1515,7 +1515,14 @@ static __always_inline void zap_present_folio_ptes(struct mmu_gather *tlb, > > *force_flush = true; > > } > > } > > - if (pte_young(ptent) && likely(vma_has_recency(vma))) > > + > > + /* > > + * Skip marking exclusive file folios as accessed for processes that are > > + * exiting or have been reaped due to OOM. This prevents unnecessary > > + * promotion of folios that won't benefit the new process being launched. > > + */ > > + if (!exclusive_folio_of_dying_process(folio, vma) && pte_young(ptent) && > > + likely(vma_has_recency(vma))) > > folio_mark_accessed(folio); > > rss[mm_counter(folio)] -= nr; > > } else { > > -- > > 2.39.3 (Apple Git-146) > > > -- > Best Regards, > Yan, Zi Thanks Barry ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFC PATCH] mm: don't promote exclusive file folios of dying processes 2025-04-16 7:48 ` Barry Song @ 2025-04-16 8:24 ` Baolin Wang 0 siblings, 0 replies; 19+ messages in thread From: Baolin Wang @ 2025-04-16 8:24 UTC (permalink / raw) To: Barry Song, Zi Yan, Tangquan Zheng Cc: akpm, linux-mm, linux-kernel, Barry Song, David Hildenbrand, Johannes Weiner, Matthew Wilcox, Oscar Salvador, Ryan Roberts On 2025/4/16 15:48, Barry Song wrote: > On Sun, Apr 13, 2025 at 12:31 AM Zi Yan <ziy@nvidia.com> wrote: >> >> On 12 Apr 2025, at 4:58, Barry Song wrote: >> >>> From: Barry Song <v-songbaohua@oppo.com> >>> >>> Promoting exclusive file folios of a dying process is unnecessary and >>> harmful. For example, while Firefox is killed and LibreOffice is >>> launched, activating Firefox's young file-backed folios makes it >>> harder to reclaim memory that LibreOffice doesn't use at all. >>> >>> An exiting process is unlikely to be restarted right away—it's >>> either terminated by the user or killed by the OOM handler. >> >> The proposal looks reasonable to me. Do you have any performance number >> about the improvement? > > Tangquan ran the test on Android phones and saw 3% improvement on > refault/thrashing things: Good. > w/o patch w/patch > workingset_refault_anon 2215933 2146602 3.13% > workingset_refault_file 9859208 9646518 2.16% > pswpin 2411086 2337790 3.04% > pswpout 6482838 6264865 3.36% > > A further demotion of exclusive file folios can improvement more, but > might be controversial. it could be a separate patch later. > >> >>> >>> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> >>> Cc: David Hildenbrand <david@redhat.com> >>> Cc: Johannes Weiner <hannes@cmpxchg.org> >>> Cc: Matthew Wilcox (Oracle) <willy@infradead.org> >>> Cc: Oscar Salvador <osalvador@suse.de> >>> Cc: Ryan Roberts <ryan.roberts@arm.com> >>> Cc: Zi Yan <ziy@nvidia.com> >>> Signed-off-by: Barry Song <v-songbaohua@oppo.com> >>> --- >>> mm/huge_memory.c | 4 ++-- >>> mm/internal.h | 19 +++++++++++++++++++ >>> mm/memory.c | 9 ++++++++- >>> 3 files changed, 29 insertions(+), 3 deletions(-) >>> >>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c >>> index e97a97586478..05b83d2fcbb6 100644 >>> --- a/mm/huge_memory.c >>> +++ b/mm/huge_memory.c >>> @@ -2264,8 +2264,8 @@ int zap_huge_pmd(struct mmu_gather *tlb, struct vm_area_struct *vma, >>> * Use flush_needed to indicate whether the PMD entry >>> * is present, instead of checking pmd_present() again. >>> */ >>> - if (flush_needed && pmd_young(orig_pmd) && >>> - likely(vma_has_recency(vma))) >>> + if (!exclusive_folio_of_dying_process(folio, vma) && flush_needed && Nit: I prefer to check 'flush_needed' first to make sure it is a present pte. Otherwise look good to me. >>> + pmd_young(orig_pmd) && likely(vma_has_recency(vma))) >>> folio_mark_accessed(folio); >>> } >>> >>> diff --git a/mm/internal.h b/mm/internal.h >>> index 4e0ea83aaf1c..666de96a293d 100644 >>> --- a/mm/internal.h >>> +++ b/mm/internal.h >>> @@ -11,6 +11,7 @@ >>> #include <linux/khugepaged.h> >>> #include <linux/mm.h> >>> #include <linux/mm_inline.h> >>> +#include <linux/oom.h> >>> #include <linux/pagemap.h> >>> #include <linux/pagewalk.h> >>> #include <linux/rmap.h> >>> @@ -130,6 +131,24 @@ static inline int folio_nr_pages_mapped(const struct folio *folio) >>> return atomic_read(&folio->_nr_pages_mapped) & FOLIO_PAGES_MAPPED; >>> } >>> >>> +/* >>> + * Return true if a folio is exclusive and belongs to an exiting or >>> + * oom-reaped process; otherwise, return false. >>> + */ >>> +static inline bool exclusive_folio_of_dying_process(struct folio *folio, >>> + struct vm_area_struct *vma) >>> +{ >>> + if (folio_maybe_mapped_shared(folio)) >>> + return false; >>> + >>> + if (!atomic_read(&vma->vm_mm->mm_users)) >>> + return true; >>> + if (check_stable_address_space(vma->vm_mm)) >>> + return true; >>> + >>> + return false; >>> +} >>> + >>> /* >>> * Retrieve the first entry of a folio based on a provided entry within the >>> * folio. We cannot rely on folio->swap as there is no guarantee that it has >>> diff --git a/mm/memory.c b/mm/memory.c >>> index b9e8443aaa86..cab69275e473 100644 >>> --- a/mm/memory.c >>> +++ b/mm/memory.c >>> @@ -1515,7 +1515,14 @@ static __always_inline void zap_present_folio_ptes(struct mmu_gather *tlb, >>> *force_flush = true; >>> } >>> } >>> - if (pte_young(ptent) && likely(vma_has_recency(vma))) >>> + >>> + /* >>> + * Skip marking exclusive file folios as accessed for processes that are >>> + * exiting or have been reaped due to OOM. This prevents unnecessary >>> + * promotion of folios that won't benefit the new process being launched. >>> + */ >>> + if (!exclusive_folio_of_dying_process(folio, vma) && pte_young(ptent) && >>> + likely(vma_has_recency(vma))) >>> folio_mark_accessed(folio); >>> rss[mm_counter(folio)] -= nr; >>> } else { >>> -- >>> 2.39.3 (Apple Git-146) >> >> >> -- >> Best Regards, >> Yan, Zi > > Thanks > Barry ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFC PATCH] mm: don't promote exclusive file folios of dying processes 2025-04-12 8:58 [RFC PATCH] mm: don't promote exclusive file folios of dying processes Barry Song 2025-04-12 15:48 ` Matthew Wilcox 2025-04-12 16:31 ` Zi Yan @ 2025-04-16 8:32 ` David Hildenbrand 2025-04-16 9:24 ` Barry Song 2 siblings, 1 reply; 19+ messages in thread From: David Hildenbrand @ 2025-04-16 8:32 UTC (permalink / raw) To: Barry Song, akpm, linux-mm Cc: linux-kernel, Barry Song, Baolin Wang, Johannes Weiner, Matthew Wilcox, Oscar Salvador, Ryan Roberts, Zi Yan On 12.04.25 10:58, Barry Song wrote: > From: Barry Song <v-songbaohua@oppo.com> > > Promoting exclusive file folios of a dying process is unnecessary and > harmful. For example, while Firefox is killed and LibreOffice is > launched, activating Firefox's young file-backed folios makes it > harder to reclaim memory that LibreOffice doesn't use at all. Do we know when it is reasonable to promote any folios of a dying process? Assume you restart Firefox, would it really matter to promote them when unmapping? New Firefox would fault-in / touch the ones it really needs immediately afterwards? -- Cheers, David / dhildenb ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFC PATCH] mm: don't promote exclusive file folios of dying processes 2025-04-16 8:32 ` David Hildenbrand @ 2025-04-16 9:24 ` Barry Song 2025-04-16 9:32 ` David Hildenbrand 0 siblings, 1 reply; 19+ messages in thread From: Barry Song @ 2025-04-16 9:24 UTC (permalink / raw) To: David Hildenbrand Cc: akpm, linux-mm, linux-kernel, Barry Song, Baolin Wang, Johannes Weiner, Matthew Wilcox, Oscar Salvador, Ryan Roberts, Zi Yan On Wed, Apr 16, 2025 at 4:32 PM David Hildenbrand <david@redhat.com> wrote: > > On 12.04.25 10:58, Barry Song wrote: > > From: Barry Song <v-songbaohua@oppo.com> > > > > Promoting exclusive file folios of a dying process is unnecessary and > > harmful. For example, while Firefox is killed and LibreOffice is > > launched, activating Firefox's young file-backed folios makes it > > harder to reclaim memory that LibreOffice doesn't use at all. > > Do we know when it is reasonable to promote any folios of a dying process? > I don't know. It seems not reasonable at all. if one service crashes due to SW bug, systemd will restart it immediately. this might be the case promoting folios might be good. but it is really a bug of the service, not a normal case. > Assume you restart Firefox, would it really matter to promote them when > unmapping? New Firefox would fault-in / touch the ones it really needs > immediately afterwards? Usually users kill firefox to start other applications (users intend to free memory for new applications). For Android, an app might be killed because it has been staying in the background inactively for a while. On the other hand, even if users restart firefox immediately, their folios are probably still in LRU to hit. > > -- > Cheers, > > David / dhildenb > Thanks Barry ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFC PATCH] mm: don't promote exclusive file folios of dying processes 2025-04-16 9:24 ` Barry Song @ 2025-04-16 9:32 ` David Hildenbrand 2025-04-16 9:38 ` Barry Song 0 siblings, 1 reply; 19+ messages in thread From: David Hildenbrand @ 2025-04-16 9:32 UTC (permalink / raw) To: Barry Song Cc: akpm, linux-mm, linux-kernel, Barry Song, Baolin Wang, Johannes Weiner, Matthew Wilcox, Oscar Salvador, Ryan Roberts, Zi Yan On 16.04.25 11:24, Barry Song wrote: > On Wed, Apr 16, 2025 at 4:32 PM David Hildenbrand <david@redhat.com> wrote: >> >> On 12.04.25 10:58, Barry Song wrote: >>> From: Barry Song <v-songbaohua@oppo.com> >>> >>> Promoting exclusive file folios of a dying process is unnecessary and >>> harmful. For example, while Firefox is killed and LibreOffice is >>> launched, activating Firefox's young file-backed folios makes it >>> harder to reclaim memory that LibreOffice doesn't use at all. >> >> Do we know when it is reasonable to promote any folios of a dying process? >> > > I don't know. It seems not reasonable at all. if one service crashes due to > SW bug, systemd will restart it immediately. this might be the case promoting > folios might be good. but it is really a bug of the service, not a normal case. > >> Assume you restart Firefox, would it really matter to promote them when >> unmapping? New Firefox would fault-in / touch the ones it really needs >> immediately afterwards? > > Usually users kill firefox to start other applications (users intend > to free memory > for new applications). For Android, an app might be killed because it has been > staying in the background inactively for a while. > On the other hand, even if users restart firefox immediately, their folios are > probably still in LRU to hit. Right, that's what I'm thinking. So I wonder if we could just say "the whole process is going down; even if we had some recency information, that could only affect some other process, where we would have to guess if it really matters". If the data is important, one would assume that another process would soon access it either way, and as you say, likely it will still be on the LRU to hit. -- Cheers, David / dhildenb ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFC PATCH] mm: don't promote exclusive file folios of dying processes 2025-04-16 9:32 ` David Hildenbrand @ 2025-04-16 9:38 ` Barry Song 2025-04-16 9:40 ` David Hildenbrand 0 siblings, 1 reply; 19+ messages in thread From: Barry Song @ 2025-04-16 9:38 UTC (permalink / raw) To: David Hildenbrand Cc: akpm, linux-mm, linux-kernel, Barry Song, Baolin Wang, Johannes Weiner, Matthew Wilcox, Oscar Salvador, Ryan Roberts, Zi Yan On Wed, Apr 16, 2025 at 5:32 PM David Hildenbrand <david@redhat.com> wrote: > > On 16.04.25 11:24, Barry Song wrote: > > On Wed, Apr 16, 2025 at 4:32 PM David Hildenbrand <david@redhat.com> wrote: > >> > >> On 12.04.25 10:58, Barry Song wrote: > >>> From: Barry Song <v-songbaohua@oppo.com> > >>> > >>> Promoting exclusive file folios of a dying process is unnecessary and > >>> harmful. For example, while Firefox is killed and LibreOffice is > >>> launched, activating Firefox's young file-backed folios makes it > >>> harder to reclaim memory that LibreOffice doesn't use at all. > >> > >> Do we know when it is reasonable to promote any folios of a dying process? > >> > > > > I don't know. It seems not reasonable at all. if one service crashes due to > > SW bug, systemd will restart it immediately. this might be the case promoting > > folios might be good. but it is really a bug of the service, not a normal case. > > > >> Assume you restart Firefox, would it really matter to promote them when > >> unmapping? New Firefox would fault-in / touch the ones it really needs > >> immediately afterwards? > > > > Usually users kill firefox to start other applications (users intend > > to free memory > > for new applications). For Android, an app might be killed because it has been > > staying in the background inactively for a while. > > > On the other hand, even if users restart firefox immediately, their folios are > > probably still in LRU to hit. > > Right, that's what I'm thinking. > > So I wonder if we could just say "the whole process is going down; even > if we had some recency information, that could only affect some other > process, where we would have to guess if it really matters". > > If the data is important, one would assume that another process would > soon access it either way, and as you say, likely it will still be on > the LRU to hit. I'll include this additional information in the v2 version of the patch since you think it would be helpful. Regarding the exclusive flag - I'm wondering whether we actually need to distinguish between exclusive and shared folios in this case. The current patch uses the exclusive flag mainly to reduce controversy, but even for shared folios: does the recency from a dying process matter? The recency information only reflects the dying process's usage pattern, which will soon be irrelevant. > > -- > Cheers, > > David / dhildenb > Thanks Barry ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFC PATCH] mm: don't promote exclusive file folios of dying processes 2025-04-16 9:38 ` Barry Song @ 2025-04-16 9:40 ` David Hildenbrand 2025-04-16 14:15 ` Johannes Weiner 0 siblings, 1 reply; 19+ messages in thread From: David Hildenbrand @ 2025-04-16 9:40 UTC (permalink / raw) To: Barry Song Cc: akpm, linux-mm, linux-kernel, Barry Song, Baolin Wang, Johannes Weiner, Matthew Wilcox, Oscar Salvador, Ryan Roberts, Zi Yan On 16.04.25 11:38, Barry Song wrote: > On Wed, Apr 16, 2025 at 5:32 PM David Hildenbrand <david@redhat.com> wrote: >> >> On 16.04.25 11:24, Barry Song wrote: >>> On Wed, Apr 16, 2025 at 4:32 PM David Hildenbrand <david@redhat.com> wrote: >>>> >>>> On 12.04.25 10:58, Barry Song wrote: >>>>> From: Barry Song <v-songbaohua@oppo.com> >>>>> >>>>> Promoting exclusive file folios of a dying process is unnecessary and >>>>> harmful. For example, while Firefox is killed and LibreOffice is >>>>> launched, activating Firefox's young file-backed folios makes it >>>>> harder to reclaim memory that LibreOffice doesn't use at all. >>>> >>>> Do we know when it is reasonable to promote any folios of a dying process? >>>> >>> >>> I don't know. It seems not reasonable at all. if one service crashes due to >>> SW bug, systemd will restart it immediately. this might be the case promoting >>> folios might be good. but it is really a bug of the service, not a normal case. >>> >>>> Assume you restart Firefox, would it really matter to promote them when >>>> unmapping? New Firefox would fault-in / touch the ones it really needs >>>> immediately afterwards? >>> >>> Usually users kill firefox to start other applications (users intend >>> to free memory >>> for new applications). For Android, an app might be killed because it has been >>> staying in the background inactively for a while. >> >>> On the other hand, even if users restart firefox immediately, their folios are >>> probably still in LRU to hit. >> >> Right, that's what I'm thinking. >> >> So I wonder if we could just say "the whole process is going down; even >> if we had some recency information, that could only affect some other >> process, where we would have to guess if it really matters". >> >> If the data is important, one would assume that another process would >> soon access it either way, and as you say, likely it will still be on >> the LRU to hit. > > I'll include this additional information in the v2 version of the patch since > you think it would be helpful. > > Regarding the exclusive flag - I'm wondering whether we actually need to > distinguish between exclusive and shared folios in this case. The current > patch uses the exclusive flag mainly to reduce controversy, but even for > shared folios: does the recency from a dying process matter? The > recency information only reflects the dying process's usage pattern, which > will soon be irrelevant. Exactly my thoughts. So if we can simplify -- ignore it completely -- that would certainly be nice. -- Cheers, David / dhildenb ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFC PATCH] mm: don't promote exclusive file folios of dying processes 2025-04-16 9:40 ` David Hildenbrand @ 2025-04-16 14:15 ` Johannes Weiner 2025-04-16 15:59 ` David Hildenbrand 0 siblings, 1 reply; 19+ messages in thread From: Johannes Weiner @ 2025-04-16 14:15 UTC (permalink / raw) To: David Hildenbrand Cc: Barry Song, akpm, linux-mm, linux-kernel, Barry Song, Baolin Wang, Matthew Wilcox, Oscar Salvador, Ryan Roberts, Zi Yan On Wed, Apr 16, 2025 at 11:40:31AM +0200, David Hildenbrand wrote: > On 16.04.25 11:38, Barry Song wrote: > > On Wed, Apr 16, 2025 at 5:32 PM David Hildenbrand <david@redhat.com> wrote: > >> > >> On 16.04.25 11:24, Barry Song wrote: > >>> On Wed, Apr 16, 2025 at 4:32 PM David Hildenbrand <david@redhat.com> wrote: > >>>> > >>>> On 12.04.25 10:58, Barry Song wrote: > >>>>> From: Barry Song <v-songbaohua@oppo.com> > >>>>> > >>>>> Promoting exclusive file folios of a dying process is unnecessary and > >>>>> harmful. For example, while Firefox is killed and LibreOffice is > >>>>> launched, activating Firefox's young file-backed folios makes it > >>>>> harder to reclaim memory that LibreOffice doesn't use at all. > >>>> > >>>> Do we know when it is reasonable to promote any folios of a dying process? > >>>> > >>> > >>> I don't know. It seems not reasonable at all. if one service crashes due to > >>> SW bug, systemd will restart it immediately. this might be the case promoting > >>> folios might be good. but it is really a bug of the service, not a normal case. > >>> > >>>> Assume you restart Firefox, would it really matter to promote them when > >>>> unmapping? New Firefox would fault-in / touch the ones it really needs > >>>> immediately afterwards? > >>> > >>> Usually users kill firefox to start other applications (users intend > >>> to free memory > >>> for new applications). For Android, an app might be killed because it has been > >>> staying in the background inactively for a while. > >> > >>> On the other hand, even if users restart firefox immediately, their folios are > >>> probably still in LRU to hit. > >> > >> Right, that's what I'm thinking. > >> > >> So I wonder if we could just say "the whole process is going down; even > >> if we had some recency information, that could only affect some other > >> process, where we would have to guess if it really matters". > >> > >> If the data is important, one would assume that another process would > >> soon access it either way, and as you say, likely it will still be on > >> the LRU to hit. > > > > I'll include this additional information in the v2 version of the patch since > > you think it would be helpful. > > > > Regarding the exclusive flag - I'm wondering whether we actually need to > > distinguish between exclusive and shared folios in this case. The current > > patch uses the exclusive flag mainly to reduce controversy, but even for > > shared folios: does the recency from a dying process matter? The > > recency information only reflects the dying process's usage pattern, which > > will soon be irrelevant. > > Exactly my thoughts. So if we can simplify -- ignore it completely -- > that would certainly be nice. This doesn't sound right to me. Remembering the accesses of an exiting task is very much the point of this. Consider executables and shared libraries repeatedly referenced by short-lived jobs, like shell scripts, compiles etc. MADV_COLD and MADV_PAGEOUT where specifically added for this Android usecase - the rare situation where you *know* those pages are done. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFC PATCH] mm: don't promote exclusive file folios of dying processes 2025-04-16 14:15 ` Johannes Weiner @ 2025-04-16 15:59 ` David Hildenbrand 2025-04-16 18:18 ` Johannes Weiner 0 siblings, 1 reply; 19+ messages in thread From: David Hildenbrand @ 2025-04-16 15:59 UTC (permalink / raw) To: Johannes Weiner Cc: Barry Song, akpm, linux-mm, linux-kernel, Barry Song, Baolin Wang, Matthew Wilcox, Oscar Salvador, Ryan Roberts, Zi Yan On 16.04.25 16:15, Johannes Weiner wrote: > On Wed, Apr 16, 2025 at 11:40:31AM +0200, David Hildenbrand wrote: >> On 16.04.25 11:38, Barry Song wrote: >>> On Wed, Apr 16, 2025 at 5:32 PM David Hildenbrand <david@redhat.com> wrote: >>>> >>>> On 16.04.25 11:24, Barry Song wrote: >>>>> On Wed, Apr 16, 2025 at 4:32 PM David Hildenbrand <david@redhat.com> wrote: >>>>>> >>>>>> On 12.04.25 10:58, Barry Song wrote: >>>>>>> From: Barry Song <v-songbaohua@oppo.com> >>>>>>> >>>>>>> Promoting exclusive file folios of a dying process is unnecessary and >>>>>>> harmful. For example, while Firefox is killed and LibreOffice is >>>>>>> launched, activating Firefox's young file-backed folios makes it >>>>>>> harder to reclaim memory that LibreOffice doesn't use at all. >>>>>> >>>>>> Do we know when it is reasonable to promote any folios of a dying process? >>>>>> >>>>> >>>>> I don't know. It seems not reasonable at all. if one service crashes due to >>>>> SW bug, systemd will restart it immediately. this might be the case promoting >>>>> folios might be good. but it is really a bug of the service, not a normal case. >>>>> >>>>>> Assume you restart Firefox, would it really matter to promote them when >>>>>> unmapping? New Firefox would fault-in / touch the ones it really needs >>>>>> immediately afterwards? >>>>> >>>>> Usually users kill firefox to start other applications (users intend >>>>> to free memory >>>>> for new applications). For Android, an app might be killed because it has been >>>>> staying in the background inactively for a while. >>>> >>>>> On the other hand, even if users restart firefox immediately, their folios are >>>>> probably still in LRU to hit. >>>> >>>> Right, that's what I'm thinking. >>>> >>>> So I wonder if we could just say "the whole process is going down; even >>>> if we had some recency information, that could only affect some other >>>> process, where we would have to guess if it really matters". >>>> >>>> If the data is important, one would assume that another process would >>>> soon access it either way, and as you say, likely it will still be on >>>> the LRU to hit. >>> >>> I'll include this additional information in the v2 version of the patch since >>> you think it would be helpful. >>> >>> Regarding the exclusive flag - I'm wondering whether we actually need to >>> distinguish between exclusive and shared folios in this case. The current >>> patch uses the exclusive flag mainly to reduce controversy, but even for >>> shared folios: does the recency from a dying process matter? The >>> recency information only reflects the dying process's usage pattern, which >>> will soon be irrelevant. >> >> Exactly my thoughts. So if we can simplify -- ignore it completely -- >> that would certainly be nice. > > This doesn't sound right to me. > > Remembering the accesses of an exiting task is very much the point of > this. Consider executables and shared libraries repeatedly referenced > by short-lived jobs, like shell scripts, compiles etc. For these always-mmaped / never read/write files I tend to agree. But, is it really a good indication whether a folio is exclusive to this process or not? I mean, if a bash scripts executes the same executable repeatedly, but never multiple copies at the same time, we would also not tracking the access with this patch. Similarly with an app that mmaps() a large data set (DB, VM, ML, ..) exclusively. Re-starting the app would not track recency with this patch. But I guess there is no right or wrong ... -- Cheers, David / dhildenb ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFC PATCH] mm: don't promote exclusive file folios of dying processes 2025-04-16 15:59 ` David Hildenbrand @ 2025-04-16 18:18 ` Johannes Weiner 2025-04-16 21:54 ` Barry Song 0 siblings, 1 reply; 19+ messages in thread From: Johannes Weiner @ 2025-04-16 18:18 UTC (permalink / raw) To: David Hildenbrand Cc: Barry Song, akpm, linux-mm, linux-kernel, Barry Song, Baolin Wang, Matthew Wilcox, Oscar Salvador, Ryan Roberts, Zi Yan On Wed, Apr 16, 2025 at 05:59:12PM +0200, David Hildenbrand wrote: > On 16.04.25 16:15, Johannes Weiner wrote: > > On Wed, Apr 16, 2025 at 11:40:31AM +0200, David Hildenbrand wrote: > >> On 16.04.25 11:38, Barry Song wrote: > >>> On Wed, Apr 16, 2025 at 5:32 PM David Hildenbrand <david@redhat.com> wrote: > >>>> > >>>> On 16.04.25 11:24, Barry Song wrote: > >>>>> On Wed, Apr 16, 2025 at 4:32 PM David Hildenbrand <david@redhat.com> wrote: > >>>>>> > >>>>>> On 12.04.25 10:58, Barry Song wrote: > >>>>>>> From: Barry Song <v-songbaohua@oppo.com> > >>>>>>> > >>>>>>> Promoting exclusive file folios of a dying process is unnecessary and > >>>>>>> harmful. For example, while Firefox is killed and LibreOffice is > >>>>>>> launched, activating Firefox's young file-backed folios makes it > >>>>>>> harder to reclaim memory that LibreOffice doesn't use at all. > >>>>>> > >>>>>> Do we know when it is reasonable to promote any folios of a dying process? > >>>>>> > >>>>> > >>>>> I don't know. It seems not reasonable at all. if one service crashes due to > >>>>> SW bug, systemd will restart it immediately. this might be the case promoting > >>>>> folios might be good. but it is really a bug of the service, not a normal case. > >>>>> > >>>>>> Assume you restart Firefox, would it really matter to promote them when > >>>>>> unmapping? New Firefox would fault-in / touch the ones it really needs > >>>>>> immediately afterwards? > >>>>> > >>>>> Usually users kill firefox to start other applications (users intend > >>>>> to free memory > >>>>> for new applications). For Android, an app might be killed because it has been > >>>>> staying in the background inactively for a while. > >>>> > >>>>> On the other hand, even if users restart firefox immediately, their folios are > >>>>> probably still in LRU to hit. > >>>> > >>>> Right, that's what I'm thinking. > >>>> > >>>> So I wonder if we could just say "the whole process is going down; even > >>>> if we had some recency information, that could only affect some other > >>>> process, where we would have to guess if it really matters". > >>>> > >>>> If the data is important, one would assume that another process would > >>>> soon access it either way, and as you say, likely it will still be on > >>>> the LRU to hit. > >>> > >>> I'll include this additional information in the v2 version of the patch since > >>> you think it would be helpful. > >>> > >>> Regarding the exclusive flag - I'm wondering whether we actually need to > >>> distinguish between exclusive and shared folios in this case. The current > >>> patch uses the exclusive flag mainly to reduce controversy, but even for > >>> shared folios: does the recency from a dying process matter? The > >>> recency information only reflects the dying process's usage pattern, which > >>> will soon be irrelevant. > >> > >> Exactly my thoughts. So if we can simplify -- ignore it completely -- > >> that would certainly be nice. > > > > This doesn't sound right to me. > > > > Remembering the accesses of an exiting task is very much the point of > > this. Consider executables and shared libraries repeatedly referenced > > by short-lived jobs, like shell scripts, compiles etc. > > For these always-mmaped / never read/write files I tend to agree. > > But, is it really a good indication whether a folio is exclusive to this > process or not? > > I mean, if a bash scripts executes the same executable repeatedly, but > never multiple copies at the same time, we would also not tracking the > access with this patch. > > Similarly with an app that mmaps() a large data set (DB, VM, ML, ..) > exclusively. Re-starting the app would not track recency with this patch. > > But I guess there is no right or wrong ... Right, I'm more broadly objecting to the patch and its premise, but thought the exclusive filtering would at least mitigate its downsides somewhat. You raise good points that it's not as clear cut. IMO this is too subtle and unpredictable for everybody else. The kernel can't see the future, but access locality and recent use is a proven predictor. We generally don't discard access information, unless the user asks us to, and that's what the madvise calls are for. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFC PATCH] mm: don't promote exclusive file folios of dying processes 2025-04-16 18:18 ` Johannes Weiner @ 2025-04-16 21:54 ` Barry Song 2025-04-16 23:58 ` Johannes Weiner 0 siblings, 1 reply; 19+ messages in thread From: Barry Song @ 2025-04-16 21:54 UTC (permalink / raw) To: Johannes Weiner Cc: David Hildenbrand, akpm, linux-mm, linux-kernel, Barry Song, Baolin Wang, Matthew Wilcox, Oscar Salvador, Ryan Roberts, Zi Yan On Thu, Apr 17, 2025 at 2:18 AM Johannes Weiner <hannes@cmpxchg.org> wrote: > > On Wed, Apr 16, 2025 at 05:59:12PM +0200, David Hildenbrand wrote: > > On 16.04.25 16:15, Johannes Weiner wrote: > > > On Wed, Apr 16, 2025 at 11:40:31AM +0200, David Hildenbrand wrote: > > >> On 16.04.25 11:38, Barry Song wrote: > > >>> On Wed, Apr 16, 2025 at 5:32 PM David Hildenbrand <david@redhat.com> wrote: > > >>>> > > >>>> On 16.04.25 11:24, Barry Song wrote: > > >>>>> On Wed, Apr 16, 2025 at 4:32 PM David Hildenbrand <david@redhat.com> wrote: > > >>>>>> > > >>>>>> On 12.04.25 10:58, Barry Song wrote: > > >>>>>>> From: Barry Song <v-songbaohua@oppo.com> > > >>>>>>> > > >>>>>>> Promoting exclusive file folios of a dying process is unnecessary and > > >>>>>>> harmful. For example, while Firefox is killed and LibreOffice is > > >>>>>>> launched, activating Firefox's young file-backed folios makes it > > >>>>>>> harder to reclaim memory that LibreOffice doesn't use at all. > > >>>>>> > > >>>>>> Do we know when it is reasonable to promote any folios of a dying process? > > >>>>>> > > >>>>> > > >>>>> I don't know. It seems not reasonable at all. if one service crashes due to > > >>>>> SW bug, systemd will restart it immediately. this might be the case promoting > > >>>>> folios might be good. but it is really a bug of the service, not a normal case. > > >>>>> > > >>>>>> Assume you restart Firefox, would it really matter to promote them when > > >>>>>> unmapping? New Firefox would fault-in / touch the ones it really needs > > >>>>>> immediately afterwards? > > >>>>> > > >>>>> Usually users kill firefox to start other applications (users intend > > >>>>> to free memory > > >>>>> for new applications). For Android, an app might be killed because it has been > > >>>>> staying in the background inactively for a while. > > >>>> > > >>>>> On the other hand, even if users restart firefox immediately, their folios are > > >>>>> probably still in LRU to hit. > > >>>> > > >>>> Right, that's what I'm thinking. > > >>>> > > >>>> So I wonder if we could just say "the whole process is going down; even > > >>>> if we had some recency information, that could only affect some other > > >>>> process, where we would have to guess if it really matters". > > >>>> > > >>>> If the data is important, one would assume that another process would > > >>>> soon access it either way, and as you say, likely it will still be on > > >>>> the LRU to hit. > > >>> > > >>> I'll include this additional information in the v2 version of the patch since > > >>> you think it would be helpful. > > >>> > > >>> Regarding the exclusive flag - I'm wondering whether we actually need to > > >>> distinguish between exclusive and shared folios in this case. The current > > >>> patch uses the exclusive flag mainly to reduce controversy, but even for > > >>> shared folios: does the recency from a dying process matter? The > > >>> recency information only reflects the dying process's usage pattern, which > > >>> will soon be irrelevant. > > >> > > >> Exactly my thoughts. So if we can simplify -- ignore it completely -- > > >> that would certainly be nice. > > > > > > This doesn't sound right to me. > > > > > > Remembering the accesses of an exiting task is very much the point of > > > this. Consider executables and shared libraries repeatedly referenced > > > by short-lived jobs, like shell scripts, compiles etc. > > > > For these always-mmaped / never read/write files I tend to agree. > > > > But, is it really a good indication whether a folio is exclusive to this > > process or not? > > > > I mean, if a bash scripts executes the same executable repeatedly, but > > never multiple copies at the same time, we would also not tracking the > > access with this patch. > > > > Similarly with an app that mmaps() a large data set (DB, VM, ML, ..) > > exclusively. Re-starting the app would not track recency with this patch. > > > > But I guess there is no right or wrong ... > > Right, I'm more broadly objecting to the patch and its premise, but > thought the exclusive filtering would at least mitigate its downsides > somewhat. You raise good points that it's not as clear cut. > > IMO this is too subtle and unpredictable for everybody else. The > kernel can't see the future, but access locality and recent use is a > proven predictor. We generally don't discard access information, > unless the user asks us to, and that's what the madvise calls are for. David pointed out some exceptions - the recency of dying processes might still be useful to new processes, particularly in cases like: while true; do app; done Here, 'app' is repeatedly restarted but always maintains a single running instance. I agree this seems correct. However, we can also find many cases where a dying process means its folios instantly become cold. For example: - If someone enjoys watching his/her TV (not shared with family) and then passes away, the TV's folios become instantly cold. - Even if the TV is shared with family but only that person actively used it, the folios still become cold. If other users access this TV too, shouldn't their PTEs reflect that it's still young? I agree that "access locality and recent use" is generally a good heuristic, but it must have some correlation (strong or weak) with the process lifecycle. Implementing 'madv_cold' on a dying process seems impractical as dying means 'cold' for many cases. Also, It is really not doable to execute madv_cold on a dying process. Thanks Barry ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFC PATCH] mm: don't promote exclusive file folios of dying processes 2025-04-16 21:54 ` Barry Song @ 2025-04-16 23:58 ` Johannes Weiner 2025-04-17 2:43 ` Barry Song 0 siblings, 1 reply; 19+ messages in thread From: Johannes Weiner @ 2025-04-16 23:58 UTC (permalink / raw) To: Barry Song Cc: David Hildenbrand, akpm, linux-mm, linux-kernel, Barry Song, Baolin Wang, Matthew Wilcox, Oscar Salvador, Ryan Roberts, Zi Yan On Thu, Apr 17, 2025 at 05:54:57AM +0800, Barry Song wrote: > On Thu, Apr 17, 2025 at 2:18 AM Johannes Weiner <hannes@cmpxchg.org> wrote: > > Right, I'm more broadly objecting to the patch and its premise, but > > thought the exclusive filtering would at least mitigate its downsides > > somewhat. You raise good points that it's not as clear cut. > > > > IMO this is too subtle and unpredictable for everybody else. The > > kernel can't see the future, but access locality and recent use is a > > proven predictor. We generally don't discard access information, > > unless the user asks us to, and that's what the madvise calls are for. > > David pointed out some exceptions - the recency of dying processes might > still be useful to new processes, particularly in cases like: > > while true; do app; done > > Here, 'app' is repeatedly restarted but always maintains a single running > instance. I agree this seems correct. > > However, we can also find many cases where a dying process means its folios > instantly become cold. For example: Of course, there are many of them. Just like any access could be the last one to that page for the next hour. But you don't know which ones they are. Just like you don't know if I'm shutting down firefox because that's enough internet for one day, or if I'm just restarting it to clear out the 107 tabs I've lost track off. > I agree that "access locality and recent use" is generally a good heuristic, > but it must have some correlation (strong or weak) with the process lifecycle. I don't agree. It's a cache shared between past, present and future processes. The lifecycle of an individual processes is not saying much. Unless you know something about userspace, and the exact data at hand, that the kernel doesn't, which is why the Android usecase of MADV_COLD or PAGEOUT for background apps makes sense to me, but generally tying it to a process death does not. ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFC PATCH] mm: don't promote exclusive file folios of dying processes 2025-04-16 23:58 ` Johannes Weiner @ 2025-04-17 2:43 ` Barry Song 2025-04-17 12:17 ` Johannes Weiner 0 siblings, 1 reply; 19+ messages in thread From: Barry Song @ 2025-04-17 2:43 UTC (permalink / raw) To: Johannes Weiner Cc: David Hildenbrand, akpm, linux-mm, linux-kernel, Barry Song, Baolin Wang, Matthew Wilcox, Oscar Salvador, Ryan Roberts, Zi Yan On Thu, Apr 17, 2025 at 7:58 AM Johannes Weiner <hannes@cmpxchg.org> wrote: > > On Thu, Apr 17, 2025 at 05:54:57AM +0800, Barry Song wrote: > > On Thu, Apr 17, 2025 at 2:18 AM Johannes Weiner <hannes@cmpxchg.org> wrote: > > > Right, I'm more broadly objecting to the patch and its premise, but > > > thought the exclusive filtering would at least mitigate its downsides > > > somewhat. You raise good points that it's not as clear cut. > > > > > > IMO this is too subtle and unpredictable for everybody else. The > > > kernel can't see the future, but access locality and recent use is a > > > proven predictor. We generally don't discard access information, > > > unless the user asks us to, and that's what the madvise calls are for. > > > > David pointed out some exceptions - the recency of dying processes might > > still be useful to new processes, particularly in cases like: > > > > while true; do app; done > > > > Here, 'app' is repeatedly restarted but always maintains a single running > > instance. I agree this seems correct. > > > > However, we can also find many cases where a dying process means its folios > > instantly become cold. For example: > > Of course, there are many of them. Just like any access could be the > last one to that page for the next hour. But you don't know which ones > they are. Just like you don't know if I'm shutting down firefox > because that's enough internet for one day, or if I'm just restarting > it to clear out the 107 tabs I've lost track off. Typically, we focus on scenarios where multiple applications switch seamlessly—for instance, on a phone, when transitioning between different apps. The smoothness of these transitions matters most, Immediately restarting a just-terminated app isn't problematic since its memory footprint often persists before being reclaimed. > > > I agree that "access locality and recent use" is generally a good heuristic, > > but it must have some correlation (strong or weak) with the process lifecycle. > > I don't agree. It's a cache shared between past, present and future > processes. The lifecycle of an individual processes is not saying much. > > Unless you know something about userspace, and the exact data at hand, > that the kernel doesn't, which is why the Android usecase of MADV_COLD > or PAGEOUT for background apps makes sense to me, but generally tying > it to a process death does not. I agree that MADV_COLD or PAGEOUT makes sense for background apps, but I still believe process death is somewhat underestimated by you :-) In Android, process death is actually a strong signal that an app is inactive and consuming much memory—leading to its termination by either userspace or the kernel's OOM mechanism. We actually took a more aggressive approach by implementing a hook to demote exclusive folios of dying apps, which yielded good results—reducing kswapd overhead, refaults, and thrashing. Of course, it is even much more controversial than this patch. While I acknowledge that counter-examples to my described pattern can always be found, our observations clearly show that process death is a big event - far from being just a trivial unmap operation. Anyway, not trying to push the patch as obviously it seems quite hard :-) Thanks Barry ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFC PATCH] mm: don't promote exclusive file folios of dying processes 2025-04-17 2:43 ` Barry Song @ 2025-04-17 12:17 ` Johannes Weiner 2025-04-17 12:57 ` David Hildenbrand 0 siblings, 1 reply; 19+ messages in thread From: Johannes Weiner @ 2025-04-17 12:17 UTC (permalink / raw) To: Barry Song Cc: David Hildenbrand, akpm, linux-mm, linux-kernel, Barry Song, Baolin Wang, Matthew Wilcox, Oscar Salvador, Ryan Roberts, Zi Yan Hi Barry, On Thu, Apr 17, 2025 at 10:43:20AM +0800, Barry Song wrote: > On Thu, Apr 17, 2025 at 7:58 AM Johannes Weiner <hannes@cmpxchg.org> wrote: > > On Thu, Apr 17, 2025 at 05:54:57AM +0800, Barry Song wrote: > > > I agree that "access locality and recent use" is generally a good heuristic, > > > but it must have some correlation (strong or weak) with the process lifecycle. > > > > I don't agree. It's a cache shared between past, present and future > > processes. The lifecycle of an individual processes is not saying much. > > > > Unless you know something about userspace, and the exact data at hand, > > that the kernel doesn't, which is why the Android usecase of MADV_COLD > > or PAGEOUT for background apps makes sense to me, but generally tying > > it to a process death does not. > > I agree that MADV_COLD or PAGEOUT makes sense for background apps, > but I still believe process death is somewhat underestimated by you :-) In > Android, process death is actually a strong signal that an app is inactive and > consuming much memory—leading to its termination by either userspace or > the kernel's OOM mechanism. That's exactly what I'm saying, though. You know something about userspace that the kernel doesn't, which results from the unique way in which app scheduling and killing works on Android. Where you have recent foreground apps, idle background apps that you can kill and switching back to them later transparently restarts them and shows the user a fresh instance. But you have to admit that this is a unique microcosm modeled on top of a conventional Unix process model. So this doesn't necessarily translate to other Linux systems, like servers or desktops. There is much higher concurrency, workingsets are more static, there is no systematic distinction between foreground and background apps (not in the Android sense), OOM killing is a rare cornercase most setups ususally try hard to avoid etc. But surely even Android has system management components, daemons etc. that fit more into that second category? > We actually took a more aggressive approach by implementing a hook to demote > exclusive folios of dying apps, which yielded good results—reducing kswapd > overhead, refaults, and thrashing. Of course, it is even much more controversial > than this patch. That doesn't sound wrong to me for Android apps. How about a prctl() to request the behavior for those specific app processes where you have clear usage signal? And then by all means, outright demote the pages, or even invalidate the cache. Delete the files, discard the flash blocks! (Ok that was a joke). ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFC PATCH] mm: don't promote exclusive file folios of dying processes 2025-04-17 12:17 ` Johannes Weiner @ 2025-04-17 12:57 ` David Hildenbrand 2025-04-18 0:16 ` Barry Song 0 siblings, 1 reply; 19+ messages in thread From: David Hildenbrand @ 2025-04-17 12:57 UTC (permalink / raw) To: Johannes Weiner, Barry Song Cc: akpm, linux-mm, linux-kernel, Barry Song, Baolin Wang, Matthew Wilcox, Oscar Salvador, Ryan Roberts, Zi Yan >> We actually took a more aggressive approach by implementing a hook to demote >> exclusive folios of dying apps, which yielded good results—reducing kswapd >> overhead, refaults, and thrashing. Of course, it is even much more controversial >> than this patch. > > That doesn't sound wrong to me for Android apps. > > How about a prctl() to request the behavior for those specific app > processes where you have clear usage signal? I was thinking about the same, so likely that might be a viable solution. -- Cheers, David / dhildenb ^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [RFC PATCH] mm: don't promote exclusive file folios of dying processes 2025-04-17 12:57 ` David Hildenbrand @ 2025-04-18 0:16 ` Barry Song 0 siblings, 0 replies; 19+ messages in thread From: Barry Song @ 2025-04-18 0:16 UTC (permalink / raw) To: David Hildenbrand Cc: Johannes Weiner, akpm, linux-mm, linux-kernel, Barry Song, Baolin Wang, Matthew Wilcox, Oscar Salvador, Ryan Roberts, Zi Yan On Thu, Apr 17, 2025 at 8:57 PM David Hildenbrand <david@redhat.com> wrote: > > >> We actually took a more aggressive approach by implementing a hook to demote > >> exclusive folios of dying apps, which yielded good results—reducing kswapd > >> overhead, refaults, and thrashing. Of course, it is even much more controversial > >> than this patch. > > > > That doesn't sound wrong to me for Android apps. > > > > How about a prctl() to request the behavior for those specific app > > processes where you have clear usage signal? > > I was thinking about the same, so likely that might be a viable solution. Many thanks to both Johannes and David for the suggestion. I’d be delighted to take a look at this. > > -- > Cheers, > > David / dhildenb > Thanks Barry ^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2025-04-18 0:16 UTC | newest] Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2025-04-12 8:58 [RFC PATCH] mm: don't promote exclusive file folios of dying processes Barry Song 2025-04-12 15:48 ` Matthew Wilcox 2025-04-12 16:31 ` Zi Yan 2025-04-16 7:48 ` Barry Song 2025-04-16 8:24 ` Baolin Wang 2025-04-16 8:32 ` David Hildenbrand 2025-04-16 9:24 ` Barry Song 2025-04-16 9:32 ` David Hildenbrand 2025-04-16 9:38 ` Barry Song 2025-04-16 9:40 ` David Hildenbrand 2025-04-16 14:15 ` Johannes Weiner 2025-04-16 15:59 ` David Hildenbrand 2025-04-16 18:18 ` Johannes Weiner 2025-04-16 21:54 ` Barry Song 2025-04-16 23:58 ` Johannes Weiner 2025-04-17 2:43 ` Barry Song 2025-04-17 12:17 ` Johannes Weiner 2025-04-17 12:57 ` David Hildenbrand 2025-04-18 0:16 ` Barry Song
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox