* [PATCH] mm: shmem: fix too little space for tmpfs only fallback 4KB
@ 2025-09-08 12:31 Vernon Yang
2025-09-08 23:22 ` Andrew Morton
2025-09-09 5:58 ` Baolin Wang
0 siblings, 2 replies; 8+ messages in thread
From: Vernon Yang @ 2025-09-08 12:31 UTC (permalink / raw)
To: hughd, baolin.wang, akpm, da.gomez; +Cc: linux-mm, linux-kernel, Vernon Yang
From: Vernon Yang <yanglincheng@kylinos.cn>
When the system memory is sufficient, allocating memory is always
successful, but when tmpfs size is low (e.g. 1MB), it falls back
directly from 2MB to 4KB, and other small granularity (8KB ~ 1024KB)
will not be tried.
Therefore add check whether the remaining space of tmpfs is sufficient
for allocation. If there is too little space left, try smaller large
folio.
Fixes: acd7ccb284b8 ("mm: shmem: add large folio support for tmpfs")
Signed-off-by: Vernon Yang <yanglincheng@kylinos.cn>
---
mm/shmem.c | 13 +++++++++++++
1 file changed, 13 insertions(+)
diff --git a/mm/shmem.c b/mm/shmem.c
index 8c592c6db2a0..b20affd57b23 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -1820,6 +1820,7 @@ static unsigned long shmem_suitable_orders(struct inode *inode, struct vm_fault
unsigned long orders)
{
struct vm_area_struct *vma = vmf ? vmf->vma : NULL;
+ struct shmem_sb_info *sbinfo = SHMEM_SB(inode->i_sb);
pgoff_t aligned_index;
unsigned long pages;
int order;
@@ -1835,6 +1836,18 @@ static unsigned long shmem_suitable_orders(struct inode *inode, struct vm_fault
while (orders) {
pages = 1UL << order;
aligned_index = round_down(index, pages);
+
+ /*
+ * Check whether the remaining space of tmpfs is sufficient for
+ * allocation. If there is too little space left, try smaller
+ * large folio.
+ */
+ if (sbinfo->max_blocks && percpu_counter_read(&sbinfo->used_blocks)
+ + pages > sbinfo->max_blocks) {
+ order = next_order(&orders, order);
+ continue;
+ }
+
/*
* Check for conflict before waiting on a huge allocation.
* Conflict might be that a huge page has just been allocated
--
2.51.0
^ permalink raw reply [flat|nested] 8+ messages in thread* Re: [PATCH] mm: shmem: fix too little space for tmpfs only fallback 4KB 2025-09-08 12:31 [PATCH] mm: shmem: fix too little space for tmpfs only fallback 4KB Vernon Yang @ 2025-09-08 23:22 ` Andrew Morton 2025-09-09 14:16 ` Vernon Yang 2025-09-09 5:58 ` Baolin Wang 1 sibling, 1 reply; 8+ messages in thread From: Andrew Morton @ 2025-09-08 23:22 UTC (permalink / raw) To: Vernon Yang Cc: hughd, baolin.wang, da.gomez, linux-mm, linux-kernel, Vernon Yang On Mon, 8 Sep 2025 20:31:28 +0800 Vernon Yang <vernon2gm@gmail.com> wrote: > From: Vernon Yang <yanglincheng@kylinos.cn> > > When the system memory is sufficient, allocating memory is always > successful, but when tmpfs size is low (e.g. 1MB), it falls back > directly from 2MB to 4KB, and other small granularity (8KB ~ 1024KB) > will not be tried. > > Therefore add check whether the remaining space of tmpfs is sufficient > for allocation. If there is too little space left, try smaller large > folio. Thanks. What are the effects of this change? I'm assuming it's an *improvement*, rather than a fix for some misbehavior? ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] mm: shmem: fix too little space for tmpfs only fallback 4KB 2025-09-08 23:22 ` Andrew Morton @ 2025-09-09 14:16 ` Vernon Yang 0 siblings, 0 replies; 8+ messages in thread From: Vernon Yang @ 2025-09-09 14:16 UTC (permalink / raw) To: Andrew Morton Cc: Vernon Yang, hughd, baolin.wang, da.gomez, linux-mm, linux-kernel, Vernon Yang > On Sep 9, 2025, at 07:22, Andrew Morton <akpm@linux-foundation.org> wrote: > > On Mon, 8 Sep 2025 20:31:28 +0800 Vernon Yang <vernon2gm@gmail.com> wrote: > >> From: Vernon Yang <yanglincheng@kylinos.cn> >> >> When the system memory is sufficient, allocating memory is always >> successful, but when tmpfs size is low (e.g. 1MB), it falls back >> directly from 2MB to 4KB, and other small granularity (8KB ~ 1024KB) >> will not be tried. >> >> Therefore add check whether the remaining space of tmpfs is sufficient >> for allocation. If there is too little space left, try smaller large >> folio. > > Thanks. > > What are the effects of this change? I'm assuming it's an > *improvement*, rather than a fix for some misbehavior? > When we use tmpfs and the tmpfs space is getting smaller and smaller (e.g. less than 2MB), it can still allocate 8KB~1MB large folio. Thank you for your feedback. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] mm: shmem: fix too little space for tmpfs only fallback 4KB 2025-09-08 12:31 [PATCH] mm: shmem: fix too little space for tmpfs only fallback 4KB Vernon Yang 2025-09-08 23:22 ` Andrew Morton @ 2025-09-09 5:58 ` Baolin Wang 2025-09-09 12:29 ` Vernon Yang 1 sibling, 1 reply; 8+ messages in thread From: Baolin Wang @ 2025-09-09 5:58 UTC (permalink / raw) To: Vernon Yang, hughd, akpm, da.gomez; +Cc: linux-mm, linux-kernel, Vernon Yang On 2025/9/8 20:31, Vernon Yang wrote: > From: Vernon Yang <yanglincheng@kylinos.cn> > > When the system memory is sufficient, allocating memory is always > successful, but when tmpfs size is low (e.g. 1MB), it falls back > directly from 2MB to 4KB, and other small granularity (8KB ~ 1024KB) > will not be tried. > > Therefore add check whether the remaining space of tmpfs is sufficient > for allocation. If there is too little space left, try smaller large > folio. I don't think so. For a tmpfs mount with 'huge=within_size' and 'size=1M', if you try to write 1M data, it will allocate an order 8 large folio and will not fallback to order 0. For a tmpfs mount with 'huge=always' and 'size=1M', if you try to write 1M data, it will not completely fallback to order 0 either, instead, it will still allocate some order 1 to order 7 large folios. I'm not sure if this is your actual user scenario. If your files are small and you are concerned about not getting large folio allocations, I recommend using the 'huge=within_size' mount option. > Fixes: acd7ccb284b8 ("mm: shmem: add large folio support for tmpfs") No, this doesn't fix anything. > Signed-off-by: Vernon Yang <yanglincheng@kylinos.cn> > --- > mm/shmem.c | 13 +++++++++++++ > 1 file changed, 13 insertions(+) > > diff --git a/mm/shmem.c b/mm/shmem.c > index 8c592c6db2a0..b20affd57b23 100644 > --- a/mm/shmem.c > +++ b/mm/shmem.c > @@ -1820,6 +1820,7 @@ static unsigned long shmem_suitable_orders(struct inode *inode, struct vm_fault > unsigned long orders) > { > struct vm_area_struct *vma = vmf ? vmf->vma : NULL; > + struct shmem_sb_info *sbinfo = SHMEM_SB(inode->i_sb); > pgoff_t aligned_index; > unsigned long pages; > int order; > @@ -1835,6 +1836,18 @@ static unsigned long shmem_suitable_orders(struct inode *inode, struct vm_fault > while (orders) { > pages = 1UL << order; > aligned_index = round_down(index, pages); > + > + /* > + * Check whether the remaining space of tmpfs is sufficient for > + * allocation. If there is too little space left, try smaller > + * large folio. > + */ > + if (sbinfo->max_blocks && percpu_counter_read(&sbinfo->used_blocks) > + + pages > sbinfo->max_blocks) { > + order = next_order(&orders, order); > + continue; > + } > + > /* > * Check for conflict before waiting on a huge allocation. > * Conflict might be that a huge page has just been allocated ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] mm: shmem: fix too little space for tmpfs only fallback 4KB 2025-09-09 5:58 ` Baolin Wang @ 2025-09-09 12:29 ` Vernon Yang 2025-09-22 1:46 ` Baolin Wang 0 siblings, 1 reply; 8+ messages in thread From: Vernon Yang @ 2025-09-09 12:29 UTC (permalink / raw) To: Baolin Wang Cc: Vernon Yang, hughd, akpm, da.gomez, linux-mm, linux-kernel, Vernon Yang > On Sep 9, 2025, at 13:58, Baolin Wang <baolin.wang@linux.alibaba.com> wrote: > > > > On 2025/9/8 20:31, Vernon Yang wrote: >> From: Vernon Yang <yanglincheng@kylinos.cn> >> When the system memory is sufficient, allocating memory is always >> successful, but when tmpfs size is low (e.g. 1MB), it falls back >> directly from 2MB to 4KB, and other small granularity (8KB ~ 1024KB) >> will not be tried. >> Therefore add check whether the remaining space of tmpfs is sufficient >> for allocation. If there is too little space left, try smaller large >> folio. > > I don't think so. > > For a tmpfs mount with 'huge=within_size' and 'size=1M', if you try to write 1M data, it will allocate an order 8 large folio and will not fallback to order 0. > > For a tmpfs mount with 'huge=always' and 'size=1M', if you try to write 1M data, it will not completely fallback to order 0 either, instead, it will still allocate some order 1 to order 7 large folios. > > I'm not sure if this is your actual user scenario. If your files are small and you are concerned about not getting large folio allocations, I recommend using the 'huge=within_size' mount option. > No, this is not my user scenario. Based on your previous patch [1], this scenario can be easily reproduced as follows. $ mount -t tmpfs -o size=1024K,huge=always tmpfs /xxx/test $ echo hello > /xxx/test/README $ df -h tmpfs 1.0M 4.0K 1020K 1% /xxx/test The code logic is as follows: shmem_get_folio_gfp() orders = shmem_allowable_huge_orders() shmem_alloc_and_add_folio(orders) return -ENOSPC; shmem_alloc_folio() alloc 2MB shmem_inode_acct_blocks() percpu_counter_limited_add() goto unacct; filemap_remove_folio() shmem_alloc_and_add_folio(order = 0) As long as the tmpfs remaining space is too little and the system can allocate memory 2MB, the above path will be triggered. [1] https://lore.kernel.org/linux-mm/10e7ac6cebe6535c137c064d5c5a235643eebb4a.1756888965.git.baolin.wang@linux.alibaba.com/ >> Fixes: acd7ccb284b8 ("mm: shmem: add large folio support for tmpfs") > > No, this doesn't fix anything. > >> Signed-off-by: Vernon Yang <yanglincheng@kylinos.cn> >> --- >> mm/shmem.c | 13 +++++++++++++ >> 1 file changed, 13 insertions(+) >> diff --git a/mm/shmem.c b/mm/shmem.c >> index 8c592c6db2a0..b20affd57b23 100644 >> --- a/mm/shmem.c >> +++ b/mm/shmem.c >> @@ -1820,6 +1820,7 @@ static unsigned long shmem_suitable_orders(struct inode *inode, struct vm_fault >> unsigned long orders) >> { >> struct vm_area_struct *vma = vmf ? vmf->vma : NULL; >> + struct shmem_sb_info *sbinfo = SHMEM_SB(inode->i_sb); >> pgoff_t aligned_index; >> unsigned long pages; >> int order; >> @@ -1835,6 +1836,18 @@ static unsigned long shmem_suitable_orders(struct inode *inode, struct vm_fault >> while (orders) { >> pages = 1UL << order; >> aligned_index = round_down(index, pages); >> + >> + /* >> + * Check whether the remaining space of tmpfs is sufficient for >> + * allocation. If there is too little space left, try smaller >> + * large folio. >> + */ >> + if (sbinfo->max_blocks && percpu_counter_read(&sbinfo->used_blocks) >> + + pages > sbinfo->max_blocks) { >> + order = next_order(&orders, order); >> + continue; >> + } >> + >> /* >> * Check for conflict before waiting on a huge allocation. >> * Conflict might be that a huge page has just been allocated > ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] mm: shmem: fix too little space for tmpfs only fallback 4KB 2025-09-09 12:29 ` Vernon Yang @ 2025-09-22 1:46 ` Baolin Wang 2025-09-22 2:51 ` Vernon Yang 0 siblings, 1 reply; 8+ messages in thread From: Baolin Wang @ 2025-09-22 1:46 UTC (permalink / raw) To: Vernon Yang; +Cc: hughd, akpm, da.gomez, linux-mm, linux-kernel, Vernon Yang On 2025/9/9 20:29, Vernon Yang wrote: > > >> On Sep 9, 2025, at 13:58, Baolin Wang <baolin.wang@linux.alibaba.com> wrote: >> >> >> >> On 2025/9/8 20:31, Vernon Yang wrote: >>> From: Vernon Yang <yanglincheng@kylinos.cn> >>> When the system memory is sufficient, allocating memory is always >>> successful, but when tmpfs size is low (e.g. 1MB), it falls back >>> directly from 2MB to 4KB, and other small granularity (8KB ~ 1024KB) >>> will not be tried. >>> Therefore add check whether the remaining space of tmpfs is sufficient >>> for allocation. If there is too little space left, try smaller large >>> folio. >> >> I don't think so. >> >> For a tmpfs mount with 'huge=within_size' and 'size=1M', if you try to write 1M data, it will allocate an order 8 large folio and will not fallback to order 0. >> >> For a tmpfs mount with 'huge=always' and 'size=1M', if you try to write 1M data, it will not completely fallback to order 0 either, instead, it will still allocate some order 1 to order 7 large folios. >> >> I'm not sure if this is your actual user scenario. If your files are small and you are concerned about not getting large folio allocations, I recommend using the 'huge=within_size' mount option. >> > > No, this is not my user scenario. > > Based on your previous patch [1], this scenario can be easily reproduced as > follows. > > $ mount -t tmpfs -o size=1024K,huge=always tmpfs /xxx/test > $ echo hello > /xxx/test/README > $ df -h > tmpfs 1.0M 4.0K 1020K 1% /xxx/test > > The code logic is as follows: > > shmem_get_folio_gfp() > orders = shmem_allowable_huge_orders() > shmem_alloc_and_add_folio(orders) return -ENOSPC; > shmem_alloc_folio() alloc 2MB > shmem_inode_acct_blocks() > percpu_counter_limited_add() goto unacct; > filemap_remove_folio() > shmem_alloc_and_add_folio(order = 0) > > > As long as the tmpfs remaining space is too little and the system can allocate > memory 2MB, the above path will be triggered. In your scenario, wouldn't allocating 4K be more reasonable? Using a 1M large folio would waste memory. Moreover, if you want to use a large folio, I think you could increase the 'size' mount option. To me, this doesn't seem like a real-world usage scenario, instead it looks more like a contrived test case for a specific situation. Sorry, this still doesn't convince me. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] mm: shmem: fix too little space for tmpfs only fallback 4KB 2025-09-22 1:46 ` Baolin Wang @ 2025-09-22 2:51 ` Vernon Yang 2025-09-22 3:09 ` Baolin Wang 0 siblings, 1 reply; 8+ messages in thread From: Vernon Yang @ 2025-09-22 2:51 UTC (permalink / raw) To: Baolin Wang; +Cc: hughd, akpm, da.gomez, linux-mm, linux-kernel, Vernon Yang On Mon, Sep 22, 2025 at 09:46:53AM +0800, Baolin Wang wrote: > > > On 2025/9/9 20:29, Vernon Yang wrote: > > > > > > > On Sep 9, 2025, at 13:58, Baolin Wang <baolin.wang@linux.alibaba.com> wrote: > > > > > > > > > > > > On 2025/9/8 20:31, Vernon Yang wrote: > > > > From: Vernon Yang <yanglincheng@kylinos.cn> > > > > When the system memory is sufficient, allocating memory is always > > > > successful, but when tmpfs size is low (e.g. 1MB), it falls back > > > > directly from 2MB to 4KB, and other small granularity (8KB ~ 1024KB) > > > > will not be tried. > > > > Therefore add check whether the remaining space of tmpfs is sufficient > > > > for allocation. If there is too little space left, try smaller large > > > > folio. > > > > > > I don't think so. > > > > > > For a tmpfs mount with 'huge=within_size' and 'size=1M', if you try to write 1M data, it will allocate an order 8 large folio and will not fallback to order 0. > > > > > > For a tmpfs mount with 'huge=always' and 'size=1M', if you try to write 1M data, it will not completely fallback to order 0 either, instead, it will still allocate some order 1 to order 7 large folios. > > > > > > I'm not sure if this is your actual user scenario. If your files are small and you are concerned about not getting large folio allocations, I recommend using the 'huge=within_size' mount option. > > > > > > > No, this is not my user scenario. > > > > Based on your previous patch [1], this scenario can be easily reproduced as > > follows. > > > > $ mount -t tmpfs -o size=1024K,huge=always tmpfs /xxx/test > > $ echo hello > /xxx/test/README > > $ df -h > > tmpfs 1.0M 4.0K 1020K 1% /xxx/test > > > > The code logic is as follows: > > > > shmem_get_folio_gfp() > > orders = shmem_allowable_huge_orders() > > shmem_alloc_and_add_folio(orders) return -ENOSPC; > > shmem_alloc_folio() alloc 2MB > > shmem_inode_acct_blocks() > > percpu_counter_limited_add() goto unacct; > > filemap_remove_folio() > > shmem_alloc_and_add_folio(order = 0) > > > > > > As long as the tmpfs remaining space is too little and the system can allocate > > memory 2MB, the above path will be triggered. > > In your scenario, wouldn't allocating 4K be more reasonable? Using a 1M > large folio would waste memory. Moreover, if you want to use a large folio, > I think you could increase the 'size' mount option. To me, this doesn't seem > like a real-world usage scenario, instead it looks more like a contrived > test case for a specific situation. The previous example is just an easy demo to reproduce, and if someone uses this example in the real world, of course the best method is to increase the 'size'. But the scenario I want to express here is that when the tmpfs space is *consumed* to less than 2MB, only 4KB will be allocated, you can imagine that when a tmpfs is constantly consumed, but someone is reclaiming or freeing memory, causing often tmpfs space to remain in the range of [0~2MB), then tmpfs will always only allocate 4KB. > Sorry, this still doesn't convince me. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] mm: shmem: fix too little space for tmpfs only fallback 4KB 2025-09-22 2:51 ` Vernon Yang @ 2025-09-22 3:09 ` Baolin Wang 0 siblings, 0 replies; 8+ messages in thread From: Baolin Wang @ 2025-09-22 3:09 UTC (permalink / raw) To: Vernon Yang; +Cc: hughd, akpm, da.gomez, linux-mm, linux-kernel, Vernon Yang On 2025/9/22 10:51, Vernon Yang wrote: > On Mon, Sep 22, 2025 at 09:46:53AM +0800, Baolin Wang wrote: >> >> >> On 2025/9/9 20:29, Vernon Yang wrote: >>> >>> >>>> On Sep 9, 2025, at 13:58, Baolin Wang <baolin.wang@linux.alibaba.com> wrote: >>>> >>>> >>>> >>>> On 2025/9/8 20:31, Vernon Yang wrote: >>>>> From: Vernon Yang <yanglincheng@kylinos.cn> >>>>> When the system memory is sufficient, allocating memory is always >>>>> successful, but when tmpfs size is low (e.g. 1MB), it falls back >>>>> directly from 2MB to 4KB, and other small granularity (8KB ~ 1024KB) >>>>> will not be tried. >>>>> Therefore add check whether the remaining space of tmpfs is sufficient >>>>> for allocation. If there is too little space left, try smaller large >>>>> folio. >>>> >>>> I don't think so. >>>> >>>> For a tmpfs mount with 'huge=within_size' and 'size=1M', if you try to write 1M data, it will allocate an order 8 large folio and will not fallback to order 0. >>>> >>>> For a tmpfs mount with 'huge=always' and 'size=1M', if you try to write 1M data, it will not completely fallback to order 0 either, instead, it will still allocate some order 1 to order 7 large folios. >>>> >>>> I'm not sure if this is your actual user scenario. If your files are small and you are concerned about not getting large folio allocations, I recommend using the 'huge=within_size' mount option. >>>> >>> >>> No, this is not my user scenario. >>> >>> Based on your previous patch [1], this scenario can be easily reproduced as >>> follows. >>> >>> $ mount -t tmpfs -o size=1024K,huge=always tmpfs /xxx/test >>> $ echo hello > /xxx/test/README >>> $ df -h >>> tmpfs 1.0M 4.0K 1020K 1% /xxx/test >>> >>> The code logic is as follows: >>> >>> shmem_get_folio_gfp() >>> orders = shmem_allowable_huge_orders() >>> shmem_alloc_and_add_folio(orders) return -ENOSPC; >>> shmem_alloc_folio() alloc 2MB >>> shmem_inode_acct_blocks() >>> percpu_counter_limited_add() goto unacct; >>> filemap_remove_folio() >>> shmem_alloc_and_add_folio(order = 0) >>> >>> >>> As long as the tmpfs remaining space is too little and the system can allocate >>> memory 2MB, the above path will be triggered. >> >> In your scenario, wouldn't allocating 4K be more reasonable? Using a 1M >> large folio would waste memory. Moreover, if you want to use a large folio, >> I think you could increase the 'size' mount option. To me, this doesn't seem >> like a real-world usage scenario, instead it looks more like a contrived >> test case for a specific situation. > > The previous example is just an easy demo to reproduce, and if someone > uses this example in the real world, of course the best method is to > increase the 'size'. > > But the scenario I want to express here is that when the tmpfs space is > *consumed* to less than 2MB, only 4KB will be allocated, you can imagine > that when a tmpfs is constantly consumed, but someone is reclaiming or > freeing memory, causing often tmpfs space to remain in the range of > [0~2MB), then tmpfs will always only allocate 4KB. Please increase your 'size' mount option for testing. I don't see why we need to add more such logic without a solid reason. Andrew, please drop this patch. ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2025-09-22 3:10 UTC | newest] Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2025-09-08 12:31 [PATCH] mm: shmem: fix too little space for tmpfs only fallback 4KB Vernon Yang 2025-09-08 23:22 ` Andrew Morton 2025-09-09 14:16 ` Vernon Yang 2025-09-09 5:58 ` Baolin Wang 2025-09-09 12:29 ` Vernon Yang 2025-09-22 1:46 ` Baolin Wang 2025-09-22 2:51 ` Vernon Yang 2025-09-22 3:09 ` Baolin Wang
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox