* Re: [Bug 202089] New: transparent hugepage not compatable with madvise(MADV_DONTNEED)
[not found] <bug-202089-27@https.bugzilla.kernel.org/>
@ 2018-12-29 20:53 ` Andrew Morton
2018-12-29 22:48 ` Kirill A. Shutemov
0 siblings, 1 reply; 6+ messages in thread
From: Andrew Morton @ 2018-12-29 20:53 UTC (permalink / raw)
To: linux-mm; +Cc: bugzilla-daemon, jianpanlanyue
(switched to email. Please respond via emailed reply-to-all, not via the
bugzilla web interface).
On Sat, 29 Dec 2018 09:00:22 +0000 bugzilla-daemon@bugzilla.kernel.org wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=202089
>
> Bug ID: 202089
> Summary: transparent hugepage not compatable with
> madvise(MADV_DONTNEED)
> Product: Memory Management
> Version: 2.5
> Kernel Version: 4.4.0-117
> Hardware: x86-64
> OS: Linux
> Tree: Mainline
> Status: NEW
> Severity: high
> Priority: P1
> Component: Other
> Assignee: akpm@linux-foundation.org
> Reporter: jianpanlanyue@163.com
> Regression: No
>
> environment:
> 1.kernel 4.4.0 on x86_64
> 2.echo always > /sys/kernel/mm/transparent_hugepage/enable
> echo always > /sys/kernel/mm/transparent_hugepage/defrag
> echo 2000000 > /sys/kernel/mm/transparent_hugepage/khugepaged/pages_to_scan
> ( faster defrag pages to reproduce problem)
>
> problem:
> 1. use mmap() to allocate 4096 bytes for 1024*512 times (4096*1024*512=2G).
> 2. use madvise(MADV_DONTNEED) to free most of the above pages, but reserve a
> few pages(by if(i%33==0) continue;), then process's physical memory firstly
> come down, but after a few seconds, it rise back to 2G again, and can't come
> down forever.
> 3. if i delete this condition(if(i%33==0) continue;) or disable
> transparent_hugepage by setting 'enable' and 'defrag' to never, all go well and
> the physical memory can come down expectly.
>
> It seems like transparent_hugepage has problems with non-contiguous
> madvise(MADV_DONTEED).
>
>
> Belows is the test code:
>
> #include <stdio.h>
> #include <memory.h>
> #include <stdlib.h>
> #include <sys/mman.h>
> #include <errno.h>
> #include <assert.h>
>
> #define PAGE_SIZE 4096
> #define PAGE_COUNT 1024*512
> int main()
> {
> void** table = (void**)malloc(sizeof(void*) * PAGE_COUNT);
> printf("begin mmap...\n");
>
> for (int i=0; i<PAGE_COUNT; i++) {
> table[i] = mmap(NULL, PAGE_SIZE, PROT_READ|PROT_WRITE,
> MAP_PRIVATE|MAP_ANONYMOUS, -1 ,0);
> assert(table[i] != MAP_FAILED);
> memset(table[i], 1, PAGE_SIZE);
> }
>
> printf("mmap ok, press enter to free most of them\n");
> getchar();
>
> //it behaves not expectly: after most pages freed, thp make it rise to 2G
> again
> for(int i=0; i<PAGE_COUNT; i++) {
> if (i%33==0) continue;
> if (madvise(table[i], PAGE_SIZE, MADV_DONTNEED) != 0)
> printf("madvise error, errno:%d\n", errno);
> }
>
> printf("munmap finish\n");
> free(table);
> getchar();
> getchar();
> }
>
> --
> You are receiving this mail because:
> You are the assignee for the bug.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Bug 202089] New: transparent hugepage not compatable with madvise(MADV_DONTNEED)
2018-12-29 20:53 ` [Bug 202089] New: transparent hugepage not compatable with madvise(MADV_DONTNEED) Andrew Morton
@ 2018-12-29 22:48 ` Kirill A. Shutemov
2019-01-03 9:44 ` Michal Hocko
0 siblings, 1 reply; 6+ messages in thread
From: Kirill A. Shutemov @ 2018-12-29 22:48 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux-mm, bugzilla-daemon, jianpanlanyue
On Sat, Dec 29, 2018 at 12:53:16PM -0800, Andrew Morton wrote:
>
> (switched to email. Please respond via emailed reply-to-all, not via the
> bugzilla web interface).
>
> On Sat, 29 Dec 2018 09:00:22 +0000 bugzilla-daemon@bugzilla.kernel.org wrote:
>
> > https://bugzilla.kernel.org/show_bug.cgi?id=202089
> >
> > Bug ID: 202089
> > Summary: transparent hugepage not compatable with
> > madvise(MADV_DONTNEED)
> > Product: Memory Management
> > Version: 2.5
> > Kernel Version: 4.4.0-117
> > Hardware: x86-64
> > OS: Linux
> > Tree: Mainline
> > Status: NEW
> > Severity: high
> > Priority: P1
> > Component: Other
> > Assignee: akpm@linux-foundation.org
> > Reporter: jianpanlanyue@163.com
> > Regression: No
> >
> > environment:
> > 1.kernel 4.4.0 on x86_64
> > 2.echo always > /sys/kernel/mm/transparent_hugepage/enable
> > echo always > /sys/kernel/mm/transparent_hugepage/defrag
> > echo 2000000 > /sys/kernel/mm/transparent_hugepage/khugepaged/pages_to_scan
> > ( faster defrag pages to reproduce problem)
> >
> > problem:
> > 1. use mmap() to allocate 4096 bytes for 1024*512 times (4096*1024*512=2G).
> > 2. use madvise(MADV_DONTNEED) to free most of the above pages, but reserve a
> > few pages(by if(i%33==0) continue;), then process's physical memory firstly
> > come down, but after a few seconds, it rise back to 2G again, and can't come
> > down forever.
> > 3. if i delete this condition(if(i%33==0) continue;) or disable
> > transparent_hugepage by setting 'enable' and 'defrag' to never, all go well and
> > the physical memory can come down expectly.
> >
> > It seems like transparent_hugepage has problems with non-contiguous
> > madvise(MADV_DONTEED).
It's expected behaviour.
MADV_DONTNEED doesn't guarantee that the range will not be repopulated
(with or without direct action on application behalf). It's just a hint
for the kernel.
For sparse mappings, consider using MADV_NOHUGEPAGE.
--
Kirill A. Shutemov
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Bug 202089] New: transparent hugepage not compatable with madvise(MADV_DONTNEED)
2018-12-29 22:48 ` Kirill A. Shutemov
@ 2019-01-03 9:44 ` Michal Hocko
2019-01-03 14:35 ` Matthew Wilcox
0 siblings, 1 reply; 6+ messages in thread
From: Michal Hocko @ 2019-01-03 9:44 UTC (permalink / raw)
To: Kirill A. Shutemov, jianpanlanyue
Cc: Andrew Morton, linux-mm, bugzilla-daemon
On Sun 30-12-18 01:48:43, Kirill A. Shutemov wrote:
> On Sat, Dec 29, 2018 at 12:53:16PM -0800, Andrew Morton wrote:
> >
> > (switched to email. Please respond via emailed reply-to-all, not via the
> > bugzilla web interface).
> >
> > On Sat, 29 Dec 2018 09:00:22 +0000 bugzilla-daemon@bugzilla.kernel.org wrote:
> >
> > > https://bugzilla.kernel.org/show_bug.cgi?id=202089
> > >
> > > Bug ID: 202089
> > > Summary: transparent hugepage not compatable with
> > > madvise(MADV_DONTNEED)
> > > Product: Memory Management
> > > Version: 2.5
> > > Kernel Version: 4.4.0-117
> > > Hardware: x86-64
> > > OS: Linux
> > > Tree: Mainline
> > > Status: NEW
> > > Severity: high
> > > Priority: P1
> > > Component: Other
> > > Assignee: akpm@linux-foundation.org
> > > Reporter: jianpanlanyue@163.com
> > > Regression: No
> > >
> > > environment:
> > > 1.kernel 4.4.0 on x86_64
> > > 2.echo always > /sys/kernel/mm/transparent_hugepage/enable
> > > echo always > /sys/kernel/mm/transparent_hugepage/defrag
> > > echo 2000000 > /sys/kernel/mm/transparent_hugepage/khugepaged/pages_to_scan
> > > ( faster defrag pages to reproduce problem)
> > >
> > > problem:
> > > 1. use mmap() to allocate 4096 bytes for 1024*512 times (4096*1024*512=2G).
> > > 2. use madvise(MADV_DONTNEED) to free most of the above pages, but reserve a
> > > few pages(by if(i%33==0) continue;), then process's physical memory firstly
> > > come down, but after a few seconds, it rise back to 2G again, and can't come
> > > down forever.
> > > 3. if i delete this condition(if(i%33==0) continue;) or disable
> > > transparent_hugepage by setting 'enable' and 'defrag' to never, all go well and
> > > the physical memory can come down expectly.
> > >
> > > It seems like transparent_hugepage has problems with non-contiguous
> > > madvise(MADV_DONTEED).
>
> It's expected behaviour.
>
> MADV_DONTNEED doesn't guarantee that the range will not be repopulated
> (with or without direct action on application behalf). It's just a hint
> for the kernel.
I agree with Kirill here but I would be interested in the underlying
usecase that triggered this. The test case is clearly artificial but is
any userspace actually relying on MADV_DONTNEED reducing the rss
longterm?
> For sparse mappings, consider using MADV_NOHUGEPAGE.
Yes or use a high threshold for khugepaged for collapsing.
--
Michal Hocko
SUSE Labs
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Bug 202089] New: transparent hugepage not compatable with madvise(MADV_DONTNEED)
2019-01-03 9:44 ` Michal Hocko
@ 2019-01-03 14:35 ` Matthew Wilcox
2019-01-03 14:41 ` Michal Hocko
2019-01-03 14:53 ` Kirill A. Shutemov
0 siblings, 2 replies; 6+ messages in thread
From: Matthew Wilcox @ 2019-01-03 14:35 UTC (permalink / raw)
To: Michal Hocko
Cc: Kirill A. Shutemov, jianpanlanyue, Andrew Morton, linux-mm,
bugzilla-daemon
On Thu, Jan 03, 2019 at 10:44:22AM +0100, Michal Hocko wrote:
> On Sun 30-12-18 01:48:43, Kirill A. Shutemov wrote:
> > On Sat, Dec 29, 2018 at 12:53:16PM -0800, Andrew Morton wrote:
> > > > 1. use mmap() to allocate 4096 bytes for 1024*512 times (4096*1024*512=2G).
> > > > 2. use madvise(MADV_DONTNEED) to free most of the above pages, but reserve a
> > > > few pages(by if(i%33==0) continue;), then process's physical memory firstly
> > > > come down, but after a few seconds, it rise back to 2G again, and can't come
> > > > down forever.
> > > > 3. if i delete this condition(if(i%33==0) continue;) or disable
> > > > transparent_hugepage by setting 'enable' and 'defrag' to never, all go well and
> > > > the physical memory can come down expectly.
> > > >
> > > > It seems like transparent_hugepage has problems with non-contiguous
> > > > madvise(MADV_DONTEED).
> >
> > It's expected behaviour.
> >
> > MADV_DONTNEED doesn't guarantee that the range will not be repopulated
> > (with or without direct action on application behalf). It's just a hint
> > for the kernel.
>
> I agree with Kirill here but I would be interested in the underlying
> usecase that triggered this. The test case is clearly artificial but is
> any userspace actually relying on MADV_DONTNEED reducing the rss
> longterm?
>
> > For sparse mappings, consider using MADV_NOHUGEPAGE.
Should the MADV_DONTNEED hint imply MADV_NOHUGEPAGE? It'd prevent
coalescing elsewhere in the VMA, so that might negatively affect other
programs.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Bug 202089] New: transparent hugepage not compatable with madvise(MADV_DONTNEED)
2019-01-03 14:35 ` Matthew Wilcox
@ 2019-01-03 14:41 ` Michal Hocko
2019-01-03 14:53 ` Kirill A. Shutemov
1 sibling, 0 replies; 6+ messages in thread
From: Michal Hocko @ 2019-01-03 14:41 UTC (permalink / raw)
To: Matthew Wilcox
Cc: Kirill A. Shutemov, jianpanlanyue, Andrew Morton, linux-mm,
bugzilla-daemon
On Thu 03-01-19 06:35:02, Matthew Wilcox wrote:
> On Thu, Jan 03, 2019 at 10:44:22AM +0100, Michal Hocko wrote:
> > On Sun 30-12-18 01:48:43, Kirill A. Shutemov wrote:
> > > On Sat, Dec 29, 2018 at 12:53:16PM -0800, Andrew Morton wrote:
> > > > > 1. use mmap() to allocate 4096 bytes for 1024*512 times (4096*1024*512=2G).
> > > > > 2. use madvise(MADV_DONTNEED) to free most of the above pages, but reserve a
> > > > > few pages(by if(i%33==0) continue;), then process's physical memory firstly
> > > > > come down, but after a few seconds, it rise back to 2G again, and can't come
> > > > > down forever.
> > > > > 3. if i delete this condition(if(i%33==0) continue;) or disable
> > > > > transparent_hugepage by setting 'enable' and 'defrag' to never, all go well and
> > > > > the physical memory can come down expectly.
> > > > >
> > > > > It seems like transparent_hugepage has problems with non-contiguous
> > > > > madvise(MADV_DONTEED).
> > >
> > > It's expected behaviour.
> > >
> > > MADV_DONTNEED doesn't guarantee that the range will not be repopulated
> > > (with or without direct action on application behalf). It's just a hint
> > > for the kernel.
> >
> > I agree with Kirill here but I would be interested in the underlying
> > usecase that triggered this. The test case is clearly artificial but is
> > any userspace actually relying on MADV_DONTNEED reducing the rss
> > longterm?
> >
> > > For sparse mappings, consider using MADV_NOHUGEPAGE.
>
> Should the MADV_DONTNEED hint imply MADV_NOHUGEPAGE? It'd prevent
> coalescing elsewhere in the VMA, so that might negatively affect other
> programs.
I really do not think this is a good idea. MADV_DONTEED doesn't really
imply anything to future rss. It only wipes out the current content.
In other words do we want to stop fault around/readahead or any other
optimistic faulting on MADV_DONTEED?
--
Michal Hocko
SUSE Labs
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Bug 202089] New: transparent hugepage not compatable with madvise(MADV_DONTNEED)
2019-01-03 14:35 ` Matthew Wilcox
2019-01-03 14:41 ` Michal Hocko
@ 2019-01-03 14:53 ` Kirill A. Shutemov
1 sibling, 0 replies; 6+ messages in thread
From: Kirill A. Shutemov @ 2019-01-03 14:53 UTC (permalink / raw)
To: Matthew Wilcox
Cc: Michal Hocko, jianpanlanyue, Andrew Morton, linux-mm, bugzilla-daemon
On Thu, Jan 03, 2019 at 06:35:02AM -0800, Matthew Wilcox wrote:
> On Thu, Jan 03, 2019 at 10:44:22AM +0100, Michal Hocko wrote:
> > On Sun 30-12-18 01:48:43, Kirill A. Shutemov wrote:
> > > On Sat, Dec 29, 2018 at 12:53:16PM -0800, Andrew Morton wrote:
> > > > > 1. use mmap() to allocate 4096 bytes for 1024*512 times (4096*1024*512=2G).
> > > > > 2. use madvise(MADV_DONTNEED) to free most of the above pages, but reserve a
> > > > > few pages(by if(i%33==0) continue;), then process's physical memory firstly
> > > > > come down, but after a few seconds, it rise back to 2G again, and can't come
> > > > > down forever.
> > > > > 3. if i delete this condition(if(i%33==0) continue;) or disable
> > > > > transparent_hugepage by setting 'enable' and 'defrag' to never, all go well and
> > > > > the physical memory can come down expectly.
> > > > >
> > > > > It seems like transparent_hugepage has problems with non-contiguous
> > > > > madvise(MADV_DONTEED).
> > >
> > > It's expected behaviour.
> > >
> > > MADV_DONTNEED doesn't guarantee that the range will not be repopulated
> > > (with or without direct action on application behalf). It's just a hint
> > > for the kernel.
> >
> > I agree with Kirill here but I would be interested in the underlying
> > usecase that triggered this. The test case is clearly artificial but is
> > any userspace actually relying on MADV_DONTNEED reducing the rss
> > longterm?
> >
> > > For sparse mappings, consider using MADV_NOHUGEPAGE.
>
> Should the MADV_DONTNEED hint imply MADV_NOHUGEPAGE? It'd prevent
> coalescing elsewhere in the VMA, so that might negatively affect other
> programs.
MADV_NOHUGEPAGE often creates a new VMA (or two) and it has performance
implications. And creating a new VMA would require down_write(mmap_sem)
which is no-go for MADV_DONTNEED.
--
Kirill A. Shutemov
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2019-01-03 14:53 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <bug-202089-27@https.bugzilla.kernel.org/>
2018-12-29 20:53 ` [Bug 202089] New: transparent hugepage not compatable with madvise(MADV_DONTNEED) Andrew Morton
2018-12-29 22:48 ` Kirill A. Shutemov
2019-01-03 9:44 ` Michal Hocko
2019-01-03 14:35 ` Matthew Wilcox
2019-01-03 14:41 ` Michal Hocko
2019-01-03 14:53 ` Kirill A. Shutemov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox