* [PATCH 0/2] Fixes for hugetlbfs-related problems on shared memory
@ 2009-05-27 11:12 Mel Gorman
2009-05-27 11:12 ` [PATCH 1/2] x86: Ignore VM_LOCKED when determining if hugetlb-backed page tables can be shared or not Mel Gorman
` (3 more replies)
0 siblings, 4 replies; 20+ messages in thread
From: Mel Gorman @ 2009-05-27 11:12 UTC (permalink / raw)
To: Ingo Molnar, Andrew Morton, stable, Linux Memory Management List
Cc: Linux Kernel Mailing List, Hugh Dickins, Lee Schermerhorn,
KOSAKI Motohiro, starlight, Eric B Munson, Adam Litke,
Andy Whitcroft, wli
The following two patches are required to fix problems reported by
starlight@binnacle.cx. The tests cases both involve two processes interacting
with shared memory segments backed by hugetlbfs.
Patch 1 fixes an x86-specific problem where regions sharing page tables
are not being reference counted properly. The page tables get freed early
resulting in bad PMD messages printed to the kernel log and the hugetlb
counters getting corrupted. Strictly speaking, this affects mainline but
the problem is masked by UNEVITABLE_LRU as it never leaves VM_LOCKED set for
hugetlbfs-backed mapping. This does affect the stable branch of 2.6.27 and
distributions based on that kernel such as SLES 11. This patch is required
for 2.6.27-stable and while it is optional for mainline, it should be merged
so that the stable branch does not contain patches that are not in mainline.
Patch 2 fixes a general hugetlbfs problem where it is using VM_SHARED instead
of VM_MAYSHARE to detect if the mapping was MAP_SHARED or MAP_PRIVATE. This
causes hugetlbfs to attempt reserving more pages than is required for
MAP_SHARED and mmap() fails when it should succeed. This patch is needed
for 2.6.30 and -stable. It rejects against 2.6.27.24 but the reject is
trivially resolved by changing the last VM_SHARED in hugetlb_reserve_pages()
to VM_MAYSHARE.
Starlight, if you are still watching, can you reconfirm that this patches
fix the problems you were having?
arch/x86/mm/hugetlbpage.c | 6 +++++-
mm/hugetlb.c | 26 +++++++++++++-------------
2 files changed, 18 insertions(+), 14 deletions(-)
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 20+ messages in thread* [PATCH 1/2] x86: Ignore VM_LOCKED when determining if hugetlb-backed page tables can be shared or not 2009-05-27 11:12 [PATCH 0/2] Fixes for hugetlbfs-related problems on shared memory Mel Gorman @ 2009-05-27 11:12 ` Mel Gorman 2009-05-27 16:38 ` Eric B Munson 2009-05-27 23:18 ` Ingo Molnar 2009-05-27 11:12 ` [PATCH 2/2] mm: Account for MAP_SHARED mappings using VM_MAYSHARE and not VM_SHARED in hugetlbfs Mel Gorman ` (2 subsequent siblings) 3 siblings, 2 replies; 20+ messages in thread From: Mel Gorman @ 2009-05-27 11:12 UTC (permalink / raw) To: Ingo Molnar, Andrew Morton, stable, Linux Memory Management List Cc: Linux Kernel Mailing List, Hugh Dickins, Lee Schermerhorn, KOSAKI Motohiro, starlight, Eric B Munson, Adam Litke, Andy Whitcroft, wli On x86 and x86-64, it is possible that page tables are shared beween shared mappings backed by hugetlbfs. As part of this, page_table_shareable() checks a pair of vma->vm_flags and they must match if they are to be shared. All VMA flags are taken into account, including VM_LOCKED. The problem is that VM_LOCKED is cleared on fork(). When a process with a shared memory segment forks() to exec() a helper, there will be shared VMAs with different flags. The impact is that the shared segment is sometimes considered shareable and other times not, depending on what process is checking. What happens is that the segment page tables are being shared but the count is inaccurate depending on the ordering of events. As the page tables are freed with put_page(), bad pmd's are found when some of the children exit. The hugepage counters also get corrupted and the Total and Free count will no longer match even when all the hugepage-backed regions are freed. This requires a reboot of the machine to "fix". This patch addresses the problem by comparing all flags except VM_LOCKED when deciding if pagetables should be shared or not for hugetlbfs-backed mapping. Signed-off-by: Mel Gorman <mel@csn.ul.ie> Acked-by: Hugh Dickins <hugh.dickins@tiscali.co.uk> --- arch/x86/mm/hugetlbpage.c | 6 +++++- 1 files changed, 5 insertions(+), 1 deletions(-) diff --git a/arch/x86/mm/hugetlbpage.c b/arch/x86/mm/hugetlbpage.c index 8f307d9..f46c340 100644 --- a/arch/x86/mm/hugetlbpage.c +++ b/arch/x86/mm/hugetlbpage.c @@ -26,12 +26,16 @@ static unsigned long page_table_shareable(struct vm_area_struct *svma, unsigned long sbase = saddr & PUD_MASK; unsigned long s_end = sbase + PUD_SIZE; + /* Allow segments to share if only one is marked locked */ + unsigned long vm_flags = vma->vm_flags & ~VM_LOCKED; + unsigned long svm_flags = svma->vm_flags & ~VM_LOCKED; + /* * match the virtual addresses, permission and the alignment of the * page table page. */ if (pmd_index(addr) != pmd_index(saddr) || - vma->vm_flags != svma->vm_flags || + vm_flags != svm_flags || sbase < svma->vm_start || svma->vm_end < s_end) return 0; -- 1.5.6.5 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 1/2] x86: Ignore VM_LOCKED when determining if hugetlb-backed page tables can be shared or not 2009-05-27 11:12 ` [PATCH 1/2] x86: Ignore VM_LOCKED when determining if hugetlb-backed page tables can be shared or not Mel Gorman @ 2009-05-27 16:38 ` Eric B Munson 2009-05-27 23:18 ` Ingo Molnar 1 sibling, 0 replies; 20+ messages in thread From: Eric B Munson @ 2009-05-27 16:38 UTC (permalink / raw) To: Mel Gorman Cc: Ingo Molnar, Andrew Morton, stable, Linux Memory Management List, Linux Kernel Mailing List, Hugh Dickins, Lee Schermerhorn, KOSAKI Motohiro, starlight, Adam Litke, Andy Whitcroft, wli [-- Attachment #1: Type: text/plain, Size: 1577 bytes --] On Wed, 27 May 2009, Mel Gorman wrote: > On x86 and x86-64, it is possible that page tables are shared beween shared > mappings backed by hugetlbfs. As part of this, page_table_shareable() checks > a pair of vma->vm_flags and they must match if they are to be shared. All > VMA flags are taken into account, including VM_LOCKED. > > The problem is that VM_LOCKED is cleared on fork(). When a process with a > shared memory segment forks() to exec() a helper, there will be shared VMAs > with different flags. The impact is that the shared segment is sometimes > considered shareable and other times not, depending on what process is > checking. > > What happens is that the segment page tables are being shared but the count is > inaccurate depending on the ordering of events. As the page tables are freed > with put_page(), bad pmd's are found when some of the children exit. The > hugepage counters also get corrupted and the Total and Free count will > no longer match even when all the hugepage-backed regions are freed. This > requires a reboot of the machine to "fix". > > This patch addresses the problem by comparing all flags except VM_LOCKED when > deciding if pagetables should be shared or not for hugetlbfs-backed mapping. > > Signed-off-by: Mel Gorman <mel@csn.ul.ie> > Acked-by: Hugh Dickins <hugh.dickins@tiscali.co.uk> I tested this patch using 2.6.30-rc7 and the libhugetlbfs test suite on x86_64. Everything looks good to me. Acked-by: Eric B Munson <ebmunson@us.ibm.com> Tested-by: Eric B Munson <ebmunson@us.ibm.com> [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 197 bytes --] ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 1/2] x86: Ignore VM_LOCKED when determining if hugetlb-backed page tables can be shared or not 2009-05-27 11:12 ` [PATCH 1/2] x86: Ignore VM_LOCKED when determining if hugetlb-backed page tables can be shared or not Mel Gorman 2009-05-27 16:38 ` Eric B Munson @ 2009-05-27 23:18 ` Ingo Molnar 2009-05-28 8:55 ` Mel Gorman 1 sibling, 1 reply; 20+ messages in thread From: Ingo Molnar @ 2009-05-27 23:18 UTC (permalink / raw) To: Mel Gorman Cc: Andrew Morton, stable, Linux Memory Management List, Linux Kernel Mailing List, Hugh Dickins, Lee Schermerhorn, KOSAKI Motohiro, starlight, Eric B Munson, Adam Litke, Andy Whitcroft, wli * Mel Gorman <mel@csn.ul.ie> wrote: > On x86 and x86-64, it is possible that page tables are shared > beween shared mappings backed by hugetlbfs. As part of this, > page_table_shareable() checks a pair of vma->vm_flags and they > must match if they are to be shared. All VMA flags are taken into > account, including VM_LOCKED. > > The problem is that VM_LOCKED is cleared on fork(). When a process > with a shared memory segment forks() to exec() a helper, there > will be shared VMAs with different flags. The impact is that the > shared segment is sometimes considered shareable and other times > not, depending on what process is checking. > > What happens is that the segment page tables are being shared but > the count is inaccurate depending on the ordering of events. As > the page tables are freed with put_page(), bad pmd's are found > when some of the children exit. The hugepage counters also get > corrupted and the Total and Free count will no longer match even > when all the hugepage-backed regions are freed. This requires a > reboot of the machine to "fix". > > This patch addresses the problem by comparing all flags except > VM_LOCKED when deciding if pagetables should be shared or not for > hugetlbfs-backed mapping. > > Signed-off-by: Mel Gorman <mel@csn.ul.ie> > Acked-by: Hugh Dickins <hugh.dickins@tiscali.co.uk> > --- > arch/x86/mm/hugetlbpage.c | 6 +++++- > 1 files changed, 5 insertions(+), 1 deletions(-) i suspect it would be best to do this due -mm, due to the (larger) mm/hugetlb.c cross section, right? Ingo -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 1/2] x86: Ignore VM_LOCKED when determining if hugetlb-backed page tables can be shared or not 2009-05-27 23:18 ` Ingo Molnar @ 2009-05-28 8:55 ` Mel Gorman 0 siblings, 0 replies; 20+ messages in thread From: Mel Gorman @ 2009-05-28 8:55 UTC (permalink / raw) To: Ingo Molnar Cc: Andrew Morton, stable, Linux Memory Management List, Linux Kernel Mailing List, Hugh Dickins, Lee Schermerhorn, KOSAKI Motohiro, starlight, Eric B Munson, Adam Litke, Andy Whitcroft, wli On Thu, May 28, 2009 at 01:18:03AM +0200, Ingo Molnar wrote: > > * Mel Gorman <mel@csn.ul.ie> wrote: > > > On x86 and x86-64, it is possible that page tables are shared > > beween shared mappings backed by hugetlbfs. As part of this, > > page_table_shareable() checks a pair of vma->vm_flags and they > > must match if they are to be shared. All VMA flags are taken into > > account, including VM_LOCKED. > > > > The problem is that VM_LOCKED is cleared on fork(). When a process > > with a shared memory segment forks() to exec() a helper, there > > will be shared VMAs with different flags. The impact is that the > > shared segment is sometimes considered shareable and other times > > not, depending on what process is checking. > > > > What happens is that the segment page tables are being shared but > > the count is inaccurate depending on the ordering of events. As > > the page tables are freed with put_page(), bad pmd's are found > > when some of the children exit. The hugepage counters also get > > corrupted and the Total and Free count will no longer match even > > when all the hugepage-backed regions are freed. This requires a > > reboot of the machine to "fix". > > > > This patch addresses the problem by comparing all flags except > > VM_LOCKED when deciding if pagetables should be shared or not for > > hugetlbfs-backed mapping. > > > > Signed-off-by: Mel Gorman <mel@csn.ul.ie> > > Acked-by: Hugh Dickins <hugh.dickins@tiscali.co.uk> > > --- > > arch/x86/mm/hugetlbpage.c | 6 +++++- > > 1 files changed, 5 insertions(+), 1 deletions(-) > > i suspect it would be best to do this due -mm, due to the (larger) > mm/hugetlb.c cross section, right? > I'm happy with that approach. Almost all hugetlbfs-related patches have gone through -mm to date AFAIK even when they have been arch specific like this. Thanks -- Mel Gorman Part-time Phd Student Linux Technology Center University of Limerick IBM Dublin Software Lab -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 20+ messages in thread
* [PATCH 2/2] mm: Account for MAP_SHARED mappings using VM_MAYSHARE and not VM_SHARED in hugetlbfs 2009-05-27 11:12 [PATCH 0/2] Fixes for hugetlbfs-related problems on shared memory Mel Gorman 2009-05-27 11:12 ` [PATCH 1/2] x86: Ignore VM_LOCKED when determining if hugetlb-backed page tables can be shared or not Mel Gorman @ 2009-05-27 11:12 ` Mel Gorman 2009-05-27 16:40 ` Eric B Munson 2009-05-27 20:14 ` [PATCH 0/2] Fixes for hugetlbfs-related problems on shared memory Andrew Morton 2009-06-08 1:25 ` starlight 3 siblings, 1 reply; 20+ messages in thread From: Mel Gorman @ 2009-05-27 11:12 UTC (permalink / raw) To: Ingo Molnar, Andrew Morton, stable, Linux Memory Management List Cc: Linux Kernel Mailing List, Hugh Dickins, Lee Schermerhorn, KOSAKI Motohiro, starlight, Eric B Munson, Adam Litke, Andy Whitcroft, wli hugetlbfs reserves huge pages but does not fault them at mmap() time to ensure that future faults succeed. The reservation behaviour differs depending on whether the mapping was mapped MAP_SHARED or MAP_PRIVATE. For MAP_SHARED mappings, hugepages are reserved when mmap() is first called and are tracked based on information associated with the inode. Other processes mapping MAP_SHARED use the same reservation. MAP_PRIVATE track the reservations based on the VMA created as part of the mmap() operation. Each process mapping MAP_PRIVATE must make its own reservation. hugetlbfs currently checks if a VMA is MAP_SHARED with the VM_SHARED flag and not VM_MAYSHARE. For file-backed mappings, such as hugetlbfs, VM_SHARED is set only if the mapping is MAP_SHARED and the file was opened read-write. If a shared memory mapping was mapped shared-read-write for populating of data and mapped shared-read-only by other processes, then hugetlbfs would account for the mapping as if it was MAP_PRIVATE. This causes processes to fail to map the file MAP_SHARED even though it should succeed as the reservation is there. This patch alters mm/hugetlb.c and replaces VM_SHARED with VM_MAYSHARE when the intent of the code was to check whether the VMA was mapped MAP_SHARED or MAP_PRIVATE. Signed-off-by: Mel Gorman <mel@csn.ul.ie> --- mm/hugetlb.c | 26 +++++++++++++------------- 1 files changed, 13 insertions(+), 13 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 28c655b..e83ad2c 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -316,7 +316,7 @@ static void resv_map_release(struct kref *ref) static struct resv_map *vma_resv_map(struct vm_area_struct *vma) { VM_BUG_ON(!is_vm_hugetlb_page(vma)); - if (!(vma->vm_flags & VM_SHARED)) + if (!(vma->vm_flags & VM_MAYSHARE)) return (struct resv_map *)(get_vma_private_data(vma) & ~HPAGE_RESV_MASK); return NULL; @@ -325,7 +325,7 @@ static struct resv_map *vma_resv_map(struct vm_area_struct *vma) static void set_vma_resv_map(struct vm_area_struct *vma, struct resv_map *map) { VM_BUG_ON(!is_vm_hugetlb_page(vma)); - VM_BUG_ON(vma->vm_flags & VM_SHARED); + VM_BUG_ON(vma->vm_flags & VM_MAYSHARE); set_vma_private_data(vma, (get_vma_private_data(vma) & HPAGE_RESV_MASK) | (unsigned long)map); @@ -334,7 +334,7 @@ static void set_vma_resv_map(struct vm_area_struct *vma, struct resv_map *map) static void set_vma_resv_flags(struct vm_area_struct *vma, unsigned long flags) { VM_BUG_ON(!is_vm_hugetlb_page(vma)); - VM_BUG_ON(vma->vm_flags & VM_SHARED); + VM_BUG_ON(vma->vm_flags & VM_MAYSHARE); set_vma_private_data(vma, get_vma_private_data(vma) | flags); } @@ -353,7 +353,7 @@ static void decrement_hugepage_resv_vma(struct hstate *h, if (vma->vm_flags & VM_NORESERVE) return; - if (vma->vm_flags & VM_SHARED) { + if (vma->vm_flags & VM_MAYSHARE) { /* Shared mappings always use reserves */ h->resv_huge_pages--; } else if (is_vma_resv_set(vma, HPAGE_RESV_OWNER)) { @@ -369,14 +369,14 @@ static void decrement_hugepage_resv_vma(struct hstate *h, void reset_vma_resv_huge_pages(struct vm_area_struct *vma) { VM_BUG_ON(!is_vm_hugetlb_page(vma)); - if (!(vma->vm_flags & VM_SHARED)) + if (!(vma->vm_flags & VM_MAYSHARE)) vma->vm_private_data = (void *)0; } /* Returns true if the VMA has associated reserve pages */ static int vma_has_reserves(struct vm_area_struct *vma) { - if (vma->vm_flags & VM_SHARED) + if (vma->vm_flags & VM_MAYSHARE) return 1; if (is_vma_resv_set(vma, HPAGE_RESV_OWNER)) return 1; @@ -924,7 +924,7 @@ static long vma_needs_reservation(struct hstate *h, struct address_space *mapping = vma->vm_file->f_mapping; struct inode *inode = mapping->host; - if (vma->vm_flags & VM_SHARED) { + if (vma->vm_flags & VM_MAYSHARE) { pgoff_t idx = vma_hugecache_offset(h, vma, addr); return region_chg(&inode->i_mapping->private_list, idx, idx + 1); @@ -949,7 +949,7 @@ static void vma_commit_reservation(struct hstate *h, struct address_space *mapping = vma->vm_file->f_mapping; struct inode *inode = mapping->host; - if (vma->vm_flags & VM_SHARED) { + if (vma->vm_flags & VM_MAYSHARE) { pgoff_t idx = vma_hugecache_offset(h, vma, addr); region_add(&inode->i_mapping->private_list, idx, idx + 1); @@ -1893,7 +1893,7 @@ retry_avoidcopy: * at the time of fork() could consume its reserves on COW instead * of the full address range. */ - if (!(vma->vm_flags & VM_SHARED) && + if (!(vma->vm_flags & VM_MAYSHARE) && is_vma_resv_set(vma, HPAGE_RESV_OWNER) && old_page != pagecache_page) outside_reserve = 1; @@ -2000,7 +2000,7 @@ retry: clear_huge_page(page, address, huge_page_size(h)); __SetPageUptodate(page); - if (vma->vm_flags & VM_SHARED) { + if (vma->vm_flags & VM_MAYSHARE) { int err; struct inode *inode = mapping->host; @@ -2104,7 +2104,7 @@ int hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma, goto out_mutex; } - if (!(vma->vm_flags & VM_SHARED)) + if (!(vma->vm_flags & VM_MAYSHARE)) pagecache_page = hugetlbfs_pagecache_page(h, vma, address); } @@ -2289,7 +2289,7 @@ int hugetlb_reserve_pages(struct inode *inode, * to reserve the full area even if read-only as mprotect() may be * called to make the mapping read-write. Assume !vma is a shm mapping */ - if (!vma || vma->vm_flags & VM_SHARED) + if (!vma || vma->vm_flags & VM_MAYSHARE) chg = region_chg(&inode->i_mapping->private_list, from, to); else { struct resv_map *resv_map = resv_map_alloc(); @@ -2330,7 +2330,7 @@ int hugetlb_reserve_pages(struct inode *inode, * consumed reservations are stored in the map. Hence, nothing * else has to be done for private mappings here */ - if (!vma || vma->vm_flags & VM_SHARED) + if (!vma || vma->vm_flags & VM_MAYSHARE) region_add(&inode->i_mapping->private_list, from, to); return 0; } -- 1.5.6.5 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 2/2] mm: Account for MAP_SHARED mappings using VM_MAYSHARE and not VM_SHARED in hugetlbfs 2009-05-27 11:12 ` [PATCH 2/2] mm: Account for MAP_SHARED mappings using VM_MAYSHARE and not VM_SHARED in hugetlbfs Mel Gorman @ 2009-05-27 16:40 ` Eric B Munson 0 siblings, 0 replies; 20+ messages in thread From: Eric B Munson @ 2009-05-27 16:40 UTC (permalink / raw) To: Mel Gorman Cc: Ingo Molnar, Andrew Morton, stable, Linux Memory Management List, Linux Kernel Mailing List, Hugh Dickins, Lee Schermerhorn, KOSAKI Motohiro, starlight, Adam Litke, Andy Whitcroft, wli [-- Attachment #1: Type: text/plain, Size: 1655 bytes --] On Wed, 27 May 2009, Mel Gorman wrote: > hugetlbfs reserves huge pages but does not fault them at mmap() time to ensure > that future faults succeed. The reservation behaviour differs depending on > whether the mapping was mapped MAP_SHARED or MAP_PRIVATE. For MAP_SHARED > mappings, hugepages are reserved when mmap() is first called and are tracked > based on information associated with the inode. Other processes mapping > MAP_SHARED use the same reservation. MAP_PRIVATE track the reservations > based on the VMA created as part of the mmap() operation. Each process > mapping MAP_PRIVATE must make its own reservation. > > hugetlbfs currently checks if a VMA is MAP_SHARED with the VM_SHARED flag and > not VM_MAYSHARE. For file-backed mappings, such as hugetlbfs, VM_SHARED is > set only if the mapping is MAP_SHARED and the file was opened read-write. If a > shared memory mapping was mapped shared-read-write for populating of data and > mapped shared-read-only by other processes, then hugetlbfs would account for > the mapping as if it was MAP_PRIVATE. This causes processes to fail to map > the file MAP_SHARED even though it should succeed as the reservation is there. > > This patch alters mm/hugetlb.c and replaces VM_SHARED with VM_MAYSHARE when > the intent of the code was to check whether the VMA was mapped MAP_SHARED > or MAP_PRIVATE. > > Signed-off-by: Mel Gorman <mel@csn.ul.ie> I tested this patch on both x86_64 and ppc64 using 2.6.30-rc7 with the libhugetlbfs test suite and everything looks good. Acked-by: Eric B Munson <ebmunson@us.ibm.com> Tested-by: Eric B Munson <ebmunson@us.ibm.com> [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 197 bytes --] ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 0/2] Fixes for hugetlbfs-related problems on shared memory 2009-05-27 11:12 [PATCH 0/2] Fixes for hugetlbfs-related problems on shared memory Mel Gorman 2009-05-27 11:12 ` [PATCH 1/2] x86: Ignore VM_LOCKED when determining if hugetlb-backed page tables can be shared or not Mel Gorman 2009-05-27 11:12 ` [PATCH 2/2] mm: Account for MAP_SHARED mappings using VM_MAYSHARE and not VM_SHARED in hugetlbfs Mel Gorman @ 2009-05-27 20:14 ` Andrew Morton 2009-05-27 23:19 ` Ingo Molnar 2009-05-28 8:56 ` [PATCH 0/2] Fixes for hugetlbfs-related problems on shared memory Mel Gorman 2009-06-08 1:25 ` starlight 3 siblings, 2 replies; 20+ messages in thread From: Andrew Morton @ 2009-05-27 20:14 UTC (permalink / raw) To: Mel Gorman Cc: mingo, stable, linux-mm, linux-kernel, hugh.dickins, Lee.Schermerhorn, kosaki.motohiro, starlight, ebmunson, agl, apw, wli On Wed, 27 May 2009 12:12:27 +0100 Mel Gorman <mel@csn.ul.ie> wrote: > The following two patches are required to fix problems reported by > starlight@binnacle.cx. The tests cases both involve two processes interacting > with shared memory segments backed by hugetlbfs. Thanks. Both of these address http://bugzilla.kernel.org/show_bug.cgi?id=13302, yes? I added that info to the changelogs, to close the loop. Ingo, I'd propose merging both these together rather than routing one via the x86 tree, OK? Question is: when? Are we confident enough to merge it into 2.6.30 now, or should we hold off for 2.6.30.1? I guess we have a week or more, and if the changes do break something, we can fix that in 2.6.30.1 ;) -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 0/2] Fixes for hugetlbfs-related problems on shared memory 2009-05-27 20:14 ` [PATCH 0/2] Fixes for hugetlbfs-related problems on shared memory Andrew Morton @ 2009-05-27 23:19 ` Ingo Molnar 2009-06-16 0:19 ` QUESTION: can netdev_alloc_skb() errors be reduced by tuning? starlight 2009-05-28 8:56 ` [PATCH 0/2] Fixes for hugetlbfs-related problems on shared memory Mel Gorman 1 sibling, 1 reply; 20+ messages in thread From: Ingo Molnar @ 2009-05-27 23:19 UTC (permalink / raw) To: Andrew Morton Cc: Mel Gorman, stable, linux-mm, linux-kernel, hugh.dickins, Lee.Schermerhorn, kosaki.motohiro, starlight, ebmunson, agl, apw, wli * Andrew Morton <akpm@linux-foundation.org> wrote: > On Wed, 27 May 2009 12:12:27 +0100 > Mel Gorman <mel@csn.ul.ie> wrote: > > > The following two patches are required to fix problems reported by > > starlight@binnacle.cx. The tests cases both involve two processes interacting > > with shared memory segments backed by hugetlbfs. > > Thanks. > > Both of these address > http://bugzilla.kernel.org/show_bug.cgi?id=13302, yes? I added > that info to the changelogs, to close the loop. > > Ingo, I'd propose merging both these together rather than routing > one via the x86 tree, OK? sure. > Question is: when? Are we confident enough to merge it into > 2.6.30 now, or should we hold off for 2.6.30.1? I guess we have a > week or more, and if the changes do break something, we can fix > that in 2.6.30.1 ;) With an Acked-by from Hugh i feel pretty confident about it - and as long as it get into -rc8 i think we should do it in .30. Ingo -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 20+ messages in thread
* QUESTION: can netdev_alloc_skb() errors be reduced by tuning? 2009-05-27 23:19 ` Ingo Molnar @ 2009-06-16 0:19 ` starlight 2009-06-16 2:26 ` Eric Dumazet 2009-06-16 9:19 ` Mel Gorman 0 siblings, 2 replies; 20+ messages in thread From: starlight @ 2009-06-16 0:19 UTC (permalink / raw) To: linux-kernel, Mel Gorman, linux-mm, hugh.dickins, Lee.Schermerhorn, kosaki.motohiro, ebmunson, agl, apw, wli Hello, I submitted testcase for a hugepages bug that has been successfully resolved. Have an apparently obscure question related to MM, and so I am asking anyone who might have some idea on this. Nothing much turned up via Google and digging into the KMEM code looks daunting. Running Intel 82598/ixgbe 10 gig Ethernet under heavy stress. Generally is working well after tuning IRQ affinities, but a fair number of buffer allocation failures are occurring in the 'ixgbe' device driver and are reported via 'ethtool' statistics. This may be causing data loss. The kernel primitive returning the error is netdev_alloc_skb(). Are any tuneable parameters available that can reduce or eliminate these allocation failures? Have about eleven gigabytes of free memory, though most of that is consumed by non-dirty file cache data. Total system memory is 16GB with 4GB allocated to hugepages. Zero swap usage and activity though swap is enabled. Most application memory is hugepage or is 'mlock()'ed. Thank you. System rebooted before test run. Dual Xeon E5430, 16GB FB-DIMM RAM. $ cat /proc/meminfo MemTotal: 16443828 kB MemFree: 281176 kB Buffers: 53896 kB Cached: 11331924 kB SwapCached: 0 kB Active: 200740 kB Inactive: 11284312 kB HighTotal: 0 kB HighFree: 0 kB LowTotal: 16443828 kB LowFree: 281176 kB SwapTotal: 2031608 kB SwapFree: 2031400 kB Dirty: 4 kB Writeback: 0 kB AnonPages: 104464 kB Mapped: 14644 kB Slab: 440452 kB PageTables: 4032 kB NFS_Unstable: 0 kB Bounce: 0 kB CommitLimit: 8156368 kB Committed_AS: 122452 kB VmallocTotal: 34359738367 kB VmallocUsed: 266872 kB VmallocChunk: 34359471043 kB HugePages_Total: 2048 HugePages_Free: 735 HugePages_Rsvd: 0 Hugepagesize: 2048 kB # ethtool -S eth2 | egrep -v ': 0$' NIC statistics: rx_packets: 724246449 tx_packets: 229847 rx_bytes: 152691992335 tx_bytes: 10573426 multicast: 725997241 broadcast: 6 rx_csum_offload_good: 723051776 alloc_rx_buff_failed: 7119 tx_queue_0_packets: 229847 tx_queue_0_bytes: 10573426 rx_queue_0_packets: 340698332 rx_queue_0_bytes: 70844299683 rx_queue_1_packets: 385298923 rx_queue_1_bytes: 82276167594 ixgbe driver fragment ===================== struct sk_buff *skb = netdev_alloc_skb(adapter->netdev, bufsz); if (!skb) { adapter->alloc_rx_buff_failed++; goto no_buffers; } -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: QUESTION: can netdev_alloc_skb() errors be reduced by tuning? 2009-06-16 0:19 ` QUESTION: can netdev_alloc_skb() errors be reduced by tuning? starlight @ 2009-06-16 2:26 ` Eric Dumazet 2009-06-16 4:12 ` starlight 2009-06-16 9:19 ` Mel Gorman 1 sibling, 1 reply; 20+ messages in thread From: Eric Dumazet @ 2009-06-16 2:26 UTC (permalink / raw) To: starlight Cc: linux-kernel, Mel Gorman, linux-mm, hugh.dickins, Lee.Schermerhorn, kosaki.motohiro, ebmunson, agl, apw, wli starlight@binnacle.cx a ecrit : > Hello, > > I submitted testcase for a hugepages bug that has been > successfully resolved. Have an apparently obscure question > related to MM, and so I am asking anyone who might have some idea > on this. Nothing much turned up via Google and digging into > the KMEM code looks daunting. > > Running Intel 82598/ixgbe 10 gig Ethernet under heavy stress. > Generally is working well after tuning IRQ affinities, but a > fair number of buffer allocation failures are occurring in the > 'ixgbe' device driver and are reported via 'ethtool' statistics. > This may be causing data loss. > > The kernel primitive returning the error is netdev_alloc_skb(). > > Are any tuneable parameters available that can reduce or > eliminate these allocation failures? Have about eleven > gigabytes of free memory, though most of that is consumed > by non-dirty file cache data. Total system memory is 16GB with > 4GB allocated to hugepages. Zero swap usage and activity though > swap is enabled. Most application memory is hugepage or is > 'mlock()'ed. > > Thank you. > > > > > > System rebooted before test run. > > Dual Xeon E5430, 16GB FB-DIMM RAM. > > > $ cat /proc/meminfo > MemTotal: 16443828 kB > MemFree: 281176 kB > Buffers: 53896 kB > Cached: 11331924 kB > SwapCached: 0 kB > Active: 200740 kB > Inactive: 11284312 kB > HighTotal: 0 kB > HighFree: 0 kB > LowTotal: 16443828 kB > LowFree: 281176 kB > SwapTotal: 2031608 kB > SwapFree: 2031400 kB > Dirty: 4 kB > Writeback: 0 kB > AnonPages: 104464 kB > Mapped: 14644 kB > Slab: 440452 kB > PageTables: 4032 kB > NFS_Unstable: 0 kB > Bounce: 0 kB > CommitLimit: 8156368 kB > Committed_AS: 122452 kB > VmallocTotal: 34359738367 kB > VmallocUsed: 266872 kB > VmallocChunk: 34359471043 kB > HugePages_Total: 2048 > HugePages_Free: 735 > HugePages_Rsvd: 0 > Hugepagesize: 2048 kB > > > # ethtool -S eth2 | egrep -v ': 0$' > NIC statistics: > rx_packets: 724246449 > tx_packets: 229847 > rx_bytes: 152691992335 > tx_bytes: 10573426 > multicast: 725997241 > broadcast: 6 > rx_csum_offload_good: 723051776 > alloc_rx_buff_failed: 7119 > tx_queue_0_packets: 229847 > tx_queue_0_bytes: 10573426 > rx_queue_0_packets: 340698332 > rx_queue_0_bytes: 70844299683 > rx_queue_1_packets: 385298923 > rx_queue_1_bytes: 82276167594 > > > ixgbe driver fragment > ===================== > struct sk_buff *skb = netdev_alloc_skb(adapter->netdev, bufsz); > > if (!skb) { > adapter->alloc_rx_buff_failed++; > goto no_buffers; > } > 152691992335/724246449 = 210 bytes per rx packet in average It could make sense to add copybreak feature in this driver to reduce memory needs, but that also would consume more cpu cycles, and slow down forwarding setups. Maybe this packet trimming could be done generically in UDP stack input path, before queueing packet into a receive queue, if amount of available memory is under a given threshold. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: QUESTION: can netdev_alloc_skb() errors be reduced by tuning? 2009-06-16 2:26 ` Eric Dumazet @ 2009-06-16 4:12 ` starlight 2009-06-16 6:12 ` Eric Dumazet 0 siblings, 1 reply; 20+ messages in thread From: starlight @ 2009-06-16 4:12 UTC (permalink / raw) To: Eric Dumazet Cc: linux-kernel, Mel Gorman, linux-mm, hugh.dickins, Lee.Schermerhorn, kosaki.motohiro, ebmunson, agl, apw, wli Eric, Great thought--thank you. Running a similar server with 82571/e1000e and it does not exhibit the problem. 'e1000e' has default copybreak=256 while 'ixgbe' has no copybreak. Rational given is http://osdir.com/ml/linux.drivers.e1000.devel/2008-01/msg00103.html But the comparion is a bit apples-and-oranges since the 'e1000e' system is dual Opteron 2354 while the 'ixgbe' system is Xeon E5430 (a painful choice thus far). Also 'e1000e' system passes data via a PACKET socket while the 'ixgbe' system passes data via UDP (a configurable option). I'm not fully up on how this all works: am I to understand that the error could result from RX ring-queue buffers not freeing quickly enough because they have a use-count held non-zero as the packet travels the stack? I've just doubled some SLAB tuneables that seem relevant, but if the cause is the aforementioned, this won't help. Will have the answer on the tweaks by the end of Tuesday. David At 04:26 AM 6/16/2009 +0200, Eric Dumazet wrote: > >152691992335/724246449 = 210 bytes per rx packet in average > >It could make sense to add copybreak feature in this driver to >reduce memory needs, but that also would consume more cpu >cycles, and slow down forwarding setups. > >Maybe this packet trimming could be done generically in UDP >stack input path, before queueing packet into a receive queue, >if amount of available memory is under a given threshold. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: QUESTION: can netdev_alloc_skb() errors be reduced by tuning? 2009-06-16 4:12 ` starlight @ 2009-06-16 6:12 ` Eric Dumazet 2009-07-05 3:44 ` Herbert Xu 0 siblings, 1 reply; 20+ messages in thread From: Eric Dumazet @ 2009-06-16 6:12 UTC (permalink / raw) To: starlight Cc: Eric Dumazet, linux-kernel, Mel Gorman, linux-mm, hugh.dickins, Lee.Schermerhorn, kosaki.motohiro, ebmunson, agl, apw, wli, Linux Netdev List Please dont top post, we prefer other way around :) starlight@binnacle.cx a ecrit : > Eric, > > Great thought--thank you. Running a similar server with > 82571/e1000e and it does not exhibit the problem. 'e1000e' has > default copybreak=256 while 'ixgbe' has no copybreak. Rational > given is > > http://osdir.com/ml/linux.drivers.e1000.devel/2008-01/msg00103.html > > But the comparion is a bit apples-and-oranges since the 'e1000e' > system is dual Opteron 2354 while the 'ixgbe' system is Xeon > E5430 (a painful choice thus far). Also 'e1000e' system passes > data via a PACKET socket while the 'ixgbe' system passes data > via UDP (a configurable option). > > I'm not fully up on how this all works: am I to understand that > the error could result from RX ring-queue buffers not freeing > quickly enough because they have a use-count held non-zero as > the packet travels the stack? Well, error is normal in stress situation, when no more kernel memory is available. cat /proc/net/udp can show you (in last column) sockets where packets where dropped by UDP stack if their receive queue was full. > > I've just doubled some SLAB tuneables that seem relevant, but > if the cause is the aforementioned, this won't help. Will > have the answer on the tweaks by the end of Tuesday. > > David copybreak in drivers themselves is nice because driver can recycle its rx skbs much faster, but that is suboptimal in forwarding (routers) workloads. Its also a lot of duplicated code in every driver. So we could do the skb trimming (ie : reallocating the data portion to exactly the size of packet) in core network stack, when we know packet must be handled by an application, and not dropped or forwarded by kernel. Because of slab rounding, this reallocation should be done only if resulting data portion is really smaller (50 %) than original skb. > > > > At 04:26 AM 6/16/2009 +0200, Eric Dumazet wrote: >> 152691992335/724246449 = 210 bytes per rx packet in average >> >> It could make sense to add copybreak feature in this driver to >> reduce memory needs, but that also would consume more cpu >> cycles, and slow down forwarding setups. >> >> Maybe this packet trimming could be done generically in UDP >> stack input path, before queueing packet into a receive queue, >> if amount of available memory is under a given threshold. > -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: QUESTION: can netdev_alloc_skb() errors be reduced by tuning? 2009-06-16 6:12 ` Eric Dumazet @ 2009-07-05 3:44 ` Herbert Xu 0 siblings, 0 replies; 20+ messages in thread From: Herbert Xu @ 2009-07-05 3:44 UTC (permalink / raw) To: Eric Dumazet Cc: starlight, linux-kernel, mel, linux-mm, hugh.dickins, Lee.Schermerhorn, kosaki.motohiro, ebmunson, agl, apw, wli, netdev Eric Dumazet <eric.dumazet@gmail.com> wrote: > > Because of slab rounding, this reallocation should be done only if resulting data > portion is really smaller (50 %) than original skb. If we're going to do this in the core then we should only do it in the spots where the packet may be held indefinitely. Cheers, -- Visit Openswan at http://www.openswan.org/ Email: Herbert Xu ~{PmV>HI~} <herbert@gondor.apana.org.au> Home Page: http://gondor.apana.org.au/~herbert/ PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: QUESTION: can netdev_alloc_skb() errors be reduced by tuning? 2009-06-16 0:19 ` QUESTION: can netdev_alloc_skb() errors be reduced by tuning? starlight 2009-06-16 2:26 ` Eric Dumazet @ 2009-06-16 9:19 ` Mel Gorman 2009-06-16 15:25 ` starlight 1 sibling, 1 reply; 20+ messages in thread From: Mel Gorman @ 2009-06-16 9:19 UTC (permalink / raw) To: starlight Cc: linux-kernel, linux-mm, hugh.dickins, Lee.Schermerhorn, kosaki.motohiro, ebmunson, agl, apw, wli On Mon, Jun 15, 2009 at 08:19:33PM -0400, starlight@binnacle.cx wrote: > Hello, > > I submitted testcase for a hugepages bug that has been > successfully resolved. Have an apparently obscure question > related to MM, and so I am asking anyone who might have some idea > on this. Nothing much turned up via Google and digging into > the KMEM code looks daunting. > > Running Intel 82598/ixgbe 10 gig Ethernet under heavy stress. > Generally is working well after tuning IRQ affinities, but a > fair number of buffer allocation failures are occurring in the > 'ixgbe' device driver and are reported via 'ethtool' statistics. > This may be causing data loss. > Can you give an example of an allocation failure? Specifically, I want to see what sort of allocation it was and what order. For reliable protocols, an allocation failure should recover and the data get through but obviously there is a drop in network performance when this happens. > The kernel primitive returning the error is netdev_alloc_skb(). > > Are any tuneable parameters available that can reduce or > eliminate these allocation failures? Have about eleven > gigabytes of free memory, though most of that is consumed > by non-dirty file cache data. Total system memory is 16GB with > 4GB allocated to hugepages. Zero swap usage and activity though > swap is enabled. Most application memory is hugepage or is > 'mlock()'ed. > If the allocations are high-order and atomic, increasing min_free_kbytes can help, particularly in situations where there is a burst of network traffic. I won't know if they are atomic until I see an error message though. > Thank you. > > > > > > System rebooted before test run. > > Dual Xeon E5430, 16GB FB-DIMM RAM. > > > $ cat /proc/meminfo > MemTotal: 16443828 kB > MemFree: 281176 kB > Buffers: 53896 kB > Cached: 11331924 kB > SwapCached: 0 kB > Active: 200740 kB > Inactive: 11284312 kB > HighTotal: 0 kB > HighFree: 0 kB > LowTotal: 16443828 kB > LowFree: 281176 kB > SwapTotal: 2031608 kB > SwapFree: 2031400 kB > Dirty: 4 kB > Writeback: 0 kB > AnonPages: 104464 kB > Mapped: 14644 kB > Slab: 440452 kB > PageTables: 4032 kB > NFS_Unstable: 0 kB > Bounce: 0 kB > CommitLimit: 8156368 kB > Committed_AS: 122452 kB > VmallocTotal: 34359738367 kB > VmallocUsed: 266872 kB > VmallocChunk: 34359471043 kB > HugePages_Total: 2048 > HugePages_Free: 735 > HugePages_Rsvd: 0 > Hugepagesize: 2048 kB > > > # ethtool -S eth2 | egrep -v ': 0$' > NIC statistics: > rx_packets: 724246449 > tx_packets: 229847 > rx_bytes: 152691992335 > tx_bytes: 10573426 > multicast: 725997241 > broadcast: 6 > rx_csum_offload_good: 723051776 > alloc_rx_buff_failed: 7119 > tx_queue_0_packets: 229847 > tx_queue_0_bytes: 10573426 > rx_queue_0_packets: 340698332 > rx_queue_0_bytes: 70844299683 > rx_queue_1_packets: 385298923 > rx_queue_1_bytes: 82276167594 > > > ixgbe driver fragment > ===================== > struct sk_buff *skb = netdev_alloc_skb(adapter->netdev, bufsz); > > if (!skb) { > adapter->alloc_rx_buff_failed++; > goto no_buffers; > } > -- Mel Gorman Part-time Phd Student Linux Technology Center University of Limerick IBM Dublin Software Lab -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: QUESTION: can netdev_alloc_skb() errors be reduced by tuning? 2009-06-16 9:19 ` Mel Gorman @ 2009-06-16 15:25 ` starlight 0 siblings, 0 replies; 20+ messages in thread From: starlight @ 2009-06-16 15:25 UTC (permalink / raw) To: Mel Gorman Cc: linux-kernel, linux-mm, hugh.dickins, Lee.Schermerhorn, kosaki.motohiro, ebmunson, agl, apw, wli At 10:19 AM 6/16/2009 +0100, Mel Gorman wrote: >Can you give an example of an allocation failure? Specifically, I want to >see what sort of allocation it was and what order. I think it's just the basic buffer allocation for Ethernet frames arriving in the 'ixgbe' driver. Seems like it's one allocation per frame. Per the original message the allocations are made with the 'netdev_alloc_skb()' kernel call. The function where this code appears is named 'ixgbe_alloc_rx_buffers()' and the comment is "Replace used receive buffers." The code path in question does not generate an error. It just increments the 'alloc_rx_buff_failed' counter for the ethX device. In addition it appears that the frame is dropped only if the PCIe hardware ring-queue associated with each interface is full. So on the next interrupt the allocation is retried and appears to be successful 99% of the time. >For reliable protocols, an allocation failure should recover and the >data get through but obviously there is a drop in network performance >when this happens. This is for a specialized high-volume UDP multicast application where data loss of any kind is unacceptable. >If the allocations are high-order and atomic, increasing min_free_kbytes >can help, particularly in situations where there is a burst of network >traffic. I won't know if they are atomic until I see an error message >though. Doesn't the use of 'netdev_alloc_skb()' kernel primitive imply what the nature of the allocation is? I followed the call graph down into "kmem" land, but it's a complex place and so I abandoned the review. My impression is that 'min_free_kbytes' relates mainly to systems where significant paging pressure exists. The servers have zero paging pressure and lots of free memory, though mostly in the form of instantly discardable file data cache pages. In the past disabling the program that generates the cache pressure has had no effect on data loss, though I haven't tried it in relation this specific issue. Tried increasing a few /proc/slabinfo tuneable parameters today and this appears to have fixed the issue so far today. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 0/2] Fixes for hugetlbfs-related problems on shared memory 2009-05-27 20:14 ` [PATCH 0/2] Fixes for hugetlbfs-related problems on shared memory Andrew Morton 2009-05-27 23:19 ` Ingo Molnar @ 2009-05-28 8:56 ` Mel Gorman 1 sibling, 0 replies; 20+ messages in thread From: Mel Gorman @ 2009-05-28 8:56 UTC (permalink / raw) To: Andrew Morton Cc: mingo, stable, linux-mm, linux-kernel, hugh.dickins, Lee.Schermerhorn, kosaki.motohiro, starlight, ebmunson, agl, apw, wli On Wed, May 27, 2009 at 01:14:37PM -0700, Andrew Morton wrote: > On Wed, 27 May 2009 12:12:27 +0100 > Mel Gorman <mel@csn.ul.ie> wrote: > > > The following two patches are required to fix problems reported by > > starlight@binnacle.cx. The tests cases both involve two processes interacting > > with shared memory segments backed by hugetlbfs. > > Thanks. > > Both of these address http://bugzilla.kernel.org/show_bug.cgi?id=13302, yes? > I added that info to the changelogs, to close the loop. > Yes. I'm sorry, I should have included that information in the leader. I had a niggling feeling I was forgetting something to add to the changelog - this was it :) > Ingo, I'd propose merging both these together rather than routing one > via the x86 tree, OK? > > Question is: when? Are we confident enough to merge it into 2.6.30 > now, or should we hold off for 2.6.30.1? I guess we have a week or > more, and if the changes do break something, we can fix that in > 2.6.30.1 ;) > FWIW, I'm reasonably confident based on libhugetlbfs regression testing that I haven't broken something new. If they make it into 2.6.30-rc8, so much the better. Thanks. -- Mel Gorman Part-time Phd Student Linux Technology Center University of Limerick IBM Dublin Software Lab -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 0/2] Fixes for hugetlbfs-related problems on shared memory 2009-05-27 11:12 [PATCH 0/2] Fixes for hugetlbfs-related problems on shared memory Mel Gorman ` (2 preceding siblings ...) 2009-05-27 20:14 ` [PATCH 0/2] Fixes for hugetlbfs-related problems on shared memory Andrew Morton @ 2009-06-08 1:25 ` starlight 2009-06-08 10:24 ` Mel Gorman 3 siblings, 1 reply; 20+ messages in thread From: starlight @ 2009-06-08 1:25 UTC (permalink / raw) To: Mel Gorman, Ingo Molnar, Andrew Morton, stable, Linux Memory Management List Cc: Linux Kernel Mailing List, Hugh Dickins, Lee Schermerhorn, KOSAKI Motohiro, Eric B Munson, Adam Litke, Andy Whitcroft, wli Mel, Tried out the two new patches on 2.6.26.4 and everything is working now. The application that uncovered the issue works perfectly and hugepages function sanely. Thank you for the fix. Regards -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 0/2] Fixes for hugetlbfs-related problems on shared memory 2009-06-08 1:25 ` starlight @ 2009-06-08 10:24 ` Mel Gorman 0 siblings, 0 replies; 20+ messages in thread From: Mel Gorman @ 2009-06-08 10:24 UTC (permalink / raw) To: starlight Cc: Ingo Molnar, Andrew Morton, stable, Linux Memory Management List, Linux Kernel Mailing List, Hugh Dickins, Lee Schermerhorn, KOSAKI Motohiro, Eric B Munson, Adam Litke, Andy Whitcroft, wli On Sun, Jun 07, 2009 at 09:25:06PM -0400, starlight@binnacle.cx wrote: > Mel, > > Tried out the two new patches on 2.6.26.4 and everything is > working now. The application that uncovered the issue works > perfectly and hugepages function sanely. > Very cool. Thanks for testing. > Thank you for the fix. > Thank you for persisting the problem and coming up with the test cases that reproduce it. Without both, this fix would not have been forthcoming. It's very much appreciated. -- Mel Gorman Part-time Phd Student Linux Technology Center University of Limerick IBM Dublin Software Lab -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: QUESTION: can netdev_alloc_skb() errors be reduced by tuning?
@ 2009-06-16 17:24 starlight
0 siblings, 0 replies; 20+ messages in thread
From: starlight @ 2009-06-16 17:24 UTC (permalink / raw)
To: Mel Gorman
Cc: linux-kernel, linux-mm, hugh.dickins, Lee.Schermerhorn,
kosaki.motohiro, ebmunson, agl, apw, wli
>Tried increasing a few /proc/slabinfo tuneable parameters today
>and this appears to have fixed the issue so far today.
Spoke too soon. A burst of allocation fails appeared
a some incoming data was lost. 'e1000e' system had
no problem.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
^ permalink raw reply [flat|nested] 20+ messages in threadend of thread, other threads:[~2009-07-05 3:19 UTC | newest] Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2009-05-27 11:12 [PATCH 0/2] Fixes for hugetlbfs-related problems on shared memory Mel Gorman 2009-05-27 11:12 ` [PATCH 1/2] x86: Ignore VM_LOCKED when determining if hugetlb-backed page tables can be shared or not Mel Gorman 2009-05-27 16:38 ` Eric B Munson 2009-05-27 23:18 ` Ingo Molnar 2009-05-28 8:55 ` Mel Gorman 2009-05-27 11:12 ` [PATCH 2/2] mm: Account for MAP_SHARED mappings using VM_MAYSHARE and not VM_SHARED in hugetlbfs Mel Gorman 2009-05-27 16:40 ` Eric B Munson 2009-05-27 20:14 ` [PATCH 0/2] Fixes for hugetlbfs-related problems on shared memory Andrew Morton 2009-05-27 23:19 ` Ingo Molnar 2009-06-16 0:19 ` QUESTION: can netdev_alloc_skb() errors be reduced by tuning? starlight 2009-06-16 2:26 ` Eric Dumazet 2009-06-16 4:12 ` starlight 2009-06-16 6:12 ` Eric Dumazet 2009-07-05 3:44 ` Herbert Xu 2009-06-16 9:19 ` Mel Gorman 2009-06-16 15:25 ` starlight 2009-05-28 8:56 ` [PATCH 0/2] Fixes for hugetlbfs-related problems on shared memory Mel Gorman 2009-06-08 1:25 ` starlight 2009-06-08 10:24 ` Mel Gorman 2009-06-16 17:24 QUESTION: can netdev_alloc_skb() errors be reduced by tuning? starlight
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox