* Re: FW: [PATCH 0/3] Demand faulting for huge pages [not found] <B05667366EE6204181EABE9C1B1C0EB5086AF0DF@scsmsx401.amr.corp.intel.com> @ 2005-10-07 21:28 ` Rohit Seth 2005-10-08 7:57 ` Chen, Kenneth W 0 siblings, 1 reply; 7+ messages in thread From: Rohit Seth @ 2005-10-07 21:28 UTC (permalink / raw) To: hugh, agl, linux-kernel; +Cc: linux-mm, akpm On Fri, 2005-10-07 at 10:47 -0700, Adam Litke wrote: > > > If I were to spend time coding up a patch to remove truncation support > for hugetlbfs, would it be something other people would want to see > merged as well? > In its current form, there is very little use of huegtlb truncate functionality. Currently it only allows reducing the size of hugetlb backing file. IMO it will be useful to keep and enhance this capability so that apps can dynamically reduce or increase the size of backing files (for example based on availability of memory at any time). -rohit -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 7+ messages in thread
* RE: FW: [PATCH 0/3] Demand faulting for huge pages 2005-10-07 21:28 ` FW: [PATCH 0/3] Demand faulting for huge pages Rohit Seth @ 2005-10-08 7:57 ` Chen, Kenneth W 2005-10-09 12:27 ` Hugh Dickins 0 siblings, 1 reply; 7+ messages in thread From: Chen, Kenneth W @ 2005-10-08 7:57 UTC (permalink / raw) To: Seth, Rohit, hugh, agl, linux-kernel; +Cc: linux-mm, akpm Rohit Seth wrote on Friday, October 07, 2005 2:29 PM > On Fri, 2005-10-07 at 10:47 -0700, Adam Litke wrote: > > If I were to spend time coding up a patch to remove truncation > > support for hugetlbfs, would it be something other people would > > want to see merged as well? > > In its current form, there is very little use of huegtlb truncate > functionality. Currently it only allows reducing the size of hugetlb > backing file. > > IMO it will be useful to keep and enhance this capability so that > apps can dynamically reduce or increase the size of backing files > (for example based on availability of memory at any time). Yup, here is a patch to enhance that capability. It is more of bring ftruncate on hugetlbfs file a step closer to the same semantics for file on other file systems. --- Add expanding ftruncate to hugetlbfs. Signed-off-by: Ken Chen <kenneth.w.chen@intel.com> --- linux-2.6.14-rc3/fs/hugetlbfs/inode.c.orig 2005-10-07 18:07:38.131373873 -0700 +++ linux-2.6.14-rc3/fs/hugetlbfs/inode.c 2005-10-08 00:31:15.951404405 -0700 @@ -327,20 +327,20 @@ hugetlb_vmtruncate_list(struct prio_tree } } -/* - * Expanding truncates are not allowed. - */ static int hugetlb_vmtruncate(struct inode *inode, loff_t offset) { unsigned long pgoff; struct address_space *mapping = inode->i_mapping; - - if (offset > inode->i_size) - return -EINVAL; + struct vm_area_struct *vma; + struct prio_tree_iter iter; + int ret = 0; BUG_ON(offset & ~HPAGE_MASK); pgoff = offset >> HPAGE_SHIFT; + if (offset > inode->i_size) + goto do_expand; + inode->i_size = offset; spin_lock(&mapping->i_mmap_lock); if (!prio_tree_empty(&mapping->i_mmap)) @@ -348,6 +348,18 @@ static int hugetlb_vmtruncate(struct ino spin_unlock(&mapping->i_mmap_lock); truncate_hugepages(mapping, offset); return 0; + +do_expand: + spin_lock(&mapping->i_mmap_lock); + vma_prio_tree_foreach(vma, &iter, &mapping->i_mmap, pgoff, ULONG_MAX) { + ret = hugetlb_prefault(mapping, vma); + if (ret == 0) + inode->i_size = offset; + else + break; + } + spin_unlock(&mapping->i_mmap_lock); + return ret; } static int hugetlbfs_setattr(struct dentry *dentry, struct iattr *attr) --- linux-2.6.14-rc3/mm/hugetlb.c.orig 2005-10-07 23:16:42.789349826 -0700 +++ linux-2.6.14-rc3/mm/hugetlb.c 2005-10-07 23:25:04.175085872 -0700 @@ -340,7 +340,7 @@ void zap_hugepage_range(struct vm_area_s int hugetlb_prefault(struct address_space *mapping, struct vm_area_struct *vma) { - struct mm_struct *mm = current->mm; + struct mm_struct *mm = vma->vm_mm; unsigned long addr; int ret = 0; @@ -360,6 +360,8 @@ int hugetlb_prefault(struct address_spac ret = -ENOMEM; goto out; } + if (pte_present(*pte)) + continue; idx = ((addr - vma->vm_start) >> HPAGE_SHIFT) + (vma->vm_pgoff >> (HPAGE_SHIFT - PAGE_SHIFT)); -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 7+ messages in thread
* RE: FW: [PATCH 0/3] Demand faulting for huge pages 2005-10-08 7:57 ` Chen, Kenneth W @ 2005-10-09 12:27 ` Hugh Dickins 2005-10-10 6:51 ` Chen, Kenneth W 2005-10-10 16:17 ` Adam Litke 0 siblings, 2 replies; 7+ messages in thread From: Hugh Dickins @ 2005-10-09 12:27 UTC (permalink / raw) To: Chen, Kenneth W Cc: Seth, Rohit, William Irwin, agl, linux-kernel, linux-mm, akpm On Sat, 8 Oct 2005, Chen, Kenneth W wrote: > Rohit Seth wrote on Friday, October 07, 2005 2:29 PM > > On Fri, 2005-10-07 at 10:47 -0700, Adam Litke wrote: > > > If I were to spend time coding up a patch to remove truncation > > > support for hugetlbfs, would it be something other people would > > > want to see merged as well? > > > > In its current form, there is very little use of huegtlb truncate > > functionality. Currently it only allows reducing the size of hugetlb > > backing file. And is that functionality actually used? > > IMO it will be useful to keep and enhance this capability so that > > apps can dynamically reduce or increase the size of backing files > > (for example based on availability of memory at any time). And is that functionality actually being asked for? > Yup, here is a patch to enhance that capability. It is more of bring > ftruncate on hugetlbfs file a step closer to the same semantics for > file on other file systems. Well, it's peculiar semantics that extending a file slots its pages into existing mmaps, as in your patch. Though that may indeed match the existing prefault semantics for hugetlb mmaps and files. But in those existing peculiar semantics, the file can already be extended, by mmaping further, so you're not really adding new capability. But please don't expect me to decide one way or another. We all seem to have different agendas for hugetlb. I'm interested in fixing the existing bugs with truncation (see -mm), and getting the locking to fit with my page_table_lock patches. Prohibiting truncation is an attractively easy and efficient way of fixing several such problems. Adam is interested in fault on demand, which needs further work if truncation is allowed. You and Rohit are interested in enhancing the generality of hugetlbfs. I'd imagine supporting "read" and "write" would be the first priorities if you were really trying to make hugetlbfs more like an ordinary fs. But I thought it was intentionally kept at the minimum to do its job. Hugh -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 7+ messages in thread
* RE: FW: [PATCH 0/3] Demand faulting for huge pages 2005-10-09 12:27 ` Hugh Dickins @ 2005-10-10 6:51 ` Chen, Kenneth W 2005-10-10 9:32 ` Andi Kleen 2005-10-10 16:17 ` Adam Litke 1 sibling, 1 reply; 7+ messages in thread From: Chen, Kenneth W @ 2005-10-10 6:51 UTC (permalink / raw) To: 'Hugh Dickins' Cc: Seth, Rohit, William Irwin, agl, linux-kernel, linux-mm, akpm Hugh Dickins wrote on Sunday, October 09, 2005 5:27 AM > We all seem > to have different agendas for hugetlb. I'm interested in fixing the > existing bugs with truncation (see -mm), and getting the locking to > fit with my page_table_lock patches. Prohibiting truncation is an > attractively easy and efficient way of fixing several such problems. > Adam is interested in fault on demand, which needs further work if > truncation is allowed. You and Rohit are interested in enhancing > the generality of hugetlbfs. IMO, these three things are not contradictory with each other. They are orthogonal. Even though maybe we are all touching same lines of code, in the end, everyone is working toward better and more robust hugetlb code. Demand paging is one aspect of enhancing generality of hugetlb. Intel initially proposed the feature 18 month ago [* see link below] along with SGI. Christoph Lameter at SGI scratched that subject Oct 2004. And now, Adam at IBM attempts it again. There is a growing need to make hugetlb easier to use, more transparency in using hugetlb pages etc. All requires hugetlb code to be more generalized, instead of reducing functionality. Granted, the patch I posted on expanding ftruncate will be replaced once demand paging goes in. I wanted to demonstrate that it is a feature we should implement, instead of cutting back more on current thin functionality in hugetlbfs. (with demand paging, expanding ftruncate should be really easy and clean, instead of "peculiar semantics" all because of prefaulting). - Ken [*] http://marc.theaimsgroup.com/?l=linux-ia64&m=108189860401704&w=2 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: FW: [PATCH 0/3] Demand faulting for huge pages 2005-10-10 6:51 ` Chen, Kenneth W @ 2005-10-10 9:32 ` Andi Kleen 0 siblings, 0 replies; 7+ messages in thread From: Andi Kleen @ 2005-10-10 9:32 UTC (permalink / raw) To: Chen, Kenneth W Cc: 'Hugh Dickins', Seth, Rohit, William Irwin, agl, linux-kernel, linux-mm, akpm On Monday 10 October 2005 08:51, Chen, Kenneth W wrote: > Demand paging is one aspect of enhancing generality of hugetlb. Intel > initially proposed the feature 18 month ago [* see link below] along > with SGI. Christoph Lameter at SGI scratched that subject Oct 2004. > And now, Adam at IBM attempts it again. There is a growing need to > make hugetlb easier to use, more transparency in using hugetlb pages > etc. All requires hugetlb code to be more generalized, instead of > reducing functionality. It's also badly needed to make hugetlbfs NUMA policy aware. mbind requires allocation on demand, because it runs after mmap and cannot fix up the policy when the pages are already allocated. > Granted, the patch I posted on expanding ftruncate will be replaced > once demand paging goes in. I wanted to demonstrate that it is a > feature we should implement, instead of cutting back more on current > thin functionality in hugetlbfs. (with demand paging, expanding > ftruncate should be really easy and clean, instead of "peculiar > semantics" all because of prefaulting). I would like to have it. I remember hating to implement extending truncate by hand when I did the test programs for the hugetlbfs numa policy. -Andi -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 7+ messages in thread
* RE: FW: [PATCH 0/3] Demand faulting for huge pages 2005-10-09 12:27 ` Hugh Dickins 2005-10-10 6:51 ` Chen, Kenneth W @ 2005-10-10 16:17 ` Adam Litke 2005-10-11 3:10 ` Andrew Morton 1 sibling, 1 reply; 7+ messages in thread From: Adam Litke @ 2005-10-10 16:17 UTC (permalink / raw) To: Hugh Dickins Cc: Chen, Kenneth W, Seth, Rohit, William Irwin, linux-kernel, linux-mm, akpm On Sun, 2005-10-09 at 13:27 +0100, Hugh Dickins wrote: > On Sat, 8 Oct 2005, Chen, Kenneth W wrote: > > Rohit Seth wrote on Friday, October 07, 2005 2:29 PM > > > On Fri, 2005-10-07 at 10:47 -0700, Adam Litke wrote: > > > > If I were to spend time coding up a patch to remove truncation > > > > support for hugetlbfs, would it be something other people would > > > > want to see merged as well? > > > > > > In its current form, there is very little use of huegtlb truncate > > > functionality. Currently it only allows reducing the size of hugetlb > > > backing file. > > And is that functionality actually used? > > > > IMO it will be useful to keep and enhance this capability so that > > > apps can dynamically reduce or increase the size of backing files > > > (for example based on availability of memory at any time). > > And is that functionality actually being asked for? > > > Yup, here is a patch to enhance that capability. It is more of bring > > ftruncate on hugetlbfs file a step closer to the same semantics for > > file on other file systems. > > Well, it's peculiar semantics that extending a file slots its pages > into existing mmaps, as in your patch. Though that may indeed match > the existing prefault semantics for hugetlb mmaps and files. But in > those existing peculiar semantics, the file can already be extended, > by mmaping further, so you're not really adding new capability. > > But please don't expect me to decide one way or another. We all seem > to have different agendas for hugetlb. I'm interested in fixing the > existing bugs with truncation (see -mm), and getting the locking to > fit with my page_table_lock patches. Prohibiting truncation is an > attractively easy and efficient way of fixing several such problems. > Adam is interested in fault on demand, which needs further work if > truncation is allowed. You and Rohit are interested in enhancing > the generality of hugetlbfs. > > I'd imagine supporting "read" and "write" would be the first priorities > if you were really trying to make hugetlbfs more like an ordinary fs. > But I thought it was intentionally kept at the minimum to do its job. Honestly, I think there is an even more fundamental issue at hand. If the goal is transparent and flexible use of huge pages it seems to me that there is two ways to go: 1) Continue with hugetlbfs and work to finish implementing all of the operations (that make sense) properly (like read, write, truncate, etc). 2) Recognize that trying to use hugetlbfs files to transparently replace normal memory is ultimately a hack. Normal memory is not implemented as a file system so using hugetlb pages here will always cause headaches as implemented. So work towards removing filesystem-like behaviour and treating huge pages more like regular memory. If we can all agree on 1 or 2 then it should be easier to make decisions like this thread calls for. I'll put my vote in for #2. Thoughts? -- Adam Litke - (agl at us.ibm.com) IBM Linux Technology Center -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: FW: [PATCH 0/3] Demand faulting for huge pages 2005-10-10 16:17 ` Adam Litke @ 2005-10-11 3:10 ` Andrew Morton 0 siblings, 0 replies; 7+ messages in thread From: Andrew Morton @ 2005-10-11 3:10 UTC (permalink / raw) To: Adam Litke; +Cc: hugh, kenneth.w.chen, rohit.seth, wli, linux-kernel, linux-mm Adam Litke <agl@us.ibm.com> wrote: > > Honestly, I think there is an even more fundamental issue at hand. If > the goal is transparent and flexible use of huge pages it seems to me > that there is two ways to go: > > 1) Continue with hugetlbfs and work to finish implementing all of the > operations (that make sense) properly (like read, write, truncate, etc). hugetlbfs provides the API by which applications may obtain hugetlb-page-backed memory. In fact the filesystem didn't even exist in the initial version of the patch - the first version used specific syscalls to obtain the hugepage memory. So. Given that hugetlbfs is purely there as a means by which applications can access (and share) hugepage memory, it doesn't make sense to flesh that filesystem out any further. IOW: no need for read() and write(). > 2) Recognize that trying to use hugetlbfs files to transparently replace > normal memory is ultimately a hack. Normal memory is not implemented as > a file system so using hugetlb pages here will always cause headaches as > implemented. So work towards removing filesystem-like behaviour and > treating huge pages more like regular memory. Early Linus diktat was that we shouldn't attempt to make the core MM aware of multiple page sizes in the manner which you suggest. Trying to sneak this in via "improved integration of hugepage support" would likely create a mess. The design approach for hugepage integration was that the MM would continue to be focussed on a fixed page size and that hugepages would be some non-intrusive thing off to the side - more like a mmappable device driver than some core part of the MM system. This is not all meant to say "don't do it". But I am saying that you'll need to review several years worth of discussion on the topic and understand the downsides and objections, and be prepared for a big project. One which risks causing Hugh a ton of grief in ongoing core MM improvements. Aside: one problem with the kernel's hugepage support is that it doesn't have a single person who performs the overall maintenance function. Bill Irwin was doing this for a while, but now seems to have gone quiet. Consequently various people come in and attempt various this-is-a-change-i-need operations. Problem is, with no single person keeping track of who the affected stakeholders are, and what the likely effects of each change upon the stakeholders will be, things proceed slowly and various people end up maintaining various out-of-tree things (I think). I attempt to plug the gaps, but the time interval between flurries of hugetlb activity are long and I forget who's doing what. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a> ^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2005-10-11 3:10 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <B05667366EE6204181EABE9C1B1C0EB5086AF0DF@scsmsx401.amr.corp.intel.com>
2005-10-07 21:28 ` FW: [PATCH 0/3] Demand faulting for huge pages Rohit Seth
2005-10-08 7:57 ` Chen, Kenneth W
2005-10-09 12:27 ` Hugh Dickins
2005-10-10 6:51 ` Chen, Kenneth W
2005-10-10 9:32 ` Andi Kleen
2005-10-10 16:17 ` Adam Litke
2005-10-11 3:10 ` Andrew Morton
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox