* Re: news about IDE PIO HIGHMEM bug (was: Re: 2.6.9-mm1) [not found] <58cb370e041027074676750027@mail.gmail.com> @ 2004-10-27 15:14 ` Jeff Garzik 2004-10-27 15:52 ` Martin J. Bligh 0 siblings, 1 reply; 13+ messages in thread From: Jeff Garzik @ 2004-10-27 15:14 UTC (permalink / raw) To: Linux Kernel, linux-mm Cc: Bartlomiej Zolnierkiewicz, Randy.Dunlap, William Lee Irwin III, Jens Axboe Bartlomiej Zolnierkiewicz wrote: > We have stuct page of the first page and a offset. > We need to obtain struct page of the current page and map it. Opening this question to a wider audience. struct scatterlist gives us struct page*, and an offset+length pair. The struct page* is the _starting_ page of a potentially multi-page run of data. The question: how does one get struct page* for the second, and successive pages in a known-contiguous multi-page run, if one only knows the first page? Jeff -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a> ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: news about IDE PIO HIGHMEM bug (was: Re: 2.6.9-mm1) 2004-10-27 15:14 ` news about IDE PIO HIGHMEM bug (was: Re: 2.6.9-mm1) Jeff Garzik @ 2004-10-27 15:52 ` Martin J. Bligh 2004-10-27 15:59 ` Jeff Garzik 2004-10-27 16:01 ` Martin J. Bligh 0 siblings, 2 replies; 13+ messages in thread From: Martin J. Bligh @ 2004-10-27 15:52 UTC (permalink / raw) To: Jeff Garzik, Linux Kernel, linux-mm Cc: Bartlomiej Zolnierkiewicz, Randy.Dunlap, William Lee Irwin III, Jens Axboe > Bartlomiej Zolnierkiewicz wrote: >> We have stuct page of the first page and a offset. >> We need to obtain struct page of the current page and map it. > > > Opening this question to a wider audience. > > struct scatterlist gives us struct page*, and an offset+length pair. The struct page* is the _starting_ page of a potentially multi-page run of data. > > The question: how does one get struct page* for the second, and successive pages in a known-contiguous multi-page run, if one only knows the first page? If it's a higher order allocation, just page+1 should be safe. If it just happens to be contig, it might cross a discontig boundary, and not obey that rule. Very unlikely, but possible. M. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a> ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: news about IDE PIO HIGHMEM bug (was: Re: 2.6.9-mm1) 2004-10-27 15:52 ` Martin J. Bligh @ 2004-10-27 15:59 ` Jeff Garzik 2004-10-27 17:36 ` Martin J. Bligh 2004-10-27 16:01 ` Martin J. Bligh 1 sibling, 1 reply; 13+ messages in thread From: Jeff Garzik @ 2004-10-27 15:59 UTC (permalink / raw) To: Martin J. Bligh Cc: Linux Kernel, linux-mm, Bartlomiej Zolnierkiewicz, Randy.Dunlap, William Lee Irwin III, Jens Axboe, Andrew Morton Martin J. Bligh wrote: >>Bartlomiej Zolnierkiewicz wrote: >> >>>We have stuct page of the first page and a offset. >>>We need to obtain struct page of the current page and map it. >> >> >>Opening this question to a wider audience. >> >>struct scatterlist gives us struct page*, and an offset+length pair. The struct page* is the _starting_ page of a potentially multi-page run of data. >> >>The question: how does one get struct page* for the second, and successive pages in a known-contiguous multi-page run, if one only knows the first page? > > > If it's a higher order allocation, just page+1 should be safe. If it just > happens to be contig, it might cross a discontig boundary, and not obey > that rule. Very unlikely, but possible. Unfortunately, it's not. The block layer just tells us "it's a contiguous run of memory", which implies nothing really about the allocation size. Bart and I (and others?) essentially need a "page+1" thing (for 2.4.x too!), that won't break in the face of NUMA/etc. Alternatively (or additionally), we may need to make sure the block layer doesn't merge across zones or NUMA boundaries or whatnot. Jeff -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a> ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: news about IDE PIO HIGHMEM bug (was: Re: 2.6.9-mm1) 2004-10-27 15:59 ` Jeff Garzik @ 2004-10-27 17:36 ` Martin J. Bligh 0 siblings, 0 replies; 13+ messages in thread From: Martin J. Bligh @ 2004-10-27 17:36 UTC (permalink / raw) To: Jeff Garzik Cc: Linux Kernel, linux-mm, Bartlomiej Zolnierkiewicz, Randy.Dunlap, William Lee Irwin III, Jens Axboe, Andrew Morton > Unfortunately, it's not. > > The block layer just tells us "it's a contiguous run of memory", which implies nothing really about the allocation size. > > Bart and I (and others?) essentially need a "page+1" thing (for 2.4.x too!), that won't break in the face of NUMA/etc. > > Alternatively (or additionally), we may need to make sure the block layer doesn't merge across zones or NUMA boundaries or whatnot. The latter would be rather more efficient. I don't know how often you end up doing each operation though ... the page+1 vs the attemtped merge. Depends on the ratio, I guess. M. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a> ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: news about IDE PIO HIGHMEM bug (was: Re: 2.6.9-mm1) 2004-10-27 15:52 ` Martin J. Bligh 2004-10-27 15:59 ` Jeff Garzik @ 2004-10-27 16:01 ` Martin J. Bligh 2004-10-27 16:35 ` [PATCH] " Jeff Garzik 2004-10-27 18:08 ` Christoph Hellwig 1 sibling, 2 replies; 13+ messages in thread From: Martin J. Bligh @ 2004-10-27 16:01 UTC (permalink / raw) To: Jeff Garzik, Linux Kernel, linux-mm Cc: Bartlomiej Zolnierkiewicz, Randy.Dunlap, William Lee Irwin III, Jens Axboe --"Martin J. Bligh" <mbligh@aracnet.com> wrote (on Wednesday, October 27, 2004 08:52:39 -0700): >> Bartlomiej Zolnierkiewicz wrote: >>> We have stuct page of the first page and a offset. >>> We need to obtain struct page of the current page and map it. >> >> >> Opening this question to a wider audience. >> >> struct scatterlist gives us struct page*, and an offset+length pair. The struct page* is the _starting_ page of a potentially multi-page run of data. >> >> The question: how does one get struct page* for the second, and successive pages in a known-contiguous multi-page run, if one only knows the first page? > > If it's a higher order allocation, just page+1 should be safe. If it just > happens to be contig, it might cross a discontig boundary, and not obey > that rule. Very unlikely, but possible. To repeat what I said in IRC ... ;-) Actually, you could check this with the pfns being the same when >> MAX_ORDER-1. We should be aligned on a MAX_ORDER boundary, I think. However, pfn_to_page(page_to_pfn(page) + 1) might be safer. If rather slower. M. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a> ^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH] Re: news about IDE PIO HIGHMEM bug (was: Re: 2.6.9-mm1) 2004-10-27 16:01 ` Martin J. Bligh @ 2004-10-27 16:35 ` Jeff Garzik 2004-10-27 21:29 ` Andrew Morton 2004-10-27 18:08 ` Christoph Hellwig 1 sibling, 1 reply; 13+ messages in thread From: Jeff Garzik @ 2004-10-27 16:35 UTC (permalink / raw) To: Martin J. Bligh, Andrew Morton Cc: Linux Kernel, linux-mm, Bartlomiej Zolnierkiewicz, Randy.Dunlap, William Lee Irwin III, Jens Axboe [-- Attachment #1: Type: text/plain, Size: 494 bytes --] Martin J. Bligh wrote: > To repeat what I said in IRC ... ;-) > > Actually, you could check this with the pfns being the same when >> MAX_ORDER-1. > We should be aligned on a MAX_ORDER boundary, I think. > > However, pfn_to_page(page_to_pfn(page) + 1) might be safer. If rather slower. Is this patch acceptable to everyone? Andrew? It uses the publicly-exported pfn_to_page/page_to_pfn abstraction, which seems to be the only way to accomplish what we want to do in IDE/libata. Jeff [-- Attachment #2: patch --] [-- Type: text/plain, Size: 401 bytes --] ===== include/linux/mm.h 1.193 vs edited ===== --- 1.193/include/linux/mm.h 2004-10-20 04:37:06 -04:00 +++ edited/include/linux/mm.h 2004-10-27 12:33:28 -04:00 @@ -41,6 +41,8 @@ #define MM_VM_SIZE(mm) TASK_SIZE #endif +#define nth_page(page,n) pfn_to_page(page_to_pfn((page)) + n) + /* * Linux kernel virtual memory manager primitives. * The idea being to have a "virtual" mm in the same way ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] Re: news about IDE PIO HIGHMEM bug (was: Re: 2.6.9-mm1) 2004-10-27 16:35 ` [PATCH] " Jeff Garzik @ 2004-10-27 21:29 ` Andrew Morton 2004-10-27 21:31 ` Jeff Garzik 2004-10-27 21:34 ` William Lee Irwin III 0 siblings, 2 replies; 13+ messages in thread From: Andrew Morton @ 2004-10-27 21:29 UTC (permalink / raw) To: Jeff Garzik Cc: mbligh, linux-kernel, linux-mm, bzolnier, rddunlap, wli, axboe Jeff Garzik <jgarzik@pobox.com> wrote: > > > However, pfn_to_page(page_to_pfn(page) + 1) might be safer. If rather slower. > > > Is this patch acceptable to everyone? Andrew? spose so. The scatterlist API is being a bit silly there. It might be worthwhile doing: #ifdef CONFIG_DISCONTIGMEM #define nth_page(page,n) pfn_to_page(page_to_pfn((page)) + n) #else #define nth_page(page,n) ((page)+(n)) #endif -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a> ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] Re: news about IDE PIO HIGHMEM bug (was: Re: 2.6.9-mm1) 2004-10-27 21:29 ` Andrew Morton @ 2004-10-27 21:31 ` Jeff Garzik 2004-10-27 21:34 ` William Lee Irwin III 1 sibling, 0 replies; 13+ messages in thread From: Jeff Garzik @ 2004-10-27 21:31 UTC (permalink / raw) To: Andrew Morton Cc: mbligh, linux-kernel, linux-mm, bzolnier, rddunlap, wli, axboe On Wed, Oct 27, 2004 at 02:29:14PM -0700, Andrew Morton wrote: > spose so. The scatterlist API is being a bit silly there. Well, it depends on your perspective :) Each scatterlist entry is supposed to map to a physical segment to be passed to h/w. Hardware S/G tables just want to see a addr/len pair, and don't care about machine page size. scatterlist follows a similar model. dma_map_sg() and other helpers create a favorable situation, where >90% of the drivers don't have to care about the VM-size details. Unfortunately those drivers that need need to do their own data transfer (like ATA's PIO, instead of DMA) need direct access to each member of an s/g list. Jeff -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a> ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] Re: news about IDE PIO HIGHMEM bug (was: Re: 2.6.9-mm1) 2004-10-27 21:29 ` Andrew Morton 2004-10-27 21:31 ` Jeff Garzik @ 2004-10-27 21:34 ` William Lee Irwin III 1 sibling, 0 replies; 13+ messages in thread From: William Lee Irwin III @ 2004-10-27 21:34 UTC (permalink / raw) To: Andrew Morton Cc: Jeff Garzik, mbligh, linux-kernel, linux-mm, bzolnier, rddunlap, axboe Jeff Garzik <jgarzik@pobox.com> wrote: >> However, pfn_to_page(page_to_pfn(page) + 1) might be safer. If >> rather slower. Is this patch acceptable to everyone? Andrew? On Wed, Oct 27, 2004 at 02:29:14PM -0700, Andrew Morton wrote: > spose so. The scatterlist API is being a bit silly there. > It might be worthwhile doing: > #ifdef CONFIG_DISCONTIGMEM > #define nth_page(page,n) pfn_to_page(page_to_pfn((page)) + n) > #else > #define nth_page(page,n) ((page)+(n)) > #endif This is actually not quite good enough. Zones are not guaranteed to have adjacent mem_map[]'s even with CONFIG_DISCONTIGMEM=n. It may make sense to prevent merging from spanning zones, but frankly the overhead of the pfn_to_page()/page_to_pfn() is negligible in comparison to the data movement and (when applicable) virtual windowing, where in the merging code cpu overhead is a greater concern, particularly for devices that don't require manual data movement. -- wli -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a> ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: news about IDE PIO HIGHMEM bug (was: Re: 2.6.9-mm1) 2004-10-27 16:01 ` Martin J. Bligh 2004-10-27 16:35 ` [PATCH] " Jeff Garzik @ 2004-10-27 18:08 ` Christoph Hellwig 2004-10-27 18:33 ` news about IDE PIO HIGHMEM bug Jeff Garzik 1 sibling, 1 reply; 13+ messages in thread From: Christoph Hellwig @ 2004-10-27 18:08 UTC (permalink / raw) To: Martin J. Bligh Cc: Jeff Garzik, Linux Kernel, linux-mm, Bartlomiej Zolnierkiewicz, Randy.Dunlap, William Lee Irwin III, Jens Axboe > To repeat what I said in IRC ... ;-) > > Actually, you could check this with the pfns being the same when >> MAX_ORDER-1. > We should be aligned on a MAX_ORDER boundary, I think. > > However, pfn_to_page(page_to_pfn(page) + 1) might be safer. If rather slower. I think this is the wrong level of interface exposed. Just add two hepler kmap_atomic_sg/kunmap_atomic_sg that gurantee to map/unmap a sg list entry, even if it's bigger than a page. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a> ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: news about IDE PIO HIGHMEM bug 2004-10-27 18:08 ` Christoph Hellwig @ 2004-10-27 18:33 ` Jeff Garzik 2004-10-27 18:48 ` William Lee Irwin III 2004-10-28 0:18 ` William Lee Irwin III 0 siblings, 2 replies; 13+ messages in thread From: Jeff Garzik @ 2004-10-27 18:33 UTC (permalink / raw) To: Christoph Hellwig Cc: Martin J. Bligh, Linux Kernel, linux-mm, Bartlomiej Zolnierkiewicz, Randy.Dunlap, William Lee Irwin III, Jens Axboe Christoph Hellwig wrote: >>To repeat what I said in IRC ... ;-) >> >>Actually, you could check this with the pfns being the same when >> MAX_ORDER-1. >>We should be aligned on a MAX_ORDER boundary, I think. >> >>However, pfn_to_page(page_to_pfn(page) + 1) might be safer. If rather slower. > > > I think this is the wrong level of interface exposed. Just add two hepler > kmap_atomic_sg/kunmap_atomic_sg that gurantee to map/unmap a sg list entry, > even if it's bigger than a page. Why bother mapping anything larger than a page, when none of the users need it? Jeff P.S. In your scheme you would need four helpers; you forgot kmap_sg() and kunmap_sg(). -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a> ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: news about IDE PIO HIGHMEM bug 2004-10-27 18:33 ` news about IDE PIO HIGHMEM bug Jeff Garzik @ 2004-10-27 18:48 ` William Lee Irwin III 2004-10-28 0:18 ` William Lee Irwin III 1 sibling, 0 replies; 13+ messages in thread From: William Lee Irwin III @ 2004-10-27 18:48 UTC (permalink / raw) To: Jeff Garzik Cc: Christoph Hellwig, Linux Kernel, linux-mm, Bartlomiej Zolnierkiewicz, Randy.Dunlap, Jens Axboe, James Bottomley Christoph Hellwig wrote: >> I think this is the wrong level of interface exposed. Just add two hepler >> kmap_atomic_sg/kunmap_atomic_sg that gurantee to map/unmap a sg list entry, >> even if it's bigger than a page. On Wed, Oct 27, 2004 at 02:33:45PM -0400, Jeff Garzik wrote: > Why bother mapping anything larger than a page, when none of the users > need it? > P.S. In your scheme you would need four helpers; you forgot kmap_sg() > and kunmap_sg(). This is all a non-issue. The page structure just represents little more than a physical address to the block layer in the context of merging, so the pfn_to_page(page_to_pfn(...) + ...) bits calculate this properly. There is just nothing interesting going on here. Generate the page structure for the piece of the segment, kmap_atomic() it, and it's done. -- wli -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a> ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: news about IDE PIO HIGHMEM bug 2004-10-27 18:33 ` news about IDE PIO HIGHMEM bug Jeff Garzik 2004-10-27 18:48 ` William Lee Irwin III @ 2004-10-28 0:18 ` William Lee Irwin III 1 sibling, 0 replies; 13+ messages in thread From: William Lee Irwin III @ 2004-10-28 0:18 UTC (permalink / raw) To: Jeff Garzik Cc: Christoph Hellwig, Martin J. Bligh, Linux Kernel, linux-mm, Bartlomiej Zolnierkiewicz, Randy.Dunlap, Jens Axboe Christoph Hellwig wrote: >> I think this is the wrong level of interface exposed. Just add two hepler >> kmap_atomic_sg/kunmap_atomic_sg that gurantee to map/unmap a sg list entry, >> even if it's bigger than a page. On Wed, Oct 27, 2004 at 02:33:45PM -0400, Jeff Garzik wrote: > Why bother mapping anything larger than a page, when none of the users > need it? > P.S. In your scheme you would need four helpers; you forgot kmap_sg() > and kunmap_sg(). The scheme hch suggested is highly invasive in the area of architecture- specific fixmap layout and introduces a dependency of fixmap layout on maximum segment size, which may make it current normal maximum segment sizes use prohibitive amounts of vmallocspace on 32-bit architectures. So I'd drop that suggestion, though it's not particularly farfetched. -- wli -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"aart@kvack.org"> aart@kvack.org </a> ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2004-10-28 0:18 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <58cb370e041027074676750027@mail.gmail.com>
2004-10-27 15:14 ` news about IDE PIO HIGHMEM bug (was: Re: 2.6.9-mm1) Jeff Garzik
2004-10-27 15:52 ` Martin J. Bligh
2004-10-27 15:59 ` Jeff Garzik
2004-10-27 17:36 ` Martin J. Bligh
2004-10-27 16:01 ` Martin J. Bligh
2004-10-27 16:35 ` [PATCH] " Jeff Garzik
2004-10-27 21:29 ` Andrew Morton
2004-10-27 21:31 ` Jeff Garzik
2004-10-27 21:34 ` William Lee Irwin III
2004-10-27 18:08 ` Christoph Hellwig
2004-10-27 18:33 ` news about IDE PIO HIGHMEM bug Jeff Garzik
2004-10-27 18:48 ` William Lee Irwin III
2004-10-28 0:18 ` William Lee Irwin III
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox