* Re: + fs-break-generic_file_buffered_read-up-into-multiple-functions.patch added to -mm tree [not found] <20201025220817.XxXVE%akpm@linux-foundation.org> @ 2020-10-27 13:35 ` Matthew Wilcox 2020-10-28 22:22 ` Andrew Morton 0 siblings, 1 reply; 10+ messages in thread From: Matthew Wilcox @ 2020-10-27 13:35 UTC (permalink / raw) To: akpm; +Cc: linux-mm, axboe, kent.overstreet On Sun, Oct 25, 2020 at 03:08:17PM -0700, akpm@linux-foundation.org wrote: > The patch titled > Subject: mm/filemap/c: freak generic_file_buffered_read up into multiple functions > has been added to the -mm tree. Its filename is > fs-break-generic_file_buffered_read-up-into-multiple-functions.patch Can we back this out? It really makes the THP patchset unhappy. I think we can do something like this afterwards, but doing it this way round is ridiculously hard. Long explanation ... If we readahead a THP and hit a read error, the iomap code currently tries to split the page to handle the error on a more fine-grained boundary. We can't split a page while someone else holds a reference to it, and generic/273 has 500 threads livelocking on the same page, all looking it up, trying to lock it, sleeping, eventually getting the lock, discovering somebody else has the lock, and retrying. I changed from using wait_on_page_locked_killable() to put_and_wait_on_page_locked(), so the threads all sleep without holding a reference to the page. It worked great, and was a nice simplification mm/filemap.c | 69 ++++++++++++++++++------------------------------------------- 1 file changed, 20 insertions(+), 49 deletions(-) With this reorganisation, not only do we try to take the page lock a long way down the chain, we've also got a batch of pages that we're holding a refcount on, so we need to call put_page() on the rest of the batch and then call put_and_wait_on_page_locked() on the one page that we need to wait for. I'm sure it's possible to make this work, but it's going to be completely unlike what I've been testing. I basically have to start again on doing these changes. commit 9fd2f26c6d00c7f0c6c730ec10a421780fedaa79 Author: Matthew Wilcox (Oracle) <willy@infradead.org> Date: Mon Oct 12 16:36:11 2020 -0400 mm/filemap: Don't hold a page reference while waiting for unlock In the upcoming THP patch series, if we find a !Uptodate page, it is because of a read error. In this case, we want to split the THP into smaller pages so we can handle the error in as granular a fashion as possible. But xfstests generic/273 defeats this strategy by having 500 threads all sleeping on the same page, each waiting for their turn to split the page. None of them will ever succeed because splitting a page requires that you hold the only reference to it. To fix this, use put_and_wait_on_page_locked() to sleep without holding a reference to the page. Each of the readers will then go back and retry the page lookup after the page is unlocked. Since we now get the page lock a little earlier in generic_file_buffered_read(), we can eliminate a number of duplicate checks. The original intent (commit ebded02788b5 ("avoid unnecessary calls to lock_page when waiting for IO to complete during a read") behind getting the page lock later was to avoid re-locking the page after it has been brought uptodate by another thread. We will still avoid that because we go through the normal lookup path again after the winning thread has brought the page uptodate. Using the "fail 10% of readaheads" patch, which will induce the !Uptodate case, I can detect no significant difference by applying this patch with the generic/273 test. Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org> diff --git a/mm/filemap.c b/mm/filemap.c index e496833d3ede..a6a2132087cd 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -1358,14 +1358,6 @@ static int __wait_on_page_locked_async(struct page *page, return ret; } -static int wait_on_page_locked_async(struct page *page, - struct wait_page_queue *wait) -{ - if (!PageLocked(page)) - return 0; - return __wait_on_page_locked_async(compound_head(page), wait, false); -} - /** * put_and_wait_on_page_locked - Drop a reference and wait for it to be unlocked * @page: The page to wait for. @@ -2279,34 +2271,37 @@ ssize_t generic_file_buffered_read(struct kiocb *iocb, put_page(page); goto out; } - error = wait_on_page_locked_async(page, - iocb->ki_waitq); + error = lock_page_async(page, iocb->ki_waitq); + if (error) + goto readpage_error; + } else if (iocb->ki_flags & IOCB_NOWAIT) { + put_page(page); + goto would_block; } else { - if (iocb->ki_flags & IOCB_NOWAIT) { - put_page(page); - goto would_block; + if (!trylock_page(page)) { + put_and_wait_on_page_locked(page, + TASK_KILLABLE); + continue; } - error = wait_on_page_locked_killable(page); } - if (unlikely(error)) - goto readpage_error; if (PageUptodate(page)) - goto page_ok; + goto uptodate; + if (!page->mapping) { + unlock_page(page); + put_page(page); + continue; + } if (inode->i_blkbits == PAGE_SHIFT || !mapping->a_ops->is_partially_uptodate) - goto page_not_up_to_date; + goto readpage; /* pipes can't handle partially uptodate pages */ if (unlikely(iov_iter_is_pipe(iter))) - goto page_not_up_to_date; - if (!trylock_page(page)) - goto page_not_up_to_date; - /* Did it get truncated before we got the lock? */ - if (!page->mapping) - goto page_not_up_to_date_locked; + goto readpage; if (!mapping->a_ops->is_partially_uptodate(page, offset, iter->count)) - goto page_not_up_to_date_locked; + goto readpage; +uptodate: unlock_page(page); } page_ok: @@ -2372,30 +2367,6 @@ ssize_t generic_file_buffered_read(struct kiocb *iocb, goto out; } continue; - -page_not_up_to_date: - /* Get exclusive access to the page ... */ - if (iocb->ki_flags & IOCB_WAITQ) - error = lock_page_async(page, iocb->ki_waitq); - else - error = lock_page_killable(page); - if (unlikely(error)) - goto readpage_error; - -page_not_up_to_date_locked: - /* Did it get truncated before we got the lock? */ - if (!page->mapping) { - unlock_page(page); - put_page(page); - continue; - } - - /* Did somebody else fill it already? */ - if (PageUptodate(page)) { - unlock_page(page); - goto page_ok; - } - readpage: if (iocb->ki_flags & (IOCB_NOIO | IOCB_NOWAIT)) { unlock_page(page); ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: + fs-break-generic_file_buffered_read-up-into-multiple-functions.patch added to -mm tree 2020-10-27 13:35 ` + fs-break-generic_file_buffered_read-up-into-multiple-functions.patch added to -mm tree Matthew Wilcox @ 2020-10-28 22:22 ` Andrew Morton 2020-10-28 22:26 ` Jens Axboe 2020-10-28 22:50 ` Matthew Wilcox 0 siblings, 2 replies; 10+ messages in thread From: Andrew Morton @ 2020-10-28 22:22 UTC (permalink / raw) To: Matthew Wilcox; +Cc: linux-mm, axboe, kent.overstreet On Tue, 27 Oct 2020 13:35:51 +0000 Matthew Wilcox <willy@infradead.org> wrote: > On Sun, Oct 25, 2020 at 03:08:17PM -0700, akpm@linux-foundation.org wrote: > > The patch titled > > Subject: mm/filemap/c: freak generic_file_buffered_read up into multiple functions > > has been added to the -mm tree. Its filename is > > fs-break-generic_file_buffered_read-up-into-multiple-functions.patch > > Can we back this out? It really makes the THP patchset unhappy. I think > we can do something like this afterwards, but doing it this way round is > ridiculously hard. Two concerns: : On my test box, 4k buffered random reads go from ~150k to ~250k iops, : and the improvements to big sequential reads are even bigger. That's a big improvement! We want that improvement. Throwing it away on behalf of an as-yet-unmerged feature patchset hurts. Can we expect that this improvement will be available post-that-patchset? And when? (This improvment is rather hard to believe, really - more details on the test environment would be useful. Can we expect that people will in general see similar benefits or was there something special about the testing?) Secondly, : This incorporates the fix for IOCB_WAITQ handling that Jens just : posted as well Is this (mysterious, unreferenced!) fix against v1 of Kent's patchset, or is it against mainline? If it's against mainline then it's presumably first priority. Can we please get that sent out in a standalone form asap? ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: + fs-break-generic_file_buffered_read-up-into-multiple-functions.patch added to -mm tree 2020-10-28 22:22 ` Andrew Morton @ 2020-10-28 22:26 ` Jens Axboe 2020-10-29 13:57 ` Jens Axboe 2020-10-28 22:50 ` Matthew Wilcox 1 sibling, 1 reply; 10+ messages in thread From: Jens Axboe @ 2020-10-28 22:26 UTC (permalink / raw) To: Andrew Morton, Matthew Wilcox; +Cc: linux-mm, kent.overstreet On 10/28/20 4:22 PM, Andrew Morton wrote: > On Tue, 27 Oct 2020 13:35:51 +0000 Matthew Wilcox <willy@infradead.org> wrote: > >> On Sun, Oct 25, 2020 at 03:08:17PM -0700, akpm@linux-foundation.org wrote: >>> The patch titled >>> Subject: mm/filemap/c: freak generic_file_buffered_read up into multiple functions >>> has been added to the -mm tree. Its filename is >>> fs-break-generic_file_buffered_read-up-into-multiple-functions.patch >> >> Can we back this out? It really makes the THP patchset unhappy. I think >> we can do something like this afterwards, but doing it this way round is >> ridiculously hard. > > Two concerns: > > : On my test box, 4k buffered random reads go from ~150k to ~250k iops, > : and the improvements to big sequential reads are even bigger. > > That's a big improvement! We want that improvement. Throwing it away > on behalf of an as-yet-unmerged feature patchset hurts. Can we expect that > this improvement will be available post-that-patchset? And when? > > (This improvment is rather hard to believe, really - more details on the > test environment would be useful. Can we expect that people will in > general see similar benefits or was there something special about the > testing?) I did see some wins when I tested this. I'll try and run some testing tomorrow and report back. If there's something specifically you want to see tested, let me know. > Secondly, > > : This incorporates the fix for IOCB_WAITQ handling that Jens just > : posted as well > > Is this (mysterious, unreferenced!) fix against v1 of Kent's patchset, > or is it against mainline? If it's against mainline then it's > presumably first priority. Can we please get that sent out in a > standalone form asap? It's this one: commit 13bd691421bc191a402d2e0d3da5f248d170a632 Author: Jens Axboe <axboe@kernel.dk> Date: Sat Oct 17 08:31:29 2020 -0600 mm: mark async iocb read as NOWAIT once some data has been copied which is in Linus's tree. Kent spotted the issue and had it fixed in his 5.9 based patchset, which is probably where this confusion comes from. -- Jens Axboe ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: + fs-break-generic_file_buffered_read-up-into-multiple-functions.patch added to -mm tree 2020-10-28 22:26 ` Jens Axboe @ 2020-10-29 13:57 ` Jens Axboe 2020-10-29 14:57 ` Matthew Wilcox 0 siblings, 1 reply; 10+ messages in thread From: Jens Axboe @ 2020-10-29 13:57 UTC (permalink / raw) To: Andrew Morton, Matthew Wilcox; +Cc: linux-mm, kent.overstreet On 10/28/20 4:26 PM, Jens Axboe wrote: > On 10/28/20 4:22 PM, Andrew Morton wrote: >> On Tue, 27 Oct 2020 13:35:51 +0000 Matthew Wilcox <willy@infradead.org> wrote: >> >>> On Sun, Oct 25, 2020 at 03:08:17PM -0700, akpm@linux-foundation.org wrote: >>>> The patch titled >>>> Subject: mm/filemap/c: freak generic_file_buffered_read up into multiple functions >>>> has been added to the -mm tree. Its filename is >>>> fs-break-generic_file_buffered_read-up-into-multiple-functions.patch >>> >>> Can we back this out? It really makes the THP patchset unhappy. I think >>> we can do something like this afterwards, but doing it this way round is >>> ridiculously hard. >> >> Two concerns: >> >> : On my test box, 4k buffered random reads go from ~150k to ~250k iops, >> : and the improvements to big sequential reads are even bigger. >> >> That's a big improvement! We want that improvement. Throwing it away >> on behalf of an as-yet-unmerged feature patchset hurts. Can we expect that >> this improvement will be available post-that-patchset? And when? >> >> (This improvment is rather hard to believe, really - more details on the >> test environment would be useful. Can we expect that people will in >> general see similar benefits or was there something special about the >> testing?) > > I did see some wins when I tested this. I'll try and run some testing > tomorrow and report back. If there's something specifically you want to > see tested, let me know. I did some testing, unfortunately it's _very_ hard to produce somewhat consistent and good numbers as it quickly becomes a game of kswapd. Here's a basic case of 4 threads doing 32k random reads: PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 462 root 20 0 0 0 0 R 65.5 0.0 0:08.02 kswapd0 2287 axboe 20 0 1303448 2176 1072 R 46.6 0.0 0:05.35 fio 2289 axboe 20 0 1303456 2196 1092 D 46.6 0.0 0:05.34 fio 2290 axboe 20 0 1303460 2216 1112 D 46.6 0.0 0:05.37 fio 2288 axboe 20 0 1303452 2224 1120 R 45.9 0.0 0:05.33 fio Sad face... Unfortunately once kswapd kicks in, performance also plummets. This box only has 32G of ram, and you can fill that in less than 10 seconds doing buffered reads like that. I ran 4k and 32k testing, and using 1 and 4 threads. But given the above sadness, it quickly ends up looking the same for me. What I noticed in my initial testing on Kent's patches (which was focused on correctness) was that a read+write verify workload had consistently better read throughput. -- Jens Axboe ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: + fs-break-generic_file_buffered_read-up-into-multiple-functions.patch added to -mm tree 2020-10-29 13:57 ` Jens Axboe @ 2020-10-29 14:57 ` Matthew Wilcox 2020-10-29 15:02 ` Jens Axboe 0 siblings, 1 reply; 10+ messages in thread From: Matthew Wilcox @ 2020-10-29 14:57 UTC (permalink / raw) To: Jens Axboe; +Cc: Andrew Morton, linux-mm, kent.overstreet On Thu, Oct 29, 2020 at 07:57:34AM -0600, Jens Axboe wrote: > On 10/28/20 4:26 PM, Jens Axboe wrote: > > I did see some wins when I tested this. I'll try and run some testing > > tomorrow and report back. If there's something specifically you want to > > see tested, let me know. > > I did some testing, unfortunately it's _very_ hard to produce somewhat > consistent and good numbers as it quickly becomes a game of kswapd. > Here's a basic case of 4 threads doing 32k random reads: > > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > 462 root 20 0 0 0 0 R 65.5 0.0 0:08.02 kswapd0 > 2287 axboe 20 0 1303448 2176 1072 R 46.6 0.0 0:05.35 fio > 2289 axboe 20 0 1303456 2196 1092 D 46.6 0.0 0:05.34 fio > 2290 axboe 20 0 1303460 2216 1112 D 46.6 0.0 0:05.37 fio > 2288 axboe 20 0 1303452 2224 1120 R 45.9 0.0 0:05.33 fio > > Sad face... Unfortunately once kswapd kicks in, performance also > plummets. This box only has 32G of ram, and you can fill that in less > than 10 seconds doing buffered reads like that. > > I ran 4k and 32k testing, and using 1 and 4 threads. But given the above > sadness, it quickly ends up looking the same for me. What if your workload actually fits in memory? That would seem to be the situation where Kent's patches would make a difference. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: + fs-break-generic_file_buffered_read-up-into-multiple-functions.patch added to -mm tree 2020-10-29 14:57 ` Matthew Wilcox @ 2020-10-29 15:02 ` Jens Axboe 2020-10-29 15:03 ` Matthew Wilcox 0 siblings, 1 reply; 10+ messages in thread From: Jens Axboe @ 2020-10-29 15:02 UTC (permalink / raw) To: Matthew Wilcox; +Cc: Andrew Morton, linux-mm, kent.overstreet On 10/29/20 8:57 AM, Matthew Wilcox wrote: > On Thu, Oct 29, 2020 at 07:57:34AM -0600, Jens Axboe wrote: >> On 10/28/20 4:26 PM, Jens Axboe wrote: >>> I did see some wins when I tested this. I'll try and run some testing >>> tomorrow and report back. If there's something specifically you want to >>> see tested, let me know. >> >> I did some testing, unfortunately it's _very_ hard to produce somewhat >> consistent and good numbers as it quickly becomes a game of kswapd. >> Here's a basic case of 4 threads doing 32k random reads: >> >> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND >> 462 root 20 0 0 0 0 R 65.5 0.0 0:08.02 kswapd0 >> 2287 axboe 20 0 1303448 2176 1072 R 46.6 0.0 0:05.35 fio >> 2289 axboe 20 0 1303456 2196 1092 D 46.6 0.0 0:05.34 fio >> 2290 axboe 20 0 1303460 2216 1112 D 46.6 0.0 0:05.37 fio >> 2288 axboe 20 0 1303452 2224 1120 R 45.9 0.0 0:05.33 fio >> >> Sad face... Unfortunately once kswapd kicks in, performance also >> plummets. This box only has 32G of ram, and you can fill that in less >> than 10 seconds doing buffered reads like that. >> >> I ran 4k and 32k testing, and using 1 and 4 threads. But given the above >> sadness, it quickly ends up looking the same for me. > > What if your workload actually fits in memory? That would seem to be > the situation where Kent's patches would make a difference. That was my point, if I do multi-page reads then memory is filled in seconds, which makes it pretty hard to provide any accurate numbers. I don't have anything slow in this test box, I'll see if I can find something to stick in it. -- Jens Axboe ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: + fs-break-generic_file_buffered_read-up-into-multiple-functions.patch added to -mm tree 2020-10-29 15:02 ` Jens Axboe @ 2020-10-29 15:03 ` Matthew Wilcox 2020-10-29 15:05 ` Jens Axboe 0 siblings, 1 reply; 10+ messages in thread From: Matthew Wilcox @ 2020-10-29 15:03 UTC (permalink / raw) To: Jens Axboe; +Cc: Andrew Morton, linux-mm, kent.overstreet On Thu, Oct 29, 2020 at 09:02:31AM -0600, Jens Axboe wrote: > On 10/29/20 8:57 AM, Matthew Wilcox wrote: > > On Thu, Oct 29, 2020 at 07:57:34AM -0600, Jens Axboe wrote: > >> On 10/28/20 4:26 PM, Jens Axboe wrote: > >>> I did see some wins when I tested this. I'll try and run some testing > >>> tomorrow and report back. If there's something specifically you want to > >>> see tested, let me know. > >> > >> I did some testing, unfortunately it's _very_ hard to produce somewhat > >> consistent and good numbers as it quickly becomes a game of kswapd. > >> Here's a basic case of 4 threads doing 32k random reads: > >> > >> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > >> 462 root 20 0 0 0 0 R 65.5 0.0 0:08.02 kswapd0 > >> 2287 axboe 20 0 1303448 2176 1072 R 46.6 0.0 0:05.35 fio > >> 2289 axboe 20 0 1303456 2196 1092 D 46.6 0.0 0:05.34 fio > >> 2290 axboe 20 0 1303460 2216 1112 D 46.6 0.0 0:05.37 fio > >> 2288 axboe 20 0 1303452 2224 1120 R 45.9 0.0 0:05.33 fio > >> > >> Sad face... Unfortunately once kswapd kicks in, performance also > >> plummets. This box only has 32G of ram, and you can fill that in less > >> than 10 seconds doing buffered reads like that. > >> > >> I ran 4k and 32k testing, and using 1 and 4 threads. But given the above > >> sadness, it quickly ends up looking the same for me. > > > > What if your workload actually fits in memory? That would seem to be > > the situation where Kent's patches would make a difference. > > That was my point, if I do multi-page reads then memory is filled in > seconds, which makes it pretty hard to provide any accurate numbers. I > don't have anything slow in this test box, I'll see if I can find > something to stick in it. I meant re-reading files which fit in memory, so you take ten seconds to fill the page cache, then read from the same files over and over. ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: + fs-break-generic_file_buffered_read-up-into-multiple-functions.patch added to -mm tree 2020-10-29 15:03 ` Matthew Wilcox @ 2020-10-29 15:05 ` Jens Axboe 2020-10-29 15:17 ` Jens Axboe 0 siblings, 1 reply; 10+ messages in thread From: Jens Axboe @ 2020-10-29 15:05 UTC (permalink / raw) To: Matthew Wilcox; +Cc: Andrew Morton, linux-mm, kent.overstreet On 10/29/20 9:03 AM, Matthew Wilcox wrote: > On Thu, Oct 29, 2020 at 09:02:31AM -0600, Jens Axboe wrote: >> On 10/29/20 8:57 AM, Matthew Wilcox wrote: >>> On Thu, Oct 29, 2020 at 07:57:34AM -0600, Jens Axboe wrote: >>>> On 10/28/20 4:26 PM, Jens Axboe wrote: >>>>> I did see some wins when I tested this. I'll try and run some testing >>>>> tomorrow and report back. If there's something specifically you want to >>>>> see tested, let me know. >>>> >>>> I did some testing, unfortunately it's _very_ hard to produce somewhat >>>> consistent and good numbers as it quickly becomes a game of kswapd. >>>> Here's a basic case of 4 threads doing 32k random reads: >>>> >>>> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND >>>> 462 root 20 0 0 0 0 R 65.5 0.0 0:08.02 kswapd0 >>>> 2287 axboe 20 0 1303448 2176 1072 R 46.6 0.0 0:05.35 fio >>>> 2289 axboe 20 0 1303456 2196 1092 D 46.6 0.0 0:05.34 fio >>>> 2290 axboe 20 0 1303460 2216 1112 D 46.6 0.0 0:05.37 fio >>>> 2288 axboe 20 0 1303452 2224 1120 R 45.9 0.0 0:05.33 fio >>>> >>>> Sad face... Unfortunately once kswapd kicks in, performance also >>>> plummets. This box only has 32G of ram, and you can fill that in less >>>> than 10 seconds doing buffered reads like that. >>>> >>>> I ran 4k and 32k testing, and using 1 and 4 threads. But given the above >>>> sadness, it quickly ends up looking the same for me. >>> >>> What if your workload actually fits in memory? That would seem to be >>> the situation where Kent's patches would make a difference. >> >> That was my point, if I do multi-page reads then memory is filled in >> seconds, which makes it pretty hard to provide any accurate numbers. I >> don't have anything slow in this test box, I'll see if I can find >> something to stick in it. > > I meant re-reading files which fit in memory, so you take ten seconds > to fill the page cache, then read from the same files over and over. That I can certainly try. -- Jens Axboe ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: + fs-break-generic_file_buffered_read-up-into-multiple-functions.patch added to -mm tree 2020-10-29 15:05 ` Jens Axboe @ 2020-10-29 15:17 ` Jens Axboe 0 siblings, 0 replies; 10+ messages in thread From: Jens Axboe @ 2020-10-29 15:17 UTC (permalink / raw) To: Matthew Wilcox; +Cc: Andrew Morton, linux-mm, kent.overstreet On 10/29/20 9:05 AM, Jens Axboe wrote: > On 10/29/20 9:03 AM, Matthew Wilcox wrote: >> On Thu, Oct 29, 2020 at 09:02:31AM -0600, Jens Axboe wrote: >>> On 10/29/20 8:57 AM, Matthew Wilcox wrote: >>>> On Thu, Oct 29, 2020 at 07:57:34AM -0600, Jens Axboe wrote: >>>>> On 10/28/20 4:26 PM, Jens Axboe wrote: >>>>>> I did see some wins when I tested this. I'll try and run some testing >>>>>> tomorrow and report back. If there's something specifically you want to >>>>>> see tested, let me know. >>>>> >>>>> I did some testing, unfortunately it's _very_ hard to produce somewhat >>>>> consistent and good numbers as it quickly becomes a game of kswapd. >>>>> Here's a basic case of 4 threads doing 32k random reads: >>>>> >>>>> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND >>>>> 462 root 20 0 0 0 0 R 65.5 0.0 0:08.02 kswapd0 >>>>> 2287 axboe 20 0 1303448 2176 1072 R 46.6 0.0 0:05.35 fio >>>>> 2289 axboe 20 0 1303456 2196 1092 D 46.6 0.0 0:05.34 fio >>>>> 2290 axboe 20 0 1303460 2216 1112 D 46.6 0.0 0:05.37 fio >>>>> 2288 axboe 20 0 1303452 2224 1120 R 45.9 0.0 0:05.33 fio >>>>> >>>>> Sad face... Unfortunately once kswapd kicks in, performance also >>>>> plummets. This box only has 32G of ram, and you can fill that in less >>>>> than 10 seconds doing buffered reads like that. >>>>> >>>>> I ran 4k and 32k testing, and using 1 and 4 threads. But given the above >>>>> sadness, it quickly ends up looking the same for me. >>>> >>>> What if your workload actually fits in memory? That would seem to be >>>> the situation where Kent's patches would make a difference. >>> >>> That was my point, if I do multi-page reads then memory is filled in >>> seconds, which makes it pretty hard to provide any accurate numbers. I >>> don't have anything slow in this test box, I'll see if I can find >>> something to stick in it. >> >> I meant re-reading files which fit in memory, so you take ten seconds >> to fill the page cache, then read from the same files over and over. > > That I can certainly try. Reading a 16G file randomly for 10 seconds, using 1 or 4 threads and either 4k or 32k reads: test 5.10-rc1 5.10-rc1+kent ------------------------------------------------------------- 1 thread, 4k 976K 1030K (+5.5%) 4 threads, 4k 3462K 3453K (-0.3%) 1 thread, 32k 299K 322K (+7.7%) 4 threads, 32k 769K 785K (+2.0%) -- Jens Axboe ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: + fs-break-generic_file_buffered_read-up-into-multiple-functions.patch added to -mm tree 2020-10-28 22:22 ` Andrew Morton 2020-10-28 22:26 ` Jens Axboe @ 2020-10-28 22:50 ` Matthew Wilcox 1 sibling, 0 replies; 10+ messages in thread From: Matthew Wilcox @ 2020-10-28 22:50 UTC (permalink / raw) To: Andrew Morton; +Cc: linux-mm, axboe, kent.overstreet On Wed, Oct 28, 2020 at 03:22:35PM -0700, Andrew Morton wrote: > On Tue, 27 Oct 2020 13:35:51 +0000 Matthew Wilcox <willy@infradead.org> wrote: > > > On Sun, Oct 25, 2020 at 03:08:17PM -0700, akpm@linux-foundation.org wrote: > > > The patch titled > > > Subject: mm/filemap/c: freak generic_file_buffered_read up into multiple functions > > > has been added to the -mm tree. Its filename is > > > fs-break-generic_file_buffered_read-up-into-multiple-functions.patch > > > > Can we back this out? It really makes the THP patchset unhappy. I think > > we can do something like this afterwards, but doing it this way round is > > ridiculously hard. > > Two concerns: > > : On my test box, 4k buffered random reads go from ~150k to ~250k iops, > : and the improvements to big sequential reads are even bigger. > > That's a big improvement! We want that improvement. Throwing it away > on behalf of an as-yet-unmerged feature patchset hurts. Can we expect that > this improvement will be available post-that-patchset? And when? I agree we want that improvement! I don't understand how this improves 4k buffered reads though. It should improve 8k to 60k io sizes, but it should be the same for 4k. Honestly, it should be better for 4097 bytes on up (but nobody tests 4097 byte block sizes). I've been working on redoing my patchset on top of current -next for the past two days. It's mostly there, but I've hit a problem when testing with the readahead error injector. The good news is that the performance improvement should be cumulative -- Kent's improvement mean that we get pages out of the page cache in batches, while mine will do fewer atomic ops (on page->refcount). Funnily, this has led to finding a bug in copy_page_to_iter_iovec() and, honestly, a lot of the other iov_iter code, but I don't think that's going to hold us up. ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2020-10-29 15:17 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
[not found] <20201025220817.XxXVE%akpm@linux-foundation.org>
2020-10-27 13:35 ` + fs-break-generic_file_buffered_read-up-into-multiple-functions.patch added to -mm tree Matthew Wilcox
2020-10-28 22:22 ` Andrew Morton
2020-10-28 22:26 ` Jens Axboe
2020-10-29 13:57 ` Jens Axboe
2020-10-29 14:57 ` Matthew Wilcox
2020-10-29 15:02 ` Jens Axboe
2020-10-29 15:03 ` Matthew Wilcox
2020-10-29 15:05 ` Jens Axboe
2020-10-29 15:17 ` Jens Axboe
2020-10-28 22:50 ` Matthew Wilcox
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox