linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	linux-mm <linux-mm@kvack.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Jens Axboe <axboe@kernel.dk>, Michal Hocko <mhocko@suse.com>,
	Mel Gorman <mgorman@suse.de>,
	Johannes Weiner <hannes@cmpxchg.org>, Tejun Heo <tj@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH RFC] mm: implement write-behind policy for sequential file writes
Date: Mon, 2 Oct 2017 23:58:45 +0300	[thread overview]
Message-ID: <dcb23e5d-81b9-9a6c-b7ac-bbad2ef77fd8@yandex-team.ru> (raw)
In-Reply-To: <CA+55aFyXrxN8Dqw9QK9NPWk+ZD52fT=q2y7ByPt9pooOrio3Nw@mail.gmail.com>

On 02.10.2017 22:54, Linus Torvalds wrote:
> On Mon, Oct 2, 2017 at 2:54 AM, Konstantin Khlebnikov
> <khlebnikov@yandex-team.ru> wrote:
>>
>> This patch implements write-behind policy which tracks sequential writes
>> and starts background writeback when have enough dirty pages in a row.
> 
> This looks lovely to me.
> 
> I do wonder if you also looked at finishing the background
> write-behind at close() time, because it strikes me that once you
> start doing that async writeout, it would probably be good to make
> sure you try to do the whole file.

Smaller files or tails is lesser problem and forced writeback here
might add bigger overhead due to small requests or too random IO.
Also open+append+close pattern could generate too much IO.

> 
> I'm thinking of filesystems that do delayed allocation etc - I'd
> expect that you'd want the whole file to get allocated on disk
> together, rather than have the "first 256kB aligned chunks" allocated
> thanks to write-behind, and then the final part allocated much later
> (after other files may have triggered their own write-behind). Think
> loads like copying lots of pictures around, for example.

As far as I know ext4 preallocates space beyond file end for writing
patterns like append + fsync. Thus allocated extents should be bigger
than 256k. I haven't looked into this yet.

> 
> I don't have any particularly strong feelings about this, but I do
> suspect that once you have started that IO, you do want to finish it
> all up as the file write is done. No?

I'm aiming into continuous file operations like downloading huge file
or writing verbose log. Original motivation came from low-latency server
workloads which suffers from parallel bulk operations which generates
tons of dirty pages. Probably for general-purpose usage thresholds
should be increased significantly to cover only really bulky patterns.

> 
> It would also be really nice to see some numbers. Perhaps a comparison
> of "vmstat 1" or similar when writing a big file to some slow medium
> like a USB stick (which is something we've done very very badly at,
> and this should help smooth out)?

I'll try to find out some real cases with numbers.

For now I see that massive write + fdatasync (dd conf=fdatasync, fio)
always ends earlier because writeback now starts earlier too.
Without fdatasync it's obviously slower.

Cp to usb stick + umount should show same result, plus cp could be
interrupted at any point without contaminating cache with dirty pages.

Kernel compilation tooks almost the same time because most files are
smaller than 256k.

> 
>                  Linus
> 

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2017-10-02 20:58 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-10-02  9:54 Konstantin Khlebnikov
2017-10-02 11:23 ` Florian Weimer
2017-10-02 11:55   ` Konstantin Khlebnikov
2017-10-02 19:54 ` Linus Torvalds
2017-10-02 20:58   ` Konstantin Khlebnikov [this message]
2017-10-02 22:29     ` Andreas Dilger
2017-10-02 22:45   ` Dave Chinner
2017-10-02 23:08     ` Linus Torvalds
2017-10-03  0:08       ` Dave Chinner
2017-10-02 20:00 ` Jens Axboe
2017-10-02 21:50   ` Konstantin Khlebnikov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=dcb23e5d-81b9-9a6c-b7ac-bbad2ef77fd8@yandex-team.ru \
    --to=khlebnikov@yandex-team.ru \
    --cc=akpm@linux-foundation.org \
    --cc=axboe@kernel.dk \
    --cc=hannes@cmpxchg.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=mhocko@suse.com \
    --cc=tj@kernel.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox