linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "Sorin Faibish" <sfaibish@emc.com>
To: Boaz Harrosh <bharrosh@panasas.com>, Jan Kara <jack@suse.cz>
Cc: lsf-pc@lists.linuxfoundation.org, linux-fsdevel@vger.kernel.org,
	linux-mm@kvack.org, Wu Fengguang <fengguang.wu@intel.com>
Subject: Re: [LSF/MM TOPIC] Writeback - current state and future
Date: Sun, 06 Feb 2011 10:13:41 -0500	[thread overview]
Message-ID: <op.vqhlw3rirwwil4@sfaibish1.corp.emc.com> (raw)
In-Reply-To: <4D4E7B48.9020500@panasas.com>

I was thinking to have a special track for all the writeback related  
topics.
I would like also to include a discussion on new cache writeback paterns
with the target to prevent any cache swaps that are becoming a bigger  
problem
when dealing with servers wir 100's GB caches. The swap is the worst that
could happen to the performance of such systems. I will share my latest  
findings
in the cache writeback in continuation to my previous discussion at last  
LSF.

/Sorin

On Sun, 06 Feb 2011 05:43:20 -0500, Boaz Harrosh <bharrosh@panasas.com>  
wrote:

> On 02/04/2011 06:42 PM, Jan Kara wrote:
>>   Hi,
>>
>>   I'd like to have one session about writeback. The content would highly
>> depend on the current state of things but on a general level, I'd like  
>> to
>> quickly sum up what went into the kernel (or is mostly ready to go)  
>> since
>> last LSF (handling of background writeback, livelock avoidance), what is
>> being worked on - IO-less balance_dirty_pages() (if it won't be in the
>> mostly done section), what other things need to be improved (kswapd
>> writeout, writeback_inodes_sb_if_idle() mess, come to my mind now)
>>
>> 								Honza
>
> Ha, I most certainly want to participate in this talk. I wanted to
> suggest it myself.
>
> Topics that I would like to raise on the matter.
>
> [IO-less balance_dirty_pages]
> As said, I'd really like if Wu or Jan could explain more about the math
> and IO patterns that went into this tremendous work, and how it should
> affect us fs maintainers in means of advantages and disadvantages. If
> digging too deeply into this is not interesting for every body, perhaps
> a side meeting with fewer people is also possible.
>
> [Aligned write-back]
> I have just finished raid5/6 support in my filesystem and will be sending
> a patch that tries very aggressively to align IO on stripe boundaries.
> I did not take the btrfs way of cut/paste of the write_cache_pages()  
> function
> to better fit the bill. I used the wbc->nr_to_write to trim down IO on  
> stripe
> alignment. Together with some internal structure games, I now have a much
> better situation then untouched code. Better I mean that if I have simple
> linear dd IO on a file, I can see o(90%) aligned IOs as opposed to 20%  
> before
> that patch. The only remaining issue, I think I have not fully  
> investigated
> it yet, is that: because I do not want any residues left from outside the
> writepages() call so I do not need to sync and lock with flush, and have  
> a
> "flushing" flag in my writeout path. So what I still get is that  
> sometimes
> the writeback is able to catch up with dd and I get short writes at the
> reminder, which makes the end of this call and the start of the next call
> unaligned.
>
> I envision a simple BDI members just like ra_pages for readahead that  
> better
> govern the writeback chunking. (And is accounted for in the fairness).
>
> [Smarter/more cache eviction patterns]
> I love it when I do a simple dd test in a UML (300Mg of ram) and half  
> way down
> I get these fat WARN_ONs of the iscsi tcp writeback failing to allocate  
> network
> buffers. And I did lower the writeback ratio a lot because the default  
> of 20% does
> not work for a long time, like since 35 or 36. The UML is not the only  
> affected
> system any low-memory embedded-like but 64 bit system would be. Now the  
> IO does
> complete eventually but the performance is down to 20%.
>
> Now for a dd or cp like work pattern I would like the pages be freed  
> much more
> aggressively, like right after IO completion because I most certainly  
> will not
> use them again. On the other side git for example will write a big  
> sequential
> file then immediately turn and read it, so cache presence is a win. But  
> I think
> we can still come up with good patterns that take into account the  
> number of
> fileh opened on an inode, and some hot inode history to come up with  
> better
> patterns. (Some of this history we already have with the security  
> plugins)
>
> And there are other topics that I had, but can remember right now.
>
> Thanks
> Boaz
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel"  
> in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>



-- 
Best Regards
Sorin Faibish
Corporate Distinguished Engineer
Unified Storage Division

        EMC2
where information lives

Phone: 508-435-1000 x 48545
Cellphone: 617-510-0422
Email : sfaibish@emc.com

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2011-02-06 15:13 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-02-04 16:42 Jan Kara
2011-02-04 18:06 ` Curt Wohlgemuth
2011-02-05  7:55   ` Tao Ma
2011-02-06 10:43 ` Boaz Harrosh
2011-02-06 15:13   ` Sorin Faibish [this message]
2011-02-06 16:24     ` Boaz Harrosh
2011-02-11 14:47     ` Jan Kara
2011-02-11 16:22       ` sfaibish
2011-02-26 21:03         ` Sorin Faibish
2011-02-26 21:07           ` [Lsf-pc] " James Bottomley
2011-02-26 23:21             ` Sorin Faibish
2011-02-26 23:48               ` James Bottomley
2011-02-27  1:50                 ` Trond Myklebust

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=op.vqhlw3rirwwil4@sfaibish1.corp.emc.com \
    --to=sfaibish@emc.com \
    --cc=bharrosh@panasas.com \
    --cc=fengguang.wu@intel.com \
    --cc=jack@suse.cz \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lsf-pc@lists.linuxfoundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox