linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Josef Bacik <jbacik@fb.com>
To: Jan Kara <jack@suse.cz>
Cc: linux-btrfs@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	kernel-team@fb.com, jack@suse.com, viro@zeniv.linux.org.uk,
	dchinner@redhat.com, hch@lst.de, linux-mm@kvack.org,
	hannes@cmpxchg.org
Subject: Re: [PATCH 2/4] writeback: allow for dirty metadata accounting
Date: Thu, 22 Sep 2016 09:34:40 -0400	[thread overview]
Message-ID: <81d2d2e2-6a90-6727-cdb2-8ffcadf93833@fb.com> (raw)
In-Reply-To: <20160922111801.GK2834@quack2.suse.cz>

On 09/22/2016 07:18 AM, Jan Kara wrote:
> On Tue 20-09-16 16:57:46, Josef Bacik wrote:
>> Btrfs has no bounds except memory on the amount of dirty memory that we have in
>> use for metadata.  Historically we have used a special inode so we could take
>> advantage of the balance_dirty_pages throttling that comes with using pagecache.
>> However as we'd like to support different blocksizes it would be nice to not
>> have to rely on pagecache, but still get the balance_dirty_pages throttling
>> without having to do it ourselves.
>>
>> So introduce *METADATA_DIRTY_BYTES and *METADATA_WRITEBACK_BYTES.  These are
>> zone and bdi_writeback counters to keep track of how many bytes we have in
>> flight for METADATA.  We need to count in bytes as blocksizes could be
>> percentages of pagesize.  We simply convert the bytes to number of pages where
>> it is needed for the throttling.
>>
>> Signed-off-by: Josef Bacik <jbacik@fb.com>
>> ---
>>  arch/tile/mm/pgtable.c           |   3 +-
>>  drivers/base/node.c              |   6 ++
>>  fs/fs-writeback.c                |   2 +
>>  fs/proc/meminfo.c                |   5 ++
>>  include/linux/backing-dev-defs.h |   2 +
>>  include/linux/mm.h               |   9 +++
>>  include/linux/mmzone.h           |   2 +
>>  include/trace/events/writeback.h |  13 +++-
>>  mm/backing-dev.c                 |   5 ++
>>  mm/page-writeback.c              | 157 +++++++++++++++++++++++++++++++++++----
>>  mm/page_alloc.c                  |  16 +++-
>>  mm/vmscan.c                      |   4 +-
>>  12 files changed, 200 insertions(+), 24 deletions(-)
>>
>> diff --git a/arch/tile/mm/pgtable.c b/arch/tile/mm/pgtable.c
>> index 7cc6ee7..9543468 100644
>> --- a/arch/tile/mm/pgtable.c
>> +++ b/arch/tile/mm/pgtable.c
>> @@ -44,12 +44,13 @@ void show_mem(unsigned int filter)
>>  {
>>  	struct zone *zone;
>>
>> -	pr_err("Active:%lu inactive:%lu dirty:%lu writeback:%lu unstable:%lu free:%lu\n slab:%lu mapped:%lu pagetables:%lu bounce:%lu pagecache:%lu swap:%lu\n",
>> +	pr_err("Active:%lu inactive:%lu dirty:%lu metadata_dirty:%lu writeback:%lu unstable:%lu free:%lu\n slab:%lu mapped:%lu pagetables:%lu bounce:%lu pagecache:%lu swap:%lu\n",
>>  	       (global_node_page_state(NR_ACTIVE_ANON) +
>>  		global_node_page_state(NR_ACTIVE_FILE)),
>>  	       (global_node_page_state(NR_INACTIVE_ANON) +
>>  		global_node_page_state(NR_INACTIVE_FILE)),
>>  	       global_node_page_state(NR_FILE_DIRTY),
>> +	       global_node_page_state(NR_METADATA_DIRTY),
>
> Leftover from previous version? Ah, it is tile architecture so I see how it
> could have passed testing ;)
>

Ah now I understand the kbuild error I got, oops ;)

>> @@ -506,6 +530,10 @@ bool node_dirty_ok(struct pglist_data *pgdat)
>>  	nr_pages += node_page_state(pgdat, NR_FILE_DIRTY);
>>  	nr_pages += node_page_state(pgdat, NR_UNSTABLE_NFS);
>>  	nr_pages += node_page_state(pgdat, NR_WRITEBACK);
>> +	nr_pages += (node_page_state(pgdat, NR_METADATA_DIRTY_BYTES) >>
>> +		     PAGE_SHIFT);
>> +	nr_pages += (node_page_state(pgdat, NR_METADATA_WRITEBACK_BYTES) >>
>> +		     PAGE_SHIFT);
>>
>>  	return nr_pages <= limit;
>>  }
>
> I still don't think this is correct. It currently achieves the same
> behavior as before the patch but once you start accounting something else
> than pagecache pages into these counters, things will go wrong. This
> function is used to control distribution of pagecache pages among NUMA
> nodes and as such it should IMHO only account for pagecache pages...
>
>> @@ -3714,7 +3714,9 @@ static unsigned long node_pagecache_reclaimable(struct pglist_data *pgdat)
>>
>>  	/* If we can't clean pages, remove dirty pages from consideration */
>>  	if (!(node_reclaim_mode & RECLAIM_WRITE))
>> -		delta += node_page_state(pgdat, NR_FILE_DIRTY);
>> +		delta += node_page_state(pgdat, NR_FILE_DIRTY) +
>> +			(node_page_state(pgdat, NR_METADATA_DIRTY_BYTES) >>
>> +			 PAGE_SHIFT);
>>
>>  	/* Watch for any possible underflows due to delta */
>>  	if (unlikely(delta > nr_pagecache_reclaimable))
>
> The same comment as above applies here.
>

Ok that sounds reasonable, I'll make this change.  Thanks,

Josef

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2016-09-22 13:35 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-09-20 20:57 [PATCH 0/4][V3] metadata throttling in writeback patches Josef Bacik
2016-09-20 20:57 ` [PATCH 1/4] remove mapping from balance_dirty_pages*() Josef Bacik
2016-09-20 20:57 ` [PATCH 2/4] writeback: allow for dirty metadata accounting Josef Bacik
2016-09-22 11:18   ` Jan Kara
2016-09-22 13:34     ` Josef Bacik [this message]
2016-09-22 19:48   ` Johannes Weiner
2016-09-20 20:57 ` [PATCH 3/4] writeback: convert WB_WRITTEN/WB_DIRITED counters to bytes Josef Bacik
2016-09-22 11:34   ` Jan Kara
2016-09-22 13:35     ` Josef Bacik
2016-09-20 20:57 ` [PATCH 4/4] writeback: introduce super_operations->write_metadata Josef Bacik
2016-09-22 11:48   ` Jan Kara
2016-09-22 13:36     ` Josef Bacik
2017-11-08 19:00 [PATCH 1/4] remove mapping from balance_dirty_pages*() Josef Bacik
2017-11-08 19:00 ` [PATCH 2/4] writeback: allow for dirty metadata accounting Josef Bacik
2017-11-09 10:32   ` Jan Kara
2017-11-09 14:28     ` Josef Bacik

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=81d2d2e2-6a90-6727-cdb2-8ffcadf93833@fb.com \
    --to=jbacik@fb.com \
    --cc=dchinner@redhat.com \
    --cc=hannes@cmpxchg.org \
    --cc=hch@lst.de \
    --cc=jack@suse.com \
    --cc=jack@suse.cz \
    --cc=kernel-team@fb.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox