linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@linux-foundation.org>
To: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
Cc: Fengguang Wu <fengguang.wu@intel.com>,
	David Cohen <david.a.cohen@linux.intel.com>,
	Al Viro <viro@zeniv.linux.org.uk>,
	Damien Ramonda <damien.ramonda@intel.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH RFC] mm readahead: Fix the readahead fail in case of empty numa node
Date: Tue, 3 Dec 2013 14:38:41 -0800	[thread overview]
Message-ID: <20131203143841.11b71e387dc1db3a8ab0974c@linux-foundation.org> (raw)
In-Reply-To: <1386066977-17368-1-git-send-email-raghavendra.kt@linux.vnet.ibm.com>

On Tue,  3 Dec 2013 16:06:17 +0530 Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com> wrote:

> On a cpu with an empty numa node,

This makes no sense - numa nodes don't reside on CPUs.

I think you mean "on a CPU which resides on a memoryless NUMA node"?

> readahead fails because max_sane_readahead
> returns zero. The reason is we look into number of inactive + free pages 
> available on the current node.
> 
> The following patch tries to fix the behaviour by checking for potential
> empty numa node cases.
> The rationale for the patch is, readahead may be worth doing on a remote
> node instead of incuring costly disk faults later.
> 
> I still feel we may have to sanitize the nr below, (for e.g., nr/8)
> to avoid serious consequences of malicious application trying to do
> a big readahead on a empty numa node causing unnecessary load on remote nodes.
> ( or it may even be that current behaviour is right in not going ahead with
> readahead to avoid the memory load on remote nodes).
> 

I don't recall the rationale for the current code and of course we
didn't document it.  It might be in the changelogs somewhere - could
you please do the git digging and see if you can find out?

I don't immediately see why readahead into a different node is
considered a bad thing.

> --- a/mm/readahead.c
> +++ b/mm/readahead.c
> @@ -243,8 +243,11 @@ int force_page_cache_readahead(struct address_space *mapping, struct file *filp,
>   */
>  unsigned long max_sane_readahead(unsigned long nr)
>  {
> -	return min(nr, (node_page_state(numa_node_id(), NR_INACTIVE_FILE)
> -		+ node_page_state(numa_node_id(), NR_FREE_PAGES)) / 2);
> +	unsigned long numa_free_page;
> +	numa_free_page = (node_page_state(numa_node_id(), NR_INACTIVE_FILE)
> +			   + node_page_state(numa_node_id(), NR_FREE_PAGES));
> +
> +	return numa_free_page ? min(nr, numa_free_page / 2) : nr;

Well even if this CPU's node has very little pagecache at all, what's
wrong with permitting readahead?  We don't know that the new pagecache
will be allocated exclusively from this CPU's node anyway.  All very
odd.

Whatever we do, we should leave behind some good code comments which
explain the rationale(s), please.  Right now it's rather opaque.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2013-12-03 22:38 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-12-03 10:36 Raghavendra K T
2013-12-03 22:38 ` Andrew Morton [this message]
2013-12-04  8:30   ` Raghavendra K T
2013-12-04  8:41     ` Andrew Morton
2013-12-04  9:08       ` Raghavendra K T
2013-12-04 21:48         ` Andrew Morton
2013-12-05  5:57           ` Raghavendra K T
2013-12-11 22:49           ` Jan Kara
2013-12-11 23:05             ` Andrew Morton
2013-12-12 11:14               ` Jan Kara
2013-12-14  0:39               ` Linus Torvalds
2013-12-31 11:07                 ` Raghavendra K T

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20131203143841.11b71e387dc1db3a8ab0974c@linux-foundation.org \
    --to=akpm@linux-foundation.org \
    --cc=damien.ramonda@intel.com \
    --cc=david.a.cohen@linux.intel.com \
    --cc=fengguang.wu@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=raghavendra.kt@linux.vnet.ibm.com \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox