linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Daniel Jordan <daniel.m.jordan@oracle.com>
Cc: akpm@linux-foundation.org, dan.carpenter@oracle.com,
	andrea.parri@amarulasolutions.com, shli@kernel.org,
	ying.huang@intel.com, dave.hansen@linux.intel.com,
	sfr@canb.auug.org.au, osandov@fb.com, tj@kernel.org,
	ak@linux.intel.com, linux-mm@kvack.org,
	kernel-janitors@vger.kernel.org, paulmck@linux.ibm.com,
	stern@rowland.harvard.edu, will.deacon@arm.com
Subject: Re: [PATCH] mm, swap: bounds check swap_info accesses to avoid NULL derefs
Date: Wed, 30 Jan 2019 10:13:16 +0100	[thread overview]
Message-ID: <20190130091316.GC2278@hirez.programming.kicks-ass.net> (raw)
In-Reply-To: <20190115002305.15402-1-daniel.m.jordan@oracle.com>

On Mon, Jan 14, 2019 at 07:23:05PM -0500, Daniel Jordan wrote:
> Dan Carpenter reports a potential NULL dereference in
> get_swap_page_of_type:
> 
>   Smatch complains that the NULL checks on "si" aren't consistent.  This
>   seems like a real bug because we have not ensured that the type is
>   valid and so "si" can be NULL.
> 
> Add the missing check for NULL, taking care to use a read barrier to
> ensure CPU1 observes CPU0's updates in the correct order:
> 
>         CPU0                           CPU1
>         alloc_swap_info()              if (type >= nr_swapfiles)
>           swap_info[type] = p              /* handle invalid entry */
>           smp_wmb()                    smp_rmb()
>           ++nr_swapfiles               p = swap_info[type]
> 
> Without smp_rmb, CPU1 might observe CPU0's write to nr_swapfiles before
> CPU0's write to swap_info[type] and read NULL from swap_info[type].
> 
> Ying Huang noticed that other places don't order these reads properly.
> Introduce swap_type_to_swap_info to encourage correct usage.
> 
> Use READ_ONCE and WRITE_ONCE to follow the Linux Kernel Memory Model
> (see tools/memory-model/Documentation/explanation.txt).
> 
> This ordering need not be enforced in places where swap_lock is held
> (e.g. si_swapinfo) because swap_lock serializes updates to nr_swapfiles
> and the swap_info array.
> 
> This is a theoretical problem, no actual reports of it exist.
> 
> Fixes: ec8acf20afb8 ("swap: add per-partition lock for swapfile")
> Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
> Signed-off-by: Daniel Jordan <daniel.m.jordan@oracle.com>

A few comments below, but:

Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>

> +static struct swap_info_struct *swap_type_to_swap_info(int type)
> +{
> +	if (type >= READ_ONCE(nr_swapfiles))
> +		return NULL;
> +
> +	smp_rmb();	/* Pairs with smp_wmb in alloc_swap_info. */
> +	return READ_ONCE(swap_info[type]);
> +}

> @@ -2799,9 +2810,9 @@ static void *swap_start(struct seq_file *swap, loff_t *pos)
>  	if (!l)
>  		return SEQ_START_TOKEN;
>  
> -	for (type = 0; type < nr_swapfiles; type++) {
> +	for (type = 0; type < READ_ONCE(nr_swapfiles); type++) {
>  		smp_rmb();	/* read nr_swapfiles before swap_info[type] */
> -		si = swap_info[type];
> +		si = READ_ONCE(swap_info[type]);
>  		if (!(si->flags & SWP_USED) || !si->swap_map)
>  			continue;
>  		if (!--l)
> @@ -2821,9 +2832,9 @@ static void *swap_next(struct seq_file *swap, void *v, loff_t *pos)
>  	else
>  		type = si->type + 1;
>  
> -	for (; type < nr_swapfiles; type++) {
> +	for (; type < READ_ONCE(nr_swapfiles); type++) {
>  		smp_rmb();	/* read nr_swapfiles before swap_info[type] */
> -		si = swap_info[type];
> +		si = READ_ONCE(swap_info[type]);
>  		if (!(si->flags & SWP_USED) || !si->swap_map)
>  			continue;
>  		++*pos;

You could write those like:

	for (; (si = swap_type_to_swap_info(type)); type++)

> @@ -2930,14 +2941,14 @@ static struct swap_info_struct *alloc_swap_info(void)
>  	}
>  	if (type >= nr_swapfiles) {
>  		p->type = type;
> -		swap_info[type] = p;
> +		WRITE_ONCE(swap_info[type], p);
>  		/*
>  		 * Write swap_info[type] before nr_swapfiles, in case a
>  		 * racing procfs swap_start() or swap_next() is reading them.
>  		 * (We never shrink nr_swapfiles, we never free this entry.)
>  		 */
>  		smp_wmb();
> -		nr_swapfiles++;
> +		WRITE_ONCE(nr_swapfiles, nr_swapfiles + 1);
>  	} else {
>  		kvfree(p);
>  		p = swap_info[type];

It is also possible to write this with smp_load_acquire() /
smp_store_release(). ARM64/RISC-V might benefit from that, OTOH ARM
won't like that much.

Dunno what would be better.


  parent reply	other threads:[~2019-01-30  9:13 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-11  9:59 [PATCH] mm, swap: Potential NULL dereference in get_swap_page_of_type() Dan Carpenter
2019-01-11 17:41 ` Daniel Jordan
2019-01-11 23:20   ` Andrea Parri
2019-01-14 22:25     ` Daniel Jordan
2019-01-15  0:23       ` [PATCH] mm, swap: bounds check swap_info accesses to avoid NULL derefs Daniel Jordan
2019-01-15  1:17         ` Andrea Parri
2019-01-30  6:26         ` Andrew Morton
2019-01-31  1:52           ` Daniel Jordan
2019-01-31  2:44             ` [PATCH v2] mm, swap: bounds check swap_info array " Daniel Jordan
2019-01-31  2:48           ` About swapoff race patch (was Re: [PATCH] mm, swap: bounds check swap_info accesses to avoid NULL derefs) Huang, Ying
2019-01-31 20:46             ` Andrew Morton
2019-02-02  7:14               ` Huang, Ying
2019-02-04 21:37               ` Hugh Dickins
2019-02-04 22:26                 ` Matthew Wilcox
2019-02-06  0:14                 ` Huang, Ying
2019-02-06  0:36                   ` Hugh Dickins
2019-02-06  0:58                     ` Huang, Ying
2019-02-08  0:28                 ` Andrea Parri
2019-02-11  1:02                   ` Huang, Ying
2019-01-30  7:28         ` [PATCH] mm, swap: bounds check swap_info accesses to avoid NULL derefs Dan Carpenter
2019-01-31  1:55           ` Daniel Jordan
2019-01-30  9:13         ` Peter Zijlstra [this message]
2019-01-31  2:00           ` Daniel Jordan
2019-01-15  0:28       ` [PATCH] mm, swap: Potential NULL dereference in get_swap_page_of_type() Andrea Parri
2019-01-14  2:12   ` Huang, Ying
2019-01-14  2:12     ` Huang, Ying
2019-01-14  8:43   ` Dan Carpenter
2019-01-14 23:40     ` Daniel Jordan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190130091316.GC2278@hirez.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=ak@linux.intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=andrea.parri@amarulasolutions.com \
    --cc=dan.carpenter@oracle.com \
    --cc=daniel.m.jordan@oracle.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=kernel-janitors@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=osandov@fb.com \
    --cc=paulmck@linux.ibm.com \
    --cc=sfr@canb.auug.org.au \
    --cc=shli@kernel.org \
    --cc=stern@rowland.harvard.edu \
    --cc=tj@kernel.org \
    --cc=will.deacon@arm.com \
    --cc=ying.huang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox