linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Peter Zijlstra <a.p.zijlstra@chello.nl>
To: Oleg Nesterov <oleg@tv-sign.ru>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Andrew Morton <akpm@linux-foundation.org>,
	Ingo Molnar <mingo@elte.hu>, Thomas Gleixner <tglx@linutronix.de>,
	Nick Piggin <npiggin@suse.de>
Subject: Re: [PATCH 1/2] scalable rw_mutex
Date: Sat, 12 May 2007 09:39:34 +0200	[thread overview]
Message-ID: <1178955574.6810.50.camel@twins> (raw)
In-Reply-To: <20070511230023.GA449@tv-sign.ru>

On Sat, 2007-05-12 at 03:00 +0400, Oleg Nesterov wrote:
> On 05/11, Peter Zijlstra wrote:
> > 
> > +static inline int __rw_mutex_read_trylock(struct rw_mutex *rw_mutex)
> > +{
> > +	preempt_disable();
> > +	if (likely(!__rw_mutex_reader_slow(rw_mutex))) {
> 
> 	--- WINDOW ---
> 
> > +		percpu_counter_mod(&rw_mutex->readers, 1);
> > +		preempt_enable();
> > +		return 1;
> > +	}
> > +	preempt_enable();
> > +	return 0;
> > +}

Yeah, I found that one when Andrew asked me about that preempt_disable()
thing.

How about:

int __rw_mutex_read_trylock(struct rw_mutex *rw_mutex)
{
	percpu_counter_inc(&rw_mutex->readers);
	if (unlikely(rw_mutex_reader_slow(rw_mutex))) {
		percpu_counter_dec(&rw_mutex->readers);
		/*
		 * possibly wake up a writer waiting for this reference to
		 * disappear
		 */
		wake_up(&rw_mutex->wait_queue);
		return 0;
	}
	return 1;
}


> > [...snip...]
> >
> > +void rw_mutex_write_lock_nested(struct rw_mutex *rw_mutex, int subclass)
> > +{
> > [...snip...]
> > +
> > +	/*
> > +	 * block new readers
> > +	 */
> > +	__rw_mutex_status_set(rw_mutex, RW_MUTEX_READER_SLOW);
> > +	/*
> > +	 * wait for all readers to go away
> > +	 */
> > +	wait_event(rw_mutex->wait_queue,
> > +			(percpu_counter_sum(&rw_mutex->readers) == 0));
> > +}
> 
> This look a bit suspicious, can't mutex_write_lock() set RW_MUTEX_READER_SLOW
> and find percpu_counter_sum() == 0 in that WINDOW above?

Indeed; however with the above having the reverse sequence this has, it
should be closed no?

> > +void rw_mutex_read_unlock(struct rw_mutex *rw_mutex)
> > +{
> > +     rwsem_release(&rw_mutex->dep_map, 1, _RET_IP_);
> > +
> > +     percpu_counter_mod(&rw_mutex->readers, -1);
> > +     if (unlikely(__rw_mutex_reader_slow(rw_mutex)) &&
> > +                     percpu_counter_sum(&rw_mutex->readers) == 0)

I took out the percpu_counter_sum()

> > +             wake_up_all(&rw_mutex->wait_queue);
> > +}
> 
> The same. __rw_mutex_status_set()->wmb() in rw_mutex_write_lock below
> is not enough. percpu_counter_mod() doesn't take fbc->lock if < FBC_BATCH,
> so we don't have a proper serialization.
> 
> write_lock() sets RW_MUTEX_READER_SLOW, finds percpu_counter_sum() != 0,
> and sleeps. rw_mutex_read_unlock() decrements cpu-local var, does not
> see RW_MUTEX_READER_SLOW and skips wake_up_all().

write lock              read lock               read unlock

a) state = slow         1) readers++            I)  readers--
b) wait(readers == 0)   2) if (state == slow)   II) if (state == slow)

That looks pretty safe to me; however are you suggesting the
percpu_counter_inc() needs some sort of barrier in order to be reliably
picked up by the percpu_counter_sum()?

something like this:

percpu_counter_{inc,dec}
smp_wmb()

vs

smp_rmb()
percpu_counter_sum(()

> > +void rw_mutex_write_lock_nested(struct rw_mutex *rw_mutex, int subclass)
> > +{
> > +	might_sleep();
> > +	rwsem_acquire(&rw_mutex->dep_map, subclass, 0, _RET_IP_);
> > +
> > +	mutex_lock_nested(&rw_mutex->write_mutex, subclass);
> > +	mutex_lock_nested(&rw_mutex->read_mutex, subclass);
> > +
> > +	/*
> > +	 * block new readers
> > +	 */
> > +	__rw_mutex_status_set(rw_mutex, RW_MUTEX_READER_SLOW);
> > +	/*
> > +	 * wait for all readers to go away
> > +	 */
> > +	wait_event(rw_mutex->wait_queue,
> > +			(percpu_counter_sum(&rw_mutex->readers) == 0));
> > +}
> > +
> > +void rw_mutex_write_unlock(struct rw_mutex *rw_mutex)
> > +{
> > +	int waiters;
> > +
> > +	rwsem_release(&rw_mutex->dep_map, 1, _RET_IP_);
> > +
> > +	/*
> > +	 * let the readers rip
> > +	 */
> > +	__rw_mutex_status_set(rw_mutex, RW_MUTEX_READER_FAST);
> > +	waiters = atomic_read(&rw_mutex->read_waiters);
> > +	mutex_unlock(&rw_mutex->read_mutex);
> > +	/*
> > +	 * wait for at least 1 reader to get through
> > +	 */
> > +	if (waiters) {
> > +		wait_event(rw_mutex->wait_queue,
> > +			(atomic_read(&rw_mutex->read_waiters) < waiters));
> > +	}
> > +	/*
> > +	 * before we let the writers rip
> > +	 */
> > +	mutex_unlock(&rw_mutex->write_mutex);
> > +}
> 
> Looks like we can have only one task on rw_mutex->wait_queue, and it holds
> ->write_mutex. Can't we use just a "task_struct *write_waiter" instead of
> ->wait_queue ? This makes rw_mutex smaller.

Good point; I'll try and figure out how to sleep and wake a single task
without the waitqueue.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2007-05-12  7:39 UTC|newest]

Thread overview: 50+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-05-11 13:15 [PATCH 0/2] convert mmap_sem to a " Peter Zijlstra
2007-05-11 13:15 ` [PATCH 1/2] " Peter Zijlstra
2007-05-11 14:03   ` Christoph Hellwig
2007-05-11 16:31   ` Andrew Morton
2007-05-11 17:07     ` Christoph Lameter
2007-05-11 18:05       ` Andrew Morton
2007-05-12 18:55         ` Andi Kleen
2007-05-12 18:06           ` Andrew Morton
2007-05-12 18:11             ` Andrew Morton
2007-05-16 23:28             ` Andrew Morton
2007-05-16 23:40               ` Christoph Lameter
2007-05-17  0:24                 ` Andrew Morton
2007-05-12 18:12           ` Oleg Nesterov
2007-05-12 19:21             ` Andi Kleen
2007-05-12 21:42               ` Oleg Nesterov
2007-05-11 17:57     ` Peter Zijlstra
2007-05-11 23:00   ` Oleg Nesterov
2007-05-12  7:39     ` Peter Zijlstra [this message]
2007-05-12 13:41     ` Peter Zijlstra
2007-05-12 16:04       ` Oleg Nesterov
2007-05-12 16:57         ` Peter Zijlstra
2007-05-12 18:03           ` Oleg Nesterov
2007-05-14 10:59             ` Peter Zijlstra
2007-05-14 11:36               ` Nick Piggin
2007-05-15  0:36               ` Paul E. McKenney
2007-05-15  7:43                 ` Peter Zijlstra
2007-05-15 15:29                   ` Paul E. McKenney
2007-05-15 16:17                     ` Peter Zijlstra
2007-05-15 18:52                       ` Paul E. McKenney
2007-05-11 13:15 ` [PATCH 2/2] mm: change mmap_sem over to the " Peter Zijlstra
2007-05-11 16:17   ` Andrew Morton
2007-05-11 17:12     ` Peter Zijlstra
2007-05-11 18:08       ` Andrew Morton
2007-05-14 11:54         ` Nick Piggin
2007-05-11 15:56 ` [PATCH 0/2] convert mmap_sem to a " Ingo Molnar
2007-05-11 16:52   ` Eric Dumazet
2007-05-11 17:18     ` Peter Zijlstra
2007-05-14 12:07       ` Nick Piggin
2007-05-14 12:57         ` Peter Zijlstra
2007-05-11 17:08   ` Christoph Lameter
2007-05-14 11:58   ` Nick Piggin
2007-05-14 12:38     ` Peter Zijlstra
2007-05-12  9:27 ` Esben Nielsen
2007-05-12 10:01   ` Peter Zijlstra
2007-05-12 13:44     ` Esben Nielsen
2007-05-12 14:33       ` Ingo Molnar
2007-05-12 15:34         ` Esben Nielsen
2007-05-12 15:42           ` Ingo Molnar
2007-05-12 15:26       ` Eric Dumazet
2007-05-14  8:50         ` Esben Nielsen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1178955574.6810.50.camel@twins \
    --to=a.p.zijlstra@chello.nl \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mingo@elte.hu \
    --cc=npiggin@suse.de \
    --cc=oleg@tv-sign.ru \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox