linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Peter Zijlstra <peterz@infradead.org>
To: Byungchul Park <byungchul.park@lge.com>
Cc: mingo@kernel.org, tglx@linutronix.de, walken@google.com,
	boqun.feng@gmail.com, kirill@shutemov.name,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	iamjoonsoo.kim@lge.com, akpm@linux-foundation.org,
	npiggin@gmail.com
Subject: Re: [PATCH v4 07/15] lockdep: Implement crossrelease feature
Date: Mon, 16 Jan 2017 16:10:01 +0100	[thread overview]
Message-ID: <20170116151001.GD3144@twins.programming.kicks-ass.net> (raw)
In-Reply-To: <1481260331-360-8-git-send-email-byungchul.park@lge.com>

On Fri, Dec 09, 2016 at 02:12:03PM +0900, Byungchul Park wrote:

> @@ -143,6 +149,9 @@ struct lock_class_stats lock_stats(struct lock_class *class);
>  void clear_lock_stats(struct lock_class *class);
>  #endif
>  
> +#ifdef CONFIG_LOCKDEP_CROSSRELEASE
> +struct cross_lock;
> +#endif

That seems like pointless wrappery, unused (fwd) declarations are
harmless.

>  /*
>   * Map the lock object (the lock instance) to the lock-class object.
>   * This is embedded into specific lock instances:
> @@ -155,6 +164,9 @@ struct lockdep_map {
>  	int				cpu;
>  	unsigned long			ip;
>  #endif
> +#ifdef CONFIG_LOCKDEP_CROSSRELEASE
> +	struct cross_lock		*xlock;
> +#endif

The use of this escapes me; why does the lockdep_map need a pointer to
this?

>  };
>  
>  static inline void lockdep_copy_map(struct lockdep_map *to,
> @@ -258,7 +270,82 @@ struct held_lock {
>  	unsigned int hardirqs_off:1;
>  	unsigned int references:12;					/* 32 bits */
>  	unsigned int pin_count;
> +#ifdef CONFIG_LOCKDEP_CROSSRELEASE
> +	/*
> +	 * This is used to find out the first plock among plocks having
> +	 * been acquired since a crosslock was held. Crossrelease feature
> +	 * uses chain cache between the crosslock and the first plock to
> +	 * avoid building unnecessary dependencies, like how lockdep uses
> +	 * a sort of chain cache for normal locks.
> +	 */
> +	unsigned int gen_id;
> +#endif
> +};

Makes sense, except we'll have a bunch of different generation numbers
(see below), so I think it makes sense to name it more explicitly.

> +
> +#ifdef CONFIG_LOCKDEP_CROSSRELEASE
> +#define MAX_PLOCK_TRACE_ENTRIES		5

Why 5? ;-)

> +/*
> + * This is for keeping locks waiting for commit to happen so that
> + * dependencies are actually built later at commit step.
> + *
> + * Every task_struct has an array of pend_lock. Each entiry will be
> + * added with a lock whenever lock_acquire() is called for normal lock.
> + */
> +struct pend_lock {

Like said before, I'm not sure "pending" is the right word, we track
locks that have very much been released. Would something like "lock
history" make more sense?

> +	/*
> +	 * prev_gen_id is used to check whether any other hlock in the
> +	 * current is already dealing with the xlock, with which commit
> +	 * is performed. If so, this plock can be skipped.
> +	 */
> +	unsigned int		prev_gen_id;

Confused..

> +	/*
> +	 * A kind of global timestamp increased and set when this plock
> +	 * is inserted.
> +	 */
> +	unsigned int		gen_id;

Right, except you also have pend_lock::hlock::gen_id, which I think is
the very same generation number, no?

> +
> +	int			hardirq_context;
> +	int			softirq_context;

This would fit in 2 bit, why do you use 8 bytes?

> +
> +	/*
> +	 * Whenever irq happens, these are updated so that we can
> +	 * distinguish each irq context uniquely.
> +	 */
> +	unsigned int		hardirq_id;
> +	unsigned int		softirq_id;

An alternative approach would be to 'unwind' or discard all historical
events from a nested context once we exit it.

After all, all we care about is the history of the release context, once
the context is gone, we don't care.

> +
> +	/*
> +	 * Seperate stack_trace data. This will be used at commit step.
> +	 */
> +	struct stack_trace	trace;
> +	unsigned long		trace_entries[MAX_PLOCK_TRACE_ENTRIES];
> +
> +	/*
> +	 * Seperate hlock instance. This will be used at commit step.
> +	 */
> +	struct held_lock	hlock;
> +};
> +
> +/*
> + * One cross_lock per one lockdep_map.
> + *
> + * To initialize a lock as crosslock, lockdep_init_map_crosslock() should
> + * be used instead of lockdep_init_map(), where the pointer of cross_lock
> + * instance should be passed as a parameter.
> + */
> +struct cross_lock {
> +	unsigned int		gen_id;

Again, you already have hlock::gen_id for this, no?

> +	struct list_head	xlock_entry;
> +
> +	/*
> +	 * Seperate hlock instance. This will be used at commit step.
> +	 */
> +	struct held_lock	hlock;
> +
> +	int			ref; /* reference count */
>  };

Why not do something like:

struct lockdep_map_cross {
	struct lockdep_map	map;
	struct held_lock	hlock;
}

That saves at least that pointer.

But I still have to figure out why we need this hlock.

Also note that a full hlock contains superfluous information:

 - prev_chain_key; not required, we can compute the 2 entry chain hash
 		   on demand when we need.
 - instance: we already have the lockdep_map right here, pointers to
 	     self are kinda pointless
 - nest_lock: not sure that makes sense wrt this stuff.
 - class_idx: can recompute if we have lockdep_map
 - reference: see nest_lock

> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index 253538f..592ee368 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -1719,6 +1719,11 @@ struct task_struct {
>  	struct held_lock held_locks[MAX_LOCK_DEPTH];
>  	gfp_t lockdep_reclaim_gfp;
>  #endif
> +#ifdef CONFIG_LOCKDEP_CROSSRELEASE
> +#define MAX_PLOCKS_NR 1024UL
> +	int plock_index;
> +	struct pend_lock *plocks;
> +#endif

That's a giant heap of memory.. why 1024?


> diff --git a/kernel/fork.c b/kernel/fork.c
> index 4a7ec0c..91ab81b 100644
> --- a/kernel/fork.c
> +++ b/kernel/fork.c

> @@ -1443,6 +1451,10 @@ static struct task_struct *copy_process(unsigned long clone_flags,
>  	p->lockdep_depth = 0; /* no locks held yet */
>  	p->curr_chain_key = 0;
>  	p->lockdep_recursion = 0;
> +#ifdef CONFIG_LOCKDEP_CROSSRELEASE
> +	p->plock_index = 0;
> +	p->plocks = vzalloc(sizeof(struct pend_lock) * MAX_PLOCKS_NR);

And while I understand why you need vmalloc for that amount of memory,
do realize that on 32bit kernels you'll very quickly run out of space
this way.

That pend_lock thing could maybe be shrunk to 128 bytes, at which point
you can fit 64 in two pages, is that not sufficient?


> diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
> index 11580ec..2c8b2c1 100644
> --- a/kernel/locking/lockdep.c
> +++ b/kernel/locking/lockdep.c

> +#ifdef CONFIG_LOCKDEP_CROSSRELEASE
> +
> +static LIST_HEAD(xlocks_head);

Still not explanation for what this list is for...

> +
> +/*
> + * Whenever a crosslock is held, cross_gen_id will be increased.
> + */
> +static atomic_t cross_gen_id; /* Can be wrapped */
> +
> +/* Implement a circular buffer - for internal use */
> +#define cir_p(n, i)		((i) ? (i) - 1 : (n) - 1)

This is broken, I think you'll want something like: (((i)-1) % (n))

> +#define cir_n(n, i)		((i) == (n) - 1 ? 0 : (i) + 1)

Idem

> +/*
> + * Crossrelease needs to distinguish each hardirq context.
> + */
> +static DEFINE_PER_CPU(unsigned int, hardirq_id);
> +void crossrelease_hardirq_start(void)
> +{
> +	per_cpu(hardirq_id, smp_processor_id())++;
> +}
> +
> +/*
> + * Crossrelease needs to distinguish each softirq context.
> + */
> +static DEFINE_PER_CPU(unsigned int, softirq_id);
> +void crossrelease_softirq_start(void)
> +{
> +	per_cpu(softirq_id, smp_processor_id())++;
> +}

See above, I don't think we need to retain the plock stuff once a
context finishes, and therefore we don't need context generation numbers
to differentiate them.

> +/*
> + * To find the earlist crosslock among all crosslocks not released yet.
> + */
> +static unsigned int gen_id_begin(void)
> +{
> +	struct cross_lock *xlock = list_entry_rcu(xlocks_head.next,
> +			struct cross_lock, xlock_entry);
> +
> +	/* If empty */
> +	if (&xlock->xlock_entry == &xlocks_head)
> +		return (unsigned int)atomic_read(&cross_gen_id) + 1;
> +
> +	return READ_ONCE(xlock->gen_id);
> +}
> +
> +/*
> + * To find the latest crosslock among all crosslocks already released.
> + */
> +static inline unsigned int gen_id_done(void)
> +{
> +	return gen_id_begin() - 1;
> +}

I'm not sure about these... if you increment the generation count on any
cross action (both acquire and release) and tag all hlocks with the
current reading, you have all the ordering required, no?

> +/*
> + * No contention. Irq disable is only required.
> + */
> +static void add_plock(struct held_lock *hlock, unsigned int prev_gen_id,
> +		unsigned int gen_id_done)
> +{
> +	struct task_struct *curr = current;
> +	int cpu = smp_processor_id();
> +	struct pend_lock *plock;
> +	/*
> +	 *	CONTEXT 1		CONTEXT 2
> +	 *	---------		---------
> +	 *	acquire A (cross)
> +	 *	X = atomic_inc_return()
> +	 *	~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ serialize
> +	 *				Y = atomic_read_acquire()
> +	 *				acquire B
> +	 *				acquire C
> +	 *
> +	 * For ordering between this and all following LOCKs.
> +	 * This way we ensure the order A -> B -> C when CONTEXT 2
> +	 * can see Y is equal to or greater than X.
> +	 *
> +	 * Pairs with atomic_inc_return() in add_xlock().
> +	 */
> +	unsigned int gen_id = (unsigned int)atomic_read_acquire(&cross_gen_id);

fails to explain why this is important.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  parent reply	other threads:[~2017-01-16 15:10 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-12-09  5:11 [PATCH v4 00/15] " Byungchul Park
2016-12-09  5:11 ` [PATCH v4 01/15] x86/dumpstack: Optimize save_stack_trace Byungchul Park
2016-12-09  5:11 ` [PATCH v4 02/15] x86/dumpstack: Add save_stack_trace()_fast() Byungchul Park
2016-12-09  5:11 ` [PATCH v4 03/15] lockdep: Refactor lookup_chain_cache() Byungchul Park
2016-12-09  5:12 ` [PATCH v4 04/15] lockdep: Add a function building a chain between two classes Byungchul Park
2017-01-10 21:00   ` Peter Zijlstra
2017-01-12  1:41     ` Byungchul Park
2016-12-09  5:12 ` [PATCH v4 05/15] lockdep: Make check_prev_add can use a separate stack_trace Byungchul Park
2017-01-12 16:16   ` Peter Zijlstra
2017-01-13  2:45     ` Byungchul Park
2017-01-13 10:11     ` Byungchul Park
2017-01-17 15:54       ` Peter Zijlstra
2017-01-18  2:04         ` Byungchul Park
2017-01-18 15:10           ` Peter Zijlstra
2017-01-19  2:47             ` Byungchul Park
2016-12-09  5:12 ` [PATCH v4 06/15] lockdep: Make save_trace can skip stack tracing of the current Byungchul Park
2017-01-12 16:37   ` Peter Zijlstra
2017-01-13  0:18     ` Byungchul Park
2016-12-09  5:12 ` [PATCH v4 07/15] lockdep: Implement crossrelease feature Byungchul Park
2017-01-13  4:39   ` Lai Jiangshan
2017-01-13  5:02     ` Byungchul Park
2017-01-16 15:10   ` Peter Zijlstra [this message]
2017-01-17  2:05     ` Byungchul Park
2017-01-17  7:12       ` Peter Zijlstra
2017-01-17  7:49         ` Byungchul Park
2017-01-17  7:14       ` Peter Zijlstra
2017-01-17  7:45         ` Byungchul Park
2017-01-16 15:13   ` Peter Zijlstra
2017-01-17  2:33     ` Byungchul Park
2017-01-17  6:24       ` Boqun Feng
2017-01-17  7:43         ` Byungchul Park
2016-12-09  5:12 ` [PATCH v4 08/15] lockdep: Make crossrelease use save_stack_trace_fast() Byungchul Park
2016-12-09  5:12 ` [PATCH v4 09/15] lockdep: Make print_circular_bug() crosslock-aware Byungchul Park
2016-12-09  5:12 ` [PATCH v4 10/15] lockdep: Apply crossrelease to completion operation Byungchul Park
2016-12-09  5:12 ` [PATCH v4 11/15] pagemap.h: Remove trailing white space Byungchul Park
2016-12-09  5:12 ` [PATCH v4 12/15] lockdep: Apply crossrelease to PG_locked lock Byungchul Park
2016-12-09  5:12 ` [PATCH v4 13/15] lockdep: Apply lock_acquire(release) on __Set(__Clear)PageLocked Byungchul Park
2016-12-09  5:12 ` [PATCH v4 14/15] lockdep: Move data used in CONFIG_LOCKDEP_PAGELOCK from page to page_ext Byungchul Park
2016-12-09  5:12 ` [PATCH v4 15/15] lockdep: Crossrelease feature documentation Byungchul Park
2017-01-10 20:08   ` Peter Zijlstra
2017-01-11  1:29     ` Byungchul Park
2017-01-18  6:42   ` Boqun Feng
2017-01-18 10:53     ` Byungchul Park
2017-01-18 11:03       ` Peter Zijlstra
2017-01-18 11:54         ` Byungchul Park
2017-01-18 12:07           ` Peter Zijlstra
2017-01-18 12:14             ` byungchul.park
2017-01-18 14:12               ` Peter Zijlstra
2017-01-19  1:54                 ` Byungchul Park
2017-01-18 12:49             ` byungchul.park
2016-12-09  5:21 ` [FYI] Output of 'cat /proc/lockdep' after applying crossrelease Byungchul Park

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170116151001.GD3144@twins.programming.kicks-ass.net \
    --to=peterz@infradead.org \
    --cc=akpm@linux-foundation.org \
    --cc=boqun.feng@gmail.com \
    --cc=byungchul.park@lge.com \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=kirill@shutemov.name \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mingo@kernel.org \
    --cc=npiggin@gmail.com \
    --cc=tglx@linutronix.de \
    --cc=walken@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox