From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 962A8C433F5 for ; Thu, 11 Nov 2021 23:11:41 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 2409C61179 for ; Thu, 11 Nov 2021 23:11:41 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 2409C61179 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 852306B007B; Thu, 11 Nov 2021 18:11:40 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 8017F6B007D; Thu, 11 Nov 2021 18:11:40 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6BA3E6B007E; Thu, 11 Nov 2021 18:11:40 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0065.hostedemail.com [216.40.44.65]) by kanga.kvack.org (Postfix) with ESMTP id 58B626B007B for ; Thu, 11 Nov 2021 18:11:40 -0500 (EST) Received: from smtpin30.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 107A178374 for ; Thu, 11 Nov 2021 23:11:40 +0000 (UTC) X-FDA: 78798198360.30.11F602F Received: from mail-pg1-f180.google.com (mail-pg1-f180.google.com [209.85.215.180]) by imf24.hostedemail.com (Postfix) with ESMTP id 48B15B00080E for ; Thu, 11 Nov 2021 23:11:37 +0000 (UTC) Received: by mail-pg1-f180.google.com with SMTP id p17so6449515pgj.2 for ; Thu, 11 Nov 2021 15:11:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=sender:date:from:to:cc:subject:message-id:references:mime-version :content-disposition:content-transfer-encoding:in-reply-to; bh=P+Ruw5MlLAWLaSdwEbB+e/6Jp2+ZlkRqfIyxZ6pYNss=; b=oijWCFCXrJGqUgBYS499AWtt6/7fnV4Uoc8DRZ5exz5n8LYpam9D6SKOBsXxZVNQN8 N0yYUQAMVnQ+1r+tupu8H60CjS4ZNKjDdZJL8oY2wrvfN6MSP4xOxRFslZbReqPes+Bv MLsocRAFOcN01M9AwNezg92CGufeLEJ+ZKRagOPSb2JWI5kI7ZylqKBZcd5A8Bg1ZKfi f4EfrpOcAb5PtiyEKe2VAKufEP8NX7Dy+qvGU7ZOYUgvV2yQY4AV0P5PCBe4pe+IR9Pb rhFkQuRxK9OS07p9xE6H3TIeMccVfzJ/0xMIiNELdVT2zMz/EdPXaATiyI7DusgQbxiH h6TQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:sender:date:from:to:cc:subject:message-id :references:mime-version:content-disposition :content-transfer-encoding:in-reply-to; bh=P+Ruw5MlLAWLaSdwEbB+e/6Jp2+ZlkRqfIyxZ6pYNss=; b=HxVcYB2aNtXqUta3uEZHQr1yzc2lbPDSzw60oGDPtvEsaCcFqeHRD6f9HNNPLaoszE Mg5idY8K6rG/pfVJPuyAYXQ2FjDMU9hs7jzjwDI2IgqwcIl912wApdXD1ScMLUsOjfyT 1YlVw6SSoIJKn5E65by+Oe0j4X0BAQ/pinu69ANxFS4mEFDFR21hsLgzfAwEB3GioyFs ZBGxwUaf5DtXsD1Qo0dHeaYcOK8QfWlauCXTonDiXiW5QuXgkj/yM1ZvMyX5mm3/vcQj pDsCce10DMZkya6Y0QmNF/w6LxDpfxBwLnUTB6vnVgEPfxg7SBGlP3abktv230ICsrz1 CNDw== X-Gm-Message-State: AOAM531LvjpvjNajGETSO0KQ/0pPHwdJrSxJMoWB/scudZcc1YAvBcoK VfjlerbP4OuISZ36eumIC/U= X-Google-Smtp-Source: ABdhPJz4IIlCIgRYP7TWI2NQBCvuFCi8qoP7AXAzo7NXs1t+pnrA859kpteli2EL2gzAwoyLHYfU5Q== X-Received: by 2002:a05:6a00:807:b0:49f:d6ab:590c with SMTP id m7-20020a056a00080700b0049fd6ab590cmr9933075pfk.32.1636672296251; Thu, 11 Nov 2021 15:11:36 -0800 (PST) Received: from google.com ([2620:15c:211:201:4271:9d3c:1263:8f67]) by smtp.gmail.com with ESMTPSA id bf19sm9296377pjb.6.2021.11.11.15.11.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 11 Nov 2021 15:11:35 -0800 (PST) Date: Thu, 11 Nov 2021 15:11:34 -0800 From: Minchan Kim To: Sebastian Andrzej Siewior Cc: Andrew Morton , Sergey Senozhatsky , linux-mm , Thomas Gleixner Subject: Re: [PATCH 7/8] zsmalloc: replace per zpage lock with pool->migrate_lock Message-ID: References: <20211110185433.1981097-1-minchan@kernel.org> <20211110185433.1981097-8-minchan@kernel.org> <20211111090727.eq67hxfpux23dagd@linutronix.de> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20211111090727.eq67hxfpux23dagd@linutronix.de> X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 48B15B00080E X-Stat-Signature: y6sn5h3h9imbbqkwre7bw171ie3wkhyh Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=oijWCFCX; spf=pass (imf24.hostedemail.com: domain of minchan.kim@gmail.com designates 209.85.215.180 as permitted sender) smtp.mailfrom=minchan.kim@gmail.com; dmarc=fail reason="SPF not aligned (relaxed), DKIM not aligned (relaxed)" header.from=kernel.org (policy=none) X-HE-Tag: 1636672297-380448 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Nov 11, 2021 at 10:07:27AM +0100, Sebastian Andrzej Siewior wrote= : > On 2021-11-10 10:54:32 [-0800], Minchan Kim wrote: > > The zsmalloc has used a bit for spin_lock in zpage handle to keep > > zpage object alive during several operations. However, it causes > > the problem for PREEMPT_RT as well as introducing too complicated. > >=20 > > This patch replaces the bit spin_lock with pool->migrate_lock > > rwlock. It could make the code simple as well as zsmalloc work > > under PREEMPT_RT. > >=20 > > The drawback is the pool->migrate_lock is bigger granuarity than > > per zpage lock so the contention would be higher than old when > > both IO-related operations(i.e., zsmalloc, zsfree, zs_[map|unmap]) > > and compaction(page/zpage migration) are going in parallel(*, > > the migrate_lock is rwlock and IO related functions are all read > > side lock so there is no contention). However, the write-side > > is fast enough(dominant overhead is just page copy) so it wouldn't > > affect much. If the lock granurity becomes more problem later, > > we could introduce table locks based on handle as a hash value. > >=20 > > Signed-off-by: Minchan Kim > =E2=80=A6 > > index b8b098be92fa..5d4c4d254679 100644 > > --- a/mm/zsmalloc.c > > +++ b/mm/zsmalloc.c > > @@ -1789,6 +1767,11 @@ static void migrate_write_lock(struct zspage *= zspage) > > write_lock(&zspage->lock); > > } > > =20 > > +static void migrate_write_lock_nested(struct zspage *zspage) > > +{ > > + write_lock_nested(&zspage->lock, SINGLE_DEPTH_NESTING); >=20 > I don't have this in my tree.=20 I forgot it. I append it at the tail of the thread.=20 I will also include it at nest revision. >=20 > > +} > > + > > static void migrate_write_unlock(struct zspage *zspage) > > { > > write_unlock(&zspage->lock); > =E2=80=A6 > > @@ -2077,8 +2043,13 @@ static unsigned long __zs_compact(struct zs_po= ol *pool, > > struct zspage *dst_zspage =3D NULL; > > unsigned long pages_freed =3D 0; > > =20 > > + /* protect the race between zpage migration and zs_free */ > > + write_lock(&pool->migrate_lock); > > + /* protect zpage allocation/free */ > > spin_lock(&class->lock); > > while ((src_zspage =3D isolate_zspage(class, true))) { > > + /* protect someone accessing the zspage(i.e., zs_map_object) */ > > + migrate_write_lock(src_zspage); > > =20 > > if (!zs_can_compact(class)) > > break; > > @@ -2087,6 +2058,8 @@ static unsigned long __zs_compact(struct zs_poo= l *pool, > > cc.s_page =3D get_first_page(src_zspage); > > =20 > > while ((dst_zspage =3D isolate_zspage(class, false))) { > > + migrate_write_lock_nested(dst_zspage); > > + > > cc.d_page =3D get_first_page(dst_zspage); > > /* > > * If there is no more space in dst_page, resched >=20 > Looking at the these two chunks, the page here comes from a list, you > remove that page from that list and this ensures that you can't lock th= e > very same pages in reverse order as in: >=20 > migrate_write_lock(dst_zspage); > =E2=80=A6 > migrate_write_lock(src_zspage); >=20 > right? Sure. >From e0bfc5185bbd15c651a7a367b6d053b8c88b1e01 Mon Sep 17 00:00:00 2001 From: Minchan Kim Date: Tue, 19 Oct 2021 15:34:09 -0700 Subject: [PATCH] locking/rwlocks: introduce write_lock_nested Signed-off-by: Minchan Kim --- include/linux/rwlock.h | 6 ++++++ include/linux/rwlock_api_smp.h | 9 +++++++++ include/linux/spinlock_api_up.h | 1 + kernel/locking/spinlock.c | 6 ++++++ 4 files changed, 22 insertions(+) diff --git a/include/linux/rwlock.h b/include/linux/rwlock.h index 7ce9a51ae5c0..93086de7bf9e 100644 --- a/include/linux/rwlock.h +++ b/include/linux/rwlock.h @@ -70,6 +70,12 @@ do { \ #define write_lock(lock) _raw_write_lock(lock) #define read_lock(lock) _raw_read_lock(lock) =20 +#ifdef CONFIG_DEBUG_LOCK_ALLOC +#define write_lock_nested(lock, subclass) _raw_write_lock_nested(lock, s= ubclass) +#else +#define write_lock_nested(lock, subclass) _raw_write_lock(lock) +#endif + #if defined(CONFIG_SMP) || defined(CONFIG_DEBUG_SPINLOCK) =20 #define read_lock_irqsave(lock, flags) \ diff --git a/include/linux/rwlock_api_smp.h b/include/linux/rwlock_api_sm= p.h index abfb53ab11be..e0c866177c03 100644 --- a/include/linux/rwlock_api_smp.h +++ b/include/linux/rwlock_api_smp.h @@ -17,6 +17,7 @@ =20 void __lockfunc _raw_read_lock(rwlock_t *lock) __acquires(lock); void __lockfunc _raw_write_lock(rwlock_t *lock) __acquires(lock); +void __lockfunc _raw_write_lock_nested(rwlock_t *lock, int subclass) __a= cquires(lock); void __lockfunc _raw_read_lock_bh(rwlock_t *lock) __acquires(lock); void __lockfunc _raw_write_lock_bh(rwlock_t *lock) __acquires(lock); void __lockfunc _raw_read_lock_irq(rwlock_t *lock) __acquires(lock); @@ -46,6 +47,7 @@ _raw_write_unlock_irqrestore(rwlock_t *lock, unsigned l= ong flags) =20 #ifdef CONFIG_INLINE_WRITE_LOCK #define _raw_write_lock(lock) __raw_write_lock(lock) +#define _raw_write_lock_nested(lock, subclass) __raw_write_lock_nested(l= ock, subclass) #endif =20 #ifdef CONFIG_INLINE_READ_LOCK_BH @@ -211,6 +213,13 @@ static inline void __raw_write_lock(rwlock_t *lock) LOCK_CONTENDED(lock, do_raw_write_trylock, do_raw_write_lock); } =20 +static inline void __raw_write_lock_nested(rwlock_t *lock, int subclass) +{ + preempt_disable(); + rwlock_acquire(&lock->dep_map, subclass, 0, _RET_IP_); + LOCK_CONTENDED(lock, do_raw_write_trylock, do_raw_write_lock); +} + #endif /* !CONFIG_GENERIC_LOCKBREAK || CONFIG_DEBUG_LOCK_ALLOC */ =20 static inline void __raw_write_unlock(rwlock_t *lock) diff --git a/include/linux/spinlock_api_up.h b/include/linux/spinlock_api= _up.h index d0d188861ad6..b8ba00ccccde 100644 --- a/include/linux/spinlock_api_up.h +++ b/include/linux/spinlock_api_up.h @@ -59,6 +59,7 @@ #define _raw_spin_lock_nested(lock, subclass) __LOCK(lock) #define _raw_read_lock(lock) __LOCK(lock) #define _raw_write_lock(lock) __LOCK(lock) +#define _raw_write_lock_nested(lock, subclass) __LOCK(lock) #define _raw_spin_lock_bh(lock) __LOCK_BH(lock) #define _raw_read_lock_bh(lock) __LOCK_BH(lock) #define _raw_write_lock_bh(lock) __LOCK_BH(lock) diff --git a/kernel/locking/spinlock.c b/kernel/locking/spinlock.c index c5830cfa379a..22969ec69288 100644 --- a/kernel/locking/spinlock.c +++ b/kernel/locking/spinlock.c @@ -300,6 +300,12 @@ void __lockfunc _raw_write_lock(rwlock_t *lock) __raw_write_lock(lock); } EXPORT_SYMBOL(_raw_write_lock); + +void __lockfunc _raw_write_lock_nested(rwlock_t *lock, int subclass) +{ + __raw_write_lock_nested(lock, subclass); +} +EXPORT_SYMBOL(_raw_write_lock_nested); #endif =20 #ifndef CONFIG_INLINE_WRITE_LOCK_IRQSAVE --=20 2.34.0.rc1.387.gb447b232ab-goog