From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2D2A6C433F5 for ; Thu, 11 Nov 2021 09:07:31 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id D136A611CE for ; Thu, 11 Nov 2021 09:07:30 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org D136A611CE Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linutronix.de Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 722436B0072; Thu, 11 Nov 2021 04:07:30 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 6D2356B007E; Thu, 11 Nov 2021 04:07:30 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5C14E6B009F; Thu, 11 Nov 2021 04:07:30 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0171.hostedemail.com [216.40.44.171]) by kanga.kvack.org (Postfix) with ESMTP id 4D8116B0072 for ; Thu, 11 Nov 2021 04:07:30 -0500 (EST) Received: from smtpin17.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 0AD5D184799CA for ; Thu, 11 Nov 2021 09:07:30 +0000 (UTC) X-FDA: 78796071060.17.1888997 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) by imf08.hostedemail.com (Postfix) with ESMTP id 06AA330000BD for ; Thu, 11 Nov 2021 09:07:14 +0000 (UTC) Date: Thu, 11 Nov 2021 10:07:27 +0100 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1636621648; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ORWpt36YScR9eD1P2c3MHXucAjz0nc8j29p8/5H9LII=; b=XjuH0PxuHwB8OE6ObW2wcxIllNvEHHcvW75xcTUfHAy4EKVmq2qdY//1mDLzqWgcP1GMSS v1A+g4AQ7AeC8CsXFuE8p6jB5hoAgUSFleIDH86uIZ11spx7y1bALYbMZ/+ZzzK6MMQHpe ORbqm2/gRMvKH6I67iOqaRr3QUh5vZGgEtoGBoG81Mdiws274aqeyhJWORS3aDbOHmA6lk 6ZMe64zdSP1C7BJX2yR0vrFx+97WAci/4Eu82vbKRmVw8sTwLUrr4xuxew1tqb83G2Y8AC E3AYjKMgeouoHPrmn0bLurocZOImjI2y+vEJStgD5RnooAtfJtlfiMzVYg6LTg== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1636621648; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ORWpt36YScR9eD1P2c3MHXucAjz0nc8j29p8/5H9LII=; b=gi6RydJliPRHW/71WxpO1+oWPVfa4EDcO5VN8D9IVBWguAlaH8yUQpJk5Osas62CeepDcZ z/mEykcyR8xXP7AQ== From: Sebastian Andrzej Siewior To: Minchan Kim Cc: Andrew Morton , Sergey Senozhatsky , linux-mm , Thomas Gleixner Subject: Re: [PATCH 7/8] zsmalloc: replace per zpage lock with pool->migrate_lock Message-ID: <20211111090727.eq67hxfpux23dagd@linutronix.de> References: <20211110185433.1981097-1-minchan@kernel.org> <20211110185433.1981097-8-minchan@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable In-Reply-To: <20211110185433.1981097-8-minchan@kernel.org> X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 06AA330000BD X-Stat-Signature: 6ottsnkbqj7dyhcuotk3pe7nmed6tdao Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=linutronix.de header.s=2020 header.b=XjuH0Pxu; dkim=pass header.d=linutronix.de header.s=2020e header.b=gi6RydJl; dmarc=pass (policy=none) header.from=linutronix.de; spf=pass (imf08.hostedemail.com: domain of bigeasy@linutronix.de designates 193.142.43.55 as permitted sender) smtp.mailfrom=bigeasy@linutronix.de X-HE-Tag: 1636621634-947822 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 2021-11-10 10:54:32 [-0800], Minchan Kim wrote: > The zsmalloc has used a bit for spin_lock in zpage handle to keep > zpage object alive during several operations. However, it causes > the problem for PREEMPT_RT as well as introducing too complicated. >=20 > This patch replaces the bit spin_lock with pool->migrate_lock > rwlock. It could make the code simple as well as zsmalloc work > under PREEMPT_RT. >=20 > The drawback is the pool->migrate_lock is bigger granuarity than > per zpage lock so the contention would be higher than old when > both IO-related operations(i.e., zsmalloc, zsfree, zs_[map|unmap]) > and compaction(page/zpage migration) are going in parallel(*, > the migrate_lock is rwlock and IO related functions are all read > side lock so there is no contention). However, the write-side > is fast enough(dominant overhead is just page copy) so it wouldn't > affect much. If the lock granurity becomes more problem later, > we could introduce table locks based on handle as a hash value. >=20 > Signed-off-by: Minchan Kim =E2=80=A6 > index b8b098be92fa..5d4c4d254679 100644 > --- a/mm/zsmalloc.c > +++ b/mm/zsmalloc.c > @@ -1789,6 +1767,11 @@ static void migrate_write_lock(struct zspage *zspa= ge) > write_lock(&zspage->lock); > } > =20 > +static void migrate_write_lock_nested(struct zspage *zspage) > +{ > + write_lock_nested(&zspage->lock, SINGLE_DEPTH_NESTING); I don't have this in my tree.=20 > +} > + > static void migrate_write_unlock(struct zspage *zspage) > { > write_unlock(&zspage->lock); =E2=80=A6 > @@ -2077,8 +2043,13 @@ static unsigned long __zs_compact(struct zs_pool *= pool, > struct zspage *dst_zspage =3D NULL; > unsigned long pages_freed =3D 0; > =20 > + /* protect the race between zpage migration and zs_free */ > + write_lock(&pool->migrate_lock); > + /* protect zpage allocation/free */ > spin_lock(&class->lock); > while ((src_zspage =3D isolate_zspage(class, true))) { > + /* protect someone accessing the zspage(i.e., zs_map_object) */ > + migrate_write_lock(src_zspage); > =20 > if (!zs_can_compact(class)) > break; > @@ -2087,6 +2058,8 @@ static unsigned long __zs_compact(struct zs_pool *p= ool, > cc.s_page =3D get_first_page(src_zspage); > =20 > while ((dst_zspage =3D isolate_zspage(class, false))) { > + migrate_write_lock_nested(dst_zspage); > + > cc.d_page =3D get_first_page(dst_zspage); > /* > * If there is no more space in dst_page, resched Looking at the these two chunks, the page here comes from a list, you remove that page from that list and this ensures that you can't lock the very same pages in reverse order as in: migrate_write_lock(dst_zspage); =E2=80=A6 migrate_write_lock(src_zspage); right? Sebastian