From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A25C5C021B1 for ; Thu, 20 Feb 2025 19:19:48 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3CB762802F5; Thu, 20 Feb 2025 14:19:48 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 2DF3B2802F4; Thu, 20 Feb 2025 14:19:48 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1802B2802F5; Thu, 20 Feb 2025 14:19:48 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id E76E92802F4 for ; Thu, 20 Feb 2025 14:19:47 -0500 (EST) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id A2125C2A87 for ; Thu, 20 Feb 2025 19:19:47 +0000 (UTC) X-FDA: 83141287614.01.3631254 Received: from out-188.mta1.migadu.com (out-188.mta1.migadu.com [95.215.58.188]) by imf01.hostedemail.com (Postfix) with ESMTP id DFA3B40007 for ; Thu, 20 Feb 2025 19:19:45 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=HaGVyRUl; spf=pass (imf01.hostedemail.com: domain of yosry.ahmed@linux.dev designates 95.215.58.188 as permitted sender) smtp.mailfrom=yosry.ahmed@linux.dev; dmarc=pass (policy=none) header.from=linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1740079186; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=9qLHFXtQGQVNsXa8u/AS6W8A4GS6B80oE++6/uSF8kI=; b=zvEyCv6OUHA5ijAnV1Eo7TuvN6CLLrz2QCZ3Qs2rTT0pUu36L+thMfWluccKmGfvvP7Jaw 4U9s0JI93tuuGvfpwwLHfeU5ZAeg1muYcaKFfOKp98eHVmNr2Q76zJN/kkUbl+SWwuB41s iRNFCq9c0dWkubL+Nrp6gDt5iWPIGlQ= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=HaGVyRUl; spf=pass (imf01.hostedemail.com: domain of yosry.ahmed@linux.dev designates 95.215.58.188 as permitted sender) smtp.mailfrom=yosry.ahmed@linux.dev; dmarc=pass (policy=none) header.from=linux.dev ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1740079186; a=rsa-sha256; cv=none; b=iY4QBTkjMGVsTesia4z5Tlt3v4UrAmQ6RvlFHb/gT1jMDkTXAdR6LC1PJC5vXYJrIfgwrs miEZuUd18wUy9Sj7YUbo+x//OkG4M5f1msvXwqDwTAcK2EZQQqG1Gl1rPkTCVakOe/XWL2 mEBtss08WgkxTfs170Ofryb8S8cE2nc= Date: Thu, 20 Feb 2025 19:19:39 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1740079184; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=9qLHFXtQGQVNsXa8u/AS6W8A4GS6B80oE++6/uSF8kI=; b=HaGVyRUl6Q7oTBYvdJ8kblVxZU42Y1yTbHVjVwplC+asm7dippCPK1KZJ3LSDVF180XWcN Y/5w+AEFwZC7nu68O4yXG/oIC6pH7XOr82O72jB7JQtkMekoD6Mcf7ytuOBCGRG+tBgT6w p5fbDl5Ihwbkmh6Ywecfe4w7aCs1sks= X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Yosry Ahmed To: Sergey Senozhatsky Cc: Andrew Morton , Hillf Danton , Kairui Song , Minchan Kim , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH v6 11/17] zsmalloc: make zspage lock preemptible Message-ID: References: <20250214045208.1388854-1-senozhatsky@chromium.org> <20250214045208.1388854-12-senozhatsky@chromium.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Migadu-Flow: FLOW_OUT X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: DFA3B40007 X-Stat-Signature: n6zfd5xnn51p3hcw7j8z3k1oz1ub3rdu X-Rspam-User: X-HE-Tag: 1740079185-231569 X-HE-Meta: U2FsdGVkX1+onLsCuRya0hbHjhsya2eEovdYDrq5jYpbMuY8N8HVUIDK6Z6gFFcpYqdv17FWPvbFuZbyOw9TLJB6LevMMQVi6m4qbXB9hiHHZZfPyNiGUOPeddArd88qEyox3tc2YK7AyM395JlZpskys/YOF1d8++iTf2getLSkyy71mqmZ3kLeOjMVBIxgoa00AUbXKT9t6CmGVPrUBuF/6GFWztPK03pEzSHYNSCaBCVZE0eBGyJ/j21XzuAZ3RJtJO+YwQAYzGy9gT/fElWcBqA7ZeDtyLUxqXtin9MpqwJzSlz34xaPHEDiPo2pjcQLjuA0SQfaqmCx+wmeB+n3nxSxnKi+YnIXB/PD/9vtkTpJh9ZOdv7wAqNLZBnWJidbqMfX/aPqvueNKJ45T7v6rtRuRNDaJm3VNJNQZZO6OTiEdZYtcqLFWzeX5B0e+Pk5J05WIviVYAsNxxUNexxhAtiXObrnSuCWPXMQIzuBL/UtZYqHFIK2Sp/JxhjLKEkX3sFMEPXCLRcDcXIgKjMx519mzGtD9pi18jxuxMSSReAHOscdWj5grXSI0M4th4DH5zjFwgOyugcaZEHIK/cnmP9r0qJvpg75RObSJJojl/ghEgM+Z0/1qxtGrEQxwA6LBW0VYhNdTaYSMxT9rqTR0+opC3hkH3VXxif6nltOBaOngxXY6qJwuCIX9dquXtupjmptSEtvcWYLEl77vBilFoJHF39ZhJycV1tnDGeS8aGvjZRvDlJdNcuMXvjqkzaoWZ/+/3iULEUNbi1/QeBZTbmnRtKBqys0lPg1sJR2bsqLO/ks7ukV7E6YexIb3du452Ih8j/bETTrcj4oqM/SU0TJIYNnvwCDYNnxcyRwRu5GdXb/F+CyCXPTf2VScqr2kzHy+8K8ZQ58nJ+W2zu1EJccVZjvQTWFNGpKDWQcxbV86zrUP3+hj06MvlRmckS+0IadfXnJTWO1JXI 4CMyfr0a fEZTorL5m2yqD+sg3PfmiwFDPhY3S9Lgu+BLpHlpdpKm5/8sak1IbGvxnedFlX5Ahopo5ZcaQKh8tFe7fTfmRSBwh1JP81Sax4rccwGXrsgzsDuIyxgmKRn100UVR0n3KiCzREV8qMaMQhVhfNP+BOXA7tmNT8RbUpAVI+c/0+HzxDAiZ/IWzKzUvbm6dF546L9TbkdBTR3+nINvNVXwqDYhSfAi3zCvR8AHWKSoKquUpoHPGYZb4NCxz6jypQmrR6Q0b X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Feb 20, 2025 at 07:18:40PM +0000, Yosry Ahmed wrote: > On Fri, Feb 14, 2025 at 01:50:23PM +0900, Sergey Senozhatsky wrote: > > In order to implement preemptible object mapping we need a zspage lock > > that satisfies several preconditions: > > - it should be reader-write type of a lock > > - it should be possible to hold it from any context, but also being > > preemptible if the context allows it > > - we never sleep while acquiring but can sleep while holding in read > > mode > > > > An rwsemaphore doesn't suffice, due to atomicity requirements, rwlock > > doesn't satisfy due to reader-preemptability requirement. It's also > > worth to mention, that per-zspage rwsem is a little too memory heavy > > (we can easily have double digits megabytes used only on rwsemaphores). > > > > Switch over from rwlock_t to a atomic_t-based implementation of a > > reader-writer semaphore that satisfies all of the preconditions. > > > > The spin-lock based zspage_lock is suggested by Hillf Danton. > > > > Suggested-by: Hillf Danton > > Signed-off-by: Sergey Senozhatsky > > --- > > mm/zsmalloc.c | 246 +++++++++++++++++++++++++++++++++++++++----------- > > 1 file changed, 192 insertions(+), 54 deletions(-) > > > > diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c > > index 2e338cde0d21..bc679a3e1718 100644 > > --- a/mm/zsmalloc.c > > +++ b/mm/zsmalloc.c > > @@ -226,6 +226,9 @@ struct zs_pool { > > /* protect zspage migration/compaction */ > > rwlock_t lock; > > atomic_t compaction_in_progress; > > +#ifdef CONFIG_DEBUG_LOCK_ALLOC > > + struct lock_class_key lock_class; > > +#endif > > }; > > > > static inline void zpdesc_set_first(struct zpdesc *zpdesc) > > @@ -257,6 +260,18 @@ static inline void free_zpdesc(struct zpdesc *zpdesc) > > __free_page(page); > > } > > > > +#define ZS_PAGE_UNLOCKED 0 > > +#define ZS_PAGE_WRLOCKED -1 > > + > > +struct zspage_lock { > > + spinlock_t lock; > > + int cnt; > > + > > +#ifdef CONFIG_DEBUG_LOCK_ALLOC > > + struct lockdep_map dep_map; > > +#endif > > +}; > > + > > struct zspage { > > struct { > > unsigned int huge:HUGE_BITS; > > @@ -269,7 +284,7 @@ struct zspage { > > struct zpdesc *first_zpdesc; > > struct list_head list; /* fullness list */ > > struct zs_pool *pool; > > - rwlock_t lock; > > + struct zspage_lock zsl; > > }; > > > > struct mapping_area { > > @@ -279,6 +294,148 @@ struct mapping_area { > > enum zs_mapmode vm_mm; /* mapping mode */ > > }; > > > > +static void zspage_lock_init(struct zspage *zspage) > > +{ > > +#ifdef CONFIG_DEBUG_LOCK_ALLOC > > + lockdep_init_map(&zspage->zsl.dep_map, "zspage->lock", > > + &zspage->pool->lock_class, 0); > > +#endif > > + > > + spin_lock_init(&zspage->zsl.lock); > > + zspage->zsl.cnt = ZS_PAGE_UNLOCKED; > > +} > > + > > +#ifdef CONFIG_DEBUG_LOCK_ALLOC > > Instead of the #ifdef and repeating all the functions, can't we do > something like: > > #ifdef CONFIG_DEBUG_LOCK_ALLOC > #define zspage_lock_map(zsl) (&zsl->dep_map) > #else > #define zspage_lock_map(zsl) /* empty or NULL */ > #endif > > Then we can just have one version of the functions and use > zspage_lock_map() instead of zsl->dep_map, right? > > > +static inline void __read_lock(struct zspage *zspage) > > +{ > > + struct zspage_lock *zsl = &zspage->zsl; > > + > > + rwsem_acquire_read(&zsl->dep_map, 0, 0, _RET_IP_); > > + > > + spin_lock(&zsl->lock); > > + zsl->cnt++; > > Shouldn't we check if the lock is write locked? Never mind we keep the spinlock held on write locking.