From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 714C0C021B3 for ; Thu, 20 Feb 2025 19:18:51 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C08FD2802EE; Thu, 20 Feb 2025 14:18:50 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id BB84A2802A7; Thu, 20 Feb 2025 14:18:50 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A7FB92802EE; Thu, 20 Feb 2025 14:18:50 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 7ECA82802A7 for ; Thu, 20 Feb 2025 14:18:50 -0500 (EST) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 2674F54078 for ; Thu, 20 Feb 2025 19:18:50 +0000 (UTC) X-FDA: 83141285220.30.F0456C1 Received: from out-178.mta0.migadu.com (out-178.mta0.migadu.com [91.218.175.178]) by imf10.hostedemail.com (Postfix) with ESMTP id 54150C000A for ; Thu, 20 Feb 2025 19:18:48 +0000 (UTC) Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=a2VYRko8; spf=pass (imf10.hostedemail.com: domain of yosry.ahmed@linux.dev designates 91.218.175.178 as permitted sender) smtp.mailfrom=yosry.ahmed@linux.dev; dmarc=pass (policy=none) header.from=linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1740079128; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=lTJZmh7yY26uVk6h15hnLF5tyEUJH/rJ9jba2hgemVU=; b=IjOu0ztMrpJe2PktCyhl3nZW4gQqtZ4ZIIHrWs1hFOF7mJXKgnSpdXjF66wdY1TA6wvFb1 X0+pZdkIMqysatKl2lW1GyIO+/ZVQKzyaWAhhyRyUwq5UU+241wFiSVV7/QTxM+U7DOIO9 T8WAezWxIy2ekylB9OFvMc+BAYbNxT8= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=a2VYRko8; spf=pass (imf10.hostedemail.com: domain of yosry.ahmed@linux.dev designates 91.218.175.178 as permitted sender) smtp.mailfrom=yosry.ahmed@linux.dev; dmarc=pass (policy=none) header.from=linux.dev ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1740079128; a=rsa-sha256; cv=none; b=28iFGetCOHAhLx+hwyd96iOxNYlqhJLLoBUJegDvlPwr/alyxlKLmRiWEAcwQwn60fGWir Y00qULALUpvmJHfkTiwS66JS4kTAu4vjHtbVFe2JFfMixpSLkFA2EQgZGvqbBUu4JmHfBO X1slWQOc0HywqPfWbY62WpPflGY3qyg= Date: Thu, 20 Feb 2025 19:18:40 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1740079125; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=lTJZmh7yY26uVk6h15hnLF5tyEUJH/rJ9jba2hgemVU=; b=a2VYRko85aunYYIDWBd3A1Qw66J2pHDgKO2ffcglYtammcMKPNRPknzC6zplhDY/w5z6KW dwJTKKC7yS7OcAMI+25jRRCD6YsGlpaYwHhqdbZNVRlYsSTepBBXE2TzbUZg/R8/k4W5rI +fPV9uKRrAxragWGlAmLTmomi7BGceI= X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Yosry Ahmed To: Sergey Senozhatsky Cc: Andrew Morton , Hillf Danton , Kairui Song , Minchan Kim , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH v6 11/17] zsmalloc: make zspage lock preemptible Message-ID: References: <20250214045208.1388854-1-senozhatsky@chromium.org> <20250214045208.1388854-12-senozhatsky@chromium.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20250214045208.1388854-12-senozhatsky@chromium.org> X-Migadu-Flow: FLOW_OUT X-Rspamd-Queue-Id: 54150C000A X-Stat-Signature: brmqfd3xudqwwk3k8jgp84mabcfyujxe X-Rspam-User: X-Rspamd-Server: rspam01 X-HE-Tag: 1740079128-71929 X-HE-Meta: U2FsdGVkX1/NGKsocgBUrZnHZTl7/OhaFx/lnJi98WDeoIwcwwHaIuOy3FLY92f72e/M5Qxu63eOG7aUmmxZRQc0yKayS7a70zvHI1fAYnKGE2EihDWWlmHHgmNERdmH/yAI9pnLBFlROPZF2hOxJjsqB+Qwb/BhnPXY0O21Hh4TOMaow6vISp4y5DWoP2ulv7Ymk5JJTeJF/a/NdjIDwekBgswl39U7FkWBIfl6llnO0+JyK/6LPtro0RM1zXmp/v9PpreQe6jD9+7EwXnGx95MwUzc7ROrclNWmPt8NN4NAO6zsyp21PpFZVLRvvloZGROu5MEkxDYq2jan5GgBDCOWDf7JyrOgGyxBjqx64wOFXPMi0GanYycltOPea7dyy7xNGfv7mz50Xs29HnS6CQVr+9QJ9EbkmOVSjkvb3B3l+MA5G2CXNo02zCbxI5cq5jzRKOQLH2MLU957iDB/m42hTt7j9g4R8i7hqcngFzr+9yzqKsDM3NjmGbD0Y6fH0FhRG5ffQUWmIPidI9eWzUH0phOJdclTU7uvnmRMz+P7JDzMO3EiFfqOTs51qa2bU2o4CXkkZq4LkE/VioBqljqvLrqss2jsGTKQMkCzv/3mdvbzqrAKqzlpFEYCctiI9cgkBcuXM+ae3VciOiwWdqQ6wiAGfGxK8F9tQHAlemvxQWj85yftM25o0NupvzYre3IKD8K5qRT6Hz4bwEKNhFtcn995gxBquAoWPadgkbcgNrPKQQ3pZwVKG4wP8s7N9vjvslurk5Ft4ex6VCcCqhBIA4idQKDCU0r5dFRElRWbFmxTkD8xSD/keOW55cgl38UNCoKsbe6ORzy82NHQxKvtrMitCbciW4Msetjs4qceaVUHsfwJ0Lqv5j2Q5CLQlfQPm03Ma1FCKkxjVCfxHKs2J67CD9L+z8tLDFuyw1Xfz2xUUlzfBj9f0cI/5vr4l/cQL5gkJt9m/Uamoh I/pEV/a2 Nj9A1em0ZCnc5BwDtxnvWfocQnAlxIBe3jiQa71dbrilZCYr4jDLapxNLPeWDOLDB0BXWd8faPtoBuS8L6j8j0gd71uznuqHDV2bMixLLNQ9NYCBuKaQdsk1I1L6la1Hn0nkqUFDt9lJLwkcQBOEWrOVWioKEqyUuCeqTiVaORk67nNBkuMN2H02ppA9NPFAkdUvegGpStJp3oCIWLj6fTw/JyUaBYmsT1AdQ+HJ8cQB3RFqXRRAKf630BlbPcQoMFbxU X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Feb 14, 2025 at 01:50:23PM +0900, Sergey Senozhatsky wrote: > In order to implement preemptible object mapping we need a zspage lock > that satisfies several preconditions: > - it should be reader-write type of a lock > - it should be possible to hold it from any context, but also being > preemptible if the context allows it > - we never sleep while acquiring but can sleep while holding in read > mode > > An rwsemaphore doesn't suffice, due to atomicity requirements, rwlock > doesn't satisfy due to reader-preemptability requirement. It's also > worth to mention, that per-zspage rwsem is a little too memory heavy > (we can easily have double digits megabytes used only on rwsemaphores). > > Switch over from rwlock_t to a atomic_t-based implementation of a > reader-writer semaphore that satisfies all of the preconditions. > > The spin-lock based zspage_lock is suggested by Hillf Danton. > > Suggested-by: Hillf Danton > Signed-off-by: Sergey Senozhatsky > --- > mm/zsmalloc.c | 246 +++++++++++++++++++++++++++++++++++++++----------- > 1 file changed, 192 insertions(+), 54 deletions(-) > > diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c > index 2e338cde0d21..bc679a3e1718 100644 > --- a/mm/zsmalloc.c > +++ b/mm/zsmalloc.c > @@ -226,6 +226,9 @@ struct zs_pool { > /* protect zspage migration/compaction */ > rwlock_t lock; > atomic_t compaction_in_progress; > +#ifdef CONFIG_DEBUG_LOCK_ALLOC > + struct lock_class_key lock_class; > +#endif > }; > > static inline void zpdesc_set_first(struct zpdesc *zpdesc) > @@ -257,6 +260,18 @@ static inline void free_zpdesc(struct zpdesc *zpdesc) > __free_page(page); > } > > +#define ZS_PAGE_UNLOCKED 0 > +#define ZS_PAGE_WRLOCKED -1 > + > +struct zspage_lock { > + spinlock_t lock; > + int cnt; > + > +#ifdef CONFIG_DEBUG_LOCK_ALLOC > + struct lockdep_map dep_map; > +#endif > +}; > + > struct zspage { > struct { > unsigned int huge:HUGE_BITS; > @@ -269,7 +284,7 @@ struct zspage { > struct zpdesc *first_zpdesc; > struct list_head list; /* fullness list */ > struct zs_pool *pool; > - rwlock_t lock; > + struct zspage_lock zsl; > }; > > struct mapping_area { > @@ -279,6 +294,148 @@ struct mapping_area { > enum zs_mapmode vm_mm; /* mapping mode */ > }; > > +static void zspage_lock_init(struct zspage *zspage) > +{ > +#ifdef CONFIG_DEBUG_LOCK_ALLOC > + lockdep_init_map(&zspage->zsl.dep_map, "zspage->lock", > + &zspage->pool->lock_class, 0); > +#endif > + > + spin_lock_init(&zspage->zsl.lock); > + zspage->zsl.cnt = ZS_PAGE_UNLOCKED; > +} > + > +#ifdef CONFIG_DEBUG_LOCK_ALLOC Instead of the #ifdef and repeating all the functions, can't we do something like: #ifdef CONFIG_DEBUG_LOCK_ALLOC #define zspage_lock_map(zsl) (&zsl->dep_map) #else #define zspage_lock_map(zsl) /* empty or NULL */ #endif Then we can just have one version of the functions and use zspage_lock_map() instead of zsl->dep_map, right? > +static inline void __read_lock(struct zspage *zspage) > +{ > + struct zspage_lock *zsl = &zspage->zsl; > + > + rwsem_acquire_read(&zsl->dep_map, 0, 0, _RET_IP_); > + > + spin_lock(&zsl->lock); > + zsl->cnt++; Shouldn't we check if the lock is write locked? > + spin_unlock(&zsl->lock); > + > + lock_acquired(&zsl->dep_map, _RET_IP_); > +} > + > +static inline void __read_unlock(struct zspage *zspage) > +{ > + struct zspage_lock *zsl = &zspage->zsl; > + > + rwsem_release(&zsl->dep_map, _RET_IP_); > + > + spin_lock(&zsl->lock); > + zsl->cnt--; > + spin_unlock(&zsl->lock); > +}