From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 78485C0219B for ; Thu, 6 Feb 2025 07:01:22 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id EDA406B0082; Thu, 6 Feb 2025 02:01:21 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id E64526B0083; Thu, 6 Feb 2025 02:01:21 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CDC2A6B0085; Thu, 6 Feb 2025 02:01:21 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id AB1696B0082 for ; Thu, 6 Feb 2025 02:01:21 -0500 (EST) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 5A7BFB1C92 for ; Thu, 6 Feb 2025 07:01:21 +0000 (UTC) X-FDA: 83088623562.21.A625D80 Received: from mail-pl1-f173.google.com (mail-pl1-f173.google.com [209.85.214.173]) by imf24.hostedemail.com (Postfix) with ESMTP id 6B248180009 for ; Thu, 6 Feb 2025 07:01:19 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=chromium.org header.s=google header.b=CwVwQuOD; dmarc=pass (policy=none) header.from=chromium.org; spf=pass (imf24.hostedemail.com: domain of senozhatsky@chromium.org designates 209.85.214.173 as permitted sender) smtp.mailfrom=senozhatsky@chromium.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1738825279; a=rsa-sha256; cv=none; b=dDeG5rgoqDo1BV3dluCnqchLmWu1r47h5LVbhVK796QGRvVJ5e1m0QmTVKrYXwSfoXv2Aw HAi7SfwMACBtHq0uYTz+kAnuDgrVZk89iuenOu5B21eY3HCfZZXmmG0yifAgg10Ukw+cC3 vL1adM84bbT//J/UkQ38ryjYlZ3RQW4= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=chromium.org header.s=google header.b=CwVwQuOD; dmarc=pass (policy=none) header.from=chromium.org; spf=pass (imf24.hostedemail.com: domain of senozhatsky@chromium.org designates 209.85.214.173 as permitted sender) smtp.mailfrom=senozhatsky@chromium.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1738825279; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=sxNsy78Ojle1Pdtt2/9uoDPHjLQXKw9uXc9cm/JVoXs=; b=xQuSx1q+EUKLAGQcdcizVzDQi7J6mrAkW0tMGCezLodpc0w+UkRK4hrFSl9ooTxOadEFUK chgi8gytj4rcr/GifxyyaSbVLAMtQz7ichLQFjHo1WOBR1zssPZLpSx9TdrjL2rp6BqqxW UXx9XCr7ag2q176KeGoh72NyDxj/cq4= Received: by mail-pl1-f173.google.com with SMTP id d9443c01a7336-21ddab8800bso8670375ad.3 for ; Wed, 05 Feb 2025 23:01:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; t=1738825278; x=1739430078; darn=kvack.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=sxNsy78Ojle1Pdtt2/9uoDPHjLQXKw9uXc9cm/JVoXs=; b=CwVwQuODYrMUW7cAkPiet4zfKUG6kjrLMJPzKhHUlnrnwDSkh77K3s6YfUFylLmEPo inhMqeVlVfJVOHsl9itNBWxyH2Iz/IGweU/YCTQxF955qh09ku0YPllVw8jHV3uvGOwX row9BKeRa3XgAEzDvgCVRQ83gc+Wai8WzEo+0= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1738825278; x=1739430078; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=sxNsy78Ojle1Pdtt2/9uoDPHjLQXKw9uXc9cm/JVoXs=; b=tpaRoh8x1fwE4SrMxCC2A6TAl/7Eui+H39fmst8YiExmbVZYo4wdzisEFQj2BEzhlW aZ+Ozgw84V+9aavS0iqUINLGLky0rDWJOKLrvwbLeWbrbIS58Hg15Jf6SBbYMhrpVebD nc/I16ERYvEWbieEgyKLLNUEcP6cKG0VcDNOSz4ntiq+8a0y/4pKL+dHed1AdST/q9LH Dw/6g5w5xG+psMLv9bVTwt0RO8S4meNySr5JQPtqDsHAfWkOKAbnlo+1mSUwUldWeN2Q GbmxMvTuzno0qo8ICuy4SdNei99O73NAP872z2Sfg+Z1965yhdcUIX69PdlXnk4M52eG kAjw== X-Forwarded-Encrypted: i=1; AJvYcCW1LOQtcQ05SFDEsGUhvDmAA5r2ezhi3BDZ87zWLFUhYP09Q+8HVwOkZbEcmJQQ+ZEdTlQ3PDym9A==@kvack.org X-Gm-Message-State: AOJu0YygUbyLA1Hs4SK0S5V8nzK8ewAVyPAUP2bqFu1nckIP74CfYjP3 8AJpoZWHmsZ9Z5vCUNDz7U4HwshPhmDTbnaxkqeRpPARsefbBXW/awBlAZckAQ== X-Gm-Gg: ASbGncvjX495+7ixeJOhSrT2kgEij5+XmL9Qaq5G4IDJDHIjP+j2kIKHcqNmpIu91rm LXAyrPGOjOslJ/m+vuJ+zDjU5VYoXciKF8eyzBdcleFp5nMkFQEaG5NqvDpjM6qswZyVzxTmDG9 MoYPCyek/NdZVJgalsw5ZA+uyooYBTWZAGfb+vfSHgvXXGbmb9ACCYEARzKM5X87b38IEZEDjx2 8gIEY80Su3JfCrccQCTOsNHE6gruNYpx10tbqLe/GQJS+oj2y8hZh3vk2tzII5i2Shxm7b13TAw j2zfZAp5qyOQU/3csdw= X-Google-Smtp-Source: AGHT+IG4YvkGAKTcjpC9qozwv2pX6qryRKN7DcO5vmBTThC2QwtQnRZvkBNlJbFGOMFGr5/DKovTAQ== X-Received: by 2002:a05:6a21:a44:b0:1db:de38:294b with SMTP id adf61e73a8af0-1ede88ca7e2mr9801666637.38.1738825277996; Wed, 05 Feb 2025 23:01:17 -0800 (PST) Received: from google.com ([2401:fa00:8f:203:28ab:cea4:aa8a:127a]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-73048a9d903sm596092b3a.13.2025.02.05.23.01.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 05 Feb 2025 23:01:17 -0800 (PST) Date: Thu, 6 Feb 2025 16:01:12 +0900 From: Sergey Senozhatsky To: Andrew Morton Cc: Sergey Senozhatsky , Minchan Kim , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Hillf Danton , Sebastian Andrzej Siewior , Mike Galbraith Subject: Re: [PATCHv4 01/17] zram: switch to non-atomic entry locking Message-ID: References: <20250131090658.3386285-1-senozhatsky@chromium.org> <20250131090658.3386285-2-senozhatsky@chromium.org> <20250131145536.d7a0483331b0ebfde80a5754@linux-foundation.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20250131145536.d7a0483331b0ebfde80a5754@linux-foundation.org> X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 6B248180009 X-Stat-Signature: xarxeknz1aogqbqzf7godau5s9ecxukd X-Rspam-User: X-HE-Tag: 1738825279-89881 X-HE-Meta: U2FsdGVkX19VaQ8sE8/TinbGvnL6Q7V3rgEbTjV6KaEtlnhD2qOGUwcjAM36lPxINc6WS8mYJyFA7Le6nsvxRkqlrCoDh6JP7hsw2kmEqM8c/XWvFTxkSxWHLZ7sfxB+HEwed4qHhRXyDpbOWF4Ta6oBpzZbMmJVIB8ty8s2GYSJroDkvXWriHs+ePE8JDtQDaUYlyjbLn02mejbLMnXLN0yyKuwwhqPniXvdRjYmXrc1SrJRsLrKJPNWcHXWfAjwew+p1Km8NUYThFUEIxe5BTBKymfdAI1gafS4YVQIut/1s1my3LDe57BonPJrhtiUoaL0KcApOz+nXuGdLdwM1ZcEajCmM5xCI2sd66hcjIKEy8Dq/39C9nHEEm92+dOXbBYNPqX7vR6aj7plV47Aaxrh6eQZ8rt8u48PXqAaZ9bW7bqXW3+z20RkrIOAClTty5W9DAAHw96Be9xLSXS/ANXTNHqP2MiXASJfYVr671YUFWPzHDUQRX2SBQtLV8/y8upwosvttYdQloUN3OGvjwIew7UNS4qU/YAfbptWQRKOTfaf1emfKzSkBc0Q1B17U/pP87jZl5uMwaYuMJ9nALP2eVW6V4m4UqN3mPWHeDgfqXsGSGro0M1oN+bx7KPttRtvXKEWv1YQlfOMDFSwMBviD37fzNz2mpBtWr6oUlzNRlCkryPWOuSp+kSPZ5FdNEAEcPtN+iZFfIzAe/yEQcHzIuIgMODsZbdaI1TKUDTMOcl2PydkEELTFh8lgDFNMQdO7riVuY7lRJL3Oox9Z/TngP1gTCgg2O5YTiOEFeGYCj79qTYAwgFgALT9+o4064Yyvkmtte581RAF+tlso42DWcNEFBrj2Si2N/B5JcQmfhDpzg1kXc1O9tqKydAj2/jV4ryBvnQI6rxGpditaCd8I0+HI3jP21SSMb3btXjQX8DdNsWB0DXVmY21FPqtXp9yqLTvHy6a/hhiLB 1Qg7oyFr GSeSoBQLt2U/0JUt42yEEQxCnynIrbP3z9wD+Rm5s0pcTA5X3IG2pDYV0F1kYhLhGwXcCzGm0TxDlST0mzqY1dyXGo7erisIP5WMPOsHR3+ZIwuSVO7i1fMkVk5ZbTSSC+iRaL9+i4UW0Qbz8LbavAZyA+2Abxq2TvKsvLWUvr21OQ98j1qMeQg13HQTjOmiQzFL5RW+IDAIPHy9bmbNoefH1/VUMMD5SowcWJ1nFRqgLi76sziF/Xea8EKsJHyOVckrmmTRZmFIPqkppLoEJ+9M3mBscr8IHhXt7Q9TS+m9mHY+N9fMYynN6zeAJHhFGNL3CCdUv0LW2THjPG1M6vOQXubjDsYYjrkE94tD9Tsk1fVw= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Cc-ing Sebastian and Mike On (25/01/31 14:55), Andrew Morton wrote: > > +static void zram_slot_write_lock(struct zram *zram, u32 index) > > +{ > > + atomic_t *lock = &zram->table[index].lock; > > + int old = atomic_read(lock); > > + > > + do { > > + if (old != ZRAM_ENTRY_UNLOCKED) { > > + cond_resched(); > > + old = atomic_read(lock); > > + continue; > > + } > > + } while (!atomic_try_cmpxchg(lock, &old, ZRAM_ENTRY_WRLOCKED)); > > +} > > I expect that if the calling userspace process has realtime policy (eg > SCHED_FIFO) then the cond_resched() won't schedule SCHED_NORMAL tasks > and this becomes a busy loop. And if the machine is single-CPU, the > loop is infinite. > > I do agree that for inventing new locking schemes, the bar is set > really high. So I completely reworked this bit and we don't have that problem anymore, nor the problem of "inventing locking schemes in 2025". In short - we are returning back to bit-locks, what zram has been using until commit 9518e5bfaae19 ("zram: Replace bit spinlocks with a spinlock_t), not bit-spinlock these time around, that won't work with linux-rt, but wait_on_bit and friends. Entry lock is exclusive, just like before, but lock owner can sleep now, any task wishing to lock that same entry will wait to be woken up by the current lock owner once it unlocks the entry. For cases when lock is taken from atomic context (e.g. slot-free notification from softirq) we continue using TRY lock, which has been introduced by commit 3c9959e025472 ("zram: fix lockdep warning of free block handling"), so there's nothing new here. In addition I added some lockdep annotations, just to be safe. There shouldn't be too many tasks competing for the same entry. I can only think of cases when read/write (or slot-free notification if zram is used as a swap device) vs. writeback or recompression (we cannot have writeback and recompression simultaneously). It currently looks like this: --- struct zram_table_entry { unsigned long handle; unsigned long flags; #ifdef CONFIG_DEBUG_LOCK_ALLOC struct lockdep_map lockdep_map; #endif #ifdef CONFIG_ZRAM_TRACK_ENTRY_ACTIME ktime_t ac_time; #endif }; /* * entry locking rules: * * 1) Lock is exclusive * * 2) lock() function can sleep waiting for the lock * * 3) Lock owner can sleep * * 4) Use TRY lock variant when in atomic context * - must check return value and handle locking failers */ static __must_check bool zram_slot_try_lock(struct zram *zram, u32 index) { unsigned long *lock = &zram->table[index].flags; if (!test_and_set_bit_lock(ZRAM_ENTRY_LOCK, lock)) { #ifdef CONFIG_DEBUG_LOCK_ALLOC mutex_acquire(&zram->table[index].lockdep_map, 0, 0, _RET_IP_); #endif return true; } return false; } static void zram_slot_lock(struct zram *zram, u32 index) { unsigned long *lock = &zram->table[index].flags; WARN_ON_ONCE(!preemptible()); wait_on_bit_lock(lock, ZRAM_ENTRY_LOCK, TASK_UNINTERRUPTIBLE); #ifdef CONFIG_DEBUG_LOCK_ALLOC mutex_acquire(&zram->table[index].lockdep_map, 0, 0, _RET_IP_); #endif } static void zram_slot_unlock(struct zram *zram, u32 index) { unsigned long *lock = &zram->table[index].flags; clear_and_wake_up_bit(ZRAM_ENTRY_LOCK, lock); #ifdef CONFIG_DEBUG_LOCK_ALLOC mutex_release(&zram->table[index].lockdep_map, _RET_IP_); #endif }