From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E02D6E77198 for ; Wed, 8 Jan 2025 01:19:19 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 723EC6B0093; Tue, 7 Jan 2025 20:19:19 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 6D3AB6B0095; Tue, 7 Jan 2025 20:19:19 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 59B616B0096; Tue, 7 Jan 2025 20:19:19 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 39ACA6B0093 for ; Tue, 7 Jan 2025 20:19:19 -0500 (EST) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id BE237A0552 for ; Wed, 8 Jan 2025 01:19:18 +0000 (UTC) X-FDA: 82982526396.11.E03F087 Received: from mail-qv1-f48.google.com (mail-qv1-f48.google.com [209.85.219.48]) by imf11.hostedemail.com (Postfix) with ESMTP id F36EE4000E for ; Wed, 8 Jan 2025 01:19:16 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=KszPbWyA; spf=pass (imf11.hostedemail.com: domain of yosryahmed@google.com designates 209.85.219.48 as permitted sender) smtp.mailfrom=yosryahmed@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1736299157; a=rsa-sha256; cv=none; b=ZwM0hnvX3TCkxY8RZHbPiXKhYctQ6sEC/j7iZt3ddywp864qe1DDLUI50al0kVIKVt95lv YueEE4w4EXeqe7QqBoGUU6LlPw/kS/1lfragndk7vyxgRd6ACGRWDXZZ0xCg6DC/0Bau+p 246Kql/42mRgd95nlU3hB5yFHI/u0U0= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=KszPbWyA; spf=pass (imf11.hostedemail.com: domain of yosryahmed@google.com designates 209.85.219.48 as permitted sender) smtp.mailfrom=yosryahmed@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736299157; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Ui/nm90nu1hiwAHJrrBotziY8KnV1phLxDPsBYQHW0s=; b=JB3IKY4HsXqg2f97l5uztRDzGQBz/zm1vWlqvvcl2tBE+f+acEl0v2H/JpVr/pj8qSrAPN tR98f51gBJqfozm1WdgTySLfwvdgJJJgyFRoXYpC9WxLkhx3/JsvDvIWsL6g+2pZaLA3I7 r1n9uBHSsyoMHVV950fwPJPbv6h8y8c= Received: by mail-qv1-f48.google.com with SMTP id 6a1803df08f44-6dd43b08674so117304326d6.3 for ; Tue, 07 Jan 2025 17:19:16 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1736299156; x=1736903956; darn=kvack.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=Ui/nm90nu1hiwAHJrrBotziY8KnV1phLxDPsBYQHW0s=; b=KszPbWyA6e8iwb0YFaFnjmqE05hm/g7aN9O9bM4IwvyQd1Y3ww1SD8lPLRMt4Okr9i rAbrRjBHIrqDPktNkcfMr0FZmEzsdPGmnBSGE1jKK2kzbbV5PUljB+LnbiDE+EB0UXdn ubPxsdIJZSh9QzXgzPePTUD1oyndnsADmqkDSHaHMWPimJmgPcy7kXtEsMuoYIVS8TW9 t2RPe74omUtqi1kiw+TGJTOp+XhlVwxgZLvsGNnF5bZglXwoppqboZ1R/4v9saeI5CpV Bohx/v6DIGRCFf3Kh16SG/B2sLi07fwM/0a/cSba19zZ6F96VH5vtPnWk6EwCg5ooX2z v8Cw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736299156; x=1736903956; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=Ui/nm90nu1hiwAHJrrBotziY8KnV1phLxDPsBYQHW0s=; b=MQOdI0ElSTZ5valDjXytr3at1JJrOWgKCSsAIEkV7f2ww0gLTRg7cMxAKv4WlR6izO 0LTLy145sj/1GWFIodvjmklwdczdTuscHu6URZSP66nzzznS7o86nTDoVfkZr6Q//1er jQSh9brttDSS6DgksSynC2/ASSUEY9WdI3Bxhr/XP3juEXgmThKFUVA2R03fWQtbflvI eJw71hC4WriCYCSXsgzwBFuRvk1/Sj4+n1Ad8zuINBGkmSL/50T9cMkjoqa7Hqm5E+Zt sTZe3gPTmLs4b6VmNzyIEWWBHC4viZlchvPjmtaJMGxV8dFKGjeOggJ5roo9Hl9TZUa4 VgKw== X-Forwarded-Encrypted: i=1; AJvYcCWZ6qlHL+fHhNiqO57nzsHp4BkEnLDeNcDpt6ZuwADG8Xm1ZH3/F25GL/NUwETAXuUUSl2ueRf29Q==@kvack.org X-Gm-Message-State: AOJu0YwXZr2PjnRI5fl4cjvI8QlSsrfBFt/qf0Jq7eW9kjnPDWy2OxCM LQxEQzcVtPhA7o7p7fTlntn0UVcq6t7MdYYf6TAm2FW+/m/NFpllkejtE/6Uz6asPMR3Ked8IHk zx7YZ0jEIpFplMFPrlQHjN+8BQGbeoz2Moin1 X-Gm-Gg: ASbGncskPWWOWB7qwR28PL7AalGSO06QirCdJKCiww/bLj0pfmGAqtBip1PVrhHkTkR rzz6HXwyP1cSyQzMelu876hYXesbteazxMliA0/PLH0cVdByA8pRMDdzP5kT5+r2L9lnm X-Google-Smtp-Source: AGHT+IH4J2SKUpnFZyyVLN4Qg9fAKWpzHuENS2WvmbnxNJzLlL15coTgvUIzPQl02hxTPiazkkUBplmVXRb8t9+betA= X-Received: by 2002:a05:6214:240f:b0:6d8:959b:c307 with SMTP id 6a1803df08f44-6df9b1b4ff6mr24748566d6.10.1736299155963; Tue, 07 Jan 2025 17:19:15 -0800 (PST) MIME-Version: 1.0 References: <20250107222236.2715883-1-yosryahmed@google.com> <20250107222236.2715883-2-yosryahmed@google.com> In-Reply-To: From: Yosry Ahmed Date: Tue, 7 Jan 2025 17:18:39 -0800 X-Gm-Features: AbW1kvZ69maJZ5IJd1-Eie-61nDs8BoZJ0F9xpyPZkIfZ8epOJ6RDLev_OkXxk8 Message-ID: Subject: Re: [PATCH v2 2/2] mm: zswap: disable migration while using per-CPU acomp_ctx To: "Sridhar, Kanchana P" Cc: Andrew Morton , Johannes Weiner , Nhat Pham , Chengming Zhou , Vitaly Wool , Barry Song , Sam Sun , "linux-mm@kvack.org" , "linux-kernel@vger.kernel.org" , "stable@vger.kernel.org" Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: F36EE4000E X-Stat-Signature: aq7q5rqe6e65u6g48k5hmo75rm8x4pgh X-Rspam-User: X-Rspamd-Server: rspam09 X-HE-Tag: 1736299156-152678 X-HE-Meta: U2FsdGVkX185+cUdr1RAxmU9T2C8/U35YcGVY11MOR4WMcJhbGqqJhXIVoPiH0KtR37N8zSq34H5dA8T8evjsThdEcnUSv5fu4up1DOpCHKY3agSri4WAjHPPa2TWX0Ns9VpVdqweHQhCXFKcxkKJso8Glf6J3KlxeCGT2KwlKaMt0ux3gj9rkCQ95N5vJPDbYG8R96D7GZsFJXI6svTA4YlN3KSFHX4JdvPavTu1XP5HNkiDoe1y/BFrPPXE67lMmeUa+Ed+qnrBvcPFpYznkTzY/dvLK2rjbdK3RKjGNBxCu7uCd16velcXwhYBDpnkLNU/uuXFU3q+QnMq3bN4E/m6hqip7eV80wDZVTBRHT2sRDNNdlZTUDZwXyJalz7siZQSwYdokxCzWDCJlP0j7Bq4/Nyzlck+zcWMyJU0t7AwK+aeFOicp+2dhllUbhUnWpXWtvZi6FKBzff+etn2fm3z5/QgF1dZs63wDhVvemqPL1paT+s2+oEmvy1oQ4jH2XB+Sz5nm3cZWkI9qRsx4KU9QXpY5KBZWT3W8QNNHlYEtJHup9H9jMdsemNVdWBZsmSBWXZjq5GOU+HrVpULGjV5hAmjIaBz8uj4+CGeBC/zrX2+uTagrEuecesMe7h3xY1WC2Htd8PHNWZ/rHAWR7TiH/jnk6QUvJw0/SjvZFENOGMhp3LiXfgehX9rERxtCGUgzmRN0u5O8HqpPlNLENbLg6vY2/+lmxAQj6vPn7KQCDyVIaizBPwR3q3O3qMEnqQIUH6aipexNcWax8Ofu0BpGEHWSGNDl2NvByubEiryE25m1bHMdheslqHVt9JFtQ9Wlr/nSzF95a41tD1PX7GnieBhO87zE9g+7JRw22Nh4Xarjm3Z1oqIQGUn23TDBK1KcI6GFknu/8m9S8TWh8A4lrlge9yXs0O51b1ZxZG7dbo817D3LZwwGWwmmUpxaflS+AqMgJIS44eUFJ GnY+GSPk Ft/PVwRSC97c/O9wRcRR8umwihuNHHac5KcAhpOxDdUOsCiSFjd1dUPAIBE5FyedrfrR4hjqyASZnuqbF59cMqeXHq9/rnPaKC4ep936WKoKeqXg7N7ojAoJsg3qBYCdm9fUpJah1tgdvjw84V744HeGmSlNH+29IyXG6sOaYtIbZjovLQnAdvYNDk4c8H5UIfgU08d2ZnAlTyCsHkC09EJzxAEXLg92MgF2XLlc53blFeYaIRZD9MUZZStk++/LRUXqC4aStF2lLcnyPmUUfYg6RgA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: [..] > > > Couple of possibly related thoughts: > > > > > > 1) I have been thinking some more about the purpose of this per-cpu > > acomp_ctx > > > mutex. It appears that the main benefit is it causes task blocked errors > > (which are > > > useful to detect problems) if any computes in the code section it covers > > take a > > > long duration. Other than that, it does not protect a resource, nor > > prevent > > > cpu offlining from deleting its containing structure. > > > > It does protect resources. Consider this case: > > - Process A runs on CPU #1, gets the acomp_ctx, and locks it, then is > > migrated to CPU #2. > > - Process B runs on CPU #1, gets the same acomp_ctx, and tries to lock > > it then waits for process A. Without the lock they would be using the > > same lock. > > > > It is also possible that process A simply gets preempted while running > > on CPU #1 by another task that also tries to use the acomp_ctx. The > > mutex also protects against this case. > > Got it, thanks for the explanations. It seems with this patch, the mutex > would be redundant in the first example. Would this also be true of the > second example where process A gets preempted? I think the mutex is still required in the second example. Migration being disabled does not prevent other processes from running on the same CPU and attempting to use the same acomp_ctx. > If not, is it worth > figuring out a solution that works for both migration and preemption? Not sure exactly what you mean here. I suspect you mean have a single mechanism to protect against concurrent usage and CPU hotunplug rather than disabling migration and having a mutex. Yeah that would be ideal, but probably not for a hotfix. > > > > > > > > > 2) Seems like the overall problem appears to be applicable to any per-cpu > > data > > > that is being used by a process, vis-a-vis cpu hotunplug. Could it be that a > > > solution in cpu hotunplug can safe-guard more generally? Really not sure > > > about the specifics of any solution, but it occurred to me that this may > > not > > > be a problem unique to zswap. > > > > Not really. Static per-CPU data and data allocated with alloc_percpu() > > should be available for all possible CPUs, regardless of whether they > > are online or not, so CPU hotunplug is not relevant. It is relevant > > here because we allocate the memory dynamically for online CPUs only > > to save memory. I am not sure how important this is as I am not aware > > what the difference between the number of online and possible CPUs can > > be in real life deployments. > > Thought I would clarify what I meant: the problem of per-cpu data that > gets allocated dynamically using cpu hotplug and deleted even while in use > by cpu hotunplug may not be unique to zswap. If so, I was wondering if > a more generic solution in the cpu hotunplug code would be feasible/worth > exploring. I didn't look too closely, if there's something out there or something can be easily developed I'd be open to updating the zswap code accordingly, but I don't have time to look into it tbh, and it's too late in the release cycle to get creative imo.