From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 66A9AC3271E for ; Sat, 6 Jul 2024 02:25:55 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id EE77A6B009F; Fri, 5 Jul 2024 22:25:54 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E968E6B00A0; Fri, 5 Jul 2024 22:25:54 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CE6F46B00A1; Fri, 5 Jul 2024 22:25:54 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id B11B96B009F for ; Fri, 5 Jul 2024 22:25:54 -0400 (EDT) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 6703E80521 for ; Sat, 6 Jul 2024 02:25:54 +0000 (UTC) X-FDA: 82307737428.30.6F8E385 Received: from mail-oa1-f53.google.com (mail-oa1-f53.google.com [209.85.160.53]) by imf25.hostedemail.com (Postfix) with ESMTP id 9505BA0002 for ; Sat, 6 Jul 2024 02:25:52 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=XDjO4fss; spf=pass (imf25.hostedemail.com: domain of flintglass@gmail.com designates 209.85.160.53 as permitted sender) smtp.mailfrom=flintglass@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1720232739; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=iBpJ3BE0SiSrY/2ro6Q/w/amFbTKvzieCspG8933e1E=; b=My3i/7l0jsxwqH6scH123ckvS84oAC78QtITburXmxs5Mh1dF+0waJku5d6OJkWJPam03+ IFbrXIe5E/jKdvsmc4oWT7YLIK8THirEfglJSFQ7pTk0viHIiQZ16XOZlF0GxK4nX86umJ 9JHWaomxkyC/7IkJbuxdzclRCQmQmNY= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=XDjO4fss; spf=pass (imf25.hostedemail.com: domain of flintglass@gmail.com designates 209.85.160.53 as permitted sender) smtp.mailfrom=flintglass@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1720232739; a=rsa-sha256; cv=none; b=Qrd6DCsYdQqFs6AaXCR9pdF9Y31Yl/R/7oxUyXM1Vzwr9cUYqJVk6jRZ75Cc/TvdO9Ttv3 99Rg17f/+jz9UTp7RWYQbNDNbVxWJ/ntBr0YWKzOibsOo210Ba/b32kcyYlEuE1lD73YkI JlASwRw+ZD9auOoQCyBQAs+JEQ3/Dug= Received: by mail-oa1-f53.google.com with SMTP id 586e51a60fabf-25e1610e359so1091825fac.1 for ; Fri, 05 Jul 2024 19:25:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1720232751; x=1720837551; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=iBpJ3BE0SiSrY/2ro6Q/w/amFbTKvzieCspG8933e1E=; b=XDjO4fss5tWfHtuuUu//aacR5FfDFTCiNj7hwPXEJTFht5HwViGNgzrpWxKSDpwhfU NDIswUGfoYqfdGwe7UXpktvBLSGkzRed3IalhWJZuVC7rdTarzpOhv1ebDa5hHbWakyO 1lTRPdJgVOvaSEwXkSXWOjwi8Ttqv90YNHZe9uVaUKPQ3V+hPOYNKseb02IGgMB9Y+QF lUVPNs0tH5wVsYtufa1hgbTzWmbM25aZeP8W/Osdcrzx8wLp0HXw2mfMlRTtU8JXaCSK V8JLeVThc5LXJOeO5PxJf/dLGGcfPrNnp/fN9SWSaRVwzelUyxwsQc7fOaIuW1RtqUxG IU6A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1720232751; x=1720837551; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=iBpJ3BE0SiSrY/2ro6Q/w/amFbTKvzieCspG8933e1E=; b=K+uIz1q/rnSD4tvQ4MoQjl1ZKJTHI4z2uC+nO2Nk4NEvvRY0biZN2H4B3HNF2ev3fF B2Xey83kHepOiE9EZf7SJ5I80qIHd2ABcAs4ueZs4mxzIrMvzQSdyzITSleGoeT7VGKU lcMhH+QmrLDs0qvGz5zaxQz27e+ph4YyLN655vFEASZUvSOFF7cAOiHf0QRn3x41CScm cmpOK3VNzyJzT3nhBdUITEO2IiD/jN8zQatzd8JI5Vjm8+jwy2PmwDQnpNMmrz45TNfZ PVE0XBF8pOve/mldP3RNTtGycafErVP373og0+648HyauxeHszOEIPA4Mt4zjcWRQpkX SQzw== X-Forwarded-Encrypted: i=1; AJvYcCXpzq0VUF1Y042tJAs0WRMJ58JX4Qh1AC7xuKOKMbrMZGte2hHGqdjHjQS86QtaAGi6tze486cqa1Kn/01MxANyhMU= X-Gm-Message-State: AOJu0YzkZp+NMRe7tuGlEqYT3SvynDQ/ocJu3CdmhjnZ5wqvVERh9h12 3CVTUM4XFsOuR+ez6hbMeuYvhmiBcooHirvIe/vvmlEMt7WHtFhv X-Google-Smtp-Source: AGHT+IHXQIeoErfRsmMJa9fwH9FrJ6bXuw08Ue12BsvAoEyW/uWu0HwgyBVleoabGwNScNNl0KNw/w== X-Received: by 2002:a05:6870:a687:b0:25e:26f0:adff with SMTP id 586e51a60fabf-25e2bb802c5mr5147649fac.28.1720232751471; Fri, 05 Jul 2024 19:25:51 -0700 (PDT) Received: from cbuild.incus (h101-111-009-128.hikari.itscom.jp. [101.111.9.128]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-70b15417a7bsm971274b3a.205.2024.07.05.19.25.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 05 Jul 2024 19:25:50 -0700 (PDT) From: Takero Funaki To: Johannes Weiner , Yosry Ahmed , Nhat Pham , Chengming Zhou , Jonathan Corbet , Andrew Morton , Domenico Cerasuolo Cc: Takero Funaki , linux-mm@kvack.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH v2 6/6] mm: zswap: interrupt shrinker writeback while pagein/out IO Date: Sat, 6 Jul 2024 02:25:22 +0000 Message-ID: <20240706022523.1104080-7-flintglass@gmail.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240706022523.1104080-1-flintglass@gmail.com> References: <20240706022523.1104080-1-flintglass@gmail.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Rspam-User: X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 9505BA0002 X-Stat-Signature: pj1we8cqisczzkc1mqbuc9izyxzzj39n X-HE-Tag: 1720232752-888333 X-HE-Meta: U2FsdGVkX1/KYzmHoGO7xq124K84lZ1yTPI9yKZDj2p76F3AGag/KuAzh2f5Lkh49th+M2eREP9OmYoYcECJUDVqJ9fLGSCatud9YvpqKmyzg3Gsh2QF7ZZEw/BtlR36hNzKicp1xvT+fj40Hs6224PoQAhfy/IIRFNXu2vb8MzuFW42zmQHldE65MVcVYV5bq+gzDmZnmqNdUGldAFSbz3+1wsfmYgLvRdRlcpHKCq8lqt2I/IfpMkfMj9pi+zWq0e7M7NV6DIDsiwmuZogQxg2bkEeOzA9rG+bYt6dTLJEX0CwbG/Sjz8yPi8VpIYSdCeCFyc3dPv6wGuSRQcE4xziwaR2l96JVM3Q1BVHrQJ4ltdTTKnuD/nfVyhnndU0eMn3bZsT8w77uGjpLC68eYMAsYqAZRPTd5pRTIrcEAykQ3CjQ/qKA1MMnbE+2EdXDVEyoYCLA/hhR4nO8qylRxBW43/5Rjycnop2w5qDpGJ6h2rXDSIBphM1/1c8zUTKm2UNnp2Fx6LN9iwy5838TStAJ+RpUT5iya1olRtCth7fFT5TkNIgBzQvB8Fy8PnoMyUnSpOgyTTNFiUuEd8p3ip7KwS5M7JrKaf7hjdO+e0B2EW5CtKMutPao/77mGn5Cpm9tX/NJrHI7prOW0GhPWT7wWVJag5KyuBjddIRfIIEdgu22WctbRxmK1VMbeYe1cOB3KEmeROTVwUSoRr/MyKFr5JeNvyRr3oH1wg+5MGg9i2NzBE3/vGg/0NTlNdI4CK4zyNeRXo3Iy5PSyrgZ8J+nGeo8139Prw41uFEVmcyZqZpeoa+5HeOBhjRYVYMAcBZSyC9O2DxjMVJhf9E4v03FRy28IzK/Ga4uQYFloipIJCqG8Rr4XU/Qb4MAwBpagEmEwxLDpqNvLRE1P/OInHN4Hq9Qxhdy2XaTonMwUzubAzfx1D9Y8OI1AfULRebQgufNGKhHycIH3mdVZl 4gIRZnEv 9q3jKv5qqG9AdYIn9Ao9QIB+px4Hc2tjT+A3x4mL7CF/kCN4nl7WOtDT8IcbpHmKtGkM4w1kABzrHmvCGQISWGpl/ZmmWGOvffbCc9HS2s3O0Nw36xFz3UThxGPUun2/b9LtkmUYo+hFqTi6Sn4Ut/GO1KwDRZQK7965qQfpOOtMUmRM5bh6bF9UA4yylP6ch3lb2KCIae1KIvBmYNwdNUnvoZ+7rZWwMiijitfvdXj7wNCPFO4qH5K5P5wcQFOQ8KMpFEgfbTvDUOZ5lM2kqfQYxNOCm+1dtetqJhSEccaVhV6pjm+gE984Zummoz6UYQCZkfoi5lyo9Uf84Eek/OnUpAboQ5GcWYKKg X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: To prevent the zswap global shrinker from writing back pages simultaneously with IO performed for memory reclaim and faults, delay the writeback when zswap_store() rejects pages or zswap_load() cannot find entry in pool. When the zswap shrinker is running and zswap rejects an incoming page, simulatenous zswap writeback and the rejected page lead to IO contention on swap device. In this case, the writeback of the rejected page must be higher priority as it is necessary for actual memory reclaim progress. The zswap global shrinker can run in the background and should not interfere with memory reclaim. The same logic applies to zswap_load(). When zswap cannot find requested page from pool and read IO is performed, shrinker should be interrupted. To avoid IO contention, save the timestamp jiffies when zswap cannot buffer the pagein/out IO and interrupt the global shrinker. The shrinker resumes the writeback in 500 msec since the saved timestamp. Signed-off-by: Takero Funaki --- mm/zswap.c | 47 +++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 45 insertions(+), 2 deletions(-) diff --git a/mm/zswap.c b/mm/zswap.c index def0f948a4ab..59ba4663c74f 100644 --- a/mm/zswap.c +++ b/mm/zswap.c @@ -35,6 +35,8 @@ #include #include #include +#include +#include #include "swap.h" #include "internal.h" @@ -176,6 +178,14 @@ static bool zswap_next_shrink_changed; static struct work_struct zswap_shrink_work; static struct shrinker *zswap_shrinker; +/* + * To avoid IO contention between pagein/out and global shrinker writeback, + * track the last jiffies of pagein/out and delay the writeback. + * Default to 500msec in alignment with mq-deadline read timeout. + */ +#define ZSWAP_GLOBAL_SHRINKER_DELAY_MS 500 +static unsigned long zswap_shrinker_delay_start; + /* * struct zswap_entry * @@ -244,6 +254,14 @@ static inline struct xarray *swap_zswap_tree(swp_entry_t swp) pr_debug("%s pool %s/%s\n", msg, (p)->tfm_name, \ zpool_get_type((p)->zpools[0])) +static inline void zswap_shrinker_delay_update(void) +{ + unsigned long now = jiffies; + + if (now != zswap_shrinker_delay_start) + zswap_shrinker_delay_start = now; +} + /********************************* * pool functions **********************************/ @@ -1378,6 +1396,8 @@ static void shrink_worker(struct work_struct *w) struct mem_cgroup *memcg; int ret, failures = 0, progress; unsigned long thr; + unsigned long now, sleepuntil; + const unsigned long delay = msecs_to_jiffies(ZSWAP_GLOBAL_SHRINKER_DELAY_MS); /* Reclaim down to the accept threshold */ thr = zswap_accept_thr_pages(); @@ -1405,6 +1425,21 @@ static void shrink_worker(struct work_struct *w) * until the next run of shrink_worker(). */ do { + /* + * delay shrinking to allow the last rejected page completes + * its writeback + */ + sleepuntil = delay + READ_ONCE(zswap_shrinker_delay_start); + now = jiffies; + /* + * If zswap did not reject pages for long, sleepuntil-now may + * underflow. We assume the timestamp is valid only if + * now < sleepuntil < now + delay + 1 + */ + if (time_before(now, sleepuntil) && + time_before(sleepuntil, now + delay + 1)) + fsleep(jiffies_to_usecs(sleepuntil - now)); + spin_lock(&zswap_shrink_lock); /* @@ -1526,8 +1561,10 @@ bool zswap_store(struct folio *folio) VM_WARN_ON_ONCE(!folio_test_swapcache(folio)); /* Large folios aren't supported */ - if (folio_test_large(folio)) + if (folio_test_large(folio)) { + zswap_shrinker_delay_update(); return false; + } if (!zswap_enabled) goto check_old; @@ -1648,6 +1685,8 @@ bool zswap_store(struct folio *folio) zswap_entry_cache_free(entry); reject: obj_cgroup_put(objcg); + zswap_shrinker_delay_update(); + if (need_global_shrink) queue_work(shrink_wq, &zswap_shrink_work); check_old: @@ -1691,8 +1730,10 @@ bool zswap_load(struct folio *folio) else entry = xa_load(tree, offset); - if (!entry) + if (!entry) { + zswap_shrinker_delay_update(); return false; + } if (entry->length) zswap_decompress(entry, page); @@ -1835,6 +1876,8 @@ static int zswap_setup(void) if (ret) goto hp_fail; + zswap_shrinker_delay_update(); + shrink_wq = alloc_workqueue("zswap-shrink", WQ_UNBOUND, 1); if (!shrink_wq) -- 2.43.0