From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 760D9C48BEB for ; Wed, 14 Feb 2024 18:57:16 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D6F466B00A1; Wed, 14 Feb 2024 13:57:15 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id D1E266B00A2; Wed, 14 Feb 2024 13:57:15 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BC9866B00A5; Wed, 14 Feb 2024 13:57:15 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id A927A6B00A1 for ; Wed, 14 Feb 2024 13:57:15 -0500 (EST) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 8825BA2470 for ; Wed, 14 Feb 2024 18:57:15 +0000 (UTC) X-FDA: 81791317230.02.075EB08 Received: from mail-il1-f177.google.com (mail-il1-f177.google.com [209.85.166.177]) by imf22.hostedemail.com (Postfix) with ESMTP id C3A61C001D for ; Wed, 14 Feb 2024 18:57:13 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=vXtZegsk; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf22.hostedemail.com: domain of chriscli@google.com designates 209.85.166.177 as permitted sender) smtp.mailfrom=chriscli@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1707937033; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=fJnslB2CZfmXgMrNU6HSYcCTfahoTQjHsVs8f3xtvS4=; b=yLqd5ViiClVg9kVDUyEluGJuD1VglXH78MCD3F3ZQit/x8aC6TDX14ADQTDM/oNLqz9i7T Sv+snttqy9zBcQgr4jNDWMgfYpmtQJdBh8XDbPwANLPyPyVJe343IfUWYNfpqA1n5qTIfA gHvnvaGheESZlY7skgARalAcFPG/SVM= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=vXtZegsk; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf22.hostedemail.com: domain of chriscli@google.com designates 209.85.166.177 as permitted sender) smtp.mailfrom=chriscli@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1707937033; a=rsa-sha256; cv=none; b=6gFdeScZqRMnmzFoV5yYG2QuFOhCiNO2CPrypucQwq1ROivAltLgzfSLugkq+WdjHqdzb/ c5FgqTz8aaaQWGLVu8z9at7j6wO9WLN+mrPTay/GnC2/DjE7TzaYtpUi0croxjPxYyGrhl 5pSvSiGS2/N0yRHb2OhN+3K0wk6fG8I= Received: by mail-il1-f177.google.com with SMTP id e9e14a558f8ab-363dd27c082so265565ab.0 for ; Wed, 14 Feb 2024 10:57:13 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1707937033; x=1708541833; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=fJnslB2CZfmXgMrNU6HSYcCTfahoTQjHsVs8f3xtvS4=; b=vXtZegskmgUxh3G9pSgHFPCrDFRRRuJKYjusVnQ+ZEBAMxSclDKBq5HkolI+sQlV2z +5fXk5/mVl8qvprfY6FMk6I7gwj76J4KCc1bpGcW4oGpddJEDdDBRWSVewT3k1V7TWW3 64EUMPfFBGW7kP4jT32zZS5iL8hDPy5sxcvG8JzBWX14UjKoJv5D6aOV69Oj2e8iGiwX RWiqxSgIiOB0VnG5nWQbq5pp+j2BEAWlnBB/9q1n9qrvNhaq3ooojzBLbUi+MVBXbb7p Sb2cVVKLB1liqjnxS3+aRajSD5BoY6uo2UltZGwIHqDBrLMknG8zzwCivSWfDWGaVEZi oqYw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1707937033; x=1708541833; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=fJnslB2CZfmXgMrNU6HSYcCTfahoTQjHsVs8f3xtvS4=; b=AMTRINEb4xo8eaXXS8XnhASx0FRrXIkULRrXI1ZarmRsMwrcINT426q4vDqa3ykMm8 wqCmEqnscz09GTTmRJpFbY2otEswGle+4ENeRKCLCaZuKhnjnLbBR4qW+hEbHy7YXOTZ uPxvLDq3rWrvHs+N92/dhI1fi9DQpUdF1zKQnLNGtHvuHVK8dFf3znGB8GY9x3gGTlkH UZO1NcSY8E6pMHaHEtMlHNxQkf/lNIRL+TRmTsxcpJXvl34zva4JTB8+WnOuKLaoCi73 hcfF5J7iAT7N9nfebe3q2bWr3nwqG/W3D2HYilhmLKoJOAo6pu6TlNRiU7gMdnYWrdNG jVmg== X-Forwarded-Encrypted: i=1; AJvYcCUni7GlXcgCy3acGdVso8clFesKSc3/eHM7XseE59kJtZruLIfjlA7jDQToPUGYnaPU0MsK6JKDACe50OxiEtK9Wls= X-Gm-Message-State: AOJu0YxazMgbLQf8l2ONVw1+D+clMyRyZ4iCJm2VM8IuwimWloTHX35Y sp7P7coSl2lApWM1hHjC0QT/20SKArCakp9SYGOLWzjnDBAtAdfwHDhV4esnqEno53lYM5Cs28Y Xra/MtkijEC/bOuxiZTl4RVYRVsrO51P/rApW X-Google-Smtp-Source: AGHT+IGN23MqhUjWhsi32LcFDx18GhHccM/X1qvWfrLXMU/NaxS8kqfHK1uiej1XjiYz0XqHeqhCrv/NMAZaKWRuWJg= X-Received: by 2002:a92:d182:0:b0:363:b362:a2bb with SMTP id z2-20020a92d182000000b00363b362a2bbmr3721495ilz.32.1707937032775; Wed, 14 Feb 2024 10:57:12 -0800 (PST) MIME-Version: 1.0 References: <20240213-async-free-v3-1-b89c3cc48384@kernel.org> In-Reply-To: From: Chris Li Date: Wed, 14 Feb 2024 10:56:59 -0800 Message-ID: Subject: Re: [PATCH v3] mm: swap: async free swap slot cache entries To: Tim Chen Cc: Andrew Morton , linux-kernel@vger.kernel.org, linux-mm@kvack.org, Wei Xu , Yu Zhao , Greg Thelen , Chun-Tse Shao , Yosry Ahmed , Michal Hocko , Mel Gorman , Huang Ying , Nhat Pham , Kairui Song , Barry Song Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: C3A61C001D X-Stat-Signature: tk5mjbgdffycgkm88b7d1qbfyg3g8ms3 X-Rspam-User: X-HE-Tag: 1707937033-756381 X-HE-Meta: U2FsdGVkX19lT+tJNOd/K8DO/jVzSEUEQT6YecJLEuyx7BuDN6zr52mmQHaKzxUhAfw7BYS6wSwYuzK0FQcWGP/FhrXXPWzSseyz3ratzI+ipyQ/EmNuNKrYoAHFPKdfQPnUBlY7fBB8sRktdnSGhuGGg4t92DGUtkwCUr7vszxnbpVps/aRiq9/FduK1Si+euhi3FD03aKnjTly7mF2nH7RMZ2wIKzu9/fxoocuu4YsCHERBBpKCZ2oZAJ5sHoH+3x7LYDZ+FAuVPfktZi+f4Xswjsv125eXyQ0NZ7iJsicfgUfw+R+/zbZNML7A8a5F91knOEgkWbNFZ0lsOcb0r4MUFTNb3ertb96inH5V5gImhI9hMlq+7uCpyY7fahswfvBkqGLCDYa7PlB8FrKhIawSPeAJwqfd5zz18AQ9Sibs1e1DyAk70ALsF2LN3i3zsA1wP3zMoAk6P3KrbpvxRYUqDDkydNpEJowJL3e/l3dC2LiVOLEw/nAMQmbnCSW0/yjTUg26PL+bwFzLa08m/KudQHZY1/eKjPkTIiR7L3cLyMFwOL+0LLEOXhM1Ast15z7fjhfLthYxbTDAhR8S2Xi9fh3+pfnpmcPDkHbq8QZmyF1TwfgZUSDFzuQtoBhrEfyxTui7dCmg3tA4JxGZxIM0XvHMN30WnxJDJVxP/pTyXdoeXPi4WEdFRVfeiXWWKJvPrzL/KKIDIW1IBJJbfiIRh1Idng1MJFiF2LLWdtnaXyhVNVQ+33BPmPt7FMAoSPBJ2hG0ImSpVmBd8+p+XLT14cBLeeRczJiY5/UnYtn4cIBrAzHWQp6n46isrR4diqJysfM3Gkyoo+YTuM6LloB/Vj0UhMunTojbGd4t+Mbj9vEI/S8OrOYUfmHHfH3/7+U1PpyQ+ZV/wB5Jxa10ogJ5Blpz0wNlbG/r71Mhu//ep/KU+LSWVA1YCJFcc07z+mOebg5sOGcC3L39t9 W+I3jNhE 7Lk9dflJZ/v9X89QpEA5KlQZPc2uLhGK9FqFtASuJQxXGUeiwD+Ag2LXW2iEhEbf7/kYglCRYebCMXf8mYazst6guqvdrkmL9phhk6tfyqRetuyLkxGBQWvnCl4gDfpU4KI5lfR0Ndm4gjbANwzdo25iwFK6dUurqVsH29n8RqBMBazmQ6HRi962rwUxOJGnFJRzv7MKz3kHeAiW4/B8PUQ9tvAjd8HS4Jx62yjENaPuMUijXvMR9Ew1FEn4bASMbOM8GpPRbZJxKVZv93rWNoxBoZdIVPfbY4Y1uX64NuxAYKwp6zbVc3ty/3Q== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Feb 13, 2024 at 4:08=E2=80=AFPM Tim Chen wrote: > > On Tue, 2024-02-13 at 15:20 -0800, Chris Li wrote: > > We discovered that 1% swap page fault is 100us+ while 50% of > > the swap fault is under 20us. > > > > Further investigation show that a large portion of the time > > spent in the free_swap_slots() function for the long tail case. > > > > The percpu cache of swap slots is freed in a batch of 64 entries > > inside free_swap_slots(). These cache entries are accumulated > > from previous page faults, which may not be related to the current > > process. > > > > Doing the batch free in the page fault handler causes longer > > tail latencies and penalizes the current process. > > > > Add /sys/kernel/mm/swap/swap_slot_async_free to control the > > async free behavior. When enabled, using work queue to async > > free the swap slot when the swap slot cache is full. > > > > Testing: > > > > Chun-Tse did some benchmark in chromebook, showing that > > zram_wait_metrics improve about 15% with 80% and 95% confidence. > > > > I recently ran some experiments on about 1000 Google production > > machines. It shows swapin latency drops in the long tail > > 100us - 500us bucket dramatically. > > > > platform (100-500us) (0-100us) > > A 1.12% -> 0.36% 98.47% -> 99.22% > > B 0.65% -> 0.15% 98.96% -> 99.46% > > C 0.61% -> 0.23% 98.96% -> 99.38% > > > > Signed-off-by: Chris Li > > --- > > Changes in v3: > > - Address feedback from Tim Chen, direct free path will free all swap s= lots. > > - Add /sys/kernel/mm/swap/swap_slot_async_fee to enable async free. Def= ault is off. > > - Link to v2: https://lore.kernel.org/r/20240131-async-free-v2-1-525f03= e07184@kernel.org > > > > Changes in v2: > > - Add description of the impact of time changing suggest by Ying. > > - Remove create_workqueue() and use schedule_work() > > - Link to v1: https://lore.kernel.org/r/20231221-async-free-v1-1-94b277= 992cb0@kernel.org > > --- > > include/linux/swap_slots.h | 2 ++ > > mm/swap_slots.c | 20 ++++++++++++++++++++ > > mm/swap_state.c | 23 +++++++++++++++++++++++ > > 3 files changed, 45 insertions(+) > > > > diff --git a/include/linux/swap_slots.h b/include/linux/swap_slots.h > > index 15adfb8c813a..bb9a401d7cae 100644 > > --- a/include/linux/swap_slots.h > > +++ b/include/linux/swap_slots.h > > @@ -19,6 +19,7 @@ struct swap_slots_cache { > > spinlock_t free_lock; /* protects slots_ret, n_ret */ > > swp_entry_t *slots_ret; > > int n_ret; > > + struct work_struct async_free; > > }; > > > > void disable_swap_slots_cache_lock(void); > > @@ -27,5 +28,6 @@ void enable_swap_slots_cache(void); > > void free_swap_slot(swp_entry_t entry); > > > > extern bool swap_slot_cache_enabled; > > +extern uint8_t slot_cache_async_free __read_mostly; > > Why wouldn't you enable the async_free always? > Otherwise the patch looks fine to me. Thanks for the feedback. Just in case someone doesn't care about this optimization and wants to opt out this behavior? Anyway, I am happy to update the patch without the sysfs control file as we= ll. Chris