From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6F889C4829A for ; Wed, 14 Feb 2024 00:08:06 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id EFEC26B00A2; Tue, 13 Feb 2024 19:08:05 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id EAE646B00A3; Tue, 13 Feb 2024 19:08:05 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D9D656B00A4; Tue, 13 Feb 2024 19:08:05 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id C90356B00A2 for ; Tue, 13 Feb 2024 19:08:05 -0500 (EST) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 942C8140CCE for ; Wed, 14 Feb 2024 00:08:05 +0000 (UTC) X-FDA: 81788471730.05.ECE0B9A Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.16]) by imf10.hostedemail.com (Postfix) with ESMTP id D3693C000E for ; Wed, 14 Feb 2024 00:08:02 +0000 (UTC) Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b="TMe/sGs6"; spf=none (imf10.hostedemail.com: domain of tim.c.chen@linux.intel.com has no SPF policy when checking 192.198.163.16) smtp.mailfrom=tim.c.chen@linux.intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1707869283; a=rsa-sha256; cv=none; b=cbemAeZrdvuKtC8dHoIjyBlJFUA4CsV0NgzQcTLaGaLX8PZK/5Nkp0Fo1WtSXpxcTJzzYx 2KmjWREaIR3+eF38kix14/3wt71zdcAuoJto8bvebWRUo2EqRHf/hCWhz7wpAIzuAjYs+Q i6RK09M2IlmrYeXnS+fjLc41VdpQfmc= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b="TMe/sGs6"; spf=none (imf10.hostedemail.com: domain of tim.c.chen@linux.intel.com has no SPF policy when checking 192.198.163.16) smtp.mailfrom=tim.c.chen@linux.intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1707869283; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=YBdrG7SmX8jv7sfp0W5bQt8Z1wnqYjil414xRBKUjw4=; b=QOcOFAb6K8Jx5CdrEn1q6WV55WoIufDu502JDrLgux9riie7t+krFu4QikRX74yGurlKgG xF/M3P7W9nUahDZ/BxepYFiI+EyRBcRzShUxmvspgJLDkaH32HzP1WVQq5wUVBmXtq7TYM WwlCTat657cdDN42UIYNoaj/yTydmXA= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1707869283; x=1739405283; h=message-id:subject:from:to:cc:date:in-reply-to: references:content-transfer-encoding:mime-version; bh=Onc/Nm5J2ADHvGRgUGidlYL5cNBOfivtq/tSzL+HJ1s=; b=TMe/sGs6ybtFdBwNrdg2bfUxRUYxMtmEO43auXNIVC0ueOFEnbqeFYoN tMBqgoCKII1Dol/GYTNz/BGjDmTgdHCKAsJduGg3TAR3CsaSXI1Fe6HC1 uJ4yNj+yas8x7B1o+GrCoKycecxdX8sK2Wf163EngJknERyrMx1MjJ8Le M877ZOfZz7y9/EimqQBioXNLr33dkbcfZWU9TylVSE08gPbxWUrfPqHrv WaiWaD6C8n3iAUmfRZLKgpvQeiyEF0CpJkyONCmUzeZqSTSU/hoRBQjaI Yreba+Ry8S9gWOyUYnCjPzkpK8GYZlcigzVAntHRvlUzFa2bkF/wFYK8j w==; X-IronPort-AV: E=McAfee;i="6600,9927,10982"; a="2261141" X-IronPort-AV: E=Sophos;i="6.06,158,1705392000"; d="scan'208";a="2261141" Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by fmvoesa110.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Feb 2024 16:08:01 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10982"; a="935501057" X-IronPort-AV: E=Sophos;i="6.06,158,1705392000"; d="scan'208";a="935501057" Received: from arieldux-mobl.amr.corp.intel.com (HELO [10.209.91.178]) ([10.209.91.178]) by fmsmga001-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Feb 2024 16:08:00 -0800 Message-ID: Subject: Re: [PATCH v3] mm: swap: async free swap slot cache entries From: Tim Chen To: Chris Li , Andrew Morton Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, Wei Xu , Yu Zhao , Greg Thelen , Chun-Tse Shao , Yosry Ahmed , Michal Hocko , Mel Gorman , Huang Ying , Nhat Pham , Kairui Song , Barry Song Date: Tue, 13 Feb 2024 16:08:00 -0800 In-Reply-To: <20240213-async-free-v3-1-b89c3cc48384@kernel.org> References: <20240213-async-free-v3-1-b89c3cc48384@kernel.org> Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable User-Agent: Evolution 3.44.4 (3.44.4-2.fc36) MIME-Version: 1.0 X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: D3693C000E X-Stat-Signature: rgo4kuj393p3gwqf3mmuejwfkrpmc815 X-Rspam-User: X-HE-Tag: 1707869282-896378 X-HE-Meta: U2FsdGVkX19slDFm/onbYYrbUF+GlYLoTVBCMpsgTswaZJheNs4pwGkS2VZCUV7SM50f7etwMWp6sZ5qjXgWvpuhLSEaW5GMIFwX893N5tAuxjBRya/QpTtdYVqKN+NRNUAeg1yoMMAQkj10GV2AXgfLTwBJm68W756BQJE7xE7mWte28MBk6JfDidfcY2HR1E8oLyVUOu/BgQ2cgH6P2fbVQJncmtMFDcP0h5JzYdVT4GhIN89/3Zer/+6T368gTUDibsftkWBU1d69WuYB6i7rECr7OwRU6xnAjsflxt08ZIYxdr8TJEXKbDofLn6guwdIch3r/WFj+qyvTF2resDqpK4HDetKRfPAPVzXzV1iZTCA0Fownz/Sb8T024O2wHmuF/+ny0tzobxBZQTSU09WyCF8sFGD8Q2BlsGRxsdEXvrEog7Ljekxh1Pvq8lp8ARsyO94O0RqblSqukfOjynK+8eDUXG+3owjjWd2FqBCFzZObXClczHW/cbVgfNhLGKpbymqa5njWeYJRuW63Vu0sFwn1sSG5g4DZLVZU+S8irznTNySj1Htq1/i6VN5pzCvvKabHyWDW8Y7RoGjNJHJU/o5z8DE+4pjNgnZkaxqWp3pW7PirZ6oMj2zchJDsWwYuYu/MzZFOJg2DDAzO1JkNbARjGHTObxqa7PJdQuEtT3ioXsHEUVAuSxFhNItuFGVhFJeKd4t2Xl9a9aMqtlSp7Wf/+wA0XHYqpgqAc+BNoz4pzWftYpHZfXyuX4dFZ9tTzPswozW6GAWyI+aksqudpfktnzpfpLj7oEpzxgHKpLFzm2DIHNOn1glewzdDOuJA0oX5gjs34737vyq7PDzodmOUlxizmPG71UR0MiAArV9zVBo9NYnvSlt1OrrjuOOX3a/PSf3EpT6zSv4zpRvsivhwtabjINl+4aM+CfwaUEEKUDut2KiqOG31rWl+2X5TFSyt28eat8CP9V sjtO2UR4 sq/v5arcMeCGZdVu7UXqxEfWAaSrySNtK9BZKdnAeS7Gs96fxTiOIP5upZ6RzENi1a3iPQs6KlqCF+OGNmdY8So2otwpoVHahMXEMvfSW5sMjN7xXhBPkwjczHpQrmr0i8I32Z1UxD03slWWRR48Iw/0oL0MnkP00ypSmf47JaPWPXU+tfOfrqI1vvLkvBycpDWH4 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, 2024-02-13 at 15:20 -0800, Chris Li wrote: > We discovered that 1% swap page fault is 100us+ while 50% of > the swap fault is under 20us. >=20 > Further investigation show that a large portion of the time > spent in the free_swap_slots() function for the long tail case. >=20 > The percpu cache of swap slots is freed in a batch of 64 entries > inside free_swap_slots(). These cache entries are accumulated > from previous page faults, which may not be related to the current > process. >=20 > Doing the batch free in the page fault handler causes longer > tail latencies and penalizes the current process. >=20 > Add /sys/kernel/mm/swap/swap_slot_async_free to control the > async free behavior. When enabled, using work queue to async > free the swap slot when the swap slot cache is full. >=20 > Testing: >=20 > Chun-Tse did some benchmark in chromebook, showing that > zram_wait_metrics improve about 15% with 80% and 95% confidence. >=20 > I recently ran some experiments on about 1000 Google production > machines. It shows swapin latency drops in the long tail > 100us - 500us bucket dramatically. >=20 > platform (100-500us) (0-100us) > A 1.12% -> 0.36% 98.47% -> 99.22% > B 0.65% -> 0.15% 98.96% -> 99.46% > C 0.61% -> 0.23% 98.96% -> 99.38% >=20 > Signed-off-by: Chris Li > --- > Changes in v3: > - Address feedback from Tim Chen, direct free path will free all swap slo= ts. > - Add /sys/kernel/mm/swap/swap_slot_async_fee to enable async free. Defau= lt is off. > - Link to v2: https://lore.kernel.org/r/20240131-async-free-v2-1-525f03e0= 7184@kernel.org >=20 > Changes in v2: > - Add description of the impact of time changing suggest by Ying. > - Remove create_workqueue() and use schedule_work() > - Link to v1: https://lore.kernel.org/r/20231221-async-free-v1-1-94b27799= 2cb0@kernel.org > --- > include/linux/swap_slots.h | 2 ++ > mm/swap_slots.c | 20 ++++++++++++++++++++ > mm/swap_state.c | 23 +++++++++++++++++++++++ > 3 files changed, 45 insertions(+) >=20 > diff --git a/include/linux/swap_slots.h b/include/linux/swap_slots.h > index 15adfb8c813a..bb9a401d7cae 100644 > --- a/include/linux/swap_slots.h > +++ b/include/linux/swap_slots.h > @@ -19,6 +19,7 @@ struct swap_slots_cache { > spinlock_t free_lock; /* protects slots_ret, n_ret */ > swp_entry_t *slots_ret; > int n_ret; > + struct work_struct async_free; > }; > =20 > void disable_swap_slots_cache_lock(void); > @@ -27,5 +28,6 @@ void enable_swap_slots_cache(void); > void free_swap_slot(swp_entry_t entry); > =20 > extern bool swap_slot_cache_enabled; > +extern uint8_t slot_cache_async_free __read_mostly; Why wouldn't you enable the async_free always? Otherwise the patch looks fine to me. Tim > =20