From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 98A75C47DB3 for ; Thu, 1 Feb 2024 00:58:01 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 10CE96B0078; Wed, 31 Jan 2024 19:58:01 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 096876B007D; Wed, 31 Jan 2024 19:58:01 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E51E26B0080; Wed, 31 Jan 2024 19:58:00 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id D02236B0078 for ; Wed, 31 Jan 2024 19:58:00 -0500 (EST) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 7A9F9A1A9D for ; Thu, 1 Feb 2024 00:58:00 +0000 (UTC) X-FDA: 81741423120.11.569DB18 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf06.hostedemail.com (Postfix) with ESMTP id 91CE3180012 for ; Thu, 1 Feb 2024 00:57:58 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=RQ7+woMJ; spf=pass (imf06.hostedemail.com: domain of chrisl@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=chrisl@kernel.org; dmarc=pass (policy=none) header.from=kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1706749078; a=rsa-sha256; cv=none; b=MloW1jbUrBy66JXuJHul4xcXE/PHAgBxqfBdOY6PhcySDAe4noo/ICiAxc4yMsz58+iKHd lY0r6FdKAu63N/GY/f7AlfVsvA7v1QGBz08OWogfaHIQCN7dAqPWE9uInCG4bTX7AN63VR jZaMtoAZlFzBIxZnFf7/pc/pVsGLQS4= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=RQ7+woMJ; spf=pass (imf06.hostedemail.com: domain of chrisl@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=chrisl@kernel.org; dmarc=pass (policy=none) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1706749078; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=3elAM1MdU5jxjEQXnGO16pPxMOUt6et3x1F1MM8DzvQ=; b=LYmSl7IhixwisTV/tJwFSipbe770dvcJ31wCGsbIKVpaAiowjQzOoWp7gQap9zTMWaInax xOrfKbYBp8+U326ZIdxP9BSnLG4zZkLKELUMtsuM7c7oSzJupfePptdMzPPC8/GlJLTmbX Q6hLDOmqp8hM0kEwS8DERWhycVYJT6E= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id 6BB2861B32 for ; Thu, 1 Feb 2024 00:57:57 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 6CDDEC41679 for ; Thu, 1 Feb 2024 00:57:56 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1706749076; bh=8k0ghcR4VPDFES5qhdUN/8xWWs6EqfQ8PKPk1D1kB24=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=RQ7+woMJyhZ5xwaXun7RkiXhY8yZcJcOm5385luZ5v8tXK1kYYWb2RqGkXjaomjiu PaF7q7MjCmt8ICQ/fDoIURd93DpmZD452SGPP4/N+sJ/1MZSikMvRjs5kMrPYfHWQ1 Whxu+QKOkL0InXzf6oLYcRkoHgLfM/oNanA+iCnx3m4oxf5jiq66cO1gMeUHOrLkAd 6L20H2pM+L5SlOVZCRtRdCYwAXVr3OZtYDNwvzBGbCwXwO0OoZBJdCl73Kulu6Wu7B gDc0fKB8THCXtC5b9OS0y6vhylMgXseWdKirjMWcDX0K1NE6ClbM54C9zPV4YmeN+x JFbjYXPG3uXjQ== Received: by mail-il1-f181.google.com with SMTP id e9e14a558f8ab-3639ef0f790so1210235ab.0 for ; Wed, 31 Jan 2024 16:57:56 -0800 (PST) X-Gm-Message-State: AOJu0YzXuwENrYAf4Kr19Hka7dYrGvoicJP0f8lk8sPfzX0VqMcC2HUt qPSU8sSxYy4IGKt7B6qhd36SCNFA71kOS8s9sTOBpbhBiNC5pHXJaTbvHLHKGpaa1c9PKaLvFzX KwF6gkDbDzeaWvPtjjLqlSH1Lp9v+3TTQQoBd X-Google-Smtp-Source: AGHT+IGKUurwsJMSUXptg8WBQRlcd/mcUrOb09X7kqm5toaDCgm+DHc+joHcrdNygWdls/NNK4npl5EZGAuMVdJaRUc= X-Received: by 2002:a05:6e02:1c87:b0:363:8c6c:4c8e with SMTP id w7-20020a056e021c8700b003638c6c4c8emr1046887ill.31.1706749075395; Wed, 31 Jan 2024 16:57:55 -0800 (PST) MIME-Version: 1.0 References: <20231221-async-free-v1-1-94b277992cb0@kernel.org> In-Reply-To: From: Chris Li Date: Wed, 31 Jan 2024 16:57:44 -0800 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH] mm: swap: async free swap slot cache entries To: Yosry Ahmed Cc: Andrew Morton , linux-kernel@vger.kernel.org, linux-mm@kvack.org, =?UTF-8?B?V2VpIFh177+8?= , =?UTF-8?B?WXUgWmhhb++/vA==?= , Greg Thelen , Chun-Tse Shao , =?UTF-8?Q?Suren_Baghdasaryan=EF=BF=BC?= , Brain Geffon , Minchan Kim , Michal Hocko , Mel Gorman , Huang Ying , Nhat Pham , Johannes Weiner , Kairui Song , Zhongkun He , Kemeng Shi , Barry Song Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 91CE3180012 X-Stat-Signature: ubcgwhggk4krj4u5cxh4wmumf5qbxjbk X-Rspam-User: X-HE-Tag: 1706749078-440823 X-HE-Meta: U2FsdGVkX194/godAIbHeH45EFHTv8aayo9Jye1PKybN4q68KcKYLPpJd9BiKkOJDwy79taVu2jTQRAp1bv2AJRqjK375PxFqp1CcnmmXyxqj0220qj/LJ2Rmc/lUnSp2UM5rnVWYTGN5vScUEkqLsvliXq/v+IRgA+2kn9630XfWHHrj47Ya6+va/G6lPFOFXNWFqWfuIroXUJ4Yfld/DHGgsxyYK7/5TtgXe+uMn5g2TdguRmiw6AvG7UEE+SZHZ/6WoTteFVAsTEawGSZFv/Rgde3H9jV1EkH0s2YYtv/z/kYpcXkrsW5i+NG6Il4UxZWFAVM+k327KR5hChC8CWLnbuJp5Ejaa6QY4MIjfO8PYFisGwzde1LfT9CHPSq2IDvE/qrrBIzue2MKH2KpfADbatW85D3avO1OQBEG8sRBOOZcNBTZ7YdQWxmT6sYBrvwBL9S4VbksZb5cXtbMO03Y4plqo5KiQhMIeumd0zdD178wToGAziGFazy1fsCMT7MEhVhB4gh1G/Rm0o5LiHsGHxXBlPWFOR5PnKjoy1WSMcawBSPNdGui9aeeJmdM7OeVrxaol6zt9zBCmgedyaeElnROto3MT3bkBZPuCquWWiPJ/2tHJdK8sf/5yLhjJft4kZv/m3rMdrkjQrkj9MpzUxcaowQ5PM0E87Aqm0tLkVA8bwKSFprqRWBZWE6j4SSYcm1rv7fycLSGBSrNebpJcw2GEc1O1u6igoeRSRVKm0uIlqWDHTB1sjPgYc/fEKI0xdHHdYwimNXMDs/Vl0PKsWKiF3hswAKOFriNocC/CFkMYv4rikNEw8iR/8cz6+KuhEsgmuUABvMDqX8D5hW7J7Rhbo5XIQoeWExYGVAA9RwsoQJHSyO3MV+DdMW7KmFXIQgp2tH0X+x2dlcTP8sYGjbnIB2hJW0/KQDt78RGBEVuaDLhb2bjEymV26ykohRk9Jf3LmLYJKgKKP QtrLht83 PIdJFc1zs8PhzmcWRqk/V59GPRCP3E473PhxerADz7jDxKjO6m3aotgBGtOfy9K0lx3YZDktFuloznaFUnaiLSWQfB6qrlnzOvgp9tblLoC0jQAHzEvBr4DVLUTkUCEejeFaA7HAP91/qO3xxYpVR3aVWug/sVw0+oNxc8B8wvK9BfVASDzVif3xD+TOxCOYNCigV/GWElU1YJ+PKSbCYHjvAGm1QMDpftcav0Yg/ztLhy8msHnlvfNy9CoxtNShcOMV/DVlJYQ4PFfWajGdcTyDf8rERgb1zcohIzRf5XBFDcpY= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Hi Yosry, On Thu, Dec 28, 2023 at 7:34=E2=80=AFAM Yosry Ahmed = wrote: > > On Thu, Dec 21, 2023 at 10:25=E2=80=AFPM Chris Li wro= te: > > > > We discovered that 1% swap page fault is 100us+ while 50% of > > the swap fault is under 20us. > > > > Further investigation show that a large portion of the time > > spent in the free_swap_slots() function for the long tail case. > > > > The percpu cache of swap slots is freed in a batch of 64 entries > > inside free_swap_slots(). These cache entries are accumulated > > from previous page faults, which may not be related to the current > > process. > > > > Doing the batch free in the page fault handler causes longer > > tail latencies and penalizes the current process. > > > > Move free_swap_slots() outside of the swapin page fault handler into an > > async work queue to avoid such long tail latencies. > > > > Testing: > > > > Chun-Tse did some benchmark in chromebook, showing that > > zram_wait_metrics improve about 15% with 80% and 95% confidence. > > > > I recently ran some experiments on about 1000 Google production > > machines. It shows swapin latency drops in the long tail > > 100us - 500us bucket dramatically. > > > > platform (100-500us) (0-100us) > > A 1.12% -> 0.36% 98.47% -> 99.22% > > B 0.65% -> 0.15% 98.96% -> 99.46% > > C 0.61% -> 0.23% 98.96% -> 99.38% > > I recall you mentioning that mem_cgroup_uncharge_swap() is the most > expensive part of the batched freeing. If that's the case, I am > curious what happens if we move that call outside of the batching > (i.e. once the swap entry is no longer used and will be returned to > the cache). This should amortize the cost of memcg uncharging and > reduce the tail fault latency without extra work. Arguably, it could > increase the average fault latency, but not necessarily in a > significant way. > > Ying pointed out something similar if I understand correctly (and > other operations that can be moved). If the goal is to let the swap fault return as soon as possible. Then the current approach is better. mem_cgroup_uncarge_swap() is only part of it. Not close to all of it. > > Also, if we choose to follow this route, I think there we should flush > the async worker in drain_slots_cache_cpu(), right? Not sure I understand this part. The drain_slots_cache_cpu(), will free the entries already. The current lock around cache->free_lock should protect async workers accessing the entries. What do you mean by flushing? Chris