From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A108CCF34CA for ; Thu, 3 Oct 2024 22:22:55 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 083A06B026B; Thu, 3 Oct 2024 18:22:55 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 034348D0003; Thu, 3 Oct 2024 18:22:54 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E64286B030E; Thu, 3 Oct 2024 18:22:54 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id C9A076B026B for ; Thu, 3 Oct 2024 18:22:54 -0400 (EDT) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 6D5CEC0BA0 for ; Thu, 3 Oct 2024 22:22:54 +0000 (UTC) X-FDA: 82633717068.23.86F5B68 Received: from nyc.source.kernel.org (nyc.source.kernel.org [147.75.193.91]) by imf22.hostedemail.com (Postfix) with ESMTP id 916C9C0010 for ; Thu, 3 Oct 2024 22:22:51 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=GqPSvSiC; dmarc=pass (policy=quarantine) header.from=kernel.org; spf=pass (imf22.hostedemail.com: domain of chrisl@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=chrisl@kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1727994076; a=rsa-sha256; cv=none; b=245wGbvBJ6kXdSK0jqpT2RSEjPVm+3Mw65QnbAdbZ4l7bfY1ppDCdxyVL+ptU/gs1Z0Iyl nVbuAZeCwM+4+b4FHoiaKPPw75nr1WLaHHjnyjRXAxhg1pPip3QL1sja+dVMP66GnjZsf6 YaVHBLGrhRRTkjxiwfmUXCsrBTTbxzE= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=GqPSvSiC; dmarc=pass (policy=quarantine) header.from=kernel.org; spf=pass (imf22.hostedemail.com: domain of chrisl@kernel.org designates 147.75.193.91 as permitted sender) smtp.mailfrom=chrisl@kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1727994076; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=AUWrbZFRsAXuxnekpQfjrExzRDKtqlBxNrn/7/g/xQY=; b=sMCJ6M2fsR7R05zQqexn9mcDiq5EeGTwR75SSeu3BhuXpJ0gXBanhR//Uj9tx19tQOyF05 izVhHulnSjkrHlAVp/jGE//zMJlx9/U9WqxuHH/muffmuiSEZj7UehEqsj+0o5deNS6I5K gllgNJet/tPEiFgdjMTGwC3EQRhoqMw= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by nyc.source.kernel.org (Postfix) with ESMTP id 57C46A433FC for ; Thu, 3 Oct 2024 22:22:42 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 6B697C4CECE for ; Thu, 3 Oct 2024 22:22:50 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1727994170; bh=DnmNpmJQQm4NYVqRCGy9i/+/Ig4LR3+hm92QTLY1nbI=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=GqPSvSiCq2147idnMn/x14O0AwRI3NfzoPIRTQierj7doff6M7aeiGRWFiirfIgdc 99PyQ51AbiikWvguscediIN2n90aes+j9Eq34uGln8WDAw1bfzvxRHYCrG+vbXSUdu XpWfms4Y7NbnT8xoykZYRZfSVcSBHaX6WVq8lMt1pyXmdxYMVfH8cAjCouJ5pFgGt8 ++0N1NV3jo2EDn/8RACAH9MHUYSVL7Mp9k6fBEIPysFYNFktRl/tXE+g2vDsE5K257 pMkT+C5M1Wbxmu42Rwc/8h4KuCECHsrqsQGqBXEIuHLotRQ+5/VS56wTBe/XqJCnPI mZiXypv771ZWA== Received: by mail-pf1-f180.google.com with SMTP id d2e1a72fcca58-71b00a97734so1478231b3a.1 for ; Thu, 03 Oct 2024 15:22:50 -0700 (PDT) X-Forwarded-Encrypted: i=1; AJvYcCVmtd4Ael8qTwI9LzB0080yLRNEfqFSIk+zmTVSUzzIJTceOZDx/64Xxvc7HZ15xEs04N+D+9qXrw==@kvack.org X-Gm-Message-State: AOJu0YwSqjSD6rjQKN0MBia1c1QLlRnOaTlKHkDfxSFkV3ppCYidcdsD WerzUQkICAAAoPowVQwNgJtAe2xkgJXSopUMR4KNkzs/7kus6nVc3Pb8WfIr24vrfgFubonx22D Kg5YQjdO5q6oKk3JqYdBQZo/0kQ== X-Google-Smtp-Source: AGHT+IGXo32/Wh1/RnyGtZO6Dr70hFaqnlQ4oOJucQ6j4BO6HwCW3L6ykIiixFLWJmfsoSmkM6UJ8UOdzPUIKxY03BA= X-Received: by 2002:a05:6a00:4b0a:b0:71d:d1b7:8dba with SMTP id d2e1a72fcca58-71de24454cfmr729394b3a.18.1727994169992; Thu, 03 Oct 2024 15:22:49 -0700 (PDT) MIME-Version: 1.0 References: <20240926211936.75373-1-21cnbao@gmail.com> In-Reply-To: <20240926211936.75373-1-21cnbao@gmail.com> From: Chris Li Date: Thu, 3 Oct 2024 15:22:36 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH] mm: avoid unconditional one-tick sleep when swapcache_prepare fails To: Barry Song <21cnbao@gmail.com> Cc: akpm@linux-foundation.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Barry Song , Kairui Song , "Huang, Ying" , Yu Zhao , David Hildenbrand , Hugh Dickins , Johannes Weiner , Matthew Wilcox , Michal Hocko , Minchan Kim , Yosry Ahmed , SeongJae Park , Kalesh Singh , Suren Baghdasaryan , stable@vger.kernel.org, Oven Liyang Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 916C9C0010 X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: ctqsza5jan1j35xnnyn6oyht346kmb1i X-HE-Tag: 1727994171-366144 X-HE-Meta: U2FsdGVkX1/p1oyg5WCeeNIlRmo01tmgk7i/KXY5Fxy5J3XY5l8179uaauO5QQz8qFIcTvjlhOL6SjBdwKzSz95wTj1NIlYRYpRM3aGtoaohuiPCiwBYdH7L6ga8jVtLg2Hc6q4f/XkCPs9hOgIXp2tG6BpNF2dQSsynSrQNqIY1YERiA2JqzDpMZoI6r5/rsKVftNHjRtyk9QLqVCatTdrYNUrBObkHPyE/56sxc1ntUvIEgNffdTf6eQ/ZYe/TbLczuN2EHPlP1iU5bDyvTmYn5mH7vtc065GClfFqOb9O40Gsq3/r/ev+SdMiSJL10OtkjBM/DAzVxuSyYLt5lniF6wb+Zg5iI0V5p4WX3hejsGvbQhgX/1RZtnAn8Sj4ilv3//D6hZrbYRyDHbmEUY8v3wWIlFUDTVSyNQfLuR7BhUlohE6uL90yLv5YsKHUpZEdLkZLUlS7dmPVe0RnEkNyDpMHsyETj/JYm7wuMebuCW/7P1xL6HBbQ/5gp1AbguSBZdNGLVclMuebmy4hn4tKKc9Gbx5WoafIBXRH577mSJr69RtefukLxdRCWR/ppRb9HHlflwB8iXQk7Txeon503vrltkurOI0WWLGh2XwGKx2QCFOauTwgdGuO/RRhh/sycnVgyjeqSbGHzniSQzRpWoplvvnEsHnYIPrqvTyylANZeCHL/if3d1+rbO8FsCrjYhvBmN+LB8EOunPEKe8PAHYrWF9GPWMH2H/HfYmrpYSjvl5qOaM0JnE6wtgftqrS8VDX9qNcpe1mX8y0JY7OmySurKTj1C3xmoBjUKpq8JULpDmODmgbJnwE5TRpXVgeTI0O3LhkPW2W7gsF1GCbo1C8rfdm9yobaHbBUBPPjAJ7ANfllYNV8P0sK9/VdmAZDbMSTnwVrRmn8QZJ4PCtWkoNq5bpjsymr6kamyxcGGQGhrEDUeohI+7Fey+VMK+MMwQdNzNK4kQfCa3 364zqd9g VKF1O3AAQ4tMXwhpW6nXmpaD+aEa9zVqQeJsGAWhDjXJVmRYdpTiIT/GSchdfpxpb9dbwX6NOkK9yI2P6oo6Fr0eX51d3Vmai3SivOwTRrX94Zd+2t9PgDMnvg3goXmdIA0whVygg74h4IqygTLdIhbm3GZ4BTIf31W4RkqlZE94R0+wRmPAg2r+4CIiNC/j37gx+Q5hWAPJsMrYvyCBq6j4eUWiQbrejEZOtKw9DP3ckusPJK6eYuRIOVTjWmL1TA5BOJcbDi3MoeTQrJ4Yx+buHu96BCJVpbWZiaC+DXAjeCmMkbOp2DSwd6Cze9UuLbEfQt0iY/MGetifMG/e5OixXbUdZHrT7zDRMrq28UNRZZcpSwGku+laPkRMeGGIna7QcEZ1FeyHtdmyYwYtAwkIEbdytV6ZOmWoS0irspH9vnOLO8WCat9oj4BDIIcewQXfU9Db+kZ7skGUFk8QhGBzMBrJoiEmGeJCh8vVqtB0SoRRCgywdbZvTX1IPSsylSPxixo+1k8RXvi1qYBffCNA2SVrR38yfsnT2v/ZRffgbPaJ1Gog8mJlJsaT3yrzZItWM3OYLAclumiMcwhCMQtO1TAkaaUsSQGHNeqYN9In3lH4lH1G+yceU4+Brxd6KDPuGiaFAsdyiW8asOlSgsVDaXshpB6Cbx57DNs4LTOkSIZ8= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Sep 26, 2024 at 2:20=E2=80=AFPM Barry Song <21cnbao@gmail.com> wrot= e: > > From: Barry Song > > Commit 13ddaf26be32 ("mm/swap: fix race when skipping swapcache") > introduced an unconditional one-tick sleep when `swapcache_prepare()` > fails, which has led to reports of UI stuttering on latency-sensitive > Android devices. To address this, we can use a waitqueue to wake up > tasks that fail `swapcache_prepare()` sooner, instead of always > sleeping for a full tick. While tasks may occasionally be woken by an > unrelated `do_swap_page()`, this method is preferable to two scenarios: > rapid re-entry into page faults, which can cause livelocks, and > multiple millisecond sleeps, which visibly degrade user experience. > > Oven's testing shows that a single waitqueue resolves the UI > stuttering issue. If a 'thundering herd' problem becomes apparent > later, a waitqueue hash similar to `folio_wait_table[PAGE_WAIT_TABLE_SIZE= ]` > for page bit locks can be introduced. > > Fixes: 13ddaf26be32 ("mm/swap: fix race when skipping swapcache") > Cc: Kairui Song > Cc: "Huang, Ying" > Cc: Yu Zhao > Cc: David Hildenbrand > Cc: Chris Li > Cc: Hugh Dickins > Cc: Johannes Weiner > Cc: Matthew Wilcox (Oracle) > Cc: Michal Hocko > Cc: Minchan Kim > Cc: Yosry Ahmed > Cc: SeongJae Park > Cc: Kalesh Singh > Cc: Suren Baghdasaryan > Cc: > Reported-by: Oven Liyang > Tested-by: Oven Liyang > Signed-off-by: Barry Song > --- > mm/memory.c | 13 +++++++++++-- > 1 file changed, 11 insertions(+), 2 deletions(-) > > diff --git a/mm/memory.c b/mm/memory.c > index 2366578015ad..6913174f7f41 100644 > --- a/mm/memory.c > +++ b/mm/memory.c > @@ -4192,6 +4192,8 @@ static struct folio *alloc_swap_folio(struct vm_fau= lt *vmf) > } > #endif /* CONFIG_TRANSPARENT_HUGEPAGE */ > > +static DECLARE_WAIT_QUEUE_HEAD(swapcache_wq); > + > /* > * We enter with non-exclusive mmap_lock (to exclude vma changes, > * but allow concurrent faults), and pte mapped but not yet locked. > @@ -4204,6 +4206,7 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) > { > struct vm_area_struct *vma =3D vmf->vma; > struct folio *swapcache, *folio =3D NULL; > + DECLARE_WAITQUEUE(wait, current); > struct page *page; > struct swap_info_struct *si =3D NULL; > rmap_t rmap_flags =3D RMAP_NONE; > @@ -4302,7 +4305,9 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) > * Relax a bit to prevent rapid > * repeated page faults. > */ > + add_wait_queue(&swapcache_wq, &wa= it); > schedule_timeout_uninterruptible(= 1); > + remove_wait_queue(&swapcache_wq, = &wait); There is only one "swapcache_wq", if we don't care about the memory overhead, ideally should be per swap entry that fails to grab the HAS_CACHE bit and has one wait queue. Currently all swap entries using one wait queue will likely cause other swap entries (if any) get wait up then find out the swap entry it cares hasn't been served yet. Another thing to consider is that, if we are using a wait queue, the 1ms is not relevant any more. It can be longer than 1ms and it is getting waited up by the wait queue anyway. Here you might use indefinitely sleep to reduce the unnecessary wait up and the complexity of the timer. > goto out_page; > } > need_clear_cache =3D true; > @@ -4609,8 +4614,10 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) > pte_unmap_unlock(vmf->pte, vmf->ptl); > out: > /* Clear the swap cache pin for direct swapin after PTL unlock */ > - if (need_clear_cache) > + if (need_clear_cache) { > swapcache_clear(si, entry, nr_pages); > + wake_up(&swapcache_wq); Agree with Ying that here the common path will need to take a lock to wait up the wait queue. Chris > + } > if (si) > put_swap_device(si); > return ret; > @@ -4625,8 +4632,10 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) > folio_unlock(swapcache); > folio_put(swapcache); > } > - if (need_clear_cache) > + if (need_clear_cache) { > swapcache_clear(si, entry, nr_pages); > + wake_up(&swapcache_wq); > + } > if (si) > put_swap_device(si); > return ret; > -- > 2.34.1 >