From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9B1F6CF884F for ; Fri, 4 Oct 2024 15:55:56 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2AD548E0003; Fri, 4 Oct 2024 11:55:56 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 20EE88D0003; Fri, 4 Oct 2024 11:55:56 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 061A18E0003; Fri, 4 Oct 2024 11:55:55 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id D38C58D0003 for ; Fri, 4 Oct 2024 11:55:55 -0400 (EDT) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 022EC141A2B for ; Fri, 4 Oct 2024 15:55:54 +0000 (UTC) X-FDA: 82636370670.12.48FFBFA Received: from mail-vk1-f173.google.com (mail-vk1-f173.google.com [209.85.221.173]) by imf14.hostedemail.com (Postfix) with ESMTP id 9041A100012 for ; Fri, 4 Oct 2024 15:55:51 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=jNmnanKX; spf=pass (imf14.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.221.173 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1728057221; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=CsgdlqI1YjSaf2hjQX57eOCOui/ZtRhCSadX2PuXtI0=; b=GZ5nzUpeJnpRqcbs2IC2ZOotvF2QcBL8IY+uMDTcVN57cD3o4ev2naraTrLD4sIpp17G5y gadtEQPkszs635r8wYZNWwuPihv6LLcTuMWHdcvbFWLOtUxkaRHxk8tdDC7iffEa55I537 Ee1opL034J5fBfORfxeEdkb2zElsmzY= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1728057221; a=rsa-sha256; cv=none; b=ceo6abKGT9i3dx1q3vDoloQ/21W936zaRUeY3bc/mSpcg+Wj4UtMHFa8p0aEUDf6fsAoAQ AP6nqxk2y0pX5aZwP8Vpxd+xogh19QWjQdlcsm4fKOLk+rqP99q5E9O/6AIPF40Nl5xPsM NLTFQEKex2PhtDqqJMCEgMDitKuQRh4= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=jNmnanKX; spf=pass (imf14.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.221.173 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com Received: by mail-vk1-f173.google.com with SMTP id 71dfb90a1353d-5096d378125so651865e0c.0 for ; Fri, 04 Oct 2024 08:55:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1728057350; x=1728662150; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=CsgdlqI1YjSaf2hjQX57eOCOui/ZtRhCSadX2PuXtI0=; b=jNmnanKX+p5NIzaonXC1CtpEQHPtLhmrPCReVVkehsDig14wzuu0fungNhkbW5aU32 nF+j4/yuxGCsAlSSJK62xKiYwqgYgI+KvLOxT7GoA/XbwnNRwc3pLxxgIvWW23i9dSdK ja/+nxD8U2uSTwWjx3VvygkPbDJ4cGg/4cJk8etcfHtuGRffT96fhlFheECde4CbRmTn ct5lhgLm0vbB9ogwclqNyhOh95saxGay0Y3wSfFkyat+I777CNzUQJX2QCtt6efGeTJG 0B+Dag92COxdNYOp+8iqC8JwCRET9szuX7ON01QtgIniWR6yH4LKAI3gA4SbGmYsGZQZ CUAg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1728057350; x=1728662150; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=CsgdlqI1YjSaf2hjQX57eOCOui/ZtRhCSadX2PuXtI0=; b=XaVZFtotAB1KEOBa1ADVqbIXfA+00LTr4TsXvQBD455/MbTEDwSmy/jnJcG2kcIfyB zK4Yw7kA5xc4cRSU9N5VxI54jK4K7/thNvc1e8hBsy7++HcfBT02LZk/7huJUg64IIuD Tjrr5l2nes4Kt70kQcJkj1mOw66KIg81BguTppPEPUv+fI6jodLj7wpnD4uI5FbSDLaU exGRrosSIZnSYIrmAzosWxE5MRHjGHBXJWeaKUhKyuLIOcJrMwj0Ozi8/QILo3KL5aTQ 1wWgR1FFuqviNe3Fnosv4A2CIHjOVUWhRn6eyHXL8Ju5fIC4atXPQDyxeiAG5++e0lqY 3FNw== X-Forwarded-Encrypted: i=1; AJvYcCVBdFCZmcSMfFO5UWjbUxVtiy3gjJYA4LMEwcrobxqTyQB5MMiq74T/eYGUM4tgFVZvMok7ZXy5UA==@kvack.org X-Gm-Message-State: AOJu0YxJmW8ly19RTv2AkeqIPfmXo4tyjJz9U6YCBCyn0tm/l4r6MlUd ggI0YaVgQGfEudCJ01p8B4vegM7txUMkA9tsn5Oxwbz74wQ1vguqkr3IPr0zC+QB3elGsFOfNAA Cy+IkFqy/s63dUrqNnmzeGGYfkH4= X-Google-Smtp-Source: AGHT+IE1+ALhrfxE+7vJ3rstPkNZK5+mqKB3Ed/IXMiEy2VUyBOBJj9QvbRmwLnRt7fU0W2iXr5aihcesw1afW9Zs9U= X-Received: by 2002:a05:6122:169c:b0:50a:bd10:e9a5 with SMTP id 71dfb90a1353d-50c8547f0edmr2338209e0c.4.1728057350482; Fri, 04 Oct 2024 08:55:50 -0700 (PDT) MIME-Version: 1.0 References: <20240926211936.75373-1-21cnbao@gmail.com> In-Reply-To: From: Barry Song <21cnbao@gmail.com> Date: Fri, 4 Oct 2024 23:55:28 +0800 Message-ID: Subject: Re: [PATCH] mm: avoid unconditional one-tick sleep when swapcache_prepare fails To: Chris Li Cc: akpm@linux-foundation.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Barry Song , Kairui Song , "Huang, Ying" , Yu Zhao , David Hildenbrand , Hugh Dickins , Johannes Weiner , Matthew Wilcox , Michal Hocko , Minchan Kim , Yosry Ahmed , SeongJae Park , Kalesh Singh , Suren Baghdasaryan , stable@vger.kernel.org, Oven Liyang Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 9041A100012 X-Stat-Signature: gg466arfr8e63514h9ff797gqp4u4sii X-HE-Tag: 1728057351-710208 X-HE-Meta: U2FsdGVkX19fWGJBhMy37Dz9RoHJzazV7dD2R8/cgJdt+8Q9v/soc27qoE1WQ0KT/TTWQEIvviQSULp0nyAXte199o18ncrKzqa4G3j2oSF85VQ9EOlwu2ozX/w4QgSqdqn7ZrqCp4J9CtFLBe43pfZTlCks/MVmWYUVYMjo8WBn56ClI1OszoJJxH3SEySrOK74rg8BEEsJ2qm0ig1zhQohDXC/X+JJikeQd3ZbFQ55pn6uw7BEvJfJxxEYJpwk3hDE+7+iw2n3F92ZZqfv7WRBglL64MQt4AymKgLSQLA+iy5NkG+WIMlv5OcRkwISq3FF0hm6BYkJlLPUs6ELHlOr5t8gVNP8EbqcJnPexWySuWWmjzXLI2A1Qwdc+3aHxhhsfEgTXsHsbVcHZNx7ww6T7Vw0hqlG2bSNxp1gYIXdM2EAjw3Dl44IdPAWQQhaLjRFPj00Ki4TxonIFaiOGheWG6ZYMaMIx3ss07REdKTyriYj5ioZB3oRlPlsOqrs26ESC+J+Tj7sHJm1aCgF+8x7O2FJJolSiFRW7yCcSPfAgmeWiVO9Kzzr/XYqviVDwbm7WOYnObYZvNz9QsrPvUsWwFJFnsvOJQPHLGHX4aF0gm7PhSDwdWyA2M5OVKhruRawq2DXQ5D+tZJol/fF+IFYILczAyuUZ6pP0ISlDf2yHEVA20KKuS4PFQD6PqUlZHGRhySN8yePzum9rONUMFEjlsVKUw4UPnCR1ArOn0iKeNxruTWC/vOZsp9Ho2tpi47j73RPyvBb3TYwmHFZXCgUinDSE4+6EbudAahmyFs/BDdy7pcktSOA2q5VZC3d5AIdN3pUPSavGKcOtj+Vb2F4erbaHGmMdw8wlLAk7oN7CUsBVQ20XgphtbMXBIKB2fe0EUuPwWVJC6CmxDbcdlSE9WSqucb62V3DFVpkKtt/UTI58NgE5Exg8iwrycqYG4FDsP0d02I9VMm6sLg PPxs5jDb zuhZRfeKOhVGb2dNHiUqael+akMTJIUcBt76kf009KECWmZ9uwGT2VpN/7GfMYISiMAEr8NE5XRkZ8rz6AAHb2OfYQ5BDgEqslloUmPJbDzrOSwgdIihtIK4vRgh1fh1CFtUIiqyKGMvS00xWKibj2s2afrKaegVgZdiuxOBdUWqHyq4byNXbtUhyM1P3Kleu9y7XhqM21kqIo/kUkaiuSDhrmvDO2L7RHLQsblBcg/tDmoWkGzIrQTFzf6XT7HDsA9m2h+Kqwv1fH1dGjOrkVR3p5KNfYs4g5OJ9u/hIUc91tb2wnzqzwqj98XO8Y2B3RqaCZftUqJc4hwyHGgZZZseIY5mY4VuilyRNRbzRWK3bsptAla0iETqxmCkiBW1MvQzADjvQVQ0X5lSinwxrjpGlyWgOi826NdIZ8mJXq23WNMa0NS/KJDmXsoskxPlapbEBQlj/4o9FvHVASJ0Olc9u+7pa89arI2OOzrDmWNhK2eZgGqiRQ/aa5eZAVZ12NkCly6JcYj0gid9iSV27vjeXx8fSgmfcsPzuVSP+hDIiXqgXt5IczMTQF3eGMFQtwgdhUi4KciPKsjYr9I2Upv+n8Lg5w8rPEFrzlGCp/+EcASKnuzlKeN3sOgLNF5pcJ0M5kuJpz0mXVGCqCq9lbxWQII3OLvSO/p9FrhGk7U3zwxeS8MGFIzCBfJJssanuQb/Hg+iqlBf/lTY= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Oct 4, 2024 at 6:22=E2=80=AFAM Chris Li wrote: > > On Thu, Sep 26, 2024 at 2:20=E2=80=AFPM Barry Song <21cnbao@gmail.com> wr= ote: > > > > From: Barry Song > > > > Commit 13ddaf26be32 ("mm/swap: fix race when skipping swapcache") > > introduced an unconditional one-tick sleep when `swapcache_prepare()` > > fails, which has led to reports of UI stuttering on latency-sensitive > > Android devices. To address this, we can use a waitqueue to wake up > > tasks that fail `swapcache_prepare()` sooner, instead of always > > sleeping for a full tick. While tasks may occasionally be woken by an > > unrelated `do_swap_page()`, this method is preferable to two scenarios: > > rapid re-entry into page faults, which can cause livelocks, and > > multiple millisecond sleeps, which visibly degrade user experience. > > > > Oven's testing shows that a single waitqueue resolves the UI > > stuttering issue. If a 'thundering herd' problem becomes apparent > > later, a waitqueue hash similar to `folio_wait_table[PAGE_WAIT_TABLE_SI= ZE]` > > for page bit locks can be introduced. > > > > Fixes: 13ddaf26be32 ("mm/swap: fix race when skipping swapcache") > > Cc: Kairui Song > > Cc: "Huang, Ying" > > Cc: Yu Zhao > > Cc: David Hildenbrand > > Cc: Chris Li > > Cc: Hugh Dickins > > Cc: Johannes Weiner > > Cc: Matthew Wilcox (Oracle) > > Cc: Michal Hocko > > Cc: Minchan Kim > > Cc: Yosry Ahmed > > Cc: SeongJae Park > > Cc: Kalesh Singh > > Cc: Suren Baghdasaryan > > Cc: > > Reported-by: Oven Liyang > > Tested-by: Oven Liyang > > Signed-off-by: Barry Song > > --- > > mm/memory.c | 13 +++++++++++-- > > 1 file changed, 11 insertions(+), 2 deletions(-) > > > > diff --git a/mm/memory.c b/mm/memory.c > > index 2366578015ad..6913174f7f41 100644 > > --- a/mm/memory.c > > +++ b/mm/memory.c > > @@ -4192,6 +4192,8 @@ static struct folio *alloc_swap_folio(struct vm_f= ault *vmf) > > } > > #endif /* CONFIG_TRANSPARENT_HUGEPAGE */ > > > > +static DECLARE_WAIT_QUEUE_HEAD(swapcache_wq); > > + > > /* > > * We enter with non-exclusive mmap_lock (to exclude vma changes, > > * but allow concurrent faults), and pte mapped but not yet locked. > > @@ -4204,6 +4206,7 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) > > { > > struct vm_area_struct *vma =3D vmf->vma; > > struct folio *swapcache, *folio =3D NULL; > > + DECLARE_WAITQUEUE(wait, current); > > struct page *page; > > struct swap_info_struct *si =3D NULL; > > rmap_t rmap_flags =3D RMAP_NONE; > > @@ -4302,7 +4305,9 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) > > * Relax a bit to prevent rapid > > * repeated page faults. > > */ > > + add_wait_queue(&swapcache_wq, &= wait); > > schedule_timeout_uninterruptibl= e(1); > > + remove_wait_queue(&swapcache_wq= , &wait); > > There is only one "swapcache_wq", if we don't care about the memory > overhead, ideally should be per swap entry that fails to grab the > HAS_CACHE bit and has one wait queue. Currently all swap entries using > one wait queue will likely cause other swap entries (if any) get wait > up then find out the swap entry it cares hasn't been served yet. > even page bit locks do have a waitqueue for one page, i believe that case has much serious contention then swap-in. page bit lock depends on a waitqueue hash to decrease unrelated wake-up. if one process is woken-up by unrelated do_swap_page() and its swapcache is not released, it will sleep again after re-checking swapcache_prepare(). Too many unrelated wake-ups would be just a 'thundering herd' but not a livelock. > Another thing to consider is that, if we are using a wait queue, the > 1ms is not relevant any more. It can be longer than 1ms and it is > getting waited up by the wait queue anyway. Here you might use > indefinitely sleep to reduce the unnecessary wait up and the > complexity of the timer. not quite sure what you mean for 1ms, in an embedded system, we never use 1000HZ, the typical/maximum HZ is 250. not quite sure what you mean by "indefinitely sleep", my understanding is that we can't poll the result of swapcache_prepare() as the winner process which does swapcache_prepare() successfully will drop the swap slots. > > > goto out_page; > > } > > need_clear_cache =3D true; > > @@ -4609,8 +4614,10 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) > > pte_unmap_unlock(vmf->pte, vmf->ptl); > > out: > > /* Clear the swap cache pin for direct swapin after PTL unlock = */ > > - if (need_clear_cache) > > + if (need_clear_cache) { > > swapcache_clear(si, entry, nr_pages); > > + wake_up(&swapcache_wq); > > Agree with Ying that here the common path will need to take a lock to > wait up the wait queue. waitqueue_active() might be a good candidate. > > Chris > > > > + } > > if (si) > > put_swap_device(si); > > return ret; > > @@ -4625,8 +4632,10 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) > > folio_unlock(swapcache); > > folio_put(swapcache); > > } > > - if (need_clear_cache) > > + if (need_clear_cache) { > > swapcache_clear(si, entry, nr_pages); > > + wake_up(&swapcache_wq); > > + } > > if (si) > > put_swap_device(si); > > return ret; > > -- > > 2.34.1 > > Thanks Barry