From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C080BC3DA6E for ; Thu, 28 Dec 2023 15:35:24 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5F1936B00B0; Thu, 28 Dec 2023 10:35:24 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 5A1798D0012; Thu, 28 Dec 2023 10:35:24 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 443056B00B4; Thu, 28 Dec 2023 10:35:24 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 32CB66B00B0 for ; Thu, 28 Dec 2023 10:35:24 -0500 (EST) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 0B96A408CF for ; Thu, 28 Dec 2023 15:35:24 +0000 (UTC) X-FDA: 81616626168.04.56D38FE Received: from mail-ej1-f41.google.com (mail-ej1-f41.google.com [209.85.218.41]) by imf03.hostedemail.com (Postfix) with ESMTP id 3140E20019 for ; Thu, 28 Dec 2023 15:35:21 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=ngTcXWcQ; spf=pass (imf03.hostedemail.com: domain of yosryahmed@google.com designates 209.85.218.41 as permitted sender) smtp.mailfrom=yosryahmed@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1703777722; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=h986LuRLMj5CtBkm80VjuZOiBJzP2v2Z/obCN9xx1n4=; b=y/MqiC3kX0bG76MOQTSwKKhbDoFnE0dMKl+epx+CrsqfTEopIU0KXMUAKj7Ql+v9xljvnZ 7ZrW6y4/niCWkg98QJ4avcpYt34dV8ZeM2gqjuSf8p8gq3s3YEP7rOlrnxJY72YBhUr6Gn Yt7f8Sj8hTiuPf48/sUevrTIzdm6C8w= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1703777722; a=rsa-sha256; cv=none; b=tJjAHmb7RV9Po2/Gc8DxuBK/LMsotCjk3j/oC/3CyFbeLM8zcr5HCLdLwnI/2DXao36tjO zmwVFZf2C0kaY+NKvNClGG5TPz8CtxczUkICsUOKoWavOAx0acQswN4AKVfEm85IGq5kpU L5v/Au87WSzuoyKvCBDhwdkes509b1Y= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=ngTcXWcQ; spf=pass (imf03.hostedemail.com: domain of yosryahmed@google.com designates 209.85.218.41 as permitted sender) smtp.mailfrom=yosryahmed@google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-ej1-f41.google.com with SMTP id a640c23a62f3a-a236456fee1so709267266b.1 for ; Thu, 28 Dec 2023 07:35:21 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1703777721; x=1704382521; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=h986LuRLMj5CtBkm80VjuZOiBJzP2v2Z/obCN9xx1n4=; b=ngTcXWcQyeUP4lKjSA2OfzB0txaKlFytnaV4Y//SdorxeUJjfz3kpQ7guzjiqB99jb OnXs0ArAelBSSFPH40dxp7dsoMIJ133bJd4DukXo9Y41lXK01/IF8PNImPxd0ufo4TPn rC3ivbRFep/OX9OOpkzndqSoz51Mxwg3RSLFJaBy7o1YlVF8Gs5194VQAMcRLqwv5YIP ivMEz5AazVpwFMXizq2p29BlGBoQE+6t5KZyOkEp/yUDH7tc+VHCWVUqyeylxqTCMIYt bPlz9SYuzlDaLt6gcDlt3IU3y4MrQo5PFPXvOPqf+R2M6y7qa3dny1NEZ8EyouW/aGvN G4gQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1703777721; x=1704382521; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=h986LuRLMj5CtBkm80VjuZOiBJzP2v2Z/obCN9xx1n4=; b=WZ3KcYGgF0nXUcAZp1sALLug05BJyeCTf9yyYy80npoCqEaP8lq0ljcCplhHGCFGn6 59bZE0ffI3Xg6gTcpe4DQLUZkL6AbSqxEDYWQVpMpecRV09XV0y1K4IA9g3jlz49fGff 7IKChb5zyNHmqA1vbVjwfUnfCDMLj5oY13VFHuQsG3+341vqABUB4/GYxnKsySFihCX3 l2Q5PQZV12F7rOswPxnhJUjKeYF8GZ6IIQZKhUIXJk6IZvKXSj8jgdqifG3faHdtfwlz URiN5Wt5+N6uclwIE5CdMFSx6QqyxwfT4aLsF62X/TwUMDEPnLs+PqRW86DHr9dbVYx9 58yQ== X-Gm-Message-State: AOJu0YySDfkvVeWK69ArybQJrLQo2Y0p9OMlZ8tyZ5G9OTnUx/Cuh1lL i5VyB7QfK1UDcz79nKENIvAMVnnMy0YrMUACLcs1VdCui+or X-Google-Smtp-Source: AGHT+IEfAls5vgce74cxFp/jTxywNGl+RJbneQCFBbf74uuHYuQ/y2fIfTpw4sfv22RO13iub+PCicdmyvCFSGc37Js= X-Received: by 2002:a17:906:7392:b0:a26:c758:373a with SMTP id f18-20020a170906739200b00a26c758373amr4171692ejl.143.1703777720699; Thu, 28 Dec 2023 07:35:20 -0800 (PST) MIME-Version: 1.0 References: <20231221-async-free-v1-1-94b277992cb0@kernel.org> <20231222115208.ab4d2aeacdafa4158b14e532@linux-foundation.org> <0a052cb1-a5c5-4bee-5bd5-fd5569765012@google.com> In-Reply-To: From: Yosry Ahmed Date: Thu, 28 Dec 2023 07:34:44 -0800 Message-ID: Subject: Re: [PATCH] mm: swap: async free swap slot cache entries To: Chris Li Cc: David Rientjes , Andrew Morton , linux-kernel@vger.kernel.org, linux-mm@kvack.org, Wei Xu , Yu Zhao , Greg Thelen , Chun-Tse Shao , Suren Baghdasaryan , Brain Geffon , Minchan Kim , Michal Hocko , Mel Gorman , Huang Ying , Nhat Pham , Johannes Weiner , Kairui Song , Zhongkun He , Kemeng Shi , Barry Song , Hugh Dickins Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 3140E20019 X-Rspam-User: X-Rspamd-Server: rspam11 X-Stat-Signature: fyoea9mx8f11az3kchdiiuszh7mzxonz X-HE-Tag: 1703777721-603791 X-HE-Meta: U2FsdGVkX19uRsATqWOGlHzepjqOksbv9LQ1dZwfZz7TPuHE5yczrMCMf9d9bUfcvCHO5LcrFcNmD47Nh8GLOClpS4rqpPbt6uMNfuJ5TWXLOr32nRxN4NhSzLnTp59t8OYBoQ7r0HoXOYaMcW1qzStmA1URyioJrgQFoy28R/ovsrwwf52fRTMircIPVkbAbG8sipuqtx4ZnH+qT4LIZD/O2uv3QY/ZWoBRpN3mTToxqAvUgoGu8eWJ2YimjAp+cNaIvvy61E+HNNRMz6ZA41MNMqfWIC301oeduJf9xMcNTVTR6QL/+jPf2yfyfO9bQ6e/NVhlN/fRo9g02ZFFL99ZxXY14mt0HvYCrSRRrUzj0aEAjQ0OHlLcJGfyPja7iRVA8sU4qv/A1aQ0ayz6tH/vTkUSGfKuTGIDPtPWtZe/3NSxGFYFb7ak2M6y3dV7QhnxFgyFHw79PG4Gg5QYgWNWEUUMnJzcVGEvTV7Id4/PRm7hVGeKlCPbBUL11qyPmEcNsPJu4LFHylhi1TFDqP+QW/N2PU+ms2ymHBja2OJZM9uuJaNMlcNd327MWd+PHDc+K4+YD4dTaO0tHRK32xhklF3WCDLevGkaLVkfx/PsjZB5qvG3YPZLtWoAjuySX/926bXREaMsOJxAsPrihsn9vXgE6V2mymucMonHe1TkpQiRU4Oxv0ebmpErqt+BfHSJvR26ah4QAcstpgEqIe6c6Rp54lr4k+NNVaG0h7/7bfeOOSd5OzcxVIAXKZ9lK8sWxoAnUElGjCvm6v2hxXQOl4dKMtFWddqSSv/hK+cdSRqhk+DsUfh6IObAfqEmlHBsmks5ZsYEBh3cEKXdz4SNx6RbpcfRS4YY+7PKucdM5OE98B+ULddVqjsCHLJrUuIVoL9TOV4m2IKrXMfzAeIa3a2FYKvAwYWoKFrRtsR0IYAm7FFVijnIoXUfP6evOnhkR+d5jK+Ekra3/CI s5wkMYDQ q/43eUvFNQ3n2t9NxR+MH1FrsATx7SLExiCOS0kP9d07EMr2N7edAvQVZ+L/tD7F3dEyV9Z7pugblm3cyreoNwezfuixOfrodoYakqg1KsO89Ntro+5gozYA8IihqLe0oJpIb9fKVZIv8v1JB8fjIjctabyB3cQE9fbcHLyJbACUUhkkhxNG/c/fvqtTd46G15qnku8cyzbjb7KQhNVdScrFwVSXInqHhy+I2CkGgE60O5osKXX34Vw+mMj7dCnF85zsDMHCsNazKRmeSGtmrELkFAteovEhz60nDM/vKAnzI3DQdp5kSzVVbjQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000080, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Sun, Dec 24, 2023 at 2:07=E2=80=AFPM Chris Li wrote: > > On Sun, Dec 24, 2023 at 1:13=E2=80=AFPM David Rientjes wrote: > > > > On Sun, 24 Dec 2023, Chris Li wrote: > > > > > On Sat, Dec 23, 2023 at 7:01=E2=80=AFPM David Rientjes wrote: > > > > > > > > On Sat, 23 Dec 2023, Chris Li wrote: > > > > > > > > > > How do you quantify the impact of the delayed swap_entry_free()= ? > > > > > > > > > > > > Since the free and memcg uncharge are now delayed, is there not= the > > > > > > possibility that we stay under memory pressure for longer? (As= suming at > > > > > > least some users are swapping because of memory pressure.) > > > > > > > > > > > > I would assume that since the free and uncharge itself is delay= ed that in > > > > > > the pathological case we'd actually be swapping *more* until th= e async > > > > > > worker can run. > > > > > > > > > > Thanks for raising this interesting question. > > > > > > > > > > First of all, the swap_entry_free() does not impact "memory.curre= nt". > > > > > It reduces "memory.swap.current". Technically it is the swap pres= sure > > > > > not memory pressure that suffers the extra delay. > > > > > > > > > > Secondly, we are talking about delaying up to 64 swap entries for= a > > > > > few microseconds. > > > > > > > > What guarantees that the async freeing happens within a few microse= conds? > > > > > > Linux kernel typically doesn't provide RT scheduling guarantees. You > > > can change microseconds to milliseconds, my following reasoning still > > > holds. > > > > > > > What guarantees that the async freeing happens even within 10s? Your > > responses are implying that there is some deadline by which this freein= g > > absolutely must happen (us or ms), but I don't know of any strong > > guarantees. > > I think we are in agreement there, there are no such strong guarantees > in linux scheduling. However, when there are free CPU resources, the > job will get scheduled to execute in a reasonable table time frame. If > it does not, I consider that a bug if the CPU has idle resources and > the pending jobs are not able to run for a long time. > The existing code doesn't have such a guarantee either, see my point > follows. I don't know why you want to ask for such a guarantee. > > > If there are no absolute guarantees about when the freeing may now occu= r, > > I'm asking how the impact of the delayed swap_entry_free() can be > > quantified. > > Presumably each application has their own SLO metrics for monitoring > their application behavior. I am happy to take a look if any app has > new SLO violations caused by this change. > If you have one metric in mind, please name it so we can look at it > together. During my current experiment and the chromebook benchmark, I > haven't noticed such ill effects show up in the other metrics drops in > a statistically significant manner. That is not the same as saying > such drops don't exist at all. Just I haven't noticed or the SLO > watching system hasn't caught it. > > > The benefit to the current implementation is that there *are* strong > > guarantees about when the freeing will occur and cannot grow exponentia= lly > > before the async worker can do the freeing. > > I don't understand your point. Please help me. In the current code, > for the previous swapin fault that releases the swap slots into the > swap slot caches. Let's say the swap slot remains in the cache for X > seconds until Nth (N < 64) swapin page fault later, the cache is full > and finally all 64 swap slot caches are free. Are you suggesting there > is some kind of guarantee X is less than some fixed bound seconds? > What is that bound then? 10 second? 1 minutes? > > BTW, there will be no exponential growth, that is guaranteed. Until > the 64 entries cache were freed. The swapin code will take the direct > free path for the current swap slot in hand. The direct free path > existed before my change. FWIW, it's 64 * the number of CPUs.