From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 98269C54FB9 for ; Mon, 20 Nov 2023 11:17:35 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2FCD16B0441; Mon, 20 Nov 2023 06:17:35 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 2AE0E6B0442; Mon, 20 Nov 2023 06:17:35 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 175D26B0443; Mon, 20 Nov 2023 06:17:35 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 0211F6B0441 for ; Mon, 20 Nov 2023 06:17:34 -0500 (EST) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 981BAB586C for ; Mon, 20 Nov 2023 11:17:34 +0000 (UTC) X-FDA: 81478082028.01.C273C10 Received: from mail-lj1-f179.google.com (mail-lj1-f179.google.com [209.85.208.179]) by imf20.hostedemail.com (Postfix) with ESMTP id B7A391C0017 for ; Mon, 20 Nov 2023 11:17:32 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=HxVBt2JQ; spf=pass (imf20.hostedemail.com: domain of ryncsn@gmail.com designates 209.85.208.179 as permitted sender) smtp.mailfrom=ryncsn@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1700479052; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=jDZoUdX4vDe/9rtS1iTlpRn5rfopwSjoTNNEwdu9Oa4=; b=yFGW4xNRm16P35nLyY0/Q68pp58y/lm6O3zIzlRtnqdhrxroa/PDQRnXDZaJy6sNAXYuCn 2WmSX70f7YKOwZvHgPWiIVZJ3qLlV5NxtF69Ls5zG57aplxlLsWYzgR8MO/jImXev4r7WN yrC9gwaWNKfg0ka1oWuIcr4YIBH4zxg= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1700479052; a=rsa-sha256; cv=none; b=3D4aesKSFC64fqegSuGt1GNupT7bQpVjD0greLou9RI7Hj7LSKG8SmCTmIt/Flwmuq3UAc pHUf5xp+MThsNwrWAtEjborIMBv8jNbGfpx7E/5TXuFB82RNdYOnIfP0LnBWcurKnX7R0O fog1Y3R0zQF7KQFVB6YEAbwycbJPlew= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=HxVBt2JQ; spf=pass (imf20.hostedemail.com: domain of ryncsn@gmail.com designates 209.85.208.179 as permitted sender) smtp.mailfrom=ryncsn@gmail.com; dmarc=pass (policy=none) header.from=gmail.com Received: by mail-lj1-f179.google.com with SMTP id 38308e7fff4ca-2c5039d4e88so55540411fa.3 for ; Mon, 20 Nov 2023 03:17:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1700479051; x=1701083851; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=jDZoUdX4vDe/9rtS1iTlpRn5rfopwSjoTNNEwdu9Oa4=; b=HxVBt2JQ0HAA3kVSRR/t1jQLtc583vvFUX3t4PRM0fTyw4xS8zWIKFolZr5tAGdH6X gxsq4S9rjJDSysf1iOKJSxGO1ZJzBl40yvJsa6A7qW2tt4BB/97c/Rlnw153QaamJbaL in49rH7hhlYkgzSM+FXJYkWZnuUpeEU1Yc0x5afmI/LSMNaWRnEeTRtMTze4SorWR8eH hJi7rTuthqwSxqZ9vS/GlmNDcecDd8S0tVuyMNBAXToU/lp9R+NUyRW9prI/HQtUmO+H aLHa0O3xv+S/wB9jeb9dbrrEWuOuJZvc6/z0kpn/cTJYzy/+/m2Oro4GymgWePhFNVuo 4N0Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1700479051; x=1701083851; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=jDZoUdX4vDe/9rtS1iTlpRn5rfopwSjoTNNEwdu9Oa4=; b=LO0ZoMntdFBC8uBZD+D8qBtS1CfKL0fjFLbFaWpYZMRLGKYpqjDSqzfptP1zTHjviQ RIwlCrrlu5D+sDYlAYHRgYfTBsuogfh6MIjVy3iqimIvhKTMUyi2x8GqCOAWnTKpMI+Z mBU+ENO1uDwUG7JLFcFSs8C0FPqHc9zUVRa1u1xZMxgroteO8X1AooBxzJ31QHk3x/9H PWWC2cFlXQM0foB3rkezQhzTs+EYLFEV8mKa2moZO0vLwMrzE2y3hPLnnvspDIBC/bkD ugwiuujnnOGvGcOfb6jCnJmi5C/rc6aAqgcFCmfBRO2v2drEkiNgPDqijYHm7vIpQOGq tBzw== X-Gm-Message-State: AOJu0YxL6E7cVk/ll3EWLkBm6G06Yd/kT22B9myLbArGY7paD2TBocYt JoKqBvqF0r/A+Hm3L59vzn3hxcTkvDmu5hiIaq4= X-Google-Smtp-Source: AGHT+IHomPjWkWUdW5xd/cNQ6emVfgNli8c6njptdxJJ5v8+119fycBeIFgjYKG04P59AlH3kkql6ZJE7+P6ZnnP7Y0= X-Received: by 2002:a05:651c:200b:b0:2c8:742a:5a66 with SMTP id s11-20020a05651c200b00b002c8742a5a66mr4567531ljo.48.1700479050776; Mon, 20 Nov 2023 03:17:30 -0800 (PST) MIME-Version: 1.0 References: <20231119194740.94101-1-ryncsn@gmail.com> <20231119194740.94101-9-ryncsn@gmail.com> <87r0klarjp.fsf@yhuang6-desk2.ccr.corp.intel.com> In-Reply-To: <87r0klarjp.fsf@yhuang6-desk2.ccr.corp.intel.com> From: Kairui Song Date: Mon, 20 Nov 2023 19:17:12 +0800 Message-ID: Subject: Re: [PATCH 08/24] mm/swap: check readahead policy per entry To: "Huang, Ying" Cc: linux-mm@kvack.org, Andrew Morton , David Hildenbrand , Hugh Dickins , Johannes Weiner , Matthew Wilcox , Michal Hocko , linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: B7A391C0017 X-Rspam-User: X-Rspamd-Server: rspam11 X-Stat-Signature: c6qd846tcujjzrhjbxbk6wyk8cjwme5f X-HE-Tag: 1700479052-47298 X-HE-Meta: U2FsdGVkX1/5azscGVpY/dStTsh2QLKN/+CBLD6v8AjvIUjVeeAUicNfyQevfxptEwToH7Yhdx5WdOQKzZhw400DQzhgtvjOLEGTdC1kvS+Lge4vqlKScrbDUgoJZEN2e9K+whx2nRA6oOMijJ91OXlhWiLLjda1Tb4TIe4PdDOiyocoNyVobDQDlIUVrf2Y8hlCUqrCh/mhge7wmanwdf15wmol67ob9Mwv9WYKXrXZRcsvl9rD9bNlxQfrch1UVEdrL6uWDCcu4Vd3XihbJV+bTESEyVXUDofgeoK7G4HYgq/r5QFPfWZpRylp934nctj51+1f3F3gIAcqx+o5dXWxiJTCsVZ4R5w7F79GLuI1PTpCpwZNyaRlD6JBTxXBbaXcgDJsWY/ibQiYEKUPZ60nYPHbNwzSbTKiCZt9640Bd2EyHUt8ggyGlRtCrsx+PaYZXddDYgb74+zaBm3PAHvsVQgfeBt6ugtdSxBZIPmXAvvKY9BV21TY1Zyrv9cgbZQIqvC/LJV8fk1R+ShBOoPAWJXMrRYLau7ELZAKy5fRexWUIA8tM40JYXAfSJMrnswHrgYNuTWIRpGub47Pj0URbhN7iGLr09BXymwj6cYxDgisXxmCdLoIXQq3evHsvNbTDkBjynTW6V1KfC3pKr6xgPk+spYNeMmxNdLxG2RtYXsWFf1KBxiiJhbWwoR+9NnLpK6jazSKRiQuJEf4ytySzEiiD3+wmT6TaF2F8j9gGNfG37YuOzOGfDJxaW+viWfkPoblpDIUaaFKhzFATQf4GrMqe+VdcD146+Y0DiJ3wN36VfjAqBZpFOSaHd30/FCn8f1Y3Gk3L1N8AyAhjaTTJgrtYV+0MC53zMSDhB7jxmVSycfgAGxHSYzSYSU2eG6EXOPcGft/kAZXUj9H6SYMi0itxDjXS/Tk2U86N/uZRs07CQyj4DXFYoayXrD6/aLr6CV/fTfZtj2XrSs 1S14dgut b5UrafUgdkNkoUvK44mVod8uZGDmlZ+KqnCOFeU+ge7fC0Yc4OmGYAZyqnY2ynWZ1U7GlNDBZgF1eN+3m9WvFo30UQKAEL7XAorVals8KiiVCz+0r5RW4FkJ9gbiSBz2dMsQ9leM6GZPc4d2dlso5Zi2RAnprf4CLOgh+DMc2cK6oEZAc/x+yKQAyApUshNtk1See7P7N77vCRV92S4JI1k008thowQdfdOcaIF5xKCf/GDEQ2kVuxvRpeOSWTjqFNXmwrBas8P3YiRlFMNdKoFgsvWEtvYAXBbp6EphlxFvav6P6YSQdy1P7g3LNqErCO+bl6U8t8tcdHsAdhv5/mkWGydykz8aGJnXwsligEvS5j4NrWeGm5vmP3ZXTCvOJFt6J5/e0C5K5YHTgt0Hrc7OUyrMyiVLUvXuLaqJmS3r5R6RaoM4Bpg8jic5SgFSYv2fgn0xahr4R5HStrprQEB5P9cxsGXeuQZQj X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Huang, Ying =E4=BA=8E2023=E5=B9=B411=E6=9C=8820=E6= =97=A5=E5=91=A8=E4=B8=80 14:07=E5=86=99=E9=81=93=EF=BC=9A > > Kairui Song writes: > > > From: Kairui Song > > > > Currently VMA readahead is globally disabled when any rotate disk is > > used as swap backend. So multiple swap devices are enabled, if a slower > > hard disk is set as a low priority fallback, and a high performance SSD > > is used and high priority swap device, vma readahead is disabled global= ly. > > The SSD swap device performance will drop by a lot. > > > > Check readahead policy per entry to avoid such problem. > > > > Signed-off-by: Kairui Song > > --- > > mm/swap_state.c | 12 +++++++----- > > 1 file changed, 7 insertions(+), 5 deletions(-) > > > > diff --git a/mm/swap_state.c b/mm/swap_state.c > > index ff6756f2e8e4..fb78f7f18ed7 100644 > > --- a/mm/swap_state.c > > +++ b/mm/swap_state.c > > @@ -321,9 +321,9 @@ static inline bool swap_use_no_readahead(struct swa= p_info_struct *si, swp_entry_ > > return data_race(si->flags & SWP_SYNCHRONOUS_IO) && __swap_count(= entry) =3D=3D 1; > > } > > > > -static inline bool swap_use_vma_readahead(void) > > +static inline bool swap_use_vma_readahead(struct swap_info_struct *si) > > { > > - return READ_ONCE(enable_vma_readahead) && !atomic_read(&nr_rotate= _swap); > > + return data_race(si->flags & SWP_SOLIDSTATE) && READ_ONCE(enable_= vma_readahead); > > } > > > > /* > > @@ -341,7 +341,7 @@ struct folio *swap_cache_get_folio(swp_entry_t entr= y, > > > > folio =3D filemap_get_folio(swap_address_space(entry), swp_offset= (entry)); > > if (!IS_ERR(folio)) { > > - bool vma_ra =3D swap_use_vma_readahead(); > > + bool vma_ra =3D swap_use_vma_readahead(swp_swap_info(entr= y)); > > bool readahead; > > > > /* > > @@ -920,16 +920,18 @@ static struct page *swapin_no_readahead(swp_entry= _t entry, gfp_t gfp_mask, > > struct page *swapin_readahead(swp_entry_t entry, gfp_t gfp_mask, > > struct vm_fault *vmf, bool *swapcached) > > { > > + struct swap_info_struct *si; > > struct mempolicy *mpol; > > struct page *page; > > pgoff_t ilx; > > bool cached; > > > > + si =3D swp_swap_info(entry); > > mpol =3D get_vma_policy(vmf->vma, vmf->address, 0, &ilx); > > - if (swap_use_no_readahead(swp_swap_info(entry), entry)) { > > + if (swap_use_no_readahead(si, entry)) { > > page =3D swapin_no_readahead(entry, gfp_mask, mpol, ilx, = vmf->vma->vm_mm); > > cached =3D false; > > - } else if (swap_use_vma_readahead()) { > > + } else if (swap_use_vma_readahead(si)) { > > It's possible that some pages are swapped out to SSD while others are > swapped out to HDD in a readahead window. > > I suspect that there are practical requirements to use swap on SSD and > HDD at the same time. Hi Ying, Thanks for the review! For the first issue "fragmented readahead window", I was planning to do an extra check in readahead path to skip readahead entries that are on different swap devices, which is not hard to do, but this series is growing too long so I thought it will be better done later. For the second issue, "is there any practical use for multiple swap", I think actually there are. For example we are trying to use multi layer swap for offloading memory of different hotness on servers. And we also tried to implement a mechanism to migrate long sleep swap entries from high performance SSD/RAMDISK swap to cheap HDD swap device, with more than two layers of swap, which worked except the upstream issue, that readahead policy will no longer work as expected. > > > page =3D swap_vma_readahead(entry, gfp_mask, mpol, ilx, v= mf); > > cached =3D true; > > } else { > > -- > Best Regards, > Huang, Ying