From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2D108D70DFA for ; Thu, 28 Nov 2024 23:09:04 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9B0836B0083; Thu, 28 Nov 2024 18:09:03 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 95F446B0085; Thu, 28 Nov 2024 18:09:03 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8272F6B0088; Thu, 28 Nov 2024 18:09:03 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 66EEA6B0083 for ; Thu, 28 Nov 2024 18:09:03 -0500 (EST) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 848B340893 for ; Thu, 28 Nov 2024 23:09:02 +0000 (UTC) X-FDA: 82837045872.30.F3110F6 Received: from mail-qt1-f181.google.com (mail-qt1-f181.google.com [209.85.160.181]) by imf26.hostedemail.com (Postfix) with ESMTP id 13466140004 for ; Thu, 28 Nov 2024 23:08:54 +0000 (UTC) Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=eIsGrgjY; spf=pass (imf26.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.160.181 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1732835333; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=73AVzzP2v1m625wf6xJ/CbK4UaWJ5lWVRfV8EUqXnOk=; b=IpYFEcw1aMI4JmOajqSuDUgDz4gdmSnOwnUmWpIzYk9SWUfngoaXfjyMwoDoaSkmmAUa+X y+lAu3V8C3JXf8/JTq8b6bE1yp4LzoiibD7T2qBmsKoMyS+AFlEjqfXtJncP8/WAnXLIT/ p63xdtu1X1SNVBTD6w2sKfHdH3iCqEA= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1732835333; a=rsa-sha256; cv=none; b=4ncvZLjmv88c93gy3aU9GBA1EG9HW28x2QY+irtuN9vtwoxLVPFfVlq7JOP8F5PRxA1rPR n5l79z4Hb7pZttFJkHick93GfHOMLsAyiThouMnVShp2G6O7xZG/2qXEM6Oz9oIiEc/Y9X VH2HWm5MZpMc6ax9nr/vkfDmvIh2UnE= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=eIsGrgjY; spf=pass (imf26.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.160.181 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com Received: by mail-qt1-f181.google.com with SMTP id d75a77b69052e-46679337c24so9329451cf.3 for ; Thu, 28 Nov 2024 15:09:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1732835340; x=1733440140; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=73AVzzP2v1m625wf6xJ/CbK4UaWJ5lWVRfV8EUqXnOk=; b=eIsGrgjYli4I/loTdb+CRCgujqVy4IS7Eduqf2lKZbgH+funrBaFL3+9/bFJMiGJy6 Fx18esx1aib7Mxvjn5T9L0XnRsn7IW1QDgtBNQECNGpm6j/uelaMXaf71iRFcSwtwgOA nsWuJM1vqz1OLLcd3nKn+llSP0Ho+iqYnxXwesqRwyKRYF6YdX8//IMRYYMjj4Mq5FTy n8ugFbTdonXsMe2aO8OUT4A2Ss1o6hcK4yUH3nLcmicT6vLP2tOEFIwlzCZuWPdQfPlJ D7N0taM0XFFoAN1IMmZ/0CEjO4xd7t/vOQAcZqsQBAdUD016yLOGfxi7A5gQ62xkPxEd KV1Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1732835340; x=1733440140; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=73AVzzP2v1m625wf6xJ/CbK4UaWJ5lWVRfV8EUqXnOk=; b=cVzGkMQzOfU4mQgdz0kFhDDiBVkoDbeCsSKfNrmlxWcKdRWfYW7BsQ3bH26sgbb3SB dqS7WNFSB2S8zLX+YdpJ69sPVhMP13extJuW/pCxUmWoHoymr2KdZm1PV6kBT4avK28w hpWynVAXvrLZvUzGSArTBvqqNZZrXj7AchVFiXyVuW51nxZ1pbAOY0gn/2IO7TJnt/Xb IFq/rrm1DLYXjF+JTSkvP28K+QAPofQ+Uao+fSrxaETfdLdpRztsPmz9aNBVZPGjWPK9 tRt9ivDW9PoFMvueLcPAoEqT2r4wTQXbwjDedN2C4wuoNL4JGA/QFzy0Dzpx7N+FdT76 H/hg== X-Forwarded-Encrypted: i=1; AJvYcCXk0mTKn01eIqFWsf2fEjvp+p2oLixucQiemrCIcyNq6taoy/5GOAO8sO5v4Mn6fZTwa/GfG3tcbA==@kvack.org X-Gm-Message-State: AOJu0YxrvatueY+Q6vyjVJd+lJWeiCAPIsfdWGt32hVgCE6RbMnWPhVo 52HMqyQdqJdZ8BpLVYsJHhGEKl2vEzRKuRJT5wLjlVYCSSjdwpnJjGeoYzHw5mAAOcXV6LzV1gK U7ltJSvk9gpZ0aqNPBcogOv/CY4Y= X-Gm-Gg: ASbGncvCI/ZsK75GY/gDj5zZqQaWETlSTKJ4c9nGnzObz8uoglSuxeoiir8fio4cG7B 5wsNQu6+P/a/yrLIZmWHmckPS3ZRASFLB1TRk35J1Ppu6fDh3bhxy0ibtmvka/+4k8g== X-Google-Smtp-Source: AGHT+IEMgywl73xEWtsWe0pwsi0QHZeQHj2IfZ8pzf/4HpesUX2DC/FR25/TSrlLpjPifzadfwa23N2UrJho1ZIWbIY= X-Received: by 2002:a05:620a:1b98:b0:7b6:668a:5a7 with SMTP id af79cd13be357-7b67c4a3ab4mr1192641985a.62.1732835339756; Thu, 28 Nov 2024 15:08:59 -0800 (PST) MIME-Version: 1.0 References: <20241116091658.1983491-1-chenridong@huaweicloud.com> <20241116091658.1983491-2-chenridong@huaweicloud.com> <7e617fe7-388f-43a1-b0fa-e2998194b90c@huawei.com> In-Reply-To: <7e617fe7-388f-43a1-b0fa-e2998194b90c@huawei.com> From: Barry Song <21cnbao@gmail.com> Date: Fri, 29 Nov 2024 12:08:48 +1300 Message-ID: Subject: Re: [RFC PATCH v2 1/1] mm/vmscan: move the written-back folios to the tail of LRU after shrinking To: chenridong Cc: Matthew Wilcox , Chris Li , Chen Ridong , akpm@linux-foundation.org, mhocko@suse.com, hannes@cmpxchg.org, yosryahmed@google.com, yuzhao@google.com, david@redhat.com, ryan.roberts@arm.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, wangweiyang2@huawei.com, xieym_ict@hotmail.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 13466140004 X-Rspam-User: X-Rspamd-Server: rspam07 X-Stat-Signature: ce5muo5t5eiw4dkyw3n4jdzcduhbrz6u X-HE-Tag: 1732835334-517779 X-HE-Meta: U2FsdGVkX18AVeu7GicCYPh3Ew0wdICogd/0DaSLc4C0IEQceyuwkmaE3B3XFWF9/zu2zQmc+mfBn83eJmWj/8Pkx3+S4g0pV+m2PpQNbQeQFqay3Ljz0BgW86QzcElxH6AgCpv/qHvFUUmuxndviM4w9kTYy/+thTorJQMzpvf8VGDzuFtUBbmmF9JZwIz3boQ8ozrQIZ7OQr6M1H+32tI1sMMfbXYEeRldZUAhV8/vDWY04axmJiKsh7k66d33IHPaMaRfekfOQcFlU0L0KQ0opqoBcfUZJgvSRcL2ATsalqyYYVbDWmuE1K2KDdzLHZsrPFKlZNjU7WCOurFyxHOd6hyBxpHfc9z8Bi78rorSr47m8fzbEyRdzIxqxrzxQ1GGngMTI8OSjHEYUCNr2B7po3KTgyygkCHDPoCHGHhX8nwScUKPKTCh3z7b/gilKXJ71pi3SwFLZqi3879Hl+VmWgcM4nCpAENsdnbpJliscQnAysjE1jrNYJdR1t0g2D8fWqxuwA8aMlhgEXUEpH48nil/6OC55aSgGJLykjHaewnmDXr9qKfVqsDfuVrelZEomyR2KnHHCix7thVlPKnbzfJlJA15KxsaObULcB55ogTXeCN+5kKkH3hOIRc6lYCHjwcQoO39b/ty/3qOMee9G/GAfAqUQ2j+wwnVJc6NNQTiKJ6WpOv9pTZ3OpyK5W60s37h4wNQZBNiFNvfuxWGlNNXDTZIekMj199fMPDwJLHWAZHkjE/aiJCjD4yilojZR7PcSKoNY/x0M1yQb3P+nxKaL9t1lKNPVSHd8ykq4k0F/GrneozGU+PlI6FPKr58GA1mb6yZrCr7ZJQr+PWxi08uZXXdcwIJRPg7fgqycMOYYLItNPW6it2j+QrgVZUej6VTn2qYEktG+NkpiJnjq2a1x9jDs39I2JxFbYcSPdjZZhp85fi3i5DmorvwmPDaD42B+0ISbSj4M2m VB0qPmVW CJoAgbBswC1xadsSiaIO7Nzgtd2r0G4L0tbIVNXtxW5dasQxlfU43m3yauojIYqVZvfxGW6ZYcLyXiz78AwIsIH7nKLphvyRfW6IvCnlY2+Q5gXpEF5XBcT1asHuZDPwqGL7k+87PcE3BtrsJ3/oCuZ3Q8IYgjVI1+XWHKjGMVqED8gM2wVX0u1XGYCxEoB6KFW2x/+phcbahjqCiP2rchANLVpD8E5tW7V3940lTFEWLww3GeOyuFCDWEpBrEZIkcwQHs0pJvOZp9cz0u59d1THlrTPMsxqQxvZhSUzZsDkDWwvhpkGw/DmEdb8Alfw4k8SKmXJOPO/+esrJhubyT4cZ2IAs5QemJxvazsKtcWPosozHHmdWUpaIycm77V7MeTSm/fliSoRB5Icfh1S5xGfidz5v6SujDn4Wb4LvnfgtjSs= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Nov 25, 2024 at 2:19=E2=80=AFPM chenridong = wrote: > > > > On 2024/11/18 12:21, Matthew Wilcox wrote: > > On Mon, Nov 18, 2024 at 05:14:14PM +1300, Barry Song wrote: > >> On Mon, Nov 18, 2024 at 5:03=E2=80=AFPM Matthew Wilcox wrote: > >>> > >>> On Sat, Nov 16, 2024 at 09:16:58AM +0000, Chen Ridong wrote: > >>>> 2. In shrink_page_list function, if folioN is THP(2M), it may be spl= ited > >>>> and added to swap cache folio by folio. After adding to swap cach= e, > >>>> it will submit io to writeback folio to swap, which is asynchrono= us. > >>>> When shrink_page_list is finished, the isolated folios list will = be > >>>> moved back to the head of inactive lru. The inactive lru may just= look > >>>> like this, with 512 filioes have been move to the head of inactiv= e lru. > >>> > >>> I was hoping that we'd be able to stop splitting the folio when addin= g > >>> to the swap cache. Ideally. we'd add the whole 2MB and write it back > >>> as a single unit. > >> > >> This is already the case: adding to the swapcache doesn=E2=80=99t requ= ire splitting > >> THPs, but failing to allocate 2MB of contiguous swap slots will. > > > > Agreed we need to understand why this is happening. As I've said a few > > times now, we need to stop requiring contiguity. Real filesystems don'= t > > need the contiguity (they become less efficient, but they can scatter a > > single 2MB folio to multiple places). > > > > Maybe Chris has a solution to this in the works? > > > > Hi, Chris, do you have a better idea to solve this issue? Not Chris. As I read the code again, we have already the below code to fixu= p the issue "missed folio_rotate_reclaimable()" in evict_folios(): /* retry folios that may have missed folio_rotate_reclaimable() */ list_move(&folio->lru, &clean); It doesn't work for you? commit 359a5e1416caaf9ce28396a65ed3e386cc5de663 Author: Yu Zhao Date: Tue Nov 15 18:38:07 2022 -0700 mm: multi-gen LRU: retry folios written back while isolated The page reclaim isolates a batch of folios from the tail of one of the LRU lists and works on those folios one by one. For a suitable swap-backed folio, if the swap device is async, it queues that folio fo= r writeback. After the page reclaim finishes an entire batch, it puts ba= ck the folios it queued for writeback to the head of the original LRU list= . In the meantime, the page writeback flushes the queued folios also by batches. Its batching logic is independent from that of the page recla= im. For each of the folios it writes back, the page writeback calls folio_rotate_reclaimable() which tries to rotate a folio to the tail. folio_rotate_reclaimable() only works for a folio after the page reclai= m has put it back. If an async swap device is fast enough, the page writeback can finish with that folio while the page reclaim is still working on the rest of the batch containing it. In this case, that fol= io will remain at the head and the page reclaim will not retry it before reaching there. This patch adds a retry to evict_folios(). After evict_folios() has finished an entire batch and before it puts back folios it cannot free immediately, it retries those that may have missed the rotation. Before this patch, ~60% of folios swapped to an Intel Optane missed folio_rotate_reclaimable(). After this patch, ~99% of missed folios we= re reclaimed upon retry. This problem affects relatively slow async swap devices like Samsung 98= 0 Pro much less and does not affect sync swap devices like zram or zswap = at all. > > Best regards, > Ridong Thanks Barry