From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id AFA30D3C537 for ; Fri, 18 Oct 2024 05:12:16 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 449D86B0082; Fri, 18 Oct 2024 01:12:16 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 3D2F76B0083; Fri, 18 Oct 2024 01:12:16 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 24C766B0085; Fri, 18 Oct 2024 01:12:16 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 031696B0082 for ; Fri, 18 Oct 2024 01:12:15 -0400 (EDT) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 7FCA4160522 for ; Fri, 18 Oct 2024 05:12:02 +0000 (UTC) X-FDA: 82685551200.20.EDCCABD Received: from mail-vs1-f42.google.com (mail-vs1-f42.google.com [209.85.217.42]) by imf21.hostedemail.com (Postfix) with ESMTP id 5FCF51C0006 for ; Fri, 18 Oct 2024 05:11:53 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=kHcbtl3C; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf21.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.217.42 as permitted sender) smtp.mailfrom=21cnbao@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1729228173; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=rX73uDySRZPSnslBvrVkBMZm6qyXTok8S5JqDFCiT8c=; b=LdK3v87dzcKDQhgO8WXINnMeLDLAaD1+dgS6xdH2ZPTp5vh5ohQiFqiNFqpH+8J9LoU2VB CMsv6KlRdbhXan2rtBbeE/8U9DKCHv9YIw39Orvnar00EXvYMRffw4PWzyKTC86sQhBq6N Hu3UbSzN11x8gX2u2MzDXPQyyvzpnUg= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729228173; a=rsa-sha256; cv=none; b=p9JbjUHSBLucN/5oug5x86g9xrivWkKqmsGEp1tFIaxKmkzwyp6B+UKIPXe8iIy2gEdIC1 AowDYP065U/z24CNb2qycYth6t/2Mbdqie17tiLBkhNWc0XbkAUCJPIKUMgMCN0ITsYxtB Ygc7Nyws6GBpkeF3xBx6UkHoqDuJhGQ= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=kHcbtl3C; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf21.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.217.42 as permitted sender) smtp.mailfrom=21cnbao@gmail.com Received: by mail-vs1-f42.google.com with SMTP id ada2fe7eead31-4a479773730so425722137.2 for ; Thu, 17 Oct 2024 22:12:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1729228333; x=1729833133; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=rX73uDySRZPSnslBvrVkBMZm6qyXTok8S5JqDFCiT8c=; b=kHcbtl3CbuInrCaSmNCRQvFQ0SMyw6xm7PD9+USHnVoovN+4rvfEZxIf51O91V4GCD 9V75iMEjpOEs2qbsBNGXqKlbbAuFdje8mKvbcm6VSCMNLVic05TDR4XFnYMgsBerANca tcUXoqB4lQWMVwjoDCHLx/DQbkEiF5gMozT1e+eKLrTd3Zu6AQZ74S3D4fUkVcgz0YXC WRdthQ2/rtlRS5HALU+FVG7YtdzbGAyESpp+JDPTuRrV3QGVOEvxqSew+9AKaQUE6PeR d7n67GkKP+7y++tyNd7yR4GikkSercwqKororHj7F0PJXAtxHNjEfxFcxJxQhs0vRdJ9 n+kg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1729228333; x=1729833133; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=rX73uDySRZPSnslBvrVkBMZm6qyXTok8S5JqDFCiT8c=; b=rj+mKFzgSai/6Euh4peuIpmIpLfCYoF8SbznS3gLaSUoIH8etgrCDxTfLdaVTEonju FPNBpWWsVFQsOsS1Pneafz5Ojcsbl1ZTkAVeWbYUm2cO/pBnagc861yR5wwlCjx0MWZR ttWMkmkE4PrZxZNwiBmDN/Y7X7KtJRqJxM9MYlt0OsUI4dSEKB+QHNBsgGAlSC5FzRJC 2jgxWFC46v05nPO7b+9S7ed0OEDRjCPW9bFVvM6JP3dsVBorQ2LvDkZ4I3JrBApA5wO8 PkzEbJVfIUAHRa/6cxu9fs07C8NvcPgFIy1qiPNnT7GR1r161AjsGkrBfdhSVFu7OH3w 0PRA== X-Forwarded-Encrypted: i=1; AJvYcCVh/s16Zu+D5F9c+QyV3JNoS7BmjkX7pZkCL16BFapiOoDg56qoI36L2xM+ROS911XBjfP3dWUFAw==@kvack.org X-Gm-Message-State: AOJu0Yy8nQMA4ZJ498/DV4NPhCkneqsJigcK2bUfNkBF5END+ao/fF6d R+V2gg6ljFIIAtmzk2kXsjOxoRlLqz7tkcU186TyueRtd8BAvMw/Pefwm8Q4iaFRXmUJF7hzP6t xTdO7gxYdUqS2ddehNXgd9O3He0s= X-Google-Smtp-Source: AGHT+IEkuqHATHGcXMZRJ3fXj1L4fQ3FAVXWsPxxA8t6RbaDyMUkXQP0tYPIUPiWKOZwsP33tsZk2wl9FwH+t3bkvgQ= X-Received: by 2002:a05:6102:b15:b0:4a5:b3e1:f28b with SMTP id ada2fe7eead31-4a5d6aa0dfemr564284137.1.1729228332795; Thu, 17 Oct 2024 22:12:12 -0700 (PDT) MIME-Version: 1.0 References: <20241016033030.36990-1-21cnbao@gmail.com> <20241016155835.8fadc58d913d9df14099514b@linux-foundation.org> In-Reply-To: From: Barry Song <21cnbao@gmail.com> Date: Fri, 18 Oct 2024 18:12:01 +1300 Message-ID: Subject: Re: [PATCH v2] mm: mglru: provide a separate list for lazyfree anon folios To: Minchan Kim Cc: Andrew Morton , yuzhao@google.com, linux-mm@kvack.org, david@redhat.com, fengbaopeng@honor.com, gaoxu2@honor.com, hailong.liu@oppo.com, kaleshsingh@google.com, linux-kernel@vger.kernel.org, lokeshgidra@google.com, mhocko@suse.com, ngeoffray@google.com, shli@fb.com, surenb@google.com, v-songbaohua@oppo.com, yipengxiang@honor.com, Gao Xu Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 5FCF51C0006 X-Stat-Signature: nx9x81okc1t57571734wspih57s15cak X-Rspam-User: X-HE-Tag: 1729228313-688363 X-HE-Meta: U2FsdGVkX1+RyVMi1kk3kw+AII5Odxw/xXHWPCRtvBZHxK5kIJr5ARlLBaSyD4DWVh556EkpRNstW0DprOEjpl3Orahtm/by4eTVnRu+YReCQW6T6M2Tnvp+CPc4BtOm/RQPpoV79rPzkF/Sv5hDFr3Lw+kNYfRQrD9LiKqQg5QUa0F0+M9L/1VEDMGnvIam8zoyW/MIPVkaIRRF1SAEY1QceRJ8Ssq+x2gO7b6SPq9xgH1LTn3VV+I6IW3ZjjatooGF50/bfvRk8GyVrGDc7DrBlCsnowUy4HK2RWbMq3+AZqZ/Yydhz4yFwTtwz40kt1T1RQou8bDWFG0BJu5PWQngNXefL29pfCCUZSfoBUJPG+PP/15TvPHGF/QfFitqaL7AmWxvDjU6WkprueC60bXN4gx/F/opTAghkLj8evqKuh5caqDC8ViHBdW1iFitZVe/8Wr3oU5G4UJbQzIWpLVy6yrG+Kbd12a1+bw2O7Zn9AI1ZfTUXzYhv5FMElHOOXGk9ofUyAY+mBIOT5E9dITMH8XL7LHcjxnr0FkW8uu2JBTSEYgzq/MVO+SHeyIbvw6fHoIYyd88kZjUAryueCwzwrQI/smXAYDgbWe4l62QQYOWc56PP+fEnfFmgPSNRuchezRxaMr3NpPKPORX9zvAUGP23OUX0HTHNh6moQ6JbiEMTRe5pkh49bPaQjs8Z5qphGhK/PKXpKqLWgGUD4gzYBq/z5Xok+xifj7sqb7ev990hkoumefielRVWeCJVfBZ3OaZ3IcOk2s5DHHa1TPqKqOAIwLw8q5FHOEzeYznLh7L5s/jYAMA829+jh7Owc6G7JwRtsg+OyOkyZQrdlTKE9OJbGpzoBc6QtJ8ChIwj5ITKJ6SrAGR83SKYn5v+abBqNtHIAmQlY5qo8gxOtx478leHBlYmUuVnoqAT6wW7LZVTyrHHXe8lhZMYTWGmwpJ4ldSIMF31qWTQ1G txgfFZa1 ME4HaD0zaCtV+8CLEWRXF3cVl5arcFJKsjHU4uyjL0skunpNY8/yaEnKooXUJr+y/bfPFg1A+NBzr4wYrJedO797Xv+NtnQ/oR5FI4QdhY+Lebwr6MgQZH7UtQHqZJeKEdzepDs+prO1uIdG68y5uqP1sjgpurXZkvgI1oUI2e/nTo3qWnE6YjUO2D4aFthcGiv0XC5IKKc3C/JHoa0CNebP4IaSWvYRziMrePwpjudq+PPRDn1i76rG+LRVZxuNqkarAHqRERvzAQsWMblF4gbnBj+yZTgTkf9xDCSsTKjKCnTI1f1cyaTbcYHMjBLp9oKGfijuutTutYnMjYKFknbxE/PEeYecirXsPEqnq3ZFAppnNXKARknpBzKWx/ws87jKP60NkuBUqgun9YzJ0vwVkNZJp13Nr8gGN0rN95q3aCldRJSOm+2fntXe0aJ5cZnRY X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Oct 18, 2024 at 6:58=E2=80=AFAM Minchan Kim wr= ote: > > On Thu, Oct 17, 2024 at 06:59:09PM +1300, Barry Song wrote: > > On Thu, Oct 17, 2024 at 11:58=E2=80=AFAM Andrew Morton > > wrote: > > > > > > On Wed, 16 Oct 2024 16:30:30 +1300 Barry Song <21cnbao@gmail.com> wro= te: > > > > > > > To address this, this patch proposes maintaining a separate list > > > > for lazyfree anon folios while keeping them classified under the > > > > "file" LRU type to minimize code changes. > > > > > > Thanks. I'll await input from other MGLRU developers before adding > > > this for testing. > > > > Thanks! > > > > Hi Minchan, Yu, > > > > Any comments? I understand that Minchan may have a broader plan > > to "enable the system to maintain a quickly reclaimable memory > > pool and provide a knob for admins to control its size." While I > > have no objection to that plan, I believe improving MADV_FREE > > performance is a more urgent priority and a low-hanging fruit at this > > stage. > > Hi Barry, > > I have no idea why my email didn't send well before. I sent following > reply on Sep 24. Hope it works this time. Hi Minchan, I guess not. Your *this* email ended up in my spam folder of gmail, and my oppo.com account still hasn=E2=80=99t received it. Any idea why? > > =3D=3D=3D=3D=3D=3D &< =3D=3D=3D=3D=3D=3D > > My proposal involves the following: > > 1. Introduce an "easily reclaimable" LRU list. This list would hold pages > that can be quickly freed without significant overhead. I assume you plan to keep both lazyfree anon pages and 'reclaimed' file folios (reclaimed in the normal LRU lists but still in the easily- reclaimable list) in this 'easily reclaimable' LRU list. However, I'm not sure this will work, as this patch aims to help reclaim lazyfree anon pages before file folios to reduce both file and anon refaults. If we place 'reclaimed' file folios and lazyfree anon folios in the same list, we may need to revisit how to reclaim lazyfree anon folios before reclaiming the 'reclaimed' file folios. > > 2. Implement a parameter to control the size of this list. This allows fo= r > system tuning based on available memory and performance requirements. If we include only 'reclaimed' file folios in this 'easily reclaimable' LRU list, the parameter makes sense. However, if we also add lazyfree folios to the list,= the parameter becomes less meaningful since we can't predict how many lazyfree anon folios user space might have. I still feel lazyfree anon foli= os are different with "reclaimed" file folios (I mean reclaimed from normal lists but still in 'easily-reclaimable' list). > > 3. Modify kswapd behavior to utilize this list. When kswapd is awakened d= ue > to memory pressure, it should attempt to drop those pages first to ref= ill > free pages up to the high watermark by first reclaiming. > > 4. Before kswapd goes to sleep, it should scan the tail of the LRU list a= nd > move cold pages to the easily reclaimable list, unmapping them from th= e > page table. > > 5. Whenever page cache hit, move the page into evictable LRU. > > This approach allows the system to maintain a pool of readily available > memory, mitigating the "aging" problem. The trade-off is the potential fo= r > minor page faults and LRU movement ovehreads if these pages in ez_reclaim= able > LRU are accessed again. I believe you're aware of an implementation from Samsung that uses cleancache. Although it was dropped from the mainline kernel, it still exists in the Android kernel. Samsung's rbincache, based on cleancache, maintains a reserved memory region for holding reclaimed file folios. Instead of LRU movement, rbincache uses memcpy to transfer data between the pool and the page cache. > > Furthermore, we could put some asynchrnous writeback pages(e.g., swap > out or writeback the fs pages) into the list, too. > Currently, what we are doing is rotate those pages back to head of LRU > and once writeback is done, move the page to the tail of LRU again. > We can simply put the page into ez_reclaimable LRU without rotating > back and forth. If this is about establishing a pool of easily reclaimable file folios, I fully support the idea and am eager to try it, especially for Android, where there are certainly strong use cases. However, I suspect it may be controversial and could take months to gain acceptance. Therefore, I=E2=80=99d prefer we first focus on landing a smaller change to address th= e madv_free performance issue and treat that idea as a separate incremental patch set. My current patch specifically targets the issue of reclaiming lazyfree anon folios before reclaiming file folios. It appears your proposal is independent (though related) work, and I don't believe it should delay resolving the madv_free issue. Additionally, that pool doesn=E2=80=99t effe= ctively address the reclamation priority between files and lazyfree anon folios. In conclusion: 1. I agree that the pool is valuable, and I=E2=80=99d like to develop it as= an incremental patch set. However, this is a significant step that will require considerable time. 2. It could be quite tricky to include both lazyfree anon folios and reclaimed file folios (which are reclaimed in normal lists but not in the 'easily-reclaimable' list) in the same LRU list. I=E2=80=99d prefer to start by replacing Samsung's rbincache to reduce file folio I/O if we decide to implement the pool. 3. I believe we should first focus on landing this fix patch for the madv_free performance issue. What are your thoughts? I spoke with Yu, and he would like to hear your opinion. Thanks Barry