From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B059BECE582 for ; Tue, 10 Sep 2024 08:52:13 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id EEBD38D0033; Tue, 10 Sep 2024 04:52:12 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E9B588D0002; Tue, 10 Sep 2024 04:52:12 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D3CA38D0033; Tue, 10 Sep 2024 04:52:12 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id B22808D0002 for ; Tue, 10 Sep 2024 04:52:12 -0400 (EDT) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 55C15160876 for ; Tue, 10 Sep 2024 08:52:12 +0000 (UTC) X-FDA: 82548211704.23.4A03EE3 Received: from mail-vs1-f50.google.com (mail-vs1-f50.google.com [209.85.217.50]) by imf24.hostedemail.com (Postfix) with ESMTP id 89098180002 for ; Tue, 10 Sep 2024 08:52:10 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=cd9CcLyR; spf=pass (imf24.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.217.50 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1725958279; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=sJFYmvEycp/4ZX+4oPaDLoVQSwkBcf9SPGVQ9ODBUGA=; b=x+yyc1Oabdx4ET4fwgHz87rveQx6Wi43CbY6me6u4Ch7xDPMQMGOJiCr1+6o7dN/p/BDcu j7cjHQjV3bnaXk4Jiaf5ARpUFJwxwoe8q38R7ge5BpgZ9z75XiJtlDdsu8KQdb/xqhIhnr wY0CaZXXO2+N0e/ghjHPbw7rtm93V/A= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=cd9CcLyR; spf=pass (imf24.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.217.50 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1725958279; a=rsa-sha256; cv=none; b=jE2g4tu6CPpx7er6e+/NQglq09qGeUIpNr6TOE+3WSvQlIzjJw+DCBHwoTT5fvGIXLc04I yBwvH4JW6Pkmv8M72Pv8a9bEI1r+aj2TMdoNoVBwlkc4jId/t0HqZathTIAj8gTsL3iI/x BcytxSEA1zyNM/s8T4l35Q5VhWVChTo= Received: by mail-vs1-f50.google.com with SMTP id ada2fe7eead31-49bbed1dad5so144082137.3 for ; Tue, 10 Sep 2024 01:52:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1725958329; x=1726563129; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=sJFYmvEycp/4ZX+4oPaDLoVQSwkBcf9SPGVQ9ODBUGA=; b=cd9CcLyR8ItFwr+ZwAfqpTh5Kth2Xhoo0xYEF4c5vM2nqK1s6VRki/NRcQmjEPMjiH Y0Y2fCCFJQkb/QVrqq7f3CL7aGdW8+vlcW9P98mnZdAISjclFrAEGA18rbZTwWxhdwSw m6yWKEmlOmZL7fgbrry5KkuIqvO/MKF19VnEQ91ZQ0yIQvoXhgBEg9RZtFFXt25FTtCq iqFmfvXLTZdMtqTIhdv/JBf8VG6yKOBK1fjg9+S37c11cSiVJqBkegJyWuHt+bh6a1Cf 34M9oMe1coPo3mXQwVkD79HgVvPbQ219zBpYKD2R2Dgz2z62IZgiNuGq8j9mD9S0LAA/ BGcw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725958329; x=1726563129; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=sJFYmvEycp/4ZX+4oPaDLoVQSwkBcf9SPGVQ9ODBUGA=; b=m8Jd27D7wZt86OTO/1m6ZqTgkxeTmq5x9ybLjXSjkwBFQhMOzW3R45TsUOWXc+iYF/ T0I1o8zHliBjt4TQC6SrshdcMi9ZoT1XDNv3P0Q4mQUw7SKYkrRZuDk28SOYwyOlJuSJ DdieW8tHrGaJC4h2lQkyWtCGhJrFBPNpHZI6aLDdLOESXiCnLNWSVeRRN8PoY9fJtI0l 7AGGtzVjIX9lKeSj9VAcDaBAo28eGFsPdu7rsnMfpEjug8JRTJBTOGEJtubb4htz2bCi F3IYtxWNxvZkkZ5mg5zxPnZ3C5AqvwYB2rLyU+G3QvpBzJuZhXmj+fMZVCwvd+7HSekS D6dA== X-Forwarded-Encrypted: i=1; AJvYcCXprQP7Oev1rBmZF+29XSzoUAFb1XWKISx69QuzaEXuUPNVNjGHVS/i6150OCgMCMA993wp1iCVjg==@kvack.org X-Gm-Message-State: AOJu0YwyaG1LgTqtJo6U0xuDokFdwR0IN1aE1XCuo085PBmn4os5TqgS Yi3vWbEw0RR0jPigsqSvZPFiowCn0fHwzBM+0IAKQFHdiOvq+OBOxN5oXODx1X8ef1agwp/XLhl zRBM/dQmU1MHDGawx0UHUMnX6cms= X-Google-Smtp-Source: AGHT+IHrGM9ewKbXyxNA9WVmk4gtj2LN5DtAXs+tjIcKrJgVUhXQ78KitivjcvGMKI16skJyxWLepnvpueNOe3o0AYk= X-Received: by 2002:a05:6102:ccc:b0:493:b9a0:8ee8 with SMTP id ada2fe7eead31-49bece19587mr5744004137.22.1725958329343; Tue, 10 Sep 2024 01:52:09 -0700 (PDT) MIME-Version: 1.0 References: <1757d01334ee4391beba1ea3dcdfed7c@honor.com> <61f07b0979814462bae19b3cf5a34663@honor.com> In-Reply-To: <61f07b0979814462bae19b3cf5a34663@honor.com> From: Barry Song <21cnbao@gmail.com> Date: Tue, 10 Sep 2024 20:51:58 +1200 Message-ID: Subject: =?UTF-8?B?UmU6IOWbnuWkjTogW1BBVENIIHYyXSBtbTogYWRkIGxhenlmcmVlIGZvbGlvIHRvIGxydQ==?= =?UTF-8?B?IHRhaWw=?= To: gaoxu Cc: Minchan Kim , Lokesh Gidra , Suren Baghdasaryan , Nicolas Geoffray , Michal Hocko , Andrew Morton , "linux-mm@kvack.org" , "linux-kernel@vger.kernel.org" , Shaohua Li , yipengxiang , fengbaopeng , Kalesh Singh Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam03 X-Rspam-User: X-Rspamd-Queue-Id: 89098180002 X-Stat-Signature: 5pc6ses4na669kgiumtn3gfkstb4tswe X-HE-Tag: 1725958330-194983 X-HE-Meta: U2FsdGVkX1+VgWa2spHbbk7hpKe7Zrg8yOuKgCsZDII4VuggsCB0qro+gWXiohsv/UfsDBIH0V00q4bTI/1ttFCuc6cWp7lq5l84GSjzd0HWVF4WE4Is39EGQDzvdd8TNtqW7uadTI4inNKac6NJhRnN0e8FeYG0WQUl90rjGL4VugETJom/d+MuD7ArfuCPcbYoAcIoevFgyoxuLLZH598LbTjC4UkSIYx6Y1LCRA8OxAffLwUGDure/C7f1VtgRiQAEJrtHO4W+Nsxs0SpxM8Isj0pkRW75zwcNpk7+pICnS6cOHAp61Kx0zF5PJj4WOAKpCAZ32J4n8E6fgjbDDkpS/yJPnHfEvVXl7C4WuxkdtZJhq95FB10SvS+IhhsGB959BEjlT/XGREyeuWWyqEV/J+/AzeCMaiTpW7uxljzBQwZQ/xt2O4Mm35JBKGiABQbtxBpqtPsOuFSgzue/hqdfM9J/n8kHAfV6a8t4J4VldpSPa8MV2owmFKiy+mrrA6PIMZ5y1CefLl8qFaZkEY2hawiIFbADsuogiTvqwhqCvdscPCI5ZSzZiC4nZPLM7adSPKYTWRbk7JMlLd8iqSfWNPrvqj4FCiuL/MxhJnNG/FgUrAM0NWwkF1lfMWnfoZWGF9RM49UGX9w52ID76qRM3eT5ZWJNo/y3mpF2OV10ay1hwlh0tkF6piYcb0dPEyPIJ/NNWe2z5IfbpqK3oDVccdo97S8/bjZjKR+cJZoRnbVHXmlYFqN5xWdCv1jXUxGKFKq/Z3CIz/rqY5wawpWwxtDKeLVwT/flNKh6+EMqGGFe3vYDUgB6AIHHwW+GPii1MP2Zcm/g0v7y66KTaLvtDI4SCDipmKz3msODHN112/ypvBOzfvkWe6YRrSNfB8b80iEPCC9QqLi4O+7FNfaj1aJxmKS8crEllP8yDbdHg+zP4QBGl9JLfStLDTEqbNCzQdq73/xCBKo2tx mSyuyY1q m28cqyYE4+/geoeIPXlCUGVBIRFthBVymIye2yU4f6ibcg963rVtAkkewGF4owu+46xdGjXLRTq7u3RUsMYJTr1IJ/hanpn7szrHLwnK79L0RSH2jnopkil1bU5CaIVbCASQVSwV8txtkBZ6w6AizGAszvnsuwgOHUG07GJCaQJ0xCMm9ss5yG3rX2sdTpAqIhZabP1yHNRxwJWjzTaVGwvWGS1b4wccktJ/lYE4zfQ8QlVuUEUbJu3i6uZ+Kp9JazuEQXzdrSzbX6YJjDFoYpIKWgrA1pL+I1bfkeeE6I595qfYXTiM9VBc67w3uQaEHM6RImxAh2zIKSRD2BXYyjoWnQJoUsoteaESrQVLFBDRQ/ImaDUeGz78TrsrbD6Mk/ukAR89Y1pJ5ApgZn5IdBr2OwV4ibqSuz3swig4EQWfuQ/K1KmUo5tHZYpVi18Qu1tWfOB1aqKdI9bAsBJSCK5ZSgozXQIi0G8ZIBvBZYhJOwmqahtcxueYAGxQ1mB3TeJa6xKJMxysQWyu1JqAr4+sahP3fpdIiXaMLiaL/ZYPk2Rhhoo2r/hR4YS7dwz99mk4etfbwc6TBf4jUlusJg+u6Dxfr6A3rEK5ukEVEDvBQ3fjSX3CJCJVZ6Ob96WyntaAKlCeuA2Z3FC0cUA9Jp55iSg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.015059, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Aug 29, 2024 at 3:55=E2=80=AFPM gaoxu wrote: > > > On Tue, Aug 27, 2024 at 04:07:57AM +0000, gaoxu wrote: > > > > > > > > On Mon, Aug 26, 2024 at 12:55=E2=80=AFPM Barry Song <21cnbao@gmail.= com> > > wrote: > > > > > > > > > > On Tue, Aug 27, 2024 at 4:37=E2=80=AFAM Lokesh Gidra > > > > wrote: > > > > > > > > > > > > Thanks Suren for looping in > > > > > > > > > > > > On Fri, Aug 23, 2024 at 4:39=E2=80=AFPM Suren Baghdasaryan > > > > > > wrote: > > > > > > > > > > > > > > On Wed, Aug 21, 2024 at 2:47=E2=80=AFPM Barry Song <21cnbao@g= mail.com> > > > > wrote: > > > > > > > > > > > > > > > > On Wed, Aug 21, 2024 at 8:46=E2=80=AFPM Michal Hocko > > > > > > wrote: > > > > > > > > > > > > > > > > > > On Fri 16-08-24 07:48:01, gaoxu wrote: > > > > > > > > > > Replace lruvec_add_folio with lruvec_add_folio_tail in = the > > > > lru_lazyfree_fn: > > > > > > > > > > 1. The lazy-free folio is added to the LRU_INACTIVE_FIL= E list. If > > it's > > > > > > > > > > moved to the LRU tail, it allows for faster release = lazy-free > > folio > > > > and > > > > > > > > > > reduces the impact on file refault. > > > > > > > > > > > > > > > > > > This has been discussed when MADV_FREE was introduced. Th= e > > > > question was > > > > > > > > > whether this memory has a lower priority than other inact= ive > > memory > > > > that > > > > > > > > > has been marked that way longer ago. Also consider severa= l > > > > MADV_FREE > > > > > > > > > users should they be LIFO from the reclaim POV? > > > > > > > > > > > > Thinking from the user's perspective, it seems to me that FIFO = within > > > > > > MADV_FREE'ed pages makes more sense. As a user I expect the lon= ger a > > > > > > MADV_FREE'ed page hasn't been touched, the chances are higher t= hat it > > > > > > may not be around anymore. > > > > > > > > > > > > > > > > > > Hi Lokesh, > > > > > Thanks! > > > > > > > > > > > > > The priority of this memory compared to other inactive memo= ry that > > has > > > > been > > > > > > > > marked for a longer time likely depends on the user's expec= tations - > > How > > > > soon > > > > > > > > do users expect MADV_FREE to be reclaimed compared with old= file > > > > folios. > > > > > > > > > > > > > > > > art guys moved to MADV_FREE from MADV_DONTNEED without any > > > > > > > > useful performance data and reason in the changelog: > > > > > > > > https://android-review.googlesource.com/c/platform/art/+/26= 33132 > > > > > > > > > > > > > > > > Since art is the Android Java heap, it can be quite large. = This increases > > the > > > > > > > > likelihood of packing the file LRU and reduces the chances = of > > reclaiming > > > > > > > > anonymous memory, which could result in more file re-faults= while > > > > helping > > > > > > > > anonymous folio persist longer in memory. > > > > > > > > > > > > Individual heaps of android apps are not big, and even in there= we > > > > > > don't call MADV_FREE on the entire heap. > > > > > > > > > > How do you define "Individual heaps of android apps", do you know= the > > usual > > > > > total_size for a phone with memory pressure by running multiple a= pps and > > > > how > > > > > much for each app? > > > > > > > > > Every app is a separate process and therefore has its own private A= RT > > > > heap. Those numbers that you are asking vary drastically. But here'= s > > > > what I can tell you: > > > > > > > > Max heap size for an app is 512MB typically. But it is rarely entir= ely > > > > used. Typical heap usage is 50MB to 250MB. But as I said, not all o= f > > > > it is MADV_FREE'ed. Only those pages which are freed after GC > > > > compaction are. > > > > > > > > > > > > > > > > I am really curious why art guys have moved to MADV_FREE if= we > > have > > > > > > > > an approach to reach them. > > > > > > > > > > > > Honestly, it makes little sense as a user that calling MADV_FRE= E on an > > > > > > anonymous mapping will impact file LRU. That was never the inte= ntion > > > > > > with our ART change. > > > > > > > > > > > > > > > > This is just how MADV_FREE is implemented in the kernel, this kin= d of > > lazyfree > > > > > anon folios are moved to file but *NOT* anon LRU. > > > > > > > > > > > From our perspective, once a set of pages are MADV_FREE'ed, the= y are > > > > > > like a page-cache. It gives an opportunity, without hurting mem= ory > > > > > > use, to avoid overhead of page-faults, which happen frequently = after > > > > > > GC is done on running apps. > > > > > > > > > > > > IMHO, within LRU_INACTIVE_FILE, MADV_FREE'ed pages should be > > > > > > prioritized for reclamation over file ones. > > > > > > > > > > This is exactly what this patch is doing, putting lazyfree anon f= olios > > > > > to the tail of file LRU so that they can be reclaimed earlier tha= n file > > > > > folios. But the question is: is the requirement "MADV_FREE'ed pag= es > > > > > should be prioritized for reclamation over file ones" universally= true for > > > > > all other non-Android users? > > > > > > > > > That's definitely an important question to get answered. But puttin= g > > > > my users hat on again, by explicitly MADV_FREE'ing we ask for that > > > > behavior. IMHO, MADV_FREE'ed pages should be the first ones to be > > > > reclaimed on memory pressure. > > > For non-Android systems, perhaps the author of MADV_FREE can provide = a > > more > > > reasonable opinion; > > > > > > Add Minchan Kim. > > > Please forgive me for forgetting to add you when sending the patch. > > > > AFAIR, there were two concerns: > > > > 1. The file LRU would contain pages used only once. > > > > While MADV_FREE allows discarding pages under memory pressure, the syst= em > > would > > still have non-working set pages within the file LRU (e.g., those used = only once). > > > > > > 2. LRU inversion among MADV_FREE users. > > > > Consider this time order: > > > > 1. A process: MADV_FREE > > 2. B process: MADV_FREE > > 3. C process: MADV_FREE > > > > The moving tail approach would discard the most recent pages from Proce= ss C > > first, > > instead of those from Process A. > > > > Of course, this isn't universally true for all workloads, but it's the = reality. > After enabling MGLRU, the implementation of age and evict based on gen di= lutes the FIFO mechanism. Although the joining time points are different, t= hey are all reclaimed based on the same gen. > Android has always been plagued by performance issues caused by high IO. = Many engineers adjust strategies to prefer reclaiming anon when the system = is low on memory. For the same reason, > we believe lazy free folio should prioritize file reclamation.(If I misun= derstood, please correct me.) > > For other discussions that lean towards reclaiming anon folio, please ref= er to: > https://patchwork.kernel.org/project/linux-mm/cover/20231108065818.19932-= 1-link@vivo.com/ > > Adding lazyfree folio to the LRU tail has no impact on the Android system= , allowing the system to normally utilize the reuse of MADV_FREE when not i= n a low mem state. > If added to the file LRU head, the Android system will encounter various = issues such as high IO and heavy kswapd load, forcing us to prohibit Androi= d ART from continuing to use MADV_FREE. > Adding lazyfree folio to the LRU tail is not the best approach, but it is= more acceptable compared to adding it to the LRU head. > > > > > At the time, I proposed introducing an additional "ez_reclaimable" LRU = list to > > store MADV_FREE pages > > (and potentially other hinted pages in the future). > > This would allow differentiating priority among LRU lists based on knob= s or > > heuristics. > This solution looks good, might need to think about how to adapt it to mg= lru. That's right. Adapting to MGLRU isn't straightforward. We might need a sepa= rate generation smaller than min_seq for this, or alternatively, it could be handled by a separate LRU list that isn't tied to any MGLRU generation. both seem hard= . > > However, this idea wasn't well-received.