From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D9D22C4332F for ; Tue, 12 Dec 2023 06:52:31 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CA4AD6B0288; Tue, 12 Dec 2023 01:52:30 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id C2D7D6B0289; Tue, 12 Dec 2023 01:52:30 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AA67E6B028A; Tue, 12 Dec 2023 01:52:30 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 93C7D6B0288 for ; Tue, 12 Dec 2023 01:52:30 -0500 (EST) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 592E2C092F for ; Tue, 12 Dec 2023 06:52:30 +0000 (UTC) X-FDA: 81557247660.22.1AF4FE7 Received: from mail-lj1-f180.google.com (mail-lj1-f180.google.com [209.85.208.180]) by imf03.hostedemail.com (Postfix) with ESMTP id 64A2D2000D for ; Tue, 12 Dec 2023 06:52:28 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=Bk9O43mK; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf03.hostedemail.com: domain of ryncsn@gmail.com designates 209.85.208.180 as permitted sender) smtp.mailfrom=ryncsn@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1702363948; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=KZo4NVf3AF8ZGR5m2uxx660ix6ln6qBZxuQ/HFN1wu4=; b=L8+U5AUby5E2qkbPunwyWrSv3htxAf6MRrDzPbIH9QL722cAoxsDqffEVjBSzMyrh2Euwr qe05FSb2pET0za5yMqokitN8IrM04hXPxLkmSjuvkrfwh/tWorz3ykNFv/ZsCdus9hxt4X RbaIgn07v9innNfGRj1S4v3HGtjOjgI= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=Bk9O43mK; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf03.hostedemail.com: domain of ryncsn@gmail.com designates 209.85.208.180 as permitted sender) smtp.mailfrom=ryncsn@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1702363948; a=rsa-sha256; cv=none; b=q1EA8dZJ4P9EKQpJ6wlTK8XqQHfscvgClBKc5koBdPFQ0RNUfTK5phm64MSEO/oFyq5A69 vJwzWfmIofwzjYI40LCgJNOmroaJelQn653Qos/Tj0TXoX6pPCKUXkPDIC8xgL2lqdyDHy B9BBKIKBW6zv0JteUPGJFzlCgGhhHYQ= Received: by mail-lj1-f180.google.com with SMTP id 38308e7fff4ca-2c9f8faf57bso68042661fa.3 for ; Mon, 11 Dec 2023 22:52:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1702363946; x=1702968746; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=KZo4NVf3AF8ZGR5m2uxx660ix6ln6qBZxuQ/HFN1wu4=; b=Bk9O43mK1ba8qqM5ZwVsypEb3L08cIzWem8XkF7w3uepW/TT8GXtebDhXrcXSEwxyL xOsQF+WSaCAYUGzAR0cJdpG1hDea949Tn4lYFYf7ChDxd3lcpAAloA2820h25BzVUw1N YIyBgDqihOIu/G7AHaM5YnUR35qUGLdNxUFp5koXwQVSjNjHst1QQnC8lwHIbSwTBozd N/VsqnUtbOrWimQXQ61vxCU6bizgdSGZFNmGDrnobVu6J0dAaZEV1IgIT1HjvxdkMw54 F8GVyQJlKeuSmU8QGNUNnRJnjGn6bS5iIzi+iAb4lCGHKazpqdVPVtV+mLqWUHRcI2YQ wo7Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702363946; x=1702968746; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=KZo4NVf3AF8ZGR5m2uxx660ix6ln6qBZxuQ/HFN1wu4=; b=KKoyHdt0SkJ1RPw8aBIAzgAwReFKUtdWfsMuxyZzJRMYSkJ6px3txqz9KHbHc+kBTV LcGBD5BpNU/BUsq3xBqZFesbtAZsU6w0W87Ezy+/oBU6F+xbflISEGYtvF11reSmj4jK mbLtaV3E50LSrfAGJzfYp080upXULzIeA4SbmeQWj7Sc/7eO1zLWw+pFyQ583ZwHeD/x gxNZuJXysXVM2CZkbdZ0G79+8dDADA/c89k/enoZEoLLbIBYFFZQKNV/hK2h50HWWjlD qSse38LewOKY/Im3wOGWh7cVfFAMvO/EkR0T+6di1ZZl1546jby9MIcwW25cU68QyPv8 s9kQ== X-Gm-Message-State: AOJu0Yy1cDSrIZfBdhTwFYYC/d3RhtnC/h8qptZjQ/6IHETy+09zt5WP N9v+OlBy7Vc+hgExXjKibuvga6JHLLtmzx+IwEF725r285H5t6z8 X-Google-Smtp-Source: AGHT+IEgznlicpRXzIo7QVmC4FLAv0DjsNVfS/JO4zV5pX1iElneh0jEkEQOXzst71I9Sb2gALJ/tz4LSshn5I2XDiA= X-Received: by 2002:a2e:bd03:0:b0:2cc:20bd:e3a8 with SMTP id n3-20020a2ebd03000000b002cc20bde3a8mr2673486ljq.59.1702363946191; Mon, 11 Dec 2023 22:52:26 -0800 (PST) MIME-Version: 1.0 References: <20231208061407.2125867-1-yuzhao@google.com> In-Reply-To: From: Kairui Song Date: Tue, 12 Dec 2023 14:52:08 +0800 Message-ID: Subject: Re: [PATCH mm-unstable v1 1/4] mm/mglru: fix underprotected page cache To: Yu Zhao Cc: Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Charan Teja Kalla , Kalesh Singh , stable@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 64A2D2000D X-Rspam-User: X-Rspamd-Server: rspam04 X-Stat-Signature: sce3s9p17jb74oy1nj1adpwcdrssx13u X-HE-Tag: 1702363948-897962 X-HE-Meta: U2FsdGVkX19M2Jx+lL28Wn0PIhhNISXUnLyAYE6i2CytuBfrZpIzc3Kr2e7B8z0z9G4Grfu4j+Sz64eYnsxO1UbiWGZsPBhrA/EILY4N+0JAobBLtf7i/zuKvvfk2nChuliqtZP32S4km2QYT6+J/9SMYHa5spUoM48VkUnj+rukfJIZDN4HLfBcP2UEpKKcOKS125w/OMgkdtRxy+Docz440brTgEz0vZAM9mV0O34VrOUQY0TsEIcQclLl/Ax+kxRFUXZpfnY9VRflprM3SYhcy4iG2ekEOMGBhtdGba1Q0PHZWIGZmtuduNlOIEo47TIiiYa8iGfiLFLt9b0nXyTJNxLxcKeA5Swylq/L6jTcOEaA6Q7IvxkFlf1f9J57StuHc8n8+PKllshMAweW7Q8IDX7VxXm236PBxRI12aN0fApCYXgGPS8+88Dwc9sFjVvTuilb7kMQbkZexrG2lt/KPqZjmoVfcIR1nbI+yypWmEIySobNear2nfIYFvZUPucfc5GTphjIaA45pTwgk/SdmoSOVjgXr6sR44PQYDQ3HBrzVjAbXWnCy8o67cNPMwYtXMa8mffwviotwyqem5J9eJu4cVt9nZX3GMi/yovgIOB95fwo0YMkmqqdzfdVlwAoIC4IX2MAcK2CFwyD02ZqWsKsyaVK4XrAb7Rc6do4NvEL1xT9wRyiMLiyupKp+GSPV7AIL4aiwPZDoYSqIgXnNzQExuFN3bWo/O+dIxNS1N8/0MQJiX1UPxLwS7gADpnvD69sa8Fd0bnPT2K9RytUU7KkaB9xB0Z2HFVk5N0UL1o3ThspBRngDGup12o9Km6Ls70UJYIPxCJpu1U4DdNCT80MbVXgbsYgzOmU4xjoLTlEx4Fdx44wT+HXpk4Us6V+98uwJXZsISLEVhAPMlyYOti6IZFLjCveJAM1Lx883uwWjSFLWuBD5ZrGm82Dcsw7Mx6pMJfNAoGg2m9 y5kGuCJs nmd4D3BjAlifrqbnJ/VAvsg7tAq0jXAWDAVR+t6L2xOUYpK494G8elI4w9DPZ9953fX2kxTwt43wPTWkpmlVnMl5V/FhtAxTeFeEBrdHB8RA6E6SMO3uJJI1Vg3NuDFWzcDXUttgGiFqt/gI+SeZQHgYX8gaqR2ewRHPTTLeA9ZiOOwsPGhtj+mi8+uu9MIYytbgBLHTGdYVHLg9obAZPzfHOfBsxrPELH+h7P3hhHE701HLpRBKEySvShT6qLw600qXvJlEp8cTDSCrgp1laBfu421dhsN+7kOdE5tAj6wwsS5xC10yax9aDjHhGeFgTGaRQrR8dL+8AmoZyewTniz9GwXsjlE7CrwRtJ5+cG4/LRz7B0AFSyif8LFGAA0lmotDP3WtM4APKqGizwtucTRrj1RvHwaQqrb4hTOsbVAfj80IuUfh6ROwiuWiPT9aTfkWWpC+ILqYmGWg= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Yu Zhao =E4=BA=8E2023=E5=B9=B412=E6=9C=8812=E6=97=A5=E5= =91=A8=E4=BA=8C 06:07=E5=86=99=E9=81=93=EF=BC=9A > > On Fri, Dec 8, 2023 at 1:24=E2=80=AFAM Kairui Song wro= te: > > > > Yu Zhao =E4=BA=8E2023=E5=B9=B412=E6=9C=888=E6=97=A5= =E5=91=A8=E4=BA=94 14:14=E5=86=99=E9=81=93=EF=BC=9A > > > > > > Unmapped folios accessed through file descriptors can be > > > underprotected. Those folios are added to the oldest generation based > > > on: > > > 1. The fact that they are less costly to reclaim (no need to walk the > > > rmap and flush the TLB) and have less impact on performance (don't > > > cause major PFs and can be non-blocking if needed again). > > > 2. The observation that they are likely to be single-use. E.g., for > > > client use cases like Android, its apps parse configuration files > > > and store the data in heap (anon); for server use cases like MySQL= , > > > it reads from InnoDB files and holds the cached data for tables in > > > buffer pools (anon). > > > > > > However, the oldest generation can be very short lived, and if so, it > > > doesn't provide the PID controller with enough time to respond to a > > > surge of refaults. (Note that the PID controller uses weighted > > > refaults and those from evicted generations only take a half of the > > > whole weight.) In other words, for a short lived generation, the > > > moving average smooths out the spike quickly. > > > > > > To fix the problem: > > > 1. For folios that are already on LRU, if they can be beyond the > > > tracking range of tiers, i.e., five accesses through file > > > descriptors, move them to the second oldest generation to give the= m > > > more time to age. (Note that tiers are used by the PID controller > > > to statistically determine whether folios accessed multiple times > > > through file descriptors are worth protecting.) > > > 2. When adding unmapped folios to LRU, adjust the placement of them s= o > > > that they are not too close to the tail. The effect of this is > > > similar to the above. > > > > > > On Android, launching 55 apps sequentially: > > > Before After Change > > > workingset_refault_anon 25641024 25598972 0% > > > workingset_refault_file 115016834 106178438 -8% > > > > Hi Yu, > > > > Thanks you for your amazing works on MGLRU. > > > > I believe this is the similar issue I was trying to resolve previously: > > https://lwn.net/Articles/945266/ > > The idea is to use refault distance to decide if the page should be > > place in oldest generation or some other gen, which per my test, > > worked very well, and we have been using refault distance for MGLRU in > > multiple workloads. > > > > There are a few issues left in my previous RFC series, like anon pages > > in MGLRU shouldn't be considered, I wanted to collect feedback or test > > cases, but unfortunately it seems didn't get too much attention > > upstream. > > > > I think both this patch and my previous series are for solving the > > file pages underpertected issue, and I did a quick test using this > > series, for mongodb test, refault distance seems still a better > > solution (I'm not saying these two optimization are mutually exclusive > > though, just they do have some conflicts in implementation and solving > > similar problem): > > > > Previous result: > > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > > Execution Results after 905 seconds > > ------------------------------------------------------------------ > > Executed Time (=C2=B5s) Rate > > STOCK_LEVEL 2542 27121571486.2 0.09 txn/s > > ------------------------------------------------------------------ > > TOTAL 2542 27121571486.2 0.09 txn/s > > > > This patch: > > =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D > > Execution Results after 900 seconds > > ------------------------------------------------------------------ > > Executed Time (=C2=B5s) Rate > > STOCK_LEVEL 1594 27061522574.4 0.06 txn/s > > ------------------------------------------------------------------ > > TOTAL 1594 27061522574.4 0.06 txn/s > > > > Unpatched version is always around ~500. > > Thanks for the test results! > > > I think there are a few points here: > > - Refault distance make use of page shadow so it can better > > distinguish evicted pages of different access pattern (re-access > > distance). > > - Throttled refault distance can help hold part of workingset when > > memory is too small to hold the whole workingset. > > > > So maybe part of this patch and the bits of previous series can be > > combined to work better on this issue, how do you think? > > I'll try to find some time this week to look at your RFC. It'd be a Thanks! > lot easier for me if you could share > 1. your latest tree, preferably based on the mainline, and > 2. your VM image containing the above test. Sure, I'll update the RFC and try to provide an easier test reproducer.