From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C246CC4332F for ; Fri, 15 Dec 2023 06:47:13 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 452E68D0119; Fri, 15 Dec 2023 01:47:13 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 400F08D0103; Fri, 15 Dec 2023 01:47:13 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2A1E58D0119; Fri, 15 Dec 2023 01:47:13 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 17CDD8D0103 for ; Fri, 15 Dec 2023 01:47:13 -0500 (EST) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id E65DB1A04DF for ; Fri, 15 Dec 2023 06:47:12 +0000 (UTC) X-FDA: 81568120704.02.CE8CA7B Received: from mail-ed1-f50.google.com (mail-ed1-f50.google.com [209.85.208.50]) by imf08.hostedemail.com (Postfix) with ESMTP id 23CEE160006 for ; Fri, 15 Dec 2023 06:47:08 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=XxiXC9Qb; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf08.hostedemail.com: domain of yuzhao@google.com designates 209.85.208.50 as permitted sender) smtp.mailfrom=yuzhao@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1702622829; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=4FKXVOTp/bQOfHcljreFVgKQiFMabDqpotbFQ79Z/fg=; b=FOOkrwbGr9jnWzgKOvUcDEHviuFDIwNWkVRnUyjbwdGuncTOBn6i4RwUIlijyi6A1bR039 dNw2tJhpBfRQYrXfVhqMe6tgpc0xwG00MMD70SmT1sdoO3mHrOiFD309VXtiKe8+SJGea4 07nBe/WaGDZk0B3/9t2ESefTwzB6wY8= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=XxiXC9Qb; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf08.hostedemail.com: domain of yuzhao@google.com designates 209.85.208.50 as permitted sender) smtp.mailfrom=yuzhao@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1702622829; a=rsa-sha256; cv=none; b=Xxd0/eaz7ANK8E6TuDZMwifUegL+2EpWGX3Qj1Z8/rinfyuzNu1a0b2a3+n720r/Gr1S+J /257vl1/nq4+Dbb0AKtX2qgc8FXXwGtr8UMQianZM92U96VM5bZ1jeJ9tJB7KWZU3iVc4m 63qus1vT0Y8DwEPExFG1UkL9H6r6PY0= Received: by mail-ed1-f50.google.com with SMTP id 4fb4d7f45d1cf-54c77d011acso4499a12.1 for ; Thu, 14 Dec 2023 22:47:08 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1702622827; x=1703227627; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=4FKXVOTp/bQOfHcljreFVgKQiFMabDqpotbFQ79Z/fg=; b=XxiXC9QbKm0I7vigoSbL0e7zUah80pdbhB6BOiU9uOORyYRWPdK/95qyXU/g+7ZwDp 8aNxChHCpdeaSPpRJ0Wf6lFl0B712z42aRP01UlKDAh7HmBux5Tqq8YuWk3zMRbrBw/L zFf4WkRZ5lfWx8Ll1ckh2PmlKn4tTDMRKcLs0zHs6p+9fY5FxcfNNWJcVWFuVQgMKCFc SgcajOMtrPP7nODQqMT6SdHU/45shhNu3X1PCMrT83uBnWSXRIT+iVfLNa5RsEyV4iIk mKEqbhrfCsFkxWoDanvVxldK2stx2L4lNnCcpsb+8yFXwTu+hXUcrlnPukK/XcsMAbT1 XgQg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702622828; x=1703227628; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=4FKXVOTp/bQOfHcljreFVgKQiFMabDqpotbFQ79Z/fg=; b=qOsSHcBnZU3VDW8uObepWMLsY1PzBDDm55IFX7UZkdB9pHwNJwVKFvfYwM8EybZlNK NvA80T7p/3bBaiZROhYPhfAAfG7Cj+TFnOEPWrgne+uOVH6Xp9Y19WkeCVA+G6Vygn9i atQNWHPPYQ/JXvrGa6Y3kyf4Nka0gZGyNHFREQ4e7vBlGdPUPRlyW/jadJX+phoyKbiR RrMCfaoMdxIZvnlR+S1vvlvrTqyNsuwZbg4cnfvKwnaZrhsMNMyEZG0lwW7EKFzFH6qY zs7qeVEvIQCUG89bb2OjGUbdpia2x88Eg5TE379OdVIj+j4oaKRNx4ZSnsE+nA3uvyoO 8hQA== X-Gm-Message-State: AOJu0YwInHQOeK5wOWCZchxdA+hIW0N52+gJr9zHQs3OHmgV96TWtVJQ TEiMdE4UO1pMDghfsq+uySutRz1D9cwVKnQXGZJWnWZr7puMAIyexNUXgg== X-Google-Smtp-Source: AGHT+IGh0xKeczdxJkwY9Ms/J+BfztpO8E4fJzl7f+hhS3s4y/glg0dwx7NSN2HXOX1Suf2776Ez02Tepizlhs67gsU= X-Received: by 2002:a50:c092:0:b0:543:fb17:1a8 with SMTP id k18-20020a50c092000000b00543fb1701a8mr748481edf.3.1702622827515; Thu, 14 Dec 2023 22:47:07 -0800 (PST) MIME-Version: 1.0 References: <20231208071235.17812-1-henry.hj@antgroup.com> In-Reply-To: <20231208071235.17812-1-henry.hj@antgroup.com> From: Yu Zhao Date: Thu, 14 Dec 2023 23:46:29 -0700 Message-ID: Subject: Re: [RFC v2] mm: Multi-Gen LRU: fix use mm/page_idle/bitmap To: Henry Huang Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, =?UTF-8?B?6LCI6Ym06ZSL?= , =?UTF-8?B?5pyx6L6JKOiMtuawtCk=?= , akpm@linux-foundation.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 23CEE160006 X-Stat-Signature: hft4nqda5bh1ejyjj1u5t4offaw9jwzd X-HE-Tag: 1702622828-798720 X-HE-Meta: U2FsdGVkX1+6UIra3S1hJ7Jlvh5K9p/tG5dOkkwNBOLHUjPOt33EctiSoqCzuVWuDDPzgyZ79SYk0x7uzYc9lega6mhVQqcOJRdEPiEqx/xap4eeSDnOixKHnUozkAWHu6iXXZ8TbRFByKxyb2RmtckSuet4rvq4rQWCWLIWLVZr6+QuTEuc1NZuyKRib1DZ0EZYsgnDyZU/5LfB4Q9IzDAn4fLJ5U/Be+15GM3iW0dkK+LFUvhb6xDX2pDDbN1k4idgNI9yLlctI86XZzonlzRsy7S6HFV8c7gTHgNZNsjJJ4N93sA70SHAZrxQmpB3tw3KBzFV6oQMx8s2Rmj2c0N0Y5GTuKMBDIDsJRZpd2qbqmTWTbNb1yjRQsUYdT4H+PupuHU33UgtMnzHfMHlnCH2Ru6+lajZ/CEVdPtdIDOLH3bdkKEDnlkmyyWdaN4jz+Rzv2D1mlXA/hX5GpTHL6k/m3szog2hCP8cHHiwnbBRvIZZz2tPMiTzGVwL8+mEdHSVvSyfvIhIfzb+Dvi3dJPWjSUA8szHlZJymPlfJ6zZ0+39t3yi7WBP+9Ulj8wo4bhYB2OpTNmGbiVewL0eFggta836jBx4x/MyCSZ2x43lr3EvBKx4645tiAF0TANNnFwmdQjTBlx1kM2yRMAR0rwt9gO19S2E/ttoAPuJ7udJJsbRedo+XFR45K1/ejA9OqjOHquoYxgB+budP3hdPgVnayBfQ/V2RjQwLFGN9F7LSkWdgBGgesTH4uPNZxvwD1ocAStg6nrdZMGO8lKdiPRG7+zlJDQyMJy22WcdVb2APPSutQNpvQ6lw4qQxTZXWKE4UCW42pWEm29o9SKP57Vxj0HmfBMKMsZxKcIV/QewOlqbrE0U+RRwgLqAFG5Kq2rv9dh9xEZzTrvCtw7eUIxScyylUUjac6wg2ne0V0nVPxLe+UX88Yb9YPov3/eZX2Zptdqcc6V14qi+L+e 59oNuNdd 8ZbhYUeYAiIHzu2RU7g99IN8YpndTS6a9dmmkyMw6YhC7nSHcxkQkP5gPchxmNIJQ+rxmwpQ/AR7dyK4ddKNxnTCP7ovHsfoEs6uB1K/z3zn2HyOl82WlVIHs0a30fZqxoekHsLUAD/134e9tNsICXFuIvuHRqhxX9w/oChjrgJwI+SMFJmhR5WqLz0S6FGRewdoYeN+nTqV5QPTPsBdRFNl+TMbX7TguzHhxgL1mpatStpESa/42JxOEIX7jrYX/nzcVGDe9IV3YpNt3uLacOCCmEPsTVbdvD+5DinHyDyNo0BdGsYKAaVGp/GNgnGVY5Q9L725mO0HPsW6vNfhje0W9teUlV7oWqZpSm6J7t/taSdYPxruqrox00FasNWGGkX3z9zdy5cA+Gdk= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Dec 8, 2023 at 12:12=E2=80=AFAM Henry Huang = wrote: > > Thanks for replying this RFC. > > > 1. page_idle/bitmap isn't a capable interface at all -- yes, Google > > proposed the idea [1], but we don't really use it anymore because of > > its poor scalability. > > In our environment, we use /sys/kernel/mm/page_idle/bitmap to check > pages whether were accessed during a peroid of time. Is it a production environment? If so, what's your 1. scan interval 2. memory size I'm trying to understand why scalability isn't a problem for you. On an average server, there are hundreds of millions of PFNs, so it'd be very expensive to use that ABI even for a time interval of minutes. > We manage all pages > idle time in userspace. Then use a prediction algorithm to select pages > to reclaim. These pages would more likely be idled for a long time. "There is a system in place now that is based on a user-space process that reads a bitmap stored in sysfs, but it has a high CPU and memory overhead, so a new approach is being tried." https://lwn.net/Articles/787611/ Could you elaborate how you solved this problem? > We only need kernel to tell use whether a page is accessed, a boolean > value in kernel is enough for our case. How do you define "accessed"? I.e., through page tables or file descriptors or both? > > 2. PG_idle/young, being a boolean value, has poor granularity. If > > anyone must use page_idle/bitmap for some specific reason, I'd > > recommend exporting generation numbers instead. > > Yes, at first time, we try using multi-gen LRU proactvie scan and > exporting generation&refs number to do the same thing. > > But there are serveral problems: > > 1. multi-gen LRU only care about self-memcg pages. In our environment, > it's likely to see that different memcg's process share pages. This is related to my question above: are those pages mapped into different memcgs or not? > multi-gen LRU only update gen of pages in current memcg. It's hard to > judge a page whether is accessed depends on gen update. This depends. I'd be glad to elaborate after you clarify the above. > We still have no ideas how to solve this problem. > > 2. We set swappiness 0, and use proactive scan to select cold pages > & proactive reclaim to swap anon pages. But we can't control passive > scan(can_swap =3D false), which would make anon pages cold/hot inversion > in inc_min_seq. There is an option to prevent the inversion, IIUC, the force_scan option is what you are looking for.