From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7437CE7718B for ; Sun, 22 Dec 2024 02:34:52 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8927F6B007B; Sat, 21 Dec 2024 21:34:51 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 83FC16B0082; Sat, 21 Dec 2024 21:34:51 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6E0886B0083; Sat, 21 Dec 2024 21:34:51 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 4F1EC6B007B for ; Sat, 21 Dec 2024 21:34:51 -0500 (EST) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id B2B7AC1385 for ; Sun, 22 Dec 2024 02:34:50 +0000 (UTC) X-FDA: 82921026804.14.5345404 Received: from mail-qv1-f41.google.com (mail-qv1-f41.google.com [209.85.219.41]) by imf25.hostedemail.com (Postfix) with ESMTP id EC21DA0010 for ; Sun, 22 Dec 2024 02:34:22 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=fbIvu9By; spf=pass (imf25.hostedemail.com: domain of laoar.shao@gmail.com designates 209.85.219.41 as permitted sender) smtp.mailfrom=laoar.shao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1734834864; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=QZiSg001d3VxRUt50BsOqyddYWYf/BO9uQchrBA63UU=; b=iwxEgaKm9Zb3126BXfdd7wqH7PGAOzEloTOZigxy+fTl91qZX3gWTqD+7tDFv1kzhJQGQC oq5iZGh/9nleeDjFdrXEe01lUfUanlv9wNbXt+w9aNpNhcF9j4OymLC0pSHuI6Aurhs/Q0 ykzkR73WjfP4XjWLm1tD3SKjJAXmbvw= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1734834864; a=rsa-sha256; cv=none; b=K7MGWXn+yD1KwhDg+YaNHWZjkFziHJExAlDB8/EruoBBS5tKH/F8oGYhSmqlTRR8QldRWH yowugUY/Uo13eprEJAMj2uHbfOz+SO9WRzs85ujj14VTuUtljRRxgsdxVRYLea0F3nB5Ye 8QvqaOUnL8OS4GWCFFR2MlUOdwBvfxY= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=fbIvu9By; spf=pass (imf25.hostedemail.com: domain of laoar.shao@gmail.com designates 209.85.219.41 as permitted sender) smtp.mailfrom=laoar.shao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com Received: by mail-qv1-f41.google.com with SMTP id 6a1803df08f44-6d933736380so36911446d6.1 for ; Sat, 21 Dec 2024 18:34:48 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1734834888; x=1735439688; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=QZiSg001d3VxRUt50BsOqyddYWYf/BO9uQchrBA63UU=; b=fbIvu9ByRDkEtjZsyQ/FMVNMIhcqqusgIXzHXWu1ISU8lRgHeKTe6ivcSh/B4CAftB haXXqJpIEtY2eAr6K6ScARzdA0/X0aVYUbuK+5IxuUiscwjeWOu9G7roOt0UVLShtT62 a/8FrPSR1ydWQlo+P51LfIGq1834Aa2Vlxu+QfXB62Dg9FtMwbVpOp15n6Cm7xl73EoT Fodawp6NZxdEbNIXhsD7AClJpYX3waSbJ6hi8yerZ+Zsm6uRo4C3E4yhk6/dzQdlEM9j u/bn5oLy4UXdm2/KTaF5xiSuXHhpUB58izrm7hgs+X/3A9JI4s7ISlvSCtz5iMCDWsEc nmVA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1734834888; x=1735439688; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=QZiSg001d3VxRUt50BsOqyddYWYf/BO9uQchrBA63UU=; b=biMiN5MnehHJy43HRC+exM9aXqjjL8jb4WbLOkdmSY7cW5DNs9L/Zp7JeDViUfxqeZ c2ncB0RwKJx5xaNyYC48K4mfEiJZp4ekuaJC6n03iFijttB8fB+g3LPxz+6NOiOhE7cm MEgum6Gu2QjKqvLH5QHlfiK5/xBV5xKAhbSDtofy2uNJBgAGvGA9RUC1IiZCegviTbUJ Is4rX5LAe5OC4G/WokNmQsPbGJDFa8M7OuFMW2rLbQXusCWsNF7tbjhNfXtwa88TEH+E AWVvmzSbjo/ruhftreVTBWvOnUuOXd+uEuIMtLR2NhTVj5bhbaGapC33PmxiLBPRok35 IC6Q== X-Forwarded-Encrypted: i=1; AJvYcCUWd/kqUw8IFqz894YjuESa4+m1kU2pkQaG3klFHBPFT3ahjxrBOPggQOMOOcx+3ZoPe61BWITCVw==@kvack.org X-Gm-Message-State: AOJu0Yyl5CNba0NSn0blwA8ymhdhcBF1rdZxsNY9NcrtBMRlSc/cEbj0 Jr43adT+4q+B6RQsnEDFvNi/PVbYrh7jZxgUbtDPer420pQPXd6Ll7/34pnSRHWh0v901k4YIQW TOCEMtM5hhpMyAey30Lr3+SDUj8A= X-Gm-Gg: ASbGnctZt+mY9TpckpM7RPXSeP/ju7TqofB7cUdZXmbYBEekCHsX7zBG+4LpYbtk8VF 1g+W3tEfzAhgIM4XRx+meCOTf94YKRsysVlH0ew== X-Google-Smtp-Source: AGHT+IE5LRSvOi2bXhDmTSZIlJYVPsvx0rPixk/Y3AkF1ocwR/zUhaL8+WORIzaPEWZshbhygG/vYgl3o5TU9D2Rypk= X-Received: by 2002:a05:6214:3291:b0:6d8:899e:c3be with SMTP id 6a1803df08f44-6dd233afa69mr147451306d6.49.1734834888020; Sat, 21 Dec 2024 18:34:48 -0800 (PST) MIME-Version: 1.0 References: <20241215073415.88961-1-laoar.shao@gmail.com> In-Reply-To: From: Yafang Shao Date: Sun, 22 Dec 2024 10:34:12 +0800 Message-ID: Subject: Re: [RFC PATCH 0/2] memcg: add nomlock to avoid folios beling mlocked in a memcg To: Michal Hocko Cc: hannes@cmpxchg.org, roman.gushchin@linux.dev, shakeel.butt@linux.dev, muchun.song@linux.dev, akpm@linux-foundation.org, linux-mm@kvack.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: EC21DA0010 X-Stat-Signature: z7etyrm7eg41ei9jykq5wynsrdrbpsh1 X-Rspam-User: X-HE-Tag: 1734834862-737703 X-HE-Meta: U2FsdGVkX19ScDTcb/8uRpjQBlFAQAMVrsmT3dGFWHbufkSwCbHbNGn3eDiapSxnWYZTqflVWKryQrZwpxXxu2Qa2cHW/zBiOYREstOi+vyN9ZWD/oxs+VES1aPNMdz6j3R+bwoewSfgvrT99AtW8khN2EmKH0SWZanYV3XEipwqVEbhAbTIcdZSNwFauCkmmM3oC8J38F3xUd4AOmdCSZOP9jOq5xTkAWLYBjIkO/L8WPjIxc9tyWjzDkaWOUTbgR39e7f/hSA/0wZbrhhrj2AVvsRIeqr/ExpZUXPyQ8rqzyg16k8RBlQ24YHVazNR14Ik1gQSGOfwnPM/ttehfAs/OXvYdQVlAtnf4ino4qHYJIkRDDQo6sB42AMtAcb2HBYLQwjLihzMJ9KrE1t/qBcgpmB2bOOz+Mncf2VDKHmnkZvqCUYYQ+CNxhNHlg4cw5e2MflLkrrD281rpT0mg+BEP3vEyIDoJExfdAkGpk7RNUoYk/pfXcqnJSMwB7/x5w5w2wI7eB7iw+GALkLIpZwnRL8OpKMncHk3fCCDEDxWsRsIfBAQQJ67SzS3QBXuBm+B72ZXF+109OZ5ZvTbDaesi21V79o7ZUqhA70KwhySbODX3/J7ahLW0MdY/mabEBsO+AQFlitU6wGVXjY2FujhB0nuqF6W8fndIgp3snYBFYd77SEqYT3mpZu5BRlQguwxVJex0AcFkSMrIypildwnAvg2DOiV7DZ3r+U+sBXA/rujTM2UMTiGcVY6MEoZlPazEZZm8XPPR1r4WY0HKRByU9qM2VpS77xgvjnsWRIcXxawox/eVeFtIZ+n/ccBUnFeENcrAKlQYKHipJAIem9utirfnQVtzEVnWJlm8MpkkiVmPRNRxU+0Lr8N312fJALmvOfffhfZTAldq9q401JQm/KnAVFXQlfz2Mhfi+CxGv9ozjKkW3SphCc//KMo59lsTatxKoF/XUPmUYq F5lKQfoa VZvzY8P/uQbBWl2dj+dyi/HOb4Z0GmZfeDnsLDAs+13dS2pNvAmOeuwsP869c9Zu3j/zmkbjov5VHqdFpU8u2IraKkCbl2Gl3aubHp58fCfH56mrIc6Vn1EF7A8rVGZdQgM0C9MN113pWQpBK2osZrFgHWWRyTzPkpdsLoW1QMks+rA43jp9+JvbnhDAvV8yvH45m92fTpuJCetNm7qA6a+IV4K8tm+gBEOkPnQXBepA0CizEsglBkLte8WbMlYLHbIxMYjlA5dyADoiTiNsVSFyUDEYxIwiyWuzpowsY6qtNm4RYti0YBseNZhAFdcj6FbzTazTmlamRKutE4ll/7yjyPeYL2+8DL+gxCnknAoEbbQMnjLVMNsSIdN0NAqbLF2f8YZYEJl33f8owqBFHlXb1UJ51r/nymv3x X-Bogosity: Ham, tests=bogofilter, spamicity=0.171331, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Sat, Dec 21, 2024 at 3:21=E2=80=AFPM Michal Hocko wrot= e: > > On Fri 20-12-24 19:52:16, Yafang Shao wrote: > > On Fri, Dec 20, 2024 at 6:23=E2=80=AFPM Michal Hocko = wrote: > > > > > > On Sun 15-12-24 15:34:13, Yafang Shao wrote: > > > > Implementation Options > > > > ---------------------- > > > > > > > > - Solution A: Allow file caches on the unevictable list to become > > > > reclaimable. > > > > This approach would require significant refactoring of the page r= eclaim > > > > logic. > > > > > > > > - Solution B: Prevent file caches from being moved to the unevictab= le list > > > > during mlock and ignore the VM_LOCKED flag during page reclaim. > > > > This is a more straightforward solution and is the one we have ch= osen. > > > > If the file caches are reclaimed from the download-proxy's memcg = and > > > > subsequently accessed by tasks in the application=E2=80=99s memcg= , a filemap > > > > fault will occur. A new file cache will be faulted in, charged to= the > > > > application=E2=80=99s memcg, and locked there. > > > > > > Both options are silently breaking userspace because a non failing ml= ock > > > doesn't give guarantees it is supposed to AFAICS. > > > > It does not bypass the mlock mechanism; rather, it defers the actual > > locking operation to the page fault path. Could you clarify what you > > mean by "a non-failing mlock"? From what I can see, mlock can indeed > > fail if there isn=E2=80=99t sufficient memory available. With this chan= ge, we > > are simply shifting the potential failure point to the page fault path > > instead. > > Your change will cause mlocked pages (as mlock syscall returns success) > to be reclaimable later on. That breaks the basic mlock contract. AFAICS, the mlock() behavior was originally designed with only a single root memory cgroup in mind. In other words, when mlock() was introduced, all locked pages were confined to the same memcg. However, this changed with the introduction of memcg support. Now, mlock() can lock pages that belong to a different memcg than the current task. This behavior is not explicitly defined in the mlock() documentation, which could lead to confusion. To clarify, I propose updating the mlock() documentation as follows: When memcg is enabled, the page being locked might reside in a different memcg than the current task. In such cases, the page might be reclaimed if mlock() is not permitted in its original memcg. If the locked page is reclaimed, it could be faulted back into the current task's memcg and then locked again. Additionally, encountering a single page fault during this process should be acceptable to most users. If your application cannot tolerate even a single page fault, you likely wouldn=E2=80=99t enable memcg= in the first place. --=20 Regards Yafang