From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 01993C072A2 for ; Wed, 22 Nov 2023 05:34:58 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 713746B055F; Wed, 22 Nov 2023 00:34:58 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 6C2576B0561; Wed, 22 Nov 2023 00:34:58 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 58A846B0562; Wed, 22 Nov 2023 00:34:58 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 4A4C56B055F for ; Wed, 22 Nov 2023 00:34:58 -0500 (EST) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 8917EA01A0 for ; Wed, 22 Nov 2023 05:34:57 +0000 (UTC) X-FDA: 81484476234.01.F76D2E7 Received: from sin.source.kernel.org (sin.source.kernel.org [145.40.73.55]) by imf26.hostedemail.com (Postfix) with ESMTP id 116A6140014 for ; Wed, 22 Nov 2023 05:34:54 +0000 (UTC) Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=S7dQz6c8; spf=pass (imf26.hostedemail.com: domain of chrisl@kernel.org designates 145.40.73.55 as permitted sender) smtp.mailfrom=chrisl@kernel.org; dmarc=pass (policy=none) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1700631295; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=tg/ZQkwYFuyh5BCDunD0e6LC/Wv9e5IUT3hjZgnd30A=; b=w32HzDKWOFChhnZBh5D/tX3BkTYzWmglcKOgEvy8hoc5/5aFyK2vvX+2iKnKTY9GrcNvMY vvMhCyfhB3KjSe/3pXXslkSEhQtjUQkXDg7KqiUi2KvF16DU27jWSrMIL6DLXRF/fLUcb+ ctvmVCXRw7xIR/P+ckInVHh24bEk508= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1700631295; a=rsa-sha256; cv=none; b=qOczCYh1/6mE6+gmKKX0e6QY/y1wfTo0PZ3YIhA3hVshnT6HPTPiKOiAOPSpNlqg7xVav7 P4CiZCKaofR5i87FQqYiIMAo9oLqfp5YqovYfIGQ37xML3e6LF6tFdmbF3pLtRXlE2348i XMfB6Wrx8d4PTJyrLIjnFENH7fNxA50= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=S7dQz6c8; spf=pass (imf26.hostedemail.com: domain of chrisl@kernel.org designates 145.40.73.55 as permitted sender) smtp.mailfrom=chrisl@kernel.org; dmarc=pass (policy=none) header.from=kernel.org Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by sin.source.kernel.org (Postfix) with ESMTP id 8AA99CE1E82 for ; Wed, 22 Nov 2023 05:34:51 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id F0DFBC433C9 for ; Wed, 22 Nov 2023 05:34:49 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1700631290; bh=+oyhJw+7WDNmz+8BB3oZ+9jML7VR1PCueTIj1yyeSJE=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=S7dQz6c8HTOaMWHw0+v8POXeHHNgESqqp6IZQSFG8+JBaOa4iVFm60usbgE64pLoz Qlunhv0YUWMetRsAv+7LcuJJYdskWp/gG8FftvYsRYG8NfzopVqFNVtvgOQS8h8DKE GCTMOJ51R2jYnqT3/BEM9MX7e+fFis6y7mf1B2Q2cO8Gcrl7KNX8tjITyZKsBXXlyo d0A4rjvrDc3p1TtDNXyk2oohsoMsa017+Hg4i86F8yVt+zPOdoBvMnxAk22JWlILHg Cy/v0lSPgHbC25txAI07W7824PU4LL4UtwggSy/xgk3NB6/Qgprt3/yLW+hFGgqRUV jXbHBNjgZbSxg== Received: by mail-ot1-f42.google.com with SMTP id 46e09a7af769-6ce31c4a653so3636934a34.3 for ; Tue, 21 Nov 2023 21:34:49 -0800 (PST) X-Gm-Message-State: AOJu0YzgZ83BDxNdcmrEq/6ezCvDsWsxSYoRtc8NsT4z6gl+7JBv8ILl ylYgkuFwdlefpFSQL26yLRVGAAx2Z9Jx6kJ3gfopzQ== X-Google-Smtp-Source: AGHT+IFx1OBet7WELnVj3mLYddyu1QmnNSMWiKQ+4A9WFcAMSYkIoxEBRkjJKPTTYeaaUfvmaHZMc5SVjs66MkcFvzo= X-Received: by 2002:a05:6870:470b:b0:1ef:b649:cbd0 with SMTP id b11-20020a056870470b00b001efb649cbd0mr1594597oaq.52.1700631289206; Tue, 21 Nov 2023 21:34:49 -0800 (PST) MIME-Version: 1.0 References: <20231119194740.94101-1-ryncsn@gmail.com> <20231119194740.94101-24-ryncsn@gmail.com> <87msv8c1xy.fsf@yhuang6-desk2.ccr.corp.intel.com> In-Reply-To: From: Chris Li Date: Tue, 21 Nov 2023 21:34:37 -0800 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH 23/24] swap: fix multiple swap leak when after cgroup migrate To: Kairui Song Cc: "Huang, Ying" , linux-mm@kvack.org, Andrew Morton , David Hildenbrand , Hugh Dickins , Johannes Weiner , Matthew Wilcox , Michal Hocko , linux-kernel@vger.kernel.org, Tejun Heo Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Stat-Signature: ds858mh9pokk3q6pop51pn5ziawu35a8 X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: 116A6140014 X-Rspam-User: X-HE-Tag: 1700631294-996701 X-HE-Meta: U2FsdGVkX1/BAPtWHSQzuP5G0yzwAGRGc868OJfmSOnWnSgjWrm+5GyGFIYi6AOA4Ya7SZ2BTkanWXoqz6yGuBe1Z+b67t641UbVF5lEVPouXtWER5eMpKw0xr1SNks9bDqH2tPbbOJDgYr2fuBHsbNoeOxMXC/NIjrazYJWsTpz6qAX426PqY6DKtCMpwQC9v3zLbAj5CWliyOMNbs9gvJzrGxQUFS9EkXocBJFVgdZBjM7eBtI6+LGXd3OHxYkeUfLPd6pc8JEzpLA2W7mzWaECQas1Qqt5UYxjvOn+Iv9b+IupYQ/KBr93XZob5sfEVd4EhtfdKiG6tWYQEIw/rjMRU1ub2/ylDph7sP5Xdxvwoerit8cSIrSx6TAxntBnNoC6w0pCfzIHCiVBfp9qTgsmaiSy7IsoIVVa/Xl1v39DSKNBkTI0tfnnV3yeZYihWm9EbmA6rqBReYnfa6450Qw60jKty/tSsfjqrd+l3xndN2R0+5t5wxaDGZg4YQ/Axvfu4jKYp5E4F/T9VOqxfulHPrZWGiUDxALFleZ6k4Vz7Nkjc3OCY9d3PRgLeEgWzBx8AO7VPzMgNdotsdzQJFyML4bP/mOsH0DC2aqH/4TdfzpmRUZHq4r7WcJOg/uXSIpTZo6KE/8Y9q0Ywx0m+G7cmbBgMUVVoi9LTCWXtun8GrsBwW6XzSNZq3M9MMVFVmh4ZTFUWowD2f3zKdL18kXrElpcw8XDPlkWuXR4EM87A1oyM/94BFIcg5izCI3o6VcHQMtmD7fBZ1P5IOxhCzgPmoER23PvjygS1RNHo69JtTjKb3DisvNGjfH+q8P1Kr+/rhNqwaaimjWfy2HwkqzE85GVmuT6TgmjgvLrHeE9hHuEVsWByauqqWwUO12/E9wc+pliS0osVApPMLAJ5Dkqx/+hWoGtAp+lzwlZBgNcZ68srre8mRxsiGG+in7mTEgjoZuSgz80Hii8O9 FT7BZUGc CZG/iX1utFsUyDWXNlqjLfQOYfiEVvlt+xpnGh8sxJconZgE8QzQeBJsYV9Cd46fI4Jy8+at4A26DBWrEVMr2F5/GR6MoZes1MK8j8XVk8xw0ui20myTVnIqoAX+2WqShLGRAsYKJ1DiZGJxaW0S50ZhbNEqcEbcMLYS5qztU7ry5Ivkf9FJic0UZKjwvDrQQK0K5Khfp9wFasdasTis1KFq4xqVtVmDLaEZtaBo+RgwQup+gVRGbPP5XD39jcIO/D418MlOoJp0QrK/jNnSTu1NPf8p4IhsJ3vMAJ5CuVUahqh3fMDgQIyN/L8ggYyeS+/YOHHkMzRGQAXcyV/AFr8FqoA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Hi Kairui, On Mon, Nov 20, 2023 at 3:17=E2=80=AFAM Kairui Song wrot= e: > > Huang, Ying =E4=BA=8E2023=E5=B9=B411=E6=9C=8820=E6= =97=A5=E5=91=A8=E4=B8=80 15:37=E5=86=99=E9=81=93=EF=BC=9A > > > > Kairui Song writes: > > > > > From: Kairui Song > > > > > > When a process which previously swapped some memory was moved to > > > another cgroup, and the cgroup it previous in is dead, then swapped i= n > > > pages will be leaked into rootcg. Previous commits fixed the bug for > > > no readahead path, this commit fix the same issue for readahead path. > > > > > > This can be easily reproduced by: > > > - Setup a SSD or HDD swap. > > > - Create memory cgroup A, B and C. > > > - Spawn process P1 in cgroup A and make it swap out some pages. > > > - Move process P1 to memory cgroup B. > > > - Destroy cgroup A. > > > - Do a swapoff in cgroup C > > > - Swapped in pages is accounted into cgroup C. > > > > > > This patch will fix it make the swapped in pages accounted in cgroup = B. > > > > Accroding to "Memory Ownership" section of > > Documentation/admin-guide/cgroup-v2.rst, > > > > " > > A memory area is charged to the cgroup which instantiated it and stays > > charged to the cgroup until the area is released. Migrating a process > > to a different cgroup doesn't move the memory usages that it > > instantiated while in the previous cgroup to the new cgroup. > > " > > > > Because we don't move the charge when we move a task from one cgroup to > > another. It's controversial which cgroup should be charged to. > > According to the above document, it's acceptable to charge to the cgrou= p > > C (cgroup where swapoff happens). > > Hi Ying, thank you very much for the info! > > It is controversial indeed, just the original behavior is kind of > counter-intuitive. > > Image if there are cgroup P1, and its child cgroup C1 C2. If a process > swapped out some memory in C1 then moved to C2, and C1 is dead. > On swapoff the charge will be moved out of P1... > > And swapoff often happen on some unlimited cgroup or some cgroup for > management agent. > > If P1 have a memory limit, it can breech the limit easily, we will see > a process that never leave P1 having a much higher RSS that P1/C1/C2's > limit. > And if there is a limit for the management agent cgroup, the agent > will be OOM instead of OOM in P1. I think I will reply to another similar email. If you want OOM in P1, you can have an admin program. fork and execute a new process, add the new process into P1, then swap off from that new process. > > Simply moving a process between the child cgroup of the same parent > cgroup won't cause such issue, thing get weird when swapoff is > involved. > > Or maybe we should try to be compatible, and introduce a sysctl or > cmdline for this? If the above suggestion works, then you don't need to change swap off? If you still want to change the charging model. I like to see the bigger picture, what rules it follows and how it works in other situations. Chris