From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A83E4C54E76 for ; Mon, 20 Nov 2023 11:17:58 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4837E6B0448; Mon, 20 Nov 2023 06:17:58 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 433CE6B044A; Mon, 20 Nov 2023 06:17:58 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2FC826B0451; Mon, 20 Nov 2023 06:17:58 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 1CFCB6B0448 for ; Mon, 20 Nov 2023 06:17:58 -0500 (EST) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id E7FE31207DE for ; Mon, 20 Nov 2023 11:17:57 +0000 (UTC) X-FDA: 81478082994.26.4128AEC Received: from mail-lj1-f170.google.com (mail-lj1-f170.google.com [209.85.208.170]) by imf24.hostedemail.com (Postfix) with ESMTP id D8904180004 for ; Mon, 20 Nov 2023 11:17:55 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=gKGN4hy0; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf24.hostedemail.com: domain of ryncsn@gmail.com designates 209.85.208.170 as permitted sender) smtp.mailfrom=ryncsn@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1700479076; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=5V2XDX61NY0tOMnwnrDCDr7g/yBsZgaDtQvnBOTZYnU=; b=km9oUogL53Qzv26ABGwH1Bn0gyuJzya1Yh8g49AO2mWkDsrCIdBSdDXQSh/jl2JH5K0fdo p99JJ6GxDx4a712HLhyAyrgCvfa6z6X7yjCasUZ2Gc+5f1dRHSgeWyAPt7oRYe2z8eimJ4 5Zy8OD1GL3gxGFdmpkd1AXSTJIFKw6g= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=gKGN4hy0; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf24.hostedemail.com: domain of ryncsn@gmail.com designates 209.85.208.170 as permitted sender) smtp.mailfrom=ryncsn@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1700479076; a=rsa-sha256; cv=none; b=qHCd69SKqGJX88IMH7LR/A5yWEOiIYkEHZWNdSQIyKNdz7X3rmaUTPTmWoc5Hn2WU7lnK7 mtDwzg1XcFtmdnO1yuiKncI89/YHhTjY0+JDa4QKEzOEQKhhoXE1e/5fS/ud5oZ3hxlSvi ZSo8f0wJ4CtpZjh+er4XOBIzNneKs48= Received: by mail-lj1-f170.google.com with SMTP id 38308e7fff4ca-2c523ac38fbso53672941fa.0 for ; Mon, 20 Nov 2023 03:17:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1700479074; x=1701083874; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=5V2XDX61NY0tOMnwnrDCDr7g/yBsZgaDtQvnBOTZYnU=; b=gKGN4hy0IbsxUOZzKX3b1EkxHMMgyF72VGlNGfN6rDJd6l5Ej49IKzq2UK+Qu5ubUZ XkNZmYgPhkPPez0Bd0M59Z7EDpcK3iS7wPiCk0IwL2JI+fFgRiElJYZjeGQiEC11uUYs xNeHnEYLufJxNKwb6Z9Jzb1OFaSxizpCG8O7ZXJ2Q18Ra7XfbKW4KEjflGNAOj1vzrYO cum0KYbwVbCMLSiBM5lOnZncWLOGmjHbRRp9LaYnW+7TPqP3Dm3HmT5MPdbQQtyHIL7x +E5RdzYV4hXujYIp/TC7XX8SsV4yVKrtOx5W5XinbOL84LX0xqvcLa+4fyBC19duf4bz sjhg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1700479074; x=1701083874; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=5V2XDX61NY0tOMnwnrDCDr7g/yBsZgaDtQvnBOTZYnU=; b=Q9xG4LBdXfECUnxNuiw/2XXzbIWYNK0dzLo0qp5Jyu+VvPDX3zaP2aWV15cWhJobCS GCcjBC/ciYwCYcBhjIQeiRnxOKoOVYjOKoCRjm1yyepAz8KtNAf8GUScn6waNP/9oKtQ n4DyjxUuvrUM/+Pdo+vL9R1Uv38gKC+uE67iseWgmKJ4NVlvOh+wVGmMRdaFdxxp7h/C Tch0RKC+SId7a2mC8LGrj2O0hHzo+FYXyXJyQH0XMCIa+THNQhHzJmuwDUacAI/kC1rg HRy8RLQX+YNmAvwXA5T+0jJVQCR34XXoWQ5zg8ViIByubPGXbo24x6bgYSiMg+9rNqgF oOOw== X-Gm-Message-State: AOJu0YwUEW9/5dJt0XLJ2adDWHDB7ETH7JVora4jE13+4BwjLK04/9Wl 2deJoi5nfWM6sVWOj8KzWNI5tLA4d/S/UVhW7CM= X-Google-Smtp-Source: AGHT+IHVVN+CkY38GkgyVOlVqVlrcWmFlktP5IlMWMAbCfzH8W+BumO4kHgH/wjFPhhgU7yFdBvSD+5vP8/jVOmmMos= X-Received: by 2002:a05:651c:2209:b0:2c8:87fe:2f4e with SMTP id y9-20020a05651c220900b002c887fe2f4emr112425ljq.8.1700479074189; Mon, 20 Nov 2023 03:17:54 -0800 (PST) MIME-Version: 1.0 References: <20231119194740.94101-1-ryncsn@gmail.com> <20231119194740.94101-24-ryncsn@gmail.com> <87msv8c1xy.fsf@yhuang6-desk2.ccr.corp.intel.com> In-Reply-To: <87msv8c1xy.fsf@yhuang6-desk2.ccr.corp.intel.com> From: Kairui Song Date: Mon, 20 Nov 2023 19:17:37 +0800 Message-ID: Subject: Re: [PATCH 23/24] swap: fix multiple swap leak when after cgroup migrate To: "Huang, Ying" Cc: linux-mm@kvack.org, Andrew Morton , David Hildenbrand , Hugh Dickins , Johannes Weiner , Matthew Wilcox , Michal Hocko , linux-kernel@vger.kernel.org, Tejun Heo Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: D8904180004 X-Stat-Signature: iabwzjhqixjp456mu84t441ckumesr95 X-HE-Tag: 1700479075-586499 X-HE-Meta: U2FsdGVkX19SbwU3+qdXLlaAGZsW1IDemSZl9shM1g9xF34iL2TpkSqe/R7FLyY5SNs2+Bu1Lz9gbZEBtgJcrbtnWr+S+PYCB/T8iVUaQAFPs8x4fmUnZF/i78WBadM1x2tfzpfJRoQLHryTNlw31N6Q6BKpp02gCwnR6CL5qa9HXIjX1uMWROxhl4LY4aR4igDx1m3I8EeCxnyzjUEu3/oWNVx9IEN/ylpC4kugevZnA3yoop2STHz0gKvnVwY2SBhJZCtz5eF6tQz7sRInDf5XcnRXk7TqK2b+cZMyK/VJs4fea7Q+XnYsyQ37bzRYxm4r31M1TwXG2QuSGETRdCiwaJ/T1Gbrd9S1e9BwUvl0WsQvOaxtXvqaX8ImjzJ2+CxkanDWqhPR56e/P4Ue61XT7s34mVAsyqB+hNdld0hCmlzsN0hdr1vAaD9pUU7Y5/8Il68j5TZtONY/NinodBhlynwtSmetNt8STwfIR72jyl93qs4GawqbDBv9skLcNfILC5bvM9S4GvceCWY6RCIx/DN8jr8sFGChuL7vXUfkbil0xqRDyLaxy5VkYPgnbR1sOTeIrFDcuu4LvOcsIZ7Wo3eN0DFAfsIRgKiPOqXtAsbud3rgim9WzT6MLuSupOP2k6Nnisv7WG6zpLQu52Gwm/Cm8JX6LZTmpYjS8emimShog56heBiDI2RNROtp4UIevLMA5W+g0aMpr2polBCl3tu6BCFjZ+vABhEDDtOI0mXlBbVuGnlnh4zSVNrX260bnylJtx46LSBSncPO3GAwmvLjwXzgSkz/62OqHBlBA5VCnDK9chQcj3zLKkh0NC5bD4Y3ZlErp2nL/6fTrzars+h2b1bJ52GPlrXc9RDiScWOIhN5G+mr9JSleq+jZPcD9dm+ogFO780+0tzcozQoTBAMER3R2unXikCnirPbwLJPc8qu8gxeHi0exKk5Fc7lgXtcF/pemXRx/av PeWBjWlL cLtFM7JYHxXTs/JDzhYJBbCmBJwgXOuEXxcl9P2hG65wqoStunFercHsRqHdc9KRKaSSRuBxOH+NPIHnziLmrrW6+60bPsCSLz2RGFuUlmYxxxr47FfPdVC1rpLtA6OedY+lTcKU6MtXrVK4rKiE3hopAMa/g/v1Ib91ckTh6ojJIFCe8wwaqw3EUDhiGBzHz3M5RLAznSLzTEF3g29UPfgxkTgMo1fiC2PLbClicAykmD5z4mUD992f74Qwkm5hiHIiKDpwGSU8ziLvjWR9AyzONEUv5P57BFSesv+DPIDd+AKPItXNe2l0wLBSzl7rNhFiKEJ4NWAnFBSsjK+CrL5hJLDep0rP/A3FdYg2nS9xDzM6oIADYjzd29ZExjkimmcmU9dnAJ24MYq/v24XNsnd3EYjGTCRlLgJZm2oyZ00x6bISoxnxrRCGzU7GrZOEV2bZ8tB9E3t3dyoQFncOTW68a4H1YG1vheWe X-Bogosity: Ham, tests=bogofilter, spamicity=0.000010, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Huang, Ying =E4=BA=8E2023=E5=B9=B411=E6=9C=8820=E6= =97=A5=E5=91=A8=E4=B8=80 15:37=E5=86=99=E9=81=93=EF=BC=9A > > Kairui Song writes: > > > From: Kairui Song > > > > When a process which previously swapped some memory was moved to > > another cgroup, and the cgroup it previous in is dead, then swapped in > > pages will be leaked into rootcg. Previous commits fixed the bug for > > no readahead path, this commit fix the same issue for readahead path. > > > > This can be easily reproduced by: > > - Setup a SSD or HDD swap. > > - Create memory cgroup A, B and C. > > - Spawn process P1 in cgroup A and make it swap out some pages. > > - Move process P1 to memory cgroup B. > > - Destroy cgroup A. > > - Do a swapoff in cgroup C > > - Swapped in pages is accounted into cgroup C. > > > > This patch will fix it make the swapped in pages accounted in cgroup B. > > Accroding to "Memory Ownership" section of > Documentation/admin-guide/cgroup-v2.rst, > > " > A memory area is charged to the cgroup which instantiated it and stays > charged to the cgroup until the area is released. Migrating a process > to a different cgroup doesn't move the memory usages that it > instantiated while in the previous cgroup to the new cgroup. > " > > Because we don't move the charge when we move a task from one cgroup to > another. It's controversial which cgroup should be charged to. > According to the above document, it's acceptable to charge to the cgroup > C (cgroup where swapoff happens). Hi Ying, thank you very much for the info! It is controversial indeed, just the original behavior is kind of counter-intuitive. Image if there are cgroup P1, and its child cgroup C1 C2. If a process swapped out some memory in C1 then moved to C2, and C1 is dead. On swapoff the charge will be moved out of P1... And swapoff often happen on some unlimited cgroup or some cgroup for management agent. If P1 have a memory limit, it can breech the limit easily, we will see a process that never leave P1 having a much higher RSS that P1/C1/C2's limit. And if there is a limit for the management agent cgroup, the agent will be OOM instead of OOM in P1. Simply moving a process between the child cgroup of the same parent cgroup won't cause such issue, thing get weird when swapoff is involved. Or maybe we should try to be compatible, and introduce a sysctl or cmdline for this?