From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.8 required=3.0 tests=DKIM_INVALID,DKIM_SIGNED, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E236FC2D0EF for ; Fri, 17 Apr 2020 17:36:19 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id A98D42220A for ; Fri, 17 Apr 2020 17:36:19 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="TiLFh5aj" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A98D42220A Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 484A88E003E; Fri, 17 Apr 2020 13:36:19 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 45C398E0023; Fri, 17 Apr 2020 13:36:19 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3717D8E003E; Fri, 17 Apr 2020 13:36:19 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0109.hostedemail.com [216.40.44.109]) by kanga.kvack.org (Postfix) with ESMTP id 1C7A78E0023 for ; Fri, 17 Apr 2020 13:36:19 -0400 (EDT) Received: from smtpin19.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id C179F181AEF3F for ; Fri, 17 Apr 2020 17:36:18 +0000 (UTC) X-FDA: 76718050836.19.wire89_8e6203191d050 X-HE-Tag: wire89_8e6203191d050 X-Filterd-Recvd-Size: 6892 Received: from mail-qk1-f196.google.com (mail-qk1-f196.google.com [209.85.222.196]) by imf07.hostedemail.com (Postfix) with ESMTP for ; Fri, 17 Apr 2020 17:36:18 +0000 (UTC) Received: by mail-qk1-f196.google.com with SMTP id j4so3232171qkc.11 for ; Fri, 17 Apr 2020 10:36:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=F+HgM45oFlXbwupYfko4Yr92PXppTYIX/jM7xHjwT0U=; b=TiLFh5ajMpb02YUYYwgGkQ0XhhZQEV+TBGxC+xQc3+dlMsyhZIaiuXTuOE434f7iy0 iBhpQpKLim2K9grkhWBnI2xw9MaXHiSsUNeNask+Dpb2SuecIn1YI3kZ4pvvl7fhPFZX JzhmT252wqxAw1h1u6+3OIExrRKHTEhlm1o6Jn9aTlSDIdYk1GAEgVtejsHyqIuJY3pa C6ghblGdWFUWyBWkNA/mSnIv9AxI9ET5ZTr0gqqqJUw/7/uGX5c+2ocP/RkfH1T7qZBV 4lkpuiDmL37+pFz2W+R4SRfb/jN+BaJHHZdhaUuN/0jbSFXEjXGDGrWy2tBnnkhiDqmZ fl+w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:from:to:cc:subject:message-id :references:mime-version:content-disposition:in-reply-to; bh=F+HgM45oFlXbwupYfko4Yr92PXppTYIX/jM7xHjwT0U=; b=RHPgCYXycOEuUz35VGzw9d/gTWlrGPtMG9UJECnxsLRQorRPykwPvJm420dGTWxbwI lYPb0k0KbCO0K2OvI4mAeITiUsteZA4k3arzEPR/QH+b7oGRkboIkEW+/MZ/Xojyw8Gy sTIJsiABXc/R6KCpN3kBDYA2wl5Mc0lBVXnqY93kB3aIrovjmd2MAdlQe5bAPAERPnXM PqyN7LMcqIjIwPqKbMETC2Ryp9hmEh7DMirLH/2oboKO2FFQyu4ezWwBo13Y/qBPwaV7 SuA4Qa7iMj1M3lOLmED4JDt+zIoerv8OTxO8AVuRVpFRz+sG2FmGw/ZQ7jZTH+4nxdfE dfTQ== X-Gm-Message-State: AGi0Pubgmce8mNsneABP4VsN3dChTfbp6fmeiloxmLnN4l4EJHyRdXzl YZGmfTMVugb7EYsLpyyrg64= X-Google-Smtp-Source: APiQypKuLAKKyz6hDdgAqvU5H/zT6a7fDcRD92n742QL2oPJU3jOJy0dx6BBB1O1ZNBconeRzSwVZg== X-Received: by 2002:a05:620a:13b9:: with SMTP id m25mr4298233qki.456.1587144977417; Fri, 17 Apr 2020 10:36:17 -0700 (PDT) Received: from localhost ([2620:10d:c091:480::c4b0]) by smtp.gmail.com with ESMTPSA id w27sm18487825qtc.18.2020.04.17.10.36.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 17 Apr 2020 10:36:16 -0700 (PDT) Date: Fri, 17 Apr 2020 13:36:15 -0400 From: Tejun Heo To: Shakeel Butt Cc: Jakub Kicinski , Andrew Morton , Linux MM , Kernel Team , Johannes Weiner , Chris Down , Cgroups Subject: Re: [PATCH 0/3] memcg: Slow down swap allocation as the available space gets depleted Message-ID: <20200417173615.GB43469@mtj.thefacebook.com> References: <20200417010617.927266-1-kuba@kernel.org> <20200417162355.GA43469@mtj.thefacebook.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Hello, On Fri, Apr 17, 2020 at 10:18:15AM -0700, Shakeel Butt wrote: > > There currently are issues with anonymous memory management which makes them > > different / worse than page cache but I don't follow why swapping > > necessarily means that isolation is broken. Page refaults don't indicate > > that memory isolation is broken after all. > > Sorry, I meant the performance isolation. Direct reclaim does not > really differentiate who to stall and whose CPU to use. Can you please elaborate concrete scenarios? I'm having a hard time seeing differences from page cache. > > > memcg limit reclaim and memcg limits are overcommitted. Shouldn't > > > running out of swap will trigger the OOM earlier which should be > > > better than impacting the whole system. > > > > The primary scenario which was being considered was undercommitted > > protections but I don't think that makes any relevant differences. > > > > What is undercommitted protections? Does it mean there is still swap > available on the system but the memcg is hitting its swap limit? Hahaha, I assumed you were talking about memory.high/max and was saying that the primary scenarios that were being considered was usage of memory.low interacting with swap. Again, can you please give an concrete example so that we don't misunderstand each other? > > This is exactly similar to delay injection for memory.high. What's desired > > is slowing down the workload as the available resource is depleted so that > > the resource shortage presents as gradual degradation of performance and > > matching increase in resource PSI. This allows the situation to be detected > > and handled from userland while avoiding sudden and unpredictable behavior > > changes. > > > > Let me try to understand this with an example. Memcg 'A' has Ah, you already went there. Great. > memory.high = 100 MiB, memory.max = 150 MiB and memory.swap.max = 50 > MiB. When A's usage goes over 100 MiB, it will reclaim the anon, file > and kmem. The anon will go to swap and increase its swap usage until > it hits the limit. Now the 'A' reclaim_high has fewer things (file & > kmem) to reclaim but the mem_cgroup_handle_over_high() will keep A's > increase in usage in check. > > So, my question is: should the slowdown by memory.high depends on the > reclaimable memory? If there is no reclaimable memory and the job hits > memory.high, should the kernel slow it down to crawl until the PSI > monitor comes and decides what to do. If I understand correctly, the > problem is the kernel slow down is not successful when reclaimable > memory is very low. Please correct me if I am wrong. In combination with memory.high, swap slowdown may not be necessary because memory.high's slow down mechanism is already there to handle "can't swap" scenario whether that's because swap is disabled wholesale, limited or depleted. However, please consider the following scenario. cgroup A has memory.low protection and no other restrictions. cgroup B has no protection and has access to swap. When B's memory starts bloating and gets the system under memory contention, it'll start consuming swap until it can't. When swap becomes depleted for B, there's nothing holding it back and B will start eating into A's protection. The proposed mechanism just plugs another vector for the same condition where anonymous memory management breaks down because they can no longer be reclaimed due to swap unavailability. Thanks. -- tejun