From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.2 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C664DCA9EAE for ; Tue, 29 Oct 2019 15:46:59 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 6910B20856 for ; Tue, 29 Oct 2019 15:46:59 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=cmpxchg-org.20150623.gappssmtp.com header.i=@cmpxchg-org.20150623.gappssmtp.com header.b="rvC5HmMF" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 6910B20856 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=cmpxchg.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id BAC256B0005; Tue, 29 Oct 2019 11:46:58 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B5C926B0006; Tue, 29 Oct 2019 11:46:58 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A4C8B6B0007; Tue, 29 Oct 2019 11:46:58 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0027.hostedemail.com [216.40.44.27]) by kanga.kvack.org (Postfix) with ESMTP id 810636B0005 for ; Tue, 29 Oct 2019 11:46:58 -0400 (EDT) Received: from smtpin28.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with SMTP id 0CCFF6D74 for ; Tue, 29 Oct 2019 15:46:58 +0000 (UTC) X-FDA: 76097250516.28.note61_287912d9b705 X-HE-Tag: note61_287912d9b705 X-Filterd-Recvd-Size: 5014 Received: from mail-qk1-f195.google.com (mail-qk1-f195.google.com [209.85.222.195]) by imf03.hostedemail.com (Postfix) with ESMTP for ; Tue, 29 Oct 2019 15:46:57 +0000 (UTC) Received: by mail-qk1-f195.google.com with SMTP id a194so12683549qkg.10 for ; Tue, 29 Oct 2019 08:46:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=jBEwENFPDiTI0Tv4ACPfDqixaIF+4KZzeWBvvR41YGk=; b=rvC5HmMFf5bwMJE3zaaRWuN/o3uk+Xf9Sjy9wlzUFKA2J57CaA/HJDOQtTEJHklqGf TxoDmUQOl3BAShSIht19b+SUSV8F+VZ6DArhKMDtJN5aF/otuMSa0xtQbPN1sW4de1E/ bn2xzKsyHDkKfPMwDNHCTu6PvHb32vdambEssrzMgw9ejfIDHxt+nHJsDsijQuu+9jIA bnFWEndUd6h3PulCgdjyhHQi5oWwxSP2bw7ijkRMjXQZZSuQJXRFQOqawN/b0d0YM0B+ RGMxbbFtGZTAnX5vo0pVCmn+eaMrfp4kefWYskSM1OV3OPzuc4sKvSzRxzH1Hdw62l2A S5og== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=jBEwENFPDiTI0Tv4ACPfDqixaIF+4KZzeWBvvR41YGk=; b=IVEyYUkergwf7/aHIDNNGYh1PUJcoTyNfSYllcrHF1J9qOm12BuyiOKqsjDxDDdcsN tflQ6sXl6ZIec/Xwtfq/er2+T2XNYKrbRWgRt8VyZRwu12JQHp+UMWgD4Xz3Brz8pZQg FFMTKu41Ib66M2oEd7pTW6i673XcWPwsymqJ5hy5vwjx8xXeqvDRMBIAmIn1a4VjEaiK C7p20c12eiGtYTHns/U+Ypi5qjo2w7e9rLTg35m+ukeCkd8drOvIjTPLn7QZVPgxnU+j 0TL1EtfD8CxB3ihNo1559yr6H3PwT8wdjRBRBeCZFeXKNL+wT5Lt8cXaBWJLL7IjEc0R aEag== X-Gm-Message-State: APjAAAW+CZ+mVEFnC2iyL3Pz97p/pRy94UacsvoDd55CmAQxhDlfUDyW gKP0rkYOHKvleXKORYPgI58h4g== X-Google-Smtp-Source: APXvYqzvfVlH6Qzt1fNDHXNsPvdxY2oSMZPG3j/+toG2nE9AepD/WBgdPrObyFNX7JaUNuh+tx2Z0Q== X-Received: by 2002:a37:9d12:: with SMTP id g18mr13783697qke.157.1572364016081; Tue, 29 Oct 2019 08:46:56 -0700 (PDT) Received: from localhost ([2620:10d:c091:500::7081]) by smtp.gmail.com with ESMTPSA id v186sm8285216qkb.42.2019.10.29.08.46.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 29 Oct 2019 08:46:55 -0700 (PDT) Date: Tue, 29 Oct 2019 11:46:54 -0400 From: Johannes Weiner To: Hillf Danton Cc: linux-mm , Andrew Morton , linux-kernel , Chris Down , Tejun Heo , Roman Gushchin , Michal Hocko , Shakeel Butt , Matthew Wilcox , Minchan Kim , Mel Gorman Subject: Re: [RFC v2] memcg: add memcg lru for page reclaiming Message-ID: <20191029154654.GC33522@cmpxchg.org> References: <20191026110745.12956-1-hdanton@sina.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20191026110745.12956-1-hdanton@sina.com> User-Agent: Mutt/1.12.2 (2019-09-21) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Sat, Oct 26, 2019 at 07:07:45PM +0800, Hillf Danton wrote: > > Currently soft limit reclaim is frozen, see > Documentation/admin-guide/cgroup-v2.rst for reasons. > > This work adds memcg hook into kswapd's logic to bypass slr, > paving a brick for its cleanup later. > > After b23afb93d317 ("memcg: punt high overage reclaim to > return-to-userland path"), high limit breachers are reclaimed one > after another spiraling up through the memcg hierarchy before > returning to userspace. > > We can not add new hook yet if it is infeasible to defer that > reclaiming a bit further until kswapd becomes active. > > It can be defered however because high limit breach looks benign > in the absence of memory pressure, or we ensure it will be > reclaimed soon in the presence of kswapd. I have no idea what this patch is actually trying to do. But this premise here, as well as the implementation, are seriously flawed. memory.high needs to be enforced synchronously. Current users expect workloads to be strictly contained or throttled by memory.high in order to ensure consistent behavior regardless of the host environment, as well as prevent interference with other workloads whose startup time could be slowed down by this lack of containment. On the implementation side, it appears you patched out reclaim but left in the throttling that's supposed to make up for failing reclaim. That means that once a cgroup tree's cache footprint grows past its memory.high, instead of simply picking up the cold cache pages, it'll get throttled heavily and see extreme memory pressure. It could take ages for it to grow to the point where kswapd wakes up. Nacked-by: Johannes Weiner