From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.8 required=3.0 tests=DKIM_INVALID,DKIM_SIGNED, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 619A8C3B19D for ; Fri, 14 Feb 2020 13:57:34 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 17E8422314 for ; Fri, 14 Feb 2020 13:57:33 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="Xt+4Qjfc" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 17E8422314 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id DD49A6B0624; Fri, 14 Feb 2020 08:57:32 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id DAB006B0625; Fri, 14 Feb 2020 08:57:32 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C73756B0626; Fri, 14 Feb 2020 08:57:32 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0070.hostedemail.com [216.40.44.70]) by kanga.kvack.org (Postfix) with ESMTP id AD11B6B0624 for ; Fri, 14 Feb 2020 08:57:32 -0500 (EST) Received: from smtpin22.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 62BDA249C for ; Fri, 14 Feb 2020 13:57:32 +0000 (UTC) X-FDA: 76488885144.22.horse95_5ee0673d57138 X-HE-Tag: horse95_5ee0673d57138 X-Filterd-Recvd-Size: 7940 Received: from mail-qv1-f67.google.com (mail-qv1-f67.google.com [209.85.219.67]) by imf50.hostedemail.com (Postfix) with ESMTP for ; Fri, 14 Feb 2020 13:57:31 +0000 (UTC) Received: by mail-qv1-f67.google.com with SMTP id m5so4301354qvv.4 for ; Fri, 14 Feb 2020 05:57:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=pRCR8JFRteiLiVzqAfjfoL4EPL1tfp3Ime2M/1tWd2A=; b=Xt+4QjfcAQFD4vbXwLw142jzmUR/mbfyWcR2LGQxEG3Q4CI4VnpvrtZdxIzjAMG0NM 12gk7hnv23oml5u4GPRoSAR1zri9vwXrb4GXVszIma+EJWc+TMedI/TsHu9IB/yiwvYl l9tmJ9l3SL8pc7gpPoKNofC0lO0V1mMt/tZfMprNr3xounTF0mzLJ/zcCN7ETn//1KVY o8dDRv2rXzXiog5yCkv1sCoPDAeKuTOzemq55dKXyAPWVs+wzVeCeYAfolPGJ5Vik8aP ddWHw6bWMPQ7qFme1giV2oPfdSwVecasRRbWE2WJQN4DJ9FWqkiRq3DGzyLJC1eYmFjF zDFA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:from:to:cc:subject:message-id :references:mime-version:content-disposition:in-reply-to; bh=pRCR8JFRteiLiVzqAfjfoL4EPL1tfp3Ime2M/1tWd2A=; b=Jw/rD3ldrDDFort9aY188qYNXmLjxR/GAzk4LDKy6448GoFQ4XQMwDN+1jAtSd7au/ E87f8sM4MJmckWetSSyg2NHMeNxkekGWWxGH80oDug97GZyIlHtEkikT957d0WxaEsMi DFJIlF5QlTecBFwuFxPriHbTznqoUGJDcnzdtvtPrXFZrp6bgc1yIQCG4odOWw7dEIn+ U74OQryMGceroEBOZYJqmTe4HjjpwgcUE/9ZWRmRmsYFEwkg1SNY9I9ap3xpncpNIbhH eKFsNMDjWK7c2ASILzfHysDcqZpzaxmLXRuZJM6/7KUSJ/rIy7gFwg3EcJALfz9Xts4U oqCA== X-Gm-Message-State: APjAAAU7JQQrAQGai0ZB6qWxmKmrZkMzEgyTptOTyRibrCxgg661LFSZ hwobiaMAbfl3W8ZwNKTKF/w= X-Google-Smtp-Source: APXvYqwMGPF1vQTzvjZZAzqYcaur4NxkkfsQZNmn2Qcu3JNVWIdK1pxN9mTSXfJCZUr4riuWjBkKbw== X-Received: by 2002:ad4:4dc9:: with SMTP id cw9mr2299790qvb.0.1581688651050; Fri, 14 Feb 2020 05:57:31 -0800 (PST) Received: from localhost ([71.172.127.161]) by smtp.gmail.com with ESMTPSA id c26sm3149342qtn.19.2020.02.14.05.57.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 14 Feb 2020 05:57:30 -0800 (PST) Date: Fri, 14 Feb 2020 08:57:28 -0500 From: Tejun Heo To: Michal Hocko Cc: Johannes Weiner , Andrew Morton , Roman Gushchin , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, kernel-team@fb.com Subject: Re: [PATCH v2 3/3] mm: memcontrol: recursive memory.low protection Message-ID: <20200214135728.GK88887@mtj.thefacebook.com> References: <20200203215201.GD6380@cmpxchg.org> <20200211164753.GQ10636@dhcp22.suse.cz> <20200212170826.GC180867@cmpxchg.org> <20200213074049.GA31689@dhcp22.suse.cz> <20200213135348.GF88887@mtj.thefacebook.com> <20200213154731.GE31689@dhcp22.suse.cz> <20200213155249.GI88887@mtj.thefacebook.com> <20200213163636.GH31689@dhcp22.suse.cz> <20200213165711.GJ88887@mtj.thefacebook.com> <20200214071537.GL31689@dhcp22.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200214071537.GL31689@dhcp22.suse.cz> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Hello, On Fri, Feb 14, 2020 at 08:15:37AM +0100, Michal Hocko wrote: > > Yes, it can set up the control knobs as directed but it doesn't ship > > with any material resource configurations or has conventions set up > > around it. > > Right. But services might use those knobs, right? And that means that if > somebody wants a memory protection then the service file is going to use > MemoryLow=$FOO and that is likely not going to work properly without an > an additional hassles, e.g. propagate upwards, which systemd doesn't do > unless I am mistaken. While there are applications where strict protection makes sense, in a lot of cases, resource decisions have to consider factors global to the system - how much is there and for what purpose the system is being set up. Static per-service configuration for sure doesn't work and neither will dynamic configuration without considering system-wide factors. Another aspect is that as configuration gets more granular and stricter with memory knobs, the configuration becomes less work-conserving. Kernel's MM keeps track of dynamic behavior and adapt to the dynamic usage, these configurations can't. So, while individual applications may indicate what its resource dispositions are, a working configuration is not gonna come from each service declaring how many bytes they want. This doesn't mean configurations are more tedious or difficult. In fact, in a lot of cases, categorizing applications on the system broadly and assigning ballpark weights and memory protections from the higher level is sufficient. > > > Besides that we are talking about memcg features which are available only > > > unified hieararchy and that is what systemd is using already. > > > > I'm not quite sure what the above sentence is trying to say. > > I meant to say that once the unified hierarchy is used by systemd you > cannot configure it differently to suit your needs without interfering > with systemd. I haven't experienced systemd getting in the way of structuring cgroup hierarchy and configuring them. It's pretty flexible and easy to configure. Do you have any specific constraints on mind? > > There's a plan to integrate streamlined implementation of oomd into > > systemd. There was a thread somewhere but the only thing I can find > > now is a phoronix link. > > > > https://www.phoronix.com/scan.php?page=news_item&px=Systemd-Facebook-OOMD > > I am not sure I see how that is going to change much wrt. resource > distribution TBH. Is the existing cgroup hierarchy going to change for > the OOMD to be deployed? It's not a hard requirement but it'll be a lot more useful with actual resource hierarchy. As more resource control features get enabled, I think it'll converge that way because that's more useful. > > Yeah, exactly, all it needs to do is placing scopes / services > > according to resource hierarchy and configure overall policy at higher > > level slices, which is exactly what the memory.low semantics change > > will allow. > > Let me ask more specifically. Is there any plan or existing API to allow > to configure which services are related resource wise? At kernel level, no. They seem like pretty high level policy decisions to me. > > > That being said, I do not really blame systemd here. We are not making > > > their life particularly easy TBH. > > > > Do you mind elaborating a bit? > > I believe I have already expressed the configurability concern elsewhere > in the email thread. It boils down to necessity to propagate > protection all the way up the hierarchy properly if you really need to > protect leaf cgroups that are organized without a resource control in > mind. Which is what systemd does. But that doesn't work for other controllers at all. I'm having a difficult time imagining how making this one control mechanism work that way makes sense. Memory protection has to be configured together with IO protection to be actually effective. As for cgroup hierarchy being unrelated to how controllers behave, it frankly reminds me of cgroup1 memcg flat hierarchy thing I'm not sure how that would actually work in terms of resource isolation. Also, I'm not sure how systemd forces such configurations and I'd think systemd folks would be happy to fix them if there are such problems. Is the point you're trying to make "because of systemd, we have to contort how memory controller behaves"? Thanks. -- tejun