From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 94EEAC43334 for ; Tue, 19 Jul 2022 20:16:16 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0D3276B0071; Tue, 19 Jul 2022 16:16:16 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 083596B0073; Tue, 19 Jul 2022 16:16:16 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E8D486B0074; Tue, 19 Jul 2022 16:16:15 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id DAE836B0071 for ; Tue, 19 Jul 2022 16:16:15 -0400 (EDT) Received: from smtpin31.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id ABACA120341 for ; Tue, 19 Jul 2022 20:16:15 +0000 (UTC) X-FDA: 79704956310.31.4009671 Received: from mail-vs1-f50.google.com (mail-vs1-f50.google.com [209.85.217.50]) by imf16.hostedemail.com (Postfix) with ESMTP id 56BC618008A for ; Tue, 19 Jul 2022 20:16:15 +0000 (UTC) Received: by mail-vs1-f50.google.com with SMTP id j65so14508423vsc.3 for ; Tue, 19 Jul 2022 13:16:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=1fZYu+1ItEdtumrIwz+viCZWzbZgaVsyd8axTqL69J8=; b=lhMx3WCGxmsOPPKH8fXkrtq39pSx4jZoHIt7npcOYxvi9Z9J0pI3EzsCIk+Atu8L6o 0ydgxUCSfKgitz0EJiaj/9X40kD3K0VnkE8sgf9rdRfrp2oNxCDnSi+GZFJDzYZtmRbe AdLa5FAd4/Bx1Hbqb5WWQgNYqbKXuWQUMoFBZGoHT66Sqv0UaykxXXilIZ8wR2Y/l+DD Em42wuztIwvg/r8hlcwHxHgnV87QwshBIHr1MmQ8OMxul6NXJJk2h4Wylfx1W0kAUbpt zwy6DLLgmN9JKNR1qxk308Bd4iP3pA3WYnBiZTsdAzNa7ecPwv1+jSE9JNLzbFk2ESK2 B4rQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=1fZYu+1ItEdtumrIwz+viCZWzbZgaVsyd8axTqL69J8=; b=xZUfBIyQ9xbZ6+wjTnPyENYjwvagTcPfygy97WKfDF9ux5GqB3P3ckZB0UV+C9lRDE PCd6xNHAriUJ6bk8rHWiic91MkzJkbq6yTVdz8IRTjPDoL2+hrM1xY7VOMVCi1zDGNh2 pVWyUzk1kzN2HhGqUelHgp/o83OVolD9appEAyyFo4WNKeq1ZEwBt/Bzwq9X41K8J/gD zknq15LiwlTWREuimAbgUJ9NYfvC4SKJdtipOPgKMjezayCac4YQniKZxsxdYZxscfSZ O/HKA5H9DyKok2MWme0e8DSnWLa6yl8to5gbvFickVvYbcLNR7CJ6XbCzPwppP1vW4SI TWLA== X-Gm-Message-State: AJIora8RR/4RnD/J14wrUnwnu3HtgYuNmT7f1rZYnBm4HG2elyodl9xH OtxknkGjRfP84fiqQWPMLUNr6hPwoJ1tGj0Y/kM8bw== X-Google-Smtp-Source: AGRyM1syMnsmkFypSHvlChPgZAVDlfKvy3Xew/8nGSWlc3qi7PA0FACDWczilu3QxVPEvGajBfphU8Q2sJcXLlP9tww= X-Received: by 2002:a05:6102:3676:b0:357:6dd9:7145 with SMTP id bg22-20020a056102367600b003576dd97145mr12613977vsb.49.1658261774515; Tue, 19 Jul 2022 13:16:14 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Mina Almasry Date: Tue, 19 Jul 2022 13:16:02 -0700 Message-ID: Subject: Re: cgroup specific sticky resources (was: Re: [PATCH bpf-next 0/5] bpf: BPF specific memory allocator.) To: Tejun Heo Cc: Yosry Ahmed , Michal Hocko , Roman Gushchin , Yafang Shao , Alexei Starovoitov , Shakeel Butt , Matthew Wilcox , Christoph Hellwig , "David S. Miller" , Daniel Borkmann , Andrii Nakryiko , Martin KaFai Lau , bpf , Kernel Team , linux-mm , Christoph Lameter , Pekka Enberg , David Rientjes , Joonsoo Kim , Andrew Morton , Vlastimil Babka , Johannes Weiner Content-Type: text/plain; charset="UTF-8" ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=lhMx3WCG; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf16.hostedemail.com: domain of almasrymina@google.com designates 209.85.217.50 as permitted sender) smtp.mailfrom=almasrymina@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1658261775; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=1fZYu+1ItEdtumrIwz+viCZWzbZgaVsyd8axTqL69J8=; b=HN+pFBJe6vnnPc8dC0xjtZ/5BhqmDcu9oYT/5oaNqINhf0kIRiQRF1UcCFP3oFW0aOf97f fhinZEWHrckuMJ/4nm/TQ8kOvqYJU3gdRd3v+TYc18SdHWMXvsJCWOiYleRc9U6mmQOjL+ 0gllKMD71tHK5GwTeULVuPxvkdqEP/0= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1658261775; a=rsa-sha256; cv=none; b=Z+iHu79tFQpBZT/+0wVvSe6aMZZFmEMnJ9xYdaV4MByUpkEWN0arv2D2CT0cuOdCp0xB/u 66VY9C2z0sWUH9Bxg/o46S3iSZI066ae32vHb7RP0jnySi9m8KUxV3U+04Bn84SuifeGLq uqCB3hs37UOFrPsBOn+offP8njvLPnk= X-Rspamd-Queue-Id: 56BC618008A Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=lhMx3WCG; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf16.hostedemail.com: domain of almasrymina@google.com designates 209.85.217.50 as permitted sender) smtp.mailfrom=almasrymina@google.com X-Rspam-User: X-Rspamd-Server: rspam03 X-Stat-Signature: ixryuhu117yr88mwwqgo36695jugbz94 X-HE-Tag: 1658261775-549829 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Jul 19, 2022 at 12:54 PM Tejun Heo wrote: > > Hello, > > On Tue, Jul 19, 2022 at 12:47:39PM -0700, Mina Almasry wrote: > > Hmm, sorry I might be missing something but I don't think we have the > > same thing in mind? > > > > My understanding is that the sysadmin can do something like this which > > is relatively inexpensive to implement in the kernel: > > > > > > mount -t tmpfs /mnt/mymountpoint > > echo "/mnt/mymountpoint" > /path/to/cgroup/cgroup.charge_for.tmpfs > > > > > > At that point all tmpfs charges for this tmpfs are directed to > > /path/to/cgroup/memory.current. > > > > Then the sysadmin can do something like: > > > > > > echo "/mnt/mymountpoint" > /path/to/cgroup2/cgroup.charge_for.tmpfs > > > > > > At that point all _future_ charges of that tmpfs will go to > > cgroup2/memory.current. All existing charges remain at > > cgroup/memory.current and get uncharged from there. Per my > > understanding there is no need to move all the _existing_ charges from > > cgroup/memory.current to cgroup2/memory.current. > > So, it's a lot better if the existing charges aren't moved around but it's > also kinda confusing if something can be moved around the tree arbitrarily > leaving charges behind. We already do get that from moving processes around > but most common usages are pretty static at this point and I think it'd be > better to avoid expanding the interface in that direction. > I think I'm flexible in this sense. Would you like the kernel to prevent reattaching the tmpfs to a different cgroup? To be honest we have a use case for that, but I'm not going to die on this hill. I guess the worst case scenario is that I can carry a local patch on our kernel which allows reattaching to a different cgroup and directs future charges there... > I'd much prefer something alont the line of `mount -t tmpfs -o cgroup=XXX` > where the tmpfs code checks whether the specified cgroup is one of the > ancestors and the mounting task has enough permission to shift the resource > there. > Actually this is pretty much the same interface I opted for in my original proposal (except I named it memcg= rather than cgroup=): https://lore.kernel.org/linux-mm/20211120045011.3074840-1-almasrymina@google.com/ Curious, why do we need to check if the cgroup= is an ancestor? We actually do have a use case where the cgroups are unrelated and the common ancestor is root. Again, I'm not sure I want to die on this hill. At worst I can remove the restriction in a local patch for our kernel again... Before I get too excited and implement these changes and submit another iteration of my proposal above, I'd love to hear from Johannes/Michal/Roman. My previous proposal was a pretty strong nack from Johannes and Michal in particular. > Thanks. > > -- > tejun