From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BDFDBC433EF for ; Tue, 12 Jul 2022 17:26:35 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 337219400BA; Tue, 12 Jul 2022 13:26:35 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 2E5E7940063; Tue, 12 Jul 2022 13:26:35 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 186E19400BA; Tue, 12 Jul 2022 13:26:35 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 04EEF940063 for ; Tue, 12 Jul 2022 13:26:35 -0400 (EDT) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id D1F883345A for ; Tue, 12 Jul 2022 17:26:34 +0000 (UTC) X-FDA: 79679127108.18.D06BB09 Received: from mail-pj1-f53.google.com (mail-pj1-f53.google.com [209.85.216.53]) by imf30.hostedemail.com (Postfix) with ESMTP id 91FF180091 for ; Tue, 12 Jul 2022 17:26:34 +0000 (UTC) Received: by mail-pj1-f53.google.com with SMTP id z12-20020a17090a7b8c00b001ef84000b8bso12305889pjc.1 for ; Tue, 12 Jul 2022 10:26:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=bD5Vt81GWvxaxfliFcqQlWhU4gxj9miMeasbRgHQrNs=; b=LPUimenByIRsyK/inetz+R7gf8CoKmgyG69kuGsVYFDhBuHrT4/rV5TSuiT6AtnhDU rBdWs1fvo4DudAnsB3df3u02yoOVuA6zbXLLXF7jhW9pnDECnthxXAfqsjTCU6FbP6Sm KjstAo/DkMZiGXMsC2r/pakgZLgLc/hQ5mjWNdbzYw5icTC5iWoEtIitlXCoSW9P6zmF 7dcFhshcbtAq2zWmGZGjG+jXXLLLXtbdwNVEjueZphjZpX0AXridn7dl+B1cLmj2VUDa XsvbsabaeYaYodZJJCZGbDXyWdmDoLwuZnJCBr6Yy0zYXw3mlheT2F0H0y9qBd5AX9xj VoAA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=bD5Vt81GWvxaxfliFcqQlWhU4gxj9miMeasbRgHQrNs=; b=ODJNUXCMySo+D024pWC7yF7nZCYixw3GH61z5umIf296SxZtUTP5v/+O1MgDyuq982 95w9u777AOTXIaOS/W6CXa0lrNIEVaY6PCSjoknPxJk+BENUxJZ/FtV16EAh0X+TuniL ukBjaLdto2XZIazwGnxKURPt1z/7J74mzpPvU0daK9cqMq9qNG3WweKU+UoYowqrNoPG bSD/nQKqxvHcJQbA/vTZJpm8UZJBsMSeXGEuk2/P7U/SlC+uhvvcDHP0u/GigIq3Sx8I oCxsqye7I3uZxJeljpbwdc/QG3nWN89yV88LnonOVggG5NfeUj58BR2jKkB8HL1WFaJo 6l3A== X-Gm-Message-State: AJIora/7bOqtydBhQ27nRV4DbQNE9MSny+QhFAzCMb1Sqv+7ZQmOhamS nbveWCiocKDEviLnPg8LMx8wgK5Ka2U8SSqyAzz4Bg== X-Google-Smtp-Source: AGRyM1vDtuysOBoFfVT0CqKAYEBsNcb9IX8AUkMTyCjon2tXxxAP7cqW1KlDZGMKkVRjuHA2N4tbpyIpXJuBsN+HIZ8= X-Received: by 2002:a17:902:ef48:b0:16a:1d4b:22ca with SMTP id e8-20020a170902ef4800b0016a1d4b22camr24539104plx.6.1657646793485; Tue, 12 Jul 2022 10:26:33 -0700 (PDT) MIME-Version: 1.0 References: <20220708174858.6gl2ag3asmoimpoe@macbook-pro-3.dhcp.thefacebook.com> <20220708215536.pqclxdqvtrfll2y4@google.com> <20220710073213.bkkdweiqrlnr35sv@google.com> <20220712043914.pxmbm7vockuvpmmh@macbook-pro-3.dhcp.thefacebook.com> In-Reply-To: From: Shakeel Butt Date: Tue, 12 Jul 2022 10:26:22 -0700 Message-ID: Subject: Re: [PATCH bpf-next 0/5] bpf: BPF specific memory allocator. To: Tejun Heo Cc: Michal Hocko , Yosry Ahmed , Muchun Song , Johannes Weiner , Yafang Shao , Alexei Starovoitov , Matthew Wilcox , Christoph Hellwig , "David S. Miller" , Daniel Borkmann , Andrii Nakryiko , Martin KaFai Lau , bpf , Kernel Team , linux-mm , Christoph Lameter , Pekka Enberg , David Rientjes , Joonsoo Kim , Andrew Morton , Vlastimil Babka Content-Type: text/plain; charset="UTF-8" ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1657646794; a=rsa-sha256; cv=none; b=SGU0lUUCQUbxNKkDcsb8sjvHi+9fyTxhv7mWLnxk76pV2Rd/CpSahjmoxyUxLLPABwEZWz eI1GDiz8c5Wa7hztfUBUtA1/FoNzqNJuS/Lfr4vtJ0MnOZO9gnQ1vKIctr/Up747RQrzP+ PjUewjqGY5vAQ1avuwaz8LA9rA3kXXI= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=LPUimenB; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf30.hostedemail.com: domain of shakeelb@google.com designates 209.85.216.53 as permitted sender) smtp.mailfrom=shakeelb@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1657646794; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=bD5Vt81GWvxaxfliFcqQlWhU4gxj9miMeasbRgHQrNs=; b=ztMctm0y6hL3CjdrpSY9ummTs6YCnxlvnx1IkVzXhGa8HshxGcsx9LfrojAa3IkwoizStX gBCXNZIezeGjtecAyeni4d4xJuKCzBPgqgHn8SNR868yaujq8N2AB8G23qBiqfLPZ1tSEj wvwwS4MPfFi8KJi/5lbYEAYQ5CI4eZE= X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 91FF180091 Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=LPUimenB; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf30.hostedemail.com: domain of shakeelb@google.com designates 209.85.216.53 as permitted sender) smtp.mailfrom=shakeelb@google.com X-Stat-Signature: fqffbwmpmgusqsa7x7t8bjbsx95fcbgn X-Rspam-User: X-HE-Tag: 1657646794-236108 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Jul 12, 2022 at 9:32 AM Tejun Heo wrote: > > Hello, > > On Tue, Jul 12, 2022 at 08:25:24AM -0700, Shakeel Butt wrote: > > Another very obvious example is the filesystem shared between multiple > > jobs. We had a similar discussion [1] on LRU reparenting patch series. > > Hmm... if I'm understanding correctly, what's discussed in [1] can be solved > with proper reparenting and nesting, right? > To some extent i.e. the zombies will go away but the accounting/stats of the sub-jobs will be nondeterministic until all the possible shared stuff is reparented. Let me give a more concrete example below. > > For this use-case internally we have a memcg= mount option where the > > given memcg is the common ancestor (think of pod in k8s environment) > > of the jobs who are sharing the filesystem. > > Can you elaborate a bit more on this? We've never really supported correctly > accounting pages shared across cgroups because it can be very complicating > and the use cases aren't that wide-spread. What's being shared? How big is > the shared portion in relation to total memory usage? What's the cgroup > topology like? > One use-case we have is a build & test service which runs independent builds and tests but all the build utilities (compiler, linker, libraries) are shared between those builds and tests. In terms of topology, the service has a top level cgroup (P) and all independent builds and tests run in their own cgroup under P. These builds/tests continuously come and go. This service continuously monitors all the builds/tests running and may kill some based on some criteria which includes memory usage. However the memory usage is nondeterministic and killing a specific build/test may not really free memory if most of the memory charged to it is from shared build utilities.