From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E707BC54EAA for ; Fri, 27 Jan 2023 23:50:41 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2935A6B0074; Fri, 27 Jan 2023 18:50:41 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 21CF76B0075; Fri, 27 Jan 2023 18:50:41 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 095DA6B0078; Fri, 27 Jan 2023 18:50:41 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id E86A36B0074 for ; Fri, 27 Jan 2023 18:50:40 -0500 (EST) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id B82ADAB5F2 for ; Fri, 27 Jan 2023 23:50:40 +0000 (UTC) X-FDA: 80402226240.22.01AD0D9 Received: from out0.migadu.com (out0.migadu.com [94.23.1.103]) by imf09.hostedemail.com (Postfix) with ESMTP id CC98B140016 for ; Fri, 27 Jan 2023 23:50:38 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=Oje197PR; spf=pass (imf09.hostedemail.com: domain of roman.gushchin@linux.dev designates 94.23.1.103 as permitted sender) smtp.mailfrom=roman.gushchin@linux.dev; dmarc=pass (policy=none) header.from=linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1674863439; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=J/s356kjQjTo/NKt8pwAEyADOkBLZ0BqETecZLsYgjM=; b=gw7GvDb7agPLJQKx8FOMB1RtOCOi6T9Wfx/eUNq5wTWN/v/4Hn9N+HCFzyLHe0/wiuY0ZQ 6Dg1O4iaNKtI+a50Ha91vea3Fk2oiEyuaHIQwpD7XocfrH2mtrtzQlFqLvIVxgPmoL/ZsP vcPEceDCeD5DX1+nxyYyQukMQ6rEUO4= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=Oje197PR; spf=pass (imf09.hostedemail.com: domain of roman.gushchin@linux.dev designates 94.23.1.103 as permitted sender) smtp.mailfrom=roman.gushchin@linux.dev; dmarc=pass (policy=none) header.from=linux.dev ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1674863439; a=rsa-sha256; cv=none; b=sVazPUapMMDyqYr05gUgqeVM/U5UxzA14Y5lX3wmsQJvMKfXanOWFEmQpaf2plVP+PDU5R 8NeIH2hh+8xgodX7MMVO4CwMo7Z7LtHbSiUhA3Yx1KIQGnSsRO4xgGSRHi4dDC5Q7M+tNB xDGjtiDf/321LFGV/3z21kkcbVTmEoY= Date: Fri, 27 Jan 2023 15:50:31 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1674863436; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=J/s356kjQjTo/NKt8pwAEyADOkBLZ0BqETecZLsYgjM=; b=Oje197PRQ2GU1rKiuR/Hn8mnMhKtnmSzuaghr6jKx8M84GPls+gqP225AjheHQL1QVvC0C MUgefT43v07E2KVZUjbAEqfrnW86BJk9+nzjXNDGp5WNLYqVWXGVcFb8lsLUfUjo1tT1lG TIDxvvbRZ89OPikOzZ+OIsGWdiodBLo= X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Roman Gushchin To: Leonardo =?iso-8859-1?Q?Br=E1s?= Cc: Michal Hocko , Marcelo Tosatti , Johannes Weiner , Shakeel Butt , Muchun Song , Andrew Morton , cgroups@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH v2 0/5] Introduce memcg_stock_pcp remote draining Message-ID: References: <9e61ab53e1419a144f774b95230b789244895424.camel@redhat.com> <55ac6e3cbb97c7d13c49c3125c1455d8a2c785c3.camel@redhat.com> <15c605f27f87d732e80e294f13fd9513697b65e3.camel@redhat.com> <029147be35b5173d5eb10c182e124ac9d2f1f0ba.camel@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <029147be35b5173d5eb10c182e124ac9d2f1f0ba.camel@redhat.com> X-Migadu-Flow: FLOW_OUT X-Rspamd-Queue-Id: CC98B140016 X-Stat-Signature: mg5gr71bjbb1uhm5uashhxao6jw4y8ge X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1674863438-175989 X-HE-Meta: U2FsdGVkX18rD94Se3cWPOem25zvSVrlK9Uceh8btC2p8/pSjmJBzEDjV9tOObE/Y0pUrzNSaXggt/Dubkx37qllA7L0UTSVRzE4HiYsOEJX2UGevQl1XcaIT4o+GUVYy0ptolPABdaBjCk3m+8EB0UHBAnB2y79g9l1k0MjPvY6aLYeovtxwoK/wo044sVPq5yS9UX93PuwFx+oAYE8biOnTYMEV24VGi3RfehwFUSXNo1d3cjcsbPT+7ZBV50DugTRfbequbK9jFyYHvcoGzqtFHW5QVEfD2AbdsV8SZpMSbb/X9MNJAbekuzhSF6V42OWQN/G0ZL437iewvWLGwnlHt7I79pr4wEY18ZkZvL9KKggCPxAHKhiyHqhBQDextFstDwW8eD5elOHD1nmoxd3ccrpskOIwB5WuyGCKgIqDVDSN6VnNpY4P00DUUtNUIAK86HTbeLnMT6L/KnMeu4FvwELN5QAvsmEi5tiq9dynlc+HVp2FSCjCtzolMFiGe6PeYhljerBgH/elyz4xyUHY3wpCRNOsUQrgg7Td1Tn35btEd1uBNnQvzzHyrPR6Gz9ivmr7iqYVO4ip2vb5CiKAoTKmseFbBexMxrTwAQzgQtvWKCfBNxmEvP3WB5vSyVKosrCi7gg1WO77j/XqxpAdQ39nAo8rXV4UBkVfgDwbVBtWgct183FQJzSHyr062X9Z4K5CdW2PdkNvYCsmw0OWrIM7ORePJ1+fZA983qSQr/zDm0bKuvzKoGQNgASndreUvwHv8dVod26qnNg7soVT/TlF1HWckyAVqESoY8yqM3hdWRX7RUhhR7kBjGlUyd+2uqZMOv68s5vMtc8ouYZKv7oQrufUbtWfG7+EkyBgwvQd/WftDSgmFXBh8QtG50+extbCuyawISUy8Iu8mbiVoz02W9AbktHfKXWOHk+DgnzzBX4SAEG0EVV0M67J9k7WYdD7nF0QJFKSSc 9uCRs30J gUOCl+zNNRSl2nEtPduHzgdEPyjHbkMG+oks7IWt5dqPkVnhdn0F1rm3ioljQgBgMmLS4M1+qvjflSXl1zq9TN2C6GOGp5RdyrmxhRFR5tL3sdq77X9wD7NXvGwX1ewu0OXQI4e7oird5ze92oiaITJqzI5bRpk4FLstH0QcnAeKLLdtd7tvUHA92Yw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, Jan 27, 2023 at 04:29:37PM -0300, Leonardo Brás wrote: > On Fri, 2023-01-27 at 10:29 +0100, Michal Hocko wrote: > > On Fri 27-01-23 04:35:22, Leonardo Brás wrote: > > > On Fri, 2023-01-27 at 08:20 +0100, Michal Hocko wrote: > > > > On Fri 27-01-23 04:14:19, Leonardo Brás wrote: > > > > > On Thu, 2023-01-26 at 15:12 -0800, Roman Gushchin wrote: > > > > [...] > > > > > > I'd rather opt out of stock draining for isolated cpus: it might slightly reduce > > > > > > the accuracy of memory limits and slightly increase the memory footprint (all > > > > > > those dying memcgs...), but the impact will be limited. Actually it is limited > > > > > > by the number of cpus. > > > > > > > > > > I was discussing this same idea with Marcelo yesterday morning. > > > > > > > > > > The questions had in the topic were: > > > > > a - About how many pages the pcp cache will hold before draining them itself?  > > > > > > > > MEMCG_CHARGE_BATCH (64 currently). And one more clarification. The cache > > > > doesn't really hold any pages. It is a mere counter of how many charges > > > > have been accounted for the memcg page counter. So it is not really > > > > consuming proportional amount of resources. It just pins the > > > > corresponding memcg. Have a look at consume_stock and refill_stock > > > > > > I see. Thanks for pointing that out! > > > > > > So in worst case scenario the memcg would have reserved 64 pages * (numcpus - 1) > > > > s@numcpus@num_isolated_cpus@ > > I was thinking worst case scenario being (ncpus - 1) being isolated. > > > > > > that are not getting used, and may cause an 'earlier' OOM if this amount is > > > needed but can't be freed. > > > > s@OOM@memcg OOM@ > > > > In the wave of worst case, supposing a big powerpc machine, 256 CPUs, each > > > holding 64k * 64 pages => 1GB memory - 4MB (one cpu using resources). > > > It's starting to get too big, but still ok for a machine this size. > > > > It is more about the memcg limit rather than the size of the machine. > > Again, let's focus on actual usacase. What is the usual memcg setup with > > those isolcpus > > I understand it's about the limit, not actually allocated memory. When I point > the machine size, I mean what is expected to be acceptable from a user in that > machine. > > > > > > The thing is that it can present an odd behavior: > > > You have a cgroup created before, now empty, and try to run given application, > > > and hits OOM. > > > > The application would either consume those cached charges or flush them > > if it is running in a different memcg. Or what do you have in mind? > > 1 - Create a memcg with a VM inside, multiple vcpus pinned to isolated cpus. > 2 - Run multi-cpu task inside the VM, it allocates memory for every CPU and keep > the pcp cache > 3 - Try to run a single-cpu task (pinned?) inside the VM, which uses almost all > the available memory. > 4 - memcg OOM. > > Does it make sense? It can happen now as well, you just need a competing drain request. Honestly, I feel the probability of this scenario to be a real problem is fairly low. I don't recall any complains on spurious OOMs because of races in the draining code. Usually machines which are tight on memory are rarely have so many idle cpus. Thanks!