From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 300F3C6FD1D for ; Fri, 17 Mar 2023 20:09:08 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 90F3D6B0078; Fri, 17 Mar 2023 16:09:07 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 8BF8C6B007B; Fri, 17 Mar 2023 16:09:07 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 787956B007D; Fri, 17 Mar 2023 16:09:07 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 6B4F06B0078 for ; Fri, 17 Mar 2023 16:09:07 -0400 (EDT) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 37CD31C5E34 for ; Fri, 17 Mar 2023 20:09:07 +0000 (UTC) X-FDA: 80579479134.03.A8AFDF6 Received: from mail-yb1-f178.google.com (mail-yb1-f178.google.com [209.85.219.178]) by imf14.hostedemail.com (Postfix) with ESMTP id 64B4410001D for ; Fri, 17 Mar 2023 20:09:04 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=O3zQpVlm; spf=pass (imf14.hostedemail.com: domain of shakeelb@google.com designates 209.85.219.178 as permitted sender) smtp.mailfrom=shakeelb@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1679083744; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=tiU63fzKbrxG6OsvAQO5eoTUgRIjxq/dY+rYtWI2V/Y=; b=SMjR0ktj7OdjQy5n4gGTK/7D6oOUmh2DFtuxlZS7QqV1K6LruAWeBbnH1ZTPC9wROGhy28 mdtF5u7AcwPu2ZunHWy7k6zOpP9tDIbU26BWo0udpcBemXGcRqTsB6/1sxEyevEw0wo6Lj 4eusoeuMSo7UWJHY3vsKheoib5VEvYM= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=O3zQpVlm; spf=pass (imf14.hostedemail.com: domain of shakeelb@google.com designates 209.85.219.178 as permitted sender) smtp.mailfrom=shakeelb@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1679083744; a=rsa-sha256; cv=none; b=SNDzDZFmRFY8tEBnOOsMPYJN37XY9H0oxQMc9SEBUPJRN4y/sxWpTw84oakdwVRaYAkY8j N8laY1jsMrphNKMq8gU94Jgf9ehqCjDSliUdq8ZzdyY0lXrNqOmP8Cnmw74LdI+GADxK2j sjeqO8dAjNXftOaiXsccocHnSZoL+E0= Received: by mail-yb1-f178.google.com with SMTP id e194so7023637ybf.1 for ; Fri, 17 Mar 2023 13:09:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; t=1679083743; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=tiU63fzKbrxG6OsvAQO5eoTUgRIjxq/dY+rYtWI2V/Y=; b=O3zQpVlmOTiLTFQb7Faze4s9B7m1eqRhe730lI97hcfmDpS5kFaaYVf7wsVLD0aYDf rSd/6s/VKT7CPe1IeJVBgpT8tLQsol7A4ArTdTMzv2bhmh3Xdg0KscyPUZbPqb05qUSj f5RKWldN0/q6mBLZwkECcux9WaVCew76VEE/U74vIHx1DeW20vh2DSFDgub0K5VZJGzC D07tSjuff7OBrYvLn+fQQavGN/8qbLLicJANb+fBGCUQWiaaVRGkYc1xrR8hMgXTccMo oMdYqXsFdwVAAha750b/5p1/FUQ6R4sE1VtqiYLo/yF7KSrCxd9xH4L1Nd4rVFe0rdzi UDsQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1679083743; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=tiU63fzKbrxG6OsvAQO5eoTUgRIjxq/dY+rYtWI2V/Y=; b=tbP0f/0E/hXWQ+Ieti5JX6ao5mW9LVVAaUoFez3dkLMPGdqxdrJnwetcyYHeIiv0tT zJOlXUDwss5sv3rS9TXpdXB0ZmfecAupzD/VhLLVFqWXU5UOL/MgoTfZEF4lti1r3+S/ i5fILMIN5ouhu/vlP6JE/SpdryJ3TrCzgkaR3y/qiyX4zEQlYOk/EaL/kfVoS23HIZSq OFI/DL5msgbNrZScd0B+K4M7iRQDY78xb8NK4uRRFZ46ti7IS8dpOcHeUp8H1NwAuqhW tLqSvtJmGI6DSX7WjIfZtAgg6hcc/DVV0jpcU/T6nArhZlzTOveK9LoOFzC1D2okzuMb WPsw== X-Gm-Message-State: AO0yUKVQ6MS2F97O8Ueve5boOaLTKnNRn92qiHT3okVVIMAmqXrMFIMf 6NMh68/vagrwx7X+eNP4htPq108XQdfSk26FM49KVQ== X-Google-Smtp-Source: AK7set+E7kGxOkwB68SQ5anuvVXg2Hon15ymj8sjOl0ixG4OA/sKg7LmyULFkZn+7bUvVxyXp0Lkz+kTTDBtj5oq4jE= X-Received: by 2002:a25:3486:0:b0:b56:1f24:7e9f with SMTP id b128-20020a253486000000b00b561f247e9fmr509657yba.12.1679083743422; Fri, 17 Mar 2023 13:09:03 -0700 (PDT) MIME-Version: 1.0 References: <20230317134448.11082-1-mhocko@kernel.org> <20230317134448.11082-3-mhocko@kernel.org> In-Reply-To: <20230317134448.11082-3-mhocko@kernel.org> From: Shakeel Butt Date: Fri, 17 Mar 2023 13:08:52 -0700 Message-ID: Subject: Re: [PATCH 2/2] memcg: do not drain charge pcp caches on remote isolated cpus To: Michal Hocko Cc: Andrew Morton , Leonardo Bras , Frederic Weisbecker , Peter Zijlstra , Thomas Gleixner , Marcelo Tosatti , Johannes Weiner , Roman Gushchin , Muchun Song , LKML , linux-mm@kvack.org, Michal Hocko , Frederic Weisbecker Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 64B4410001D X-Rspam-User: X-Stat-Signature: yno7abtpgnhmxjbdu88jyirobnhfpkaj X-HE-Tag: 1679083744-490347 X-HE-Meta: U2FsdGVkX18d0bfT+OCslAWqY5D2nOoHZP6Na3IVREok1/v8CuJQLqjOCqj0QIFeMZScgbPhPakC7/m5Sfdz57niTCRy70BelQIDzsoUBVVSdttXLU3T9tF4QDkfMgxTlzYrZ9KuqbSj3gqRuHii+GJ+NTzNtfvunCOHu1UwEXr2JqXEnfZ1PGUDyXNAVz27h6njOSqDyxLDSK3FhNgoc2PeWPZOgVqdNZtquwbSw2iop+Vr9CQDsh2u4+1QRm3hq/C0NHMjA+8XM7nHggyIREkyITEyI54T85zK9ZZ3fhoGCvZ4JbQPlUPYys5lwpTPx/cFd71uhXhDzDWvRLrdp9MTJUKRDCsSq3W1XwPB1+qlFmrYUJA1ZLz5vxKRZ4jqq7sJ/diA4uPPEAIW5J4IdFYFu+q80kjqLwbOeeOLufP88pOlj2JvjW0+piTE6Dy2sE41fuEyRbopYaPhuIw2g1ZFvYzfcGPB9Dv4Ds+3sPD8XznFtnYIYFKYEHorxPZeurC52nEVUkS5zzHyVufsqebHLKd03Bd/29V1PyQs0jwEEgnuwbS0HMTXEloEAGuWzP7r38HglUCOr+hDX2GJTx0Q+Cr2wRinuIcMKp4BBCnzsBCAPopYOZ5DBlrvkWPcvRnE5K4YBqkzFjJlM69+M5bPSV6fHLwJjFUa9JA7Q6awyhOFwzAwsyRKhOD34FNQw5VkvV1dMDR/I2typj5rNp+1ahVDRZg+vq9wdXs3lqT5It6trfLm2Lz7t6qQoIoXdqlIwQ7T1tAVdUqmVMdvOSZmFxtY48CmaW8sHtS2AIaGqlLnfvdYjMxADn9bipowR0kgiL00nYUO2wBI/ydoYi4lc/l2uWsgP6etutFYs2yAJmKudozdsSbB/qWOWGwMBdYs20e9DiohGRkoAzv83WrsxgwVlzkqDSOGyOy+Z2go/I54yAi3qRKBsYhsWXR+HTiq5IXQ+DFfb3KOlDq 225SWcTC VseMu2+ahzVAfjdZFBLEVVc5MlN+SNpWQAAUfW4sW7CoZ8SExORPxfcGMPtXtSdmjHpIFsFXw5gt3XVJIDomPol0eQw+gUHneInLXIUK12vSPFT4umj9dRsAk3YJ01hLE078ynkcu2yz9KtIQA414Ulw7C/Tmw+Zzex9cnlnJRXWXvLo4nXilo8PNfD1dDdpV37Rsdzfz6tznHVdQGZOZl6jwGnnROurwJeqbzRTVng1L8pG67VPm6DZXGq99t6HQ4UEdYJalUQ3wZAZR3DKuKx136NxAA92FxEgWgsXZFhG+AzXyJn6lY166YqAg2UJupUnn7uccv1Ak5shnn+lpb5geH5sI/CvU/nkhJ8MKzlLkplPHDX1ahL4G+D2/BckP9g/UWrDDSfYLwi1Y+yzEcDwDUk6hXUAhUnbFU6Y8QLbwI6ugqNvjzqkFbltXlz9eRqzdpWSHlnuTpJPzR8CiUOcO/Q== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, Mar 17, 2023 at 6:44=E2=80=AFAM Michal Hocko wr= ote: > > From: Michal Hocko > > Leonardo Bras has noticed that pcp charge cache draining might be > disruptive on workloads relying on 'isolated cpus', a feature commonly > used on workloads that are sensitive to interruption and context > switching such as vRAN and Industrial Control Systems. > > There are essentially two ways how to approach the issue. We can either > allow the pcp cache to be drained on a different rather than a local cpu > or avoid remote flushing on isolated cpus. > > The current pcp charge cache is really optimized for high performance > and it always relies to stick with its cpu. That means it only requires > local_lock (preempt_disable on !RT) and draining is handed over to pcp > WQ to drain locally again. > > The former solution (remote draining) would require to add an additional > locking to prevent local charges from racing with the draining. This > adds an atomic operation to otherwise simple arithmetic fast path in the > try_charge path. Another concern is that the remote draining can cause a > lock contention for the isolated workloads and therefore interfere with > it indirectly via user space interfaces. > > Another option is to avoid draining scheduling on isolated cpus > altogether. That means that those remote cpus would keep their charges > even after drain_all_stock returns. This is certainly not optimal either > but it shouldn't really cause any major problems. In the worst case > (many isolated cpus with charges - each of them with MEMCG_CHARGE_BATCH > i.e 64 page) the memory consumption of a memcg would be artificially > higher than can be immediately used from other cpus. > > Theoretically a memcg OOM killer could be triggered pre-maturely. > Currently it is not really clear whether this is a practical problem > though. Tight memcg limit would be really counter productive to cpu > isolated workloads pretty much by definition because any memory > reclaimed induced by memcg limit could break user space timing > expectations as those usually expect execution in the userspace most of > the time. > > Also charges could be left behind on memcg removal. Any future charge on > those isolated cpus will drain that pcp cache so this won't be a > permanent leak. > > Considering cons and pros of both approaches this patch is implementing > the second option and simply do not schedule remote draining if the > target cpu is isolated. This solution is much more simpler. It doesn't > add any new locking and it is more more predictable from the user space > POV. Should the pre-mature memcg OOM become a real life problem, we can > revisit this decision. > > Cc: Leonardo Br=C3=A1s > Cc: Marcelo Tosatti > Cc: Shakeel Butt > Cc: Muchun Song > Cc: Johannes Weiner > Cc: Frederic Weisbecker > Reported-by: Leonardo Bras > Acked-by: Roman Gushchin > Suggested-by: Roman Gushchin > Signed-off-by: Michal Hocko Acked-by: Shakeel Butt