From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 38E16C71153 for ; Tue, 29 Aug 2023 20:21:16 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A14BE28001B; Tue, 29 Aug 2023 16:21:15 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9C4F28E0029; Tue, 29 Aug 2023 16:21:15 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 88CD628001B; Tue, 29 Aug 2023 16:21:15 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 7721C8E0029 for ; Tue, 29 Aug 2023 16:21:15 -0400 (EDT) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 49B7B1602B9 for ; Tue, 29 Aug 2023 20:21:15 +0000 (UTC) X-FDA: 81178261710.26.958BA2A Received: from mail-lj1-f177.google.com (mail-lj1-f177.google.com [209.85.208.177]) by imf09.hostedemail.com (Postfix) with ESMTP id 6F3CE140029 for ; Tue, 29 Aug 2023 20:21:13 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=Hjal4eXy; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf09.hostedemail.com: domain of yosryahmed@google.com designates 209.85.208.177 as permitted sender) smtp.mailfrom=yosryahmed@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1693340473; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=TOlYcqn8beQHjGoc0Rc5zdI+Fp8S0JCnlA3frimuCzk=; b=YLbNOGgwSGoSRyELlCp2fENjwCp/hOvEz7rwUvBICLfIyms4aUSRkAfkS80lPO176sxoS+ PlReKqytKGvNDs/4kscXSKOnqFkc44JD92+N2QLDZYo7VHxyLZpUVOQGiH7fW2z76u0PDo x3NwYTLamm7UilH65ywkuK9akuLu2OQ= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=Hjal4eXy; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf09.hostedemail.com: domain of yosryahmed@google.com designates 209.85.208.177 as permitted sender) smtp.mailfrom=yosryahmed@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1693340473; a=rsa-sha256; cv=none; b=tpYnaPE46G20AdPdrmpTcB7cAVA6SgEzPdbfZBz1vP1fBw9uHWuhHZ9pkfZ/2lzY5lwNF8 gfnAXROc8srURGVBAoWjOEb52AqZ8J17fSC/b9TbQpYg71rKerHhvU4lqrLw08avSkLmz1 5Z4v1Adf0tYdXlYk/Lya87lIZUE7VSs= Received: by mail-lj1-f177.google.com with SMTP id 38308e7fff4ca-2b9d07a8d84so72131631fa.3 for ; Tue, 29 Aug 2023 13:21:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1693340471; x=1693945271; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=TOlYcqn8beQHjGoc0Rc5zdI+Fp8S0JCnlA3frimuCzk=; b=Hjal4eXy1T9Fr3wGzjfeSIiJ6FbQajOH0i3v3IbGObU/1ZPPvq+yDRJhmf0/fLyGMZ 6Sr5a6g7DBCAPz7E5cTM9lMGGLQVVB+tKqdTkgUGlGXBTp2uVkEsdYo5YzESqCDQWswf m5cn2OKyoRaMxwGW31eO90XWcRjLmc1mqov2fZ+XQG8NxPSZerWag4+BENAFN7wkxI4y GBlU0gPD3dK5Q6HOoNJmR2IDjc0+J4HjnQiL83XSRCJ7f0cyajwJWmXV2P9AosuY2sIS WIZMmfqFWDb1OLc27WzgWYXwZ4ov02GGkQRnx9izol4KWtqdhJpUPbN0hlxHYFaAUawN dQkQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1693340471; x=1693945271; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=TOlYcqn8beQHjGoc0Rc5zdI+Fp8S0JCnlA3frimuCzk=; b=enIf7HRYe6hZvo4BjyvGBMFe+3/AcM+Gk33GjxQGknAX/weJk6wyGJHgUfelNEM+j6 E0Sqs2kzwsekee0/iymX1RrV5Wgb/3T2WRzDSLq0WFuL3rAZIkbt5JPbebpvB1+fJ1UB Q1p7rl6nv/CX79zTtNIZNZHWm8pcueZjrU3rRhuCL0IHQfXRrVu/Pmf/5yIVg/DvxNme sZVVBQu3U+4dkpYVClcySmGtY3liULqVthiyiLs3UcfCk7UdPNZcO4rhPtcQnSLyZAfp cVl5M97uRk8KO93Cfhwn+Gc+1vcpkO8B//cjnlijfednaJltgSx+8eORgJWbxALXG+d/ c3vQ== X-Gm-Message-State: AOJu0YzRT1QPfaNuAAJqOxObFvtJ5Ylx7c8sVkEPMBaf53WtBLIjGsfD uizOHnckDiQUgrKT6GzK80X8lLQAqF9p/+/SXrSn8g== X-Google-Smtp-Source: AGHT+IH2KIgoRggLZTMZmzajJlmhBlCx1wDfnHukX1Z8/CBs4EOept/nxD6OEIylcZqOv7fs6UbIYycFpg0THTHewZU= X-Received: by 2002:a2e:9290:0:b0:2bd:10ee:7d22 with SMTP id d16-20020a2e9290000000b002bd10ee7d22mr213747ljh.51.1693340471248; Tue, 29 Aug 2023 13:21:11 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Yosry Ahmed Date: Tue, 29 Aug 2023 13:20:34 -0700 Message-ID: Subject: Re: [PATCH 3/3] mm: memcg: use non-unified stats flushing for userspace reads To: Tejun Heo Cc: Michal Hocko , Andrew Morton , Johannes Weiner , Roman Gushchin , Shakeel Butt , Muchun Song , Ivan Babrou , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Stat-Signature: sggjsbkhj1f7xcqyhdecxibo8irhj6hx X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 6F3CE140029 X-HE-Tag: 1693340473-788336 X-HE-Meta: U2FsdGVkX1+rTHEl0MxBTCkvZ61Lf0hVUckO9RIYKL1DDSGRSRTuDD/7J4JKKpO88wK4rnBFjZt4H2Ah2ypROKpxPE4UUAB4YNkjaQJRUNcsDs8BwPARK24dyYoIaAHO8INPiNAVTxmunQv6gyGE2WP9uwBUXrPc6wuxVXXHKaHF2SzuGEiH+Gg5GeEfc9kYLetS3N9bfPUsNFSn4aVJDYPUY8hfcwT2zz0ettKqnePI5DDo3EWFzkCtZOd74fLibLEgXjKyRMtSgfWdn38MHP/AELTbslVwvb3ELl7lLY60lTyUfaABsvskX4oMws7Wd/fpirRkXMjsZxne8sH3Z0b9oPHWOigFyssxH8TJOl1By1zeanfZeqY/8yOQRNSr21FtLFE57rCjnupwsIxjhrs915CcTUxGPE/siAoKoD63xNy62QO1Sk6W6f8mxrOeXdVTJRt5+9bE4/Mdi4dqKkXA49xnHT8v4GdwDU4+EndBBhdJaIu7JIGiMJvYPMNNL62JqkyyqB2aoI0CDLbtYQxtN/8fR/SofdyLgsb+gVavns1zETqTFSj6n+xzOUbuHrqnP0y9as+l3kn2+DRcaUW/CDUYypNZWquO9uzmK2jSWiLi1v6kBTc9/pTtWHOlU7PcqTuiOdO+E18ki3rW0WgVDpYxYozAAFE37HCcIRD3jwpoVU7QwfN+3engj95KaOneuXxCtw0cr8l0sOB+ftlv2sJozl+exGzmGuiAPb9v34NkJKNEp+UYbmjberR3yGIgq4yPocj9SALzRUs/TZzSb6XXzg7KAoqAy4vJT3V85yCCbuOmrzh5cEp4awEaw6K4KlfhReBIxFYerqMQR+pwgQhMH7r6PAXE8b3Wkh6R8Hd58QTY5qXjpoTwMjqjYwybPQXuhEzrhkQp72U6kHcNhjmXtWQyIhtsei0PouTN4lGk0AV5tB9d1aK4QT/HVaO55opIp1m1Hpk7EhW i/PwyMBX bNVvJX8BvGhuFTX2V5FA2HpZyVH+HEp4NHRFRYVIiyKM8hljgaFlZYcMLrrpPOV5ttbaIfF+/p46szm7UF+vADmfRSem6sVFkGUBuzS5V14IezR/4rtBb0Jo6Xc1SLd4pBnJ+CL+UXJkn17gmpwj/GNfP4aptz6lG8IzrtkzvVV9VZjs+BjDPLZV6CEz4lbhGdabphHPDm4dR/2CUUkllkyjaObQIrWBwYIYAdVkYmysbnrBAAC9S6a0M+XN5OVBETpWBA2WQUmXq65GsceMskK83Y7Y48g92zOT1 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Aug 29, 2023 at 1:12=E2=80=AFPM Tejun Heo wrote: > > Hello, > > On Tue, Aug 29, 2023 at 12:54:06PM -0700, Yosry Ahmed wrote: > ... > > > Maybe leave the global lock as-is and gate the userland flushers with= a > > > mutex so that there's only ever one contenting on the rstat lock from > > > userland side? > > > > Waiman suggested this as well. We can do that for sure, although I > > think we should wait until we are sure it's needed. > > > > One question. If whoever is holding that mutex is either flushing with > > the spinlock held or spinning (i.e not sleepable or preemptable), > > wouldn't this be equivalent to just changing the spinlock with a mutex > > and disable preemption while holding it? > > Well, it creates layering so that userspace can't flood the inner lock wh= ich > can cause contention issues for kernel side users. Not sleeping while > actively flushing is an side-effect too but the code at least doesn't loo= k > as anti-patterny as disabling preemption right after grabbing a mutex. I see. At most one kernel side flusher will be spinning for the lock at any given point anyway, but I guess having that one kernel side flusher competing against one user side flusher is better competing with N flushers. I will add a mutex on the userspace read side then and spin a v3. Hopefully this addresses Michal's concern as well. The lock dropping logic will still exist for the inner lock, but when one userspace reader drops the inner lock other readers won't be able to pick it up. > > I don't have a strong preference. As long as we stay away from introducin= g a > new user interface construct and can address the noticed scalability issu= es, > it should be fine. Note that there are other ways to address priority > inversions and contentions too - e.g. we can always bounce flushing to a > [kthread_]kworker and rate limit (or rather latency limit) how often > different classes of users can trigger flushing. I don't think we have to= go > there yet but if the simpler meaures don't work out, there are still many > ways to solve the problem within the kernel. I whole-heartedly agree with the preference to fix the problem within the kernel with minimal/none user space involvement. Thanks!