From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 44178C6FD20 for ; Fri, 24 Mar 2023 14:12:53 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C6FD56B0072; Fri, 24 Mar 2023 10:12:52 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id BF8546B0074; Fri, 24 Mar 2023 10:12:52 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A995C6B0075; Fri, 24 Mar 2023 10:12:52 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 93DA46B0072 for ; Fri, 24 Mar 2023 10:12:52 -0400 (EDT) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 69EF31C61E2 for ; Fri, 24 Mar 2023 14:12:52 +0000 (UTC) X-FDA: 80603982984.25.145DA01 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf20.hostedemail.com (Postfix) with ESMTP id 911D51C000F for ; Fri, 24 Mar 2023 14:12:50 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b="iF/6sLIE"; spf=pass (imf20.hostedemail.com: domain of longman@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=longman@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1679667170; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=ADiIDdAPnjqVAbV/udpxZCAiMLd8rfZn004mVe2ENYg=; b=HCuDWyWk5eexMnkPhY9xRNhweFneKMvwjcXB4Q4qM94i7mh00wQ0UXs/uvPWxnhDOfCZlU c0OMza6XEGJwR1TB3lA5hqApHvxBiiLWRKq2SCBR0Z+zl920RhUy8qX3iDKVZK/cFOdiUt 2ewlBnPk0DFAbFYQPVuDhTG9B0dm840= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b="iF/6sLIE"; spf=pass (imf20.hostedemail.com: domain of longman@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=longman@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1679667170; a=rsa-sha256; cv=none; b=uB89Ikuomq4KL+GzsowlKKqlxfHS2HOIAVBjiSzCOmSDYvbQRz4DNHcDfRBITLlwqA21nk hq93dkL1SBJ4ILCT7r49jiTEQQ6azZP5M/9c+4uEePulUWnGZc7QKX9taIF/gypPsejn8p cRZCnxav2P3e0OOWD367lLh+VusJKIY= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1679667170; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ADiIDdAPnjqVAbV/udpxZCAiMLd8rfZn004mVe2ENYg=; b=iF/6sLIEKNyszkQCXZApGQNt3eCYHsHvIa+duWopLR6LONmnfLrgZoBiRQ93BPJ505Dgu7 uD70VYIZMTYjqxp603EOgR43bkW+GwsNA/QGOaXJ4sb6Pq4WPG8192sa2ZSo9IUFQMe65T o2ATYU5wukbTh6P5Ln921BlGXkNYpko= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-638-1m-g8VaUM1S0ZM79i3uKWA-1; Fri, 24 Mar 2023 10:12:45 -0400 X-MC-Unique: 1m-g8VaUM1S0ZM79i3uKWA-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.rdu2.redhat.com [10.11.54.7]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id E7BAE29AA3B5; Fri, 24 Mar 2023 14:12:43 +0000 (UTC) Received: from [10.22.33.184] (unknown [10.22.33.184]) by smtp.corp.redhat.com (Postfix) with ESMTP id E174D140EBF4; Fri, 24 Mar 2023 14:12:42 +0000 (UTC) Message-ID: <53582a07-81c6-35eb-10bf-7920b27483be@redhat.com> Date: Fri, 24 Mar 2023 10:12:42 -0400 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.7.1 Subject: Re: [RFC PATCH 1/7] cgroup: rstat: only disable interrupts for the percpu lock Content-Language: en-US To: Yosry Ahmed , Tejun Heo Cc: Josef Bacik , Jens Axboe , Zefan Li , Johannes Weiner , Michal Hocko , Roman Gushchin , Shakeel Butt , Muchun Song , Andrew Morton , Vasily Averin , cgroups@vger.kernel.org, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, bpf@vger.kernel.org References: <20230323040037.2389095-1-yosryahmed@google.com> <20230323040037.2389095-2-yosryahmed@google.com> From: Waiman Long In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 3.1 on 10.11.54.7 X-Rspam-User: X-Rspamd-Server: rspam03 X-Stat-Signature: tjt4kbcuwyk97ggm9qkgk9yp1opw5hu8 X-Rspamd-Queue-Id: 911D51C000F X-HE-Tag: 1679667170-844046 X-HE-Meta: U2FsdGVkX19r8FPGqnY/P08ifTNdOGnLK1PtokSpSN+fn9yYK7MUTPfce6/eNebxYUBfYkhrybMckdmpUvCn0d93L5M7uId4joAttHO060F95sPaae3+XzwnDEvjWQ1E3AjWVSC0J4CFY0HV+V8K2dqvcDVHGUmTwIYAXBFaP0CX5uLM8GRD6nDYIB5pQrgvtvBkgrDFj1ceXdaPa7c9QL4vpvFWPn1G2CYIN+ozW5Xuh1Zy2FB+pkNqeumzJ+Cko9b/hKPAjJE55EoUMO/EcmNT5R5NOzFenxsYSkLlXV0Y8YiPzTQfUUK/h24IHS+jRbv2ohMAG3TSMTzUiRpF1TFzcfESxj3TuVfMTFc7S9hyFoY3e1S199hhUSdK/IBma736xbJkybahzNMCn79G5GAT7YY0nzCOL7esiXeEiGRB+OoHrjdNm241ous9jQja986mUaMjUG09n9I1QFyS3G5N/VihPnMwsYdZ8lBi16bX49Y1X8WeLcP9XGF4ffT/B9i2dmPFLkjyNdp/LRjM644OgbG1KWT0HZ1zXsXXjMLylWtxoVIgJrsCzV507RTDxsX3G8jDNQi5b/Er9JPsCkJQe1EIppYeLGfGGXTu7j+qWDOryQI5znapOCE7gURD6jaok49PpqPH3gy+xOyCAFm8CWCN7+fqODIuSYer2dCJB/U/3hc3MG2KFD2WoREgSW1yTrfzeqYLPRzxgs7TS8Gq4Omr+pofZdKJOEMsZ06KRfs44Rf38UXyqXIZGeLvj9p1MJsBc5BkyOKOuG6nxORwQuiSX0D0gS4gaCpJRdUWvpDxE5dC9PXdBahYyCYChdjGbBrNZtvRU4n0anATH1imEs/EbhGZ8IfYkeJQHmHpwZLUOw/xbjNNYkqGRkamRuybp0r3hf/tMp5fcryJxSlvIdlQfZ2Je2LM/p5KTQGtTlU2kvUze70nz94+7pibPOJDgQvfg7IunKfLM7i Cb/mFWp5 YRwr7TDORtgGNQC3dQbusZGH2o+HsK5VCMn/bHUL4WGW8i9Jh4N6bocNcgYfkz86X4Iez//DJ+AkY0AXMszVHCHDPxog1nautPoBATRc31sG61wvB9rYkBfYxPrvJwl7/pOt7BMJ7aX7mVygNdOwd/2KousZi4aIjNt5Mlw972LgFSW0gTKDfnhxhkOcYCpsXGn6rh/8Sw6DVCTKn08TKzPJj4vPGSFyEodbKHsajIjUZykk= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 3/24/23 03:22, Yosry Ahmed wrote: > On Thu, Mar 23, 2023 at 6:39 PM Tejun Heo wrote: >> Hello, >> >> On Thu, Mar 23, 2023 at 04:00:31AM +0000, Yosry Ahmed wrote: >>> Currently, when sleeping is not allowed during rstat flushing, we hold >>> the global rstat lock with interrupts disabled throughout the entire >>> flush operation. Flushing in an O(# cgroups * # cpus) operation, and >>> having interrupts disabled throughout is dangerous. >>> >>> For some contexts, we may not want to sleep, but can be interrupted >>> (e.g. while holding a spinlock or RCU read lock). As such, do not >>> disable interrupts throughout rstat flushing, only when holding the >>> percpu lock. This breaks down the O(# cgroups * # cpus) duration with >>> interrupts disabled to a series of O(# cgroups) durations. >>> >>> Furthermore, if a cpu spinning waiting for the global rstat lock, it >>> doesn't need to spin with interrupts disabled anymore. >> I'm generally not a fan of big spin locks w/o irq protection. They too often >> become a source of unpredictable latency spikes. As you said, the global >> rstat lock can be held for quite a while. Removing _irq makes irq latency >> better on the CPU but on the other hand it makes a lot more likely that the >> lock is gonna be held even longer, possibly significantly so depending on >> the configuration and workload which will in turn stall other CPUs waiting >> for the lock. Sure, irqs are being serviced quicker but if the cost is more >> and longer !irq context multi-cpu stalls, what's the point? >> >> I don't think there's anything which requires the global lock to be held >> throughout the entire flushing sequence and irq needs to be disabled when >> grabbing the percpu lock anyway, so why not just release the global lock on >> CPU boundaries instead? We don't really lose anything significant that way. >> The durations of irq disabled sections are still about the same as in the >> currently proposed solution at O(# cgroups) and we avoid the risk of holding >> the global lock for too long unexpectedly from getting hit repeatedly by >> irqs while holding the global lock. > Thanks for taking a look! > > I think a problem with this approach is that we risk having to contend > for the global lock at every CPU boundary in atomic contexts. Right Isn't it the plan to just do a trylock in atomic contexts so that it won't get stuck spinning for the lock for an indeterminate amount of time? > now we contend for the global lock once, and once we have it we go > through all CPUs to flush, only having to contend with updates taking > the percpu locks at this point. If we unconditionally release & > reacquire the global lock at every CPU boundary then we may contend > for it much more frequently with concurrent flushers. Note that with the use of qspinlock in all the major arches, the impact of thundering herds of lockers are much less serious than before. There are certainly some overhead in doing multiple lock acquires and releases, but that shouldn't been too excessive. I am all in for reducing lock hold time as much as possible as it will improve the response time. Cheers, Longman