From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A5FDCC021A9 for ; Tue, 18 Feb 2025 09:52:22 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3FB9B28010D; Tue, 18 Feb 2025 04:52:22 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 3AB6328010B; Tue, 18 Feb 2025 04:52:22 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2734128010D; Tue, 18 Feb 2025 04:52:22 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 089BC28010B for ; Tue, 18 Feb 2025 04:52:22 -0500 (EST) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id B589D80C68 for ; Tue, 18 Feb 2025 09:52:21 +0000 (UTC) X-FDA: 83132600082.13.F70AF72 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf02.hostedemail.com (Postfix) with ESMTP id 5852280012 for ; Tue, 18 Feb 2025 09:52:19 +0000 (UTC) Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=QkkSMc17; spf=pass (imf02.hostedemail.com: domain of gmonaco@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=gmonaco@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739872339; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=EaKksJdmXCyD8TTd5exBb2bDAdujcZSLDah+8bKqAXs=; b=SuPgi8AuAFxG/v9lnyNasVYnK/fLysMF27ERAgEqghqvDICazo5sWG9pEsa3LN2t7Y9gpu mxRwGJIKT44nSBwmJNrjoQOOMMLEPHOZZIbJeoCWhOTmZCrB/6cvv91hoMR9CoOKEXFDy2 3vz9pQwEaoOa+Dtq9T4isUaBOX98XPg= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=QkkSMc17; spf=pass (imf02.hostedemail.com: domain of gmonaco@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=gmonaco@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739872339; a=rsa-sha256; cv=none; b=kuyhJzgmdGMI8QmBcOtvwJ8K2JuXs9hef1Ya/X/gnmJMFeAbvUptJg2HCNDj4aaBvqYBLt SQBa9Bhsa7NBehcXTaLURvHIeFJvqC84ZNGYIfv5GBQU9L8kwN3DHx0cixUQ03X0D5J8zF WBeu3zIkuQoRbs2uLV/wwkcjRENc1rQ= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1739872338; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:autocrypt:autocrypt; bh=EaKksJdmXCyD8TTd5exBb2bDAdujcZSLDah+8bKqAXs=; b=QkkSMc17ZPrrUGaZKU0/+zbEGiaWfnl/ll2LogBc469+4ubEw8BmUiY1soIoVyl8GM7x+W Enju73LqS+CTcLErLeSpzFfgA++NcoskYyFd7xc7MaiA8L1OFAY4QOEj0RHH8ImpVO56lU 1cCFZ3w8uig50komUkK5/MG6CSVpktc= Received: from mail-wm1-f69.google.com (mail-wm1-f69.google.com [209.85.128.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-297-A48UDfJ9NfeKpsHxqY6Eog-1; Tue, 18 Feb 2025 04:52:16 -0500 X-MC-Unique: A48UDfJ9NfeKpsHxqY6Eog-1 X-Mimecast-MFC-AGG-ID: A48UDfJ9NfeKpsHxqY6Eog_1739872335 Received: by mail-wm1-f69.google.com with SMTP id 5b1f17b1804b1-439868806bbso8103405e9.1 for ; Tue, 18 Feb 2025 01:52:16 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1739872335; x=1740477135; h=mime-version:user-agent:content-transfer-encoding:autocrypt :references:in-reply-to:date:cc:to:from:subject:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=cpvc/eE6btvdapL2knMvKNNFyASasB/O3RXv/FeBYZE=; b=sW56yGKTj+kEYRWNLlJGlrWjl/A1fJi7GvLmrP6+CSccfbdJB6mTIVjBsyBkd+v/hU ZilhTbll1GiydkCoRfqvsT9q6d4vlMp2siG/6/ROhYfuaMI4KjNMImRij7tGXceySaCd TY8bE8CEvpjzJngjJVlyg8dPqL6g6TCdjXd2mRpdsG3WASjYv23BhXK3tWTC9e6Bs/H3 Z6K0aRddrVtgXl7gay3XrqKPVeHXwUdRCpepij7BV0hTJScaV5omLgS8MqjOP6eV8Vlb Ogsk64+mgUsessJiGLer6M2hUmUNzF3NFRJQ51y0lkEk4k0SV12USq0WWY2xY5z+eTHq zSzg== X-Forwarded-Encrypted: i=1; AJvYcCXAiZJeHl5gqG0AYEKlMRoXTg7t3uPWzKjGBo9x1WaccLhDCQOLMTNZ0XpIaDfT0Ug/jmJzw2v4eQ==@kvack.org X-Gm-Message-State: AOJu0Yyg/l3021yGQdvEFWMa0L5C+HODLpzyiKFPHRZkNAnXe0Nhr78A cg1nfcjKEXDYsC9lIz5kms9gA+9uk13de/USW14jD9996/pKETONN/bMNrc3vpUolm9hPpbF/zz FTyWnMvWrnNfuSMDLwMUpd077aCnFBxBRK9O/3U5XZ1oKdZWX X-Gm-Gg: ASbGncs9zxabZGwFj4fsxnqaVCdHQWW5W7kwvWhBtFOUw5pgYF1Mn9+eLpMQGsDi/8y o6/oBrXsu5MH+AJBFtSzEWsu4yLxHOEcGu36aaGtauixYh5hBqq2aP8nmHYYvXss1MxA1X1ZK1c tGsCzsD4ey3DqFj2+PvivKWz3s2zHE7UdtpTxs6826JQQNwwoQmm0ZFb1RhhKwQS+OIi0Uc7JxF fYauUMm+ZcoEfyugsUEdHrN8t8+NsgYdzyiRryeGmPyq+skw9ebPNZD/gpF7GxQfq2VhacM8Y// rpsMMGzZAXjLumhT9ABglAIL4WlaRuU= X-Received: by 2002:a05:600c:1989:b0:439:873a:1114 with SMTP id 5b1f17b1804b1-439873a1338mr45119515e9.6.1739872335234; Tue, 18 Feb 2025 01:52:15 -0800 (PST) X-Google-Smtp-Source: AGHT+IFLaAAyNfjFduPuw7TtO8xEnzr7M4Zuxk5o6P8CcOEujVHVwixlj2Q0pvGuqNUIvFqjjJ9xVA== X-Received: by 2002:a05:600c:1989:b0:439:873a:1114 with SMTP id 5b1f17b1804b1-439873a1338mr45119235e9.6.1739872334756; Tue, 18 Feb 2025 01:52:14 -0800 (PST) Received: from gmonaco-thinkpadt14gen3.rmtit.csb ([185.107.56.35]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-43987d1865asm45771735e9.3.2025.02.18.01.52.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 18 Feb 2025 01:52:14 -0800 (PST) Message-ID: Subject: Re: [PATCH 1/2] sched: Compact RSEQ concurrency IDs in batches From: Gabriele Monaco To: Mathieu Desnoyers , linux-kernel@vger.kernel.org Cc: Ingo Molnar , "Paul E. McKenney" , Andrew Morton , Ingo Molnar , Peter Zijlstra , linux-mm@kvack.org Date: Tue, 18 Feb 2025 10:52:12 +0100 In-Reply-To: <6a86f095-4f3b-46e8-8a42-51bff3d03405@efficios.com> References: <20250217112317.258716-1-gmonaco@redhat.com> <20250217112317.258716-2-gmonaco@redhat.com> <6a86f095-4f3b-46e8-8a42-51bff3d03405@efficios.com> Autocrypt: addr=gmonaco@redhat.com; prefer-encrypt=mutual; keydata=mDMEZuK5YxYJKwYBBAHaRw8BAQdAmJ3dM9Sz6/Hodu33Qrf8QH2bNeNbOikqYtxWFLVm0 1a0JEdhYnJpZWxlIE1vbmFjbyA8Z21vbmFjb0ByZWRoYXQuY29tPoiZBBMWCgBBFiEEysoR+AuB3R Zwp6j270psSVh4TfIFAmbiuWMCGwMFCQWjmoAFCwkIBwICIgIGFQoJCAsCBBYCAwECHgcCF4AACgk Q70psSVh4TfJzZgD/TXjnqCyqaZH/Y2w+YVbvm93WX2eqBqiVZ6VEjTuGNs8A/iPrKbzdWC7AicnK xyhmqeUWOzFx5P43S1E1dhsrLWgP User-Agent: Evolution 3.54.3 (3.54.3-1.fc41) MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: 40UViHRb7e-ZbwLQywu6nciQtlY0wXbFIGQyG4Tj1zw_1739872335 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Stat-Signature: 87gh9zrwe3sqc4b51htqhwh79yqkgksa X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 5852280012 X-HE-Tag: 1739872339-364540 X-HE-Meta: U2FsdGVkX19b9CxDxfm5pJznVpEM/BnmoNZLA9SNzUnZr5gGw+DcVFB46RLfU6mhZj23vq15t7jg5VVq1i9Gi6HuiEKpZzq1jRB2e9xj162A9w4UCsnSx23ERqIp7XhSB+/PGyQz04prn66EfyqiK0oL1cubAzzuAo+j0dyeeUr6EqE100Adr3T/14UPFRBYH3xDcOydKoH8u47DDPJLuD/LdehbF/owHncuVY8FpbCaUW0dgHla4NK2RFhCbqot48nrTebD6UdI9KzRlRFztm4c8o4U+wZe6VvnxmEja/mZlp0pOaUaPp/qszKQXP6edQf9oe6YnO4YoCMaxsL423mOaqGw5iEBishNWZ+DFuCnU8axHLuu9fGqBM6cq2NsRHiLOstG0ukMZrexMKr4Q10IowDkYSeQYLNCOmUg2DaTieEVVE5kg1tpNlnU282hLIGTMUxuDr0LRSrky9ZNqMnUinXPmsH/ABZhIRY/vpPDk8hrYjCSDRjIfqs9N8l3nHj/UXTtz+6R4ououv/i+OQFfycfr9z+U/pBqlsolh4gUgX8oXNWNcfJ6k40AW3u+k2Ry6uVc58tYFrSpNcwVRLTgt/OfPNOkz42xwgMyo1vpHeFcUc4Evg1qXKdMppZzHlZAPJiZe9yVWn6yJr6GVi/r7YYrtL/S92WOck5bU+69c0M0vkooe/cqKfdYwmWfB25pQ4Vd8YlDKU9K7M8/2xLzWEvLu2ZIfgt3uHIm/F9n6k7whPTmoX/Obv4iJaofonDuh9ehqE11kkwwJh+s5a3RLyZS2LuS39qEt9Z6cAxTp2QTWhyQuwey3HoMUSiDkoTO3E1KUTMnULUaVcXucxjZ0HpgxAmNVV9H536dUgGV2qjUzwPtgW1II+xg2NV/ZYDkiHGUYP07/hGKc7zAB3KIhgEBWH39jT11Dm7eQ8VoRsPy/AjOyjaaTxLzZeY+nOneuhVv/HQgu94r/Q +cvjuUEE w1cMZKfCDHQ/sLeJoAAAj7cax88Wys59LZYbiB+/AuyQwVo6e3I9p0p/sumMeBlxeElCVnpCBaVtSbe3xPE6yj937bWrpd+YfWL2QKdDdINFFOPo4+Q81ETYnO4++kYdSrsh4vVsMpYvaBAQxSZXXd90cmHCfKuwAFOsF4VxyZ1gJS/w0xIE8N+578c9Z8Wz0IJ2dfUUx8p/IbQt89BaMctgVaHOIoP84DsuN3urZdxfgV4dwxtIh3UyM08JLq9Vo1+wbQhbsqbeejraUMqiC0+/lq71+I3cRXzhkDLIk7A6pNpT9sIxA9vpKoULtDJvjC7Xowbjnfp8vpSfKDuFgkxuzcvWwS8YR44GijE/LiPv7XL0= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, 2025-02-17 at 14:46 -0500, Mathieu Desnoyers wrote: > On 2025-02-17 06:23, Gabriele Monaco wrote: > > Currently, the task_mm_cid_work function is called in a task work > > triggered by a scheduler tick to frequently compact the mm_cids of > > each > > process for each core. This can delay the execution of the > > corresponding > > thread for the entire duration of the function, negatively > > affecting the > > response in case of real time tasks. In practice, we observe > > task_mm_cid_work increasing the latency of 30-35us on a 128 cores > > system, this order of magnitude is meaningful under PREEMPT_RT. > >=20 > > Run the task_mm_cid_work in batches of up to > > CONFIG_RSEQ_CID_SCAN_BATCH > > cpus, this contains the duration of the delay for each scan. > > Also improve the duration by iterating for all present cpus and not > > for > > all possible. >=20 > Iterating only on present cpus is not enough on CONFIG_HOTPLUG=3Dy, > because ACPI can dynamically add/remove CPUs from the set. If we end > up iterating only on present cpus, then we need to add a cpu hotplug > callback to handle the removal case, and I'm not sure the added > complexity is worth it here. >=20 Got it, didn't think of that.. > >=20 > > The task_mm_cid_work already contains a mechanism to avoid running > > more > > frequently than every 100ms, considering the function runs at every > > tick, assuming ticks every 1ms (HZ=3D1000 is common on distros) and > > assuming an unfavorable scenario of 1/10 ticks during task T > > runtime, we > > can compact the CIDs for task T in about 130ms by setting > > CONFIG_RSEQ_CID_SCAN_BATCH to 10 on a 128 cores machine. > > This value also drastically reduces the task work duration and is a > > more > > acceptable latency for the aforementioned machine. > >=20 > > Fixes: 223baf9d17f2 ("sched: Fix performance regression introduced > > by mm_cid") > > Signed-off-by: Gabriele Monaco > > --- > > =C2=A0 include/linux/mm_types.h |=C2=A0 8 ++++++++ > > =C2=A0 init/Kconfig=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0 | 12 ++++++++++++ > > =C2=A0 kernel/sched/core.c=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 | 27 +++++++++= +++++++++++++++--- > > =C2=A0 3 files changed, 44 insertions(+), 3 deletions(-) > > =C2=A0=20 > > @@ -10546,6 +10546,15 @@ static void task_mm_cid_work(struct > > callback_head *work) > > =C2=A0=C2=A0=09mm =3D t->mm; > > =C2=A0=C2=A0=09if (!mm) > > =C2=A0=C2=A0=09=09return; > > +=09cpu =3D from_cpu =3D READ_ONCE(mm->mm_cid_scan_cpu); > > +=09to_cpu =3D from_cpu + CONFIG_RSEQ_CID_SCAN_BATCH; > > +=09if (from_cpu > cpumask_last(cpu_present_mask)) { >=20 > See explanation about using possible rather than present. >=20 > > +=09=09from_cpu =3D 0; > > +=09=09to_cpu =3D CONFIG_RSEQ_CID_SCAN_BATCH; >=20 > If the cpu_possible_mask is sparsely populated, this will end > up doing batches that hit very few cpus. Instead, we should > count how many cpus are handled within each > for_each_cpu_from(cpu, cpu_possible_mask) loops below and break > when reaching CONFIG_RSEQ_CID_SCAN_BATCH. >=20 > > +=09} > > [...] > > +=09for_each_cpu_from(cpu, cpu_present_mask) { > > +=09=09if (cpu =3D=3D to_cpu) > > +=09=09=09break; > > =C2=A0=C2=A0=09=09sched_mm_cid_remote_clear_weight(mm, cpu, weight); > > +=09} >=20 > Here set mm->mm_cid_scan_cpu to the new next position which is > the result from the "for each" loop. >=20 Mmh, good point, I wonder though if we need to care for multiple threads scanning the same mm concurrently. In my patch it shouldn't happen (threads /book/ up to to_cpu writing it before scanning). To do so, I'd probably need to create a map with N elements starting from from_cpu and use that, or have a dry loop before actually scanning. Thanks, Gabriele