From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D9AECC02198 for ; Fri, 14 Feb 2025 06:44:23 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 74110280007; Fri, 14 Feb 2025 01:44:23 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 6F02A280006; Fri, 14 Feb 2025 01:44:23 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 59067280007; Fri, 14 Feb 2025 01:44:23 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 3B8EF280006 for ; Fri, 14 Feb 2025 01:44:23 -0500 (EST) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id CDB86141FF3 for ; Fri, 14 Feb 2025 06:44:22 +0000 (UTC) X-FDA: 83117611164.21.AE47143 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf18.hostedemail.com (Postfix) with ESMTP id 82C171C0004 for ; Fri, 14 Feb 2025 06:44:20 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=R0kos7N+; spf=pass (imf18.hostedemail.com: domain of gmonaco@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=gmonaco@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739515460; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=XihkKkwRTgcC1tBSr2qqSHWBzOdbjJF6xX07acAUnTs=; b=Ab2ZIySjcrRph4JdvwaoRsE5T1IjtkzZhB9wUEnC8kXCk8PRAiVWqiJ1xCqpwrRxdg5NwE RcTv5AYEnruZHMN+2EIl10VMeq1pdp6EFeyEm9i80GQryFa/bAGp17ORD0DSuy9YhPhS81 bgNX1K0PUXwNgVii06wh/6zSkXHHORI= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=R0kos7N+; spf=pass (imf18.hostedemail.com: domain of gmonaco@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=gmonaco@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739515460; a=rsa-sha256; cv=none; b=jGFWmLDXyMy3lOK9PBVZ7ohSG81swIb7G1Dn1aDEeF3ae4oiC7NfEWd278I6vWPYMv4zes cQ3eJo814w0j0eoJ1QwXFapTSKfnaKG/vgBYbBLtiWsIr7c50u29ttVSp94G9TJ1HKDXqj jsm60w9n7a4daaMXfRrfhtUf6uUF4lM= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1739515459; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:autocrypt:autocrypt; bh=XihkKkwRTgcC1tBSr2qqSHWBzOdbjJF6xX07acAUnTs=; b=R0kos7N+qhh8eN1w9ODWlymGRPne0Igg1hgT7vPWf9JkBxbUOjUmlaiON/1jyWNqI0+jN2 7+uqXmhXMBSRpEnYdfibBf/l3fxywdhR4EHfo40LfzMOh3UGZaGpU4j+clGYkHwcaUBSbA 2H0Nk4Eh35P8s0Y321tyQi11IVPyUqs= Received: from mail-wm1-f72.google.com (mail-wm1-f72.google.com [209.85.128.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-623-aGYtE4kTPnaJVhgt0iG_iQ-1; Fri, 14 Feb 2025 01:44:17 -0500 X-MC-Unique: aGYtE4kTPnaJVhgt0iG_iQ-1 X-Mimecast-MFC-AGG-ID: aGYtE4kTPnaJVhgt0iG_iQ Received: by mail-wm1-f72.google.com with SMTP id 5b1f17b1804b1-4395586f952so8761685e9.2 for ; Thu, 13 Feb 2025 22:44:17 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1739515457; x=1740120257; h=mime-version:user-agent:content-transfer-encoding:autocrypt :references:in-reply-to:date:cc:to:from:subject:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=XihkKkwRTgcC1tBSr2qqSHWBzOdbjJF6xX07acAUnTs=; b=VTDiG0PtsJP+FTv5Jvsz1xKzDhQZKVWnYq5vQAxbpA3K+gZNdeQ2s4THme86uBL44f jJMu5m207Lh4m/bW5C2S8+cBabD9Rn3H6Qeel/2ReV0lNCL9Ne3Kvnp9goOsp7oTGhEY cHcQQEpIDlkPss9uhfFi4gvsDK136hNOFjVnBdMAwySf/OaJsoXWIV90XRYxqxLvTskh r3LJvSXrx/Jds25i5aYzG7TJTPubMvsC44tpG5SL5Gcv/YrTJPH4jYeUBfHCyzsz6/we L+a5jRY63UT2l1R85iGCEG22Hq46hEJOwhOjy87tW5S2/Avcbf6logqsycWbiNOVmcuC +Ruw== X-Gm-Message-State: AOJu0YxelqBoNdvsLIDZTTFOLcs4uXBfllC8A/dgBGeBQg66uPjTDEcG wAqKxkBH+ZQw8Ue5h92eCCVu5u6EpcMZLynLJ9qQMOFPHfVy27OPROF/3wi3ep8psdhigKyqpxQ CBK8NIosoSDrhnPnJEaAnBKqKiBLVoEvsDbTcsDqQrnDTQLrI X-Gm-Gg: ASbGncutWYeLxZ1hWXr7X+7ZEMtLmOwqazqIrQx0mf0bvgcBMHhJ0GfpqHOrhsZPax5 qQ08/AKCS5/rbpLusg6lvyJP0rfiDiBYfyBk/PaTPJBK4BgigWbhsb5TDBsK7WkqGQ8Go9LmU91 xP5Sjo2QFyF2EOo6eUBYsQh3B5mS8rp/OYPSvOByvDRAeIwU1JHFToNd/mjso/FL/3i3hqcxpql nAnXJ2Cw1J05tc+X008zl9dvuV3xrqzaef5fZyEO+apGwEHAhcGkLKImn1Rxebq6J2QVoXKjX3V uKMe2uRAw3KqYDPtr99KwLBfQQE7/xQ= X-Received: by 2002:a05:600c:5491:b0:439:4036:e925 with SMTP id 5b1f17b1804b1-43960179ba8mr97826575e9.11.1739515456630; Thu, 13 Feb 2025 22:44:16 -0800 (PST) X-Google-Smtp-Source: AGHT+IE+4qyDYkNuw8fexMMX5Tv8MINKZo1rNtIj/rzbP3sO9nv/Yi7cp+xrxqZ2bhBRgtfshVQgtA== X-Received: by 2002:a05:600c:5491:b0:439:4036:e925 with SMTP id 5b1f17b1804b1-43960179ba8mr97826195e9.11.1739515456190; Thu, 13 Feb 2025 22:44:16 -0800 (PST) Received: from gmonaco-thinkpadt14gen3.rmtit.csb ([185.107.56.40]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-43960e937c7sm23906515e9.3.2025.02.13.22.44.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 13 Feb 2025 22:44:15 -0800 (PST) Message-ID: Subject: Re: [PATCH v6 2/3] sched: Move task_mm_cid_work to mm delayed work From: Gabriele Monaco To: Mathieu Desnoyers , linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, aubrey.li@linux.intel.com, yu.c.chen@intel.com, Andrew Morton , Peter Zijlstra , Ingo Molnar , "Paul E. McKenney" , Shuah Khan Date: Fri, 14 Feb 2025 07:44:13 +0100 In-Reply-To: <0888d6a3-8dea-455b-893f-d8d929e827e2@efficios.com> References: <202502131405.1ba0803f-lkp@intel.com> <17bda9071b6962414f61668698fa840501819172.camel@redhat.com> <0888d6a3-8dea-455b-893f-d8d929e827e2@efficios.com> Autocrypt: addr=gmonaco@redhat.com; prefer-encrypt=mutual; keydata=mDMEZuK5YxYJKwYBBAHaRw8BAQdAmJ3dM9Sz6/Hodu33Qrf8QH2bNeNbOikqYtxWFLVm0 1a0JEdhYnJpZWxlIE1vbmFjbyA8Z21vbmFjb0ByZWRoYXQuY29tPoiZBBMWCgBBFiEEysoR+AuB3R Zwp6j270psSVh4TfIFAmbiuWMCGwMFCQWjmoAFCwkIBwICIgIGFQoJCAsCBBYCAwECHgcCF4AACgk Q70psSVh4TfJzZgD/TXjnqCyqaZH/Y2w+YVbvm93WX2eqBqiVZ6VEjTuGNs8A/iPrKbzdWC7AicnK xyhmqeUWOzFx5P43S1E1dhsrLWgP User-Agent: Evolution 3.54.3 (3.54.3-1.fc41) MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: v8kDokxd2aYp-QG-7gyvJL7Lx02rgj_oXUOZJmvnD-A_1739515457 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Queue-Id: 82C171C0004 X-Stat-Signature: cisntuiqkjhx9m3gkr7ujo85obqnj4rn X-Rspamd-Server: rspam03 X-HE-Tag: 1739515460-289291 X-HE-Meta: U2FsdGVkX1+stDPpIF48F40S4ik2kUBuTVjDp4k5LS2spDHXTjMBIBUOUzExK351/7H1rNCJmp9xvLRx8hzPRtBnzMuHl9lMv+0qwpo1hLXCpClxJYQhUquY/rNfJyp4i8olQfu58v4l6Ja1nRwz3/+DnJ2NqLNMUQBkZEHkVinw1torhIIMwUzwxZL1+xI5XwieJTKqFMExwdyRaXxUsEPYE9txcR+Z8xP5WvNxe6Jg/rfwZ43EFNwBI9KkOuCR/Z6SgBLrRU8Z+imkzWl/rZKA2eCUBzStvR29i0mLYEgLv8JgnQ0OUf9hPGIqF6rym9THxLX+BQrb9PlkRSg68E47w/0dbEVU+AyqzB79euqmdVF8zmx89K/u5xBee2e3ddo5h4eL70qoY/3jUE0QiwYu4NL2qNmmAodquc697oQvHbKnWo+D6OXQrUeW8XftAIDPdBEY3UCrQxlX2p1Kk5SQKgbxya5jpZU/bpw26aN8NlDdXX6V3pp5ei+IZn1xlW5ybjyaiO/H9z8JV5EydxRlhBUYZ/6grgM8A4Xn88pqtk6YEKHHtwjcFSj8ScZkpv14fsU00kIlBYDbKUXAx4N8gJkOuEs0du5c5kC5dMJziOS87wJXA0iWZTBNu4vVwlGo0bb10aoc7WbeVcxTR/yRcLbWiJd1LGi31nBIa7H1MslA9k/QTfWGpZ18mR72nNSmgfYpj1f6r1a7FsSpdGL6XTTBfIDJv5Aef/0AoF8l5TsVe9HAl0HJrYROzv3FaJVgTU+UHw6XPvGTAK6fvT/PZrV/gwREABRQpEuTrmbOKjbFUbtLD2sYw0cuj2lMcH3NoiyNdmOgMrOcFgxA4CnoHhHSdsYzy0QaMwD3I1B2RUYFC8ly4X4leG2dJPd1GqEvn2WAaqT521nOt8674iAy//4fI+2H5MdkN5Wd9X0Xuu4nlUAnlrlinO/uoP5pbvNazRFym3BDUVWyij9 QPMBOqeq cr0WSrdQwmKklYBd1CG5N5BauozAswwPbyBpfgMAKX8LeJh4quIwHNDf2kqXJvnXrm5Vogl5O4//2S0F30jVd82nIZGiBSV314czBxQBBh8LQLR5GORmjAf0B7+whmHD3VJPLtaInUwUqTnlZrcIoyDGS22GkNO3uRN7YVLCKTnnKuC2mNJy4CxpjjJTP9oqR4APWX4SZmgZ0wbw6j5ABCthVJaqNHaMq+kKAHQUm6Oo85Zu35rls6cs84jbJq2GjC0B01VIrhrGIKu0GX98CrhJB5Au/yh68ClVzu34URR0M6fvhHe1DTEf+LBpnYFE9oJ/jeeIT0eZ9na40SJadlltnLSxHo4WTO5H5c9USaJ/iqG4KppY/d4gydri5TDop4Uae X-Bogosity: Ham, tests=bogofilter, spamicity=0.000003, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, 2025-02-13 at 12:31 -0500, Mathieu Desnoyers wrote: > On 2025-02-13 08:25, Gabriele Monaco wrote: > > On Thu, 2025-02-13 at 14:52 +0800, kernel test robot wrote: > > > kernel test robot noticed > > > "WARNING:at_kernel/workqueue.c:#__queue_delayed_work" on: > > >=20 > > > [=C2=A0=C2=A0=C2=A0 2.640924][=C2=A0=C2=A0=C2=A0 T0] ------------[ cu= t here ]------------ > > > [ 2.641646][ T0] WARNING: CPU: 0 PID: 0 at > > > kernel/workqueue.c:2495 > > > __queue_delayed_work (kernel/workqueue.c:2495 (discriminator 9)) > > > [=C2=A0=C2=A0=C2=A0 2.642874][=C2=A0=C2=A0=C2=A0 T0] Modules linked i= n: > > > [=C2=A0=C2=A0=C2=A0 2.643381][=C2=A0=C2=A0=C2=A0 T0] CPU: 0 UID: 0 PI= D: 0 Comm: swapper Not > > > tainted > > > 6.14.0-rc2-00002-g287adf9e9c1f #1 > > > [=C2=A0=C2=A0=C2=A0 2.644582][=C2=A0=C2=A0=C2=A0 T0] Hardware name: Q= EMU Standard PC (i440FX + > > > PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014 > > > [ 2.645943][ T0] RIP: 0010:__queue_delayed_work > > > (kernel/workqueue.c:2495 (discriminator 9)) > >=20 > > There seem to be major problems with this configuration, I'm trying > > to > > understand what's wrong but, for the time being, this patchset is > > not > > ready for inclusion. >=20 > I'm staring at this now, and I'm thinking we could do a simpler > change > that would solve your RT issues without having to introduce a > dependency > on workqueue.c. >=20 > So if the culprit is that task_mm_cid_work() runs for too long on > large > many-cpus systems, why not break it up into smaller iterations ? >=20 > So rather than iterating on "for_each_possible_cpu", we could simply > break this down into iteration on at most N cpus, so: >=20 > tick #1: iteration on CPUs 0 ..=C2=A0=C2=A0 N - 1 > tick #2: iteration on CPUs N .. 2*N - 1 > ... > circling back to 0 when it reaches the number of possible cpus. >=20 > This N value could be configurable, e.g. CONFIG_RSEQ_CID_SCAN_BATCH, > with a sane default. An RT system could decide to make that value > lower. >=20 > Then all we need to do is remember which was that last observed cpu > number in the mm struct, so the next tick picks up from there. >=20 > The main downside of this approach compared to scheduling delayed > work in a workqueue is that it depends on having the mm be current > when > the scheduler tick happens. But perhaps this is something we could > fix > in a different way that does not add a dependency on workqueue. I'm > not > sure how though. >=20 > Thoughts ? Mmh, that's indeed neat, what is not so good about this type of task work is that it's a pure latency, it will happen before scheduling the task and can't be interrupted. The only acceptable latency is a bounded one and your idea is going in that direction. As you mentioned, this will make the compaction of mm_cid even more rare and will likely have the test in 3/3 fail even more often, I'm not sure if this is necessarily a bad thing though, since mm_cid compaction is mainly aesthetic, so we could just increase the duration of the test or even add a busy loop inside to make the task more likely to run this compaction. I gave a thought about this whole thing, don't take this too seriously, but what I see essentially flawed in this approach is: 1. task_works are set on tick 2. task_works are run returning to userspace 1. is the issue with frequency and 2. can be mitigated by your idea, but not essentially solved. What if we (also) did: 1+. set this task_work while switching in 2+. run this task_work while switching out to sleep (i.e. no preemption) 1+. would make sure all threads have this task_work scheduled at a certain point (perhaps a bit too much, but we have a periodicity check in place). 2+. can effectively run the task in a moment when it is not problematic for the real time response: on a preemptible kernel, as soon as a task with higher priority is ready to run, it will preempt the currently running one, the fact current is going to sleep willingly implies there's no higher priority task ready, so likely no task really caring about RT response is going to run after. Not all tasks are ever going to sleep, so we must keep the original TWA_RESUME in the task_work, especially for those long-running or low- priority, both unlikely to be RT tasks. I'm going to try a patch with this CONFIG_RSEQ_CID_SCAN_BATCH and tuning the test to pass. In the future we can see if those ideas make sense and perhaps bring them in. Thanks, Gabriele