From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id EF7EDC0015E for ; Mon, 24 Jul 2023 16:55:56 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 71CF06B0071; Mon, 24 Jul 2023 12:55:56 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 6CE8D6B0074; Mon, 24 Jul 2023 12:55:56 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 56FE18E0001; Mon, 24 Jul 2023 12:55:56 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 484B76B0071 for ; Mon, 24 Jul 2023 12:55:56 -0400 (EDT) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 083561A0AC5 for ; Mon, 24 Jul 2023 16:55:56 +0000 (UTC) X-FDA: 81047107512.23.B3E4A06 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf09.hostedemail.com (Postfix) with ESMTP id B5E1D140009 for ; Mon, 24 Jul 2023 16:55:53 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=WCQFq2WO; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf09.hostedemail.com: domain of vschneid@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=vschneid@redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1690217753; a=rsa-sha256; cv=none; b=DQEtSZh+L9FEgBY2M9yolp/q/Nsovvs2oCroZzKR8gss0P9MqDpBQi7lkgq/yh4rkSpC3a u3K5nfTrwyeIzpT0D36d6ocxQGugThUL69zdfbPQ+C8y1XvWKz+ZKK+QxzH0lQK5V1pAdS EIR30GidN0q1ZnBXoP0YsjL4ovqS/wg= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=WCQFq2WO; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf09.hostedemail.com: domain of vschneid@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=vschneid@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1690217753; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=siuJb/Gx947W7g0WBvc54XC4RizX/HWWBL9wwR+/gxc=; b=tp1ucUwH44Sw+v6CPolaBLv62fyCKgs0CPcUGRDKyq8KQ/SEk++IaS5DB2dnOj1InuMvZQ o/TFZ1U7iMdkR68T1fpcO9/o6w7LdyMbMq5GMec13Xm2WNeYg0gvGe0AHWHpgVGGWzUbPl uaUkRKQS8K7+cmhU5qXAQ0ndXL0vKmg= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1690217752; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=siuJb/Gx947W7g0WBvc54XC4RizX/HWWBL9wwR+/gxc=; b=WCQFq2WOzgfGumjyXi1YyEdXo3QYKvsCILYh3gCAPtkJY3Uzy0KHSVek6xSIm6llzM36Zk l9GUlxFRkobTLSJnga36KRUaTj5aj4rAB6TfAxDf+uJAhk1oCdVmyVgIgbyv7l35frMbTQ H+Amxff75pm4UqG2uc2oRkv0EOL8hwg= Received: from mail-lf1-f72.google.com (mail-lf1-f72.google.com [209.85.167.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-54-Y9weTOXNNEyJ0X_8bolMyw-1; Mon, 24 Jul 2023 12:55:49 -0400 X-MC-Unique: Y9weTOXNNEyJ0X_8bolMyw-1 Received: by mail-lf1-f72.google.com with SMTP id 2adb3069b0e04-4fb76659d37so3914314e87.2 for ; Mon, 24 Jul 2023 09:55:49 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1690217748; x=1690822548; h=content-transfer-encoding:mime-version:message-id:date:references :in-reply-to:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=+i0Z9WjluIMP5uW/WBHzltLXbYemxnhhH7IUl23Qdow=; b=jD2xN+EGm57caGJCS2Kjkn8P8ZEiA/iizhsmW76XWDKSQkFIWcmAI9qwzsLpUcWBSB z0hNRrFFALn1wqfdC9vwOmFQwSkeVjVMPsVKrNJiSWp3znBnfHJhFtioFHXaDWmvzRCB Tfsgfu2Y38rvhatQnV8qpjpws11g+lbWryrJqAgHV3NhsRYMoSm0CFqzBkfBLFIoZHwi Oa9h1XCt79i7fDpGoGTb6HIMOsdaJdDZIutUNABNI5ZmrqUXxtY0GgRGyzYqUFTF+ZsV ycOnDVJeNAYp9tTW+OYfYkVL4I10f4YQZpMAoM4zpjiRgxw1ZQfq3/KstQEHTPfUyyTw Iwzw== X-Gm-Message-State: ABy/qLano8TmHxj9dDt9EmGkRExElrHIrqjuxbjWnGNZjNJSlKctsrpi EDhGYtZ0ek+R0pdP1t8n20NKQU+fVb12eT5d6CBcnW2Rti7kbhdv8VG/4ii9vRxBySDTwi7jsNW 6qwkegAWOXQk= X-Received: by 2002:ac2:5dee:0:b0:4fb:99c6:8533 with SMTP id z14-20020ac25dee000000b004fb99c68533mr5480474lfq.33.1690217747937; Mon, 24 Jul 2023 09:55:47 -0700 (PDT) X-Google-Smtp-Source: APBJJlESyHT5vseo1Fe9EOmjjndUHiTWba8Rkai6N/Uo2tJlbGYCmmO/8/5WknEFctQ/nYDYCoWt/g== X-Received: by 2002:ac2:5dee:0:b0:4fb:99c6:8533 with SMTP id z14-20020ac25dee000000b004fb99c68533mr5480427lfq.33.1690217747520; Mon, 24 Jul 2023 09:55:47 -0700 (PDT) Received: from vschneid.remote.csb ([149.12.7.81]) by smtp.gmail.com with ESMTPSA id o25-20020a1c7519000000b003fbaade0735sm13965691wmc.19.2023.07.24.09.55.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 Jul 2023 09:55:47 -0700 (PDT) From: Valentin Schneider To: Frederic Weisbecker Cc: linux-kernel@vger.kernel.org, linux-trace-kernel@vger.kernel.org, linux-doc@vger.kernel.org, kvm@vger.kernel.org, linux-mm@kvack.org, bpf@vger.kernel.org, x86@kernel.org, rcu@vger.kernel.org, linux-kselftest@vger.kernel.org, Nicolas Saenz Julienne , Steven Rostedt , Masami Hiramatsu , Jonathan Corbet , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H. Peter Anvin" , Paolo Bonzini , Wanpeng Li , Vitaly Kuznetsov , Andy Lutomirski , Peter Zijlstra , "Paul E. McKenney" , Neeraj Upadhyay , Joel Fernandes , Josh Triplett , Boqun Feng , Mathieu Desnoyers , Lai Jiangshan , Zqiang , Andrew Morton , Uladzislau Rezki , Christoph Hellwig , Lorenzo Stoakes , Josh Poimboeuf , Jason Baron , Kees Cook , Sami Tolvanen , Ard Biesheuvel , Nicholas Piggin , Juerg Haefliger , Nicolas Saenz Julienne , "Kirill A. Shutemov" , Nadav Amit , Dan Carpenter , Chuang Wang , Yang Jihong , Petr Mladek , "Jason A. Donenfeld" , Song Liu , Julian Pidancet , Tom Lendacky , Dionna Glaze , Thomas =?utf-8?Q?Wei=C3=9Fschuh?= , Juri Lelli , Daniel Bristot de Oliveira , Marcelo Tosatti , Yair Podemsky Subject: Re: [RFC PATCH v2 15/20] context-tracking: Introduce work deferral infrastructure In-Reply-To: References: <20230720163056.2564824-1-vschneid@redhat.com> <20230720163056.2564824-16-vschneid@redhat.com> Date: Mon, 24 Jul 2023 17:55:44 +0100 Message-ID: MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: B5E1D140009 X-Stat-Signature: dmtwuhpkkdptgz4ggotu9k1ujx3ojrra X-HE-Tag: 1690217753-933650 X-HE-Meta: U2FsdGVkX18RNzWrZRBHKV91VKJKUEyKQQJuilVByBoI3Dl9IxeDcW0qYq1i3hZ+eQp29A8yX8gNZ9FLMyTvW4MJKIA0OAmeocKrumz+90nMafsMEEkT5B9bm28R3+NiHIF4lus+U6TrEuU6LoVAARS1cOnfzLwn7KXjeOZSQeiua5QY1mxSlpbqJaDOuhTOOEB4svgfKUToRatMVDMkqOIRHj++2VRrll74EEhTfZHWknZr0F7ZkReF+nKddQPDCkLN4kvZw/b4hkzIIv7oKDHGODpnodVAWGhrcY8L6w2YcYl17piUeSxMt/z2zsPIAquRzV7L+yvM87P/m+Hf9DyRuH43FF2NxXdoaYEQpUnMp4WH6vnxo0mzkkOX+vyuqdb5sm1opsBsDlsC5ZhWr+5s139KiPA6kSiSYzfMRMqu2T0JnJ1dV4KZ59RrOD/qOPuizG9EKemk2skXOXxoztUxMLbwVulVt1fMoeKhWRvy3y/MxEM6OkES0Iq9w96Wh/cRImW6zBpssBvUG9/mKsUoEWMYZQ4rKz4+C7q06XfULgETKQraJFQnmkvROizOHfQeDBm5cJM31TRSYnUHP7ub9selO7cu63+ldsw4zX3Fk6EG+c0YCxQmxjyRqCKkVkvsTZqo6eutwBo5HHfQeFtINMucKNt97LZQZv9GJKSoKbROiYhg7YXrvOfFsDJ7tFDZ7vK3gPqvsVmCqFUkoYR8wlHYS4AdZHrtcstBrlHWpLhcR6qKKETxLPrQb71nCtlWY8AWMYH1j12UHMbL1jPcftfzONCSe8nntf7q53O1EF/RVhlxWIgh9nm1LpS/8Y5WdLYpppkiSkOYs3DV2flXR6LIn2HpGpJL0wAMONhJDOMiL5r9bStk4nJOT5kmH6P6CUH6B02eBYyEcsaGKcozy9FfobX0LJ1ipHFB76L2w7TTQXcINXUx/uhHyPnF9j6d2KNEOCP6QL2IlF/ pHF+frcX KIu+aZ8JQLhcvM9xn84FBe/aJhxyjjfyaF7r7/m1526+hpWUqphHgnUDx7xVMrZwS3E75P73gMS+2y9rHzuUs16n1kOr1TH17ZXZaith3I5BlT14IMOLyujSuCl4FXbTxvT3lk+PL76iKOJz6hG+tNCyxmkChS0y6xvhrUkKPN4i/rM4d+FsJSCy8v8fHkkcBRTDeZmVUivBGBiTMyONzxwTF8Pb+lwXZrCOJVhsgao/9T/f8eZuTffida6PfGnYHMVfko67d47MvJTQYzAEbS4jnJEZhhz4ksZt/eIsmKN48TtayKUjTvU77SvkZjH5w0N7ChvBzCGV0LbzsumKrmdHjPj49VlnIXYsJBTMYQLXkeYpqSY0Ul3czZEu7BG3k5oI1Y9Nt2pgc7IE= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 24/07/23 16:52, Frederic Weisbecker wrote: > Le Thu, Jul 20, 2023 at 05:30:51PM +0100, Valentin Schneider a =C3=A9crit= : >> +enum ctx_state { >> +=09/* Following are values */ >> +=09CONTEXT_DISABLED=09=3D -1,=09/* returned by ct_state() if unknown */ >> +=09CONTEXT_KERNEL=09=09=3D 0, >> +=09CONTEXT_IDLE=09=09=3D 1, >> +=09CONTEXT_USER=09=09=3D 2, >> +=09CONTEXT_GUEST=09=09=3D 3, >> +=09CONTEXT_MAX =3D 4, >> +}; >> + >> +/* >> + * We cram three different things within the same atomic variable: >> + * >> + * CONTEXT_STATE_END RCU_DYNTICKS= _END >> + * | CONTEXT_WORK_END | >> + * | | | >> + * v v v >> + * [ context_state ][ context work ][ RCU dynticks counter ] >> + * ^ ^ ^ >> + * | | | >> + * | CONTEXT_WORK_START | >> + * CONTEXT_STATE_START RCU_DYNTICKS_START > > Should the layout be displayed in reverse? Well at least I always picture > bitmaps in reverse, that's probably due to the direction of the shift arr= ows. > Not sure what is the usual way to picture it though... > Surprisingly, I managed to confuse myself with that comment :-) I think I am subconsciously more used to the reverse as well. I've flipped that and put "MSB" / "LSB" at either end. >> + */ >> + >> +#define CT_STATE_SIZE (sizeof(((struct context_tracking *)0)->state) * = BITS_PER_BYTE) >> + >> +#define CONTEXT_STATE_START 0 >> +#define CONTEXT_STATE_END (bits_per(CONTEXT_MAX - 1) - 1) > > Since you have non overlapping *_START symbols, perhaps the *_END > are superfluous? > They're only really there to tidy up the GENMASK() further down - it keeps the range and index definitions in one hunk. I tried defining that directly within the GENMASK() themselves but it got too ugly IMO. >> + >> +#define RCU_DYNTICKS_BITS (IS_ENABLED(CONFIG_CONTEXT_TRACKING_WORK) ? = 16 : 31) >> +#define RCU_DYNTICKS_START (CT_STATE_SIZE - RCU_DYNTICKS_BITS) >> +#define RCU_DYNTICKS_END (CT_STATE_SIZE - 1) >> +#define RCU_DYNTICKS_IDX BIT(RCU_DYNTICKS_START) > > Might be the right time to standardize and fix our naming: > > CT_STATE_START, > CT_STATE_KERNEL, > CT_STATE_USER, > ... > CT_WORK_START, > CT_WORK_*, > ... > CT_RCU_DYNTICKS_START, > CT_RCU_DYNTICKS_IDX > Heh, I have actually already done this for v3, though I hadn't touched the RCU_DYNTICKS* family. I'll fold that in. >> +bool ct_set_cpu_work(unsigned int cpu, unsigned int work) >> +{ >> +=09struct context_tracking *ct =3D per_cpu_ptr(&context_tracking, cpu); >> +=09unsigned int old; >> +=09bool ret =3D false; >> + >> +=09preempt_disable(); >> + >> +=09old =3D atomic_read(&ct->state); >> +=09/* >> +=09 * Try setting the work until either >> +=09 * - the target CPU no longer accepts any more deferred work >> +=09 * - the work has been set >> +=09 * >> +=09 * NOTE: CONTEXT_GUEST intersects with CONTEXT_USER and CONTEXT_IDLE >> +=09 * as they are regular integers rather than bits, but that doesn't >> +=09 * matter here: if any of the context state bit is set, the CPU isn'= t >> +=09 * in kernel context. >> +=09 */ >> +=09while ((old & (CONTEXT_GUEST | CONTEXT_USER | CONTEXT_IDLE)) && !ret= ) > > That may still miss a recent entry to userspace due to the first plain re= ad, ending > with an undesired interrupt. > > You need at least one cmpxchg. Well, of course that stays racy by nature = because > between the cmpxchg() returning CONTEXT_KERNEL and the actual IPI raised = and > received, the remote CPU may have gone to userspace already. But still it= limits > a little the window. > I can make that a 'do {} while ()' instead to force at least one execution of the cmpxchg(). This is only about reducing the race window, right? If we're executing this just as the target CPU is about to enter userspace, we're going to be in racy territory anyway. Regardless, I'm happy to do that change.