From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id ED8FBC25B48 for ; Thu, 26 Oct 2023 21:35:37 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E831C6B0382; Thu, 26 Oct 2023 17:35:36 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E153B6B0383; Thu, 26 Oct 2023 17:35:36 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C85B16B0384; Thu, 26 Oct 2023 17:35:36 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id B5D0C6B0382 for ; Thu, 26 Oct 2023 17:35:36 -0400 (EDT) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 9377BB60D2 for ; Thu, 26 Oct 2023 21:35:36 +0000 (UTC) X-FDA: 81388919472.01.6191904 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf27.hostedemail.com (Postfix) with ESMTP id E948F40004 for ; Thu, 26 Oct 2023 21:35:34 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=none; spf=pass (imf27.hostedemail.com: domain of "SRS0=xBQK=GI=goodmis.org=rostedt@kernel.org" designates 139.178.84.217 as permitted sender) smtp.mailfrom="SRS0=xBQK=GI=goodmis.org=rostedt@kernel.org"; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1698356135; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=SqFNPywpHyBXPr/GaHy8eCxQKUf8LZL2ciRF9cxsZKg=; b=kWHUvuKt7jXYoXjFR2CooNFCyBObVZxbFmftTmZBPTi+byPxROpm0CArpnfWaTq/abqDge eEcvmQI5sD8xdYMMADouE4/nEz6CM4Bpezqr0jZtti55s4vpjGw/sLLEGOWctyIWBNCZ3F TJmeFaNiL29Ojjn+rO3jH9brUZ46fbM= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1698356135; a=rsa-sha256; cv=none; b=cG2Hp3/edhylQ1TXMvwMyf64CgZm6+ai6/g+9LcWWBJd0B4l6eZhLsebXP4P7JHmP/w4r5 cOCQTGL8YRASlcutvEbt26WMB1vP2SKrCBy1SJa1KwPTcGMRIwlx9FqXZUX8d8B+Rplxq5 jzS3Bx2ipnF053DGIu4TEKvU+9FMty0= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=none; spf=pass (imf27.hostedemail.com: domain of "SRS0=xBQK=GI=goodmis.org=rostedt@kernel.org" designates 139.178.84.217 as permitted sender) smtp.mailfrom="SRS0=xBQK=GI=goodmis.org=rostedt@kernel.org"; dmarc=none Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id CB98C6366E; Thu, 26 Oct 2023 21:35:33 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 32000C433C8; Thu, 26 Oct 2023 21:35:30 +0000 (UTC) Date: Thu, 26 Oct 2023 17:35:27 -0400 From: Steven Rostedt To: Mathieu Desnoyers Cc: Peter Zijlstra , LKML , Thomas Gleixner , Ankur Arora , Linus Torvalds , linux-mm@kvack.org, x86@kernel.org, akpm@linux-foundation.org, luto@kernel.org, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, mingo@redhat.com, juri.lelli@redhat.com, vincent.guittot@linaro.org, willy@infradead.org, mgorman@suse.de, jon.grimm@amd.com, bharata@amd.com, raghavendra.kt@amd.com, boris.ostrovsky@oracle.com, konrad.wilk@oracle.com, jgross@suse.com, andrew.cooper3@citrix.com, Joel Fernandes , Youssef Esmat , Vineeth Pillai , Suleiman Souhlal , Ingo Molnar , Daniel Bristot de Oliveira Subject: Re: [POC][RFC][PATCH v2] sched: Extended Scheduler Time Slice Message-ID: <20231026173527.2ad215cc@gandalf.local.home> In-Reply-To: <20231026152022.668ca0f3@gandalf.local.home> References: <20231025235413.597287e1@gandalf.local.home> <20231026105944.GJ33965@noisy.programming.kicks-ass.net> <20231026071413.4ed47b0e@gandalf.local.home> <20231026152022.668ca0f3@gandalf.local.home> X-Mailer: Claws Mail 3.19.1 (GTK+ 2.24.33; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="MP_/=uzH+ow/_xOFCsWFh3XaCEE" X-Rspamd-Queue-Id: E948F40004 X-Rspam-User: X-Rspamd-Server: rspam11 X-Stat-Signature: 8r45zd99duoxjqausuoitkwn88uq4s6j X-HE-Tag: 1698356134-487050 X-HE-Meta: U2FsdGVkX1/wn2YWW6qvaK+EAWC2F1ADPnefSMt4QujoYQVbzPMs4WVvhSODEswVkWfUx7Gb5fq5vQFQY42046i//tN2/Y1Uuu44mkkIo+m6ZgXNlP6+F/rHn03jzXWNfPg+CbL4IXCvc6eEgxqSal/gAvvvFO28lYAZh8/KSqwwC/BWAhHUe1FftVxE3/Nzdvmnb4ugwEqpsLhJ0tcDCUp0wt/rhdIf+7XSTD5F1AcN2msv+2rXCW3D1ZVe7palwdT9UzzISwvduZtECJynUtGgMzrqGOHd0oZb5MNKuIZ7OJxIovC95crb4sHgN1/X+X8/2V2hdMdzdJjm+s40dCCgULfaXX3bJ2rr7ynDBVZXHkA6mN0mBmUG3ewk5t3RraoGhv893H4cc3D3olZanv/+3iKBTNSG6PRuV+yi3DJXPQpQnIC/pvLVkv8sajU7ZuJLdG9c8QYM76ha7MkKFIBHXRP1D29/R0wlv1rqsGlcxt789uf7lK17QgdkNROcAQ42ufnOPC6jC/v4irmVwa8PcAiEx5eumhSOpxU53uKk6tWCEW7VvveAyyjJjyVw1OJFYfskaffWZ7cQzFvFpuQ+7wlIdnQqPA5skV5hNRBzL2o4Gvx7QV1fazQqOYd7XTHngzlSnWcr4aOnzquicR9cvnlBFmRueZa41GlwJGNwu6/yQ53FL6jpp/f14xcBmHkdqL3Wi1VvKpKEVFz+FtMc97mnhh4f+4dIczfPq8soOsoab3JL8DNvOkE2wwhm2WWhCtCX7NAUFqJUpybtJ5cF7aaVk5CsV/XVPPVTs2kHVw2YQWEziXPoCZzbWlPPUjEu5NGxvuBJOl3KPNlj3s+juAEUi+Y/IjgNuNMCea8O0wUHMrjL/qz4Tuv3vTEtpaaQoOrCxQuk+ADr55lvIRxTM+xSgn7vcSRt/sg2FoftndDLjI1vGMsl4O/fBeOHIBag6EX26PvM6vPevmb oBYA8HIM mFx5b2Oc7YHhapfRkwqfdeiSc3Shef3jTsrxbRvs7LlwesdORywWwKhJpYSrB+3e1Xkcy2tzGuOFsJf71ajX+ua7R0YrAE5esI+/zBDbKmXTsHfUO7UXyYvTWDln5ZUUOWdDiuJz1UPfRaxoWNHD0RGQ4S98Yjy/xRPOiKAdT86GtyYy23VOJ2WdrPoD9IHsnzFTYyQH0Kb/GtRu4AoZ8eVbjRZQdDtHQ29lJu55acDhNBG6EkVdk1EgTPIQu/lA4xZi9J/OjQxJ9gvvJT80YaU5JfTNTmUe9jBwF5AX4JbJKoj+BeJtTPjcVbR/9b3aTRFV308SmLYlN9bVU3z6hDQohrU3/lUbg/Gh6r/hy9KPlevUCwNH/KDuUGwU2QirYz5OSNGjIvPwX8YeDRhFuuDfAkf5vXbYJwDtjdqp+eKYMSs7p7wLdeQWOIyQaW5MXxMzUDab9OyK0dMw8mGK7wbsv4w== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: --MP_/=uzH+ow/_xOFCsWFh3XaCEE Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Content-Disposition: inline On Thu, 26 Oct 2023 15:20:22 -0400 Steven Rostedt wrote: > Anyway, I changed the code to use: > > static inline unsigned clrbit(volatile unsigned *ptr) > { > unsigned ret; > > asm volatile("andb %b1,%0" > : "+m" (*(volatile char *)ptr) > : "iq" (0x2) > : "memory"); > > ret = *ptr; > *ptr = 0; > > return ret; > } Mathieu also told me that glibc's rseq has some extra padding at the end, that happens to be big enough to hold this feature. That means you can run the code without adding: GLIBC_TUNABLES=glibc.pthread.rseq=0 Attached is the updated test program. -- Steve --MP_/=uzH+ow/_xOFCsWFh3XaCEE Content-Type: text/x-c++src Content-Transfer-Encoding: 7bit Content-Disposition: attachment; filename=extend-sched.c // Run with: GLIBC_TUNABLES=glibc.pthread.rseq=0 #include #include #include #include #include #include #include #include #include #include #include #include #include #include "rseq-abi.h" #include #define rseq(rseq, len, flags, sig) syscall(SYS_rseq, rseq, len, \ flags, sig); #define __weak __attribute__((weak)) //#define barrier() asm volatile ("" ::: "memory") #define rmb() asm volatile ("lfence" ::: "memory") #define wmb() asm volatile ("sfence" ::: "memory") static pthread_barrier_t pbarrier; static __thread struct rseq_abi __attribute__((aligned(sizeof(struct rseq_abi)))) rseq_map; static __thread struct rseq_abi *rseq_ptr; static bool no_rseq; static void init_extend_map(void) { extern ptrdiff_t __rseq_offset; extern unsigned int __rseq_size; int ret; if (no_rseq) return; if (__rseq_size) { if (__rseq_size < sizeof(rseq_map)) { printf("glibc rseq less than required mapping\n"); return; } rseq_ptr = __builtin_thread_pointer() + __rseq_offset; printf("Using glibc rseq %p\n", rseq_ptr); return; } rseq_ptr = &rseq_map; ret = rseq(rseq_ptr, sizeof(rseq_map), 0, 0); perror("rseq"); printf("ret = %d (%zd) %p\n", ret, sizeof(rseq_map), &rseq_map); if (ret < 0) rseq_ptr = NULL; } struct data; struct thread_data { unsigned long long start_wait; unsigned long long x_count; unsigned long long total; unsigned long long max; unsigned long long min; unsigned long long total_wait; unsigned long long max_wait; unsigned long long min_wait; struct data *data; }; struct data { unsigned long long x; unsigned long lock; struct thread_data *tdata; bool done; }; static inline unsigned long cmpxchg(volatile unsigned long *ptr, unsigned long old, unsigned long new) { unsigned long prev; asm volatile("lock; cmpxchg %b1,%2" : "=a"(prev) : "q"(new), "m"(*(ptr)), "0"(old) : "memory"); return prev; } static inline unsigned clrbit(volatile unsigned *ptr) { unsigned ret; asm volatile("andb %b1,%0" : "+m" (*(volatile char *)ptr) : "iq" (0x2) : "memory"); ret = *ptr; *ptr = 0; return ret; } static void extend(void) { if (!rseq_ptr) return; rseq_ptr->cr_flags = 1; } static void unextend(void) { unsigned prev; if (!rseq_ptr) return; prev = clrbit(&rseq_ptr->cr_flags); if (prev & 2) { tracefs_printf(NULL, "Yield!\n"); sched_yield(); } } #define sec2usec(sec) (sec * 1000000ULL) #define usec2sec(usec) (usec / 1000000ULL) static unsigned long long get_time(void) { struct timeval tv; unsigned long long time; gettimeofday(&tv, NULL); time = sec2usec(tv.tv_sec); time += tv.tv_usec; return time; } static void grab_lock(struct thread_data *tdata, struct data *data) { unsigned long long start, end, delta; unsigned long long end_wait; unsigned long long last; unsigned long prev; if (!tdata->start_wait) tdata->start_wait = get_time(); while (data->lock && !data->done) rmb(); extend(); start = get_time(); prev = cmpxchg(&data->lock, 0, 1); if (prev) { unextend(); return; } end_wait = get_time(); tracefs_printf(NULL, "Have lock!\n"); delta = end_wait - tdata->start_wait; tdata->start_wait = 0; if (!tdata->total_wait || tdata->max_wait < delta) tdata->max_wait = delta; if (!tdata->total_wait || tdata->min_wait > delta) tdata->min_wait = delta; tdata->total_wait += delta; data->x++; last = data->x; if (data->lock != 1) { printf("Failed locking\n"); exit(-1); } prev = cmpxchg(&data->lock, 1, 0); end = get_time(); if (prev != 1) { printf("Failed unlocking\n"); exit(-1); } tracefs_printf(NULL, "released lock!\n"); unextend(); delta = end - start; if (!tdata->total || tdata->max < delta) tdata->max = delta; if (!tdata->total || tdata->min > delta) tdata->min = delta; tdata->total += delta; tdata->x_count++; /* Let someone else have a turn */ while (data->x == last && !data->done) rmb(); } static void *run_thread(void *d) { struct thread_data *tdata = d; struct data *data = tdata->data; init_extend_map(); pthread_barrier_wait(&pbarrier); while (!data->done) { grab_lock(tdata, data); } return NULL; } int main (int argc, char **argv) { unsigned long long total_wait = 0; unsigned long long secs; pthread_t *threads; struct data data; int cpus; memset(&data, 0, sizeof(data)); cpus = sysconf(_SC_NPROCESSORS_CONF); threads = calloc(cpus + 1, sizeof(*threads)); if (!threads) { perror("threads"); exit(-1); } data.tdata = calloc(cpus + 1, sizeof(*data.tdata)); if (!data.tdata) { perror("Allocating tdata"); exit(-1); } tracefs_print_init(NULL); pthread_barrier_init(&pbarrier, NULL, cpus + 2); for (int i = 0; i <= cpus; i++) { int ret; data.tdata[i].data = &data; ret = pthread_create(&threads[i], NULL, run_thread, &data.tdata[i]); if (ret < 0) { perror("creating threads"); exit(-1); } } pthread_barrier_wait(&pbarrier); sleep(5); printf("Finish up\n"); data.done = true; wmb(); for (int i = 0; i <= cpus; i++) { pthread_join(threads[i], NULL); printf("thread %i:\n", i); printf(" count:\t%lld\n", data.tdata[i].x_count); printf(" total:\t%lld\n", data.tdata[i].total); printf(" max:\t%lld\n", data.tdata[i].max); printf(" min:\t%lld\n", data.tdata[i].min); printf(" total wait:\t%lld\n", data.tdata[i].total_wait); printf(" max wait:\t%lld\n", data.tdata[i].max_wait); printf(" min wait:\t%lld\n", data.tdata[i].min_wait); total_wait += data.tdata[i].total_wait; } secs = usec2sec(total_wait); printf("Ran for %lld times\n", data.x); printf("Total wait time: %lld.%06lld\n", secs, total_wait - sec2usec(secs)); return 0; } --MP_/=uzH+ow/_xOFCsWFh3XaCEE--