From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 361EBC4332F for ; Thu, 9 Nov 2023 00:18:28 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A519880028; Wed, 8 Nov 2023 19:18:27 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 9DA3E8D0073; Wed, 8 Nov 2023 19:18:27 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 87AB080028; Wed, 8 Nov 2023 19:18:27 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 72CA38D0073 for ; Wed, 8 Nov 2023 19:18:27 -0500 (EST) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 46F4E120C8B for ; Thu, 9 Nov 2023 00:18:27 +0000 (UTC) X-FDA: 81436504254.09.34CF5F6 Received: from out03.mta.xmission.com (out03.mta.xmission.com [166.70.13.233]) by imf16.hostedemail.com (Postfix) with ESMTP id E515E180012 for ; Thu, 9 Nov 2023 00:18:24 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=none; spf=pass (imf16.hostedemail.com: domain of ebiederm@xmission.com designates 166.70.13.233 as permitted sender) smtp.mailfrom=ebiederm@xmission.com; dmarc=pass (policy=none) header.from=xmission.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1699489105; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=3sj0Je71GEERpaXnGTrscEMDIy7nje7X4u1pyUKHsWo=; b=O/fnMBzbwzIbhWPJZEy7fNDTcdpA+X1S7ERDoAHOLNYq84b7NLHQV3JGCoi1DmVtUBw5QE qDudSAa1oAl/zseOwo+YM98X3CiXTWzAepjNArZbEI1boFnFoTYvQXTv+0bszIW1kp1+Gn FZ6q46AEwSUbnJU3rAZ1Hoz7v8/lX14= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1699489105; a=rsa-sha256; cv=none; b=FcWtothWKH/UtHYCRHX4ypBh6Z0s3qkIOzkHF4TumphmncJRuQ3sM+R9DSDBtxsMbetaXZ hhdDwzo6+3h7MKg6rnXO2hFN8/4jRYkpa3patgXDljYwdkex9eulXWEOOIp2+CnfW3oq4C JmnRsCQu5XV/KMI9lnAwe/kMfnUQlMo= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=none; spf=pass (imf16.hostedemail.com: domain of ebiederm@xmission.com designates 166.70.13.233 as permitted sender) smtp.mailfrom=ebiederm@xmission.com; dmarc=pass (policy=none) header.from=xmission.com Received: from in01.mta.xmission.com ([166.70.13.51]:58140) by out03.mta.xmission.com with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1r0skU-000smb-Ih; Wed, 08 Nov 2023 17:18:21 -0700 Received: from ip68-227-168-167.om.om.cox.net ([68.227.168.167]:56084 helo=email.froward.int.ebiederm.org.xmission.com) by in01.mta.xmission.com with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1r0skT-0037wH-FT; Wed, 08 Nov 2023 17:18:18 -0700 From: "Eric W. Biederman" To: Mateusz Guzik Cc: Kees Cook , Peter Zijlstra , Kees Cook , Josh Triplett , Alexander Viro , linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org References: <5c7333ea4bec2fad1b47a8fa2db7c31e4ffc4f14.1663334978.git.josh@joshtriplett.org> <202311071228.27D22C00@keescook> <20231107205151.qkwlw7aarjvkyrqs@f> <202311071445.53E5D72C@keescook> <202311081129.9E1EC8D34@keescook> Date: Wed, 08 Nov 2023 18:17:53 -0600 In-Reply-To: (Mateusz Guzik's message of "Wed, 8 Nov 2023 20:35:55 +0100") Message-ID: <87msvnwzim.fsf@email.froward.int.ebiederm.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-XM-SPF: eid=1r0skT-0037wH-FT;;;mid=<87msvnwzim.fsf@email.froward.int.ebiederm.org>;;;hst=in01.mta.xmission.com;;;ip=68.227.168.167;;;frm=ebiederm@xmission.com;;;spf=pass X-XM-AID: U2FsdGVkX19FZAR+Ppw2c1+15ty8jD9sRIiAHEI+ZEI= X-SA-Exim-Connect-IP: 68.227.168.167 X-SA-Exim-Mail-From: ebiederm@xmission.com Subject: Re: [PATCH] fs/exec.c: Add fast path for ENOENT on PATH search before allocating mm X-SA-Exim-Version: 4.2.1 (built Sat, 08 Feb 2020 21:53:50 +0000) X-SA-Exim-Scanned: Yes (on in01.mta.xmission.com) X-Rspamd-Queue-Id: E515E180012 X-Rspam-User: X-Rspamd-Server: rspam11 X-Stat-Signature: 6myz8e816in3zige78hm81mw5r6pp8gi X-HE-Tag: 1699489104-765382 X-HE-Meta: U2FsdGVkX197OiViGv3fltoFuACx/9TFbhCF+Fzl+I5c7PoVFsnx0BaPrtxYS7bDe2cNEVOQ2TN1HzyaAbTUhJZ2Sp2676hXtuhvqO+v4dZVCvP+2b2GPOpS7UNmU1FqKNaMsMP04unmpmTaB5KM8wFzXb+uY2BuHrDDpiMwVc5PJUaLoH9SHAsM6AiWxEwTaTWYGsJAF6i6xueph3NxMGqAgnqqgzyKVFzMu7ypvRJMgCIOLcYNxSKU/3oL2tc3s7XcuCu+tD7+FZja52Ciuc5A2xmcymO4Px+5RI2AMP5dPITZHew8fiMK5IVTKQGneN4RA+g0LXD17skypu765FUxEHGPrPoIwYOyWpkaOWoNSy3nfRR/KC0h4onFAkCmouZaoSPVWr3y8d4D3+kcqT4fZn+/aepqeY7w1zz06BAA7Ht6K7mK/3ieA4NYKCgPedmdfrBd/Wu8Cx91lgzRtx/qroTZiULZcv4o6s0i8Y311nmMLzRbmUKSNM1Pb3tWG01f6ITkdqfiq0DkN6lI+ONqED1oINauKZGChRtmGCZcm8ElkkKRhaL/SxvZcrEvgq5oNRpAxUMe2m1YanBkmKfidcpBSfU6GNvfk8c+G5vaIj+QZ3MkdRuKGXDGXyBS3MB5A8rmJmk//dWNUZ/n7AyIS9OyCbK3r5s3H0tb3FQx72epWpVwvtmTscwNdOXPP7NbAdnMr7F2+AO2G1II4kr3qTFQ/Y3QGvsTiAGQjfrilcZx8DzwX29439lxJ1wUe1ipl9EPSDoBIownMxWrevclb2E+rC4oojjXjEgCbnKL0ilrvhGMSQkmKi/ebyM+tMxr/TTHmyqs7XT1ITeTbFbhj286OzFiElSy1cuIlNLJiq/t0y26J7iPb3DyAjf69Z9F/GFBiI59bRLyfyaif/VSFUe1VWBhGj2jCb0ri8sq7nPP+wSsJDCzfUMTBOQqAi2BDWDC/HJTz3V/+9P rtGHiDIm 005xTFqnjr2zcmky+ZTLpwNNIRond5VqgcBtQrhKAB9TriV9XtUdXAjo7V/3ZaUX1ucDc0MJdg8ht5YKeUW314+c2qyFR2/d7oc8wQ3HlibGhTW/wDRI0Qs2+TzZbOpCV7zwprrQqiuca7vpHs2qhzCSo5iFS3rqWtRrRl4NQIZl/3jcnKkp/+Am2/vTlQHToNgFO X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Mateusz Guzik writes: > On 11/8/23, Kees Cook wrote: >> On Wed, Nov 08, 2023 at 01:03:33AM +0100, Mateusz Guzik wrote: >>> I'm getting around 3.4k execs/s. However, if I "taskset -c 3 >>> ./static-doexec 1" the number goes up to about 9.5k and lock >>> contention disappears from the profile. So off hand looks like the >>> task is walking around the box when it perhaps could be avoided -- it >>> is idle apart from running the test. Again this is going to require a >>> serious look instead of ad hoc pokes. >> >> Peter, is this something you can speak to? It seems like execve() forces >> a change in running CPU. Is this really something we want to be doing? >> Or is there some better way to keep it on the same CPU unless there is >> contention? >> > > sched_exec causes migration only for only few % of execs in the bench, > but when it does happen there is tons of overhead elsewhere. > > I expect real programs which get past execve will be prone to > migrating anyway, regardless of what sched_exec is doing. > > That is to say, while sched_exec buggering off here would be nice, I > think for real-world wins the thing to investigate is the overhead > which comes from migration to begin with. I have a vague memory that the idea is that there is a point during exec when it should be much less expensive than normal to allow migration between cpus because all of the old state has gone away. Assuming that is the rationale, if we are getting lock contention then either there is a global lock in there, or there is the potential to pick a less expensive location within exec. Just to confirm my memory I dug a little deeper and I found the original commit that added sched_exec (in tglx's git tree of the bit keeper history). commit f01419fd6d4e5b32fef19d206bc3550cc04567a9 Author: Martin J. Bligh Date: Wed Jan 15 19:46:10 2003 -0800 [PATCH] (2/3) Initial load balancing Patch from Michael Hohnbaum This adds a hook, sched_balance_exec(), to the exec code, to make it place the exec'ed task on the least loaded queue. We have less state to move at exec time than fork time, so this is the cheapest point to cross-node migrate. Experience in Dynix/PTX and testing on Linux has confirmed that this is the cheapest time to move tasks between nodes. It also macro-wraps changes to nr_running, to allow us to keep track of per-node nr_running as well. Again, no impact on non-NUMA machines. Eric