From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1B844C4332F for ; Fri, 10 Nov 2023 05:27:23 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 647A8280012; Fri, 10 Nov 2023 00:27:23 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 5F74C280009; Fri, 10 Nov 2023 00:27:23 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 49774280012; Fri, 10 Nov 2023 00:27:23 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 37272280009 for ; Fri, 10 Nov 2023 00:27:23 -0500 (EST) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 0959C40440 for ; Fri, 10 Nov 2023 05:27:23 +0000 (UTC) X-FDA: 81440911566.03.4BEFC2C Received: from out02.mta.xmission.com (out02.mta.xmission.com [166.70.13.232]) by imf30.hostedemail.com (Postfix) with ESMTP id CAE978000B for ; Fri, 10 Nov 2023 05:27:19 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=xmission.com; spf=pass (imf30.hostedemail.com: domain of ebiederm@xmission.com designates 166.70.13.232 as permitted sender) smtp.mailfrom=ebiederm@xmission.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1699594040; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=fMb7d/U1aMnXiLktRWzuXGAm3LfpUSK1ME42hzK/Xro=; b=ExQEwf8p7iIWMqz2UmpKu8qQ/B8tVxr2XMBwV0jR7VqfgsJEatBAVIdAb6t4QjlRMhDbqi xTxh0eff4E+rLHlw2iR+4Hk1AAS4TeSicwk8j/CGjWLdhRmZKYcKhIGoatlbli+0rGc3m/ QhTrC0KsMf3SznG56ElEYoa4Mhn+SzA= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=xmission.com; spf=pass (imf30.hostedemail.com: domain of ebiederm@xmission.com designates 166.70.13.232 as permitted sender) smtp.mailfrom=ebiederm@xmission.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1699594040; a=rsa-sha256; cv=none; b=5iUwNMqZ5MTmxuJjUVOia/dAmtuPsthTyp4WvUFYk9nDyZnFckENhD3CAYOi3czSlsAWio 829OHQOY8Mi0f78xz6hSv2/CJEA2j0XwS37zvlg56333uko+MdspkdpWalxEGazd2eOJAb VuAffFZgensDDlgpbehIHzgrsOVrwhA= Received: from in02.mta.xmission.com ([166.70.13.52]:52218) by out02.mta.xmission.com with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1r1K32-00Au9P-8z; Thu, 09 Nov 2023 22:27:16 -0700 Received: from ip68-227-168-167.om.om.cox.net ([68.227.168.167]:57024 helo=email.froward.int.ebiederm.org.xmission.com) by in02.mta.xmission.com with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.93) (envelope-from ) id 1r1K31-002Sgf-8u; Thu, 09 Nov 2023 22:27:15 -0700 From: "Eric W. Biederman" To: Mateusz Guzik Cc: Peter Zijlstra , Kees Cook , Josh Triplett , Alexander Viro , linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org References: <5c7333ea4bec2fad1b47a8fa2db7c31e4ffc4f14.1663334978.git.josh@joshtriplett.org> <202311071228.27D22C00@keescook> <20231107205151.qkwlw7aarjvkyrqs@f> <202311071445.53E5D72C@keescook> <202311081129.9E1EC8D34@keescook> <87msvnwzim.fsf@email.froward.int.ebiederm.org> Date: Thu, 09 Nov 2023 23:26:23 -0600 In-Reply-To: (Mateusz Guzik's message of "Thu, 9 Nov 2023 13:21:04 +0100") Message-ID: <87a5rmw54w.fsf@email.froward.int.ebiederm.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/28.2 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-XM-SPF: eid=1r1K31-002Sgf-8u;;;mid=<87a5rmw54w.fsf@email.froward.int.ebiederm.org>;;;hst=in02.mta.xmission.com;;;ip=68.227.168.167;;;frm=ebiederm@xmission.com;;;spf=pass X-XM-AID: U2FsdGVkX1+cPIx/y1pkh9ZJrAXmYoQCg+zoL4NEB9Y= X-SA-Exim-Connect-IP: 68.227.168.167 X-SA-Exim-Mail-From: ebiederm@xmission.com Subject: Re: [PATCH] fs/exec.c: Add fast path for ENOENT on PATH search before allocating mm X-SA-Exim-Version: 4.2.1 (built Sat, 08 Feb 2020 21:53:50 +0000) X-SA-Exim-Scanned: Yes (on in02.mta.xmission.com) X-Rspamd-Queue-Id: CAE978000B X-Rspam-User: X-Rspamd-Server: rspam04 X-Stat-Signature: t1qynutfon63exmep74fjwuwk6fyb4zk X-HE-Tag: 1699594039-373684 X-HE-Meta: U2FsdGVkX18CfvrO9Qn8VqLhgu42scBhx0+9cj93OKop4Vlo5ndckRz26jXMuv7GCZq/jvUIOP3hblFvBgdIK+CbI3adY6txhrDwgDTF8/fFrE+lhloM4VFryroyFkLIeZEzT3dK7xQpUNLY35KaiLH0Lj1yoAoKAs7QyOj/WngRoVyoYXzwewxxLoHPRX58R8nmfd3KmBgXyO/etcCxk2C8Bs6eqdtZukyq+tNpPDr2Q3Ow6VLVK2VScCK/YeYJEw2V2uXyWbeUnUNPA9no3xnVq4kGO5746t7gbQLLMiEVev7IlCuLrNssKoMY/iF3jFZxckypbW2JdQ8wZeMg3/nZ3zp868QlcJlcbSyL06zSi20OlHmL53zd3cP7sr3RNojgszBaf6BOPzuvt3D41bFhkJhRoRwmpW3KAAw5heETgEb7CP4K94RSnB5tOuNPubDsWnb9ebvSgTLpe+0prSaXFbfzdB4Gqp7DuSYC0RxqWgUbpNFRdkq6WO6qRAARzltN9vVhuImtPqcSnOay5nKpb1kEGtUgNqEGxXTEHgBNwmKO0B1uNPEmvQjJ6okNMZwJpFIuAfMSSFLR/IiCo9J8PIADwnw5Zy+UTRZ+NGgWNGtLWQywjg1KptFP36gJwft3IkMQJH5alETd02ue4LzFFAMAlfKXxgp6ianB8CIcZYYzrM2wAgf52wUtQwhxg+106tFpPFC1QxUsQe0sBoT8OwhhqortZWM00ZzT5X194KrTMJReUsflGmCNCQot8gbN6jwzVmGrE4TjJWU65qJDBshYbYxAGY3d4dKe11f+r2oioiqOrZNQ7MWUuzf3D9z9oW7/mCkVp6pTTkvsQV0VvrnCT9OqpiEhdDiE3DVQZ/wJ6pwmHShHN6h/NKZtI4S1BSF50Sa6sAgLomtduVEv3ACR/Vi89Vv/VZ2q38EIAe9HVuoqjuwzFR1k+bmR7hV0FVLjGwo9fYvuPnD gEqPorsW 6cNMH0z5PcyYEvIBpGZnQgTnErrCS6VFivbbUo4AjGCeFaaF6V7vfgmQOhr4Jz2zabqt6jq1VjXQAMBnpmrx7RNGizi1WrxHabxQU4essAFHOt1vJwxh0T2c2EXoTvb3KSuRn X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Mateusz Guzik writes: > On 11/9/23, Eric W. Biederman wrote: >> Mateusz Guzik writes: >>> sched_exec causes migration only for only few % of execs in the bench, >>> but when it does happen there is tons of overhead elsewhere. >>> >>> I expect real programs which get past execve will be prone to >>> migrating anyway, regardless of what sched_exec is doing. >>> >>> That is to say, while sched_exec buggering off here would be nice, I >>> think for real-world wins the thing to investigate is the overhead >>> which comes from migration to begin with. >> >> I have a vague memory that the idea is that there is a point during exec >> when it should be much less expensive than normal to allow migration >> between cpus because all of the old state has gone away. >> >> Assuming that is the rationale, if we are getting lock contention >> then either there is a global lock in there, or there is the potential >> to pick a less expensive location within exec. >> > > Given the commit below I think the term "migration cost" is overloaded here. > > By migration cost in my previous mail I meant the immediate cost > (stop_one_cpu and so on), but also the aftermath -- for example tlb > flushes on another CPU when tearing down your now-defunct mm after you > switched. > > For testing purposes I verified commenting out sched_exec and not > using taskset still gives me about 9.5k ops/s. > > I 100% agree should the task be moved between NUMA domains, it makes > sense to do it when it has the smallest footprint. I don't know what > the original patch did, the current code just picks a CPU and migrates > to it, regardless of NUMA considerations. I will note that the goal > would still be achieved by comparing domains and doing nothing if they > match. > > I think this would be nice to fix, but it is definitely not a big > deal. I guess the question is to Peter Zijlstra if this sounds > reasonable. Perhaps I misread the trace. My point was simply that the sched_exec seemed to be causing lock contention because what was on one cpu is now on another cpu, and we are now getting cross cpu lock ping-pongs. If the sched_exec is causing exec to cause cross cpu lock ping-pongs, then we can move sched_exec to a better place within exec. It has already happened once, shortly after it was introduced. Ultimately we want the sched_exec to be in the cheapest place within exec that we can find. Eric