From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8AF2EC4167D for ; Wed, 8 Nov 2023 00:03:37 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C04956B0122; Tue, 7 Nov 2023 19:03:36 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id B8DAE6B0123; Tue, 7 Nov 2023 19:03:36 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A068C6B0124; Tue, 7 Nov 2023 19:03:36 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 8B5C06B0122 for ; Tue, 7 Nov 2023 19:03:36 -0500 (EST) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 6472C16072F for ; Wed, 8 Nov 2023 00:03:36 +0000 (UTC) X-FDA: 81432838032.10.18E9FEA Received: from mail-ot1-f42.google.com (mail-ot1-f42.google.com [209.85.210.42]) by imf20.hostedemail.com (Postfix) with ESMTP id 932FD1C0011 for ; Wed, 8 Nov 2023 00:03:34 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=l0GvW3Ew; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf20.hostedemail.com: domain of mjguzik@gmail.com designates 209.85.210.42 as permitted sender) smtp.mailfrom=mjguzik@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1699401814; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=w5LBcj9Ai/IS8fLRfhiYfHqRvqJUefGN2W7PjES7Bxo=; b=JFxjvPBocL7/OgKcor6yTkz82b7XecSLKMFF/PJVUl0euHsrLWrRjkUfE1MCJSveTR9MYS xZvBcajNibL45hmQWyb5cxNudJRH+KTkSFRyQNFZIfvnRIjkvXGiTY+tPZ4iPpjLYz2865 sVDSx8I26EqtlIoqP+084VfWr1FWQ9U= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=l0GvW3Ew; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf20.hostedemail.com: domain of mjguzik@gmail.com designates 209.85.210.42 as permitted sender) smtp.mailfrom=mjguzik@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1699401814; a=rsa-sha256; cv=none; b=J2IXRPyxo0079mC88SUob7Yce2c5C911nlYZobYtmJdTKBcOmBjX6JQB39mD8rvC8wMmTI pTlGUQj4RyfADP56pw0eT//xvUz3mr2daT+28HQBcJfhlbxCjz0MNsb63FjfQ9KG+aVEUK STZ2I9mPM/5dR1nWu9LdKtULt+iI8kk= Received: by mail-ot1-f42.google.com with SMTP id 46e09a7af769-6ce2b6b3cb6so4021230a34.3 for ; Tue, 07 Nov 2023 16:03:34 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1699401814; x=1700006614; darn=kvack.org; h=cc:to:subject:message-id:date:from:references:in-reply-to :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=w5LBcj9Ai/IS8fLRfhiYfHqRvqJUefGN2W7PjES7Bxo=; b=l0GvW3EwXCiqI9avAoW3C7VQhdO5Ox9OkTcVF4amAeB9nFLbfQeSteCmfUJ/v08xzD NoU7iZYMeNlWud9BwSRxpn/I8Xrz4jdmW+i1kb7TmmTiauvNgbDoQ1nP7pxRcI9pWeD6 veWUFB0oEL2aiqJ9AOq/qu8SD94EwqOLQ1sdp67FmEdgHVryxS497jc3//SK1rTm6pBT 1AqZBpmzBxzdGPsKPBgUrNw8ahl9ufywiszG09HaaYe8Bh2wVYBXVmkOqIC3miZw7dUT RP+i81GcN+mzNlDplmZbntlTlXHxmgYDpLJZf/bZLAQvPeirzlwnQudO+eA3Z7YHO7ii iFPA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1699401814; x=1700006614; h=cc:to:subject:message-id:date:from:references:in-reply-to :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=w5LBcj9Ai/IS8fLRfhiYfHqRvqJUefGN2W7PjES7Bxo=; b=cS5j3B47yQZXZqrMgkHbPenjD22kUdVAFa7X/72zLNy/rDJrc45YR8bupweko00eE3 V+gyelOR20JbXSSYKYR+rFLTgaqxqCexB7ah6xr2p9LhGdvlmRsaoIy44zFgdllycw+B IHzJC4+btcUGI9J3iHAzEki+clwqpMnovNpShAKWuyGijC5k+njGwZID7MhbSACLvNiO lcayNTx6m2Adu3xjNWcpN5ztTmgtV6GJfaQ+2lpCROdaZgTNrCmpxcPTLwS4YppBN5xs TWKahHeVpPJruVwDuXkmvbtZuWUn1tx0noYQUdrtwZo9Zav2gE0kUW4BjRNjbq/e8NA9 GPpQ== X-Gm-Message-State: AOJu0YwwLGf5tGndf9AAUV/y6wTy/IPnb6mYcCna1h5bcxWwUx+hpYix RjhUpUFXqM1I1f1/JAFndUlloU4m3kJMgtQCTFI= X-Google-Smtp-Source: AGHT+IEqZhTaX29JcgF6ONXkh2J4MIBuyEqTF1/MoCB/3c+YA7OhLsdvXIckAeQ2wusUzs9RSqXWYxO0oNDet6dw0CE= X-Received: by 2002:a05:6830:2b25:b0:6d3:1212:15ab with SMTP id l37-20020a0568302b2500b006d3121215abmr396732otv.20.1699401813768; Tue, 07 Nov 2023 16:03:33 -0800 (PST) MIME-Version: 1.0 Received: by 2002:a8a:158f:0:b0:4f0:1250:dd51 with HTTP; Tue, 7 Nov 2023 16:03:33 -0800 (PST) In-Reply-To: References: <5c7333ea4bec2fad1b47a8fa2db7c31e4ffc4f14.1663334978.git.josh@joshtriplett.org> <202311071228.27D22C00@keescook> <20231107205151.qkwlw7aarjvkyrqs@f> <202311071445.53E5D72C@keescook> From: Mateusz Guzik Date: Wed, 8 Nov 2023 01:03:33 +0100 Message-ID: Subject: Re: [PATCH] fs/exec.c: Add fast path for ENOENT on PATH search before allocating mm To: Kees Cook Cc: Kees Cook , Josh Triplett , Eric Biederman , Alexander Viro , linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: 932FD1C0011 X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: spmxfkg11fy35d5dk76pkync1hqzwzoj X-HE-Tag: 1699401814-832521 X-HE-Meta: U2FsdGVkX19eJCdW9+HWqZGg9Mie10fpceQHQbeWNChOAoAOSbT+seVBxPrFOXwfp8ZdGzDF4ihfLNT7I+7iChXC/Nfay+8gnSc2IxOpleO7xzeXPZtcY7YIwWOOAZ1tEhWRHGPy0ye2iISKT7dpOasxb/dOnSm2cMqS9TGEjRMGg42GZ1bkgbyeqelmK0nJdhKqXdGfcM2nq6uwlmlKsqifIXWrv+Uh05zT/SiZUi4qbIBwm9QEFW1LyZeXv10AALf5tT2HgXbYQdrbJ/YpRDhcl8CMjE0zdC5+2gkwCVTdlBmIpS26LVmqWEo4jUEHzwzkBbXabz0fKD+5kBjBzUU7kL//imWwEcdRERkE+QhNpzt5KgZyOEU8GXksywJcuveVXxvggKkLv7okHTlyLj2j0wXWOqaa68dsPNmm11vCFO+PfkKglmPyPflfJhwWxrcQIxzPS1NaHtEgwzffC65DZKniOJOEtX6qIsm0tkMM5nljf1S+tvgoHpjTvO9huVHNSRy/BAZD0u/escpMrDCqPayw99gA1hpdUQe/A9LZqJyweUhv/Zd0yfF3RHe2KJrrjXW99trYr3AoHcnvKhJnOPciaUGVlTRjkWT/J1otepNt85xczkClXar1Uhlox4/nt7wJDmjCJkqqcnb+FW7uCNQ5aJ6Szy1RIR/u53dN+D5Urv7ridMkzbqZGzO2NvNy271j7kkoEK73UU8+vUmY8K+hNcNLEBjBPf0Hf9lgOUJ7f9bUXw1uE/uzLsBZEeO4cFgcVhuGdGaDIQZBfzk8SQry9bOGmcezNRlREPavSH7iXjGXoFCqI010t2aLbB2rlenAffIJd2pBkhJB99EKlvHzhxPQJTi5ULJL3yaB6wfdrWvZMnclEr/Iu02oPuDck+J+blQUjFYeE9sgfJ374dz6DuHyUqudbhRftbRITNU7xv6KEMlqnosCO7LG7/FjQRyyR+ImTvJv6sw rY5t0/CK jCvAS3an8Tu39zGS5mM2CuZ6KYwBtzE+F9sC79bd3CSY0e1GwbU3D72YXxi9ra0Lxpt0fve97IiXxxJ2KOgLDAsNSmKNisFpcOOgSRY18cAlrmS8WTN/C99YguGmX5nWFyeQtnzShnIXanAnt/WwTCIOvUQqYValDv+NYxeZtBQ0aCQyv5CNadXDE8BluIBO68zFAqUHMIHvqoFpwlGPxJeF3PLKQKkTre8KchdcZFcnR+8/39UuZrjATjbzGIkdjqltPt7yNWbwTcvQeJq6BE/3+Dgto65Dc8wTbCvrQ5NFJfBQtczmGVWSP2Dn+HyhBCTX7Zi0HpgU9vlGNLaSjwz0pz78ew5vbHFs+7JGNfnSSnG8McahEgN883gGmP/C6IjpYNRxoLVknHpyFO0eNRCZZq8FG3Bsf8gEJIAHG0za6T4JI9NsW2QrLuuGgAwgE8Fk/DKh/1EUrWWzh4oiMr4tFDiK885oj0dL2g7wPHjlyur6jkS5d5YaKPG6cCX0G8PMUC4N1YM5DX7mOyOZv0o87G/rIcjLjuu4ZOoO4zuZE8Rs= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000093, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 11/8/23, Kees Cook wrote: > > > On November 7, 2023 3:08:47 PM PST, Mateusz Guzik > wrote: >>On 11/7/23, Kees Cook wrote: >>> On Tue, Nov 07, 2023 at 10:23:16PM +0100, Mateusz Guzik wrote: >>>> If the patch which dodges second lookup still somehow appears slower a >>>> flamegraph or other profile would be nice. I can volunteer to take a >>>> look at what's going on provided above measurements will be done and >>>> show funkyness. >>> >>> When I looked at this last, it seemed like all the work done in >>> do_filp_open() (my patch, which moved the lookup earlier) was heavier >>> than the duplicate filename_lookup(). >>> >>> What I didn't test was moving the sched_exec() before the mm creation, >>> which Peter confirmed shouldn't be a problem, but I think that might be >>> only a tiny benefit, if at all. >>> >>> If you can do some comparisons, that would be great; it always takes me >>> a fair bit of time to get set up for flame graph generation, etc. :) >>> >> >>So I spawned *one* process executing one statocally linked binary in a >>loop, test case from http://apollo.backplane.com/DFlyMisc/doexec.c . >> >>The profile is definitely not what I expected: >> 5.85% [kernel] [k] asm_exc_page_fault >> 5.84% [kernel] [k] __pv_queued_spin_lock_slowpath >>[snip] >> >>I'm going to have to recompile with lock profiling, meanwhile >>according to bpftrace >>(bpftrace -e 'kprobe:__pv_queued_spin_lock_slowpath { @[kstack()] = >> count(); }') >>top hits would be: >> >>@[ >> __pv_queued_spin_lock_slowpath+1 >> _raw_spin_lock+37 >> __schedule+192 >> schedule_idle+38 >> do_idle+366 >> cpu_startup_entry+38 >> start_secondary+282 >> secondary_startup_64_no_verify+381 >>]: 181 >>@[ >> __pv_queued_spin_lock_slowpath+1 >> _raw_spin_lock_irq+43 >> wait_for_completion+141 >> stop_one_cpu+127 >> sched_exec+165 > > There's the suspicious sched_exec() I was talking about! :) > > I think it needs to be moved, and perhaps _later_ instead of earlier? > Hmm... > I'm getting around 3.4k execs/s. However, if I "taskset -c 3 ./static-doexec 1" the number goes up to about 9.5k and lock contention disappears from the profile. So off hand looks like the task is walking around the box when it perhaps could be avoided -- it is idle apart from running the test. Again this is going to require a serious look instead of ad hoc pokes. Side note I actually read your patch this time around instead of skimming through it and assuming it did what I thought. do_filp_open is of course very expensive and kmalloc + kfree are slow. On top of it deallocating a file object even after a failed open was very expensive due to delegation to task_work (recently fixed). What I claim should be clear-cut faster is that lookup as in the original patch and only messing with file allocation et al if it succeeds. -- Mateusz Guzik