From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 33A2DE7717F for ; Mon, 16 Dec 2024 16:06:35 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8D9476B00B5; Mon, 16 Dec 2024 11:06:34 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 8889D6B00B6; Mon, 16 Dec 2024 11:06:34 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 728BD6B00B8; Mon, 16 Dec 2024 11:06:34 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 5634E6B00B5 for ; Mon, 16 Dec 2024 11:06:34 -0500 (EST) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 010E9811AE for ; Mon, 16 Dec 2024 16:06:33 +0000 (UTC) X-FDA: 82901299194.15.7647F65 Received: from mail-wm1-f51.google.com (mail-wm1-f51.google.com [209.85.128.51]) by imf30.hostedemail.com (Postfix) with ESMTP id 914B98000C for ; Mon, 16 Dec 2024 16:05:35 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=hNdKiWJB; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf30.hostedemail.com: domain of olsajiri@gmail.com designates 209.85.128.51 as permitted sender) smtp.mailfrom=olsajiri@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1734365164; a=rsa-sha256; cv=none; b=aTlLMCunaKVcCWTVc5yosNhRfY4VnzVMUpy7irDdJ9JE1WRmkQ4uqSpi1kmVIMnSmX785H axR1xi4cpiqUeSGbiWq3miUr2WYSJ0+nkK9Jcbw+q8sCvtUy5NU6dqrzyY2jhPTovoOG6t rUIkN4jbnUwh3iJ5Vo4StscUjL7df20= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=hNdKiWJB; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf30.hostedemail.com: domain of olsajiri@gmail.com designates 209.85.128.51 as permitted sender) smtp.mailfrom=olsajiri@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1734365164; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=cGqyisqmMAu4dOam1OVi9m1FcFaUXOIPgO9eWA/wASA=; b=MQX/uAB4TUN69gI/7EvldyP8kVPhRcblHF9v5cP6lBBdWZNYaw1WlCklKhPgdCPndSNGoa 48crnwMCt9kg/sp3/TZs2Fx3bJOSrtr200eE1ehc5MVdZjpvDYgAqmdfhTFVmaVlPtt7gg LthZgNcSi00HINRoTCcT2pSaO0MDvRo= Received: by mail-wm1-f51.google.com with SMTP id 5b1f17b1804b1-4361dc6322fso28151865e9.3 for ; Mon, 16 Dec 2024 08:06:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1734365190; x=1734969990; darn=kvack.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:date:from:from:to:cc:subject:date:message-id:reply-to; bh=cGqyisqmMAu4dOam1OVi9m1FcFaUXOIPgO9eWA/wASA=; b=hNdKiWJBZtxSmpF/P6Qo/3nhx04TSlMXd8FDsqkUG7gM4oCoyRtfGDxZ41nJoZWrBG xSnb8/TwRgjDeVT4r+QHahCNs9XLNyOAUILkzNZDtDiuaX27/LzmSzD7GD7O6S+0O3c1 eD8dEiHBLqUMHwmTeT9wRS9p4J8SubkqrODquzG6reJcNtwlycAWdG1xlogEg3H1r0cd 9fQBHx7YgR0gDBEm745R4k248AmgbYaUQmx5Y7ExniUPUFG80+TX0E2snoH4COXSXgP1 1dgX+2e8BbBhwWP+YSaQxyAn2p+s12FjXhVShbfNTQu17Wj4fkDr7hdrwfxnNhF+e4U8 V8rA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1734365190; x=1734969990; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:date:from:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=cGqyisqmMAu4dOam1OVi9m1FcFaUXOIPgO9eWA/wASA=; b=MIyDXx17uUxc2lZ7wBtRxnuHU6ioHeMczTuC97INrqyZwitNFLcpWrbUMVdtiwn9Mi bGWGynBIYcRvhmh0IfDS7iZ8jKKd+LHsz5NhUVNt3ktBRM20SFHtKG8exwvxXihCV3zc eT5IBfrIUgJnuW6jAFtVEi0c8jrbZ0wBqd9KPGcaIXnASZkukA/onlswsuDEH3YD/h1h TpZ4cAYD6VX6reyqLhmcecLwY12czoJXqCpv0m2cvOrOarISD17nActFal9o1k5iDSuR julBKp8unigWzxRkizCJeULSWVm260QwyZ/ZlKIfPGhyrKf+0QovG05/IpBC2akOxoz+ oWAg== X-Forwarded-Encrypted: i=1; AJvYcCXVeXqT5DmSkZmj3opf6R2uQyF2w2nQyVhgvsHyTzD9n9ft6DCfZJHYqiupGCl2Z7xyrnE+DgZBQA==@kvack.org X-Gm-Message-State: AOJu0YwlVWtA5VieXMT2uIgrt6g9+EefDCkta8xUTAouQYnEMqsB8wE5 dzU1euth3TJwf+8+rhDH/Jpr6yK4wEUr0AbKBjIs462aBe/nrfuG X-Gm-Gg: ASbGncvvAd/Dd3kamildmvJGLETTe6X/JofnCg5tCrZXMSnCmzCippwx3rSLxjXDA5x h5W/28vkhWPowcGklSeF73vGdtgWO0W3Zxa+FeM5pSDCnuxp7YeFY5dg2BvZXWMBeefj99pKMJ9 t8n/agtQF+IuzsXm3QJl+gLY9JzWip6JNHjslmp0/YcJEJzlANPC8KxlfLcqO2/TNmVjhP7Tzh1 O4+tInwOV/zXMJAN18cgSkilE8Q584SnyuajwzaRfc= X-Google-Smtp-Source: AGHT+IGFFSp98N/h0XGnnlSck95VjUoxFw5f4Id94RYv6cdOomxczF6C3VYA5YCOsO31uxZWHS4Dqw== X-Received: by 2002:a05:600c:1da7:b0:42a:a6d2:3270 with SMTP id 5b1f17b1804b1-4362aa9dd6dmr115042585e9.21.1734365189882; Mon, 16 Dec 2024 08:06:29 -0800 (PST) Received: from krava ([213.175.46.84]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-43625706cd3sm146868265e9.28.2024.12.16.08.06.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 16 Dec 2024 08:06:29 -0800 (PST) From: Jiri Olsa X-Google-Original-From: Jiri Olsa Date: Mon, 16 Dec 2024 17:06:27 +0100 To: David Laight Cc: 'Jiri Olsa' , Oleg Nesterov , "linux-mm@kvack.org" , Peter Zijlstra , Andrii Nakryiko , "bpf@vger.kernel.org" , Song Liu , Yonghong Song , John Fastabend , Hao Luo , Steven Rostedt , Masami Hiramatsu , Alan Maguire , "linux-kernel@vger.kernel.org" , "linux-trace-kernel@vger.kernel.org" Subject: Re: [PATCH bpf-next 08/13] uprobes/x86: Add support to optimize uprobes Message-ID: References: <20241211133403.208920-9-jolsa@kernel.org> <1521ff93bc0649b0aade9cfc444929ca@AcuMS.aculab.com> <20241215141412.GA13580@redhat.com> <20241216101258.GA374@redhat.com> <0916e24539ba4bae9fb729198b033bd7@AcuMS.aculab.com> <20241216122204.GB374@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 914B98000C X-Stat-Signature: tu47zhk9bzw1fn8m68ewfs9xp3wm7mgt X-Rspam-User: X-HE-Tag: 1734365135-783777 X-HE-Meta: U2FsdGVkX19UNp2+L1kCfloP98wM0jqg/I7/CkT6n3tCYRnixxxc9GUOYAc7rvdAEGnIlMmlHBC2heh0R7piANUZoGtzB0oA3JQ5mLFse0ukT2TREPyip4kh9ZFyl8vmj/JE2D3udr2ZV4vsVi4aMT3jGPyfwY6up0uGLK85SKGsmNJyOwLwXDrrSxyTy5ewb2B7J6aEhf/+vQTPcYIxmxWCTvv0vceRuJKSXskb6PXbZR/t2KkdSSgcuneHmTXsumtDY+LP0toBVpDh43K64q9HrsRsH4bYdosJTftyvAogAyB3tY3ABUTZObObhZDFG9uJvopCz086HhZ/pBk+9cD48ao1PDEMaVaM3/iBhAKUiw7cUs/4sR5VyjIlEePA3ADjQzJbe5MUN2Ye0vaahCdaUMDIgxyuQiD6+PNdCHDL+AyI5ph7PgF3B5qgJyRrGFMbzbTR2zZN7MuYYY+hWEnuHHbBKpkFNdfmmnJ9SFTlZcCYrn8LzIeO7VEsv86Tvt6tYr8cGHtAM4Q06LUMzJ0fRYQR6dP+idF/wZ5Q2ldOb2uhZGY472MZ2dcLT9++pVUnsJ7l4X/XK5+vD/3OD54J9QJuq4queueFFEGT+wrkL2CXzCwmw9PMtE2N2eIM9cpxQsFPdJbqLFi08n39rjoe1eoQsdXaA7fJHrUuV/uP+T/i6PZFGfjh7yb7FWW14Y0+v2EGeTUIYMqLaadgwDoxEYuZPz8yn9v8IldMJh60epNGI4Wa2E8aIbnc9YCImrZG/Eh0znQf6PGQkZ9jQmwZfMrUPsMQ4+T3JQ/omZML/MhvpQ/8dkeYrKrtr59Tic912pZ1QHreCj+cDBtfraXfgwzp4rHkqwnK/UpClKdLhD7sBj08uPbnvdlyv9V7wMf6vYxtxzxchoLKq7Seh+WQ1OgK6lcyn1yE1mfaknNDzZF4wNV5GHPPwr5DSOXJPdbSSREwkT7L4vVwd+k NpEkqk4k 1mfusy/c5ptt6uyK8cXIWEMfMQ6sj2tJIfPinzsyipSzAuUtIY4OGkNZhCB2wl0S9krSysazjpXIzxrM0tpDvYhQYDdECcFYpYPAf+Q/GM8BqGI3yGUt61UzPE2ydltppgBUJmw3P+k3TBxWuyczusw9WG7qB9RajddBt9uf75NUviZGvNGIEwZCp6YuKFKzkJ/W+ttt1GTYC/PEbii2+/URHD+u7mloh5xdSD6P5ODMZnwYfdouWyacNB9TT7oHQLB31k2VFguWQCJ8GPYpXgv6l2cHo6/cyYtF8YQ0T3vIGYfIcIBYtUT9InX6mZy2DG9SZTB2hrhmHc6mqLpgKw1uwEijLIgqQogDgELaDc8Ub0PTFV6omsfnSpyoddMmHzkFivBAVvU7wAseK1CfEA4dOgfKphpxMSJ0rpsj1BoxfDam4MZNQvPWkaoe2y47gcZgmi1pASLQNIqz26GZeuNl6Ps7thJqhYTvU8xVeg7ktVw1O27SkZ2aaU+X+YZa8LtYd X-Bogosity: Ham, tests=bogofilter, spamicity=0.000173, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Dec 16, 2024 at 03:08:14PM +0000, David Laight wrote: > From: Jiri Olsa > > Sent: 16 December 2024 12:50 > > > > On Mon, Dec 16, 2024 at 01:22:05PM +0100, Oleg Nesterov wrote: > > > OK, thanks, I am starting to share your concerns... > > > > > > Oleg. > > > > > > On 12/16, David Laight wrote: > > > > > > > > From: Oleg Nesterov > > > > > Sent: 16 December 2024 10:13 > > > > > > > > > > David, > > > > > > > > > > let me say first that my understanding of this magic is very limited, > > > > > please correct me. > > > > > > > > I only (half) understand what the 'magic' has to accomplish and > > > > some of the pitfalls. > > > > > > > > I've copied linux-mm - someone there might know more. > > > > > > > > > On 12/16, David Laight wrote: > > > > > > > > > > > > It all depends on how hard __replace_page() tries to be atomic. > > > > > > The page has to change from one backed by the executable to a private > > > > > > one backed by swap - otherwise you can't write to it. > > > > > > > > > > This is what uprobe_write_opcode() does, > > > > > > > > And will be enough for single byte changes - they'll be picked up > > > > at some point after the change. > > > > > > > > > > But the problems arise when the instruction prefetch unit has read > > > > > > part of the 5-byte instruction (it might even only read half a cache > > > > > > line at a time). > > > > > > I'm not sure how long the pipeline can sit in that state - but I > > > > > > can do a memory read of a PCIe address that takes ~3000 clocks. > > > > > > (And a misaligned AVX-512 read is probably eight 8-byte transfers.) > > > > > > > > > > > > So I think you need to force an interrupt while the PTE is invalid. > > > > > > And that need to be simultaneous on all cpu running that process. > > > > > > > > > > __replace_page() does ptep_get_and_clear(old_pte) + flush_tlb_page(). > > > > > > > > > > That's not enough? > > > > > > > > I doubt it. As I understand it. > > > > The hardware page tables will be shared by all the threads of a process. > > > > So unless you hard synchronise all the cpu (and flush the TLB) while the > > > > PTE is being changed there is always the possibility of a cpu picking up > > > > the new PTE before the IPI that (I presume) flush_tlb_page() generates > > > > is processed. > > > > If that happens when the instruction you are patching is part-read into > > > > the instruction decode buffer then you'll execute a mismatch of the two > > > > instructions. > > > > if 5 byte update would be a problem, I guess we could workaround that through > > partial updates using int3 like we do in text_poke_bp_batch? > > > > - changing nop5 instruction to 'call xxx' > > - write int3 to first byte of nop5 instruction > > - have poke_int3_handler to emulate nop5 if int3 is triggered > > - write rest of the call instruction to nop5 last 4 bytes > > - overwrite first byte of nop5 with call opcode > > That might work provided there are IPI (to flush the decode pipeline) > after the write of the 'int3' and one before the write of the 'call'. > You'll need to ensure the I-cache gets invalidated as well. ok, seems to be done by text_poke_sync > > And if the sequence crosses a page boundary.... that was already limitation for the current change jirka