From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B5041E7717F for ; Mon, 16 Dec 2024 11:11:34 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 42C166B0089; Mon, 16 Dec 2024 06:11:34 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 3DB9F6B008A; Mon, 16 Dec 2024 06:11:34 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 27CB36B008C; Mon, 16 Dec 2024 06:11:34 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 076FD6B0089 for ; Mon, 16 Dec 2024 06:11:34 -0500 (EST) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 93846C0BD7 for ; Mon, 16 Dec 2024 11:11:33 +0000 (UTC) X-FDA: 82900555080.01.7EE2AF9 Received: from eu-smtp-delivery-151.mimecast.com (eu-smtp-delivery-151.mimecast.com [185.58.85.151]) by imf26.hostedemail.com (Postfix) with ESMTP id 72B93140016 for ; Mon, 16 Dec 2024 11:11:07 +0000 (UTC) Authentication-Results: imf26.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=aculab.com; spf=pass (imf26.hostedemail.com: domain of david.laight@aculab.com designates 185.58.85.151 as permitted sender) smtp.mailfrom=david.laight@aculab.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1734347464; a=rsa-sha256; cv=none; b=Gn5fH8Cq51/B7BZ/llk3V9SAkb70xGnaeBkucQlnHtsiODgOnNLIzJDp7l3mj26kTEwNgk JYKPdk6XFPadt18TK0Sx35GYXgko3TFKqjbLsSIH23veuoHLOw8hd2iHbdDLH3iovAFbV3 X6ce3/7jktqf1u5HAcNeF2O1UoBZAd0= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=aculab.com; spf=pass (imf26.hostedemail.com: domain of david.laight@aculab.com designates 185.58.85.151 as permitted sender) smtp.mailfrom=david.laight@aculab.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1734347464; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=N2vdhFMCwt7nl3aR3OnWzK7YbPyMRZ+JlQNLQWrzIEU=; b=GccyVo4llR4v4ck+oAVaKIWWlBspUOamAd63n4YTWUzYVyPW/CAG/wroy/EOmYgFhrXeUA FUF8/gOIRYA21o+0+K+KZJgQY9ZDYEAKTSigJQzzORZ6P2PAno2BaKIe8/d0bmxbJas0Sl eKvEI2sK8bxiYtDptgZFWSXRFgWJFio= Received: from AcuMS.aculab.com (156.67.243.121 [156.67.243.121]) by relay.mimecast.com with ESMTP with both STARTTLS and AUTH (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id uk-mta-78-kvkqQzWQOWCezDNdN5gTxA-1; Mon, 16 Dec 2024 11:11:27 +0000 X-MC-Unique: kvkqQzWQOWCezDNdN5gTxA-1 X-Mimecast-MFC-AGG-ID: kvkqQzWQOWCezDNdN5gTxA Received: from AcuMS.Aculab.com (10.202.163.4) by AcuMS.aculab.com (10.202.163.4) with Microsoft SMTP Server (TLS) id 15.0.1497.48; Mon, 16 Dec 2024 11:10:23 +0000 Received: from AcuMS.Aculab.com ([::1]) by AcuMS.aculab.com ([::1]) with mapi id 15.00.1497.048; Mon, 16 Dec 2024 11:10:23 +0000 From: David Laight To: 'Oleg Nesterov' , "linux-mm@kvack.org" CC: 'Jiri Olsa' , Peter Zijlstra , Andrii Nakryiko , "bpf@vger.kernel.org" , Song Liu , Yonghong Song , John Fastabend , Hao Luo , Steven Rostedt , Masami Hiramatsu , Alan Maguire , "linux-kernel@vger.kernel.org" , "linux-trace-kernel@vger.kernel.org" Subject: RE: [PATCH bpf-next 08/13] uprobes/x86: Add support to optimize uprobes Thread-Topic: [PATCH bpf-next 08/13] uprobes/x86: Add support to optimize uprobes Thread-Index: AQHbS9GTSF5rwnXysUaDsufIVqYKlLLnNYvQgAFWD+SAAA72QIAAE/UAgAAKdkA= Date: Mon, 16 Dec 2024 11:10:23 +0000 Message-ID: <0916e24539ba4bae9fb729198b033bd7@AcuMS.aculab.com> References: <20241211133403.208920-1-jolsa@kernel.org> <20241211133403.208920-9-jolsa@kernel.org> <1521ff93bc0649b0aade9cfc444929ca@AcuMS.aculab.com> <20241215141412.GA13580@redhat.com> <20241216101258.GA374@redhat.com> In-Reply-To: <20241216101258.GA374@redhat.com> Accept-Language: en-GB, en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-exchange-transport-fromentityheader: Hosted x-originating-ip: [10.202.205.107] MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: rsGzcgdJkV8jnXAc85WZqjHEvhmxp3Uw_HoHxXCykeQ_1734347486 X-Mimecast-Originator: aculab.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Stat-Signature: dtnxz9u4b3ouaizzba8nmo88rewpx7h8 X-Rspam-User: X-Rspamd-Queue-Id: 72B93140016 X-Rspamd-Server: rspam08 X-HE-Tag: 1734347467-658873 X-HE-Meta: U2FsdGVkX18dNMGUL+Ox/Ps5dqVAUMRDPR6TQWXd2B8eHqTmjLAIVlK4CH4UFlrm6zu6zIdkRZ/hFDCOrgcy8sp9bFGF4eLAjYbpFTKDcighcU8/pVoIsvDIWxIqnNJkiV081ArFgXRjpvQWcEY9xkFnVmXMSHBIGDwVKvrHzuiGTZ9xEXqCu/FDdvbYJfcEpOyQWOVBFHbRBvxQ5zwP5XloowtuiyVLf/DeQUn9ottVUhFw1dBvyVs0hxYZrbh9lWz22K0jILVgP+AO2iZS9RpwKfwsjvwVvzpJdY96J+gh4Xg14tO0Adqxc+dKu0imEMQw6QH/kRDq5vdtm/duXxSH7FxJSNcmnW8rj3bc+LdX+5KQ3dp6Q+GPKFZPYU0EJhvnP/erP9shGsSK4Om2LyNBskfMAn1h2VFpPGfO3uH7WjGOmCyt0+aBnv/0MNXu/RjtjqC+YPvmidDlwhk2aKcpjayztesNXOZRAJsJQz79Y5Kodscm5QeKuZmuRqkivJEUwlL1yhOVUbVEYgw2EgTl1CIikHhU8OgyrgcjfADCbi1JWVW0WathLPkER7DRFaEo/ie8iJVOdYHtKbAezUUlTmlANkC4lFbqhOg3BUxL9Io0HGwJGH1k/yGbYa4lULafSvBAw4iywasontKSati8xBh9XVkCwrfs/9ZuRmLhAc23PTPj4jIDVn8T8oe8Ip5IycYGsv+4UO6wExMBfHYIPPti4U1DPyS3OiqZG8en2owwI+ZHG4ySlwhNdNcARtkW4ukEnxP+efOflwGGREWirFBC0ih2XBnpELvfkFCnyqZs4BockKMX4rPVpZR7TVURb2acARxWEM59SLH57gx5VzsYphao7m5nx1ZIoUib6RfDRUo2t2svDOoDx0AxODJbIvHLLLZMzU4ElSnVNBZaiNQ8bY+VRlHZuLjdYqod1P39jhjozA9GT0ise5Wi8wg2jieEN/ucitNfFUP JWyKDbzh XNn2doUXag7rsC1ZNqTens01ZXop+SIphuL7Q/qDZ2QrFrXfV/DFaoY1AD6322vAedkJlVaxLPAoLd93AaEyHwONemfT9EWzsFpoA/RySjNlytdjr/PBj37jpyeMcq1GHKeRBxdq+mB0eGUqBwNIEM/POt80Wo5mv3qDNHZBW4AQ/MbWQ3rGsmBnVA+rWQsVloK5idxAFpTs2TfQNcyH3RaMmRxe4fmFPcKOiVkZdQDMyjhi4hcEAs00oMnnWKvq4l7YHKTz0Wija2D+hsUrNZ1b1M7iSW/ptgPuZppN2IWl/Ent5E5gCiQWaVC52g2Mtqmexf1jpr4ONmvV0TKSoH6KsLOa83qLxD4o/ZP1bZ+dAaw6y2aXc/85lmRzIKXft0DpSHOmxAQOzTik= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Oleg Nesterov > Sent: 16 December 2024 10:13 >=20 > David, >=20 > let me say first that my understanding of this magic is very limited, > please correct me. I only (half) understand what the 'magic' has to accomplish and some of the pitfalls. I've copied linux-mm - someone there might know more. > On 12/16, David Laight wrote: > > > > It all depends on how hard __replace_page() tries to be atomic. > > The page has to change from one backed by the executable to a private > > one backed by swap - otherwise you can't write to it. >=20 > This is what uprobe_write_opcode() does, And will be enough for single byte changes - they'll be picked up at some point after the change. > > But the problems arise when the instruction prefetch unit has read > > part of the 5-byte instruction (it might even only read half a cache > > line at a time). > > I'm not sure how long the pipeline can sit in that state - but I > > can do a memory read of a PCIe address that takes ~3000 clocks. > > (And a misaligned AVX-512 read is probably eight 8-byte transfers.) > > > > So I think you need to force an interrupt while the PTE is invalid. > > And that need to be simultaneous on all cpu running that process. >=20 > __replace_page() does ptep_get_and_clear(old_pte) + flush_tlb_page(). >=20 > That's not enough? I doubt it. As I understand it. The hardware page tables will be shared by all the threads of a process. So unless you hard synchronise all the cpu (and flush the TLB) while the PTE is being changed there is always the possibility of a cpu picking up the new PTE before the IPI that (I presume) flush_tlb_page() generates is processed. If that happens when the instruction you are patching is part-read into the instruction decode buffer then you'll execute a mismatch of the two instructions. I can't remember the outcome of discussions about live-patching kernel code - and I'm sure that was aligned 32bit writes. >=20 > > Stopping the process using ptrace would do it. >=20 > Not an option :/ Thought you'd say that. =09David >=20 > Oleg. - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1= PT, UK Registration No: 1397386 (Wales)