From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 45E3FE7717F for ; Tue, 10 Dec 2024 09:01:42 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B71746B0143; Tue, 10 Dec 2024 04:01:41 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id B04376B0144; Tue, 10 Dec 2024 04:01:41 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9999D6B0145; Tue, 10 Dec 2024 04:01:41 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 74DBE6B0143 for ; Tue, 10 Dec 2024 04:01:41 -0500 (EST) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 183BCC05F9 for ; Tue, 10 Dec 2024 09:01:41 +0000 (UTC) X-FDA: 82878456192.14.A311F20 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) by imf11.hostedemail.com (Postfix) with ESMTP id 981E740005 for ; Tue, 10 Dec 2024 09:01:19 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=linutronix.de header.s=2020 header.b=NiIsVxpu; dkim=pass header.d=linutronix.de header.s=2020e header.b=37vZR01z; dmarc=pass (policy=none) header.from=linutronix.de; spf=pass (imf11.hostedemail.com: domain of bigeasy@linutronix.de designates 193.142.43.55 as permitted sender) smtp.mailfrom=bigeasy@linutronix.de ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1733821279; a=rsa-sha256; cv=none; b=VwappSZKbB0hlOrKRLNFmLJlf0+tiIqI1LV9gBJXY1XvG4rntSOrLpgMUXpvt9NiqsptMB DnNMp0OPdhYRpNcuLBZ6MSk/c1z0SYrMgJUFQXrKiLdJNe3PG6yHqwp1sZ4afBBCyw86xI YJkNLDQXuC2RTfcNWvkjJZTvfbKVEPY= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=linutronix.de header.s=2020 header.b=NiIsVxpu; dkim=pass header.d=linutronix.de header.s=2020e header.b=37vZR01z; dmarc=pass (policy=none) header.from=linutronix.de; spf=pass (imf11.hostedemail.com: domain of bigeasy@linutronix.de designates 193.142.43.55 as permitted sender) smtp.mailfrom=bigeasy@linutronix.de ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1733821279; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=HLPVydrjgFcQUrebQfUKKXc+oRg9J32yus5NohbqKBs=; b=PVbWq5Nkc/liujUPOnDCH1lQsjzmLa0yRo1hgrQQOPSlsft+8zVvf6SB37buHSNSfyysJR hzBF8vmnybZBKUZJQ8p4lzncXXZ8tlUVTsIETpXBTPGIsVOVw0cbirsIPJQ+XOlk1JE0D2 FFHrasNmadYNlMUriOIyYQyOWyXW6wQ= Date: Tue, 10 Dec 2024 10:01:36 +0100 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1733821297; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=HLPVydrjgFcQUrebQfUKKXc+oRg9J32yus5NohbqKBs=; b=NiIsVxpu+5F5PPtQgrjLxOvj1jQQkcwGc5U9F9oZdzJTlVDLv7TWMHSDKpgCaRSZK+Oz9g x/0bPvpoFnMEb2jzzdrkmB9SgZYce9On5w5VWDnggeGJo936l1SWKClaO0DQjCyoKWRCWv wq/5Um0xx6zL7bZgMkUVNdKvrWUpgX/L6GqUrsL/JO9+1mAm/ekTgTZEfo5iQ+pzl4BPTV shABUZJIvcABKBMDZPu6DIhmvJoV4M08ghcVZsqstCWEg6vCCuOEI1qFispvHpRvyFYFSQ Z6GzIt3/bcba34Ze6ILFb8GS5bMhv8/HGECbOgtL8B8Fgt3vUlhfoWtQZCKVag== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1733821297; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=HLPVydrjgFcQUrebQfUKKXc+oRg9J32yus5NohbqKBs=; b=37vZR01z16pE4mZreHJSA6yrQeO3CNMPYyG07J0koMKzR3tH3/opzQrzFQnGO6brJoTfVG 691zdhvOYHFj9eDQ== From: Sebastian Andrzej Siewior To: Alexei Starovoitov Cc: bpf@vger.kernel.org, andrii@kernel.org, memxor@gmail.com, akpm@linux-foundation.org, peterz@infradead.org, vbabka@suse.cz, rostedt@goodmis.org, houtao1@huawei.com, hannes@cmpxchg.org, shakeel.butt@linux.dev, mhocko@suse.com, willy@infradead.org, tglx@linutronix.de, tj@kernel.org, linux-mm@kvack.org, kernel-team@fb.com Subject: Re: [PATCH bpf-next v2 1/6] mm, bpf: Introduce __GFP_TRYLOCK for opportunistic page allocation Message-ID: <20241210090136.DGfYLmeo@linutronix.de> References: <20241210023936.46871-1-alexei.starovoitov@gmail.com> <20241210023936.46871-2-alexei.starovoitov@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20241210023936.46871-2-alexei.starovoitov@gmail.com> X-Stat-Signature: 5rymygcsaihwnwrxhfokaq5b35wmfagb X-Rspam-User: X-Rspamd-Queue-Id: 981E740005 X-Rspamd-Server: rspam08 X-HE-Tag: 1733821279-481881 X-HE-Meta: U2FsdGVkX1+70TEtEdeu8IA5EvAbpKJq2I6df/oW8so0O4C2Bz7n2+X/aKub61pEq+kWCuXESuZFH52pZYeUnXvXC+bY6UO1Kusear28Cv8lxjVP85fZYMwvGWaugDxwwvim18A1AKqlGF6Q1QVUoGVNX/QW/J0bQgLm9LMVO76WDAOK/oDbrstODcTuVT5W02vFrgFObgSulVSDzfJ/BOo8OyAtPUX4jXdfCJ97UW1O9d0qXdz036oKxsSoUknE7U5imuEt2s3aosygh6c/uidAkyPR0zLlBW+XB+1nxhfKoVUKgKth4sz6UFOmwEnlKUoBVz4QpBi9mCreYBvsY80SPRQhGqw8pwIdPmlowrjhNMqO1u3/bh5X0j/wYvwwDR16VmUSz3AEYoapW9dL8dvXOVfOw1OaKAvDvYCpGlDNceYy9CbpXBlVR5YFCpRSgTKp1yPUpClIPryL4fI15fkoD06hf8mBFKZ5I6yDpyEuDmbkjDOW45cyKd78XQw0yRv6F4Bq46pN+HRY3Sjk+Eg1zjQRewEU+J0NRzQfuKYI4mmjBtro4D/af4gfb5so683NGLS7OIj1yeIXWTiidO5W7qF74Lr5TzV4Lnvsu3R3U50N2dtO2A8OhSvBrRtIinxOw9tNOD525yqITnKznMF6cVXI+SFWb428I4sn3LqLjFWm0GVRkrk35qcOCMZERKW0WpteJrUmkoMxq93D6sLj4LHAD9GmlqZDbtItZEW6SWy+EWhJRWu/pEh/PpKu7VdEzKE3oIKtySwb9jC0Xe0T9yyYK9pob5zQlbfpLNONgPWtUTxzciDnkthZjRjQ9JPjGhE4SFT9iGikuVMsSlBCrVRFTgt9irKwVLcPQmJ2wOSKou6BZFr9xblgv0cXXG60rcLuMs+25gtPrIZ0Tp2C4s6C3J8QSl/0Z1Ng9A7COnXNBhM8stkQ+nPA5FZjvoaLRvhK868+Utf20TK npLYmGaR FGJmCqymDvlOyvCOSS5k/W8Hf0GVVo10+ciXnujQ0bmUaQxk//PZMuqKSUf1KOLJxXLAtuMtjfFezdV2AYIJSXjmB4ackGRF9ARNqYp1pCo5170KFUY3PoAJCeX3KStfiXb+EvXeApgdV8k++fRx3ZIE5+awCsRFi0byYzTPMj8PnVYIKnSgQ8Sj4bkLfDXX/DY44aOAF0ZYUyQE9RC8Njpg4E2Xnpmcz3hd9Eps9TCKhbicV5iX8CWb6EjiC8ur6QBU2I9QKyGcs64V6RWXkIwAtMgqouG9utUTbqjKu4KBcvWQ= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 2024-12-09 18:39:31 [-0800], Alexei Starovoitov wrote: > From: Alexei Starovoitov > > Tracing BPF programs execute from tracepoints and kprobes where running > context is unknown, but they need to request additional memory. > The prior workarounds were using pre-allocated memory and BPF specific > freelists to satisfy such allocation requests. Instead, introduce > __GFP_TRYLOCK flag that makes page allocator accessible from any context. > It relies on percpu free list of pages that rmqueue_pcplist() should be > able to pop the page from. If it fails (due to IRQ re-entrancy or list > being empty) then try_alloc_pages() attempts to spin_trylock zone->lock > and refill percpu freelist as normal. > BPF program may execute with IRQs disabled and zone->lock is sleeping in RT, > so trylock is the only option. The __GFP_TRYLOCK flag looks reasonable given the challenges for BPF where it is not known how much memory will be needed and what the calling context is. I hope it does not spread across the kernel where people do ATOMIC in preempt/ IRQ-off on PREEMPT_RT and then once they learn that this does not work, add this flag to the mix to make it work without spending some time on reworking it. Side note: I am in the process of hopefully getting rid of the preempt_disable() from trace points. What remains then is attaching BPF programs to any code/ function with a raw_spinlock_t and I am not yet sure what to do here. Sebastian