From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D3BBFC28D13 for ; Fri, 19 Aug 2022 22:45:23 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4081D6B0073; Fri, 19 Aug 2022 18:45:23 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 3B7286B0074; Fri, 19 Aug 2022 18:45:23 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 258026B0078; Fri, 19 Aug 2022 18:45:23 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 173AB6B0074 for ; Fri, 19 Aug 2022 18:45:23 -0400 (EDT) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id DCA961C5FBD for ; Fri, 19 Aug 2022 22:45:22 +0000 (UTC) X-FDA: 79817824884.29.FF3EB2E Received: from mail-pg1-f177.google.com (mail-pg1-f177.google.com [209.85.215.177]) by imf21.hostedemail.com (Postfix) with ESMTP id 63FC01C012A for ; Fri, 19 Aug 2022 22:43:21 +0000 (UTC) Received: by mail-pg1-f177.google.com with SMTP id r22so4743814pgm.5 for ; Fri, 19 Aug 2022 15:43:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc; bh=y4FKYCBkWdu7eXvvGMhOLWdm9kmZwSMH/OKi5LsDaN4=; b=Zvpf2AL/7whxmDf+QfnhQ3jE9XzeDyTNmqLBkc2BbkbQjPcISqwXenOeXBfc78tzZc QPJ7RTOWqhrtAuIokwAvF9v3stZItm3L8K6UvhkVl7dPvr1wvtFBht7Aro59O5Wl2658 9I7we6E9Ap5o+NS8d3fhyrmNq3w9PGUZLUOjcu9q2uqoy/mJ7V9dSjpgs4nwVQefRhbF s3GkchEpUZHDs6StcvEjeXrFj53I6BxelQfcI0SLz+FF5VbPZz72IuqFLRtEZPHPR3Rp 1qkMlXxkwsvt1GjLeIYyH0YzV0c7RwRyu65Z5Hn456ZQ48Oc8KTO5BANSQ1t4fj4WB+S sLgw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc; bh=y4FKYCBkWdu7eXvvGMhOLWdm9kmZwSMH/OKi5LsDaN4=; b=qtkXzgcadBCeUnGfxq2Mrowk1oqovlRs+xDzz/if5XV9Phqj/3riHlPMeiIDvkZxRP QuIW8COwkI1fay/M0Zby28rvt2FLLIoo6gSbffadcONQ6cI4nKGSVr8eHseFMIkoYSGT IVNF/qtmSN00yrJAPCcSU0nArXT8fxWQRVbaKDBx799eLXlM7o1Glb3aKVBNawk0H+pF mPBrQH0DSXdrGkEG7e1uiFUX2NDE1tlg30OIRmaMjhztQJrUWgOs02g5WLMF6gaXhVZp k2k3LREVxq+RwYsUO3FX81uwtVIu3bVXtJNZWJE/9CexBg08AqdHLI1Dq4xldE8M2ykO V7jw== X-Gm-Message-State: ACgBeo0W4rGd57JK0VSXK1u9DyJRKEmlnP2u37p36dKj6UKSEfpre7wR OEiZCawiRlixm83xiEWY1vA= X-Google-Smtp-Source: AA6agR4tUmAC3dbsLPig67Vs2UwW+p0yQ82X6KDCRMfxeztCXr5h29oo3cjFH+9wj8ihzgrKK2tTOw== X-Received: by 2002:aa7:954d:0:b0:52e:b22c:14a2 with SMTP id w13-20020aa7954d000000b0052eb22c14a2mr9969788pfq.45.1660949000213; Fri, 19 Aug 2022 15:43:20 -0700 (PDT) Received: from MacBook-Pro-3.local.dhcp.thefacebook.com ([2620:10d:c090:500::1:c4b1]) by smtp.gmail.com with ESMTPSA id z12-20020aa7948c000000b00535c4b7f1eesm3916314pfk.87.2022.08.19.15.43.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 19 Aug 2022 15:43:19 -0700 (PDT) Date: Fri, 19 Aug 2022 15:43:17 -0700 From: Alexei Starovoitov To: Kumar Kartikeya Dwivedi Cc: davem@davemloft.net, daniel@iogearbox.net, andrii@kernel.org, tj@kernel.org, delyank@fb.com, linux-mm@kvack.org, bpf@vger.kernel.org, kernel-team@fb.com Subject: Re: [PATCH v3 bpf-next 13/15] bpf: Prepare bpf_mem_alloc to be used by sleepable bpf programs. Message-ID: <20220819224317.i3mwmr5atdztudtt@MacBook-Pro-3.local.dhcp.thefacebook.com> References: <20220819214232.18784-1-alexei.starovoitov@gmail.com> <20220819214232.18784-14-alexei.starovoitov@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1660949001; a=rsa-sha256; cv=none; b=A8oBzc+KRCv7eiSOQU/otTnv/zDQrU1ZjQs88ICXb96H2STszcPrkLrOssU275VlVshfA8 a3TOeffswf7Qz+xPNyoC8rsnGBkCS3dlc46Iyf8SfQQcr9n8fzu3IiC9BgyhC3r19MKFCg Umc22ZIP12g97pk6LfD4groH0sUZPUQ= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b="Zvpf2AL/"; spf=pass (imf21.hostedemail.com: domain of alexei.starovoitov@gmail.com designates 209.85.215.177 as permitted sender) smtp.mailfrom=alexei.starovoitov@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1660949001; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=y4FKYCBkWdu7eXvvGMhOLWdm9kmZwSMH/OKi5LsDaN4=; b=tSF4OmICmoO9KAAVRmieZPS2ITeVVTsAXEC0+St5Y3F2SskpxHymrodX5vmPxWgjskmxaE MO8x3RNGPpCy8NHHk3WB3d9t1p92or3jkXBsRypOkYK/9KcmxEXMl2EufupFfOXFCwlbm4 MIYrcTatSZLDZRYlqJBqSLEzgvJZoEQ= X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 63FC01C012A X-Rspam-User: Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b="Zvpf2AL/"; spf=pass (imf21.hostedemail.com: domain of alexei.starovoitov@gmail.com designates 209.85.215.177 as permitted sender) smtp.mailfrom=alexei.starovoitov@gmail.com; dmarc=pass (policy=none) header.from=gmail.com X-Stat-Signature: ouss6g9mi9pfebbe6zr86ebwdy4ch1du X-HE-Tag: 1660949001-752583 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000005, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Sat, Aug 20, 2022 at 12:21:46AM +0200, Kumar Kartikeya Dwivedi wrote: > On Fri, 19 Aug 2022 at 23:43, Alexei Starovoitov > wrote: > > > > From: Alexei Starovoitov > > > > Use call_rcu_tasks_trace() to wait for sleepable progs to finish. > > Then use call_rcu() to wait for normal progs to finish > > and finally do free_one() on each element when freeing objects > > into global memory pool. > > > > Signed-off-by: Alexei Starovoitov > > --- > > I fear this can make OOM issues very easy to run into, because one > sleepable prog that sleeps for a long period of time can hold the > freeing of elements from another sleepable prog which either does not > sleep often or sleeps for a very short period of time, and has a high > update frequency. I'm mostly worried that unrelated sleepable programs > not even using the same map will begin to affect each other. 'sleep for long time'? sleepable bpf prog doesn't mean that they can sleep. sleepable progs can copy_from_user, but they're not allowed to waste time. I don't share OOM concerns at all. max_entries and memcg limits are still there and enforced. dynamic map is strictly better and memory efficient than full prealloc. > Have you considered other options? E.g. we could directly expose > bpf_rcu_read_lock/bpf_rcu_read_unlock to the program and enforce that > access to RCU protected map lookups only happens in such read > sections, and unlock invalidates all RCU protected pointers? Sleepable > helpers can then not be invoked inside the BPF RCU read section. The > program uses RCU read section while accessing such maps, and sleeps > after doing bpf_rcu_read_unlock. They can be kfuncs. Yes. We can add explicit bpf_rcu_read_lock and teach verifier about RCU CS, but I don't see the value specifically for sleepable progs. Current sleepable progs can do map lookup without extra kfuncs. Explicit CS would force progs to be rewritten which is not great. > It might also be useful in general, to access RCU protected data from > sleepable programs (i.e. make some sections of the program RCU > protected and non-sleepable at runtime). It will allow use of elements For other cases, sure. We can introduce RCU protected objects and explicit bpf_rcu_read_lock. > from dynamically allocated maps with bpf_mem_alloc while not having to > wait for RCU tasks trace grace period, which can extend into minutes > (or even longer if unlucky). sleepable bpf prog that lasts minutes? In what kind of situation? We don't have bpf_sleep() helper and not going to add one any time soon.