From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 168ABC32771 for ; Fri, 19 Aug 2022 17:52:00 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2AE728D0003; Fri, 19 Aug 2022 13:52:00 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 25DD38D0001; Fri, 19 Aug 2022 13:52:00 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0D8BF8D0003; Fri, 19 Aug 2022 13:52:00 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id F00DF8D0001 for ; Fri, 19 Aug 2022 13:51:59 -0400 (EDT) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id C4E251603D4 for ; Fri, 19 Aug 2022 17:51:59 +0000 (UTC) X-FDA: 79817085558.04.2AAACF8 Received: from mail-pl1-f169.google.com (mail-pl1-f169.google.com [209.85.214.169]) by imf03.hostedemail.com (Postfix) with ESMTP id 7F07C20012 for ; Fri, 19 Aug 2022 17:51:59 +0000 (UTC) Received: by mail-pl1-f169.google.com with SMTP id x19so4743435plc.5 for ; Fri, 19 Aug 2022 10:51:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc; bh=otYHjdyLGePy1LBSwR9uatiwpWnd5ib8hRhTZupav48=; b=AYs8t7/R7/dAhhvd+GzU+Kon0SLCXYyudpMyF7JpCNtlF3PuHgV9B0uBrP76517EOr P23xLRfyMrXO9B4a2f214pKFKc7BAtXB7h4XTtu5h2uCbe42879Qq7eiKCb8H7Du1xVx MdR+gWtQixDl9VRZCZDjzvmwZiAzJxjRq8pUvTrpuHyav4pnCj4qNN8U66gHv/a575kI /a9QvUVTaeOXrrNVNXbnA+C+V5uk5GuuNBZ7FRLj1vptF1iP/WVW8l6zU68sDaf3sYzf QgSYxEqajhP0ra4fyCTKL+4415AVMGu+Uurq5BtK8BD9lt5W4gzdDeaVPKFHs7T6huC6 jEIQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc; bh=otYHjdyLGePy1LBSwR9uatiwpWnd5ib8hRhTZupav48=; b=pxUstFBXIXCiHEDYI/QAgcgpWnafAwyZow3wZboavs09CdIINGxvdAKHJJfV93UyJg ew5vkebcoxeBoTXQgkrkMVEIA+3g8Z+FiHEBqBFD8eZUfi+SS4tZ8iWPXingF8b6+WTk g3PpLJvAVZcKrr/ApSTGJtKSiLOMl9clN46hSIpJv8oxuOplU3K/n8wx7EGZd5cHErWx Nztd6HyMD8vNoxPvwCsxb5CtbRj48OW60QeL2jqjmH8xtToylNguF7Yep/WSiPpSu++C oEqm9J2zo7yYjjjELex5aGlqyN3m9CAu6y23NuIfqh9miwdKT+uGPprdl8uKPE2NOPMB T0eA== X-Gm-Message-State: ACgBeo1wRUAsrTOLoP4pD+e7JlsCHLaRzxQbqfXCEBHvDXvDrwKdSoGm mhOxXJhtsdxi5RMZfNWyziqK2WN2Mgo= X-Google-Smtp-Source: AA6agR7iKXkA2VMG3EddXBSHxfaNNU1TvL5TgtuZ7Q1VKvv8zel1NaeVlcwF+z7QnDIvGvhuoeh+vQ== X-Received: by 2002:a17:90b:1194:b0:1fa:c41a:59c0 with SMTP id gk20-20020a17090b119400b001fac41a59c0mr11360126pjb.165.1660931518296; Fri, 19 Aug 2022 10:51:58 -0700 (PDT) Received: from MacBook-Pro-3.local ([2620:10d:c090:500::1:c4b1]) by smtp.gmail.com with ESMTPSA id x24-20020aa79418000000b0052d50e14f1dsm3715137pfo.78.2022.08.19.10.51.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 19 Aug 2022 10:51:57 -0700 (PDT) Date: Fri, 19 Aug 2022 10:51:55 -0700 From: Alexei Starovoitov To: Kumar Kartikeya Dwivedi Cc: davem@davemloft.net, daniel@iogearbox.net, andrii@kernel.org, tj@kernel.org, delyank@fb.com, linux-mm@kvack.org, bpf@vger.kernel.org, kernel-team@fb.com Subject: Re: [PATCH v2 bpf-next 01/12] bpf: Introduce any context BPF specific memory allocator. Message-ID: <20220819175155.deyd62m6tscv63td@MacBook-Pro-3.local> References: <20220817210419.95560-1-alexei.starovoitov@gmail.com> <20220817210419.95560-2-alexei.starovoitov@gmail.com> <20220818003957.t5lcp636n7we37hk@MacBook-Pro-3.local> <20220818223051.ti3gt7po72c5bqjh@MacBook-Pro-3.local.dhcp.thefacebook.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1660931519; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=otYHjdyLGePy1LBSwR9uatiwpWnd5ib8hRhTZupav48=; b=dHpGzzjqxqz/vMrbvA9K5Iv++uT6ztbOY2Z3BbIv9RFk62kYQ5vIqnd//4/mabh6lG8KKS QDyPOU051KMuSBOY1e6gzPH3ihYxs+U8Ru5gO7fe6ctBGL3vOIRmXGhkLpvBSbozMYWzkE KNPHWX+qVq2Orryny3VzS3pVguTMsKc= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b="AYs8t7/R"; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf03.hostedemail.com: domain of alexei.starovoitov@gmail.com designates 209.85.214.169 as permitted sender) smtp.mailfrom=alexei.starovoitov@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1660931519; a=rsa-sha256; cv=none; b=NEusWZVQDzTF1R5B/0giSaq99+6SeU6FZDYE94UtTYlG0O84earKE5lUgtA556ini9GVOs fzzHpO8Mg1RAjibZ+eiQrgKzgqpcdOTEiYD0SvQHn4/Tn5qUz1UuOj+MWIr2w66Oc13dkg 623brhFd/uv9WgEc8K71BaSiXDj/WfA= Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b="AYs8t7/R"; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf03.hostedemail.com: domain of alexei.starovoitov@gmail.com designates 209.85.214.169 as permitted sender) smtp.mailfrom=alexei.starovoitov@gmail.com X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 7F07C20012 X-Stat-Signature: irknjdhkoscejby8m49qr7s5ttuypwtq X-Rspam-User: X-HE-Tag: 1660931519-667857 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000177, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, Aug 19, 2022 at 04:31:11PM +0200, Kumar Kartikeya Dwivedi wrote: > On Fri, 19 Aug 2022 at 00:30, Alexei Starovoitov > wrote: > > > > Right. We cannot fail in unit_free(). > > With per-cpu counter both unit_alloc() and free_bulk_nmi() would > > potentially fail in such unlikely scenario. > > Not a big deal for free_bulk_nmi(). It would pick the element later. > > For unit_alloc() return NULL is normal. > > Especially since it's so unlikely for nmi to hit right in the middle > > of llist_del_first(). > > > > Since we'll add this per-cpu counter to solve interrupted llist_del_first() > > it feels that the same counter can be used to protect unit_alloc/free/irq_work. > > Then we don't need free_llist_nmi. Single free_llist would be fine, > > but unit_free() should not fail. If free_list cannot be accessed > > due to per-cpu counter being busy we have to push somewhere. > > So it seems two lists are necessary. Maybe it's still better ? > > Roughly I'm thinking of the following: > > unit_alloc() > > { > > llnode = NULL; > > local_irq_save(); > > if (__this_cpu_inc_return(c->alloc_active) != 1)) > > goto out; > > llnode = __llist_del_first(&c->free_llist); > > if (llnode) > > cnt = --c->free_cnt; > > out: > > __this_cpu_dec(c->alloc_active); > > local_irq_restore(); > > return ret; > > } > > unit_free() > > { > > local_irq_save(); > > if (__this_cpu_inc_return(c->alloc_active) != 1)) { > > llist_add(llnode, &c->free_llist_nmi); > > goto out; > > } > > __llist_add(llnode, &c->free_llist); > > cnt = ++c->free_cnt; > > out: > > __this_cpu_dec(c->alloc_active); > > local_irq_restore(); > > return ret; > > } > > alloc_bulk, free_bulk would be protected by alloc_active as well. > > alloc_bulk_nmi is gone. > > free_bulk_nmi is still there to drain unlucky unit_free, > > but it's now alone to do llist_del_first() and it just frees anything > > that is in the free_llist_nmi. > > The main advantage is that free_llist_nmi doesn't need to prefilled. > > It will be empty most of the time. > > wdyt? > > Looks great! The other option would be to not have the overflow > free_llist_nmi list and just allowing llist_add to free_llist from the > NMI case even if we interrupt llist_del_first, but then the non-NMI > user needs to use the atomic llist_add version as well (since we may > interrupt it), not only llist_add, but unit_alloc would have to use atomic llist_del_first too. So any operation on the list would have to be with cmpxchg. > which won't be great for performance. exactly. > So having the > extra list is much better. yep. same thinking. I'll refactor the patches and send v3 with this approach.