From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 729F6C25B75 for ; Fri, 10 May 2024 09:16:09 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0D4D56B009E; Fri, 10 May 2024 05:16:08 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 0859E6B00A0; Fri, 10 May 2024 05:16:08 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E1DAB6B00A2; Fri, 10 May 2024 05:16:07 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id BD2776B009E for ; Fri, 10 May 2024 05:16:07 -0400 (EDT) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 44A2B141215 for ; Fri, 10 May 2024 09:16:07 +0000 (UTC) X-FDA: 82101929574.03.8EE012D Received: from mail-pf1-f176.google.com (mail-pf1-f176.google.com [209.85.210.176]) by imf09.hostedemail.com (Postfix) with ESMTP id 8848414001F for ; Fri, 10 May 2024 09:16:04 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=hN9OJtuG; spf=none (imf09.hostedemail.com: domain of wuqiang.matt@bytedance.com has no SPF policy when checking 209.85.210.176) smtp.mailfrom=wuqiang.matt@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1715332565; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=09EsDfPDeujlowp8cn8X0gI+Ox3NPGsb8Dl/irRuSw8=; b=ni2l2lixxY3EFcyiChK6IlKOmIYtzN3Ddz2YtEACUAx8wzFkPsVYmuHWtNbwscyVqixu6m rDJ19DnyIsfFs1IuvuKTlS81Ag6SUvnmzc8fIvgJryVWcefFBXxkfTycTfJrRLr1i96vv7 JC+pSTWPWk9Tn3ha7g1CE6p2ss6QbSM= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=pass header.d=bytedance.com header.s=google header.b=hN9OJtuG; spf=none (imf09.hostedemail.com: domain of wuqiang.matt@bytedance.com has no SPF policy when checking 209.85.210.176) smtp.mailfrom=wuqiang.matt@bytedance.com; dmarc=pass (policy=quarantine) header.from=bytedance.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1715332565; a=rsa-sha256; cv=none; b=RekaIJER9LYl57ECuCoR0M2WCCspC43pALatpfcvAQPpemr1ckKU10gezRjum3yk0A2dVO YxxdJxlUU/Y8MVhsHwt9c0wO5OzxDSEXjRnEUzhv8Lx05HPsiaGfUEHdJJx5Ty4QQiz27Z axXnrWWvvRrni6HKe81GdSnFa3QCa1Q= Received: by mail-pf1-f176.google.com with SMTP id d2e1a72fcca58-6edc61d0ff6so1593218b3a.2 for ; Fri, 10 May 2024 02:16:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1715332563; x=1715937363; darn=kvack.org; h=content-transfer-encoding:in-reply-to:autocrypt:from:references:cc :to:content-language:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=09EsDfPDeujlowp8cn8X0gI+Ox3NPGsb8Dl/irRuSw8=; b=hN9OJtuGtg96RYwrjIG5WlTnfETLg8tHgZ2KJ1wFuv/plDOCCxHMu1vwqRIkFQ8oAy s2KbnXe0EsJBqJPb8Cv/TDxEQ18JEbuQSjUgOHH8HHtzLFTzaBCwZ+eDZPqpExyRD169 b/KWcVE2KiK7BVr7/556DuRedDlFnLHZAUcDETPbW9NkDA5hJoUFVypPK1wOyw2MpLkV J3LHEdjFySatC5Eyt4H+ouynj6qhy/Q+XX+DBTJHTf/jB3b3uNq4vdD2lLGH0YAEVNMX 9lRaLTlpLGp3lteHrGA/TdwIz1Bj1D+Q5VVOGNl3v9MuRkw/PmT/iNiJb0+ffoCHhPVR 3Uag== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1715332563; x=1715937363; h=content-transfer-encoding:in-reply-to:autocrypt:from:references:cc :to:content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=09EsDfPDeujlowp8cn8X0gI+Ox3NPGsb8Dl/irRuSw8=; b=es42gHFDYSA7PGprOivlFejtNncVu02cdfX5jcYRduZyk4lQxH6tmv7Hpr8Q2qel9m /ZDxer0tJkZM7UoJUX+etgfR96UI1iz37rTKrxXk/MxGXTJybubFhDk2tcAx6/atLtv9 /2F+VMM2p26m6qRCz/khRqzspGGEnLohWZbBuLO2LdD4Gomd5xl9FHCIoOQsWzIRsvRR Zz+/ESO4wyPkt+yHOUmzs7mJ2v98A3H2V3V3lGiD9gtgiRw7E1US8fSbKQ2StAZ0RKGQ T2nVnzQgUW1wRWvB8b87/ld3qnuI204ehqSOmQPs8dlwDK/Scx0mMimC+b//+hZgBWlL LI/A== X-Forwarded-Encrypted: i=1; AJvYcCW3Egv/FXXqcZtxrwjBRpyOqJmt/pFN18BRI1qOOHym/Kk+Q7Dj1AvDRaRq7cN58g+kVVPUkwJ6KWGjFrGq47N6sng= X-Gm-Message-State: AOJu0YwL6bEXL3kZNRwMlZtCxOriOb74bhJJPHqIPmlMickBGIcS30WX E94HcCXQ3UYi0uYcfpBjtu0Ze8ENz1tc05TxH/17yb6qElpeoPzCJ6tqkhBepWw= X-Google-Smtp-Source: AGHT+IHl6UAzuZOMPjDf4Od9c7F6KGtPLJVjk3aQi8MBMga6hDZ+zPHnwB83YbCEAN9WvBdUo///xA== X-Received: by 2002:a05:6a20:dc95:b0:1af:9369:9a3 with SMTP id adf61e73a8af0-1afde1b70a0mr2350397637.44.1715332563134; Fri, 10 May 2024 02:16:03 -0700 (PDT) Received: from [192.168.6.6] ([61.213.176.56]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-6f4d2b2fe07sm2520501b3a.216.2024.05.10.02.16.00 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 10 May 2024 02:16:02 -0700 (PDT) Message-ID: <6994f6c1-29eb-46cb-942e-c2d1e3fe9f5d@bytedance.com> Date: Fri, 10 May 2024 17:15:58 +0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH 1/2] objpool: enable inlining objpool_push() and objpool_pop() operations Content-Language: en-US To: Vlastimil Babka , Andrii Nakryiko , linux-trace-kernel@vger.kernel.org, rostedt@goodmis.org, mhiramat@kernel.org Cc: bpf@vger.kernel.org, "linux-mm@kvack.org" References: <20240424215214.3956041-1-andrii@kernel.org> <20240424215214.3956041-2-andrii@kernel.org> <0e8b7482-478e-4efc-ad5f-76d60cf02bfd@suse.cz> <93840eb4-609d-49d3-b48a-9c26bfb5b8ec@suse.cz> From: "wuqiang.matt" Autocrypt: addr=wuqiang.matt@bytedance.com; keydata= xsDNBGOidiIBDADKahCm8rTJ3ZgXTS0JR0JWkorMj3oNDI0HnLvHt8f9DBmjYyV11ol0FYUr uJ230wjVVKLMm0yBk3jX7Dsy0jggnIcVlINhaXV9DMxzLBM7Vc55FuB9M5/ZaSrM+V5LeG+t nPbZie6yzJbNpdGBdVXnXiOAEgT9+kYqgCRBOJdpzZyEHv14elfGOMo8PVCxiN2UEkCG+cg1 EwfMgy2lZXsGP/By0DaEHnDtyXHfNEwlyoPHOWu7t+PWCw3FgXndX4wvg0QN0IYqrdvP+Tbl YQLAnA9x4odjYvqwfUDXavAb7OHObEBrqNkMX7ifotg64QgZ0SZdB3cd1Az5dC3i0zmGx22Q pPFseJxGShaHZ0KeE+NSlbUrz0mbiU1ZpPCeXrkuj0ud5W3QfEdHh00/PupgL/Jiy6CHWUkK 1VN2jP52uUFYIpwUxaCj1IT9RzoHUMYdf/Pj4aUUn2gflaLMQFqH+aT68BncLylbaZybQn/X ywm05lNCmTq7M7vsh2wIZ1cAEQEAAc0kd3VxaWFuZyA8d3VxaWFuZy5tYXR0QGJ5dGVkYW5j ZS5jb20+wsEHBBMBCAAxFiEEhAnU1znx1I9+E57kDMyNdoDoPy8FAmOidiMCGwMECwkIBwUV CAkKCwUWAgMBAAAKCRAMzI12gOg/LzhCC/sEdGvOQbv0zaQw2tBfw7WFBvAuQ6ouWpPQZkSV 3mZihJKfaxBjjhpjtS5/ieMebChUoiVoofx9VTCaP3c/qQ/qzYUYdKCzQL92lrqRph0qK/tJ QPxFUkUEgsSwY7h/SEMsga8ziPczBdVf+0HWkmKGL1uvfS6c72M2UMSulvg73kxjxUIeg30s BTzh6g94FiCOhn8Ali2aHhkbRgQ2RoXNqgmyp6zGdI3pigk1irIpfGF6qmGshNUw/UTLLKos /zJdNjezfPaHifNSRgCnuLfQ1jennpEirgxUcLNQSWrUFqOOb/bJcWsWgU3P84dlfpNqbXmI Qo6gSWzuetChHAPl0YHpvATrOuXqJtxrvsOVWg9nGaPj7fjm0DEvp32a2eFvVz7a3SX8cuQv RUE915TsKcXeX9CBx1cDPGmggT+IT6oqk0lup3ZL980FZhVk7wXoj1T4rEx9JFeZV5KikET1 j7NFGAh2oBi19cE3RT+NEwsSO2q8JvTgoluld2BzN57OwM0EY6J2IwEMANHVmP9TbdLlo0uT VtKl+vUC1niW9wiyOZn1RlRTKu3B+md/orIMEbVHkmYb4rmxdAOY+GRHazxw30b88MC0hiNc paHtp7GqlqRJ9PkQVc1M6EyMP4zuem0qOR+t0rq3n8pTWLFyji+wWj2J06LOqsEx36Qx+RbV 8E2cgRA3e43ldHYBx+ZNM/kBLLLzvMNriv0DQJvZpNfhewLw/87rNZ3QfkxzNYeBAjLj11S5 gPLRXMc5pRV/Tq2bSd9ijinpGVbDCnffX2oqCBg2pYxBBXa9/LvyqK+eZrdkAkvoYTFwczpS c5Sa6ciSvVWHJmWDixNfb8o9T5QJHifTiRLk2KnjFKJCq6D8peP93kst5JoADytO2x0zijgP h+iX+R+kXdRW8Ib1nJVY96cjE08gnewd9lq/7HpL2NIuEL6QVPExKXNQsJaFe554gUbOCTmN nbIVYzRaBeTfVqGoGNOIq/LkqMwzr2V5BufCPFJlLGoHXQ4zqllS4xSHSyjmAfF7OwARAQAB wsD2BBgBCAAgFiEEhAnU1znx1I9+E57kDMyNdoDoPy8FAmOidiQCGwwACgkQDMyNdoDoPy9v iwwAjE0d5hEHKR0xQTm5yzgIpAi76f4yrRcoBgricEH22SnLyPZsUa4ZX/TKmX4WFsiOy4/J KxCFMiqdkBcUDw8g2hpbpUJgx7oikD06EnjJd+hplxxj+zVk4mwuEz+gdZBB01y8nwm2ZcS1 S7JyYL4UgbYunufUwnuFnD3CRDLD09hiVSnejNl2vTPiPYnA9bHfHEmb7jgpyAmxvxo9oiEj cpq+G9ZNRIKo2l/cF3LILHVES3uk+oWBJkvprWUE8LLPVRmJjlRrSMfoMnbZpzruaX+G0kdS 4BCIU7hQ4YnFMzki3xN3/N+TIOH9fADg/RRcFJRCZUxJVzeU36KCuwacpQu0O7TxTCtJarxg ePbcca4cQyC/iED4mJkivvFCp8H73oAo7kqiUwhMCGE0tJM0Gbn3N/bxf2MTfgaXEpqNIV5T Sl/YZTLL9Yqs64DPNIOOyaKp++Dg7TqBot9xtdRs2xB2UkljyL+un3RJ3nsMbb+T74kKd1WV 4mCJUdEkdwCS In-Reply-To: <93840eb4-609d-49d3-b48a-9c26bfb5b8ec@suse.cz> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Rspam-User: X-Rspamd-Queue-Id: 8848414001F X-Rspamd-Server: rspam06 X-Stat-Signature: brtpfwn8iopss4x49fezqf997xsqbnrb X-HE-Tag: 1715332564-330297 X-HE-Meta: U2FsdGVkX18hlMGMn76tX+0rx9WahPwtuefFjNGlykTnrYB1ZUcF3resJV4xZvms5wKfNb9V2rUSfG/ggBuZFFRZVu9HSTtvk3/laQgGQ8WkUapo6TID6RXtMGyRKNUGFuWQkmJxSfZXWgZkowCLuui2GB1gTVZRLNAHZMfPdcojVdlCAXXj0srZKORmrRrN+uXv9w4EVattNZsI2VYZ1yBnTwWiaOGKmu7be/97vwsBM0Gx9DJQKYh/N3fl2NRzMh3zYlf/zmH/lRX+vGjGfhXwhU8LdUdfqeFGvUzal8u5/69ZIyQots1F3p+4YQmeShuErW2WSXppQpKwNYahlpfw+t3iMyDcYxW36b5LJ6YPznF6c8WUy2aF5pgNPuvPa6EFEBBh7oTSbRABWXIYTg3LQB+5ssdrcwtjXs2pJBzAMUDVhNyN/c/o6QwG1sdX+aCiHti9xzGfKDIv9ocxSbkzuUIrdGU6cPaNnWXceVuIFdOUDUhmi6qml5H7GSXrJItrzYuXJ5y9so1kJA+K0WH7vjp0Sg8I90fnaOlpkR2K2VL6+64tfLsGj2OvpMtVPh8u9fT0POBOtf5oNABAMdPGKTZrN1BnoQTKAvZ8QAYDBKKkCr12cvXSt00NbBD347iJ2RSH6/DZetGmvQKMFVRLrs6kj6uYoK3V6wsFBqhp7bgP32kyycntn3e4X514+O5tt1K0ouQ83Z0EqhaeO35I1f8WjHkj0pcBORo9ymjLTwl/BjG+/+TCAajlP+DXv20ssB9q7OAOus2zTzmo+EJyGEJIFlL7GxqAGtCvRQ5LPRS6XWboyBi1WUvInkpyoPONaDIU7ELke24iO5w5To7VR0l3r7v0Q7wpvUBduPAEZJeQaRBcQ9zMg7zZUHMjsMhn8cSV+WyDewK7Y0POFyonb+N0i+X4a8P9KF0VdBrD+XL8MGBCrUeXHdh7dQNbpFb5bcuLbPYbLYysAH0 qvS61Ifw KsYAqdRMAQy4cR4mTQDpl0npAC+/5BCwK+Nf+ZUEeghtUDdqSLHHLwCxQdTrPqT+9V5AF7oym93a/qM9sdXQ0C3WTOzTkjqgBNvxUGa9j6joPZ4oMZjYy5EOUFNuq8CWMPOB0t+ibpPqPgwlikH4y3mqJODQ33ooLn7OevNZUztaipmo8LVO7sDuEmmR0tWhpXh8jxeQbV8Lr1mubjTj1qywe05TOgd6T0zaDLAfJPQ/hjlTXq8Nt/3prrefaz4dOJtASU8DOd3oNyqaQPfh8I6fPeJZPQaD1wC2G6J20WtiI5X/UGJD1Cce9kLkwpGg5fZ0qA4zq7vVJ4rjrNLVwM6kR55K0Dq3JhK24Ma7MT0X+AGBf+7Uu/mKA0H5sCSE/+4wkcsSe7T3UjtitIXEkbRf2hLbYtZXSOGjU8fNyIq3fvgppXfhJuV/aKiKdKtJ+NCzVMNvlRDyGEWWJDViUJQt2BOhaNobi6PzI X-Bogosity: Ham, tests=bogofilter, spamicity=0.000001, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 2024/5/10 16:20, Vlastimil Babka wrote: > On 5/10/24 9:59 AM, wuqiang.matt wrote: >> On 2024/5/7 21:55, Vlastimil Babka wrote: > >> >>>> + } while (!try_cmpxchg_acquire(&slot->tail, &tail, tail + 1)); >>>> + >>>> + /* now the tail position is reserved for the given obj */ >>>> + WRITE_ONCE(slot->entries[tail & slot->mask], obj); >>>> + /* update sequence to make this obj available for pop() */ >>>> + smp_store_release(&slot->last, tail + 1); >>>> + >>>> + return 0; >>>> +} >>>> >>>> /** >>>> * objpool_push() - reclaim the object and return back to objpool >>>> @@ -134,7 +219,19 @@ void *objpool_pop(struct objpool_head *pool); >>>> * return: 0 or error code (it fails only when user tries to push >>>> * the same object multiple times or wrong "objects" into objpool) >>>> */ >>>> -int objpool_push(void *obj, struct objpool_head *pool); >>>> +static inline int objpool_push(void *obj, struct objpool_head *pool) >>>> +{ >>>> + unsigned long flags; >>>> + int rc; >>>> + >>>> + /* disable local irq to avoid preemption & interruption */ >>>> + raw_local_irq_save(flags); >>>> + rc = __objpool_try_add_slot(obj, pool, raw_smp_processor_id()); >>> >>> And IIUC, we could in theory objpool_pop() on one cpu, then later another >>> cpu might do objpool_push() and cause the latter cpu's pool to go over >>> capacity? Is there some implicit requirements of objpool users to take care >>> of having matched cpu for pop and push? Are the current objpool users >>> obeying this requirement? (I can see the selftests do, not sure about the >>> actual users). >>> Or am I missing something? Thanks. >> >> The objects are all pre-allocated along with creation of the new objpool >> and the total number of objects never exceeds the capacity on local node. > > Aha, I see, the capacity of entries is enough to hold objects from all nodes > in the most unfortunate case they all end up freed from a single cpu. > >> So objpool_push() would always find an available slot from the ring-array >> for the given object to insert back. objpool_pop() would try looping all >> the percpu slots until an object is found or whole objpool is empty. > > So it's correct, but seems rather wasteful to have the whole capacity for > entries replicated on every cpu? It does make objpool_push() simple and > fast, but as you say, objpool_pop() still has to search potentially all > non-local percpu slots, with disabled irqs, which is far from ideal. Yes, it's a trade-off between performance and memory usage, with a slight increase of memory consumption for a significant improvement of performance. The reason of disabling local irqs is objpool uses a 32bit sequence number as the state description of each element. It could likely overflow and go back with the same value for extreme cases. 64bit value could eliminate the collision but seems too heavy. > And the "abort if the slot was already full" comment for > objpool_try_add_slot() seems still misleading? Maybe that was your initial > idea but changed later? Right, the comments are just left unchanged during iterations. The original implementation kept each percpu ring-array very compact and objpool_push will try looping all cpu nodes to return the given object to objpool. Actually my new update would remove objpool_try_add_slot and integrate it's functionality into objpool_push. I'll submit the new patch when I finish the verification. > >> Currently kretprobe is the only actual usecase of objpool. >> >> I'm testing an updated objpool in our HIDS project for critical pathes, >> which is widely deployed on servers inside my company. The new version >> eliminates the raw_local_irq_save and raw_local_irq_restore pair of >> objpool_push and gains up to 5% of performance boost. > > Mind Ccing me and linux-mm once you are posting that? Sure, I'll make sure to let you know. > Thanks, > Vlastimil > Regards, Matt Wu