From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7F877C7115A for ; Fri, 18 Aug 2023 15:18:40 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B46A0940065; Fri, 18 Aug 2023 11:18:39 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id AF72E940012; Fri, 18 Aug 2023 11:18:39 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9E5E9940065; Fri, 18 Aug 2023 11:18:39 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 8B549940012 for ; Fri, 18 Aug 2023 11:18:39 -0400 (EDT) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 5AC9114057B for ; Fri, 18 Aug 2023 15:18:39 +0000 (UTC) X-FDA: 81137582358.01.F0624FE Received: from smtp-out1.suse.de (smtp-out1.suse.de [195.135.220.28]) by imf18.hostedemail.com (Postfix) with ESMTP id 280B81C0023 for ; Fri, 18 Aug 2023 15:18:34 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=EpRBuI8I; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=JbQjzaDC; dmarc=none; spf=pass (imf18.hostedemail.com: domain of vbabka@suse.cz designates 195.135.220.28 as permitted sender) smtp.mailfrom=vbabka@suse.cz ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1692371915; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=nqiekH2F0E2/DDRwg3KGVqWkE2aBPtHmUQq2Q3oCfR8=; b=Z2VdY/bfSlGuJpeubqYJ1Kh+3QASe1E+MHbr+amAt39kGNMXiom22XzRKuWf50qqKsA92G lj3ZLB16VdUPD8MAZO+LYJuuaxy1NHju2Qn7J9+u9tOjMB03tMfUX1GAeWCb4Gu37r//Tk CszsJG3FjeEFz2sfCbqPvyFUhHzEP5U= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=EpRBuI8I; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=JbQjzaDC; dmarc=none; spf=pass (imf18.hostedemail.com: domain of vbabka@suse.cz designates 195.135.220.28 as permitted sender) smtp.mailfrom=vbabka@suse.cz ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1692371915; a=rsa-sha256; cv=none; b=pU1vxtZCCo/J8KFO1mvNA1EwDGoA/IbSdl30d6fcHZqbnC0xmBLnJaRutAQVA0Asn7f44s CYHjO0XJEKBr8pYldEP3lcExhwDn0moMX/+cFvofj9b2qiAAttpuer7b/P6s6umJYH2wqD DR4ApATqZpfyOuB/5s41hPeGmQ2fI4k= Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id 5D9D121890; Fri, 18 Aug 2023 15:18:33 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1692371913; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=nqiekH2F0E2/DDRwg3KGVqWkE2aBPtHmUQq2Q3oCfR8=; b=EpRBuI8IZ7N4y+Fb1IC81VfFtAE0lnIg8bO2Pdtle8MMmVibh8YZauSSqc9T8l79wnWMUQ nIQdLIZDCThqJUJpNdN3tS9dqIFqHEy8Oz5fuudmds1y82cr58EixAkWsJA3LTtBXlBR1w ARoGyE0CJDbM85BTQorWFikMxwKwKvw= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1692371913; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=nqiekH2F0E2/DDRwg3KGVqWkE2aBPtHmUQq2Q3oCfR8=; b=JbQjzaDCyT9y0UIMBO2L1auErX+f71DPH1ntDCw8bZA9FATVWk+/cadv+eT25ryTxi0yFC /9L85ZCZTGyRAbBg== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 837A8138F0; Fri, 18 Aug 2023 15:18:32 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id MXhvHciL32QQWwAAMHmgww (envelope-from ); Fri, 18 Aug 2023 15:18:32 +0000 Message-ID: <7fa57517-cf32-79b9-405d-251997d25414@suse.cz> Date: Fri, 18 Aug 2023 17:20:16 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.14.0 Subject: Re: [PATCH net] net: use SLAB_NO_MERGE for kmem_cache skbuff_head_cache Content-Language: en-US To: Jesper Dangaard Brouer , Matthew Wilcox , Jesper Dangaard Brouer Cc: brouer@redhat.com, netdev@vger.kernel.org, Eric Dumazet , "David S. Miller" , Jakub Kicinski , Paolo Abeni , linux-mm@kvack.org, Andrew Morton , Mel Gorman , Christoph Lameter , roman.gushchin@linux.dev, dsterba@suse.com References: <169211265663.1491038.8580163757548985946.stgit@firesoul> <0f77001b-8bd3-f72e-7837-cc0d3485aaf8@redhat.com> From: Vlastimil Babka In-Reply-To: <0f77001b-8bd3-f72e-7837-cc0d3485aaf8@redhat.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 280B81C0023 X-Stat-Signature: mdqfqi4d3aug44jfmaek1qenqt3zun65 X-HE-Tag: 1692371914-475613 X-HE-Meta: U2FsdGVkX1813SMfLs6h2gDLjJPLbggMuN8r6+YCMQ30joAHwu9Zw07GNBTnAug9/aI8OqwsyQi8SEAYXWFD2lEGexj/XqxjH4LnjJykekZ/4ZF+H7ubLOIucQZ6jQQevlWxe1q2Z3v3J+jCpGy9hQC71guV5FFAKO4R5Uf46rZXQX1zdiuWEC7gKx6zskBO3OW3FGUFO2AIvEmQMzXmMUhne21oLXTnl6FbRiER3aKH3EOOe2IyJ3cmoSo9d9AsAzP+0C/c7UKE7khsJ2f2ITjdmuN2Afle2yjeWcAiHKQXvuM3t9EdBuWL3wbpmOb47C7Mtg7jT35a7BA2TJ0KNC2y15m9OwGRBcJohBdoaxnjbIld6PxZQf/VZ7ELBUQrkYt2jcGWzpQO/e5HKoAmr/MHiEfY5SJZfk+pF32YDNg6sk1JM7ol5MrfGtEC2QjjkJJiY5UEhcuNwVwmr1akIPIjfW+PNe/40rB5nYefunJ3iylDwglhabyZ9Mr6LUX1/YuEw7fTc9JlcjC6F5X0GlSczXEoqnRZuc3lv1w2rBpt00pS94Kjh+BtU8IqTCMNtgBtVXBxEK3e5XV5LT/uXt65/DUDFEnyHWIo5G/n0HheekT49tduatHM3+80TvL5CtD4BQSJbg1t8K0pQholaslXvO3DXobVpCZCeQCa1pqCC8F7RsDgYANNkHyIEV6ZGEnKZ28bCUZ5byxws2nku4MtyMf/oJQsm/gZeDTh57i3+j4Q1ZOF1rPf9B0Q01XOZ14ZlxkuWIKOn7IjXdO7JSul3n61Lmv5W635XNNSU0ZXEc6xsTmTaW2HvrFnq3JkUVomY1eHisIHVCuAITCI9rDUV4tvHvO5DHYAuEEQ5A6gnShB54y2y3dz/wjBQUqXNV+K0/fs7u+tSO/4e+CsY6rGxN+1Bn4klEisx0q9mbdu8zbXR8Js3IlHWVOVAZmQlA2DfxxpiwmqM21fE6a ZHvW8VLe K8plQToh+qkkcyoRpzISkRDa8J7AGSpGnln0mthfOqfnhKpRikzSPUSj0jFSJXrSpW4JfR7FtJ1aD62rQcZK9tGvvlVzrs89prC55MsnOIM9xoqYPX0LpbhATwekURGMgZacmTmWoPSlnjnN/RlRySiIgJP4RR9mumkLrNowdu3HL5csMB4WmPYiemDnjYsw2F3vE9CcxNP3hw7bJ6TinrfSg1H+kze8n5oQbNfP1ijHk9mpnURtjUBJoXKov6Rv4ZRURtTb7VRYS+XlKa9rn6VD89yqMMRtJ9/++5EpRsC0+6OCi25no6jtNhT3UPjjTiTeo6hC3O3Oj2+EwjsunapiVyEKJDrDn/dxGTKjmaNtSnJZph1IqEvUN+k9jepqTc0fwyYNi2v1gNMZSjTQW+vJD+Q== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 8/18/23 14:32, Jesper Dangaard Brouer wrote: > > > On 15/08/2023 17.53, Matthew Wilcox wrote: >> On Tue, Aug 15, 2023 at 05:17:36PM +0200, Jesper Dangaard Brouer wrote: >>> For the bulk API to perform efficiently the slub fragmentation need to >>> be low. Especially for the SLUB allocator, the efficiency of bulk free >>> API depend on objects belonging to the same slab (page). >> >> Hey Jesper, >> >> You probably haven't seen this patch series from Vlastimil: >> >> https://lore.kernel.org/linux-mm/20230810163627.6206-9-vbabka@suse.cz/ >> >> I wonder if you'd like to give it a try?  It should provide some immunity >> to this problem, and might even be faster than the current approach. >> If it isn't, it'd be good to understand why, and if it could be improved. I didn't Cc Jesper on that yet, as the initial attempt was focused on the maple tree nodes use case. But you'll notice using the percpu array requires the cache to be created with SLAB_NO_MERGE anyway, so this patch would be still necessary :) > I took a quick look at: >  - > https://lore.kernel.org/linux-mm/20230810163627.6206-11-vbabka@suse.cz/#Z31mm:slub.c > > To Vlastimil, sorry but I don't think this approach with spin_lock will > be faster than SLUB's normal fast-path using this_cpu_cmpxchg. > > My experience is that SLUB this_cpu_cmpxchg trick is faster than spin_lock. > > On my testlab CPU E5-1650 v4 @ 3.60GHz: >  - spin_lock+unlock : 34 cycles(tsc) 9.485 ns >  - this_cpu_cmpxchg :  5 cycles(tsc) 1.585 ns >  - locked cmpxchg   : 18 cycles(tsc) 5.006 ns Hm that's unexpected difference between spin_lock+unlock where AFAIK spin_lock is basically a locked cmpxchg and unlock a simple write, and I assume these measurements are on uncontended lock? > SLUB does use a cmpxchg_double which I don't have a microbench for. Yeah it's possible the _double will be slower. Yeah the locking will have to be considered more thoroughly for the percpu array. >> No objection to this patch going in for now, of course. >> >