From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BF01ACF9C6C for ; Wed, 25 Sep 2024 06:35:56 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3C5E06B0083; Wed, 25 Sep 2024 02:35:56 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 375FD6B0085; Wed, 25 Sep 2024 02:35:56 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 23D286B0088; Wed, 25 Sep 2024 02:35:56 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 05C2C6B0083 for ; Wed, 25 Sep 2024 02:35:55 -0400 (EDT) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id B1CB4A0A14 for ; Wed, 25 Sep 2024 06:35:55 +0000 (UTC) X-FDA: 82602300270.18.8B7234B Received: from smtpout.efficios.com (smtpout.efficios.com [167.114.26.122]) by imf13.hostedemail.com (Postfix) with ESMTP id C5F9120006 for ; Wed, 25 Sep 2024 06:35:53 +0000 (UTC) Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=efficios.com header.s=smtpout1 header.b=Q6apTrqm; dmarc=pass (policy=none) header.from=efficios.com; spf=pass (imf13.hostedemail.com: domain of mathieu.desnoyers@efficios.com designates 167.114.26.122 as permitted sender) smtp.mailfrom=mathieu.desnoyers@efficios.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1727246056; a=rsa-sha256; cv=none; b=nml4o/hkodyPQfS9BFP6s77/WRDNAcjIsHlzFAJoAufzmh6lhCDr3kKNUTsfMifBEvOxyU j+oS/nBqFeidALkjqCjs8r//+4dkqoNjst+mbE8cPUz4kBLJDJRFH9hF13uew/q7MQaW11 RlMd7l1IONrmLC0sRgwXUyh1pRUP+Qk= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=pass header.d=efficios.com header.s=smtpout1 header.b=Q6apTrqm; dmarc=pass (policy=none) header.from=efficios.com; spf=pass (imf13.hostedemail.com: domain of mathieu.desnoyers@efficios.com designates 167.114.26.122 as permitted sender) smtp.mailfrom=mathieu.desnoyers@efficios.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1727246056; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=FFvGlHa2GWnuOl8QK2VPBTmvRpx1yS5+92KTDlnBIrU=; b=KO48py1V3jWzhb1bt1a7jV6WQ4u0Y9219IG8AOwwEatbR25WJA3QaYFoBvHzMR3TG9KFse n+rJngTqQGVU+nhn1y+p473K0FiWYMFhqKofCU3tVs6tJuhvBBlVRDRixyLj5l+icCgbll d8Ud56sSP1nv4k4u5y6oVA/qKSLm6hM= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=efficios.com; s=smtpout1; t=1727246152; bh=DgyGr6egxcOB39s5l6IEt9vFBGg3lOzqWSScvDkVdx8=; h=Date:Subject:To:Cc:References:From:In-Reply-To:From; b=Q6apTrqmCbkPOJxFrRCMYZqmm2X0FA2Eryzt90uZzGARNknLJHHkfLBzijizaDW2m ocPI3aTcKEv4X12+W2dAgNhs6KTFfDtLzpaA9drN6EV13LgL5LB1aI9fApYlH81+5b nZdvgVRXs7vDm/bI9ddaiSgx4IMRs1G/RGZwTBQ18Bx4xqjQv+h78wPglzW6r0XqfE nIMFqSYN8kRGMeHnKOSYouMUZvtpZo7qkv0Wd3MeeFRpd2046rNuLkAulNsw4SArQ9 O73p+k+uP44jAYiLM6umRC7/JJamqSyCKcf48FCPJNfvyCtTFC1vSO8QLkMUyf8ZSz KCqCtf4LRyl3Q== Received: from [172.24.0.4] (96-127-217-162.qc.cable.ebox.net [96.127.217.162]) by smtpout.efficios.com (Postfix) with ESMTPSA id 4XD6S01yS9z1Lkh; Wed, 25 Sep 2024 02:35:44 -0400 (EDT) Message-ID: <48ae741e-98aa-49d9-b677-6c4f8fd1bcb0@efficios.com> Date: Wed, 25 Sep 2024 08:35:04 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [RFC PATCH 1/1] hpref: Hazard Pointers with Reference Counter To: Jonas Oberhauser , Boqun Feng , "Paul E. McKenney" Cc: linux-kernel@vger.kernel.org, Will Deacon , Peter Zijlstra , Alan Stern , John Stultz , Linus Torvalds , Frederic Weisbecker , Joel Fernandes , Josh Triplett , Uladzislau Rezki , Steven Rostedt , Lai Jiangshan , Zqiang , Ingo Molnar , Waiman Long , Mark Rutland , Thomas Gleixner , Vlastimil Babka , maged.michael@gmail.com, Mateusz Guzik , rcu@vger.kernel.org, linux-mm@kvack.org, lkmm@lists.linux.dev References: <20240921164210.256278-1-mathieu.desnoyers@efficios.com> From: Mathieu Desnoyers Content-Language: en-US In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: C5F9120006 X-Stat-Signature: 3pbw4nfw419yw4g5qg3xyxrbg459moeu X-Rspam-User: X-HE-Tag: 1727246153-872530 X-HE-Meta: U2FsdGVkX1+zg0qjikrXrR2UQaKRgPWnmdKCkLx17CtPJioKCXEvZS8UP6R2PaClXYniexaWHQZXw0G9qpJKNBAiyVLVkYCyu57f3zLIWzeRkDf9pf+msbgahA5iuYW3jczoqw5JUo8MY066UYaSNuKNC1ZWwLO8g9I000ohYztCzPzHi6QXTjeZsXjkh6/yAdJOCVYOq8Ztpw98D0vq5l6g9f2OwrvaExRAPDUmgKngxSWRjRrw15kRAXTmzdP8NBB+5RCWdDMUa3avypCY3WrQVGhkshAMF3oiM6/ldtMrAtYQPkAPMzuKviKC/jab3RrBAzz+IGh66mxdAPtk8tjJf09dxmlKrGYw8U3MqwygisWcQTz/dtw3c6ugZQH/RhGYxw6HkOIGHfcWSU1XxwHYxg7WYJAzWF1REMfXBXGwfd/dPQhXGBOKtuYwJU+H/GXE5vsLzKr5NCxB+/T/IQ2OTv10UqtGCXz2jTDcwhFmL2WSqu3PlxnroNBFq0ysxzsXf7vsyTGdW0KX8EpXLCGs+UUTBy1fb3FRkaTbupT8zd5Vk/ri68l0q4nI4OIdRaR6ZCilJ2N6PHn6ojHbecYiglTybnbkZtuYE5vHF5kYQM8DvAO6hANTBNrM3pB16pR+/FmGjc9BqhX3GPU7jPglOGjMlH1dz8tRWLmi6+ObUKmGLCqzmcE2JiIEe0RxatO/abu7Xd+RTRTMyXR2c8haIpVqBIIYcOY3mXEB5+yKwEl7CL14Zf4+lRasNM8dj7wxWMwL4gTqagXcX4K7tKvO8vk8gxU7fIW6aJ4DyKD+ldUTPno0ktfOhvdQdqHVT+kxpasg+dbsDuXnnzncVedkGjL6iovMIJ2nW5IbJQ/SNrwb5SLBz8MuvyuZPOLZHHD/frNHc4NPiImfx9UOC2H9UqL0rauU86ELwiS39fAov1BwUS5fMmxks8ArhgfJloXMnXlcRSgLtp28q/k aUfnHFet DdGefAYQJf3FMxrJxe3sCGNVjS6fvlzzn2dOf5VibO3CccZTUyJwSGrZxwa6y3UWY51pFyx+4wJHO+Qz7MPKwjw+gOBoMD1MGk6OS5zMiEUq69sFC05uT+VDvaDyI/5ijE/MuIsckFF2IbDfbCSelQ/UDyovg7eVRfTta6g9xix8cMk9Uhl2NDUPPEg7YSiNQirf1JcCw+EUvkEgjbFNxX7r2VTDz2J/rD7molcjoHQ+6y0Vu8taJYs1FhALTFGIHj5C6pKqY+kV6+Sbr/v4+AfsM4O9gizYz+9yBoATnO9IGr/m//5fRvluOXlkCkNuyo4HOyf/C4tqKm2tgDzSrsRmP7eRsoELFG3AuGc5X4Jeu2DRhRrAitkoaspzZ51vDCPD8dCHwz2PEuDrZHVKxoKEMNGQIoJCZiqPMekbDE1qTG3o6KnBybrs9VNkvF/mIji5Y0ayLFcTJfSQhUL+c5LsrxRJe1RrwdEsx X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 2024-09-25 07:57, Jonas Oberhauser wrote: > Hi Mathieu, > > interesting idea. Conceptually it looks good. > > There's another approach of using hazard pointer to optimize shared > reference counting (to make it lock-free in common cases). > > https://github.com/cmuparlay/concurrent_deferred_rc > > It doesn't go as far as what you're doing, but they also use the hazard > pointer to protect the refcount (like you do in the promote function). Thanks for the reference. Indeed, one use-case of this HP+refcount scheme would be to implement something that resembles C++ shared pointers. I'm glad to see I'm not alone in seeing potential in this approach. > I haven't read your code in detail but it seems to me you have an ABA > bug: as I explained elsewhere, you could read the same pointer after ABA > but you don't synchronize with the newer store that gave you node2, > leaving you to speculatively read stale values through *ctx->hp. > (I am assuming here that ctx->hp is essentially an out parameter used to > let the caller know which node got protected). You are of course absolutely right. The following change should fix it: cmm_barrier(); - node2 = uatomic_load(node_p, CMM_RELAXED); /* Load A */ + node2 = rcu_dereference(*node_p); /* Load A */ Please let me know if I'm missing anything. Thanks, Mathieu > > Have fun, >   jonas > >> +/* >> + * hpref_hp_get: Obtain a reference to a stable object, protected either >> + *               by hazard pointer (fast-path) or using reference >> + *               counter as fall-back. >> + */ >> +static inline >> +bool hpref_hp_get(struct hpref_node **node_p, struct hpref_ctx *ctx) >> +{ >> +    int cpu = rseq_current_cpu_raw(); >> +    struct hpref_percpu_slots *cpu_slots = >> rseq_percpu_ptr(hpref_percpu_slots, cpu); >> +    struct hpref_slot *slot = >> &cpu_slots->slots[cpu_slots->current_slot]; >> +    bool use_refcount = false; >> +    struct hpref_node *node, *node2; >> +    unsigned int next_slot; >> + >> +retry: >> +    node = uatomic_load(node_p, CMM_RELAXED); >> +    if (!node) >> +        return false; >> +    /* Use rseq to try setting current slot hp. Store B. */ >> +    if (rseq_load_cbne_store__ptr(RSEQ_MO_RELAXED, RSEQ_PERCPU_CPU_ID, >> +                (intptr_t *) &slot->node, (intptr_t) NULL, >> +                (intptr_t) node, cpu)) { >> +        slot = &cpu_slots->slots[HPREF_EMERGENCY_SLOT]; >> +        use_refcount = true; >> +        /* >> +         * This may busy-wait for another reader using the >> +         * emergency slot to transition to refcount. >> +         */ >> +        caa_cpu_relax(); >> +        goto retry; >> +    } >> +    /* Memory ordering: Store B before Load A. */ >> +    cmm_smp_mb(); >> +    node2 = uatomic_load(node_p, CMM_RELAXED);    /* Load A */ >> +    if (node != node2) { >> +        uatomic_store(&slot->node, NULL, CMM_RELAXED); >> +        if (!node2) >> +            return false; >> +        goto retry; >> +    } >> +    ctx->type = HPREF_TYPE_HP; >> +    ctx->hp = node; >                    ^---- here > >> +    ctx->slot = slot; >> +    if (use_refcount) { >> +        hpref_promote_hp_to_ref(ctx); >> +        return true; >> +    } >> +    /* >> +     * Increment current slot (racy increment is OK because it is >> +     * just a position hint). Skip the emergency slot. >> +     */ >> +    next_slot = uatomic_load(&cpu_slots->current_slot, CMM_RELAXED) + 1; >> +    if (next_slot >= HPREF_EMERGENCY_SLOT) >> +        next_slot = 0; >> +    uatomic_store(&cpu_slots->current_slot, next_slot, CMM_RELAXED); >> +    return true; >> +} >> + >> +static inline >> +void hpref_put(struct hpref_ctx *ctx) >> +{ >> +    if (ctx->type == HPREF_TYPE_REF) { >> +        urcu_ref_put(&ctx->hp->refcount, hpref_release); >> +    } else { >> +        /* Release HP. */ >> +        uatomic_store(&ctx->slot->node, NULL, CMM_RELEASE); >> +    } >> +    ctx->hp = NULL; >> +} >> + >> +/* >> + * hpref_set_pointer: Store pointer @node to @ptr, with RCU publication >> + *                    guarantees. >> + */ >> +static inline >> +void hpref_set_pointer(struct hpref_node **ptr, struct hpref_node *node) >> +{ >> +    if (__builtin_constant_p(node) && node == NULL) >> +        uatomic_store(ptr, NULL, CMM_RELAXED); >> +    else >> +        uatomic_store(ptr, node, CMM_RELEASE); >> +} >> + >> +/* >> + * Return the content of the hpref context hazard pointer field. >> + */ >> +static inline >> +struct hpref_node *hpref_ctx_pointer(struct hpref_ctx *ctx) >> +{ >> +    return ctx->hp; >> +} >> + >> +#ifdef __cplusplus >> +} >> +#endif >> + >> +#endif /* _URCU_HPREF_H */ >> diff --git a/src/Makefile.am b/src/Makefile.am >> index b555c818..7312c9f7 100644 >> --- a/src/Makefile.am >> +++ b/src/Makefile.am >> @@ -19,7 +19,8 @@ RCULFHASH = rculfhash.c rculfhash-mm-order.c >> rculfhash-mm-chunk.c \ >>   lib_LTLIBRARIES = liburcu-common.la \ >>           liburcu.la liburcu-qsbr.la \ >>           liburcu-mb.la liburcu-bp.la \ >> -        liburcu-memb.la liburcu-cds.la >> +        liburcu-memb.la liburcu-cds.la \ >> +        liburcu-hpref.la >>   # >>   # liburcu-common contains wait-free queues (needed by call_rcu) as well >> @@ -50,6 +51,9 @@ liburcu_cds_la_SOURCES = rculfqueue.c rculfstack.c >> lfstack.c \ >>       workqueue.c workqueue.h $(RCULFHASH) $(COMPAT) >>   liburcu_cds_la_LIBADD = liburcu-common.la >> +liburcu_hpref_la_SOURCES = hpref.c >> +liburcu_hpref_la_LIBADD = -lrseq >> + >>   pkgconfigdir = $(libdir)/pkgconfig >>   pkgconfig_DATA = liburcu-cds.pc liburcu.pc liburcu-bp.pc >> liburcu-qsbr.pc \ >>       liburcu-mb.pc liburcu-memb.pc >> diff --git a/src/hpref.c b/src/hpref.c >> new file mode 100644 >> index 00000000..f63530f5 >> --- /dev/null >> +++ b/src/hpref.c >> @@ -0,0 +1,78 @@ >> +// SPDX-FileCopyrightText: 2024 Mathieu Desnoyers >> >> +// >> +// SPDX-License-Identifier: LGPL-2.1-or-later >> + >> +/* >> + * HPREF: Hazard pointers with reference counters >> + */ >> + >> +#define _LGPL_SOURCE >> +#include >> +#include     /* Per-CPU memory */ >> + >> +static struct rseq_mempool *mempool; >> +struct hpref_percpu_slots *hpref_percpu_slots; >> + >> +void hpref_release(struct urcu_ref *ref) >> +{ >> +    struct hpref_node *node = caa_container_of(ref, struct >> hpref_node, refcount); >> + >> +    node->release(node); >> +} >> + >> +/* >> + * hpref_synchronize: Wait for any reader possessing a hazard pointer to >> + *                    @node to clear its hazard pointer slot. >> + */ >> +void hpref_synchronize(struct hpref_node *node) >> +{ >> +    int nr_cpus = rseq_get_max_nr_cpus(), cpu; >> + >> +    /* Memory ordering: Store A before Load B. */ >> +    cmm_smp_mb(); >> +    /* Scan all CPUs slots. */ >> +    for (cpu = 0; cpu < nr_cpus; cpu++) { >> +        struct hpref_percpu_slots *cpu_slots = >> rseq_percpu_ptr(hpref_percpu_slots, cpu); >> +        struct hpref_slot *slot; >> +        unsigned int i; >> + >> +        for (i = 0; i < HPREF_NR_PERCPU_SLOTS; i++) { >> +            slot = &cpu_slots->slots[i]; >> +            /* Busy-wait if node is found. */ >> +            while (uatomic_load(&slot->node, CMM_ACQUIRE) == node) >> /* Load B */ >> +                caa_cpu_relax(); >> +        } >> +    } >> +} >> + >> +/* >> + * hpref_synchronize_put: Wait for any reader possessing a hazard >> + *                        pointer to clear its slot and put reference >> + *                        count. >> + */ >> +void hpref_synchronize_put(struct hpref_node *node) >> +{ >> +    if (!node) >> +        return; >> +    hpref_synchronize(node); >> +    urcu_ref_put(&node->refcount, hpref_release); >> +} >> + >> +static __attribute__((constructor)) >> +void hpref_init(void) >> +{ >> +    mempool = rseq_mempool_create("hpref", sizeof(struct >> hpref_percpu_slots), NULL); >> +    if (!mempool) >> +        abort(); >> +    hpref_percpu_slots = rseq_mempool_percpu_zmalloc(mempool); >> +    if (!hpref_percpu_slots) >> +        abort(); >> +} >> + >> +static __attribute__((destructor)) >> +void hpref_exit(void) >> +{ >> +    rseq_mempool_percpu_free(hpref_percpu_slots); >> +    if (rseq_mempool_destroy(mempool)) >> +        abort(); >> +} > -- Mathieu Desnoyers EfficiOS Inc. https://www.efficios.com