From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.3 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 175DFC433DB for ; Fri, 15 Jan 2021 20:16:32 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id A92DE23A9C for ; Fri, 15 Jan 2021 20:16:31 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org A92DE23A9C Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id DD7448D01EA; Fri, 15 Jan 2021 15:16:30 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id D60228D01B2; Fri, 15 Jan 2021 15:16:30 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C28A28D01EA; Fri, 15 Jan 2021 15:16:30 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0136.hostedemail.com [216.40.44.136]) by kanga.kvack.org (Postfix) with ESMTP id A99028D01B2 for ; Fri, 15 Jan 2021 15:16:30 -0500 (EST) Received: from smtpin06.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 66739826FF6F for ; Fri, 15 Jan 2021 20:16:30 +0000 (UTC) X-FDA: 77709116940.06.swim74_63116bb27532 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin06.hostedemail.com (Postfix) with ESMTP id 459C61033366A for ; Fri, 15 Jan 2021 20:16:30 +0000 (UTC) X-HE-Tag: swim74_63116bb27532 X-Filterd-Recvd-Size: 5146 Received: from mail-lj1-f170.google.com (mail-lj1-f170.google.com [209.85.208.170]) by imf40.hostedemail.com (Postfix) with ESMTP for ; Fri, 15 Jan 2021 20:16:29 +0000 (UTC) Received: by mail-lj1-f170.google.com with SMTP id e7so11725516ljg.10 for ; Fri, 15 Jan 2021 12:16:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=ySs7CNTOqN5GXMFt8JleozKLbcnnMYl8h0GybHe3ovo=; b=VlpCFPXBTVU5sv30ixD//eShTT4irKo27rhhyZGhIUsqmoxF6jwWVu3Quqs689I6zq AHsvmwosR39IZhN5Eb64huoTv/7f1b7rIr7ogsXWFqkSAtrtra6wKPFDL6OixkbiQyOL tz6ewYTX/eHAIBfWDy9YfcGq/ZxOFqTRMfA7TIhhouFbfaNN8gSJnq1BPNtLUwkdGfVC zKQn3z03aTQxD0M2gvdoU+OMjzL0Zw1RGwpVkKhlpRQ8F7yPXZEv3gkHy0YsIBBmw1UO 6eixhRzN2DUNoCHBBB0xIWIwl8uN/v1ZbP8Pn9IfkMZYMnAfzmIYZEvL6nQEv/s8LYPr OXSg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=ySs7CNTOqN5GXMFt8JleozKLbcnnMYl8h0GybHe3ovo=; b=pEWkLzQOG0whX2S2AfvnAQae7d6bsizqOSSfgtcHN239R76WAnCYCKrYKwSXRkufeB ZWjdhixrwqolR/AzlsKTw3c9mk6TUdLvYoOzOMVKlphA3W09+hFBZFB/NpVTeVTNhx+F DYnRLQsgAZ2YVqNDy1xWVMxM2bBZ7Nq5FGL/CRYLopcLtAk29cgrSwt0wyNc46q4V7/u ywwGhBmRKL90Mo+0NAW9DlrnIHGSnxEZgnjI/nRuTH9HHvWbem/pn79fYEirBwM3oxe5 UobgOYLYTIM5uatejemyHApfBpDzh8tiWeDxMEZhYyLVt0MI+E9oXS7HlSXfX9bsY0f5 vmmA== X-Gm-Message-State: AOAM531kxQxFzJk8QxkbqLLet4ihoOcNkizxdsVjdPDuoFlDHe5eWZWI uo0Me4gq4ZQCq5bpvS7qgSq/cdgmW+vPLcD4SMxjcg== X-Google-Smtp-Source: ABdhPJxDasIJIa/cY9hSw0yawfYbZ0GYp/23phEahsfz3evcWfJnDGxLtBqc+qw6GzB3pScGLudSyHgnR8FGw6dwkZM= X-Received: by 2002:a2e:9d87:: with SMTP id c7mr5897472ljj.43.1610741788004; Fri, 15 Jan 2021 12:16:28 -0800 (PST) MIME-Version: 1.0 References: <20210115183543.15097-1-vbabka@suse.cz> In-Reply-To: <20210115183543.15097-1-vbabka@suse.cz> From: Jann Horn Date: Fri, 15 Jan 2021 21:16:01 +0100 Message-ID: Subject: Re: [PATCH] mm, slub: splice cpu and page freelists in deactivate_slab() To: Vlastimil Babka Cc: Linux-MM , kernel list , Christoph Lameter , Pekka Enberg , David Rientjes , Joonsoo Kim , Andrew Morton Content-Type: text/plain; charset="UTF-8" X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, Jan 15, 2021 at 7:35 PM Vlastimil Babka wrote: > In deactivate_slab() we currently move all but one objects on the cpu freelist > to the page freelist one by one using the costly cmpxchg_double() operation. > Then we unfreeze the page while moving the last object on page freelist, with > a final cmpxchg_double(). > > This can be optimized to avoid the cmpxchg_double() per object. Just count the > objects on cpu freelist (to adjust page->inuse properly) and also remember the > last object in the chain. Then splice page->freelist to the last object and > effectively add the whole cpu freelist to page->freelist while unfreezing the > page, with a single cmpxchg_double(). This might have some more (good) effects, although these might well be too minuscule to notice: - The old code inverted the direction of the freelist, while the new code preserves the direction. - We're no longer dirtying the cachelines of objects in the middle of the freelist. In the current code it probably doesn't really matter, since I think we basically only take this path when handling NUMA mismatches, PFMEMALLOC stuff, racing new_slab(), and flush_slab() for handling flushing IPIs? But yeah, if we want to start automatically sending flush IPIs, it might be a good idea, given that the next accesses to the page will probably come from a different CPU (unless the page is entirely unused, in which case it may be freed to the page allocator's percpu list) and we don't want to create unnecessary cache/memory traffic. (And it's a good cleanup regardless, I think.) > Signed-off-by: Vlastimil Babka Reviewed-by: Jann Horn [...] > /* > - * Stage two: Ensure that the page is unfrozen while the > - * list presence reflects the actual number of objects > - * during unfreeze. > + * Stage two: Unfreeze the page while splicing the per-cpu > + * freelist to the head of page's freelist. > + * > + * Ensure that the page is unfrozen while the list presence > + * reflects the actual number of objects during unfreeze. (my computer complains about trailing whitespace here) > * > * We setup the list membership and then perform a cmpxchg > * with the count. If there is a mismatch then the page