From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 51328C10DCE for ; Wed, 18 Mar 2020 15:26:20 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id F3DDD2076D for ; Wed, 18 Mar 2020 15:26:19 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="iDy1plYE" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org F3DDD2076D Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 915C86B0078; Wed, 18 Mar 2020 11:26:19 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 8C5666B007D; Wed, 18 Mar 2020 11:26:19 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7B4826B007E; Wed, 18 Mar 2020 11:26:19 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0114.hostedemail.com [216.40.44.114]) by kanga.kvack.org (Postfix) with ESMTP id 633C76B0078 for ; Wed, 18 Mar 2020 11:26:19 -0400 (EDT) Received: from smtpin24.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 2A66EF96 for ; Wed, 18 Mar 2020 15:26:19 +0000 (UTC) X-FDA: 76608859278.24.dogs18_1a0647c41ea31 X-HE-Tag: dogs18_1a0647c41ea31 X-Filterd-Recvd-Size: 7344 Received: from mail-io1-f65.google.com (mail-io1-f65.google.com [209.85.166.65]) by imf41.hostedemail.com (Postfix) with ESMTP for ; Wed, 18 Mar 2020 15:26:18 +0000 (UTC) Received: by mail-io1-f65.google.com with SMTP id v3so25303855iom.13 for ; Wed, 18 Mar 2020 08:26:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=h30qXLlqp9/WnBdqhsneMWkECxHjTJItk58Wshn9ME4=; b=iDy1plYEEB3TZh/nofEBowyZLbMP1kbRaC7/nPXyuboD8nKqjm4ZEmnini+x21yTgE SHHsRJwnjL/gq86opSBwuvSAy2yvpEDU32e6Hi+/knM5y5aLW2a3ysM1ECUwGJtuapEr 2bJ16Wxee1xp9R2cXpRSZKGsVLRg5U3QzWjT09tDM4DHx0Y0GS3aKJhCkui16b/lqjEY UONKpFzyQFXW+FcWd4JVDDGQS51Hi9IAKfz2j7/RJq65pndFpzHxFIciNTwYAcUA4uNJ 42KYEexBASN9Co9Vrjogx+fqCXh8vxnVePhzs11/OVmrxO0eyqWcPqiIzrOx3mJ8mraG HwFQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=h30qXLlqp9/WnBdqhsneMWkECxHjTJItk58Wshn9ME4=; b=rzrWaS0VmZebhHr2G+HymjEcOsA9mdqZ9jtF0nx0rvoU5ycceTnOu0eOyd4nWQdsrb 4wUU2Rinq72DEnH6NCiIo43VIyFrsvL0GQKOAXlHUUA17II623pMyb6VnUHEj5GJEaoG XNgJ8jJvIyFzvL79luztvizond4VV0qV5Qn5o3ylq5pQ43Q3TR/IyxWpASVO+7Eep9w9 jbph0CV300rvNPwKhz16/WSrzc79PhLI8Xc7VLxNjEq2yrRnNEDorc/DGtdbkDpHPtWS gb2zKiCWEh/gQpYsN+T8gSgvT179uwV9qdtucVpcVxYdH69TzT5LEZbsFV9/AKQByrSr Y3Uw== X-Gm-Message-State: ANhLgQ2jT0Tcu0OOh9/jsYVYZ+HXWQFVeKZKOS09aQ6hM/HTIi5V+HiK zbDlR6a2LDda66fbM6M+q7ektWVnKfTJeTuUnYM= X-Google-Smtp-Source: ADFU+vtX+iwXOjRonZi/n7mYJClLvx6RLkPjzWRDkMTW72s8HWqpTZoTi7m7iul52lCB4bw+n4h5FLJuGC0q5a1oAt0= X-Received: by 2002:a02:6016:: with SMTP id i22mr1575546jac.87.1584545177782; Wed, 18 Mar 2020 08:26:17 -0700 (PDT) MIME-Version: 1.0 References: <20200317135035.GA19442@SDF.ORG> <202003171435.41F7F0DF9@keescook> <20200317230612.GB19442@SDF.ORG> <202003171619.23210A7E0@keescook> <20200318014410.GA2281@SDF.ORG> In-Reply-To: <20200318014410.GA2281@SDF.ORG> From: Alexander Duyck Date: Wed, 18 Mar 2020 08:26:06 -0700 Message-ID: Subject: Re: [PATCH v2] mm/shuffle.c: Fix races in add_to_free_area_random() To: George Spelvin Cc: Kees Cook , Dan Williams , linux-mm , Andrew Morton Content-Type: text/plain; charset="UTF-8" X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Tue, Mar 17, 2020 at 6:44 PM George Spelvin wrote: > > The old code had separate "rand" and "rand_count" variables, > which could get out of sync with bad results. > > In the worst case, two threads would see rand_count == 1 and > both decrement it, resultint in rand_count = 255 and rand being > filled with zeros for the next 255 calls. > > Instead, pack them both into a single, atomically updatable, > variable. This makes it a lot easier to reason about race > conditions. They are still there - the code deliberately eschews > locking - but basically harmless on the rare occasions that > they happen. > > Second, use READ_ONCE and WRITE_ONCE. Without them, we are deep > in the land of nasal demons. The compiler would be free to spill > temporaries to the static variables in arbitrary perverse ways > and create hard-to-find bugs. > > (Alternatively, we could declare the static variable "volatile", > one of the few places in the Linux kernel that would be correct, > but it would probably annoy Linus.) > > Third, use long rather than u64. This not only keeps the > state atomically updatable, it also speeds up the fast path > on 32-bit machines. Saving at least three instructions on > the fast path (one load, one add-with-carry, and one store) > is worth exchanging one call to get_random_u64 for two > calls to get_random_u32. The fast path of get_random_* is > less than the 3*64 = 192 instructions saved, and the slow > path happens every 64 bytes so isn't affectrd by the change. > > I've tried a few variants. Keeping random lsbits with > a most-significant end marker, and using an explicit bool > flag rather than testing r both increase code size slightly. > > x86_64 i386 > This code 94 95 > Explicit bool 103 99 > Lsbits 99 101 > Both 96 100 > > Signed-off-by: George Spelvin > Cc: Dan Williams > Cc: Kees Cook > Cc: Andrew Morton > Cc: linux-mm@kvack.org > --- > mm/shuffle.c | 26 ++++++++++++++++---------- > 1 file changed, 16 insertions(+), 10 deletions(-) > > diff --git a/mm/shuffle.c b/mm/shuffle.c > index e0ed247f8d90..4ba3ba84764d 100644 > --- a/mm/shuffle.c > +++ b/mm/shuffle.c > @@ -186,22 +186,28 @@ void __meminit __shuffle_free_memory(pg_data_t *pgdat) > void add_to_free_area_random(struct page *page, struct free_area *area, > int migratetype) > { > - static u64 rand; > - static u8 rand_bits; > + static long rand; /* 0..BITS_PER_LONG-1 buffered random bits */ > + unsigned long r = READ_ONCE(rand), rshift = r << 1;; > > /* > - * The lack of locking is deliberate. If 2 threads race to > - * update the rand state it just adds to the entropy. > + * rand holds some random msbits, with a 1 bit appended, followed > + * by zero-padding in the lsbits. This allows us to maintain > + * the pre-generated bits and the count of bits in a single, > + * atomically updatable, variable. > + * > + * The lack of locking is deliberate. If two threads race to > + * update the rand state it just adds to the entropy. The > + * worst that can happen is a random bit is used twice, or > + * get_random_long is called redundantly. > */ > - if (rand_bits == 0) { > - rand_bits = 64; > - rand = get_random_u64(); > + if (unlikely(rshift == 0)) { > + r = get_random_long(); > + rshift = r << 1 | 1; You might want to wrap the "r << 1" in parenthesis. Also you could probably use a + 1 instead of an | 1. > } > + WRITE_ONCE(rand, rshift); > > - if (rand & 1) > + if ((long)r < 0) One trick you might be able to get away with here is to actually compare r to rshift. "If (rshift <= r)" should give you the same result. This works since what you are essentially doing is just adding r to itself so if you overflow rshift will be equal to at most r - 1. However with the addition of the single bit in the rshift == 0 case it could potentially be equal in the unlikely case of r being all 1's. > add_to_free_area(page, area, migratetype); > else > add_to_free_area_tail(page, area, migratetype); > - rand_bits--; > - rand >>= 1; > } > -- > 2.26.0.rc2 >