From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=3.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FSL_HELO_FAKE,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1, USER_IN_DEF_DKIM_WL autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6DBFEC5DF63 for ; Wed, 6 Nov 2019 19:12:01 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 1E7792075C for ; Wed, 6 Nov 2019 19:12:00 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="rgRyO6x1" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 1E7792075C Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id A46DF6B0003; Wed, 6 Nov 2019 14:12:00 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 9F8766B0005; Wed, 6 Nov 2019 14:12:00 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8E6356B0006; Wed, 6 Nov 2019 14:12:00 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0242.hostedemail.com [216.40.44.242]) by kanga.kvack.org (Postfix) with ESMTP id 788AB6B0003 for ; Wed, 6 Nov 2019 14:12:00 -0500 (EST) Received: from smtpin09.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with SMTP id 3018D181AEF00 for ; Wed, 6 Nov 2019 19:12:00 +0000 (UTC) X-FDA: 76126797600.09.music96_4b9f3de453333 X-HE-Tag: music96_4b9f3de453333 X-Filterd-Recvd-Size: 9303 Received: from mail-wr1-f68.google.com (mail-wr1-f68.google.com [209.85.221.68]) by imf27.hostedemail.com (Postfix) with ESMTP for ; Wed, 6 Nov 2019 19:11:59 +0000 (UTC) Received: by mail-wr1-f68.google.com with SMTP id b3so21610303wrs.13 for ; Wed, 06 Nov 2019 11:11:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=EcgRR3fhItj7MngTpAkKZQgz86ETXxndudb9oWnZwxg=; b=rgRyO6x1nGDPENAhlTeX1XD1tL7bztL4f34POmMWurjqkhpHZRc6O0seKYBMbC9u2m GiOBZXsbYzd2cMwQ7Un5XYdpqxCR88JdJYcGqZ+EIZn0tz+O/gh+0gVymXtnTh3x3e+Q +vgQsiFQzsAF0f1SzF6rQeWosSl4KG7cSBO0cMeY/yFMtjaD3kOQUI7B0ybY2Mp1EYRj XlqQzSr4fPATOqhHF3AqAEPw28SBvkbGqiOwb13fJHc3e8IJB/HGyF85obDx5SBHYEOt r5PJ8CZ3wiWvJaTMjVlEH/zcloEiB3aDDbSMcFQnX/01veFTumtByBbnmVpbWkZ+fwuV UtjQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=EcgRR3fhItj7MngTpAkKZQgz86ETXxndudb9oWnZwxg=; b=ZplJxcu4ItuDxKzvmjzg1TmnRoCBUiUD+XZX/4poNpJ6XvgrpGEp5QmTVSI92sl/Ii xvLsMauuuYvyg9GBCpEEFdi45ToP/qfT41SK1zzHijUXhJH4mWZfJmfunmq2jpEivarX O0k5HYPU77Cx+NNyypLCVIQZZ6RNqApxsb1f9KfNfySvY3itLjeUt9PuB6lQay1osXXW dWt3CmDO2gwomoS6sRCgydtPJLJGU7MAipXIx/w0F+vdiFys4Fa1+XMnYOernN5y/oKN v8VZl/0agv03zprzrf9lDYJCR3q0m880QXsBIRuRy+alSZc5fCgXKn5aB0uExEaW0jwa HL6g== X-Gm-Message-State: APjAAAWDikzP8CabPsLxMOBx4EuuP6eL3gmRjew+mVKa24Yug/O6NX1f kJm4TB38PjLlJ8bHCeNa+8a3TA== X-Google-Smtp-Source: APXvYqy2j8z4zrv6pixCYXtdwfDCHIPHcMSlZiyOi4PqRbOW14W0q0tt/ziIhir+mKB+jiKw5TxZRQ== X-Received: by 2002:adf:f743:: with SMTP id z3mr4041566wrp.200.1573067516685; Wed, 06 Nov 2019 11:11:56 -0800 (PST) Received: from google.com ([100.105.32.75]) by smtp.gmail.com with ESMTPSA id p14sm16143410wrq.72.2019.11.06.11.11.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Nov 2019 11:11:55 -0800 (PST) Date: Wed, 6 Nov 2019 20:11:49 +0100 From: Marco Elver To: Dmitry Vyukov Cc: LKMM Maintainers -- Akira Yokosawa , Alan Stern , Alexander Potapenko , Andrea Parri , Andrey Konovalov , Andy Lutomirski , Ard Biesheuvel , Arnd Bergmann , Boqun Feng , Borislav Petkov , Daniel Axtens , Daniel Lustig , Dave Hansen , David Howells , "H. Peter Anvin" , Ingo Molnar , Jade Alglave , Joel Fernandes , Jonathan Corbet , Josh Poimboeuf , Luc Maranget , Mark Rutland , Nicholas Piggin , paulmck@kernel.org, Peter Zijlstra , Thomas Gleixner , Will Deacon , kasan-dev , linux-arch , "open list:DOCUMENTATION" , linux-efi@vger.kernel.org, "open list:KERNEL BUILD + fi..." , LKML , Linux-MM , the arch/x86 maintainers Subject: Re: [PATCH v3 1/9] kcsan: Add Kernel Concurrency Sanitizer infrastructure Message-ID: <20191106191149.GA126960@google.com> References: <20191104142745.14722-1-elver@google.com> <20191104142745.14722-2-elver@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, 06 Nov 2019, Dmitry Vyukov wrote: > On Mon, Nov 4, 2019 at 3:28 PM Marco Elver wrote: > > > > Kernel Concurrency Sanitizer (KCSAN) is a dynamic data-race detector for > > kernel space. KCSAN is a sampling watchpoint-based data-race detector. > > See the included Documentation/dev-tools/kcsan.rst for more details. > ... > > +static inline atomic_long_t *find_watchpoint(unsigned long addr, size_t size, > > + bool expect_write, > > + long *encoded_watchpoint) > > +{ > > + const int slot = watchpoint_slot(addr); > > + const unsigned long addr_masked = addr & WATCHPOINT_ADDR_MASK; > > + atomic_long_t *watchpoint; > > + unsigned long wp_addr_masked; > > + size_t wp_size; > > + bool is_write; > > + int i; > > + > > + BUILD_BUG_ON(CONFIG_KCSAN_NUM_WATCHPOINTS < CHECK_NUM_SLOTS); > > + > > + for (i = 0; i < CHECK_NUM_SLOTS; ++i) { > > + watchpoint = &watchpoints[SLOT_IDX(slot, i)]; > > > The fast path code become much nicer! > I did another pass looking at how we can optimize the fast path. > Currently we still have 2 push/pop pairs on the fast path because of > register pressure. The logic in SLOT_IDX seems to be the main culprit. > We discussed several options offline: > 1. Just check 1 slot and ignore all corner cases (we will miss racing > unaligned access to different addresses but overlapping and crossing > pages, which sounds pretty esoteric) > 2. Check 3 slots in order and without wraparound (watchpoints[slot + > i], where i=-1,0,1), this will require adding dummy slots around the > array > 3. An interesting option is to check just 2 slots (that's enough!), to > make this work we will need to slightly offset bucket number when > setting a watch point (namely, if an access goes to the very end of a > page, we set the watchpoint into the bucket corresponding to the > _next_ page) > All of these options remove push/pop in my experiments. Obviously > checking fewer slots will reduce dynamic overhead even more. > > > > + *encoded_watchpoint = atomic_long_read(watchpoint); > > + if (!decode_watchpoint(*encoded_watchpoint, &wp_addr_masked, > > + &wp_size, &is_write)) > > + continue; > > + > > + if (expect_write && !is_write) > > + continue; > > + > > + /* Check if the watchpoint matches the access. */ > > + if (matching_access(wp_addr_masked, wp_size, addr_masked, size)) > > + return watchpoint; > > + } > > + > > + return NULL; > > +} > > + > > +static inline atomic_long_t *insert_watchpoint(unsigned long addr, size_t size, > > + bool is_write) > > +{ > > + const int slot = watchpoint_slot(addr); > > + const long encoded_watchpoint = encode_watchpoint(addr, size, is_write); > > + atomic_long_t *watchpoint; > > + int i; > > + > > + for (i = 0; i < CHECK_NUM_SLOTS; ++i) { > > + long expect_val = INVALID_WATCHPOINT; > > + > > + /* Try to acquire this slot. */ > > + watchpoint = &watchpoints[SLOT_IDX(slot, i)]; > > If we do this SLOT_IDX trickery to catch unaligned accesses crossing > pages, then I think we should not use it insert_watchpoint at all and > only set the watchpoint to the exact index. Otherwise, we will > actually miss the corner cases which defeats the whole purpose of > SLOT_IDX and 3 iterations. > Just for the record, there are 2 reasons actually I decided to do this: 1. the address slot is already occupied, check if any adjacent slots are free; 2. accesses that straddle a slot boundary due to size that exceeds a slot's range may check adjacent slots if any watchpoint matches. In /sys/kernel/debug/kcsan I can see no_capacity with the current version stays below 10 for kernel boot. When I just use 1 slot, no_capacity events exceed 90000, because point (1) is no longer addressed. This is a problem that would impair our ability to detect races. One reason this happens is due to locality: it is just much more likely that we have multiple accesses to the same pages during some phase of execution from multiple threads. To avoid blowing up no_capacity events, insert_watchpoint should not change. I will change the iteration order in the fast-path (avoiding the complicated logic), and add additional overflow entries to the watchpoint array. AFAIK this generates better code, while still addressing points (1) and (2) above. This should be the best trade-off between absolute performance and our ability to detect data races. -- Marco