From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-18.2 required=3.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_CR_TRAILER,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id AC785C433B4 for ; Sat, 3 Apr 2021 22:31:02 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 2F8EE61359 for ; Sat, 3 Apr 2021 22:31:02 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 2F8EE61359 Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 3E01E6B0071; Sat, 3 Apr 2021 18:31:01 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 38F6C6B0075; Sat, 3 Apr 2021 18:31:01 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 209676B0078; Sat, 3 Apr 2021 18:31:01 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0250.hostedemail.com [216.40.44.250]) by kanga.kvack.org (Postfix) with ESMTP id F26C26B0071 for ; Sat, 3 Apr 2021 18:31:00 -0400 (EDT) Received: from smtpin02.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id A5633812F for ; Sat, 3 Apr 2021 22:31:00 +0000 (UTC) X-FDA: 77992502280.02.82DE73A Received: from mail-oi1-f174.google.com (mail-oi1-f174.google.com [209.85.167.174]) by imf19.hostedemail.com (Postfix) with ESMTP id E8BBA90009EC for ; Sat, 3 Apr 2021 22:30:55 +0000 (UTC) Received: by mail-oi1-f174.google.com with SMTP id c16so8292536oib.3 for ; Sat, 03 Apr 2021 15:31:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=sYlLWj8+FRi0P0GcdiqidMl/NInsj6MKpCJaKoSPEY0=; b=rqsXs8aP3fYPJwOH2fhm1Blh4CveNCXGCuSgUiGWM2SeJFPoC0tbffTR1v6kyATH3c xuhCqoqfXIpkzjbMEFjCGNDGAXOEEiInHeieMRl3ZNNc2qFuGd44dAZh0gKKadU3DD4/ 8vn0tHDGJ7ToZTZJOBTStfrGGfgF6mHK+/J14mcjUrac8NseUMmRRdJwuYi/kl1o0KBH MVoh7DjniO6PwmXOSmnoJfF/PUWKysUnzLBmtBtNquQOi6shPUa7Cz5X1ZQ2Zub8LCoI NL/2vls7Uix24Jq+OmzaI/hXQXfxdrOIRwo/16J7zXVOP1w8f71wnrt57tsKg6cyhfTQ n5fQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=sYlLWj8+FRi0P0GcdiqidMl/NInsj6MKpCJaKoSPEY0=; b=UNpxkzCjED0bdgBO4vNDCwSZk1+VOPGRk0XzoADkkVakKSPCmdF11MivpaM/bCxg8s oGGNJ+gQy0S9DU95NEBrrRqFtJJq8KfKbd84rakLY150msoAc14zq0tUJ2EGm2XXRF2k YyGyC4ikIBdH5sTW3O1LEA7LS8z4dcwsK9o808H3GELRRwAOGQHBttfwwA6bgHKML3nv styS42Xrzz/ZS7tGPp3+vsTda2ZvIuqQfn0vJF5CPWbiEPOWodK0twddK4C53HUK4OrG FLPz0frhQTnGWik6Nkth0m0pP9cMYoCVSsVGPkqfnCeNYE23XRU26uW3fI0aoxHNZ+1t i8cw== X-Gm-Message-State: AOAM533qnmsaaRaJSaqXNFfbwfcrY4MB1cfmHAAVQ3psdnMsNCqQix6L CVJEKi1vwAQFahYBeyh6aaTfu6qGHZiHeOowhQ4ocQ== X-Google-Smtp-Source: ABdhPJz69mE4fC5KMXNgLh70gdGilr0Djrsb2GOXBb4adjwQ0uGbT0bHr1nRf3/xA/sdKnNMxXT5HDGkN7l4m0VVxEc= X-Received: by 2002:aca:bb06:: with SMTP id l6mr13785004oif.121.1617489059403; Sat, 03 Apr 2021 15:30:59 -0700 (PDT) MIME-Version: 1.0 References: <20210403051325.683071-1-pcc@google.com> In-Reply-To: From: Marco Elver Date: Sun, 4 Apr 2021 00:30:47 +0200 Message-ID: Subject: Re: [PATCH] kfence: unpoison pool region before use To: Peter Collingbourne Cc: Dmitry Vyukov , Alexander Potapenko , Evgenii Stepanov , Andrey Konovalov , Linux Memory Management List , LKML , Andrew Morton Content-Type: text/plain; charset="UTF-8" X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: E8BBA90009EC X-Stat-Signature: juo1nahgyh8dfe3ifqgyxayecm4gyr4t Received-SPF: none (google.com>: No applicable sender policy available) receiver=imf19; identity=mailfrom; envelope-from=""; helo=mail-oi1-f174.google.com; client-ip=209.85.167.174 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1617489055-407221 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Sat, 3 Apr 2021 at 22:40, Peter Collingbourne wrote: > On Sat, Apr 3, 2021 at 3:03 AM Marco Elver wrote: > > On Sat, 3 Apr 2021 at 07:13, Peter Collingbourne wrote: > > > If the memory region allocated by KFENCE had previously been poisoned, > > > any validity checks done using kasan_byte_accessible() will fail. Fix > > > it by unpoisoning the memory before using it as the pool region. > > > > > > Link: https://linux-review.googlesource.com/id/I0af99e9f1c25eaf7e1ec295836b5d148d76940c5 > > > Signed-off-by: Peter Collingbourne > > > > Thanks, at a high level this seems reasonable, because we always want > > to ensure that KFENCE memory remains unpoisoned with KASAN on. FWIW I > > subjected a config with KFENCE+KASAN (generic, SW_TAGS, and HW_TAGS) > > to syzkaller testing and ran kfence_test: > > > > Tested-by: Marco Elver > > > > > > However, it is unclear to me under which circumstances we actually > > need this, i.e. something would grab some memblock memory, somehow > > poison it, and then release the memory back during early boot (note, > > kfence_alloc_pool() is called before slab setup). If we can somehow > > understand what actually did this, perhaps it'd help tell us if this > > actually needs fixing in KFENCE or it's the other thing that needs a > > fix. > > > > Given all this is happening during really early boot, I'd expect no or > > very few calls to kasan_poison() until kfence_alloc_pool() is called. > > We can probably debug it more by having kasan_poison() do a "if > > (!__kfence_pool) dump_stack();" somewhere. Can you try this on the > > system where you can repro the problem? I tried this just now on the > > latest mainline kernel, and saw 0 calls until kfence_alloc_pool(). > > I looked into the issue some more, and it turned out that the memory > wasn't getting poisoned by kasan_poison() but rather by the calls to > kasan_map_populate() in kasan_init_shadow(). Starting with the patch > "kasan: initialize shadow to TAG_INVALID for SW_TAGS", > KASAN_SHADOW_INIT is set to 0xFE rather than 0xFF, which caused the > failure. The Android kernel branch for 5.10 (and the downstream kernel > I was working with) already have this patch, but it isn't in the > mainline kernel yet. > > Now that I understand the cause of the issue, I can reproduce it using > the KFENCE unit tests on a db845c board, using both the Android 5.10 > and mainline branches if I cherry-pick that change. Here's an example > crash from the unit tests (the failure was originally also observed > from ksize in the downstream kernel): > > [ 46.692195][ T175] BUG: KASAN: invalid-access in test_krealloc+0x1c4/0xf98 > [ 46.699282][ T175] Read of size 1 at addr ffffff80e9e7b000 by task > kunit_try_catch/175 > [ 46.707400][ T175] Pointer tag: [ff], memory tag: [fe] > [ 46.712710][ T175] > [ 46.714955][ T175] CPU: 4 PID: 175 Comm: kunit_try_catch Tainted: > G B 5.12.0-rc5-mainline-09505-ga2ab5b26d445-dirty #1 > [ 46.727193][ T175] Hardware name: Thundercomm Dragonboard 845c (DT) > [ 46.733636][ T175] Call trace: > [ 46.736841][ T175] dump_backtrace+0x0/0x2f8 > [ 46.741295][ T175] show_stack+0x2c/0x3c > [ 46.745388][ T175] dump_stack+0x124/0x1bc > [ 46.749668][ T175] print_address_description+0x7c/0x308 > [ 46.755178][ T175] __kasan_report+0x1a8/0x398 > [ 46.759816][ T175] kasan_report+0x50/0x7c > [ 46.764103][ T175] __kasan_check_byte+0x3c/0x54 > [ 46.768916][ T175] ksize+0x4c/0x94 > [ 46.772573][ T175] test_krealloc+0x1c4/0xf98 > [ 46.777108][ T175] kunit_try_run_case+0x94/0x1c4 > [ 46.781990][ T175] kunit_generic_run_threadfn_adapter+0x30/0x44 > [ 46.788196][ T175] kthread+0x20c/0x234 > [ 46.792213][ T175] ret_from_fork+0x10/0x30 > > Since "kasan: initialize shadow to TAG_INVALID for SW_TAGS" hasn't > landed in mainline yet, it seems like we should insert this patch > before that one rather than adding a Fixes: tag. Thanks for getting to the bottom of it. However, given the above, I think we need to explain this in the commit message (which also makes the dependency between these 2 patches clear) and add a comment above the new kasan_unpoison_range(). That is, if we still think this is the right fix -- I'm not entirely sure it is. Because what I gather from "kasan: initialize shadow to TAG_INVALID for SW_TAGS", is the requirement that "0xFF pointer tag is a match-all tag, it doesn't matter what tag the accessed memory has". While KFENCE memory is accessible through the slab API, and in this case ksize() calling kasan_check_byte() leading to a failure, the kasan_check_byte() call is part of the public KASAN API. Which means that if some subsystem decides to memblock_alloc() some memory, and wishes to use kasan_check_byte() on that memory but with an untagged pointer, will get the same problem as KFENCE: with generic and HW_TAGS mode everything is fine, but with SW_TAGS mode things break. To me this indicates the fix is not with KFENCE, but should be in mm/kasan/sw_tags.c:kasan_byte_accessible(), which should not load the shadow when the pointer is untagged. Thanks, -- Marco