From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.2 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 58B67C43603 for ; Thu, 12 Dec 2019 09:58:02 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id CC9082173E for ; Thu, 12 Dec 2019 09:58:01 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org CC9082173E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=virtuozzo.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 377506B36DF; Thu, 12 Dec 2019 04:58:01 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 327CC6B36E0; Thu, 12 Dec 2019 04:58:01 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 265796B36E1; Thu, 12 Dec 2019 04:58:01 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0001.hostedemail.com [216.40.44.1]) by kanga.kvack.org (Postfix) with ESMTP id 132786B36DF for ; Thu, 12 Dec 2019 04:58:01 -0500 (EST) Received: from smtpin09.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with SMTP id B2DEF40E1 for ; Thu, 12 Dec 2019 09:58:00 +0000 (UTC) X-FDA: 76256038320.09.error84_62ee7be96375a X-HE-Tag: error84_62ee7be96375a X-Filterd-Recvd-Size: 4967 Received: from relay.sw.ru (relay.sw.ru [185.231.240.75]) by imf09.hostedemail.com (Postfix) with ESMTP for ; Thu, 12 Dec 2019 09:57:59 +0000 (UTC) Received: from dhcp-172-16-25-5.sw.ru ([172.16.25.5]) by relay.sw.ru with esmtp (Exim 4.92.3) (envelope-from ) id 1ifLDd-00051z-5a; Thu, 12 Dec 2019 12:57:13 +0300 Subject: Re: [PATCH v2 4/4] powerpc: Book3S 64-bit "heavyweight" KASAN support To: Daniel Axtens , Balbir Singh , linux-kernel@vger.kernel.org, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org, linux-xtensa@linux-xtensa.org, linux-arch@vger.kernel.org, linux-arm-kernel@lists.infradead.org, kasan-dev@googlegroups.com, christophe.leroy@c-s.fr, aneesh.kumar@linux.ibm.com, Dmitry Vyukov References: <20191210044714.27265-1-dja@axtens.net> <20191210044714.27265-5-dja@axtens.net> <71751e27-e9c5-f685-7a13-ca2e007214bc@gmail.com> <875zincu8a.fsf@dja-thinkpad.axtens.net> <2e0f21e6-7552-815b-1bf3-b54b0fc5caa9@gmail.com> <87wob3aqis.fsf@dja-thinkpad.axtens.net> From: Andrey Ryabinin Message-ID: <023d59f1-c007-e153-9893-3231a4caf7d1@virtuozzo.com> Date: Thu, 12 Dec 2019 12:56:56 +0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.3.0 MIME-Version: 1.0 In-Reply-To: <87wob3aqis.fsf@dja-thinkpad.axtens.net> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-Bogosity: Ham, tests=bogofilter, spamicity=0.000001, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 12/11/19 5:24 PM, Daniel Axtens wrote: > Hi Balbir, > >>>>> +Discontiguous memory can occur when you have a machine with memory spread >>>>> +across multiple nodes. For example, on a Talos II with 64GB of RAM: >>>>> + >>>>> + - 32GB runs from 0x0 to 0x0000_0008_0000_0000, >>>>> + - then there's a gap, >>>>> + - then the final 32GB runs from 0x0000_2000_0000_0000 to 0x0000_2008_0000_0000 >>>>> + >>>>> +This can create _significant_ issues: >>>>> + >>>>> + - If we try to treat the machine as having 64GB of _contiguous_ RAM, we would >>>>> + assume that ran from 0x0 to 0x0000_0010_0000_0000. We'd then reserve the >>>>> + last 1/8th - 0x0000_000e_0000_0000 to 0x0000_0010_0000_0000 as the shadow >>>>> + region. But when we try to access any of that, we'll try to access pages >>>>> + that are not physically present. >>>>> + >>>> >>>> If we reserved memory for KASAN from each node (discontig region), we might survive >>>> this no? May be we need NUMA aware KASAN? That might be a generic change, just thinking >>>> out loud. >>> >>> The challenge is that - AIUI - in inline instrumentation, the compiler >>> doesn't generate calls to things like __asan_loadN and >>> __asan_storeN. Instead it uses -fasan-shadow-offset to compute the >>> checks, and only calls the __asan_report* family of functions if it >>> detects an issue. This also matches what I can observe with objdump >>> across outline and inline instrumentation settings. >>> >>> This means that for this sort of thing to work we would need to either >>> drop back to out-of-line calls, or teach the compiler how to use a >>> nonlinear, NUMA aware mem-to-shadow mapping. >> >> Yes, out of line is expensive, but seems to work well for all use cases. > > I'm not sure this is true. Looking at scripts/Makefile.kasan, allocas, > stacks and globals will only be instrumented if you can provide > KASAN_SHADOW_OFFSET. In the case you're proposing, we can't provide a > static offset. I _think_ this is a compiler limitation, where some of > those instrumentations only work/make sense with a static offset, but > perhaps that's not right? Dmitry and Andrey, can you shed some light on > this? > There is no code in the kernel is poisoning/unpoisoning redzones/variables on stack. It's because it's always done by the compiler, it inserts some code in prologue/epilogue of every function. So compiler needs to know the shadow offset which will be used to poison/unpoison stack frames. There is no such kind of limitation on globals instrumentation. The only reason globals instrumentation depends on -fasan-shadow-offset is because there was some bug related to globals in old gcc version which didn't support -fasan-shadow-offset. If you want stack instrumentation with not standard mem-to-shadow mapping, the options are: 1. Patch compiler to make it possible the poisoning/unpoisonig of stack frames via function calls. 2. Use out-line instrumentation and do whatever mem-to-shadow mapping you want, but keep all kernel stacks in some special place for which standard mem-to-shadow mapping (addr >>3 +offset) works. > Also, as it currently stands, the speed difference between inline and > outline is approximately 2x, and given that we'd like to run this > full-time in syzkaller I think there is value in trading off speed for > some limitations. >