From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C1AE0CDB465 for ; Thu, 19 Oct 2023 07:26:59 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 49FAD8D0126; Thu, 19 Oct 2023 03:26:59 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 4501B8D0110; Thu, 19 Oct 2023 03:26:59 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 317F88D0126; Thu, 19 Oct 2023 03:26:59 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 213828D0110 for ; Thu, 19 Oct 2023 03:26:59 -0400 (EDT) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id CC6058059D for ; Thu, 19 Oct 2023 07:26:58 +0000 (UTC) X-FDA: 81361379316.24.36D7378 Received: from szxga08-in.huawei.com (szxga08-in.huawei.com [45.249.212.255]) by imf22.hostedemail.com (Postfix) with ESMTP id 9D4ADC0018 for ; Thu, 19 Oct 2023 07:26:54 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf22.hostedemail.com: domain of wangkefeng.wang@huawei.com designates 45.249.212.255 as permitted sender) smtp.mailfrom=wangkefeng.wang@huawei.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1697700417; a=rsa-sha256; cv=none; b=LhjHeM6XSVbaVtwNaYfGuJvGyUkKA8i+CCYZ1AYZhfG/FXyGlBL6K3OFi7HqXS1MGaEq5a RxaBLfaYjVZQxZrEXJ00QJ5KFOajQsMx7gTofDJQ0trSTX4822hAVowqqr8OiFM82VkNZI HmasVe2FuuKOFUUTsdkGeClCYG8uOfU= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf22.hostedemail.com: domain of wangkefeng.wang@huawei.com designates 45.249.212.255 as permitted sender) smtp.mailfrom=wangkefeng.wang@huawei.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1697700417; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=i/xmajoZJ1tMSwShrYaGj6OyDi4UdsTn43+MHwZAZ4E=; b=4SHU0wPjwIfaU0SbPY2ru0GUugFmC8IF6Nn8Wu9lGdMcUDIt+yfJA/E8q76/qbyJaXhBqb 5sNnzzdhabAq1QEE9fLP8CFCsGeetosHZEpAaIJs/ebZ9v7x6Ngq+a9MJ/cuXxkZQaCapd kMXOYYfyEILxHIhKEDndrS1uQ2uVGO0= Received: from dggpemm100001.china.huawei.com (unknown [172.30.72.56]) by szxga08-in.huawei.com (SkyGuard) with ESMTP id 4S9zjd6DXRz15NYZ; Thu, 19 Oct 2023 15:24:05 +0800 (CST) Received: from [10.174.177.243] (10.174.177.243) by dggpemm100001.china.huawei.com (7.185.36.93) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.31; Thu, 19 Oct 2023 15:26:49 +0800 Message-ID: <7ab8839b-8f88-406e-b6e1-2c69c8967d4e@huawei.com> Date: Thu, 19 Oct 2023 15:26:48 +0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH -rfc 0/3] mm: kasan: fix softlock when populate or depopulate pte Content-Language: en-US To: Uladzislau Rezki CC: Marco Elver , Andrey Ryabinin , Alexander Potapenko , Andrey Konovalov , Dmitry Vyukov , Vincenzo Frascino , Andrew Morton , Christoph Hellwig , Lorenzo Stoakes , , References: <20230906124234.134200-1-wangkefeng.wang@huawei.com> <4e2e075f-b74c-4daf-bf1a-f83fced742c4@huawei.com> <5b33515b-5fd2-4dc7-9778-e321484d2427@huawei.com> From: Kefeng Wang In-Reply-To: Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: 8bit X-Originating-IP: [10.174.177.243] X-ClientProxiedBy: dggems704-chm.china.huawei.com (10.3.19.181) To dggpemm100001.china.huawei.com (7.185.36.93) X-CFilter-Loop: Reflected X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 9D4ADC0018 X-Stat-Signature: eui63tfqesaq3yncsefwhqoarz17s9ks X-Rspam-User: X-HE-Tag: 1697700414-422298 X-HE-Meta: U2FsdGVkX19RCXOUjLeRbj0o/F04xqFSSgnwwc4IPBPpUREhoQRG49tmFoaiN50WYVIrftWd/ahCecbhepGiWvM6McQ/+FN/lcZIwLV70AxeAcNhqrw4J0cm+/xk1tVm79fjdTRjlTVKs8wjnjVReDn00uH3P6i9Tjw4QYblYssusqWoa1+9kiTIxaUReVZaxqlQ93GHsrl4GAnv5tMpYOFjhpJR/8pbGbO+k7Lx9BbLhHDPW6VpNr8XzMfjrdbJmTgStm9lO9LsSlWxkoAPfovkMtFeD/Bm29m9GqR0fnWjPl9rOVJ2yTjLJQhPMKoSXblYc60boXDzwaltqq30ZJLtRZdmP334xi4AmZxV7TW99HYmcDNBf+y6gVs44ohSujCKMb+dq9WGQ4BCTaIsa6GhUIbpeSQTZkmraeHDr4vhRGgjyHektUGQ0pwWvx9g/NJLcnpjwbW/toC2KGp6AlToSu3U2sUtG0WsdQbjJtNd9sXAkPVEAZUoR5207E8YW1Ic915m7fXVEESrNcVL3PoYHHT3U9c/6MVjKlLs/k5Jd5lHQfV9KOjRahEbuJtJcRZSpaS/lwmrHdFWHn3jxeqkPxxEhvTjBNwveKcNVu9Nx3uYozyK/dBLAfyQSUVm3GvR7pGAebq5X7JP143w+adhmb00Bw8mIEkz7x88UcVmdvI77cnY/JSe+MQWK/FE/bghg3KxzNUS4d81sJ5lohbVeMY+h+5FobgLcIMF+9Y9orfOCsD6ONuDKeBaNtKFH43ZzdzckvZSiJ3v43WtQfodV94DXlsl17w8e/OIwnbJokeZEPd+gUHDwDJBzvkcpUHDD/48gcscZ7w5fehO+7+qzFd8gMgNn/HiQgVbD968zdrzZpmoI0HjgmQNWWzdkIu0GKZ3Cu9H2g2lUmV3xjclNA+/bc3YYB8tfhbN/tbzdHdHPMG5dUIUDQofO44oU923stmL+JWQxQZpnZI PJ/5mrqV xA1k8CA2sB4P0eLri8XKUxEu4kLf0oRBWehIxL4Cf0EuD4DUdR0jtkDK70EAEvwglLDA+Jg4lk5vAFSqcpvnGLYuS9sFtZChSEmD9UVcBYM6ACLbWwlykIHmy5aybKLEkD3ftPVqssLtJysePvwNp8sYnBESqB3Q85IKbqhcZVojCQ1nfMIqWNc4tQ5lBi2KiQMZ7ZyK9e95kNoa8V/+GyZvaeVLRR8r368zytQV4jlVcpzv1ggMLnX3n7LLzSgozVFLhtm/5WU/AeN0PZ/69hBs1rhJkMF8kQtQYE2hZpGffwyI= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 2023/10/19 14:17, Uladzislau Rezki wrote: > On Thu, Oct 19, 2023 at 09:40:10AM +0800, Kefeng Wang wrote: >> >> >> On 2023/10/19 0:37, Marco Elver wrote: >>> On Wed, 18 Oct 2023 at 16:16, 'Kefeng Wang' via kasan-dev >>> wrote: >>>> >>>> The issue is easy to reproduced with large vmalloc, kindly ping... >>>> >>>> On 2023/9/15 8:58, Kefeng Wang wrote: >>>>> Hi All, any suggest or comments,many thanks. >>>>> >>>>> On 2023/9/6 20:42, Kefeng Wang wrote: >>>>>> This is a RFC, even patch3 is a hack to fix the softlock issue when >>>>>> populate or depopulate pte with large region, looking forward to your >>>>>> reply and advise, thanks. >>>>> >>>>> Here is full stack,for populate pte, >>>>> >>>>> [ C3] watchdog: BUG: soft lockup - CPU#3 stuck for 26s! [insmod:458] >>>>> [ C3] Modules linked in: test(OE+) >>>>> [ C3] irq event stamp: 320776 >>>>> [ C3] hardirqs last enabled at (320775): [] >>>>> _raw_spin_unlock_irqrestore+0x98/0xb8 >>>>> [ C3] hardirqs last disabled at (320776): [] >>>>> el1_interrupt+0x38/0xa8 >>>>> [ C3] softirqs last enabled at (318174): [] >>>>> __do_softirq+0x658/0x7ac >>>>> [ C3] softirqs last disabled at (318169): [] >>>>> ____do_softirq+0x18/0x30 >>>>> [ C3] CPU: 3 PID: 458 Comm: insmod Tainted: G OE 6.5.0+ #595 >>>>> [ C3] Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015 >>>>> [ C3] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) >>>>> [ C3] pc : _raw_spin_unlock_irqrestore+0x50/0xb8 >>>>> [ C3] lr : _raw_spin_unlock_irqrestore+0x98/0xb8 >>>>> [ C3] sp : ffff800093386d70 >>>>> [ C3] x29: ffff800093386d70 x28: 0000000000000801 x27: ffff0007ffffa9c0 >>>>> [ C3] x26: 0000000000000000 x25: 000000000000003f x24: fffffc0004353708 >>>>> [ C3] x23: ffff0006d476bad8 x22: fffffc0004353748 x21: 0000000000000000 >>>>> [ C3] x20: ffff0007ffffafc0 x19: 0000000000000000 x18: 0000000000000000 >>>>> [ C3] x17: ffff80008024e7fc x16: ffff80008055a8f0 x15: ffff80008024ec60 >>>>> [ C3] x14: ffff80008024ead0 x13: ffff80008024e7fc x12: ffff6000fffff5f9 >>>>> [ C3] x11: 1fffe000fffff5f8 x10: ffff6000fffff5f8 x9 : 1fffe000fffff5f8 >>>>> [ C3] x8 : dfff800000000000 x7 : 00000000f2000000 x6 : dfff800000000000 >>>>> [ C3] x5 : 00000000f2f2f200 x4 : dfff800000000000 x3 : ffff700012670d70 >>>>> [ C3] x2 : 0000000000000001 x1 : c9a5dbfae610fa24 x0 : 000000000004e507 >>>>> [ C3] Call trace: >>>>> [ C3] _raw_spin_unlock_irqrestore+0x50/0xb8 >>>>> [ C3] rmqueue_bulk+0x434/0x6b8 >>>>> [ C3] get_page_from_freelist+0xdd4/0x1680 >>>>> [ C3] __alloc_pages+0x244/0x508 >>>>> [ C3] alloc_pages+0xf0/0x218 >>>>> [ C3] __get_free_pages+0x1c/0x50 >>>>> [ C3] kasan_populate_vmalloc_pte+0x30/0x188 >>>>> [ C3] __apply_to_page_range+0x3ec/0x650 >>>>> [ C3] apply_to_page_range+0x1c/0x30 >>>>> [ C3] kasan_populate_vmalloc+0x60/0x70 >>>>> [ C3] alloc_vmap_area.part.67+0x328/0xe50 >>>>> [ C3] alloc_vmap_area+0x4c/0x78 >>>>> [ C3] __get_vm_area_node.constprop.76+0x130/0x240 >>>>> [ C3] __vmalloc_node_range+0x12c/0x340 >>>>> [ C3] __vmalloc_node+0x8c/0xb0 >>>>> [ C3] vmalloc+0x2c/0x40 >>>>> [ C3] show_mem_init+0x1c/0xff8 [test] >>>>> [ C3] do_one_initcall+0xe4/0x500 >>>>> [ C3] do_init_module+0x100/0x358 >>>>> [ C3] load_module+0x2e64/0x2fc8 >>>>> [ C3] init_module_from_file+0xec/0x148 >>>>> [ C3] idempotent_init_module+0x278/0x380 >>>>> [ C3] __arm64_sys_finit_module+0x88/0xf8 >>>>> [ C3] invoke_syscall+0x64/0x188 >>>>> [ C3] el0_svc_common.constprop.1+0xec/0x198 >>>>> [ C3] do_el0_svc+0x48/0xc8 >>>>> [ C3] el0_svc+0x3c/0xe8 >>>>> [ C3] el0t_64_sync_handler+0xa0/0xc8 >>>>> [ C3] el0t_64_sync+0x188/0x190 >>>>> >>>>> and for depopuldate pte, >>>>> >>>>> [ C6] watchdog: BUG: soft lockup - CPU#6 stuck for 48s! [kworker/6:1:59] >>>>> [ C6] Modules linked in: test(OE+) >>>>> [ C6] irq event stamp: 39458 >>>>> [ C6] hardirqs last enabled at (39457): [] >>>>> _raw_spin_unlock_irqrestore+0x98/0xb8 >>>>> [ C6] hardirqs last disabled at (39458): [] >>>>> el1_interrupt+0x38/0xa8 >>>>> [ C6] softirqs last enabled at (39420): [] >>>>> __do_softirq+0x658/0x7ac >>>>> [ C6] softirqs last disabled at (39415): [] >>>>> ____do_softirq+0x18/0x30 >>>>> [ C6] CPU: 6 PID: 59 Comm: kworker/6:1 Tainted: G OEL >>>>> 6.5.0+ #595 >>>>> [ C6] Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015 >>>>> [ C6] Workqueue: events drain_vmap_area_work >>>>> [ C6] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) >>>>> [ C6] pc : _raw_spin_unlock_irqrestore+0x50/0xb8 >>>>> [ C6] lr : _raw_spin_unlock_irqrestore+0x98/0xb8 >>>>> [ C6] sp : ffff80008fe676b0 >>>>> [ C6] x29: ffff80008fe676b0 x28: fffffc000601d310 x27: ffff000edf5dfa80 >>>>> [ C6] x26: ffff000edf5dfad8 x25: 0000000000000000 x24: 0000000000000006 >>>>> [ C6] x23: ffff000edf5dfad4 x22: 0000000000000000 x21: 0000000000000006 >>>>> [ C6] x20: ffff0007ffffafc0 x19: 0000000000000000 x18: 0000000000000000 >>>>> [ C6] x17: ffff8000805544b8 x16: ffff800080553d94 x15: ffff8000805c11b0 >>>>> [ C6] x14: ffff8000805baeb0 x13: ffff800080047e10 x12: ffff6000fffff5f9 >>>>> [ C6] x11: 1fffe000fffff5f8 x10: ffff6000fffff5f8 x9 : 1fffe000fffff5f8 >>>>> [ C6] x8 : dfff800000000000 x7 : 00000000f2000000 x6 : dfff800000000000 >>>>> [ C6] x5 : 00000000f2f2f200 x4 : dfff800000000000 x3 : ffff700011fcce98 >>>>> [ C6] x2 : 0000000000000001 x1 : cf09d5450e2b4f7f x0 : 0000000000009a21 >>>>> [ C6] Call trace: >>>>> [ C6] _raw_spin_unlock_irqrestore+0x50/0xb8 >>>>> [ C6] free_pcppages_bulk+0x2bc/0x3e0 >>>>> [ C6] free_unref_page_commit+0x1fc/0x290 >>>>> [ C6] free_unref_page+0x184/0x250 >>>>> [ C6] __free_pages+0x154/0x1a0 >>>>> [ C6] free_pages+0x88/0xb0 >>>>> [ C6] kasan_depopulate_vmalloc_pte+0x58/0x80 >>>>> [ C6] __apply_to_page_range+0x3ec/0x650 >>>>> [ C6] apply_to_existing_page_range+0x1c/0x30 >>>>> [ C6] kasan_release_vmalloc+0xa4/0x118 >>>>> [ C6] __purge_vmap_area_lazy+0x4f4/0xe30 >>>>> [ C6] drain_vmap_area_work+0x60/0xc0 >>>>> [ C6] process_one_work+0x4cc/0xa38 >>>>> [ C6] worker_thread+0x240/0x638 >>>>> [ C6] kthread+0x1c8/0x1e0 >>>>> [ C6] ret_from_fork+0x10/0x20 >>>>> >>>>> >>>>> >>>>>> >>>>>> Kefeng Wang (3): >>>>>> mm: kasan: shadow: add cond_resched() in kasan_populate_vmalloc_pte() >>>>>> mm: kasan: shadow: move free_page() out of page table lock >>>>>> mm: kasan: shadow: HACK add cond_resched_lock() in >>>>>> kasan_depopulate_vmalloc_pte() >>> >>> The first 2 patches look ok, but yeah, the last is a hack. I also >>> don't have any better suggestions, only more questions. >> >> Thanks Marco, maybe we could convert free_vmap_area_lock from spinlock to >> mutex lock only if KASAN enabled? >> > I do not think it is a good suggestion. Could you please clarify the > reason of such conversion? See Call Trace of softlock, when map/unmap the vmalloc buf, the kasan will populate and depopulate vmalloc pte, those will spend more time than no-kasan kernel, for unmap, and there is already a cond_resched_lock() in __purge_vmap_area_lazy(), but with more time consumed under spinlock(free_vmap_area_lock), and we couldn't add cond_resched_lock in kasan_depopulate_vmalloc_pte(), so if spin lock converted to mutex lock, we could add a cond_resched into kasan depopulate, this is why make such conversion if kasan enabled, but this conversion maybe not correct, any better solution? > >>> >>> Does this only happen on arm64? >> >> Our test case run on arm64 qemu(host is x86), so it run much more slower >> than real board. >>> Do you have a minimal reproducer you can share? >> Here is the code in test driver, >> >> void *buf = vmalloc(40UL << 30); >> vfree(buf); >> > What is a test driver? Why do you need 42G of memmory, for which purpose? This is just to accelerate reproduction of above issue, the main reason of the issue is too much time spent during kasan_populate_vmalloc() and kasan_release_vmalloc(). Thanks. > > -- > Uladzislau Rezki >