From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8FFD9EEAA7D for ; Fri, 15 Sep 2023 00:58:41 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D4E7E6B02AE; Thu, 14 Sep 2023 20:58:40 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id CD6EB6B02D5; Thu, 14 Sep 2023 20:58:40 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B77CB6B02D9; Thu, 14 Sep 2023 20:58:40 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 6BEE76B02AE for ; Thu, 14 Sep 2023 20:58:40 -0400 (EDT) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 3352AC06D6 for ; Fri, 15 Sep 2023 00:58:40 +0000 (UTC) X-FDA: 81237021600.01.362DEDF Received: from szxga03-in.huawei.com (szxga03-in.huawei.com [45.249.212.189]) by imf09.hostedemail.com (Postfix) with ESMTP id 8B581140020 for ; Fri, 15 Sep 2023 00:58:36 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf09.hostedemail.com: domain of wangkefeng.wang@huawei.com designates 45.249.212.189 as permitted sender) smtp.mailfrom=wangkefeng.wang@huawei.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1694739518; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=A6amhPw/ShqMQ4jyKXEiLcD47a1tFUZBnB70blhG43Q=; b=w+/pMTyuDIqpaZicIIOvM57HcOlww0k1AvX9CUPWJhHRZZYvia5v6bP0burnKzfLINVDIn XcxetzVbjiZlCoOPzLAdilh9a8dcn6GK8549Ze3FcwodF8fTKOERk9f8I1Lz31GeGTGqQ8 syyKp7q+5Ke5jSYaSzvWGjJCyIoZA1k= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf09.hostedemail.com: domain of wangkefeng.wang@huawei.com designates 45.249.212.189 as permitted sender) smtp.mailfrom=wangkefeng.wang@huawei.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1694739518; a=rsa-sha256; cv=none; b=SPncmpFHhUdZzUnYQe3JizwZGGviBbaFc1HwUypq5Lf5i+UZsts9f1MA87bvy2BPeN8/+c VSNZ8NjnMS2xopYNI2bcUPp787OgtZCKfcnZdODXQAZL4F/NUwrhh/f1sEZIlbRMnCGyGP RodGcNiOSlb/yYoJ9DlsHeE62NT0tg8= Received: from dggpemm100001.china.huawei.com (unknown [172.30.72.57]) by szxga03-in.huawei.com (SkyGuard) with ESMTP id 4Rmwh81V7xzMlZw; Fri, 15 Sep 2023 08:54:48 +0800 (CST) Received: from [10.174.177.243] (10.174.177.243) by dggpemm100001.china.huawei.com (7.185.36.93) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.31; Fri, 15 Sep 2023 08:58:16 +0800 Message-ID: <4e2e075f-b74c-4daf-bf1a-f83fced742c4@huawei.com> Date: Fri, 15 Sep 2023 08:58:16 +0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH -rfc 0/3] mm: kasan: fix softlock when populate or depopulate pte Content-Language: en-US To: Andrey Ryabinin , Alexander Potapenko , Andrey Konovalov , Dmitry Vyukov , Vincenzo Frascino , Andrew Morton , Uladzislau Rezki , Christoph Hellwig , Lorenzo Stoakes , , References: <20230906124234.134200-1-wangkefeng.wang@huawei.com> From: Kefeng Wang In-Reply-To: <20230906124234.134200-1-wangkefeng.wang@huawei.com> Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: 8bit X-Originating-IP: [10.174.177.243] X-ClientProxiedBy: dggems704-chm.china.huawei.com (10.3.19.181) To dggpemm100001.china.huawei.com (7.185.36.93) X-CFilter-Loop: Reflected X-Rspamd-Queue-Id: 8B581140020 X-Rspam-User: X-Rspamd-Server: rspam02 X-Stat-Signature: 5zf1hw3d1fcxokws9i57pwqr79fj6s49 X-HE-Tag: 1694739516-716297 X-HE-Meta: U2FsdGVkX1+L0BbvhZLJB08QURAwMQAggDAQCyF28vfj2PA7DZvWaEsEMceoHXfOjUdLM/3y7VMLdD4E9KUSiiTkOMYkKMwZN5j86R1/LbemvvrvqkorR4Xxy9QUxjxSCVdbikoej6I936CZmABxhaOIWQGYAge6uvCJADbap0+zwcLPd8ZLzUvjJ3RYIY81QLIeLXADAXXaJG/pnILorsw0SoawgSm2YWCP5Wv5jvNFCCjswf9lleKcnURvIaKZxsTsYGcW2bhes6tzfZwF5H0IyZW9kz+0boiCvCFjUiiAGjsxrjhXJJlomxDr2IoJc/VF8VO2Ox2v7k1DGAqJhDM4ZrmAzkAKMccnGiEGs/qZCHCkRooekcQsfl9JP+QUY9igVMt7hz/ZvkR+/0h+4y5Z9y15SMhSNlRxObbhbi4ywFbKIPW/S4f3cgzZ7sQq9dwibEaUBxWFSavY0Z3Lt2TiLROOY5CzAO4dUlam/c3mbjkqP6nBPa4kmDuCPKzt3cC27fuGUUqw+CIiCuJd9ff9WZFCNZLwLdHnMXo7ChwdPebz3JMhb6BnzI7B8mklWgf7KAcgJkjQX7h6sFGtrvyebl6XEYhtddZkXnJ0HJxAOCOhnLRKNpMBJF9Mkzi5xIM/iBuuGI+3GQrF6CUHT7MEqj7l4SntBzCdxvyVxV3APqW1kQtZcQ0T+1kQF5SB42/zA+S26Sufne0Dm6WzigaG3juvDfulvbM7xatDPjFnS1F1iGO8YuJsQ/4zM4are0mbWGnTaJgODoJx1DT6/mWuBQIx0vEurNy8rq1NvtCscHeABjiREjLCwh4dtBbfVPW6hthWImc55NfreAZLp29us2361WxnDSJdQxCl4F+QjhvP7CxMtV76ELq4miBaTc9BCWLHsMlZZZzX6+ob8YIZRJjdtdlLPkpJVjplIs6mKbtV5Xewf0mVcN6sLs28g1jFHF9WS+ROhV/saG5 wDETsqDi 6Io9rRUhM7P2iibiUKKyrjvh5WgLN/pw7Exn9nTtizMq6pKbhA6fBlEnNyCKnP4WN4HM2o8SwaviqVsWwyX+k85YuMJ+niCOHDEh8EfqNh5vN46jhSx+XX/d+q+0g9Pqg01sYykR2IDs+pEdVON2UtSWgncFUsMxurP5JqY/uC3J6DBqn9CIMRYh3ezrLhjqY480yptMXCet6GB0Ab1rVdXn6eqamb6iysvrqXhIdbMQ+hKTJ0axqvHTM6Q== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Hi All, any suggest or comments,many thanks. On 2023/9/6 20:42, Kefeng Wang wrote: > This is a RFC, even patch3 is a hack to fix the softlock issue when > populate or depopulate pte with large region, looking forward to your > reply and advise, thanks. Here is full stack,for populate pte, [ C3] watchdog: BUG: soft lockup - CPU#3 stuck for 26s! [insmod:458] [ C3] Modules linked in: test(OE+) [ C3] irq event stamp: 320776 [ C3] hardirqs last enabled at (320775): [] _raw_spin_unlock_irqrestore+0x98/0xb8 [ C3] hardirqs last disabled at (320776): [] el1_interrupt+0x38/0xa8 [ C3] softirqs last enabled at (318174): [] __do_softirq+0x658/0x7ac [ C3] softirqs last disabled at (318169): [] ____do_softirq+0x18/0x30 [ C3] CPU: 3 PID: 458 Comm: insmod Tainted: G OE 6.5.0+ #595 [ C3] Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015 [ C3] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ C3] pc : _raw_spin_unlock_irqrestore+0x50/0xb8 [ C3] lr : _raw_spin_unlock_irqrestore+0x98/0xb8 [ C3] sp : ffff800093386d70 [ C3] x29: ffff800093386d70 x28: 0000000000000801 x27: ffff0007ffffa9c0 [ C3] x26: 0000000000000000 x25: 000000000000003f x24: fffffc0004353708 [ C3] x23: ffff0006d476bad8 x22: fffffc0004353748 x21: 0000000000000000 [ C3] x20: ffff0007ffffafc0 x19: 0000000000000000 x18: 0000000000000000 [ C3] x17: ffff80008024e7fc x16: ffff80008055a8f0 x15: ffff80008024ec60 [ C3] x14: ffff80008024ead0 x13: ffff80008024e7fc x12: ffff6000fffff5f9 [ C3] x11: 1fffe000fffff5f8 x10: ffff6000fffff5f8 x9 : 1fffe000fffff5f8 [ C3] x8 : dfff800000000000 x7 : 00000000f2000000 x6 : dfff800000000000 [ C3] x5 : 00000000f2f2f200 x4 : dfff800000000000 x3 : ffff700012670d70 [ C3] x2 : 0000000000000001 x1 : c9a5dbfae610fa24 x0 : 000000000004e507 [ C3] Call trace: [ C3] _raw_spin_unlock_irqrestore+0x50/0xb8 [ C3] rmqueue_bulk+0x434/0x6b8 [ C3] get_page_from_freelist+0xdd4/0x1680 [ C3] __alloc_pages+0x244/0x508 [ C3] alloc_pages+0xf0/0x218 [ C3] __get_free_pages+0x1c/0x50 [ C3] kasan_populate_vmalloc_pte+0x30/0x188 [ C3] __apply_to_page_range+0x3ec/0x650 [ C3] apply_to_page_range+0x1c/0x30 [ C3] kasan_populate_vmalloc+0x60/0x70 [ C3] alloc_vmap_area.part.67+0x328/0xe50 [ C3] alloc_vmap_area+0x4c/0x78 [ C3] __get_vm_area_node.constprop.76+0x130/0x240 [ C3] __vmalloc_node_range+0x12c/0x340 [ C3] __vmalloc_node+0x8c/0xb0 [ C3] vmalloc+0x2c/0x40 [ C3] show_mem_init+0x1c/0xff8 [test] [ C3] do_one_initcall+0xe4/0x500 [ C3] do_init_module+0x100/0x358 [ C3] load_module+0x2e64/0x2fc8 [ C3] init_module_from_file+0xec/0x148 [ C3] idempotent_init_module+0x278/0x380 [ C3] __arm64_sys_finit_module+0x88/0xf8 [ C3] invoke_syscall+0x64/0x188 [ C3] el0_svc_common.constprop.1+0xec/0x198 [ C3] do_el0_svc+0x48/0xc8 [ C3] el0_svc+0x3c/0xe8 [ C3] el0t_64_sync_handler+0xa0/0xc8 [ C3] el0t_64_sync+0x188/0x190 and for depopuldate pte, [ C6] watchdog: BUG: soft lockup - CPU#6 stuck for 48s! [kworker/6:1:59] [ C6] Modules linked in: test(OE+) [ C6] irq event stamp: 39458 [ C6] hardirqs last enabled at (39457): [] _raw_spin_unlock_irqrestore+0x98/0xb8 [ C6] hardirqs last disabled at (39458): [] el1_interrupt+0x38/0xa8 [ C6] softirqs last enabled at (39420): [] __do_softirq+0x658/0x7ac [ C6] softirqs last disabled at (39415): [] ____do_softirq+0x18/0x30 [ C6] CPU: 6 PID: 59 Comm: kworker/6:1 Tainted: G OEL 6.5.0+ #595 [ C6] Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015 [ C6] Workqueue: events drain_vmap_area_work [ C6] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ C6] pc : _raw_spin_unlock_irqrestore+0x50/0xb8 [ C6] lr : _raw_spin_unlock_irqrestore+0x98/0xb8 [ C6] sp : ffff80008fe676b0 [ C6] x29: ffff80008fe676b0 x28: fffffc000601d310 x27: ffff000edf5dfa80 [ C6] x26: ffff000edf5dfad8 x25: 0000000000000000 x24: 0000000000000006 [ C6] x23: ffff000edf5dfad4 x22: 0000000000000000 x21: 0000000000000006 [ C6] x20: ffff0007ffffafc0 x19: 0000000000000000 x18: 0000000000000000 [ C6] x17: ffff8000805544b8 x16: ffff800080553d94 x15: ffff8000805c11b0 [ C6] x14: ffff8000805baeb0 x13: ffff800080047e10 x12: ffff6000fffff5f9 [ C6] x11: 1fffe000fffff5f8 x10: ffff6000fffff5f8 x9 : 1fffe000fffff5f8 [ C6] x8 : dfff800000000000 x7 : 00000000f2000000 x6 : dfff800000000000 [ C6] x5 : 00000000f2f2f200 x4 : dfff800000000000 x3 : ffff700011fcce98 [ C6] x2 : 0000000000000001 x1 : cf09d5450e2b4f7f x0 : 0000000000009a21 [ C6] Call trace: [ C6] _raw_spin_unlock_irqrestore+0x50/0xb8 [ C6] free_pcppages_bulk+0x2bc/0x3e0 [ C6] free_unref_page_commit+0x1fc/0x290 [ C6] free_unref_page+0x184/0x250 [ C6] __free_pages+0x154/0x1a0 [ C6] free_pages+0x88/0xb0 [ C6] kasan_depopulate_vmalloc_pte+0x58/0x80 [ C6] __apply_to_page_range+0x3ec/0x650 [ C6] apply_to_existing_page_range+0x1c/0x30 [ C6] kasan_release_vmalloc+0xa4/0x118 [ C6] __purge_vmap_area_lazy+0x4f4/0xe30 [ C6] drain_vmap_area_work+0x60/0xc0 [ C6] process_one_work+0x4cc/0xa38 [ C6] worker_thread+0x240/0x638 [ C6] kthread+0x1c8/0x1e0 [ C6] ret_from_fork+0x10/0x20 > > Kefeng Wang (3): > mm: kasan: shadow: add cond_resched() in kasan_populate_vmalloc_pte() > mm: kasan: shadow: move free_page() out of page table lock > mm: kasan: shadow: HACK add cond_resched_lock() in > kasan_depopulate_vmalloc_pte() > > include/linux/kasan.h | 9 ++++++--- > mm/kasan/shadow.c | 20 +++++++++++++------- > mm/vmalloc.c | 7 ++++--- > 3 files changed, 23 insertions(+), 13 deletions(-) >