From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id AD89ACDB482 for ; Wed, 18 Oct 2023 14:34:06 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3EC6780017; Wed, 18 Oct 2023 10:34:06 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 39D3F8D0016; Wed, 18 Oct 2023 10:34:06 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 28BA680017; Wed, 18 Oct 2023 10:34:06 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 13F838D0016 for ; Wed, 18 Oct 2023 10:34:06 -0400 (EDT) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 94EF8402FE for ; Wed, 18 Oct 2023 14:34:05 +0000 (UTC) X-FDA: 81358826850.07.7D50638 Received: from szxga03-in.huawei.com (szxga03-in.huawei.com [45.249.212.189]) by imf15.hostedemail.com (Postfix) with ESMTP id 49A17A0032 for ; Wed, 18 Oct 2023 14:34:01 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=none; spf=pass (imf15.hostedemail.com: domain of wangkefeng.wang@huawei.com designates 45.249.212.189 as permitted sender) smtp.mailfrom=wangkefeng.wang@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1697639643; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ydIkvbS+/6F0j1VegN2jh0ivkiZtXhAF4G/oPjr5WG0=; b=J5SpIrMYyK0VTz1sEprlkhUITTLPoja6v0NLEb1F2q4nLwglDIWth2TcH4QxIjIvSYYkYL 4l6Hl/wkcUOxcUD4uEuZjAj7ajzGl2iW/Yw3L2SMA1SzXTyB7AsERMjOZwRvL4pdAySq3i 1coWs29xNHOoWj52Rs+gxdpdF7b/OlM= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1697639643; a=rsa-sha256; cv=none; b=DBqwr6WPPckf8T/jHbQoFg/rA93Efjy/MS2raHcJVKuurpb88iEKx24Ooa4itkEhoxPMYU gGXR7RFtPnEyl3d2o9pGRI8nXhPZ+MQ9rEfGdGIeSNE7jwUtaRnGrsuxjsPBXgyxwS8Mca 2npoOq7R6w54YovwvqG8RUCY2NB/7fU= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=none; spf=pass (imf15.hostedemail.com: domain of wangkefeng.wang@huawei.com designates 45.249.212.189 as permitted sender) smtp.mailfrom=wangkefeng.wang@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com Received: from dggpemm100001.china.huawei.com (unknown [172.30.72.57]) by szxga03-in.huawei.com (SkyGuard) with ESMTP id 4S9Xpk2JxnzLp5f; Wed, 18 Oct 2023 22:11:58 +0800 (CST) Received: from [10.174.177.243] (10.174.177.243) by dggpemm100001.china.huawei.com (7.185.36.93) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.31; Wed, 18 Oct 2023 22:16:03 +0800 Message-ID: Date: Wed, 18 Oct 2023 22:16:02 +0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH -rfc 0/3] mm: kasan: fix softlock when populate or depopulate pte Content-Language: en-US From: Kefeng Wang To: Andrey Ryabinin , Alexander Potapenko , Andrey Konovalov , Dmitry Vyukov , Vincenzo Frascino , Andrew Morton , Uladzislau Rezki , Christoph Hellwig , Lorenzo Stoakes , , References: <20230906124234.134200-1-wangkefeng.wang@huawei.com> <4e2e075f-b74c-4daf-bf1a-f83fced742c4@huawei.com> In-Reply-To: <4e2e075f-b74c-4daf-bf1a-f83fced742c4@huawei.com> Content-Type: text/plain; charset="UTF-8"; format=flowed Content-Transfer-Encoding: 8bit X-Originating-IP: [10.174.177.243] X-ClientProxiedBy: dggems701-chm.china.huawei.com (10.3.19.178) To dggpemm100001.china.huawei.com (7.185.36.93) X-CFilter-Loop: Reflected X-Rspamd-Queue-Id: 49A17A0032 X-Rspam-User: X-Rspamd-Server: rspam11 X-Stat-Signature: otr88xrtezmwod7cyxkpqwmjobng3n9x X-HE-Tag: 1697639641-915347 X-HE-Meta: U2FsdGVkX18WDQ0AgnhlbwR4BN7DAgIo4jSDSZsJ6efsl7eHzDz6m2C4Q/R6etEGXul1R2Y1A3gompYpLsUxjg81KYxxRyFIvTFxfIfe0VBVNgpB+ZIlsnTsVHq2NKIOogXy6hrjxtniHi3Fn/hBgviv8xS942UR6g0vTOaa5tRj1CSNpeXNL1fNiy/fvVsqeKqqg8xW9orb6GS2Ybm5Ss3qRUeS5NVj+JSGUyEBgwpUGMG3L/fWvsRLkSrRe3mbX4XT5Sc0kV4m7fNwEYON0+9KCtjAAjvkfn7kdG/u0CquD26J6koErgzmGh8iyxtcvEVoN389ElYv5yK1kU+X5ODogpcGsad7TMYnj7Foivsp80bfPEVMgIcjTjJYpA29y1RNjZJpaJxx7bGobRoBUbLrOk9c2zdWVhmEmZs7ODwXXrmIvWn8/ugp/ql19dxO9kqxvuOj59JlXj9F6XLh24ghXzWDbbjMPekQQpM5TO4oO77X5N2N1uLAzKDM2nO3hwzZv54RBgzdgxZD+4Vq2XmACh/TaNRWFGd8uLkiSPmR9oRjK48YQAI8PINGnFiLEk5vg1z5AS3oLZL6AttgffN8tUz3Kv7GXh45YeUQsuXbS9VTaFkLN38iC3AQUBYiRT/w7ZFPt1QiRuhUhMi0wi2fXWWbDcyVHU/5OCWqopDr+YIlnDfHx8ydM2uSWMJpNqh8u8qPxLsBJ8r087jdftn8jxYd9e6nwUNyYOo9MXJYs8wyEg+R0xv7/Y5ykkejXYBCF1Yast1YGzNlahlC/KK2kIzIiRCy2g0J7XdBBYD2x1qFPGPOD7GvVnkQ1DQsY21crBpK+x4uXLnS9AcQTEb+My3Wklq1wN/511AI886YuiTU0To+CSTmsYERdA/ps86TEnzE95pQcOXkOaGit18ThqWl1H+Bx9d62XEHR5pNqGrNwc3QeiKUa9kdVJ/CmyDMBcC2caxBC8wdgkt UX4OoFgY mLTXRgo2SHwadZhPg1wliSAu6FXP0ZMPTZ/bCZPk77gQa9unr3175nj+T5juJFmRbalDvZ1W0ufzoMv3vIaegMb3/8aotNKtAP3BgFr/csHeXA6Y2cCOK3NIb4/R4etcjqRkgFgE3bp3VQqrlfe1fQUB/YS9h2lr1JQXSiOgPKa6TDe+8QcydyOjFrCWL2l8auxUbBYGbh5rIliRkBwA9Be+vuTmgYiSNe5kdZcJ/5C3/Pj9CR3/kfjK94w== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The issue is easy to reproduced with large vmalloc, kindly ping... On 2023/9/15 8:58, Kefeng Wang wrote: > Hi All, any suggest or comments,many thanks. > > On 2023/9/6 20:42, Kefeng Wang wrote: >> This is a RFC, even patch3 is a hack to fix the softlock issue when >> populate or depopulate pte with large region, looking forward to your >> reply and advise, thanks. > > Here is full stack,for populate pte, > > [    C3] watchdog: BUG: soft lockup - CPU#3 stuck for 26s! [insmod:458] > [    C3] Modules linked in: test(OE+) > [    C3] irq event stamp: 320776 > [    C3] hardirqs last  enabled at (320775): [] > _raw_spin_unlock_irqrestore+0x98/0xb8 > [    C3] hardirqs last disabled at (320776): [] > el1_interrupt+0x38/0xa8 > [    C3] softirqs last  enabled at (318174): [] > __do_softirq+0x658/0x7ac > [    C3] softirqs last disabled at (318169): [] > ____do_softirq+0x18/0x30 > [    C3] CPU: 3 PID: 458 Comm: insmod Tainted: G           OE 6.5.0+ #595 > [    C3] Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015 > [    C3] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) > [    C3] pc : _raw_spin_unlock_irqrestore+0x50/0xb8 > [    C3] lr : _raw_spin_unlock_irqrestore+0x98/0xb8 > [    C3] sp : ffff800093386d70 > [    C3] x29: ffff800093386d70 x28: 0000000000000801 x27: ffff0007ffffa9c0 > [    C3] x26: 0000000000000000 x25: 000000000000003f x24: fffffc0004353708 > [    C3] x23: ffff0006d476bad8 x22: fffffc0004353748 x21: 0000000000000000 > [    C3] x20: ffff0007ffffafc0 x19: 0000000000000000 x18: 0000000000000000 > [    C3] x17: ffff80008024e7fc x16: ffff80008055a8f0 x15: ffff80008024ec60 > [    C3] x14: ffff80008024ead0 x13: ffff80008024e7fc x12: ffff6000fffff5f9 > [    C3] x11: 1fffe000fffff5f8 x10: ffff6000fffff5f8 x9 : 1fffe000fffff5f8 > [    C3] x8 : dfff800000000000 x7 : 00000000f2000000 x6 : dfff800000000000 > [    C3] x5 : 00000000f2f2f200 x4 : dfff800000000000 x3 : ffff700012670d70 > [    C3] x2 : 0000000000000001 x1 : c9a5dbfae610fa24 x0 : 000000000004e507 > [    C3] Call trace: > [    C3]  _raw_spin_unlock_irqrestore+0x50/0xb8 > [    C3]  rmqueue_bulk+0x434/0x6b8 > [    C3]  get_page_from_freelist+0xdd4/0x1680 > [    C3]  __alloc_pages+0x244/0x508 > [    C3]  alloc_pages+0xf0/0x218 > [    C3]  __get_free_pages+0x1c/0x50 > [    C3]  kasan_populate_vmalloc_pte+0x30/0x188 > [    C3]  __apply_to_page_range+0x3ec/0x650 > [    C3]  apply_to_page_range+0x1c/0x30 > [    C3]  kasan_populate_vmalloc+0x60/0x70 > [    C3]  alloc_vmap_area.part.67+0x328/0xe50 > [    C3]  alloc_vmap_area+0x4c/0x78 > [    C3]  __get_vm_area_node.constprop.76+0x130/0x240 > [    C3]  __vmalloc_node_range+0x12c/0x340 > [    C3]  __vmalloc_node+0x8c/0xb0 > [    C3]  vmalloc+0x2c/0x40 > [    C3]  show_mem_init+0x1c/0xff8 [test] > [    C3]  do_one_initcall+0xe4/0x500 > [    C3]  do_init_module+0x100/0x358 > [    C3]  load_module+0x2e64/0x2fc8 > [    C3]  init_module_from_file+0xec/0x148 > [    C3]  idempotent_init_module+0x278/0x380 > [    C3]  __arm64_sys_finit_module+0x88/0xf8 > [    C3]  invoke_syscall+0x64/0x188 > [    C3]  el0_svc_common.constprop.1+0xec/0x198 > [    C3]  do_el0_svc+0x48/0xc8 > [    C3]  el0_svc+0x3c/0xe8 > [    C3]  el0t_64_sync_handler+0xa0/0xc8 > [    C3]  el0t_64_sync+0x188/0x190 > > and for depopuldate pte, > > [    C6] watchdog: BUG: soft lockup - CPU#6 stuck for 48s! [kworker/6:1:59] > [    C6] Modules linked in: test(OE+) > [    C6] irq event stamp: 39458 > [    C6] hardirqs last  enabled at (39457): [] > _raw_spin_unlock_irqrestore+0x98/0xb8 > [    C6] hardirqs last disabled at (39458): [] > el1_interrupt+0x38/0xa8 > [    C6] softirqs last  enabled at (39420): [] > __do_softirq+0x658/0x7ac > [    C6] softirqs last disabled at (39415): [] > ____do_softirq+0x18/0x30 > [    C6] CPU: 6 PID: 59 Comm: kworker/6:1 Tainted: G           OEL > 6.5.0+ #595 > [    C6] Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015 > [    C6] Workqueue: events drain_vmap_area_work > [    C6] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) > [    C6] pc : _raw_spin_unlock_irqrestore+0x50/0xb8 > [    C6] lr : _raw_spin_unlock_irqrestore+0x98/0xb8 > [    C6] sp : ffff80008fe676b0 > [    C6] x29: ffff80008fe676b0 x28: fffffc000601d310 x27: ffff000edf5dfa80 > [    C6] x26: ffff000edf5dfad8 x25: 0000000000000000 x24: 0000000000000006 > [    C6] x23: ffff000edf5dfad4 x22: 0000000000000000 x21: 0000000000000006 > [    C6] x20: ffff0007ffffafc0 x19: 0000000000000000 x18: 0000000000000000 > [    C6] x17: ffff8000805544b8 x16: ffff800080553d94 x15: ffff8000805c11b0 > [    C6] x14: ffff8000805baeb0 x13: ffff800080047e10 x12: ffff6000fffff5f9 > [    C6] x11: 1fffe000fffff5f8 x10: ffff6000fffff5f8 x9 : 1fffe000fffff5f8 > [    C6] x8 : dfff800000000000 x7 : 00000000f2000000 x6 : dfff800000000000 > [    C6] x5 : 00000000f2f2f200 x4 : dfff800000000000 x3 : ffff700011fcce98 > [    C6] x2 : 0000000000000001 x1 : cf09d5450e2b4f7f x0 : 0000000000009a21 > [    C6] Call trace: > [    C6]  _raw_spin_unlock_irqrestore+0x50/0xb8 > [    C6]  free_pcppages_bulk+0x2bc/0x3e0 > [    C6]  free_unref_page_commit+0x1fc/0x290 > [    C6]  free_unref_page+0x184/0x250 > [    C6]  __free_pages+0x154/0x1a0 > [    C6]  free_pages+0x88/0xb0 > [    C6]  kasan_depopulate_vmalloc_pte+0x58/0x80 > [    C6]  __apply_to_page_range+0x3ec/0x650 > [    C6]  apply_to_existing_page_range+0x1c/0x30 > [    C6]  kasan_release_vmalloc+0xa4/0x118 > [    C6]  __purge_vmap_area_lazy+0x4f4/0xe30 > [    C6]  drain_vmap_area_work+0x60/0xc0 > [    C6]  process_one_work+0x4cc/0xa38 > [    C6]  worker_thread+0x240/0x638 > [    C6]  kthread+0x1c8/0x1e0 > [    C6]  ret_from_fork+0x10/0x20 > > > >> >> Kefeng Wang (3): >>    mm: kasan: shadow: add cond_resched() in kasan_populate_vmalloc_pte() >>    mm: kasan: shadow: move free_page() out of page table lock >>    mm: kasan: shadow: HACK add cond_resched_lock() in >>      kasan_depopulate_vmalloc_pte() >> >>   include/linux/kasan.h |  9 ++++++--- >>   mm/kasan/shadow.c     | 20 +++++++++++++------- >>   mm/vmalloc.c          |  7 ++++--- >>   3 files changed, 23 insertions(+), 13 deletions(-) >>