From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 63427CDB465 for ; Thu, 19 Oct 2023 06:17:30 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D45398008D; Thu, 19 Oct 2023 02:17:29 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id CCF0B8008C; Thu, 19 Oct 2023 02:17:29 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B47B68008D; Thu, 19 Oct 2023 02:17:29 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 9F1518008C for ; Thu, 19 Oct 2023 02:17:29 -0400 (EDT) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 68F8540EB8 for ; Thu, 19 Oct 2023 06:17:29 +0000 (UTC) X-FDA: 81361204218.27.DE1E6D3 Received: from mail-lj1-f181.google.com (mail-lj1-f181.google.com [209.85.208.181]) by imf03.hostedemail.com (Postfix) with ESMTP id 773DA20004 for ; Thu, 19 Oct 2023 06:17:27 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=dIffN6to; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf03.hostedemail.com: domain of urezki@gmail.com designates 209.85.208.181 as permitted sender) smtp.mailfrom=urezki@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1697696247; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=ZFs1yV2RxCK2OxD+c4DIu8WP+BE3ENkrqmCLTOe10Ww=; b=Uk+R9tBDQZYtC8dXC/J2YTl+FPiswADiT46G41ATwHYHWrbZxWd6lbitWvQF2SPMSL1UbL LssE91EWabnens7HND3UihGHfv8n94i6OXGy1FesMPgHQUhMdP66x7t3jrNDuReB2yzcl3 clsDWUM2M1NFmynVrHZzgyAK4jpBx+Y= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=dIffN6to; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf03.hostedemail.com: domain of urezki@gmail.com designates 209.85.208.181 as permitted sender) smtp.mailfrom=urezki@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1697696247; a=rsa-sha256; cv=none; b=6iGM5Fnlw+YFUZ3WTqh/aG7fTDASdjCdMKM9aOaUhHnDAQ2PNzfGYg0gix6R1ENUBvRAw3 rVvDuUL7UyU61+dFG9Uf/JO7AJLYukNx5n4ppnOapD54qwYFh997aS4zpfIQ1L11G/Z4Mu mw0E2DXgma38kRU7/sj7hrK7kXE5FzI= Received: by mail-lj1-f181.google.com with SMTP id 38308e7fff4ca-2c504a5e1deso94940921fa.2 for ; Wed, 18 Oct 2023 23:17:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1697696246; x=1698301046; darn=kvack.org; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:date:from:from:to :cc:subject:date:message-id:reply-to; bh=ZFs1yV2RxCK2OxD+c4DIu8WP+BE3ENkrqmCLTOe10Ww=; b=dIffN6tolOYPhhuN71v8+IXh5oHkImazt+zCw9ZQJXLwm0Omo2tJFgsVC79iIVgA6o /nr1AXFMC+cJvhi1WaeN9BgwB4DyF3HVYPrljCVkCp1fZWwX+LqYatjfJucxMaaVK0dC M+jCQ57wXtCm3OB8Pp3jZTZkB7Ilj7JrkGYwnEKgsmTXD8KbEVNbE9RkWP7+X/c/CAyL d8f3ESzrMVjSb3ynRmIx2jAMQl62OblFu+DRKwFe30/Vd9O+OQB9xR7Jbe9mMZoHex1I cZlomlJr4cPZXNyDy3KKS4uQc8qnxyPDTyiZ6pGNWWRA2I9NhC0Pl8Avvj6ShVKW4x+r M0aQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1697696246; x=1698301046; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:date:from :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=ZFs1yV2RxCK2OxD+c4DIu8WP+BE3ENkrqmCLTOe10Ww=; b=GufLPXIX7aETGOZRhAB6XhEDwhyn9AjcYvIuCJ0MHpfGLes45C6gmjpnHRhAVZTGZ0 HHFiQIGalXivRIM7TQn0zDhMyyNpHjNmqQoj22vfDXefvetxtjM+Imis8dUxabUxwp0m wH/IRdl4ajJN2j3k1ZkhCyb0pWyGwMMJDYQjcFd5DHwJ2DPfBDgTr8tp1jhd+9su49C0 a26PejBstMvkg+uA98Prqpu/Clz37D7IQeXjgOxAUJwffLkpfE6MxcGDkLF7QEv/8z+x K8Hjl9hjL1nI0Er/sVpR+LOFoQULXJ1LBIYTgndd5RcaTlmJ3WbskI65+eYlI4YHna9k tyVQ== X-Gm-Message-State: AOJu0YweCsCvePKw0uqiNPxvUDTAKElSTZNuHh6qGDZoOVQreLMv8sOP 9R7IARALgiaF528c8zUaS8k= X-Google-Smtp-Source: AGHT+IG/LSsCD91xDdmbUePt4TTGb+6fXQOauRdPyiiegsjexlKW60lNyDLhIcvwYQlkBY2naaP33Q== X-Received: by 2002:a2e:9584:0:b0:2bf:f670:36dc with SMTP id w4-20020a2e9584000000b002bff67036dcmr655263ljh.49.1697696245391; Wed, 18 Oct 2023 23:17:25 -0700 (PDT) Received: from pc636 ([155.137.26.201]) by smtp.gmail.com with ESMTPSA id a30-20020a2ebe9e000000b002b724063010sm976341ljr.47.2023.10.18.23.17.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 18 Oct 2023 23:17:24 -0700 (PDT) From: Uladzislau Rezki X-Google-Original-From: Uladzislau Rezki Date: Thu, 19 Oct 2023 08:17:22 +0200 To: Kefeng Wang Cc: Marco Elver , Andrey Ryabinin , Alexander Potapenko , Andrey Konovalov , Dmitry Vyukov , Vincenzo Frascino , Andrew Morton , Uladzislau Rezki , Christoph Hellwig , Lorenzo Stoakes , kasan-dev@googlegroups.com, linux-mm@kvack.org Subject: Re: [PATCH -rfc 0/3] mm: kasan: fix softlock when populate or depopulate pte Message-ID: References: <20230906124234.134200-1-wangkefeng.wang@huawei.com> <4e2e075f-b74c-4daf-bf1a-f83fced742c4@huawei.com> <5b33515b-5fd2-4dc7-9778-e321484d2427@huawei.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <5b33515b-5fd2-4dc7-9778-e321484d2427@huawei.com> X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: 773DA20004 X-Stat-Signature: nup7wdog33x8mopwr53rngdqch1ifo86 X-Rspam-User: X-HE-Tag: 1697696247-813552 X-HE-Meta: U2FsdGVkX183SesG/DON5uzyOx2HQs+02RHNQjlbhdDlKLhCpAmBwKejJzm81xjWTcsmzrvLlJP3PZkSpA+f4YACYP/xrEmTMIu+SqYicMi5sYduV0kSq1Xfe8Xrh5IYd8yuds3HPGR/Ss7DkvF6GYG6Bp32nOUdqT+64QTxKAuAnsdoJYWEQdpo0aRamdQgjNyNNQH4YrxD2iVs5+YpucVCoYmrAmNwlcPzGO+JIhNYURhUlaKI925esdRdKAhs9/nCmd04BONe/U2ZgrZgPh2AztXabFipcXvLbqodRhpevr367UY58OoiI3kv2tjbhuCK4LLW1Fj4v2ADZ0EabWGcFNGBbHzdR+LiAcME4Er8e4x60CYp/ae/ampYUjCoKoks4IUwEyqZiCJT/vjVNAaZEk2FEMbCVzzbDsna2jrI0OHGxZczs4RpZW5by7menhDEGUcNjW1NV5IzQadzY80R0i1xkG48rocnt6SpaLM9/bC9HG2o5ue2HoanAUiO5ls90cbrm6zVgdwGSjaHv3i7YOrBayvRQjoaujE3fOJjCmCwVjBSAVInTLH3jlhNe8NbxDlsXblzlFP8wgr6h2ffWhODTP8w9UTfR48Vh9jgvhrRZiZWJeiLKmg41ADRC1sLSv7cy8xHyqUtopVnYZT85ByXEP8FzHER5M6ek7zYtarhbov2hh8hpUKRnl7h1vVmEJdywXfentTRP4AK0MpDQjmHHqbtFI0/4Vbq77X0bNJII6HFJikaV7QJDFnV2fJ37xgJojS7UCuluIE1ihf7hBZ7vtySMZOQZr3hmiRh1gxglaecWOE1BSfnJhGwZcaUDWYIqb5UCBMpuiWzyAjUJq1V3Ic+6GTHeBiejjP7DfnZQwnGOKA5iAH0HjELeFBx7/p0cLd5m+y0YsPPk+M9B1OKAtoglS4oJyMQVPGyS7SNGK8fMwkXGW89jSdWNqC7xPUBTwuGG+gRvfa RjJPihuk S87wC+HXF8a/sBb6F+asogxFXkBSVznNs4PEP1sV+0z044211AUJiE6TyhOEkdV/tDmc8xyGHu+mjSZR7slbqLLMJmN+OwJO8d16UG/R+tGZNX+deGu99pis7/jwqrZNxdNE6U36sq43jsKulRT+bqQBWt9IZN8wy2qjWVr7HzsSOCTe1I4OVzO6WpHvuvoO3/k7Z7WZ/4y1tF+tK11QmS8CGa+Mahd0KB7IB+WSt56+L9C2MhUdQ3e5NgHIa7wGDLut/GMOmHB3UaB0O+Q6JtQ0tCD2Kiqi0Mu1gFhR1LE6rqRK6Syz1YMQlQhFxWu3ccJYVS1oos2NfpSWI8NJldwNjjx1LvKCBC+Lk8r+NQiBElwgmJ/aaXr0IG7eu6CsRvI10ZRANHyA7KAPX5FEO41IIuSe3NqXvgEXu+mwbl/ViSVuxFzBK5MIWIM6yHfW7goLPf4c0Ys36FpaqhNsA/sF0EG8RIyGG9L9xTbL3w7tC0Kai39Rl3yAkNoCyGs2Fo/wshiLnxuplLsNQkC04MJcoJ7KuY9Lj088GxRCOO8I35waY9fcMlRybZQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Oct 19, 2023 at 09:40:10AM +0800, Kefeng Wang wrote: > > > On 2023/10/19 0:37, Marco Elver wrote: > > On Wed, 18 Oct 2023 at 16:16, 'Kefeng Wang' via kasan-dev > > wrote: > > > > > > The issue is easy to reproduced with large vmalloc, kindly ping... > > > > > > On 2023/9/15 8:58, Kefeng Wang wrote: > > > > Hi All, any suggest or comments,many thanks. > > > > > > > > On 2023/9/6 20:42, Kefeng Wang wrote: > > > > > This is a RFC, even patch3 is a hack to fix the softlock issue when > > > > > populate or depopulate pte with large region, looking forward to your > > > > > reply and advise, thanks. > > > > > > > > Here is full stack,for populate pte, > > > > > > > > [ C3] watchdog: BUG: soft lockup - CPU#3 stuck for 26s! [insmod:458] > > > > [ C3] Modules linked in: test(OE+) > > > > [ C3] irq event stamp: 320776 > > > > [ C3] hardirqs last enabled at (320775): [] > > > > _raw_spin_unlock_irqrestore+0x98/0xb8 > > > > [ C3] hardirqs last disabled at (320776): [] > > > > el1_interrupt+0x38/0xa8 > > > > [ C3] softirqs last enabled at (318174): [] > > > > __do_softirq+0x658/0x7ac > > > > [ C3] softirqs last disabled at (318169): [] > > > > ____do_softirq+0x18/0x30 > > > > [ C3] CPU: 3 PID: 458 Comm: insmod Tainted: G OE 6.5.0+ #595 > > > > [ C3] Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015 > > > > [ C3] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) > > > > [ C3] pc : _raw_spin_unlock_irqrestore+0x50/0xb8 > > > > [ C3] lr : _raw_spin_unlock_irqrestore+0x98/0xb8 > > > > [ C3] sp : ffff800093386d70 > > > > [ C3] x29: ffff800093386d70 x28: 0000000000000801 x27: ffff0007ffffa9c0 > > > > [ C3] x26: 0000000000000000 x25: 000000000000003f x24: fffffc0004353708 > > > > [ C3] x23: ffff0006d476bad8 x22: fffffc0004353748 x21: 0000000000000000 > > > > [ C3] x20: ffff0007ffffafc0 x19: 0000000000000000 x18: 0000000000000000 > > > > [ C3] x17: ffff80008024e7fc x16: ffff80008055a8f0 x15: ffff80008024ec60 > > > > [ C3] x14: ffff80008024ead0 x13: ffff80008024e7fc x12: ffff6000fffff5f9 > > > > [ C3] x11: 1fffe000fffff5f8 x10: ffff6000fffff5f8 x9 : 1fffe000fffff5f8 > > > > [ C3] x8 : dfff800000000000 x7 : 00000000f2000000 x6 : dfff800000000000 > > > > [ C3] x5 : 00000000f2f2f200 x4 : dfff800000000000 x3 : ffff700012670d70 > > > > [ C3] x2 : 0000000000000001 x1 : c9a5dbfae610fa24 x0 : 000000000004e507 > > > > [ C3] Call trace: > > > > [ C3] _raw_spin_unlock_irqrestore+0x50/0xb8 > > > > [ C3] rmqueue_bulk+0x434/0x6b8 > > > > [ C3] get_page_from_freelist+0xdd4/0x1680 > > > > [ C3] __alloc_pages+0x244/0x508 > > > > [ C3] alloc_pages+0xf0/0x218 > > > > [ C3] __get_free_pages+0x1c/0x50 > > > > [ C3] kasan_populate_vmalloc_pte+0x30/0x188 > > > > [ C3] __apply_to_page_range+0x3ec/0x650 > > > > [ C3] apply_to_page_range+0x1c/0x30 > > > > [ C3] kasan_populate_vmalloc+0x60/0x70 > > > > [ C3] alloc_vmap_area.part.67+0x328/0xe50 > > > > [ C3] alloc_vmap_area+0x4c/0x78 > > > > [ C3] __get_vm_area_node.constprop.76+0x130/0x240 > > > > [ C3] __vmalloc_node_range+0x12c/0x340 > > > > [ C3] __vmalloc_node+0x8c/0xb0 > > > > [ C3] vmalloc+0x2c/0x40 > > > > [ C3] show_mem_init+0x1c/0xff8 [test] > > > > [ C3] do_one_initcall+0xe4/0x500 > > > > [ C3] do_init_module+0x100/0x358 > > > > [ C3] load_module+0x2e64/0x2fc8 > > > > [ C3] init_module_from_file+0xec/0x148 > > > > [ C3] idempotent_init_module+0x278/0x380 > > > > [ C3] __arm64_sys_finit_module+0x88/0xf8 > > > > [ C3] invoke_syscall+0x64/0x188 > > > > [ C3] el0_svc_common.constprop.1+0xec/0x198 > > > > [ C3] do_el0_svc+0x48/0xc8 > > > > [ C3] el0_svc+0x3c/0xe8 > > > > [ C3] el0t_64_sync_handler+0xa0/0xc8 > > > > [ C3] el0t_64_sync+0x188/0x190 > > > > > > > > and for depopuldate pte, > > > > > > > > [ C6] watchdog: BUG: soft lockup - CPU#6 stuck for 48s! [kworker/6:1:59] > > > > [ C6] Modules linked in: test(OE+) > > > > [ C6] irq event stamp: 39458 > > > > [ C6] hardirqs last enabled at (39457): [] > > > > _raw_spin_unlock_irqrestore+0x98/0xb8 > > > > [ C6] hardirqs last disabled at (39458): [] > > > > el1_interrupt+0x38/0xa8 > > > > [ C6] softirqs last enabled at (39420): [] > > > > __do_softirq+0x658/0x7ac > > > > [ C6] softirqs last disabled at (39415): [] > > > > ____do_softirq+0x18/0x30 > > > > [ C6] CPU: 6 PID: 59 Comm: kworker/6:1 Tainted: G OEL > > > > 6.5.0+ #595 > > > > [ C6] Hardware name: QEMU QEMU Virtual Machine, BIOS 0.0.0 02/06/2015 > > > > [ C6] Workqueue: events drain_vmap_area_work > > > > [ C6] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) > > > > [ C6] pc : _raw_spin_unlock_irqrestore+0x50/0xb8 > > > > [ C6] lr : _raw_spin_unlock_irqrestore+0x98/0xb8 > > > > [ C6] sp : ffff80008fe676b0 > > > > [ C6] x29: ffff80008fe676b0 x28: fffffc000601d310 x27: ffff000edf5dfa80 > > > > [ C6] x26: ffff000edf5dfad8 x25: 0000000000000000 x24: 0000000000000006 > > > > [ C6] x23: ffff000edf5dfad4 x22: 0000000000000000 x21: 0000000000000006 > > > > [ C6] x20: ffff0007ffffafc0 x19: 0000000000000000 x18: 0000000000000000 > > > > [ C6] x17: ffff8000805544b8 x16: ffff800080553d94 x15: ffff8000805c11b0 > > > > [ C6] x14: ffff8000805baeb0 x13: ffff800080047e10 x12: ffff6000fffff5f9 > > > > [ C6] x11: 1fffe000fffff5f8 x10: ffff6000fffff5f8 x9 : 1fffe000fffff5f8 > > > > [ C6] x8 : dfff800000000000 x7 : 00000000f2000000 x6 : dfff800000000000 > > > > [ C6] x5 : 00000000f2f2f200 x4 : dfff800000000000 x3 : ffff700011fcce98 > > > > [ C6] x2 : 0000000000000001 x1 : cf09d5450e2b4f7f x0 : 0000000000009a21 > > > > [ C6] Call trace: > > > > [ C6] _raw_spin_unlock_irqrestore+0x50/0xb8 > > > > [ C6] free_pcppages_bulk+0x2bc/0x3e0 > > > > [ C6] free_unref_page_commit+0x1fc/0x290 > > > > [ C6] free_unref_page+0x184/0x250 > > > > [ C6] __free_pages+0x154/0x1a0 > > > > [ C6] free_pages+0x88/0xb0 > > > > [ C6] kasan_depopulate_vmalloc_pte+0x58/0x80 > > > > [ C6] __apply_to_page_range+0x3ec/0x650 > > > > [ C6] apply_to_existing_page_range+0x1c/0x30 > > > > [ C6] kasan_release_vmalloc+0xa4/0x118 > > > > [ C6] __purge_vmap_area_lazy+0x4f4/0xe30 > > > > [ C6] drain_vmap_area_work+0x60/0xc0 > > > > [ C6] process_one_work+0x4cc/0xa38 > > > > [ C6] worker_thread+0x240/0x638 > > > > [ C6] kthread+0x1c8/0x1e0 > > > > [ C6] ret_from_fork+0x10/0x20 > > > > > > > > > > > > > > > > > > > > > > Kefeng Wang (3): > > > > > mm: kasan: shadow: add cond_resched() in kasan_populate_vmalloc_pte() > > > > > mm: kasan: shadow: move free_page() out of page table lock > > > > > mm: kasan: shadow: HACK add cond_resched_lock() in > > > > > kasan_depopulate_vmalloc_pte() > > > > The first 2 patches look ok, but yeah, the last is a hack. I also > > don't have any better suggestions, only more questions. > > Thanks Marco, maybe we could convert free_vmap_area_lock from spinlock to > mutex lock only if KASAN enabled? > I do not think it is a good suggestion. Could you please clarify the reason of such conversion? > > > > Does this only happen on arm64? > > Our test case run on arm64 qemu(host is x86), so it run much more slower > than real board. > > Do you have a minimal reproducer you can share? > Here is the code in test driver, > > void *buf = vmalloc(40UL << 30); > vfree(buf); > What is a test driver? Why do you need 42G of memmory, for which purpose? -- Uladzislau Rezki