From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 48D3FC48BC3 for ; Tue, 20 Feb 2024 15:52:29 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A6CEE6B0074; Tue, 20 Feb 2024 10:52:28 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id A1C2E6B0075; Tue, 20 Feb 2024 10:52:28 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 90A3D6B0078; Tue, 20 Feb 2024 10:52:28 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 80E5B6B0074 for ; Tue, 20 Feb 2024 10:52:28 -0500 (EST) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 44E1B4073B for ; Tue, 20 Feb 2024 15:52:28 +0000 (UTC) X-FDA: 81812624376.30.35E7E53 Received: from mail-ot1-f47.google.com (mail-ot1-f47.google.com [209.85.210.47]) by imf22.hostedemail.com (Postfix) with ESMTP id 56CCAC0009 for ; Tue, 20 Feb 2024 15:52:26 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=chromium.org header.s=google header.b=ZZMv8gOw; dmarc=pass (policy=none) header.from=chromium.org; spf=pass (imf22.hostedemail.com: domain of svenva@chromium.org designates 209.85.210.47 as permitted sender) smtp.mailfrom=svenva@chromium.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1708444346; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=DdFor41F3IC1r56a0HLGglXn70d2B2IHDR4s07Fdmrs=; b=MZ7WnqZvl/raW0A/FdcjXDuJvAl70+AHUCM36KqwVVDvxsS/UaxuEVbbwlrs7TOTrfBsyI FIo6nG4hg8sk6OeBI+kCtYWKdYFSei/RdUbYnHcAQT+Ifo6bZ40kU+zajHF0XSAg1c5Dhk TGU11y2qV2X9MOYCE7PwdKhNb7XjQzQ= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=chromium.org header.s=google header.b=ZZMv8gOw; dmarc=pass (policy=none) header.from=chromium.org; spf=pass (imf22.hostedemail.com: domain of svenva@chromium.org designates 209.85.210.47 as permitted sender) smtp.mailfrom=svenva@chromium.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1708444346; a=rsa-sha256; cv=none; b=fYFMb4QuMHAdRVFaQOx/LsxfQ3DSrOICMmnMlHc9+RJ5piK7x8Creywzp7ndMpRtysPG7t wX5IWaLMU9JCnSCqbaDKoYsUfkyEWYxLt02n2LYK1eDfsQOdjdtFOSBeAGKM29PqNcqIQY H+yqGyQoUIXtPZFMVIDC98e/Qe6luDQ= Received: by mail-ot1-f47.google.com with SMTP id 46e09a7af769-6e0f43074edso3170304a34.1 for ; Tue, 20 Feb 2024 07:52:26 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; t=1708444345; x=1709049145; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=DdFor41F3IC1r56a0HLGglXn70d2B2IHDR4s07Fdmrs=; b=ZZMv8gOwdMVbunEAcTkmM7KZNP5LUWztAeWDDwcdvcaAC2eqfGHR7zYjvEOnEKCHSz zwjl7d+/6P/i6KQgNqA9OpC/amNZ7LpNQu4wmjftFBCrROASMLx0YbtIyr2TSpITo3aV tR97Qiit9GU8Z32uk5Qf1d7SN4VmnWFWFh/GA= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1708444345; x=1709049145; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=DdFor41F3IC1r56a0HLGglXn70d2B2IHDR4s07Fdmrs=; b=AWoYer5ZOSx40H1i9sXnHvlxC278PDH7M5+RKeGi+iyJsJXRWzQSNFwK+kCvY48my3 n+G1XOUH2qY2RrZEo5krXdcBLA4i+jeUveVCSNOJc3rdHPXda5/RsobpYuzYZB3rRfEc nHfqk0lZioAwLC3KOJrMHagusUbRSAB8LXvUieslbRTmRjQ8VFQe0MKsBnzYDB8hyOtJ IrFDFMPDGOpU7ECb/ri6HTSepx5mYZL0xllv5jyBLqU0ny8+EG/F/nrxF0Stc+S+6KdL hfhqr2gmkxM9PHYcyLKacwJn0i+uweYRo24AINUN5s1le3NK9yw38WPn1jQKoILHU39s TSHA== X-Forwarded-Encrypted: i=1; AJvYcCXtQjz1Dohm/20MOfdC8HX7qV4ysBFFgvS2mR9LUjgMXXCW4R9/hlYftbBmJ7dF8f29XugrpA1cHG1R7KxKgNFHDkM= X-Gm-Message-State: AOJu0YxFttDZcO2LfaqBZoRK9JWuQMAYVMAd0EjFCtG1fvuBMJJRaiyi xrUDulL/Wd2aOG14sZG7idadmJlTbPij5PD7eJx0xpy05hmVZu4nCVUjM5OdDvNGm8UZXSmzjWI bgBG3NPzeKeU0sPuNUJTnCTFMDzCMvqEpm3zOtczcGjPcMarHcQ== X-Google-Smtp-Source: AGHT+IEWG+l/QRVtfNEh+sJn3QHhpIF8SajAdIrjIPZq0YcYxQ5EHhhZkodoUJbW1yo55UqEMkUo3jXBI9O+TBqnA7c= X-Received: by 2002:a9d:6b08:0:b0:6e4:34ba:add2 with SMTP id g8-20020a9d6b08000000b006e434baadd2mr12014050otp.5.1708444345325; Tue, 20 Feb 2024 07:52:25 -0800 (PST) MIME-Version: 1.0 References: <20240214170720.v1.1.Ic3de2566a7fd3de8501b2f18afa9f94eadb2df0a@changeid> <87jzn0ofdb.wl-tiwai@suse.de> <235ab5aa-90a4-4dd7-b2c6-70469605bcfb@suse.cz> In-Reply-To: <235ab5aa-90a4-4dd7-b2c6-70469605bcfb@suse.cz> From: Sven van Ashbrook Date: Tue, 20 Feb 2024 10:52:14 -0500 Message-ID: Subject: Re: Stall at page allocations with __GFP_RETRY_MAYFAIL (Re: [PATCH v1] ALSA: memalloc: Fix indefinite hang in non-iommu case) To: Vlastimil Babka Cc: Takashi Iwai , Karthikeyan Ramasubramanian , LKML , Brian Geffon , stable@vger.kernel.org, Curtis Malainey , Jaroslav Kysela , Takashi Iwai , linux-sound@vger.kernel.org, linux-mm@kvack.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 56CCAC0009 X-Stat-Signature: jazyiqz4x3d5r6x5ntogpxpx9oaa644a X-HE-Tag: 1708444346-131561 X-HE-Meta: U2FsdGVkX1/kA7bWHhqU5TEDb4vLe7O1AYTaTCQdzFWjZYietz+PQrpKT6GTy0qe/HCAfbYK5/zROOwQW7xIxm0RK8JIlBbHG43Z2rnxnHqp9YzZIbk66ha2yqvfBidx24lEtThLq5fqbmCPDlYuaBBpWTVBa/SmE5+2fBjnle8HJjK6/PvAJ4Jg9LjWeUSJPQj07J6mFhnqQm/k4rQIN7GDE7J3HxAjzVTbgyjt7pZPSDIdrqNRrrh2G2nOUUYjZt4KHDbOzrQA2idhqZzdDZx2SO9llJureTLOhtEy+pReYMp7Embl2AtN5T7+JyzspNm4qesYYG54MbsJEuLPP5jDIXXBZUdYdSih7LoZla6EjpozhKXusDIWJR/DpOm9Cz1wCBub3LlRxWIEF5EUoTdq1qroMAKIIW6kpUnTVIRwy1JR7/FKlL7S3kIK8ofFe76fJjCYj3SgLyAqjWc7zUUXSx/SP2ZRZvf/jZTM6ASF7gQ9ZDaBO0jh2RURq2d+IW21uzBZnHULfvrHGvMg2mqmU9UD9D1nDgmfcfN6Dg8lvqnD0J3eKAVPX3hw96CR5qw5kitFBkjeU4hlK4v5rReeeyyTXFpwXgUynGKpxs95Es7jOBT0FVdCcT8c7AC63TZxF5qthWH23hPDq68L9GVxoJGygnDOEBdID9pbBHWNHI4oMDigwbXVGBOpug7zgb/ylkJ+WCp3vz2XqthYNlu4jy3HksM+l2gZS0nzQnUNaXwZDL30wFMecXouId/RsdDUpAcJ3TEG/3oFBNiXVzTk2eWoIARRBgRYegD5CB3tu1jGyglZTUibQ4oUp8GvPDampDlN00jgEabm4kC5phrCo9/E/pI6W94+VDozyVNs6vy9AKCT1pBi0NAvH2tUovWuW9jUjWqzNj3sDFlDx3RiVoSaU3tVnJavdbBdLFiKj3c0F/e+0lt9PgwVdC+HuTyJhJq9D7YetO67Iiz 462p8wfp MVCMiw5zl8jRuTtXDo1zV12XwzFRMVvQnCVNi3Z45d3KW2Sc42bJwh45O9lUyVoWp1uKBYJpXGQ7b8xGNACVJGuM1YiUYko8sYZNfrO7vlfa8dNoaNam1TG152kxTVM3SrknRBAyDuE78ohkNYYGwSW0eLjsUMuVVT8xEYhYfZAbzELLMLpI1C+f7rX3eX71PA64k94RUf1l8TXaPtf9/G4oMbsfNQgb5c43nCIB6ZBbRbZfcMPkFNy2lUZS0FeUSf8IxAEGZj/SXV9VCNJLnrLXpd6awQLeN8hBB4hAYGHmQy7qrZn+bNIoK6bFnb7gQHRh2Jw2jSTp2JOSFJ+PmngKkmjF2L1GrFJdUz9zW1/JRb3JQysnk4qna8KvAdwdfJlClSasTGCyGUrE= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000001, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Takaski, Vlastimil: thanks so much for the engagement! See below. > On 2/19/24 12:36, Takashi Iwai wrote: > > > > Karthikeyan, Sven, and co: could you guys show the stack trace at the > > stall? This may give us more clear light. Here are two typical stack traces at the stall. Note that the timer interru= pt is just a software watchdog that fires to generate the stack trace. This is running the v6.1 kernel. We should be able to reproduce this on v6.6 as well if need be. <4>[310289.546429] <4>[310289.546431] asm_sysvec_apic_timer_interrupt+0x16/0x20 <4>[310289.546434] RIP: 0010:super_cache_count+0xc/0xea <4>[310289.546438] Code: ff ff e8 48 ac e3 ff 4c 89 e0 48 83 c4 20 5b 41 5c 41 5d 41 5e 41 5f 5d c3 cc cc cc cc cc 0f 1f 44 00 00 f6 87 23 fc ff ff 20 <75> 08 31 c0 c3 cc cc cc cc cc 55 48 89 e5 41 57 41 56 41 54 53 49 <4>[310289.546440] RSP: 0018:ffffa64e8aed35c0 EFLAGS: 00000202 <4>[310289.546443] RAX: 0000000000000080 RBX: 0000000000000400 RCX: 0000000000000000 <4>[310289.546445] RDX: ffffffffa6d66bc8 RSI: ffffa64e8aed3610 RDI: ffff9fd2873dbc30 <4>[310289.546447] RBP: ffffa64e8aed3660 R08: 0000000000000064 R09: 0000000000000000 <4>[310289.546449] R10: ffffffffa6e3b260 R11: ffffffffa5163a52 R12: ffff9fd2873dbc50 <4>[310289.546451] R13: 0000000000046c00 R14: 0000000000000000 R15: 0000000000000000 <4>[310289.546453] ? super_cache_scan+0x199/0x199 <4>[310289.546457] shrink_slab+0xb3/0x37e <4>[310289.546460] shrink_node+0x377/0x110e <4>[310289.546464] ? sysvec_apic_timer_interrupt+0x17/0x80 <4>[310289.546467] ? asm_sysvec_apic_timer_interrupt+0x16/0x20 <4>[310289.546471] try_to_free_pages+0x46e/0x857 <4>[310289.546475] ? psi_task_change+0x7f/0x9c <4>[310289.546478] __alloc_pages_slowpath+0x4e2/0xe5c <4>[310289.546482] __alloc_pages+0x225/0x2a2 <4>[310289.546486] __dma_direct_alloc_pages+0xed/0x1cb <4>[310289.546489] dma_direct_alloc_pages+0x21/0xa3 <4>[310289.546493] dma_alloc_noncontiguous+0xd1/0x144 <4>[310289.546496] snd_dma_noncontig_alloc+0x45/0xe3 <4>[310289.546499] snd_dma_alloc_dir_pages+0x4f/0x81 <4>[310289.546502] hda_cl_stream_prepare+0x66/0x15e [snd_sof_intel_hda_common (HASH:1255 1)] <4>[310289.546510] hda_dsp_cl_boot_firmware+0xc4/0x2ca [snd_sof_intel_hda_common (HASH:1255 1)] <4>[310289.546518] snd_sof_run_firmware+0xca/0x2d7 [snd_sof (HASH:ecd9 2)] <4>[310289.546526] ? hda_dsp_resume+0x97/0x1a7 [snd_sof_intel_hda_common (HASH:1255 1)] <4>[310289.546534] sof_resume+0x155/0x251 [snd_sof (HASH:ecd9 2)] <4>[310289.546542] ? pci_pm_suspend+0x1e7/0x1e7 <4>[310289.546546] dpm_run_callback+0x3c/0x132 <4>[310289.546549] device_resume+0x1f7/0x282 <4>[310289.546552] ? dpm_watchdog_set+0x54/0x54 <4>[310289.546555] async_resume+0x1f/0x5b <4>[310289.546558] async_run_entry_fn+0x2b/0xc5 <4>[310289.546561] process_one_work+0x1be/0x381 <4>[310289.546564] worker_thread+0x20b/0x35b <4>[310289.546568] kthread+0xde/0xf7 <4>[310289.546571] ? pr_cont_work+0x54/0x54 <4>[310289.546574] ? kthread_blkcg+0x32/0x32 <4>[310289.546577] ret_from_fork+0x1f/0x30 <4>[310289.546580] <4>[171032.151834] <4>[171032.151835] asm_sysvec_apic_timer_interrupt+0x16/0x20 <4>[171032.151839] RIP: 0010:_raw_spin_unlock_irq+0x10/0x28 <4>[171032.151842] Code: 2c 70 74 06 c3 cc cc cc cc cc 55 48 89 e5 e8 7e 30 2b ff 5d c3 cc cc cc cc cc 0f 1f 44 00 00 c6 07 00 fb 65 ff 0d af b1 2c 70 <74> 06 c3 cc cc cc cc cc 55 48 89 e5 e8 56 30 2b ff 5d c3 cc cc cc <4>[171032.151844] RSP: 0018:ffff942447b334d8 EFLAGS: 00000286 <4>[171032.151847] RAX: 0000000000000031 RBX: 0000000000000001 RCX: 0000000000000034 <4>[171032.151849] RDX: 0000000000000031 RSI: 0000000000000002 RDI: ffffffff9103b1b0 <4>[171032.151851] RBP: ffff942447b33660 R08: 0000000000000032 R09: 0000000000000010 <4>[171032.151853] R10: ffffffff9103b370 R11: 00000000ffffffff R12: ffffffff9103b160 <4>[171032.151855] R13: ffffd055000111c8 R14: 0000000000000000 R15: 0000000000000031 <4>[171032.151858] evict_folios+0xf9e/0x1307 <4>[171032.151861] ? asm_sysvec_apic_timer_interrupt+0x16/0x20 <4>[171032.151866] shrink_node+0x2e8/0x110e <4>[171032.151870] ? common_interrupt+0x1c/0x95 <4>[171032.151872] ? common_interrupt+0x1c/0x95 <4>[171032.151875] ? asm_common_interrupt+0x22/0x40 <4>[171032.151878] ? __compaction_suitable+0x7c/0x9d <4>[171032.151882] try_to_free_pages+0x46e/0x857 <4>[171032.151885] ? psi_task_change+0x7f/0x9c <4>[171032.151889] __alloc_pages_slowpath+0x4e2/0xe5c <4>[171032.151893] __alloc_pages+0x225/0x2a2 <4>[171032.151896] __dma_direct_alloc_pages+0xed/0x1cb <4>[171032.151900] dma_direct_alloc_pages+0x21/0xa3 <4>[171032.151903] dma_alloc_noncontiguous+0xd1/0x144 <4>[171032.151907] snd_dma_noncontig_alloc+0x45/0xe3 <4>[171032.151910] snd_dma_alloc_dir_pages+0x4f/0x81 <4>[171032.151913] hda_cl_stream_prepare+0x66/0x15e [snd_sof_intel_hda_common (HASH:7df0 1)] <4>[171032.151921] hda_dsp_cl_boot_firmware+0xc4/0x2ca [snd_sof_intel_hda_common (HASH:7df0 1)] <4>[171032.151929] snd_sof_run_firmware+0xca/0x2d7 [snd_sof (HASH:9f20 2)] <4>[171032.151937] ? hda_dsp_resume+0x97/0x1a7 [snd_sof_intel_hda_common (HASH:7df0 1)] <4>[171032.151945] sof_resume+0x155/0x251 [snd_sof (HASH:9f20 2)] <4>[171032.151953] ? pci_pm_suspend+0x1e7/0x1e7 <4>[171032.151957] dpm_run_callback+0x3c/0x132 <4>[171032.151960] device_resume+0x1f7/0x282 <4>[171032.151962] ? dpm_watchdog_set+0x54/0x54 <4>[171032.151965] async_resume+0x1f/0x5b <4>[171032.151968] async_run_entry_fn+0x2b/0xc5 <4>[171032.151971] process_one_work+0x1be/0x381 <4>[171032.151975] worker_thread+0x20b/0x35b <4>[171032.151978] kthread+0xde/0xf7 <4>[171032.151981] ? pr_cont_work+0x54/0x54 <4>[171032.151984] ? kthread_blkcg+0x32/0x32 <4>[171032.151987] ret_from_fork+0x1f/0x30 <4>[171032.151991] On Mon, Feb 19, 2024 at 6:40=E2=80=AFAM Vlastimil Babka wr= ote: > > Yeah, if the inifinite loop with __GFP_RETRY_MAYFAIL happens in a call to > __alloc_pages and not in some retry loop around it in an upper layer (I > tried to check the dma functions but got lost quickly so the exact call > stack would be useful), we definitely want to know the details. It should= n't > happen for costly orders (>3) because the retries are hard limited for th= ose > despite apparent progress or reclaim or compaction. Here are our notes of the indefinite stall we saw on v5.10 with iommu SoCs. We did not pursue debugging the stall at the time, in favour of a work-arou= nd with the gfp flags. Therefore we only have partial confidence in the notes below. Take them with a block of salt, but they may point in a useful direc= tion. 1. try to do a "costly" allocation (order > PAGE_ALLOC_COSTLY_ORDER) with __GFP_RETRY_MAYFAIL set. 2. page alloc's __alloc_pages_slowpath [1] tries to get a page from the freelist. This fails because there is nothing free of that costly order. 3. page alloc tries to reclaim by calling __alloc_pages_direct_reclaim, whi= ch bails out [2] because a zone is ready to be compacted; it pretends to have made a single page of progress. 4. page alloc tries to compact, but this always bails out early [3] because __GFP_IO is not set (it's not passed by the snd allocator, and even if it were, we are suspending so the __GFP_IO flag would be cleared anyway). 5. page alloc believes reclaim progress was made (because of the pretense in item 3) and so it checks whether it should retry compaction. The compaction retry logic [4] thinks it should try again, because: a) reclaim is needed because of the early bail-out in item 4 b) a zonelist is suitable for compaction 6. goto 2. indefinite stall. > > > Also, Vlastimil suggested that tracepoints would be helpful if that's > > really in the page allocator, too. > > We might be able to generate traces by bailing out of the indefinite stall using a timer, which should hopefully give us a device that's "alive enough" to read the traces. Can you advise which tracepoints you'd like to see? Is trace-cmd [5] suitable to capture this? [1] https://source.chromium.org/chromiumos/chromiumos/codesearch/+/main:src= /third_party/kernel/v5.10/mm/page_alloc.c;l=3D4654;drc=3Da16293af64a1f558da= b9a5dd7fb05fdbc2b7c5c0 [2] https://source.chromium.org/chromiumos/chromiumos/codesearch/+/main:src= /third_party/kernel/v5.10/mm/vmscan.c;drc=3D44452e4236561f6e36ec587805a52b6= 83e2804c9;l=3D6177 [3] https://source.chromium.org/chromiumos/chromiumos/codesearch/+/main:src= /third_party/kernel/v5.10/mm/compaction.c;l=3D2479;drc=3Dd7b105aa1559e6c287= f3f372044c21c7400b7784 [4] https://source.chromium.org/chromiumos/chromiumos/codesearch/+/main:src= /third_party/kernel/v5.10/mm/page_alloc.c;l=3D4171;drc=3Da16293af64a1f558da= b9a5dd7fb05fdbc2b7c5c0 [5] https://chromium.googlesource.com/chromiumos/docs/+/HEAD/kernel_develop= ment.md#ftrace-debugging