From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9B294C369CB for ; Tue, 22 Apr 2025 16:29:32 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7956F6B0008; Tue, 22 Apr 2025 12:29:31 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 71E0D6B000C; Tue, 22 Apr 2025 12:29:31 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 597A06B000D; Tue, 22 Apr 2025 12:29:31 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 3D5A56B0008 for ; Tue, 22 Apr 2025 12:29:31 -0400 (EDT) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 6A499808D4 for ; Tue, 22 Apr 2025 16:29:31 +0000 (UTC) X-FDA: 83362215342.08.BED7C31 Received: from mail-pl1-f179.google.com (mail-pl1-f179.google.com [209.85.214.179]) by imf13.hostedemail.com (Postfix) with ESMTP id 7C9AC2000E for ; Tue, 22 Apr 2025 16:29:29 +0000 (UTC) Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=cGSG1nZ5; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf13.hostedemail.com: domain of qq282012236@gmail.com designates 209.85.214.179 as permitted sender) smtp.mailfrom=qq282012236@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1745339369; a=rsa-sha256; cv=none; b=tWZ1EF8CY5mQsj7MyYIxhetK+dmtwMppqI/2f4QaX9WAZN6VjkJtJfbhzSRj1vYgXu1gDV kJISo18Y4ugJ5Ho3+7yVmZoZTFXfIjz22N5543IVZ1T0rYp7GojcYA/NTr+TWbqurHALW0 1Kucwg4Sex5zWzS7AEVxEu43sLMCLHU= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=cGSG1nZ5; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf13.hostedemail.com: domain of qq282012236@gmail.com designates 209.85.214.179 as permitted sender) smtp.mailfrom=qq282012236@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1745339369; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=mqFQjXcx1dupphMpacBjgQn4b3sEJ6P436qTqlhXx8M=; b=t1mO54q+GWZvWNLZAWXJ6JZ/wqZMmdomlpfpvwsEt+bdL7GYan4HRI6sp+OayCBeKQpP5C S2gyK1iVpdq+IdCw1xMcGGfZB7v+euALETTnmBi5Qq5BijmwhghxfHTApIzrI+EJTBP7Hi P6jvl0pVHFRTxMcOeIXBKJI2wMi2lX8= Received: by mail-pl1-f179.google.com with SMTP id d9443c01a7336-223fb0f619dso61364215ad.1 for ; Tue, 22 Apr 2025 09:29:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1745339368; x=1745944168; darn=kvack.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=mqFQjXcx1dupphMpacBjgQn4b3sEJ6P436qTqlhXx8M=; b=cGSG1nZ5Sf5tY+An7jfdC8fmAfhyWXavjK4ISx2zNwejruv+eAtKabSE6kN7LaNZmX PTg0qXWNeLAk4JfMZJoWD+bJ+FueSKY8WdNVjrUc7PxopgPvN3ZtxPpYmRUmvtyoyfNQ 0529HmiqzYy6toVbjdc22ppJ9W/LEWYZ9lZj/npUM8gTcSKlVdQ/WzTqQxT29UboaXCB rhrJc6uALx7ow4yt8sAGICIIm6/1JBHPTswd+rB+hbe9S58naHyuUx3y32v8O9b9ASDx tmvh0JBYMd5SWOnVCRQVjLgYU/PWO3l5bv/IPt6uhU/Di5502ftFvxmViAnIz+VMz4vG +Tlw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1745339368; x=1745944168; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=mqFQjXcx1dupphMpacBjgQn4b3sEJ6P436qTqlhXx8M=; b=qEUEg7JoajWtpOkC6Q3yCfQcKJiXoFTrku/fnFlj0TRieyzoThUZyOM1vQiKceV0A8 ILtNBO5RSVKFQFosA3wrExfXe0th+CWO6BFCD5XQRqlXKSVywO8Sr6hgZcdP6VQt7XSv NbROWbbOngDfOlPSK2x/0DlLJ4yW/9F3C2nEqPhwadiN3nr9cCJT9IO68s5iJWDMIHZX XARTsfizuwFCXEyIK4dOXLhWkdimO5bcfr9/wKPwIVUU8ZuW479y9tc0CjAV2JLKFxVl 3HWnj/uN0aaFEddQgJFplwR9nmgy8qtmWmRWfIWneUh3NI3KawMUoBDBtgFujJYRmTpz uHBQ== X-Forwarded-Encrypted: i=1; AJvYcCWSoCVBweaCSk9cUF6E6te5dQ/ObzGFGuPKIaQAm2VLmz9Q5pcyz56kmhniODrhanb3wNYYSheWTg==@kvack.org X-Gm-Message-State: AOJu0Yzwv7oJh3V6ncvmSibeCXl/RaNCCYfRp8DEKogVSqfHUJUHTOhW qb0tRxjVVyMCeMLksBaUwlBAkruYb329MwsRawCYRnlOG/1xdKKU X-Gm-Gg: ASbGncvfR8R1XQdjWSyVEPg/JjtG6zCl6tqZMmQGiWFiyCUWE/oDydTi431BlkU70PZ h58LXlCr7h0CWWPs8y9pJKaKcyOni5ACOBpET7uDYOXGENtVb9RGyeTnaYJjVLEd4U7j2nCA6EV NrfnqIJNCYlUa4zlMNNVEV8ItknvXgGFSgkW+Kvp+r3BfR851VP767dHZ7s2nznjs4Jfl5UASq0 VR72rVXsV/+ymkdz/4tljT7zZEzDVaqhSOrGotSi6qzyIzNaDrbejODnwqtDEUwddD+Fc0HJtqP wZ/ql6tefYObDwLxTUq1XzWSkJN3cLCD85tzoNi+jsDzOP6hsANKRgatOsYaYeotxcGBnldu4wa bW7sgywniM3xoGqvsZjbN95vCfIJfaQET+ryVmvKHcFJqWNEp0iprT2QDy0o0sTn8pR6uarTPQL 1jbsuN X-Google-Smtp-Source: AGHT+IHEFpvt2yM13kra3DTZbBby3wsgWEMQQdPQlccwS0dzadJOUKXG3q6hx2tj8M9/i/HfEK2TNg== X-Received: by 2002:a17:903:1aa6:b0:224:2201:84da with SMTP id d9443c01a7336-22c5356de05mr207898855ad.6.1745339368281; Tue, 22 Apr 2025 09:29:28 -0700 (PDT) Received: from linux-devops-jiangzhiwei-1.asia-southeast1-a.c.monica-ops.internal (92.206.124.34.bc.googleusercontent.com. [34.124.206.92]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-22c50fe20b0sm87481695ad.243.2025.04.22.09.29.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 22 Apr 2025 09:29:27 -0700 (PDT) From: Zhiwei Jiang To: viro@zeniv.linux.org.uk Cc: brauner@kernel.org, jack@suse.cz, akpm@linux-foundation.org, peterx@redhat.com, axboe@kernel.dk, asml.silence@gmail.com, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, io-uring@vger.kernel.org, Zhiwei Jiang Subject: [PATCH v2 0/2] Fix 100% CPU usage issue in IOU worker threads Date: Tue, 22 Apr 2025 16:29:11 +0000 Message-Id: <20250422162913.1242057-1-qq282012236@gmail.com> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 7C9AC2000E X-Stat-Signature: 1dn5f9p8ds5f4q3qde48riuogwfagyqd X-Rspam-User: X-HE-Tag: 1745339369-885691 X-HE-Meta: U2FsdGVkX19TYeLlHwItHunxe5pggNOj/8OC9g3HIutvMJL3EY+vDuCsEFcXFzYqqjJWedbxTJV0WKY3Z91YjtlNRqarOQ0Ae4gY7wtd0N6wj99uMIDaWpcn0MQYohR2x+SSiZKG1x6xi1dzq2E0qPbmzHgfcYJP/D01tF+PWH65qdKkiYLn//YwWKOZtGWh7Dv0Q9+ynW0r8kC3F9HRUu0Hnn4bmDuekdCwbVDZ5ICVpO/4oGb+3Uq+0P59WzdZ7b4uYUGXKuxox9uXrjT3aRzcDWPIuoxWe0Nvpc7kLr3GUdQ8v66nt2HL2rYvh1+XOEl0fzFbt7dbZE7uUabTNEy4VtGZPEcG1eStRfpa/68y6yjvz6RDjyNOyzA9RXsFwLLiKocvAWCpWDHsKjb3WiNIj97I+snbKIQRJFIgCJt224vENgruhWc9MjcN8TKilVCMal2Hcr4cCCPuYVQB6QvTXVcU+A6FYnbk7Se12RMhnbYFRE7GPZkTnZYfMjHiHkRzobCaUXs0zIrPsESIjlX+/WTQeOk3BpvRUMopvHJ7hQ3U7snI7kSfKJ54PfapUvsNWcoR9hvcMBaKOoeLZpJUoAgi84GBLaKuvgQA7RMcRLEAn8Dg+moD4ZGdWfZZNj2kOAh7UwhXWr7/zQzBHIFPD1mAs9/C6wwH7UPRALeffwwOfGWkudOuKzxh3CQlxZieB8XE5izm61XsBVfNSfbL/auOqOdncHEhnFqgPykfOb5j6Asoc20R/9GqmrCeMbLq67OuVeX4OBuELBTy8GSybuFlyr4dNi+DULRgDbRPwqceYOYBxD3cvqMV/NrKsyYsvOyacGto3kChirdhfgkKiFLLOM0zSlkKPfLYClmbFd9ppSoFlnDvgLPgUrYF6WzNCYYbKOnxtqVZ1Uo6XbUe9ecL7if679dmcVx/8jBCX7KerSbAOg+fNmSUzhehkrOWDQ8PdrJD9DT8Uwl fD5njAl0 Sv918q57l/qMyCN4oHDiEd85Bz7ypu76sWpBwosLUGYiC5l917Mk9w7m8QIH3kiN5H7zv5zs9mCyoofkmPrpXxouGLgGaIlS0i4A6WEo5t+9p35g+g9fy1ZXpP22e7oND3zSkSg4rjdpI92idmFwPcqv1E2/76mUk34dYYjN6Rt03vlVmImLLfY7nMPjZg5NUsj8QeHcGucNX47z4bMUsYTAw9vLheLQoNqF5aiOjOJ/VG73KyQ46Ad08q66S5RGPlk1vLdW0S6wpYXU8LJItR9Z95PmFFvchwVPi2BJmxrjE5Hi0jTb5+JPrAtvl4Ws4mJwsAPFBMn7+rjYM2scB1lBXKtpybmk3SwlQpT4mynO9Klto3a55mWqu5nCPlMsgNyRUgO1wcxPbozze3unnwPcQKCvdXV5eh70aVc+SLB175Cc= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: In the Firecracker VM scenario, sporadically encountered threads with the UN state in the following call stack: [<0>] io_wq_put_and_exit+0xa1/0x210 [<0>] io_uring_clean_tctx+0x8e/0xd0 [<0>] io_uring_cancel_generic+0x19f/0x370 [<0>] __io_uring_cancel+0x14/0x20 [<0>] do_exit+0x17f/0x510 [<0>] do_group_exit+0x35/0x90 [<0>] get_signal+0x963/0x970 [<0>] arch_do_signal_or_restart+0x39/0x120 [<0>] syscall_exit_to_user_mode+0x206/0x260 [<0>] do_syscall_64+0x8d/0x170 [<0>] entry_SYSCALL_64_after_hwframe+0x78/0x80 The cause is a large number of IOU kernel threads saturating the CPU and not exiting. When the issue occurs, CPU usage 100% and can only be resolved by rebooting. Each thread's appears as follows: iou-wrk-44588 [kernel.kallsyms] [k] ret_from_fork_asm iou-wrk-44588 [kernel.kallsyms] [k] ret_from_fork iou-wrk-44588 [kernel.kallsyms] [k] io_wq_worker iou-wrk-44588 [kernel.kallsyms] [k] io_worker_handle_work iou-wrk-44588 [kernel.kallsyms] [k] io_wq_submit_work iou-wrk-44588 [kernel.kallsyms] [k] io_issue_sqe iou-wrk-44588 [kernel.kallsyms] [k] io_write iou-wrk-44588 [kernel.kallsyms] [k] blkdev_write_iter iou-wrk-44588 [kernel.kallsyms] [k] iomap_file_buffered_write iou-wrk-44588 [kernel.kallsyms] [k] iomap_write_iter iou-wrk-44588 [kernel.kallsyms] [k] fault_in_iov_iter_readable iou-wrk-44588 [kernel.kallsyms] [k] fault_in_readable iou-wrk-44588 [kernel.kallsyms] [k] asm_exc_page_fault iou-wrk-44588 [kernel.kallsyms] [k] exc_page_fault iou-wrk-44588 [kernel.kallsyms] [k] do_user_addr_fault iou-wrk-44588 [kernel.kallsyms] [k] handle_mm_fault iou-wrk-44588 [kernel.kallsyms] [k] hugetlb_fault iou-wrk-44588 [kernel.kallsyms] [k] hugetlb_no_page iou-wrk-44588 [kernel.kallsyms] [k] hugetlb_handle_userfault iou-wrk-44588 [kernel.kallsyms] [k] handle_userfault iou-wrk-44588 [kernel.kallsyms] [k] schedule iou-wrk-44588 [kernel.kallsyms] [k] __schedule iou-wrk-44588 [kernel.kallsyms] [k] __raw_spin_unlock_irq iou-wrk-44588 [kernel.kallsyms] [k] io_wq_worker_sleeping I tracked the address that triggered the fault and the related function graph, as well as the wake-up side of the user fault, and discovered this : In the IOU worker, when fault in a user space page, this space is associated with a userfault but does not sleep. This is because during scheduling, the judgment in the IOU worker context leads to early return. Meanwhile, the listener on the userfaultfd user side never performs a COPY to respond, causing the page table entry to remain empty. However, due to the early return, it does not sleep and wait to be awakened as in a normal user fault, thus continuously faulting at the same address,so CPU loop. Therefore, I believe it is necessary to specifically handle user faults by setting a new flag to allow schedule function to continue in such cases, make sure the thread to sleep. Patch 1 io_uring: Add new functions to handle user fault scenarios Patch 2 userfaultfd: Set the corresponding flag in IOU worker context Changes since v1: - Optimized the code under Jens Axboe's suggestion to reduce the exposure of IO worker structure. fs/userfaultfd.c | 4 ++++ io_uring/io-wq.c | 35 +++++++++++++++++++++++++++++------ io_uring/io-wq.h | 12 ++++++++++-- 3 files changed, 43 insertions(+), 8 deletions(-) -- 2.34.1