From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2A92FEB64DC for ; Mon, 3 Jul 2023 08:51:54 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8088B6B00B0; Mon, 3 Jul 2023 04:51:53 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 7B7D88E0099; Mon, 3 Jul 2023 04:51:53 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 67FA98E007C; Mon, 3 Jul 2023 04:51:53 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 591476B00B0 for ; Mon, 3 Jul 2023 04:51:53 -0400 (EDT) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 1EE201608CE for ; Mon, 3 Jul 2023 08:51:53 +0000 (UTC) X-FDA: 80969682906.21.F52E7C5 Received: from out-26.mta1.migadu.com (out-26.mta1.migadu.com [95.215.58.26]) by imf04.hostedemail.com (Postfix) with ESMTP id 18D5040005 for ; Mon, 3 Jul 2023 08:51:50 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=cFPIkSov; spf=pass (imf04.hostedemail.com: domain of muchun.song@linux.dev designates 95.215.58.26 as permitted sender) smtp.mailfrom=muchun.song@linux.dev; dmarc=pass (policy=none) header.from=linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1688374311; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=BRfiuFvI+bQGPqXl5mRKHqeh3cBqVk/OnWnQQ/jAzUU=; b=g/J/ZBzFU58JfV/FH4PjJRFutvhfF82QUtad1nPit/ofEuAaCP78FmvBeaQJbZD4hTxN+S ggRkY0uX0FinGylKSbhSkp2DNbvUsKDtekt1KX3M3n+xr2i0i3YsPIcQmJfOxqBjhTzmuY cY81vqrWin830usB9q/mkbIJD4cw9xk= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1688374311; a=rsa-sha256; cv=none; b=ITFYsCh1c05WyEq/XuGAZDq0IZ+TNf+NminWUs2FA9L3KxwcO8XjR7Rhtgh9P6rokUQMxx Pm8sjcAbsCUktXduEk+mlGxLGM1eUrsPqK2xEGXamqoRVajy+164QZUxggQmamYPCFMPx2 Xs4pH2uhDcGFpBeu3Xx9Y2PM376tes0= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=cFPIkSov; spf=pass (imf04.hostedemail.com: domain of muchun.song@linux.dev designates 95.215.58.26 as permitted sender) smtp.mailfrom=muchun.song@linux.dev; dmarc=pass (policy=none) header.from=linux.dev Content-Type: text/plain; charset=utf-8 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1688374308; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=BRfiuFvI+bQGPqXl5mRKHqeh3cBqVk/OnWnQQ/jAzUU=; b=cFPIkSovy9lM5LfJvlj5sCqVQzey7qmNljmwIxM9cWpo8nFdGNSGIog8TLVHabri65mTfv qYcgJMeXwRZkzMF2zyNrXW/TeEi/gSzZeLTq6jIEROGNeZVLrqtOAbKs9Wwre8y9P3abxY hUn9hNc34/T7Li9ZJ+tKwW9rzcxRK5o= MIME-Version: 1.0 Subject: Re: Report a huge page issue in kernel version v5.19.xx X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Muchun Song In-Reply-To: Date: Mon, 3 Jul 2023 16:51:07 +0800 Cc: "mike.kravetz@oracle.com" , Muchun Song , "linux-mm@kvack.org" Content-Transfer-Encoding: quoted-printable Message-Id: <12889800-1784-4ECE-88F3-3F88E99688A7@linux.dev> References: To: Fumin Gao X-Migadu-Flow: FLOW_OUT X-Stat-Signature: wmrmnxzm48wh4b5mcx9wir9hmq3acuj3 X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: 18D5040005 X-Rspam-User: X-HE-Tag: 1688374310-51162 X-HE-Meta: U2FsdGVkX191Irr5FMmcIWf9GJYr0LQRaTWADHJQFBtS/0CPqZBWRXPCmjHOx1wV1gZ6vDTRe+DBNa4V+bx2Ng2f6CN30q7fXURWvfLDQtQbGq2PnVFfSBa7fbyhFl0+3cFZfyjdIQRtmPqSWkKWAhHc2SD/5TjdyWiFiSo8FZZb2/q4B1YsIxqn+77Y3mjWUM7Tj/+1IBzOYNNnYoHqI85LBf/0CT2hE6jd+pkjiVVwqd/SvXQpR/HxgQNBXiLSFQx7E0bDFgL2qMYuVDp0wWQ+rdMma/3l+70z2mJc5PxJapRk019y4KmM8K7MiQKMZ4tA5hn1rvPG+R1rILQljUSYRv0Jo96gSXwcvXcvfmEoj7TmCzCLLvOF9ffidfF/CL1s8sYoy/yEtcORJiTRJxS7BOj8q8UYAu4IYPyk56WUHzPfkuaoWWYMQCye44AraMsKAMpcgPV7DeMTB1cddSWEdoETrOV9crgls628yNW/sUAN/N8bP1qptVY11I831KTLWtBKttGS2Qni1rgqDwQNlyhHHpLxyVznWTWU7yF9ZHPPF4CEg1475rzaoJYoy5KuupewmWAmac4lIbbX3TzBn7hgWVEqQcdEGS22sOrIXEVseNpe5NjDTcRw+EeWwHwH4G8um3JNQgyartXM3IWQkKclIetADia8XPNS6TcY5EW4+0U0DGNF75iRtUVCqOvju3EdanPlsN/dvoquHfVkkkrNTbNQeTZNY8aS/n65puNJntojvSHAxDj9mQL1rSQ3Ezp7WQpVazoN9Zs85rPGcpD53B4diEAeMLqmFIjeyg3xX2LlshZWJ0Tkf9SCKnlMtXubFg0RbL3Y1ff07vW3xD70SHcUMva8w3tGDAqgv4NKA6L9kHRB1KH1XTmy9KFTC/IMIwptjQ3HMVb/jcsmw47RF2jFQJUWcMuOcuN381Og3/12n22uOxVEDJE+yFwF+Ha/rNOlB4J5e7o ms8bfdvD rkim7aJZ+kdhhqaUsfnXV979bHgOplpN9TzslEUu4khdH8r+WNcdBw+cYM2PQu2IsPiSUQN8oLwmHPzwIABdYZceX/0hpA1mFG0l3K7WUCUkIV5LSaMJKkWHteUe24U43oa7Z9V8lBhzdxXsNOs3H6NzZVTqYCtZ5n9znleOD8UB3hRRkdGWYcDAomKCFrK+82AKFTLhXPaKxCMY334BWVp4IZ/kXDpDWu532 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000001, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: > On Jul 3, 2023, at 11:07, Fumin Gao = wrote: >=20 > Hi, What=E2=80=99s the issue? > Recently in our product, I found a issue in kernel version v5.19.xx, = this issue was fixed in kernel version v6.xx. > The issue is I can=E2=80=99t get which node the huge page is on by = system call move_pages. > How to reproduce this issue?=20 > I attached my test programme file in email. > virtaddr =3D mmap(NULL, ONE_GIG, PROT_READ | PROT_WRITE, MAP_PRIVATE | = MAP_ANONYMOUS | MAP_HUGETLB , -1, 0); > *(char *)virtaddr =3D 0; > if (syscall(SYS_move_pages, 0, 1, &virtaddr, NULL, &NumaNode, 0) !=3D = 0) > { > printf("Get virtual address 0x%p on NumaNode failed \n", virtaddr); > } > printf("create shared memory with mmap, virtaddr 0x%lx on Node %d, = errno %d \n", virtaddr,NumaNode, errno); > When tested with kernel v5.19.xx , the value of NumaNode is -2 = (-ENOENT). > My analysis of this issue. > Based on the following trace and kernel source code, I can see the = function calling process. > kernel_move_pages =E2=80=93 do_pages_stat =E2=80=93 = do_pages_stat_array =E2=80=94 follow_page =E2=80=94 follow_page_mask > =E2=80=94 follow_p4d_mask =E2=80=94 follow_pud_mask =E2=80=94 = follow_huge_pud > [001] ..... 510329749178328: sys_move_pages(pid: 0, nr_pages: 1, = pages: 7fffa23a2c90, nodes: 0, status: 7fffa23a2c9c, flags: 0) > [001] ..... 510329749179360: sys_enter: NR 279 (0, 1, 7fffa23a2c90, 0, = 7fffa23a2c9c, 0) > [001] ...1. 510329749185448: mmap_lock_start_locking: = mm=3D00000000e0f35bcd = memcg_path=3D/user.slice/user-1000.slice/session-1.scope write=3Dfalse > [001] ...1. 510329749187872: mmap_lock_acquire_returned: = mm=3D00000000e0f35bcd = memcg_path=3D/user.slice/user-1000.slice/session-1.scope write=3Dfalse = success=3Dtrue > [001] ..... 510329749196628: p_follow_page_0: (follow_page+0x0/0xe0) > [001] ..... 510329749199690: p_vma_is_secretmem_0: = (vma_is_secretmem+0x0/0x20) > [001] ..... 510329749202194: p_follow_page_mask_0: = (follow_page_mask+0x0/0x160) > [001] ..... 510329749206928: p_follow_huge_addr_0: = (follow_huge_addr+0x0/0x20) > [001] ..... 510329749210628: myretprobe: (follow_page_mask+0x38/0x160 = <- follow_huge_addr) ret=3D0xffffffffffffffea > [001] ..... 510329749216464: p_follow_pud_mask_isra_0_0: = (follow_pud_mask.isra.0+0x0/0x1e0) > [001] ..... 510329749221108: p_follow_huge_pud_0: = (follow_huge_pud+0x0/0x80) > [001] ..... 510329749221902: myretprobe: = (follow_pud_mask.isra.0+0x1c8/0x1e0 <- follow_huge_pud) ret=3D0x0 > [001] ..... 510329749223462: myretprobe: (follow_page_mask+0x147/0x160 = <- follow_pud_mask.isra.0) ret=3D0x0 > [001] ..... 510329749224838: myretprobe: (do_pages_stat+0x18b/0x330 <- = follow_page) ret=3D0x0 > [001] ...1. 510329749226096: mmap_lock_released: mm=3D00000000e0f35bcd = memcg_path=3D/user.slice/user-1000.slice/session-1.scope write=3Dfalse > [001] ..... 510329749228348: sys_move_pages -> 0x0 > [001] ..... 510329749229224: sys_exit: NR 279 =3D 0 > In the kernel version v5.19.xx, it add a flag FOLL_GET in = do_pages_stat_array compared with v5.18.xx. > page =3D follow_page(vma, addr, FOLL_GET | FOLL_DUMP); > But in the function follow_huge_pud, if the flags has FOLL_GET, it = will return NULL. This causes we get the status is -ENOENT (-2)=20 > in move_pages. > Is my analysis correct ? Correct! If you want v5.19 works properly, you could apply commit 831568214883 ("mm: migration: fix the FOLL_GET failure on = following huge page") to fix the issue. Thanks.