From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6A074C77B7C for ; Fri, 12 May 2023 21:27:30 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BC9FA6B007B; Fri, 12 May 2023 17:27:29 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B79A66B007D; Fri, 12 May 2023 17:27:29 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A41AB6B007E; Fri, 12 May 2023 17:27:29 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 939296B007B for ; Fri, 12 May 2023 17:27:29 -0400 (EDT) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 49C9F12104D for ; Fri, 12 May 2023 21:27:29 +0000 (UTC) X-FDA: 80782889418.25.EB1025C Received: from mail-qt1-f179.google.com (mail-qt1-f179.google.com [209.85.160.179]) by imf07.hostedemail.com (Postfix) with ESMTP id 86E2A40005 for ; Fri, 12 May 2023 21:27:26 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=ABsGeYvq; spf=pass (imf07.hostedemail.com: domain of jthoughton@google.com designates 209.85.160.179 as permitted sender) smtp.mailfrom=jthoughton@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1683926846; a=rsa-sha256; cv=none; b=aKgpa2GyYmiULuIwhEdTvlxlEmLFRJnGiIT9s0aiTYlCRURxGQLTVuRUfHQxZILhysPMyJ NbZjG/G+Z+DC9bXPbBJ4UmokT6BXvuYSxlXJqfaAAyg6LaJQN4SYkdCjiEtCtTkRSQe8+i 35yqwpKJIvjrB+LAev4MTnpgKCxqlKQ= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=ABsGeYvq; spf=pass (imf07.hostedemail.com: domain of jthoughton@google.com designates 209.85.160.179 as permitted sender) smtp.mailfrom=jthoughton@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1683926846; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=ZQrzSoX4BEMgh4sQvp+Oyfr2kigwqdwZuISmq1ZjPMA=; b=sc05PKuaUQFzLct7cFLkRoiuD7l9cwjdnmGsGv51DoLTsxzO8dDm+n6qIlN+o69jklqOqd tJnCXhOdfrxvggyUEP7t//BLgKGcpZ4U9HNab2s1gAzlrvkGBbcq6aCBE6v6o5q++wo+5O NTHLxFzTc+Zuc8zPxMcOJ4AWG8yIzBk= Received: by mail-qt1-f179.google.com with SMTP id d75a77b69052e-3f38a9918d1so1133061cf.1 for ; Fri, 12 May 2023 14:27:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1683926845; x=1686518845; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=ZQrzSoX4BEMgh4sQvp+Oyfr2kigwqdwZuISmq1ZjPMA=; b=ABsGeYvqpnB2WCnZdbLNvpIRZLn9LyJDsia/vNaB8KJPUcjY8Nai0KtDzJlNRIUgLp C5HMdd2xJpLxd+wusI/okBqza7mKWrXDIIe+eKJEH4mx2c/OQx9Y3/BTRxsEpZTTGoJy Agyc4fopiD2DkAbygyJt6MJQKXB3dPGo2sfnkymAPxkQ8NwediMYu8BCyBi3rvtEo/Cc ZRBrvfJ/psXj1bbIM0NcHrUfREpZM9EHktOhHxY4IgnNKPfSsEEcXm50CYQ791x+Vl33 5KPJtu2LRLubuFxv7w2TG2EJaW+18efXriAODV4A2nql3Y49ZgMjsG28lziRGjBhePor ePlA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1683926845; x=1686518845; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ZQrzSoX4BEMgh4sQvp+Oyfr2kigwqdwZuISmq1ZjPMA=; b=ZUxqNYN3jGv6ljeGYIH6u2awU26X61dc3zA7zLYrSfKD89CQ0jt8F29M3a+WaugjVI 42es7kEbxkV2YDdLL3gMYj0gENMCzxGINtSoTrlJS0QH3hvnC5O3HvCUWUBI6MHlpS2A qfjP/dvsd0rEMkanrHZ/CcTwz/afHOQj6/H9Uq5uWS9GlpS8rSjTUx5n72mcCEnqKSSm h2iOjBEG3L94MlvhP1ag0Hr3EtCA6UlEgDdzsjRvqC2Y5Prylq1QdTenQYnoxL/P50S1 gWlvXUDDgYxVysh+Dh7sXV9Fm39t2IlUVG1tO67JDTl3A3CblsIPIG/fToMyqSeKGXn1 aT/Q== X-Gm-Message-State: AC+VfDxO82uQNPGhQe8kEdstp+CHM3a76Y4LLlm5+UUVp74eHA25ptpG QqDL6RCcEJlj6mPnMFYVC2LM9hV1RBtYk+tgVjj1ow== X-Google-Smtp-Source: ACHHUZ5rISReyNVNJQOrLE1rpa97BbiLJjA4aD4/WRWtQ6UAxiZqw9+5zyRgsUeaC49oKUet1/J5svgRQSJrDND5urU= X-Received: by 2002:a05:622a:5cc:b0:3f2:1441:3c11 with SMTP id d12-20020a05622a05cc00b003f214413c11mr711936qtb.2.1683926845536; Fri, 12 May 2023 14:27:25 -0700 (PDT) MIME-Version: 1.0 References: <20230512072036.1027784-1-junxiao.chang@intel.com> In-Reply-To: <20230512072036.1027784-1-junxiao.chang@intel.com> From: James Houghton Date: Fri, 12 May 2023 14:26:49 -0700 Message-ID: Subject: Re: [PATCH] mm: fix hugetlb page unmap count balance issue To: Junxiao Chang Cc: akpm@linux-foundation.org, kirill.shutemov@linux.intel.com, mhocko@suse.com, jmarchan@redhat.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, mike.kravetz@oracle.com, muchun.song@linux.dev Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Stat-Signature: 6hb9tnw1bupz7gxw113o9krnrjy195kq X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 86E2A40005 X-HE-Tag: 1683926846-317131 X-HE-Meta: U2FsdGVkX1/lc6tkENKNpzEEaaCwbogtL3u0sXh7c3zG4IdGCPuZjT+WlNt4bHpfUj5KZwWVuB93U7rt6IUMymY2hwaWt+989t+krTV0IG5H7CubzIr7a6m8iAziAZGPG0DREI4FUV6fCJQcD4W4jkqErVHM6CmU8qRemq/2m+nCIKHEA2fFt4uAhSS8qWnYilWLV2Zmd4xwnxIEJLfAtHTEmuFEXKnBsSf67wFXxrWDgswGnTbfqVPNkZhQu2bFFB5jRouhG8NeGjck+G9R99Obulz/I7eQdGcXw4ZHxkoOw8BWB6PHfENcajqBRNU8iUyY0Gj1bQ/fSSSbA5rn8GYYCHrtfT3LEThdF9/0nr3CefK8cTJcCoRY6rwxhKxA7ZbG8zdgOS8NSPMCVALGm6HBQRuZvkLntwhcDwxAATzW3IEPiOtSwW6kAv8Yr5637O6Xd6u6sEr5Im/dq1hs0+iMDW6S4KykGTaPUq0uTEIUlDMpEh+PLkprYl+npusqXImUYZl5g/jk7FUb4j7zlYPg3ObRcgVCHLs4GpNVFVWuATjdJx8lUaSyRBNNxxeSS+W0oSOenbHoUgfmaxi5py7zZfGuatFjrPeSTYuqqnXIZjOFGAPrxwdE7dGZiogZL1AhedBU2jbDbBCmxp/wCu4AD05Qj67NsEdfFlyH9QYx3cKteaznB9BhfLEKhJCsABWHsYaxgpPX07jvNy6NQ307CXZ+6QoCSANrmjb1V6VQoAs1Ga2iiTLqoUu3XDds9L1O+OmUjhiWJyyXQTTKJ5kDFMTiXrq0ox0+swGiLpTtMwNKKtE7SMzRKymyl833xgs+/mO2NB3ahKLTPLARayW/KNNUpz55W+Ezam40v6HeqMnlMDU1hdhae6yIqdWqgvXiEDYrwMp+gpwkpy2qUrpNz6tpvGa8AQAjCH6dG89KGZ40ISyfNkG4bWNvg17RvqrkK15vCNa/12I27xI L6fYG+9A n63MCj7DqTgqZNPHjcwQBhl/562uUIFjiEDyl0+1C/F3zPF3jK5tViXAXNwJYSxSaLfhlS23rRnY0wXptNZNechdIUBIHsp7xZYJVX6cx242mCvk4w2d/sqFwDp/DmTXvLZWoVPBvC0FFGtpMlKaezNHGQ/0GzymuPCuUrT+DqCrT4E4r5L7g4LBQo/KhcgUD4jQpGD1dpAjAYUvObO2lerizArSYfzOUte56z6gUs8Z7hNTRZelBqom6RtPDqG+xJV5uCKwgYJDuzuyr6Qp7zn8VGeT98ErPx8u1axZdqLWS0m3Iojt+dFJnAadSfPxm6bZ7O2WTfPH/sCo= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, May 12, 2023 at 12:20=E2=80=AFAM Junxiao Chang wrote: > > hugetlb page usually is mapped with pmd, but occasionally it might be > mapped with pte. QEMU can use udma-buf to create host dmabufs for guest > framebuffers. When QEMU is launched with parameter "hugetlb=3Don", > udmabuffer driver maps hugetlb page with pte in page fault handler. > Call chain looks like: > > page_add_file_rmap > do_set_pte > finish_fault > __do_fault -> udmabuf_vm_fault, it maps hugetlb page here. > do_read_fault > > In function page_add_file_rmap, compound is false since it is pte mapping= . > > When qemu exits and page is unmapped in function page_remove_rmap, the > hugetlb page should not be handled in pmd way. > > This change is to check compound parameter as well as hugetlb flag. It > fixes below kernel bug which is reproduced with 6.3 kernel: > > [ 114.027754] BUG: Bad page cache in process qemu-system-x86 pfn:37aa00 > [ 114.034288] page:000000000dd2153b refcount:514 mapcount:-4 mapping:000= 000004b01ca30 index:0x13800 pfn:0x37aa00 > [ 114.044277] head:000000000dd2153b order:9 entire_mapcount:-4 nr_pages_= mapped:4 pincount:512 > [ 114.052623] aops:hugetlbfs_aops ino:6f93 > [ 114.056552] flags: 0x17ffffc0010001(locked|head|node=3D0|zone=3D2|last= cpupid=3D0x1fffff) > [ 114.064115] raw: 0017ffffc0010001 fffff7338deb0008 fffff7338dea0008 ff= ff98dc855ea870 > [ 114.071847] raw: 000000000000009c 0000000000000002 00000202ffffffff 00= 00000000000000 > [ 114.079572] page dumped because: still mapped when deleted > [ 114.085048] CPU: 0 PID: 3122 Comm: qemu-system-x86 Tainted: G BU W= E 6.3.0-v3+ #62 > [ 114.093566] Hardware name: Intel Corporation Alder Lake Client Platfor= m DDR5 SODIMM SBS RVP, BIOS ADLPFWI1.R00.3084.D89.2303211034 03/21/2023 > [ 114.106839] Call Trace: > [ 114.109291] > [ 114.111405] dump_stack_lvl+0x4c/0x70 > [ 114.115073] dump_stack+0x14/0x20 > [ 114.118395] filemap_unaccount_folio+0x159/0x220 > [ 114.123021] filemap_remove_folio+0x54/0x110 > [ 114.127295] remove_inode_hugepages+0x111/0x5b0 > [ 114.131834] hugetlbfs_evict_inode+0x23/0x50 > [ 114.136111] evict+0xcd/0x1e0 > [ 114.139083] iput.part.0+0x183/0x1e0 > [ 114.142663] iput+0x20/0x30 > [ 114.145466] dentry_unlink_inode+0xcc/0x130 > [ 114.149655] __dentry_kill+0xec/0x1a0 > [ 114.153325] dput+0x1ca/0x3c0 > [ 114.156293] __fput+0xf4/0x280 > [ 114.159357] ____fput+0x12/0x20 > [ 114.162502] task_work_run+0x62/0xa0 > [ 114.166088] do_exit+0x352/0xae0 > [ 114.169321] do_group_exit+0x39/0x90 > [ 114.172892] get_signal+0xa09/0xa30 > [ 114.176391] arch_do_signal_or_restart+0x33/0x280 > [ 114.181098] exit_to_user_mode_prepare+0x11f/0x190 > [ 114.185893] syscall_exit_to_user_mode+0x2a/0x50 > [ 114.190509] do_syscall_64+0x4c/0x90 > [ 114.194095] entry_SYSCALL_64_after_hwframe+0x72/0xdc > > Fixes: 53f9263baba6 ("mm: rework mapcount accounting to enable 4k mapping= of THPs") > Signed-off-by: Junxiao Chang > --- > mm/rmap.c | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > diff --git a/mm/rmap.c b/mm/rmap.c > index 19392e090bec6..b42fc0389c243 100644 > --- a/mm/rmap.c > +++ b/mm/rmap.c > @@ -1377,9 +1377,9 @@ void page_remove_rmap(struct page *page, struct vm_= area_struct *vma, > > VM_BUG_ON_PAGE(compound && !PageHead(page), page); > > - /* Hugetlb pages are not counted in NR_*MAPPED */ > - if (unlikely(folio_test_hugetlb(folio))) { > - /* hugetlb pages are always mapped with pmds */ > + /* Hugetlb pages usually are not counted in NR_*MAPPED */ > + if (unlikely(folio_test_hugetlb(folio) && compound)) { > + /* hugetlb pages are mapped with pmds */ > atomic_dec(&folio->_entire_mapcount); > return; > } This alone doesn't fix mapcounting for PTE-mapped HugeTLB pages. You need something like [1]. I can resend it if that's what we should be doing, but this mapcounting scheme doesn't work when the page structs have been freed. It seems like it was a mistake to include support for hugetlb memfds in udm= abuf. [1]: https://lore.kernel.org/linux-mm/20230306230004.1387007-2-jthoughton@g= oogle.com/ - James