From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 68AA4C25B75 for ; Fri, 31 May 2024 18:13:38 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CE76A6B00A4; Fri, 31 May 2024 14:13:37 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C707A6B00A5; Fri, 31 May 2024 14:13:37 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AEA5E6B00A6; Fri, 31 May 2024 14:13:37 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 8B9D66B00A4 for ; Fri, 31 May 2024 14:13:37 -0400 (EDT) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 30ABFC02BE for ; Fri, 31 May 2024 18:13:37 +0000 (UTC) X-FDA: 82179488874.21.5793F7B Received: from mail-ej1-f47.google.com (mail-ej1-f47.google.com [209.85.218.47]) by imf25.hostedemail.com (Postfix) with ESMTP id 3738CA000E for ; Fri, 31 May 2024 18:13:35 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=j0Vzsk6d; spf=pass (imf25.hostedemail.com: domain of shy828301@gmail.com designates 209.85.218.47 as permitted sender) smtp.mailfrom=shy828301@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1717179215; a=rsa-sha256; cv=none; b=DnnALbEgjtSbDgtD0fc2lpCaPrqoRs8f/MA83jLBzwlPfxL+7kiy1XZNODEgAIZbppNf7D qBTshKq8/4DZA34YWP252CWkjVHbcdUPANalPhsPSOKK+WX/ToydV+6BzKAZ7x6hAnn7dT bjhX2I+lBLxdeT22R1z/wg2INwJjJLI= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=j0Vzsk6d; spf=pass (imf25.hostedemail.com: domain of shy828301@gmail.com designates 209.85.218.47 as permitted sender) smtp.mailfrom=shy828301@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1717179215; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=9TV0ewj2ofkJ+QHrN9fyLdzF3zHHEdMhA4Fe/wK0Sbc=; b=6AA4CjDOtdUoiawJQbLv0B7fxRruqhcaaaH3Dd0GeXLO1ZWoYU6vosa1BYQ59HbNu58GIT XHc3F80yEt0I2qHV1MjoOTLMs1vLMqFvkR11BjKaqKK1G9kP4wpsAWP5gtwjKv9sD3bGua uhSDEhiZI2XDceOaxJTvRsZU6Sl6yOI= Received: by mail-ej1-f47.google.com with SMTP id a640c23a62f3a-a6269885572so442634366b.1 for ; Fri, 31 May 2024 11:13:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1717179214; x=1717784014; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=9TV0ewj2ofkJ+QHrN9fyLdzF3zHHEdMhA4Fe/wK0Sbc=; b=j0Vzsk6dX1rsxcslqbReKDv559+K6dPT4eXs170QRNfOnTQWT4GPRz5T5N91FdaCnu 6D/f2b95Xv057E2/GbOm5d8h80t/qysazWQ47EIxVjzhhZeGoBsyFYuutVO4gTynxeww 5z998G2R+dO5pdCXxtzswmO3hOMPt6VAP1fctRVtQdeN5JfV76piSRBgae2byOg1IcL5 IyJmtYbFYJ4GgsW++RLwZE8nlnl84tk385LZauPeNo5GTggtUr9FdZGuJrODYEg/n3fM 1Xlh01BN8yf8WNBlciPjk8ZSPvyFPJ3k9+0EQDmzdydIEi5RUbdXtPNwLwkjCfouZNAm Xnlw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1717179214; x=1717784014; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=9TV0ewj2ofkJ+QHrN9fyLdzF3zHHEdMhA4Fe/wK0Sbc=; b=lyd2QxXQZ7fFyoTFneSmoaWXXvMwv+XbInIBWqWwT59phMs8/hHcYZXVgwZgY2jHrR Kb//wnbeDepGgDKJDtbYJV9Lbd91VhdScIYKHBT5VTr/dL4xIrC3iStB5lkqTu26qJRq Mz+TmybuNvfm3wOH5Z44FLdoEkAUAxRwJWV6pZSq2u/MS2yH5HBxiBiHQ7KsqsIHA52P mmzrk4DZbKYHXZ9H3Vk+NuFa77fU7tdi1wuXAC8DvWoVnz+mtAs4M6DIWf/+sGNQfm3i nIpM4V//N9JoTRrla93pK1Ss60qRTZtuRkqGj2IY4UrNsAKhTB7E1H8QZFZxbyAXfJCv ScUg== X-Forwarded-Encrypted: i=1; AJvYcCUsqQTQW8v5HNlDiTFQS0pRvoqtrwASALjxYhONqQYpwO4W9oTpqUbPURyoD6SEZNmGmOLJOfuOIz0UFS9i3UCUxYc= X-Gm-Message-State: AOJu0Yyhuv2Kv4VHwcgwMd0fRP4XNzV6HR2Miea0+OMIN9fyFnlHSbFr 9bgCb/nONihYufjaH3w7AJl1rUCauE2fcnTxc6eQvDTEAMefYo96vuXqXghv47xlaxvxeB3w6u3 EIgG8raSyxLdpxjxPyh0YNTJeMhM= X-Google-Smtp-Source: AGHT+IH5c9rwggcCjXylCD0ZnbzPCrk0Gl4eHGtauFqpDgl8dd86r3TRRFV5PN4OnFkwPD2Y0nAofHwux6jU/ttA88U= X-Received: by 2002:a17:906:8886:b0:a66:4045:2c52 with SMTP id a640c23a62f3a-a681a8774famr237261466b.28.1717179213449; Fri, 31 May 2024 11:13:33 -0700 (PDT) MIME-Version: 1.0 References: <202405311534.86cd4043-lkp@intel.com> <890e5a79-8574-4a24-90ab-b9888968d5e5@redhat.com> In-Reply-To: From: Yang Shi Date: Fri, 31 May 2024 11:13:22 -0700 Message-ID: Subject: Re: [linus:master] [mm] efa7df3e3b: kernel_BUG_at_include/linux/page_ref.h To: David Hildenbrand Cc: kernel test robot , Peter Xu , Jason Gunthorpe , Vivek Kasireddy , Rik van Riel , oe-lkp@lists.linux.dev, lkp@intel.com, linux-kernel@vger.kernel.org, Andrew Morton , Matthew Wilcox , Christopher Lameter , linux-mm@kvack.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 3738CA000E X-Rspam-User: X-Rspamd-Server: rspam12 X-Stat-Signature: 5t1t9qhi7rr818nmq3nejgji97yjb6nh X-HE-Tag: 1717179214-296313 X-HE-Meta: U2FsdGVkX18L3d4b22xcppkbBX+e9n6HEXPUYqYfzyQA59KMHgtNPh4c9qIZFgsjzqtw/2mUz5NL6nzRn8mHqO14rFlwqFasQlsfpDfqnzYgPrgJC51Ap6CoHgqS39hzE0AqnBJEZtRKMe4zlGn/TJs0Ub/arlV4c2tqUpkimKl7kL1ScHwskzlnjqs6uCwxq393U5JdQLA/+9xjpDmEbefvyKQz8x+VEp966AKJcGjan2M31Q9N9kDiJAiEUnUDf1mn2zbu45hW9A5hu0ZabgLJjn+NC2tL72zLv2rJ2EHvlRnFC5dw7CAYyXLIC21GUFYT5bnkXU6Thp/MzeWeEL+OrNKVOOu5Ii4VzFrEhghvBarBmEd5HjdK9Juc/rDLzKVbPQcDZF7sR/HjTgnGgiVp2ygj+Nfh8apg9Vzyj0yWxQ0V1+A23F0LYiEOBZYdyjchuov9eUopslgu/CqkIG6rv59lz1rw1DkfMrSPDdD/Y1cNfnT1tirCI/m6/0JfdvMLLyRQJOmB/IgGGh5iT5QB3wk6sa4EFRVc21fKc7Fiebff6qpqujUXGOeTLu2DCu7QUSUsupTJk/9ExfN2fJuJgH5igMRZb1Y/aoYBaDKQ5WXPchwrnn6gVWgMDJsmaHMySIxk920FUBhS2NYSMY+F4DrqxPX1Pymplx9iQ+CDQ8wLs/6+Vo4LxTtOvGBY+SFM4VHmFMGX2PjlMUegMd2/gNsTl6WRFmn670RPUs0fPstdEdy+IDrNbuWn2xr064LbqFB4pYp5liOnc8IxoGNZzXiEi+6t+Abbc9kMhynptVYWqTyrbiSAzBDDwd1LsiLxIxveELlP8xSaCIbH+6hhs8YXs+HEQdaIsfWr0bwSglCsaDSGlr4j2QWFkK/zd7pEK0pq54fpayTAQ1icigBRzqAkn4LF/OgpvyTozVZjrdcbK0UqLMKp63FFtZWEK59D7TnUF+n1GtyJcMF x3H4f16U 9p5dBZ5G0b0C3WxY2GqofV5GrB/yEj/6Isduoh8FTfLoymTadDdFzwImQGkW3BsSRIpgyKRlnhlA6DvFAYfnOWuscrWIoq9a1JC7nefu3g6++hawFOG4dSb4dmkoyL6iOCAZYv66JTu17XYW0ueL1Ab8VoQMRo9xoFInjLdhBt/Fy41lWw2W07ZQa8tXHrYlysj1W+ym4MSHPtINCLEj0FdYZCIG9wu9/m3GEdiTypuG92dDG81IfkzkefBPr7eC12szzYk6FqdF/aH/PKgsV6s2zRRZ320t1C0J3raOoWgXMJB+mMjDxsWIWo5U3+TNAfkgFJYdhLC7NXWYpL4qh3L1/xR/02BcvJmKfKGVK8XY+7eJxZgIiYpw9/1PVTuSBWmKWcuwCd8RQ7wbqgQoUiqJQPpd5jbLXGEiEpZyXIuIGxr7Nw1eWdzHdvKUwZVjp/FhHbItyQM/xBwOqdJuwfoA4iYveARNyZ96++PaDrZiGHJBf5lr8jlnhu5ETPh/vWWTsnuNBn3AbWe1AOs003B09+w== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, May 31, 2024 at 11:07=E2=80=AFAM Yang Shi wro= te: > > On Fri, May 31, 2024 at 10:46=E2=80=AFAM David Hildenbrand wrote: > > > > On 31.05.24 18:50, Yang Shi wrote: > > > On Fri, May 31, 2024 at 1:24=E2=80=AFAM kernel test robot wrote: > > >> > > >> > > >> > > >> Hello, > > >> > > >> kernel test robot noticed "kernel_BUG_at_include/linux/page_ref.h" o= n: > > >> > > >> commit: efa7df3e3bb5da8e6abbe37727417f32a37fba47 ("mm: align larger = anonymous mappings on THP boundaries") > > >> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git mast= er > > >> > > >> [test failed on linus/master e0cce98fe279b64f4a7d81b7f5c3a23d80= b92fbc] > > >> [test failed on linux-next/master 6dc544b66971c7f9909ff038b621491052= 72d26a] > > >> > > >> in testcase: trinity > > >> version: trinity-x86_64-6a17c218-1_20240527 > > >> with following parameters: > > >> > > >> runtime: 300s > > >> group: group-00 > > >> nr_groups: 5 > > >> > > >> > > >> > > >> compiler: gcc-13 > > >> test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2= -m 16G > > >> > > >> (please refer to attached dmesg/kmsg for entire log/backtrace) > > >> > > >> > > >> we noticed the issue does not always happen. 34 times out of 50 runs= as below. > > >> the parent is clean. > > >> > > >> 1803d0c5ee1a3bbe efa7df3e3bb5da8e6abbe377274 > > >> ---------------- --------------------------- > > >> fail:runs %reproduction fail:runs > > >> | | | > > >> :50 68% 34:50 dmesg.Kernel_panic-no= t_syncing:Fatal_exception > > >> :50 68% 34:50 dmesg.RIP:try_get_fol= io > > >> :50 68% 34:50 dmesg.invalid_opcode:= #[##] > > >> :50 68% 34:50 dmesg.kernel_BUG_at_i= nclude/linux/page_ref.h > > >> > > >> > > >> > > >> If you fix the issue in a separate patch/commit (i.e. not just a new= version of > > >> the same patch/commit), kindly add following tags > > >> | Reported-by: kernel test robot > > >> | Closes: https://lore.kernel.org/oe-lkp/202405311534.86cd4043-lkp@i= ntel.com > > >> > > >> > > >> [ 275.267158][ T4335] ------------[ cut here ]------------ > > >> [ 275.267949][ T4335] kernel BUG at include/linux/page_ref.h:275! > > >> [ 275.268526][ T4335] invalid opcode: 0000 [#1] KASAN PTI > > >> [ 275.269001][ T4335] CPU: 0 PID: 4335 Comm: trinity-c3 Not tainted= 6.7.0-rc4-00061-gefa7df3e3bb5 #1 > > >> [ 275.269787][ T4335] Hardware name: QEMU Standard PC (i440FX + PII= X, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014 > > >> [ 275.270679][ T4335] RIP: 0010:try_get_folio (include/linux/page_re= f.h:275 (discriminator 3) mm/gup.c:79 (discriminator 3)) > > >> [ 275.271159][ T4335] Code: c3 cc cc cc cc 44 89 e6 48 89 df e8 e4 5= 4 11 00 eb ae 90 0f 0b 90 31 db eb d5 9c 58 0f 1f 40 00 f6 c4 02 0f 84 46 f= f ff ff 90 <0f> 0b 48 c7 c6 a0 54 d2 87 48 89 df e8 a9 e9 ff ff 90 0f 0b be= 04 > > > > > > If I read this BUG correctly, it is: > > > > > > VM_BUG_ON(!in_atomic() && !irqs_disabled()); > > > > > > > Yes, that seems to be the one. > > > > > try_grab_folio() actually assumes it is in an atomic context (irq > > > disabled or preempt disabled) for this call path. This is achieved by > > > disabling irq in gup fast or calling it in rcu critical section in > > > page cache lookup path > > > > try_grab_folio()->try_get_folio()->folio_ref_try_add_rcu() > > > > Is called (mm-unstable) from: > > > > (1) gup_fast function, here IRQs are disable > > (2) gup_hugepte(), possibly problematic > > (3) memfd_pin_folios(), possibly problematic > > (4) __get_user_pages(), likely problematic > > > > (1) should be fine. > > > > (2) is possibly problematic on the !fast path. If so, due to commit > > a12083d721d7 ("mm/gup: handle hugepd for follow_page()") ? CCing P= eter. > > > > (3) is possibly wrong. CCing Vivek. > > > > (4) is what we hit here > > > > > > > > And try_grab_folio() is used when the folio is a large folio. The > > > > > > We come via process_vm_rw()->pin_user_pages_remote()->__get_user_pages(= )->try_grab_folio() > > > > That code was added in > > > > commit 57edfcfd3419b4799353d8cbd6ce49da075cfdbd > > Author: Peter Xu > > Date: Wed Jun 28 17:53:07 2023 -0400 > > > > mm/gup: accelerate thp gup even for "pages !=3D NULL" > > > > The acceleration of THP was done with ctx.page_mask, however it'll= be > > ignored if **pages is non-NULL. > > > > > > Likely the try_grab_folio() in __get_user_pages() is wrong? > > > > As documented, we already hold a refcount. Likely we should better do a > > folio_ref_add() and sanity check the refcount. > > Yes, a plain folio_ref_add() seems ok for these cases. > > In addition, the comment of folio_try_get_rcu() says, which is just a > wrapper of folio_ref_try_add_rcu(): > > You can also use this function if you're holding a lock that prevents > pages being frozen & removed; eg the i_pages lock for the page cache > or the mmap_lock or page table lock for page tables. In this case, it > will always succeed, and you could have used a plain folio_get(), but > it's sometimes more convenient to have a common function called from > both locked and RCU-protected contexts. > > So IIUC we can use the plain folio_get() at least for > process_vm_readv/writev since mmap_lock is held in this path. > > > > > > > In essence, I think: try_grab_folio() should only be called from GUP-fa= st where > > IRQs are disabled. > > Yes, I agree. Just the fast path should need to call try_grab_folio(). try_grab_folio() also handles FOLL_PIN and FOLL_GET, so we may just keep calling it and add a flag to try_grab_folio, just like: if flag is true folio_ref_add() else try_get_folio() > > > > > (2), (3) and (4) are possible offenders of that. > > > > > > Or am I getting it all wrong? :) > > > > -- > > Cheers, > > > > David / dhildenb > >