From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5342AC83F17 for ; Thu, 10 Jul 2025 16:57:08 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A5DD36B0096; Thu, 10 Jul 2025 12:57:07 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A0EA96B0099; Thu, 10 Jul 2025 12:57:07 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8FD6A6B009A; Thu, 10 Jul 2025 12:57:07 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 7AC4B6B0096 for ; Thu, 10 Jul 2025 12:57:07 -0400 (EDT) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id D89CE56838 for ; Thu, 10 Jul 2025 16:57:06 +0000 (UTC) X-FDA: 83648960052.22.A557D6E Received: from mail-ed1-f44.google.com (mail-ed1-f44.google.com [209.85.208.44]) by imf12.hostedemail.com (Postfix) with ESMTP id E4F8B40003 for ; Thu, 10 Jul 2025 16:57:04 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=PRIwT7C3; spf=pass (imf12.hostedemail.com: domain of adobriyan@gmail.com designates 209.85.208.44 as permitted sender) smtp.mailfrom=adobriyan@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1752166625; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=c9Ri74kT7ZHQydHthSU1dEMhblK6ufs2Kl2/e6Z/GVI=; b=N9EceZ9wVYYo/iJTPXS2C5bAy+n44gc2FsOWXCYa2l8ff2xjf7Dmdpqh9EZa7pm0DEC4Bn Z8EU9d5JgqqpxcPRzSJjj1KdShyRolwxWhBW0rOv7FJeVtVg6Ltsa/uQy2zqBHFHPx9ZDP Q/829WtVH3fw8fqSxmTt+4J4Xl9tma0= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=PRIwT7C3; spf=pass (imf12.hostedemail.com: domain of adobriyan@gmail.com designates 209.85.208.44 as permitted sender) smtp.mailfrom=adobriyan@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1752166625; a=rsa-sha256; cv=none; b=xUizy9o7WINPTsgQ5mjK+NaoPHw3d/AfmVeLoE4GV2kRB2cSZXNPr+p+S9d4rdAPTXNCWs 8bZoo0/Zc6MSJWfO8IXnGuSIPZzx1rRP1ZtwUNZyhxggDawE9GDe3T81MScd2fXhr3TWbi TOJ9z/uBwNt2H2NFROsYgA8vavH2Xco= Received: by mail-ed1-f44.google.com with SMTP id 4fb4d7f45d1cf-607cc1a2bd8so1940596a12.2 for ; Thu, 10 Jul 2025 09:57:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1752166623; x=1752771423; darn=kvack.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=c9Ri74kT7ZHQydHthSU1dEMhblK6ufs2Kl2/e6Z/GVI=; b=PRIwT7C3sd27YS5ier2rHsPsZFL/HV7KcgYTg7qHjoCjTKtJDBuvMJVIXQUbZnnUb1 1ErSaITQF7QBWKtGO06p2KG+W7S3xOA93P52GllnYwqGgcWG/Bvpu/MM3GAhC4Nzif12 VZP/tmLKpZ2qv7A2CToAuTl2jLGd292aGnQPjbLCMzIfubAM44ZSWYL3XZXVRcvBKVUD 890/WUFxMC73C++Fxb6bQlJX0PmuGmWaXS+S5mUXMjaObypQoAh0vm0D4gIVPgKlBYnA XIi4zyfgPeKAg1tOfo0dQLBkmTZblRaHUQFs6CryMMjnYrIjLVBb4crFxkM/uhbQrVin o0ew== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1752166623; x=1752771423; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=c9Ri74kT7ZHQydHthSU1dEMhblK6ufs2Kl2/e6Z/GVI=; b=nxF5twfa3aXeXfiY5mSCNCbWDO9D2AJx5vHqz0VZIPujUyG22KTTFQBadAVE8Kofsf eYcquEIskIC02rso5iUgwlbZsyGcB8k1fvX9R4Hkd1VE/e54hBTBFm9UjRwItKaHpsh4 kpuZKtmfKDq4r4iIkyjsyZw8yUUeSNPjmq8m/msTYX7wigjHarknGV4OiPJspu2URNTD zqi8CI4fKfvMF7GPFt3Nuo2u/IKXjvP3gkqmtAabZvAQV9ZS2xXNw8haSY7CwdZCA1IX NAts4Itj1sJLLPbytNmrLLAz4rJoql7DEOSPFv8x7T2FF9h3r1Gx5DylfW1KfDLz8jRN WJ2g== X-Forwarded-Encrypted: i=1; AJvYcCVnPE6zYUnd/jm7hAc2d/Z2y/FaNdWb7BHufL8/efpq4zXA1rV1/+lgw8ClpDr58spiVQQSvbJh7g==@kvack.org X-Gm-Message-State: AOJu0Yw3o8oZ04fD91OXplJP17ZQT1K2k0n021Ug3Y2jCUMFbJw7rIu2 ntgPLHFD9kjw3s/q+etE1ptJyl1IrKyuPMrxb1BU/HYbxn1vkvXA7IY= X-Gm-Gg: ASbGncu2uME0cHD5mcPrfgYAfwr4iPq7vXRNXtgeLPW5p9Fb91VCj74B24viQqkRJPT 9rWnHuRp1O0hgNZJNcTVuScXxLOw35Owtt4NJf207wQcwPLaHymq5s+YgPisbBMWpsV39oQQYtm jqsiFg5UP9JfEELQlqma3xpDvSvVth97gOi5wjVqmtZ/qK4d+i0RfseF5V0EfUSoCP4gg+Pb4Rc ueUPyGqViLJQm/99kWxjqpxfaMpQ+C3rju18H2jya7MSbx8KsUwNuPvJ2GuHMZUU402x08kVRdE 55LY+ZJsXQiPY3E3Lq31Rx3Tiw0V+o/ReZ6dQ2ZCU7PHPH3IqDeJ6IRdRiqi X-Google-Smtp-Source: AGHT+IGzU/Dlzc8COXQUiBNkPkzRxJR5U+b1hbru0ZWxpFK64pbTpTj+umsU1o80jBOpWOG4KlSjhg== X-Received: by 2002:a17:907:cd0e:b0:ae6:a8c1:c633 with SMTP id a640c23a62f3a-ae6e7044e60mr340776966b.34.1752166623095; Thu, 10 Jul 2025 09:57:03 -0700 (PDT) Received: from p183 ([46.53.250.196]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-ae6e7e9235esm163124566b.12.2025.07.10.09.57.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 10 Jul 2025 09:57:02 -0700 (PDT) Date: Thu, 10 Jul 2025 19:57:00 +0300 From: Alexey Dobriyan To: Lorenzo Stoakes Cc: akpm@linux-foundation.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, David Hildenbrand , "Liam R. Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko Subject: Re: [PATCH] mm: implement "memory.oops_if_bad_pte=1" boot option Message-ID: <72956765-39e0-445a-b381-6bbc54046544@p183> References: <4e1b7d2d-ed54-4e0a-a0a4-906b14d9cd41@p183> <525a4060-2c8b-40c5-b4bd-b9c47de94f0f@lucifer.local> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <525a4060-2c8b-40c5-b4bd-b9c47de94f0f@lucifer.local> X-Rspamd-Queue-Id: E4F8B40003 X-Stat-Signature: dzwmw55wofji4jn84hckhop198gwogww X-Rspam-User: X-Rspamd-Server: rspam07 X-HE-Tag: 1752166624-135323 X-HE-Meta: U2FsdGVkX19/UcAcnkPH7KXJYN/bi11q1Ryx57ou2kQPB8Jifuo0oOeIWGarIjonnFSGiCudjDcejqBB7/wc/qmj09tBGB8oHzCFcHoW4yCLFBCktPjLPHZjn4c+MmWNndV5r794P4L8jVrnGr8vloCRFdupjprq8xrQOBPI6hgD6JqXQ9eXb+tA60OaQQ14BnP1YFxDZTNFSo4g4vb/gBzy6P2Vn5AF45hnBCHg2EDx6SmlTqn/2FkenJVu2XDjakFPRjUwmjYesyWDPOSQ8Hqb/K6q29nDan43g+ziMt/26P2yDxb5+nkkWxHaq+E5j0Bdqhnb24QLHUndVxHGceFHD++9clwqjyjl8Ii0ElusCqvSJUuI7BeNefqoov2QjkV61aOkpBptWy+2/7156ltMwcWZb+ax3owuR1DVa+MVT22dfwU9/LalhQrWiNFdUjMsAr5PjbRGyBO7Kcq0SgEKBjXXRPWv/ChvUbeU6ZTjc5nzGMUaugaqkZ7w/2kqtfT4dcjpiU7zstWDPbUEKF0O/hbhvNOeBLRfNaSHLXgAhiBKexGaOPJnay4D/IseEtmTzz9eDL2nmSWAujOIcz+4NYHSI5tHeQd4K4S7xqJIhePONKQAUCJwqGH11kymnyyGothsKEHPtp2/tYCxE7lSLaOBfRiOpX0vtO9GK3pQU7RoaHf8tU0W8q4Fpa1vQZ21BEKwvoa4AkFcJf6b9HyiHwRZOvVaAm/xtbYe0UTmS2HIjtwqsLOXS1s1k0WxoUZt7AfuuHTn0akFnt2essoBqomKMIez2I51YjF6DiejqKn+Jr+t/itPqMOBk4sTM1pjSmFU6nfpHHl+bxjDn0crrkB013a1NVmiiP/Zu8MqsV7W+0kVu6AlXdSkVuBNtiMb2u7DqLzQ/cRFUQRzJnadwH5239+a09q+gg3xXDaxRLhlsRkHAd6pxokFVhBsqMr3LxBsQ97b9PY3zrf Wstd85Xx 9HOrLPPxWQxTi4gFNg2xo4oRlN79dOXDFBGxkDbErAJkuI5AyHkZCU/8g5TiQJK6c4lgU23NBxpHBrDDVWlmdM0pgiTF3QSlPIHGJg8OqcwG2hkZhixt5bRX4kwkhPJXomCwdH3lT4R84YxnTkvQW+XgfFO5aETLJceg6tWH2TfBJ6rSVyzdel4qqJhI0Z8fP4NS5ILR3ioEOgogvzCi0sUnGzUjx3MKePDywIOQIJwEg9uoEsqCIFYQzmDvEglvVn0t65xa0ukZZ1ZqHeFnsgpVx/T6XQhDm4ROPIZC8QH2HVvPECbvBJKTydmmWIxba6P2tCI8lOgNiP15QYELup/TsTQ9V5hCfiqYiNzvWqiAD93bA5Z9iftxMqzi9dPwhBKCQsUpmQBY+3RdDGYRPTsAmPZHq2rLHIPh44fTqltyYZoRGKRiyl+cMo7siiUIdDuKcuerx+eec4kLKXDkpgiEVZEAV2/EpoPWiNAy0T+W0cMDV8XYcig7YWVqXqTaSTu6yy76jHcjXc3gCkTRFK5F51g== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Jul 10, 2025 at 05:16:52PM +0100, Lorenzo Stoakes wrote: > Sorry but no - this seems to me to just be a hack. And it also appears to > violate the rules on BUG_ON() (see [0]) so this is just a no. > > [0]:https://lore.kernel.org/linux-mm/CAHk-=wjO1xL_ZRKUG_SJuh6sPTQ-6Lem3a3pGoo26CXEsx_w0g@mail.gmail.com/ > > On Wed, Jul 09, 2025 at 09:10:59PM +0300, Alexey Dobriyan wrote: > > Implement > > > > memory.oops_if_bad_pte=1 > > This is a totally new paradigm afaict - introducing an oops based on user > input, I really don't think that's sensible. > > Unless kernel.panic_on_oops is set this won't necessarily cause anything to > halt. Really you want a panic_on_bad_pte here, but that would be way way > too specific. > > So it seems like a hack just so you can get a vmcore? > > You seem to be using BUG_ON() to _maybe_ cause a panic, maybe not, but by > doing this you're inferring that there's unrecoverable system instability, > which isf clearly not the case. > > All of the bad PTE handling seems to be intended to be recoverable and > handled by calling code. > > Additionally we have uses like zap_present_folio_ptes() which use it to > output PTE state in the instance of an invalid mapcount value - I don't > think oopsing there would really be what you wanted right? > > > > > boot option which oopses the machine instead of dreadful > > > > BUG: Bad page map in process > > > > message. > > I'm not sure what's so dreadful about it? Because the root cause is unknown, happened at unknown time, dmesg rotated away and nobody bothered to coredump the machine because it didn't oops! > And why an oops is better? I apologize for stating the obvious but the less time between the bug and coredump collection the better. > > This is intended > > for people who want to panic at the slightest provocation and > > for people who ruled out hardware problems which in turn means that > > delaying vmcore collection is counter-productive. > > Seems to be a specific edge case. Yes, but the option is not enabled by default and costs 2 instructions on the coldest code path, so... > > Linux doesn't (never?) panicked on PTE corruption and even implemented > > ratelimited version of the message meaning it can go for minutes and > > even hours without anyone noticing which is exactly the opposite of what > > should be done to facilitate debugging. > > But are there real situations you can cite where this has been problematic? > > > > > Not enabled by default. > > Yeah, obviously. > > > > > Not advertised. > > Umm why? Seems like you just want to add this for your own very specific > purpose? Sort of, I don't want to patch and unpatch things every time. > > +/* > > + * Oops instead of printing "Bad page map in process" message and > > + * trying to continue. > > + */ > > +static bool oops_if_bad_pte __ro_after_init = false; > > +module_param(oops_if_bad_pte, bool, 0444); > > + > > /* > > * This function is called to print an error when a bad pte > > * is found. For example, we might have a PFN-mapped pte in > > @@ -490,6 +498,13 @@ static inline void add_mm_rss_vec(struct mm_struct *mm, int *rss) > > static void print_bad_pte(struct vm_area_struct *vma, unsigned long addr, > > pte_t pte, struct page *page) > > { > > + /* > > + * This line is a formality to collect vmcore ASAP. Real bug > > + * (hardware or software) happened earlier, current registers and > > + * backtrace aren't interesting. > > + */ > > + BUG_ON(oops_if_bad_pte); > > Except that it won't without panic_on_oops? Yes, I'll update the comment. it is supposed to be used with panic_on_oops=1 for maximum effect. > I mean we can't just go around putting arbitrary BUG_ON()'s like this for > cases we want data on. Yes, we can! > And far worse here - this is a print_xxx() function, and you're making it > oops? That's really bad. It's fine because, it is conditional BUG_ON. > Note that other page table levels can be 'bad' as well, see pgd_bad() et > al. - none of these will be caught. Sure, I didn't think much about spreading this option to other places. It can be spread independently. > Overall I suspect there's one single case you're worried about, that really > you want to put a WARN_ON_ONCE() against - then you can panic_on_warn and > get what you want. Ehh, no. WARN is for home users who can maybe photo the oops and fish it out of dmesg and make bug report -- so that system survives until their data are flushed to disk. I suspect users are very bifurcated: some want to panic always, some want to panic during QA but not in the field, and then there are users whose only hope is cellphone camera. > If you can make an argument in favour of this that's convincing then that > would be a potentially upstreamable patch, but this one isn't, in my view.