From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-9.9 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id ED976C32771 for ; Mon, 27 Jan 2020 10:11:09 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 9C9A820702 for ; Mon, 27 Jan 2020 10:11:09 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="HEn1cFPi" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 9C9A820702 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 263656B0003; Mon, 27 Jan 2020 05:11:09 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 1EECF6B0006; Mon, 27 Jan 2020 05:11:09 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0B6076B0007; Mon, 27 Jan 2020 05:11:09 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0019.hostedemail.com [216.40.44.19]) by kanga.kvack.org (Postfix) with ESMTP id E4E9D6B0003 for ; Mon, 27 Jan 2020 05:11:08 -0500 (EST) Received: from smtpin04.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with SMTP id 97DC782499B9 for ; Mon, 27 Jan 2020 10:11:08 +0000 (UTC) X-FDA: 76422996216.04.vein06_7e530422f3162 X-HE-Tag: vein06_7e530422f3162 X-Filterd-Recvd-Size: 8655 Received: from us-smtp-delivery-1.mimecast.com (us-smtp-2.mimecast.com [205.139.110.61]) by imf08.hostedemail.com (Postfix) with ESMTP for ; Mon, 27 Jan 2020 10:11:07 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1580119867; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=+9D3S7UjMLr4ZZ4gh/BeXIceuAr2vKSjULkUh3kSIj4=; b=HEn1cFPicEmWULxAMZvkMULWRBUBDKkGHvT+UUEQCqrcAIpbkbqB38kEbtsPoPY5IoGmqs mBigdmbuypBl4mg9YWlFz7cWRn8TqPKY2osMbVafEUWQq5TRwB9ywrKcRzMAQvUg/WAy3Q rrwcIfeXb5bMQRa7QGy4WonX9TSQyCk= Received: from mail-wm1-f71.google.com (mail-wm1-f71.google.com [209.85.128.71]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-338-xM5YcHQcOsGoKOfYHF_aDA-1; Mon, 27 Jan 2020 05:11:04 -0500 Received: by mail-wm1-f71.google.com with SMTP id m21so1291400wmg.6 for ; Mon, 27 Jan 2020 02:11:04 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=c9UaqAcMbOuX1fIscx2ic00TzOMEBUExJGLWfLI7Q0c=; b=EDJg4bCpJhto5Zt2D48vDfrow5tcWyZv/oSA8VApF+AyYq495/fi/JiB2uWh8xyFpX ZqFGYsLpvCxGQVkKW4b+dwTAnq61ILoLhtwcmGonNzYVAkxxH9DT5rXAXTukCQCXfSE4 DhKCjxE71tSadOvwVF7KC70IdFODPd92skZvDQLbUU8QKwIfP/Z054ut5GamFVljqE21 UrdBoTXLWbnL2qGccrV7M9xAlH5+l2lURoGPoytBqFQjfG8MwmzXUjhdK5L2ASIzxhBg kxIMYyprm5IdcbrKb0NO5vAQywUsHIloSoH/J0B3UKmL3rAg/4HN9mZ86xQShkWmr+wB BFjQ== X-Gm-Message-State: APjAAAXdqIqZrJ9yYd2N+iWmrM2wTYYWr112KP1zCAk4qedQEmZ0IZY8 jKCnqwJvsG/VlbQFTmrB21dQE10QAjMkQTTJBDjYBfPvjRnUGS13xdMMmE2MP+oqrvZKiykrC4w sSqbnUVJ4Od8= X-Received: by 2002:a5d:6390:: with SMTP id p16mr21640805wru.170.1580119863351; Mon, 27 Jan 2020 02:11:03 -0800 (PST) X-Google-Smtp-Source: APXvYqzRFDgX75y1Oys657k+F9y3i7KtXY+JRn/oj1oaSQ5mKVpI3zaXERF8eL4tQtDCAb8XlqRYiw== X-Received: by 2002:a5d:6390:: with SMTP id p16mr21640760wru.170.1580119863068; Mon, 27 Jan 2020 02:11:03 -0800 (PST) Received: from dhcp-1-195.brq.redhat.com (nat-pool-brq-t.redhat.com. [213.175.37.10]) by smtp.googlemail.com with ESMTPSA id o1sm19961256wrn.84.2020.01.27.02.11.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 27 Jan 2020 02:11:02 -0800 (PST) From: Grzegorz Halat To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-doc@vger.kernel.org, ghalat@redhat.com, ssaner@redhat.com, atomlin@redhat.com, oleksandr@redhat.com, vbendel@redhat.com, kirill@shutemov.name, khlebnikov@yandex-team.ru, borntraeger@de.ibm.com, Andrew Morton , Iurii Zaikin , Kees Cook , Luis Chamberlain , Jonathan Corbet Subject: [PATCH 1/1] mm: sysctl: add panic_on_mm_error sysctl Date: Mon, 27 Jan 2020 11:11:00 +0100 Message-Id: <20200127101100.92588-1-ghalat@redhat.com> X-Mailer: git-send-email 2.24.1 MIME-Version: 1.0 X-MC-Unique: xM5YcHQcOsGoKOfYHF_aDA-1 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Memory management subsystem performs various checks at runtime, if an inconsistency is detected then such event is being logged and kernel continues to run. While debugging such problems it is helpful to collect memory dump as early as possible. Currently, there is no easy way to panic kernel when such error is detected. It was proposed[1] to panic the kernel if panic_on_oops is set but this approach was not accepted. One of alternative proposals was introduction of a new sysctl. The patch adds panic_on_mm_error sysctl. If the sysctl is set then the kernel will be crashed when an inconsistency is detected by memory management. This currently means panic when bad page or bad PTE is detected(this may be extended to other places in MM). Another use case of this sysctl may be in security-wise environments, it may be more desired to crash machine than continue to run with potentially damaged data structures. [1] https://marc.info/?l=3Dlinux-mm&m=3D142649500728327&w=3D2 Signed-off-by: Grzegorz Halat --- Documentation/admin-guide/sysctl/kernel.rst | 12 ++++++++++++ include/linux/kernel.h | 1 + kernel/sysctl.c | 9 +++++++++ mm/memory.c | 7 +++++++ mm/page_alloc.c | 4 +++- 5 files changed, 32 insertions(+), 1 deletion(-) diff --git a/Documentation/admin-guide/sysctl/kernel.rst b/Documentation/ad= min-guide/sysctl/kernel.rst index def074807cee..2fecd6b2547e 100644 --- a/Documentation/admin-guide/sysctl/kernel.rst +++ b/Documentation/admin-guide/sysctl/kernel.rst @@ -61,6 +61,7 @@ show up in /proc/sys/kernel: - overflowgid - overflowuid - panic +- panic_on_mm_error - panic_on_oops - panic_on_stackoverflow - panic_on_unrecovered_nmi @@ -611,6 +612,17 @@ an IO error. and you can use this option to take a crash dump. =20 =20 +panic_on_mm_error: +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +Controls the kernel's behaviour when inconsistency is detected +by memory management code, for example bad page state or bad PTE. + +0: try to continue operation. + +1: panic immediately. + + panic_on_oops: =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D =20 diff --git a/include/linux/kernel.h b/include/linux/kernel.h index 0d9db2a14f44..5f9d408512ff 100644 --- a/include/linux/kernel.h +++ b/include/linux/kernel.h @@ -518,6 +518,7 @@ extern int oops_in_progress;=09=09/* If set, an oops, p= anic(), BUG() or die() is in extern int panic_timeout; extern unsigned long panic_print; extern int panic_on_oops; +extern int panic_on_mm_error; extern int panic_on_unrecovered_nmi; extern int panic_on_io_nmi; extern int panic_on_warn; diff --git a/kernel/sysctl.c b/kernel/sysctl.c index 70665934d53e..6477e1cce28b 100644 --- a/kernel/sysctl.c +++ b/kernel/sysctl.c @@ -1238,6 +1238,15 @@ static struct ctl_table kern_table[] =3D { =09=09.extra1=09=09=3D SYSCTL_ZERO, =09=09.extra2=09=09=3D SYSCTL_ONE, =09}, +=09{ +=09=09.procname=09=3D "panic_on_mm_error", +=09=09.data=09=09=3D &panic_on_mm_error, +=09=09.maxlen=09=09=3D sizeof(int), +=09=09.mode=09=09=3D 0644, +=09=09.proc_handler=09=3D proc_dointvec_minmax, +=09=09.extra1=09=09=3D SYSCTL_ZERO, +=09=09.extra2=09=09=3D SYSCTL_ONE, +=09}, #if defined(CONFIG_SMP) && defined(CONFIG_NO_HZ_COMMON) =09{ =09=09.procname=09=3D "timer_migration", diff --git a/mm/memory.c b/mm/memory.c index 45442d9a4f52..cce74ff39447 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -71,6 +71,7 @@ #include #include #include +#include =20 #include =20 @@ -88,6 +89,8 @@ #warning Unfortunate NUMA and NUMA Balancing config, growing page-frame fo= r last_cpupid. #endif =20 +int panic_on_mm_error __read_mostly; + #ifndef CONFIG_NEED_MULTIPLE_NODES /* use the per-pgdat data instead for discontigmem - mbligh */ unsigned long max_mapnr; @@ -543,6 +546,10 @@ static void print_bad_pte(struct vm_area_struct *vma, = unsigned long addr, =09=09 vma->vm_ops ? vma->vm_ops->fault : NULL, =09=09 vma->vm_file ? vma->vm_file->f_op->mmap : NULL, =09=09 mapping ? mapping->a_ops->readpage : NULL); + +=09print_modules(); +=09if (panic_on_mm_error) +=09=09panic("Bad page map detected"); =09dump_stack(); =09add_taint(TAINT_BAD_PAGE, LOCKDEP_NOW_UNRELIABLE); } diff --git a/mm/page_alloc.c b/mm/page_alloc.c index d047bf7d8fd4..2ea6a65ba011 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -643,9 +643,11 @@ static void bad_page(struct page *page, const char *re= ason, =09if (bad_flags) =09=09pr_alert("bad because of flags: %#lx(%pGp)\n", =09=09=09=09=09=09bad_flags, &bad_flags); -=09dump_page_owner(page); =20 +=09dump_page_owner(page); =09print_modules(); +=09if (panic_on_mm_error) +=09=09panic("Bad page state detected"); =09dump_stack(); out: =09/* Leave bad fields for debug, except PageBuddy could make trouble */ --=20 2.21.1