linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: James Morse <james.morse@arm.com>
To: Borislav Petkov <bp@alien8.de>
Cc: linux-acpi@vger.kernel.org, kvmarm@lists.cs.columbia.edu,
	linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org,
	Marc Zyngier <marc.zyngier@arm.com>,
	Christoffer Dall <christoffer.dall@arm.com>,
	Will Deacon <will.deacon@arm.com>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>,
	Rafael Wysocki <rjw@rjwysocki.net>, Len Brown <lenb@kernel.org>,
	Tony Luck <tony.luck@intel.com>,
	Dongjiu Geng <gengdongjiu@huawei.com>,
	Xie XiuQi <xiexiuqi@huawei.com>, Fan Wu <wufan@codeaurora.org>
Subject: Re: [PATCH v7 04/25] ACPI / APEI: Make hest.c manage the estatus memory pool
Date: Thu, 10 Jan 2019 18:20:35 +0000	[thread overview]
Message-ID: <7f1621ac-09ba-71c0-d47d-e9ad61660307@arm.com> (raw)
In-Reply-To: <20181219144234.GA31643@zn.tnic>

Hi Boris,

On 19/12/2018 14:42, Borislav Petkov wrote:
> On Fri, Dec 14, 2018 at 01:56:16PM +0000, James Morse wrote:
>> /me digs a bit,
>>
>> ghes_estatus_pool_init() allocates memory from hest_ghes_dev_register().
>> Its caller is behind a 'if (!ghes_disable)' in acpi_hest_init(), and is after
>> another 2 calls to apei_hest_parse().
>>
>> If ghes_disable is set, we don't call this thing.
>> If hest_disable is set, acpi_hest_init() exits early.
>> If we don't have a HEST table, acpi_hest_init() exits early.
>>
>> ... if the HEST table doesn't have any GHES entries, hest_ghes_dev_register() is
>> called with ghes_count==0, and does nothing useful. (kmalloc_alloc_array(0,...)
>> great!) But we do call ghes_estatus_pool_init().
>>
>> I think a check that ghes_count is non-zero before calling
>> hest_ghes_dev_register() is the cleanest way to avoid this.
> 
> Grrr, what an effing mess that code is! There's hest_disable *and*
> ghes_disable. Do we really need them both?

ghes_disable lets you ignore the firmware-first notifications, but still 'use'
the other error sources:
drivers/pci/pcie/aer.c picks out the three AER types, and uses apei_hest_parse()
to know if firmware is controlling AER, even if ghes_disable is set.

x86's arch_apei_enable_cmcff() looks like it disables MCE to get firmware to
handle them. hest_disable would stop this, but instead ghes_disable keeps that,
and stops the NOTIFY_NMI being registered.


> With my simplifier hat on I wanna say, we should have a single switch -
> apei_disable - and kill those other two. What a damn mess that is.

(do you consider cmdline arguments as ABI, or hard to justify and hard to remove?)

I don't think its broken enough to justify ripping them out. A user of
ghes_disable would be someone with broken firmware-first handling of AER. They
need to know firmware is changing the register values behind their back (so need
to parse the HEST), but want to ignore the junk notifications. It doesn't sound
like an unlikely scenario.


>> I wanted the estatus pool to be initialised before creating the platform devices
>> in case the order of these things is changed in the future and they get probed
>> immediately, before the pool is initialised.
> 
> Hmmm.
> 
> Actually, I meant flipping those two calls:
> 
>         rc = ghes_estatus_pool_init(ghes_count);
>         if (rc)
>                 goto out;
> 
>         rc = apei_hest_parse(hest_parse_ghes, &ghes_arr);
>         if (rc)
>                 goto err;
> 
> to
> 
>         rc = apei_hest_parse(hest_parse_ghes, &ghes_arr);
>         if (rc)
>                 goto err;
> 
>         rc = ghes_estatus_pool_init(ghes_count);
>         if (rc)
>                 goto out;
> 
> so as not to alloc the pool unnecessarily if the parsing fails.
> 
> Also, AFAICT, the order you have them in now might be a problem anyway
> if
> 
> 	apei_hest_parse(hest_parse_ghes, &ghes_arr);
> 
> fails because then you goto err and and that pool leaks, right?

Right, yes. I've been ignoring errors like this on the probe path as it implies
you've got busted ACPI tables, or so little memory you're never going to make it
to user-space. I was more worried about ghes_probe() trying to use the pool
memory before its been allocated. I doesn't seem right to register the device if
the driver wouldn't work yet. But one is an subsys_initcall(), the drivers is
device_initcall(), which is obvious enough.

Fixed.


Thanks,

James

  reply	other threads:[~2019-01-10 18:20 UTC|newest]

Thread overview: 75+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-12-03 18:05 [PATCH v7 00/25] APEI in_nmi() rework and SDEI wire-up James Morse
2018-12-03 18:05 ` [PATCH v7 01/25] ACPI / APEI: Don't wait to serialise with oops messages when panic()ing James Morse
2018-12-03 18:05 ` [PATCH v7 02/25] ACPI / APEI: Remove silent flag from ghes_read_estatus() James Morse
2018-12-04 11:36   ` Borislav Petkov
2018-12-03 18:05 ` [PATCH v7 03/25] ACPI / APEI: Switch estatus pool to use vmalloc memory James Morse
2018-12-04 13:01   ` Borislav Petkov
2018-12-03 18:05 ` [PATCH v7 04/25] ACPI / APEI: Make hest.c manage the estatus memory pool James Morse
2018-12-11 16:48   ` Borislav Petkov
2018-12-14 13:56     ` James Morse
2018-12-19 14:42       ` Borislav Petkov
2019-01-10 18:20         ` James Morse [this message]
2018-12-03 18:05 ` [PATCH v7 05/25] ACPI / APEI: Make estatus pool allocation a static size James Morse
2018-12-11 16:54   ` Borislav Petkov
2018-12-03 18:05 ` [PATCH v7 06/25] ACPI / APEI: Don't store CPER records physical address in struct ghes James Morse
2018-12-11 17:04   ` Borislav Petkov
2018-12-03 18:05 ` [PATCH v7 07/25] ACPI / APEI: Remove spurious GHES_TO_CLEAR check James Morse
2018-12-11 17:18   ` Borislav Petkov
2018-12-03 18:05 ` [PATCH v7 08/25] ACPI / APEI: Don't update struct ghes' flags in read/clear estatus James Morse
2018-12-03 18:05 ` [PATCH v7 09/25] ACPI / APEI: Generalise the estatus queue's notify code James Morse
2018-12-11 17:44   ` Borislav Petkov
2019-01-10 18:21     ` James Morse
2019-01-11 11:46       ` Borislav Petkov
2018-12-03 18:05 ` [PATCH v7 10/25] ACPI / APEI: Tell firmware the estatus queue consumed the records James Morse
2018-12-11 18:36   ` Borislav Petkov
2019-01-10 18:22     ` James Morse
2019-01-10 21:01       ` Tyler Baicar
2019-01-10 21:01         ` Tyler Baicar
2019-01-11 12:03         ` Borislav Petkov
2019-01-11 15:32           ` Tyler Baicar
2019-01-11 15:32             ` Tyler Baicar
2019-01-11 17:45             ` Borislav Petkov
2019-01-11 18:25               ` James Morse
2019-01-11 19:58                 ` Borislav Petkov
2019-01-23 18:36                   ` James Morse
2019-01-29 11:49                     ` Borislav Petkov
2019-01-29 18:48                       ` James Morse
2019-01-31 13:29                         ` Borislav Petkov
2019-01-11 18:09             ` James Morse
2019-01-11 20:01               ` Borislav Petkov
2019-01-11 20:53               ` Tyler Baicar
2019-01-11 20:53                 ` Tyler Baicar
2019-01-29 18:48                 ` James Morse
2018-12-03 18:05 ` [PATCH v7 11/25] ACPI / APEI: Move NOTIFY_SEA between the estatus-queue and NOTIFY_NMI James Morse
2019-01-21 13:01   ` Borislav Petkov
2018-12-03 18:06 ` [PATCH v7 12/25] ACPI / APEI: Switch NOTIFY_SEA to use the estatus queue James Morse
2018-12-03 18:06 ` [PATCH v7 13/25] KVM: arm/arm64: Add kvm_ras.h to collect kvm specific RAS plumbing James Morse
2018-12-06 16:17   ` Catalin Marinas
2018-12-03 18:06 ` [PATCH v7 14/25] arm64: KVM/mm: Move SEA handling behind a single 'claim' interface James Morse
2018-12-06 16:17   ` Catalin Marinas
2018-12-03 18:06 ` [PATCH v7 15/25] ACPI / APEI: Move locking to the notification helper James Morse
2018-12-03 18:06 ` [PATCH v7 16/25] ACPI / APEI: Let the notification helper specify the fixmap slot James Morse
2018-12-03 18:06 ` [PATCH v7 17/25] ACPI / APEI: Pass ghes and estatus separately to avoid a later copy James Morse
2019-01-21 13:35   ` Borislav Petkov
2018-12-03 18:06 ` [PATCH v7 18/25] ACPI / APEI: Split ghes_read_estatus() to allow a peek at the CPER length James Morse
2019-01-21 13:53   ` Borislav Petkov
2018-12-03 18:06 ` [PATCH v7 19/25] ACPI / APEI: Only use queued estatus entry during _in_nmi_notify_one() James Morse
2019-01-21 17:19   ` Borislav Petkov
2018-12-03 18:06 ` [PATCH v7 20/25] ACPI / APEI: Use separate fixmap pages for arm64 NMI-like notifications James Morse
2019-01-21 17:27   ` Borislav Petkov
2019-01-23 18:33     ` James Morse
2019-01-31 13:38       ` Borislav Petkov
2018-12-03 18:06 ` [PATCH v7 21/25] mm/memory-failure: Add memory_failure_queue_kick() James Morse
2018-12-03 18:06 ` [PATCH v7 22/25] ACPI / APEI: Kick the memory_failure() queue for synchronous errors James Morse
2018-12-05  2:02   ` Xie XiuQi
2018-12-10 19:15     ` James Morse
2019-01-22 10:51       ` Borislav Petkov
2019-01-23 18:37         ` James Morse
2019-01-21 17:58   ` Borislav Petkov
2019-01-23 18:40     ` James Morse
2019-01-31 14:04       ` Borislav Petkov
2018-12-03 18:06 ` [PATCH v7 23/25] arm64: acpi: Make apei_claim_sea() synchronise with APEI's irq work James Morse
2018-12-06 16:18   ` Catalin Marinas
2018-12-03 18:06 ` [PATCH v7 24/25] firmware: arm_sdei: Add ACPI GHES registration helper James Morse
2018-12-06 16:18   ` Catalin Marinas
2018-12-03 18:06 ` [PATCH v7 25/25] ACPI / APEI: Add support for the SDEI GHES Notification type James Morse

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7f1621ac-09ba-71c0-d47d-e9ad61660307@arm.com \
    --to=james.morse@arm.com \
    --cc=bp@alien8.de \
    --cc=catalin.marinas@arm.com \
    --cc=christoffer.dall@arm.com \
    --cc=gengdongjiu@huawei.com \
    --cc=kvmarm@lists.cs.columbia.edu \
    --cc=lenb@kernel.org \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-mm@kvack.org \
    --cc=marc.zyngier@arm.com \
    --cc=n-horiguchi@ah.jp.nec.com \
    --cc=rjw@rjwysocki.net \
    --cc=tony.luck@intel.com \
    --cc=will.deacon@arm.com \
    --cc=wufan@codeaurora.org \
    --cc=xiexiuqi@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox