linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Jane Chu <jane.chu@oracle.com>
To: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Petr Mladek <pmladek@suse.com>,
	"rostedt@goodmis.org" <rostedt@goodmis.org>,
	"senozhatsky@chromium.org" <senozhatsky@chromium.org>,
	"linux@rasmusvillemoes.dk" <linux@rasmusvillemoes.dk>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	Haakon Bugge <haakon.bugge@oracle.com>,
	John Haxby <john.haxby@oracle.com>,
	Jane Chu <jane.chu@oracle.com>
Subject: Re: [PATCH] vsprintf: protect kernel from panic due to non-canonical pointer dereference
Date: Wed, 19 Oct 2022 20:16:17 +0000	[thread overview]
Message-ID: <a555035a-0905-7c7c-bc8e-d5512ca8a84a@oracle.com> (raw)
In-Reply-To: <Y1BPc8JsEoApKJkL@smile.fi.intel.com>

On 10/19/2022 12:26 PM, Andy Shevchenko wrote:
> On Wed, Oct 19, 2022 at 06:36:07PM +0000, Jane Chu wrote:
>> On 10/18/2022 1:49 PM, Andy Shevchenko wrote:
>>> On Tue, Oct 18, 2022 at 08:30:01PM +0000, Jane Chu wrote:
>>>> On 10/18/2022 1:07 PM, Andy Shevchenko wrote:
>>>>> On Tue, Oct 18, 2022 at 06:56:31PM +0000, Jane Chu wrote:
>>>>>> On 10/18/2022 5:45 AM, Petr Mladek wrote:
>>>>>>> On Mon 2022-10-17 19:31:53, Jane Chu wrote:
>>>>>>>> On 10/17/2022 12:25 PM, Andy Shevchenko wrote:
>>>>>>>>> On Mon, Oct 17, 2022 at 01:16:11PM -0600, Jane Chu wrote:
>>>>>>>>>> While debugging a separate issue, it was found that an invalid string
>>>>>>>>>> pointer could very well contain a non-canical address, such as
>>>>>>>>>> 0x7665645f63616465. In that case, this line of defense isn't enough
>>>>>>>>>> to protect the kernel from crashing due to general protection fault
>>>>>>>>>>
>>>>>>>>>> 	if ((unsigned long)ptr < PAGE_SIZE || IS_ERR_VALUE(ptr))
>>>>>>>>>>                      return "(efault)";
>>>>>>>>>>
>>>>>>>>>> So instead, use kern_addr_valid() to validate the string pointer.
>>>>>>>>>
>>>>>>>>> How did you check that value of the (invalid string) pointer?
>>>>>>>>>
>>>>>>>>
>>>>>>>> In the bug scenario, the invalid string pointer was an out-of-bound
>>>>>>>> string pointer. While the OOB referencing is fixed,
>>>>>>>
>>>>>>> Could you please provide more details about the fixed OOB?
>>>>>>> What exact vsprintf()/printk() call was broken and eventually
>>>>>>> how it was fixed, please?
>>>>>>
>>>>>> For sensitive reason, I'd like to avoid mentioning the specific name of
>>>>>> the sysfs attribute in the bug, instead, just call it "devX_attrY[]",
>>>>>> and describe the precise nature of the issue.
>>>>>>
>>>>>> devX_attrY[] is a string array, declared and filled at compile time,
>>>>>> like
>>>>>>       const char const devX_attrY[] = {
>>>>>> 	[ATTRY_A] = "Dev X AttributeY A",
>>>>>> 	[ATTRY_B] = "Dev X AttributeY B",
>>>>>> 	...
>>>>>> 	[ATTRY_G] = "Dev X AttributeY G",
>>>>>>       }
>>>>>> such that, when user "cat /sys/devices/systems/.../attry_1",
>>>>>> "Dev X AttributeY B" will show up in the terminal.
>>>>>> That's it, no more reference to the pointer devX_attrY[ATTRY_B] after that.
>>>>>>
>>>>>> The bug was that the index to the array was wrongfully produced,
>>>>>> leading up to OOB, e.g. devX_attrY[11].  The fix was to fix the
>>>>>> calculation and that is not an upstream fix.
>>>>>>
>>>>>>>
>>>>>>>> the lingering issue
>>>>>>>> is that the kernel ought to be able to protect itself, as the pointer
>>>>>>>> contains a non-canonical address.
>>>>>>>
>>>>>>> Was the pointer used only by the vsprintf()?
>>>>>>> Or was it accessed also by another code, please?
>>>>>>
>>>>>> The OOB pointer was used only by vsprintf() for the "cat" sysfs case.
>>>>>> No other code uses the OOB pointer, verified both by code examination
>>>>>> and test.
>>>>>
>>>>> So, then the vsprintf() is _the_ point to crash and why should we hide that?
>>>>> Because of the crash you found the culprit, right? The efault will hide very
>>>>> important details.
>>>>>
>>>>> So to me it sounds like I like this change less and less...
>>>>
>>>> What about the existing check
>>>>     	if ((unsigned long)ptr < PAGE_SIZE || IS_ERR_VALUE(ptr))
>>>>                        return "(efault)";
>>>> ?
>>>
>>> Because it's _special_. We know that First page is equivalent to a NULL pointer
>>> and the last one is dedicated for so called error pointers. There are no more
>>> special exceptions to the addresses in the Linux kernel (I don't talk about
>>> alignment requirements by the certain architectures).
>>>
>>>> In an experiment just to print the raw OOB pointer values, I saw below
>>>> (the devX attrY stuff are substitutes of the real attributes, other
>>>> values and strings are verbatim copy from "dmesg"):
>>>>
>>>> [ 3002.772329] devX_attrY[26]: (ffffffff84d60ad3) Dev X AttributeY E
>>>> [ 3002.772346] devX_attrY[27]: (ffffffff84d60ae4) Dev X AttributeY F
>>>> [ 3002.772347] devX_attrY[28]: (ffffffff84d60aee) Dev X AttributeY G
>>>> [ 3002.772349] devX_attrY[29]: (0) (null)
>>>> [ 3002.772350] devX_attrY[30]: (0) (null)
>>>> [ 3002.772351] devX_attrY[31]: (0) (null)
>>>> [ 3002.772352] devX_attrY[32]: (7665645f63616465) (einval)
>>>> [ 3002.772354] devX_attrY[33]: (646e61685f656369) (einval)
>>>> [ 3002.772355] devX_attrY[34]: (6f635f65755f656c) (einval)
>>>> [ 3002.772355] devX_attrY[35]: (746e75) (einval)
>>>>
>>>> where starting from index 29 are all OOB pointers.
>>>>
>>>> As you can see, if the OOBs are NULL, "(null)" was printed due to the
>>>> existing checking, but when the OOBs are turned to non-canonical which
>>>> is detectable, the fact the pointer value deviates from
>>>>      (ffffffff84d60aee + 4 * sizeof(void *))
>>>> evidently shown that the OOBs are detectable.
>>>>
>>>> The question then is why should the non-canonical OOBs be treated
>>>> differently from NULL and ERR_VALUE?
>>>
>>> Obviously, to see the crash. And let kernel _to crash_. Isn't it what we need
>>> to see a bug as early as possible?
>>>
>>
>> If the purpose is to see the bug as early as possible, then getting
>> "(efault)" from reading sysfs attribute would serve the purpose, right?
>>
>> The fact an OOB pointer has already being turned into either NULL or
>> non-canonical value implies that *if* kernel code other than
>> vsprintf() references the pointer, it'll crash else where;
> 
> No, not the case for error pointers and NULL.

Sorry, I don't understand, what about Oops from NUll pointer dereference?

> 
>> but *if* no
>> other code referencing the pointer, why crash?
> 
> Because how else you can see the bug?! The trace will give you essential
> information about registers, etc that gives you a hint what the _cause_ of the
> crash. And we need that cause. The "(efault)" has not even a bit close to what
> crash gives us.
> 
> So, this is my last message in the discussion.
> 
> Here is a formal NAK. Up to maintainers to decide what to do with this.
> 

Sigh, but thanks for taking the time articulating your point of view.

-jane



  reply	other threads:[~2022-10-19 20:16 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20221017191611.2577466-1-jane.chu@oracle.com>
     [not found] ` <Y02sENwhtpsx5yhP@smile.fi.intel.com>
     [not found]   ` <5d987403-a7bf-8996-d639-c99edeaabcdf@oracle.com>
     [not found]     ` <Y06f4EwisLTU0rEz@alley>
2022-10-18 18:56       ` Jane Chu
2022-10-18 19:28         ` Randy Dunlap
2022-10-18 19:58           ` Jane Chu
2022-10-18 20:07         ` Andy Shevchenko
2022-10-18 20:30           ` Jane Chu
2022-10-18 20:49             ` Andy Shevchenko
2022-10-19 10:43               ` Haakon Bugge
2022-10-19 11:25                 ` Andy Shevchenko
2022-10-19 18:36               ` Jane Chu
2022-10-19 19:26                 ` Andy Shevchenko
2022-10-19 20:16                   ` Jane Chu [this message]
2022-10-20  7:44               ` Petr Mladek
2022-10-20  9:18                 ` Petr Mladek
2022-10-20 13:57                 ` Andy Shevchenko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a555035a-0905-7c7c-bc8e-d5512ca8a84a@oracle.com \
    --to=jane.chu@oracle.com \
    --cc=andriy.shevchenko@linux.intel.com \
    --cc=haakon.bugge@oracle.com \
    --cc=john.haxby@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux@rasmusvillemoes.dk \
    --cc=pmladek@suse.com \
    --cc=rostedt@goodmis.org \
    --cc=senozhatsky@chromium.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox