linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Jiangfeng Xiao <xiaojiangfeng@huawei.com>
To: "Russell King (Oracle)" <linux@armlinux.org.uk>
Cc: <arnd@arndb.de>, <keescook@chromium.org>, <haibo.li@mediatek.com>,
	<angelogioacchino.delregno@collabora.com>,
	<amergnat@baylibre.com>, <akpm@linux-foundation.org>,
	<dave.hansen@linux.intel.com>, <douzhaolei@huawei.com>,
	<gustavoars@kernel.org>, <jpoimboe@kernel.org>,
	<kepler.chenxin@huawei.com>, <kirill.shutemov@linux.intel.com>,
	<linux-hardening@vger.kernel.org>, <linux-kernel@vger.kernel.org>,
	<linux-mm@kvack.org>, <linux-arm-kernel@lists.infradead.org>,
	<nixiaoming@huawei.com>, <peterz@infradead.org>,
	<wangbing6@huawei.com>, <wangfangpeng1@huawei.com>,
	<jannh@google.com>, <willy@infradead.org>,
	<David.Laight@aculab.com>
Subject: Re: [PATCH v2] ARM: unwind: improve unwinders for noreturn case
Date: Thu, 21 Mar 2024 17:44:07 +0800	[thread overview]
Message-ID: <bad25937-fc98-8e11-4279-ab6b3a727c1f@huawei.com> (raw)
In-Reply-To: <Zfs7sT6Pxy5yjnPC@shell.armlinux.org.uk>



On 2024/3/21 3:40, Russell King (Oracle) wrote:
> On Wed, Mar 20, 2024 at 11:30:05PM +0800, Jiangfeng Xiao wrote:
>>
>>
>> On 2024/3/20 16:45, Russell King (Oracle) wrote:
>>> On Wed, Mar 20, 2024 at 11:44:38AM +0800, Jiangfeng Xiao wrote:
>>>> This is an off-by-one bug which is common in unwinders,
>>>> due to the fact that the address on the stack points
>>>> to the return address rather than the call address.
>>>>
>>>> So, for example, when the last instruction of a function
>>>> is a function call (e.g., to a noreturn function), it can
>>>> cause the unwinder to incorrectly try to unwind from
>>>> the function after the callee.
>>>>
>>>> foo:
>>>> ...
>>>> 	bl	bar
>>>> ... end of function and thus next function ...
>>>>
>>>> which results in LR pointing into the next function.
>>>>
>>>> Fixed this by subtracting 1 from frmae->pc in the call frame
>>>> (but not exception frames) like ORC on x86 does.
>>>
>>> The reason that I'm not accepting this patch is because the above says
>>> that it fixes it by subtracting 1 from the PC value, but the patch is
>>> *way* more complicated than that and there's no explanation why.
>>>
>>> For example, the following are unexplained:
>>>
>>> - Why do we always need ex_frame
>>
>> ```
>> bar:
>> ...
>> ... end of function bar ...
>>
>> foo:
>>     BUG();
>> ... end of function foo ...
>> ```
>>
>> For example, when the first instruction of function 'foo'
>> is a undefined instruction, after function 'foo' is executed
>> to trigger an exception, 'arm_get_current_stackframe' assigns
>> 'regs->ARM_pc' to 'frame.pc'.
>>
>> If we always decrement frame.pc by 1, unwinder will incorrectly
>> try to unwind from the function 'bar' before the function 'foo'.
>>
>> So we need to 'ext_frame' to distinguish this case
>> where we don't need to subtract 1.
> 
> This just sounds wrong to me. We have two different cases:
> 
> 1) we're unwinding a frame where PC points at the offending instruction.
>    This may or may not be in the exception text.
> 2) we're unwinding a frame that has been created because of a branch,
>    where the PC points at the next instruction _after_ that callsite.
> 
> While we're unwinding, we will mostly hit the second type of frames, but
> we'll only hit the first type on the initial frame. Some exception
> entries will have the PC pointing at the next instruction to be
> executed, others will have it pointing at the offending instruction
> (e.g. if it needs to be retried.)


Thank you for your summary.

Now we try to enumerate all cases:

1) When we hit the first type of frames

1.1) If the pc pointing at the next instruction
     of the offending instruction

1.1.1) If the offending instruction is the first instruction
       of the function
       pc:   cause to unwind from current function
       pc-1: casue to unwind from current function

1.1.2) If the offending instruction is neither the first
       instruction nor the last instruction of the function
       pc:   cause to unwind from current function
       pc-1: casue to unwind from current function

1.1.3) If the offending instruction is the last instruction
       of the function
       pc:   cause to unwind from next    function
       pc-1: casue to unwind from current function

1.2) If the pc pointing at the offending instruction

1.2.1) If the offending instruction is the first instruction
       of the function
       pc:   cause to unwind from current  function
       pc-1: casue to unwind from previous function

1.2.2) If the offending instruction is neither the first
       instruction nor the last instruction of the function
       pc:   cause to unwind from current function
       pc-1: casue to unwind from current function

1.2.3) If the offending instruction is the last instruction
       of the function
       pc:   cause to unwind from current function
       pc-1: casue to unwind from current function

2) When we hit the second type of frames
2.1) pc always pointing at the next instruction after that callsite
2.1.1) If the callsite is the first instruction
       pc:   cause to unwind from current function
       pc-1: casue to unwind from current function

2.1.2) If the callsite is neither the first nor last instruction
       pc:   cause to unwind from current function
       pc-1: casue to unwind from current function

2.1.3) If the callsite is the last instruction
       pc:   cause to unwind from next    function
       pc-1: casue to unwind from current function


All in all, I think you are right.

In case 2), We can always unwind by 'pc-1'.

In case 1), If we unwind by 'pc', case 1.1.3) is problematic.
If we unwind by 'pc-1', 1.2.1) is problematic.


> 
> So, I don't see what being in the exception/entry text really has much
> to do with any decision making here. I think you've found that it works
> for your case, but it won't _always_ work and you're just shifting the
> "bug" with how these traces work from one issue to a different but
> similar issue.



  reply	other threads:[~2024-03-21  9:44 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-04  1:39 [PATCH] usercopy: delete __noreturn from usercopy_abort Jiangfeng Xiao
2024-03-04 15:15 ` Jann Horn
2024-03-04 17:40   ` Kees Cook
2024-03-05  3:31     ` Jiangfeng Xiao
2024-03-05  9:32       ` Kees Cook
2024-03-05 11:38         ` Jiangfeng Xiao
2024-03-05 17:58           ` Josh Poimboeuf
2024-03-06  4:00             ` Jiangfeng Xiao
2024-03-06  9:52             ` Russell King (Oracle)
2024-03-06 16:02               ` Josh Poimboeuf
2024-03-09 14:58               ` David Laight
2024-03-18  4:01             ` Jiangfeng Xiao
2024-03-05  2:54   ` Jiangfeng Xiao
2024-03-05  3:12     ` Jiangfeng Xiao
2024-03-20  2:19 ` [PATCH] ARM: unwind: improve unwinders for noreturn case Jiangfeng Xiao
2024-03-20  2:46   ` Kees Cook
2024-03-20  3:30     ` Jiangfeng Xiao
2024-03-20  3:34       ` Matthew Wilcox
2024-03-20  3:46         ` Jiangfeng Xiao
2024-03-20  3:44 ` [PATCH v2] " Jiangfeng Xiao
2024-03-20  8:45   ` Russell King (Oracle)
2024-03-20 15:30     ` Jiangfeng Xiao
2024-03-20 19:40       ` Russell King (Oracle)
2024-03-21  9:44         ` Jiangfeng Xiao [this message]
2024-03-21 10:22           ` David Laight
2024-03-21 11:23             ` Russell King (Oracle)
2024-03-21 12:07               ` David Laight
2024-03-21 12:22                 ` Russell King (Oracle)
2024-03-21 12:57                   ` David Laight
2024-03-21 13:08                     ` Russell King (Oracle)
2024-03-21 14:37                       ` David Laight
2024-03-21 14:56                         ` Russell King (Oracle)
2024-03-21 15:20                           ` David Laight
2024-03-21 15:33                             ` Russell King (Oracle)
2024-03-21 22:43               ` Ard Biesheuvel
2024-03-22  0:08                 ` Russell King (Oracle)
2024-03-22  9:24                   ` David Laight
2024-03-22  9:52                     ` Russell King (Oracle)
2024-03-22 12:54                       ` Jiangfeng Xiao
2024-03-22 14:16                       ` David Laight
2024-03-20 15:41 ` [PATCH v3] " Jiangfeng Xiao
2024-03-20 19:42   ` Russell King (Oracle)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bad25937-fc98-8e11-4279-ab6b3a727c1f@huawei.com \
    --to=xiaojiangfeng@huawei.com \
    --cc=David.Laight@aculab.com \
    --cc=akpm@linux-foundation.org \
    --cc=amergnat@baylibre.com \
    --cc=angelogioacchino.delregno@collabora.com \
    --cc=arnd@arndb.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=douzhaolei@huawei.com \
    --cc=gustavoars@kernel.org \
    --cc=haibo.li@mediatek.com \
    --cc=jannh@google.com \
    --cc=jpoimboe@kernel.org \
    --cc=keescook@chromium.org \
    --cc=kepler.chenxin@huawei.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-hardening@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux@armlinux.org.uk \
    --cc=nixiaoming@huawei.com \
    --cc=peterz@infradead.org \
    --cc=wangbing6@huawei.com \
    --cc=wangfangpeng1@huawei.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox