linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Zhou Yanjie <zhouyanjie@wanyeetech.com>
To: Ilya Lipnitskiy <ilya.lipnitskiy@gmail.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Cc: Hugh Dickins <hughd@google.com>,
	"Eric W. Biederman" <ebiederm@xmission.com>,
	stable@vger.kernel.org
Subject: Re: [PATCH v3] mm: fix race by making init_zero_pfn() early_initcall
Date: Tue, 30 Mar 2021 12:59:27 +0800	[thread overview]
Message-ID: <51e3affb-ea09-65a4-99e1-daba968e6dc8@wanyeetech.com> (raw)
In-Reply-To: <20210330044208.8305-1-ilya.lipnitskiy@gmail.com>

Hi Ilya,

On 2021/3/30 下午12:42, Ilya Lipnitskiy wrote:
> There are code paths that rely on zero_pfn to be fully initialized
> before core_initcall. For example, wq_sysfs_init() is a core_initcall
> function that eventually results in a call to kernel_execve, which
> causes a page fault with a subsequent mmput. If zero_pfn is not
> initialized by then it may not get cleaned up properly and result in an
> error:
>    BUG: Bad rss-counter state mm:(ptrval) type:MM_ANONPAGES val:1
>
> Here is an analysis of the race as seen on a MIPS device. On this
> particular MT7621 device (Ubiquiti ER-X), zero_pfn is PFN 0 until
> initialized, at which point it becomes PFN 5120:
>    1. wq_sysfs_init calls into kobject_uevent_env at core_initcall:
>         [<80340dc8>] kobject_uevent_env+0x7e4/0x7ec
>         [<8033f8b8>] kset_register+0x68/0x88
>         [<803cf824>] bus_register+0xdc/0x34c
>         [<803cfac8>] subsys_virtual_register+0x34/0x78
>         [<8086afb0>] wq_sysfs_init+0x1c/0x4c
>         [<80001648>] do_one_initcall+0x50/0x1a8
>         [<8086503c>] kernel_init_freeable+0x230/0x2c8
>         [<8066bca0>] kernel_init+0x10/0x100
>         [<80003038>] ret_from_kernel_thread+0x14/0x1c
>
>    2. kobject_uevent_env() calls call_usermodehelper_exec() which executes
>       kernel_execve asynchronously.
>
>    3. Memory allocations in kernel_execve cause a page fault, bumping the
>       MM reference counter:
>         [<8015adb4>] add_mm_counter_fast+0xb4/0xc0
>         [<80160d58>] handle_mm_fault+0x6e4/0xea0
>         [<80158aa4>] __get_user_pages.part.78+0x190/0x37c
>         [<8015992c>] __get_user_pages_remote+0x128/0x360
>         [<801a6d9c>] get_arg_page+0x34/0xa0
>         [<801a7394>] copy_string_kernel+0x194/0x2a4
>         [<801a880c>] kernel_execve+0x11c/0x298
>         [<800420f4>] call_usermodehelper_exec_async+0x114/0x194
>
>    4. In case zero_pfn has not been initialized yet, zap_pte_range does
>       not decrement the MM_ANONPAGES RSS counter and the BUG message is
>       triggered shortly afterwards when __mmdrop checks the ref counters:
>         [<800285e8>] __mmdrop+0x98/0x1d0
>         [<801a6de8>] free_bprm+0x44/0x118
>         [<801a86a8>] kernel_execve+0x160/0x1d8
>         [<800420f4>] call_usermodehelper_exec_async+0x114/0x194
>         [<80003198>] ret_from_kernel_thread+0x14/0x1c
>
> To avoid races such as described above, initialize init_zero_pfn at
> early_initcall level. Depending on the architecture, ZERO_PAGE is either
> constant or gets initialized even earlier, at paging_init, so there is
> no issue with initializing zero_pfn earlier.
>
> Discussion: https://lkml.kernel.org/r/CALCv0x2YqOXEAy2Q=hafjhHCtTHVodChv1qpM=niAXOpqEbt7w@mail.gmail.com
>
> Signed-off-by: Ilya Lipnitskiy <ilya.lipnitskiy@gmail.com>
> Cc: Hugh Dickins <hughd@google.com>
> Cc: "Eric W. Biederman" <ebiederm@xmission.com>
> Cc: stable@vger.kernel.org
> ---
>   mm/memory.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)


Tested-by: 周琰杰 (Zhou Yanjie)<zhouyanjie@wanyeetech.com> # on 
CU1000-Neo/X1000E and CU1830-Neo/X1830


> diff --git a/mm/memory.c b/mm/memory.c
> index 5c3b29d3af66..e66b11ac1659 100644
> --- a/mm/memory.c
> +++ b/mm/memory.c
> @@ -166,7 +166,7 @@ static int __init init_zero_pfn(void)
>   	zero_pfn = page_to_pfn(ZERO_PAGE(0));
>   	return 0;
>   }
> -core_initcall(init_zero_pfn);
> +early_initcall(init_zero_pfn);
>   
>   void mm_trace_rss_stat(struct mm_struct *mm, int member, long count)
>   {


      reply	other threads:[~2021-03-30  4:59 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-29  5:24 [PATCH] " Ilya Lipnitskiy
2021-03-29  5:29 ` [PATCH v2] " Ilya Lipnitskiy
2021-03-30  4:42   ` [PATCH v3] " Ilya Lipnitskiy
2021-03-30  4:59     ` Zhou Yanjie [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=51e3affb-ea09-65a4-99e1-daba968e6dc8@wanyeetech.com \
    --to=zhouyanjie@wanyeetech.com \
    --cc=akpm@linux-foundation.org \
    --cc=ebiederm@xmission.com \
    --cc=hughd@google.com \
    --cc=ilya.lipnitskiy@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox