From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3C45CC433F5 for ; Thu, 14 Apr 2022 16:30:16 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9AA3F6B0071; Thu, 14 Apr 2022 12:30:15 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 957726B0073; Thu, 14 Apr 2022 12:30:15 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7F8476B0074; Thu, 14 Apr 2022 12:30:15 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (relay.hostedemail.com [64.99.140.26]) by kanga.kvack.org (Postfix) with ESMTP id 70C766B0071 for ; Thu, 14 Apr 2022 12:30:15 -0400 (EDT) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 3E417634B5 for ; Thu, 14 Apr 2022 16:30:15 +0000 (UTC) X-FDA: 79356021990.07.BC2CDA9 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf09.hostedemail.com (Postfix) with ESMTP id 746AA14000C for ; Thu, 14 Apr 2022 16:30:14 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1649953813; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=dS40yRI9Iu4mpulttk/yT5UkW6FhfFYXK4U+vMF3mEE=; b=ddHgCh+0WqFjzDKa8wsIw50w5AQBYBrc+5guymIVpcew1Oe2+UDAoQ+MsaSLxD00Ccbtxu wAzivf/L8vxgn7lNIApkPzPYoM8P1qFVKY4dpmRCkurAaTRvOWF86/FtftLd2w6/Rog5hP ei28+qZI3SdZStfwdTSM7GAg6wtWR8M= Received: from mail-il1-f199.google.com (mail-il1-f199.google.com [209.85.166.199]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-84-VSYfqtMhNDyNd8lLnN5nsA-1; Thu, 14 Apr 2022 12:30:12 -0400 X-MC-Unique: VSYfqtMhNDyNd8lLnN5nsA-1 Received: by mail-il1-f199.google.com with SMTP id x4-20020a056e021ca400b002ca77aa219aso3261290ill.20 for ; Thu, 14 Apr 2022 09:30:12 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=dS40yRI9Iu4mpulttk/yT5UkW6FhfFYXK4U+vMF3mEE=; b=SdPJKojGxVTAaRIShr7D0EDwHPwpjk0oyT9DtP1W9O+dbn5cWsNOZxZFSUUu4x04Ca j5cDjnMiXgFgde7CG9zCREAR5XjvogcHhCWjCoXYgx6WxDV5kjB5V5vNp8NnoATz0MY5 vRfO1SvSsAV8WZJiJ+jyUB8MJM4gT9NAQEvZqmDRjK1P9Tc/crU8X+oUwUT9kysTWhQj zYrc7p161Z1QaaFUCiFRl0Hs+gT5xH1aMJqOfUxl2+nN83z4q7djv7ixpn7LorTXE61h JZG4dOVHHqg+5zy64bi6E4a14GlbPAvNk+sPEppPFPFtl1jJmooNtUZ+CifyGrOIS2rw Ej3A== X-Gm-Message-State: AOAM530UKpxMZm+itoNP8Q5Z/Ub0s1EMPQqkLnbVuATwSQnXtRLl5Ffj U0uaTxUbbS+peqyB0lcZXJiraGcmvDmsnvkTsqHpO6CTZd/+2ifZbveLjrh6waHT+1dFfzB4KYc fMonN2p28E70= X-Received: by 2002:a05:6602:29ce:b0:609:4f60:59cb with SMTP id z14-20020a05660229ce00b006094f6059cbmr1503830ioq.183.1649953809946; Thu, 14 Apr 2022 09:30:09 -0700 (PDT) X-Google-Smtp-Source: ABdhPJw9A8AHSrxgnUVEj2v/S1hhEfJFkeDyEN0WflKOfoPSETwnRHxI6kA8LxPjlmWrc/2uv8jWQg== X-Received: by 2002:a05:6602:29ce:b0:609:4f60:59cb with SMTP id z14-20020a05660229ce00b006094f6059cbmr1503809ioq.183.1649953809695; Thu, 14 Apr 2022 09:30:09 -0700 (PDT) Received: from xz-m1.local (cpec09435e3e0ee-cmc09435e3e0ec.cpe.net.cable.rogers.com. [99.241.198.116]) by smtp.gmail.com with ESMTPSA id f10-20020a05660215ca00b0064d25228248sm1525884iow.11.2022.04.14.09.30.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 14 Apr 2022 09:30:08 -0700 (PDT) Date: Thu, 14 Apr 2022 12:30:06 -0400 From: Peter Xu To: Marek Szyprowski Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, Mike Kravetz , Nadav Amit , Matthew Wilcox , Mike Rapoport , David Hildenbrand , Hugh Dickins , Jerome Glisse , "Kirill A . Shutemov" , Andrea Arcangeli , Andrew Morton , Axel Rasmussen , Alistair Popple Subject: Re: [PATCH v8 03/23] mm: Check against orig_pte for finish_fault() Message-ID: References: <20220405014646.13522-1-peterx@redhat.com> <20220405014836.14077-1-peterx@redhat.com> <710c48c9-406d-e4c5-a394-10501b951316@samsung.com> <6ccf5f5f-8dc5-16cc-f06c-78401b822a54@samsung.com> MIME-Version: 1.0 In-Reply-To: <6ccf5f5f-8dc5-16cc-f06c-78401b822a54@samsung.com> X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Disposition: inline X-Stat-Signature: iwj5dq47me1ohpmkz64wmxq8foybjmbo X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 746AA14000C Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=ddHgCh+0; dmarc=pass (policy=none) header.from=redhat.com; spf=none (imf09.hostedemail.com: domain of peterx@redhat.com has no SPF policy when checking 170.10.133.124) smtp.mailfrom=peterx@redhat.com X-Rspam-User: X-HE-Tag: 1649953814-74418 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Apr 14, 2022 at 09:51:01AM +0200, Marek Szyprowski wrote: > Hi Peter, > > On 13.04.2022 18:43, Peter Xu wrote: > > On Wed, Apr 13, 2022 at 04:03:28PM +0200, Marek Szyprowski wrote: > >> On 05.04.2022 03:48, Peter Xu wrote: > >>> We used to check against none pte in finish_fault(), with the assumption > >>> that the orig_pte is always none pte. > >>> > >>> This change prepares us to be able to call do_fault() on !none ptes. For > >>> example, we should allow that to happen for pte marker so that we can restore > >>> information out of the pte markers. > >>> > >>> Let's change the "pte_none" check into detecting changes since we fetched > >>> orig_pte. One trivial thing to take care of here is, when pmd==NULL for > >>> the pgtable we may not initialize orig_pte at all in handle_pte_fault(). > >>> > >>> By default orig_pte will be all zeros however the problem is not all > >>> architectures are using all-zeros for a none pte. pte_clear() will be the > >>> right thing to use here so that we'll always have a valid orig_pte value > >>> for the whole handle_pte_fault() call. > >>> > >>> Signed-off-by: Peter Xu > >> This patch landed in today's linux next-202204213 as commit fa6009949163 > >> ("mm: check against orig_pte for finish_fault()"). Unfortunately it > >> causes serious system instability on some ARM 32bit machines. I've > >> observed it on all tested boards (various Samsung Exynos based, > >> Raspberry Pi 3b and 4b, even QEMU's virt 32bit machine) when kernel was > >> compiled from multi_v7_defconfig. > > Thanks for the report. > > > >> Here is a crash log from QEMU's ARM 32bit virt machine: > >> > >> 8<--- cut here --- > >> Unable to handle kernel paging request at virtual address e093263c > >> [e093263c] *pgd=42083811, *pte=00000000, *ppte=00000000 > >> Internal error: Oops: 807 [#1] SMP ARM > >> Modules linked in: > >> CPU: 1 PID: 37 Comm: kworker/u4:0 Not tainted > >> 5.18.0-rc2-00176-gfa6009949163 #11684 > >> Hardware name: Generic DT based system > >> PC is at cpu_ca15_set_pte_ext+0x4c/0x58 > >> LR is at handle_mm_fault+0x46c/0xbb0 > > I had a feeling that for some reason the pte_clear() isn't working right > > there when it's applying to a kernel stack variable for arm32. I'm totally > > newbie to arm32, so what I'm reading is this: > > > > https://people.kernel.org/linusw/arm32-page-tables > > > > Especially: > > > > https://protect2.fireeye.com/v1/url?k=35bc90ac-6a27a9bd-35bd1be3-0cc47a31cdbc-b032cb1d178dc691&q=1&e=c82daefb-c86b-4ca1-8db1-cadbdc124ed2&u=https%3A%2F%2Fdflund.se%2F%7Etriad%2Fimages%2Fclassic-mmu-page-table.jpg > > > > It does match with what I read from arm32's proc-v7-2level.S of it, where > > from the comment above cpu_v7_set_pte_ext: > > > > * - ptep - pointer to level 2 translation table entry > > * (hardware version is stored at +2048 bytes) <---------- > > > > So it seems to me that arm32 needs to store some metadata at offset 0x800 > > of any pte_t* pointer passed over to pte_clear(), then it must be a real > > pgtable or it'll write to random places in the kernel, am I correct? > > > > Does it mean that all pte_*() operations upon a kernel stack var will be > > wrong? I thought it could happen easily in the rest of mm too but I didn't > > yet check much. The fact shows that it's mostly possible the current code > > just work well with arm32 and no such violation occured yet. > > > > That does sound a bit tricky, IMHO. But I don't have an immediate solution > > to make it less tricky.. though I have a thought of workaround, by simply > > not calling pte_clear() on the stack var. > > > > Would you try the attached patch to replace this problematic patch? So we > > need to revert commit fa6009949163 and apply the new one. Please let me > > know whether it'll solve the problem, so far I only compile tested it, but > > I'll run some more test to make sure the uffd-wp scenarios will be working > > right with the new version. > > I've reverted fa6009949163 and applied the attached patch on top of > linux next-20220314. The ARM 32bit issues went away. :) > > Feel free to add: > > Reported-by: Marek Szyprowski > > Tested-by: Marek Szyprowski Thanks, Marek, for the fast feedback! I've also verified it for the uffd-wp case so the whole series keeps running as usual and nothing else shows up after the new patch replaced. Andrew, any suggestion on how we proceed with the replacement patch? E.g. do you want me to post it separately to the list? Please let me know your preference, thanks. -- Peter Xu