From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.2 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 90B08C761A1 for ; Thu, 20 Feb 2020 02:32:34 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 56C7624670 for ; Thu, 20 Feb 2020 02:32:34 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 56C7624670 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=huawei.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 025006B0007; Wed, 19 Feb 2020 21:32:34 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id F15C86B0008; Wed, 19 Feb 2020 21:32:33 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DDD906B000A; Wed, 19 Feb 2020 21:32:33 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0190.hostedemail.com [216.40.44.190]) by kanga.kvack.org (Postfix) with ESMTP id BF9816B0007 for ; Wed, 19 Feb 2020 21:32:33 -0500 (EST) Received: from smtpin12.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 94242440E for ; Thu, 20 Feb 2020 02:32:33 +0000 (UTC) X-FDA: 76508931786.12.basin06_8aaf27a171261 X-HE-Tag: basin06_8aaf27a171261 X-Filterd-Recvd-Size: 3242 Received: from huawei.com (szxga06-in.huawei.com [45.249.212.32]) by imf33.hostedemail.com (Postfix) with ESMTP for ; Thu, 20 Feb 2020 02:32:32 +0000 (UTC) Received: from DGGEMS406-HUB.china.huawei.com (unknown [172.30.72.59]) by Forcepoint Email with ESMTP id 371B5B13612E52EF5B93; Thu, 20 Feb 2020 10:32:30 +0800 (CST) Received: from [127.0.0.1] (10.177.246.209) by DGGEMS406-HUB.china.huawei.com (10.3.19.206) with Microsoft SMTP Server id 14.3.439.0; Thu, 20 Feb 2020 10:32:22 +0800 Subject: Re: [PATCH] mm/hugetlb: avoid get wrong ptep caused by race To: Sean Christopherson CC: , , , , , , , References: <1582027825-112728-1-git-send-email-longpeng2@huawei.com> <20200218203717.GE28156@linux.intel.com> <20200219015836.GM28156@linux.intel.com> <6ccbde03-953c-c006-a07e-8146b84389d9@huawei.com> <20200219162231.GE15888@linux.intel.com> From: "Longpeng (Mike)" Message-ID: <6d4d2b59-5b40-49da-a6f7-e8ea34ed30e6@huawei.com> Date: Thu, 20 Feb 2020 10:32:21 +0800 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.7.2 MIME-Version: 1.0 In-Reply-To: <20200219162231.GE15888@linux.intel.com> Content-Type: text/plain; charset="utf-8" X-Originating-IP: [10.177.246.209] X-CFilter-Loop: Reflected Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: =E5=9C=A8 2020/2/20 0:22, Sean Christopherson =E5=86=99=E9=81=93: > On Wed, Feb 19, 2020 at 08:21:26PM +0800, Longpeng (Mike) wrote: >> =E5=9C=A8 2020/2/19 9:58, Sean Christopherson =E5=86=99=E9=81=93: >>> FWIW, I'd be in favor of going the READ/WRITE_ONCE() route for x86, e= .g. >>> convert everything as a follow-up patch (or patches). I'm fairly con= fident >>> that KVM's usage of lookup_address_in_mm() is safe, but I wouldn't ex= actly >>> bet my life on it. I'd much rather the failing scenario be that KVM = uses >>> a sub-optimal page size as opposed to exploding on a bad pointer. >>> >> Um...our testcase starts 50 VMs with 2U4G(use 1G hugepage) and then do >> live-upgrade(private feature that just modify the qemu and libvirt) an= d >> live-migrate in turns for each one. However our live upgraded new QEMU= won't do >> touch_all_pages. >> Suppose we start a VM without touch_all_pages in QEMU, the VM's guest = memory is >> not mapped in the CR3 pagetable at the moment. When the 2 vcpus runnin= g, they >> could access some pages belong to the same 1G-hugepage, both of them w= ill vmexit >> due to ept_violation and then call gup-->follow_hugetlb_page-->hugetlb= _fault, so >> the race may encounter, right? >=20 > Yep. The code I'm referring to is similar but different code that just > happened to go into KVM for kernel 5.6. It has no effect on the gup() = flow > that leads to this bug. I mentioned it above as an example of code out= side > of hugetlb_fault() that would also benefit from moving to READ/WRITE_ON= CE(). >=20 >=20 I understand better now, thanks for your patience. :) --=20 Regards, Longpeng(Mike)