From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-0.9 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B6486C43603 for ; Mon, 16 Dec 2019 13:30:53 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 7B332206CB for ; Mon, 16 Dec 2019 13:30:53 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="aBbk/T0T" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 7B332206CB Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id E805D8E0008; Mon, 16 Dec 2019 08:30:52 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id E089D8E0003; Mon, 16 Dec 2019 08:30:52 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CD04C8E0008; Mon, 16 Dec 2019 08:30:52 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0043.hostedemail.com [216.40.44.43]) by kanga.kvack.org (Postfix) with ESMTP id B1C088E0003 for ; Mon, 16 Dec 2019 08:30:52 -0500 (EST) Received: from smtpin30.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with SMTP id 6ED548249980 for ; Mon, 16 Dec 2019 13:30:52 +0000 (UTC) X-FDA: 76271089944.30.hour04_770567d75836 X-HE-Tag: hour04_770567d75836 X-Filterd-Recvd-Size: 6362 Received: from us-smtp-delivery-1.mimecast.com (us-smtp-2.mimecast.com [205.139.110.61]) by imf01.hostedemail.com (Postfix) with ESMTP for ; Mon, 16 Dec 2019 13:30:50 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1576503050; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=2+Mpl5WNFnfS6xbKHJTFsF7+ERe3vNfKJGQlFzhyOa4=; b=aBbk/T0TTu8K/dhWbx7aNmsPayztVKmzkY/29sA8TrlrHd1c4XNGAVhnBtVcAWSJemHDss td3o4QIJ+2ZkIyH2fBvu5vdcGMNsUnFbgh9mrvZeTeHuKA9znJC4rV53lov8kRmxx1TD/G MLehvNogvFKMBdhqqx4aCIAyCAdDF2Y= Received: from mail-wm1-f72.google.com (mail-wm1-f72.google.com [209.85.128.72]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-194-LGQY0CKuNtSrx3jjY63COg-1; Mon, 16 Dec 2019 08:30:47 -0500 Received: by mail-wm1-f72.google.com with SMTP id b131so1029171wmd.9 for ; Mon, 16 Dec 2019 05:30:47 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:in-reply-to:references:date :message-id:mime-version; bh=T3wALvF0xpsoopJ4Bzkk3lqyRNuGZEwsZlHqY5vrDU8=; b=lp1W4k0HctMiOtlo6K0BNcy+IPORX437/+SbYGvebbdj40jjPQYCpyOOCceWLHVU4s UJsLnZDZcdeDvA5b55teuDcj7J0fb4LLsuaJTBHzKzHjrqLGO+ycWPmZ+C0cyAdTo5HV SeMjJ6ovr6QF7y4KSGogDq4gA0+rhs5nTjERDror0BQpyYjAiIczjoe0JQHkuuw6X5XJ QTcxXszHccyoGPrWW12Ys0gqJGV7HBQnsP7a0w9b/L+A4H+MLggar/GOJ8hZAx8hP6sB MDIWvDJSw8ucDgcXiRbY1VNQ7hn+mdjuMhgXQn08yfYZXgOf+vJT9aLNFlEp26gHm9Jw /4iw== X-Gm-Message-State: APjAAAWjWUQK5V7lNvxq29gvDXWTeJweqzIBFoaEC2RIDe5YIlPmzf67 GV+347xvmktdyT9b+173Ei0kU8JrwfN4Bf/RBiIF/vIZOTWHNuY2xGk4pZpC/kHudyqLiXlKJ2/ zlWYt8UIj28U= X-Received: by 2002:adf:9b83:: with SMTP id d3mr30126001wrc.54.1576503046078; Mon, 16 Dec 2019 05:30:46 -0800 (PST) X-Google-Smtp-Source: APXvYqwp1RzM05y36ZEXv8HKgj2CLHoERwAX15WfZnqvn+tQSSKxuIklj/RaULFXdTnzVAnGsM+K+A== X-Received: by 2002:adf:9b83:: with SMTP id d3mr30125970wrc.54.1576503045873; Mon, 16 Dec 2019 05:30:45 -0800 (PST) Received: from vitty.brq.redhat.com (nat-pool-brq-t.redhat.com. [213.175.37.10]) by smtp.gmail.com with ESMTPSA id 2sm21854185wrq.31.2019.12.16.05.30.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 16 Dec 2019 05:30:45 -0800 (PST) From: Vitaly Kuznetsov To: Peter Zijlstra , Ajay Kaher Cc: gregkh@linuxfoundation.org, stable@vger.kernel.org, torvalds@linux-foundation.org, punit.agrawal@arm.com, akpm@linux-foundation.org, kirill.shutemov@linux.intel.com, willy@infradead.org, will.deacon@arm.com, mszeredi@redhat.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, srivatsab@vmware.com, srivatsa@csail.mit.edu, amakhalov@vmware.com, srinidhir@vmware.com, bvikas@vmware.com, anishs@vmware.com, vsirnapalli@vmware.com, srostedt@vmware.com, Vlastimil Babka , Oscar Salvador , Thomas Gleixner , Ingo Molnar , Juergen Gross , Borislav Petkov , Dave Hansen , Andy Lutomirski Subject: Re: [PATCH v3 8/8] x86, mm, gup: prevent get_page() race with munmap in paravirt guest In-Reply-To: <20191216130443.GN2844@hirez.programming.kicks-ass.net> References: <1576529149-14269-1-git-send-email-akaher@vmware.com> <1576529149-14269-9-git-send-email-akaher@vmware.com> <20191216130443.GN2844@hirez.programming.kicks-ass.net> Date: Mon, 16 Dec 2019 14:30:44 +0100 Message-ID: <87lfrc9z3v.fsf@vitty.brq.redhat.com> MIME-Version: 1.0 X-MC-Unique: LGQY0CKuNtSrx3jjY63COg-1 X-Mimecast-Spam-Score: 0 Content-Type: text/plain Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Peter Zijlstra writes: > On Tue, Dec 17, 2019 at 02:15:48AM +0530, Ajay Kaher wrote: >> From: Vlastimil Babka >>=20 >> The x86 version of get_user_pages_fast() relies on disabled interrupts t= o >> synchronize gup_pte_range() between gup_get_pte(ptep); and get_page() ag= ainst >> a parallel munmap. The munmap side nulls the pte, then flushes TLBs, the= n >> releases the page. As TLB flush is done synchronously via IPI disabling >> interrupts blocks the page release, and get_page(), which assumes existi= ng >> reference on page, is thus safe. >> However when TLB flush is done by a hypercall, e.g. in a Xen PV guest, t= here is >> no blocking thanks to disabled interrupts, and get_page() can succeed on= a page >> that was already freed or even reused. >>=20 >> We have recently seen this happen with our 4.4 and 4.12 based kernels, w= ith >> userspace (java) that exits a thread, where mm_release() performs a fute= x_wake() >> on tsk->clear_child_tid, and another thread in parallel unmaps the page = where >> tsk->clear_child_tid points to. The spurious get_page() succeeds, but fu= tex code >> immediately releases the page again, while it's already on a freelist. S= ymptoms >> include a bad page state warning, general protection faults acessing a p= oisoned >> list prev/next pointer in the freelist, or free page pcplists of two cpu= s joined >> together in a single list. Oscar has also reproduced this scenario, with= a >> patch inserting delays before the get_page() to make the race window lar= ger. >>=20 >> Fix this by removing the dependency on TLB flush interrupts the same way= as the > > This is suppsed to be fixed by: > > arch/x86/Kconfig: select HAVE_RCU_TABLE_FREE if PARAVI= RT > Yes, but HAVE_RCU_TABLE_FREE was enabled on x86 only in 4.14: commit 9e52fc2b50de3a1c08b44f94c610fbe998c0031a Author: Vitaly Kuznetsov Date: Mon Aug 28 10:22:51 2017 +0200 x86/mm: Enable RCU based page table freeing (CONFIG_HAVE_RCU_TABLE_FREE= =3Dy) and, if I understood correctly, Ajay is suggesting the patch for older stable kernels (4.9 and 4.4 I would guess). --=20 Vitaly