From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8D743C25B50 for ; Mon, 23 Jan 2023 11:07:14 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2293C6B0071; Mon, 23 Jan 2023 06:07:14 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 1D8F06B0072; Mon, 23 Jan 2023 06:07:14 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 07A786B0073; Mon, 23 Jan 2023 06:07:14 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id EAA4F6B0071 for ; Mon, 23 Jan 2023 06:07:13 -0500 (EST) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id C05AA4060B for ; Mon, 23 Jan 2023 11:07:13 +0000 (UTC) X-FDA: 80385787146.25.120595D Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf16.hostedemail.com (Postfix) with ESMTP id 8FEC718001C for ; Mon, 23 Jan 2023 11:07:10 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=F6o8ZNm5; spf=pass (imf16.hostedemail.com: domain of david@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1674472030; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=dHJj6GHCp84olcE9Bo8DHZkNV+HuFj5W3aT+4JQhb/A=; b=OE7RU6+tbx8whoW5U2OxhohCjDL51/4drGw2dSBWMfCSjIAkLTjQmj4qz4svigYgUSGx2/ 0leGfNV8Cjtm8LlenDHPVCMbptrqBV67odsazuWo1+Ks/ojMpOzNCK201URivJmxwhM8gi /KyU4tHuQC8VWiPZu8OpFVMge2hLF4w= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=F6o8ZNm5; spf=pass (imf16.hostedemail.com: domain of david@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1674472030; a=rsa-sha256; cv=none; b=dgx9vkci4mrt12PnW5cDfob0ovcmg7gwLJVl/nxaGTk6vF6QpQl4/GrkZPoyxjEgOZ0LEI Xplur308sO17fxABtlqiFAGLDWvf1D6j79ebJr+5pVgJn8XBGzfBXmJ6SKN8g4hq+Jd/xY pm6h6GAS50CTc0wX/mZYIOSCHW/8PsI= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1674472029; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=dHJj6GHCp84olcE9Bo8DHZkNV+HuFj5W3aT+4JQhb/A=; b=F6o8ZNm5cQGovfiOZVF2UMr3LieHJzTsxhcUSE5EZfcGAHjuD90V35sqCdeVhPpChKPn2N LVfPf7mkCDiOJ6HLergWkvMtzzJDFgGUoeQ7r5NkzJWh3pN+CMzQuHCpbb5BhIVTexl1fi Vpj/9rwyEV+2U3/2dhBMrbjvqLYdViY= Received: from mail-wr1-f71.google.com (mail-wr1-f71.google.com [209.85.221.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-271-l8bVw949Nia6wdApWvCVJg-1; Mon, 23 Jan 2023 06:07:08 -0500 X-MC-Unique: l8bVw949Nia6wdApWvCVJg-1 Received: by mail-wr1-f71.google.com with SMTP id o11-20020adf8b8b000000b002be143c4827so1896338wra.19 for ; Mon, 23 Jan 2023 03:07:08 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:in-reply-to:organization:from:references :cc:to:content-language:subject:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=dHJj6GHCp84olcE9Bo8DHZkNV+HuFj5W3aT+4JQhb/A=; b=CWHwRADj1VEbYuDS0PQ7LrZeR58oJjXP4FEfz4OezOzCXCEngEF9RcFNFasfGsW5d7 sa2tiGGkZJDlRiAPdXq1IxcYXx/pz2icn0+74dWRhFApb4U8+Gx3003USuzjW4Gvlnm9 0nZstzJIPmokZAwr7nmZuqJEpIDHgVVPpde/ncAyYmol/eWUVuES+DQpNRZ4AbOMl2M7 1f1u6vIzhyfp1JvZad/7pGiul9slBDNp1izc68hD0Aygz8IbPdzpIVvr1yB8ta92z9Tm Aow3We7sLx/SyHON/igFMzbyP3lIvUVNqGiD+/1DFsyfU9P1Hip45dm7CWEZmxOL0deN F4HA== X-Gm-Message-State: AFqh2kpq3VlWfKL5fEZQWCBHd5k4GG2/9Gt3g3zDrGDag00bnvoCvbPj zeikmSTwvjWyT2jqSpZOksM8H43WMohlvTvgeQs6fTmpV16MEsqcdNKVoE2RtSowtFqkOY5Qk75 RMJGWmM38HhE= X-Received: by 2002:a05:600c:4256:b0:3da:1d51:ef9e with SMTP id r22-20020a05600c425600b003da1d51ef9emr23420611wmm.17.1674472027581; Mon, 23 Jan 2023 03:07:07 -0800 (PST) X-Google-Smtp-Source: AMrXdXu0Cg92QjtLeNk1itjVppuG2H1v/jszHJB9+YOuh2PaEWkxGDrwwwcVA8mXq9v9fdlxS4T6LA== X-Received: by 2002:a05:600c:4256:b0:3da:1d51:ef9e with SMTP id r22-20020a05600c425600b003da1d51ef9emr23420580wmm.17.1674472027274; Mon, 23 Jan 2023 03:07:07 -0800 (PST) Received: from ?IPV6:2003:cb:c704:1100:65a0:c03a:142a:f914? (p200300cbc704110065a0c03a142af914.dip0.t-ipconnect.de. [2003:cb:c704:1100:65a0:c03a:142a:f914]) by smtp.gmail.com with ESMTPSA id he11-20020a05600c540b00b003d9b89a39b2sm10206202wmb.10.2023.01.23.03.07.06 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 23 Jan 2023 03:07:06 -0800 (PST) Message-ID: <1b8696ec-e2be-7b7b-705c-e2dcabb2e8e5@redhat.com> Date: Mon, 23 Jan 2023 12:07:05 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.6.0 Subject: Re: [PATCH] mm/khugepaged: Fix ->anon_vma race To: "Kirill A. Shutemov" Cc: Jann Horn , Andrew Morton , linux-mm@kvack.org, "Kirill A. Shutemov" , Zach O'Keefe , linux-kernel@vger.kernel.org, Yang Shi References: <20230111133351.807024-1-jannh@google.com> <20230112085649.gvriasb2t5xwmxkm@box.shutemov.name> <20230115190654.mehtlyz2rxtg34sl@box.shutemov.name> <20230116123403.fiyv22esqgh7bzp3@box.shutemov.name> <5a7fdfa7-5b25-0ed4-2479-661d387b397b@redhat.com> <20230116134710.n4dgtrutt6rqif62@box.shutemov.name> From: David Hildenbrand Organization: Red Hat In-Reply-To: <20230116134710.n4dgtrutt6rqif62@box.shutemov.name> X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Rspam-User: X-Rspamd-Server: rspam03 X-Stat-Signature: fa4btoene7dpchp758nzyb69ubacztby X-Rspamd-Queue-Id: 8FEC718001C X-HE-Tag: 1674472030-440972 X-HE-Meta: U2FsdGVkX19Y9oKnRcJ0Mrk5aTvNKbXXTeTJrOv686p2g/wsWjZorCSj4DyQkoAmJnuStTD2Fg4ug2sxL6q1OXHNpia+U+HImsrlmHzyBF1JmFdZNUeO948nXSCNU8TTEkcOKzZ1XI98xGhL24E0G6odKHdLQ9wuFPATYINrRDjpTrGgqMUcc+73Df6QFNGSBw0pK7aCznXx4G1S+nG2iiCFqZZRojfmLOXTpP0j7NkTc2h6mltiOZUpmEttc7QoHst4LWXTO2S089s/BDj0ik35RLauFXOJKOskQHf3sWvw9yseDCbluoOKFD7zbVQNq1BQAP/j1OEyFzGCzRBOU+xtr8sJGBB2fu5EtHRfli917kbqT/WOY2RVsmSbWcKmZ5QCxg07I2dnS81ijdUWyLdpcSvvc8GgrsvWAwaKe0oqh8EuTewaVCxqUayikglJpQOElgw+gnVPmHrZD9Kg7h6zEKjWSBccRltcaIW467i9Yn2ThfTddwimfsZEm+tq+lKEHqBvNTp0a60q2ofHCVPH0cKew5i37NLrJ4wwcc268OOCGTR7wCaO70S+v8UEDDG/Q1DGJYGuRKHXdxR4iBL6zli3syJhaeUmVzSZB7Jzf3LbiAxWhJnVFBGhZxvHZwnIr8WOC66h8mF/K8ndolyCPVowdgWWP6rjgoyZiyxItZor/KIokN3JvNgfGsDDHBdET00QWdNIDV+kqLX0O6t5cBcAPgqYztHOudJnPsoNUS3X60dphoBwLaHhrNBGpnGbkmSMnMMW2CZ0Aq6BEsUBiPlUL6c7qCG5ySNvft7LOiiG62SuK2/gELkvs/R1xJJg0UEvJz9y1R453Er8fuI94QKSKQpgsOCvy09znx4ktS+i7vVpNXdrkBNNSGgxItUXMkCO2Kn7R1Qr1TQFheJ2J+2V/i4IrxsHM4f099AOTyOJD/A6JsDoBslwJ7Ki3iMSmE4sW6K7Pkzn/A7 e57agIX3 9o6NT/oztI7eXpP1tXTPRGv34tn21wDY2xdGAkMWWUoCVA4HKK5LfWLkhH123kMwDosDrwfZGOu91CScmBx1jz8tCT47f3UPVPaQnYp8Gm9evNvNNubQrYbQAVe96LtsAOhtDNDAsGAZ5Gj758nz/oZWzpq0t9gWQswDxbYFFI993btVgnDnHHhpzMZFHK9BfNacAhJGP3OXpq6jVio+05NTMR4AZsw1ktBLH9J8tW/AwrEWrh7F1xA+FztpxDlrOxfTGookZG2Es3TBVvyXVzCnsNrjTCDUEL1rIAXACbWFcvHJe6vioxa0+G0tKVC6wxFI2n6PLGbO9WPZrIN3zgtN1n5ZY0+ze7DUK7Wbbqjml+nOlUUCK2RJ/aMpZqV9AmqKXbHhVIBjQ/0uMkjFoUb/WHKYLgY132+zjDnB8sLRbKylDm/6c39EV4A== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 16.01.23 14:47, Kirill A. Shutemov wrote: > On Mon, Jan 16, 2023 at 02:07:41PM +0100, David Hildenbrand wrote: >> On 16.01.23 13:34, Kirill A. Shutemov wrote: >>> On Mon, Jan 16, 2023 at 01:06:59PM +0100, Jann Horn wrote: >>>> On Sun, Jan 15, 2023 at 8:07 PM Kirill A. Shutemov wrote: >>>>> On Fri, Jan 13, 2023 at 08:28:59PM +0100, Jann Horn wrote: >>>>>> No, that lockdep assert has to be there. Page table traversal is >>>>>> allowed under any one of the mmap lock, the anon_vma lock (if the VMA >>>>>> is associated with an anon_vma), and the mapping lock (if the VMA is >>>>>> associated with a mapping); and so to be able to remove page tables, >>>>>> we must hold all three of them. >>>>> >>>>> Okay, that's fair. I agree with the patch now. Maybe adjust the commit >>>>> message a bit? >>>> >>>> Just to make sure we're on the same page: Are you suggesting that I >>>> add this text? >>>> "Page table traversal is allowed under any one of the mmap lock, the >>>> anon_vma lock (if the VMA is associated with an anon_vma), and the >>>> mapping lock (if the VMA is associated with a mapping); and so to be >>>> able to remove page tables, we must hold all three of them." >>>> Or something else? >>> >>> Looks good to me. >>> >>>>> Anyway: >>>>> >>>>> Acked-by: Kirill A. Shutemov >>>> >>>> Thanks! >>>> >>>>> BTW, I've noticied that you recently added tlb_remove_table_sync_one(). >>>>> I'm not sure why it is needed. Why IPI in pmdp_collapse_flush() in not >>>>> good enough to serialize against GUP fast? >>>> >>>> If that sent an IPI, it would be good enough; but >>>> pmdp_collapse_flush() is not guaranteed to send an IPI. >>>> It does a TLB flush, but on some architectures (including arm64 and >>>> also virtualized x86), a remote TLB flush can be done without an IPI. >>>> For example, arm64 has some fancy hardware support for remote TLB >>>> invalidation without IPIs ("broadcast TLB invalidation"), and >>>> virtualized x86 has (depending on the hypervisor) things like TLB >>>> shootdown hypercalls (under Hyper-V, see hyperv_flush_tlb_multi) or >>>> TLB shootdown signalling for preempted CPUs through shared memory >>>> (under KVM, see kvm_flush_tlb_multi). >>> >>> I think such architectures must provide proper pmdp_collapse_flush() >>> with the required serialization. Power and S390 already do that. >>> >> >> The plan is to eventually move away from (ab)using IPI to synchronize with >> GUP-fast. Moving further into that direction a is wrong. >> >> The flush was added as a quick fix for all architectures by Jann, until >> we can do better. >> >> Even for ppc64, see: >> >> commit bedf03416913d88c796288f9dca109a53608c745 >> Author: Yang Shi >> Date: Wed Sep 7 11:01:44 2022 -0700 >> >> powerpc/64s/radix: don't need to broadcast IPI for radix pmd collapse flush >> The IPI broadcast is used to serialize against fast-GUP, but fast-GUP will >> move to use RCU instead of disabling local interrupts in fast-GUP. Using >> an IPI is the old-styled way of serializing against fast-GUP although it >> still works as expected now. >> And fast-GUP now fixed the potential race with THP collapse by checking >> whether PMD is changed or not. So IPI broadcast in radix pmd collapse >> flush is not necessary anymore. But it is still needed for hash TLB. > > Okay. But I think tlb_remove_table_sync_one() belongs inside > pmdp_collapse_flush(). Collapsing pmd table into huge page without > serialization is a bug. They should not be separate. Agreed. But I wonder if it should be moved into a generic pmdp_collapse_flush(), that calls an arch specific __pmdp_collapse_flush(). -- Thanks, David / dhildenb