From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id AFE78C54EBE for ; Mon, 16 Jan 2023 13:07:50 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 369586B0071; Mon, 16 Jan 2023 08:07:50 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 2F26D6B0072; Mon, 16 Jan 2023 08:07:50 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 16C256B0073; Mon, 16 Jan 2023 08:07:50 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 01B2E6B0071 for ; Mon, 16 Jan 2023 08:07:49 -0500 (EST) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id BAAA21C630A for ; Mon, 16 Jan 2023 13:07:49 +0000 (UTC) X-FDA: 80360689458.25.9DE2CCE Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf25.hostedemail.com (Postfix) with ESMTP id 854FAA0016 for ; Mon, 16 Jan 2023 13:07:46 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=VtcnVPdx; spf=pass (imf25.hostedemail.com: domain of david@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1673874467; a=rsa-sha256; cv=none; b=YeqsHdJr1bc0fkksRIazzrK68DUoIVbC08a86gWnZ5SBW57z/0BtTV6SWhEfCeUSOpDXfm DWrYGKTd6ZcoxHba7koZMx7HyTCodRKCjtmNHLght4vW39xplZ9Acs87gts/7vzv1TR2JG eJlVWKQSi1+aKfR32E8Ig0l740pKQyg= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=VtcnVPdx; spf=pass (imf25.hostedemail.com: domain of david@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1673874467; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=dmLLkPbzYH2imUf6tsl8FiMjCHfk0QK4ayKq6vV3fIw=; b=uDnPuOet8a1+ivP0SVysGk7Ac+H7rTawY8IkcQjP8BKhKsqZcXC+YpbfLYCZtDVblgz4Zt nTGvIQoO/3oD2IXHgaeaNuP3Fm9SGbk+5PQcaiih2nXKv39IOMuxptkJ0Z4fqBHCnr6ls5 v0EE7ZiIqCWjiQBleHpxob41hdGkhyw= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1673874465; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=dmLLkPbzYH2imUf6tsl8FiMjCHfk0QK4ayKq6vV3fIw=; b=VtcnVPdxgBXfOnfFFWpe7+LMcv3qDeLYe0SCcVZzzZD/nNS1znErNoNigbpfkhBNazIq6D mfLdDg7esB2SSt8wI1KNW4SyqR/H13wmhaxwvD0m0HPAMu3cn4AMCYp52ukLUOrEpOmJCx sAEZ7H9Yh9cVy0QE2WvWhqMeHWaUIO8= Received: from mail-wr1-f69.google.com (mail-wr1-f69.google.com [209.85.221.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-412-ixevGPXWOP2anz97jO897Q-1; Mon, 16 Jan 2023 08:07:44 -0500 X-MC-Unique: ixevGPXWOP2anz97jO897Q-1 Received: by mail-wr1-f69.google.com with SMTP id v3-20020adfa1c3000000b002bdd6ce1358so2218204wrv.23 for ; Mon, 16 Jan 2023 05:07:44 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:in-reply-to:organization:from:references :cc:to:content-language:subject:user-agent:mime-version:date :message-id:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=dmLLkPbzYH2imUf6tsl8FiMjCHfk0QK4ayKq6vV3fIw=; b=ryxFVY64P4XR9Za0qatkFEiYB/nGpQKhDTg0Y90Zse3jB1trCTSsUUrQqpwuJ1Tqeh dfFgm9MFf8N8cEE9psgl0gI0ECfOxizxWgBqbifWpTF9mULBbzUJPUNqLfvLPIJXC++v ulUh090Zp0yP51VVptHHcV7/0/rBY8vOBI8bq7oUstAiF+CNg2BjQQFqcD2ind0aXpls Id6z038Jh/twSVbGPc06WnLlkuNk+mexZBSWl4PuRXaW5OcEtk6t10gfLy7yJZVx7iFF +INGOsQDT3YUDU6M34ENqCfSrvRSs/Axdi10w0enKZDQWLgBlVAhox0O8C/rGvNm3+3d ZI/A== X-Gm-Message-State: AFqh2kq51kCHFg/1eZ6bn6R+hULg/oRR5BoxcdwDQOPEPeY4qbzvk7YS 7HLnVQkO9kN36ZM+koI8TF5YyeE3KxPWByNgoKgDTo0e7AxB1gnZw8YMcA/VvhePYTiX8EBhcw9 opKm+jx88R9U= X-Received: by 2002:a05:600c:1e10:b0:3da:1d51:ef9d with SMTP id ay16-20020a05600c1e1000b003da1d51ef9dmr12365181wmb.15.1673874463485; Mon, 16 Jan 2023 05:07:43 -0800 (PST) X-Google-Smtp-Source: AMrXdXugD1xws40AUhPq2LWDIcl1B4P7+pPhBrX2BWrHBy3cM2vSvbJ2rmzKLoAy9dm9rGqGbjQaJw== X-Received: by 2002:a05:600c:1e10:b0:3da:1d51:ef9d with SMTP id ay16-20020a05600c1e1000b003da1d51ef9dmr12365161wmb.15.1673874463142; Mon, 16 Jan 2023 05:07:43 -0800 (PST) Received: from ?IPV6:2003:cb:c704:1000:21d5:831d:e107:fbd6? (p200300cbc704100021d5831de107fbd6.dip0.t-ipconnect.de. [2003:cb:c704:1000:21d5:831d:e107:fbd6]) by smtp.gmail.com with ESMTPSA id n23-20020a05600c3b9700b003cf71b1f66csm38419097wms.0.2023.01.16.05.07.42 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 16 Jan 2023 05:07:42 -0800 (PST) Message-ID: <5a7fdfa7-5b25-0ed4-2479-661d387b397b@redhat.com> Date: Mon, 16 Jan 2023 14:07:41 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.6.0 Subject: Re: [PATCH] mm/khugepaged: Fix ->anon_vma race To: "Kirill A. Shutemov" , Jann Horn Cc: Andrew Morton , linux-mm@kvack.org, "Kirill A. Shutemov" , Zach O'Keefe , linux-kernel@vger.kernel.org, Yang Shi References: <20230111133351.807024-1-jannh@google.com> <20230112085649.gvriasb2t5xwmxkm@box.shutemov.name> <20230115190654.mehtlyz2rxtg34sl@box.shutemov.name> <20230116123403.fiyv22esqgh7bzp3@box.shutemov.name> From: David Hildenbrand Organization: Red Hat In-Reply-To: <20230116123403.fiyv22esqgh7bzp3@box.shutemov.name> X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Rspam-User: X-Rspamd-Queue-Id: 854FAA0016 X-Rspamd-Server: rspam01 X-Stat-Signature: r4rpg3de5z8xeum67drj8mpd6ij7n158 X-HE-Tag: 1673874466-862246 X-HE-Meta: U2FsdGVkX1+y3FqNbqOI6bWvRjCU0pHzEJ6kZIlDDbdbaPWUXgcAVLZomP8y8+TgmqIJdsQI/h8gaEy3DpSBi41MasqZJA0zjNYUtthbqW3dOMeAl9bvupWMNogJLznufi659ugi4EeIZU5uTSg5XMe3T/RSabavvZ+HmqHsL3EsP+DB8JSfE0lKWtk7F2wUAgszCMhk/xVz/IOu9o4Jf3LYLcW53j1YwgEmWE0g9aRNw7qVR4LIcs+w1MRCm1wB+6JbU0f5uZIap+u9Sy2qTVwWpofv/AplzGJbeyMFrTnsRfgiTSZGT7Nf947YR0yY/XS5e94zUNKyyha2f2BfZe7EPNcsYXU0LT0C6mmVp/lpwUKBQ1eJxLYyVzx5WTUUfUA9Pqu7IX3DImHq5/1G/FNrJ+eKVfbPssS10rXv0Ozs0eZ3fV6vSO/529FZQB1qTnnG9gM8IOCgd6ZFyOxJkEtut+V+xBHE0qqDNapr2luqL9eXrStTfrUkFwfgWzNv9t4fwKUWfBJmOYF8E82mD1tIsDxbT5I7fG1so6mjN5CsQ04WAGVesUAICWT2LnSGW8SnZMTlevAFVHC1QnvL4v84+d5mHc4FvGR6qo2qWN1FycTKtIV/QQO5NF5oKQi+RoHw1RufAzSU37CEv0LWMgCv3oGYQOxQErzMdwaCdoTx41HpY5G66FI1OaxJ3eq5rTXZCgb3FjuvXNAAKCvlmbb7etFBk/amdkWr63GdIfBnNPnmB3xMsOo6WIU3sdPaU5KQqhz73HeW+WTVT/6DNvelmCH+KU45Ax4LgW1+tqGgUYZF5wwvp+Y6+BFQe9htLppR2QCfaXUkDsgBoEc2pI3ljmZ1jaLxcQmeCR+jlx8SZhqxoc6CnMgOchd1f3MWA4bxAF/p6A6kIZ/l/lMV/gR/gP+kIBx0QIsUGtvH+U53VdIJ7uofwBwBbQ+3o2FDRMMDhWxMP24nslnC6Pc O7mDjcIz ZEBsmT9gZg5Gsybzef7b23G/9teW+yAchcKys+cFa8doZ71c24rsHr1JNlM2QHG02H2AQ2h3ukJqoOKl6/ETURihRRdZm/hPKFVqkc15MzuV9C/h0FHSnTHPykqrAtTtUbOjkyNrLj/nursLpLzQFoVNL26IPyrsz2tsv5WrJIQV9HUXib+UVeN3A5NY5sxbvn6iT X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 16.01.23 13:34, Kirill A. Shutemov wrote: > On Mon, Jan 16, 2023 at 01:06:59PM +0100, Jann Horn wrote: >> On Sun, Jan 15, 2023 at 8:07 PM Kirill A. Shutemov wrote: >>> On Fri, Jan 13, 2023 at 08:28:59PM +0100, Jann Horn wrote: >>>> No, that lockdep assert has to be there. Page table traversal is >>>> allowed under any one of the mmap lock, the anon_vma lock (if the VMA >>>> is associated with an anon_vma), and the mapping lock (if the VMA is >>>> associated with a mapping); and so to be able to remove page tables, >>>> we must hold all three of them. >>> >>> Okay, that's fair. I agree with the patch now. Maybe adjust the commit >>> message a bit? >> >> Just to make sure we're on the same page: Are you suggesting that I >> add this text? >> "Page table traversal is allowed under any one of the mmap lock, the >> anon_vma lock (if the VMA is associated with an anon_vma), and the >> mapping lock (if the VMA is associated with a mapping); and so to be >> able to remove page tables, we must hold all three of them." >> Or something else? > > Looks good to me. > >>> Anyway: >>> >>> Acked-by: Kirill A. Shutemov >> >> Thanks! >> >>> BTW, I've noticied that you recently added tlb_remove_table_sync_one(). >>> I'm not sure why it is needed. Why IPI in pmdp_collapse_flush() in not >>> good enough to serialize against GUP fast? >> >> If that sent an IPI, it would be good enough; but >> pmdp_collapse_flush() is not guaranteed to send an IPI. >> It does a TLB flush, but on some architectures (including arm64 and >> also virtualized x86), a remote TLB flush can be done without an IPI. >> For example, arm64 has some fancy hardware support for remote TLB >> invalidation without IPIs ("broadcast TLB invalidation"), and >> virtualized x86 has (depending on the hypervisor) things like TLB >> shootdown hypercalls (under Hyper-V, see hyperv_flush_tlb_multi) or >> TLB shootdown signalling for preempted CPUs through shared memory >> (under KVM, see kvm_flush_tlb_multi). > > I think such architectures must provide proper pmdp_collapse_flush() > with the required serialization. Power and S390 already do that. > The plan is to eventually move away from (ab)using IPI to synchronize with GUP-fast. Moving further into that direction a is wrong. The flush was added as a quick fix for all architectures by Jann, until we can do better. Even for ppc64, see: commit bedf03416913d88c796288f9dca109a53608c745 Author: Yang Shi Date: Wed Sep 7 11:01:44 2022 -0700 powerpc/64s/radix: don't need to broadcast IPI for radix pmd collapse flush The IPI broadcast is used to serialize against fast-GUP, but fast-GUP will move to use RCU instead of disabling local interrupts in fast-GUP. Using an IPI is the old-styled way of serializing against fast-GUP although it still works as expected now. And fast-GUP now fixed the potential race with THP collapse by checking whether PMD is changed or not. So IPI broadcast in radix pmd collapse flush is not necessary anymore. But it is still needed for hash TLB. -- Thanks, David / dhildenb