From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 87A12ECAAA1 for ; Tue, 6 Sep 2022 12:51:05 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E6FAD80280; Tue, 6 Sep 2022 08:51:04 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E1EC080224; Tue, 6 Sep 2022 08:51:04 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CBF7E80280; Tue, 6 Sep 2022 08:51:04 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id BA0EA80224 for ; Tue, 6 Sep 2022 08:51:04 -0400 (EDT) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 802611405DC for ; Tue, 6 Sep 2022 12:51:04 +0000 (UTC) X-FDA: 79881645648.12.A033D4C Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf07.hostedemail.com (Postfix) with ESMTP id 9AFFC4005F for ; Tue, 6 Sep 2022 12:51:03 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1662468663; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=CS1sV3mhmp6RRPBp+HAVZ3XD9a2FK1IEmXWlvT/wGPA=; b=QbaSTEJaa9nth/5RsjOcDjB0XuJ70X9INzFNRgm6l1i+kWenNdqB5HEJsU+gWiVh7rA3nj qQb0218mN7ukzPYY9P/XmSRxmLGnNWMaFEnzmtlUeMymueP6uflLnQ2raOGU8lig0jEJHc 075Wqc9bbtI2nbNsRi+0LlsRcgDSio0= Received: from mail-wr1-f71.google.com (mail-wr1-f71.google.com [209.85.221.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-637-MOYO3UtRMMmxkPBFX__Mig-1; Tue, 06 Sep 2022 08:50:49 -0400 X-MC-Unique: MOYO3UtRMMmxkPBFX__Mig-1 Received: by mail-wr1-f71.google.com with SMTP id j9-20020a5d4649000000b00226d830857cso2356979wrs.12 for ; Tue, 06 Sep 2022 05:50:48 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:in-reply-to:organization:from :content-language:references:cc:to:subject:user-agent:mime-version :date:message-id:x-gm-message-state:from:to:cc:subject:date; bh=CS1sV3mhmp6RRPBp+HAVZ3XD9a2FK1IEmXWlvT/wGPA=; b=cOyNZS41Zh0bUANzYCzQHMQQGBBtLafN9vpCvpJ+Zbm/6RpvvuC/i2D32Oie813Oey FXcQyi0HQzJ31QvTMkq0OMhmKlAHkQWXJCeF2YSNfUenKngO6mvA3d5oNlr/hpeCvC1o f29oZsAmE7KS1jdaOLqLQHU91CajExUCgafRQybMpdLVTtNmvayVRuSre6AoV2G3b+1b IqGSLxaf/KXD+eXlQx9DHX1rvFb29BH+4b4oiNHH2ZlDN8oZiRtzpZke/BuAcFL6RyO0 xRf3zNVwinI0LVFrr1JYYzW2UZPd7yKEKFRAsnjJ4yuf5JYN95JYIAK7wdQ8+Xdxj8TQ pc2A== X-Gm-Message-State: ACgBeo2HXh63nTHx8axVQHONIdxXkUnHMb+RL9cGtgipdwBi0bk6pHLJ oiOGtPcw/WBcEaxSlM2MPWXpiHNiI23+jGmTLPD8S/8Sh+ga3lnONanpoqTQ5WR/EuBrd6c/I3M fIkgcQxfz+1g= X-Received: by 2002:a5d:4587:0:b0:226:d803:2acf with SMTP id p7-20020a5d4587000000b00226d8032acfmr25045780wrq.329.1662468648068; Tue, 06 Sep 2022 05:50:48 -0700 (PDT) X-Google-Smtp-Source: AA6agR48HGgevMs+kFVKIQO+SzWnrQoNNZk5w3KONC0Iy/cSti59UErxZmeyl6ZqA3Z2IOUUjVNtwQ== X-Received: by 2002:a5d:4587:0:b0:226:d803:2acf with SMTP id p7-20020a5d4587000000b00226d8032acfmr25045763wrq.329.1662468647714; Tue, 06 Sep 2022 05:50:47 -0700 (PDT) Received: from ?IPV6:2003:d8:2f0d:ba00:c951:31d7:b2b0:8ba0? (p200300d82f0dba00c95131d7b2b08ba0.dip0.t-ipconnect.de. [2003:d8:2f0d:ba00:c951:31d7:b2b0:8ba0]) by smtp.gmail.com with ESMTPSA id bo11-20020a056000068b00b002251639bfd0sm12963908wrb.59.2022.09.06.05.50.46 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 06 Sep 2022 05:50:47 -0700 (PDT) Message-ID: <30e2f5e1-a2de-0036-6242-b6f7021a8692@redhat.com> Date: Tue, 6 Sep 2022 14:50:46 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.2.0 Subject: Re: [PATCH] mm: gup: fix the fast GUP race against THP collapse To: John Hubbard , Yang Shi , peterx@redhat.com, kirill.shutemov@linux.intel.com, jgg@nvidia.com, hughd@google.com, akpm@linux-foundation.org Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org References: <20220901222707.477402-1-shy828301@gmail.com> From: David Hildenbrand Organization: Red Hat In-Reply-To: X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1662468664; a=rsa-sha256; cv=none; b=S+HUTuxj9Kwc92ehpWp42wHc5IxMIAborH/2lPKSOLVm/K5WAy/Rfgl35s8kjBchPQjHHj a83c20Lcl34zptowUlDKT32tiGYbfegvkCvMz8b68Zjw31nPIhFTfKFkpqOyIYVJcx0+9L 8qDrwsFInkp44Hig8qBx1ThtEEgWXGA= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=QbaSTEJa; spf=pass (imf07.hostedemail.com: domain of david@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1662468664; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=CS1sV3mhmp6RRPBp+HAVZ3XD9a2FK1IEmXWlvT/wGPA=; b=mHjTkz9lVl9Evj/KkIeKdVdIQsVhAlkEg0gnnbfbT+KT5mYqM3/hz8KQvho2+Qp9LYLsD/ tuI3fFe4vSDVtzNz5R/z4lRpyY9xqlzDVxjiQHyFT7gFiXe7aL65ogz0913RIjQWGqBAKZ iXQZ6sAFAtjML4ePQAaMQNN8bIsjIP8= X-Stat-Signature: dc18fqw4fjekqu4tf5st933wnghz83a3 X-Rspamd-Queue-Id: 9AFFC4005F Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=QbaSTEJa; spf=pass (imf07.hostedemail.com: domain of david@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com X-Rspam-User: X-Rspamd-Server: rspam10 X-HE-Tag: 1662468663-368579 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: > OK, I believe you're referring to this: > > folio = try_grab_folio(page, 1, flags); > > just earlier in gup_pte_range(). Yes that's true...but it's hidden, which > is unfortunate. Maybe a comment could help. > Most certainly. >> >> If we still intend to change that code, we should fixup all GUP-fast >> functions in a similar way. But again, I don't think we need a change here. >> > > It's really rough, having to play this hide-and-seek game of "who did > the memory barrier". And I'm tempted to suggest adding READ_ONCE() to > any and all reads of the page table entries, just to help stay out of > trouble. It's a visual reminder that page table reads are always a > lockless read and are inherently volatile. > > Of course, I realize that adding extra READ_ONCE() calls is not a good > thing. It might be a performance hit, although, again, these are > volatile reads by nature, so you probably had a membar anyway. > > And looking in reverse, there are actually a number of places here where > we could probably get away with *removing* READ_ONCE()! > > Overall, I would be inclined to load up on READ_ONCE() calls, yes. But I > sort of expect to be overridden on that, due to potential performance > concerns, and that's reasonable. > > At a minimum we should add a few short comments about what memory > barriers are used, and why we don't need a READ_ONCE() or something > stronger when reading the pte. Adding more unnecessary memory barriers doesn't necessarily improve the situation. Messing with memory barriers is and remains absolutely disgusting. IMHO, only clear documentation and ASCII art can keep it somehow maintainable for human beings. > > >> >>>> - * After this gup_fast can't run anymore. This also removes >>>> - * any huge TLB entry from the CPU so we won't allow >>>> - * huge and small TLB entries for the same virtual address >>>> - * to avoid the risk of CPU bugs in that area. >>>> + * This removes any huge TLB entry from the CPU so we won't allow >>>> + * huge and small TLB entries for the same virtual address to >>>> + * avoid the risk of CPU bugs in that area. >>>> + * >>>> + * Parallel fast GUP is fine since fast GUP will back off when >>>> + * it detects PMD is changed. >>>> */ >>>> _pmd = pmdp_collapse_flush(vma, address, pmd); >>> >>> To follow up on David Hildenbrand's note about this in the nearby thread... >>> I'm also not sure if pmdp_collapse_flush() implies a memory barrier on >>> all arches. It definitely does do an atomic op with a return value on x86, >>> but that's just one arch. >>> >> >> I think a ptep/pmdp clear + TLB flush really has to imply a memory >> barrier, otherwise TLB flushing code might easily mess up with >> surrounding code. But we should better double-check. > > Let's document the function as such, once it's verified: "This is a > guaranteed memory barrier". Yes. Hopefully it indeed is on all architectures :) -- Thanks, David / dhildenb