From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3A0BCC433F5 for ; Sat, 18 Dec 2021 04:03:45 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8190A6B0072; Fri, 17 Dec 2021 23:03:34 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 7A1076B0073; Fri, 17 Dec 2021 23:03:34 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5F3FD6B0074; Fri, 17 Dec 2021 23:03:34 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0064.hostedemail.com [216.40.44.64]) by kanga.kvack.org (Postfix) with ESMTP id 48DFA6B0072 for ; Fri, 17 Dec 2021 23:03:34 -0500 (EST) Received: from smtpin08.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 125508C58F for ; Sat, 18 Dec 2021 04:03:24 +0000 (UTC) X-FDA: 78929570328.08.057A601 Received: from mail-ed1-f45.google.com (mail-ed1-f45.google.com [209.85.208.45]) by imf20.hostedemail.com (Postfix) with ESMTP id B6F9F1C0038 for ; Sat, 18 Dec 2021 04:03:20 +0000 (UTC) Received: by mail-ed1-f45.google.com with SMTP id y13so15039188edd.13 for ; Fri, 17 Dec 2021 20:03:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux-foundation.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=iDz9wbDMHn9MeIeMeVs1g93TcSMjFNt5KhDPV7GWmW8=; b=JZWsZ9zzivWVUnONWc6WXUrfS9G5bbi5gt+peI0yOlau9BFB8rndAwtdKJ2qX2ZnUf OQ+2eD5CGCh2TBG+PQVESOHPfQzWG7avZX5F/+5/9t4X10C604kFp0d+2CFTwOowQY+g h1Hrnva1kIJXMOMq6KKMozGFUEUYAP+iMVhiU= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=iDz9wbDMHn9MeIeMeVs1g93TcSMjFNt5KhDPV7GWmW8=; b=Ji1A6U6J+Wj2bAw247JRwK3QM9zIZuc1CBJLudzoncmeUJ45nvz2fKWlHKY5DBUbY/ oOqSp8hSaEXlahvHBoOcZisSqtDTVflrXg6hHVgX1M4ONQDUhhAtxN3YcnPbPoOSIKAy pibv12HM686QIBNd8FTUEY9JxQntglH5eqZsTPfms8vKAu3YljWeZAAp59MwpdD0DCZ9 CETCfxQ24ocHiHVn0ChQyxGD5M1NHjkDDFCOTLeUJHEk+cVcQDGn6vPfzrrRimoq365W ZjrylVGkYTDQ6kKebgLbPHHLsSe+ha7Tkzi13I6PHhhe4CW98QN2+XgAhqLLSZ329/E0 AuHg== X-Gm-Message-State: AOAM530xEs4/+Kuz/CvBf8wShXH0jALUxRyk4t86mtN61dFqJEGdvixu gEP17Zz9D63OtuzOJDWBHctk8J6gfNxioJ++V1A= X-Google-Smtp-Source: ABdhPJyFN0Hy9gETeS2np1Q5i9zUWxPl9uuloyUsgYfjhHYXbPtSDNaq6K6wAk0BwHtxu8IfXy18Ng== X-Received: by 2002:a17:906:6591:: with SMTP id x17mr4741908ejn.37.1639800202329; Fri, 17 Dec 2021 20:03:22 -0800 (PST) Received: from mail-wm1-f51.google.com (mail-wm1-f51.google.com. [209.85.128.51]) by smtp.gmail.com with ESMTPSA id f27sm3315162ejj.193.2021.12.17.20.03.22 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 17 Dec 2021 20:03:22 -0800 (PST) Received: by mail-wm1-f51.google.com with SMTP id i12so2855028wmq.4 for ; Fri, 17 Dec 2021 20:03:22 -0800 (PST) X-Received: by 2002:a05:600c:4e07:: with SMTP id b7mr12036558wmq.8.1639800191035; Fri, 17 Dec 2021 20:03:11 -0800 (PST) MIME-Version: 1.0 References: <20211217113049.23850-1-david@redhat.com> <20211217113049.23850-7-david@redhat.com> <9c3ba92e-9e36-75a9-9572-a08694048c1d@redhat.com> <02cf4dcf-74e8-9cbd-ffbf-8888f18a9e8a@redhat.com> <0aa27d7d-0db6-94ee-ca16-91d19997286b@redhat.com> <0de1a3cb-8286-15bd-aec1-2b284bf8918a@redhat.com> <719D2770-97EF-4CF5-81E6-056B0B55A996@vmware.com> In-Reply-To: <719D2770-97EF-4CF5-81E6-056B0B55A996@vmware.com> From: Linus Torvalds Date: Fri, 17 Dec 2021 20:02:54 -0800 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH v1 06/11] mm: support GUP-triggered unsharing via FAULT_FLAG_UNSHARE (!hugetlb) To: Nadav Amit Cc: David Hildenbrand , Linux Kernel Mailing List , Andrew Morton , Hugh Dickins , David Rientjes , Shakeel Butt , John Hubbard , Jason Gunthorpe , Mike Kravetz , Mike Rapoport , Yang Shi , "Kirill A . Shutemov" , Matthew Wilcox , Vlastimil Babka , Jann Horn , Michal Hocko , Rik van Riel , Roman Gushchin , Andrea Arcangeli , Peter Xu , Donald Dutile , Christoph Hellwig , Oleg Nesterov , Jan Kara , Linux-MM , "open list:KERNEL SELFTEST FRAMEWORK" , "open list:DOCUMENTATION" Content-Type: text/plain; charset="UTF-8" Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=google header.b=JZWsZ9zz; spf=pass (imf20.hostedemail.com: domain of torvalds@linuxfoundation.org designates 209.85.208.45 as permitted sender) smtp.mailfrom=torvalds@linuxfoundation.org; dmarc=none X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: B6F9F1C0038 X-Stat-Signature: j5gmihu1doswao8ptwnmt469nxh3bsd5 X-HE-Tag: 1639800200-183419 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Fri, Dec 17, 2021 at 3:53 PM Nadav Amit wrote: > > I understand the discussion mainly revolves correctness, which is > obviously the most important property, but I would like to mention > that having transient get_page() calls causing unnecessary COWs can > cause hard-to-analyze and hard-to-avoid performance degradation. Note that the COW itself is pretty cheap. Yes, there's the page allocation and copy, but it's mostly a local thing. So that falls under the "good to avoid" heading, but in the end it's not an immense deal. In contrast, the page lock has been an actual big user-visible latency issue, to the point of correctness. A couple of years ago, we literally had NMI watchdog timeouts due to the page wait-queues growing basically boundlessly. This was some customer internal benchmark code that I never saw, so it wasn't *quite* clear exactly what was going on, but we ended up having to split up the page wait list traversal using bookmark entries, because it was such a huge latency issue. That was mostly NUMA balancing faults, I think, but the point I'm making is that avoiding the page lock can be a *much* bigger deal than avoiding some local allocation and copying of a page of data. There are real loads where the page-lock gets insanely bad, and I think it's because we use it much too much. See commit 2554db916586 ("sched/wait: Break up long wake list walk") for some of that saga. So I really think that having to serialize with the page lock in order to do some "exact page use counting" is a false economy. Yes, maybe you'd be able to avoid a COW or two, but at what locking cost? Linus