From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2D2A0C433F5 for ; Sat, 18 Dec 2021 01:54:36 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2B68A6B0072; Fri, 17 Dec 2021 20:54:25 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 2407B6B0073; Fri, 17 Dec 2021 20:54:25 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 092266B0074; Fri, 17 Dec 2021 20:54:25 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0103.hostedemail.com [216.40.44.103]) by kanga.kvack.org (Postfix) with ESMTP id E825F6B0072 for ; Fri, 17 Dec 2021 20:54:24 -0500 (EST) Received: from smtpin30.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id A2FC78C5A9 for ; Sat, 18 Dec 2021 01:54:14 +0000 (UTC) X-FDA: 78929244828.30.DF64DCF Received: from mail-lj1-f169.google.com (mail-lj1-f169.google.com [209.85.208.169]) by imf01.hostedemail.com (Postfix) with ESMTP id D74A040036 for ; Sat, 18 Dec 2021 01:54:08 +0000 (UTC) Received: by mail-lj1-f169.google.com with SMTP id z8so6006226ljz.9 for ; Fri, 17 Dec 2021 17:54:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux-foundation.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=dTrxVAtwDNxhFPmosFFlt9H6t41hXU+6IxlHAslUllg=; b=O6XfYNdscSUNLQjZoscLmlmMkKSNijulCuTb4Q2B3XsFA/h3vGu6jHAV244eaLTQvy GT8/tCMwpxagvHqj2STHyeIgCPUE4XWMdxwbC8cqaUaik8OpMW26B4HMkzJprc5CisjR 3B7w6JIGSGDhfKhsb55bYBk2eovn1g5toVtHc= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=dTrxVAtwDNxhFPmosFFlt9H6t41hXU+6IxlHAslUllg=; b=oC03Msb+uwP5EYe6RdY0vnHiCxvtZF7pANvJdjnZL2XJo2lIfcChSud8JCPfYx32Gh kXEzX08sH+9FJDEnaCz3/j4xF6VnWl7OpN3azgBrZ/6uqHrt69raWEvuO1Q8DqTNvKCw p7Y9Va9xelGmKM6cilkjaiDSUQbsGBMX2lMZjNyIC5mw7rd5e0ZRi/gi/aFE9RgcHRay GRhhTxEVTsNh+fhKFmVvloQZVhTZ1y3gMuLPJek2UXckf9qcIdRWwSTwTu/fEQk0m5RT CyARWr6fU0TFvJfoigTblY5Pj9uy7qeBnG4an8fcYU1B/FueO/opBZ8OnXj4wGy0h/ss UjKw== X-Gm-Message-State: AOAM531TVJaS3EkiPzL9EMzLx+bswB8bpCfyUgL4PMFA3UsilIdEQeIA uzNnfJkMb8aWALJq1D0y0CanKKaMzj+Yjq3nFiw= X-Google-Smtp-Source: ABdhPJx5193HJ+ZN/4QA6pETjFDomintQerRWx9/sXgnwyrZysF2AxbfGf2y0SDeM7MGxH6PXSl3tA== X-Received: by 2002:a05:651c:612:: with SMTP id k18mr5021935lje.383.1639792452483; Fri, 17 Dec 2021 17:54:12 -0800 (PST) Received: from mail-lj1-f174.google.com (mail-lj1-f174.google.com. [209.85.208.174]) by smtp.gmail.com with ESMTPSA id y4sm1571006lfg.163.2021.12.17.17.54.12 for (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 17 Dec 2021 17:54:12 -0800 (PST) Received: by mail-lj1-f174.google.com with SMTP id u22so6023138lju.7 for ; Fri, 17 Dec 2021 17:54:12 -0800 (PST) X-Received: by 2002:adf:d1a6:: with SMTP id w6mr4363313wrc.274.1639792441589; Fri, 17 Dec 2021 17:54:01 -0800 (PST) MIME-Version: 1.0 References: <20211217113049.23850-1-david@redhat.com> <20211217113049.23850-7-david@redhat.com> <54c492d7-ddcd-dcd0-7209-efb2847adf7c@redhat.com> <20211217204705.GF6385@nvidia.com> <2E28C79D-F79C-45BE-A16C-43678AD165E9@vmware.com> In-Reply-To: <2E28C79D-F79C-45BE-A16C-43678AD165E9@vmware.com> From: Linus Torvalds Date: Fri, 17 Dec 2021 17:53:45 -0800 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH v1 06/11] mm: support GUP-triggered unsharing via FAULT_FLAG_UNSHARE (!hugetlb) To: Nadav Amit Cc: Jason Gunthorpe , David Hildenbrand , Linux Kernel Mailing List , Andrew Morton , Hugh Dickins , David Rientjes , Shakeel Butt , John Hubbard , Mike Kravetz , Mike Rapoport , Yang Shi , "Kirill A . Shutemov" , Matthew Wilcox , Vlastimil Babka , Jann Horn , Michal Hocko , Rik van Riel , Roman Gushchin , Andrea Arcangeli , Peter Xu , Donald Dutile , Christoph Hellwig , Oleg Nesterov , Jan Kara , Linux-MM , "open list:KERNEL SELFTEST FRAMEWORK" , "open list:DOCUMENTATION" Content-Type: text/plain; charset="UTF-8" X-Rspamd-Queue-Id: D74A040036 X-Stat-Signature: qqp7m3ezsacjxn5eqx5nitifxsckd8bk Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=linux-foundation.org header.s=google header.b=O6XfYNds; dmarc=none; spf=pass (imf01.hostedemail.com: domain of torvalds@linuxfoundation.org designates 209.85.208.169 as permitted sender) smtp.mailfrom=torvalds@linuxfoundation.org X-Rspamd-Server: rspam11 X-HE-Tag: 1639792448-783097 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: [ Going back in the thread to this one ] On Fri, Dec 17, 2021 at 1:15 PM Nadav Amit wrote: > > I think that there is an assumption that once a page is COW-broken, > it would never have another write-fault that might lead to COW > breaking later. Right. I do think there are problems in the current code, I just think that the patches are a step back. The problems with the current code are of two kinds: - I think the largepage code (either THP or explicit hugetlb) doesn't do as good a job of this whole COW handling as the regular pages do - some of the "you can make pages read-only again explicitly" kinds of loads. But honestly, at least for the second case, if somebody does a GUP, and then starts playing mprotect games on the same virtual memory area that they did a GUP on, and are surprised when they get another COW fault that breaks their own connection with a page they did a GUP on earlier, that's their own fault. So I think there's some of "If you broke it, you get to keep both pieces". Literally, in this case. You have your GUP page that you looked up, and you have your virtual address page that you caused COW on with mprotect() by making it read-only and then read-write again, then you have two different pages, and at some point it really is just "Well, don't do that then". But yes, there's also some of "some code probably didn't get fully converted to the new world order". So if VFIO only uses FOLL_LONGTERM, and didn't ask for the COW breaking, then yes, VFIO will see page incoherencies. But that should be an issue of "VFIO should do the right thing". So part of it is a combination of "if you do crazy things, you'll get crazy results". And some of it is some kernel pinning code that doesn't do the right thing to actually make sure it gets a shared page to be pinned. And then there's THP and HUGETLB, that I do think needs fixing and aren't about those two kinds of cases. I think we never got around to just doing the same thing we did for regular pages. I think the hugepage code simply doesn't follow that "COW on GUP, mark to not COW later" pattern. Linus