From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 658E7C433F5 for ; Fri, 24 Sep 2021 03:08:46 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 0296361100 for ; Fri, 24 Sep 2021 03:08:45 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 0296361100 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 3C7136B006C; Thu, 23 Sep 2021 23:08:45 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 3777A900002; Thu, 23 Sep 2021 23:08:45 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 23E046B0073; Thu, 23 Sep 2021 23:08:45 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0142.hostedemail.com [216.40.44.142]) by kanga.kvack.org (Postfix) with ESMTP id 10F696B006C for ; Thu, 23 Sep 2021 23:08:45 -0400 (EDT) Received: from smtpin16.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id BBA5A2DD85 for ; Fri, 24 Sep 2021 03:08:44 +0000 (UTC) X-FDA: 78620984568.16.A7FC9F7 Received: from mail-ed1-f51.google.com (mail-ed1-f51.google.com [209.85.208.51]) by imf18.hostedemail.com (Postfix) with ESMTP id 7EEE1400208A for ; Fri, 24 Sep 2021 03:08:44 +0000 (UTC) Received: by mail-ed1-f51.google.com with SMTP id dj4so30635371edb.5 for ; Thu, 23 Sep 2021 20:08:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=pqOUF2ID5Z3ZhgcypPTAUuh9KgfVjRAG/Kv6sgvowxc=; b=XcpN0OmnZiCmTRLGp6cKERD7HTR89Kj9jpH7/CHRyc9RFZt/YZRSVGdqasbne76v2v vSX6EMJiGswIs9cCNO7/M9UNU3c+M6+5GxBupYFV7BbGAyBcfDVJ3kGoknQsh5BMRDfC hq2Udpf/5k84jW9VLPeSpZssJSRbJK3Bbr0at69y57bm+SrcMrwZiWPP3ziPw9qWMWxv dwIfZyCvdf9Lk9dyx23VrJga56LJQIOMRaEtHzDaaDP8A0spxD5fbuzqWXMG8IfxczV0 LhPtRN/QwNcIxDR+ZVj6UL1qMBZ39hzfBQxeFFy5VF71j6AK6rYOKPUzucIVVUcyNaBb Y4Qw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=pqOUF2ID5Z3ZhgcypPTAUuh9KgfVjRAG/Kv6sgvowxc=; b=8FzDcQgRhk44RQnQKQqlNovCPFSnJpvuK4+fun8BDpEfCHbEjuB43RPsli2bJeCFyV lV7a5OC8brPW2Q6vklF+cJwE5GQYd6bhxYbWSzaheMoM0IEYcr8uLQg1iXs+rL0+5joy GGpnVBXCaTiffaSK3oRv3p4v0R6W/Kp3E/M8gk1UZBOuPKukXLxbDQs5KZ3vHWxG5lN4 X6A/UVchFEO0IccYzsqNQbjKoeSVOkOpPaPON73hS3hifhej/L+AIM7KIPyTHXzTkMYV jIX4Sdal/724zfeGRq63j0Ma/8OwUTniniTJnNbTYoeQwdEB0nRpPrZn9oEgFnXd1aBa iDvg== X-Gm-Message-State: AOAM532eoR/jawQVql1r900MYO53jqXW6+BqA8ry5KAGUgK/NW8OEAdQ No50r8Qxyzz3Va05lfO+99YESmrgBFmubOCSFK0= X-Google-Smtp-Source: ABdhPJx3zGsvE6gO3iatsk7hVOkUwkznmn7IZbRaERkL0uPs+CEgt5cfCnYe5G9ewkeD/gXujh+Cwj1SxeIee6C7JzM= X-Received: by 2002:a17:906:3854:: with SMTP id w20mr8488896ejc.537.1632452923167; Thu, 23 Sep 2021 20:08:43 -0700 (PDT) MIME-Version: 1.0 References: <20210906121200.57905-1-rongwei.wang@linux.alibaba.com> <20210922070645.47345-2-rongwei.wang@linux.alibaba.com> <20210923194343.ca0f29e1c4d361170343a6f2@linux-foundation.org> In-Reply-To: <20210923194343.ca0f29e1c4d361170343a6f2@linux-foundation.org> From: Yang Shi Date: Thu, 23 Sep 2021 20:08:31 -0700 Message-ID: Subject: Re: [PATCH v2 1/2] mm, thp: check page mapping when truncating page cache To: Andrew Morton Cc: Rongwei Wang , Matthew Wilcox , Linux MM , Linux Kernel Mailing List , song@kernel.org, william.kucharski@oracle.com, Hugh Dickins Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 7EEE1400208A X-Stat-Signature: bsm6px347irdm1mdm5z3xcbgssr4bp3e Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=XcpN0Omn; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf18.hostedemail.com: domain of shy828301@gmail.com designates 209.85.208.51 as permitted sender) smtp.mailfrom=shy828301@gmail.com X-HE-Tag: 1632452924-375081 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000028, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Thu, Sep 23, 2021 at 7:43 PM Andrew Morton w= rote: > > On Thu, 23 Sep 2021 01:04:54 +0800 Rongwei Wang wrote: > > > > > > > > On Sep 22, 2021, at 7:37 PM, Matthew Wilcox wro= te: > > > > > > On Wed, Sep 22, 2021 at 03:06:44PM +0800, Rongwei Wang wrote: > > >> Transparent huge page has supported read-only non-shmem files. The f= ile- > > >> backed THP is collapsed by khugepaged and truncated when written (fo= r > > >> shared libraries). > > >> > > >> However, there is race in two possible places. > > >> > > >> 1) multiple writers truncate the same page cache concurrently; > > >> 2) collapse_file rolls back when writer truncates the page cache; > > > > > > As I've said before, the bug here is that somehow there is a writable= fd > > > to a file with THPs. That's what we need to track down and fix. > > Hi, Matthew > > I am not sure get your means. We know =E2=80=9Cmm, thp: relax the VM_DE= NYWRITE constraint on file-backed THPs" > > Introduced file-backed THPs for DSO. It is possible {very rarely} for D= SO to be opened in writeable way. > > > > ... > > > > > https://lore.kernel.org/linux-mm/YUdL3lFLFHzC80Wt@casper.infradead.or= g/ > > All in all, what you mean is that we should solve this race at the sour= ce? > > Matthew is being pretty clear here: we shouldn't be permitting > userspace to get a writeable fd for a thp-backed file. No, he doesn't mean it IIRC. Actually we had the same conversation for another patch. Quoted below: " > > Things have already gone wrong before we get to this point. See > > do_dentry_open(). You aren't supposed to be able to get a writable fil= e > > descriptor on a file which has had huge pages added to the page cache > > without the filesystem's knowledge. That's the problem that needs to > > be fixed. > > I don't quite understand your point here. Do you mean do_dentry_open() > should fail for such cases instead of truncating the page cache? No, do_dentry_open() should have truncated the page cache when it was called and found that there were THPs in the cache. Then khugepaged should see that someone has the file open for write and decline to create new THPs. So it shouldn't be possible to get here with THPs in the cache." Please see https://lore.kernel.org/linux-mm/YUkCI2I085Sos%2F64@casper.infra= dead.org/ But actually "mm, thp: relax the VM_DENYWRITE constraint on file-backed THPs" did so exactly. > > Why are we permitting the DSO to be opened writeably? If there's a > legitimate case for doing this then presumably "mm, thp: relax the > VM_DENYWRITE constraint on file-backed THPs: should be fixed or > reverted. Unfortunately we can't revert this commit anymore since VM_DENYWRITE is gone due to commit 8d0920bde5eb ("mm: remove VM_DENYWRITE") > > If there is no legitimate use case for returning a writeable fd for a > thp-backed file then we should fail such an attempt at open(). This > approach has back-compatibility issues which need to be thought about. > Perhaps we should permit the open-writeably attempt to appear to > succeed, but to really return a read-only fd? > >