From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-10.1 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI, NICE_REPLY_A,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A8367C4320E for ; Wed, 11 Aug 2021 14:45:21 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 4DEC760F21 for ; Wed, 11 Aug 2021 14:45:21 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 4DEC760F21 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id B82E98D0001; Wed, 11 Aug 2021 10:45:20 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B33986B0072; Wed, 11 Aug 2021 10:45:20 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9FB188D0001; Wed, 11 Aug 2021 10:45:20 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0133.hostedemail.com [216.40.44.133]) by kanga.kvack.org (Postfix) with ESMTP id 85DF86B0071 for ; Wed, 11 Aug 2021 10:45:20 -0400 (EDT) Received: from smtpin18.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 23FEC18017815 for ; Wed, 11 Aug 2021 14:45:20 +0000 (UTC) X-FDA: 78463072800.18.5B08667 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf02.hostedemail.com (Postfix) with ESMTP id AA5527003BDE for ; Wed, 11 Aug 2021 14:45:19 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1628693119; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=eBcHxr3b5mworuPq4PqU+l4xARfNUKjm75tM9rIDjPo=; b=UlgKGqKA0XquDA7M45NOqhQMsgB9qaigBwu6Anbb8NYMBwvSK0AcVA4z3WpvfKqnndGvV/ y7fEbdUbVTnku6JFk01Eb13tvpqQCpfGrIEIQz+wzgc/EOZn1YqUahv3mus54L7yU2ifdU wfZImY07Ec17H1YfCH77YhhwSemmlGM= Received: from mail-wm1-f69.google.com (mail-wm1-f69.google.com [209.85.128.69]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-510-K_cEuCH6MQi32-RlEzeUGg-1; Wed, 11 Aug 2021 10:45:17 -0400 X-MC-Unique: K_cEuCH6MQi32-RlEzeUGg-1 Received: by mail-wm1-f69.google.com with SMTP id b196-20020a1c80cd0000b02902e677003785so2176855wmd.7 for ; Wed, 11 Aug 2021 07:45:17 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:organization :message-id:date:user-agent:mime-version:in-reply-to :content-language:content-transfer-encoding; bh=eBcHxr3b5mworuPq4PqU+l4xARfNUKjm75tM9rIDjPo=; b=UaW/Zil18KAhTjlmlbFIwwMCSKE6YSlSsDAPcspnPyBZ5P7xc4ByqekES2QbMrJFFT TqkHDNdmnvnm5xDetytebwFazDmmtyFQRCx/kvtH0FwV83ypSXj33xzY21wdh6ppVZVD GzjcLxPgOsXlFTor4SQ8xSJ69GuRnQVvC5O8JMSDqwO7x/rWGHLuBoloWxvJ4cNkxUwi EZtTJD4TGnMDE0BXXuiakJYRm5lcnbBmdtBtOp5hKqWK52rFc5oL/DATJzDmlZ94seX2 +auFOavjHV4XckbGm1+PExSfhoBx01NX0g+r60JXlm3oGMBtKE/MCfUhK4cPl11djvph rS2g== X-Gm-Message-State: AOAM530HK+s5Q3eF4HG8ZBG6NMTW1wFOlnOhytM1EAssZ0SLy6hRNlcX Gdw/C+YAJnc+7QUOp0sBRQ532tWWaEnCptBTnq7mnlJgLlQPbvQMElRAgcrRdrDjlXwlqAFN1Ip 65i9tEhANcaU= X-Received: by 2002:a1c:f019:: with SMTP id a25mr10501115wmb.96.1628693116112; Wed, 11 Aug 2021 07:45:16 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyH29GuiTfEW6pGrDbGgwByHckYOSWZlO1OckR66NtZ86Hv7seISYduYOnbWzd/HniZiX+U4w== X-Received: by 2002:a1c:f019:: with SMTP id a25mr10501089wmb.96.1628693115840; Wed, 11 Aug 2021 07:45:15 -0700 (PDT) Received: from [192.168.3.132] (p5b0c64a0.dip0.t-ipconnect.de. [91.12.100.160]) by smtp.gmail.com with ESMTPSA id g6sm9982769wrm.73.2021.08.11.07.45.14 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 11 Aug 2021 07:45:14 -0700 (PDT) Subject: Re: mmap denywrite mess (Was: [GIT PULL] overlayfs fixes for 5.14-rc6) To: Miklos Szeredi , Linus Torvalds Cc: =?UTF-8?Q?Christian_K=c3=b6nig?= , Matthew Wilcox , linux-mm , linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-unionfs@vger.kernel.org References: From: David Hildenbrand Organization: Red Hat Message-ID: Date: Wed, 11 Aug 2021 16:45:14 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.11.0 MIME-Version: 1.0 In-Reply-To: X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=UlgKGqKA; spf=none (imf02.hostedemail.com: domain of david@redhat.com has no SPF policy when checking 170.10.133.124) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: AA5527003BDE X-Stat-Signature: ejft6gme8qkkihtqcm3nfrkc75p61iao X-HE-Tag: 1628693119-477960 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 11.08.21 16:11, Miklos Szeredi wrote: > On Mon, Aug 09, 2021 at 02:25:17PM -0700, Linus Torvalds wrote: >=20 >> Ugh. Th edances with denywrite and mapping_unmap_writable are really >> really annoying. >=20 > Attached version has error and success paths separated. Was that your > complaint? >=20 >> I get the feeling that the whole thing with deny_write_access and >> mapping_map_writable could possibly be done after-the-fact somehow as >> part of actually inserting the vma in the vma tree, rather than done >> as the vma is prepared. >=20 > I don't know if that's doable or not. The final denywrite count is obt= ained in > __vma_link_file(), called after __vma_link(). The questions are: >=20 > - does the order of those helper calls matter? >=20 > - if it does, could the __vma_link() be safely undone after an unsucc= essful > __vmal_link_file()? >=20 >> And most users of vma_set_file() probably really don't want that whole >> thing at all (ie the DRM stuff that just switches out a local thing. >> They also don't check for the new error cases you've added. >=20 > Christian K=C3=B6nig wants to follow up with those checks (which should= be asserts, > if the code wasn't buggy in the first place). >=20 >> So I really think this is quite questionable, and those cases should >> probably have been done entirely inside ovlfs rather than polluting >> the cases that don't care and don't check. >=20 > I don't get that. mmap_region() currently drops the deny counts from t= he > original file. That doesn't work for overlayfs since it needs to take = new temp > counts on the override file. >=20 > So mmap_region() is changed to drop the counts on vma->vm_file, but the= n all > callers of vma_set_file() will need to do that switch of temp counts, t= here's no > way around that. >=20 > Thanks, > Miklos >=20 > For reference, here's the previous discussion: >=20 > https://lore.kernel.org/linux-mm/YNHXzBgzRrZu1MrD@miu.piliscsaba.redhat= .com/ >=20 > --- > fs/overlayfs/file.c | 4 +++- > include/linux/mm.h | 2 +- > mm/mmap.c | 2 +- > mm/util.c | 31 ++++++++++++++++++++++++++++++- > 4 files changed, 35 insertions(+), 4 deletions(-) >=20 > --- a/fs/overlayfs/file.c > +++ b/fs/overlayfs/file.c > @@ -475,7 +475,9 @@ static int ovl_mmap(struct file *file, s > if (WARN_ON(file !=3D vma->vm_file)) > return -EIO; > =20 > - vma_set_file(vma, realfile); > + ret =3D vma_set_file(vma, realfile); > + if (ret) > + return ret; > =20 > old_cred =3D ovl_override_creds(file_inode(file)->i_sb); > ret =3D call_mmap(vma->vm_file, vma); > --- a/include/linux/mm.h > +++ b/include/linux/mm.h > @@ -2780,7 +2780,7 @@ static inline void vma_set_page_prot(str > } > #endif > =20 > -void vma_set_file(struct vm_area_struct *vma, struct file *file); > +int /* __must_check */ vma_set_file(struct vm_area_struct *vma, struct= file *file); > =20 > #ifdef CONFIG_NUMA_BALANCING > unsigned long change_prot_numa(struct vm_area_struct *vma, > --- a/mm/mmap.c > +++ b/mm/mmap.c > @@ -1806,6 +1806,7 @@ unsigned long mmap_region(struct file *f > */ > vma->vm_file =3D get_file(file); > error =3D call_mmap(file, vma); > + file =3D vma->vm_file; > if (error) > goto unmap_and_free_vma; > =20 > @@ -1867,7 +1868,6 @@ unsigned long mmap_region(struct file *f > if (vm_flags & VM_DENYWRITE) > allow_write_access(file); > } > - file =3D vma->vm_file; > out: > perf_event_mmap(vma); > =20 > --- a/mm/util.c > +++ b/mm/util.c > @@ -314,12 +314,41 @@ int vma_is_stack_for_current(struct vm_a > /* > * Change backing file, only valid to use during initial VMA setup. > */ > -void vma_set_file(struct vm_area_struct *vma, struct file *file) > +int vma_set_file(struct vm_area_struct *vma, struct file *file) > { > + vm_flags_t vm_flags =3D vma->vm_flags; > + int err; > + > + /* Get temporary denial counts on replacement */ > + if (vm_flags & VM_DENYWRITE) { > + err =3D deny_write_access(file); > + if (err) > + return err; > + } > + if (vm_flags & VM_SHARED) { > + err =3D mapping_map_writable(file->f_mapping); > + if (err) > + goto undo_denywrite; > + } > + > /* Changing an anonymous vma with this is illegal */ > get_file(file); > swap(vma->vm_file, file); > + > + /* Undo temporary denial counts on replaced */ > + if (vm_flags & VM_SHARED) > + mapping_unmap_writable(file->f_mapping); > + > + if (vm_flags & VM_DENYWRITE) > + allow_write_access(file); > + > fput(file); > + return 0; > + > +undo_denywrite: > + if (vm_flags & VM_DENYWRITE) > + allow_write_access(file); > + return err; > } > EXPORT_SYMBOL(vma_set_file); > =20 >=20 I proposed a while ago to get rid of VM_DENYWRITE completely: https://lkml.kernel.org/r/20210423131640.20080-1-david@redhat.com I haven't looked how much it still applies to current upstream, but=20 maybe that might help cleaning up that code. --=20 Thanks, David / dhildenb