linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "zhangpeng (AS)" <zhangpeng362@huawei.com>
To: Matthew Wilcox <willy@infradead.org>
Cc: <linux-mm@kvack.org>, <linux-fsdevel@vger.kernel.org>,
	<netdev@vger.kernel.org>, <akpm@linux-foundation.org>,
	<edumazet@google.com>, <davem@davemloft.net>,
	<dsahern@kernel.org>, <kuba@kernel.org>, <pabeni@redhat.com>,
	<arjunroy@google.com>, <wangkefeng.wang@huawei.com>
Subject: Re: [RFC PATCH] filemap: add mapping_mapped check in filemap_unaccount_folio()
Date: Sat, 20 Jan 2024 14:46:49 +0800	[thread overview]
Message-ID: <5106a58e-04da-372a-b836-9d3d0bd2507b@huawei.com> (raw)
In-Reply-To: <Zap7t9GOLTM1yqjT@casper.infradead.org>

On 2024/1/19 21:40, Matthew Wilcox wrote:

> On Fri, Jan 19, 2024 at 05:20:24PM +0800, Peng Zhang wrote:
>> Recently, we discovered a syzkaller issue that triggers
>> VM_BUG_ON_FOLIO in filemap_unaccount_folio() with CONFIG_DEBUG_VM
>> enabled, or bad page without CONFIG_DEBUG_VM.
>>
>> The specific scenarios are as follows:
>> (1) mmap: Use socket fd to create a TCP VMA.
>> (2) open(O_CREAT) + fallocate + sendfile: Read the ext4 file and create
>> the page cache. The mapping of the page cache is ext4 inode->i_mapping.
>> Send the ext4 page cache to the socket fd through sendfile.
>> (3) getsockopt TCP_ZEROCOPY_RECEIVE: Receive the ext4 page cache and use
>> vm_insert_pages() to insert the ext4 page cache to the TCP VMA. In this
>> case, mapcount changes from - 1 to 0. The page cache mapping is ext4
>> inode->i_mapping, but the VMA of the page cache is the TCP VMA and
>> folio->mapping->i_mmap is empty.
> I think this is the bug.  We shouldn't be incrementing the mapcount
> in this scenario.  Assuming we want to support doing this at all and
> we don't want to include something like ...
>
> 	if (folio->mapping) {
> 		if (folio->mapping != vma->vm_file->f_mapping)
> 			return -EINVAL;
> 		if (page_to_pgoff(page) != linear_page_index(vma, address))
> 			return -EINVAL;
> 	}
>
> But maybe there's a reason for networking needing to map pages in this
> scenario?

Agreed, and I'm also curious why.

>> (4) open(O_TRUNC): Deletes the ext4 page cache. In this case, the page
>> cache is still in the xarray tree of mapping->i_pages and these page
>> cache should also be deleted. However, folio->mapping->i_mmap is empty.
>> Therefore, truncate_cleanup_folio()->unmap_mapping_folio() can't unmap
>> i_mmap tree. In filemap_unaccount_folio(), the mapcount of the folio is
>> 0, causing BUG ON.
>>
>> Syz log that can be used to reproduce the issue:
>> r3 = socket$inet_tcp(0x2, 0x1, 0x0)
>> mmap(&(0x7f0000ff9000/0x4000)=nil, 0x4000, 0x0, 0x12, r3, 0x0)
>> r4 = socket$inet_tcp(0x2, 0x1, 0x0)
>> bind$inet(r4, &(0x7f0000000000)={0x2, 0x4e24, @multicast1}, 0x10)
>> connect$inet(r4, &(0x7f00000006c0)={0x2, 0x4e24, @empty}, 0x10)
>> r5 = openat$dir(0xffffffffffffff9c, &(0x7f00000000c0)='./file0\x00',
>> 0x181e42, 0x0)
>> fallocate(r5, 0x0, 0x0, 0x85b8)
>> sendfile(r4, r5, 0x0, 0x8ba0)
>> getsockopt$inet_tcp_TCP_ZEROCOPY_RECEIVE(r4, 0x6, 0x23,
>> &(0x7f00000001c0)={&(0x7f0000ffb000/0x3000)=nil, 0x3000, 0x0, 0x0, 0x0,
>> 0x0, 0x0, 0x0, 0x0}, &(0x7f0000000440)=0x40)
>> r6 = openat$dir(0xffffffffffffff9c, &(0x7f00000000c0)='./file0\x00',
>> 0x181e42, 0x0)
>>
>> In the current TCP zerocopy scenario, folio will be released normally .
>> When the process exits, if the page cache is truncated before the
>> process exits, BUG ON or Bad page occurs, which does not meet the
>> expectation.
>> To fix this issue, the mapping_mapped() check is added to
>> filemap_unaccount_folio(). In addition, to reduce the impact on
>> performance, no lock is added when mapping_mapped() is checked.
> NAK this patch, you're just preventing the assertion from firing.
> I think there's a deeper problem here.

-- 
Best Regards,
Peng



  reply	other threads:[~2024-01-20  6:47 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-19  9:20 Peng Zhang
2024-01-19 13:40 ` Matthew Wilcox
2024-01-20  6:46   ` zhangpeng (AS) [this message]
2024-01-22 16:04     ` SECURITY PROBLEM: Any user can crash the kernel with TCP ZEROCOPY Matthew Wilcox
2024-01-22 16:30       ` Eric Dumazet
2024-01-22 17:12         ` Matthew Wilcox
2024-01-22 17:39           ` Eric Dumazet
2024-01-24  9:30             ` zhangpeng (AS)
2024-01-24 10:11               ` Eric Dumazet
2024-01-25  2:18                 ` zhangpeng (AS)
2024-01-25  8:57                   ` Eric Dumazet
2024-01-25  9:22                     ` zhangpeng (AS)
2024-01-25 10:31                       ` Eric Dumazet

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5106a58e-04da-372a-b836-9d3d0bd2507b@huawei.com \
    --to=zhangpeng362@huawei.com \
    --cc=akpm@linux-foundation.org \
    --cc=arjunroy@google.com \
    --cc=davem@davemloft.net \
    --cc=dsahern@kernel.org \
    --cc=edumazet@google.com \
    --cc=kuba@kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=wangkefeng.wang@huawei.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox