From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7367AE6B244 for ; Fri, 1 Nov 2024 11:44:30 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DAAA26B0088; Fri, 1 Nov 2024 07:44:29 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D328E6B0089; Fri, 1 Nov 2024 07:44:29 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BD3656B008A; Fri, 1 Nov 2024 07:44:29 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 9F4B16B0088 for ; Fri, 1 Nov 2024 07:44:29 -0400 (EDT) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 4C536A1BFC for ; Fri, 1 Nov 2024 11:44:29 +0000 (UTC) X-FDA: 82737342660.12.C2141C0 Received: from out30-133.freemail.mail.aliyun.com (out30-133.freemail.mail.aliyun.com [115.124.30.133]) by imf17.hostedemail.com (Postfix) with ESMTP id 7EAC74001A for ; Fri, 1 Nov 2024 11:44:04 +0000 (UTC) Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b=opZUhXeg; spf=pass (imf17.hostedemail.com: domain of jefflexu@linux.alibaba.com designates 115.124.30.133 as permitted sender) smtp.mailfrom=jefflexu@linux.alibaba.com; dmarc=pass (policy=none) header.from=linux.alibaba.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1730461421; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=9ijLlyG7y1k+JpZnuKKIER0K2Jko/yUXHQnUqE7/+jc=; b=5whGIjUlQE7XiPjHJ4y5K9SeD3AxNk0wZQN/p8oJzAStRp0yN7Boy1dbNej6/6rCH+Z45a FQKRBwFP28LSuULL/S3EpHlQVYpOLSAJ+/4e+xj3yODVRIceJytHKWFD9s/yIfsXTmKIos 0x2bLfPYUJp/C3Mp7WN88jjIDoWRwmI= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1730461421; a=rsa-sha256; cv=none; b=gxHWxl6qk3ZkBRlGFLgOM1XxUk/1mC9aVfFCwoKIyI4QIKtq9QHMyVVPtSNUO2IPhgdfMV pZQdUCIWepl91NJiWsbysjt/wB/ixaSp3V2C2KWLiv04/FAz7dGDoeN/V+zQAZxR65MH9v u8ZxltwcHiQetuO+EtUCyvUjNyv1DQk= ARC-Authentication-Results: i=1; imf17.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b=opZUhXeg; spf=pass (imf17.hostedemail.com: domain of jefflexu@linux.alibaba.com designates 115.124.30.133 as permitted sender) smtp.mailfrom=jefflexu@linux.alibaba.com; dmarc=pass (policy=none) header.from=linux.alibaba.com DKIM-Signature:v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1730461459; h=Message-ID:Date:MIME-Version:Subject:To:From:Content-Type; bh=9ijLlyG7y1k+JpZnuKKIER0K2Jko/yUXHQnUqE7/+jc=; b=opZUhXegkZXCssv1+GHVzfX1Pa/Yqwv4XgtywwR2s0YCuTS3osffaRQg/6E/nPtzHqXz28+uVPzdWn/qcgniK24tz0GVs3UCewHYjdoKnaDYjEUsTTRk2wGIpwu52EJcPmuqhBwCTo7AG2wL0wWzMBiyeEAo36yhwl4W5I98Xcw= Received: from 30.221.145.1(mailfrom:jefflexu@linux.alibaba.com fp:SMTPD_---0WIRWj2h_1730461456 cluster:ay36) by smtp.aliyun-inc.com; Fri, 01 Nov 2024 19:44:17 +0800 Message-ID: <43aeed1a-0572-4bcc-8c06-49522459f7d2@linux.alibaba.com> Date: Fri, 1 Nov 2024 19:44:13 +0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v2 2/2] fuse: remove tmp folio for writebacks and internal rb tree To: Joanne Koong , Shakeel Butt Cc: Bernd Schubert , Miklos Szeredi , linux-fsdevel@vger.kernel.org, josef@toxicpanda.com, hannes@cmpxchg.org, linux-mm@kvack.org, kernel-team@meta.com, Vlastimil Babka References: <023c4bab-0eb6-45c5-9a42-d8fda0abec02@fastmail.fm> <4hwdxhdxgjyxgxutzggny4isnb45jxtump7j7tzzv6paaqg2lr@55sguz7y4hu7> Content-Language: en-US From: Jingbo Xu In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Stat-Signature: f6kwdxmufauecwtykwre8d5xyueutgbh X-Rspam-User: X-Rspamd-Queue-Id: 7EAC74001A X-Rspamd-Server: rspam02 X-HE-Tag: 1730461444-532224 X-HE-Meta: U2FsdGVkX1/KzQLt69RvN4QMCiBo5sc6FqcF541RIuyoGNHS967fvwpV85HykA8msqrSY/PydJJ0Mfp22VVfD2iz9wBhaAvlR03Fkbe/C++0toxi3oUG5XaA8DPigeO/Ng7sAU1yHfw5oXwaZNGO84Mk4k6K7fczglwBA7VwEMLBWRs+ltaVpVFY/ZuQnkAEzwMl6bc5OfQirB6XnSVHNKSZtfEXs5MFG1EROAt9NFMM7koqG+2LE8T8zxzsIev4zIlVCtMyuhDG+lUd2C8QSh/00AL0UJPvObgGBT/sn0AR9ZznGbY4Rl/xcgdGD2TXfbj7F3mCPSv+ijOSfzq8r5U2jx3f1508e0qsa7SspKeQlPuEMVww6iwmc/2ZoI2+U5kcei0n0KPH7t96r/xE5nA5yoOo5aJ1lMHe2+VKqgnz9GxqqdEcr0ZlPOdQvi0XEIuvrZ8gza4/A8zwCwILEQOG1FlxqjH9s/RW9oFDhthwqc6mCqUvO3o4U2BE4PgPOdf9yM4mctZ9S+n2+EbjP4n69pmlSb0fZuKFe72Ndd959OWSDJi4h1vxy5wRos3gTxWBOwV596WnvnIi2QPPSRS+3vFHBpVhG/xosBj63tmHCZoL4g6e64JHxlIDREPPDpMSxbZM5nhafdId0L9h+rp/NUqhyyKXXYf+JIdiIT4/jB0T38zZRCAohXsnpAAI2q244tor7yU5D3A6xE2mrSpVx/aDAv42Da0irdnK2FLVGkSZP77e3UUOWC/gKWbN7sqHWB/I6zhw2ZFKQwNAEn5MDIGyBcX8mTHzwaP5PTRzOnKZBJg0bNln3gRP0EaSU+0t6nQpAZ+IumR27fApA3iOFecnHCi63K54CqMMb8yfVPoCcFQir38PaHJufg3hy/yKjZ9jfBBR6eBMppHkTIC01GeeWv4clbh2Op1dAg4GO2G3Ds4MPUTo8pBRk6yEl+1l3dPGCOgKo023GGu AvGyiqCW GZsDu7IHKxcH9VFo0iSj3DGAbH7CS68FmyH/0okbi/AvWXKJv1NdyxhFYeIyrHP16ZWkGvk7u8RYZYQRlhM+lTZz8fGKuqYxFmcTZ6M8HrKOwYx7g2nDs2oqMJlpzhufBXrnCIy48xcYzbsGglHepAvXYLp3/WGv0MPZBJIkIXaSgEqHWXSh167AjmkO1Q4eJMVe8gULWOaJrrYYd111FD5clX6KUAaiGpCqi8pY15gKeWOVY0OPOT6T6wtBE8ZZjrmoou0FpVXy4LY4188gDgDAw4Q== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Hi Joanne, Thanks for keeping pushing this forward. On 11/1/24 5:52 AM, Joanne Koong wrote: > On Thu, Oct 31, 2024 at 1:06 PM Shakeel Butt wrote: >> >> On Thu, Oct 31, 2024 at 12:06:49PM GMT, Joanne Koong wrote: >>> On Wed, Oct 30, 2024 at 5:30 PM Shakeel Butt wrote: >> [...] >>>> >>>> Memory pool is a bit confusing term here. Most probably you are asking >>>> about the migrate type of the page block from which tmp page is >>>> allocated from. In a normal system, tmp page would be allocated from page >>>> block with MIGRATE_UNMOVABLE migrate type while the page cache page, it >>>> depends on what gfp flag was used for its allocation. What does fuse fs >>>> use? GFP_HIGHUSER_MOVABLE or something else? Under low memory situation >>>> allocations can get mixed up with different migrate types. >>>> >>> >>> I believe it's GFP_HIGHUSER_MOVABLE for the page cache pages since >>> fuse doesn't set any additional gfp masks on the inode mapping. >>> >>> Could we just allocate the fuse writeback pages with GFP_HIGHUSER >>> instead of GFP_HIGHUSER_MOVABLE? That would be in fuse_write_begin() >>> where we pass in the gfp mask to __filemap_get_folio(). I think this >>> would give us the same behavior memory-wise as what the tmp pages >>> currently do, >> >> I don't think it would be the same behavior. From what I understand the >> liftime of the tmp page is from the start of the writeback till the ack >> from the fuse server that writeback is done. While the lifetime of the >> page of the page cache can be arbitrarily large. We should just make it >> unmovable for its lifetime. I think it is fine to make the page >> unmovable during the writeback. We should not try to optimize for the >> bad or buggy behavior of fuse server. >> >> Regarding the avoidance of wait on writeback for fuse folios, I think we >> can handle the migration similar to how you are handling reclaim and in >> addition we can add a WARN() in folio_wait_writeback() if the kernel ever >> sees a fuse folio in that function. > > Awesome, this is what I'm planning to do in v3 to address migration then: > > 1) in migrate_folio_unmap(), only call "folio_wait_writeback(src);" if > src->mapping does not have the AS_NO_WRITEBACK_WAIT bit set on it (eg > fuse folios will have that AS_NO_WRITEBACK_WAIT bit set) I think it's generally okay to skip FUSE pages under writeback when the sync migrate_pages() is called in low memory context, which only tries to migrate as many pages as possible (i.e. best effort). While more caution may be needed when the sync migrate_pages() is called with an implicit hint that the migration can not fail. For example, ``` offline_pages while { scan_movable_pages do_migrate_range } ``` If the malicious server never completes the writeback IO, no progress will be made in the above while loop, and I'm afraid it will be a dead loop then. > > 2) in the fuse filesystem's implementation of the > mapping->a_ops->migrate_folio callback, return -EAGAIN if the folio is > under writeback. Is there any possibility that a_ops->migrate_folio() may be called with the folio under writeback? - for most pages without AS_NO_WRITEBACK_WAIT, a_ops->migrate_folio() will be called only when Page_writeback is cleared; - for AS_NO_WRITEBACK_WAIT pages, they are skipped if they are under writeback -- Thanks, Jingbo