From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 81440D59F7F for ; Wed, 6 Nov 2024 23:37:26 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 19B106B0096; Wed, 6 Nov 2024 18:37:26 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 124816B0098; Wed, 6 Nov 2024 18:37:26 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EE0AD6B0099; Wed, 6 Nov 2024 18:37:25 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id CB74A6B0096 for ; Wed, 6 Nov 2024 18:37:25 -0500 (EST) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 7432714021E for ; Wed, 6 Nov 2024 23:37:25 +0000 (UTC) X-FDA: 82757282496.19.67D1785 Received: from mail-qt1-f170.google.com (mail-qt1-f170.google.com [209.85.160.170]) by imf10.hostedemail.com (Postfix) with ESMTP id 1C52EC0007 for ; Wed, 6 Nov 2024 23:37:07 +0000 (UTC) Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=TACIA3uX; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf10.hostedemail.com: domain of joannelkoong@gmail.com designates 209.85.160.170 as permitted sender) smtp.mailfrom=joannelkoong@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1730936074; a=rsa-sha256; cv=none; b=rfrOrNCQp/UtFafWmXmOrOV8QYVWwEs6wDR/uGjQPrt1SNl6ftKKOPHH+PE9LE18yRWQZu yEVlCNS5FAqIJ6x5UvZJo5cL4MYvkxdbYmjTN8Nq7v8SPVrybdEbFFScYVMNi55bYdoVD0 UkIfqIA9aUhSWwY/tmDEKUpmV4blFsQ= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=TACIA3uX; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf10.hostedemail.com: domain of joannelkoong@gmail.com designates 209.85.160.170 as permitted sender) smtp.mailfrom=joannelkoong@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1730936074; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=lz3gYzQ4n32I4z58BHzCmszK35FiX3Yy6HdDdCI5+KM=; b=mKvt2+TXDmmaX878Ed6xKoOmj5o9zzvJBb2gvK4LlsjmJfC+ChpjQ0MPqlG7wioLG14pKx 8QjkVai4syyojOJcFawqIDTxUYR7HAcxzmcKFfD/sFf0/u6XhoAu+cjAtaQha7jlr6WZzs drg1w5ku0EZtk4z6qf5iostNOFq0cjQ= Received: by mail-qt1-f170.google.com with SMTP id d75a77b69052e-4609c96b2e5so2332701cf.0 for ; Wed, 06 Nov 2024 15:37:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1730936243; x=1731541043; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=lz3gYzQ4n32I4z58BHzCmszK35FiX3Yy6HdDdCI5+KM=; b=TACIA3uX0yQXnM9z4xRfH8b52Pz4hcqqIprceG8z2GO1mCzM9A7e81GsBYPl9xdQai CqGRBtJIBIiCiJZsgq51uH07TCHkTnzI9v0ThTMeLRZHnst2++hxwbiKZtFm4L2tw4bj 8IShZ9SdMULA1o5ZLP5iEe6vYqAKiHIukomtG50uiSJHqK5LZXp3NrjK9GFGvEXZo+SX 5T5ahvNEDAFj0dowJwsLwlzOz5a/U72fZ/hG7G8WRMurESYCe9/u2hbO355ZDROZ06if m5npFvZRaMZdsxEanHmwOXlG80xEeg+f5VqtUpW+fSHBhYFIAU2Acq2620XGv4uwAt2A 7EwQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1730936243; x=1731541043; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=lz3gYzQ4n32I4z58BHzCmszK35FiX3Yy6HdDdCI5+KM=; b=VL/FZZx29tyZ9Bw/6umkQl2xs6/7oEAAmRqM59Vz7MwE+yGH3rSkPORg3AoBHD5Tag pf36dyr/kFNWNg8FMs4cIGQ+PgL6Y7cNxVSaZ0Md1187dK3rOmrhiXrvZ8sqyZF0djbu MSjPz7UwTb4uOGyAIG2Mbh9HG8K7JqSWveQEoqhEKy45aWfMQeXxdDcacpiuY4xlk/Xv HDNEtut1YrZ0hG5Is/z9OP0GBNLuDOBNYlyPAdpxdKk6GbYugUCYg1hymikzjj8VlO0C CzTH61N97CboB5odKE6U/fieqi+HbtpZ/gON/I7yZ3xHXlpHZMX3S3+CnPY6Ef2RsCsz YFoQ== X-Forwarded-Encrypted: i=1; AJvYcCVmWrQl+uQn5xVZ6oS4gC8/62JP1DsY3GjurE+7ZPNwfF01d773B4Myujp/6/dChsu46SsghnQLbA==@kvack.org X-Gm-Message-State: AOJu0YzLrw1/yV2CtBshwddqMhVXUhZmYfXILP+FoGG0iaxaAsZkNe0L +QkoVA42BoJNsTVVmFW7abtZK4ieN8prFCJrczz5qoJmhP4uZA+cnqMvG37LdCHzvMkpoc69II2 jwg3NzUSzyRPRyxZHjNApNXN4kUg= X-Google-Smtp-Source: AGHT+IGvugyVfeM1HThyBImtsu/AaR/aZpZDSIjM4wJB+KxgAkOfuIbqI8F7li1oa4RDxXAYycDRK9oOPZOLe931dSo= X-Received: by 2002:ac8:7fc4:0:b0:461:1532:d769 with SMTP id d75a77b69052e-4613c19a90amr590681841cf.54.1730936242690; Wed, 06 Nov 2024 15:37:22 -0800 (PST) MIME-Version: 1.0 References: <4hwdxhdxgjyxgxutzggny4isnb45jxtump7j7tzzv6paaqg2lr@55sguz7y4hu7> In-Reply-To: From: Joanne Koong Date: Wed, 6 Nov 2024 15:37:11 -0800 Message-ID: Subject: Re: [PATCH v2 2/2] fuse: remove tmp folio for writebacks and internal rb tree To: Shakeel Butt Cc: Bernd Schubert , Jingbo Xu , Miklos Szeredi , linux-fsdevel@vger.kernel.org, josef@toxicpanda.com, hannes@cmpxchg.org, linux-mm@kvack.org, kernel-team@meta.com, Vlastimil Babka Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Stat-Signature: migqb1o7emz8q9bfmck1js7jwzffgi3x X-Rspamd-Queue-Id: 1C52EC0007 X-Rspamd-Server: rspam08 X-Rspam-User: X-HE-Tag: 1730936227-52487 X-HE-Meta: U2FsdGVkX1/s59UK3ineGuKSer/9jKg4aYLwrJQemsS3DuV9zBYDTwUOn30rQ5BJ0wqD+4Vm5SlxGcUk2ik98amHa99OTUBtopAk+Dgt+uetNPzKW0itBVW5y09NIjhdXAgou2ri8vwokHvUjwGUb3M8fKUChmPKBArCMLRvmBZes5xLlChI3uwUFl/d4l9zNwNH7Vt1kZZZOYQnAYVP8OhwQjUZETomumU5gXTjngMfo2eU/nP5AIhT/d0qtP1F8hp43dTeUDCDZlFAN0Y89I9qxO2mJebAE6lXwZoeLSpcAKU52YLj8IDO+QlZcsJXq+kUXi/BIxl23iDPEyBjGeFg3MAJA1WQMxtApmCZVliFPltyTds8sNLMxIgs/F0YnOrom1O95p2FnEGyBVOtghqRLRFSeotmEDgOfRXBJJxWfoFJpdpS17EVQ+h5QGzYvTYqx7T5hTOqBayrwlKIppyf5jgahylLA89x6CpOX4RQVLC0PpSIHi2cena4wJGmnMERX+WanqSUeSYx5BlkOLHDhLb/LdCG6+NqH+XInXhbgl7jwTBxAWNmcwFXv5MJt28MZ+NEP/bAz0mPIpHZ9MIoC4WoeQi7IMH+KTygIJtydPXCZBI8Hbnm3qRw1TyYSipp9XLvs7IQtUGwOwzPBwM8PfR0QFto4JsmH6A3ZQZohbBVpnHK73yi7QEfTpGne7MPg3iCzLdXhsbT0ipNJ0+Am59Tms0SUYAKWM/CMjzEXfazFXVsAQGBfNWfJpKh2QuC5g1ZGgzipYI1+LhPQM+tHWr/Qa4b8ZoqnqEMZAhEQ8BjwP42B66V3kZDINRYTIOoZABEjNkrLcQyV3WdHpN8O9af2rHolLGkpfAfvDye74unGRV2pRovUblx/mPw2kN0bSBV03Xu6wAj/HMHj5QPkNpS4AJBYL8rH7+zuzwEOlNdRPMgxygj5VkyA7O7BjJVlkdkMV7ziZCpXYa y7AJ+E/U zDM3JABzpjZxYieOAT/OfAhjQ9dmo/Y937NnmIcPNUBA22ihOM0LqEnwGpQ3Eq2yYpKHqMb9ls+s+JDmBbazCk1fil69xQG1sHR2qPtvTHOYuYS8tDSBJRwiAi7wssS4wxlQAHHKg0nbblLQmkjgnRLXpwUb5XoULt5SQXaTC8TsyNSGKrTBfSxIy3MzKMWcSYdc6aHHh0fASrO4uTw2eSyTMKyHutE4Z8EO5CuKszMB1JNodVGrAqc7U6lVzlI2vTuOTfI6g5DObh6pF8rO8gvam6Hzkslre2viEpOoB10GpIHVbMVLl/X1O2XQrNgmVRucW X-Bogosity: Ham, tests=bogofilter, spamicity=0.000035, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Oct 31, 2024 at 3:38=E2=80=AFPM Shakeel Butt wrote: > > On Thu, Oct 31, 2024 at 02:52:57PM GMT, Joanne Koong wrote: > > On Thu, Oct 31, 2024 at 1:06=E2=80=AFPM Shakeel Butt wrote: > > > > > > On Thu, Oct 31, 2024 at 12:06:49PM GMT, Joanne Koong wrote: > > > > On Wed, Oct 30, 2024 at 5:30=E2=80=AFPM Shakeel Butt wrote: > > > [...] > > > > > > > > > > Memory pool is a bit confusing term here. Most probably you are a= sking > > > > > about the migrate type of the page block from which tmp page is > > > > > allocated from. In a normal system, tmp page would be allocated f= rom page > > > > > block with MIGRATE_UNMOVABLE migrate type while the page cache pa= ge, it > > > > > depends on what gfp flag was used for its allocation. What does f= use fs > > > > > use? GFP_HIGHUSER_MOVABLE or something else? Under low memory sit= uation > > > > > allocations can get mixed up with different migrate types. > > > > > > > > > > > > > I believe it's GFP_HIGHUSER_MOVABLE for the page cache pages since > > > > fuse doesn't set any additional gfp masks on the inode mapping. > > > > > > > > Could we just allocate the fuse writeback pages with GFP_HIGHUSER > > > > instead of GFP_HIGHUSER_MOVABLE? That would be in fuse_write_begin(= ) > > > > where we pass in the gfp mask to __filemap_get_folio(). I think thi= s > > > > would give us the same behavior memory-wise as what the tmp pages > > > > currently do, > > > > > > I don't think it would be the same behavior. From what I understand t= he > > > liftime of the tmp page is from the start of the writeback till the a= ck > > > from the fuse server that writeback is done. While the lifetime of th= e > > > page of the page cache can be arbitrarily large. We should just make = it > > > unmovable for its lifetime. I think it is fine to make the page > > > unmovable during the writeback. We should not try to optimize for the > > > bad or buggy behavior of fuse server. > > > > > > Regarding the avoidance of wait on writeback for fuse folios, I think= we > > > can handle the migration similar to how you are handling reclaim and = in > > > addition we can add a WARN() in folio_wait_writeback() if the kernel = ever > > > sees a fuse folio in that function. > > > > Awesome, this is what I'm planning to do in v3 to address migration the= n: > > > > 1) in migrate_folio_unmap(), only call "folio_wait_writeback(src);" if > > src->mapping does not have the AS_NO_WRITEBACK_WAIT bit set on it (eg > > fuse folios will have that AS_NO_WRITEBACK_WAIT bit set) > > > > 2) in the fuse filesystem's implementation of the > > mapping->a_ops->migrate_folio callback, return -EAGAIN if the folio is > > under writeback. > > 3) Add WARN_ONCE() in folio_wait_writeback() if folio->mapping has > AS_NO_WRITEBACK_WAIT set and return without waiting. For v3, I'm going to change AS_NO_WRITEBACK_RECLAIM to AS_WRITEBACK_MAY_BLOCK and skip 3) because 3) may be too restrictive. For example, for the sync_file_range() syscall, we do want to wait on writeback - it's ok in this case to call folio_wait_writeback() on a fuse folio since the caller would have intentionally passed in a fuse fd to sync_file_range(). Thanks, Joanne > > > > > Does this sound good? > > Yes.