From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0E7C8C4345F for ; Fri, 26 Apr 2024 15:49:38 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 83CFE6B009D; Fri, 26 Apr 2024 11:49:37 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 7ED296B009F; Fri, 26 Apr 2024 11:49:37 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6B4FC6B00A3; Fri, 26 Apr 2024 11:49:37 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 488906B009D for ; Fri, 26 Apr 2024 11:49:37 -0400 (EDT) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 4F2B216143B for ; Fri, 26 Apr 2024 15:49:36 +0000 (UTC) X-FDA: 82052117952.15.CDBF41D Received: from mout-p-102.mailbox.org (mout-p-102.mailbox.org [80.241.56.152]) by imf03.hostedemail.com (Postfix) with ESMTP id 2674C20004 for ; Fri, 26 Apr 2024 15:49:33 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=pankajraghav.com header.s=MBO0001 header.b=fFXCufbW; dmarc=pass (policy=quarantine) header.from=pankajraghav.com; spf=pass (imf03.hostedemail.com: domain of kernel@pankajraghav.com designates 80.241.56.152 as permitted sender) smtp.mailfrom=kernel@pankajraghav.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1714146574; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=XNNw6TAlLezpTq9qoSuGur/Xk2cZhgELQYdUZKIkBRA=; b=JqyjaHMBhXhG8HwZby0gGuKp1AJHs0TucF9I+6/boG5n1QBsiZ7eDmY7ZFfCwoNAjEnmPo Rqz0zgHv6xu58t+jI9UD65JBmrcq5rSk6hs5heNE+byGto/k4DnXXgB8SulUePJCvvH0S+ hREvxQnkGDvtupRTcMTTzM2Dv2/dIxY= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=pass header.d=pankajraghav.com header.s=MBO0001 header.b=fFXCufbW; dmarc=pass (policy=quarantine) header.from=pankajraghav.com; spf=pass (imf03.hostedemail.com: domain of kernel@pankajraghav.com designates 80.241.56.152 as permitted sender) smtp.mailfrom=kernel@pankajraghav.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1714146574; a=rsa-sha256; cv=none; b=DURLj6xNd8S3xBPJN+lGIbhYRkl0cF4hBT0AQDewVwrkdLZFdgUWnC/VOQ4IX7OctTsmVe c0DKIp3fQCaFmJ2aotVNL0W/esTa3urZnjjIrkWDHuiBOaeRyWtZeXxZclLWcyqw5LfAFw 5xqomoUfDpTQsn2SxLIHwDq/f0DM7h8= Received: from smtp102.mailbox.org (smtp102.mailbox.org [10.196.197.102]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-102.mailbox.org (Postfix) with ESMTPS id 4VQxx133srz9sq6; Fri, 26 Apr 2024 17:49:25 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pankajraghav.com; s=MBO0001; t=1714146565; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=XNNw6TAlLezpTq9qoSuGur/Xk2cZhgELQYdUZKIkBRA=; b=fFXCufbW2/eqJVZpXbrEBjvew3iTSt7TwmO55jD3Ziim1za9z7hXnhBBkMkCqgITa+BGXr wwm5gBcD8ShA7EPmhuJq31ka/N31bA9+d2R8MeV3659FpHAeOHBUVpBW5NZR+gP4xCwEjQ /TzO+4nki4F0mdJuSWGqS1X7cVXY9+YgdleSZdwj0pAHE01PQlrZIrTLOQNTCWQo1sVMZM lOj8MKV11cSQZuc9ryfWk7MUdr/fNOYwt7109AyXJS5JLmQWfb2+RGb0IJ6pW1nhnyQXAX LhJPoPMyqlmvyobKegjFGtAnHaly1KmAZUPPcVQ2Xb9CAap2h/TyaWVy8TI7hA== Date: Fri, 26 Apr 2024 15:49:19 +0000 From: "Pankaj Raghav (Samsung)" To: Matthew Wilcox Cc: djwong@kernel.org, brauner@kernel.org, david@fromorbit.com, chandan.babu@oracle.com, akpm@linux-foundation.org, linux-fsdevel@vger.kernel.org, hare@suse.de, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-xfs@vger.kernel.org, mcgrof@kernel.org, gost.dev@samsung.com, p.raghav@samsung.com Subject: Re: [PATCH v4 05/11] mm: do not split a folio if it has minimum folio order requirement Message-ID: <20240426154919.hupoxurihhbfj67x@quentin> References: <20240425113746.335530-1-kernel@pankajraghav.com> <20240425113746.335530-6-kernel@pankajraghav.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 2674C20004 X-Stat-Signature: wpf9xdmze7c67rht5t3as17yyo5zez3g X-Rspam-User: X-HE-Tag: 1714146573-695262 X-HE-Meta: U2FsdGVkX1/tXMStGH2vPD0m0+uR0gRzpFrJsZBpornJP0LFbGqKG9GwKAKH3B21FJUm2YyfhPqgclBfUKaxhuUnuIvWTTuxPFtxF8ZGUhlQpseJyEE2+ffOawnm1FPNXQhGWx7IA6AIA7oXA3TTsmf0eW0eSigo4tAlE8WT9np7n/7FMj2zo6CvKJupSrBWd4h0FIb65c/WrVdGwq7f0oXYPV6Rf1VDlfUJejpDzhq4SnBSe/Vl5loWVCZfksf3BJSVdM+m9zruezxc+ASdX9skes9xmYlDcHY2+2+Ygb9yozNQx2kfha+VLXybtGO+i7A6220ZguJKFnuXvvm7I4OUiDWxpp8ms+DMh7CJkU01nN23iG6b6HxWd7Qkip/24G+H3yfqsbhcu4Za4X0NADqyQUqWZ9TATW6J8tBW9tG93/M6S/vIfDwgFk0rdVjdgJfY59Ko5Uzh2LpIFnlMJSG/q7KJz7AzHMFsBmWMOZLiNsIsrIngTBYLmWe8XNprhRhEYOtvQQknDQwF8FyYHNoa3dmb+BfoB5AlihehxzsCEt1vHKw05xdag0Vl6qWTsqIjst0vD08KrsI/1fUuaWzeQPVC+8R78ZGK0gFEQZUOqek/QUcVEOATHja//9cIvZ849qkjRZj5csexQ75sxwuzx7DlyKzFSy2O4TAYq7oYHEAv6T/17+MpKpFFxQ6GcuuFG0qTKfe1SgF0dDRQShVyAeCiNuTywHxrA/6yt63modaVw507/kDeRayrZQWxozggQluNH6xLGIwrTv6c8IQbXFtXp7QIndeAw7Kvr2Xc4RJc1a7F32PjPZMMhObowSDfFm5O5fOxnEhXwiPFjfFYO41TZVZDZx9DAATLHc2wN0vwxHwXOjaoMfEm7SP9JSGWvhNxRogBKcI1scDrMaqNv3dzNmtH9LtvsLT+gnbHhdvG3MhTjipZvbK3QfTdzB8bwZ89YQVAJ2O5kiV MM38WaXm aO3N8u+JfpV1VolRF0/MeAtsn2M6VQFnWbVNqkQhijqx4L/j/KLmHMsuQvR8IUQi8+eLQMSuxbXA8oefLYaIQJiYRQf64ULlAgVlNE30pk1p3s+jQa7AyHKa6g+qIEQjF6VBOLw1xqRQB0fSegMTDdmusvq4lHLiwU2XnRlt9lxqp4issJBwQg8/BNXiBOIKhsA/zbnRHY1Zvs/Vu7uDJN/woPb/jdvyGl5kbIF52F4tcjQ3uppS4ycC+dsGLUAQmn9EEsktBo3QudT0DYoqP8XLEn2klbvf6teo4e4CDeyccoz9SVAETSexTQ17KLeLakt5MCffPXQwvCB8RRey5S7asI2Dp/38Lp/ie+Coo3n6X3xM= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Apr 25, 2024 at 09:10:16PM +0100, Matthew Wilcox wrote: > On Thu, Apr 25, 2024 at 01:37:40PM +0200, Pankaj Raghav (Samsung) wrote: > > From: Pankaj Raghav > > > > Splitting a larger folio with a base order is supported using > > split_huge_page_to_list_to_order() API. However, using that API for LBS > > is resulting in an NULL ptr dereference error in the writeback path [1]. > > > > Refuse to split a folio if it has minimum folio order requirement until > > we can start using split_huge_page_to_list_to_order() API. Splitting the > > folio can be added as a later optimization. > > > > [1] https://gist.github.com/mcgrof/d12f586ec6ebe32b2472b5d634c397df > > Obviously this has to be tracked down and fixed before this patchset can > be merged ... I think I have some ideas. Let me look a bit. How > would I go about reproducing this? I am able to reproduce it in a VM with 4G RAM and running generic/447 (sometimes you have to run it twice) on a 16K BS on a 4K PS system. I have a suspicion on this series: https://lore.kernel.org/linux-fsdevel/20240215063649.2164017-1-hch@lst.de/ but I am still unsure why this is happening when we split with LBS configurations. If you have kdevops installed, then go with Luis's suggestion, or else this is my local config. This is the diff I applied instead of this patch: diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 9859aa4f7553..63ee7b6ed03d 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -3041,6 +3041,10 @@ int split_huge_page_to_list_to_order(struct page *page, struct list_head *list, { struct folio *folio = page_folio(page); struct deferred_split *ds_queue = get_deferred_split_queue(folio); + unsigned int mapping_min_order = mapping_min_folio_order(folio->mapping); + + if (!folio_test_anon(folio)) + new_order = max_t(unsigned int, mapping_min_order, new_order); /* reset xarray order to new order after split */ XA_STATE_ORDER(xas, &folio->mapping->i_pages, folio->index, new_order); struct anon_vma *anon_vma = NULL; @@ -3117,6 +3121,8 @@ int split_huge_page_to_list_to_order(struct page *page, struct list_head *list, goto out; } + // XXX: Remove it later + VM_WARN_ON_FOLIO((new_order < mapping_min_order), folio); gfp = current_gfp_context(mapping_gfp_mask(mapping) & GFP_RECLAIM_MASK); (END) xfstests is based on https://github.com/kdave/xfstests/tree/v2024.04.14 xfstests config: [default] FSTYP=xfs RESULT_BASE=/root/results/ DUMP_CORRUPT_FS=1 CANON_DEVS=yes RECREATE_TEST_DEV=true TEST_DEV=/dev/nvme0n1 TEST_DIR=/media/test SCRATCH_DEV=/dev/vdb SCRATCH_MNT=/media/scratch LOGWRITES_DEV=/dev/vdc [16k_4ks] MKFS_OPTIONS='-f -m reflink=1,rmapbt=1, -i sparse=1, -b size=16k, -s size=4k' [nix-shell:~]# lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS vdb 254:16 0 32G 0 disk /media/scratch vdc 254:32 0 32G 0 disk nvme0n1 259:0 0 32G 0 disk /media/test $ ./check -s 16k_4ks generic/447 BT: [ 74.170698] BUG: KASAN: null-ptr-deref in filemap_get_folios_tag+0x14b/0x510 [ 74.170938] Write of size 4 at addr 0000000000000036 by task kworker/u16:6/284 [ 74.170938] [ 74.170938] CPU: 0 PID: 284 Comm: kworker/u16:6 Not tainted 6.9.0-rc4-00011-g4676d00b6f6f #7 [ 74.170938] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.2-0-gea1b7a073390-prebuilt.qemu.org 04/01/2014 [ 74.170938] Workqueue: writeback wb_workfn (flush-254:16) [ 74.170938] Call Trace: [ 74.170938] [ 74.170938] dump_stack_lvl+0x51/0x70 [ 74.170938] kasan_report+0xab/0xe0 [ 74.170938] ? filemap_get_folios_tag+0x14b/0x510 [ 74.170938] kasan_check_range+0x35/0x1b0 [ 74.170938] filemap_get_folios_tag+0x14b/0x510 [ 74.170938] ? __pfx_filemap_get_folios_tag+0x10/0x10 [ 74.170938] ? srso_return_thunk+0x5/0x5f [ 74.170938] writeback_iter+0x508/0xcc0 [ 74.170938] ? __pfx_iomap_do_writepage+0x10/0x10 [ 74.170938] write_cache_pages+0x80/0x100 [ 74.170938] ? __pfx_write_cache_pages+0x10/0x10 [ 74.170938] ? srso_return_thunk+0x5/0x5f [ 74.170938] ? srso_return_thunk+0x5/0x5f [ 74.170938] ? srso_return_thunk+0x5/0x5f [ 74.170938] ? _raw_spin_lock+0x87/0xe0 [ 74.170938] iomap_writepages+0x85/0xe0 [ 74.170938] xfs_vm_writepages+0xe3/0x140 [xfs] [ 74.170938] ? __pfx_xfs_vm_writepages+0x10/0x10 [xfs] [ 74.170938] ? kasan_save_track+0x10/0x30 [ 74.170938] ? srso_return_thunk+0x5/0x5f [ 74.170938] ? __kasan_kmalloc+0x7b/0x90 [ 74.170938] ? srso_return_thunk+0x5/0x5f [ 74.170938] ? virtqueue_add_split+0x605/0x1b00 [ 74.170938] do_writepages+0x176/0x740 [ 74.170938] ? __pfx_do_writepages+0x10/0x10 [ 74.170938] ? __pfx_virtqueue_add_split+0x10/0x10 [ 74.170938] ? __pfx_update_sd_lb_stats.constprop.0+0x10/0x10 [ 74.170938] ? srso_return_thunk+0x5/0x5f [ 74.170938] ? virtqueue_add_sgs+0xfe/0x130 [ 74.170938] ? srso_return_thunk+0x5/0x5f [ 74.170938] ? virtblk_add_req+0x15c/0x280 [ 74.170938] __writeback_single_inode+0x9f/0x840 [ 74.170938] ? wbc_attach_and_unlock_inode+0x345/0x5d0 [ 74.170938] writeback_sb_inodes+0x491/0xce0 [ 74.170938] ? __pfx_wb_calc_thresh+0x10/0x10 [ 74.170938] ? __pfx_writeback_sb_inodes+0x10/0x10 [ 74.170938] ? __wb_calc_thresh+0x1a0/0x3c0 [ 74.170938] ? __pfx_down_read_trylock+0x10/0x10 [ 74.170938] ? wb_over_bg_thresh+0x16b/0x5e0 [ 74.170938] ? __pfx_move_expired_inodes+0x10/0x10 [ 74.170938] __writeback_inodes_wb+0xb7/0x200 [ 74.170938] wb_writeback+0x2c4/0x660 [ 74.170938] ? __pfx_wb_writeback+0x10/0x10 [ 74.170938] ? __pfx__raw_spin_lock_irq+0x10/0x10 [ 74.170938] wb_workfn+0x54e/0xaf0 [ 74.170938] ? srso_return_thunk+0x5/0x5f [ 74.170938] ? __pfx_wb_workfn+0x10/0x10 [ 74.170938] ? __pfx___schedule+0x10/0x10 [ 74.170938] ? __pfx__raw_spin_lock_irq+0x10/0x10 [ 74.170938] process_one_work+0x622/0x1020 [ 74.170938] worker_thread+0x844/0x10e0 [ 74.170938] ? srso_return_thunk+0x5/0x5f [ 74.170938] ? __kthread_parkme+0x82/0x150 [ 74.170938] ? __pfx_worker_thread+0x10/0x10 [ 74.170938] kthread+0x2b4/0x380 [ 74.170938] ? __pfx_kthread+0x10/0x10 [ 74.170938] ret_from_fork+0x30/0x70 [ 74.170938] ? __pfx_kthread+0x10/0x10 [ 74.170938] ret_from_fork_asm+0x1a/0x30 [ 74.170938]