From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E1894D3B7C1 for ; Tue, 26 Nov 2024 08:43:49 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 289F96B0083; Tue, 26 Nov 2024 03:43:49 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 236B26B0085; Tue, 26 Nov 2024 03:43:49 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0D92B6B0088; Tue, 26 Nov 2024 03:43:49 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id E6D156B0083 for ; Tue, 26 Nov 2024 03:43:48 -0500 (EST) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 54B081C7798 for ; Tue, 26 Nov 2024 08:43:48 +0000 (UTC) X-FDA: 82827608094.02.6A62FBA Received: from mail-lj1-f178.google.com (mail-lj1-f178.google.com [209.85.208.178]) by imf21.hostedemail.com (Postfix) with ESMTP id 2F6FC1C000A for ; Tue, 26 Nov 2024 08:43:37 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=suse.com header.s=google header.b=gui4nQed; spf=pass (imf21.hostedemail.com: domain of wqu@suse.com designates 209.85.208.178 as permitted sender) smtp.mailfrom=wqu@suse.com; dmarc=pass (policy=quarantine) header.from=suse.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1732610623; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=YwWSRx5u97W/Hl2pIyY6PV9jf5U88VgpobHkBfCMFnA=; b=AQXEQtlaWSC7zERqCHP6a83oQWT+sgqE+rQeS9iUiJy4aX0dNaXFLUy8NJXDIUHV/3yZvL aL6nmv37gRBworeq8dudqkSU+W/tNMDwU2D9gglKBkSB39atLLmrp+YLBOkNaYHv9Gg5wX 6fp6HY8PwTO0IxURZuso0Z8ZosGCVks= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1732610623; a=rsa-sha256; cv=none; b=1ro+WchiI+g7K9VKsyaXkcOTmMIIgrIqBFz+o1NjKu/5AWbkkcc8L9mA0cFlLmrBTh+BrN 1QY0v+0PeoMADyGqYSCLLNKh9WqtFrS3k5sq8hUIB68I+AnYTqfLMxI0OUu6YAbpLtBziG hXQu60e9bWLjQB5UtdUOguXosg20yKI= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=pass header.d=suse.com header.s=google header.b=gui4nQed; spf=pass (imf21.hostedemail.com: domain of wqu@suse.com designates 209.85.208.178 as permitted sender) smtp.mailfrom=wqu@suse.com; dmarc=pass (policy=quarantine) header.from=suse.com Received: by mail-lj1-f178.google.com with SMTP id 38308e7fff4ca-2ffc86948dcso16030131fa.2 for ; Tue, 26 Nov 2024 00:43:45 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=google; t=1732610624; x=1733215424; darn=kvack.org; h=content-transfer-encoding:in-reply-to:autocrypt:from :content-language:references:cc:to:subject:user-agent:mime-version :date:message-id:from:to:cc:subject:date:message-id:reply-to; bh=YwWSRx5u97W/Hl2pIyY6PV9jf5U88VgpobHkBfCMFnA=; b=gui4nQedsNPfzX8zccYvCsHeYvluBqNoB0FodXX9N2aFKn9GvTD+mEqKLopCTX3Zzo R+03C4U4xEHfSlAGPwF3hk7DFRYUn8ZKve/X+WkUXpL162GVa90rsUF3zjdq1Ijdli/F r52iNipcWuGVpTvkzrchkdgpxksRLr4pdDM5rWFTJzO4WOFytMaNxxhQmGDdVWwMO5B6 vMHoCDzmRHRNl2IbNijEYbQ2Y/j/Tl4UNXbaPKpwIoN5Zr3fpP8AcuIeiy55OcK+gUOH VFwVO8PyYsP4Oa80/szpKQAghszLLpDaqPoJl//Roz9Z+7Ek0FjvLd4E+4I0KXTjMgZG n0dw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1732610624; x=1733215424; h=content-transfer-encoding:in-reply-to:autocrypt:from :content-language:references:cc:to:subject:user-agent:mime-version :date:message-id:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=YwWSRx5u97W/Hl2pIyY6PV9jf5U88VgpobHkBfCMFnA=; b=ogyYDo1p+2x4EgLwzjtQr8qZ6Zbn8QJZ6MQA6ajaRUW16KW/s9N8aABqox5U4QminX 1Bi8hiTsVL+LZNPO8Yd4eOGTfbKpHNrETMCP7sNtcoJ//EJFZf+JwKKidBlrzNshkc22 voVh5JWhFWVP4scMF+Fadwqa+6KfSNO9yJKoCVgu6DL5IzKjDgxxPvPMBugUJ46+pwZk mf+hojkuEBRPtx3dCKKhdh+deQZEwuMJxGBkMQ5Lnq/iCbRILf6+KwG7n4yneWK6O0bR PHGlZXM/gK/cA2hQl08p0eSpSXDtSDVWgnLNvjDwKepfLWS94GY4aYIeGNZsq04Hc3BD ZIIQ== X-Forwarded-Encrypted: i=1; AJvYcCW/ez+mDPx6OiUr1BKIt5bHr8KmQE0bLLXn75ONKOYSX9PbBXKVmF435WHVxI6+Z+p3vce3qnuZpA==@kvack.org X-Gm-Message-State: AOJu0YxKMc6T9bZ6Xed9tkSqN47Id4/3gAZ1roRiESrd0rs+5J5w6Epq +RvnDEzAwisvWMOkkeqywAuDsqM2E1gsN/cbicPn45kEa6/Y3+NseqAGqvJdiCEPEopl3WOIOgd w1QA= X-Gm-Gg: ASbGncvc9mwcbKZVOJHJN6JaOY26bIe9MZmlIKW3ED+KpEWqeHj33qjsoFPoIlmyUli gDcsnB244Zit9c5So0a9PryeuLEGuuorYjoI70Zjw2bmELX77IA3XmKmsESMhojafXaBlSyRlUH 4e7+JPtmDug8ieuAhKjsXKLsSEngPGFMw59D0DT/TKTal4Wycx57c5SJdmFpDXFiJ3P/5vbViCb IVNVbYJPH0ryDgQDRJlX9wA5zB9nNfJ7V+h1jyaJQhuFwafRxBhNJNGpLnC/zpGdx7zMTuYBQra yg== X-Google-Smtp-Source: AGHT+IGXID7oHO5S0gVKrM+lQoYHeK/Ra/9fruJb/FoWS5RxVJBRQowL6MI4jlYsMU160B/ko0pNQQ== X-Received: by 2002:a05:6512:ac7:b0:53d:e8ed:21c1 with SMTP id 2adb3069b0e04-53de8ed2714mr977868e87.7.1732610623980; Tue, 26 Nov 2024 00:43:43 -0800 (PST) Received: from ?IPV6:2403:580d:fda1::299? (2403-580d-fda1--299.ip6.aussiebb.net. [2403:580d:fda1::299]) by smtp.gmail.com with ESMTPSA id 41be03b00d2f7-7fbcc3de2c3sm6970904a12.62.2024.11.26.00.43.39 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 26 Nov 2024 00:43:43 -0800 (PST) Message-ID: Date: Tue, 26 Nov 2024 19:13:37 +1030 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [syzbot] [btrfs?] kernel BUG in __folio_start_writeback To: Aleksandr Nogikh Cc: Matthew Wilcox , syzbot , akpm@linux-foundation.org, clm@fb.com, dsterba@suse.com, josef@toxicpanda.com, linux-btrfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, syzkaller-bugs@googlegroups.com References: <67432dee.050a0220.1cc393.0041.GAE@google.com> Content-Language: en-US From: Qu Wenruo Autocrypt: addr=wqu@suse.com; keydata= xsBNBFnVga8BCACyhFP3ExcTIuB73jDIBA/vSoYcTyysFQzPvez64TUSCv1SgXEByR7fju3o 8RfaWuHCnkkea5luuTZMqfgTXrun2dqNVYDNOV6RIVrc4YuG20yhC1epnV55fJCThqij0MRL 1NxPKXIlEdHvN0Kov3CtWA+R1iNN0RCeVun7rmOrrjBK573aWC5sgP7YsBOLK79H3tmUtz6b 9Imuj0ZyEsa76Xg9PX9Hn2myKj1hfWGS+5og9Va4hrwQC8ipjXik6NKR5GDV+hOZkktU81G5 gkQtGB9jOAYRs86QG/b7PtIlbd3+pppT0gaS+wvwMs8cuNG+Pu6KO1oC4jgdseFLu7NpABEB AAHNGFF1IFdlbnJ1byA8d3F1QHN1c2UuY29tPsLAlAQTAQgAPgIbAwULCQgHAgYVCAkKCwIE FgIDAQIeAQIXgBYhBC3fcuWlpVuonapC4cI9kfOhJf6oBQJnEXVgBQkQ/lqxAAoJEMI9kfOh Jf6o+jIH/2KhFmyOw4XWAYbnnijuYqb/obGae8HhcJO2KIGcxbsinK+KQFTSZnkFxnbsQ+VY fvtWBHGt8WfHcNmfjdejmy9si2jyy8smQV2jiB60a8iqQXGmsrkuR+AM2V360oEbMF3gVvim 2VSX2IiW9KERuhifjseNV1HLk0SHw5NnXiWh1THTqtvFFY+CwnLN2GqiMaSLF6gATW05/sEd V17MdI1z4+WSk7D57FlLjp50F3ow2WJtXwG8yG8d6S40dytZpH9iFuk12Sbg7lrtQxPPOIEU rpmZLfCNJJoZj603613w/M8EiZw6MohzikTWcFc55RLYJPBWQ+9puZtx1DopW2jOwE0EWdWB rwEIAKpT62HgSzL9zwGe+WIUCMB+nOEjXAfvoUPUwk+YCEDcOdfkkM5FyBoJs8TCEuPXGXBO Cl5P5B8OYYnkHkGWutAVlUTV8KESOIm/KJIA7jJA+Ss9VhMjtePfgWexw+P8itFRSRrrwyUf E+0WcAevblUi45LjWWZgpg3A80tHP0iToOZ5MbdYk7YFBE29cDSleskfV80ZKxFv6koQocq0 vXzTfHvXNDELAuH7Ms/WJcdUzmPyBf3Oq6mKBBH8J6XZc9LjjNZwNbyvsHSrV5bgmu/THX2n g/3be+iqf6OggCiy3I1NSMJ5KtR0q2H2Nx2Vqb1fYPOID8McMV9Ll6rh8S8AEQEAAcLAfAQY AQgAJgIbDBYhBC3fcuWlpVuonapC4cI9kfOhJf6oBQJnEXWBBQkQ/lrSAAoJEMI9kfOhJf6o cakH+QHwDszsoYvmrNq36MFGgvAHRjdlrHRBa4A1V1kzd4kOUokongcrOOgHY9yfglcvZqlJ qfa4l+1oxs1BvCi29psteQTtw+memmcGruKi+YHD7793zNCMtAtYidDmQ2pWaLfqSaryjlzR /3tBWMyvIeWZKURnZbBzWRREB7iWxEbZ014B3gICqZPDRwwitHpH8Om3eZr7ygZck6bBa4MU o1XgbZcspyCGqu1xF/bMAY2iCDcq6ULKQceuKkbeQ8qxvt9hVxJC2W3lHq8dlK1pkHPDg9wO JoAXek8MF37R8gpLoGWl41FIUb3hFiu3zhDDvslYM4BmzI18QgQTQnotJH8= In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Stat-Signature: 3n76em1c36qkk9j5k3ad31fapjfy5eoh X-Rspam-User: X-Rspamd-Queue-Id: 2F6FC1C000A X-Rspamd-Server: rspam02 X-HE-Tag: 1732610617-673692 X-HE-Meta: U2FsdGVkX1+a0RwY1YZiRMVBcwVyd/VSHp+I0L/7Um/UuJu3tYO6JlUfVY6/MAniS09r0vT6hWZLV7cuc6Cvfp9y2TBQ9wZYizfXLE2ElSmGsLGPYt1DpR9ihdePvhOjuf6objPPsd/MPI0cZyyggHGINcz8+P3uty1djzBBj7+uzVjjMa2jzMba2sORlfI6xsNWje7msVfxs8Z1E4G+fCQJLXy/GtJsAG5jVS3F8noI9o4Y6KqXp9eDw9tVIw1yJmZsn9gQ1CyMc2NQbSRe9T5AK6+23PC2nA9R8v8F6vvaqDLqS5hYVRcfLqvks3PFUpiMNMcKdE8NTVGXP7L+duujtHeR901OmAeoPhXohdDOpJFvPsP8BOb0d+WuDavUGRQWdtA1GpJGK8RPSz6BPXnNfm7FaInDHUPwAfZoScVqtGsaGK/aEc5sVLKP+IIyeuBmi3l+0seBY/SosfTJACkkUrE98WFHVT01dvW8ZgtJR18uK7Ev9HITglRmIo0gMNNySrXuaMu0S2FgbmdnUP/S1eAPL7REbaoQbcCfxz/CztV/2wJNEaJoEmc43VTF8OeDyKK9Ww+nHCE/HW7DoYHFL06K53YI0/YFzJt3OWquaAC8tbmjTm39y2gk6QhGTVr99Pm6q3QSKJavQRATMpbFzNGCL0BwKEu5oUlzi5RBv0KhcP2xLN1PyT6UY3/AlnIfiim20vUVN7x90v8i2rmrGJQ10NHu5+G8OU2lYVjPq+uY9s9dktl0GZZiNC0g5v0bFX05KgIgyYat+tSdeFeIt28asLFlJagJi9iPjKdBrGeTz9tFT/CEdEIRoPhfahRJaKcq8ryYe4Pk+lNRb8lIskNkoexK4nTBoNTrTQpxpoxI57/zPqLcQt1iJa5zqUfJ010KRxWVIEtQ58XsXlcHJQIcUODb1pBt5/I/M5vTgOe/ozumA61Cd4NXrOvkRTUj+sGXvVo+AwVEzEc 9ahHrZcd FUi7GVWv9X3ekFhci1+txdsj478GCiJiPWBfhohMCm+iDqhCcnGf/n8xex8JMBldK8WKnpemqXX0mMnVY7Y+g2qSoVbE849Ux8mnfdRD60cr8X/MiHzYLLBFG49265LkIhKYVYZ96Yj6iT/TQbIqB/m6e7cwu1MbOd6/5HrH/i7Ij2Ie0CcFu3RdNsy5rgosR39MYzpcRaQ9QXUwZrH8jUAwrliLiQlU/dhuX3bbd87Q5duL1TFOU4JkjvLWHY4mdRrLs/Y/dWMMvowWOUJpIR/pQotusLJ7zEX9KDPrDEmbjsP11jB7EawdisUjBIW5NvEFphdkm4SAZ+wHBuuZwIxU5IobZxjnEJxH9O0zi6ecrsqkhZptHhfH5ymYJbm8dJlVUlf5bC/xVTwq8O3DBO8qVqgby26kYr3SubQGgZFomPaCOmC750ZuJOTBBPQdUJN92Bp1lcBHG2YVyuMfg9veet6/BGHI4zr5Hf26YStcN2nMuzL3EzU3EGq7moe+n0RnGMj6ZiPx+1nSjMvL9uujl3jdhwhsncWNdnBhvOcKEgs5HtZTlCFONdZLYwqDMpO3mFgkv9tRXitXPTNZLfj0qcA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: 在 2024/11/25 21:14, Aleksandr Nogikh 写道: > On Mon, Nov 25, 2024 at 1:30 AM 'Qu Wenruo' via syzkaller-bugs > wrote: >> >> >> >> 在 2024/11/25 07:56, Matthew Wilcox 写道: >>> On Sun, Nov 24, 2024 at 05:45:18AM -0800, syzbot wrote: >>>> >>>> __fput+0x5ba/0xa50 fs/file_table.c:458 >>>> task_work_run+0x24f/0x310 kernel/task_work.c:239 >>>> resume_user_mode_work include/linux/resume_user_mode.h:50 [inline] >>>> exit_to_user_mode_loop kernel/entry/common.c:114 [inline] >>>> exit_to_user_mode_prepare include/linux/entry-common.h:329 [inline] >>>> __syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline] >>>> syscall_exit_to_user_mode+0x13f/0x340 kernel/entry/common.c:218 >>>> do_syscall_64+0x100/0x230 arch/x86/entry/common.c:89 >>>> entry_SYSCALL_64_after_hwframe+0x77/0x7f >>> >>> This is: >>> >>> VM_BUG_ON_FOLIO(folio_test_writeback(folio), folio); >>> >>> ie we've called __folio_start_writeback() on a folio which is already >>> under writeback. >>> >>> Higher up in the trace, we have the useful information: >>> >>> page: refcount:6 mapcount:0 mapping:ffff888077139710 index:0x3 pfn:0x72ae5 >>> memcg:ffff888140adc000 >>> aops:btrfs_aops ino:105 dentry name(?):"file2" >>> flags: 0xfff000000040ab(locked|waiters|uptodate|lru|private|writeback|node=0|zone=1|lastcpupid=0x7ff) >>> raw: 00fff000000040ab ffffea0001c8f408 ffffea0000939708 ffff888077139710 >>> raw: 0000000000000003 0000000000000001 00000006ffffffff ffff888140adc000 >>> page dumped because: VM_BUG_ON_FOLIO(folio_test_writeback(folio)) >>> page_owner tracks the page as allocated >>> >>> The interesting part of the page_owner stacktrace is: >>> >>> filemap_alloc_folio_noprof+0xdf/0x500 >>> __filemap_get_folio+0x446/0xbd0 >>> prepare_one_folio+0xb6/0xa20 >>> btrfs_buffered_write+0x6bd/0x1150 >>> btrfs_direct_write+0x52d/0xa30 >>> btrfs_do_write_iter+0x2a0/0x760 >>> do_iter_readv_writev+0x600/0x880 >>> vfs_writev+0x376/0xba0 >>> >>> (ie not very interesting) >>> >>>> Workqueue: btrfs-delalloc btrfs_work_helper >>>> RIP: 0010:__folio_start_writeback+0xc06/0x1050 mm/page-writeback.c:3119 >>>> Call Trace: >>>> >>>> process_one_folio fs/btrfs/extent_io.c:187 [inline] >>>> __process_folios_contig+0x31c/0x540 fs/btrfs/extent_io.c:216 >>>> submit_one_async_extent fs/btrfs/inode.c:1229 [inline] >>>> submit_compressed_extents+0xdb3/0x16e0 fs/btrfs/inode.c:1632 >>>> run_ordered_work fs/btrfs/async-thread.c:245 [inline] >>>> btrfs_work_helper+0x56b/0xc50 fs/btrfs/async-thread.c:324 >>>> process_one_work kernel/workqueue.c:3229 [inline] >>> >>> This looks like a race? >>> >>> process_one_folio() calls >>> btrfs_folio_clamp_set_writeback calls >>> btrfs_subpage_set_writeback: >>> >>> spin_lock_irqsave(&subpage->lock, flags); >>> bitmap_set(subpage->bitmaps, start_bit, len >> fs_info->sectorsize_bits) >>> ; >>> if (!folio_test_writeback(folio)) >>> folio_start_writeback(folio); >>> spin_unlock_irqrestore(&subpage->lock, flags); >>> >>> so somebody else set writeback after we tested for writeback here. >> >> The test VM is using X86_64, thus we won't go into the subpage routine, >> but directly call folio_start_writeback(). >> >>> >>> One thing that comes to mind is that _usually_ we take folio_lock() >>> first, then start writeback, then call folio_unlock() and btrfs isn't >>> doing that here (afaict). Maybe that's not the source of the bug? >> >> We still hold the folio locked, do submission then unlock. >> >> You can check extent_writepage(), where at the entrance we check if the >> folio is still locked. >> Then inside extent_writepage_io() we do the submission, setting the >> folio writeback inside submit_one_sector(). >> Eventually unlock the folio at the end of extent_writepage(), that's for >> the uncompressed writes. >> >> There are a lot of special handling for async submission (compression), >> but it still holds the folio locked, do compression and submission, and >> unlock, just all in another thread (this case). >> >> So it looks like something is wrong when transferring the ownership of >> the page cache folios to the compression path, or some not properly >> handled error path. >> >> Unfortunately I'm not really able to reproduce the case using the >> reproducer... > > I've just tried to reproduce locally using the downloadable assets and > the kernel crashed ~ after 1 minute of running the attached C repro. > > [ 87.616440][ T9044] ------------[ cut here ]------------ > [ 87.617126][ T9044] kernel BUG at mm/page-writeback.c:3119! > [ 87.619308][ T9044] Oops: invalid opcode: 0000 [#1] PREEMPT SMP KASAN PTI > [ 87.620174][ T9044] CPU: 1 UID: 0 PID: 9044 Comm: kworker/u10:6 Not > tainted 6.12.0-syzkaller-08446-g228a1157fb9f #0 > > Here are the instructions I followed: > https://github.com/google/syzkaller/blob/master/docs/syzbot_assets.md#run-a-c-reproducer Thanks for the confirmation. I can reproduce it using the exact disk image (around 1min), but not inside my usual development VM (over 5min). So it will a lot tricky to debug now... Thanks, Qu