From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id DFE27C02183 for ; Thu, 16 Jan 2025 14:57:04 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 74E936B0082; Thu, 16 Jan 2025 09:57:04 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 6D65C6B0083; Thu, 16 Jan 2025 09:57:04 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5284C6B0085; Thu, 16 Jan 2025 09:57:04 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 2FEFC6B0082 for ; Thu, 16 Jan 2025 09:57:04 -0500 (EST) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id B68EAB049E for ; Thu, 16 Jan 2025 14:57:03 +0000 (UTC) X-FDA: 83013617526.20.521326B Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.223.131]) by imf07.hostedemail.com (Postfix) with ESMTP id 6243240016 for ; Thu, 16 Jan 2025 14:57:01 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=NRt11gmt; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=kfWeNzKi; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=AEqtj3Mb; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=rguJeDJf; spf=pass (imf07.hostedemail.com: domain of jack@suse.cz designates 195.135.223.131 as permitted sender) smtp.mailfrom=jack@suse.cz; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1737039421; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=25mhKZCy+wszXZ9AZGe6SbcYXi6AGtNrxGnAfpbsCG8=; b=rt899esC80V02WP/RL37m4pgfXQGuLjDGmeL16MVFmchZypltPL/LeU8PoOWcduee7BaYC F1vyWyXhhMk7LnZQ6HSj4af93/Q0JtTi+wl1VwJ2S9JXRPjnrP/b9JiLsZSg/zqdPV3L5C dYrtAFxRZemdGZBl9588vRWtNrYh0nY= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1737039421; a=rsa-sha256; cv=none; b=gj5UqZyfvkaqBIcmVQBI6t821QUIvlCBJj6NxBx8aOLcztAC4Y0OgH/igp/FsaX0hUwWKv ni1mt0c9jYeH/G7pbLZmE8oZ1wxpBpGG5fBCfjKFFBbx12gyjL7o1Z9+TYFr4pm3YWOs9a SXsgDmg8i9HMHWO9zgRjxwzjbkQTDQQ= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=NRt11gmt; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=kfWeNzKi; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=AEqtj3Mb; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b=rguJeDJf; spf=pass (imf07.hostedemail.com: domain of jack@suse.cz designates 195.135.223.131 as permitted sender) smtp.mailfrom=jack@suse.cz; dmarc=none Received: from imap1.dmz-prg2.suse.org (unknown [10.150.64.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 69E3A1F7A6; Thu, 16 Jan 2025 14:56:58 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1737039419; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=25mhKZCy+wszXZ9AZGe6SbcYXi6AGtNrxGnAfpbsCG8=; b=NRt11gmtfmNjMdFRVbyQkpSA5TpQWG59yroonBhnO7FQXE8JV01b0xsz87/uODR2LVJpFl 9VH6XOTBFMC6B4Yz64A9XhPZ5kHj2usYH+Fivc2td1LVHxUhz3jBF8ABkIJRJgdHGO5Hd7 BCrao5QJfn+JSwO37rizPqG34Bzywq0= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1737039419; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=25mhKZCy+wszXZ9AZGe6SbcYXi6AGtNrxGnAfpbsCG8=; b=kfWeNzKi+fgEopXZGjI8p4qC5lt2qxDZxR0v+qWlMWj5a0R32zZfnIgR5S9+Z4cq0wAtnV o5lKlhzYsumLoGAQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1737039418; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=25mhKZCy+wszXZ9AZGe6SbcYXi6AGtNrxGnAfpbsCG8=; b=AEqtj3MbeL+dPM2acTUg0CI5VCbmA0YAI7+7kfRVIT44HxlSNq4aRcme0FgvEE7ClayxOe QftbTQ8KDffRt2h2ZRfk/R4ZIIXEA6sai0Putj6yU+ON2OanAHowmRx2tqpc84dpVxl08Y avdUSwQG2v9LTHs9d6mUJZ7Q4BBQDf8= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1737039418; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=25mhKZCy+wszXZ9AZGe6SbcYXi6AGtNrxGnAfpbsCG8=; b=rguJeDJfYiIQd7adjxx2wmMWtXBY3kI61YH61TIa1Fqr8uTzpdX9+LnNre7c6BF5ziI8jh SG4nVASnZsEtUVDA== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 519C313AEC; Thu, 16 Jan 2025 14:56:58 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id CVjdEzoeiWe6EAAAD6G6ig (envelope-from ); Thu, 16 Jan 2025 14:56:58 +0000 Received: by quack3.suse.cz (Postfix, from userid 1000) id E0EB4A08E0; Thu, 16 Jan 2025 15:56:57 +0100 (CET) Date: Thu, 16 Jan 2025 15:56:57 +0100 From: Jan Kara To: Guenter Roeck Cc: Jan Kara , Jim Zhao , akpm@linux-foundation.org, willy@infradead.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: Re: [PATCH] mm/page-writeback: Consolidate wb_thresh bumping logic into __wb_calc_thresh Message-ID: <2xndprbkr5k5qer4zb6ov35fa5ym7c36q6mcyapdh22ypqxivh@ahuvuqs47yd4> References: <20241121100539.605818-1-jimzhao.ai@gmail.com> <64a44636-16ec-4a10-aeb6-e327b7f989c2@roeck-us.net> <0e5dc5f1-c2c2-4893-902b-4677c21a38c0@roeck-us.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <0e5dc5f1-c2c2-4893-902b-4677c21a38c0@roeck-us.net> X-Stat-Signature: 7ijypgwux6i8yc57gtridk4eqyf37uoc X-Rspam-User: X-Rspamd-Queue-Id: 6243240016 X-Rspamd-Server: rspam03 X-HE-Tag: 1737039421-899731 X-HE-Meta: U2FsdGVkX1+ZO2iK1dcaz3otQ+Yx815mZrcosrgR7NbEaViSPQ1FRqN7aT2yeHs1l0I9aJqABuDyvT4TuGRGfVJCjnvXGoKlettToD0skTFyPd8jsKWy7d+GYdflRkFU0OUzgCZXLXDznz+GaiFyNBjV04Vud404lqXZZlYa44Y0756EaUKYLmq4Dqnc0hXYaiVBTNGWzj1TZz0L+nyofdUcwMC/wjyr1Kelu0LOHOqKBZOlD5sYmJI3aA/wcEBMBBAVMqdg6JUchgfDcJdd5leDxshUwy9LQZY5+Scv5jX4lg29WGHD9yYharjv4VCPePzuMoIDrkDKwtVICdM0sGK+X0OPp1gITzOO2pguD5nUK63aqU+p/07WMvsrIGMOmPyLsZBUDulRXFPOnQlGjN2R8yZuPVm04Uwev92wy04GT9W+f0/y+BedDVqpVpgyr0KAJf8C9YfU8dlDSAgofEU09DzHmQeByz4aTXY82z3wfjlfHoEcw7jyGuWhf1v+CKIWHxlzj+k8UOnqtyoNiytvnb6HXLyBCbdJwnPY78O+H0TJXyChNn2tL9zF8NR9jYM6aeyQ7I6iprEG5q1uvYP4CYc5VLQvyPD+yndoaI11yYN/SAEXJwdZZPMrg3te3Ai67i8VPTeU72DogqmYId9IsJysIJcmlcvctUEkpHw5lYIN5mt+gE7iXMouiBVTn1kqTyEDuzDcrZgpII1KUxwmN54iWkbvpqM2ZSY9HAFO4bwSyW3I9t/OlkeQyOD8KDGes3YtsZBVtNtEmYOXV89SAkjSIjejanjV+EIg+DTz8opspTbhxP8QgiUAk8r2xMQYGnq4fxZbKhJwS2lNkLSLupKTsxgKnCl4raOVyW8rnyuT9In2xm8jQ7XzkY3El3nSHP1cwq9VWiAtikadDv1uy4BX6bZCZMQdRsoGpkoAzwcV475QzUInw4EM+HiDeP5En3Ntx+yTR+YXFPt idvknKog hO7Lf33hB6y/+6Y0bfylEbDRU+q8uR//Zv6dYoOjQHpJ9LZl2I2NPLW3IinwCGt7I8JukF4Jx/VFLBwwPOjvM7FahF+Klp8dSW0viDowSeHD0ONKuuYyNo4fBaon/tR8zS4iRKXbsjpd+YQZfOI8lPpnqDZBGDR+DJV+de1f/P02YoV47/ZfHlQ+kN7gvyVUsqrJG0XgZ5ZmvWblA0UhOE8hY0VDdpLm0rs4d6nJhWQ55wbsAztmsfQ3NF7k//WztmFUlzHpt/nq+GA35awhyrLekFFrdTB9YZMCxI/SSj3lWuu3cZ4DT8WKa8Iux6DN90Td1j2pNelfpgUfAp4kgegnyeED4hW+nk4xTDr6U12X9/0PTgLpWhIIrC+8AOl5pZt9EzufX4GqF0FzxrWaZrDXlA5dhUm4sFTJZjmVtR035ztcqIGCCYOSePblCgt1SomWS5jj26MNgK/2xYyKf/qUztrV5h5G8Wnywr/cRtqVIMWovLDegcvFtmA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed 15-01-25 08:41:43, Guenter Roeck wrote: > On 1/15/25 08:07, Jan Kara wrote: > > On Tue 14-01-25 07:01:08, Guenter Roeck wrote: > > > On 1/14/25 05:19, Jan Kara wrote: > > > > On Mon 13-01-25 15:05:25, Guenter Roeck wrote: > > > > > On Thu, Nov 21, 2024 at 06:05:39PM +0800, Jim Zhao wrote: > > > > > > Address the feedback from "mm/page-writeback: raise wb_thresh to prevent > > > > > > write blocking with strictlimit"(39ac99852fca98ca44d52716d792dfaf24981f53). > > > > > > The wb_thresh bumping logic is scattered across wb_position_ratio, > > > > > > __wb_calc_thresh, and wb_update_dirty_ratelimit. For consistency, > > > > > > consolidate all wb_thresh bumping logic into __wb_calc_thresh. > > > > > > > > > > > > Reviewed-by: Jan Kara > > > > > > Signed-off-by: Jim Zhao > > > > > > > > > > This patch triggers a boot failure with one of my 'sheb' boot tests. > > > > > It is seen when trying to boot from flash (mtd). The log says > > > > > > > > > > ... > > > > > Starting network: 8139cp 0000:00:02.0 eth0: link down > > > > > udhcpc: started, v1.33.0 > > > > > EXT2-fs (mtdblock3): error: ext2_check_folio: bad entry in directory #363: : directory entry across blocks - offset=0, inode=27393, rec_len=3072, name_len=2 > > > > > udhcpc: sending discover > > > > > udhcpc: sending discover > > > > > udhcpc: sending discover > > > > > EXT2-fs (mtdblock3): error: ext2_check_folio: bad entry in directory #363: : directory entry across blocks - offset=0, inode=27393, rec_len=3072, name_len=2 > > > > > > > > Thanks for report! Uh, I have to say I'm very confused by this. It is clear > > > > than when ext2 detects the directory corruption (we fail checking directory > > > > inode 363 which is likely /etc/init.d/), the boot fails in interesting > > > > ways. What is unclear is how the commit can possibly cause ext2 directory > > > > corruption. If you didn't verify reverting the commit fixes the issue, I'd > > > > be suspecting bad bisection but that obviously isn't the case :-) > > > > > > > > Ext2 is storing directory data in the page cache so at least it uses the > > > > subsystem which the patch impacts but how writeback throttling can cause > > > > ext2 directory corruption is beyond me. BTW, do you recreate the root > > > > filesystem before each boot? How exactly? > > > > > > I use pre-built root file systems. For sheb, they are at > > > https://github.com/groeck/linux-build-test/tree/master/rootfs/sheb > > > > Thanks. So the problematic directory is /usr/share/udhcpc/ where we > > read apparently bogus metadata at the beginning of that directory. > > > > > I don't think this is related to ext2 itself. Booting an ext2 image from > > > ata/ide drive works. > > > > Interesting this is specific to mtd. I'll read the patch carefully again if > > something rings a bell. > > > > Interesting. Is there some endianness issue, by any chance ? I only see the problem > with sheb (big endian), not with sh (little endian). I'd suspect that it is an > emulation bug, but it is odd that the problem did not show up before. So far I don't have a good explanation. Let me write down here the facts, maybe it will trigger the aha effect. 1) Ext2 stores the metadata in little endian ordering. We observe the problem with the first directory entry in the folio. Both entry->rec_len (16-bit) and entry->inode (32-bit) appear to be seen in wrong endianity 2) The function that fails is ext2_check_folio(). We kmap_local() the folio in ext2_get_folio(), then in ext2_check_folio() we do: ext2_dirent *p; p = (ext2_dirent *)(kaddr + 0); rec_len = ext2_rec_len_from_disk(p->rec_len); ^^^ value 3072 == 0x0c00 seen here instead of correct 0x000c this value is invalid so we go to: ext2_error(sb, __func__, "bad entry in directory #%lu: : %s - " "offset=%llu, inode=%lu, rec_len=%d, name_len=%d", dir->i_ino, error, folio_pos(folio) + offs, (unsigned long) le32_to_cpu(p->inode), rec_len, p->name_len); Here rec_len is printed so we see the wrong value. Also le32_to_cpu(p->inode) is printed which also shows number with swapped byte ordering (the message contains inode number 27393 == 0x00006b01 but the proper inode number is 363 == 0x0000016b). This actually releals more about the problem because only the two bytes were swapped in the inode number although we always treat it as 32-bit entity. So this would indeed point more at some architectural issue rather than a problem in the filesystem / MM. Note that to get at this point in the boot we must have correctly byteswapped many other directory entries in the filesystem. So the problem must be triggered by some parallel activity happening in the system or something like that. 3) The problem appears only with MTD storage, not with IDE/SATA on the same system + filesystem image. It it unclear how the storage influences the reproduction, rather than that it influences timing of events in the system. 4) The problem reliably happens with "mm/page-writeback: Consolidate wb_thresh bumping logic into __wb_calc_thresh", not without it. All this patch does is that it possibly changes a limit at which processes dirtying pages in the page cache get throttled. Thus there are fairly limited opportunities for how it can cause damage (I've checked for possible UAF issues or memory corruption but I don't really see any such possibility there, it is just crunching numbers from the mm counters and takes decision based on the result). This change doesn't have direct on the directory ext2 code. The only thing it does is that it possibly changes code alignment of ext2 code if it gets linked afterwards into vmlinux image (provided ext2 is built in). Another possibility is again that it changes timing of events in the system due to differences in throttling of processes dirtying page cache. So at this point I don't have a better explanation than blame the HW. What really tipped my conviction in this direction is the 16-bit byteswap in a 32-bit entity. Hence I guess I'll ask Andrew to put Jim's patch back into tree if you don't object. Honza -- Jan Kara SUSE Labs, CR