From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5E21DC3DA49 for ; Sun, 28 Jul 2024 22:12:06 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C2E2E6B007B; Sun, 28 Jul 2024 18:12:05 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id BDD556B0083; Sun, 28 Jul 2024 18:12:05 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AA5346B0085; Sun, 28 Jul 2024 18:12:05 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 8BCE16B007B for ; Sun, 28 Jul 2024 18:12:05 -0400 (EDT) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id BE50D80210 for ; Sun, 28 Jul 2024 22:12:04 +0000 (UTC) X-FDA: 82390560168.19.6F4BD74 Received: from out30-97.freemail.mail.aliyun.com (out30-97.freemail.mail.aliyun.com [115.124.30.97]) by imf10.hostedemail.com (Postfix) with ESMTP id 691F3C0011 for ; Sun, 28 Jul 2024 22:12:01 +0000 (UTC) Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b=kq1mdMZT; dmarc=pass (policy=none) header.from=linux.alibaba.com; spf=pass (imf10.hostedemail.com: domain of hsiangkao@linux.alibaba.com designates 115.124.30.97 as permitted sender) smtp.mailfrom=hsiangkao@linux.alibaba.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1722204719; a=rsa-sha256; cv=none; b=zgbQtN9XuJFOaMq9L6A7XkkIg+K8deL5c+mVP+EjYvqPhiBGcKfb+ns9LOq+DMRuYYQ3rM HeVLcU7YNsTK3chkDrQ7jvd6xUIjkPCW+Tu3hdtJc1KLdMLj1mfHUXfhEwEeKqMOxyAKlB 9pPjBTM/o15fS/qITFNcpNejf0S+PDQ= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b=kq1mdMZT; dmarc=pass (policy=none) header.from=linux.alibaba.com; spf=pass (imf10.hostedemail.com: domain of hsiangkao@linux.alibaba.com designates 115.124.30.97 as permitted sender) smtp.mailfrom=hsiangkao@linux.alibaba.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1722204719; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Y8tZ5sw0h13MXZSP5byipzmW9HMCzdG2caajKfL5w7w=; b=POZkfcDN6U7IMnSav5NTch1YTtJZaFBCaufMouyVezVhDOLjG5QJlBdEuHDX9MxQwyOyc2 oAqIX0m4gqgcR+xM1AKCzsCux10cJtjqjKOhWx6P7yoObTcTyJUUbGhLaB/NN6fkjuZWy+ SnNrJ/vEW34mw4MPEzzfjSLh4NRxyd0= DKIM-Signature:v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1722204718; h=Message-ID:Date:MIME-Version:Subject:To:From:Content-Type; bh=Y8tZ5sw0h13MXZSP5byipzmW9HMCzdG2caajKfL5w7w=; b=kq1mdMZTmP7fed6t2Uwa8fBDYJ+7S6DJZ8v2DA6H8btImvlsK3mICXWMnhlGo5pJ9s/ZjdGorG5cxxJ6tZLSkeqtx7XIuLkDke9Dwzl7UJE4olLb7fSxG+Zwrmh7wctCsSiKttCfbwwEYoQ4mKYEL3pjcAel0V3Xt3S3PfQKY1s= X-Alimail-AntiSpam:AC=PASS;BC=-1|-1;BR=01201311R741e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=maildocker-contentspam033032019045;MF=hsiangkao@linux.alibaba.com;NM=1;PH=DS;RN=5;SR=0;TI=SMTPD_---0WBS6vcy_1722204715; Received: from 30.27.96.125(mailfrom:hsiangkao@linux.alibaba.com fp:SMTPD_---0WBS6vcy_1722204715) by smtp.aliyun-inc.com; Mon, 29 Jul 2024 06:11:57 +0800 Message-ID: <04bbfcd0-6eb1-4a5b-ac21-b3cdf1acdc77@linux.alibaba.com> Date: Mon, 29 Jul 2024 06:11:54 +0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH] mm/migrate: fix deadlock in migrate_pages_batch() on large folios To: Matthew Wilcox Cc: Andrew Morton , Huang Ying , linux-mm@kvack.org, linux-kernel@vger.kernel.org References: <20240728154913.4023977-1-hsiangkao@linux.alibaba.com> From: Gao Xiang In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Rspam-User: X-Rspamd-Queue-Id: 691F3C0011 X-Rspamd-Server: rspam01 X-Stat-Signature: 8pongc88gsayx81man9oizaj6tytymos X-HE-Tag: 1722204721-535040 X-HE-Meta: U2FsdGVkX1/IPgygk5Ws9uF3aPIVFh4y+FoM4so5HCfCN2jHsj6zVlODYi/GrV01ORYOa7TssOXSH4jyhLxG8IqQfnUpcJrllZJcaxgoJqH9lIJaikfE2hSXkzMaf5doUslIpA8odOpNRIz3eYkA7DSdC+1H00WaiAkdCNV2yRag95x9VEICUt0lNWgIthwynVF60RGhF8DGBJMNRp1lnh/lxu9Zmrg0g9vBlxoOE/JDeNKKVJu/N6h742Lz7y0UlIOKBP7s6M3rZcDTueFNg/1KkeZeYvniviLVygyr2E7j08GEOM+Fz4+ipMgYD38QG4qp//h6q3j7B3xkaGNepouo48i5/Xss7Lj2MxyFoUhsxFv40DonxNTS8BXx6nYdjh0vmvW/o5zHApxUVuVzOXsQwlP/omrzUHWHSU7SKuoDkkmFxspeWmJSZFweIuZqe89X+zNCNOJr0008kmt2sNzuFhm9sCoyTBH4D+Hs9ndPsGYm0Wl9d2hp9uWntouRDXEntbGs0c5AwYUVnbUmB78wJhTOZj3sePXLf7tmsb8iO+CRFxiR5EgLpypUar7vs41YIr9ruuaDIbRA7+jKnvvUKZXVmXoHI1A2l5X3FYetAV+k3BIVYDf83GSFahlnl7cEuDBjeqmbeGgLzkUuUdtIX00tO6ValJ1nNePUVaK30kkfuw6fd2T84k+PEQoGmeOFEYyFHcjsFnPARYoFsDeh4jwBPno9x+WLolM17ANllWRFBoJBp49g+qA8acQdRqr4dzQfHxWpBabNmzuF/S0zrmaoutGRYtRg15hAjfJ2OFCt077jwZAo7pMIpsODqYsh6sDzDQVi+xjXzB7fAh3vUz12E4vKQk415/MIarKwRXaltjLJl0qjF5GY21JdGBk5hmJ9ACrRGmrKOg4gS0ABa9Q4JXISuLZilCNs5IKj0jqvrw4wwsLKeUPkPVCeLsQTIK3GkSLWDVvkOFu +FEFYCbL vypo7QnqYMeUwyPryh75vDnx3Wcud154FQpBnmrrSNo5dzJnNhzE3Wau8PEw2o1m3uMmg+fwmb63X1R3mLFZVDRcAvmwXDmr94+f5KJJaxqPPxo6xpmOrPINM3AJuOP332K8x/ptoLw5+ZR8eYGwE0Kl14TjooIRZyOF++9v8uE9r7RfyzD7lvn3G5hoP8CtkRiqcF5EQNgqwVxTDAXnejGMIER/IzNjeLzXxbSXHDgtWeJV1b7i5PSWpaaxOCaJsXem7EaOSnL8KuCBA4tiRqufWCUU85jvuB6QP0raCIaPmuPr7n1wFZXt/jVr4NJnQZ3wRwL/haIVOJr4= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Hi, On 2024/7/29 05:46, Matthew Wilcox wrote: > On Sun, Jul 28, 2024 at 11:49:13PM +0800, Gao Xiang wrote: >> It was found by compaction stress test when I explicitly enable EROFS >> compressed files to use large folios, which case I cannot reproduce with >> the same workload if large folio support is off (current mainline). >> Typically, filesystem reads (with locked file-backed folios) could use >> another bdev/meta inode to load some other I/Os (e.g. inode extent >> metadata or caching compressed data), so the locking order will be: > > Umm. That is a new constraint to me. We have two other places which > take the folio lock in a particular order. Writeback takes locks on > folios belonging to the same inode in ascending ->index order. It > submits all the folios for write before moving on to lock other inodes, > so it does not conflict with this new constraint you're proposing. BTW, I don't believe it's a new order out of EROFS, if you consider ext4 or ext2 for example, it will also use sb_bread() (buffer heads on bdev inode to trigger some meta I/Os), e.g. take ext2 for simplicity: ext2_readahead mpage_readahead ext2_get_block ext2_get_blocks ext2_get_branch sb_bread <-- get some metadata using for this data I/O > > The other place is remap_file_range(). Both inodes in that case must be > regular files, > if (!S_ISREG(inode_in->i_mode) || !S_ISREG(inode_out->i_mode)) > return -EINVAL; > so this new rule is fine. > > Does anybody know of any _other_ ordering constraints on folio locks? I'm > willing to write them down ... Personally I don't think out any particular order between two folio locks acrossing different inodes, so I think folio batching locking always needs to be taken care. > >> diff --git a/mm/migrate.c b/mm/migrate.c >> index 20cb9f5f7446..a912e4b83228 100644 >> --- a/mm/migrate.c >> +++ b/mm/migrate.c >> @@ -1483,7 +1483,8 @@ static inline int try_split_folio(struct folio *folio, struct list_head *split_f >> { >> int rc; >> >> - folio_lock(folio); >> + if (!folio_trylock(folio)) >> + return -EAGAIN; >> rc = split_folio_to_list(folio, split_folios); >> folio_unlock(folio); >> if (!rc) > > This feels like the best quick fix to me since migration is going to > walk the folios in a different order from writeback. I'm surprised > this hasn't already bitten us, to be honest. My stress workload explicitly triggers compaction and other EROFS read loads, I'm not sure if others just test like this too, but: https://lore.kernel.org/r/20240418001356.95857-1-mcgrof@kernel.org seems like a similar load. Thanks, Gao Xiang > > (ie I don't think this is even necessarily connected to the new > ordering constraint; I think migration and writeback can already > deadlock)