From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 93A41C3600C for ; Thu, 3 Apr 2025 21:17:10 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 31314280003; Thu, 3 Apr 2025 17:17:09 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 29A0A280001; Thu, 3 Apr 2025 17:17:09 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1136E280003; Thu, 3 Apr 2025 17:17:09 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id E4421280001 for ; Thu, 3 Apr 2025 17:17:08 -0400 (EDT) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id BF4901A125B for ; Thu, 3 Apr 2025 21:17:08 +0000 (UTC) X-FDA: 83293992936.09.3690EE8 Received: from mail-wr1-f46.google.com (mail-wr1-f46.google.com [209.85.221.46]) by imf18.hostedemail.com (Postfix) with ESMTP id 86F021C0006 for ; Thu, 3 Apr 2025 21:17:06 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=suse.com header.s=google header.b=RwQ73dBz; dmarc=pass (policy=quarantine) header.from=suse.com; spf=pass (imf18.hostedemail.com: domain of wqu@suse.com designates 209.85.221.46 as permitted sender) smtp.mailfrom=wqu@suse.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1743715026; a=rsa-sha256; cv=none; b=Z9VdC1S6k48BdUuJdZIbF4+bjwcUEQl+ugErvJ8EM3ftL4tjruxVT5dunh8N3nJ8ocIV6Q 3RC7U+mVTuwej611PFFQge0i3mZ3BTml4D+7KUJVH+0/lXjU4HzF4cEvZxWTU6l431jfqN l4eJTRsSpVuSdAGGDq0aoDfbc37x6Lo= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=suse.com header.s=google header.b=RwQ73dBz; dmarc=pass (policy=quarantine) header.from=suse.com; spf=pass (imf18.hostedemail.com: domain of wqu@suse.com designates 209.85.221.46 as permitted sender) smtp.mailfrom=wqu@suse.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1743715026; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=hje/R4x8f8eUiwO/VVrzzPZE9MAu2KSy4teAI/eT2p4=; b=BqPyYrn4yWU67J0CK22Ys+wxgr43hs1OWIutSxsUK5EmlI9/qZ5y1hZjjg7+KVPzIQLHie b2PFq5adHWrjv5Yocgqa7OyfYEb9r/h2gCETaG5VcrWfushQEgU8vIyjQXniHzqq7zeVy5 n3JvfV7gDZV12oPqVv+JYlRp4d9S+eg= Received: by mail-wr1-f46.google.com with SMTP id ffacd0b85a97d-3914a5def6bso875751f8f.1 for ; Thu, 03 Apr 2025 14:17:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=google; t=1743715025; x=1744319825; darn=kvack.org; h=content-transfer-encoding:in-reply-to:autocrypt:from :content-language:references:cc:to:subject:user-agent:mime-version :date:message-id:from:to:cc:subject:date:message-id:reply-to; bh=hje/R4x8f8eUiwO/VVrzzPZE9MAu2KSy4teAI/eT2p4=; b=RwQ73dBzGg1qqBt5juqmEDXm5hmtbpuUnFVmHuNcrIx4DMoLtGmGT4AaICHtP9mAr8 ebGClrNf9BLB26bAOHZgXSak0svCeEbWw6FNxI9QBRdosnQp5oAz8CNx6tdg+Ox9Asi3 /BCjFTYV/56zT0J7AbnQulNk0/7IqoDkBLu6l6Q3oycxAAhkgYaB/fgb3pv3DrLrzCLG Bn5YsPPQOsR0D2GstIeasDMvaMyasVtnsVgI+pLhg+27TBFgwbwi581qafd8Po1RCdpQ bseTis4cG/uQPRO3SVGo5CGZGWAYwBZ+XKA5qnPUlii0yGZnVQUBRNNJxre5FP9w1aUm dr4w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1743715025; x=1744319825; h=content-transfer-encoding:in-reply-to:autocrypt:from :content-language:references:cc:to:subject:user-agent:mime-version :date:message-id:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=hje/R4x8f8eUiwO/VVrzzPZE9MAu2KSy4teAI/eT2p4=; b=eomBzrk4CTMG+XudncwrLnmU3jpegKCzB7CVa38YCWZUfdvMTEDkAHEczjipO3prAy cUUPrF42FDRXGUK5wlOyaM6GnaFe1dAXqOot2qCDrc92UXJTaWn5EuztLfW0YOUqDC0C QkIGG37bh5gvur9P+xc39Cli0vlMD4oq65uGmtPsI33a6dpgh+RkrI4XS6yGxR/qcg/1 Sw1XVUv2i3Kx9a/JI+2NWpPV3gLp40QirxsuRiOcN4uP74nopslhmpL/xPpjS+mD7+Ri /fuIwVlHxR3hO7SX9lCXuWusQyOFFd29uk7Ylj5S8rEmD3g8TCIngRzi/YQ5BTZ6nYD4 ee+Q== X-Gm-Message-State: AOJu0YwvlxXeVYY7ByGPV8ZjYvynZeN5LWbA1DLQQJynVianXxM2YEKE /U/CYdbqUa5vxPmknaQOg3tm9mIn6WEcSa5xDuUenXNP7c8xmbEujAeCUvFWf7o= X-Gm-Gg: ASbGncvf1+k+iNG6So7YeOyCp+sd6gy81fpOXg7Ky2qFGvx2VnH5DcFTmojmNfvyTLg wJoVcQEO0ri0J+ShUaoCBSeeKpPaWAMzi2q+OA4KdbaJz4xCxdGSBlARJoa1fXD1bNzCKs06vq/ UegTCUjXsKz0N+F7ktX7nkl6yAY2VjoWmeX8UP2aitWbNPcfybVGX3GpdeeJwHiBYjrYQUuDLzd WNvRM+GZo9fGIwqaTuLVnYk2OBwqKY955rvrM0oay0sF6UeE88FCIg6fHJ6YAjVrHXXQV8UD6Zb 9qt7vr4bI8Old7fL/fjaSSHn63sMGrGd7V6dDM7Q5zVe/8BlCNtydeFq0RB/TuUnECi0NpwJ X-Google-Smtp-Source: AGHT+IHRE5wJmywBZcMLc+JY7rRZmBtEbpNWFHjuKVkK5txq3gu6sAY2kfmXHqDloG0IUjXfz2nzyg== X-Received: by 2002:a5d:59ac:0:b0:39c:1f04:a646 with SMTP id ffacd0b85a97d-39cb359457bmr695538f8f.13.1743715024782; Thu, 03 Apr 2025 14:17:04 -0700 (PDT) Received: from ?IPV6:2403:580d:fda1::299? (2403-580d-fda1--299.ip6.aussiebb.net. [2403:580d:fda1::299]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-3057ca7fb80sm2211891a91.25.2025.04.03.14.17.01 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 03 Apr 2025 14:17:04 -0700 (PDT) Message-ID: <59539c02-d353-4811-bcbe-080b408f445e@suse.com> Date: Fri, 4 Apr 2025 07:46:59 +1030 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: Large folios and filemap_get_folios_contig() To: Matthew Wilcox , Qu Wenruo Cc: Linux Memory Management List , "linux-fsdevel@vger.kernel.org" , linux-btrfs , vivek.kasireddy@intel.com, Andrew Morton References: Content-Language: en-US From: Qu Wenruo Autocrypt: addr=wqu@suse.com; keydata= xsBNBFnVga8BCACyhFP3ExcTIuB73jDIBA/vSoYcTyysFQzPvez64TUSCv1SgXEByR7fju3o 8RfaWuHCnkkea5luuTZMqfgTXrun2dqNVYDNOV6RIVrc4YuG20yhC1epnV55fJCThqij0MRL 1NxPKXIlEdHvN0Kov3CtWA+R1iNN0RCeVun7rmOrrjBK573aWC5sgP7YsBOLK79H3tmUtz6b 9Imuj0ZyEsa76Xg9PX9Hn2myKj1hfWGS+5og9Va4hrwQC8ipjXik6NKR5GDV+hOZkktU81G5 gkQtGB9jOAYRs86QG/b7PtIlbd3+pppT0gaS+wvwMs8cuNG+Pu6KO1oC4jgdseFLu7NpABEB AAHNGFF1IFdlbnJ1byA8d3F1QHN1c2UuY29tPsLAlAQTAQgAPgIbAwULCQgHAgYVCAkKCwIE FgIDAQIeAQIXgBYhBC3fcuWlpVuonapC4cI9kfOhJf6oBQJnEXVgBQkQ/lqxAAoJEMI9kfOh Jf6o+jIH/2KhFmyOw4XWAYbnnijuYqb/obGae8HhcJO2KIGcxbsinK+KQFTSZnkFxnbsQ+VY fvtWBHGt8WfHcNmfjdejmy9si2jyy8smQV2jiB60a8iqQXGmsrkuR+AM2V360oEbMF3gVvim 2VSX2IiW9KERuhifjseNV1HLk0SHw5NnXiWh1THTqtvFFY+CwnLN2GqiMaSLF6gATW05/sEd V17MdI1z4+WSk7D57FlLjp50F3ow2WJtXwG8yG8d6S40dytZpH9iFuk12Sbg7lrtQxPPOIEU rpmZLfCNJJoZj603613w/M8EiZw6MohzikTWcFc55RLYJPBWQ+9puZtx1DopW2jOwE0EWdWB rwEIAKpT62HgSzL9zwGe+WIUCMB+nOEjXAfvoUPUwk+YCEDcOdfkkM5FyBoJs8TCEuPXGXBO Cl5P5B8OYYnkHkGWutAVlUTV8KESOIm/KJIA7jJA+Ss9VhMjtePfgWexw+P8itFRSRrrwyUf E+0WcAevblUi45LjWWZgpg3A80tHP0iToOZ5MbdYk7YFBE29cDSleskfV80ZKxFv6koQocq0 vXzTfHvXNDELAuH7Ms/WJcdUzmPyBf3Oq6mKBBH8J6XZc9LjjNZwNbyvsHSrV5bgmu/THX2n g/3be+iqf6OggCiy3I1NSMJ5KtR0q2H2Nx2Vqb1fYPOID8McMV9Ll6rh8S8AEQEAAcLAfAQY AQgAJgIbDBYhBC3fcuWlpVuonapC4cI9kfOhJf6oBQJnEXWBBQkQ/lrSAAoJEMI9kfOhJf6o cakH+QHwDszsoYvmrNq36MFGgvAHRjdlrHRBa4A1V1kzd4kOUokongcrOOgHY9yfglcvZqlJ qfa4l+1oxs1BvCi29psteQTtw+memmcGruKi+YHD7793zNCMtAtYidDmQ2pWaLfqSaryjlzR /3tBWMyvIeWZKURnZbBzWRREB7iWxEbZ014B3gICqZPDRwwitHpH8Om3eZr7ygZck6bBa4MU o1XgbZcspyCGqu1xF/bMAY2iCDcq6ULKQceuKkbeQ8qxvt9hVxJC2W3lHq8dlK1pkHPDg9wO JoAXek8MF37R8gpLoGWl41FIUb3hFiu3zhDDvslYM4BmzI18QgQTQnotJH8= In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Rspam-User: X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 86F021C0006 X-Stat-Signature: 8tuop481uj6spr7qt8incrhzt1a6rcr6 X-HE-Tag: 1743715026-11115 X-HE-Meta: U2FsdGVkX18QkbDlNhf7d5PVoY2cYUqdDTlNmX29YbAeNk7nmHQ84SATRDK4LKi0LtzusZcMDWxmRgv66Ln0cCi/aCNgyXF8jjZldpq/XmjfueLS2ZN0WSpVbNfSFlVLDDZLJBncJLvxold97rPeve8AUCnILP82roaVUztQavJdvS4MEKGA43TYTLs3Y3LkNJsDrrb/Ch4dFakUrwNDJF8IGjDniIJ+MPoEL5034OgEXqy4KNjiWstDhXpZQyi8gRf9bA7gV6lcbvLalgop4TBFXgBKM06rqgMHTLR3zxlOleRZsXfvc/oMvR+eGRWSuachvBHj+EQ8JfsQmt/06lconJySka7DAJ4YimYH7XPiCHTfuVR9b5PmORZutGqiMFdeshnrA1JHZN3ew+n3ya+L6P9tHkbJL0IWayJ26cuztF4S9P7K5LJ4g2O2Y5wUyNOHtxvjl/dJXrYwkVsAHEYqHvJaqY8lU6FiMR1CPS4MdDD2js6UGHCMb0zH5pbzLUIJjysrXbCFQHs6L8hnGOr+FZ/X8mvQ5YQXhl4Um6HdF/rQs1l9zejzhYedm608XnB03yeMuBK8GuhkF8wWOohFcgCVIRthiKMGvMCZHBaf70MDZ4o3dTriTK/bn2VMTLWHSxhMPD2TT0/woTJsXPM9/1BKhlZ5bIULR90YUvsClOWSKJhD8fkKbiYDCOKM30QoY4K5ry8t2HJ94u4D/z2L9bL5aY6odimk2D8B9flvbS1Y6p1dn/2ca60CJWubQylavlp+RooGHP2FC+a4NUZEV6ngefIBv8CGoozRdbH8fXq+rQoKCTdnsJUFblWydGTcQnd2L5AD/wkiICChS6OqYxHWHjSjAQYvzYDRVt8wh9h/sN5q3Kdc/h295YLCfMGSIy0AL54dv5536skoULUYPsdO6squTV4gosEJbxNlPi6W4miEUey4yV5Rn80mSzXMM3zWpd5Kogq2wc7 HAzkO9U2 TnE82QOcf5JroYOvKLWdXlweuKWBCeUreV67FJqwd9S+RJs8spnaYnTOHZuTy6CjaU2+o7peAPHpmwuk31BBRUxjboFonn01sRDXmERDTwaItwW6WKSQ4oXexgd6WQkg5ycA8sloef2dllJbGN648Ay3/odAodWv3iXV4VN/KAgirip7vGqv7UK10CIqr5IRVg8VC8zrR+H6nafVgNra2Mxj2W9wdThsXpX7+dY8TYAViBU27gR8HXNpuJvkuky5/VizUkbq+tXLjn/9oa7ysKY1GwngzFTMn1lk32i1KAxP2qwNGjP1shgFDsN37kEkxNvzoCjQmkSQGaswsvOfjMUWeq3Z+DEve+ydzqccUCJyf5lRl9OmLJCGyQGnw2Ac3O27A7k9B2pPm5us/mrK3oPKZDGL0hthJeLmg X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: 在 2025/4/3 23:05, Matthew Wilcox 写道: > On Thu, Apr 03, 2025 at 08:06:53PM +1030, Qu Wenruo wrote: >> Recently I hit a bug when developing the large folios support for btrfs. >> >> That we call filemap_get_folios_contig(), then lock each returned folio. >> (We also have a case where we unlock each returned folio) >> >> However since a large folio can be returned several times in the batch, >> this obviously makes a deadlock, as btrfs is trying to lock the same >> folio more than once. > > Sorry, what? A large folio should only be returned once. xas_next() > moves to the next folio. How is it possible that > filemap_get_folios_contig() returns the same folio more than once? But that's exactly what I got from filemap_get_folios_contig(): lock_delalloc_folios: r/i=5/260 locked_folio=720896(65536) start=782336 end=819199(36864) lock_delalloc_folios: r/i=5/260 found_folios=1 lock_delalloc_folios: r/i=5/260 i=0 folio=720896(65536) lock_delalloc_folios: r/i=5/260 found_folios=8 lock_delalloc_folios: r/i=5/260 i=0 folio=786432(262144) lock_delalloc_folios: r/i=5/260 i=1 folio=786432(262144) lock_delalloc_folios: r/i=5/260 i=2 folio=786432(262144) lock_delalloc_folios: r/i=5/260 i=3 folio=786432(262144) lock_delalloc_folios: r/i=5/260 i=4 folio=786432(262144) lock_delalloc_folios: r/i=5/260 i=5 folio=786432(262144) lock_delalloc_folios: r/i=5/260 i=6 folio=786432(262144) lock_delalloc_folios: r/i=5/260 i=7 folio=786432(262144) r/i is the root and inode number from btrfs, and you can completely ignore it. @locked_folio is the folio we're already holding a lock, the value inside the brackets is the folio size. @start and @end is the range we're searching for, the value inside the brackets is the search range length. The first iteration returns the current locked folio, and since the range inside that folio is only 4K, thus it's only returned once. The next 8 slots are all inside the same large folio at 786432, resulting duplicated entries. > >> Then I looked into the caller of filemap_get_folios_contig() inside >> mm/gup, and it indeed does the correct skip. > > ... that code looks wrong to me. It looks like it's xas_find() is doing the correct skip by calling xas_next_offset() -> xas_move_index() to skip the next one. But the filemap_get_folios_contig() only calls xas_next() by increasing the index, not really skip to the next folio. Although I can be totally wrong as I'm not familiar with the xarray internals at all. However I totally agree the duplicated behavior (and the extra handling of duplicated entries) looks very wrong. Thanks, Qu