From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 39139C3DA49 for ; Thu, 18 Jul 2024 08:51:00 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BDAEC6B0093; Thu, 18 Jul 2024 04:50:59 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B63E16B0095; Thu, 18 Jul 2024 04:50:59 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9DD4B6B0096; Thu, 18 Jul 2024 04:50:59 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 77E1F6B0093 for ; Thu, 18 Jul 2024 04:50:59 -0400 (EDT) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 1A438120DF6 for ; Thu, 18 Jul 2024 08:50:59 +0000 (UTC) X-FDA: 82352253438.22.9A5183D Received: from mail-lj1-f172.google.com (mail-lj1-f172.google.com [209.85.208.172]) by imf03.hostedemail.com (Postfix) with ESMTP id E645920007 for ; Thu, 18 Jul 2024 08:50:56 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=suse.com header.s=google header.b=DbaeNWwL; dmarc=pass (policy=quarantine) header.from=suse.com; spf=pass (imf03.hostedemail.com: domain of wqu@suse.com designates 209.85.208.172 as permitted sender) smtp.mailfrom=wqu@suse.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1721292611; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=aUXyH26QNSmClAbWu6W2K8UP580WeU1e6kGYqEWCyBc=; b=6/0S7YgWztR/dqagmkOD/v9C/B5AfRxNreGZRryefweXJcRRytZNTx+7r2AqCzomXaB+tj eC8ykPAiolhqNdC1wsJCnaGbfF0mEYThaBTW7FErcot7+S1uzSGk7QiYOyFFbZ2S3GHbpt ATDPQknI08xjQ8kW0qpVg+4fxD3eBPU= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1721292611; a=rsa-sha256; cv=none; b=tRVjXUkJKouvgo718o7PuyQmtbwqw+Z9DSD9zBmrgqUARo1Lo+s5ZdfbbCLNTaT36KUhOb OdKHrBIY7EgkPHdICzv4UUMpq5f2yycR9sdI3tO+LVBgzTMLzS/2AWuT8qjcEBjzZQ40Tj YrHD3WQcd5VRcMBmeh3LnmSA/eYUx6Q= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=pass header.d=suse.com header.s=google header.b=DbaeNWwL; dmarc=pass (policy=quarantine) header.from=suse.com; spf=pass (imf03.hostedemail.com: domain of wqu@suse.com designates 209.85.208.172 as permitted sender) smtp.mailfrom=wqu@suse.com Received: by mail-lj1-f172.google.com with SMTP id 38308e7fff4ca-2eedea0fd88so7304321fa.2 for ; Thu, 18 Jul 2024 01:50:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=google; t=1721292655; x=1721897455; darn=kvack.org; h=content-transfer-encoding:in-reply-to:autocrypt:from :content-language:references:cc:to:subject:user-agent:mime-version :date:message-id:from:to:cc:subject:date:message-id:reply-to; bh=aUXyH26QNSmClAbWu6W2K8UP580WeU1e6kGYqEWCyBc=; b=DbaeNWwLAD5pRfI+u3nbgMzKaC6rgA7yDh6cntjAZGu0CyQMN/Mqqa1YNqWxzgtrdv vA3SDiUBgjCq79yChkbE59S99ud9/fkT6Eyuk8GIuQJRy7PUWJVRnLDLafKfZxOeu8Vs G/sLRhklLZhbiCJLQmxlzosEkXog9DM3yg3kQ01eOlEDjZOMfnud48LHvD5TR6G220cI iDlZQVMjoIZuO25AGznIcuLn+Q57TIYlMGxxxFKV10s/bI3QB030ZKVxgejv24oHtKto 0V1cnjXOcau7Evjgfn7du2ESMxJ+KzRQo1rNqkIo0XmLkGQLjlqRbGc7KAGhB+um5InN dwQw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1721292655; x=1721897455; h=content-transfer-encoding:in-reply-to:autocrypt:from :content-language:references:cc:to:subject:user-agent:mime-version :date:message-id:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=aUXyH26QNSmClAbWu6W2K8UP580WeU1e6kGYqEWCyBc=; b=IWQSiZnzR1xoG3+S5nl5fDf3CJg8V4+oBq3ekJRaWQJw96wllNQEKkRgzDbi7mcz1H KV85iivTr0KhF9OV2Jz4D0ZmyEENk8Mk3wmOra5lz0xTP25e/u4Z/7XWXKYuq2othucJ 3WahwoKiDRjR3MCxUqSsMM3mKFVR3q1CQr/s2J/Uez+3UIl9H5xxnIYUhbJBfhcx54Ph AnQJK+wx2QKYMhttTE2DG5yEF9ICC3Du1i79+peM8pm4Rgxd7/EzV/Ir2LfP+xgeRrjl x9FZwWFPj3isur4q932n80g0M8pHWDRD7BkKWupv7MXg0TSzUD+pKFUB11t2Ayn5hvMw 7DSg== X-Forwarded-Encrypted: i=1; AJvYcCW9z0IZQmbSs6D0+U291JZ80DiV+kh9tJHYr/0B4UlkJF/ldOiLvvOhlLLPQoeDXczGMp+hJsYP4KHhNlGHjHpuUFI= X-Gm-Message-State: AOJu0YyVFSl6mHDSVuNgg7TbaVaLiSepOQgWvLFS24IBJUznHQbHaOkd BdrrCAqlvKDTT0JeVY4mzjmiH8vr1vLfJ3nsmFuXwkVfPcMVnp08pzGxZ/aZXi0= X-Google-Smtp-Source: AGHT+IGhsQ8lEw/txY+HPUhGhnwqdejZRwl/YC9UMsP50V4lGYEbG7/8EsXBTOPhcMRIw3DDB6FHXA== X-Received: by 2002:a05:651c:1542:b0:2ee:8777:f87a with SMTP id 38308e7fff4ca-2ef05ca17b5mr13735131fa.29.1721292654698; Thu, 18 Jul 2024 01:50:54 -0700 (PDT) Received: from ?IPV6:2403:580d:fda1::299? (2403-580d-fda1--299.ip6.aussiebb.net. [2403:580d:fda1::299]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-1fc0bc274edsm87809815ad.154.2024.07.18.01.50.49 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 18 Jul 2024 01:50:54 -0700 (PDT) Message-ID: <2b48a095-97e6-43bc-9f7c-13dd31ce00b8@suse.com> Date: Thu, 18 Jul 2024 18:20:47 +0930 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH 0/2] mm: skip memcg for certain address space To: "Vlastimil Babka (SUSE)" , Qu Wenruo , Michal Hocko Cc: linux-btrfs@vger.kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, Johannes Weiner , Roman Gushchin , Shakeel Butt , Muchun Song , Cgroups , Matthew Wilcox References: <8faa191c-a216-4da0-a92c-2456521dcf08@kernel.org> <9c0d7ce7-b17d-4d41-b98a-c50fd0c2c562@gmx.com> <9572fc2b-12b0-41a3-82dc-bb273bfdd51d@kernel.org> <3cc3e652-e058-4995-8347-337ae605ebab@suse.com> Content-Language: en-US From: Qu Wenruo Autocrypt: addr=wqu@suse.com; keydata= xsBNBFnVga8BCACyhFP3ExcTIuB73jDIBA/vSoYcTyysFQzPvez64TUSCv1SgXEByR7fju3o 8RfaWuHCnkkea5luuTZMqfgTXrun2dqNVYDNOV6RIVrc4YuG20yhC1epnV55fJCThqij0MRL 1NxPKXIlEdHvN0Kov3CtWA+R1iNN0RCeVun7rmOrrjBK573aWC5sgP7YsBOLK79H3tmUtz6b 9Imuj0ZyEsa76Xg9PX9Hn2myKj1hfWGS+5og9Va4hrwQC8ipjXik6NKR5GDV+hOZkktU81G5 gkQtGB9jOAYRs86QG/b7PtIlbd3+pppT0gaS+wvwMs8cuNG+Pu6KO1oC4jgdseFLu7NpABEB AAHNGFF1IFdlbnJ1byA8d3F1QHN1c2UuY29tPsLAlAQTAQgAPgIbAwULCQgHAgYVCAkKCwIE FgIDAQIeAQIXgBYhBC3fcuWlpVuonapC4cI9kfOhJf6oBQJjTSJVBQkNOgemAAoJEMI9kfOh Jf6oapEH/3r/xcalNXMvyRODoprkDraOPbCnULLPNwwp4wLP0/nKXvAlhvRbDpyx1+Ht/3gW p+Klw+S9zBQemxu+6v5nX8zny8l7Q6nAM5InkLaD7U5OLRgJ0O1MNr/UTODIEVx3uzD2X6MR ECMigQxu9c3XKSELXVjTJYgRrEo8o2qb7xoInk4mlleji2rRrqBh1rS0pEexImWphJi+Xgp3 dxRGHsNGEbJ5+9yK9Nc5r67EYG4bwm+06yVT8aQS58ZI22C/UeJpPwcsYrdABcisd7dddj4Q RhWiO4Iy5MTGUD7PdfIkQ40iRcQzVEL1BeidP8v8C4LVGmk4vD1wF6xTjQRKfXHOwE0EWdWB rwEIAKpT62HgSzL9zwGe+WIUCMB+nOEjXAfvoUPUwk+YCEDcOdfkkM5FyBoJs8TCEuPXGXBO Cl5P5B8OYYnkHkGWutAVlUTV8KESOIm/KJIA7jJA+Ss9VhMjtePfgWexw+P8itFRSRrrwyUf E+0WcAevblUi45LjWWZgpg3A80tHP0iToOZ5MbdYk7YFBE29cDSleskfV80ZKxFv6koQocq0 vXzTfHvXNDELAuH7Ms/WJcdUzmPyBf3Oq6mKBBH8J6XZc9LjjNZwNbyvsHSrV5bgmu/THX2n g/3be+iqf6OggCiy3I1NSMJ5KtR0q2H2Nx2Vqb1fYPOID8McMV9Ll6rh8S8AEQEAAcLAfAQY AQgAJgIbDBYhBC3fcuWlpVuonapC4cI9kfOhJf6oBQJjTSJuBQkNOge/AAoJEMI9kfOhJf6o rq8H/3LJmWxL6KO2y/BgOMYDZaFWE3TtdrlIEG8YIDJzIYbNIyQ4lw61RR+0P4APKstsu5VJ 9E3WR7vfxSiOmHCRIWPi32xwbkD5TwaA5m2uVg6xjb5wbdHm+OhdSBcw/fsg19aHQpsmh1/Q bjzGi56yfTxxt9R2WmFIxe6MIDzLlNw3JG42/ark2LOXywqFRnOHgFqxygoMKEG7OcGy5wJM AavA+Abj+6XoedYTwOKkwq+RX2hvXElLZbhYlE+npB1WsFYn1wJ22lHoZsuJCLba5lehI+// ShSsZT5Tlfgi92e9P7y+I/OzMvnBezAll+p/Ly2YczznKM5tV0gboCWeusM= In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: E645920007 X-Stat-Signature: hjexpmygr39ff4ok8szb41ushh9jrqme X-Rspam-User: X-HE-Tag: 1721292656-695848 X-HE-Meta: U2FsdGVkX19f//ohI9qeb7Vl7z6eXMyg348QReD2cohqyqqBUq7fKmUd6v+JCR905FiJXlT7ZofW4+UyUWIjn4Iw8PBr760G4maBXDjgSg4IJqshLolbLtqC0XblQVm9LJmsGRMCmhuWGFZ0RMaaRsyGA1Z7rSn+zqct3rAA6Whc/Eovz1yuruRiapsQrB9BkoE7808OMX2q7cV8oikV9e81tUjg13LuLjxedgdkaKtw4ijY18fPZDryI244T1ljFLehQbcR38D1HxuuWyVPMLT93QZlPVUq4vkXfqkXR0u2jIIPVeP6608FPDyeomNkEN7ubSgyAF5mbD0MFm//cLmwFI5fFIUJ16VZ3QSW9SQJunHGY78M5gvdWkuhmgGafOOrRjQerUqhn3uIakfI9FgSmGzFuTSmP0eO+3A20EambAHUHX2248iwsOSUZ+O4q5oycNlNaNk/prroTBRss9ijYQpoTNUb0ipv8JhuTxwVZ0nq+jSbMZ6MhqdghLkrk9r71F7SPC56i61ZNeHDIuXgbkMu624rKA36BxcLlklNtm81ur3KfgWPFLbG+ZeV3FsliqguH+ZfHA/TrHGNd7zLyX7uNM/gUf+ssuRGMLQOD2bp48BgGj5NCgoLOokssuIXMaGBaiSahHCROPM0bjjkCRPk7CYiwfk05yIoJSikrhNkwCcgRf/EKx5WAaOAStCbHajlOV2p/Z/OJWlTNu5XK5CDBXhxgG7ID3ipt7+JOzRczYuSp9zOX1t3w2ueoAYRq9p3c3Nd15YhkN9jDYHJk1smGM+x1x3cA4bjKg0MIuNn8ZVbnyJVBZrguQ6c2qUMkRAwvY6ymltUA/eSjtssFppNRDHKVJcrGgNtj+dT1xuwvf2aEZBB4HxTcGpGMJcI6dbZNo/qqf4RJLlfbTWbCweRXdlqYEhN8CKBtq5W2isrgJXYDE+T4YiGy9L3LoL6fZTjK9OTgdN46bv 4fXwirTm gefove8YSNFUEbd+tZmKFICc3ARqQNDyF/04PhxR+mOp5IygYjzIrlzu7nvvj2TmvSOF68LFKCRZyNIxaXc0iepC9l0bW4xCWqnzsKE4zHQ4fX02kCmInQMQgQZPusa2IbKnLrjBjiVy2jQ9iiRs1flxBmjg9Lu52ZifD8IRXrbDlnAcYRCHwj90YB+LVd2lTAkTZPZhM4eYobndJqxytBvI+AmvL3yFJwe2lr9MQvxX8iUnrAw9/yKO8Vxyw9bJ4TgYEI9gIGesbZBPmx4dr1it1P6Ky1UFNosZV15Nl3kf2IUE= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: 在 2024/7/18 17:58, Vlastimil Babka (SUSE) 写道: > On 7/18/24 9:52 AM, Qu Wenruo wrote: >> >> >> 在 2024/7/18 16:47, Vlastimil Babka (SUSE) 写道: >>> On 7/18/24 12:38 AM, Qu Wenruo wrote: >> [...] >>>> Another question is, I only see this hang with larger folio (order 2 vs >>>> the old order 0) when adding to the same address space. >>>> >>>> Does the folio order has anything related to the problem or just a >>>> higher order makes it more possible? >>> >>> I didn't spot anything in the memcg charge path that would depend on the >>> order directly, hm. Also what kernel version was showing these soft lockups? >> >> The previous rc kernel. IIRC it's v6.10-rc6. >> >> But that needs extra btrfs patches, or btrfs are still only doing the >> order-0 allocation, then add the order-0 folio into the filemap. >> >> The extra patch just direct btrfs to allocate an order 2 folio (matching >> the default 16K nodesize), then attach the folio to the metadata filemap. >> >> With extra coding handling corner cases like different folio sizes etc. > > Hm right, but the same code is triggered for high-order folios (at least for > user mappable page cache) today by some filesystems AFAIK, so we should be > seeing such lockups already? btrfs case might be special that it's for the > internal node as you explain, but that makes no difference for > filemap_add_folio(), right? Or is it the only user with GFP_NOFS? Also is > that passed as gfp directly or are there some extra scoped gfp resctrictions > involved? (memalloc_..._save()). I'm not sure about other fses, but for that hang case, it's very metadata heavy, and ALL folios for that btree inode filemap is in order 2, since we're always allocating the order folios using GFP_NOFAIL, and attaching that folio into the filemap using GFP_NOFAIL too. Not sure if other fses can have such situation. [...] >> If I understand it correctly, we have implemented release_folio() >> callback, which does the btrfs metadata checks to determine if we can >> release the current folio, and avoid releasing folios that's still under >> IO etc. > > I see, thanks. Sounds like there might be potentially some suboptimal > handling in that the folio will appear inactive because there's no > references that folio_check_references() can detect, unless there's some > folio_mark_accessed() calls involved (I see some FGP_ACCESSED in btrfs so > maybe that's fine enough) so reclaim could consider it often, only to be > stopped by release_folio failing. For the page accessed part, btrfs handles it by mark_extent_buffer_accessed() call, and it's called every time we try to grab an extent buffer structure (the structure used to represent a metadata block inside btrfs). So the accessed flag part should be fine I guess? Thanks, Qu > >>> >>> (sorry if the questions seem noob, I'm not that much familiar with the page >>> cache side of mm) >> >> No worry at all, I'm also a newbie on the whole mm part. >> >> Thanks, >> Qu >> >>> >>>> Thanks, >>>> Qu >>> >