From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 76889C52D7D for ; Wed, 14 Aug 2024 23:05:57 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1295E6B009B; Wed, 14 Aug 2024 19:05:57 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 0D9B66B009C; Wed, 14 Aug 2024 19:05:57 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EE3446B009D; Wed, 14 Aug 2024 19:05:56 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id D07B76B009B for ; Wed, 14 Aug 2024 19:05:56 -0400 (EDT) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 83DBD16120D for ; Wed, 14 Aug 2024 23:05:56 +0000 (UTC) X-FDA: 82452385512.12.A31FF21 Received: from mail-pg1-f169.google.com (mail-pg1-f169.google.com [209.85.215.169]) by imf24.hostedemail.com (Postfix) with ESMTP id 80CD1180019 for ; Wed, 14 Aug 2024 23:05:54 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=Y9R6xKNw; spf=pass (imf24.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.215.169 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1723676683; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=mYxTfjyRpR/7rH4HqiHRBcUKt/StR011Utyi06+DQ3M=; b=zE0N70zxFPYCjckkkCiG/5SBguKG+FG3vVBuFaqyv7kE4/N228kkRMIwy/lrINTCXH06rI bsqXc1fqohEyxpRReE/NOCd7i3+KUNxUjJRTBzGsoX/Z1Ch/W83O9jRae30gTNG8Ma8455 W64JVZvVrjaaqpRyNxQ5JoSsqBeeboM= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1723676683; a=rsa-sha256; cv=none; b=tQaITSPS5Eu7KLVi0CQuFAujmpgdm8cRyCbf8SGQXYfVPi3vjHw4DC4OS4OViOZt5LZFFq WFKrz/YV+4Y2/MY/9uWHR1AHgjDkeYo/Pygj913B+fR/FaPRcENz4NSYoxOaDuyf0MJkPW RBHLjI272iLJdzuHTfksq31r3WHVem0= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=Y9R6xKNw; spf=pass (imf24.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.215.169 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com Received: by mail-pg1-f169.google.com with SMTP id 41be03b00d2f7-7a264a24ea7so343269a12.3 for ; Wed, 14 Aug 2024 16:05:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1723676753; x=1724281553; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=mYxTfjyRpR/7rH4HqiHRBcUKt/StR011Utyi06+DQ3M=; b=Y9R6xKNwZq9YvGsdtvfO6XVqydRe4v8EQ3khrCZh0uO+yMx1PHeIpYWQVf95shheu5 7WhqFVjf2xcmjgS+xFJTv73BHc33Su8ED5sTS82BL01R9vnfMTAT+X7C4CZFY9yIZly/ BTcf4jeC1rl418R1iroA/z9odPTqn+bbo2+lVfi555SJqpjE+2HnVhlG/d6mgwHBg337 BXj4+U30qGpzGAFxU14Y5jTw+hUKKqMg7RYpuJQQdYa9hRyOilR8m2q/aWrzC/0CENEP Lga81nYCWrbJtJu2hZ1uJXqCMvqjcaYawlh6rUCBwnTxn9Sc+eaUCAXI+8o2pPH7bPkM h7Ww== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1723676753; x=1724281553; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=mYxTfjyRpR/7rH4HqiHRBcUKt/StR011Utyi06+DQ3M=; b=XkOieZL/re/EhBvaF9PQRtTPkVJmXN+p9fnIGha2SSXhRnUOsaiu6BM8QelOGwGzgo f5w4AWWpoqjJRjdqpaUZuSBmVvb1TksrP+Ji/2hMNjjOg8c71H/7RXf1HwtW/KzrosaD YV0SYN0NDYONksLZOBTyf3kurRRzxlsu4dupJj2iR3+5GSdPUrkEHW2wxaTgA8449SHs CR0WAmMNXqvQOP9NlbFRQbhbqEOh3gtzG70/kmVyx5f8vg4ajPj/DCRg7Lqw2pXFg4h5 VZIlz5tWU6vZ81T6TnCmCjFwIuZDiALn7sT/ox8lD1Y+TdlxK5GBrDoWlz7+wcMgAZN0 BkSA== X-Forwarded-Encrypted: i=1; AJvYcCXLN8y1r2O4EEaTcTLDIVICgat0+wL96c3ON7bIMXyEt+TTtyarnxSxhInFuuf/Ib8UEgs1KDIRrgPA8aD46rHqt6w= X-Gm-Message-State: AOJu0YyDqUl25+gSmp44edcCp+w2SjETzrgIXW2FHdZPkrczh9H62250 RXsUfZXK6flYbDR+rEwlzOQu5u2QZk9tvkTqVHqz/s3HWK+4/mNs X-Google-Smtp-Source: AGHT+IE49zC2YaWLxpj0vloyoaA+nYSez/aYHKFx2zeZYok2DEgoCeeZkMf7aExA+uQ+h7v+WXULiA== X-Received: by 2002:a17:90a:6881:b0:2c9:6d8:d823 with SMTP id 98e67ed59e1d1-2d3aaa7ab9bmr5477821a91.1.1723676752872; Wed, 14 Aug 2024 16:05:52 -0700 (PDT) Received: from localhost.localdomain ([2407:7000:8942:5500:aaa1:59ff:fe57:eb97]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-2d3ac855073sm2326640a91.54.2024.08.14.16.05.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 14 Aug 2024 16:05:52 -0700 (PDT) From: Barry Song <21cnbao@gmail.com> To: usamaarif642@gmail.com Cc: akpm@linux-foundation.org, baohua@kernel.org, cerasuolodomenico@gmail.com, corbet@lwn.net, david@redhat.com, hannes@cmpxchg.org, kernel-team@meta.com, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, riel@surriel.com, roman.gushchin@linux.dev, rppt@kernel.org, ryan.roberts@arm.com, shakeel.butt@linux.dev, willy@infradead.org, yuzhao@google.com Subject: Re: [PATCH v3 4/6] mm: Introduce a pageflag for partially mapped folios Date: Thu, 15 Aug 2024 11:05:33 +1200 Message-Id: <20240814230533.54938-1-21cnbao@gmail.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <88d411c5-6d66-4d41-ae86-e0f943e5fb91@gmail.com> References: <88d411c5-6d66-4d41-ae86-e0f943e5fb91@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Rspamd-Queue-Id: 80CD1180019 X-Stat-Signature: owkt1nhrqpaxo98s9qf4zcq7tdxhor3i X-Rspamd-Server: rspam09 X-Rspam-User: X-HE-Tag: 1723676754-188699 X-HE-Meta: U2FsdGVkX19iVzx46JN4k3IwuQKVmOViuGePBx42rgQIvNhtgQR1tfXynBeGIlGhlLypiq48Vyc+uI+5XPI5W7DZSlwgdcfQ10nA92aUtPdei93cLRf6sdqs1FiHkflz98NCHWmrZ8SPrk0ZPMU5kpve0Ai5jpFW0kufyIav43dKTT5uChyt4UxAZknL2Mp8usNOOVG/6FG8bK7Qkez3phRTT77ZtVHQR+q0JHDasxQTsMmerNlJerWvqZtb2YTvMv3Am7GRSnR2aESR+VMKgSm4uny9baxJ4ixsxmGmnv5wR59o8w2de2qxVKRLAN9QwKrk9ztySrsFd9dnms60Ghncs/cby59z52yWNKJyh1HWp3t1b5k1VOnPKLNWtFCxWWhppoE3k0lYz715LyyFjG6kYLAWICEgSv/R0er2tFtbPcY3A8X3gLCYNPBHhG/ZTnU6BoM4N6e2SbNncrDImLNzxfxXTbS6QdSUwVe7xQb4+eM7IjTuK8+9lGpICeoAsomtcZb0GmqYZz2EbVEmEGjsL1rmJno524Ep5AUqhNS3P8NUxB2sqHra2PQUFLbPPHRGU6Bolelj+6ZjNyXB+5yolLAPMn5BFLabpYGf1zO2Vqd7NNea7PKCdTV94Nlx7718qpBNhBvcqwfYtDJnoWsDdZ3Ukbdp4n3eNtxJfZwHn5ugHK9PHQ8N5kqCPgpkClnKAa685JMXiIzK1z/opnapQ9s0oGDrx0GGnpiutxTpNnI5vVHomXSxB9vkcE8lZ1OqtBKScXa6FA6++jF0jTuMK3MoQsQCKUEXBFOXOjCuppqdOVTDh9BEEkiIkqCZRGVMTrDM9DBSJVgIMzfnftUoKam9M2+Y8lKUNBeqWy8j73WbYqQJ5eD8phRQgWJkRBbajZzlVBdAmsRCe+8kzkrEXxHMDFXfalgfgrYEhY3Rfn6k+RjddGbaQ9DkDa+BBKtSOcsaUP1RmKBsmS+ oNw10mJQ 4yrNo67iPl+DQD1rzk22ERrHXvsi3whNMosAMy79GEIdnCuIqM05OT4dmHOQiTP8BALHsYMNiNVq9MKa5QVHmdptb45nKdMNsvFTperrJWgzt3ANOCwwo0grjq/iqiIk8uG4cwa56VLlSf1nywSw94jW2nPwSQa1m1kfHFh2/EQIyB91DHJAX+zR52v70zccDt92xKxYtMESUvAPYZ2zgrkGjHFM0TOIEh1oVLEAeLJDc62vcf39VLOOQ/LmO2WlxQvPxUb/ZdqqpGzWcOMZ5hN6GqS65WrovcJchOLCNuxufxmRogPPanjwNfDcgAs3AluDmXgElBAdQKrcnB43LOUzoPFNP07Keagcg2xwueW+JATCQdzmH5+LIHHu0uD1RW3qqoTnfCGZbFTU= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Aug 15, 2024 at 12:37 AM Usama Arif wrote: [snip] > >>>> > >>>> -void deferred_split_folio(struct folio *folio) > >>>> +void deferred_split_folio(struct folio *folio, bool partially_mapped) > >>>>  { > >>>>         struct deferred_split *ds_queue = get_deferred_split_queue(folio); > >>>>  #ifdef CONFIG_MEMCG > >>>> @@ -3485,14 +3487,17 @@ void deferred_split_folio(struct folio *folio) > >>>>         if (folio_test_swapcache(folio)) > >>>>                 return; > >>>> > >>>> -       if (!list_empty(&folio->_deferred_list)) > >>>> -               return; > >>>> - > >>>>         spin_lock_irqsave(&ds_queue->split_queue_lock, flags); > >>>> +       if (partially_mapped) > >>>> +               folio_set_partially_mapped(folio); > >>>> +       else > >>>> +               folio_clear_partially_mapped(folio); > >>>>         if (list_empty(&folio->_deferred_list)) { > >>>> -               if (folio_test_pmd_mappable(folio)) > >>>> -                       count_vm_event(THP_DEFERRED_SPLIT_PAGE); > >>>> -               count_mthp_stat(folio_order(folio), MTHP_STAT_SPLIT_DEFERRED); > >>>> +               if (partially_mapped) { > >>>> +                       if (folio_test_pmd_mappable(folio)) > >>>> +                               count_vm_event(THP_DEFERRED_SPLIT_PAGE); > >>>> +                       count_mthp_stat(folio_order(folio), MTHP_STAT_SPLIT_DEFERRED); > >>> > >>> This code completely broke MTHP_STAT_SPLIT_DEFERRED for PMD_ORDER. It > >>> added the folio to the deferred_list as entirely_mapped > >>> (partially_mapped == false). > >>> However, when partially_mapped becomes true, there's no opportunity to > >>> add it again > >>> as it has been there on the list. Are you consistently seeing the counter for > >>> PMD_ORDER as 0? > >>> > >> > >> Ah I see it, this should fix it? > >> > >> -void deferred_split_folio(struct folio *folio) > >> +/* partially_mapped=false won't clear PG_partially_mapped folio flag */ > >> +void deferred_split_folio(struct folio *folio, bool partially_mapped) > >>  { > >>         struct deferred_split *ds_queue = get_deferred_split_queue(folio); > >>  #ifdef CONFIG_MEMCG > >> @@ -3485,14 +3488,14 @@ void deferred_split_folio(struct folio *folio) > >>         if (folio_test_swapcache(folio)) > >>                 return; > >> > >> -       if (!list_empty(&folio->_deferred_list)) > >> -               return; > >> - > >>         spin_lock_irqsave(&ds_queue->split_queue_lock, flags); > >> -       if (list_empty(&folio->_deferred_list)) { > >> +       if (partially_mapped) { > >> +               folio_set_partially_mapped(folio); > >>                 if (folio_test_pmd_mappable(folio)) > >>                         count_vm_event(THP_DEFERRED_SPLIT_PAGE); > >>                 count_mthp_stat(folio_order(folio), MTHP_STAT_SPLIT_DEFERRED); > >> +       } > >> +       if (list_empty(&folio->_deferred_list)) { > >>                 list_add_tail(&folio->_deferred_list, &ds_queue->split_queue); > >>                 ds_queue->split_queue_len++; > >>  #ifdef CONFIG_MEMCG > >> > > > > not enough. as deferred_split_folio(true) won't be called if folio has been > > deferred_list in __folio_remove_rmap(): > > > >         if (partially_mapped && folio_test_anon(folio) && > >             list_empty(&folio->_deferred_list)) > >                 deferred_split_folio(folio, true); > > > > so you will still see 0. > > > > ah yes, Thanks. > > So below diff over the current v3 series should work for all cases: > > > diff --git a/mm/huge_memory.c b/mm/huge_memory.c > index b4d72479330d..482e3ab60911 100644 > --- a/mm/huge_memory.c > +++ b/mm/huge_memory.c > @@ -3483,6 +3483,7 @@ void __folio_undo_large_rmappable(struct folio *folio) >         spin_unlock_irqrestore(&ds_queue->split_queue_lock, flags); >  } > > +/* partially_mapped=false won't clear PG_partially_mapped folio flag */ >  void deferred_split_folio(struct folio *folio, bool partially_mapped) >  { >         struct deferred_split *ds_queue = get_deferred_split_queue(folio); > @@ -3515,16 +3516,16 @@ void deferred_split_folio(struct folio *folio, bool partially_mapped) >                 return; > >         spin_lock_irqsave(&ds_queue->split_queue_lock, flags); > -       if (partially_mapped) > +       if (partially_mapped) { >                 folio_set_partially_mapped(folio); > -       else > -               folio_clear_partially_mapped(folio); > +               if (folio_test_pmd_mappable(folio)) > +                       count_vm_event(THP_DEFERRED_SPLIT_PAGE); > +               count_mthp_stat(folio_order(folio), MTHP_STAT_SPLIT_DEFERRED); > +       } else { > +               /* partially mapped folios cannont become partially unmapped */ > +               VM_WARN_ON_FOLIO(folio_test_partially_mapped(folio), folio); > +       } >         if (list_empty(&folio->_deferred_list)) { > -               if (partially_mapped) { > -                       if (folio_test_pmd_mappable(folio)) > -                               count_vm_event(THP_DEFERRED_SPLIT_PAGE); > -                       count_mthp_stat(folio_order(folio), MTHP_STAT_SPLIT_DEFERRED); > -               } >                 list_add_tail(&folio->_deferred_list, &ds_queue->split_queue); >                 ds_queue->split_queue_len++; >  #ifdef CONFIG_MEMCG > diff --git a/mm/rmap.c b/mm/rmap.c > index 9ad558c2bad0..4c330635aa4e 100644 > --- a/mm/rmap.c > +++ b/mm/rmap.c > @@ -1578,7 +1578,7 @@ static __always_inline void __folio_remove_rmap(struct folio *folio, >          * Check partially_mapped first to ensure it is a large folio. >          */ >         if (partially_mapped && folio_test_anon(folio) && > -           list_empty(&folio->_deferred_list)) > +           !folio_test_partially_mapped(folio)) >                 deferred_split_folio(folio, true); > >         __folio_mod_stat(folio, -nr, -nr_pmdmapped); > This is an improvement, but there's still an issue. Two or more threads can execute deferred_split_folio() simultaneously, which could lead to DEFERRED_SPLIT being added multiple times. We should double-check the status after acquiring the spinlock. diff --git a/mm/huge_memory.c b/mm/huge_memory.c index f401ceded697..3d247826fb95 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -3607,10 +3607,12 @@ void deferred_split_folio(struct folio *folio, bool partially_mapped) spin_lock_irqsave(&ds_queue->split_queue_lock, flags); if (partially_mapped) { - folio_set_partially_mapped(folio); - if (folio_test_pmd_mappable(folio)) - count_vm_event(THP_DEFERRED_SPLIT_PAGE); - count_mthp_stat(folio_order(folio), MTHP_STAT_SPLIT_DEFERRED); + if (!folio_test_partially_mapped(folio)) { + folio_set_partially_mapped(folio); + if (folio_test_pmd_mappable(folio)) + count_vm_event(THP_DEFERRED_SPLIT_PAGE); + count_mthp_stat(folio_order(folio), MTHP_STAT_SPLIT_DEFERRED); + } } else { /* partially mapped folios cannont become partially unmapped */ VM_WARN_ON_FOLIO(folio_test_partially_mapped(folio), folio);