From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 38440C4332F for ; Fri, 25 Nov 2022 07:40:45 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7AD146B0071; Fri, 25 Nov 2022 02:40:44 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 75D4C6B0072; Fri, 25 Nov 2022 02:40:44 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5FE6B6B0073; Fri, 25 Nov 2022 02:40:44 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 4C4E96B0071 for ; Fri, 25 Nov 2022 02:40:44 -0500 (EST) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 19B031414AB for ; Fri, 25 Nov 2022 07:40:44 +0000 (UTC) X-FDA: 80171167608.13.DF5B4F3 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf15.hostedemail.com (Postfix) with ESMTP id B5960A0011 for ; Fri, 25 Nov 2022 07:40:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1669362043; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=omzOAQuOPD821XyVl59RQ4e2xtcxIZdVZNTB92ld58s=; b=TAuvSSy5hu/dltZzZ/HxbxAWYEmMc4nycLSJJlyJHE1W1aeXWWAx76yxW96qlrUpz+boUh ALrqMz7kEMRDuHpt9L6YmS8DZtPALQuqQmjo7hbprqC0+Jp2yK58jLlGrVjlGTZcnUEfsN KbrrBpSGufUEaXZUvqhzyyWj5sH68Gk= Received: from mail-ed1-f70.google.com (mail-ed1-f70.google.com [209.85.208.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-401-zCGgQZGYONGMnXq9DW1ANg-1; Fri, 25 Nov 2022 02:40:41 -0500 X-MC-Unique: zCGgQZGYONGMnXq9DW1ANg-1 Received: by mail-ed1-f70.google.com with SMTP id e15-20020a056402190f00b00461b0576620so2147213edz.2 for ; Thu, 24 Nov 2022 23:40:41 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=omzOAQuOPD821XyVl59RQ4e2xtcxIZdVZNTB92ld58s=; b=C0+h7ULZBM5wdaLcjPVhqwUGWhrN82Gwd+8KSMqzFkkxb9wLN+9VnaG4hyte28qmdl gXGO+zVRMIZiB+c5GP6zADvUUS3G2o2BO3AUpUl7zqr0D8LS1DKXnqzwSgRpngZx7Ejb YFFE50AEiFvL//EQ0RfgLzTPO+LFnpeklF6FMWDv18ZnIq47//ZVxYiWda+mFTV8OB+I pjJTJCdhKTF4CaF/QK7yQx4ysWk1ytQBwXU+Uvb3ojeeHLuIBMsxJPve3OQG7EJyweIV KS+dqyxqLpzUAovQnDqUzPb4GImeo9m8D2mddxgY8W0IniIY3j3zTzIltK6IhcF3HCQs eGbw== X-Gm-Message-State: ANoB5plFxMARI5iME0ikuzP6hYXK69PbRb/rRD+zMRlzAHt5Z/n7G6tc Uc+/bs38MkhTRnGbnQIy+yEUgzrc/UV2TwXbbi+yA2eWwpD81UXQq3yBqdGTHizAFMALpIQBcD0 u68MSpx88h1hHdrH5yYJ8C+K6E34= X-Received: by 2002:a17:906:684a:b0:7bc:73e6:b2c3 with SMTP id a10-20020a170906684a00b007bc73e6b2c3mr1534471ejs.451.1669362040526; Thu, 24 Nov 2022 23:40:40 -0800 (PST) X-Google-Smtp-Source: AA0mqf4p+DPVWOPmqc6ICeZN0JH2//EVrObfeM7Ie7k/w1ZHJstjzMHFQRDba/oYFFIvOrSRLaf33MvfHD5tkmM8wVc= X-Received: by 2002:a17:906:684a:b0:7bc:73e6:b2c3 with SMTP id a10-20020a170906684a00b007bc73e6b2c3mr1534452ejs.451.1669362040223; Thu, 24 Nov 2022 23:40:40 -0800 (PST) MIME-Version: 1.0 References: <20221124095523.31061-1-gshan@redhat.com> <3c584ce6-dc8c-e0e4-c78f-b59dfff1fc13@redhat.com> <22407f18-0406-6ede-ef1e-592f03d3699e@redhat.com> <31bda0ab-a185-340d-b96b-b1cfed7c3910@redhat.com> <759a17cf-e234-2601-bf42-7a40a4d89466@redhat.com> <31947f33-cd9e-adbb-2dcc-106a464438df@redhat.com> In-Reply-To: <31947f33-cd9e-adbb-2dcc-106a464438df@redhat.com> From: Zhenyu Zhang Date: Fri, 25 Nov 2022 15:40:03 +0800 Message-ID: Subject: Re: [PATCH v2] mm: migrate: Fix THP's mapcount on isolation To: Guowen Shan , linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org, akpm@linux-foundation.org, william.kucharski@oracle.com, ziy@nvidia.com, kirill.shutemov@linux.intel.com, apopple@nvidia.com, hughd@google.com, willy@infradead.org, shan.gavin@gmail.com, David Hildenbrand X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset="UTF-8" ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1669362043; a=rsa-sha256; cv=none; b=uFJvzVVxaG7g34+z4v8laSLLsahbLl+TxMB61z3hgEqvV76fMeAZUjtSJKEyHG0g40GEPl MtvpksMaHwWiKDXPSmPQBeKIcs1QVWlmDpwhex5bYrvcJgf2Zh0uJusE/Hq+4CGa0UZoCN DPugCz7SdREwIogJ21XzodZaUa8ZYno= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=TAuvSSy5; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf15.hostedemail.com: domain of zhenyzha@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=zhenyzha@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1669362043; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=omzOAQuOPD821XyVl59RQ4e2xtcxIZdVZNTB92ld58s=; b=Jb0pQKczQKZVXwISU2yUrbjB21hePBoPXWZHs7YL3FZnzNSzcI41jE/iDcBri3Um2FNbFj gOilLsZZm79SBD5Aik3AoiovUqA1LpPL43wd6LCvGZk2lAhCoHRdjemu9luaZnVugwGeBX K4GJvQhchc0LEfcNltxUTdFcWWP9LU8= X-Rspamd-Queue-Id: B5960A0011 X-Stat-Signature: 9i38jbffiiakybu8p1dnga5f6d5u69a3 X-Rspam-User: Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=TAuvSSy5; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf15.hostedemail.com: domain of zhenyzha@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=zhenyzha@redhat.com X-Rspamd-Server: rspam09 X-HE-Tag: 1669362043-941318 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: With the patch applied, I'm unable to hit memory hot-remove failure in the environment where the issue was initially found. Tested-by: Zhenyu Zhang On Thu, Nov 24, 2022 at 10:09 PM David Hildenbrand wrote: > > On 24.11.22 14:22, David Hildenbrand wrote: > > On 24.11.22 13:55, Gavin Shan wrote: > >> On 11/24/22 6:43 PM, David Hildenbrand wrote: > >>> On 24.11.22 11:21, Gavin Shan wrote: > >>>> On 11/24/22 6:09 PM, David Hildenbrand wrote: > >>>>> On 24.11.22 10:55, Gavin Shan wrote: > >>>>>> The issue is reported when removing memory through virtio_mem device. > >>>>>> The transparent huge page, experienced copy-on-write fault, is wrongly > >>>>>> regarded as pinned. The transparent huge page is escaped from being > >>>>>> isolated in isolate_migratepages_block(). The transparent huge page > >>>>>> can't be migrated and the corresponding memory block can't be put > >>>>>> into offline state. > >>>>>> > >>>>>> Fix it by replacing page_mapcount() with total_mapcount(). With this, > >>>>>> the transparent huge page can be isolated and migrated, and the memory > >>>>>> block can be put into offline state. Besides, The page's refcount is > >>>>>> increased a bit earlier to avoid the page is released when the check > >>>>>> is executed. > >>>>> > >>>>> Did you look into handling pages that are in the swapcache case as well? > >>>>> > >>>>> See is_refcount_suitable() in mm/khugepaged.c. > >>>>> > >>>>> Should be easy to reproduce, let me know if you need inspiration. > >>>>> > >>>> > >>>> Nope, I didn't look into the case. Please elaborate the details so that > >>>> I can reproduce it firstly. > >>> > >>> > >>> A simple reproducer would be (on a system with ordinary swap (not zram)) > >>> > >>> 1) mmap a region (MAP_ANON|MAP_PRIVATE) that can hold a THP > >>> > >>> 2) Enable THP for that region (MADV_HUGEPAGE) > >>> > >>> 3) Populate a THP (e.g., write access) > >>> > >>> 4) PTE-map the THP, for example, using MADV_FREE on the last subpage > >>> > >>> 5) Trigger swapout of the THP, for example, using MADV_PAGEOUT > >>> > >>> 6) Read-access to some subpages to fault them in from the swapcache > >>> > >>> > >>> Now you'd have a THP, which > >>> > >>> 1) Is partially PTE-mapped into the page table > >>> 2) Is in the swapcache (each subpage should have one reference from the swapache) > >>> > >>> > >>> Now we could test, if alloc_contig_range() will still succeed (e.g., using virtio-mem). > >>> > >> > >> Thanks for the details. Step (4) and (5) can be actually combined. To swap part of > >> the THP (e.g. one sub-page) will force the THP to be split. > >> > >> I followed your steps in the attached program, there is no issue to do memory hot-remove > >> through virtio-mem with or without this patch. > > > > Interesting. But I don't really see how we could pass this check with a > > page that's in the swapcache, maybe I'm missing something else. > > > > I'll try to see if I can reproduce it. > > > > After some unsuccessful attempts and many head-scratches, I realized > that it's quite simple why we don't have to worry about swapcache pages > here: > > page_mapping() is != NULL for pages in the swapcache: folio_mapping() > makes this rather obvious: > > if (unlikely(folio_test_swapcache(folio)) > return swap_address_space(folio_swap_entry(folio)); > > > I think the get_page_unless_zero() might also be a fix for the > page_mapping() call, smells like something could blow up on concurrent > page freeing. (what about concurrent removal from the swapcache? nobody > knows :) ) > > > Thanks Gavin! > > Acked-by: David Hildenbrand > > > -- > Thanks, > > David / dhildenb >