From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.7 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,NICE_REPLY_A, SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CC0A9C47080 for ; Tue, 1 Jun 2021 08:01:31 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 6AB7961378 for ; Tue, 1 Jun 2021 08:01:31 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 6AB7961378 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id C30D38D0005; Tue, 1 Jun 2021 04:01:30 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id BE1F38D0002; Tue, 1 Jun 2021 04:01:30 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A33D18D0005; Tue, 1 Jun 2021 04:01:30 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0018.hostedemail.com [216.40.44.18]) by kanga.kvack.org (Postfix) with ESMTP id 7160F8D0002 for ; Tue, 1 Jun 2021 04:01:30 -0400 (EDT) Received: from smtpin32.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 04A23181AC9CB for ; Tue, 1 Jun 2021 08:01:30 +0000 (UTC) X-FDA: 78204410340.32.5A7E472 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by imf25.hostedemail.com (Postfix) with ESMTP id E8A726000572 for ; Tue, 1 Jun 2021 08:01:16 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1622534488; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=8mKeAx83Wo1F8/mxxn7eN4daLltVvzFFfCCvCnjwWhw=; b=IDBHXsFGfqMk5VA7aOAYQHWenZJDKQiyKZxFqVC/H1/HYNRmPZlp//R1kQjOUHYrTl5FaP tUC0LFnZj/XgjW1ROKIrE0hM/dtexkkpJEG1E2baPPBCc6PnAsSU16NMU+YhK88/w2V4G0 vyFyBvcZhmkRigJ5wh2lzz1pPGVhCE4= Received: from mail-wm1-f72.google.com (mail-wm1-f72.google.com [209.85.128.72]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-599-AH3v0SABNXm6pwTsEqMDNw-1; Tue, 01 Jun 2021 04:01:27 -0400 X-MC-Unique: AH3v0SABNXm6pwTsEqMDNw-1 Received: by mail-wm1-f72.google.com with SMTP id 128-20020a1c04860000b0290196f3c0a927so805614wme.3 for ; Tue, 01 Jun 2021 01:01:27 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:to:cc:references:from:organization:subject :message-id:date:user-agent:mime-version:in-reply-to :content-language:content-transfer-encoding; bh=8mKeAx83Wo1F8/mxxn7eN4daLltVvzFFfCCvCnjwWhw=; b=HUVbcSnUY4AiZH29htXXK1lcbbvLumI6dc1O00psIASnD56+i0h+0FtUAl371NPgjR eVEpX61EPTv0RkH/06VRuMpsEKGMIzAWPocJh/lbJiwNl43Cz456xcoQxvfzKBuPJaYG NA2RG8+EJYrv/YcD8OtY/+IK6vb8WpDTDlEKTcFepC+hpvFpJkIlpoJinkKXuB8Ux5Fd IxX9qxxYDIcD6PBQEhUNneNVhpE6saa9Bems7iwWrLnPtwoFXTsaD4Ct6VuOxDO2HGh6 hi46Lc/M5Kwxp1HlUaS7zo1T/MpydBbRu6zuyqP6eR2Wbv0D00Gf4WimNGPyB5LCy0nv 3szQ== X-Gm-Message-State: AOAM530/StqXyYvmFH4osbd66H6zR+anOuo1IKc3maC9DD9pcWemQUmJ v+39ebGW3gC/4zorjlLV1EDxEztnu0vyKBV6lF29dUgiB6n2OV1lQ2CHHsF+3GtN2/PuDM1nFhH giS5d6SDzLcU= X-Received: by 2002:a7b:c006:: with SMTP id c6mr3192732wmb.11.1622534486391; Tue, 01 Jun 2021 01:01:26 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwccKvTwuROehgW3RRKPGdNxD6nFp8lVriIFHjp5RHZwx/0OZRvWmnzuZaMgiV0bJqxs+lwhw== X-Received: by 2002:a7b:c006:: with SMTP id c6mr3192696wmb.11.1622534486126; Tue, 01 Jun 2021 01:01:26 -0700 (PDT) Received: from [192.168.3.132] (p5b0c69ce.dip0.t-ipconnect.de. [91.12.105.206]) by smtp.gmail.com with ESMTPSA id v18sm2650277wro.18.2021.06.01.01.01.25 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 01 Jun 2021 01:01:25 -0700 (PDT) To: Gavin Shan , linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org, alexander.h.duyck@linux.intel.com, akpm@linux-foundation.org, shan.gavin@gmail.com, Anshuman Khandual References: <20210601033319.100737-1-gshan@redhat.com> From: David Hildenbrand Organization: Red Hat Subject: Re: [RFC PATCH] mm/page_reporting: Adjust threshold according to MAX_ORDER Message-ID: <76516781-6a70-f2b0-f3e3-da999c84350f@redhat.com> Date: Tue, 1 Jun 2021 10:01:25 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.10.1 MIME-Version: 1.0 In-Reply-To: <20210601033319.100737-1-gshan@redhat.com> X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=IDBHXsFG; spf=none (imf25.hostedemail.com: domain of david@redhat.com has no SPF policy when checking 216.205.24.124) smtp.mailfrom=david@redhat.com; dmarc=pass (policy=none) header.from=redhat.com X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: E8A726000572 X-Stat-Signature: dkg14wsdfg1kdc8hqjj1q8fqgojykqpc X-HE-Tag: 1622534476-157754 Content-Transfer-Encoding: quoted-printable X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On 01.06.21 05:33, Gavin Shan wrote: > The PAGE_REPORTING_MIN_ORDER is equal to @pageblock_order, taken as > minimal order (threshold) to trigger page reporting. The page reporting > is never triggered with the following configurations and settings on > aarch64. In the particular scenario, the page reporting won't be trigge= red > until the largest (2 ^ (MAX_ORDER-1)) free area is achieved from the > page freeing. The condition is very hard, or even impossible to be met. >=20 > CONFIG_ARM64_PAGE_SHIFT: 16 > CONFIG_HUGETLB_PAGE: Y > CONFIG_HUGETLB_PAGE_SIZE_VARIABLE: N > pageblock_order: 13 > CONFIG_FORCE_MAX_ZONEORDER: 14 > MAX_ORDER: 14 >=20 > The issue can be reproduced in VM, running kernel with above configurat= ions > and settings. The 'memhog' is used inside the VM to access 512MB anonym= ous > area. The QEMU's RSS doesn't drop accordingly after 'memhog' exits. >=20 > /home/gavin/sandbox/qemu.main/build/qemu-system-aarch64 \ > -accel kvm -machine virt,gic-version=3Dhost \ > -cpu host -smp 8,sockets=3D2,cores=3D4,threads=3D1 -m 4096M,maxmem=3D= 64G \ > -object memory-backend-ram,id=3Dmem0,size=3D2048M = \ > -object memory-backend-ram,id=3Dmem1,size=3D2048M = \ > -numa node,nodeid=3D0,cpus=3D0-3,memdev=3Dmem0 = \ > -numa node,nodeid=3D1,cpus=3D4-7,memdev=3Dmem1 = \ > : \ > -device virtio-balloon-pci,id=3Dballoon0,free-page-reporting=3Dyes >=20 > This tries to fix the issue by adjusting the threshold to the smaller v= alue > of @pageblock_order and (MAX_ORDER/2). With this applied, the QEMU's RS= S > drops after 'memhog' exits. IIRC, we use pageblock_order to a) Reduce the free page reporting overhead. Reporting on small chunks=20 can make us report constantly with little system activity. b) Avoid splitting THP in the hypervisor, avoiding downgraded VM=20 performance. c) Avoid affecting creation of pageblock_order pages while hinting is=20 active. I think there are cases where "temporary pulling sub-pageblock=20 pages" can negatively affect creation of pageblock_order pages.=20 Concurrent compaction would be one of these cases. The monstrosity called aarch64 64k is really special in that sense,=20 because a) does not apply because pageblocks are just very big, b) does=20 sometimes not apply because either our VM isn't backed by (rare) 512MB=20 THP or uses 4k with 2MB THP and c) similarly doesn't apply in smallish=20 VMs because we don't really happen to create 512MB THP either way. For example, going on x86-64 from reporting 2MB to something like 32KB=20 is absolutely undesired. I think if we want to go down that path (and I am not 100% sure yet if=20 we want to), we really want to treat only the special case in a special=20 way. Note that even when doing it only for aarch64 with 64k, you will=20 still end up splitting THP in a hypervisor if it uses 64k base pages=20 (b)) and can affect creation of THP, for example, when compacting (c),=20 so there is a negative side to that. --=20 Thanks, David / dhildenb