From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 648D0C2BD09 for ; Fri, 12 Jul 2024 05:39:33 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C8D3B6B008A; Fri, 12 Jul 2024 01:39:32 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C17206B0092; Fri, 12 Jul 2024 01:39:32 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A8F9F6B0093; Fri, 12 Jul 2024 01:39:32 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 880256B008A for ; Fri, 12 Jul 2024 01:39:32 -0400 (EDT) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 34F89A0938 for ; Fri, 12 Jul 2024 05:39:32 +0000 (UTC) X-FDA: 82329998184.18.3999D24 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf16.hostedemail.com (Postfix) with ESMTP id 1353E18000A for ; Fri, 12 Jul 2024 05:39:29 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=ecAQIW2K; spf=pass (imf16.hostedemail.com: domain of gshan@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=gshan@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1720762753; a=rsa-sha256; cv=none; b=kBJuB3ukj5LHQeYKNaDkuy0eZi3hBCqINuuE00M+H7Th5AwWNzc4mE7k8t6nq9sVnitOlj Xjynb1xvxDJmHSAX/M+MEhfzi7Q9hNzYpmBN95iSORB9syWy8R91xI/Ro3HkMuUg24oXru ytP0swE80Zbxt+WevsAYOay8HEzxbnw= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=ecAQIW2K; spf=pass (imf16.hostedemail.com: domain of gshan@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=gshan@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1720762753; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Juf3XDq2EUxzBjQ9N3/IKdm2gpCkJR3uxHfLLmp+JKA=; b=wRWdwNIB/eBAKFQ2T5dlFi3vbZMrB05Eq8phn9aIJj2aiZmiVe/2pehHpJoF7h2qL7DROB thNSqenru2DZ9FulZ6JvU/Q0EhuEm1sNQuZMVhbtEhAXeSqkCAwMLc2e/U70agVs0N3Mh/ lnUzBIs/trldHm3BkAce1jS9Fz1lpyg= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1720762769; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Juf3XDq2EUxzBjQ9N3/IKdm2gpCkJR3uxHfLLmp+JKA=; b=ecAQIW2K4IG/Yw7t5vk9BCjsEBBH16982HiQz05loJagcGKckEGBbXqs5gdLI7IWv/Rzjr L72cqSmqrSh65Ndr8Bm0R3id1yLliZb06094v72FpN2M5Uif168++gl7rkQlqCtnAXbqNL J1BtZqf0MrqTVG3IWAzf0xyKi0wVKE8= Received: from mail-pg1-f199.google.com (mail-pg1-f199.google.com [209.85.215.199]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-549-c4UZuA_EOuCZcDKM2hPXZQ-1; Fri, 12 Jul 2024 01:39:27 -0400 X-MC-Unique: c4UZuA_EOuCZcDKM2hPXZQ-1 Received: by mail-pg1-f199.google.com with SMTP id 41be03b00d2f7-78006198b43so1256975a12.0 for ; Thu, 11 Jul 2024 22:39:27 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1720762766; x=1721367566; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Juf3XDq2EUxzBjQ9N3/IKdm2gpCkJR3uxHfLLmp+JKA=; b=TcMSI9MAS526mSX/Mu7khAfnwwndDUnddtva2kdUAu3jH4N5TpmM7FV23R63rAf3VI tf83cOUgp1/vQ+L8Jbp1Hc276RD4t1SVQQXv/Kx7FcAcAukEI4xRbHXCk62eFvF2wuj0 sf4MGwVoJJo/97Z9pR1FnxQa7p6xRYNLXi3thzIIHBpj3WldfRLErLLsewIPWSOjkBKh 3lX+qLyA7H5faB6sAV7ZghqsC9qW7QFaOPK5jBDn4inl+LUs9Al+7kgd3YaeRH5+4YPt mTaoMVtlrP1KCnTJYxq06bmWD5dzL2ajvDvePIKD0WVY7IV6t0YFoEbagqkOdJsO/pax yblg== X-Gm-Message-State: AOJu0YyNhce9ED5oV/ZSaT5ivi3xcc/QVcdXn1asbuelzkb+9m8eE5JB XqV2fPe0gOiwmYdA0gXm5Nga/1CP87Wid0sZhi5PxDOY89cUVSPwjauiXhsRa4jBO6zZ0YhUWtH qkQy7n26TjB3qbknuWn2axQJHsOB1I4/MHXpXhGMxdNIQMl0/ X-Received: by 2002:a17:903:2342:b0:1fb:4194:5b78 with SMTP id d9443c01a7336-1fbb6ec24cbmr90896675ad.47.1720762766481; Thu, 11 Jul 2024 22:39:26 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFfj/Ar1/TbsQ4S4E1sIRhO+MyKu2ikCKwAAuxpNfHr8daVGBu9FDekV4bt2He3KUzw0zdWRA== X-Received: by 2002:a17:903:2342:b0:1fb:4194:5b78 with SMTP id d9443c01a7336-1fbb6ec24cbmr90896565ad.47.1720762766058; Thu, 11 Jul 2024 22:39:26 -0700 (PDT) Received: from [192.168.68.54] ([43.252.112.134]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-1fbb6a2b2b0sm59174105ad.84.2024.07.11.22.39.22 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 11 Jul 2024 22:39:25 -0700 (PDT) Message-ID: <63a0364b-a2e0-48c2-b255-e976112deeb1@redhat.com> Date: Fri, 12 Jul 2024 15:39:20 +1000 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH] mm/huge_memory: Avoid PMD-size page cache if needed To: David Hildenbrand , Matthew Wilcox Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, akpm@linux-foundation.org, william.kucharski@oracle.com, ryan.roberts@arm.com, shan.gavin@gmail.com References: <20240711104840.200573-1-gshan@redhat.com> From: Gavin Shan In-Reply-To: X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Stat-Signature: aph9uogxztongci9kmayfpuwr83n6376 X-Rspamd-Queue-Id: 1353E18000A X-Rspam-User: X-Rspamd-Server: rspam10 X-HE-Tag: 1720762769-315088 X-HE-Meta: U2FsdGVkX19rTjrmXftLhtnx0iCcuuxghf6sSEwWHRgqWre4drAeSQ09U3edXFJ8Hiz/Yq5BqOUnDEwA+QFYz1RC+c7xqZ09dxlp6mhwgAYvFQl42/tFnNzQzLVL2eFJlrTUbELxYaRAzObHlAAUsAo89asMQFOqG3bStzeamho03eCk8bi1VxtpqlqHjiHPtgfNTmG8udZb41bbY1AvW+DHJtO1ALTSoOukVi5D4wiaGJ4GHIbS9TcGcGQ1LuzhL2VXHDGGJJcxGZ5ksS2BhyB8WpXraftXGWrGBoTuNKfekPfqL/E/hZ3DVGDIE35zFTNTwMJt5wnFedFOy24BEME0JevwBnGYIspuciEcxaip5b9ffUmztSV+eaDHTl1UClTvtcMTZ0V3sn/QTjNOFEvelI365ak4oQvnllW7Y5pSaoTLFeBuOi/78bU3V0DViIhhlIc/CbUnrn4vcv19oF0LU3cThfASlNNh+e199JyzRF47sfSQVUTRkaFhAwdXauPpHgTQToDkgTiAoM8YaA/PT9PBJMz4L7n5ANm7+3bhYdAlebgk1YyV6cIfqmrCzQc1RUdQo7SlMjaiur7Svi4Mht9iMJETHYKXRJ5SbhlppLc1yO8RWNVTmy24mSzVdQCt6Lm4D4mq16SGEHXh5a5IVHZwhplW0Obb+CwVO/64ZvC5pWhNJepXVHeiRjJLVT/rxKpfinD+1DjivEwFUGTQPtX2ikVv3YNgzOjFXSs9C5SCVcRzJI+HtYdflds7350SP0q5lJr4fyK3Hz+5RglzpmgRcIH+dLXDEQ8/IccspGaHyn7e8lFWIeg0o2hUDz0ZA6on3qgwwyVwyUr054RLXOZir4kyyRXAD7cTEY3LqNKb0sG51vcS5JMoKJm/g785dQG9HiwZIZKSZ14v2udgFoEAiHbKOvTLB47wqDNlZzcyLh0k8xLr8J8T8c/IDsQNbWngawldML94KQG 4T6okiqk ob+Ah8z0VZS1oxLYo+974u+SaBAevz5sE3J+fzSM+N7r5yt+j52PPij3SSW9AJEMbAUsaig5b6WfAgv8oGet+UfrRfxAtg62hnTW1rRLP7Z7bkDetSupaRGcaTkOGtjEcZCVk08LC5g6fqekE6bFVZktJY2V/2f/gd+KymzYOu2j9tZCC3ltqNp2PHPTXHeWkTVwOo6073JnkjBJrX9K+BaiUIg2SJdMiZEM0whMAkBpw1yXA1n3L7b0qT4VhLmQG/gl12VR/oow7hEc3j5I/6aAiycjicvHw7hJZKeTE3RnrwFFIvz9HHmV6QnVh7bo1HJ+tovvhzq2uaFiZJhXaQQLLQL1kMTxkmX/EWS/IsvtquJtWCAn3FR43uisroGOBbYfsVtQTb31Okh9doO/uwmoQJ4d/XW2Qojh7En6gnfgDctquWAvPXzQ7ox73Z8I2Ksj7+q447ptyivXhiRCpa59+WQB98wvSNYZt5aGbVKjJIgT6nk3/DbnP6g== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 7/12/24 7:03 AM, David Hildenbrand wrote: > On 11.07.24 22:46, Matthew Wilcox wrote: >> On Thu, Jul 11, 2024 at 08:48:40PM +1000, Gavin Shan wrote: >>> +++ b/mm/huge_memory.c >>> @@ -136,7 +136,8 @@ unsigned long __thp_vma_allowable_orders(struct vm_area_struct *vma, >>>           while (orders) { >>>               addr = vma->vm_end - (PAGE_SIZE << order); >>> -            if (thp_vma_suitable_order(vma, addr, order)) >>> +            if (!(vma->vm_file && order > MAX_PAGECACHE_ORDER) && >>> +                thp_vma_suitable_order(vma, addr, order)) >>>                   break; >> >> Why does 'orders' even contain potential orders that are larger than >> MAX_PAGECACHE_ORDER? >> >> We do this at the top: >> >>          orders &= vma_is_anonymous(vma) ? >>                          THP_ORDERS_ALL_ANON : THP_ORDERS_ALL_FILE; >> >> include/linux/huge_mm.h:#define THP_ORDERS_ALL_FILE     (BIT(PMD_ORDER) | BIT(PUD_ORDER)) >> >> ... and that seems very wrong.  We support all kinds of orders for >> files, not just PMD order.  We don't support PUD order at all. >> >> What the hell is going on here? > > yes, that's just absolutely confusing. I mentioned it to Ryan lately that we should clean that up (I wanted to look into that, but am happy if someone else can help). > > There should likely be different defines for > > DAX (PMD|PUD) > > SHMEM (PMD) -- but soon more. Not sure if we want separate ANON_SHMEM for the time being. Hm. But shmem is already handles separately, so maybe we can just ignore shmem here. > > PAGECACHE (1 .. MAX_PAGECACHE_ORDER) > > ? But it's still unclear to me. > > At least DAX must stay special I think, and PAGECACHE should be capped at MAX_PAGECACHE_ORDER. > David, I can help to clean it up. Could you please help to confirm the following changes are exactly what you're suggesting? Hopefully, there are nothing I've missed. The original issue can be fixed by the changes. With the changes applied, madvise(MADV_COLLAPSE) returns with errno -22 in the test program. The fix tag needs to adjusted either. Fixes: 3485b88390b0 ("mm: thp: introduce multi-size THP sysfs interface") diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index 2aa986a5cd1b..45909efb0ef0 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -74,7 +74,12 @@ extern struct kobj_attribute shmem_enabled_attr; /* * Mask of all large folio orders supported for file THP. */ -#define THP_ORDERS_ALL_FILE (BIT(PMD_ORDER) | BIT(PUD_ORDER)) +#define THP_ORDERS_ALL_FILE_DAX \ + ((BIT(PMD_ORDER) | BIT(PUD_ORDER)) & (BIT(MAX_PAGECACHE_ORDER + 1) - 1)) +#define THP_ORDERS_ALL_FILE_DEFAULT \ + ((BIT(MAX_PAGECACHE_ORDER + 1) - 1) & ~BIT(0)) +#define THP_ORDERS_ALL_FILE \ + (THP_ORDERS_ALL_FILE_DAX | THP_ORDERS_ALL_FILE_DEFAULT) /* * Mask of all large folio orders supported for THP. diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 2120f7478e55..4690f33afaa6 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -88,9 +88,17 @@ unsigned long __thp_vma_allowable_orders(struct vm_area_struct *vma, bool smaps = tva_flags & TVA_SMAPS; bool in_pf = tva_flags & TVA_IN_PF; bool enforce_sysfs = tva_flags & TVA_ENFORCE_SYSFS; + unsigned long supported_orders; + /* Check the intersection of requested and supported orders. */ - orders &= vma_is_anonymous(vma) ? - THP_ORDERS_ALL_ANON : THP_ORDERS_ALL_FILE; + if (vma_is_anonymous(vma)) + supported_orders = THP_ORDERS_ALL_ANON; + else if (vma_is_dax(vma)) + supported_orders = THP_ORDERS_ALL_FILE_DAX; + else + supported_orders = THP_ORDERS_ALL_FILE_DEFAULT; + + orders &= supported_orders; if (!orders) return 0; Thanks, Gavin