From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 65D06D33A28 for ; Tue, 5 Nov 2024 18:46:52 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DB2856B0088; Tue, 5 Nov 2024 13:46:51 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id D61AB6B0089; Tue, 5 Nov 2024 13:46:51 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C03216B008A; Tue, 5 Nov 2024 13:46:51 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id A0C706B0088 for ; Tue, 5 Nov 2024 13:46:51 -0500 (EST) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 1DB651C76B2 for ; Tue, 5 Nov 2024 18:46:48 +0000 (UTC) X-FDA: 82752921888.23.0EE0F88 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf08.hostedemail.com (Postfix) with ESMTP id B4B7116002D for ; Tue, 5 Nov 2024 18:46:23 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=Jq6Gfpdd; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf08.hostedemail.com: domain of peterx@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=peterx@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1730832238; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=LYwp8KIugtxkap4ul3M8EPvAlgEl05pvj3U2wYQAfhc=; b=orBgr64uZdenaiA/uzKMnvLn2bmjYEOJUuko2GBxlo5nNbUl2ly3NrUXwzZ9XEmWimBRlZ ccJJXg41mVyJPgMS8HMvESvaf4CW1XcFgsyvfQncfeLmL9LBac9edBcPRBOjII7bSzGqNz NvTuIcxuo0Ryf4b6TV0FfbEIFF4UQwM= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=Jq6Gfpdd; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf08.hostedemail.com: domain of peterx@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=peterx@redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1730832238; a=rsa-sha256; cv=none; b=p1BNGkFhYw3Vpzl0R7IZI+HyU+PNc0t5UyCn9C9lkhv+4E1rp7Z4Jb7qldaSCDbpGERDdG fm2hQN51jc8xu5kyqKka6CVveQVR7zmUvC7Yw/0pXdhHI0GLFmaJ+P1XX5TyBlvDcVjf3s fWNEsZYVvT5ddREa1m2v1f09WbklyjM= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1730832405; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=LYwp8KIugtxkap4ul3M8EPvAlgEl05pvj3U2wYQAfhc=; b=Jq6GfpddmgmnZ1BUSLOQQPXxt4sPccmyJT0f50FnlyhF9TTsfqTS08OeksQ3l0USI1Qrkw hn+UMyh9KvENGQfrPWt8tZrlF2pDqzQl9UOwitIWi5nr37neSuzY8DzKLmFKSVe9uFel2s e1kwqUMBrtXssorqkcCS1uAE3TMxwsg= Received: from mail-qv1-f69.google.com (mail-qv1-f69.google.com [209.85.219.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-691-2pmk1KT_PKSiATzYoiqGIg-1; Tue, 05 Nov 2024 13:46:44 -0500 X-MC-Unique: 2pmk1KT_PKSiATzYoiqGIg-1 Received: by mail-qv1-f69.google.com with SMTP id 6a1803df08f44-6cbebfc1725so96932666d6.1 for ; Tue, 05 Nov 2024 10:46:44 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1730832404; x=1731437204; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=LYwp8KIugtxkap4ul3M8EPvAlgEl05pvj3U2wYQAfhc=; b=mg5IuRMRWRrQu1xTNNOYJG4GbYKdjeLjF14E1MqbccwP+0gONWmloNXXVhbcmWfNSt 8hXAK/pGRGZDC93bl7Pn1uFGDJAXoOaita92MwZyxsil8nv4J8XJS9FJV5chmidtk21D aQ9mmZdnOrv6Fr+340rEI3JGMroTQDR3ag8elv8ZZ1Tqaw3Ad71FzdCAtk5OGPntE8+0 WrvWZ/8ox6ifDlJk8w6OD6+Ig4m1iPm35jCrcCJqwqi5N9BGfpqBoZ1/Uu8ns5yXq1FA outfCnKjycFmIqiYZHQQuLG0yChKbRqhoFvQVaXMzw/a293dDQUseJeqh3+a8OADhPGB gLBw== X-Forwarded-Encrypted: i=1; AJvYcCW3mhAbCzKRQ6Gcq6GCTfw5CAUw0vV/vA3xbjLiPC8TNCc51ycvWB0NdHNxXREEXbpjwxX3A76ZSA==@kvack.org X-Gm-Message-State: AOJu0YyzXCo1Ek7QVkWHgV3GBWHE9geQ4KCIV4dgT8q7AO1mgdgTteAJ qWKoCpOQQhqzk0/vY+TVxS3MHisDLCH51c+l+0aPJB6DGMVTQdvCS6bnFc8N/n+Ug75N0AoBXl9 vsnMRhqT2GSA0/Pwqnbovgtowkp38h414+CgV/ashaIqfxhBW X-Received: by 2002:a05:6214:16d1:b0:6d3:6577:ab0e with SMTP id 6a1803df08f44-6d36577ad52mr176219956d6.25.1730832403694; Tue, 05 Nov 2024 10:46:43 -0800 (PST) X-Google-Smtp-Source: AGHT+IHJqKYJHWZJB7IdidyerdlnGnmwlSuAXCaI3vemSa5qwsTCH66dfDEtOB+NFfPtBagp0pe3Vg== X-Received: by 2002:a05:6214:16d1:b0:6d3:6577:ab0e with SMTP id 6a1803df08f44-6d36577ad52mr176219756d6.25.1730832403337; Tue, 05 Nov 2024 10:46:43 -0800 (PST) Received: from x1n (pool-99-254-114-190.cpe.net.cable.rogers.com. [99.254.114.190]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-6d354177d2fsm62931026d6.107.2024.11.05.10.46.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 05 Nov 2024 10:46:42 -0800 (PST) Date: Tue, 5 Nov 2024 13:46:39 -0500 From: Peter Xu To: Ackerley Tng Cc: muchun.song@linux.dev, akpm@linux-foundation.org, rientjes@google.com, fvdl@google.com, jthoughton@google.com, david@redhat.com, isaku.yamahata@intel.com, zhiquan1.li@intel.com, fan.du@intel.com, jun.miao@intel.com, tabba@google.com, quic_eberman@quicinc.com, roypat@amazon.co.uk, jgg@nvidia.com, jhubbard@nvidia.com, seanjc@google.com, pbonzini@redhat.com, erdemaktas@google.com, vannapurve@google.com, pgonda@google.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: Re: [RFC PATCH 2/3] mm: hugetlb: Refactor vma_has_reserves() to should_use_hstate_resv() Message-ID: References: <3d1946d01f63104de913c0979b5a596e2add1672.1728684491.git.ackerleytng@google.com> MIME-Version: 1.0 In-Reply-To: <3d1946d01f63104de913c0979b5a596e2add1672.1728684491.git.ackerleytng@google.com> X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Disposition: inline X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: B4B7116002D X-Stat-Signature: 5w6wxhuqo794uu558s6iaq35qt1wh3x5 X-Rspam-User: X-HE-Tag: 1730832383-365756 X-HE-Meta: U2FsdGVkX194ATCuS6krDNOGZ7B2MRAKIv63kJ1FUsMAKyzMj5dpdQZ5Rec087mRHlcyP/ms1CLphYL2LL3BLXIaM+bQRTkLhbwq/V69OxWx6M86A/+LddhI22PDctTZCFLAPsekdvwKNHJXp/q9ltcJ9W8mFDH0ZAxtl9Nh3a8y7PzlRvsVrevmiCBmzgNKHLsCfxWo/NxFey6a7ojPBuJnBgBncuWSvHQbHSYRChRvp5W2q/A0TFp1uNTWFpX44AlNxVmMTcmSqre14tP+sfHxTv82NoLfnI48X+WI6muaVSZumVpjJ8wWvMNOOtzLkLAYT1bs1uReGbRzwABFKxlz//E5QUKf+TLtRgdsDQ+QyjzBXiQm7MMRrMvTNwsROzLSZZXvIjLgFUd1PuHKjHw8Ylb92ZA1aF8hRh3Pk5NKdnqkS+d4qTAsnKtBu0vf8+TxIT5J/k0MxkSuIvG0NEiEfVZTp0nX1ewWgQIGpFGwsFTqJu70eAGco1+pAdx3GvxIDSowD4KzUX77+cxYBqWBfWAlo8CUVZQ0mkjDMkIUH60amSU5tmudAjTfpL+GzJAx5o/mQ5LRhmfAvk6R90szgu/z1oZOe9fuzRMWyRhLBgbqtF7UwQBeLEVfkqNdIWsMUxMwBcoHmutzuOeqIQSPMt8GwcoYMpXVSULqo2mIcYw1c3iEDGzniOOg3dgKEBaYQSLB83v31A+YgLklWmYaj/LTXnYlIygan0TI1/OAjqRub1TAknoiYjjUoTt0Jf9dlFmpf/UwykPP80L4B8DAUXU2ZZIrN9IkXJJzoAmdpI2Vho/j2hfsJbiTcyAT0xPyGFV+w+Z6dkMdsTdh5EgEZ1ORkI77dVCikKVF6g8VjPFNVYVohHUs5PLe2K49sVIWqk5TePt5onPILLc9qjHmxhOz5uTf1os8Tooq8/LDszmTlujQpAhPIcu01UZxofgVdYpZTkJl9G2riJg rqJ/Mqgz jQYgMduqJUseUiXcqiBvF5jbB2Cdqh8IC478OpsiXzaRseAeGNDHELtLc0UUYPLoHTygQYGKYncIs1rtPAG29fEJatBqX2jFju1J1lg+ux9SHXYMqDf8mxK0+nuoGEMHZe3waxmn61Jiw+/cb3Di+QGpIR+hy9EdrJNL1I4M/yoE1JwS2CClrFZwwCjHrKmx3bEidxq4P1P91IU+bzqpzY+iTvL3maQkYbidASrPu3jB8b/DhDzgMER3SENcbex+oNVASAkS43t8oJgvN17fNQ0m/ocWOe1XOVfooF7KFb8ZFSGZIAThSM6KaeaOYwBfM24N7jUWWSVSUCDc7kOc8mSTFlki0cz8U/BblKrTEDNhdVM8rHZRGRaDNPv75uIcQTt+gF73qUrYxbs6QPsVXD7W411M61P6R4EjYE2sk2xtCXQ5UvETtEGD40laL4Sam9XW8 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Fri, Oct 11, 2024 at 11:22:37PM +0000, Ackerley Tng wrote: > With the addition of the chg parameter, vma_has_reserves() no longer > just determines whether the vma has reserves. > > The comment in the vma->vm_flags & VM_NORESERVE block indicates that > this function actually computes whether or not the reserved count > should be decremented. > > This refactoring also takes into account the allocation's request > parameter avoid_reserve, which helps to further simplify the calling > function alloc_hugetlb_folio(). > > Signed-off-by: Ackerley Tng I wonder if this patch could be merged with the 3rd, IIUC this can fundamentally be seen as a movement of what patch 3 was removed. Furthermore, I do feel like should_use_hstate_resv() could be redundant on its own on many things. Let me try to justify. Firstly, after 3 patches applied, now it looks like this (I removed all comments to make things shorter..): static bool should_use_hstate_resv(struct vm_area_struct *vma, long chg, bool avoid_reserve) { if (avoid_reserve) return false; if (vma->vm_flags & VM_NORESERVE) { if (vma->vm_flags & VM_MAYSHARE && chg == 0) return true; else return false; } if (vma->vm_flags & VM_MAYSHARE) { if (chg) return false; else return true; } if (is_vma_resv_set(vma, HPAGE_RESV_OWNER)) { if (chg) return false; else return true; } return false; } Then let's look at chg==0 processing all above: what does it say? I suppose it means "we don't need another global reservation". It means if chg==0 we always will use an existing reservation. From math POV it also is true, so it can already be moved out ahead, IIUC, like this: static bool should_use_hstate_resv(struct vm_area_struct *vma, long chg, bool avoid_reserve) { if (avoid_reserve) return false; if (chg == 0) return true; if (vma->vm_flags & VM_NORESERVE) return false; if (vma->vm_flags & VM_MAYSHARE) return false; if (is_vma_resv_set(vma, HPAGE_RESV_OWNER)) return false; return false; <--------------------- [1] } Move on. If I read it right, above [1] is exactly for avoid_reserve==1 case, where it basically says "it's !NORESERVE, private, and it's not the vma resv owner, either fork() or CoW". If my reading is correct, it means after your patch 2, [1] should never be reachable at all.. I would hope adding a panic() right above would be ok. And irrelevant of whether my understanding is correct.. math-wise above is also already the same as: static bool should_use_hstate_resv(struct vm_area_struct *vma, long chg, bool avoid_reserve) { if (avoid_reserve) return false; if (chg == 0) return true; return false; } Then it makes a lot more sense now, because afaict, gbl_chg is exactly what should_use_hstate_resv() is looking for.. only except avoid_reserve==true. Would above make sense to you? In short, it's about whether a patch on top of your series would further simply this whole thing, like below: ===8<=== diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 60e72214d5bf..4b1c5c4ed7be 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -1245,80 +1245,6 @@ void clear_vma_resv_huge_pages(struct vm_area_struct *vma) hugetlb_dup_vma_private(vma); } -/* - * Returns true if this allocation should use (debit) hstate reservations, based on - * - * @vma: VMA config - * @chg: Whether the page requirement can be satisfied using subpool reservations - * @avoid_reserve: Whether allocation was requested to avoid using reservations - */ -static bool should_use_hstate_resv(struct vm_area_struct *vma, long chg, - bool avoid_reserve) -{ - if (avoid_reserve) - return false; - - if (vma->vm_flags & VM_NORESERVE) { - /* - * This address is already reserved by other process(chg == 0), - * so, we should decrement reserved count. Without decrementing, - * reserve count remains after releasing inode, because this - * allocated page will go into page cache and is regarded as - * coming from reserved pool in releasing step. Currently, we - * don't have any other solution to deal with this situation - * properly, so add work-around here. - */ - if (vma->vm_flags & VM_MAYSHARE && chg == 0) - return true; - else - return false; - } - - /* Shared mappings always use reserves */ - if (vma->vm_flags & VM_MAYSHARE) { - /* - * We know VM_NORESERVE is not set. Therefore, there SHOULD - * be a region map for all pages. The only situation where - * there is no region map is if a hole was punched via - * fallocate. In this case, there really are no reserves to - * use. This situation is indicated if chg != 0. - */ - if (chg) - return false; - else - return true; - } - - /* - * Only the process that called mmap() has reserves for private - * mappings. A child process with MAP_PRIVATE mappings created by their - * parent have no page reserves. - */ - if (is_vma_resv_set(vma, HPAGE_RESV_OWNER)) { - /* - * Like the shared case above, a hole punch or truncate - * could have been performed on the private mapping. - * Examine the value of chg to determine if reserves - * actually exist or were previously consumed. - * Very Subtle - The value of chg comes from a previous - * call to vma_needs_reserves(). The reserve map for - * private mappings has different (opposite) semantics - * than that of shared mappings. vma_needs_reserves() - * has already taken this difference in semantics into - * account. Therefore, the meaning of chg is the same - * as in the shared case above. Code could easily be - * combined, but keeping it separate draws attention to - * subtle differences. - */ - if (chg) - return false; - else - return true; - } - - return false; -} - static void enqueue_hugetlb_folio(struct hstate *h, struct folio *folio) { int nid = folio_nid(folio); @@ -3255,7 +3181,7 @@ struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma, } - use_hstate_resv = should_use_hstate_resv(vma, gbl_chg, avoid_reserve); + use_hstate_resv = avoid_reserve || !gbl_chg; /* * charge_cgroup_reservation if this allocation is not consuming a ===8<=== Thanks, -- Peter Xu