From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 05B51E77198 for ; Mon, 6 Jan 2025 20:56:07 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3E65C6B00AD; Mon, 6 Jan 2025 15:56:07 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 36FCB6B00B2; Mon, 6 Jan 2025 15:56:07 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1E8B88D0001; Mon, 6 Jan 2025 15:56:07 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id F1A9E6B00AD for ; Mon, 6 Jan 2025 15:56:06 -0500 (EST) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id A3BFAA02F7 for ; Mon, 6 Jan 2025 20:56:06 +0000 (UTC) X-FDA: 82978234332.26.9DFA2E9 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf17.hostedemail.com (Postfix) with ESMTP id 33D3440015 for ; Mon, 6 Jan 2025 20:56:04 +0000 (UTC) Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=HSz2b+n3; spf=pass (imf17.hostedemail.com: domain of peterx@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736196964; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=RkftWYjHm4LxfdUQv8d50IoMaQzENKu1a+Nikid1hy4=; b=nuuk1wKZYZ+bUupd0TSklpXCUNW1j3M5RJ5vtrFlN4u150CQQo1CHA7yDQUJfwMkSQFI9c HG1h5vQYPW/vhOARMqgHPRXuoE0lWKrP1o4GZu8mIxLxWwGvmt9MUREWBEmJiKVbZYVhn5 +jOh7vW92ytdqqHYIFdiuueXek/MKIk= ARC-Authentication-Results: i=1; imf17.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=HSz2b+n3; spf=pass (imf17.hostedemail.com: domain of peterx@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1736196964; a=rsa-sha256; cv=none; b=X6+P9vw8gKR7/DjCra0ySxZHbutxsPMYUufWDb+gCJGzAJHiVY4Ua+P2da7o7YndoWt0PR fnC+xgXf45us6FCBcVQBEu7L1JnS/LB9Y4rqhEEA93uFx3kbf4unqKRW0VbSLHf7JdVpV3 vW15gCtwyfAVaX8LOXD428brNQHmC+g= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1736196963; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=RkftWYjHm4LxfdUQv8d50IoMaQzENKu1a+Nikid1hy4=; b=HSz2b+n3fqxHm2g/XiDyuUERJB1uXr+58Zt309OlDnJRuF47GLwzLu7oanVQb/kM1dIzw3 gwx1gmJsQMpgMtusjZVOE6dUPYfEovLJXS9VMCGpZNjd707MIRt4SHbl9kJjXm0qROQXTe qSRLrgvjHQkPANs8RWba1XZIX1gOsuE= Received: from mail-qv1-f72.google.com (mail-qv1-f72.google.com [209.85.219.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-615-bwcFqaZ8MQGlrdMzXwyPwQ-1; Mon, 06 Jan 2025 15:56:02 -0500 X-MC-Unique: bwcFqaZ8MQGlrdMzXwyPwQ-1 X-Mimecast-MFC-AGG-ID: bwcFqaZ8MQGlrdMzXwyPwQ Received: by mail-qv1-f72.google.com with SMTP id 6a1803df08f44-6d92efa9ff4so240863346d6.2 for ; Mon, 06 Jan 2025 12:56:02 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736196962; x=1736801762; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=RkftWYjHm4LxfdUQv8d50IoMaQzENKu1a+Nikid1hy4=; b=KCb9p0puxpeU4Y0vLe5aKb+AnMypFwnKXWlcaWUT8HIeOAPwj0TEUfR+ZQ4KI3mJAo KhAHcq7sBTd6bvvEQYbClbyz93HubyCvfdvRNmo5YItZ6KCKo8w1zyrzolOXNE3UhA1P VJYs2v5OE5k6rSTHWANTlfqwu5+VulngaLB4QNoZA3HF3vMo+LAZ1+m4Cqddgcv235jL iU0L79U57BrFrMNMYkDeTmARbAQqYdS/ciz53cPtv2gkBLhyWlNFVnexzbBl70KQSF3e R087Md/4hBVomLJr3fnbD/ITcRIQWVjy2LOaqMXE+5fYNn+13ikIw0aMTu3EpuEC3M1L VKQQ== X-Forwarded-Encrypted: i=1; AJvYcCX4TVDyudpuBNgZ8CszGUknBTGhb6T8aHOIstU6sdIUgb0HHnRbFYceQ+sa7QhY9OkX3GxYW6BYIg==@kvack.org X-Gm-Message-State: AOJu0YzDg0DrMx2omtIPWM9clsad5ICVzHm1EG5kF5x18ja6SFy4EJk1 9pKxa0UsKcyoMqI4WVMU8PwjRNUlQ+y4LRRgvZzdVvhNBpUHI/rL1evFsg7bGm/S5K4Gsn/1pIv WkStqKr2bI0/icETB5K7TIqrUzkQWcxywKklsfh4P7fqeC3fj X-Gm-Gg: ASbGncvxIEwgt9cRXmrvPxIIFZ6XNUNlhEw+5VmPHj/2UY99VhrEc8F1rTFBa7Ov4f/ NIESU7aCGwo6fBCyuVf/2Xnq7H+mMN5haCKxMnNBA9dS2koBGlEiumL7PNibqcbh+9KABxSM4Oh MKWkROnzxae6Uj4rU6EAOsFIYSD8+BJErrhLbL0x9AxraHoSlz+pygGz4+BQv1Zeab0FQ8r1014 M4jR9MjNgVXv/MLB8sJ5BDagxA0zaeSCAX1iBa6cDYVyMQJ+63NyMHPg8btGnHkY/S04j84Y+oK 7+bzbH7hXDP7Bi4vxA== X-Received: by 2002:a05:6214:5786:b0:6d8:b115:76a6 with SMTP id 6a1803df08f44-6dd231f1e5emr962346076d6.0.1736196961868; Mon, 06 Jan 2025 12:56:01 -0800 (PST) X-Google-Smtp-Source: AGHT+IF2tWW0v0vbbhLYq/9qMLhOjU3cuHAXYxZjjEtoKxmUzSvrlM/MC+TsfF9Vd8igIfGVH+Uv5w== X-Received: by 2002:a05:6214:5786:b0:6d8:b115:76a6 with SMTP id 6a1803df08f44-6dd231f1e5emr962345716d6.0.1736196961539; Mon, 06 Jan 2025 12:56:01 -0800 (PST) Received: from x1n (pool-99-254-114-190.cpe.net.cable.rogers.com. [99.254.114.190]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-6dd180eacb8sm174561526d6.20.2025.01.06.12.56.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 06 Jan 2025 12:56:00 -0800 (PST) Date: Mon, 6 Jan 2025 15:55:58 -0500 From: Peter Xu To: Ackerley Tng Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, riel@surriel.com, leitao@debian.org, akpm@linux-foundation.org, muchun.song@linux.dev, osalvador@suse.de, roman.gushchin@linux.dev, nao.horiguchi@gmail.com Subject: Re: [PATCH 4/7] mm/hugetlb: Clean up map/global resv accounting when allocate Message-ID: References: MIME-Version: 1.0 In-Reply-To: X-Mimecast-Spam-Score: 0 X-Mimecast-MFC-PROC-ID: 5PLoZolzl2CspI4JvWuVKIShmlwZC7zvUjh4b_hoKog_1736196962 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Disposition: inline X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 33D3440015 X-Rspam-User: X-Stat-Signature: xbw8z3zhgfrmyxouhoa837mdbscsjr68 X-HE-Tag: 1736196964-339821 X-HE-Meta: U2FsdGVkX18yRn0jtg8qR7uoIIRYzKHAW8jdD5YNj3wZZw9a8RM9nfR4HSjwDAtc8sYNRA3ite0+nYjz4n3gXecftnRNKGUWDOiA8JzU6C3tL4MviUaANyGJJdXNJUayf1z/k4aihUqIv2Ncs5EFlAmakDvZBmHfvEma4EdoWAov6XnMca5WAK9jfuyIcKPPsQtgo7aH0SIgCJQ/AXN9RJTPfa+PwQvT03/6ooVRHcqf4kJr/6i88O8ng2JPFJlaInUQ1AePJYja+nYBpHJ51kGkGO49okMyMFT3nFdmIdUt/qH8ReT+zNSVLbbArA6VvNH+YdJN+IISm3u3gyV9ZAD6CaqB5P6gM4ebEAJ0MaXIJl3JqiYlFN5HpamuwhNEr8ipXP/rgMrbpKCaFObWa42+vbBERmET8MLgQfHog3M3s21ySmMekhI58B9mdBb7j0C18S7vgL/N6YRdzRO+ygcLcwsD8A0XhU9b5IfhUb2zYPRyr1pC0EiPpJvVE3MSq0W3PMwrBWxFzq/BJPxhRgSg0dYDn1wWyvuq0iq7CgDSLMApas/p4a3Y0ZhvaemXOI0YBgf4PJv2nxPtgTQ9e0qoVcRijPD4ezsVlIBWrJZuHpCB95/5zAHhFsFMAzua9kNGBlfVRbJMXG/jorJZIocEQ7VxqJu9ThRuly6wIh+pUE/0DgmGGkIzNNXQtMhV5WIMxjgLXI/VSqXxzTQknRiV6Moqloew26qkkWxCTiQo0XNnIJSEcVD/EwbeZjdWZ69P2NFzcG7KRcUoZRQC5U+3E5eabZYxPlEEZqSdCl50lfAeHi5TzaJaWO6sz9+uoMGeken03OzS7xsklVfggs7nGt8xprVRV2nNokoP3uRcbcGOjPOVvozMWBkHuGu0dkVlQyZVomlImxvIhOb9xZ8wDml0wkTA/ugc1cvwi+qmCwFKsQAD8pZnSPVK4CTumM2m+IYeQTGdNZMf0Yc m1kPq87j 9rJn2dDCvj002EcsR4wZOmRkdgO9tqB62BraH0AAiw7IK2DsgS8InAbVb9Kedk6oK6BKGew0EMHO6OCDSEDlxeApl6jp7D0ATdZ1Yc42vOJa8qNUsJQwOAj7LqXnokngRZnl0Ilv2RcTxodsfeCY6lyASZk0b5/RqGU3rnJNnhLdwmhQllcD3nsvmFUy9duI76uZ7IwFg6xAZPJ/CyrK/TUxKizBD2Mt9uaPosc9J7wwdPjDIKIiqG0VyARGrKW1g6J1vo/kRl5IEj1cQtdcrYM+HBFTph8Apw6WmRlywNdeI/G4NAgmDZk9TM1EeC34LShjkriCW3eZgCOwMI7MhnsUi00NOPi5JNzjGuEAUguzUZZ4dm5fXZ7AF2b7z+iJV4lz7UGyJB2GD6MV/wD6aUjg5cN4YS5CBxYY83fQaW3OUWxNHsLb4gnirNfJtpLw/8AlZYm2Tyy/nblE3g8RqN4He4F3O5Gotda3SACf+Xu6M6KNl5FiFQZl24N8XI8IB6Wsj7r+ZLA+paDWcVZ5JcW5+3k0uSSQoeLwunzxP1P3gBEc86dZTZPLPoA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Mon, Jan 06, 2025 at 02:48:12PM +0000, Ackerley Tng wrote: > Peter Xu writes: > > > On Sat, Dec 28, 2024 at 12:06:34AM +0000, Ackerley Tng wrote: > >> > > >> > - /* If this allocation is not consuming a reservation, charge it now. > >> > + /* > >> > + * If this allocation is not consuming a per-vma reservation, > >> > + * charge the hugetlb cgroup now. > >> > */ > >> > - deferred_reserve = map_chg || cow_from_owner; > >> > - if (deferred_reserve) { > >> > + if (map_chg) { > >> > ret = hugetlb_cgroup_charge_cgroup_rsvd( > >> > idx, pages_per_huge_page(h), &h_cg); > >> > >> Should hugetlb_cgroup_charge_cgroup_rsvd() be called when map_chg == MAP_CHG_ENFORCED? > > > > This looks like a pretty niche use case, though I would say yes. > > > > I don't think I take a lot of consideration here when drafting the patch, > > as the change here should have kept the old behavior: map_chg grows into > > the tristate so that we can drop deferred_reserve, OTOH nothing should > > change from such behavior of cgroup charging. > > > > When it happens, it means the owner process CoWed a private hugetlb folio > > which will enforce bypassing the vma reservation. Here bypassing the vma > > check makes sense to me, because the new to-be-cowed folio X will replace > > another folio Y, which should have consumed the private vma resv at this > > specific index. So there's no way the to-be-cowed folio X can have anything > > to do with the vma reservation.. > > > > Besides the vma reservation, I don't see why this folio allocation needs to > > be any more special. IOW, it should still go through all rest checks and > > fail the process properly if the check fails, that should include any form > > of cgroups (either hugetlb or memcg), IMHO. > > > > Do you have any specific thought on this path? > > I re-read the code, and I hope this understanding is right: > > When a user sets "rsvd.max_usage_in_bytes" to X, the user is saying that > within this cgroup, the maximum memory that can be reserved in the vma > reservation is X. Right, and the allocation may or may not attach to a vma reservation at all. In this case it skips the vma reservation however will still need to be accounted; there should have other similar cases where vma resv doesn't count, e.g. MAP_NORESERVE. For those we do accounting on reservations only until allocation time. > > Hence even when this CoW is performed, this should count towards the > cgroup's "rsvd.max_usage_in_bytes" and so yes, it should be charged. > > I think I misunderstood the context on cgroup charging earlier and hence > I thought it shouldn't be charged, but I agree with you after > re-reading. Thanks. I'll hold another 1-2 days then I'll respin. -- Peter Xu