From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E4414C4167B for ; Tue, 28 Nov 2023 19:48:19 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6014B6B0338; Tue, 28 Nov 2023 14:48:19 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 5B1AE6B033A; Tue, 28 Nov 2023 14:48:19 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4790E6B033C; Tue, 28 Nov 2023 14:48:19 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 396496B0338 for ; Tue, 28 Nov 2023 14:48:19 -0500 (EST) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 125ADA037B for ; Tue, 28 Nov 2023 19:48:19 +0000 (UTC) X-FDA: 81508399518.15.0A5BFF4 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf16.hostedemail.com (Postfix) with ESMTP id CB41618001F for ; Tue, 28 Nov 2023 19:48:16 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=Fi+bIioY; spf=pass (imf16.hostedemail.com: domain of peterx@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1701200896; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=FI/qc+HoptbCAYRrip/l9UUtKZw1zfPxYz8AmPHyRKg=; b=Hv08g62Ov0c/Yabvw8q+k0Hd+Q/bRmseHTSdEKHJmFcEA+VowIyDpxm9/T5OyokzN+Dw+7 CTUemag2QQp5zV0nxSDZRpY8CwD95mnCLr4oqPmMsw6eIFikRiefLPzjsaW8RvjHwgVPU7 faPVElvosu+x6no+zC3dTutgJyF9stk= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=Fi+bIioY; spf=pass (imf16.hostedemail.com: domain of peterx@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1701200896; a=rsa-sha256; cv=none; b=oJiEM5lvbXTXpUq3SqrbaOk4OzDZ/MK6sMMdHYK7OSSgY8nuokc/g23Vzcgbii8XNrOnKj OgO+ViC+LFEKoId7cY/OSgd9lQUprDEmmlS+YY6cvTYuwDOWrJ5NNxtQHdLDYmzynnddp4 uYkFmnqwIbzZg3DU2niGjwNrAkvfPN4= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1701200896; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=FI/qc+HoptbCAYRrip/l9UUtKZw1zfPxYz8AmPHyRKg=; b=Fi+bIioY3NP4ooyovpcc/j9g03fDA1SUBg10otdLxsHSe5P9YeeYnZRhXai845Ew/bIagv j0TqN5UCPBVj8Oi6OAF50MdjgJKFgzfvttXU500AKhKPDqJZ5D4puX+tg+cSZe+3NSA9EB HrYVuBFgf/TODFMwY7Jk/vNZqX3lBms= Received: from mail-qk1-f199.google.com (mail-qk1-f199.google.com [209.85.222.199]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-33-YoTBJv9-Mw6BYDP43J9l8A-1; Tue, 28 Nov 2023 14:48:14 -0500 X-MC-Unique: YoTBJv9-Mw6BYDP43J9l8A-1 Received: by mail-qk1-f199.google.com with SMTP id af79cd13be357-77d8df7c8e2so70491485a.1 for ; Tue, 28 Nov 2023 11:48:14 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701200894; x=1701805694; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=FI/qc+HoptbCAYRrip/l9UUtKZw1zfPxYz8AmPHyRKg=; b=Xlps+ECpZrslqrVF/YOB6KY7ZrUFqMjXPVjQm0kRkNA+B4Wcm7mHePBMBadAEXdQy8 rSMGyLzXe25iXAh7RTVU8e1nEXS72hLg2EshsyqFuUx3OsOiXt33qZWbpUAFDY89V1ya 2Dbg77+bKX6ptinmEwXiqo3Ib1/ao28gRHkp5eDxJSXXr0QzckddEJcyEsP2UnHtcvv3 //PRHLzwfmyetjAdErSuK6wQBnbjFrim/6tzCDmv5f5tR4MO6VP+PjBJbAwKPvIZu4Oj Dzrx1u0JT8N4PRVv/7RWIUfV4Q9myanF/Ti5ikfGsHyDCdfNjCeesPa8oEGBZTV9GUYi JYdg== X-Gm-Message-State: AOJu0YwOZjoUPSbXhEcCCx6Ft26ZLtb+UCKPVLEDJ1DG5o0YChi2Ayz5 8A+47pEqFFbu4twmdsVkUQUOkqELgJDTPyA6DJrNLoh1RPDM9gTEVIJJ/lz1MHB/5JjNfCRf9Mq jHrKrWZnq7NbeEPL3gd0= X-Received: by 2002:a05:620a:19a9:b0:773:a8d0:b326 with SMTP id bm41-20020a05620a19a900b00773a8d0b326mr19136523qkb.4.1701200894205; Tue, 28 Nov 2023 11:48:14 -0800 (PST) X-Google-Smtp-Source: AGHT+IGnmfMd1Af6MOd9MBrYMBFGfGQ+U2yn2nxKjVFjmVJPni5vvpTbDhcwoD5Rcmq6sbu5MsnF9w== X-Received: by 2002:a05:620a:19a9:b0:773:a8d0:b326 with SMTP id bm41-20020a05620a19a900b00773a8d0b326mr19136510qkb.4.1701200893900; Tue, 28 Nov 2023 11:48:13 -0800 (PST) Received: from x1n (cpe688f2e2cb7c3-cm688f2e2cb7c0.cpe.net.cable.rogers.com. [99.254.121.117]) by smtp.gmail.com with ESMTPSA id u21-20020ae9c015000000b0076efaec147csm4729883qkk.45.2023.11.28.11.48.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 28 Nov 2023 11:48:13 -0800 (PST) Date: Tue, 28 Nov 2023 14:48:11 -0500 From: Peter Xu To: David Hildenbrand Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, Andrew Morton , Mike Kravetz , Muchun Song Subject: Re: [PATCH v1 2/5] mm/rmap: introduce and use hugetlb_remove_rmap() Message-ID: References: <20231128145205.215026-1-david@redhat.com> <20231128145205.215026-3-david@redhat.com> MIME-Version: 1.0 In-Reply-To: X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Disposition: inline X-Rspamd-Queue-Id: CB41618001F X-Rspam-User: X-Stat-Signature: t1kkf6d1cp5t43t3bi5e96xn39pbmoea X-Rspamd-Server: rspam01 X-HE-Tag: 1701200896-303588 X-HE-Meta: U2FsdGVkX19fidGAaYYmoiFs/4XYTNASp3Y0d0OYbtIFZ+qFqF+w0kP1GlDQ+B4RDNKEJneeexa8hByNC6RzTNBLezE5lBYA46DhD0CN3h0fGptKK4QDudgmW9xh+Lp8Kmz69mF6IH56CDrpfO2A0vVemOymAt27YKXDa/D48/qX6p04dmACxymAvc69jc5EmF/SvzrY2nCcoA8QxxjmGs9knRtqOtAyjM6U2X1mBGb1T/A6sgxboyqKRFAFeGwUi8274hvxIFvJNkfV/LNxrGT6mqu8SJRhnsCelNiuhnVYv9qKL/297Tw01FaKry2rqaWdpDn7TAf87VwNAhT3Qeogl4GImviNHVZx8Lf+ZIQEUmOu+DX9Q9kSvtmAL4OEeOBFBQZ9ccxIw9VpuLf4kXfdXjImYmxnEJnN7CbFxlJDRy6TjnOKj9U2K+mUgWuFypWQ1dw6zCDur+poYBEyuNTm1Jc5shoCn8mElJx4+FMsovUBsK/SOwsBEtg6onX2XEbK0TzFYWJDgb6OQWXTbx9Zd4dJ3Yb8zfukzJKAOyp+9yUGYwhv5B9osiGTRCN/ecSt11F6tRzj71LtDFwNNctce/tC5OjQXL/cdO0IpglH3eDGr/pUs6cZzzHpRWrmLAAmE3NDrfs6m/sRdjaWOXWF/yTpD0viR79VM75GQd3nYt0fGeWM8LdkqB2BhmRlZZJkiFCirQzup9vzyqWmh6Miko1+u7vtFqA4TtYQBqre5RSQ8iNAlFgTWFprAjdG3X31MzsZqcOloQIjJWPJfgH1liL6V0bJcOFtKHHjrubcfyAJKeLVww1IeL/3yGCBf4nZO6eOHocWDi8TK4njtvhpAZcuOjqUnsM91WltAuBEZvMmp1lynhP85nZX+OrQp4IUMvvoEbPxTN7MGpI0lZvSPoM+F4k47wsKcy4mN73EychQKSQoPd3qmZEemfgRAdfdX4ojMNrJ5g0Q9qj pKSHGF/v LPvH37TUeq01Tp/MCe8+HSNufXYyUr+t8Z6ePpSWiSSOcj8a2w+oiyQ8qg830hUScNlZ9R8YaeQYm4DkY7NLrMF2xkaQuENLgHgkKCzfSMMpNsFi1AEZHDyr8bKglYM+zDWM6afvhDoHByPDqJatwfm0PEpuyJb9NPbR1NppIkv6GT0UC3IDs8rVcIRQmntHI/J9oGu/dvKe7IMFhotSVJU7wSQD7ojQd99Uspq/9A6eoePEwFvU/6BQeZ57xomFOJtFRFKP/VVxjhDb+lWWYXU+jBgmWcjdcii26ExHvQS9zVBBLuW/R2vgthaNvZ1iX9OaOihuui+JNYwU4kL8u80j/r+ZrUmkpXd8i3lXi2zL0dje5UbTYoQtpJ+0BlgTtXZqV/QB1OtHltuFvqATmRaPRI/KXAUqXzsCe X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Nov 28, 2023 at 06:42:44PM +0100, David Hildenbrand wrote: > On 28.11.23 18:13, Peter Xu wrote: > > On Tue, Nov 28, 2023 at 05:39:35PM +0100, David Hildenbrand wrote: > > > Quoting from the cover letter: > > > > > > "We have hugetlb special-casing/checks in the callers in all cases either > > > way already in place: it doesn't make too much sense to call generic-looking > > > functions that end up doing hugetlb specific things from hugetlb > > > special-cases." > > > > I'll take this one as an example: I think one goal (of my understanding of > > the mm community) is to make the generic looking functions keep being > > generic, dropping any function named as "*hugetlb*" if possible one day > > within that generic implementation. I said that in my previous reply. > > Yes, and I am one of the people asking for that. However, only where it > makes sense (e.g., like page table walking, GUP as you said), and only when > it is actually unified. > > I don't think that rmap handling or fault handling will ever be completely > unified to that extreme, and it might also not be desirable. Just like we > have separate paths for anon and file in areas where they are reasonable > different. Yes I haven't check further in that direction, that'll be after the pgtable work for me if that can first move on smoothly, and it'll also depend on whether merging the pgtable changes would be good enough so that we can move on to consider small mappings for hugetlb, until then we need to settle a final mapcount solution for hugetlb. > > What doesn't make sense is using patterns like: > > page_remove_rmap(subpage, vma, folio_test_hugetlb(folio)); > > and then, inside page_remove_rmap() have an initial folio_test_hugetlb() > check that does something completely different. IIUC above "folio_test_hugetlb(folio)" pattern can become "false" for hugetlb, if we decided to do mapcount for small hugetlb mappings. If that happens, I think something like this patch _may_ need to be reverted again more or less. Or we start to copy some of page_remove_rmap() into the new hugetlb rmap api. > > So each and everyone calling page_remove_rmap (and knowing that it's > certainly not a hugetlb folio) has to run through that check. Note that right after this change applied, hugetlb already start to lose something in common with generic compound folios, where page_remove_rmap() had: VM_BUG_ON_PAGE(compound && !PageHead(page), page); That sanity check goes away immediately for hugetlb, which is still logically applicable. > > Then, we have functions like page_add_file_rmap() that look like they can be > used for hugetlb, but hugetlb is smart enough and only calls > page_dup_file_rmap(), because it doesn't want to touch any file/anon > counters. And to handle that we would have to add folio_test_hugetlb() > inside there, which adds the same as above, trying to unify without > unifying. > > Once we're in the area of folio_add_file_rmap_range(), it all stops making > sense, because there is no way we could possibly partially map a folio > today. (and if we can in the future, we might still want separate handling, > because most caller know with which pages they are dealing, below) > > Last but not least, it's all inconsistent right now with > hugetlb_add_anon_rmap()/hugetlb_add_new_anon_rmap() being there because they > differ reasonably well from the "ordinary" counterparts. Taking hugepage_add_new_anon_rmap() as example, IMHO they still share a lot of things with !hugetlb codes, and maybe they can already be cleaned up into something common for a large mapping: void folio_add_new_anon_rmap(struct folio *folio, struct vm_area_struct *vma, unsigned long address) { int nr; VM_BUG_ON_VMA(address < vma->vm_start || address >= vma->vm_end, vma); if (folio_is_hugetlb(folio)) { folio_clear_hugetlb_restore_reserve(folio); } else { __folio_set_swapbacked(folio); atomic_set(&folio->_nr_pages_mapped, COMPOUND_MAPPED); nr = folio_nr_pages(folio); __lruvec_stat_mod_folio(folio, NR_ANON_THPS, nr); __lruvec_stat_mod_folio(folio, NR_ANON_MAPPED, nr); } /* increment count (starts at -1) */ atomic_set(&folio->_entire_mapcount, 0); __folio_set_anon(folio, vma, address, true); SetPageAnonExclusive(&folio->page); } For folio_add_file_rmap_range(): would it work if it just pass the hugetlb folio range into it? Would that make it much more difficult with the recent work on large folios from you or anyone? > I don't think going in the other direction and e.g., removing > hugetlb_add_anon_rmap / hugetlb_add_new_anon_rmap is making a unification > that is not really reasonable. It will only make things slower and the > individual functions more complicated. IIUC so far the performance difference should be minimum on which helper to use. As I mentioned, I sincerely don't know whether this patch is good or not as I don't know enough on everything around that is happening. It's just that I'll still think twice if to create hugetlb its own rmap API, because from the high level it's going the other way round to me. So I still want to raise this as a pure question. Thanks, -- Peter Xu