From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 30AA8CA0EC0 for ; Thu, 29 Aug 2024 22:11:58 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 607166B007B; Thu, 29 Aug 2024 18:11:57 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 5B6B06B0083; Thu, 29 Aug 2024 18:11:57 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 47DFB6B0085; Thu, 29 Aug 2024 18:11:57 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 27F5C6B007B for ; Thu, 29 Aug 2024 18:11:57 -0400 (EDT) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 84A8680A09 for ; Thu, 29 Aug 2024 22:11:56 +0000 (UTC) X-FDA: 82506681432.15.0524CA0 Received: from casper.infradead.org (casper.infradead.org [90.155.50.34]) by imf27.hostedemail.com (Postfix) with ESMTP id E408E40008 for ; Thu, 29 Aug 2024 22:11:53 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=mzVS94Ox; spf=none (imf27.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org; dmarc=none ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1724969425; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=cluY4o3sKBSLScF/Teq2ofGGdYtENiSAyED2EAKIs78=; b=Jd7ENaQGoWggzdW8FXMTN0r28IQIGG0h2AsudO+CetwdhMu106NXODG7/8/f0X4CJzQPq3 mSx/x5zPNzNeY4/I7umyVGSuBkS+bxYbkQdb7XX1imVhHiWNkU3GP0a7OT+Dz5ixIhXOP3 hQeMKNwEEv/Vt/ZXwDyAdQ3hjPhiKlI= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1724969425; a=rsa-sha256; cv=none; b=W5JRNRT/AmID6btixns202wQ67T/FHpk5069IQdCd1l0bi5IMZa1woA5BUvH/oM7Fv1xRe 0gMz9hvNhE7TvAoSDsPYOlaydWwSnxkpqnS9EFwI+7KR8XG6WjJwAuRCvEONJpYhs8Stuf NmurGUXrmgKC42xi/ZvI3jMAoeJxbCU= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=infradead.org header.s=casper.20170209 header.b=mzVS94Ox; spf=none (imf27.hostedemail.com: domain of willy@infradead.org has no SPF policy when checking 90.155.50.34) smtp.mailfrom=willy@infradead.org; dmarc=none DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=cluY4o3sKBSLScF/Teq2ofGGdYtENiSAyED2EAKIs78=; b=mzVS94Oxz0ViRqldoe5ZZiqQPQ vWJdXxMTCNPxNG3KtUP86F10zUufqOiR66z03/EZgspXZhYFAllr4x+fHcue1QcI604ITY0XlV9H4 d9Mxr6S0Angqab1MH0Lvg3XximaLie9b10KWD8HGdHCiv5XSlOJxOeGHWQ956qFCbt9yYFNS6b8TT YeYXnuyYZaBlexuII1PKqthXKQKqKjkY4i25Hvb7BkC/tcyHYyY89M1MU1wAV2YrEqIy0mDRcmBox UTemC3lLeiXLuRMF59cMEfdSobjBB4hNndOFcYL5MG8A2Ze9xGvmzPBJWrS5zHYzrYwpZ0VECNqrC fSwVEqmQ==; Received: from willy by casper.infradead.org with local (Exim 4.98 #2 (Red Hat Linux)) id 1sjnMf-00000002ZPG-0LKr; Thu, 29 Aug 2024 22:11:37 +0000 Date: Thu, 29 Aug 2024 23:11:36 +0100 From: Matthew Wilcox To: "Pankaj Raghav (Samsung)" Cc: brauner@kernel.org, akpm@linux-foundation.org, chandan.babu@oracle.com, linux-fsdevel@vger.kernel.org, djwong@kernel.org, hare@suse.de, gost.dev@samsung.com, linux-xfs@vger.kernel.org, hch@lst.de, david@fromorbit.com, Zi Yan , yang@os.amperecomputing.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, john.g.garry@oracle.com, cl@os.amperecomputing.com, p.raghav@samsung.com, mcgrof@kernel.org, ryan.roberts@arm.com, David Howells Subject: Re: [PATCH v13 04/10] mm: split a folio in minimum folio order chunks Message-ID: References: <20240822135018.1931258-1-kernel@pankajraghav.com> <20240822135018.1931258-5-kernel@pankajraghav.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20240822135018.1931258-5-kernel@pankajraghav.com> X-Stat-Signature: uhkido4qp3bh5tah8unouwxhetuph4re X-Rspamd-Queue-Id: E408E40008 X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1724969513-330328 X-HE-Meta: U2FsdGVkX18Y6rSFdE2CJMipzLh3RKHXg1/LQT9ZoF5iYsUMHBM1HhOycpLyw1WEk8PKmT1SA7/3XVFs2iiT3AT/XhParydHuzgnbOIb1ePlnWUIaDN+z8Nl876PBb2YNQL2qzPwuiuSHb869tZ3K9eqLAimvvQieoNhLzdpLBCnNMFIZkAd5zvhJjPYCo3aWBiR6+0+pU4/iVLFfKCvfuj7xx5Eg2dMaPeEZGdXq/nVMM9/1pMlE4cBRTpbOXrOLjGeyWc1qFNWW3437dLY/criLXxOBCIBLJ35TdguUhpiLmT0F2ClzMZ/tIUmMMua6a50MMgrEir1jgKrzKe+QSpUJrOsLgg3drTtySBJhcbNckPJnbZ0uffvKfVmQk6Jj8Sy4cMCnmlHmDgE8XJj91cUztyYQmOqsWBIVm8BS/BmXNM14g7DxBgPqFuDCmo9Z9o5hfphoOgzKO26wKW+bxJCXiV6deTCVige5aWaMBvk0hS9ul5tRI+MrsYVj6W7+ZJwyOhMXl/EhFv1AhF3sgkdT3WysQwEpPh3cinQHUkObszVirPdPXGYYTS18UiSGRau63RG5B2g1P8fs0qdbfeTaOT8AWZDvppRo2SkvSK8NmmNySIdmNm4q+e99S0EiFqlkIyvpg17t7HGEItDu1pXBR1pXX/AvvQIeVAms/xCA45wkT+pF7Ha68k0UCuxXxYlFdSZZugeWo6C6AnPVO5JhPLwSqYAcvZAL+du9GkxwtrD8xqP1VfnHOINfvRGiDJTW2O7CfgQrcQguAvR4u0qMI9cFePttygB1zVv6QCxeWLOW3xVOs+gGy1W1tnzx9t5q26o4ByvfZHe66M92Gfq4G0Kqln9PFp0Nm+z0j2k+XaIYpVooRAjnsw431q0f5W2kfyefm1Pg7v5V9egHUPd7uqwHZbAWcRSk4OoyaNuIxPxnrMx6y6DxGp8HGxmnOkBWy1kkHQ2cT+RXIY CP5yXUgc /pcDOLHPyKl+ugifMOifnBvwypA++yDgv4UAY6B9lE6ddU//GurT/JLtxVa27Lkc2NnZwnAo/TevT2wg7rOniIoexx8NjIOpEAheDkVdQKGP+t7CyMpdtP/MGxcO+SXbRuTxN840DnjVEUSMEOspU3xH1YOEvU5LYkxXadm5qPIfAC62X6ULrTuqnFopLdrejJ7G2oAAers0aJXEu6ahlnzKAupghsbHGi0y8XXar7iH8M0U= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Thu, Aug 22, 2024 at 03:50:12PM +0200, Pankaj Raghav (Samsung) wrote: > @@ -317,9 +319,10 @@ unsigned long thp_get_unmapped_area_vmflags(struct file *filp, unsigned long add > bool can_split_folio(struct folio *folio, int caller_pins, int *pextra_pins); > int split_huge_page_to_list_to_order(struct page *page, struct list_head *list, > unsigned int new_order); > +int split_folio_to_list(struct folio *folio, struct list_head *list); > static inline int split_huge_page(struct page *page) > { > - return split_huge_page_to_list_to_order(page, NULL, 0); > + return split_folio(page_folio(page)); Oh! You can't do this! split_huge_page() takes a precise page, NOT a folio. That page is locked. When we return from split_huge_page(), the new folio which contains the precise page is locked. You've made it so that the caller's page's folio won't necessarily be locked. More testing was needed ;-P > } > void deferred_split_folio(struct folio *folio); > > @@ -495,6 +498,12 @@ static inline int split_huge_page(struct page *page) > { > return 0; > } > + > +static inline int split_folio_to_list(struct folio *folio, struct list_head *list) > +{ > + return 0; > +} > + > static inline void deferred_split_folio(struct folio *folio) {} > #define split_huge_pmd(__vma, __pmd, __address) \ > do { } while (0) > @@ -622,7 +631,4 @@ static inline int split_folio_to_order(struct folio *folio, int new_order) > return split_folio_to_list_to_order(folio, NULL, new_order); > } > > -#define split_folio_to_list(f, l) split_folio_to_list_to_order(f, l, 0) > -#define split_folio(f) split_folio_to_order(f, 0) > - > #endif /* _LINUX_HUGE_MM_H */ > diff --git a/mm/huge_memory.c b/mm/huge_memory.c > index cf8e34f62976f..06384b85a3a20 100644 > --- a/mm/huge_memory.c > +++ b/mm/huge_memory.c > @@ -3303,6 +3303,9 @@ bool can_split_folio(struct folio *folio, int caller_pins, int *pextra_pins) > * released, or if some unexpected race happened (e.g., anon VMA disappeared, > * truncation). > * > + * Callers should ensure that the order respects the address space mapping > + * min-order if one is set for non-anonymous folios. > + * > * Returns -EINVAL when trying to split to an order that is incompatible > * with the folio. Splitting to order 0 is compatible with all folios. > */ > @@ -3384,6 +3387,7 @@ int split_huge_page_to_list_to_order(struct page *page, struct list_head *list, > mapping = NULL; > anon_vma_lock_write(anon_vma); > } else { > + unsigned int min_order; > gfp_t gfp; > > mapping = folio->mapping; > @@ -3394,6 +3398,14 @@ int split_huge_page_to_list_to_order(struct page *page, struct list_head *list, > goto out; > } > > + min_order = mapping_min_folio_order(folio->mapping); > + if (new_order < min_order) { > + VM_WARN_ONCE(1, "Cannot split mapped folio below min-order: %u", > + min_order); > + ret = -EINVAL; > + goto out; > + } > + > gfp = current_gfp_context(mapping_gfp_mask(mapping) & > GFP_RECLAIM_MASK); > > @@ -3506,6 +3518,25 @@ int split_huge_page_to_list_to_order(struct page *page, struct list_head *list, > return ret; > } > > +int split_folio_to_list(struct folio *folio, struct list_head *list) > +{ > + unsigned int min_order = 0; > + > + if (folio_test_anon(folio)) > + goto out; > + > + if (!folio->mapping) { > + if (folio_test_pmd_mappable(folio)) > + count_vm_event(THP_SPLIT_PAGE_FAILED); > + return -EBUSY; > + } > + > + min_order = mapping_min_folio_order(folio->mapping); > +out: > + return split_huge_page_to_list_to_order(&folio->page, list, > + min_order); > +} > + > void __folio_undo_large_rmappable(struct folio *folio) > { > struct deferred_split *ds_queue; > @@ -3736,6 +3767,8 @@ static int split_huge_pages_pid(int pid, unsigned long vaddr_start, > struct vm_area_struct *vma = vma_lookup(mm, addr); > struct folio_walk fw; > struct folio *folio; > + struct address_space *mapping; > + unsigned int target_order = new_order; > > if (!vma) > break; > @@ -3753,7 +3786,13 @@ static int split_huge_pages_pid(int pid, unsigned long vaddr_start, > if (!is_transparent_hugepage(folio)) > goto next; > > - if (new_order >= folio_order(folio)) > + if (!folio_test_anon(folio)) { > + mapping = folio->mapping; > + target_order = max(new_order, > + mapping_min_folio_order(mapping)); > + } > + > + if (target_order >= folio_order(folio)) > goto next; > > total++; > @@ -3771,9 +3810,14 @@ static int split_huge_pages_pid(int pid, unsigned long vaddr_start, > folio_get(folio); > folio_walk_end(&fw, vma); > > - if (!split_folio_to_order(folio, new_order)) > + if (!folio_test_anon(folio) && folio->mapping != mapping) > + goto unlock; > + > + if (!split_folio_to_order(folio, target_order)) > split++; > > +unlock: > + > folio_unlock(folio); > folio_put(folio); > > @@ -3802,6 +3846,8 @@ static int split_huge_pages_in_file(const char *file_path, pgoff_t off_start, > pgoff_t index; > int nr_pages = 1; > unsigned long total = 0, split = 0; > + unsigned int min_order; > + unsigned int target_order; > > file = getname_kernel(file_path); > if (IS_ERR(file)) > @@ -3815,6 +3861,8 @@ static int split_huge_pages_in_file(const char *file_path, pgoff_t off_start, > file_path, off_start, off_end); > > mapping = candidate->f_mapping; > + min_order = mapping_min_folio_order(mapping); > + target_order = max(new_order, min_order); > > for (index = off_start; index < off_end; index += nr_pages) { > struct folio *folio = filemap_get_folio(mapping, index); > @@ -3829,15 +3877,19 @@ static int split_huge_pages_in_file(const char *file_path, pgoff_t off_start, > total++; > nr_pages = folio_nr_pages(folio); > > - if (new_order >= folio_order(folio)) > + if (target_order >= folio_order(folio)) > goto next; > > if (!folio_trylock(folio)) > goto next; > > - if (!split_folio_to_order(folio, new_order)) > + if (folio->mapping != mapping) > + goto unlock; > + > + if (!split_folio_to_order(folio, target_order)) > split++; > > +unlock: > folio_unlock(folio); > next: > folio_put(folio); > -- > 2.44.1 >