From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B0C9CC02199 for ; Sun, 9 Feb 2025 10:49:28 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 42F8B6B007B; Sun, 9 Feb 2025 05:49:28 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 3E33E6B0083; Sun, 9 Feb 2025 05:49:28 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2A6466B0085; Sun, 9 Feb 2025 05:49:28 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 0E0776B007B for ; Sun, 9 Feb 2025 05:49:28 -0500 (EST) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id AD06A1211E8 for ; Sun, 9 Feb 2025 10:49:27 +0000 (UTC) X-FDA: 83100084774.28.F3E9701 Received: from m16.mail.126.com (m16.mail.126.com [117.135.210.6]) by imf13.hostedemail.com (Postfix) with ESMTP id E646020007 for ; Sun, 9 Feb 2025 10:49:21 +0000 (UTC) Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=126.com header.s=s110527 header.b=FYQ6biCv; spf=pass (imf13.hostedemail.com: domain of yangge1116@126.com designates 117.135.210.6 as permitted sender) smtp.mailfrom=yangge1116@126.com; dmarc=pass (policy=none) header.from=126.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739098166; a=rsa-sha256; cv=none; b=sbceZshJWotbr4RIz9e9jr/T/XoG3Kni9Toth/ecg3sxmzfRARYdHUdm82ZpmBO8XBW0LN k7H/4doyAda2NAjP7PEc5oYpuUTX91vuaJgtGACfdA2SIuPOSdi2CVNXgkK2Fpzn6YrEj1 dMSSpcAV/QheBDYrrQ6TKHkM1G/EgRo= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=pass header.d=126.com header.s=s110527 header.b=FYQ6biCv; spf=pass (imf13.hostedemail.com: domain of yangge1116@126.com designates 117.135.210.6 as permitted sender) smtp.mailfrom=yangge1116@126.com; dmarc=pass (policy=none) header.from=126.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739098166; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=akvxBzNGg7jDeoykg2ST3OpylLpsQxNNSX0ReAXujes=; b=e0dHiUQvu3XAg0i6naLk8w/Ygm7SyGM9qKKnNQfaO2IW3pQaeTe+6y7MB5TEgEKJLCrfZw YEhsScYybYCFDtNMaJ92ksytS/nmY76n9wm/QxXL8mU61UpRK8AAk1b6t+9CyWwg7Tn1E6 c1lXyGLWHQiVkm/Dr7SnQzro8LlB7LI= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=126.com; s=s110527; h=Message-ID:Date:MIME-Version:Subject:From: Content-Type; bh=akvxBzNGg7jDeoykg2ST3OpylLpsQxNNSX0ReAXujes=; b=FYQ6biCvxdySKjzXgIZwyQsBaY28b+vOe5YL3TVg7Ow11vsPtg3ZmlcMV3lk5c ta8JHicIVelQzwbDMHBRl+VA30KSUxEM6atk9FqIT6TLKxO+P+pI+DBtODh5tyWP q6ufXdqNDP7pPaMUDErKQfBJbE5ZxJuM2ON/PCfSjezuU= Received: from [172.19.20.199] (unknown []) by gzga-smtp-mtada-g0-4 (Coremail) with SMTP id _____wDH_3griKhnXq5iAw--.62977S2; Sun, 09 Feb 2025 18:49:16 +0800 (CST) Message-ID: <20a21d17-77d8-4120-8643-c575304c39f2@126.com> Date: Sun, 9 Feb 2025 18:49:15 +0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH] mm/cma: add an API to enable/disable concurrent memory allocation for the CMA To: Barry Song <21cnbao@gmail.com> Cc: akpm@linux-foundation.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, stable@vger.kernel.org, david@redhat.com, baolin.wang@linux.alibaba.com, aisheng.dong@nxp.com, liuzixing@hygon.cn References: <1737717687-16744-1-git-send-email-yangge1116@126.com> <28edc5df-eed5-45b8-ab6d-76e63ef635a9@126.com> From: Ge Yang In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-CM-TRANSID:_____wDH_3griKhnXq5iAw--.62977S2 X-Coremail-Antispam: 1Uf129KBjvJXoW3GFyftF4kJw1rJrykWrW5Awb_yoW7Gw4xpa y8G3WYk3yrJrnrA3s2qw4093ZIq397GF4UWry7K3s7Zr98tFnFgr1UKw15urykArWkWF1I vr4jq3ya9F15Z37anT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDUYxBIdaVFxhVjvjDU0xZFpf9x07jbHUDUUUUU= X-Originating-IP: [112.64.138.194] X-CM-SenderInfo: 51dqwwjhrrila6rslhhfrp/1tbiOh7uG2eoaOjRcAAAs- X-Rspamd-Queue-Id: E646020007 X-Stat-Signature: kc9gfcsgszxe8ng1x4ie8958cmum6u86 X-Rspam-User: X-Rspamd-Server: rspam10 X-HE-Tag: 1739098161-233624 X-HE-Meta: U2FsdGVkX198lTwcA+OtHm6bzyZucfP+9tEtkIxLriLX5U471U68b0CgvV3q6KCAJqd0CLES8sJMpLdo3fqT/xqIA/LGGzx/07Vf53xeM8e3pHc15lQ2X+uIZPbz3x6BEOXKrHCXq0FwbuzPN4HJSSYlc8uX0rMYY1C+DhN+fwDGzBJOS8DRNjCoR+fvHFwvxUEQVXFTUs9nuQ7BlbF6MF0afbaHP9wjXeXOJBVJrzqEUq5Tk36F61GLslyMbmiyUVJlZQdrYgdLK+zJEyNTgyvZnLc8uf+ExCRtafbMFrstThJDW19/t1gbuN78Y91DELrNv9gXV36waZJTIh0jr7g0upXGHN0Om/qf1JBCoMHRrFWjlfm1MRdMiUurat+PVU/8rIK77MGrD86xDJkzdlOU026nSTazUCzcdQc2eUt5V7X7WD/3mWdy2laFIQ/wN4AIf9sL2qC69+4D7x4WszbnKRqNxCFuSN9/Sl5U5uCOEVlf68iI3q8oWByHaHqLtaOi8OdYtndEX1QFsgLwoYt3swZnROZbAcRckcJYbKgSctLgcJicwjVLMTseHcjq58A1SMxjERlwtYReyBwX1dpcM+RikFfwfM25mFLr6IwJgpiV+nPzskbeiS0VWXzhSVITJ6AY7Cf4yYEx7EO1BAHMg35JaNd9NJoNOxwzSFtCH0zPlceU2T67UzF0cdEtRS59rha8tt5Bwuz6JSf8Clh66s8kohLW3Bst6XfD1N44B9C2JJQxOxRgz81cY/gnVzCbiyG9qWs1wuTOyJ/wpKCkGgsXIfkSxpGz3Kiw7GDDB0lf+goV/EyHA3AXmg8hqJpIKSLLLcogmsDYIBeo53GrHyxwGUWOjIRwQaCO3OKKudorD6EgwOtIGXIL0KTRjumj9JzyxP524+YCoDS8DP38ujf+nPtO9kLMbqphpUDY6VnrjwihfHkH0M7mu1FG+QbaInt9TQYRI7rNCMa mmg4S2iU zpG75pPHujhTJaWhdqqmV0TCigxpmWH8ImKasy8fGgv+resyCK+Z7dte8KlGpBOiB0oI2XOB3dU8p8emWlg6HBzNudYMh7ZN+VMpIJjgRgg5107eFlYVg1aJj0nJ0ChPP4nmd0s6wi7SuAjB/eMRhW6QpoKduYXHTjuCSslNMgexouNF1+Hq8nCBiFmkQp5N5A4sFRphUfKH0aO52MXGwKWv690vY4hMwkUO/cap8jdNaQmBi1+DO6tzzQj0BVN7BYCzNUKGxCm0ZEqjrqGo3+Brpyg2Z8EbW0E9vWmNn5SFhonhHNiEFZQTpgekmXzjBSw08V8unmjBmj94IetwCoh8RlIu3owCcsFGWS4oTD2pQPZoEmrD+Fycpsvvq96k1nepc0GSSEZdCOLppxafiYUt/6yQ/0NjNNV+w X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: 在 2025/2/9 5:34, Barry Song 写道: > On Sat, Feb 8, 2025 at 9:50 PM Ge Yang wrote: >> >> >> >> 在 2025/1/28 17:58, Barry Song 写道: >>> On Sat, Jan 25, 2025 at 12:21 AM wrote: >>>> >>>> From: yangge >>>> >>>> Commit 60a60e32cf91 ("Revert "mm/cma.c: remove redundant cma_mutex lock"") >>>> simply reverts to the original method of using the cma_mutex to ensure >>>> that alloc_contig_range() runs sequentially. This change was made to avoid >>>> concurrency allocation failures. However, it can negatively impact >>>> performance when concurrent allocation of CMA memory is required. >>> >>> Do we have some data? >> Yes, I will add it in the next version, thanks. >>> >>>> >>>> To address this issue, we could introduce an API for concurrency settings, >>>> allowing users to decide whether their CMA can perform concurrent memory >>>> allocations or not. >>> >>> Who is the intended user of cma_set_concurrency? >> We have some drivers that use cma_set_concurrency(), but they have not >> yet been merged into the mainline. The cma_alloc_mem() function in the >> mainline also supports concurrent allocation of CMA memory. By applying >> this patch, we can also achieve significant performance improvements in >> certain scenarios. I will provide performance data in the next version. >> I also feel it is somewhat >>> unsafe since cma->concurr_alloc is not protected by any locks. >> Ok, thanks. >>> >>> Will a user setting cma->concurr_alloc = 1 encounter the original issue that >>> commit 60a60e32cf91 was attempting to fix? >>> >> Yes, if a user encounters the issue described in commit 60a60e32cf91, >> they will not be able to set cma->concurr_alloc to 1. > > A user who hasn't encountered a problem yet doesn't mean they won't > encounter it; it most likely just means the testing time hasn't been long > enough. > > Is it possible to implement a per-CMA lock or range lock that simultaneously > improves performance and prevents the original issue that commit > 60a60e32cf91 aimed to fix? > Using per-CMA locks can improve performance and prevent the original issue. I am currently preparing the patch. Thanks. > I strongly believe that cma->concurr_alloc is not the right approach. Let's > not waste our time on this kind of hack or workaround. Instead, we should > find a proper fix that remains transparent to users. > >>>> >>>> Fixes: 60a60e32cf91 ("Revert "mm/cma.c: remove redundant cma_mutex lock"") >>>> Signed-off-by: yangge >>>> Cc: >>>> --- >>>> include/linux/cma.h | 2 ++ >>>> mm/cma.c | 22 ++++++++++++++++++++-- >>>> mm/cma.h | 1 + >>>> 3 files changed, 23 insertions(+), 2 deletions(-) >>>> >>>> diff --git a/include/linux/cma.h b/include/linux/cma.h >>>> index d15b64f..2384624 100644 >>>> --- a/include/linux/cma.h >>>> +++ b/include/linux/cma.h >>>> @@ -53,6 +53,8 @@ extern int cma_for_each_area(int (*it)(struct cma *cma, void *data), void *data) >>>> >>>> extern void cma_reserve_pages_on_error(struct cma *cma); >>>> >>>> +extern bool cma_set_concurrency(struct cma *cma, bool concurrency); >>>> + >>>> #ifdef CONFIG_CMA >>>> struct folio *cma_alloc_folio(struct cma *cma, int order, gfp_t gfp); >>>> bool cma_free_folio(struct cma *cma, const struct folio *folio); >>>> diff --git a/mm/cma.c b/mm/cma.c >>>> index de5bc0c..49a7186 100644 >>>> --- a/mm/cma.c >>>> +++ b/mm/cma.c >>>> @@ -460,9 +460,17 @@ static struct page *__cma_alloc(struct cma *cma, unsigned long count, >>>> spin_unlock_irq(&cma->lock); >>>> >>>> pfn = cma->base_pfn + (bitmap_no << cma->order_per_bit); >>>> - mutex_lock(&cma_mutex); >>>> + >>>> + /* >>>> + * If the user sets the concurr_alloc of CMA to true, concurrent >>>> + * memory allocation is allowed. If the user sets it to false or >>>> + * does not set it, concurrent memory allocation is not allowed. >>>> + */ >>>> + if (!cma->concurr_alloc) >>>> + mutex_lock(&cma_mutex); >>>> ret = alloc_contig_range(pfn, pfn + count, MIGRATE_CMA, gfp); >>>> - mutex_unlock(&cma_mutex); >>>> + if (!cma->concurr_alloc) >>>> + mutex_unlock(&cma_mutex); >>>> if (ret == 0) { >>>> page = pfn_to_page(pfn); >>>> break; >>>> @@ -610,3 +618,13 @@ int cma_for_each_area(int (*it)(struct cma *cma, void *data), void *data) >>>> >>>> return 0; >>>> } >>>> + >>>> +bool cma_set_concurrency(struct cma *cma, bool concurrency) >>>> +{ >>>> + if (!cma) >>>> + return false; >>>> + >>>> + cma->concurr_alloc = concurrency; >>>> + >>>> + return true; >>>> +} >>>> diff --git a/mm/cma.h b/mm/cma.h >>>> index 8485ef8..30f489d 100644 >>>> --- a/mm/cma.h >>>> +++ b/mm/cma.h >>>> @@ -16,6 +16,7 @@ struct cma { >>>> unsigned long *bitmap; >>>> unsigned int order_per_bit; /* Order of pages represented by one bit */ >>>> spinlock_t lock; >>>> + bool concurr_alloc; >>>> #ifdef CONFIG_CMA_DEBUGFS >>>> struct hlist_head mem_head; >>>> spinlock_t mem_head_lock; >>>> -- >>>> 2.7.4 >>>> >>>> >>> > > Thanks > Barry