From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id CA6FEC3ABAA for ; Mon, 5 May 2025 13:13:00 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C99C66B0089; Mon, 5 May 2025 09:12:58 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C48526B0093; Mon, 5 May 2025 09:12:58 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B0F0E6B0095; Mon, 5 May 2025 09:12:58 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 9081A6B0089 for ; Mon, 5 May 2025 09:12:58 -0400 (EDT) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id AC4DE120473 for ; Mon, 5 May 2025 13:12:59 +0000 (UTC) X-FDA: 83408894478.07.8889AEA Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.223.131]) by imf14.hostedemail.com (Postfix) with ESMTP id 5192A100002 for ; Mon, 5 May 2025 13:12:57 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=PsD8s2OA; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b="M/XQayTp"; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=PsD8s2OA; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b="M/XQayTp"; dmarc=none; spf=pass (imf14.hostedemail.com: domain of vbabka@suse.cz designates 195.135.223.131 as permitted sender) smtp.mailfrom=vbabka@suse.cz ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1746450777; a=rsa-sha256; cv=none; b=WbbUVvYEyuLKDMWtQ3Sb4s+hzUvrdgw9SWqibHGAOSY1Xt+HKtMlH3RXPkSbVWhab5gvUv OSVE0ZTRUw+URejrHbmMAw8amdNLiNMGbHj85Iq0uDa9f2U8QuxwtfqHdYXT7GY3hrvDaz MKLPP71HJeZGj747CyE1BrRqy+S63+o= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=PsD8s2OA; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b="M/XQayTp"; dkim=pass header.d=suse.cz header.s=susede2_rsa header.b=PsD8s2OA; dkim=pass header.d=suse.cz header.s=susede2_ed25519 header.b="M/XQayTp"; dmarc=none; spf=pass (imf14.hostedemail.com: domain of vbabka@suse.cz designates 195.135.223.131 as permitted sender) smtp.mailfrom=vbabka@suse.cz ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1746450777; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=vAzJhDQ3oZg78MFDLniHbP1StB2cbsxdD5OCjTZpEKk=; b=XeX5Hu2Hm3yF/WufHpZtanvVzALgDbzdKMQi+KWNhjFlUihz0ZqyiLpE5mSVcCNLzPfVI5 cr89/n0b/s7N1GFUH2V3P4Ce8ZO7JCE154vnZhbistMOBTqp5epoRMEyoJ9DWmzYOQGsFT 8PRvIcOnrnJiKaabZkeWrZrsfeI7tMs= Received: from imap1.dmz-prg2.suse.org (unknown [10.150.64.97]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id B90231F794; Mon, 5 May 2025 13:12:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1746450775; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=vAzJhDQ3oZg78MFDLniHbP1StB2cbsxdD5OCjTZpEKk=; b=PsD8s2OAk8HAgWOBE//B317AlN87M8IHR+cuwgkJzAfIwxbSt02R45V/a3kxFue5nwJmVD 8XHzBjey8HaPW1Yq5nvRl2XrVC5qBeJ/5J+0VxiphLzo79OTN5QLKDXpu/L2sme+hHUFLj pZbGOWb+9mPlhwPeFxGPIQQKGdGBJ68= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1746450775; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=vAzJhDQ3oZg78MFDLniHbP1StB2cbsxdD5OCjTZpEKk=; b=M/XQayTp70ZfCozRwUMXL29V1Kt6hPCENGQCfFLAqMhlTMlk1VcxXnm1JWX/hHjQTb1+J9 yc6ZCEO38tVMZBAw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_rsa; t=1746450775; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=vAzJhDQ3oZg78MFDLniHbP1StB2cbsxdD5OCjTZpEKk=; b=PsD8s2OAk8HAgWOBE//B317AlN87M8IHR+cuwgkJzAfIwxbSt02R45V/a3kxFue5nwJmVD 8XHzBjey8HaPW1Yq5nvRl2XrVC5qBeJ/5J+0VxiphLzo79OTN5QLKDXpu/L2sme+hHUFLj pZbGOWb+9mPlhwPeFxGPIQQKGdGBJ68= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.cz; s=susede2_ed25519; t=1746450775; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=vAzJhDQ3oZg78MFDLniHbP1StB2cbsxdD5OCjTZpEKk=; b=M/XQayTp70ZfCozRwUMXL29V1Kt6hPCENGQCfFLAqMhlTMlk1VcxXnm1JWX/hHjQTb1+J9 yc6ZCEO38tVMZBAw== Received: from imap1.dmz-prg2.suse.org (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by imap1.dmz-prg2.suse.org (Postfix) with ESMTPS id 9A00E1372E; Mon, 5 May 2025 13:12:55 +0000 (UTC) Received: from dovecot-director2.suse.de ([2a07:de40:b281:106:10:150:64:167]) by imap1.dmz-prg2.suse.org with ESMTPSA id sSUxJVe5GGhwLwAAD6G6ig (envelope-from ); Mon, 05 May 2025 13:12:55 +0000 Message-ID: <8edbd2be-d495-4bfc-a9f3-6eaae7a66d91@suse.cz> Date: Mon, 5 May 2025 15:12:55 +0200 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH] mm/codetag: sub in advance when free non-compound high order pages Content-Language: en-US To: David Wang <00107082@163.com>, akpm@linux-foundation.org, surenb@google.com, mhocko@suse.com, jackmanb@google.com, hannes@cmpxchg.org, ziy@nvidia.com Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Shakeel Butt References: <20250504061923.66914-1-00107082@163.com> From: Vlastimil Babka In-Reply-To: <20250504061923.66914-1-00107082@163.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Rspamd-Queue-Id: 5192A100002 X-Rspam-User: X-Rspamd-Server: rspam07 X-Stat-Signature: hxbtjs3e3yidn9f3kdt7qokhyrgro6f9 X-HE-Tag: 1746450777-399712 X-HE-Meta: U2FsdGVkX1/JwPDonw/9TvMA/Ko6iGkC8GhndqinRtE4t64vmJeSMoK/mwxOkY8iX45/FvDxmKR6EH2kUpHqVYOzI05sg9clZoalQ++nSXy8gn3oBdJroORjrOT5DDyDskeyS2c0ANyDU4Fv7uLgPyxEeq/MNqqg4VAwBqyFibGPGvKo0leWO1/W9/OgD0X4RGJJQVS2DLBTzZ7pxCXCc3dAgmFhrh5lTKD/LqZD47qFYINu/SkalO9gp9l3MsI/dFb0QewaEp20yL5ckR6lLh5PIF06yUnI7W1wnDfvJF0Q9yMwPHswK+AUEUxENYMEWlTEwYJ7B/Foi3SAtP3z4gE1L732B7yLwUpvI5gf2AVMCPKiD7cN6Zlw/u/8USPZ+I7rl7xfkh+1oFky7f+qhhNqWR5zJr7CFgtGIstizJGEc8+5UtSmEukjcKWtjXtwEEenrv4056WvfoKW+//ppzs/ufO64RkfVAPihmokkJq8TNHJGfBJQjyAOQQqnlXqepP1vSRfNgAF3ll+J7mjFuXjX4kAaSZzWGjGjj2ji6QyqLd7zKLCqc5saz6zUI00gVMkg6mH3Ckkii0MlgVOXgRDlZ9Gez9dxaB7vdeNBZfs7hRt0e4O3l2AeOaC4+udlrnWRrkds2l7xdeywDr201H5UWBdIW3c7NDyyOucd8SgqjRJpD+bVUYAx5Yi7nmaBv5Vmuz1wsuDhA0maYu52kf0KCK27oh2b+fgBSgPaq9vhoc6LojXvQ+9u3AIHeJCy6N5BcuW3VU7epOOhJqHU2y+mXJvTcsziZ6oP5ImfolIKYek8jyEIXpJvByJ4UOPYWIxRrVr6c85K7qgRj+g9kU16rtiPF/C0B+HyxP0Yl3YkZ6VE7jqxvGPzBXcf25IVzhIVgHeheXpW2kPfInrbOpuRqztpEaX8yWJtLZmO8x/j8W1rH0fTjIPTM7eCwOwktNUCFqazFR9m3ungwP 3BI2BpmZ TWIo5czO1DnHhCmNNnvywS7ONf6yETcG2ioeX+3u/KpEDknGbcVFnQdqffql1O/JAaekQEl/Rhmn7xfk3jtmDj8mlIVeRUAJmNRAhDOr3NlwC54AnMQwLUpXr6k6Ox/rEkggsfe7yT4Iy03fgyEkbQfGtioYXxPxR0fgvDgYVIueXnK2lmYSAGWe3H1YBKdKF8y/ogLi7tsimljFcoTalt2SuWCJFPIZ97hYDYs2P2iZoNVsOJ2c9RzNcUR3SVHK6piFtUDMHP0tUyyT3BpHTJaoE2NIMzhezviibPpKP2lydEvaAplhqpTty3N01ya5bGdrL/cuLamECaAQeWgvS+Yq/M2vJHuxaiqd2176iFcnj+N18btyTwqboW10g2FioqSbd55/V7HdS7c4= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On 5/4/25 08:19, David Wang wrote: > When page is non-compound, page[0] could be released by other > thread right after put_page_testzero failed in current thread, > pgalloc_tag_sub_pages afterwards would manipulate an invalid > page for accounting remaining pages: > > [timeline] [thread1] [thread2] > | alloc_page non-compound > V > | get_page, rf counter inc > V > | in ___free_pages > | put_page_testzero fails > V > | put_page, page released > V > | in ___free_pages, > | pgalloc_tag_sub_pages > | manipulate an invalid page > V > V > > Move the tag page accounting ahead, and only account remaining pages > for non-compound pages with non-zero order. > > Signed-off-by: David Wang <00107082@163.com> Hmm, I think the problem was introduced by 51ff4d7486f0 ("mm: avoid extra mem_alloc_profiling_enabled() checks"). Previously we'd get the tag pointer upfront and avoid the page use-after-free. It would likely be nicer to fix it by going back to that approach for ___free_pages(), while hopefully keeping the optimisations of 51ff4d7486f0 for the other call sites where it applies? > --- > mm/page_alloc.c | 36 +++++++++++++++++++++++++++++++++--- > 1 file changed, 33 insertions(+), 3 deletions(-) > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c > index 5669baf2a6fe..c42e41ed35fe 100644 > --- a/mm/page_alloc.c > +++ b/mm/page_alloc.c > @@ -1163,12 +1163,25 @@ static inline void pgalloc_tag_sub_pages(struct page *page, unsigned int nr) > this_cpu_sub(tag->counters->bytes, PAGE_SIZE * nr); > } > > +static inline void pgalloc_tag_add_pages(struct page *page, unsigned int nr) > +{ > + struct alloc_tag *tag; > + > + if (!mem_alloc_profiling_enabled()) > + return; > + > + tag = __pgalloc_tag_get(page); > + if (tag) > + this_cpu_add(tag->counters->bytes, PAGE_SIZE * nr); > +} > + > #else /* CONFIG_MEM_ALLOC_PROFILING */ > > static inline void pgalloc_tag_add(struct page *page, struct task_struct *task, > unsigned int nr) {} > static inline void pgalloc_tag_sub(struct page *page, unsigned int nr) {} > static inline void pgalloc_tag_sub_pages(struct page *page, unsigned int nr) {} > +static inline void pgalloc_tag_add_pages(struct page *page, unsigned int nr) {} > > #endif /* CONFIG_MEM_ALLOC_PROFILING */ > > @@ -5065,11 +5078,28 @@ static void ___free_pages(struct page *page, unsigned int order, > { > /* get PageHead before we drop reference */ > int head = PageHead(page); > + /* > + * For remaining pages other than the first page of > + * a non-compound allocation, we decrease its tag > + * pages in advance, in case the first page is released > + * by other thread inbetween our put_page_testzero and any > + * accounting behavior afterwards. > + */ > + unsigned int remaining_tag_pages = 0; > > - if (put_page_testzero(page)) > + if (order > 0 && !head) { > + if (unlikely(page_ref_count(page) > 1)) { > + remaining_tag_pages = (1 << order) - 1; > + pgalloc_tag_sub_pages(page, remaining_tag_pages); > + } > + } > + > + if (put_page_testzero(page)) { > + /* no need special treat for remaining pages, add it back. */ > + if (unlikely(remaining_tag_pages > 0)) > + pgalloc_tag_add_pages(page, remaining_tag_pages); > __free_frozen_pages(page, order, fpi_flags); > - else if (!head) { > - pgalloc_tag_sub_pages(page, (1 << order) - 1); > + } else if (!head) { > while (order-- > 0) > __free_frozen_pages(page + (1 << order), order, > fpi_flags);