From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id AE462C3ABC0 for ; Wed, 7 May 2025 16:51:37 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 295B66B009E; Wed, 7 May 2025 12:51:36 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 269AF6B009F; Wed, 7 May 2025 12:51:36 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 131CD6B00A0; Wed, 7 May 2025 12:51:36 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id E5CDE6B009E for ; Wed, 7 May 2025 12:51:35 -0400 (EDT) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 8881FC09A6 for ; Wed, 7 May 2025 16:51:36 +0000 (UTC) X-FDA: 83416702992.24.66EF4AB Received: from mail-wr1-f46.google.com (mail-wr1-f46.google.com [209.85.221.46]) by imf19.hostedemail.com (Postfix) with ESMTP id 8C0391A0005 for ; Wed, 7 May 2025 16:51:34 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=O3h6DbK8; spf=pass (imf19.hostedemail.com: domain of mjguzik@gmail.com designates 209.85.221.46 as permitted sender) smtp.mailfrom=mjguzik@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1746636694; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=WILK6zBqwTnoTOEGYU/kAU/1/GVjpQgWiODsfDZ6zt0=; b=125aHRDWw1vzJwe1GxD5/hEk37c/ZQrnUvASYeNVmIjWL5I0Nf0mvWR4Qk/zBmknoA664+ v2cmEf+IGdxzEMi5UvwBwHB8fsH/YZf1obdKJNuzrrkQkYk19MAvaTDoJBBi8ptJUtJUZ/ Z0jXviFvRV6kPbl6GYulTPN++n5M0AE= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=O3h6DbK8; spf=pass (imf19.hostedemail.com: domain of mjguzik@gmail.com designates 209.85.221.46 as permitted sender) smtp.mailfrom=mjguzik@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1746636694; a=rsa-sha256; cv=none; b=O0cX9KKl5sehfP3AGrZNPYPmlecYDW5BP12u5OOyV+DMO4cY4tdhlxIdjpnuIG+WQl90tk f2gbSnCzK7zeJ6GtURnUhKHey3YQubcj6BNbWYcZvO/W+kGUkwZde10X14lq0u/I+b+PTw M1EFnnRvRrmpt9zlxJo2s/PhQnCQmIQ= Received: by mail-wr1-f46.google.com with SMTP id ffacd0b85a97d-39ee5ac4321so110061f8f.1 for ; Wed, 07 May 2025 09:51:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1746636693; x=1747241493; darn=kvack.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=WILK6zBqwTnoTOEGYU/kAU/1/GVjpQgWiODsfDZ6zt0=; b=O3h6DbK8xcoq95fCQar2rCOaFuuM3OMcEyw+aAyrZsojX3IyprPrUL3KQF7uPVGa8S kGurQnxScoGXRToagQ0PX4UYO7/kGEdwA0SI2QuYH+UJy4nAbkVfu38UrxJjv5C/zud8 Wt5aR5KRotyQzQfGGMDQu0kZqeNSjujrB1CLTMZfzTKWI6apNEC4TdRP9C3VvfKgv/97 shwWulwgDHckqXHb4mxM2SXihgbqBXHV71UMNjJMdBjad3H+zDfClttgmLv1H4lrbxP9 4AlxNtASkCRpZX4BFni7zjKKqH+CjRmPjcVuRpQAmK7TztGnfMj5OSwynVEJkbkpmHch I8eQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1746636693; x=1747241493; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=WILK6zBqwTnoTOEGYU/kAU/1/GVjpQgWiODsfDZ6zt0=; b=dsXgsQ6ZdM0Fypn43iiV8mhXo7jNl3eKmJcxkBuViNvYWY2UjN1YL+XRqopc5aKwfD elia4QGkT1QJo4uQYSiGt4e8UH0joVu8p1UaP0nKXQFwqlbH6QH4KF87qTWSBrcUzz0a NhFJG8iMJGyn5dOcfvVUlxEQ7MSbhqIPdaC0pZH618NILCE19OXu5iVkyxlP3n49HDin Sc6Xy6FqbemJUAmT6gJyTExFPoJCad5T9tDxIAYrKvSpYtMf0fa6pThd7IItX6sbkPg4 OvReW2ZScu0faUJhfWNca2mB/6SNLkMG2udUNZfXWeJclvUdKRlX/Lul4vIELl3vtOz3 Wqzg== X-Forwarded-Encrypted: i=1; AJvYcCX7/bZpF3r4hFLjEp34U5E8+jIAB1A+uTnWg2bHkED6RS/LILL6lB+lgQjKSSCXf0jnGUfVGXxaQQ==@kvack.org X-Gm-Message-State: AOJu0YzNl8CM/XkDtqS+CgriLb+chy02/MzAAyGcXQD4A//f/3JS3MVY NkInHg9Wuma8geC7kKb+SYOweiMsNBjQbKFXTEcY//ohxrfXSp/b X-Gm-Gg: ASbGncvLrM5JhtW4dnSmOTEo0sEhbvCTPb9r5SfJrN421mJt3lhVTsyxQLp65eoFZzt voDRrZPyOz6kDWUoljpeoGS/V+lxIiyK8EJhBaqg8/7d39bB4XLZ2y9XVruEOzYOJPu0AFO8jYG pUsRZZw9NQLCzoq1+xkUqed3oIuu2T/ujr6Zf6dQDbI1zE5miJYGOG1ZUCpsz9iBrrFKGE95crD wM/DhUuH/ICFrQLl7DsWjYqspyZFbBn9BvkWvTWDBNFsE1q3lt04rC0g8Wvn4wWyoEnowhCWRIS H+tgFwwGiQCmktfXyYyIQFxTVVxEf1hn5maB3oAg0oAvJ7Hf6Ic= X-Google-Smtp-Source: AGHT+IHGAdF4sr2u0RMN06pUoT9YBuLHMq6VHOhGAQK8FgqTDAIKsewjRCjlLBGQcCKVvViyOaO7Jg== X-Received: by 2002:a05:6000:220d:b0:3a0:b943:ca8a with SMTP id ffacd0b85a97d-3a0b943cac6mr620208f8f.38.1746636692738; Wed, 07 May 2025 09:51:32 -0700 (PDT) Received: from f (cst-prg-3-11.cust.vodafone.cz. [46.135.3.11]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-3a0b64946e7sm2146325f8f.24.2025.05.07.09.51.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 07 May 2025 09:51:31 -0700 (PDT) Date: Wed, 7 May 2025 18:51:23 +0200 From: Mateusz Guzik To: Linus Walleij Cc: Andrew Morton , linux-mm@kvack.org, Pasha Tatashin Subject: Re: [PATCH v2 5/5] fork: zero vmap stack using clear_page() instead of memset() Message-ID: References: <20250507-fork-fixes-v2-0-82ab1e42cde3@linaro.org> <20250507-fork-fixes-v2-5-82ab1e42cde3@linaro.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20250507-fork-fixes-v2-5-82ab1e42cde3@linaro.org> X-Rspam-User: X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: 8C0391A0005 X-Stat-Signature: 97iehtjm9qxcmea79y5u7c88a79kxmna X-HE-Tag: 1746636694-771445 X-HE-Meta: U2FsdGVkX1/aCos/qgyWZe5v6o7aXaAevtelU7LU3vMMkR6Qjm5ITDCAY1U7Sc9FOQIcl6cwTQWLw2jYRRNLIHhZPi3QOsbbM7CuwX9AhuV3DVplKtJihxwW4+7ooWt02CNdAOVXtIB9Y1Sj5Ulr33rXfhFmT3nqs/PsFYyOxVLMRjaPyllJVLWxyeXxRW3sgLzl7zXSp32XQmK4IKwuHLb7SNMvi4LlyPS9EGhzvFFUyIWeWFrk0uzGs3ieq6ZK9rUYCV/oI0035IGX7rsxBHy7pWeQcPohgn51Uy9P9nM2LQcYYvhVeyW16iu28g683Z4fDea82c++dpC+iOeSXGjiVST04D5fwnTAz/LPoUA+qQDUePRNPPJ40LLp6ajjeQJmVoC012BpcdtV6SG7vDU2SAmSb+FT73ifB98fw1eRbIPKS6QymQfhzxC+NPxLIvPE1HnI7w3+/+0jodZtbSsiKEo90iAzbIxCk69ODDVXxXLYlL8cQAIxddjZhg55DxUFx7IMudffWffeh4nFpVVF/mPD8hBvRgMt6281ocoGII+gzoLlQCoJVcQZU/Wm/XWITw6iaXJCyfgt8yqkPvKcUbqEeWX/dXpO9UE3GDjbzfa8gz8gdTLivI64H/SNdtiIsiJAIxJUjD9nR3akW4YxTuchRdLZrWrOH4wOkrcwxMmukfzQakw8xaVOfFDHb+DI3XKRjIUPw+7+zsFNHhkGUHFWgwFasOnSjjHv8ihwuRyz2o8sAHPiuQMo24eyDzWYYpdGsMgo0Z/i0GZiKrOlqwM4XcSGKEOpMdhhTzyFZNLwmzA6tz/tbd4oIuGkrQuZoLOIv5x1FxhZcW/pG4z2TUDrm/2thPXu/UZUMn/KeWUi6JAFzd6fWnfQJcb9xztu4D0BOC/nJLGVgudUP81WFqfX8Df4EW+R4A1qVucrNOEx8L/rmUOdsXE72eds52BWjZlLmtzQwZVLfj5 o11QVcqh eylNiPijeLNDwAyRcfPOtJMQwK29K5FTiitiVocZr5ewG6FIE/yTqpmxFb24hwY98WBMLP4zX0rh+WRTilfDeZORGlgNxW1ZGpRJF5iyBx7zuW4cf080dE04YWx0WJ8vAPeh7eqE1230nDYMr3jNaCsEBNWfJbbrNEGUnRS1zUUDM/Yzi8VEGX7v4+WxhVN4u0Z/I658bW9Z8YUZrCb8dJ+LiiemPhq44IWp/JreT3/gjdU8AKqMyDuov3T35cpvsNxrY3M+SB66jkJj9GU+8LguyTsiqp6pplhl+3N9eYhnRIBjPz5vKv4tYCSi0vIigCBc4kyyJDyv2RNXrzN1PqBkyIY6aG1vBmee6Ws+kssfBZyh0tPRXuJW4nSp41LynUiiHbTetOLrGf1Qy9zPIe8jOE0v0SHlZnIyhKatDfL40uRRz4IZ7DZYKiXbrb8Mh/mWLxQT/0v/JuUz7XrKlrEbi+hAXDFTSYP2K7xEVbxsaS67QI0fMkcuMF36ESD0CAhzPilPuF1+ZpmWYroqKNThF0O5p8Fq0O/oO48hzDTUHCVdF7wcTG/A3bmiw9FCz4z79 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Wed, May 07, 2025 at 02:46:31PM +0200, Linus Walleij wrote: > From: Pasha Tatashin > > Do not zero the whole span of the stack, but instead only the > pages that are part of the vm_area. > > As several architectures have optimized implementations of > clear_page(), this will give the architecture a clear idea > of what is going on and will speed up the clearing operation. > > Signed-off-by: Pasha Tatashin > Link: https://lore.kernel.org/20240311164638.2015063-7-pasha.tatashin@soleen.com > [linus.walleij@linaro.org: Rebased] > Signed-off-by: Linus Walleij > --- > kernel/fork.c | 7 +++++-- > 1 file changed, 5 insertions(+), 2 deletions(-) > > diff --git a/kernel/fork.c b/kernel/fork.c > index 6ac4674fdf04081fbcbf1eb99167a4c990a58506..d3f000b846c634221c7dafe6e64185a276c5a08b 100644 > --- a/kernel/fork.c > +++ b/kernel/fork.c > @@ -271,13 +271,15 @@ static int memcg_charge_kernel_stack(struct vm_struct *vm_area) > static int alloc_thread_stack_node(struct task_struct *tsk, int node) > { > struct vm_struct *vm_area; > + int nr_pages; > void *stack; > - int i; > + int i, j; > > for (i = 0; i < NR_CACHED_STACKS; i++) { > vm_area = this_cpu_xchg(cached_stacks[i], NULL); > if (!vm_area) > continue; > + nr_pages = vm_area->nr_pages; > > if (memcg_charge_kernel_stack(vm_area)) { > vfree(vm_area->addr); > @@ -290,7 +292,8 @@ static int alloc_thread_stack_node(struct task_struct *tsk, int node) > stack = kasan_reset_tag(vm_area->addr); > > /* Clear stale pointers from reused stack. */ > - memset(stack, 0, THREAD_SIZE); > + for (j = 0; j < nr_pages; j++) > + clear_page(page_address(vm_area->pages[j])); > > tsk->stack_vm_area = vm_area; > tsk->stack = stack; > > -- > 2.49.0 > > I don't know if the logic works here as far as the speed up goes -- you are in fact *taking away* information what the caller is doing. In order to actually allow archs to optimize this you would need a clearing func which grabs the page count. For example, suppose 'rep stosb' is the optimal way to handle the 16K stacks. With a clear_pages(addr, n) routine it gets issued once with the appropriate size the CPU knows about. In code as provided here you get 4 invocations instead. There was a patchset to support multi-page clearing, but it only covers x86-64: https://lore.kernel.org/all/20250414034607.762653-1-ankur.a.arora@oracle.com/ Side note: funnily enough so happens memset on x86-64 is pretty bad the moment, never using rep stos regardless of size. But that's an argument for adding rep, after which perf will be about the same. tl;dr I think this patch should be dropped. if multi-page clearing shows up for more archs then this is a thing to consider.