From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-qt1-f178.google.com (mail-qt1-f178.google.com [209.85.160.178]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AD2AA1E32CE for ; Mon, 21 Oct 2024 23:32:13 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.160.178 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729553536; cv=none; b=dKbnBpMqAGHIuIsKi4MuluOe1vMM7G7PNKP62fH7LfR/99FlaAxPi4VIjsRZ6fZMnBoc1O9yo98Ize9SJ+XK/3Hsy8RAE0D2Ek3xG1WWJpEkeyAl1/9k/E7Reo2gWtAsn75XdVAhPJi/pkQP44+dWT2PhkEZN9FiXDtJif8YUqU= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729553536; c=relaxed/simple; bh=X916ky8vzME0jkYISKDdzwdl7efRQ8hg38UiB7TfrUQ=; h=MIME-Version:References:In-Reply-To:From:Date:Message-ID:Subject: To:Cc:Content-Type; b=bouhA1okBk5xFQ85K2O7yQ4JOVtJckp+UEZG0aP0y7pmOcu/g3d7PBmXAYJbTWhkKSLdXK186e+hmpw743DFXbKdfnpsgCOn7qLtX9W9gk27ZW7t2QFdp1GL/BbuhN61yCjcXaVajbuJPIIa9zP5swjWD5s1FmmvcwhGYOye5X8= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=hHz0D/R9; arc=none smtp.client-ip=209.85.160.178 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="hHz0D/R9" Received: by mail-qt1-f178.google.com with SMTP id d75a77b69052e-460a8d1a9b7so80791cf.1 for ; Mon, 21 Oct 2024 16:32:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1729553532; x=1730158332; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=XBSWnMd6XTE6VfUT0N7YniZsrpOBHvMOUqQ82iETS4o=; b=hHz0D/R9d/i5iIa0/fTOPVY4teWIgusx8+Zt8WFvaURuj0b6jBYb6S5xg+mEfKcvBw ezyDmHKyQuI7sp82PEKqlMQJGmVJb4+iIjMD/7xnrhlEgDnyoB/3VxeuTAC2MzqfmIvv XEsBNGB6KuAT3860MlzXyhDCRCoqWG23DvYLueON8OKtaXpZiBxOsYMCtjBg3W5m26hw DhEVEknXrClMURnBTAc7eIgCB/SNDGDlwekYC7FINV/yo40ttnaIcaheOdofbAL8trr1 Xl5XKxzkZVO8Hb+8cqCvDVROUW/zRuwQ59WbN6SkgK7lsLugWADRB9/gY19XXJ7RC9Dw V3WA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1729553532; x=1730158332; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=XBSWnMd6XTE6VfUT0N7YniZsrpOBHvMOUqQ82iETS4o=; b=LxH59t9mKiSsTzHv0+06f/1f6+gYy+FxxFYan+XyHjDjyTOZXpu8APnMBzh2/G1py7 dD9+tI4PxuV0a+MQ62Rvh3NEpvHD0ACb6rOY5eqvR5wZQ2FwwbHbCuuGhXGdekt0EF/5 DtRJNcHR2NBlNZYr+XTmrVy9hGxFgTbavEL3ZXDhdtvx7yzJlCTybBC+GiAe8eDGuJoH C6xt6/WfrJXbBvfCP2xCbPBgUlzTH9oVVn1nW5Qt0t6K3y534edtWLQMc4T7Y1x5PiPW eGn+d8hUPWBQaG3bzGhytf5Mi5MAxi+4064CKBEvJqyLfbmRWgDGh5X6H1iOitj8Ad5/ jnzQ== X-Forwarded-Encrypted: i=1; AJvYcCU/L3iz6R1NGZiWNMfy4I7YLh8WBxM0QHod6t1TqCGOuGR8ooaGAFDL2pC+ys7Yd+bnRvpPn2VHmJo=@vger.kernel.org X-Gm-Message-State: AOJu0YwkFsxl8++vjY2LoUuCMAZgCcxkuWGEyvHESJcUqq9lMJEjGRn4 eUNElWXGJ9vuLtcDowFFVuhYmahEgKGGByOMiUSEvPpkIuvcvYJcvufNfnRxdR/Hwehw+El4q81 P7MCWv7WkccS7pa8Vet4fBfHbqrWS6VufbXHc X-Google-Smtp-Source: AGHT+IE5R7f/XKkSpNfZF7Clv3CBDSwXz9GBn9EUoEfkMZ71oeRPVNz7IXIERudPlcaZsIWm17D9Wt+QRWMaJ8nPkfg= X-Received: by 2002:ac8:5e07:0:b0:45e:fda3:e995 with SMTP id d75a77b69052e-46100abdb08mr1708541cf.16.1729553532461; Mon, 21 Oct 2024 16:32:12 -0700 (PDT) Precedence: bulk X-Mailing-List: workflows@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 References: <20241014213342.1480681-1-xur@google.com> <20241014213342.1480681-5-xur@google.com> In-Reply-To: From: Rong Xu Date: Mon, 21 Oct 2024 16:32:00 -0700 Message-ID: Subject: Re: [PATCH v4 4/6] AutoFDO: Enable -ffunction-sections for the AutoFDO build To: Masahiro Yamada Cc: Alice Ryhl , Andrew Morton , Arnd Bergmann , Bill Wendling , Borislav Petkov , Breno Leitao , Brian Gerst , Dave Hansen , David Li , Han Shen , Heiko Carstens , "H. Peter Anvin" , Ingo Molnar , Jann Horn , Jonathan Corbet , Josh Poimboeuf , Juergen Gross , Justin Stitt , Kees Cook , "Mike Rapoport (IBM)" , Nathan Chancellor , Nick Desaulniers , Nicolas Schier , "Paul E. McKenney" , Peter Zijlstra , Sami Tolvanen , Thomas Gleixner , Wei Yang , workflows@vger.kernel.org, Miguel Ojeda , Maksim Panchenko , x86@kernel.org, linux-arch@vger.kernel.org, linux-doc@vger.kernel.org, linux-kbuild@vger.kernel.org, linux-kernel@vger.kernel.org, llvm@lists.linux.dev, Sriraman Tallam Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable The answers are the same as the reply in [PATCH v4 5/6] On Sun, Oct 20, 2024 at 7:26=E2=80=AFPM Masahiro Yamada wrote: > > On Tue, Oct 15, 2024 at 6:33=E2=80=AFAM Rong Xu wrote: > > > > Enable -ffunction-sections by default for the AutoFDO build. > > > > With -ffunction-sections, the compiler places each function in its own > > section named .text.function_name instead of placing all functions in > > the .text section. In the AutoFDO build, this allows the linker to > > utilize profile information to reorganize functions for improved > > utilization of iCache and iTLB. > > > > Co-developed-by: Han Shen > > Signed-off-by: Han Shen > > Signed-off-by: Rong Xu > > Suggested-by: Sriraman Tallam > > --- > > include/asm-generic/vmlinux.lds.h | 37 ++++++++++++++++++++++++------- > > scripts/Makefile.autofdo | 2 +- > > 2 files changed, 30 insertions(+), 9 deletions(-) > > > > diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vm= linux.lds.h > > index 5df589c60401..ace617d1af9b 100644 > > --- a/include/asm-generic/vmlinux.lds.h > > +++ b/include/asm-generic/vmlinux.lds.h > > @@ -95,18 +95,25 @@ > > * With LTO_CLANG, the linker also splits sections by default, so we n= eed > > * these macros to combine the sections during the final link. > > * > > + * With LTO_CLANG, the linker also splits sections by default, so we n= eed > > + * these macros to combine the sections during the final link. > > + * > > * RODATA_MAIN is not used because existing code already defines .roda= ta.x > > * sections to be brought in with rodata. > > */ > > -#if defined(CONFIG_LD_DEAD_CODE_DATA_ELIMINATION) || defined(CONFIG_LT= O_CLANG) > > +#if defined(CONFIG_LD_DEAD_CODE_DATA_ELIMINATION) || defined(CONFIG_LT= O_CLANG) || \ > > +defined(CONFIG_AUTOFDO_CLANG) > > #define TEXT_MAIN .text .text.[0-9a-zA-Z_]* > > +#else > > +#define TEXT_MAIN .text > > +#endif > > +#if defined(CONFIG_LD_DEAD_CODE_DATA_ELIMINATION) || defined(CONFIG_LT= O_CLANG) > > #define DATA_MAIN .data .data.[0-9a-zA-Z_]* .data..L* .data..compoundl= iteral* .data.$__unnamed_* .data.$L* > > #define SDATA_MAIN .sdata .sdata.[0-9a-zA-Z_]* > > #define RODATA_MAIN .rodata .rodata.[0-9a-zA-Z_]* .rodata..L* > > #define BSS_MAIN .bss .bss.[0-9a-zA-Z_]* .bss..L* .bss..compoundlitera= l* > > #define SBSS_MAIN .sbss .sbss.[0-9a-zA-Z_]* > > #else > > -#define TEXT_MAIN .text > > #define DATA_MAIN .data > > #define SDATA_MAIN .sdata > > #define RODATA_MAIN .rodata > > @@ -549,6 +556,20 @@ > > __cpuidle_text_end =3D .; = \ > > __noinstr_text_end =3D .; > > > > +#ifdef CONFIG_AUTOFDO_CLANG > > +#define TEXT_HOT = \ > > + __hot_text_start =3D .; = \ > > + *(.text.hot .text.hot.*) = \ > > + __hot_text_end =3D .; > > +#define TEXT_UNLIKELY = \ > > + __unlikely_text_start =3D .; = \ > > + *(.text.unlikely .text.unlikely.*) = \ > > + __unlikely_text_end =3D .; > > +#else > > +#define TEXT_HOT *(.text.hot .text.hot.*) > > +#define TEXT_UNLIKELY *(.text.unlikely .text.unlikely.*) > > +#endif > > > > Again, why is this conditional? The condition is to ensure that we don't change the default kernel build by any means. The new code will introduce a few new symbols. > > > The only difference is *_start and *_end symbols are defined > when CONFIG_AUTOFDO_CLANG=3Dy. > > And, where are these symbols used? These new symbols are currently unreferenced within the kernel source tree. However, they provide a valuable means of identifying hot and cold sections of text, and how large they are. I think they are useful informati= on. > > > > > > > > > > > > > + > > /* > > * .text section. Map to function alignment to avoid address changes > > * during second ld run in second ld pass when generating System.map > > @@ -557,30 +578,30 @@ > > * code elimination or function-section is enabled. Match these symbol= s > > * first when in these builds. > > */ > > -#if defined(CONFIG_LD_DEAD_CODE_DATA_ELIMINATION) || defined(CONFIG_LT= O_CLANG) > > +#if defined(CONFIG_LD_DEAD_CODE_DATA_ELIMINATION) || defined(CONFIG_LT= O_CLANG) || \ > > +defined(CONFIG_AUTOFDO_CLANG) > > #define TEXT_TEXT = \ > > ALIGN_FUNCTION(); = \ > > *(.text.asan.* .text.tsan.*) = \ > > *(.text.unknown .text.unknown.*) = \ > > - *(.text.unlikely .text.unlikely.*) = \ > > + TEXT_UNLIKELY = \ > > . =3D ALIGN(PAGE_SIZE); = \ > > - *(.text.hot .text.hot.*) = \ > > + TEXT_HOT = \ > > *(TEXT_MAIN .text.fixup) = \ > > NOINSTR_TEXT = \ > > *(.ref.text) > > #else > > #define TEXT_TEXT = \ > > ALIGN_FUNCTION(); = \ > > - *(.text.hot .text.hot.*) = \ > > + TEXT_HOT = \ > > *(TEXT_MAIN .text.fixup) = \ > > - *(.text.unlikely .text.unlikely.*) = \ > > + TEXT_UNLIKELY = \ > > *(.text.unknown .text.unknown.*) = \ > > NOINSTR_TEXT = \ > > *(.ref.text) = \ > > *(.text.asan.* .text.tsan.*) > > #endif > > > > - > > /* sched.text is aling to function alignment to secure we have same > > * address even at second ld pass when generating System.map */ > > #define SCHED_TEXT = \ > > diff --git a/scripts/Makefile.autofdo b/scripts/Makefile.autofdo > > index 1c9f224bc221..9c9a530ef090 100644 > > --- a/scripts/Makefile.autofdo > > +++ b/scripts/Makefile.autofdo > > @@ -10,7 +10,7 @@ ifndef CONFIG_DEBUG_INFO > > endif > > > > ifdef CLANG_AUTOFDO_PROFILE > > - CFLAGS_AUTOFDO_CLANG +=3D -fprofile-sample-use=3D$(CLANG_AUTOFDO_PRO= FILE) > > + CFLAGS_AUTOFDO_CLANG +=3D -fprofile-sample-use=3D$(CLANG_AUTOFDO_PRO= FILE) -ffunction-sections > > endif > > > > ifdef CONFIG_LTO_CLANG_THIN > > -- > > 2.47.0.rc1.288.g06298d1525-goog > > > > > > > -- > Best Regards > Masahiro Yamada