From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 313F6C369AB for ; Tue, 15 Apr 2025 20:32:25 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 86947280004; Tue, 15 Apr 2025 16:32:24 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 817AC280001; Tue, 15 Apr 2025 16:32:24 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6E0A9280004; Tue, 15 Apr 2025 16:32:24 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 508F4280001 for ; Tue, 15 Apr 2025 16:32:24 -0400 (EDT) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 67720C14C9 for ; Tue, 15 Apr 2025 20:32:24 +0000 (UTC) X-FDA: 83337425808.05.C2756B4 Received: from mail-ed1-f41.google.com (mail-ed1-f41.google.com [209.85.208.41]) by imf14.hostedemail.com (Postfix) with ESMTP id 81DC4100009 for ; Tue, 15 Apr 2025 20:32:22 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=Q1L9i2tU; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf14.hostedemail.com: domain of mjguzik@gmail.com designates 209.85.208.41 as permitted sender) smtp.mailfrom=mjguzik@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1744749142; a=rsa-sha256; cv=none; b=GWAkpm41gISR6Cnt5JViom1DECy9Kl1FeL1HkOie6klHDb6G30wVGbvx/i2vhZQeDuzHQp 4iCAsrRFSy5HcUQ8I+RJCS7RLWDamH5zwvf9vYiAA00meBXpnocg5BfzyEZ0Ev4WnJZ8KL LMC6WmUYmyzp6dvanRH/Arvvg3k6JL8= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=Q1L9i2tU; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf14.hostedemail.com: domain of mjguzik@gmail.com designates 209.85.208.41 as permitted sender) smtp.mailfrom=mjguzik@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1744749142; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=sC6hzfL5anfg0EoTyXmEdBCoJKiR614P099rsWHuaTI=; b=Nt4Q0oazjSs4GebN2gUyu9DvGrg8vM6a00s/B6TVWdC1DDNIZ8/hHftuLdQ5q4e160Uz9N kBtfrYHgACv9j20OaVbNoJReTKB8KJMj7OetWyGNJQX/rXzZVSFIFR1k5K8uyKilACp/QH vdOfcrw/2SjVJoJthG3972j9qievcqc= Received: by mail-ed1-f41.google.com with SMTP id 4fb4d7f45d1cf-5e686d39ba2so10936555a12.2 for ; Tue, 15 Apr 2025 13:32:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1744749141; x=1745353941; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=sC6hzfL5anfg0EoTyXmEdBCoJKiR614P099rsWHuaTI=; b=Q1L9i2tUqYH43OrN/CHS2osXK9DdnuAEnxOyfVa0hu7qEj9URJ35fbs72AM0PKrNH3 //H8e1cSFU3g0yanASgP2A1mfG+WgG5F1ub17mGOPmsTl5T019U4eDUXndlLj2nVO2QC B9eQ7YmIDwBbVtJKXCFEwlILEc8swfC1GsjZH+a8AbH5QzkxWBSELNAeJ7pvZbUmDzEj iWQBHPUOwyY/XzgH8gark8cIU0qVXbatpDNKfBosMWtIoiA5EZSl4+02rmUqSsGnfg5t oNw2PhLiLAWmp6C9mNAHuY8IYoEEF1WdlPs7VhNXLNT7bTqP5ZFqFP73qxInSDDlZ5uG CbnA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1744749141; x=1745353941; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=sC6hzfL5anfg0EoTyXmEdBCoJKiR614P099rsWHuaTI=; b=V/FWpkSKW31ZU5bo48qSx7j5YBVKM6PUPdLMvLf1CO2nRomCqkc4JSiUVCTiTpNSWW zSKimAoOJf03hiWWboYe9WXueP7LXEWMjuaQQyeIS9wwa3XCA20p12wVlKy3rVvtY6GZ U81YuEx34ApnAevp27HG3Rrz5tNzz2rr1Hu1v2X1Vic0U5OB/egE1zBZXLMBpcUSQiSP c4XjSCO9y8LqJRMohNoFvLzekyzkhgVBduMax3Kbp1oiEhumgnvKkAC04tSHdTuG1zNe NmmyTumgnNt5a0D+9nIL2NEClwdat7FtLTnvhDhBlx7IJBWPZokT8dntz+YoFwPkIYJj OTTQ== X-Forwarded-Encrypted: i=1; AJvYcCXHLiypbteHY8J0QGMifcq2EsWdkerwrRYenkXrGPSrmWFID1ugffPZvw5xop2onYMawbJemt4W2A==@kvack.org X-Gm-Message-State: AOJu0YwcQyypVO+nJg7pfUQPNAKLgETxPh6cg3Ze/bslcEHJIjmyHc1M cs60kt7PBxAHEfW6mvC5ed9pp0oe4nGJpbkPTaQZ0QxoyYP4URRLr+Im3kUWfswxl6HfTQbDOEn YB1NpTQeDLI1yH9FYGpTI/tpbeeE= X-Gm-Gg: ASbGnctGEWQfneZkMT11S3F+YDuHuMX7gBfdl54cYXpWIE5E8iHPhFMR3bcj00NSeq1 lD/40ye6K96QzTMi/Mr59qSyYHxk16Ul8Rn71RMLvwh08PtnVHsYXVTbMS/97K9JKqYiU4OXTkY OrFGOQkJKs0LfHrGw9ewW88g== X-Google-Smtp-Source: AGHT+IG+RwrezVhO9OG2/ktJvuR/qvmQhUaHQnoey9PqyQqYd9KVm8tOx+1SGOGrI4kvB7cIn8PghjETp3kknafwCI0= X-Received: by 2002:a17:907:3d0c:b0:acb:1184:cc29 with SMTP id a640c23a62f3a-acb3852e281mr42833966b.59.1744749140490; Tue, 15 Apr 2025 13:32:20 -0700 (PDT) MIME-Version: 1.0 References: <20250414034607.762653-1-ankur.a.arora@oracle.com> <20250414034607.762653-2-ankur.a.arora@oracle.com> <20250414110259.GF5600@noisy.programming.kicks-ass.net> <87h62qymrp.fsf@oracle.com> <87cyddxkgl.fsf@oracle.com> In-Reply-To: <87cyddxkgl.fsf@oracle.com> From: Mateusz Guzik Date: Tue, 15 Apr 2025 22:32:08 +0200 X-Gm-Features: ATxdqUGCZqrEztb1g35yZBHgJdxs14sSuGpBFoIakMeoOBGjY5edMHVccz-kHe8 Message-ID: Subject: Re: [PATCH v3 1/4] x86/clear_page: extend clear_page*() for multi-page clearing To: Ankur Arora Cc: Peter Zijlstra , Ingo Molnar , linux-kernel@vger.kernel.org, linux-mm@kvack.org, x86@kernel.org, torvalds@linux-foundation.org, akpm@linux-foundation.org, bp@alien8.de, dave.hansen@linux.intel.com, hpa@zytor.com, mingo@redhat.com, luto@kernel.org, paulmck@kernel.org, rostedt@goodmis.org, tglx@linutronix.de, willy@infradead.org, jon.grimm@amd.com, bharata@amd.com, raghavendra.kt@amd.com, boris.ostrovsky@oracle.com, konrad.wilk@oracle.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 81DC4100009 X-Rspam-User: X-Rspamd-Server: rspam07 X-Stat-Signature: 9s5mqmqtdoydt5ptwrmxyybj6168izo7 X-HE-Tag: 1744749142-416835 X-HE-Meta: U2FsdGVkX1/r8ragCrumOkFf11XC/56OVN6vcfUZxDl/YggN7swGwtZsiqdv8n+txL366GTaE3k1Nt/pi6IgQWBe3nNbKxIp2TMdXroab31JTGvHhZqOaJ37rAnrV2PuucN2J6ZYfw1Dyn8gawEMoDIQv1NR/yCIVsYW6sWCyTfpe1pZ9iW12T1xPaGqMAWPxFIA74SQK4jywvviJUySnhlOsKEzFkd7ZUdHpZCpoAzPiVpqilyx5joJrohnlyGKUARkEOYFGSA3Rip1phhrlPDcUJHvhwYW7vQmlSBm3Skj3W12n3kVBqMPOUItaaBVQL7cry9CdjZ6ORM5imz3WUS3zZ92Zu8qDJdrMTs2XLcd9ZLafD5ux0y/bfmvOwScYI8PqcRzoVWopndryXweAKgXT+kDQ3c1gDF8LPGIVdYn1ROV1o54UkQ54CRXWjsPBE9INOaOHX2EaNjjfiwBeRZ7d1DQwjzPRVzz7zeTbXxQ/W95Qp/FWrn9Kk2aaWzgj76xemJyLHHF9jYF6vJvR9LEI0j23yGn+SQxdxX+Cb4v5fYO1+GVnu3HuUIGK/ddtzWsVJhf/lkUn4g2KPyaj2+jlHTUTEQGMsq4AMnAeDsz4r+KNsUoBg02bjSDP3vxIdpOW3opQjk3uyc1/N01AgbgdRg6hir6+ARjteS6n6t6V9ujyz1AlU7TgUQQDOWZffnNsu9VFsQjCvSIZniXpxEaS9Ulh7awOmsWVZSyy8re3X/ppu1XBxDdTijI8jQLqnEwPEChzNU2VBDxXc3ygkegF0mGdgnCTSDHaMFfT1ncUSWbo6TizzooTxsGTOepG1eI8FmKfKAb+j+d76zGQbxJB/M/eaSCuuh98gs9q7uFDjqKbwA07YQtQLpKXApbGcYb9MDJ5Gp5GYuPRuYcyUZMUO5XYAJP+xCNRrrR5DOuSSCRh8/wJLPp/EZ3THINbuqrI+06WrXw4BC7B3h g26Lo+Mi A6XAoX19EQ1PVp4XaGZaLh/4ekCCvsXTBbAuuk4i1AUGDJReHms7O/uHHxOVkNQQbIbCpcf2FimvaOAbjzqzt5dbSEZS5Mq2NWGFwMxFRI3djpjUhM5N0UPHkmAqNAuLskE1fk0CBOLFH4md3nn5AoBIqHS++XDgWT+NHzRB86PRjrT7y40Z2/G0qVT1TXSczNLMtyEitskcDg8d+wM2PpGWJ63WauRMEbszWYhOdaHxJtEPJkNBZ58p5oDhvt/GgnGKvty8skf+LmMEjD2LmTy95mLlpQ1xlQrQjUuOhdlAjmLcqEBwMnW8Ve668piIOet4wxv/0a8TdhsrmfxdEcge2lg6nY8kJ8L8LYr6BNqAc4rHmNVSnvcKRVhcBwRdydQPhCCpeRNr/bMfeHhqs1FmPXFHtqwS1No6y X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Apr 15, 2025 at 10:02=E2=80=AFPM Ankur Arora wrote: > > > Mateusz Guzik writes: > > > On Tue, Apr 15, 2025 at 8:14=E2=80=AFAM Ankur Arora wrote: > >> > >> > >> Mateusz Guzik writes: > >> > With that sucker out of the way, an optional quest is to figure out = if > >> > rep stosq vs rep stosb makes any difference for pages -- for all I k= now > >> > rep stosq is the way. This would require testing on quite a few uarc= hs > >> > and I'm not going to blame anyone for not being interested. > >> > >> IIRC some recent AMD models (Rome?) did expose REP_GOOD but not ERMS. > >> > > > > The uarch does not have it or the bit magically fails to show up? > > Worst case, should rep stosb be faster on that uarch, the kernel can > > pretend the bit is set. > > It's a synthetic bit so the uarch has both. I think REP STOSB is optimize= d > post FSRS (AIUI Zen3) > > if (c->x86 >=3D 0x10) > set_cpu_cap(c, X86_FEATURE_REP_GOOD); > > /* AMD FSRM also implies FSRS */ > if (cpu_has(c, X86_FEATURE_FSRM)) > set_cpu_cap(c, X86_FEATURE_FSRS); > > > >> > Let's say nobody bothered OR rep stosb provides a win. In that case = this > >> > can trivially ALTERNATIVE between rep stosb and rep stosq based on E= RMS, > >> > no func calls necessary. > >> > >> We shouldn't need any function calls for ERMS and REP_GOOD. > >> > >> I think something like this untested code should work: > >> > >> asm volatile( > >> ALTERNATIVE_2("call clear_pages_orig", > >> "rep stosb", X86_FEATURE_REP_GOOD, > >> "shrl $3,%ecx; rep stosq", X86_FEATURE_ERMS, > >> : "+c" (size), "+D" (addr), ASM_CALL_CONSTRA= INT > >> : "a" (0))) > >> > > > > That's what I'm suggesting, with one difference: whack > > clear_pages_orig altogether. > > What do we gain by getting rid of it? Maybe there's old hardware with > unoptimized rep; stos*. > The string routines (memset, memcpy et al) need a lot of love and preferably nobody would bother spending time placating non-rep users while sorting them out. According to wiki the AMD CPUs started with REP_GOOD in 2007, meaning you would need something even older than that to not have it. Intel is presumably in a similar boat. So happens gcc spent several years emitting inlined rep stosq and rep movsq, so either users don't care or there are no users (well realistically someone somewhere has a machine like that in the garage, but fringe cases are not an argument). rep_movs_alternative already punts to rep mov ignoring the issue of REP_GOOD for some time now (admittedly, I removed the non-rep support :P) and again there are no pitchforks (that I had seen). So I think it would be best for everyone in the long run to completely reap out the REP_GOOD thing. For all I know the kernel stopped booting on machines with such uarchs long time ago for unrelated reasons. As far as this specific patchset goes, it's just a waste of testing to make sure it still works, but I can't *insist* on removing the routine. I guess it is x86 maintainers call whether to whack this. --=20 Mateusz Guzik