From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E3CDAC433DF for ; Wed, 14 Oct 2020 21:07:39 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 1CF562068E for ; Wed, 14 Oct 2020 21:07:38 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=amacapital-net.20150623.gappssmtp.com header.i=@amacapital-net.20150623.gappssmtp.com header.b="B/MjJBqa" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 1CF562068E Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=amacapital.net Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 488886B005D; Wed, 14 Oct 2020 17:07:38 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 463D66B0062; Wed, 14 Oct 2020 17:07:38 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 376EC6B0068; Wed, 14 Oct 2020 17:07:38 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0032.hostedemail.com [216.40.44.32]) by kanga.kvack.org (Postfix) with ESMTP id 093506B005D for ; Wed, 14 Oct 2020 17:07:37 -0400 (EDT) Received: from smtpin10.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 81C098249980 for ; Wed, 14 Oct 2020 21:07:37 +0000 (UTC) X-FDA: 77371767354.10.flag85_2c1577b2720e Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin10.hostedemail.com (Postfix) with ESMTP id 44E1D16A044 for ; Wed, 14 Oct 2020 21:07:37 +0000 (UTC) X-HE-Tag: flag85_2c1577b2720e X-Filterd-Recvd-Size: 5429 Received: from mail-pl1-f196.google.com (mail-pl1-f196.google.com [209.85.214.196]) by imf22.hostedemail.com (Postfix) with ESMTP for ; Wed, 14 Oct 2020 21:07:36 +0000 (UTC) Received: by mail-pl1-f196.google.com with SMTP id c6so358196plr.9 for ; Wed, 14 Oct 2020 14:07:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amacapital-net.20150623.gappssmtp.com; s=20150623; h=content-transfer-encoding:from:mime-version:subject:date:message-id :references:cc:in-reply-to:to; bh=dR7lB2w8aCjI+7SL+gCWVnyWE0EcJqrEtve/Fbe7wCs=; b=B/MjJBqabR3kHP8fuBSvbUC1hpjUg/luJg1VJj+JARvr6cqZGaBRBHiO64T1qx5Rz7 jUR16cnHImawwz9obH9+PQiNMrBTw11th2E5mmAmePsS/Jq8oWK/UrnHLPf2dMOUMrJz Rj5PceZQsnxljXjERM8WV5U0foTw7fXr2J0EApon+KYJiAPlqBouCxvl3tSndHbB9IB2 RZ92Jp5l+sW62Ablfb58utH8lz8FqYU4TTVIAZbSvWJlJMdH7B7+6Weotzz0eekyB5cu el6K4FFvc3oekO0Y8Xt5arM6WYR4BGQYcfpUpoknPGHQAEUtj3vY7RqRXNf4x7iqDJrA 3LgQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:content-transfer-encoding:from:mime-version :subject:date:message-id:references:cc:in-reply-to:to; bh=dR7lB2w8aCjI+7SL+gCWVnyWE0EcJqrEtve/Fbe7wCs=; b=OqHYp+HsdXG1HLv1/2D7QYc2NMB42GgyTTuAAQttQtHunibIQv2O0DDHUyZSj4s5wM tWFWgmSk3hfJZITvqHpcX0RlzXe/TiwkEYMr+bFWzRCCO82D3mMn1LQT6IghlRDZm6Gt K3JpNPzGyhbZ4AbIMLzLvdxLSz7KsuLO1/y8ZbVL1cZRYavPtWj1a1YG5eXBRn78WY15 PZfFWeFKwRFgjChjaqnsiJRXkLFUxl/tlXbnr3RYDQ4sfjlCxV9idLY1xCq3yQJhiEfY Xdxe/4ZTyHyyFhM5MxBJHPvKHvxvo64pln0BpXIDKFnlm8cncbq1sXuJT8bVms6virZD Gelw== X-Gm-Message-State: AOAM533uPG0qKS+Epvyb4oKz8X4fpwEzo9miFd5ZsXnKzMKrG3ppf+Qu IDzJpjI2hDDPP5252qWNQwIziA== X-Google-Smtp-Source: ABdhPJykDDF86fwzs6jHsxI1qVDwzTE5CSv91OmX/qxVA0fgG8du089mMPQ6hbvROZzrmDNYuE/fNw== X-Received: by 2002:a17:90a:4e47:: with SMTP id t7mr1017421pjl.26.1602709655153; Wed, 14 Oct 2020 14:07:35 -0700 (PDT) Received: from ?IPv6:2601:646:c200:1ef2:21fb:4af8:8865:6308? ([2601:646:c200:1ef2:21fb:4af8:8865:6308]) by smtp.gmail.com with ESMTPSA id jy19sm501565pjb.9.2020.10.14.14.07.34 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 14 Oct 2020 14:07:34 -0700 (PDT) Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable From: Andy Lutomirski Mime-Version: 1.0 (1.0) Subject: Re: [PATCH 5/8] x86/clear_page: add clear_page_uncached() Date: Wed, 14 Oct 2020 14:07:30 -0700 Message-Id: <22E29783-F1F5-43DA-B35F-D75FB247475D@amacapital.net> References: <20201014195823.GC18196@zn.tnic> Cc: Andy Lutomirski , Ankur Arora , LKML , Linux-MM , "Kirill A. Shutemov" , Michal Hocko , Boris Ostrovsky , Konrad Rzeszutek Wilk , Thomas Gleixner , Ingo Molnar , X86 ML , "H. Peter Anvin" , Arnd Bergmann , Andrew Morton , Ira Weiny , linux-arch In-Reply-To: <20201014195823.GC18196@zn.tnic> To: Borislav Petkov X-Mailer: iPhone Mail (18A393) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: > On Oct 14, 2020, at 12:58 PM, Borislav Petkov wrote: >=20 > =EF=BB=BFOn Wed, Oct 14, 2020 at 08:45:37AM -0700, Andy Lutomirski wrote: >>> On Wed, Oct 14, 2020 at 1:33 AM Ankur Arora w= rote: >>>=20 >>> Define clear_page_uncached() as an alternative_call() to clear_page_nt()= >>> if the CPU sets X86_FEATURE_NT_GOOD and fallback to clear_page() if it >>> doesn't. >>>=20 >>> Similarly define clear_page_uncached_flush() which provides an SFENCE >>> if the CPU sets X86_FEATURE_NT_GOOD. >>=20 >> As long as you keep "NT" or "MOVNTI" in the names and keep functions >> in arch/x86, I think it's reasonable to expect that callers understand >> that MOVNTI has bizarre memory ordering rules. But once you give >> something a generic name like "clear_page_uncached" and stick it in >> generic code, I think the semantics should be more obvious. >=20 > Why does it have to be a separate call? Why isn't it behind the > clear_page() alternative machinery so that the proper function is > selected at boot? IOW, why does a user of clear_page functionality need > to know at all about an "uncached" variant? >=20 >=20 I assume it=E2=80=99s for a little optimization of clearing more than one pa= ge per SFENCE. In any event, based on the benchmark data upthread, we only want to do NT cl= ears when they=E2=80=99re rather large, so this shouldn=E2=80=99t be just an= alternative. I assume this is because a page or two will fit in cache and, f= or most uses that allocate zeroed pages, we prefer cache-hot pages. When cl= earing 1G, on the other hand, cache-hot is impossible and we prefer the impr= oved bandwidth and less cache trashing of NT clears. Perhaps SFENCE is so fast that this is a silly optimization, though, and we d= on=E2=80=99t lose anything measurable by SFENCEing once per page.=