From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.1 required=3.0 tests=DKIM_INVALID,DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7FF6FC2BAEE for ; Wed, 11 Mar 2020 18:32:46 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 256DB206E9 for ; Wed, 11 Mar 2020 18:32:46 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="GsHVNura" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 256DB206E9 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=alum.mit.edu Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id BAC546B0005; Wed, 11 Mar 2020 14:32:45 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B5CC86B0006; Wed, 11 Mar 2020 14:32:45 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A4BF66B0007; Wed, 11 Mar 2020 14:32:45 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0122.hostedemail.com [216.40.44.122]) by kanga.kvack.org (Postfix) with ESMTP id 8A7A76B0005 for ; Wed, 11 Mar 2020 14:32:45 -0400 (EDT) Received: from smtpin13.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id 470D6181AEF09 for ; Wed, 11 Mar 2020 18:32:45 +0000 (UTC) X-FDA: 76583927490.13.star15_522ae25f84355 X-HE-Tag: star15_522ae25f84355 X-Filterd-Recvd-Size: 5291 Received: from mail-qk1-f193.google.com (mail-qk1-f193.google.com [209.85.222.193]) by imf24.hostedemail.com (Postfix) with ESMTP for ; Wed, 11 Mar 2020 18:32:44 +0000 (UTC) Received: by mail-qk1-f193.google.com with SMTP id c145so3061652qke.12 for ; Wed, 11 Mar 2020 11:32:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:from:date:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=PMlJ0CCM2ETRao/Z/x1Z86bkEWgQuSAvpITBlu5A7Ik=; b=GsHVNura2eRqqet/zqMCRx+4UATBBNXBlkdjyf760G2mIglKf0RCQsU2N9XDzUtMID QOG6zFpKUeZhp6O3MYB9yzjPslgY4XcfmdM/hdPwv5gb8UV/IKebOp+1qOGbfdzUbBPw 9AR0C06BREbED/nGqz1/MikJY3hy6t/5at9PShhTx37TJf2aW491moWc/W0MZQUfSBsP vNpuR9okDgiyWAt8bpKK33D5Quc3CW1GlLXA3ov7/4oSEqyoKeP1Uu3Juf1/XHfbg0rM m2oeoGJS8K5ESND7Tz3zgE4yqVsVO5wLgraKFSg6iQlDt8VR/ZtS0TVptvtPiHFfYWf4 C+FA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:from:date:to:cc:subject:message-id :references:mime-version:content-disposition:in-reply-to:user-agent; bh=PMlJ0CCM2ETRao/Z/x1Z86bkEWgQuSAvpITBlu5A7Ik=; b=RFRYaeRDhZQt7sBMkIzAV88VgGL6tJmiVzAT8EjtQXGUjH/O4yB3SldAplOyf9c80L XzPh4lW3FWT571VYMwb+7YxjzCRAXhRYVaKkRCPTKHmk4JaJipZP7hwqp1GD6/OIRCb3 ic7QnbaTDrTlgEb4VGOl8AvBk53dPejt68hvvuwQtkTFlzhcCO2lQqHf6w6yLYe7rsBk VQCi1etKe+j4F2U/sYDOra9CKG86dwt6iKGPUyYvwgheedGwZKoeLfRrOQxnkS6C8NGr dCEhQ3IxxXPBUCr5TRlPovu3v/U+W0NuWWQ7ziAmyQjmJwlW2iOo0ojn9Bm5ZjVUUE7M h+ug== X-Gm-Message-State: ANhLgQ20oTRu3W1jWH8ThUeROiFntKTiRrgGZxSd1dsoLh5sVgbk909Y bZ8TxCTKbZCMUUVTknKHMnE= X-Google-Smtp-Source: ADFU+vuM3E1+agajz5s4tUI3265OjGZw2LgwOJgXOunpsr50yTdq+JhLk2aPugNEWTHtVYnSf1er0w== X-Received: by 2002:a05:620a:1236:: with SMTP id v22mr4091581qkj.101.1583951563826; Wed, 11 Mar 2020 11:32:43 -0700 (PDT) Received: from rani.riverdale.lan ([2001:470:1f07:5f3::b55f]) by smtp.gmail.com with ESMTPSA id p191sm9264884qke.6.2020.03.11.11.32.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 11 Mar 2020 11:32:43 -0700 (PDT) From: Arvind Sankar X-Google-Original-From: Arvind Sankar Date: Wed, 11 Mar 2020 14:32:41 -0400 To: "Kirill A. Shutemov" Cc: Arvind Sankar , Cannon Matthews , Matthew Wilcox , Andi Kleen , Michal Hocko , Mike Kravetz , Andrew Morton , David Rientjes , Greg Thelen , Salman Qazi , linux-mm@kvack.org, linux-kernel@vger.kernel.org, x86@kernel.org Subject: Re: [PATCH] mm: clear 1G pages with streaming stores on x86 Message-ID: <20200311183240.GA3880414@rani.riverdale.lan> References: <20200307010353.172991-1-cannonmatthews@google.com> <20200309000820.f37opzmppm67g6et@box> <20200309090630.GC8447@dhcp22.suse.cz> <20200309153831.GK1454533@tassilo.jf.intel.com> <20200309183704.GA1573@bombadil.infradead.org> <20200311005447.jkpsaghrpk3c4rwu@box> <20200311033552.GA3657254@rani.riverdale.lan> <20200311081607.3ahlk4msosj4qjsj@box> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20200311081607.3ahlk4msosj4qjsj@box> User-Agent: Mutt/1.10.1 (2018-07-13) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Mar 11, 2020 at 11:16:07AM +0300, Kirill A. Shutemov wrote: > On Tue, Mar 10, 2020 at 11:35:54PM -0400, Arvind Sankar wrote: > > > > The rationale for MOVNTI instruction is supposed to be that it avoids > > cache pollution. Aside from the bench that shows MOVNTI to be faster for > > the move itself, shouldn't it have an additional benefit in not trashing > > the CPU caches? > > > > As string instructions improve, why wouldn't the same improvements be > > applied to MOVNTI? > > String instructions inherently more flexible. Implementation can choose > caching strategy depending on the operation size (cx) and other factors. > Like if operation is large enough and cache is full of dirty cache lines > that expensive to free up, it can choose to bypass cache. MOVNTI is more > strict on semantics and more opaque to CPU. But with today's processors, wouldn't writing 1G via the string operations empty out almost the whole cache? Or are there already optimizations to prevent one thread from hogging the L3? If we do want to just use the string operations, it seems like the clear_page routines should just call memset instead of duplicating it. > > And more importantly string instructions, unlike MOVNTI, is something that > generated often by compiler and used in standard libraries a lot. It is > and will be focus of optimization of CPU architects. > > -- > Kirill A. Shutemov