From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.4 required=3.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, MSGID_FROM_MTA_HEADER,SPF_HELO_NONE,SPF_PASS,USER_AGENT_SANE_1 autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A3AF0C2BA19 for ; Mon, 13 Apr 2020 15:34:03 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 4FD1620768 for ; Mon, 13 Apr 2020 15:34:03 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=oneplus.com header.i=@oneplus.com header.b="XjKS5KyR" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 4FD1620768 Authentication-Results: mail.kernel.org; dmarc=fail (p=quarantine dis=none) header.from=oneplus.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id EE2998E0126; Mon, 13 Apr 2020 11:34:02 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id EB89F8E0104; Mon, 13 Apr 2020 11:34:02 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DA8158E0126; Mon, 13 Apr 2020 11:34:02 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0157.hostedemail.com [216.40.44.157]) by kanga.kvack.org (Postfix) with ESMTP id C312D8E0104 for ; Mon, 13 Apr 2020 11:34:02 -0400 (EDT) Received: from smtpin08.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 7AB1C8248047 for ; Mon, 13 Apr 2020 15:34:02 +0000 (UTC) X-FDA: 76703227524.08.jar98_60b4102f9512 X-HE-Tag: jar98_60b4102f9512 X-Filterd-Recvd-Size: 8748 Received: from APC01-PU1-obe.outbound.protection.outlook.com (mail-eopbgr1320090.outbound.protection.outlook.com [40.107.132.90]) by imf42.hostedemail.com (Postfix) with ESMTP for ; Mon, 13 Apr 2020 15:34:01 +0000 (UTC) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=FGqtrKPr8K1g6vQV9wdZA/r3Nr7zfwReQhVindEilISwfuCSRgtMqQj2cEcf26l6odWMJnhfFJ5mfuq/rrVhtwfKRXmJBaTkRfYsGVd3ANGPLfsVslvNhb8xQ6jS1ajQPSRsxifw+rF3u2feBaInRZrrhUNKutzYckwf3AFiRU3nJzssQWpZ0G0wJQ61lzY5aOMWoAP5NgG4/3GkmGTkBrTJHTfXtXRZ33YjgDl23wTzokhTgmKKaMKq3mNl4fWTB/7se0VKGfJ/FUG58piTJejHFyBFT/3HwnbpwpB8rpoQI+N0zA4+PG2kg7MtUM4Rq8oO4FqqvZnYv4RVd1SYwg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=7b07ztKq4e6eJwkIrqQSAF8fVOi3l+Et+mO9rc1ldgw=; b=bWSjz2kEW5YsGgGQQvi6uOjwANcI65OWa+nvZg0w2yz+wdb/ofSInTZKlbE2P6TQG8K94j7EoJ2nA5yqQvdzuBlPV4WQRi5mMgBnxt8xwHbc05b2AAPd5W4WVuYpSSr9+ysO2K+9SxY07ZwdiFFQuO5NPAECARV0Lp0ZdJYsQ079BfJhlY58ipikVwFs+FtHj7SBEVPx2dJF24uVsj5yZTr+5A4c/8WXXSUdnJzcyN1iA1Ck3ka600waBcGyIS8ot3qLtb7hm0XkfZsFOWmQDr2ajsMoi8M3KXBEHDd3xb9PDsfF6+c5rPYqx+KMNAAyWmX/LHHW4rxTw7KglxQm0A== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=oneplus.com; dmarc=pass action=none header.from=oneplus.com; dkim=pass header.d=oneplus.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oneplus.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=7b07ztKq4e6eJwkIrqQSAF8fVOi3l+Et+mO9rc1ldgw=; b=XjKS5KyRATAC3aqUallz48i0PFyu+beHSfImbqLCY63/iOLOpSjx0RPrvTmPWofKTwiLS686Y3eAHD8IV1EMxcWs0oiQD2f/MdQcLZsUq2GipDbTZE+s6jvFJnskUfDwu+FpBvbipx5U1Rmo5eY7JMjh28Sk6qYc+HHn4IZoOFY= Authentication-Results: spf=none (sender IP is ) smtp.mailfrom=prathu.baronia@oneplus.com; Received: from SG2PR04MB3093.apcprd04.prod.outlook.com (20.177.91.75) by SG2PR04MB3144.apcprd04.prod.outlook.com (20.177.91.16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2900.28; Mon, 13 Apr 2020 15:33:57 +0000 Received: from SG2PR04MB3093.apcprd04.prod.outlook.com ([fe80::140:687e:956d:8070]) by SG2PR04MB3093.apcprd04.prod.outlook.com ([fe80::140:687e:956d:8070%6]) with mapi id 15.20.2900.026; Mon, 13 Apr 2020 15:33:57 +0000 Date: Mon, 13 Apr 2020 21:03:52 +0530 From: Prathu Baronia To: Alexander Duyck Cc: Chintan Pandya , "Huang, Ying" , Michal Hocko , "akpm@linux-foundation.org" , "linux-mm@kvack.org" , "gregkh@linuxfoundation.org" , "gthelen@google.com" , "jack@suse.cz" , Ken Lin , Gasine Xu Subject: Re: [RFC] mm/memory.c: Optimizing THP zeroing routine for !HIGHMEM cases Message-ID: <20200413153351.GB13136@oneplus.com> References: <20200403081812.GA14090@oneplus.com> <20200403085201.GX22681@dhcp22.suse.cz> <20200409152913.GA9878@oneplus.com> <20200409154538.GR18386@dhcp22.suse.cz> <87lfn390db.fsf@yhuang-dev.intel.com> Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.9.4 (2018-02-28) X-ClientProxiedBy: HK2PR02CA0141.apcprd02.prod.outlook.com (2603:1096:202:16::25) To SG2PR04MB3093.apcprd04.prod.outlook.com (2603:1096:4:6d::11) MIME-Version: 1.0 X-MS-Exchange-MessageSentRepresentingType: 1 Received: from oneplus.com (183.83.136.195) by HK2PR02CA0141.apcprd02.prod.outlook.com (2603:1096:202:16::25) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2900.15 via Frontend Transport; Mon, 13 Apr 2020 15:33:55 +0000 X-Originating-IP: [183.83.136.195] X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 0b6cd356-9c01-417f-637e-08d7dfc014e3 X-MS-TrafficTypeDiagnostic: SG2PR04MB3144:|SG2PR04MB3144: X-MS-Exchange-Transport-Forked: True X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:10000; X-Forefront-PRVS: 037291602B X-Forefront-Antispam-Report: CIP:255.255.255.255;CTRY:;LANG:en;SCL:1;SRV:;IPV:NLI;SFV:NSPM;H:SG2PR04MB3093.apcprd04.prod.outlook.com;PTR:;CAT:NONE;SFTY:;SFS:(10019020)(4636009)(366004)(396003)(346002)(376002)(39860400002)(136003)(1076003)(956004)(36756003)(316002)(478600001)(66946007)(26005)(8676002)(44832011)(66476007)(2616005)(86362001)(55016002)(107886003)(81156014)(54906003)(66556008)(8936002)(33656002)(186003)(5660300002)(16526019)(6666004)(2906002)(4326008)(7696005)(8886007)(52116002)(6916009);DIR:OUT;SFP:1102; Received-SPF: None (protection.outlook.com: oneplus.com does not designate permitted sender hosts) X-MS-Exchange-SenderADCheck: 1 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: gVfw7DyQRNe9qdfZJ7VDbxx5+H0bi0kjOcyO99W4CCX5hqjHSg9Ij/DYl7eGfL7ybIwY5V32mO6ab6Clkn6rSX4IsJcDpHMXOMDRmPOgEW506l7lshGmpYim6HS55utBGI448rhE5mJQM3220LPvsIWUI/eISplY36nn051nlIPYW32l1EFp0TKKow2r/1qO0Ujn3OtLblZ10jwXtAoQ17fGUGnbTaYDBKESx7aHR0j8B2WGRCNUJ3bIjBvRGirsSwUYoaAqHwwXsgrf/kfSnqoQcNKKg6XYMbc98bA/sZ9hmqyvEbxn/oy68qySVICH2c2tVS9qV4QA7VE0Lyloqm+pyijaBe4/vUzP/q8ZA9lyimU9LhsIvFhQh9WwSTG2BjMSP0KwSpmGqBSVfpm+XjRcRUT7HLv96M3G/sC0OfkpmdomIqK4SLBIoa2yKpMG X-MS-Exchange-AntiSpam-MessageData: DAQjC2Zjrvhtyjc9Uy1wzCm4x/fvCYXSww74Y+UoyQpvcwpJGXCKkIvf/QwuaoBnZf7FvghWemPhLGfjgvkeDF7uhs0XYlBk9AkyPNCaOqD1CLkQPg3sJhElj/mp2om2gBbIw97JU6X6fJeEwArgkQ== X-OriginatorOrg: oneplus.com X-MS-Exchange-CrossTenant-Network-Message-Id: 0b6cd356-9c01-417f-637e-08d7dfc014e3 X-MS-Exchange-CrossTenant-OriginalArrivalTime: 13 Apr 2020 15:33:57.5964 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 0423909d-296c-463e-ab5c-e5853a518df8 X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: NVpfO4+Y2RmLoM4DjtnWGJwzS74ph1w0C0UQa1TgX1TGXZIg414T29OGank9ho5XViAMxJpAlWNBjelnlLSQiZbiAPMjErO4Tbh5TWkS8cQ= X-MS-Exchange-Transport-CrossTenantHeadersStamped: SG2PR04MB3144 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The 04/11/2020 13:47, Alexander Duyck wrote: > > This is an interesting data point. So running things in reverse seems > much more expensive than running them forward. As such I would imagine > process_huge_page is going to be significantly more expensive then on > ARM64 since it will wind through the pages in reverse order from the > end of the page all the way down to wherever the page was accessed. > > I wonder if we couldn't simply process_huge_page to process pages in > two passes? The first being from the addr_hint + some offset to the > end, and then loop back around to the start of the page for the second > pass and just process up to where we started the first pass. The idea > would be that the offset would be enough so that we have the 4K that > was accessed plus some range before and after the address hopefully > still in the L1 cache after we are done. That's a great idea, we were working on a similar idea for the v2 patch and you suggesting this idea has reassured our approach. This will incorporate the benefits of optimized memset and will keep the cache hot around the faulting address. Earlier we had taken this offset as 0.5MB and after your response we have kept it as 32KB. As we understand there is a trade-off associated with keeping this value too high, we would really appreciate if you can suggest a method to derive an appropriate value for this offset from the L1 cache size. > > > An additional thing I was just wondering is if this also impacts the > copy operations as well? Looking through the code the two big users > for process_huge_page are clear_huge_page and copy_user_huge_page. One > thing that might make more sense than just splitting the code at a > high level would be to look at possibly refactoring process_huge_page > and the users for it. You are right, we didn't consider refactoring process_huge_page earlier. We will incorporate this in the soon to be sent v2 patch. Thanks a lot for the interesting insights! -- Prathu Baronia OnePlus RnD