From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 48F85C46CD3 for ; Fri, 22 Dec 2023 02:25:27 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9EE636B0078; Thu, 21 Dec 2023 21:25:26 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 99DFC6B007B; Thu, 21 Dec 2023 21:25:26 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8B3826B0080; Thu, 21 Dec 2023 21:25:26 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 7CCAD6B0078 for ; Thu, 21 Dec 2023 21:25:26 -0500 (EST) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 18978120581 for ; Fri, 22 Dec 2023 02:25:26 +0000 (UTC) X-FDA: 81592862652.11.436C7FF Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.136]) by imf12.hostedemail.com (Postfix) with ESMTP id 5BC7040008 for ; Fri, 22 Dec 2023 02:25:22 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=DKbFw63d; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf12.hostedemail.com: domain of ying.huang@intel.com designates 192.55.52.136 as permitted sender) smtp.mailfrom=ying.huang@intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1703211923; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=r+qTOr+c4lFaMxVNQpl4LZLfB+tQmoEsynYhtcirNoc=; b=YAdHerQ/zrcYrYpwbDD5jp3gYyEAZeqA4tjQkay8ftUCZ9Ydk7tdnCjiBJXeVecygnbCMh wymAE24kMhj7lCD311J4I6aUjSB8MLXhTPI1UsibAGmQyP7cRmoZsAlmFjgs4gP3t0C0Xx 7+kC7EEtyYeB3f4HQhtuOWDp+3kYWIM= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=DKbFw63d; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf12.hostedemail.com: domain of ying.huang@intel.com designates 192.55.52.136 as permitted sender) smtp.mailfrom=ying.huang@intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1703211923; a=rsa-sha256; cv=none; b=I44N52SF/XShwDIFCNIlHu2slYeGzhHM7OxYeF3HqHTPi0YDori5RWR7tC9VEv/YRsaLMn e5M5ZQlKFPoLVt6XEpepEiC0U+3ZXWKTsCI5gi5niga1fElkka9fryHNELKdln5x5lc11w kRde5pLZq5tzmmgUACe7TW1BVTByDKQ= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1703211922; x=1734747922; h=from:to:cc:subject:in-reply-to:references:date: message-id:mime-version:content-transfer-encoding; bh=FjqEvmq/Y9TwBKRAtFmD++n/yKpu1Fq5H0w9jU7Ox9I=; b=DKbFw63d3SvM0Bo5HYGnrHIr7eug98LZeRqG+Hw4TDA30y/DKKCdU3Hr shcaOUWOmaBdxhpl5wFtc53PV3Fpg+ZJ1D+XhYJVrR9p/4NuR+SmjaOvQ jQmlYA9gXXp/EpqRLWCU5nnSOMe9GL6IkwVLc3pS5uZVsn8zrDS6zoSK/ lhEqfMfxA3ocyO63wIu1Geqy1EvA2hGhjLYIspEy8MAvFQVcyfyFs8wHU RqtGATcrNILj/8GDdBI2Sl2ynsQppGjhbok44NrZBA3T3ncONF6+rFBWr 5EFHvWAgHLv0+658bLxjy8D8DG2LDJJOOXXnBTsVq4uqbyMD2V8KCyzzs w==; X-IronPort-AV: E=McAfee;i="6600,9927,10931"; a="375551167" X-IronPort-AV: E=Sophos;i="6.04,294,1695711600"; d="scan'208";a="375551167" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Dec 2023 18:25:18 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10931"; a="726645167" X-IronPort-AV: E=Sophos;i="6.04,294,1695711600"; d="scan'208";a="726645167" Received: from yhuang6-desk2.sh.intel.com (HELO yhuang6-desk2.ccr.corp.intel.com) ([10.238.208.55]) by orsmga003-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Dec 2023 18:25:13 -0800 From: "Huang, Ying" To: "Yin, Fengwei" Cc: Matthew Wilcox , Yang Shi , kernel test robot , Rik van Riel , , , "Linux Memory Management List" , Andrew Morton , Christopher Lameter , Subject: Re: [linux-next:master] [mm] 1111d46b5c: stress-ng.pthread.ops_per_sec -84.3% regression In-Reply-To: (Fengwei Yin's message of "Fri, 22 Dec 2023 09:06:03 +0800") References: <202312192310.56367035-oliver.sang@intel.com> Date: Fri, 22 Dec 2023 10:23:14 +0800 Message-ID: <87y1dngelp.fsf@yhuang6-desk2.ccr.corp.intel.com> User-Agent: Gnus/5.13 (Gnus v5.13) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 5BC7040008 X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: 5kmpiq38qfd1z9qsa8bw1xdba1ur56bi X-HE-Tag: 1703211922-249323 X-HE-Meta: U2FsdGVkX18CqPwZyZ1sOcnF9xMMQoSmnDSQ2ihlNHvbHrFELvoMtUjvRIB+YyVDbHS7WetOjJiNRbj00j5d7Q1iHTLYE+gW5tZUVHJb40RYMBZsrkmWz+Yav2uUWo9GoQRY9lKGQSiiuiyYHNjHMGnIcTvuUdSvqFBRVjJdCEabO9WsjyCWRZV2Duj+Kt+V40kHg0Cb53ppIzmu1z/Eiaqo7EhiYZaf7aOGSj2tGdB8O3mK4D77rrUQdlMt+SXl7Bl1Sn3ib8zp7JdR/t6uLsLBn5EA20tY9oLPqUUUE+nSAgDoUVQZ9CAhRBHVRb2N0mFnLGbZIkNIK6ArSYE5OUR5jCJc2OMSnYlIC5kiVboAxbkhPV/wXvMv9zVy2Zwil7AMgaA6rtZ0JtHz56kLb2pNdiFzVf1EI4vqNVTX4MPXqcNELuLEqbRW6V9PmtoOs6ZbSsUp/2EA6Y6HlE3yKrhQ3Z1c1+g/rWyMRA/I85a/2mMByUZ476nHzj2+tK3+veKHFGFVKo+azsbNXGyfPmXgbUz5SJ7SkVJnbIVD1tnCLPOFwHZa3WNDKnrjHBSFbFxS+fSZ0MMsQWu7GgX5z2kp3wXwFLJ2C+9Agzke1F/MR8/Ae5OlnaSyqfMb1xdXo32UuR+c+XgFXqxC6VSChxZ7iQZ6vgS/YD9aZmxxHZRjsxKwodm7U1dp8hZPBKrCLa/TwLS0RV7Q6McDHPF/8PKWCAXYFfucPfoPLEwDwRI9okpNo2NHntZiMAgfXkwA7I8U1BEG3kqLKtD/utW9q3wnHcefC84wZ7ZSVx7LI6frvjEMb+BCjEYBzkwdEwqb/MTBRL4mwjyGUtbikzdlbYHmXQNgv885RzpPInwfB02lvAxSEartQpGRA1UAhqiVi6mYE2UI8tWiR1d91VNH7XP5NWw/ri6qfU6bv7qUMoP0ADiNyDwkAVF59VbUu7SNW2TxRyCw2F3oXuifRSk NNJ3eq1q SZKa7VWlJkSsFUg7gK0RzUzY/jVSJs5GOnETyXq9YYXGkiMazyEJFhpiZurQ+l6horFTh5WsD3/6MgxD9aPDtelPEI6aWabbLtgubooYle2wD7OANVSD87lN1rCND/8BRLT2vW7CcEa7cyzUusqMU6InQa+uBob2XC/5qjVJmNTuNDBvG37HSnl13k9F+HxhbWjPRCgoasNvkLkX9Qx/A+TrN9WFDn6Uqv5p/kzUjFxtonhQ5ZYnKwEMxkID09flezaO7RpFkqQIIBgAaQ55GbTV/Rw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: "Yin, Fengwei" writes: > On 12/22/2023 2:14 AM, Matthew Wilcox wrote: >> On Thu, Dec 21, 2023 at 10:07:09AM -0800, Yang Shi wrote: >>> On Wed, Dec 20, 2023 at 8:49=E2=80=AFPM Matthew Wilcox wrote: >>>> >>>> On Thu, Dec 21, 2023 at 08:58:42AM +0800, Yin Fengwei wrote: >>>>> Yes. MAP_STACK is also mentioned in manpage of mmap. I did test to >>>>> filter out of the MAP_STACK mapping based on this patch. The regressi= on >>>>> in stress-ng.pthread was gone. I suppose this is kind of safe because >>>>> the madvise call is only applied to glibc allocated stack. >>>>> >>>>> >>>>> But what I am not sure was whether it's worthy to do such kind of cha= nge >>>>> as the regression only is seen obviously in micro-benchmark. No evide= nce >>>>> showed the other regressionsin this report is related with madvise. At >>>>> least from the perf statstics. Need to check more on stream/ramspeed. >>>> >>>> FWIW, we had a customer report a significant performance problem when >>>> inadvertently using 2MB pages for stacks. They were able to avoid it = by >>>> using 2044KiB sized stacks ... >>> >>> Thanks for the report. This provided more justification regarding >>> honoring MAP_STACK on Linux. Some applications, for example, pthread, >>> just allocate a fixed size area for stack. This confuses kernel >>> because kernel tell stack by VM_GROWSDOWN | VM_GROWSUP. >>> >>> But I'm still a little confused by why THP for stack could result in >>> significant performance problems. Unless the applications resize the >>> stack quite often. >> We didn't delve into what was causing the problem, only that it was >> happening. The application had many threads, so it could have been as >> simple as consuming all the available THP and leaving fewer available >> for other uses. Or it could have been a memory consumption problem; >> maybe the app would only have been using 16-32kB per thread but was >> now using 2MB per thread and if there were, say, 100 threads, that's an >> extra 199MB of memory in use. > One thing I know is related with the memory zeroing. This is from > the perf data in this report: > > 0.00 +16.7 16.69 =C2=B1 7% > perf-profile.calltrace.cycles-pp.clear_page_erms.clear_huge_page.__= do_huge_pmd_anonymous_page.__handle_mm_fault.handle_mm_fault > > Zeroing 2M memory costs much more CPU than zeroing 16-32KB memory if > there are many threads. Using 2M stack may hurt performance of short-live threads with shallow stack depth. Imagine a network server which creates a new thread for each incoming connection. I understand that the performance will not be great in this way anyway. IIUC we should not make it too bad. But, whether this is import depends on whether the use case is important. TBH, I don't know that. -- Best Regards, Huang, Ying