From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 19484C61DA4 for ; Tue, 7 Mar 2023 00:15:37 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 68AD46B0071; Mon, 6 Mar 2023 19:15:36 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 63ACD6B0072; Mon, 6 Mar 2023 19:15:36 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5028B280001; Mon, 6 Mar 2023 19:15:36 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 41A376B0071 for ; Mon, 6 Mar 2023 19:15:36 -0500 (EST) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 12CD11C614B for ; Tue, 7 Mar 2023 00:15:36 +0000 (UTC) X-FDA: 80540183472.16.D07996B Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf02.hostedemail.com (Postfix) with ESMTP id AE9C280013 for ; Tue, 7 Mar 2023 00:15:33 +0000 (UTC) Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=FfPuk1R2; spf=pass (imf02.hostedemail.com: domain of peterx@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1678148134; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=z3mAOro0Ev+QaSIa4qsnSHWb6Poo29R0wjU0QXUfzbU=; b=F4LGuvwzaUCfArI/Klb4Yz5Vllh/xXpSb3JLc0a/jPtMlfA4ecvpam8xwVuGgbebzZlyDG mT2BFUdmwtAGly+U15FJJmxzUh5x4Ig3kT5iZ88Di1hosOSdtr/x7XmIKCQV7FWC0L/BQD OglTa39MSgt13JUrfShxuwFLYSD5NDc= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=FfPuk1R2; spf=pass (imf02.hostedemail.com: domain of peterx@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=peterx@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1678148134; a=rsa-sha256; cv=none; b=s0VN/2zydHjEbQrfSiPjwT4NYgeneNuFYiwxLV4PEXPq/zps75hzeh9asuuwXOZUZT2LfX ESR2mbM1nLbeAr87hYWEI9HJoHcX2jkyH82cKXhvAm27lc7VgBbIokSDLHeeS13z4XZJV1 LGES3khdKc2y8tHz+/I0uBHi3jnklJY= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1678148132; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=z3mAOro0Ev+QaSIa4qsnSHWb6Poo29R0wjU0QXUfzbU=; b=FfPuk1R2x7PuXWZd2MoOnqLdAvHK/XhTMyLBhMNfMxRLjvhyJDOMJLTspTsgRMd3dgee2G gXf6BDkeFaUDqBIXjpJ+UR8UaakLeTtiVnjRExZfhB/m5bdMyqDHBbgHoSByiK3gFWcMCe 6CZ9Oe8+v25WIq1uxYBfHMljVRv+xEA= Received: from mail-qk1-f199.google.com (mail-qk1-f199.google.com [209.85.222.199]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-414-PnO1tCNLPKSJLzI0CA4_Gg-1; Mon, 06 Mar 2023 19:15:31 -0500 X-MC-Unique: PnO1tCNLPKSJLzI0CA4_Gg-1 Received: by mail-qk1-f199.google.com with SMTP id d4-20020a05620a166400b00742859d0d4fso6458804qko.15 for ; Mon, 06 Mar 2023 16:15:31 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1678148131; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=z3mAOro0Ev+QaSIa4qsnSHWb6Poo29R0wjU0QXUfzbU=; b=RdvPI6IIZteVshFjLqaqBtRKClDFR+O0EwZk/w0a1UbX/3XpU4jLOiurcPBWHSx/vk 5icph4435eHH8rzejhI86y0GO31Qfe0yYi9WsgVzHqcDLO6/FWSsgEDTzv/b3IFkyX5Y 0EIQsRbYhxFuw2FISLbG9u7GTUb9t580FerOWOdRSD8HtlkObfiRQTlEg2Jcbf6dbWPt AZQj0a9mlyXnCtAvWjtP+apd/cLsemfxLw6jVgk3y9IirCpAiSD45yPgt/78Ri4C3ZOP i43PUKid9Tabu/MUywLWYW4K48yWo/+9uq/cMK8bkPn8bc2dMZTtNR2IJKrZQfOYwMfu LLcQ== X-Gm-Message-State: AO0yUKXpl6EFb5m3my1Wq0tLVA9V3KG5AvviLrXoi6LKB+DX+JZqTe9H THp7w23YTFJ+KsUsKh29eKlmUEp0reQIVctLGM1/0EZ5hvWbYAWp1RCERfQpOcbUzs4nObcCHwN ZzKc8o75VrK0= X-Received: by 2002:a05:622a:28a:b0:3bf:a60d:43b9 with SMTP id z10-20020a05622a028a00b003bfa60d43b9mr20492392qtw.4.1678148131170; Mon, 06 Mar 2023 16:15:31 -0800 (PST) X-Google-Smtp-Source: AK7set/NYsFR3vMRJsaCq5gmejuHuCA+r8anGeehdq+VWUDtS6J66BH71wCshaZogf5wJ7bDwB+aiA== X-Received: by 2002:a05:622a:28a:b0:3bf:a60d:43b9 with SMTP id z10-20020a05622a028a00b003bfa60d43b9mr20492368qtw.4.1678148130825; Mon, 06 Mar 2023 16:15:30 -0800 (PST) Received: from x1n (bras-base-aurron9127w-grc-56-70-30-145-63.dsl.bell.ca. [70.30.145.63]) by smtp.gmail.com with ESMTPSA id o5-20020a374105000000b007429ee9482dsm8248414qka.134.2023.03.06.16.15.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 06 Mar 2023 16:15:30 -0800 (PST) Date: Mon, 6 Mar 2023 19:15:29 -0500 From: Peter Xu To: Mike Kravetz Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: THP backed thread stacks Message-ID: References: <20230306235730.GA31451@monkey> MIME-Version: 1.0 In-Reply-To: <20230306235730.GA31451@monkey> X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Disposition: inline X-Stat-Signature: 74dxfeo8rhxjic1548aqytf7apjkzpjp X-Rspam-User: X-Rspamd-Queue-Id: AE9C280013 X-Rspamd-Server: rspam06 X-HE-Tag: 1678148133-276484 X-HE-Meta: U2FsdGVkX1823V/8KXnI9pd6zcy8JgvIfBou39TCT8MCYwGTYmimLauFT7A9VGr5wvLqLlxQJnT4id7mVgSfwQBCKNs0QExXxTn3KLLqEeJkZgegQiuiAi/W4nKqUxKT97rnK0tym37/zqln8Vgu2sms93RZ/5QnJwT2pi1rH0Kp3iVJdMUgXwl2xERVDuuS2T4g9BeKCO5aaamOAFD5fSnstsDBSl5caZ3ugNW0AqRwN9XfrK/3XKiiTiaSdc0AYM1T0nwRtSqXWgkgW+RLCLpUfWKibpBQJnSBldaJkGgtlTB8HikA5zBGFs3jycCfV648b3e/G7dGlvXulTotsH+Jx+gYWrf5Ql1S0fkPlN7K+wnBsbZjXAaeKvo5aIINzCIVH/2eENIacI90QHCe2tSyNcShe1E1ar6em2iumjviiolvDUIGu+KGwP+sV8U24YHvy8+u/Qf9JIn5JrKQ3oRWRlc0QuF9jKE6n9Ud9teiasGvanE44oBu8HaZHlVf9V+HnB9lN5wBrbUiee4DbCpQvW0DzonBKq4J8OqQteOb4f/WBcgLDnv+nqcmo9TDgiMrqSpFniM6ZoSneaMz49gzBmxSamrn4Upzjl/CFQU6QEKSSe70dPG77S1rFosBM88qZWa35PCNG8rUmAlFaijK9iB6ZLe2BcMo8tFTVSATtt/OexRMQ+EmpXDg4MKMHjcpuVoRGBrjo/BX1wJt6zoaG4bbWyWnJ74tKUQikaJpEs58lUdK44Qt+qCdOTtCxy5Jd9AW2de9b57JBEjO+Sej2i5JVHfJA3frL9I0ogw3jC4utoghARPw3H+J7lcukB49OpBnCfmsEkZ7WpMDQ9SDYDZLwB2vjqPTCP3dWddo+p2QUq9hRymIcHW8L50tPgrrMN/gbCtk5rhFSRolHgfX3YYjhGUXdI/JLOb6x+VT5LBO6ByftQOJABM3gcXQ/97/Pp7BUM41BU7jwxV 6PZLlTGO xT7aDWgwxLEk136wMNSFF2wRDoizfk7ZTdPbA4rVViL6tGloIG5KGgyB//x9TyhOjkehytISDViznnK+iVJpNPcbitsR0AZUlRM1RvmxdDXMwfbsnMDKfoO6MqWU84t6vKs/ZSbshe9zCWDmz+lcvnjg0zAdG9I+Du4NtfaYUQu6vxNxf630XZfqrzTo6AuNNn4J52y8GdX5VTsESmjdKyuUx2ueXd5h0IN9gQVVj63AEKUFv3mJM5lb4oDNYZjUBxcD7d1N+8tMTSqUSmw67jbyGIDbL5P++rQwOkoBBqdZ66PIoWC9YKM78pFzu/Rxzzr0hKqckeLtfYzy7p0d9zQfMWAoUguATD2C3Uw0IF51mxAKm+u9fqXTPRI3m+W+Ho4lY X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, Mar 06, 2023 at 03:57:30PM -0800, Mike Kravetz wrote: > One of our product teams recently experienced 'memory bloat' in their > environment. The application in this environment is the JVM which > creates hundreds of threads. Threads are ultimately created via > pthread_create which also creates the thread stacks. pthread attributes > are modified so that stacks are 2MB in size. It just so happens that > due to allocation patterns, all their stacks are at 2MB boundaries. The > system has THP always set, so a huge page is allocated at the first > (write) fault when libpthread initializes the stack. > > It would seem that this is expected behavior. If you set THP always, > you may get huge pages anywhere. > > However, I can't help but think that backing stacks with huge pages by > default may not be the right thing to do. Stacks by their very nature > grow in somewhat unpredictable ways over time. Using a large virtual > space so that memory is allocated as needed is the desired behavior. > > The only way to address their 'memory bloat' via thread stacks today is > by switching THP to madvise. > > Just wondering if there is anything better or more selective that can be > done? Does it make sense to have THP backed stacks by default? If not, > who would be best at disabling? A couple thoughts: > - The kernel could disable huge pages on stacks. libpthread/glibc pass > the unused flag MAP_STACK. We could key off this and disable huge pages. > However, I'm sure there is somebody somewhere today that is getting better > performance because they have huge pages backing their stacks. > - We could push this to glibc/libpthreads and have them use > MADV_NOHUGEPAGE on thread stacks. However, this also has the potential > of regressing performance if somebody somewhere is getting better > performance due to huge pages. Yes it seems it's always not safe to change a default behavior to me. For stack I really can't tell why it must be different here. I assume the problem is the wasted space and it exaggerates easily with N-threads. But IIUC it'll be the same as thp to normal memories iiuc, e.g., there can be a per-thread mmap() of 2MB even if only 4K is used each, then if such mmap() is populated by THP for each thread there'll also be a huge waste. > - Other thoughts? > > Perhaps this is just expected behavior of THP always which is unfortunate > in this situation. I would think it's proper the app explicitly choose what it wants if possible, and we do have the interfaces. Then, would pthread_attr_getstack() plus MADV_NOHUGEPAGE work, which to be applied from the JVM framework level? Thanks, -- Peter Xu