From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.1 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 01691C433E2 for ; Wed, 2 Sep 2020 12:42:53 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id B414120767 for ; Wed, 2 Sep 2020 12:42:52 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=soleen.com header.i=@soleen.com header.b="V8U7jhgO" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org B414120767 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=soleen.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 494306B007D; Wed, 2 Sep 2020 08:42:52 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 445126B007E; Wed, 2 Sep 2020 08:42:52 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 35AA46B0080; Wed, 2 Sep 2020 08:42:52 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0130.hostedemail.com [216.40.44.130]) by kanga.kvack.org (Postfix) with ESMTP id 1D45A6B007D for ; Wed, 2 Sep 2020 08:42:52 -0400 (EDT) Received: from smtpin26.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id D21418248047 for ; Wed, 2 Sep 2020 12:42:51 +0000 (UTC) X-FDA: 77218085742.26.bears14_4d02c5d270a1 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin26.hostedemail.com (Postfix) with ESMTP id 8AE081804B65C for ; Wed, 2 Sep 2020 12:42:51 +0000 (UTC) X-HE-Tag: bears14_4d02c5d270a1 X-Filterd-Recvd-Size: 5068 Received: from mail-ej1-f68.google.com (mail-ej1-f68.google.com [209.85.218.68]) by imf31.hostedemail.com (Postfix) with ESMTP for ; Wed, 2 Sep 2020 12:42:51 +0000 (UTC) Received: by mail-ej1-f68.google.com with SMTP id a26so6456656ejc.2 for ; Wed, 02 Sep 2020 05:42:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen.com; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=SccKNTt3YC7XbTuW2H30Zmdads/idBqIfZTM94c1dTk=; b=V8U7jhgODsB8A//a8e72+JnBkGzNrjWlf+cOescmbiJ3DdRRPUNnX7AH0GN4CKTzGQ +/DnJnNFfFFBJdJNHpWGYydROSt6ust0nCR9krxGx1RYNVxs3B9Vt6pXMrhZ8ezZilSg 5iMLITForaLMdJBbuQ8/D4kKbKs5exxbDRNVJo9rsu4teVNsFhLVlefNJLDkf5vnWjK6 zCRx5PQacyjnr2low+ElUA4KMRIwH64tKzzsty1u/A5yrzKTSqH1Y5b+yFxgftrNkQrZ nsrBzKgtGopGLoDGEl4qx5f3eYHngO7vqpj+i4F57egPLI66c960+Mpld71TPJl37vP/ c1sA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=SccKNTt3YC7XbTuW2H30Zmdads/idBqIfZTM94c1dTk=; b=j9x/ZMBJPaWkoyQKJzXVeEnpXGnXci5I5bRKgrbzVEcBdx8zTLVbPZLeSr4QUg9J8y HzWQjKj/bxSx6yiLiW12Fm0nRlh41EOHi4/6WfCLE8McKnpXrrceEXsHFNISECH27pFU 2rxoFyo/SxH1D3dpuEiOV45reUzihsrx4prNvrcLL5vygXaaFWOfHLmlPAAMyzxApfkb ScgviFzNj0nvwOiAkKW6OhvuuGlzZrWvev8j7L653Yv2sz3hkoLqVCH/iZaaQHIjjUzh mRzSABzOHJ8ygONCu12DDTYGEsa4WIEEX+xzxiS2bo6Xrt2b+mJEqaB5D/RMMSoWjtgf fKqA== X-Gm-Message-State: AOAM53395f16sH7nCRUialT3ZyIIm0m9qdsfZrXUq1wWNdGlfW3ZNSBT N+6O1wndD4jHnoRzH6sA1P+fgEmIIAmcSiPCgpNbxw== X-Google-Smtp-Source: ABdhPJx66UO5GHi9sHXGjEoebE3qu+qT1iXTpFKwiHny0D+6C+OPmtHNdkot7yilBnEj0otgEXsEp/WbFvPk7azEl3E= X-Received: by 2002:a17:906:715b:: with SMTP id z27mr2767202ejj.166.1599050569831; Wed, 02 Sep 2020 05:42:49 -0700 (PDT) MIME-Version: 1.0 References: <6469324e-afa2-18b4-81fb-9e96466c1bf3@suse.cz> In-Reply-To: From: Pavel Tatashin Date: Wed, 2 Sep 2020 08:42:13 -0400 Message-ID: Subject: Re: [PATCH v2 00/28] The new cgroup slab memory controller To: David Hildenbrand Cc: Vlastimil Babka , Roman Gushchin , Bharata B Rao , "linux-mm@kvack.org" , Andrew Morton , Michal Hocko , Johannes Weiner , Shakeel Butt , Vladimir Davydov , "linux-kernel@vger.kernel.org" , Kernel Team , Yafang Shao , stable , Linus Torvalds , Sasha Levin , Greg Kroah-Hartman , David Hildenbrand Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: 8AE081804B65C X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam01 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: > > Am 02.09.2020 um 11:53 schrieb Vlastimil Babka : > > > > =EF=BB=BFOn 8/28/20 6:47 PM, Pavel Tatashin wrote: > >> There appears to be another problem that is related to the > >> cgroup_mutex -> mem_hotplug_lock deadlock described above. > >> > >> In the original deadlock that I described, the workaround is to > >> replace crash dump from piping to Linux traditional save to files > >> method. However, after trying this workaround, I still observed > >> hardware watchdog resets during machine shutdown. > >> > >> The new problem occurs for the following reason: upon shutdown systemd > >> calls a service that hot-removes memory, and if hot-removing fails for > > > > Why is that hotremove even needed if we're shutting down? Are there any > > (virtualization?) platforms where it makes some difference over plain > > shutdown/restart? > > If all it=E2=80=98s doing is offlining random memory that sounds unnecess= ary and dangerous. Any pointers to this service so we can figure out what i= t=E2=80=98s doing and why? (Arch? Hypervisor?) Hi David, This is how we are using it at Microsoft: there is a very large number of small memory machines (8G each) with low downtime requirements (reboot must be under a second). There is also a large state ~2G of memory that we need to transfer during reboot, otherwise it is very expensive to recreate the state. We have 2G of system memory memory reserved as a pmem in the device tree, and use it to pass information across reboots. Once the information is not needed we hot-add that memory and use it during runtime, before shutdown we hot-remove the 2G, save the program state on it, and do the reboot. Pasha