From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B5E5AC4320A for ; Mon, 16 Aug 2021 19:37:43 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 5389B60F41 for ; Mon, 16 Aug 2021 19:37:43 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 5389B60F41 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 4FD326B006C; Mon, 16 Aug 2021 15:37:42 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 4AD106B0072; Mon, 16 Aug 2021 15:37:42 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 39C546B0073; Mon, 16 Aug 2021 15:37:42 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0208.hostedemail.com [216.40.44.208]) by kanga.kvack.org (Postfix) with ESMTP id 216DD6B006C for ; Mon, 16 Aug 2021 15:37:42 -0400 (EDT) Received: from smtpin23.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 864A5230E2 for ; Mon, 16 Aug 2021 19:37:41 +0000 (UTC) X-FDA: 78481953522.23.37FD6E3 Received: from mail-ed1-f51.google.com (mail-ed1-f51.google.com [209.85.208.51]) by imf12.hostedemail.com (Postfix) with ESMTP id 4F87B1009618 for ; Mon, 16 Aug 2021 19:37:41 +0000 (UTC) Received: by mail-ed1-f51.google.com with SMTP id b7so28178145edu.3 for ; Mon, 16 Aug 2021 12:37:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=7cTn4xUKq7ggDw5LSNLiSgGeEvqiiXIyLWI0JChtlHs=; b=JZiorZuELNFlgjTFJgYIIplcBC9XFYGsDgteKoH1S+V5cWvrj7xhZSNrwMkG2ZF8r1 HkTN3QVq0k+9qBgJnjlETkzyF4O55EVRzZjulZDNRckYcp5KcaOj7zIiQmbJUOa/uRAz km5uyVnMlWGbur5qoPCQPtYVFCl9ZDto7kHIOT1JKCP40h3b7gXNYqNFt8JBB0a1JaGU BJQ/F6CJC5xbRwmk6p4DQabB43PGJghGxy5qpNfcBC186Bq6niF2x0PWyBbPHZXatA0t VjT7VRse1TbQ/IXetKO80d7swOk64FYmTe1FvRcuaWHOh3FMu7orO9WoXtxPWFhtCGjr ygpA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=7cTn4xUKq7ggDw5LSNLiSgGeEvqiiXIyLWI0JChtlHs=; b=mzcz6fZBuFYaLoBfwBHFf9jBSxPvegpZ0UURc3s0umB2E7Zup8tpJmsTjQq4jyBtA9 HT3ii5zud7IkvfiyQXLSdXn9Q/teXD9gLS+eMKwlKHVw3boSTdynNQaoBlgomam2qYnC s317gDLHm2j8R1SqLgL4EMAEr8JGCsp71rKm3aCe3MShPG+kXWFObJpBJ4LbyNwbps/D hOWfYRs89f3pTqmDSJ4D3oIaZ3u9HEuBe4x2wllhVu28fSo0Sx7I+C74AUuiKBs6W5WZ XRlzGGZml6QmsmnmO8d9wGRS0Oxt3ddUkrsOVSgKAX2sfSbx+7KxQHf/oel+tIW9H4p4 j3tg== X-Gm-Message-State: AOAM533gWQjGJ9pt0sAvxYCdhfSeg4x+kQaGe+0wQFg0O4wUtOY80hSv s6UHxLh8Zxbkeyae41fzosoeJei9Y4w8eVzlNtA= X-Google-Smtp-Source: ABdhPJyLaaOjReYuHdUmOmkH5QQLl26+EHM3hM2/CpfclOFfdDu1HRKiY8hvH5qxZ4xBQplBE0YuYxUQwVu07T1dBYg= X-Received: by 2002:a50:954c:: with SMTP id v12mr198023eda.313.1629142659999; Mon, 16 Aug 2021 12:37:39 -0700 (PDT) MIME-Version: 1.0 References: <20210816180909.3603-1-shy828301@gmail.com> <08a5ad43-7922-8cf8-31ed-4f6e0c346516@redhat.com> In-Reply-To: <08a5ad43-7922-8cf8-31ed-4f6e0c346516@redhat.com> From: Yang Shi Date: Mon, 16 Aug 2021 12:37:27 -0700 Message-ID: Subject: Re: [PATCH 1/2] mm: hwpoison: don't drop slab caches for offlining non-LRU page To: David Hildenbrand Cc: =?UTF-8?B?SE9SSUdVQ0hJIE5BT1lBKOWggOWPoyDnm7TkuZ8p?= , Oscar Salvador , tdmackey@twitter.com, Andrew Morton , Jonathan Corbet , Linux MM , Linux Kernel Mailing List Content-Type: text/plain; charset="UTF-8" X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 4F87B1009618 X-Stat-Signature: ek4qj33uu3a5j445owmwpw6z5dsns1hj Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=gmail.com header.s=20161025 header.b=JZiorZuE; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf12.hostedemail.com: domain of shy828301@gmail.com designates 209.85.208.51 as permitted sender) smtp.mailfrom=shy828301@gmail.com X-HE-Tag: 1629142661-325646 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, Aug 16, 2021 at 12:15 PM David Hildenbrand wrote: > > On 16.08.21 20:09, Yang Shi wrote: > > In the current implementation of soft offline, if non-LRU page is met, > > all the slab caches will be dropped to free the page then offline. But > > if the page is not slab page all the effort is wasted in vain. Even > > though it is a slab page, it is not guaranteed the page could be freed > > at all. > > ... but there is a chance it could be and the current behavior is > actually helpful in some setups. I don't disagree it is kind of helpful for some cases, but the question is how likely it is helpful and if the cost is worth it or not. For non-slab page (of course, non-lru too), dropping slab doesn't make any sense. Even though it is slab page, it must be a reclaimable slab. Even though it is a reclaimable slab, dropping slab can't guarantee all objects on the same page are dropped. IMHO the likelihood is not worth the cost and side effect, for example the unsuable system. > > [...] > > > The lockup made the machine is quite unusable. And it also made the > > most workingset gone, the reclaimabled slab caches were reduced from 12G > > to 300MB, the page caches were decreased from 17G to 4G. > > > > But the most disappointing thing is all the effort doesn't make the page > > offline, it just returns: > > > > soft_offline: 0x1469f2: unknown non LRU page type 5ffff0000000000 () > > > > In your example, yes. I had a look at the introducing commit: > facb6011f399 ("HWPOISON: Add soft page offline support") > > " > When the page is not free or LRU we try to free pages > from slab and other caches. The slab freeing is currently > quite dumb and does not try to focus on the specific slab > cache which might own the page. This could be potentially > improved later. > " > > I wonder, if instead of removing it altogether, we could actually > improve it as envisioned. > > To be precise, for alloc_contig_range() it would also make sense to be > able to shrink only in a specific physical memory range; this here seems > to be a similar thing. (actually, alloc_contig_range(), actual memory > offlining and hw poisoning/soft-offlining have a lot in common) > > Unfortunately, the last time I took a brief look at teaching shrinkers > to be range-aware, it turned out to be a lot of work ... so maybe this > is really a long term goal to be mitigated in the meantime by disabling > it, if it turns out to be more of a problem than actually help. Do you mean physical page range? Yes, it would need a lot of work. TBH, I don't think it is quite feasible for the time being. The problem is slabs for shrinker are managed by objects rather than pages. For example, dentry and inode objects (the most consumed reclaimable slabs) are linked to lru, and shrinkers traverse the lru to shrink the objects. The objects in a certain range can not be guaranteed in the same range of physical pages. > > -- > Thanks, > > David / dhildenb >