From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A9923C433F5 for ; Mon, 25 Apr 2022 22:58:38 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C81AD6B0075; Mon, 25 Apr 2022 18:58:37 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C2F426B0078; Mon, 25 Apr 2022 18:58:37 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AF7A46B007B; Mon, 25 Apr 2022 18:58:37 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (relay.hostedemail.com [64.99.140.25]) by kanga.kvack.org (Postfix) with ESMTP id 99E256B0075 for ; Mon, 25 Apr 2022 18:58:37 -0400 (EDT) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay12.hostedemail.com (Postfix) with ESMTP id 72927121FFF for ; Mon, 25 Apr 2022 22:58:37 +0000 (UTC) X-FDA: 79396917474.26.BEBE8D3 Received: from mail-pj1-f51.google.com (mail-pj1-f51.google.com [209.85.216.51]) by imf25.hostedemail.com (Postfix) with ESMTP id 72D62A003E for ; Mon, 25 Apr 2022 22:58:30 +0000 (UTC) Received: by mail-pj1-f51.google.com with SMTP id p6so1238569pjm.1 for ; Mon, 25 Apr 2022 15:58:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=sender:date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=/I7oaQmzvZvVLsaaj3AYps814z1htG5kX5EnwlbSP44=; b=RviRw+YceB7Rg0blJFW8YvIVKwTDqm1qJuVwiHNXPeIr0r53aLUL9STDSq8ck0GOO/ KG9sk+owSOOf+daD4KNmj99m91Feb2diJ6VP1QH2g2QL/o19QIoLgA7QKvPqzaDz4bNv 6ZH6LQ8Ueklr5M+iHrjRP2+J4HVNfaRemwp4Z23NanMXK87Nkd2KkkLfS9vyBuwIc3Ox 2gGO0goYLRUnrx8doguglL1Uqlddi8IxSv0B4yOv+hexy0kWl9cHWCXTT4DJ6TkZYmw9 NOrkKG1jCvOcIxCHK8k4hq6LYyEtZ4jqfNMy4Vx0QR6MnzjFEiUeCLrczPv6Uv/50zUz xnfw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:sender:date:from:to:cc:subject:message-id :references:mime-version:content-disposition:in-reply-to; bh=/I7oaQmzvZvVLsaaj3AYps814z1htG5kX5EnwlbSP44=; b=ne1Qx/8UFhsv2NllZ9Mg+GJuAuCQV3qLniht6oncgsIMXJR82ZRaWIwCk/bKNHN53d tqzvad2DzXhYGK/dQhYMiegHs9Z7QAbSzrQ1u7Ar7IONJXRFqQhZ5B00hX4l8I2CQ4i/ VA/upACg/3FLx8hxJP+UIvl2liGwesboeAgGLmprkH6yql4zXE/iO1hA2oZpVf2af4cj 3hGq8knjLFVVGPRlSoGl6xfHU1u9tOuOQdoixsWrSG5dGaSu6a8pcEcvHjuGK5oxu9tw RNRS7TMUkhjzfxa6q39h03sTeW6PKmSDjg9RqLL5oI9Q6cquGE32PtR+ADlJJX2WzZEp XoGA== X-Gm-Message-State: AOAM533T+wl7vCcwNVGThpsR0vrtQ3LwM92zprYZlssqP9v4YMdyICZ4 2vSJOlSy4y566V3c0JnSOWA= X-Google-Smtp-Source: ABdhPJyCbn+0oNbCmdUD9RQa98a4NgJ97TLtDUYmc+7Shrbbk6/niJItR5fzXqzBIwnDoWdBPMXXUw== X-Received: by 2002:a17:90a:550e:b0:1cd:e722:8b82 with SMTP id b14-20020a17090a550e00b001cde7228b82mr34310226pji.223.1650927515892; Mon, 25 Apr 2022 15:58:35 -0700 (PDT) Received: from google.com ([2620:15c:211:201:d773:e034:d79d:ca70]) by smtp.gmail.com with ESMTPSA id w129-20020a628287000000b0050d4246fbedsm4090549pfd.187.2022.04.25.15.58.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 25 Apr 2022 15:58:35 -0700 (PDT) Date: Mon, 25 Apr 2022 15:58:33 -0700 From: Minchan Kim To: Mel Gorman Cc: Nicolas Saenz Julienne , Marcelo Tosatti , Vlastimil Babka , Michal Hocko , LKML , Linux-MM Subject: Re: [RFC PATCH 0/6] Drain remote per-cpu directly Message-ID: References: <20220420095906.27349-1-mgorman@techsingularity.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20220420095906.27349-1-mgorman@techsingularity.net> X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 72D62A003E X-Stat-Signature: zhx81qyibhrx1ft6gcpk4rjxtn15nqbu X-Rspam-User: Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=RviRw+Yc; spf=pass (imf25.hostedemail.com: domain of minchan.kim@gmail.com designates 209.85.216.51 as permitted sender) smtp.mailfrom=minchan.kim@gmail.com; dmarc=fail reason="SPF not aligned (relaxed), DKIM not aligned (relaxed)" header.from=kernel.org (policy=none) X-HE-Tag: 1650927510-141120 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Wed, Apr 20, 2022 at 10:59:00AM +0100, Mel Gorman wrote: > This series has the same intent as Nicolas' series "mm/page_alloc: Remote > per-cpu lists drain support" -- avoid interference of a high priority > task due to a workqueue item draining per-cpu page lists. While many > workloads can tolerate a brief interruption, it may be cause a real-time > task runnning on a NOHZ_FULL CPU to miss a deadline and at minimum, > the draining in non-deterministic. Yeah, the non-deterministic is a problem. I saw the kworker-based draining takes 100+ms(up to 300ms observed) sometimes in alloc_contig_range if CPUs are heavily loaded. I am not sure Nicolas already observed. it's not only problem of per_cpu_pages but it is also lru_pvecs (pagevec) draining. Do we need to introduce similar(allow remote drainning with spin_lock) solution for pagevec? > > Currently an IRQ-safe local_lock protects the page allocator per-cpu lists. > The local_lock on its own prevents migration and the IRQ disabling protects > from corruption due to an interrupt arriving while a page allocation is > in progress. The locking is inherently unsafe for remote access unless > the CPU is hot-removed. > > This series adjusts the locking. A spin-lock is added to struct > per_cpu_pages to protect the list contents while local_lock_irq continues > to prevent migration and IRQ reentry. This allows a remote CPU to safely > drain a remote per-cpu list. > > This series is a partial series. Follow-on work would allow the > local_irq_save to be converted to a local_irq to avoid IRQs being > disabled/enabled in most cases. However, there are enough corner cases > that it deserves a series on its own separated by one kernel release and > the priority right now is to avoid interference of high priority tasks. > > Patch 1 is a cosmetic patch to clarify when page->lru is storing buddy pages > and when it is storing per-cpu pages. > > Patch 2 shrinks per_cpu_pages to make room for a spin lock. Strictly speaking > this is not necessary but it avoids per_cpu_pages consuming another > cache line. > > Patch 3 is a preparation patch to avoid code duplication. > > Patch 4 is a simple micro-optimisation that improves code flow necessary for > a later patch to avoid code duplication. > > Patch 5 uses a spin_lock to protect the per_cpu_pages contents while still > relying on local_lock to prevent migration, stabilise the pcp > lookup and prevent IRQ reentrancy. > > Patch 6 remote drains per-cpu pages directly instead of using a workqueue. > > include/linux/mm_types.h | 5 + > include/linux/mmzone.h | 12 +- > mm/page_alloc.c | 333 ++++++++++++++++++++++++--------------- > 3 files changed, 222 insertions(+), 128 deletions(-) > > -- > 2.34.1 > >