From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-3.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 11FE6C4361A for ; Fri, 4 Dec 2020 14:37:11 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 863E62251F for ; Fri, 4 Dec 2020 14:37:10 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 863E62251F Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=amacapital.net Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id A6B746B0036; Fri, 4 Dec 2020 09:37:09 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 9F3CC6B005C; Fri, 4 Dec 2020 09:37:09 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8BC336B0068; Fri, 4 Dec 2020 09:37:09 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0050.hostedemail.com [216.40.44.50]) by kanga.kvack.org (Postfix) with ESMTP id 711196B0036 for ; Fri, 4 Dec 2020 09:37:09 -0500 (EST) Received: from smtpin19.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 27468180AD80F for ; Fri, 4 Dec 2020 14:37:09 +0000 (UTC) X-FDA: 77555852178.19.stop21_1a026d0273c5 Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin19.hostedemail.com (Postfix) with ESMTP id 008071AD1AD for ; Fri, 4 Dec 2020 14:37:08 +0000 (UTC) X-HE-Tag: stop21_1a026d0273c5 X-Filterd-Recvd-Size: 5478 Received: from mail-pl1-f194.google.com (mail-pl1-f194.google.com [209.85.214.194]) by imf43.hostedemail.com (Postfix) with ESMTP for ; Fri, 4 Dec 2020 14:37:08 +0000 (UTC) Received: by mail-pl1-f194.google.com with SMTP id v3so3172897plz.13 for ; Fri, 04 Dec 2020 06:37:08 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amacapital-net.20150623.gappssmtp.com; s=20150623; h=content-transfer-encoding:from:mime-version:subject:date:message-id :references:cc:in-reply-to:to; bh=MsPLvGvNf1bS+OcbC87Aqig/sgy85eIEfjyO8YBsPiY=; b=UZXXb5QnzDbiUmObhrI5+4AmF8IUYayIuqV+PYOsWzeVumt0pv+hpfTSPujXMvzxsg VE98zoWVezLv1/BweG0xNxMjuteXpdMc9E1H0CpA5qAHbUOpRBIDLOXg5xXFwyG4hPJF DatgzK3fAgnrUD8VF+DAOOoK+CatjFJtuKKKvrPgfumcFgOlu8FRfoBfzTgWSsL9ybKB CY5brTbzShCgh5IoMrRutefrxbAFCKO9JKGM8hYOix2tYmhtiAmxXRDia6kKIJc7rwHM vmZdMdrOJRtLdyxJlg9X/kONzYaAwZrajS4LoZEixlH8lMjxYJfzCdpJVXnp0K+Nzg/Q 6zPw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:content-transfer-encoding:from:mime-version :subject:date:message-id:references:cc:in-reply-to:to; bh=MsPLvGvNf1bS+OcbC87Aqig/sgy85eIEfjyO8YBsPiY=; b=sUCENtOpZH9TRoy/6wf7Jb4uhi3aOlrfXnDDJF1Smj9tB2XzRIIqN66zrJqM+UrGuo Z3ldDtYYNT2UHJTpKk7X2DqIyhn39HHIfnjo2mO3RO6qLWOfcg1XRNaUoUfaEi4aYOVu hwyakQYnMGceV4mkwPHcIJHkHbmozqmAvtTr+mLTGaV+n/iVNJiwnCYB0TVmO+pJJegr 8rpCf2Jzx4XxErWtmO67TJqP8juxWBcJKR8erAmgxOcC01e4lmwuCdxhbYGqfkQGwAhH opqk8jJ5QnVPgx8HorX684DMV0i8l4Xa0MZD53u14WIMLX6BjJLVX0NKM8Tyi6dg3Xs0 bAPA== X-Gm-Message-State: AOAM532dzKlmjMWyQx0oDVLqDEK6D8GBfl6Ep7v6IL+5PxkoxgByHjJn V9FXXQ8oUSavYGcVm9sK6jEI6g== X-Google-Smtp-Source: ABdhPJyFshwViMl5DvUfYhVdp5U1ntZBiV4Ra0suBVd3IWlDwG2haGuTFka/aH2GIM6E+E1x/nbi3A== X-Received: by 2002:a17:90a:8582:: with SMTP id m2mr4423222pjn.199.1607092627307; Fri, 04 Dec 2020 06:37:07 -0800 (PST) Received: from ?IPv6:2601:646:c200:1ef2:e89d:8a3d:645c:8fa4? ([2601:646:c200:1ef2:e89d:8a3d:645c:8fa4]) by smtp.gmail.com with ESMTPSA id m3sm4133967pgh.5.2020.12.04.06.37.05 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Fri, 04 Dec 2020 06:37:06 -0800 (PST) Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable From: Andy Lutomirski Mime-Version: 1.0 (1.0) Subject: Re: [RFC v2 2/2] [MOCKUP] sched/mm: Lightweight lazy mm refcounting Date: Fri, 4 Dec 2020 06:37:04 -0800 Message-Id: References: <1607065599.ecww2w3xq3.astroid@bobo.none> Cc: Andy Lutomirski , Anton Blanchard , Arnd Bergmann , Catalin Marinas , Dave Hansen , Jann Horn , linux-arch , LKML , Linux-MM , linuxppc-dev , Mathieu Desnoyers , Nadav Amit , Rik van Riel , Will Deacon , X86 ML In-Reply-To: <1607065599.ecww2w3xq3.astroid@bobo.none> To: Nicholas Piggin X-Mailer: iPhone Mail (18B121) X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: > On Dec 3, 2020, at 11:54 PM, Nicholas Piggin wrote: >=20 > =EF=BB=BFExcerpts from Andy Lutomirski's message of December 4, 2020 3:26 p= m: >> This is a mockup. It's designed to illustrate the algorithm and how the >> code might be structured. There are several things blatantly wrong with >> it: >>=20 >> The coding stype is not up to kernel standards. I have prototypes in the= >> wrong places and other hacks. >>=20 >> There's a problem with mm_cpumask() not being reliable. >=20 > Interesting, this might be a way to reduce those IPIs with fairly=20 > minimal fast path cost. Would be interesting to see how much performance=20= > advantage it has over my dumb simple shoot-lazies. My real motivation isn=E2=80=99t really performance per se. I think there=E2= =80=99s considerable value in keeping the core algorithms the same across al= l architectures, and I think my approach can manage that with only a single h= int from the architecture as to which CPUs to scan. With shoot-lazies, in contrast, enabling it everywhere would either malfunct= ion or have very poor performance or even DoS issues on arches like arm64 an= d s390x that don=E2=80=99t track mm_cpumask at all. I=E2=80=99m sure we cou= ld come up with some way to mitigate that, but I think that my approach may b= e better overall for keeping the core code uniform and relatively straightfo= rward. >=20 > For powerpc I don't think we'd be inclined to go that way, so don't feel=20= > the need to add this complexity for us alone -- we'd be more inclined to=20= > move the exit lazy to the final TLB shootdown path, which we're slowly=20 > getting more infrastructure in place to do. >=20 >=20 > There's a few nits but I don't think I can see a fundamental problem=20 > yet. Thanks! I can polish the patch, but I want to be sure the memory ordering parts are c= lear. >=20 > Thanks, > Nick