From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BE3AAC7EE2C for ; Mon, 15 May 2023 18:17:25 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 56FAE280001; Mon, 15 May 2023 14:17:25 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 4F8FA900002; Mon, 15 May 2023 14:17:25 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3735E280001; Mon, 15 May 2023 14:17:25 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 27EAB900002 for ; Mon, 15 May 2023 14:17:25 -0400 (EDT) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id D4C73121428 for ; Mon, 15 May 2023 18:17:24 +0000 (UTC) X-FDA: 80793296808.08.945B646 Received: from mail-lf1-f43.google.com (mail-lf1-f43.google.com [209.85.167.43]) by imf17.hostedemail.com (Postfix) with ESMTP id D445E40013 for ; Mon, 15 May 2023 18:17:22 +0000 (UTC) Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=gmail.com header.s=20221208 header.b=K2QWWHbq; spf=pass (imf17.hostedemail.com: domain of urezki@gmail.com designates 209.85.167.43 as permitted sender) smtp.mailfrom=urezki@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1684174643; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=6HEGJfJAxln5y0DnJRlGJvjEHQVuAQaOstuGe/ne06g=; b=LU4L8fPr4rDoTMVoPcuKWw4pqKn7W/DpQo5BqzdOOePKDcicPHytEnMXwL/toitbU26Fot Gb9I5+YyLuPfnSICXv5MzqXtGrAvTxkDzQAsbcXk7AA0ZcUBGWSEBd0y6QCebrzFHuExLW nn1E5MLqFUNRQp/g5sMkrMi5enimJIQ= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1684174643; a=rsa-sha256; cv=none; b=8SWE2M2aiHtIdGuL2ymPJT5zqmoMHpytIMGgspxuT6e04d+Bls/YY3xxBD/HHTqQbxXn5s JER1fo8n303HbuY4GVNm+zxCOolO1GAKYdVgVtma2Q5gBxH4zZncwl+UefJuiQVreCSEh5 xzrYqWUuyFAU+G+wSwLDZRx3XqYPemA= ARC-Authentication-Results: i=1; imf17.hostedemail.com; dkim=pass header.d=gmail.com header.s=20221208 header.b=K2QWWHbq; spf=pass (imf17.hostedemail.com: domain of urezki@gmail.com designates 209.85.167.43 as permitted sender) smtp.mailfrom=urezki@gmail.com; dmarc=pass (policy=none) header.from=gmail.com Received: by mail-lf1-f43.google.com with SMTP id 2adb3069b0e04-4f24ceae142so11631522e87.3 for ; Mon, 15 May 2023 11:17:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1684174641; x=1686766641; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:date:from:from:to:cc:subject:date:message-id:reply-to; bh=6HEGJfJAxln5y0DnJRlGJvjEHQVuAQaOstuGe/ne06g=; b=K2QWWHbqj09s4trXbxZ0fkbD8+CNnfCiEFW6y7I8czEjONC4POHOiKri1JKW6ev3VB ogGAwBlh61OYgD50AmAwhV2HjVtyHZVTC8ApRRA2o9+XV/ABF7wbvlm396r1imXMettj 8xQ+ULBw7v/5SYY45hHQlPKAZr5zZmwB0LttEOjkzZbN9Kmmx1f8jqkWSC1DbsvnKYCX Vmqh/9AiN575w1sv3OBwm8sJOLTMKb+ffQPTcLBgfDvhLYKJ0ZN5AUShdTGpWXvpWg+u yN5zuefX/WJXZnFqLJdr6x7IxKB48WFxI0Mxb+o9zLACO6uM9Iuxjcz+ZDWdIPkwopCl 3/sw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1684174641; x=1686766641; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:date:from:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=6HEGJfJAxln5y0DnJRlGJvjEHQVuAQaOstuGe/ne06g=; b=H91gkz2O7eBlDhX58DtIDgdARNOW1e9rOnGisO8BubbblwFb/XEVFFMzkwAZ8xhsYU ESnEbKJ7Pc+22AsVKru2s5avpIVCqM4vOprgZ+7mzQNjh9j9/udLjhNJoM4aHMhDEe1u e5vFFeuw6nCncv0gPOx0x09FEC1sz0ew/OCPugiFrtUYyzZ4j9V0gGnBO+pkdTeQ7UgF p2Rx5j/B6E8aIvVtfKjANRBDPwB+Um8R+fXGrpf+ayCPgkrTNYEcQkVCiy6+F4gut1hE rbfBX+GardekYc/bayLMO87s2w7O2jnEh6wFgP8RmHbOTxi84I8zfFhtlbWKAPqAs5Od C3xQ== X-Gm-Message-State: AC+VfDwHWdYYdsBQdriBVINgLWFfbtF1E2ZvgSQYubL48s6PloVb2Egl ywPGSALHWblsmv53LvN5Qgc= X-Google-Smtp-Source: ACHHUZ5o+rcu7FUrHvPzYrLg+LCLO98WFpCQ8txQV0BorKe7z/7dN1QbElut7ZX80a/nIbzke1Bufw== X-Received: by 2002:ac2:5ecd:0:b0:4f1:30cc:3dae with SMTP id d13-20020ac25ecd000000b004f130cc3daemr6969708lfq.10.1684174640484; Mon, 15 May 2023 11:17:20 -0700 (PDT) Received: from pc636 (host-90-235-18-147.mobileonline.telia.com. [90.235.18.147]) by smtp.gmail.com with ESMTPSA id g11-20020a19ac0b000000b004efd3c2b746sm2639756lfc.162.2023.05.15.11.17.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 15 May 2023 11:17:20 -0700 (PDT) From: Uladzislau Rezki X-Google-Original-From: Uladzislau Rezki Date: Mon, 15 May 2023 20:17:17 +0200 To: Thomas Gleixner Cc: Andrew Morton , linux-mm@kvack.org, Christoph Hellwig , Uladzislau Rezki , Lorenzo Stoakes , Peter Zijlstra , Baoquan He , John Ogness , linux-arm-kernel@lists.infradead.org, Russell King , Mark Rutland , Marc Zyngier Subject: Re: Excessive TLB flush ranges Message-ID: References: <87a5y5a6kj.ffs@tglx> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <87a5y5a6kj.ffs@tglx> X-Rspamd-Queue-Id: D445E40013 X-Stat-Signature: 6abgafqmjji9efjhsanp3ikfsnwy4qjm X-Rspam-User: X-Rspamd-Server: rspam09 X-HE-Tag: 1684174642-113104 X-HE-Meta: U2FsdGVkX19fZNvkySj0nKh9nZEczse5PlDKRmR08jSrT8QAtCh+qbNo9WQY+eQzX2ZhFq53lJnAimcQpIhPJp/av3fp0ZJNARuFCLA5I0XuijlDIqIktJLy2jGQI+X96nV+7JfRsMEHeSFLmzkvNHSH5BNZMXyshNafCFWusPuWIpwlBxDmDDUHjRkveajuuX/DANMwhYrLNLUIPRShw4ZS8pvt8x02PXQtnY5ODU/jbmVn6f3bGZhEtqNCaXI1zf6nxxibwvKPnhp8PtIeA7L2Kvuk0dcF6gsYP24KWm/q/2n7TBmJjuQGsgWSFuTlBZhqLscJhlsQdyWbRSS13uprEWl+Zw/v18Dp0k8QBf7XM5191ql7vshpPFOxh/5ui1QfQzQX4GD8m0LH0fNAFnRsicwrxbeM12Pb+xe7+6tYSjkRAU9N541XkwhoZnLcTWiL0u9GzMi0Z0bKd355mPu9zs5v1ZWY/HDbZlrcEx+6SWWrvGz2mrCdifYmDXYkX0wymY7hd/XK7Dh5UWTYzqFASt8RCfTim8TnezDL1d+pUKNs1Lq9M8dA9Y/nqSCssRZFp6Zqqe074RdBSIsycFSp95p25tajluVtke5UZ+jj9kUWTMy8fQ+ySjowzm5IqGmaHtBQw5+AFxxyvKDkqbbXO0Wk/KmWGJR/qXBKASbGAv/ZSo1BzCaSlFwNb6JNdsb+TUM/s7XrvkKMrKEMEVILD6YBfaieXCh5TAGUbtgF6tI1PkADXXC2sI7EbssOY2oOUGViNK3rojep+AG9YwCU2yup2F+30ZpHDnOpecAB2rx1j0h+xCMNhxig3y+L4MguR1E7DVgXuGz7lxkCudEKs1GHdNv+Np4YS7JJTWuahOllUR6n3BLKRk3L5rYZrchnk5hPhsvMpwefDoXnOzYU75KjGtyL3Jih5tA6lO9b+gFewoHJ7uxhHOPpt/bM8Dk1xdGQAioinn8YSsH OafQvs7u QNKVUQusoI/ntM5VsUM1MmihmwoYCdsM1q3NqruID/IqKHrE3TDleK7EjJrmlwCVlHuBfbCWQFLTNU8HLvk99VHgwbP4aBs1UH6bfqttcnUxTktzciecucDlrsglMDhP+MIM7gMu8CY96xmRmWWSRTOv4Fee/wQacURzhCi9zHtsw9UsEvElatasQ97MctYN1BZg+peDxTJA+3vUdfaiquuVmozEk1hCO3uzyT+0J2NoTsCC1pnZn1OSGyI8ewB3B7J8w5pxYGQGC+MjoNtXeYKWeEEtlRQJxFDIaSaZVO2rpN+liyQ+IJ7Q//pVJv8doaCp9w6wPrcqUk2XL3ve7nyzYADfCrAco7nyl+40nEVsJU8+TOiF9naiSFrjwfb4Pgfc2Sr6LUXRW0rbBEAbgUI5jRKXCl3kLljXTZUu0aBEyK2E= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On Mon, May 15, 2023 at 06:43:40PM +0200, Thomas Gleixner wrote: > Folks! > > We're observing massive latencies and slowdowns on ARM32 machines due to > excessive TLB flush ranges. > > Those can be observed when tearing down a process, which has a seccomp > BPF filter installed. ARM32 uses the vmalloc area for module space. > > bpf_prog_free_deferred() > vfree() > _vm_unmap_aliases() > collect_per_cpu_vmap_blocks: start:0x95c8d000 end:0x95c8e000 size:0x1000 > __purge_vmap_area_lazy(start:0x95c8d000, end:0x95c8e000) > > va_start:0xf08a1000 va_end:0xf08a5000 size:0x00004000 gap:0x5ac13000 (371731 pages) > va_start:0xf08a5000 va_end:0xf08a9000 size:0x00004000 gap:0x00000000 ( 0 pages) > va_start:0xf08a9000 va_end:0xf08ad000 size:0x00004000 gap:0x00000000 ( 0 pages) > va_start:0xf08ad000 va_end:0xf08b1000 size:0x00004000 gap:0x00000000 ( 0 pages) > va_start:0xf08b3000 va_end:0xf08b7000 size:0x00004000 gap:0x00002000 ( 2 pages) > va_start:0xf08b7000 va_end:0xf08bb000 size:0x00004000 gap:0x00000000 ( 0 pages) > va_start:0xf08bb000 va_end:0xf08bf000 size:0x00004000 gap:0x00000000 ( 0 pages) > va_start:0xf0a15000 va_end:0xf0a17000 size:0x00002000 gap:0x00156000 ( 342 pages) > > flush_tlb_kernel_range(start:0x95c8d000, end:0xf0a17000) > > Does 372106 flush operations where only 31 are useful > > So for all architectures which lack a mechanism to do a full TLB flush > in flush_tlb_kernel_range() this takes ages (4-8ms) and slows down > realtime processes on the other CPUs by a factor of two and larger. > > So while ARM32, CSKY, NIOS, PPC (some variants), _should_ arguably have > a fallback to tlb_flush_all() when the range is too large, there is > another issue. I've seen a couple of instances where _vm_unmap_aliases() > collects one page and the actual va list has only 2 pages, which might > be eventually worth to flush one by one. > > I'm not sure whether that's worth it as checking for those gaps might be > too expensive for the case where a large number of va entries needs to > be flushed. > > We'll experiment with a tlb_flush_all() fallback on that ARM32 system in > the next days and see how that works out. > For systems which lack a full TLB flush and to flush a long range is a problem(it takes time), probably we can flush VA one by one. Because currently we calculate a flush range [min:max] and that range includes the space that might not be mapped at all. Like below: VA_1 VA_2 |....|-------------------------|............| 10 12 60 68 . mapped; - not mapped. so we flush from 10 until 68. Instead, probably we can do a flush of VA_1 range and VA_2 range. On modern systems with many CPUs, it could be a big slow down. Just some thoughts. -- Uladzislau Rezki