From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 062A9C47258 for ; Mon, 29 Jan 2024 03:00:53 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5100B6B0080; Sun, 28 Jan 2024 22:00:53 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 4BFFB6B0081; Sun, 28 Jan 2024 22:00:53 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 387776B0082; Sun, 28 Jan 2024 22:00:53 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 22B796B0080 for ; Sun, 28 Jan 2024 22:00:53 -0500 (EST) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id EB2551408FB for ; Mon, 29 Jan 2024 03:00:52 +0000 (UTC) X-FDA: 81730846344.05.81F6D93 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf20.hostedemail.com (Postfix) with ESMTP id 266DE1C001B for ; Mon, 29 Jan 2024 03:00:50 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=AVaFw3J5; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf20.hostedemail.com: domain of ming.lei@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=ming.lei@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1706497251; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=SH5JJA8B2/oLNz4ErgboH6AqTlVlU9RXl7tVsXTShf0=; b=mp6F0hJ/tHWkYySHfqjh6K7wT/xpW8o58YtuQMrFlkHj2oTjnp5iz6E/ri8t+50/VU6vua 8rpurY3gJwo1Rwe5Y44TclNOpJnJhlmI5IK3Y0ZfFW3MZDbwhk/98W12tmzLIzVVAGcbGQ XbXdCX/z0+dy/gOBc9+jN26i/u1LiYE= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=AVaFw3J5; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf20.hostedemail.com: domain of ming.lei@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=ming.lei@redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1706497251; a=rsa-sha256; cv=none; b=2Y9MCEIfWRMiWFZ0UsmUXvKcqSHKX1Bn5YXuTyLU4nR/+tlnnL9ihml0kqsyQxZKtf+1ij mxFF9xBC+HlWV+k3oVkxWCAdteYkERIYli43+Fjo3wom0If9BjXfR/peXSl9+OjJGiFvuV X86Bno7GjiyNBg5sVoEOnO1Y5NuG04g= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1706497250; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=SH5JJA8B2/oLNz4ErgboH6AqTlVlU9RXl7tVsXTShf0=; b=AVaFw3J525mZAw/fcqXUZBw9MdqywaScgrrpBnMemUefHF/0O554JoUV8I51VU99aFjiWe 8/TAr7MXhkvOlthfQrY8JI7znjgX7w5ZeFx+CEu74DqrK9E4ft0p4UV8UDZhpuOhh4wZh8 75iuNpaW8ak85Ctm8BG15lrJpfVXxgw= Received: from mimecast-mx02.redhat.com (mx-ext.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-540-GSvxAW-rMwWbug6hU-02LQ-1; Sun, 28 Jan 2024 22:00:45 -0500 X-MC-Unique: GSvxAW-rMwWbug6hU-02LQ-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 2077E29AC015; Mon, 29 Jan 2024 03:00:45 +0000 (UTC) Received: from fedora (unknown [10.72.116.135]) by smtp.corp.redhat.com (Postfix) with ESMTPS id C4D5B2026F95; Mon, 29 Jan 2024 03:00:40 +0000 (UTC) Date: Mon, 29 Jan 2024 11:00:36 +0800 From: Ming Lei To: Matthew Wilcox Cc: Andrew Morton , linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Mike Snitzer , Don Dutile , Raghavendra K T , ming.lei@redhat.com Subject: Re: [RFC PATCH] mm/readahead: readahead aggressively if read drops in willneed range Message-ID: References: <20240128142522.1524741-1-ming.lei@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.4 X-Rspam-User: X-Stat-Signature: wkyp785qwe8kafaxmu8pmtuz5c69kn5x X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 266DE1C001B X-HE-Tag: 1706497250-456246 X-HE-Meta: U2FsdGVkX1+JXiub5dUO2H+R7CjitWho7F9hTb1dhhlJ6imTv20JK4CItYLUOoqWLIPmoDHl61o1FZ1p1QPxmUL2ing3e7CIQB2b4BKg8uvWO8/ycNIs2lGNA4GMagjfOu2/cG/7zXHaDVucy4saIls1QZ6mhg28kt3WGeo3/GW60suzTFiUSXe49OvMWAxSY30Vni1IsvTRmJjrUBhl8MqxTesgANrLJ56rJsPIXymBsT1kT1hTEHRuOdWPtYP7NrJH6ENXFL2ewFIkbwXzN10lioQKZEuH9cyauB0O4Z0uvguWh+MISiRR/8H0G25UybfPKfLuurvY/FbNHeAXrDZocbkjquX6nlpg+xIKGwgoUSTrmlCBlPKdpwd9yfIB2KYf8JfPqU/OX153xiLDFGjO4NBFYxKCiRdY1npDjbpptutG5P4XVNeY970aJCo+y/rRidFovW0/JJ7i0cm0kXkyDWBpbL6hlkxQMAjV3EjsbrWvr8WA7nbBu8K+hBX0zkv0C3u2j0x/5awk0iSIHNeoUNCBpz215seXxuCfOz8ZiMxN65eqV8BrMql5P3EbIp+K9nRzO2/z45zM3fUh3vkAUHYLtBQ6l2MTb8GU5A+2w4xCJmRukDXlSOmAEknNa7d3Le8ZJSmmhg4zhQbcCjxY0BPIU80SARyOYcqNdj0zJm/cElB+rPTK2oHYJXNeUMifOdAp/V29J5qMxfUdPYF2Ck8fAo+ZXZm8+Jdm1LcVVI8jAaUvfqwa5qMmC12b/+KhYDJ04BH5wP8fgQbkSRXZvHLGorAsoIHWwUfcO5fZD9elzi0qIQZ5wReHP90GW5W3F9pyKhfzrEZcftCgkp1AEpWDWCmZXAQHwaTA6vqWLwoYURaRKshOHufBKjnHddJnIQRTAAlgqMH72RaP9RngJP2RGRBaDjUS+7GCiRTRmR9Hylzk0bDRlpXudJLPdJGbd5CTPOhzr6qhAmJ o00Uh8g1 US1erb+bwn3lAHhvm9wzUbYnK8bBVtIP1a9kU3pdEEyWFkf+fAyR2LSR+FqEzR2j/Wm0AvlPgrgG9ilpb9MTrDLyoxB+E6UzJXrqiGSgZSo3O44rMJwA1vY3ZvHCKjDHi2QMqNeh7rLSdvXfu456SNBB+F/jTaUmJ6Seg X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Sun, Jan 28, 2024 at 10:02:49PM +0000, Matthew Wilcox wrote: > On Sun, Jan 28, 2024 at 10:25:22PM +0800, Ming Lei wrote: > > Since commit 6d2be915e589 ("mm/readahead.c: fix readahead failure for > > memoryless NUMA nodes and limit readahead max_pages"), ADV_WILLNEED > > only tries to readahead 512 pages, and the remained part in the advised > > range fallback on normal readahead. > > Does the MAINTAINERS file mean nothing any more? It is just miss to Cc you, sorry. > > > If bdi->ra_pages is set as small, readahead will perform not efficient > > enough. Increasing read ahead may not be an option since workload may > > have mixed random and sequential I/O. > > I thik there needs to be a lot more explanation than this about what's > going on before we jump to "And therefore this patch is the right > answer". Both 6d2be915e589 and the commit log provids background about this issue, let me explain it more: 1) before commit 6d2be915e589, madvise/fadvise(WILLNEED)/readahead syscalls try to readahead in the specified range if memory is allowed, and for each readahead in this range, the ra size is set as max sectors of the block device, see force_page_cache_ra(). 2) since commit 6d2be915e589, only 2MB bytes are load in these syscalls, and the remained bytes fallback to future normal readahead when reads from page cache or mmap buffer 3) this patch wires the advise(WILLNEED) range info to normal readahead for both mmap fault and buffered read code path, so each readhead can use max sectors of block size for the ra, basically takes the similar approach before commit 6d2be915e589 > > > @@ -972,6 +974,7 @@ struct file_ra_state { > > unsigned int ra_pages; > > unsigned int mmap_miss; > > loff_t prev_pos; > > + struct maple_tree *need_mt; > > No. Embed the struct maple tree. Don't allocate it. What made you > think this was the right approach? Can you explain why it has to be embedded? core-api/maple_tree.rst mentioned it is fine to call "mt_init() for dynamically allocated ones". maple tree provides one easy way to record the advised willneed range, so readahead code path can apply this info for speedup readahead. Thanks, Ming