From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C9808C27C53 for ; Fri, 7 Jun 2024 08:30:58 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4CF016B00A1; Fri, 7 Jun 2024 04:30:58 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 4816B6B00A4; Fri, 7 Jun 2024 04:30:58 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 346516B00A8; Fri, 7 Jun 2024 04:30:58 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 175A66B00A1 for ; Fri, 7 Jun 2024 04:30:58 -0400 (EDT) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id B566E814C5 for ; Fri, 7 Jun 2024 08:30:57 +0000 (UTC) X-FDA: 82203422154.18.E9C41B6 Received: from mail-lj1-f179.google.com (mail-lj1-f179.google.com [209.85.208.179]) by imf23.hostedemail.com (Postfix) with ESMTP id CE850140010 for ; Fri, 7 Jun 2024 08:30:54 +0000 (UTC) Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=cTntFohA; spf=pass (imf23.hostedemail.com: domain of huangzhaoyang@gmail.com designates 209.85.208.179 as permitted sender) smtp.mailfrom=huangzhaoyang@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1717749055; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=NUpGLGGxaYw2Ek3Ee9MLtP7lMr2pFc4xNVQS95pA+9c=; b=HLOq+ugiTSVzYo2IZLCVdRzM3haqzOBpWC+rt8OWWKo+U/7QDvwSlAK9ZNqLjzyUJjRVpQ tbISImCHwhalVXuxv1nTWIGc13mZ+09lqlQ8P7tmjbIcb591umW4mB5NkNYeYrlkMDUaws Q3Uz6mYZ3uIy7BCg4lipsk6xBY1Kmts= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=cTntFohA; spf=pass (imf23.hostedemail.com: domain of huangzhaoyang@gmail.com designates 209.85.208.179 as permitted sender) smtp.mailfrom=huangzhaoyang@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1717749055; a=rsa-sha256; cv=none; b=iUJ++GSYRuws22J6BmrOv1NMs/IBbv7cvosA0REzVM06xNAPAbc+FzoloNiXdaZ6z959p+ gDdgetc7qIQmdxg5xNvsSFu0j+WlGY4XhITI1bAH766pqD5smz+14C4z2A1Mily+tHJY8S ghe1NZTwmSX2C7Pdbe8g95hzfcz3upQ= Received: by mail-lj1-f179.google.com with SMTP id 38308e7fff4ca-2eabd22d3f4so21543571fa.1 for ; Fri, 07 Jun 2024 01:30:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1717749053; x=1718353853; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=NUpGLGGxaYw2Ek3Ee9MLtP7lMr2pFc4xNVQS95pA+9c=; b=cTntFohAfrglzU7sR+20O2WGnU+HhZPKSFZC1K5I5Ucslp1JRfQ3R/Nc4oHeZr7SMb mTzBMvFsyxORRo+go8hTtktwPIaPDSeqnG8pjt0p0UvK/IIjhMY2V95zEkdk5wrIzCP4 iDSoYAF5vFatiT4VH939HwWXpNO/DrjnO/irozIYjRIBOnjownnsyWkZpzKjo9i2/2mD 6R78TXkKnE0ptRiskqgyMV8IqBtDaS8Z6wwAkkaO+7LJnbvoXRnUTQuLbFeXafEs/fby SvK9RhkyMZFBcGQKqNYnTZhCCFZKcQ03+KkLxT3rS+oR6x5orXV298TOrhwKqW/ghXH6 zt6w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1717749053; x=1718353853; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=NUpGLGGxaYw2Ek3Ee9MLtP7lMr2pFc4xNVQS95pA+9c=; b=MsK1iHrpqMhqpq2brdT3qdMA5GaYuiJ2MUTCugeUnti9zM9AFXQ0/qYakOcpNiStaS Ly4BygE101RY+mPXeaQ6ZatjuV+Vp18xo0a5kZMdGU0pTee+DdsC6qFDSxIfLn8T6QxA BesJZHGFkKroDWT9dXM3ON+LmPg+6fDrUUNiiDjbSZE28n56dsdibh/3CNZKMB247knl r/QgDgA6KxMCHkQFvSDZV14/nb/SIZltcK8IaZoKHEL7sjoggvOoS3B0WWrGbR5vmEqT I150h+Nyn2paIY38c0FRBxWyLd9iEYC44z4F/ME07JytZMS/4Ajhbl1N1hpzE5ItrvCY rYMw== X-Forwarded-Encrypted: i=1; AJvYcCUqTS7S+FpxpFD8U06VuLD40dLLafEl6B+KJXhR51DsVS8TOEoVML09XY0jAARb5vGaS7SdeqNRepS4Cq6pUWCNOiY= X-Gm-Message-State: AOJu0YzdtQtg5Yf7ujVmRcVn/CkZ6eBbA8jbtS8Qzyrn4iqhbw1HzO/e WEMrI+xyx35iDha7nyz0a7NbKDqa3JeBf21Zeojyh5fDp4WfeFyg1U96QI1liT+5xAMzjj6CQE+ 3PhL5BlGb9iWbqffs6W7f7OnhQNQ= X-Google-Smtp-Source: AGHT+IFWo17li4pFi28uwfmGQsv6ThzDk83pRMhuQAVFZrZTDL96s0gNi1p1JzxXhJUyK9KWSk6QTFaAyM/Hc5+0sQM= X-Received: by 2002:a2e:a601:0:b0:2ea:8174:231b with SMTP id 38308e7fff4ca-2eadce1609cmr11881771fa.2.1717749052688; Fri, 07 Jun 2024 01:30:52 -0700 (PDT) MIME-Version: 1.0 References: <20240607023116.1720640-1-zhaoyang.huang@unisoc.com> In-Reply-To: <20240607023116.1720640-1-zhaoyang.huang@unisoc.com> From: Zhaoyang Huang Date: Fri, 7 Jun 2024 16:30:41 +0800 Message-ID: Subject: Re: [Resend PATCHv4 1/1] mm: fix incorrect vbq reference in purge_fragmented_block To: "zhaoyang.huang" Cc: Andrew Morton , Uladzislau Rezki , Christoph Hellwig , Lorenzo Stoakes , Baoquan He , Thomas Gleixner , hailong liu , linux-mm@kvack.org, linux-kernel@vger.kernel.org, stable@vger.kernel.org, steve.kang@unisoc.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspam-User: X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: CE850140010 X-Stat-Signature: ugzzc8cqhin163bphcu9ynqtq4k96ph4 X-HE-Tag: 1717749054-659499 X-HE-Meta: U2FsdGVkX1/s24HGxHw5r+CJbP86mDOkPW6ZzrBTzYdSN/SEAuJRE7QrQWbGbe0VjqWWOquOhFZnH9vWFo309eRwV/zhJf6tzUCDNi10C8H4q1JH6T3PhkmkdpfyKxm8OEEz79P0+iCLdm407T5/tyX/3MmBPEFZZ1IejyEIuZO1bCw74on9q9DILt491JgpQ/tRedmeYkk3AO55yiUHQWjsnTwq3n6AsdigY2VTmvVoVANDoNrA4K6sD/YutVclUsFD+wiOlwOHuBRB6EqZRlo2NNBW0E4bS1/G2ELam917l0XPUqjM0q75/Hs01PW0t+IN8+EJT43Wae163TPJzJRofDA5NZ/fPmix/niBsz7tFgSu3f8T6JD96sITvCo0GvREjtvWQzZQ3LtEx4UvYDdgeJgtN3iIJma7eneFJutIaV6V6PTpHxBmBGqPVUFC0esVmvzSq1KmXnZgnBjIDdT/i+mLNTzgk4rjHrKAV/WtxQLvJtPbnZxPnJDbvJL00A1rGAG4C61A9KY00xnZ4dEP5ZvKCrMHgpRhPCOovFh+lkJGoE5ZkMMKrtFFnWY9L/cyjMcXpxHvhYNtMj3gcd9e7nx2c3gdgwS6zz/DGEbul0XouffFaRb243/wYxPlFIpCeVomYYrnmKPnqlbeZBrtP7BiMNQaBG32jPA02Z2wWjTcq4PNF4tr7z6Us8BzkGeVom9Iba2e89/KcqpYL7gsmoO3RZf61Uw5EGlF576l4uLXQXSskYAV99wry+D103L39flLM0Yx9FwWiYVo6LwVmj6P9+YTV0XXwMFbxxNQG1F7gNCIB4B1BI2zCOFsVahMZe7V2C25d8G6dZOegdm4ZvOW+0i6Izk/XepIJSqftyJcYRRbaYqE9j339ng0PWJKNQSyZMVNc1qNjLGxN4yvMRqWgTWoMPBOC0IFr4sc3c0FlDLO8ia+sfKFMaBkDykwgJc9WMt1M5JqxEn 72zUPZzP 3xAv3wjCs+9ldWpm8aAhfZMycGLEY5qeNhCu79bwEstG816hAaVx5SCoHHzweMS28kLRlw1Np64H3xNSXSYaTvhuaL/Q1tq8cK5cTZiTlLn0LOhizUetRSxyw94EybbmjtS/TdhiIrN4WmnUd3GT6Zn/BgyWyHmJoaMTx7vNtuleyevi4y026CauqfNkiNAGpQFIPAighvAkNjif3gj5Ry8ZlRlAasWdRG1Il2KDPFDBjhnsGTDYmuIYs6O62PksGzKHRBoHXZCdOAwsr0XIG6Oqgq6cfwDjiI5fn83A9cevlu+7PqOO1GQ+r3FpeN/hEdMruzm3XSdZgLe9Btr0M+hGEwAqHMPupWHTkR3P99f505MVdWWMTbhdhMG+SeUvXl8cqpWyIJvbRZupIO1G5MFT+ObMxW5rquzUxKZyofgnMARkFRBkHoUvgwgrYqElJz5Tl7zuSPQJz4Azz6LZDlcgThm1+ehjQVaEoGFa3wiNYEX2e4wsSJdluOih+qD/uVxNu1khgn38ZMfJIOFmXO0rgV3E772SCl3coB2I378ZF1Ci5nORON8/DfWHHo/RlbW3NUdiVJ5hrJ+Q= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Patchv4 was updated based on Hailong and Uladzislau's comments, where vbq is obtained from vb->cpu, thus avoiding disabling preemption. Furthermore, Baoquan's suggestion was not adopted because it made vbq accesses completely interleaved across all CPUs, which defeats the goal of per_cpu. On Fri, Jun 7, 2024 at 10:31=E2=80=AFAM zhaoyang.huang wrote: > > From: Zhaoyang Huang > > vmalloc area runs out in our ARM64 system during an erofs test as > vm_map_ram failed[1]. By following the debug log, we find that > vm_map_ram()->vb_alloc() will allocate new vb->va which corresponding > to 4MB vmalloc area as list_for_each_entry_rcu returns immediately > when vbq->free->next points to vbq->free. That is to say, 65536 times > of page fault after the list's broken will run out of the whole > vmalloc area. This should be introduced by one vbq->free->next point to > vbq->free which makes list_for_each_entry_rcu can not iterate the list > and find the BUG. > > [1] > PID: 1 TASK: ffffff80802b4e00 CPU: 6 COMMAND: "init" > #0 [ffffffc08006afe0] __switch_to at ffffffc08111d5cc > #1 [ffffffc08006b040] __schedule at ffffffc08111dde0 > #2 [ffffffc08006b0a0] schedule at ffffffc08111e294 > #3 [ffffffc08006b0d0] schedule_preempt_disabled at ffffffc08111e3f0 > #4 [ffffffc08006b140] __mutex_lock at ffffffc08112068c > #5 [ffffffc08006b180] __mutex_lock_slowpath at ffffffc08111f8f8 > #6 [ffffffc08006b1a0] mutex_lock at ffffffc08111f834 > #7 [ffffffc08006b1d0] reclaim_and_purge_vmap_areas at ffffffc0803ebc3c > #8 [ffffffc08006b290] alloc_vmap_area at ffffffc0803e83fc > #9 [ffffffc08006b300] vm_map_ram at ffffffc0803e78c0 > > Fixes: fc1e0d980037 ("mm/vmalloc: prevent stale TLBs in fully utilized bl= ocks") > > For detailed reason of broken list, please refer to below URL > https://lore.kernel.org/all/20240531024820.5507-1-hailong.liu@oppo.com/ > > Suggested-by: Hailong.Liu > Signed-off-by: Zhaoyang Huang > --- > v2: introduce cpu in vmap_block to record the right CPU number > v3: use get_cpu/put_cpu to prevent schedule between core > v4: replace get_cpu/put_cpu by another API to avoid disabling preemption > --- > --- > mm/vmalloc.c | 21 +++++++++++++++------ > 1 file changed, 15 insertions(+), 6 deletions(-) > > diff --git a/mm/vmalloc.c b/mm/vmalloc.c > index 22aa63f4ef63..89eb034f4ac6 100644 > --- a/mm/vmalloc.c > +++ b/mm/vmalloc.c > @@ -2458,6 +2458,7 @@ struct vmap_block { > struct list_head free_list; > struct rcu_head rcu_head; > struct list_head purge; > + unsigned int cpu; > }; > > /* Queue of free and dirty vmap blocks, for allocation and flushing purp= oses */ > @@ -2585,8 +2586,15 @@ static void *new_vmap_block(unsigned int order, gf= p_t gfp_mask) > free_vmap_area(va); > return ERR_PTR(err); > } > - > - vbq =3D raw_cpu_ptr(&vmap_block_queue); > + /* > + * list_add_tail_rcu could happened in another core > + * rather than vb->cpu due to task migration, which > + * is safe as list_add_tail_rcu will ensure the list's > + * integrity together with list_for_each_rcu from read > + * side. > + */ > + vb->cpu =3D raw_smp_processor_id(); > + vbq =3D per_cpu_ptr(&vmap_block_queue, vb->cpu); > spin_lock(&vbq->lock); > list_add_tail_rcu(&vb->free_list, &vbq->free); > spin_unlock(&vbq->lock); > @@ -2614,9 +2622,10 @@ static void free_vmap_block(struct vmap_block *vb) > } > > static bool purge_fragmented_block(struct vmap_block *vb, > - struct vmap_block_queue *vbq, struct list_head *purge_lis= t, > - bool force_purge) > + struct list_head *purge_list, bool force_purge) > { > + struct vmap_block_queue *vbq =3D &per_cpu(vmap_block_queue, vb->c= pu); > + > if (vb->free + vb->dirty !=3D VMAP_BBMAP_BITS || > vb->dirty =3D=3D VMAP_BBMAP_BITS) > return false; > @@ -2664,7 +2673,7 @@ static void purge_fragmented_blocks(int cpu) > continue; > > spin_lock(&vb->lock); > - purge_fragmented_block(vb, vbq, &purge, true); > + purge_fragmented_block(vb, &purge, true); > spin_unlock(&vb->lock); > } > rcu_read_unlock(); > @@ -2801,7 +2810,7 @@ static void _vm_unmap_aliases(unsigned long start, = unsigned long end, int flush) > * not purgeable, check whether there is dirty > * space to be flushed. > */ > - if (!purge_fragmented_block(vb, vbq, &purge_list,= false) && > + if (!purge_fragmented_block(vb, &purge_list, fals= e) && > vb->dirty_max && vb->dirty !=3D VMAP_BBMAP_BI= TS) { > unsigned long va_start =3D vb->va->va_sta= rt; > unsigned long s, e; > -- > 2.25.1 >