From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0FD72C5AD49 for ; Tue, 3 Jun 2025 18:21:18 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0F9496B03A0; Tue, 3 Jun 2025 14:21:18 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 082E86B04E4; Tue, 3 Jun 2025 14:21:18 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EB4CB6B04E5; Tue, 3 Jun 2025 14:21:17 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id CB2516B04E3 for ; Tue, 3 Jun 2025 14:21:17 -0400 (EDT) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 5344AC12A0 for ; Tue, 3 Jun 2025 18:21:17 +0000 (UTC) X-FDA: 83514906594.21.242182E Received: from mail-wm1-f53.google.com (mail-wm1-f53.google.com [209.85.128.53]) by imf30.hostedemail.com (Postfix) with ESMTP id 452D28000A for ; Tue, 3 Jun 2025 18:21:15 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=IUqJatYI; spf=pass (imf30.hostedemail.com: domain of jannh@google.com designates 209.85.128.53 as permitted sender) smtp.mailfrom=jannh@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1748974875; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=TYuYabL4uCVwF/1h/26iRiOfzm4Zqn6YpsGjqJSyXvU=; b=BMNWeUHI00f8jYrsYwqceesS1acdtHgt0WW9Ht9peKaYnayHI5Pb+zDxzeib/2me28NFCF sBB8epEbibxydkd78ttOS46p6AOBicqZxz3l0PDNoHHCh4VryX4po0hhQgw/jf9IuOlwDE 1fqqPXYHkke9BPYSld++GxO04LyyHiQ= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=IUqJatYI; spf=pass (imf30.hostedemail.com: domain of jannh@google.com designates 209.85.128.53 as permitted sender) smtp.mailfrom=jannh@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1748974875; a=rsa-sha256; cv=none; b=apOF+6uKCYqEis0AtSzDdIWd/TNUOeq3u6aKXCIRjzIuCvs5Jdn1TcurrXienxulJTNOLu Mu0s6X7Zp182d/7v1EwXHY2CFnIPJDvAyfIYdeEWZXewusJu4kHEDx9SEply3h5jGAA2Dd GxF4keqrRvStu2W5Y7Ap/paeTVSx3pY= Received: by mail-wm1-f53.google.com with SMTP id 5b1f17b1804b1-450d726f61aso9215e9.1 for ; Tue, 03 Jun 2025 11:21:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1748974874; x=1749579674; darn=kvack.org; h=cc:to:content-transfer-encoding:mime-version:message-id:date :subject:from:from:to:cc:subject:date:message-id:reply-to; bh=TYuYabL4uCVwF/1h/26iRiOfzm4Zqn6YpsGjqJSyXvU=; b=IUqJatYIbRP+fmqqNBV7ZXQZQCGces9LQjQrkFqQqWhYqLqvFguZC5407PibrwB3Ac X9ogXc3s+jlhD6Cup25NQRH9i6CpMiq3BnReKpslLbRYDcrleyvTkOmA8Hhck8xcMzjY s1igYrLRe3/tYziLxkOmFE6vy2TDLaR/RmRBv88d5aKWOoHfRAF2w9XG6W7tLM6dF9/V YdSWXCtY1W5pXH1BVMWDAPuedD4txJK2y+ZZYov5WhbTrR6GPh1tFosuTY/O7yJIzypa 2VR2xkv9xhhoe3Q6KEOqTz2mNpH8k2/3QSL5UQ/WX36En+VN39++JVUYZMlSz5hT0TvE yeLQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1748974874; x=1749579674; h=cc:to:content-transfer-encoding:mime-version:message-id:date :subject:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=TYuYabL4uCVwF/1h/26iRiOfzm4Zqn6YpsGjqJSyXvU=; b=TC0voVgJcAHwA/YwVIzp6ihV5AYEvqTntB/kz8/02mj25ZMiRQ9xGme2c62Uf2WWjb vQOma20lWzEz8f09kLzBYVWsL+hQ27C69GKyfTQimk6xvoyTC9/XF6FWOvFMZ1D8b3wl tmEG8Ur3VVyDXCo6wMuYqO2u88YtLWrGbr7GVcgf85jUYWaKOZUxgKeKuX6u2YK6uP7P xM9yN8Znhv17moHqv1BZwy2EJoDklZIoCz3HmOKvXtYWGo/IuiDySb5dOOUv7o5WGi/y Bn0+AYvhvbgqgt2JiAnyzWeT1b06KDW/4ZJc9rk0YpC2lmMYttJXrcGUOazVZBy8yg5F BB6w== X-Forwarded-Encrypted: i=1; AJvYcCUd/PvS92MSqLxoZwrTEQp5py6r2iu+rcnrxgyx0J2hh0RPkIolI3YeKko3coDnACFUnO34uQcppg==@kvack.org X-Gm-Message-State: AOJu0YxOP674kluRstye4XT9nGKdZcCbQSbhGkH6bFaRtVXA2jI2e1m5 p7y9ZOY9zEYKSiwJwsOSv0ot1+UKGkbQdO+Nwj8J3sO9z9qktZGVR2bjCkX6WLMZKQ== X-Gm-Gg: ASbGncviclUQS4Qh7y6TLxxPBQS86kfHQFhzuag/cItCqz+c17tAt/1F5XSoazBiPyR KcnmZ+Mbgptq9Q9d622yt6nP8CkBloX9aXlUiV2V6NlqVrqa3wYhuKQACVNtZgbAXwvBsrglQaf TUWTyiki/UIT3y5p6v5fuOf5WnsqUnPhu626yrLYRozGW0nEOTfLRg+QewJyhAAtgPEQnxcXdl7 NJ0/3MkiudbpKH+anF7C7cNsqV5LM5kL0r5GDNBKJSt0yC468+QNgaA9DWLXGvMq0YzZp/wM8/T l0Fb7bhcblqtGCBL/h7/i9vLpZs49GsVXT/91Km+Pi94hh+8ig== X-Google-Smtp-Source: AGHT+IG9QBvMlzUUOGyuqXUZorlJfOhUQuguzF3v93oEKgimDs5OZohrwpuPd+bRSqntMdeynS2QHA== X-Received: by 2002:a05:600c:1ca8:b0:43b:b106:bb1c with SMTP id 5b1f17b1804b1-451ef7e9a53mr91075e9.0.1748974873164; Tue, 03 Jun 2025 11:21:13 -0700 (PDT) Received: from localhost ([2a00:79e0:9d:4:796:935b:268f:1be4]) by smtp.gmail.com with UTF8SMTPSA id ffacd0b85a97d-3a4efe73eadsm18772731f8f.41.2025.06.03.11.21.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 03 Jun 2025 11:21:12 -0700 (PDT) From: Jann Horn Subject: [PATCH 0/2] mm/memory: fix memory tearing on threaded fork Date: Tue, 03 Jun 2025 20:21:01 +0200 Message-Id: <20250603-fork-tearing-v1-0-a7f64b7cfc96@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit X-B4-Tracking: v=1; b=H4sIAA09P2gC/x3MQQqAIBBA0avIrBPUkKCrRAvR0YZAY4wIxLsnL d/i/wYVmbDCKhowPlSp5AE9CfCHywklhWEwylhlZyVj4VPe6JhykosOzmjtrPIRRnIxRnr/3bb 3/gGPwrtFXgAAAA== X-Change-ID: 20250530-fork-tearing-71da211a50cf To: Andrew Morton , David Hildenbrand , Lorenzo Stoakes , "Liam R. Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , linux-mm@kvack.org Cc: Peter Xu , linux-kernel@vger.kernel.org, Jann Horn , stable@vger.kernel.org X-Mailer: b4 0.15-dev X-Developer-Signature: v=1; a=ed25519-sha256; t=1748974869; l=3953; i=jannh@google.com; s=20240730; h=from:subject:message-id; bh=NbsRwYQsU0CO1ZXV6KbvBQ+Yr8Kwk1GnONKaodwEyNw=; b=CEriTtRE9yncVYoRCEU/fpeoyPzoqk9fL+0QVRI2F8lYHb21cJQeJv92F4uUlLp55s/mlmvKu ilhnIkFMxMQBC3pYEy2fMblw+YD+gjExNxykStTn24IHo4n/TsnP5tX X-Developer-Key: i=jannh@google.com; a=ed25519; pk=AljNtGOzXeF6khBXDJVVvwSEkVDGnnZZYqfWhP1V+C8= X-Rspamd-Queue-Id: 452D28000A X-Stat-Signature: jo7o8argunc717d7jnxo8d391xw6q6i6 X-Rspam-User: X-Rspamd-Server: rspam04 X-HE-Tag: 1748974875-607547 X-HE-Meta: U2FsdGVkX1+rZi5700s1226D8w+9IBD1WoY+AvMCqebrFhYKuaKhMNbCh+AfPwXvgtkkupNJXH3yUuRcprDyf91pZXbNeJdws+r8svmlHUZSA9swHSTvfgVPh+ebym3PcfxtEGFkTfnXyNdegVqdY3DJBR3xe7KYtNd6Jon1mB1Y0ih8K48TmN4qJUOMHYS0JuTm/IjnFxskrItB34YjrUNYnnCv2XlcKydjcfADVwjO0ULrXa2WfG/7j7SPETvKm/GRf1Uqu+0iTVy862gcocQty10b+tKA7/Y7V82lRYLAQlmbvmaRo1Xr3pVi7XXq0mipor0NIdOR+XsCcOYm5ciryFZDd3yluX/64QgdwzhYKwIOZ1aJhlT+yfKXym7Xk0FaKpI29bMGLA5396DAVqdNCmpwNf8rv3jdnGRCRUbAgW07ldlv6G7CHfFXlxsyp66F2gNASwEe7ce5kiMjfOYMJX9WUyL0m8F81kw/0pN5QdmtuhbV07vJFNZjcxzyKjLkM0IAQmo7VRFSaOTD1cGPSCPqGFYX093F77yHcxNS6z46pAIQsmFJv7KeKI9yCNxS6uzHqDOvdQtJGQAfpfRiZk2SIyVc/q3d+OHWdpprbJxkO3ElLs9xLl5GCLRJsgrZig4MMKu+JAYeXDyYlTZ5eiriV4RJzK3RBqNX7L9mr/ikbwgnPjo8sj1hJ2sOCdpCT9ILK1k3F2ffTwKwIdGtcRpp/BfaxHibvBjQY8DRqHlgA3TL2bqP+xlBtaA8UGXRtxIQtecrKuAbAjg4B3qvrcuvGWtR/3GAsCIJPcTylo8jax++AOYI22Helqe3rKgECsJo0QejNTRGyxRbcxJu5rJKqvNzcA07y/u8LjI+ZE9swdnVTe8YTbjmbgB/51i/xNVSVpNgawTX9VLPwQasuHaT6d5OvZs4t1WC+q8uUveBwxN5ncFOuTFiuEn0R0s5M1ZmFpRZ4QD7qRz wM3Pi+nj K5Z5KP7zOePOvIIej+zlhsj8gMQcFY3ejRzA4r1eq0a3g4TNJNbaCHLv7NmTWGaybsdABsnFcBzxMUWti7G7j2ik6+WGbpUuRUDEWJrHTIBC1/071LU5CPOMPhDJpE3HJg9ccXNRmz9QdTM4EtW0cdiEBTme/bXqWP7rXrQXrH9x2DGQFz39D3WpCNQ5BuOEkjDSk9NWQ5cni+LUhWEHbDBsYwomo4PzBBys9Yy8CJ/IPYodhcw+lhVnxm+jLjogJR4Izr6WRrgg7TxwDGg71X3TpUxSaawKI5uPRGm6Dfv4vLLmylAFc901N0xTukIkgSPNZeG9+w3VoqABYUUljPB1+lxz+2SX/hdUyzWWSzs3PWoM= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: The first patch is a fix with an explanation of the issue, you should read that first. The second patch adds a comment to document the rules because figuring this out from scratch causes brain pain. Accidentally hitting this issue and getting negative consequences from it would require several stars to line up just right; but if someone out there is using a malloc() implementation that uses lockless data structures across threads or such, this could actually be a problem. In case someone wants a testcase, here's a very artificial one: ``` #include #include #include #include #include #include #include #include #include #define SYSCHK(x) ({ \ typeof(x) __res = (x); \ if (__res == (typeof(x))-1) \ err(1, "SYSCHK(" #x ")"); \ __res; \ }) #define NUM_SQ_PAGES 4 static int uring_init(struct io_uring_sqe **sqesp, void **cqesp) { struct io_uring_sqe *sqes = SYSCHK(mmap(NULL, NUM_SQ_PAGES*0x1000, PROT_READ|PROT_WRITE, MAP_SHARED|MAP_ANONYMOUS, -1, 0)); void *cqes = SYSCHK(mmap(NULL, NUM_SQ_PAGES*0x1000, PROT_READ|PROT_WRITE, MAP_SHARED|MAP_ANONYMOUS, -1, 0)); *(volatile unsigned int *)(cqes+4) = 64 * NUM_SQ_PAGES; struct io_uring_params params = { .flags = IORING_SETUP_NO_MMAP|IORING_SETUP_NO_SQARRAY, .sq_off = { .user_addr = (unsigned long)sqes }, .cq_off = { .user_addr = (unsigned long)cqes } }; int uring_fd = SYSCHK(syscall(__NR_io_uring_setup, /*entries=*/10, ¶ms)); if (sqesp) *sqesp = sqes; if (cqesp) *cqesp = cqes; return uring_fd; } static char *bufmem[0x3000] __attribute__((aligned(0x1000))); static void *thread_fn(void *dummy) { unsigned long i = 0; while (1) { *(volatile unsigned long *)(bufmem + 0x0000) = i; *(volatile unsigned long *)(bufmem + 0x0f00) = i; *(volatile unsigned long *)(bufmem + 0x1000) = i; *(volatile unsigned long *)(bufmem + 0x1f00) = i; *(volatile unsigned long *)(bufmem + 0x2000) = i; *(volatile unsigned long *)(bufmem + 0x2f00) = i; i++; } } int main(void) { #if 1 int uring_fd = uring_init(NULL, NULL); struct iovec reg_iov = { .iov_base = bufmem, .iov_len = 0x2000 }; SYSCHK(syscall(__NR_io_uring_register, uring_fd, IORING_REGISTER_BUFFERS, ®_iov, 1)); #endif pthread_t thread; if (pthread_create(&thread, NULL, thread_fn, NULL)) errx(1, "pthread_create"); sleep(1); int child = SYSCHK(fork()); if (child == 0) { printf("bufmem values:\n"); printf(" 0x0000: 0x%lx\n", *(volatile unsigned long *)(bufmem + 0x0000)); printf(" 0x0f00: 0x%lx\n", *(volatile unsigned long *)(bufmem + 0x0f00)); printf(" 0x1000: 0x%lx\n", *(volatile unsigned long *)(bufmem + 0x1000)); printf(" 0x1f00: 0x%lx\n", *(volatile unsigned long *)(bufmem + 0x1f00)); printf(" 0x2000: 0x%lx\n", *(volatile unsigned long *)(bufmem + 0x2000)); printf(" 0x2f00: 0x%lx\n", *(volatile unsigned long *)(bufmem + 0x2f00)); return 0; } int wstatus; SYSCHK(wait(&wstatus)); return 0; } ``` Without this series, the child will usually print results that are apart by more than 1, which is not a state that ever occurred in the parent; in my opinion, that counts as a bug. If you change the "#if 1" to "#if 0", the bug won't manifest. Signed-off-by: Jann Horn --- Jann Horn (2): mm/memory: ensure fork child sees coherent memory snapshot mm/memory: Document how we make a coherent memory snapshot kernel/fork.c | 34 ++++++++++++++++++++++++++++++++++ mm/memory.c | 18 ++++++++++++++++++ 2 files changed, 52 insertions(+) --- base-commit: 8477ab143069c6b05d6da4a8184ded8b969240f5 change-id: 20250530-fork-tearing-71da211a50cf -- Jann Horn