From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 96E7AEDE9BA for ; Tue, 10 Sep 2024 20:18:12 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0F6B18D00BD; Tue, 10 Sep 2024 16:18:12 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 0A6E68D0056; Tue, 10 Sep 2024 16:18:12 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EAFDA8D00BD; Tue, 10 Sep 2024 16:18:11 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id CD7698D0056 for ; Tue, 10 Sep 2024 16:18:11 -0400 (EDT) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 792EB1C39A1 for ; Tue, 10 Sep 2024 20:18:11 +0000 (UTC) X-FDA: 82549940382.08.A4A0DD8 Received: from mail-ua1-f46.google.com (mail-ua1-f46.google.com [209.85.222.46]) by imf26.hostedemail.com (Postfix) with ESMTP id BDEBF140007 for ; Tue, 10 Sep 2024 20:18:08 +0000 (UTC) Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=bEyC8Gex; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf26.hostedemail.com: domain of yuzhao@google.com designates 209.85.222.46 as permitted sender) smtp.mailfrom=yuzhao@google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1725999412; a=rsa-sha256; cv=none; b=bGPkpJs0i7Z7qzz7JhPuUzvffDr3DOfhZCP0MBt3cPfVden/GEWhoErp4EGtxEd5SjG152 L4M8eXbCXCd+vq20CexkbVhAyvv2vNQp6hXqVi7OZ4hn/7S7B8wG97/lD9Wkl4CzslzMR0 XmAXx/LjgYNSVevbGgmFHr8t1OEJ0aY= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=bEyC8Gex; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf26.hostedemail.com: domain of yuzhao@google.com designates 209.85.222.46 as permitted sender) smtp.mailfrom=yuzhao@google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1725999412; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=wUsuj7sykh6r4fjDg4e9eX71/s/KguywrJLzSabBieA=; b=eW1yzH93ZzbV50WT34K9d2rEg+l6YSXgUYe0fa3B3LspsvGWTieyGWv8XTWmrYF9ut0TcT agnnc1adSmuOp1SGfn+C794eOk/Ldc67lpYiYNtL4yGZF32DC0FiyesdNY2WfyZEL046je iFv61BxFkmd2xAPNsNWvxstog1uTgaM= Received: by mail-ua1-f46.google.com with SMTP id a1e0cc1a2514c-846ca104682so292599241.2 for ; Tue, 10 Sep 2024 13:18:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1725999488; x=1726604288; darn=kvack.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=wUsuj7sykh6r4fjDg4e9eX71/s/KguywrJLzSabBieA=; b=bEyC8GexU20RLTIh6EcYVM/xEpgGByRKfBWqtliRQBhFMAdJvs5XeVJG+tVONGilxr IAiv2jpUacM5pxWNoi77wayRwP3Pcw2Xj+Z+dBy4BauFP0jNwNb6patoWMSqDBHsnJ1w JAl7spzJbWAyMTZiPPfYhLIxr02CHAqTK80K7r10Q0q0HJNAKYo5NUX3U06TY4WmlKs7 a+BhNd88whNV/+rglQ5pB9E6U6d/+s+ZshyVDdaAE/V0JeKM9FXZNHPeR+zGVGJo83vt 1f9BecUwpoBgQ0q6SkhqlpNEgB4s4XJUiEIGtYpMHSjgJN6/qpcPT3mSP3xYe+RuWXS5 1kzQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1725999488; x=1726604288; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=wUsuj7sykh6r4fjDg4e9eX71/s/KguywrJLzSabBieA=; b=oca153H1uA1GrFz2CoKjiWOO0pe9+flBcMQtIg015knYRg2mp5DzlqecK963F9R0L0 snoPkIBiWczFP6URNFYzOSJvlnnPOigQHaRAkhe1hWagSQLbdRAh2Nur/L6u6RSgmq37 n8QBk2JL/FmOzpvaxBZt5I7b9nnOTZ3F5h07rORUGri/tLYZZgDjrHfYArp6pbawG47Z RnrN5jtikpk+JNAByoeOqac6wttPuUQfFIaMVgLWWQcRSNRUdQwe6PosJuGjrC0Tytsq YRaqgYRomDOoa1MdQNQSu2HRhKRL38krvGuNGqxqkYYjpAG7NqMS0c72sH5UfxC8Lic2 jebA== X-Gm-Message-State: AOJu0YwEysP1MNpZmgUcTypskOBtDDgpFMhqhXYk6x+2M33bE5SPQ77O qH2MPtpPuxBaMML+Yg/Pgmm4P90Eza5eq6LZvqY6TAqZFqOROfCTXe1yfwBmMTdKuSUSg3pVzLy K60BiEVONEi9iWUeJzNtEv4ubcs/HfPRz2DjS X-Google-Smtp-Source: AGHT+IEOLxSh3BqiwaB48pfRW9SEj4cpAdOx6QIYt7vzzgVQ6XMEDARuDDRuFsB45VLOnIA8he36fT2RP24/yL5bppE= X-Received: by 2002:a05:6102:c02:b0:493:b52f:ecb6 with SMTP id ada2fe7eead31-49c241fcc39mr1036592137.20.1725999487340; Tue, 10 Sep 2024 13:18:07 -0700 (PDT) MIME-Version: 1.0 References: <02ffa542-ce49-4755-9d2b-29841f9973e0@kernel.dk> In-Reply-To: <02ffa542-ce49-4755-9d2b-29841f9973e0@kernel.dk> From: Yu Zhao Date: Tue, 10 Sep 2024 14:17:28 -0600 Message-ID: Subject: Re: Hugepage program taking forever to exit To: Jens Axboe Cc: Linux-MM , Johannes Weiner , Andrew Morton , Muchun Song Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Rspamd-Queue-Id: BDEBF140007 X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: rik8pk17waundbwwnc97jgh7i9j8qyzm X-HE-Tag: 1725999488-981532 X-HE-Meta: U2FsdGVkX1/h6JJu0Cxtn0xN98XK7Bhf7KKBVgZp5ZL9MEExBsO+T9cMl8GXPBsBRNXp2Iy6CVx0/XnrQEmK6iE6gnRK7WFMCYXcvIScM5GS8PIexQk/RgVjW5ru0dRh6WjCGe1zogIJoVbQuImFGFbcTozC3n4PIc9gLERZ1inziBSahlsCIPwC+ccdbBwKdTFUgcZQK/6ZxtYK9RSV5rm/rh5gWJR7UgyRUoxOJoGsWbCt0tnCHRXXFlKkeEFD58MTHyZTbrlfksj1DG9GfG0BaBBof318yeZqJBAET2G0jSG3JRQkngc4+Pa93gJvf308ZCyN6seeW28Tr3N4sjNbXe4Oi+A2YDsaevDqAxLttcYlC2NINK9RccKlvrA4GLcc38+YPF8IKDoqYQwB+A+E+jaIyVs5WEPrxlcnOWMowPTAvaBvuuKfZ4FkIG1i7tGHzzF9pFzu5Fs+rH3wFL46xFAnpTrSjAZwlbovjJHQkiENDIcQzUoxcvQ4fBt9PX4Ho9kDllA4wjctK1/0lMUqoFVPSzPKoWIcpvo6D3W/dDkwxZIkDZkjKgAtumk9n8+hJUCTCQnRhG6POBqZFRNwqgnLoeOqCHnP3Gxnk2yUVrndblAONtj+rS4tEw74oxSwec5NkLf47cItCCFp23qsW0Yo/LfBGtUqCE5tqxlUSpZKwumRlnZXtK8TfEvU9jS7Ygjm45Hb8EHFNOM0WZC+VOowzzhDS0ZKNx6XevfPfkkSh83/RnTg2KcGx6Gtvm1dPXdTibN/z3H6xmjOQ9549ZusedRJlbGQE2n1gAxIdh+PetsJ6S78W5tHJBi5r6mH/HGoz07QyltGnbgr6YmVuRjnSCYovwTRRb2G0jWFkfcwSD4ULQFQRuws3Mfnw67++tEngDHqywjYKl9FMofUyvfK2IEbpy3ipSrQ5rUZIiDjjHyTHF1sb4jymjw+ZAiWQrwPkhw5ZUh9/q+ eThH1bcn 7+AZBEfPPb/biY+PZ7YqTVVMeO4bS0mPclfPtuJc/Vg3sMkipIGC3XIZ5b7Kp2F+9XxM3kHuVTUxiKeVMgLrq6vy8WDBd2/LvZNNjVpidYVu5LOpr2Jh9m+2XzOeJteybilO/OR+1Xnbhj7MxJ6VB16JtKoIzyWvECNTtUxUzSlwUsU9+fJ+66kQ3ABcvM64EE4BlV2EKFsNjpDjxr6rug9Jz5Wci7CxarMwePFd7xP/OfMo= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000023, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: On Tue, Sep 10, 2024 at 12:21=E2=80=AFPM Jens Axboe wrote= : > > Hi, > > Investigating another issue, I wrote the following simple program that al= locates > and faults in 500 1GB huge pages, and then registers them with io_uring. = Each > step is timed: > > Got 500 huge pages (each 1024MB) in 0 msec > Faulted in 500 huge pages in 38632 msec > Registered 500 pages in 867 msec > > and as expected, faulting in the pages takes (by far) the longest. From > the above, you'd also expect the total runtime to be around ~39 seconds. > But it is not... In fact it takes 82 seconds in total for this program > to have exited. Looking at why, I see: > > [<0>] __wait_rcu_gp+0x12b/0x160 > [<0>] synchronize_rcu_normal.part.0+0x2a/0x30 > [<0>] hugetlb_vmemmap_restore_folios+0x22/0xe0 > [<0>] update_and_free_pages_bulk+0x4c/0x220 > [<0>] return_unused_surplus_pages+0x80/0xa0 > [<0>] hugetlb_acct_memory.part.0+0x2dd/0x3b0 > [<0>] hugetlb_vm_op_close+0x160/0x180 > [<0>] remove_vma+0x20/0x60 > [<0>] exit_mmap+0x199/0x340 > [<0>] mmput+0x49/0x110 > [<0>] do_exit+0x261/0x9b0 > [<0>] do_group_exit+0x2c/0x80 > [<0>] __x64_sys_exit_group+0x14/0x20 > [<0>] x64_sys_call+0x714/0x720 > [<0>] do_syscall_64+0x5b/0x160 > [<0>] entry_SYSCALL_64_after_hwframe+0x4b/0x53 > > and yes, it does look like the program is mostly idle for most of the > time while returning these huge pages. It's also telling us exactly why > we're just sitting idle - RCU grace period. > > The below quick change means the runtime of the program is pretty much > just the time it takes to execute the parts of it, as you can see from > the full output after the change: > > axboe@r7525 ~> time sudo ./reg-huge > Got 500 huge pages (each 1024MB) in 0 msec > Faulted in 500 huge pages in 38632 msec > Registered 500 pages in 867 msec > > ________________________________________________________ > Executed in 39.53 secs fish external > usr time 4.88 millis 238.00 micros 4.64 millis > sys time 0.00 millis 0.00 micros 0.00 millis > > where 38632+876 =3D=3D 39.51s. > > Looks like this was introduced by: > > commit bd225530a4c717714722c3731442b78954c765b3 > Author: Yu Zhao > Date: Thu Jun 27 16:27:05 2024 -0600 Fixes are in git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-s= table c2a967f6ab0e mm/hugetlb_vmemmap: don't synchronize_rcu() without HVO c0f398c3b2cf mm/hugetlb_vmemmap: batch HVO work when demoting Additional improvements from mm-stable that may or may not help your test c= ase: e98337d11bbd mm/contig_alloc: support __GFP_COMP 463586e9ff39 mm/cma: add cma_{alloc,free}_folio() cf54f310d0d3 mm/hugetlb: use __GFP_COMP for gigantic folios