* [PATCH] mm: vmscan: always allow writeback during memcg reclaim
@ 2025-12-13 8:36 Deepanshu Kartikey
2025-12-14 23:49 ` Andrew Morton
2025-12-15 4:12 ` Johannes Weiner
0 siblings, 2 replies; 20+ messages in thread
From: Deepanshu Kartikey @ 2025-12-13 8:36 UTC (permalink / raw)
To: akpm, axelrasmussen, yuanchu, weixugc, hannes, david, mhocko,
zhengqi.arch, shakeel.butt, lorenzo.stoakes, yuzhao, heftig,
oleksandr, bgeffon
Cc: linux-mm, linux-kernel, Deepanshu Kartikey, syzbot+90fcab4d88cffed6d0d8
When laptop_mode is enabled, may_writepage is set to 0 in
try_to_free_mem_cgroup_pages(). This triggers a warning in MGLRU's
lru_gen_shrink_lruvec():
VM_WARN_ON_ONCE(!sc->may_writepage || !sc->may_unmap);
The warning occurs because MGLRU expects full reclaim capabilities to
function correctly. The call path is:
mem_cgroup_resize_max()
try_to_free_mem_cgroup_pages()
do_try_to_free_pages()
shrink_node()
shrink_lruvec()
lru_gen_shrink_lruvec() <-- WARNING
Unlike kswapd or direct reclaim where laptop_mode's disk-saving behavior
is a reasonable optimization, memcg limit enforcement is a hard
requirement - memory MUST be freed when a cgroup exceeds its limit.
The may_unmap field is already set unconditionally to 1 in this path,
acknowledging that memcg reclaim needs full capabilities.
Set may_writepage unconditionally to 1 for memcg reclaim to ensure
MGLRU works correctly and memory limits are properly enforced.
Fixes: bd74fdaea146 ("mm: multi-gen LRU: support page table walks")
Reported-by: syzbot+90fcab4d88cffed6d0d8@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=90fcab4d88cffed6d0d8
Signed-off-by: Deepanshu Kartikey <kartikey406@gmail.com>
---
Note: Only compile-tested. No reproducer available from syzbot.
---
mm/vmscan.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 900c74b6aa62..5e1c99d9cbd7 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -6669,7 +6669,7 @@ unsigned long try_to_free_mem_cgroup_pages(struct mem_cgroup *memcg,
.reclaim_idx = MAX_NR_ZONES - 1,
.target_mem_cgroup = memcg,
.priority = DEF_PRIORITY,
- .may_writepage = !laptop_mode,
+ .may_writepage = 1,
.may_unmap = 1,
.may_swap = !!(reclaim_options & MEMCG_RECLAIM_MAY_SWAP),
.proactive = !!(reclaim_options & MEMCG_RECLAIM_PROACTIVE),
--
2.43.0
^ permalink raw reply [flat|nested] 20+ messages in thread* Re: [PATCH] mm: vmscan: always allow writeback during memcg reclaim 2025-12-13 8:36 [PATCH] mm: vmscan: always allow writeback during memcg reclaim Deepanshu Kartikey @ 2025-12-14 23:49 ` Andrew Morton 2025-12-15 4:12 ` Johannes Weiner 1 sibling, 0 replies; 20+ messages in thread From: Andrew Morton @ 2025-12-14 23:49 UTC (permalink / raw) To: Deepanshu Kartikey Cc: axelrasmussen, yuanchu, weixugc, hannes, david, mhocko, zhengqi.arch, shakeel.butt, lorenzo.stoakes, yuzhao, heftig, oleksandr, bgeffon, linux-mm, linux-kernel, syzbot+90fcab4d88cffed6d0d8 On Sat, 13 Dec 2025 14:06:39 +0530 Deepanshu Kartikey <kartikey406@gmail.com> wrote: > When laptop_mode is enabled, may_writepage is set to 0 in > try_to_free_mem_cgroup_pages(). This triggers a warning in MGLRU's > lru_gen_shrink_lruvec(): > > VM_WARN_ON_ONCE(!sc->may_writepage || !sc->may_unmap); > > The warning occurs because MGLRU expects full reclaim capabilities to > function correctly. The call path is: > > mem_cgroup_resize_max() > try_to_free_mem_cgroup_pages() > do_try_to_free_pages() > shrink_node() > shrink_lruvec() > lru_gen_shrink_lruvec() <-- WARNING > > Unlike kswapd or direct reclaim where laptop_mode's disk-saving behavior > is a reasonable optimization, memcg limit enforcement is a hard > requirement - memory MUST be freed when a cgroup exceeds its limit. > The may_unmap field is already set unconditionally to 1 in this path, > acknowledging that memcg reclaim needs full capabilities. > > Set may_writepage unconditionally to 1 for memcg reclaim to ensure > MGLRU works correctly and memory limits are properly enforced. > Thanks, I'll add this to mm.git's mm-new branch for testing. I expect a few days after that I'll quietly move it into the mm-unstable branch where it will receive linux-next exposure. Further progress into mm.git's non-rebasing for-next-merge-window mm-stable branch will depend upon review outcomes. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] mm: vmscan: always allow writeback during memcg reclaim 2025-12-13 8:36 [PATCH] mm: vmscan: always allow writeback during memcg reclaim Deepanshu Kartikey 2025-12-14 23:49 ` Andrew Morton @ 2025-12-15 4:12 ` Johannes Weiner 2025-12-15 4:51 ` Deepanshu Kartikey ` (2 more replies) 1 sibling, 3 replies; 20+ messages in thread From: Johannes Weiner @ 2025-12-15 4:12 UTC (permalink / raw) To: Deepanshu Kartikey Cc: akpm, axelrasmussen, yuanchu, weixugc, david, mhocko, zhengqi.arch, shakeel.butt, lorenzo.stoakes, yuzhao, heftig, oleksandr, bgeffon, linux-mm, linux-kernel, syzbot+90fcab4d88cffed6d0d8 On Sat, Dec 13, 2025 at 02:06:39PM +0530, Deepanshu Kartikey wrote: > When laptop_mode is enabled, may_writepage is set to 0 in > try_to_free_mem_cgroup_pages(). This triggers a warning in MGLRU's > lru_gen_shrink_lruvec(): > > VM_WARN_ON_ONCE(!sc->may_writepage || !sc->may_unmap); > > The warning occurs because MGLRU expects full reclaim capabilities to > function correctly. The call path is: > > mem_cgroup_resize_max() > try_to_free_mem_cgroup_pages() > do_try_to_free_pages() > shrink_node() > shrink_lruvec() > lru_gen_shrink_lruvec() <-- WARNING > > Unlike kswapd or direct reclaim where laptop_mode's disk-saving behavior > is a reasonable optimization, memcg limit enforcement is a hard > requirement - memory MUST be freed when a cgroup exceeds its limit. That reasoning doesn't make sense to me. Reclaim is always in response to an allocation need. The laptop_mode idea applies to cgroup reclaim as much as any other reclaim. Now obviously all of this is pretty dated. Reclaim doesn't do filesystem writes anymore, and I'm not sure there are a whole lot of laptops with rotational drives left, either. Also I doubt anybody is still using zone_reclaim_mode (which is where the may_unmap is from). But let's not introduce more inconsistencies, please. The only thing weird here is the MGLRU warning. What is it trying to assert? Clearly whatever assumption was made here has never been true. And what is the zone_reclaim_mode (may_unmap) assert doing in the cgroup limit reclaim path? It seems to me both the warning in cgroup reclaim, and the goto done in root reclaim, are kind of unnecessary and gratuitously breaking both laptop_mode and zone_reclaim_mode - obsolete as they may be. But why even add this code? Can somebody with MGLRU context please take a look whether we can remove these? > Set may_writepage unconditionally to 1 for memcg reclaim to ensure > MGLRU works correctly and memory limits are properly enforced. > > Fixes: bd74fdaea146 ("mm: multi-gen LRU: support page table walks") That seems unrelated? ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] mm: vmscan: always allow writeback during memcg reclaim 2025-12-15 4:12 ` Johannes Weiner @ 2025-12-15 4:51 ` Deepanshu Kartikey 2025-12-15 19:42 ` Yuanchu Xie 2025-12-15 6:59 ` retiring laptop_mode? was " Christoph Hellwig 2025-12-15 17:49 ` Michal Hocko 2 siblings, 1 reply; 20+ messages in thread From: Deepanshu Kartikey @ 2025-12-15 4:51 UTC (permalink / raw) To: Johannes Weiner Cc: akpm, axelrasmussen, yuanchu, weixugc, david, mhocko, zhengqi.arch, shakeel.butt, lorenzo.stoakes, yuzhao, heftig, oleksandr, bgeffon, linux-mm, linux-kernel, syzbot+90fcab4d88cffed6d0d8 On Mon, Dec 15, 2025 at 9:42 AM Johannes Weiner <hannes@cmpxchg.org> wrote: > > Fixes: bd74fdaea146 ("mm: multi-gen LRU: support page table walks") > > That seems unrelated? Sorry for the wrong fixes. Correct Fixes: ee814fe23daf ("mm: vmscan: clean up struct scan_control") I'll wait for input from someone with MGLRU context on the broader discussion. Thanks Deepanshu ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] mm: vmscan: always allow writeback during memcg reclaim 2025-12-15 4:51 ` Deepanshu Kartikey @ 2025-12-15 19:42 ` Yuanchu Xie 2025-12-15 20:22 ` Johannes Weiner 2025-12-19 5:13 ` Kairui Song 0 siblings, 2 replies; 20+ messages in thread From: Yuanchu Xie @ 2025-12-15 19:42 UTC (permalink / raw) To: Deepanshu Kartikey Cc: Johannes Weiner, akpm, axelrasmussen, weixugc, david, mhocko, zhengqi.arch, shakeel.butt, lorenzo.stoakes, yuzhao, heftig, oleksandr, bgeffon, linux-mm, linux-kernel, syzbot+90fcab4d88cffed6d0d8 On Sun, Dec 14, 2025 at 10:52 PM Deepanshu Kartikey <kartikey406@gmail.com> wrote: > > On Mon, Dec 15, 2025 at 9:42 AM Johannes Weiner <hannes@cmpxchg.org> wrote: > > > > Fixes: bd74fdaea146 ("mm: multi-gen LRU: support page table walks") > > > > That seems unrelated? > > Sorry for the wrong fixes. Correct Fixes: ee814fe23daf ("mm: vmscan: > clean up struct scan_control") > > I'll wait for input from someone with MGLRU context on the broader discussion. > This warning came from commit e9d4e1ee7880 ("mm: multi-gen LRU: clarify scan_control flags") [1]. The original rationale: > 4. sc->may_writepage and sc->may_unmap, which indicates opportunistic > reclaim, are rejected, since unmapped clean folios are already > prioritized. Scanning for more of them is likely futile and can > cause high reclaim latency when there is a large number of memcgs. As far as I can tell this was a sanity check to ensure `lru_gen_shrink_lruvec` avoids extra work for minimal gain. Perhaps this shouldn't be a warning? Always setting may_writepage in this case would free more folios. I'm not against removing the warning either. @Wei Xu @Axel Rasmussen Any opinions? [1] https://lore.kernel.org/all/20221222041905.2431096-8-yuzhao@google.com/T/#u ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] mm: vmscan: always allow writeback during memcg reclaim 2025-12-15 19:42 ` Yuanchu Xie @ 2025-12-15 20:22 ` Johannes Weiner 2025-12-19 5:13 ` Kairui Song 1 sibling, 0 replies; 20+ messages in thread From: Johannes Weiner @ 2025-12-15 20:22 UTC (permalink / raw) To: Yuanchu Xie Cc: Deepanshu Kartikey, akpm, axelrasmussen, weixugc, david, mhocko, zhengqi.arch, shakeel.butt, lorenzo.stoakes, yuzhao, heftig, oleksandr, bgeffon, linux-mm, linux-kernel, syzbot+90fcab4d88cffed6d0d8 On Mon, Dec 15, 2025 at 01:42:27PM -0600, Yuanchu Xie wrote: > On Sun, Dec 14, 2025 at 10:52 PM Deepanshu Kartikey > <kartikey406@gmail.com> wrote: > > > > On Mon, Dec 15, 2025 at 9:42 AM Johannes Weiner <hannes@cmpxchg.org> wrote: > > > > > > Fixes: bd74fdaea146 ("mm: multi-gen LRU: support page table walks") > > > > > > That seems unrelated? > > > > Sorry for the wrong fixes. Correct Fixes: ee814fe23daf ("mm: vmscan: > > clean up struct scan_control") > > > > I'll wait for input from someone with MGLRU context on the broader discussion. > > > This warning came from commit e9d4e1ee7880 ("mm: multi-gen LRU: > clarify scan_control flags") [1]. > > The original rationale: > > 4. sc->may_writepage and sc->may_unmap, which indicates opportunistic > > reclaim, are rejected, since unmapped clean folios are already > > prioritized. Scanning for more of them is likely futile and can > > cause high reclaim latency when there is a large number of memcgs. > > As far as I can tell this was a sanity check to ensure > `lru_gen_shrink_lruvec` avoids extra work for minimal gain. Perhaps > this shouldn't be a warning? Always setting may_writepage in this case > would free more folios. I'm not against removing the warning either. The premise doesn't seem correct. Aside from laptop_mode, they're used in those scenarios: - zone_reclaim_mode: local node is full and user would prefer clean and/or unmapped pages over spilling to remote nodes - watermark_boost: the page allocator finds itself in a situation where it needs to fragment pageblocks, and it calls for additional reclaim to get out of that situation Neither of them are opportunistic. It's user-requested behavior. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] mm: vmscan: always allow writeback during memcg reclaim 2025-12-15 19:42 ` Yuanchu Xie 2025-12-15 20:22 ` Johannes Weiner @ 2025-12-19 5:13 ` Kairui Song 1 sibling, 0 replies; 20+ messages in thread From: Kairui Song @ 2025-12-19 5:13 UTC (permalink / raw) To: Yuanchu Xie Cc: Deepanshu Kartikey, Johannes Weiner, akpm, axelrasmussen, weixugc, david, mhocko, zhengqi.arch, shakeel.butt, lorenzo.stoakes, yuzhao, heftig, oleksandr, bgeffon, linux-mm, linux-kernel, syzbot+90fcab4d88cffed6d0d8 On Tue, Dec 16, 2025 at 3:52 AM Yuanchu Xie <yuanchu@google.com> wrote: > > On Sun, Dec 14, 2025 at 10:52 PM Deepanshu Kartikey > <kartikey406@gmail.com> wrote: > > > > On Mon, Dec 15, 2025 at 9:42 AM Johannes Weiner <hannes@cmpxchg.org> wrote: > > > > > > Fixes: bd74fdaea146 ("mm: multi-gen LRU: support page table walks") > > > > > > That seems unrelated? > > > > Sorry for the wrong fixes. Correct Fixes: ee814fe23daf ("mm: vmscan: > > clean up struct scan_control") > > > > I'll wait for input from someone with MGLRU context on the broader discussion. > > > This warning came from commit e9d4e1ee7880 ("mm: multi-gen LRU: > clarify scan_control flags") [1]. > > The original rationale: > > 4. sc->may_writepage and sc->may_unmap, which indicates opportunistic > > reclaim, are rejected, since unmapped clean folios are already > > prioritized. Scanning for more of them is likely futile and can > > cause high reclaim latency when there is a large number of memcgs. > > As far as I can tell this was a sanity check to ensure > `lru_gen_shrink_lruvec` avoids extra work for minimal gain. Perhaps > this shouldn't be a warning? Always setting may_writepage in this case > would free more folios. I'm not against removing the warning either. > > @Wei Xu @Axel Rasmussen Any opinions? > > [1] https://lore.kernel.org/all/20221222041905.2431096-8-yuzhao@google.com/T/#u > Hi All, We are also hitting this warning in our test environment. Simply removing that WARN seems OK for us, shrink_folio_list will bounce these folios back and everything behaves just fine. Meanwhile, is it a good idea to add back the !sc->may_unmap check in isolate_folio or improve isolate_folios accordingly? That might help reduce the overhead in the worst scenarios. ^ permalink raw reply [flat|nested] 20+ messages in thread
* retiring laptop_mode? was Re: [PATCH] mm: vmscan: always allow writeback during memcg reclaim 2025-12-15 4:12 ` Johannes Weiner 2025-12-15 4:51 ` Deepanshu Kartikey @ 2025-12-15 6:59 ` Christoph Hellwig 2025-12-15 16:33 ` Jens Axboe 2025-12-15 20:08 ` Johannes Weiner 2025-12-15 17:49 ` Michal Hocko 2 siblings, 2 replies; 20+ messages in thread From: Christoph Hellwig @ 2025-12-15 6:59 UTC (permalink / raw) To: Johannes Weiner, Jens Axboe Cc: Deepanshu Kartikey, akpm, linux-mm, linux-kernel, linux-block On Sun, Dec 14, 2025 at 11:12:00PM -0500, Johannes Weiner wrote: > That reasoning doesn't make sense to me. Reclaim is always in response > to an allocation need. The laptop_mode idea applies to cgroup reclaim > as much as any other reclaim. > > Now obviously all of this is pretty dated. Reclaim doesn't do > filesystem writes anymore, and I'm not sure there are a whole lot of > laptops with rotational drives left, either. Also I doubt anybody is > still using zone_reclaim_mode (which is where the may_unmap is from). Yeah. I wonder if we should retire laptop_mode. It was a cute hack back then, but it has it's ugly fingers in way to many places and should be mostly obsolete by how writeback works these days. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: retiring laptop_mode? was Re: [PATCH] mm: vmscan: always allow writeback during memcg reclaim 2025-12-15 6:59 ` retiring laptop_mode? was " Christoph Hellwig @ 2025-12-15 16:33 ` Jens Axboe 2025-12-15 20:08 ` Johannes Weiner 1 sibling, 0 replies; 20+ messages in thread From: Jens Axboe @ 2025-12-15 16:33 UTC (permalink / raw) To: Christoph Hellwig, Johannes Weiner Cc: Deepanshu Kartikey, akpm, linux-mm, linux-kernel, linux-block On 12/14/25 11:59 PM, Christoph Hellwig wrote: > On Sun, Dec 14, 2025 at 11:12:00PM -0500, Johannes Weiner wrote: >> That reasoning doesn't make sense to me. Reclaim is always in response >> to an allocation need. The laptop_mode idea applies to cgroup reclaim >> as much as any other reclaim. >> >> Now obviously all of this is pretty dated. Reclaim doesn't do >> filesystem writes anymore, and I'm not sure there are a whole lot of >> laptops with rotational drives left, either. Also I doubt anybody is >> still using zone_reclaim_mode (which is where the may_unmap is from). > > Yeah. I wonder if we should retire laptop_mode. It was a cute hack > back then, but it has it's ugly fingers in way to many places and > should be mostly obsolete by how writeback works these days. I'd be all for that. -- Jens Axboe ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: retiring laptop_mode? was Re: [PATCH] mm: vmscan: always allow writeback during memcg reclaim 2025-12-15 6:59 ` retiring laptop_mode? was " Christoph Hellwig 2025-12-15 16:33 ` Jens Axboe @ 2025-12-15 20:08 ` Johannes Weiner 2025-12-16 2:23 ` Jens Axboe 2025-12-16 7:41 ` Christoph Hellwig 1 sibling, 2 replies; 20+ messages in thread From: Johannes Weiner @ 2025-12-15 20:08 UTC (permalink / raw) To: Christoph Hellwig Cc: Jens Axboe, Deepanshu Kartikey, akpm, linux-mm, linux-kernel, linux-block On Sun, Dec 14, 2025 at 10:59:11PM -0800, Christoph Hellwig wrote: > On Sun, Dec 14, 2025 at 11:12:00PM -0500, Johannes Weiner wrote: > > That reasoning doesn't make sense to me. Reclaim is always in response > > to an allocation need. The laptop_mode idea applies to cgroup reclaim > > as much as any other reclaim. > > > > Now obviously all of this is pretty dated. Reclaim doesn't do > > filesystem writes anymore, and I'm not sure there are a whole lot of > > laptops with rotational drives left, either. Also I doubt anybody is > > still using zone_reclaim_mode (which is where the may_unmap is from). > > Yeah. I wonder if we should retire laptop_mode. It was a cute hack > back then, but it has it's ugly fingers in way to many places and > should be mostly obsolete by how writeback works these days. Yes, that makes sense to me. How about the below? It doesn't actually get rid of the reclaim toggles - I added comments for the other usecases. But it's a nice diffstat nonetheless. Debated whether to add some sort of deprecation sysctl handler, but at least systemd-sysctl just prints a warning and still applies other settings from the same config file. --- From 868f67e9d0d4465a6c22d8a147084944e7569c8d Mon Sep 17 00:00:00 2001 From: Johannes Weiner <hannes@cmpxchg.org> Date: Mon, 15 Dec 2025 12:57:53 -0500 Subject: [PATCH] mm/block/fs: remove laptop_mode Laptop mode was introduced to save battery, by delaying and consolidating writes and maximize the time rotating hard drives wouldn't have to spin. Needless to say, this is a scenario of the (in)glorious past. The footprint of the feature is small, but nevertheless it's a complicating factor in mm, block, filesystems. Developers don't think about it, and the decision-making in reclaim looks dubious. It likely hasn't been tested in years while the surrounding code has evolved. Suggested-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> --- .../admin-guide/laptops/laptop-mode.rst | 770 ------------------ Documentation/admin-guide/sysctl/vm.rst | 8 - block/blk-mq.c | 3 - fs/ext4/inode.c | 3 +- fs/sync.c | 2 - fs/xfs/xfs_super.c | 9 - include/linux/backing-dev-defs.h | 3 - include/linux/writeback.h | 4 - include/trace/events/writeback.h | 1 - include/uapi/linux/sysctl.h | 2 +- mm/backing-dev.c | 3 - mm/page-writeback.c | 66 +- mm/vmscan.c | 30 +- 13 files changed, 11 insertions(+), 893 deletions(-) delete mode 100644 Documentation/admin-guide/laptops/laptop-mode.rst diff --git a/Documentation/admin-guide/laptops/laptop-mode.rst b/Documentation/admin-guide/laptops/laptop-mode.rst deleted file mode 100644 index 66eb9cd918b5..000000000000 --- a/Documentation/admin-guide/laptops/laptop-mode.rst +++ /dev/null @@ -1,770 +0,0 @@ -=============================================== -How to conserve battery power using laptop-mode -=============================================== - -Document Author: Bart Samwel (bart@samwel.tk) - -Date created: January 2, 2004 - -Last modified: December 06, 2004 - -Introduction ------------- - -Laptop mode is used to minimize the time that the hard disk needs to be spun up, -to conserve battery power on laptops. It has been reported to cause significant -power savings. - -.. Contents - - * Introduction - * Installation - * Caveats - * The Details - * Tips & Tricks - * Control script - * ACPI integration - * Monitoring tool - - -Installation ------------- - -To use laptop mode, you don't need to set any kernel configuration options -or anything. Simply install all the files included in this document, and -laptop mode will automatically be started when you're on battery. For -your convenience, a tarball containing an installer can be downloaded at: - - http://www.samwel.tk/laptop_mode/laptop_mode/ - -To configure laptop mode, you need to edit the configuration file, which is -located in /etc/default/laptop-mode on Debian-based systems, or in -/etc/sysconfig/laptop-mode on other systems. - -Unfortunately, automatic enabling of laptop mode does not work for -laptops that don't have ACPI. On those laptops, you need to start laptop -mode manually. To start laptop mode, run "laptop_mode start", and to -stop it, run "laptop_mode stop". (Note: The laptop mode tools package now -has experimental support for APM, you might want to try that first.) - - -Caveats -------- - -* The downside of laptop mode is that you have a chance of losing up to 10 - minutes of work. If you cannot afford this, don't use it! The supplied ACPI - scripts automatically turn off laptop mode when the battery almost runs out, - so that you won't lose any data at the end of your battery life. - -* Most desktop hard drives have a very limited lifetime measured in spindown - cycles, typically about 50.000 times (it's usually listed on the spec sheet). - Check your drive's rating, and don't wear down your drive's lifetime if you - don't need to. - -* If you mount some of your ext3 filesystems with the -n option, then - the control script will not be able to remount them correctly. You must set - DO_REMOUNTS=0 in the control script, otherwise it will remount them with the - wrong options -- or it will fail because it cannot write to /etc/mtab. - -* If you have your filesystems listed as type "auto" in fstab, like I did, then - the control script will not recognize them as filesystems that need remounting. - You must list the filesystems with their true type instead. - -* It has been reported that some versions of the mutt mail client use file access - times to determine whether a folder contains new mail. If you use mutt and - experience this, you must disable the noatime remounting by setting the option - DO_REMOUNT_NOATIME to 0 in the configuration file. - - -The Details ------------ - -Laptop mode is controlled by the knob /proc/sys/vm/laptop_mode. This knob is -present for all kernels that have the laptop mode patch, regardless of any -configuration options. When the knob is set, any physical disk I/O (that might -have caused the hard disk to spin up) causes Linux to flush all dirty blocks. The -result of this is that after a disk has spun down, it will not be spun up -anymore to write dirty blocks, because those blocks had already been written -immediately after the most recent read operation. The value of the laptop_mode -knob determines the time between the occurrence of disk I/O and when the flush -is triggered. A sensible value for the knob is 5 seconds. Setting the knob to -0 disables laptop mode. - -To increase the effectiveness of the laptop_mode strategy, the laptop_mode -control script increases dirty_expire_centisecs and dirty_writeback_centisecs in -/proc/sys/vm to about 10 minutes (by default), which means that pages that are -dirtied are not forced to be written to disk as often. The control script also -changes the dirty background ratio, so that background writeback of dirty pages -is not done anymore. Combined with a higher commit value (also 10 minutes) for -ext3 filesystem (also done automatically by the control script), -this results in concentration of disk activity in a small time interval which -occurs only once every 10 minutes, or whenever the disk is forced to spin up by -a cache miss. The disk can then be spun down in the periods of inactivity. - - -Configuration -------------- - -The laptop mode configuration file is located in /etc/default/laptop-mode on -Debian-based systems, or in /etc/sysconfig/laptop-mode on other systems. It -contains the following options: - -MAX_AGE: - -Maximum time, in seconds, of hard drive spindown time that you are -comfortable with. Worst case, it's possible that you could lose this -amount of work if your battery fails while you're in laptop mode. - -MINIMUM_BATTERY_MINUTES: - -Automatically disable laptop mode if the remaining number of minutes of -battery power is less than this value. Default is 10 minutes. - -AC_HD/BATT_HD: - -The idle timeout that should be set on your hard drive when laptop mode -is active (BATT_HD) and when it is not active (AC_HD). The defaults are -20 seconds (value 4) for BATT_HD and 2 hours (value 244) for AC_HD. The -possible values are those listed in the manual page for "hdparm" for the -"-S" option. - -HD: - -The devices for which the spindown timeout should be adjusted by laptop mode. -Default is /dev/hda. If you specify multiple devices, separate them by a space. - -READAHEAD: - -Disk readahead, in 512-byte sectors, while laptop mode is active. A large -readahead can prevent disk accesses for things like executable pages (which are -loaded on demand while the application executes) and sequentially accessed data -(MP3s). - -DO_REMOUNTS: - -The control script automatically remounts any mounted journaled filesystems -with appropriate commit interval options. When this option is set to 0, this -feature is disabled. - -DO_REMOUNT_NOATIME: - -When remounting, should the filesystems be remounted with the noatime option? -Normally, this is set to "1" (enabled), but there may be programs that require -access time recording. - -DIRTY_RATIO: - -The percentage of memory that is allowed to contain "dirty" or unsaved data -before a writeback is forced, while laptop mode is active. Corresponds to -the /proc/sys/vm/dirty_ratio sysctl. - -DIRTY_BACKGROUND_RATIO: - -The percentage of memory that is allowed to contain "dirty" or unsaved data -after a forced writeback is done due to an exceeding of DIRTY_RATIO. Set -this nice and low. This corresponds to the /proc/sys/vm/dirty_background_ratio -sysctl. - -Note that the behaviour of dirty_background_ratio is quite different -when laptop mode is active and when it isn't. When laptop mode is inactive, -dirty_background_ratio is the threshold percentage at which background writeouts -start taking place. When laptop mode is active, however, background writeouts -are disabled, and the dirty_background_ratio only determines how much writeback -is done when dirty_ratio is reached. - -DO_CPU: - -Enable CPU frequency scaling when in laptop mode. (Requires CPUFreq to be setup. -See Documentation/admin-guide/pm/cpufreq.rst for more info. Disabled by default.) - -CPU_MAXFREQ: - -When on battery, what is the maximum CPU speed that the system should use? Legal -values are "slowest" for the slowest speed that your CPU is able to operate at, -or a value listed in /sys/devices/system/cpu/cpu0/cpufreq/scaling_available_frequencies. - - -Tips & Tricks -------------- - -* Bartek Kania reports getting up to 50 minutes of extra battery life (on top - of his regular 3 to 3.5 hours) using a spindown time of 5 seconds (BATT_HD=1). - -* You can spin down the disk while playing MP3, by setting disk readahead - to 8MB (READAHEAD=16384). Effectively, the disk will read a complete MP3 at - once, and will then spin down while the MP3 is playing. (Thanks to Bartek - Kania.) - -* Drew Scott Daniels observed: "I don't know why, but when I decrease the number - of colours that my display uses it consumes less battery power. I've seen - this on powerbooks too. I hope that this is a piece of information that - might be useful to the Laptop Mode patch or its users." - -* In syslog.conf, you can prefix entries with a dash `-` to omit syncing the - file after every logging. When you're using laptop-mode and your disk doesn't - spin down, this is a likely culprit. - -* Richard Atterer observed that laptop mode does not work well with noflushd - (http://noflushd.sourceforge.net/), it seems that noflushd prevents laptop-mode - from doing its thing. - -* If you're worried about your data, you might want to consider using a USB - memory stick or something like that as a "working area". (Be aware though - that flash memory can only handle a limited number of writes, and overuse - may wear out your memory stick pretty quickly. Do _not_ use journalling - filesystems on flash memory sticks.) - - -Configuration file for control and ACPI battery scripts -------------------------------------------------------- - -This allows the tunables to be changed for the scripts via an external -configuration file - -It should be installed as /etc/default/laptop-mode on Debian, and as -/etc/sysconfig/laptop-mode on Red Hat, SUSE, Mandrake, and other work-alikes. - -Config file:: - - # Maximum time, in seconds, of hard drive spindown time that you are - # comfortable with. Worst case, it's possible that you could lose this - # amount of work if your battery fails you while in laptop mode. - #MAX_AGE=600 - - # Automatically disable laptop mode when the number of minutes of battery - # that you have left goes below this threshold. - MINIMUM_BATTERY_MINUTES=10 - - # Read-ahead, in 512-byte sectors. You can spin down the disk while playing MP3/OGG - # by setting the disk readahead to 8MB (READAHEAD=16384). Effectively, the disk - # will read a complete MP3 at once, and will then spin down while the MP3/OGG is - # playing. - #READAHEAD=4096 - - # Shall we remount journaled fs. with appropriate commit interval? (1=yes) - #DO_REMOUNTS=1 - - # And shall we add the "noatime" option to that as well? (1=yes) - #DO_REMOUNT_NOATIME=1 - - # Dirty synchronous ratio. At this percentage of dirty pages the process - # which - # calls write() does its own writeback - #DIRTY_RATIO=40 - - # - # Allowed dirty background ratio, in percent. Once DIRTY_RATIO has been - # exceeded, the kernel will wake flusher threads which will then reduce the - # amount of dirty memory to dirty_background_ratio. Set this nice and low, - # so once some writeout has commenced, we do a lot of it. - # - #DIRTY_BACKGROUND_RATIO=5 - - # kernel default dirty buffer age - #DEF_AGE=30 - #DEF_UPDATE=5 - #DEF_DIRTY_BACKGROUND_RATIO=10 - #DEF_DIRTY_RATIO=40 - #DEF_XFS_AGE_BUFFER=15 - #DEF_XFS_SYNC_INTERVAL=30 - #DEF_XFS_BUFD_INTERVAL=1 - - # This must be adjusted manually to the value of HZ in the running kernel - # on 2.4, until the XFS people change their 2.4 external interfaces to work in - # centisecs. This can be automated, but it's a work in progress that still - # needs# some fixes. On 2.6 kernels, XFS uses USER_HZ instead of HZ for - # external interfaces, and that is currently always set to 100. So you don't - # need to change this on 2.6. - #XFS_HZ=100 - - # Should the maximum CPU frequency be adjusted down while on battery? - # Requires CPUFreq to be setup. - # See Documentation/admin-guide/pm/cpufreq.rst for more info - #DO_CPU=0 - - # When on battery what is the maximum CPU speed that the system should - # use? Legal values are "slowest" for the slowest speed that your - # CPU is able to operate at, or a value listed in: - # /sys/devices/system/cpu/cpu0/cpufreq/scaling_available_frequencies - # Only applicable if DO_CPU=1. - #CPU_MAXFREQ=slowest - - # Idle timeout for your hard drive (man hdparm for valid values, -S option) - # Default is 2 hours on AC (AC_HD=244) and 20 seconds for battery (BATT_HD=4). - #AC_HD=244 - #BATT_HD=4 - - # The drives for which to adjust the idle timeout. Separate them by a space, - # e.g. HD="/dev/hda /dev/hdb". - #HD="/dev/hda" - - # Set the spindown timeout on a hard drive? - #DO_HD=1 - - -Control script --------------- - -Please note that this control script works for the Linux 2.4 and 2.6 series (thanks -to Kiko Piris). - -Control script:: - - #!/bin/bash - - # start or stop laptop_mode, best run by a power management daemon when - # ac gets connected/disconnected from a laptop - # - # install as /sbin/laptop_mode - # - # Contributors to this script: Kiko Piris - # Bart Samwel - # Micha Feigin - # Andrew Morton - # Herve Eychenne - # Dax Kelson - # - # Original Linux 2.4 version by: Jens Axboe - - ############################################################################# - - # Source config - if [ -f /etc/default/laptop-mode ] ; then - # Debian - . /etc/default/laptop-mode - elif [ -f /etc/sysconfig/laptop-mode ] ; then - # Others - . /etc/sysconfig/laptop-mode - fi - - # Don't raise an error if the config file is incomplete - # set defaults instead: - - # Maximum time, in seconds, of hard drive spindown time that you are - # comfortable with. Worst case, it's possible that you could lose this - # amount of work if your battery fails you while in laptop mode. - MAX_AGE=${MAX_AGE:-'600'} - - # Read-ahead, in kilobytes - READAHEAD=${READAHEAD:-'4096'} - - # Shall we remount journaled fs. with appropriate commit interval? (1=yes) - DO_REMOUNTS=${DO_REMOUNTS:-'1'} - - # And shall we add the "noatime" option to that as well? (1=yes) - DO_REMOUNT_NOATIME=${DO_REMOUNT_NOATIME:-'1'} - - # Shall we adjust the idle timeout on a hard drive? - DO_HD=${DO_HD:-'1'} - - # Adjust idle timeout on which hard drive? - HD="${HD:-'/dev/hda'}" - - # spindown time for HD (hdparm -S values) - AC_HD=${AC_HD:-'244'} - BATT_HD=${BATT_HD:-'4'} - - # Dirty synchronous ratio. At this percentage of dirty pages the process which - # calls write() does its own writeback - DIRTY_RATIO=${DIRTY_RATIO:-'40'} - - # cpu frequency scaling - # See Documentation/admin-guide/pm/cpufreq.rst for more info - DO_CPU=${CPU_MANAGE:-'0'} - CPU_MAXFREQ=${CPU_MAXFREQ:-'slowest'} - - # - # Allowed dirty background ratio, in percent. Once DIRTY_RATIO has been - # exceeded, the kernel will wake flusher threads which will then reduce the - # amount of dirty memory to dirty_background_ratio. Set this nice and low, - # so once some writeout has commenced, we do a lot of it. - # - DIRTY_BACKGROUND_RATIO=${DIRTY_BACKGROUND_RATIO:-'5'} - - # kernel default dirty buffer age - DEF_AGE=${DEF_AGE:-'30'} - DEF_UPDATE=${DEF_UPDATE:-'5'} - DEF_DIRTY_BACKGROUND_RATIO=${DEF_DIRTY_BACKGROUND_RATIO:-'10'} - DEF_DIRTY_RATIO=${DEF_DIRTY_RATIO:-'40'} - DEF_XFS_AGE_BUFFER=${DEF_XFS_AGE_BUFFER:-'15'} - DEF_XFS_SYNC_INTERVAL=${DEF_XFS_SYNC_INTERVAL:-'30'} - DEF_XFS_BUFD_INTERVAL=${DEF_XFS_BUFD_INTERVAL:-'1'} - - # This must be adjusted manually to the value of HZ in the running kernel - # on 2.4, until the XFS people change their 2.4 external interfaces to work in - # centisecs. This can be automated, but it's a work in progress that still needs - # some fixes. On 2.6 kernels, XFS uses USER_HZ instead of HZ for external - # interfaces, and that is currently always set to 100. So you don't need to - # change this on 2.6. - XFS_HZ=${XFS_HZ:-'100'} - - ############################################################################# - - KLEVEL="$(uname -r | - { - IFS='.' read a b c - echo $a.$b - } - )" - case "$KLEVEL" in - "2.4"|"2.6") - ;; - *) - echo "Unhandled kernel version: $KLEVEL ('uname -r' = '$(uname -r)')" >&2 - exit 1 - ;; - esac - - if [ ! -e /proc/sys/vm/laptop_mode ] ; then - echo "Kernel is not patched with laptop_mode patch." >&2 - exit 1 - fi - - if [ ! -w /proc/sys/vm/laptop_mode ] ; then - echo "You do not have enough privileges to enable laptop_mode." >&2 - exit 1 - fi - - # Remove an option (the first parameter) of the form option=<number> from - # a mount options string (the rest of the parameters). - parse_mount_opts () { - OPT="$1" - shift - echo ",$*," | sed \ - -e 's/,'"$OPT"'=[0-9]*,/,/g' \ - -e 's/,,*/,/g' \ - -e 's/^,//' \ - -e 's/,$//' - } - - # Remove an option (the first parameter) without any arguments from - # a mount option string (the rest of the parameters). - parse_nonumber_mount_opts () { - OPT="$1" - shift - echo ",$*," | sed \ - -e 's/,'"$OPT"',/,/g' \ - -e 's/,,*/,/g' \ - -e 's/^,//' \ - -e 's/,$//' - } - - # Find out the state of a yes/no option (e.g. "atime"/"noatime") in - # fstab for a given filesystem, and use this state to replace the - # value of the option in another mount options string. The device - # is the first argument, the option name the second, and the default - # value the third. The remainder is the mount options string. - # - # Example: - # parse_yesno_opts_wfstab /dev/hda1 atime atime defaults,noatime - # - # If fstab contains, say, "rw" for this filesystem, then the result - # will be "defaults,atime". - parse_yesno_opts_wfstab () { - L_DEV="$1" - OPT="$2" - DEF_OPT="$3" - shift 3 - L_OPTS="$*" - PARSEDOPTS1="$(parse_nonumber_mount_opts $OPT $L_OPTS)" - PARSEDOPTS1="$(parse_nonumber_mount_opts no$OPT $PARSEDOPTS1)" - # Watch for a default atime in fstab - FSTAB_OPTS="$(awk '$1 == "'$L_DEV'" { print $4 }' /etc/fstab)" - if echo "$FSTAB_OPTS" | grep "$OPT" > /dev/null ; then - # option specified in fstab: extract the value and use it - if echo "$FSTAB_OPTS" | grep "no$OPT" > /dev/null ; then - echo "$PARSEDOPTS1,no$OPT" - else - # no$OPT not found -- so we must have $OPT. - echo "$PARSEDOPTS1,$OPT" - fi - else - # option not specified in fstab -- choose the default. - echo "$PARSEDOPTS1,$DEF_OPT" - fi - } - - # Find out the state of a numbered option (e.g. "commit=NNN") in - # fstab for a given filesystem, and use this state to replace the - # value of the option in another mount options string. The device - # is the first argument, and the option name the second. The - # remainder is the mount options string in which the replacement - # must be done. - # - # Example: - # parse_mount_opts_wfstab /dev/hda1 commit defaults,commit=7 - # - # If fstab contains, say, "commit=3,rw" for this filesystem, then the - # result will be "rw,commit=3". - parse_mount_opts_wfstab () { - L_DEV="$1" - OPT="$2" - shift 2 - L_OPTS="$*" - PARSEDOPTS1="$(parse_mount_opts $OPT $L_OPTS)" - # Watch for a default commit in fstab - FSTAB_OPTS="$(awk '$1 == "'$L_DEV'" { print $4 }' /etc/fstab)" - if echo "$FSTAB_OPTS" | grep "$OPT=" > /dev/null ; then - # option specified in fstab: extract the value, and use it - echo -n "$PARSEDOPTS1,$OPT=" - echo ",$FSTAB_OPTS," | sed \ - -e 's/.*,'"$OPT"'=//' \ - -e 's/,.*//' - else - # option not specified in fstab: set it to 0 - echo "$PARSEDOPTS1,$OPT=0" - fi - } - - deduce_fstype () { - MP="$1" - # My root filesystem unfortunately has - # type "unknown" in /etc/mtab. If we encounter - # "unknown", we try to get the type from fstab. - cat /etc/fstab | - grep -v '^#' | - while read FSTAB_DEV FSTAB_MP FSTAB_FST FSTAB_OPTS FSTAB_DUMP FSTAB_DUMP ; do - if [ "$FSTAB_MP" = "$MP" ]; then - echo $FSTAB_FST - exit 0 - fi - done - } - - if [ $DO_REMOUNT_NOATIME -eq 1 ] ; then - NOATIME_OPT=",noatime" - fi - - case "$1" in - start) - AGE=$((100*$MAX_AGE)) - XFS_AGE=$(($XFS_HZ*$MAX_AGE)) - echo -n "Starting laptop_mode" - - if [ -d /proc/sys/vm/pagebuf ] ; then - # (For 2.4 and early 2.6.) - # This only needs to be set, not reset -- it is only used when - # laptop mode is enabled. - echo $XFS_AGE > /proc/sys/vm/pagebuf/lm_flush_age - echo $XFS_AGE > /proc/sys/fs/xfs/lm_sync_interval - elif [ -f /proc/sys/fs/xfs/lm_age_buffer ] ; then - # (A couple of early 2.6 laptop mode patches had these.) - # The same goes for these. - echo $XFS_AGE > /proc/sys/fs/xfs/lm_age_buffer - echo $XFS_AGE > /proc/sys/fs/xfs/lm_sync_interval - elif [ -f /proc/sys/fs/xfs/age_buffer ] ; then - # (2.6.6) - # But not for these -- they are also used in normal - # operation. - echo $XFS_AGE > /proc/sys/fs/xfs/age_buffer - echo $XFS_AGE > /proc/sys/fs/xfs/sync_interval - elif [ -f /proc/sys/fs/xfs/age_buffer_centisecs ] ; then - # (2.6.7 upwards) - # And not for these either. These are in centisecs, - # not USER_HZ, so we have to use $AGE, not $XFS_AGE. - echo $AGE > /proc/sys/fs/xfs/age_buffer_centisecs - echo $AGE > /proc/sys/fs/xfs/xfssyncd_centisecs - echo 3000 > /proc/sys/fs/xfs/xfsbufd_centisecs - fi - - case "$KLEVEL" in - "2.4") - echo 1 > /proc/sys/vm/laptop_mode - echo "30 500 0 0 $AGE $AGE 60 20 0" > /proc/sys/vm/bdflush - ;; - "2.6") - echo 5 > /proc/sys/vm/laptop_mode - echo "$AGE" > /proc/sys/vm/dirty_writeback_centisecs - echo "$AGE" > /proc/sys/vm/dirty_expire_centisecs - echo "$DIRTY_RATIO" > /proc/sys/vm/dirty_ratio - echo "$DIRTY_BACKGROUND_RATIO" > /proc/sys/vm/dirty_background_ratio - ;; - esac - if [ $DO_REMOUNTS -eq 1 ]; then - cat /etc/mtab | while read DEV MP FST OPTS DUMP PASS ; do - PARSEDOPTS="$(parse_mount_opts "$OPTS")" - if [ "$FST" = 'unknown' ]; then - FST=$(deduce_fstype $MP) - fi - case "$FST" in - "ext3") - PARSEDOPTS="$(parse_mount_opts commit "$OPTS")" - mount $DEV -t $FST $MP -o remount,$PARSEDOPTS,commit=$MAX_AGE$NOATIME_OPT - ;; - "xfs") - mount $DEV -t $FST $MP -o remount,$OPTS$NOATIME_OPT - ;; - esac - if [ -b $DEV ] ; then - blockdev --setra $(($READAHEAD * 2)) $DEV - fi - done - fi - if [ $DO_HD -eq 1 ] ; then - for THISHD in $HD ; do - /sbin/hdparm -S $BATT_HD $THISHD > /dev/null 2>&1 - /sbin/hdparm -B 1 $THISHD > /dev/null 2>&1 - done - fi - if [ $DO_CPU -eq 1 -a -e /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_min_freq ]; then - if [ $CPU_MAXFREQ = 'slowest' ]; then - CPU_MAXFREQ=`cat /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_min_freq` - fi - echo $CPU_MAXFREQ > /sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq - fi - echo "." - ;; - stop) - U_AGE=$((100*$DEF_UPDATE)) - B_AGE=$((100*$DEF_AGE)) - echo -n "Stopping laptop_mode" - echo 0 > /proc/sys/vm/laptop_mode - if [ -f /proc/sys/fs/xfs/age_buffer -a ! -f /proc/sys/fs/xfs/lm_age_buffer ] ; then - # These need to be restored, if there are no lm_*. - echo $(($XFS_HZ*$DEF_XFS_AGE_BUFFER)) > /proc/sys/fs/xfs/age_buffer - echo $(($XFS_HZ*$DEF_XFS_SYNC_INTERVAL)) > /proc/sys/fs/xfs/sync_interval - elif [ -f /proc/sys/fs/xfs/age_buffer_centisecs ] ; then - # These need to be restored as well. - echo $((100*$DEF_XFS_AGE_BUFFER)) > /proc/sys/fs/xfs/age_buffer_centisecs - echo $((100*$DEF_XFS_SYNC_INTERVAL)) > /proc/sys/fs/xfs/xfssyncd_centisecs - echo $((100*$DEF_XFS_BUFD_INTERVAL)) > /proc/sys/fs/xfs/xfsbufd_centisecs - fi - case "$KLEVEL" in - "2.4") - echo "30 500 0 0 $U_AGE $B_AGE 60 20 0" > /proc/sys/vm/bdflush - ;; - "2.6") - echo "$U_AGE" > /proc/sys/vm/dirty_writeback_centisecs - echo "$B_AGE" > /proc/sys/vm/dirty_expire_centisecs - echo "$DEF_DIRTY_RATIO" > /proc/sys/vm/dirty_ratio - echo "$DEF_DIRTY_BACKGROUND_RATIO" > /proc/sys/vm/dirty_background_ratio - ;; - esac - if [ $DO_REMOUNTS -eq 1 ] ; then - cat /etc/mtab | while read DEV MP FST OPTS DUMP PASS ; do - # Reset commit and atime options to defaults. - if [ "$FST" = 'unknown' ]; then - FST=$(deduce_fstype $MP) - fi - case "$FST" in - "ext3") - PARSEDOPTS="$(parse_mount_opts_wfstab $DEV commit $OPTS)" - PARSEDOPTS="$(parse_yesno_opts_wfstab $DEV atime atime $PARSEDOPTS)" - mount $DEV -t $FST $MP -o remount,$PARSEDOPTS - ;; - "xfs") - PARSEDOPTS="$(parse_yesno_opts_wfstab $DEV atime atime $OPTS)" - mount $DEV -t $FST $MP -o remount,$PARSEDOPTS - ;; - esac - if [ -b $DEV ] ; then - blockdev --setra 256 $DEV - fi - done - fi - if [ $DO_HD -eq 1 ] ; then - for THISHD in $HD ; do - /sbin/hdparm -S $AC_HD $THISHD > /dev/null 2>&1 - /sbin/hdparm -B 255 $THISHD > /dev/null 2>&1 - done - fi - if [ $DO_CPU -eq 1 -a -e /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_min_freq ]; then - echo `cat /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_max_freq` > /sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq - fi - echo "." - ;; - *) - echo "Usage: $0 {start|stop}" 2>&1 - exit 1 - ;; - - esac - - exit 0 - - -ACPI integration ----------------- - -Dax Kelson submitted this so that the ACPI acpid daemon will -kick off the laptop_mode script and run hdparm. The part that -automatically disables laptop mode when the battery is low was -written by Jan Topinski. - -/etc/acpi/events/ac_adapter:: - - event=ac_adapter - action=/etc/acpi/actions/ac.sh %e - -/etc/acpi/events/battery:: - - event=battery.* - action=/etc/acpi/actions/battery.sh %e - -/etc/acpi/actions/ac.sh:: - - #!/bin/bash - - # ac on/offline event handler - - status=`awk '/^state: / { print $2 }' /proc/acpi/ac_adapter/$2/state` - - case $status in - "on-line") - /sbin/laptop_mode stop - exit 0 - ;; - "off-line") - /sbin/laptop_mode start - exit 0 - ;; - esac - - -/etc/acpi/actions/battery.sh:: - - #! /bin/bash - - # Automatically disable laptop mode when the battery almost runs out. - - BATT_INFO=/proc/acpi/battery/$2/state - - if [[ -f /proc/sys/vm/laptop_mode ]] - then - LM=`cat /proc/sys/vm/laptop_mode` - if [[ $LM -gt 0 ]] - then - if [[ -f $BATT_INFO ]] - then - # Source the config file only now that we know we need - if [ -f /etc/default/laptop-mode ] ; then - # Debian - . /etc/default/laptop-mode - elif [ -f /etc/sysconfig/laptop-mode ] ; then - # Others - . /etc/sysconfig/laptop-mode - fi - MINIMUM_BATTERY_MINUTES=${MINIMUM_BATTERY_MINUTES:-'10'} - - ACTION="`cat $BATT_INFO | grep charging | cut -c 26-`" - if [[ ACTION -eq "discharging" ]] - then - PRESENT_RATE=`cat $BATT_INFO | grep "present rate:" | sed "s/.* \([0-9][0-9]* \).*/\1/" ` - REMAINING=`cat $BATT_INFO | grep "remaining capacity:" | sed "s/.* \([0-9][0-9]* \).*/\1/" ` - fi - if (($REMAINING * 60 / $PRESENT_RATE < $MINIMUM_BATTERY_MINUTES)) - then - /sbin/laptop_mode stop - fi - else - logger -p daemon.warning "You are using laptop mode and your battery interface $BATT_INFO is missing. This may lead to loss of data when the battery runs out. Check kernel ACPI support and /proc/acpi/battery folder, and edit /etc/acpi/battery.sh to set BATT_INFO to the correct path." - fi - fi - fi - - -Monitoring tool ---------------- - -Bartek Kania submitted this, it can be used to measure how much time your disk -spends spun up/down. See tools/laptop/dslm/dslm.c diff --git a/Documentation/admin-guide/sysctl/vm.rst b/Documentation/admin-guide/sysctl/vm.rst index 4d71211fdad8..af14345ba94b 100644 --- a/Documentation/admin-guide/sysctl/vm.rst +++ b/Documentation/admin-guide/sysctl/vm.rst @@ -41,7 +41,6 @@ files can be found in mm/swap.c. - extfrag_threshold - highmem_is_dirtyable - hugetlb_shm_group -- laptop_mode - legacy_va_layout - lowmem_reserve_ratio - max_map_count @@ -363,13 +362,6 @@ hugetlb_shm_group contains group id that is allowed to create SysV shared memory segment using hugetlb page. -laptop_mode -=========== - -laptop_mode is a knob that controls "laptop mode". All the things that are -controlled by this knob are discussed in Documentation/admin-guide/laptops/laptop-mode.rst. - - legacy_va_layout ================ diff --git a/block/blk-mq.c b/block/blk-mq.c index 1978eef95dca..6d739bd9459d 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -811,9 +811,6 @@ void blk_mq_free_request(struct request *rq) blk_mq_finish_request(rq); - if (unlikely(laptop_mode && !blk_rq_is_passthrough(rq))) - laptop_io_completion(q->disk->bdi); - rq_qos_done(q, rq); WRITE_ONCE(rq->state, MQ_RQ_IDLE); diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index 0c466ccbed69..15eb463d5a9b 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -3305,8 +3305,7 @@ int ext4_alloc_da_blocks(struct inode *inode) /* * We do something simple for now. The filemap_flush() will * also start triggering a write of the data blocks, which is - * not strictly speaking necessary (and for users of - * laptop_mode, not even desirable). However, to do otherwise + * not strictly speaking necessary. However, to do otherwise * would require replicating code paths in: * * ext4_writepages() -> diff --git a/fs/sync.c b/fs/sync.c index 431fc5f5be06..6330150792f6 100644 --- a/fs/sync.c +++ b/fs/sync.c @@ -104,8 +104,6 @@ void ksys_sync(void) iterate_supers(sync_fs_one_sb, &wait); sync_bdevs(false); sync_bdevs(true); - if (unlikely(laptop_mode)) - laptop_sync_completion(); } SYSCALL_DEFINE0(sync) diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c index bc71aa9dcee8..a2014fb1bc66 100644 --- a/fs/xfs/xfs_super.c +++ b/fs/xfs/xfs_super.c @@ -845,15 +845,6 @@ xfs_fs_sync_fs( if (error) return error; - if (laptop_mode) { - /* - * The disk must be active because we're syncing. - * We schedule log work now (now that the disk is - * active) instead of later (when it might not be). - */ - flush_delayed_work(&mp->m_log->l_work); - } - /* * If we are called with page faults frozen out, it means we are about * to freeze the transaction subsystem. Take the opportunity to shut diff --git a/include/linux/backing-dev-defs.h b/include/linux/backing-dev-defs.h index 0217c1073735..c88fd4d37d1f 100644 --- a/include/linux/backing-dev-defs.h +++ b/include/linux/backing-dev-defs.h @@ -46,7 +46,6 @@ enum wb_reason { WB_REASON_VMSCAN, WB_REASON_SYNC, WB_REASON_PERIODIC, - WB_REASON_LAPTOP_TIMER, WB_REASON_FS_FREE_SPACE, /* * There is no bdi forker thread any more and works are done @@ -204,8 +203,6 @@ struct backing_dev_info { char dev_name[64]; struct device *owner; - struct timer_list laptop_mode_wb_timer; - #ifdef CONFIG_DEBUG_FS struct dentry *debug_dir; #endif diff --git a/include/linux/writeback.h b/include/linux/writeback.h index f48e8ccffe81..e530112c4b3a 100644 --- a/include/linux/writeback.h +++ b/include/linux/writeback.h @@ -328,9 +328,6 @@ struct dirty_throttle_control { bool dirty_exceeded; }; -void laptop_io_completion(struct backing_dev_info *info); -void laptop_sync_completion(void); -void laptop_mode_timer_fn(struct timer_list *t); bool node_dirty_ok(struct pglist_data *pgdat); int wb_domain_init(struct wb_domain *dom, gfp_t gfp); #ifdef CONFIG_CGROUP_WRITEBACK @@ -342,7 +339,6 @@ extern struct wb_domain global_wb_domain; /* These are exported to sysctl. */ extern unsigned int dirty_writeback_interval; extern unsigned int dirty_expire_interval; -extern int laptop_mode; void global_dirty_limits(unsigned long *pbackground, unsigned long *pdirty); unsigned long wb_calc_thresh(struct bdi_writeback *wb, unsigned long thresh); diff --git a/include/trace/events/writeback.h b/include/trace/events/writeback.h index 311a341e6fe4..b6f94e97788a 100644 --- a/include/trace/events/writeback.h +++ b/include/trace/events/writeback.h @@ -42,7 +42,6 @@ EM( WB_REASON_VMSCAN, "vmscan") \ EM( WB_REASON_SYNC, "sync") \ EM( WB_REASON_PERIODIC, "periodic") \ - EM( WB_REASON_LAPTOP_TIMER, "laptop_timer") \ EM( WB_REASON_FS_FREE_SPACE, "fs_free_space") \ EM( WB_REASON_FORKER_THREAD, "forker_thread") \ EMe(WB_REASON_FOREIGN_FLUSH, "foreign_flush") diff --git a/include/uapi/linux/sysctl.h b/include/uapi/linux/sysctl.h index 63d1464cb71c..6ea9ea8413fa 100644 --- a/include/uapi/linux/sysctl.h +++ b/include/uapi/linux/sysctl.h @@ -183,7 +183,7 @@ enum VM_LOWMEM_RESERVE_RATIO=20,/* reservation ratio for lower memory zones */ VM_MIN_FREE_KBYTES=21, /* Minimum free kilobytes to maintain */ VM_MAX_MAP_COUNT=22, /* int: Maximum number of mmaps/address-space */ - VM_LAPTOP_MODE=23, /* vm laptop mode */ + VM_BLOCK_DUMP=24, /* block dump mode */ VM_HUGETLB_GROUP=25, /* permitted hugetlb group */ VM_VFS_CACHE_PRESSURE=26, /* dcache/icache reclaim pressure */ diff --git a/mm/backing-dev.c b/mm/backing-dev.c index c5740c6d37a2..a0e26d1b717f 100644 --- a/mm/backing-dev.c +++ b/mm/backing-dev.c @@ -1034,7 +1034,6 @@ struct backing_dev_info *bdi_alloc(int node_id) bdi->capabilities = BDI_CAP_WRITEBACK; bdi->ra_pages = VM_READAHEAD_PAGES; bdi->io_pages = VM_READAHEAD_PAGES; - timer_setup(&bdi->laptop_mode_wb_timer, laptop_mode_timer_fn, 0); return bdi; } EXPORT_SYMBOL(bdi_alloc); @@ -1156,8 +1155,6 @@ static void bdi_remove_from_list(struct backing_dev_info *bdi) void bdi_unregister(struct backing_dev_info *bdi) { - timer_delete_sync(&bdi->laptop_mode_wb_timer); - /* make sure nobody finds us on the bdi_list anymore */ bdi_remove_from_list(bdi); wb_shutdown(&bdi->wb); diff --git a/mm/page-writeback.c b/mm/page-writeback.c index ccdeb0e84d39..0c0f048d12bb 100644 --- a/mm/page-writeback.c +++ b/mm/page-writeback.c @@ -109,14 +109,6 @@ EXPORT_SYMBOL_GPL(dirty_writeback_interval); */ unsigned int dirty_expire_interval = 30 * 100; /* centiseconds */ -/* - * Flag that puts the machine in "laptop mode". Doubles as a timeout in jiffies: - * a full sync is triggered after this time elapses without any disk activity. - */ -int laptop_mode; - -EXPORT_SYMBOL(laptop_mode); - /* End of sysctl-exported parameters */ struct wb_domain global_wb_domain; @@ -1843,17 +1835,7 @@ static int balance_dirty_pages(struct bdi_writeback *wb, balance_domain_limits(mdtc, strictlimit); } - /* - * In laptop mode, we wait until hitting the higher threshold - * before starting background writeout, and then write out all - * the way down to the lower threshold. So slow writers cause - * minimal disk activity. - * - * In normal mode, we start background writeout at the lower - * background_thresh, to keep the amount of dirty memory low. - */ - if (!laptop_mode && nr_dirty > gdtc->bg_thresh && - !writeback_in_progress(wb)) + if (nr_dirty > gdtc->bg_thresh && !writeback_in_progress(wb)) wb_start_background_writeback(wb); /* @@ -1876,10 +1858,6 @@ static int balance_dirty_pages(struct bdi_writeback *wb, break; } - /* Start writeback even when in laptop mode */ - if (unlikely(!writeback_in_progress(wb))) - wb_start_background_writeback(wb); - mem_cgroup_flush_foreign(wb); /* @@ -2198,41 +2176,6 @@ static int dirty_writeback_centisecs_handler(const struct ctl_table *table, int } #endif -void laptop_mode_timer_fn(struct timer_list *t) -{ - struct backing_dev_info *backing_dev_info = - timer_container_of(backing_dev_info, t, laptop_mode_wb_timer); - - wakeup_flusher_threads_bdi(backing_dev_info, WB_REASON_LAPTOP_TIMER); -} - -/* - * We've spun up the disk and we're in laptop mode: schedule writeback - * of all dirty data a few seconds from now. If the flush is already scheduled - * then push it back - the user is still using the disk. - */ -void laptop_io_completion(struct backing_dev_info *info) -{ - mod_timer(&info->laptop_mode_wb_timer, jiffies + laptop_mode); -} - -/* - * We're in laptop mode and we've just synced. The sync's writes will have - * caused another writeback to be scheduled by laptop_io_completion. - * Nothing needs to be written back anymore, so we unschedule the writeback. - */ -void laptop_sync_completion(void) -{ - struct backing_dev_info *bdi; - - rcu_read_lock(); - - list_for_each_entry_rcu(bdi, &bdi_list, bdi_list) - timer_delete(&bdi->laptop_mode_wb_timer); - - rcu_read_unlock(); -} - /* * If ratelimit_pages is too high then we can get into dirty-data overload * if a large number of processes all perform writes at the same time. @@ -2327,13 +2270,6 @@ static const struct ctl_table vm_page_writeback_sysctls[] = { .extra2 = SYSCTL_ONE, }, #endif - { - .procname = "laptop_mode", - .data = &laptop_mode, - .maxlen = sizeof(laptop_mode), - .mode = 0644, - .proc_handler = proc_dointvec_jiffies, - }, }; #endif diff --git a/mm/vmscan.c b/mm/vmscan.c index 670fe9fae5ba..a1ad50c0c9aa 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -104,13 +104,13 @@ struct scan_control { unsigned int force_deactivate:1; unsigned int skipped_deactivate:1; - /* Writepage batching in laptop mode; RECLAIM_WRITE */ + /* zone_reclaim_mode, boost reclaim */ unsigned int may_writepage:1; - /* Can mapped folios be reclaimed? */ + /* zone_reclaim_mode */ unsigned int may_unmap:1; - /* Can folios be swapped as part of reclaim? */ + /* zome_reclaim_mode, boost reclaim, cgroup restrictions */ unsigned int may_swap:1; /* Not allow cache_trim_mode to be turned on as part of reclaim? */ @@ -6366,13 +6366,6 @@ static unsigned long do_try_to_free_pages(struct zonelist *zonelist, if (sc->compaction_ready) break; - - /* - * If we're getting trouble reclaiming, start doing - * writepage even in laptop mode. - */ - if (sc->priority < DEF_PRIORITY - 2) - sc->may_writepage = 1; } while (--sc->priority >= 0); last_pgdat = NULL; @@ -6581,7 +6574,7 @@ unsigned long try_to_free_pages(struct zonelist *zonelist, int order, .order = order, .nodemask = nodemask, .priority = DEF_PRIORITY, - .may_writepage = !laptop_mode, + .may_writepage = 1, .may_unmap = 1, .may_swap = 1, }; @@ -6625,7 +6618,7 @@ unsigned long mem_cgroup_shrink_node(struct mem_cgroup *memcg, struct scan_control sc = { .nr_to_reclaim = SWAP_CLUSTER_MAX, .target_mem_cgroup = memcg, - .may_writepage = !laptop_mode, + .may_writepage = 1, .may_unmap = 1, .reclaim_idx = MAX_NR_ZONES - 1, .may_swap = !noswap, @@ -6671,7 +6664,7 @@ unsigned long try_to_free_mem_cgroup_pages(struct mem_cgroup *memcg, .reclaim_idx = MAX_NR_ZONES - 1, .target_mem_cgroup = memcg, .priority = DEF_PRIORITY, - .may_writepage = !laptop_mode, + .may_writepage = 1, .may_unmap = 1, .may_swap = !!(reclaim_options & MEMCG_RECLAIM_MAY_SWAP), .proactive = !!(reclaim_options & MEMCG_RECLAIM_PROACTIVE), @@ -7052,7 +7045,7 @@ static int balance_pgdat(pg_data_t *pgdat, int order, int highest_zoneidx) * from reclaim context. If no pages are reclaimed, the * reclaim will be aborted. */ - sc.may_writepage = !laptop_mode && !nr_boost_reclaim; + sc.may_writepage = !nr_boost_reclaim; sc.may_swap = !nr_boost_reclaim; /* @@ -7062,13 +7055,6 @@ static int balance_pgdat(pg_data_t *pgdat, int order, int highest_zoneidx) */ kswapd_age_node(pgdat, &sc); - /* - * If we're getting trouble reclaiming, start doing writepage - * even in laptop mode. - */ - if (sc.priority < DEF_PRIORITY - 2) - sc.may_writepage = 1; - /* Call soft limit reclaim before calling shrink_node. */ sc.nr_scanned = 0; nr_soft_scanned = 0; @@ -7789,7 +7775,7 @@ int user_proactive_reclaim(char *buf, .reclaim_idx = gfp_zone(gfp_mask), .proactive_swappiness = swappiness == -1 ? NULL : &swappiness, .priority = DEF_PRIORITY, - .may_writepage = !laptop_mode, + .may_writepage = 1, .nr_to_reclaim = max(batch_size, SWAP_CLUSTER_MAX), .may_unmap = 1, .may_swap = 1, -- 2.52.0 ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: retiring laptop_mode? was Re: [PATCH] mm: vmscan: always allow writeback during memcg reclaim 2025-12-15 20:08 ` Johannes Weiner @ 2025-12-16 2:23 ` Jens Axboe 2025-12-16 7:41 ` Christoph Hellwig 1 sibling, 0 replies; 20+ messages in thread From: Jens Axboe @ 2025-12-16 2:23 UTC (permalink / raw) To: Johannes Weiner, Christoph Hellwig Cc: Deepanshu Kartikey, akpm, linux-mm, linux-kernel, linux-block On 12/15/25 1:08 PM, Johannes Weiner wrote: > On Sun, Dec 14, 2025 at 10:59:11PM -0800, Christoph Hellwig wrote: >> On Sun, Dec 14, 2025 at 11:12:00PM -0500, Johannes Weiner wrote: >>> That reasoning doesn't make sense to me. Reclaim is always in response >>> to an allocation need. The laptop_mode idea applies to cgroup reclaim >>> as much as any other reclaim. >>> >>> Now obviously all of this is pretty dated. Reclaim doesn't do >>> filesystem writes anymore, and I'm not sure there are a whole lot of >>> laptops with rotational drives left, either. Also I doubt anybody is >>> still using zone_reclaim_mode (which is where the may_unmap is from). >> >> Yeah. I wonder if we should retire laptop_mode. It was a cute hack >> back then, but it has it's ugly fingers in way to many places and >> should be mostly obsolete by how writeback works these days. > > Yes, that makes sense to me. How about the below? > > It doesn't actually get rid of the reclaim toggles - I added comments > for the other usecases. But it's a nice diffstat nonetheless. > > Debated whether to add some sort of deprecation sysctl handler, but at > least systemd-sysctl just prints a warning and still applies other > settings from the same config file. > > --- > > From 868f67e9d0d4465a6c22d8a147084944e7569c8d Mon Sep 17 00:00:00 2001 > From: Johannes Weiner <hannes@cmpxchg.org> > Date: Mon, 15 Dec 2025 12:57:53 -0500 > Subject: [PATCH] mm/block/fs: remove laptop_mode > > Laptop mode was introduced to save battery, by delaying and > consolidating writes and maximize the time rotating hard drives > wouldn't have to spin. Needless to say, this is a scenario of the > (in)glorious past. > > The footprint of the feature is small, but nevertheless it's a > complicating factor in mm, block, filesystems. Developers don't think > about it, and the decision-making in reclaim looks dubious. It likely > hasn't been tested in years while the surrounding code has evolved. From a quick glance, looks good to me: Acked-by: Jens Axboe <axboe@kernel.dk> -- Jens Axboe ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: retiring laptop_mode? was Re: [PATCH] mm: vmscan: always allow writeback during memcg reclaim 2025-12-15 20:08 ` Johannes Weiner 2025-12-16 2:23 ` Jens Axboe @ 2025-12-16 7:41 ` Christoph Hellwig 2025-12-16 18:52 ` Johannes Weiner 1 sibling, 1 reply; 20+ messages in thread From: Christoph Hellwig @ 2025-12-16 7:41 UTC (permalink / raw) To: Johannes Weiner Cc: Christoph Hellwig, Jens Axboe, Deepanshu Kartikey, akpm, linux-mm, linux-kernel, linux-block On Mon, Dec 15, 2025 at 03:08:38PM -0500, Johannes Weiner wrote: > Debated whether to add some sort of deprecation sysctl handler, but at > least systemd-sysctl just prints a warning and still applies other > settings from the same config file. In general dropping sysctl will break things. So I think we'll need a stub, at which point it might as well warn for a while. > Laptop mode was introduced to save battery, by delaying and > consolidating writes and maximize the time rotating hard drives > wouldn't have to spin. Needless to say, this is a scenario of the > (in)glorious past. Maybe expand on this a bit by mentioning that reclaim now never does file system writeback, and fs writeback is already very lumpy by design. And of cours that hard disk with their high spinup latency and extra power draw are a thing of the past in laptops or other mobile devices. Otherwise this looks good to me. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: retiring laptop_mode? was Re: [PATCH] mm: vmscan: always allow writeback during memcg reclaim 2025-12-16 7:41 ` Christoph Hellwig @ 2025-12-16 18:52 ` Johannes Weiner 2025-12-16 18:54 ` Jens Axboe ` (3 more replies) 0 siblings, 4 replies; 20+ messages in thread From: Johannes Weiner @ 2025-12-16 18:52 UTC (permalink / raw) To: Christoph Hellwig Cc: Jens Axboe, Deepanshu Kartikey, akpm, linux-mm, linux-kernel, linux-block On Mon, Dec 15, 2025 at 11:41:07PM -0800, Christoph Hellwig wrote: > On Mon, Dec 15, 2025 at 03:08:38PM -0500, Johannes Weiner wrote: > > Debated whether to add some sort of deprecation sysctl handler, but at > > least systemd-sysctl just prints a warning and still applies other > > settings from the same config file. > > In general dropping sysctl will break things. So I think we'll need > a stub, at which point it might as well warn for a while. Fair enough, I added that. Jens, that change seemed small enough that I carried your Ack, but please let me know if you feel otherwise ;) > > Laptop mode was introduced to save battery, by delaying and > > consolidating writes and maximize the time rotating hard drives > > wouldn't have to spin. Needless to say, this is a scenario of the > > (in)glorious past. > > Maybe expand on this a bit by mentioning that reclaim now never does > file system writeback, and fs writeback is already very lumpy by > design. And of cours that hard disk with their high spinup latency > and extra power draw are a thing of the past in laptops or other mobile > devices. Sounds good. Can you take a look at the new version below? Andrew, absent any further objections, would you be able to take this through the -mm tree? Thanks! From 087f10b8046864f71ebc3a3f3316b097932cbded Mon Sep 17 00:00:00 2001 From: Johannes Weiner <hannes@cmpxchg.org> Date: Mon, 15 Dec 2025 12:57:53 -0500 Subject: [PATCH] mm/block/fs: remove laptop_mode Laptop mode was introduced to save battery, by delaying and consolidating writes and thereby maximize the time rotating hard drives wouldn't have to spin. Luckily, rotating hard drives, with their high spin-up times and power draw, are a thing of the past for battery-powered devices. Reclaim has also since changed to not write single filesystem pages anymore, and regular filesystem writeback is lumpy by design. The juice doesn't appear worth the squeeze anymore. The footprint of the feature is small, but nevertheless it's a complicating factor in mm, block, filesystems. Developers don't think about it, and it likely hasn't been tested with new reclaim and writeback changes in years. Let's sunset it. Keep the sysctl with a deprecation warning around for a few more cycles, but remove all functionality behind it. Suggested-by: Christoph Hellwig <hch@infradead.org> Message-ID: <aT-xv1BNYabnZB_n@infradead.org> Acked-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> --- .../admin-guide/laptops/laptop-mode.rst | 770 ------------------ Documentation/admin-guide/sysctl/vm.rst | 8 - block/blk-mq.c | 3 - fs/ext4/inode.c | 3 +- fs/sync.c | 2 - fs/xfs/xfs_super.c | 9 - include/linux/backing-dev-defs.h | 3 - include/linux/writeback.h | 4 - include/trace/events/writeback.h | 1 - include/uapi/linux/sysctl.h | 2 +- mm/backing-dev.c | 3 - mm/page-writeback.c | 74 +- mm/vmscan.c | 30 +- 13 files changed, 25 insertions(+), 887 deletions(-) delete mode 100644 Documentation/admin-guide/laptops/laptop-mode.rst diff --git a/Documentation/admin-guide/laptops/laptop-mode.rst b/Documentation/admin-guide/laptops/laptop-mode.rst deleted file mode 100644 index 66eb9cd918b5..000000000000 --- a/Documentation/admin-guide/laptops/laptop-mode.rst +++ /dev/null @@ -1,770 +0,0 @@ -=============================================== -How to conserve battery power using laptop-mode -=============================================== - -Document Author: Bart Samwel (bart@samwel.tk) - -Date created: January 2, 2004 - -Last modified: December 06, 2004 - -Introduction ------------- - -Laptop mode is used to minimize the time that the hard disk needs to be spun up, -to conserve battery power on laptops. It has been reported to cause significant -power savings. - -.. Contents - - * Introduction - * Installation - * Caveats - * The Details - * Tips & Tricks - * Control script - * ACPI integration - * Monitoring tool - - -Installation ------------- - -To use laptop mode, you don't need to set any kernel configuration options -or anything. Simply install all the files included in this document, and -laptop mode will automatically be started when you're on battery. For -your convenience, a tarball containing an installer can be downloaded at: - - http://www.samwel.tk/laptop_mode/laptop_mode/ - -To configure laptop mode, you need to edit the configuration file, which is -located in /etc/default/laptop-mode on Debian-based systems, or in -/etc/sysconfig/laptop-mode on other systems. - -Unfortunately, automatic enabling of laptop mode does not work for -laptops that don't have ACPI. On those laptops, you need to start laptop -mode manually. To start laptop mode, run "laptop_mode start", and to -stop it, run "laptop_mode stop". (Note: The laptop mode tools package now -has experimental support for APM, you might want to try that first.) - - -Caveats -------- - -* The downside of laptop mode is that you have a chance of losing up to 10 - minutes of work. If you cannot afford this, don't use it! The supplied ACPI - scripts automatically turn off laptop mode when the battery almost runs out, - so that you won't lose any data at the end of your battery life. - -* Most desktop hard drives have a very limited lifetime measured in spindown - cycles, typically about 50.000 times (it's usually listed on the spec sheet). - Check your drive's rating, and don't wear down your drive's lifetime if you - don't need to. - -* If you mount some of your ext3 filesystems with the -n option, then - the control script will not be able to remount them correctly. You must set - DO_REMOUNTS=0 in the control script, otherwise it will remount them with the - wrong options -- or it will fail because it cannot write to /etc/mtab. - -* If you have your filesystems listed as type "auto" in fstab, like I did, then - the control script will not recognize them as filesystems that need remounting. - You must list the filesystems with their true type instead. - -* It has been reported that some versions of the mutt mail client use file access - times to determine whether a folder contains new mail. If you use mutt and - experience this, you must disable the noatime remounting by setting the option - DO_REMOUNT_NOATIME to 0 in the configuration file. - - -The Details ------------ - -Laptop mode is controlled by the knob /proc/sys/vm/laptop_mode. This knob is -present for all kernels that have the laptop mode patch, regardless of any -configuration options. When the knob is set, any physical disk I/O (that might -have caused the hard disk to spin up) causes Linux to flush all dirty blocks. The -result of this is that after a disk has spun down, it will not be spun up -anymore to write dirty blocks, because those blocks had already been written -immediately after the most recent read operation. The value of the laptop_mode -knob determines the time between the occurrence of disk I/O and when the flush -is triggered. A sensible value for the knob is 5 seconds. Setting the knob to -0 disables laptop mode. - -To increase the effectiveness of the laptop_mode strategy, the laptop_mode -control script increases dirty_expire_centisecs and dirty_writeback_centisecs in -/proc/sys/vm to about 10 minutes (by default), which means that pages that are -dirtied are not forced to be written to disk as often. The control script also -changes the dirty background ratio, so that background writeback of dirty pages -is not done anymore. Combined with a higher commit value (also 10 minutes) for -ext3 filesystem (also done automatically by the control script), -this results in concentration of disk activity in a small time interval which -occurs only once every 10 minutes, or whenever the disk is forced to spin up by -a cache miss. The disk can then be spun down in the periods of inactivity. - - -Configuration -------------- - -The laptop mode configuration file is located in /etc/default/laptop-mode on -Debian-based systems, or in /etc/sysconfig/laptop-mode on other systems. It -contains the following options: - -MAX_AGE: - -Maximum time, in seconds, of hard drive spindown time that you are -comfortable with. Worst case, it's possible that you could lose this -amount of work if your battery fails while you're in laptop mode. - -MINIMUM_BATTERY_MINUTES: - -Automatically disable laptop mode if the remaining number of minutes of -battery power is less than this value. Default is 10 minutes. - -AC_HD/BATT_HD: - -The idle timeout that should be set on your hard drive when laptop mode -is active (BATT_HD) and when it is not active (AC_HD). The defaults are -20 seconds (value 4) for BATT_HD and 2 hours (value 244) for AC_HD. The -possible values are those listed in the manual page for "hdparm" for the -"-S" option. - -HD: - -The devices for which the spindown timeout should be adjusted by laptop mode. -Default is /dev/hda. If you specify multiple devices, separate them by a space. - -READAHEAD: - -Disk readahead, in 512-byte sectors, while laptop mode is active. A large -readahead can prevent disk accesses for things like executable pages (which are -loaded on demand while the application executes) and sequentially accessed data -(MP3s). - -DO_REMOUNTS: - -The control script automatically remounts any mounted journaled filesystems -with appropriate commit interval options. When this option is set to 0, this -feature is disabled. - -DO_REMOUNT_NOATIME: - -When remounting, should the filesystems be remounted with the noatime option? -Normally, this is set to "1" (enabled), but there may be programs that require -access time recording. - -DIRTY_RATIO: - -The percentage of memory that is allowed to contain "dirty" or unsaved data -before a writeback is forced, while laptop mode is active. Corresponds to -the /proc/sys/vm/dirty_ratio sysctl. - -DIRTY_BACKGROUND_RATIO: - -The percentage of memory that is allowed to contain "dirty" or unsaved data -after a forced writeback is done due to an exceeding of DIRTY_RATIO. Set -this nice and low. This corresponds to the /proc/sys/vm/dirty_background_ratio -sysctl. - -Note that the behaviour of dirty_background_ratio is quite different -when laptop mode is active and when it isn't. When laptop mode is inactive, -dirty_background_ratio is the threshold percentage at which background writeouts -start taking place. When laptop mode is active, however, background writeouts -are disabled, and the dirty_background_ratio only determines how much writeback -is done when dirty_ratio is reached. - -DO_CPU: - -Enable CPU frequency scaling when in laptop mode. (Requires CPUFreq to be setup. -See Documentation/admin-guide/pm/cpufreq.rst for more info. Disabled by default.) - -CPU_MAXFREQ: - -When on battery, what is the maximum CPU speed that the system should use? Legal -values are "slowest" for the slowest speed that your CPU is able to operate at, -or a value listed in /sys/devices/system/cpu/cpu0/cpufreq/scaling_available_frequencies. - - -Tips & Tricks -------------- - -* Bartek Kania reports getting up to 50 minutes of extra battery life (on top - of his regular 3 to 3.5 hours) using a spindown time of 5 seconds (BATT_HD=1). - -* You can spin down the disk while playing MP3, by setting disk readahead - to 8MB (READAHEAD=16384). Effectively, the disk will read a complete MP3 at - once, and will then spin down while the MP3 is playing. (Thanks to Bartek - Kania.) - -* Drew Scott Daniels observed: "I don't know why, but when I decrease the number - of colours that my display uses it consumes less battery power. I've seen - this on powerbooks too. I hope that this is a piece of information that - might be useful to the Laptop Mode patch or its users." - -* In syslog.conf, you can prefix entries with a dash `-` to omit syncing the - file after every logging. When you're using laptop-mode and your disk doesn't - spin down, this is a likely culprit. - -* Richard Atterer observed that laptop mode does not work well with noflushd - (http://noflushd.sourceforge.net/), it seems that noflushd prevents laptop-mode - from doing its thing. - -* If you're worried about your data, you might want to consider using a USB - memory stick or something like that as a "working area". (Be aware though - that flash memory can only handle a limited number of writes, and overuse - may wear out your memory stick pretty quickly. Do _not_ use journalling - filesystems on flash memory sticks.) - - -Configuration file for control and ACPI battery scripts -------------------------------------------------------- - -This allows the tunables to be changed for the scripts via an external -configuration file - -It should be installed as /etc/default/laptop-mode on Debian, and as -/etc/sysconfig/laptop-mode on Red Hat, SUSE, Mandrake, and other work-alikes. - -Config file:: - - # Maximum time, in seconds, of hard drive spindown time that you are - # comfortable with. Worst case, it's possible that you could lose this - # amount of work if your battery fails you while in laptop mode. - #MAX_AGE=600 - - # Automatically disable laptop mode when the number of minutes of battery - # that you have left goes below this threshold. - MINIMUM_BATTERY_MINUTES=10 - - # Read-ahead, in 512-byte sectors. You can spin down the disk while playing MP3/OGG - # by setting the disk readahead to 8MB (READAHEAD=16384). Effectively, the disk - # will read a complete MP3 at once, and will then spin down while the MP3/OGG is - # playing. - #READAHEAD=4096 - - # Shall we remount journaled fs. with appropriate commit interval? (1=yes) - #DO_REMOUNTS=1 - - # And shall we add the "noatime" option to that as well? (1=yes) - #DO_REMOUNT_NOATIME=1 - - # Dirty synchronous ratio. At this percentage of dirty pages the process - # which - # calls write() does its own writeback - #DIRTY_RATIO=40 - - # - # Allowed dirty background ratio, in percent. Once DIRTY_RATIO has been - # exceeded, the kernel will wake flusher threads which will then reduce the - # amount of dirty memory to dirty_background_ratio. Set this nice and low, - # so once some writeout has commenced, we do a lot of it. - # - #DIRTY_BACKGROUND_RATIO=5 - - # kernel default dirty buffer age - #DEF_AGE=30 - #DEF_UPDATE=5 - #DEF_DIRTY_BACKGROUND_RATIO=10 - #DEF_DIRTY_RATIO=40 - #DEF_XFS_AGE_BUFFER=15 - #DEF_XFS_SYNC_INTERVAL=30 - #DEF_XFS_BUFD_INTERVAL=1 - - # This must be adjusted manually to the value of HZ in the running kernel - # on 2.4, until the XFS people change their 2.4 external interfaces to work in - # centisecs. This can be automated, but it's a work in progress that still - # needs# some fixes. On 2.6 kernels, XFS uses USER_HZ instead of HZ for - # external interfaces, and that is currently always set to 100. So you don't - # need to change this on 2.6. - #XFS_HZ=100 - - # Should the maximum CPU frequency be adjusted down while on battery? - # Requires CPUFreq to be setup. - # See Documentation/admin-guide/pm/cpufreq.rst for more info - #DO_CPU=0 - - # When on battery what is the maximum CPU speed that the system should - # use? Legal values are "slowest" for the slowest speed that your - # CPU is able to operate at, or a value listed in: - # /sys/devices/system/cpu/cpu0/cpufreq/scaling_available_frequencies - # Only applicable if DO_CPU=1. - #CPU_MAXFREQ=slowest - - # Idle timeout for your hard drive (man hdparm for valid values, -S option) - # Default is 2 hours on AC (AC_HD=244) and 20 seconds for battery (BATT_HD=4). - #AC_HD=244 - #BATT_HD=4 - - # The drives for which to adjust the idle timeout. Separate them by a space, - # e.g. HD="/dev/hda /dev/hdb". - #HD="/dev/hda" - - # Set the spindown timeout on a hard drive? - #DO_HD=1 - - -Control script --------------- - -Please note that this control script works for the Linux 2.4 and 2.6 series (thanks -to Kiko Piris). - -Control script:: - - #!/bin/bash - - # start or stop laptop_mode, best run by a power management daemon when - # ac gets connected/disconnected from a laptop - # - # install as /sbin/laptop_mode - # - # Contributors to this script: Kiko Piris - # Bart Samwel - # Micha Feigin - # Andrew Morton - # Herve Eychenne - # Dax Kelson - # - # Original Linux 2.4 version by: Jens Axboe - - ############################################################################# - - # Source config - if [ -f /etc/default/laptop-mode ] ; then - # Debian - . /etc/default/laptop-mode - elif [ -f /etc/sysconfig/laptop-mode ] ; then - # Others - . /etc/sysconfig/laptop-mode - fi - - # Don't raise an error if the config file is incomplete - # set defaults instead: - - # Maximum time, in seconds, of hard drive spindown time that you are - # comfortable with. Worst case, it's possible that you could lose this - # amount of work if your battery fails you while in laptop mode. - MAX_AGE=${MAX_AGE:-'600'} - - # Read-ahead, in kilobytes - READAHEAD=${READAHEAD:-'4096'} - - # Shall we remount journaled fs. with appropriate commit interval? (1=yes) - DO_REMOUNTS=${DO_REMOUNTS:-'1'} - - # And shall we add the "noatime" option to that as well? (1=yes) - DO_REMOUNT_NOATIME=${DO_REMOUNT_NOATIME:-'1'} - - # Shall we adjust the idle timeout on a hard drive? - DO_HD=${DO_HD:-'1'} - - # Adjust idle timeout on which hard drive? - HD="${HD:-'/dev/hda'}" - - # spindown time for HD (hdparm -S values) - AC_HD=${AC_HD:-'244'} - BATT_HD=${BATT_HD:-'4'} - - # Dirty synchronous ratio. At this percentage of dirty pages the process which - # calls write() does its own writeback - DIRTY_RATIO=${DIRTY_RATIO:-'40'} - - # cpu frequency scaling - # See Documentation/admin-guide/pm/cpufreq.rst for more info - DO_CPU=${CPU_MANAGE:-'0'} - CPU_MAXFREQ=${CPU_MAXFREQ:-'slowest'} - - # - # Allowed dirty background ratio, in percent. Once DIRTY_RATIO has been - # exceeded, the kernel will wake flusher threads which will then reduce the - # amount of dirty memory to dirty_background_ratio. Set this nice and low, - # so once some writeout has commenced, we do a lot of it. - # - DIRTY_BACKGROUND_RATIO=${DIRTY_BACKGROUND_RATIO:-'5'} - - # kernel default dirty buffer age - DEF_AGE=${DEF_AGE:-'30'} - DEF_UPDATE=${DEF_UPDATE:-'5'} - DEF_DIRTY_BACKGROUND_RATIO=${DEF_DIRTY_BACKGROUND_RATIO:-'10'} - DEF_DIRTY_RATIO=${DEF_DIRTY_RATIO:-'40'} - DEF_XFS_AGE_BUFFER=${DEF_XFS_AGE_BUFFER:-'15'} - DEF_XFS_SYNC_INTERVAL=${DEF_XFS_SYNC_INTERVAL:-'30'} - DEF_XFS_BUFD_INTERVAL=${DEF_XFS_BUFD_INTERVAL:-'1'} - - # This must be adjusted manually to the value of HZ in the running kernel - # on 2.4, until the XFS people change their 2.4 external interfaces to work in - # centisecs. This can be automated, but it's a work in progress that still needs - # some fixes. On 2.6 kernels, XFS uses USER_HZ instead of HZ for external - # interfaces, and that is currently always set to 100. So you don't need to - # change this on 2.6. - XFS_HZ=${XFS_HZ:-'100'} - - ############################################################################# - - KLEVEL="$(uname -r | - { - IFS='.' read a b c - echo $a.$b - } - )" - case "$KLEVEL" in - "2.4"|"2.6") - ;; - *) - echo "Unhandled kernel version: $KLEVEL ('uname -r' = '$(uname -r)')" >&2 - exit 1 - ;; - esac - - if [ ! -e /proc/sys/vm/laptop_mode ] ; then - echo "Kernel is not patched with laptop_mode patch." >&2 - exit 1 - fi - - if [ ! -w /proc/sys/vm/laptop_mode ] ; then - echo "You do not have enough privileges to enable laptop_mode." >&2 - exit 1 - fi - - # Remove an option (the first parameter) of the form option=<number> from - # a mount options string (the rest of the parameters). - parse_mount_opts () { - OPT="$1" - shift - echo ",$*," | sed \ - -e 's/,'"$OPT"'=[0-9]*,/,/g' \ - -e 's/,,*/,/g' \ - -e 's/^,//' \ - -e 's/,$//' - } - - # Remove an option (the first parameter) without any arguments from - # a mount option string (the rest of the parameters). - parse_nonumber_mount_opts () { - OPT="$1" - shift - echo ",$*," | sed \ - -e 's/,'"$OPT"',/,/g' \ - -e 's/,,*/,/g' \ - -e 's/^,//' \ - -e 's/,$//' - } - - # Find out the state of a yes/no option (e.g. "atime"/"noatime") in - # fstab for a given filesystem, and use this state to replace the - # value of the option in another mount options string. The device - # is the first argument, the option name the second, and the default - # value the third. The remainder is the mount options string. - # - # Example: - # parse_yesno_opts_wfstab /dev/hda1 atime atime defaults,noatime - # - # If fstab contains, say, "rw" for this filesystem, then the result - # will be "defaults,atime". - parse_yesno_opts_wfstab () { - L_DEV="$1" - OPT="$2" - DEF_OPT="$3" - shift 3 - L_OPTS="$*" - PARSEDOPTS1="$(parse_nonumber_mount_opts $OPT $L_OPTS)" - PARSEDOPTS1="$(parse_nonumber_mount_opts no$OPT $PARSEDOPTS1)" - # Watch for a default atime in fstab - FSTAB_OPTS="$(awk '$1 == "'$L_DEV'" { print $4 }' /etc/fstab)" - if echo "$FSTAB_OPTS" | grep "$OPT" > /dev/null ; then - # option specified in fstab: extract the value and use it - if echo "$FSTAB_OPTS" | grep "no$OPT" > /dev/null ; then - echo "$PARSEDOPTS1,no$OPT" - else - # no$OPT not found -- so we must have $OPT. - echo "$PARSEDOPTS1,$OPT" - fi - else - # option not specified in fstab -- choose the default. - echo "$PARSEDOPTS1,$DEF_OPT" - fi - } - - # Find out the state of a numbered option (e.g. "commit=NNN") in - # fstab for a given filesystem, and use this state to replace the - # value of the option in another mount options string. The device - # is the first argument, and the option name the second. The - # remainder is the mount options string in which the replacement - # must be done. - # - # Example: - # parse_mount_opts_wfstab /dev/hda1 commit defaults,commit=7 - # - # If fstab contains, say, "commit=3,rw" for this filesystem, then the - # result will be "rw,commit=3". - parse_mount_opts_wfstab () { - L_DEV="$1" - OPT="$2" - shift 2 - L_OPTS="$*" - PARSEDOPTS1="$(parse_mount_opts $OPT $L_OPTS)" - # Watch for a default commit in fstab - FSTAB_OPTS="$(awk '$1 == "'$L_DEV'" { print $4 }' /etc/fstab)" - if echo "$FSTAB_OPTS" | grep "$OPT=" > /dev/null ; then - # option specified in fstab: extract the value, and use it - echo -n "$PARSEDOPTS1,$OPT=" - echo ",$FSTAB_OPTS," | sed \ - -e 's/.*,'"$OPT"'=//' \ - -e 's/,.*//' - else - # option not specified in fstab: set it to 0 - echo "$PARSEDOPTS1,$OPT=0" - fi - } - - deduce_fstype () { - MP="$1" - # My root filesystem unfortunately has - # type "unknown" in /etc/mtab. If we encounter - # "unknown", we try to get the type from fstab. - cat /etc/fstab | - grep -v '^#' | - while read FSTAB_DEV FSTAB_MP FSTAB_FST FSTAB_OPTS FSTAB_DUMP FSTAB_DUMP ; do - if [ "$FSTAB_MP" = "$MP" ]; then - echo $FSTAB_FST - exit 0 - fi - done - } - - if [ $DO_REMOUNT_NOATIME -eq 1 ] ; then - NOATIME_OPT=",noatime" - fi - - case "$1" in - start) - AGE=$((100*$MAX_AGE)) - XFS_AGE=$(($XFS_HZ*$MAX_AGE)) - echo -n "Starting laptop_mode" - - if [ -d /proc/sys/vm/pagebuf ] ; then - # (For 2.4 and early 2.6.) - # This only needs to be set, not reset -- it is only used when - # laptop mode is enabled. - echo $XFS_AGE > /proc/sys/vm/pagebuf/lm_flush_age - echo $XFS_AGE > /proc/sys/fs/xfs/lm_sync_interval - elif [ -f /proc/sys/fs/xfs/lm_age_buffer ] ; then - # (A couple of early 2.6 laptop mode patches had these.) - # The same goes for these. - echo $XFS_AGE > /proc/sys/fs/xfs/lm_age_buffer - echo $XFS_AGE > /proc/sys/fs/xfs/lm_sync_interval - elif [ -f /proc/sys/fs/xfs/age_buffer ] ; then - # (2.6.6) - # But not for these -- they are also used in normal - # operation. - echo $XFS_AGE > /proc/sys/fs/xfs/age_buffer - echo $XFS_AGE > /proc/sys/fs/xfs/sync_interval - elif [ -f /proc/sys/fs/xfs/age_buffer_centisecs ] ; then - # (2.6.7 upwards) - # And not for these either. These are in centisecs, - # not USER_HZ, so we have to use $AGE, not $XFS_AGE. - echo $AGE > /proc/sys/fs/xfs/age_buffer_centisecs - echo $AGE > /proc/sys/fs/xfs/xfssyncd_centisecs - echo 3000 > /proc/sys/fs/xfs/xfsbufd_centisecs - fi - - case "$KLEVEL" in - "2.4") - echo 1 > /proc/sys/vm/laptop_mode - echo "30 500 0 0 $AGE $AGE 60 20 0" > /proc/sys/vm/bdflush - ;; - "2.6") - echo 5 > /proc/sys/vm/laptop_mode - echo "$AGE" > /proc/sys/vm/dirty_writeback_centisecs - echo "$AGE" > /proc/sys/vm/dirty_expire_centisecs - echo "$DIRTY_RATIO" > /proc/sys/vm/dirty_ratio - echo "$DIRTY_BACKGROUND_RATIO" > /proc/sys/vm/dirty_background_ratio - ;; - esac - if [ $DO_REMOUNTS -eq 1 ]; then - cat /etc/mtab | while read DEV MP FST OPTS DUMP PASS ; do - PARSEDOPTS="$(parse_mount_opts "$OPTS")" - if [ "$FST" = 'unknown' ]; then - FST=$(deduce_fstype $MP) - fi - case "$FST" in - "ext3") - PARSEDOPTS="$(parse_mount_opts commit "$OPTS")" - mount $DEV -t $FST $MP -o remount,$PARSEDOPTS,commit=$MAX_AGE$NOATIME_OPT - ;; - "xfs") - mount $DEV -t $FST $MP -o remount,$OPTS$NOATIME_OPT - ;; - esac - if [ -b $DEV ] ; then - blockdev --setra $(($READAHEAD * 2)) $DEV - fi - done - fi - if [ $DO_HD -eq 1 ] ; then - for THISHD in $HD ; do - /sbin/hdparm -S $BATT_HD $THISHD > /dev/null 2>&1 - /sbin/hdparm -B 1 $THISHD > /dev/null 2>&1 - done - fi - if [ $DO_CPU -eq 1 -a -e /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_min_freq ]; then - if [ $CPU_MAXFREQ = 'slowest' ]; then - CPU_MAXFREQ=`cat /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_min_freq` - fi - echo $CPU_MAXFREQ > /sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq - fi - echo "." - ;; - stop) - U_AGE=$((100*$DEF_UPDATE)) - B_AGE=$((100*$DEF_AGE)) - echo -n "Stopping laptop_mode" - echo 0 > /proc/sys/vm/laptop_mode - if [ -f /proc/sys/fs/xfs/age_buffer -a ! -f /proc/sys/fs/xfs/lm_age_buffer ] ; then - # These need to be restored, if there are no lm_*. - echo $(($XFS_HZ*$DEF_XFS_AGE_BUFFER)) > /proc/sys/fs/xfs/age_buffer - echo $(($XFS_HZ*$DEF_XFS_SYNC_INTERVAL)) > /proc/sys/fs/xfs/sync_interval - elif [ -f /proc/sys/fs/xfs/age_buffer_centisecs ] ; then - # These need to be restored as well. - echo $((100*$DEF_XFS_AGE_BUFFER)) > /proc/sys/fs/xfs/age_buffer_centisecs - echo $((100*$DEF_XFS_SYNC_INTERVAL)) > /proc/sys/fs/xfs/xfssyncd_centisecs - echo $((100*$DEF_XFS_BUFD_INTERVAL)) > /proc/sys/fs/xfs/xfsbufd_centisecs - fi - case "$KLEVEL" in - "2.4") - echo "30 500 0 0 $U_AGE $B_AGE 60 20 0" > /proc/sys/vm/bdflush - ;; - "2.6") - echo "$U_AGE" > /proc/sys/vm/dirty_writeback_centisecs - echo "$B_AGE" > /proc/sys/vm/dirty_expire_centisecs - echo "$DEF_DIRTY_RATIO" > /proc/sys/vm/dirty_ratio - echo "$DEF_DIRTY_BACKGROUND_RATIO" > /proc/sys/vm/dirty_background_ratio - ;; - esac - if [ $DO_REMOUNTS -eq 1 ] ; then - cat /etc/mtab | while read DEV MP FST OPTS DUMP PASS ; do - # Reset commit and atime options to defaults. - if [ "$FST" = 'unknown' ]; then - FST=$(deduce_fstype $MP) - fi - case "$FST" in - "ext3") - PARSEDOPTS="$(parse_mount_opts_wfstab $DEV commit $OPTS)" - PARSEDOPTS="$(parse_yesno_opts_wfstab $DEV atime atime $PARSEDOPTS)" - mount $DEV -t $FST $MP -o remount,$PARSEDOPTS - ;; - "xfs") - PARSEDOPTS="$(parse_yesno_opts_wfstab $DEV atime atime $OPTS)" - mount $DEV -t $FST $MP -o remount,$PARSEDOPTS - ;; - esac - if [ -b $DEV ] ; then - blockdev --setra 256 $DEV - fi - done - fi - if [ $DO_HD -eq 1 ] ; then - for THISHD in $HD ; do - /sbin/hdparm -S $AC_HD $THISHD > /dev/null 2>&1 - /sbin/hdparm -B 255 $THISHD > /dev/null 2>&1 - done - fi - if [ $DO_CPU -eq 1 -a -e /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_min_freq ]; then - echo `cat /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_max_freq` > /sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq - fi - echo "." - ;; - *) - echo "Usage: $0 {start|stop}" 2>&1 - exit 1 - ;; - - esac - - exit 0 - - -ACPI integration ----------------- - -Dax Kelson submitted this so that the ACPI acpid daemon will -kick off the laptop_mode script and run hdparm. The part that -automatically disables laptop mode when the battery is low was -written by Jan Topinski. - -/etc/acpi/events/ac_adapter:: - - event=ac_adapter - action=/etc/acpi/actions/ac.sh %e - -/etc/acpi/events/battery:: - - event=battery.* - action=/etc/acpi/actions/battery.sh %e - -/etc/acpi/actions/ac.sh:: - - #!/bin/bash - - # ac on/offline event handler - - status=`awk '/^state: / { print $2 }' /proc/acpi/ac_adapter/$2/state` - - case $status in - "on-line") - /sbin/laptop_mode stop - exit 0 - ;; - "off-line") - /sbin/laptop_mode start - exit 0 - ;; - esac - - -/etc/acpi/actions/battery.sh:: - - #! /bin/bash - - # Automatically disable laptop mode when the battery almost runs out. - - BATT_INFO=/proc/acpi/battery/$2/state - - if [[ -f /proc/sys/vm/laptop_mode ]] - then - LM=`cat /proc/sys/vm/laptop_mode` - if [[ $LM -gt 0 ]] - then - if [[ -f $BATT_INFO ]] - then - # Source the config file only now that we know we need - if [ -f /etc/default/laptop-mode ] ; then - # Debian - . /etc/default/laptop-mode - elif [ -f /etc/sysconfig/laptop-mode ] ; then - # Others - . /etc/sysconfig/laptop-mode - fi - MINIMUM_BATTERY_MINUTES=${MINIMUM_BATTERY_MINUTES:-'10'} - - ACTION="`cat $BATT_INFO | grep charging | cut -c 26-`" - if [[ ACTION -eq "discharging" ]] - then - PRESENT_RATE=`cat $BATT_INFO | grep "present rate:" | sed "s/.* \([0-9][0-9]* \).*/\1/" ` - REMAINING=`cat $BATT_INFO | grep "remaining capacity:" | sed "s/.* \([0-9][0-9]* \).*/\1/" ` - fi - if (($REMAINING * 60 / $PRESENT_RATE < $MINIMUM_BATTERY_MINUTES)) - then - /sbin/laptop_mode stop - fi - else - logger -p daemon.warning "You are using laptop mode and your battery interface $BATT_INFO is missing. This may lead to loss of data when the battery runs out. Check kernel ACPI support and /proc/acpi/battery folder, and edit /etc/acpi/battery.sh to set BATT_INFO to the correct path." - fi - fi - fi - - -Monitoring tool ---------------- - -Bartek Kania submitted this, it can be used to measure how much time your disk -spends spun up/down. See tools/laptop/dslm/dslm.c diff --git a/Documentation/admin-guide/sysctl/vm.rst b/Documentation/admin-guide/sysctl/vm.rst index 4d71211fdad8..af14345ba94b 100644 --- a/Documentation/admin-guide/sysctl/vm.rst +++ b/Documentation/admin-guide/sysctl/vm.rst @@ -41,7 +41,6 @@ files can be found in mm/swap.c. - extfrag_threshold - highmem_is_dirtyable - hugetlb_shm_group -- laptop_mode - legacy_va_layout - lowmem_reserve_ratio - max_map_count @@ -363,13 +362,6 @@ hugetlb_shm_group contains group id that is allowed to create SysV shared memory segment using hugetlb page. -laptop_mode -=========== - -laptop_mode is a knob that controls "laptop mode". All the things that are -controlled by this knob are discussed in Documentation/admin-guide/laptops/laptop-mode.rst. - - legacy_va_layout ================ diff --git a/block/blk-mq.c b/block/blk-mq.c index 1978eef95dca..6d739bd9459d 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -811,9 +811,6 @@ void blk_mq_free_request(struct request *rq) blk_mq_finish_request(rq); - if (unlikely(laptop_mode && !blk_rq_is_passthrough(rq))) - laptop_io_completion(q->disk->bdi); - rq_qos_done(q, rq); WRITE_ONCE(rq->state, MQ_RQ_IDLE); diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index 0c466ccbed69..15eb463d5a9b 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -3305,8 +3305,7 @@ int ext4_alloc_da_blocks(struct inode *inode) /* * We do something simple for now. The filemap_flush() will * also start triggering a write of the data blocks, which is - * not strictly speaking necessary (and for users of - * laptop_mode, not even desirable). However, to do otherwise + * not strictly speaking necessary. However, to do otherwise * would require replicating code paths in: * * ext4_writepages() -> diff --git a/fs/sync.c b/fs/sync.c index 431fc5f5be06..6330150792f6 100644 --- a/fs/sync.c +++ b/fs/sync.c @@ -104,8 +104,6 @@ void ksys_sync(void) iterate_supers(sync_fs_one_sb, &wait); sync_bdevs(false); sync_bdevs(true); - if (unlikely(laptop_mode)) - laptop_sync_completion(); } SYSCALL_DEFINE0(sync) diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c index bc71aa9dcee8..a2014fb1bc66 100644 --- a/fs/xfs/xfs_super.c +++ b/fs/xfs/xfs_super.c @@ -845,15 +845,6 @@ xfs_fs_sync_fs( if (error) return error; - if (laptop_mode) { - /* - * The disk must be active because we're syncing. - * We schedule log work now (now that the disk is - * active) instead of later (when it might not be). - */ - flush_delayed_work(&mp->m_log->l_work); - } - /* * If we are called with page faults frozen out, it means we are about * to freeze the transaction subsystem. Take the opportunity to shut diff --git a/include/linux/backing-dev-defs.h b/include/linux/backing-dev-defs.h index 0217c1073735..c88fd4d37d1f 100644 --- a/include/linux/backing-dev-defs.h +++ b/include/linux/backing-dev-defs.h @@ -46,7 +46,6 @@ enum wb_reason { WB_REASON_VMSCAN, WB_REASON_SYNC, WB_REASON_PERIODIC, - WB_REASON_LAPTOP_TIMER, WB_REASON_FS_FREE_SPACE, /* * There is no bdi forker thread any more and works are done @@ -204,8 +203,6 @@ struct backing_dev_info { char dev_name[64]; struct device *owner; - struct timer_list laptop_mode_wb_timer; - #ifdef CONFIG_DEBUG_FS struct dentry *debug_dir; #endif diff --git a/include/linux/writeback.h b/include/linux/writeback.h index f48e8ccffe81..e530112c4b3a 100644 --- a/include/linux/writeback.h +++ b/include/linux/writeback.h @@ -328,9 +328,6 @@ struct dirty_throttle_control { bool dirty_exceeded; }; -void laptop_io_completion(struct backing_dev_info *info); -void laptop_sync_completion(void); -void laptop_mode_timer_fn(struct timer_list *t); bool node_dirty_ok(struct pglist_data *pgdat); int wb_domain_init(struct wb_domain *dom, gfp_t gfp); #ifdef CONFIG_CGROUP_WRITEBACK @@ -342,7 +339,6 @@ extern struct wb_domain global_wb_domain; /* These are exported to sysctl. */ extern unsigned int dirty_writeback_interval; extern unsigned int dirty_expire_interval; -extern int laptop_mode; void global_dirty_limits(unsigned long *pbackground, unsigned long *pdirty); unsigned long wb_calc_thresh(struct bdi_writeback *wb, unsigned long thresh); diff --git a/include/trace/events/writeback.h b/include/trace/events/writeback.h index 311a341e6fe4..b6f94e97788a 100644 --- a/include/trace/events/writeback.h +++ b/include/trace/events/writeback.h @@ -42,7 +42,6 @@ EM( WB_REASON_VMSCAN, "vmscan") \ EM( WB_REASON_SYNC, "sync") \ EM( WB_REASON_PERIODIC, "periodic") \ - EM( WB_REASON_LAPTOP_TIMER, "laptop_timer") \ EM( WB_REASON_FS_FREE_SPACE, "fs_free_space") \ EM( WB_REASON_FORKER_THREAD, "forker_thread") \ EMe(WB_REASON_FOREIGN_FLUSH, "foreign_flush") diff --git a/include/uapi/linux/sysctl.h b/include/uapi/linux/sysctl.h index 63d1464cb71c..6ea9ea8413fa 100644 --- a/include/uapi/linux/sysctl.h +++ b/include/uapi/linux/sysctl.h @@ -183,7 +183,7 @@ enum VM_LOWMEM_RESERVE_RATIO=20,/* reservation ratio for lower memory zones */ VM_MIN_FREE_KBYTES=21, /* Minimum free kilobytes to maintain */ VM_MAX_MAP_COUNT=22, /* int: Maximum number of mmaps/address-space */ - VM_LAPTOP_MODE=23, /* vm laptop mode */ + VM_BLOCK_DUMP=24, /* block dump mode */ VM_HUGETLB_GROUP=25, /* permitted hugetlb group */ VM_VFS_CACHE_PRESSURE=26, /* dcache/icache reclaim pressure */ diff --git a/mm/backing-dev.c b/mm/backing-dev.c index c5740c6d37a2..a0e26d1b717f 100644 --- a/mm/backing-dev.c +++ b/mm/backing-dev.c @@ -1034,7 +1034,6 @@ struct backing_dev_info *bdi_alloc(int node_id) bdi->capabilities = BDI_CAP_WRITEBACK; bdi->ra_pages = VM_READAHEAD_PAGES; bdi->io_pages = VM_READAHEAD_PAGES; - timer_setup(&bdi->laptop_mode_wb_timer, laptop_mode_timer_fn, 0); return bdi; } EXPORT_SYMBOL(bdi_alloc); @@ -1156,8 +1155,6 @@ static void bdi_remove_from_list(struct backing_dev_info *bdi) void bdi_unregister(struct backing_dev_info *bdi) { - timer_delete_sync(&bdi->laptop_mode_wb_timer); - /* make sure nobody finds us on the bdi_list anymore */ bdi_remove_from_list(bdi); wb_shutdown(&bdi->wb); diff --git a/mm/page-writeback.c b/mm/page-writeback.c index ccdeb0e84d39..601a5e048d12 100644 --- a/mm/page-writeback.c +++ b/mm/page-writeback.c @@ -109,14 +109,6 @@ EXPORT_SYMBOL_GPL(dirty_writeback_interval); */ unsigned int dirty_expire_interval = 30 * 100; /* centiseconds */ -/* - * Flag that puts the machine in "laptop mode". Doubles as a timeout in jiffies: - * a full sync is triggered after this time elapses without any disk activity. - */ -int laptop_mode; - -EXPORT_SYMBOL(laptop_mode); - /* End of sysctl-exported parameters */ struct wb_domain global_wb_domain; @@ -1843,17 +1835,7 @@ static int balance_dirty_pages(struct bdi_writeback *wb, balance_domain_limits(mdtc, strictlimit); } - /* - * In laptop mode, we wait until hitting the higher threshold - * before starting background writeout, and then write out all - * the way down to the lower threshold. So slow writers cause - * minimal disk activity. - * - * In normal mode, we start background writeout at the lower - * background_thresh, to keep the amount of dirty memory low. - */ - if (!laptop_mode && nr_dirty > gdtc->bg_thresh && - !writeback_in_progress(wb)) + if (nr_dirty > gdtc->bg_thresh && !writeback_in_progress(wb)) wb_start_background_writeback(wb); /* @@ -1876,10 +1858,6 @@ static int balance_dirty_pages(struct bdi_writeback *wb, break; } - /* Start writeback even when in laptop mode */ - if (unlikely(!writeback_in_progress(wb))) - wb_start_background_writeback(wb); - mem_cgroup_flush_foreign(wb); /* @@ -2198,41 +2176,6 @@ static int dirty_writeback_centisecs_handler(const struct ctl_table *table, int } #endif -void laptop_mode_timer_fn(struct timer_list *t) -{ - struct backing_dev_info *backing_dev_info = - timer_container_of(backing_dev_info, t, laptop_mode_wb_timer); - - wakeup_flusher_threads_bdi(backing_dev_info, WB_REASON_LAPTOP_TIMER); -} - -/* - * We've spun up the disk and we're in laptop mode: schedule writeback - * of all dirty data a few seconds from now. If the flush is already scheduled - * then push it back - the user is still using the disk. - */ -void laptop_io_completion(struct backing_dev_info *info) -{ - mod_timer(&info->laptop_mode_wb_timer, jiffies + laptop_mode); -} - -/* - * We're in laptop mode and we've just synced. The sync's writes will have - * caused another writeback to be scheduled by laptop_io_completion. - * Nothing needs to be written back anymore, so we unschedule the writeback. - */ -void laptop_sync_completion(void) -{ - struct backing_dev_info *bdi; - - rcu_read_lock(); - - list_for_each_entry_rcu(bdi, &bdi_list, bdi_list) - timer_delete(&bdi->laptop_mode_wb_timer); - - rcu_read_unlock(); -} - /* * If ratelimit_pages is too high then we can get into dirty-data overload * if a large number of processes all perform writes at the same time. @@ -2263,6 +2206,19 @@ static int page_writeback_cpu_online(unsigned int cpu) #ifdef CONFIG_SYSCTL +static int laptop_mode; +static int laptop_mode_handler(const struct ctl_table *table, int write, + void *buffer, size_t *lenp, loff_t *ppos) +{ + int ret = proc_dointvec_jiffies(table, write, buffer, lenp, ppos); + + if (!ret && write) + pr_warn("%s: vm.laptop_mode is deprecated. Ignoring setting.\n", + current->comm); + + return ret; +} + /* this is needed for the proc_doulongvec_minmax of vm_dirty_bytes */ static const unsigned long dirty_bytes_min = 2 * PAGE_SIZE; @@ -2332,7 +2288,7 @@ static const struct ctl_table vm_page_writeback_sysctls[] = { .data = &laptop_mode, .maxlen = sizeof(laptop_mode), .mode = 0644, - .proc_handler = proc_dointvec_jiffies, + .proc_handler = laptop_mode_handler, }, }; #endif diff --git a/mm/vmscan.c b/mm/vmscan.c index 670fe9fae5ba..a1ad50c0c9aa 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -104,13 +104,13 @@ struct scan_control { unsigned int force_deactivate:1; unsigned int skipped_deactivate:1; - /* Writepage batching in laptop mode; RECLAIM_WRITE */ + /* zone_reclaim_mode, boost reclaim */ unsigned int may_writepage:1; - /* Can mapped folios be reclaimed? */ + /* zone_reclaim_mode */ unsigned int may_unmap:1; - /* Can folios be swapped as part of reclaim? */ + /* zome_reclaim_mode, boost reclaim, cgroup restrictions */ unsigned int may_swap:1; /* Not allow cache_trim_mode to be turned on as part of reclaim? */ @@ -6366,13 +6366,6 @@ static unsigned long do_try_to_free_pages(struct zonelist *zonelist, if (sc->compaction_ready) break; - - /* - * If we're getting trouble reclaiming, start doing - * writepage even in laptop mode. - */ - if (sc->priority < DEF_PRIORITY - 2) - sc->may_writepage = 1; } while (--sc->priority >= 0); last_pgdat = NULL; @@ -6581,7 +6574,7 @@ unsigned long try_to_free_pages(struct zonelist *zonelist, int order, .order = order, .nodemask = nodemask, .priority = DEF_PRIORITY, - .may_writepage = !laptop_mode, + .may_writepage = 1, .may_unmap = 1, .may_swap = 1, }; @@ -6625,7 +6618,7 @@ unsigned long mem_cgroup_shrink_node(struct mem_cgroup *memcg, struct scan_control sc = { .nr_to_reclaim = SWAP_CLUSTER_MAX, .target_mem_cgroup = memcg, - .may_writepage = !laptop_mode, + .may_writepage = 1, .may_unmap = 1, .reclaim_idx = MAX_NR_ZONES - 1, .may_swap = !noswap, @@ -6671,7 +6664,7 @@ unsigned long try_to_free_mem_cgroup_pages(struct mem_cgroup *memcg, .reclaim_idx = MAX_NR_ZONES - 1, .target_mem_cgroup = memcg, .priority = DEF_PRIORITY, - .may_writepage = !laptop_mode, + .may_writepage = 1, .may_unmap = 1, .may_swap = !!(reclaim_options & MEMCG_RECLAIM_MAY_SWAP), .proactive = !!(reclaim_options & MEMCG_RECLAIM_PROACTIVE), @@ -7052,7 +7045,7 @@ static int balance_pgdat(pg_data_t *pgdat, int order, int highest_zoneidx) * from reclaim context. If no pages are reclaimed, the * reclaim will be aborted. */ - sc.may_writepage = !laptop_mode && !nr_boost_reclaim; + sc.may_writepage = !nr_boost_reclaim; sc.may_swap = !nr_boost_reclaim; /* @@ -7062,13 +7055,6 @@ static int balance_pgdat(pg_data_t *pgdat, int order, int highest_zoneidx) */ kswapd_age_node(pgdat, &sc); - /* - * If we're getting trouble reclaiming, start doing writepage - * even in laptop mode. - */ - if (sc.priority < DEF_PRIORITY - 2) - sc.may_writepage = 1; - /* Call soft limit reclaim before calling shrink_node. */ sc.nr_scanned = 0; nr_soft_scanned = 0; @@ -7789,7 +7775,7 @@ int user_proactive_reclaim(char *buf, .reclaim_idx = gfp_zone(gfp_mask), .proactive_swappiness = swappiness == -1 ? NULL : &swappiness, .priority = DEF_PRIORITY, - .may_writepage = !laptop_mode, + .may_writepage = 1, .nr_to_reclaim = max(batch_size, SWAP_CLUSTER_MAX), .may_unmap = 1, .may_swap = 1, -- 2.52.0 ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: retiring laptop_mode? was Re: [PATCH] mm: vmscan: always allow writeback during memcg reclaim 2025-12-16 18:52 ` Johannes Weiner @ 2025-12-16 18:54 ` Jens Axboe 2025-12-16 23:23 ` Shakeel Butt ` (2 subsequent siblings) 3 siblings, 0 replies; 20+ messages in thread From: Jens Axboe @ 2025-12-16 18:54 UTC (permalink / raw) To: Johannes Weiner, Christoph Hellwig Cc: Deepanshu Kartikey, akpm, linux-mm, linux-kernel, linux-block On 12/16/25 11:52 AM, Johannes Weiner wrote: > On Mon, Dec 15, 2025 at 11:41:07PM -0800, Christoph Hellwig wrote: >> On Mon, Dec 15, 2025 at 03:08:38PM -0500, Johannes Weiner wrote: >>> Debated whether to add some sort of deprecation sysctl handler, but at >>> least systemd-sysctl just prints a warning and still applies other >>> settings from the same config file. >> >> In general dropping sysctl will break things. So I think we'll need >> a stub, at which point it might as well warn for a while. > > Fair enough, I added that. > > Jens, that change seemed small enough that I carried your Ack, but > please let me know if you feel otherwise ;) That's fine, fwiw I fully agree with that change. -- Jens Axboe ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: retiring laptop_mode? was Re: [PATCH] mm: vmscan: always allow writeback during memcg reclaim 2025-12-16 18:52 ` Johannes Weiner 2025-12-16 18:54 ` Jens Axboe @ 2025-12-16 23:23 ` Shakeel Butt 2025-12-17 19:59 ` Johannes Weiner 2025-12-17 19:34 ` Michal Hocko 2025-12-18 6:00 ` Christoph Hellwig 3 siblings, 1 reply; 20+ messages in thread From: Shakeel Butt @ 2025-12-16 23:23 UTC (permalink / raw) To: Johannes Weiner Cc: Christoph Hellwig, Jens Axboe, Deepanshu Kartikey, akpm, linux-mm, linux-kernel, linux-block On Tue, Dec 16, 2025 at 01:52:01PM -0500, Johannes Weiner wrote: [...] > > From 087f10b8046864f71ebc3a3f3316b097932cbded Mon Sep 17 00:00:00 2001 > From: Johannes Weiner <hannes@cmpxchg.org> > Date: Mon, 15 Dec 2025 12:57:53 -0500 > Subject: [PATCH] mm/block/fs: remove laptop_mode > > Laptop mode was introduced to save battery, by delaying and > consolidating writes and thereby maximize the time rotating hard > drives wouldn't have to spin. > > Luckily, rotating hard drives, with their high spin-up times and power > draw, are a thing of the past for battery-powered devices. Reclaim has > also since changed to not write single filesystem pages anymore, and > regular filesystem writeback is lumpy by design. > > The juice doesn't appear worth the squeeze anymore. The footprint of > the feature is small, but nevertheless it's a complicating factor in > mm, block, filesystems. Developers don't think about it, and it likely > hasn't been tested with new reclaim and writeback changes in years. > > Let's sunset it. Keep the sysctl with a deprecation warning around for > a few more cycles, but remove all functionality behind it. > > Suggested-by: Christoph Hellwig <hch@infradead.org> > Message-ID: <aT-xv1BNYabnZB_n@infradead.org> Is there a need for above message ID? Why not put the whole lore link? > Acked-by: Jens Axboe <axboe@kernel.dk> > Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> One nit below and other than that you can add: Reviewed-by: Shakeel Butt <shakeel.butt@linux.dev> > --- a/include/uapi/linux/sysctl.h > +++ b/include/uapi/linux/sysctl.h > @@ -183,7 +183,7 @@ enum > VM_LOWMEM_RESERVE_RATIO=20,/* reservation ratio for lower memory zones */ > VM_MIN_FREE_KBYTES=21, /* Minimum free kilobytes to maintain */ > VM_MAX_MAP_COUNT=22, /* int: Maximum number of mmaps/address-space */ > - VM_LAPTOP_MODE=23, /* vm laptop mode */ > + There are 8 earlier enums here with names like VM_UNUSED* along with the information on what were they. Should we have something similar for this one? Something like: VM_UNUSED10=23, /* was vm laptop mode */ ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: retiring laptop_mode? was Re: [PATCH] mm: vmscan: always allow writeback during memcg reclaim 2025-12-16 23:23 ` Shakeel Butt @ 2025-12-17 19:59 ` Johannes Weiner 2025-12-18 7:21 ` Shakeel Butt 0 siblings, 1 reply; 20+ messages in thread From: Johannes Weiner @ 2025-12-17 19:59 UTC (permalink / raw) To: Shakeel Butt Cc: Christoph Hellwig, Jens Axboe, Deepanshu Kartikey, akpm, linux-mm, linux-kernel, linux-block On Tue, Dec 16, 2025 at 03:23:53PM -0800, Shakeel Butt wrote: > On Tue, Dec 16, 2025 at 01:52:01PM -0500, Johannes Weiner wrote: > Reviewed-by: Shakeel Butt <shakeel.butt@linux.dev> Thanks! > > --- a/include/uapi/linux/sysctl.h > > +++ b/include/uapi/linux/sysctl.h > > @@ -183,7 +183,7 @@ enum > > VM_LOWMEM_RESERVE_RATIO=20,/* reservation ratio for lower memory zones */ > > VM_MIN_FREE_KBYTES=21, /* Minimum free kilobytes to maintain */ > > VM_MAX_MAP_COUNT=22, /* int: Maximum number of mmaps/address-space */ > > - VM_LAPTOP_MODE=23, /* vm laptop mode */ > > + > > There are 8 earlier enums here with names like VM_UNUSED* along with > the information on what were they. Should we have something similar for > this one? Something like: > > VM_UNUSED10=23, /* was vm laptop mode */ The other enums in that file leave holes, the VM ones have a mix of VM_UNUSED and holes. I don't think it matters either way since the sysctl syscall has been removed and nothing new should be compiled against the definitions in this file, right? ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: retiring laptop_mode? was Re: [PATCH] mm: vmscan: always allow writeback during memcg reclaim 2025-12-17 19:59 ` Johannes Weiner @ 2025-12-18 7:21 ` Shakeel Butt 0 siblings, 0 replies; 20+ messages in thread From: Shakeel Butt @ 2025-12-18 7:21 UTC (permalink / raw) To: Johannes Weiner Cc: Christoph Hellwig, Jens Axboe, Deepanshu Kartikey, akpm, linux-mm, linux-kernel, linux-block On Wed, Dec 17, 2025 at 02:59:20PM -0500, Johannes Weiner wrote: > On Tue, Dec 16, 2025 at 03:23:53PM -0800, Shakeel Butt wrote: > > On Tue, Dec 16, 2025 at 01:52:01PM -0500, Johannes Weiner wrote: > > > Reviewed-by: Shakeel Butt <shakeel.butt@linux.dev> > > Thanks! > > > > --- a/include/uapi/linux/sysctl.h > > > +++ b/include/uapi/linux/sysctl.h > > > @@ -183,7 +183,7 @@ enum > > > VM_LOWMEM_RESERVE_RATIO=20,/* reservation ratio for lower memory zones */ > > > VM_MIN_FREE_KBYTES=21, /* Minimum free kilobytes to maintain */ > > > VM_MAX_MAP_COUNT=22, /* int: Maximum number of mmaps/address-space */ > > > - VM_LAPTOP_MODE=23, /* vm laptop mode */ > > > + > > > > There are 8 earlier enums here with names like VM_UNUSED* along with > > the information on what were they. Should we have something similar for > > this one? Something like: > > > > VM_UNUSED10=23, /* was vm laptop mode */ > > The other enums in that file leave holes, the VM ones have a mix of > VM_UNUSED and holes. I don't think it matters either way since the > sysctl syscall has been removed and nothing new should be compiled > against the definitions in this file, right? Yes, you are right. ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: retiring laptop_mode? was Re: [PATCH] mm: vmscan: always allow writeback during memcg reclaim 2025-12-16 18:52 ` Johannes Weiner 2025-12-16 18:54 ` Jens Axboe 2025-12-16 23:23 ` Shakeel Butt @ 2025-12-17 19:34 ` Michal Hocko 2025-12-18 6:00 ` Christoph Hellwig 3 siblings, 0 replies; 20+ messages in thread From: Michal Hocko @ 2025-12-17 19:34 UTC (permalink / raw) To: Johannes Weiner Cc: Christoph Hellwig, Jens Axboe, Deepanshu Kartikey, akpm, linux-mm, linux-kernel, linux-block On Tue 16-12-25 13:52:01, Johannes Weiner wrote: > On Mon, Dec 15, 2025 at 11:41:07PM -0800, Christoph Hellwig wrote: > > On Mon, Dec 15, 2025 at 03:08:38PM -0500, Johannes Weiner wrote: > > > Debated whether to add some sort of deprecation sysctl handler, but at > > > least systemd-sysctl just prints a warning and still applies other > > > settings from the same config file. > > > > In general dropping sysctl will break things. So I think we'll need > > a stub, at which point it might as well warn for a while. > > Fair enough, I added that. > > Jens, that change seemed small enough that I carried your Ack, but > please let me know if you feel otherwise ;) > > > > Laptop mode was introduced to save battery, by delaying and > > > consolidating writes and maximize the time rotating hard drives > > > wouldn't have to spin. Needless to say, this is a scenario of the > > > (in)glorious past. > > > > Maybe expand on this a bit by mentioning that reclaim now never does > > file system writeback, and fs writeback is already very lumpy by > > design. And of cours that hard disk with their high spinup latency > > and extra power draw are a thing of the past in laptops or other mobile > > devices. > > Sounds good. Can you take a look at the new version below? > > Andrew, absent any further objections, would you be able to take this > through the -mm tree? > > Thanks! > > >From 087f10b8046864f71ebc3a3f3316b097932cbded Mon Sep 17 00:00:00 2001 > From: Johannes Weiner <hannes@cmpxchg.org> > Date: Mon, 15 Dec 2025 12:57:53 -0500 > Subject: [PATCH] mm/block/fs: remove laptop_mode > > Laptop mode was introduced to save battery, by delaying and > consolidating writes and thereby maximize the time rotating hard > drives wouldn't have to spin. > > Luckily, rotating hard drives, with their high spin-up times and power > draw, are a thing of the past for battery-powered devices. Reclaim has > also since changed to not write single filesystem pages anymore, and > regular filesystem writeback is lumpy by design. > > The juice doesn't appear worth the squeeze anymore. The footprint of > the feature is small, but nevertheless it's a complicating factor in > mm, block, filesystems. Developers don't think about it, and it likely > hasn't been tested with new reclaim and writeback changes in years. > > Let's sunset it. Keep the sysctl with a deprecation warning around for > a few more cycles, but remove all functionality behind it. > > Suggested-by: Christoph Hellwig <hch@infradead.org> > Message-ID: <aT-xv1BNYabnZB_n@infradead.org> > Acked-by: Jens Axboe <axboe@kernel.dk> > Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Acked-by: Michal Hocko <mhocko@suse.com> Thanks! -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: retiring laptop_mode? was Re: [PATCH] mm: vmscan: always allow writeback during memcg reclaim 2025-12-16 18:52 ` Johannes Weiner ` (2 preceding siblings ...) 2025-12-17 19:34 ` Michal Hocko @ 2025-12-18 6:00 ` Christoph Hellwig 3 siblings, 0 replies; 20+ messages in thread From: Christoph Hellwig @ 2025-12-18 6:00 UTC (permalink / raw) To: Johannes Weiner Cc: Christoph Hellwig, Jens Axboe, Deepanshu Kartikey, akpm, linux-mm, linux-kernel, linux-block On Tue, Dec 16, 2025 at 01:52:01PM -0500, Johannes Weiner wrote: > On Mon, Dec 15, 2025 at 11:41:07PM -0800, Christoph Hellwig wrote: > > On Mon, Dec 15, 2025 at 03:08:38PM -0500, Johannes Weiner wrote: > > > Debated whether to add some sort of deprecation sysctl handler, but at > > > least systemd-sysctl just prints a warning and still applies other > > > settings from the same config file. > > > > In general dropping sysctl will break things. So I think we'll need > > a stub, at which point it might as well warn for a while. > > Fair enough, I added that. > > Jens, that change seemed small enough that I carried your Ack, but > please let me know if you feel otherwise ;) > > > > Laptop mode was introduced to save battery, by delaying and > > > consolidating writes and maximize the time rotating hard drives > > > wouldn't have to spin. Needless to say, this is a scenario of the > > > (in)glorious past. > > > > Maybe expand on this a bit by mentioning that reclaim now never does > > file system writeback, and fs writeback is already very lumpy by > > design. And of cours that hard disk with their high spinup latency > > and extra power draw are a thing of the past in laptops or other mobile > > devices. > > Sounds good. Can you take a look at the new version below? Looks good: Reviewed-by: Christoph Hellwig <hch@lst.de> ^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] mm: vmscan: always allow writeback during memcg reclaim 2025-12-15 4:12 ` Johannes Weiner 2025-12-15 4:51 ` Deepanshu Kartikey 2025-12-15 6:59 ` retiring laptop_mode? was " Christoph Hellwig @ 2025-12-15 17:49 ` Michal Hocko 2 siblings, 0 replies; 20+ messages in thread From: Michal Hocko @ 2025-12-15 17:49 UTC (permalink / raw) To: Johannes Weiner Cc: Deepanshu Kartikey, akpm, axelrasmussen, yuanchu, weixugc, david, zhengqi.arch, shakeel.butt, lorenzo.stoakes, yuzhao, heftig, oleksandr, bgeffon, linux-mm, linux-kernel, syzbot+90fcab4d88cffed6d0d8 On Sun 14-12-25 23:12:00, Johannes Weiner wrote: > On Sat, Dec 13, 2025 at 02:06:39PM +0530, Deepanshu Kartikey wrote: > > When laptop_mode is enabled, may_writepage is set to 0 in > > try_to_free_mem_cgroup_pages(). This triggers a warning in MGLRU's > > lru_gen_shrink_lruvec(): > > > > VM_WARN_ON_ONCE(!sc->may_writepage || !sc->may_unmap); > > > > The warning occurs because MGLRU expects full reclaim capabilities to > > function correctly. The call path is: > > > > mem_cgroup_resize_max() > > try_to_free_mem_cgroup_pages() > > do_try_to_free_pages() > > shrink_node() > > shrink_lruvec() > > lru_gen_shrink_lruvec() <-- WARNING > > > > Unlike kswapd or direct reclaim where laptop_mode's disk-saving behavior > > is a reasonable optimization, memcg limit enforcement is a hard > > requirement - memory MUST be freed when a cgroup exceeds its limit. > > That reasoning doesn't make sense to me. Reclaim is always in response > to an allocation need. The laptop_mode idea applies to cgroup reclaim > as much as any other reclaim. > > Now obviously all of this is pretty dated. Reclaim doesn't do > filesystem writes anymore, and I'm not sure there are a whole lot of > laptops with rotational drives left, either. Also I doubt anybody is > still using zone_reclaim_mode (which is where the may_unmap is from). > > But let's not introduce more inconsistencies, please. The only thing > weird here is the MGLRU warning. What is it trying to assert? Clearly > whatever assumption was made here has never been true. Completely agreed. This patch seems to just paper over a warning that seems dubious while doing something that doesn't make much sense in itself. Dropping laptop_mode from the memory reclaim seems like the right direction anyway. I seriously doubt that it makes any practical or measurable difference even on slow rotating storage laptops these days. -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 20+ messages in thread
end of thread, other threads:[~2025-12-19 5:14 UTC | newest] Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2025-12-13 8:36 [PATCH] mm: vmscan: always allow writeback during memcg reclaim Deepanshu Kartikey 2025-12-14 23:49 ` Andrew Morton 2025-12-15 4:12 ` Johannes Weiner 2025-12-15 4:51 ` Deepanshu Kartikey 2025-12-15 19:42 ` Yuanchu Xie 2025-12-15 20:22 ` Johannes Weiner 2025-12-19 5:13 ` Kairui Song 2025-12-15 6:59 ` retiring laptop_mode? was " Christoph Hellwig 2025-12-15 16:33 ` Jens Axboe 2025-12-15 20:08 ` Johannes Weiner 2025-12-16 2:23 ` Jens Axboe 2025-12-16 7:41 ` Christoph Hellwig 2025-12-16 18:52 ` Johannes Weiner 2025-12-16 18:54 ` Jens Axboe 2025-12-16 23:23 ` Shakeel Butt 2025-12-17 19:59 ` Johannes Weiner 2025-12-18 7:21 ` Shakeel Butt 2025-12-17 19:34 ` Michal Hocko 2025-12-18 6:00 ` Christoph Hellwig 2025-12-15 17:49 ` Michal Hocko
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox