* re: is hibernation usable? @ 2020-02-11 19:50 Chris Murphy 2020-02-11 22:23 ` Luigi Semenzato 0 siblings, 1 reply; 13+ messages in thread From: Chris Murphy @ 2020-02-11 19:50 UTC (permalink / raw) To: linux-mm; +Cc: semenzato Original thread: https://lore.kernel.org/linux-mm/CAA25o9RSWPX8L3s=r6A+4oSdQyvGfWZ1bhKfGvSo5nN-X58HQA@mail.gmail.com/ This whole thread is a revelation. I have no doubt most users have no idea that hibernation image creation is expected to fail if more than 50% RAM is used. Please bear with me while I ask some possibly rudimentary questions to ensure I understand this in simple terms. Example system: 32G RAM, all of it used, plus 2G of page outs (into the swap device). + 2G already paged out to swap + 16GB needs to be paged out to swap, to free up enough memory to create the hibernation image + 8-16GB for the (compressed) hibernation image to be written to a *contiguous* range within swap device This suggests a 26G-34G swap device, correct? (I realize that this swap device could, in another example, contain more than 2G of page outs already, and that would only increase this requirement.) Is there now (or planned) an automatic kernel facility that will do the eviction automatically, to free up enough memory, so that the hibernation image can always be successfully created in-memory? If not, does this suggest some facility needs to be created, maybe in systemd, coordinating with the desktop environment? I don't need to understand the details but I do want to understand if this exists, will exist, and where it will exist. One idea floated on Fedora devel@ a few months ago by a systemd developer, is to activate a swap device at hibernation time. That way the system is constrained to a smaller swap device, e.g. swap on /dev/zram during normal use, but can still hibernate by activating a suitably sized swap device on-demand. Do you anticipate any problems with this idea? Could it be subject to race conditions? Is there any difference in hibernation reliability between swap partitions, versus swapfiles? I note there isn't a standard interface for all file systems, notably Btrfs has a unique requirement [1] Are there any prospects for signed hibernation images, in order to support hibernation when UEFI Secure Boot is enabled? What about the signing of swap? If there's a trust concern with the hibernation image, and I agree that there is in the context of UEFI SB, then it seems there's likewise a concern about active pages in swap. Yes? No? [1] https://lore.kernel.org/linux-btrfs/CAJCQCtSLYY-AY8b1WZ1D4neTrwMsm_A61-G-8e6-H3Dmfue_vQ@mail.gmail.com/ Thanks! -- Chris Murphy ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: is hibernation usable? 2020-02-11 19:50 is hibernation usable? Chris Murphy @ 2020-02-11 22:23 ` Luigi Semenzato 2020-02-20 2:54 ` Chris Murphy 0 siblings, 1 reply; 13+ messages in thread From: Luigi Semenzato @ 2020-02-11 22:23 UTC (permalink / raw) To: Chris Murphy; +Cc: Linux Memory Management List On Tue, Feb 11, 2020 at 11:50 AM Chris Murphy <lists@colorremedies.com> wrote: > > Original thread: > https://lore.kernel.org/linux-mm/CAA25o9RSWPX8L3s=r6A+4oSdQyvGfWZ1bhKfGvSo5nN-X58HQA@mail.gmail.com/ > > This whole thread is a revelation. I have no doubt most users have no > idea that hibernation image creation is expected to fail if more than > 50% RAM is used. Please bear with me while I ask some possibly > rudimentary questions to ensure I understand this in simple terms. To be clear, I am not completely sure of this. Other developers are not in agreement with this (as you can see from the thread). However, I can easily and consistently reproduce the memory allocation failure when anon is >50% of total. According to others, the image allocation should reclaim pages by forcing anon pages to swap. I don't understand if/how the swap partition accommodates both swapped pages and the hibernation image, but in any case, in my experiments, I allocate a swap disk the same size of RAM, which should be sufficient (again, according to the threads). > Example system: 32G RAM, all of it used, plus 2G of page outs (into > the swap device). > > + 2G already paged out to swap > + 16GB needs to be paged out to swap, to free up enough memory to > create the hibernation image > + 8-16GB for the (compressed) hibernation image to be written to a > *contiguous* range within swap device > > This suggests a 26G-34G swap device, correct? (I realize that this > swap device could, in another example, contain more than 2G of page > outs already, and that would only increase this requirement.) > > Is there now (or planned) an automatic kernel facility that will do > the eviction automatically, to free up enough memory, so that the > hibernation image can always be successfully created in-memory? If > not, does this suggest some facility needs to be created, maybe in > systemd, coordinating with the desktop environment? I don't need to > understand the details but I do want to understand if this exists, > will exist, and where it will exist. I have a workaround, but it needs memcgroups. You can echo $limit > .../$cgroup/memory.mem.limit_in_bytes and if your current usage is greater than $limit, and you have swap, the operation will block until enough pages have been swapped out to satisfy the limit. Even this isn't guaranteed to work, even with enough free swap. The limit adjustment invokes mem_cgroup_resize_limit() which contains a loop with multiple retries of a call to do_try_to_free_pages(). The number of retries looks like a heuristic, and I've seen the resizing fail. > One idea floated on Fedora devel@ a few months ago by a systemd > developer, is to activate a swap device at hibernation time. That way > the system is constrained to a smaller swap device, e.g. swap on > /dev/zram during normal use, but can still hibernate by activating a > suitably sized swap device on-demand. Do you anticipate any problems > with this idea? Could it be subject to race conditions? > > Is there any difference in hibernation reliability between swap > partitions, versus swapfiles? I note there isn't a standard interface > for all file systems, notably Btrfs has a unique requirement [1] > > Are there any prospects for signed hibernation images, in order to > support hibernation when UEFI Secure Boot is enabled? > > What about the signing of swap? If there's a trust concern with the > hibernation image, and I agree that there is in the context of UEFI > SB, then it seems there's likewise a concern about active pages in > swap. Yes? No? > > > [1] > https://lore.kernel.org/linux-btrfs/CAJCQCtSLYY-AY8b1WZ1D4neTrwMsm_A61-G-8e6-H3Dmfue_vQ@mail.gmail.com/ > > Thanks! > > -- > Chris Murphy ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: is hibernation usable? 2020-02-11 22:23 ` Luigi Semenzato @ 2020-02-20 2:54 ` Chris Murphy 2020-02-20 2:56 ` Chris Murphy 0 siblings, 1 reply; 13+ messages in thread From: Chris Murphy @ 2020-02-20 2:54 UTC (permalink / raw) To: Luigi Semenzato; +Cc: Linux Memory Management List On Tue, Feb 11, 2020 at 3:23 PM Luigi Semenzato <semenzato@google.com> wrote: > > On Tue, Feb 11, 2020 at 11:50 AM Chris Murphy <lists@colorremedies.com> wrote: > > > > Original thread: > > https://lore.kernel.org/linux-mm/CAA25o9RSWPX8L3s=r6A+4oSdQyvGfWZ1bhKfGvSo5nN-X58HQA@mail.gmail.com/ > > > > This whole thread is a revelation. I have no doubt most users have no > > idea that hibernation image creation is expected to fail if more than > > 50% RAM is used. Please bear with me while I ask some possibly > > rudimentary questions to ensure I understand this in simple terms. > > To be clear, I am not completely sure of this. Other developers are > not in agreement with this (as you can see from the thread). However, > I can easily and consistently reproduce the memory allocation failure > when anon is >50% of total. According to others, the image allocation > should reclaim pages by forcing anon pages to swap. I don't > understand if/how the swap partition accommodates both swapped pages > and the hibernation image, but in any case, in my experiments, I > allocate a swap disk the same size of RAM, which should be sufficient > (again, according to the threads). I'm testing with this method: # echo reboot > /sys/power/disk # echo disk > /sys/power/state About 2/3rd of the time on a test system, hibernation entry fails. It's fatal. The last journal entry is: [ 349.732372] PM: hibernation: hibernation entry Screen is blank, system gets hot, fans go to high, and it doesn't recover after 15 minutes. After forcing power off and rebooting, there is no hibernation signature reported in the swap partition so I don't think the kernel every reached reboot. Shifting over to a qemu-kvm with pm support enabled, this is working. If I fill up pretty much all of RAM and a small amount of swap is used, the above two commands succeed, the VM reboots, and the hibernation image is resumed without error. AnonPages is 73% of total. Upon successful resume, it appears quite a lot of pages were pushed to swap. It looks like about 1GiB was paged out. Before hibernation: $ cat /proc/meminfo MemTotal: 2985944 kB MemFree: 148376 kB MemAvailable: 220428 kB Buffers: 172 kB Cached: 366100 kB SwapCached: 4632 kB Active: 1962088 kB Inactive: 592576 kB Active(anon): 1842560 kB Inactive(anon): 467904 kB Active(file): 119528 kB Inactive(file): 124672 kB Unevictable: 1628 kB Mlocked: 1628 kB SwapTotal: 3117052 kB SwapFree: 2899952 kB Dirty: 6248 kB Writeback: 0 kB AnonPages: 2187236 kB Mapped: 245800 kB Shmem: 120504 kB KReclaimable: 58016 kB Slab: 203260 kB SReclaimable: 58016 kB SUnreclaim: 145244 kB KernelStack: 13712 kB PageTables: 23364 kB NFS_Unstable: 0 kB Bounce: 0 kB WritebackTmp: 0 kB CommitLimit: 4610024 kB Committed_AS: 6019396 kB VmallocTotal: 34359738367 kB VmallocUsed: 27528 kB VmallocChunk: 0 kB Percpu: 4016 kB HardwareCorrupted: 0 kB AnonHugePages: 0 kB ShmemHugePages: 0 kB ShmemPmdMapped: 0 kB FileHugePages: 0 kB FilePmdMapped: 0 kB CmaTotal: 0 kB CmaFree: 0 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB Hugetlb: 0 kB DirectMap4k: 238332 kB DirectMap2M: 2904064 kB After resume: [chris@vm ~]$ cat /proc/meminfo MemTotal: 2985944 kB MemFree: 1007132 kB MemAvailable: 1069576 kB Buffers: 76 kB Cached: 400464 kB SwapCached: 296112 kB Active: 755856 kB Inactive: 955624 kB Active(anon): 731668 kB Inactive(anon): 683352 kB Active(file): 24188 kB Inactive(file): 272272 kB Unevictable: 1632 kB Mlocked: 1632 kB SwapTotal: 3117052 kB SwapFree: 1874788 kB Dirty: 2716 kB Writeback: 0 kB AnonPages: 1182108 kB Mapped: 225352 kB Shmem: 102480 kB KReclaimable: 48968 kB Slab: 183104 kB SReclaimable: 48968 kB SUnreclaim: 134136 kB KernelStack: 14000 kB PageTables: 22924 kB NFS_Unstable: 0 kB Bounce: 0 kB WritebackTmp: 0 kB CommitLimit: 4610024 kB Committed_AS: 5937732 kB VmallocTotal: 34359738367 kB VmallocUsed: 27800 kB VmallocChunk: 0 kB Percpu: 4016 kB HardwareCorrupted: 0 kB AnonHugePages: 0 kB ShmemHugePages: 0 kB ShmemPmdMapped: 0 kB FileHugePages: 0 kB FilePmdMapped: 0 kB CmaTotal: 0 kB CmaFree: 0 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 2048 kB Hugetlb: 0 kB DirectMap4k: 238332 kB DirectMap2M: 2904064 kB $ There must be some other cause for the 50% limitation. Is it possible it only starts once there's a certain amount of RAM present? e.g. maybe it can only page out 4GiB of Anon pages to swap? And after that point if at least 50% RAM isn't available, hibernation image creation fails? -- Chris Murphy ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: is hibernation usable? 2020-02-20 2:54 ` Chris Murphy @ 2020-02-20 2:56 ` Chris Murphy [not found] ` <CAA25o9T2wwqoopoNRySdZoYkD+vtqRPsB1YPnag=TkOp5D9sYA@mail.gmail.com> 0 siblings, 1 reply; 13+ messages in thread From: Chris Murphy @ 2020-02-20 2:56 UTC (permalink / raw) To: Linux Memory Management List; +Cc: Luigi Semenzato Also, is this the correct list for hibernation/swap discussion? Or linux-pm@? Thanks, Chris Murphy ^ permalink raw reply [flat|nested] 13+ messages in thread
[parent not found: <CAA25o9T2wwqoopoNRySdZoYkD+vtqRPsB1YPnag=TkOp5D9sYA@mail.gmail.com>]
* Re: is hibernation usable? [not found] ` <CAA25o9T2wwqoopoNRySdZoYkD+vtqRPsB1YPnag=TkOp5D9sYA@mail.gmail.com> @ 2020-02-20 17:38 ` Luigi Semenzato 2020-02-21 8:49 ` Michal Hocko [not found] ` <CAJCQCtScZg1CP2WTDoOy4-urPbvP_5Hw0H-AKTwHugN9YhdxLg@mail.gmail.com> 1 sibling, 1 reply; 13+ messages in thread From: Luigi Semenzato @ 2020-02-20 17:38 UTC (permalink / raw) To: Chris Murphy; +Cc: Linux Memory Management List, Linux PM I was forgetting: forcing swap by eating up memory is dangerous because it can lead to unexpected OOM kills, but you can mitigate that by giving the memory-eaters a higher OOM kill score. Still, some way of calling try_to_free_pages() directly from user-level would be preferable. I wonder if such API has been discussed. On Thu, Feb 20, 2020 at 9:16 AM Luigi Semenzato <semenzato@google.com> wrote: > > I think this is the right group for the memory issues. > > I suspect that the problem with failed allocations (ENOMEM) boils down > to the unreliability of the page allocator. In my experience, under > pressure (i.e. pages must be swapped out to be reclaimed) allocations > can fail even when in theory they should succeed. (I wish I were > wrong and that someone would convincingly correct me.) > > I have a workaround in which I use memcgroups to free pages before > starting hibernation. The cgroup request "echo $limit > > .../memory.limit_in_bytes" blocks until memory usage in the chosen > cgroup is below $limit. However, I have seen this request fail even > when there is extra available swap space. > > The callback for the operation is mem_cgroup_resize_limit() (BTW I am > looking at kernel version 4.3.5) and that code has a loop where > try_to_free_pages() is called up to retry_count, which is at least 5. > Why 5? One suspects that the writer of that code must have also > realized that the page freeing request is unreliable and it's worth > trying multiple times. > > So you could try something similar. I don't know if there are > interfaces to try_to_free_pages() other than those in cgroups. If > not, and you aren't using cgroups, one way might be to start several > memory-eating processes (such as "dd if=/dev/zero bs=1G count=1 | > sleep infinity") and monitor allocation, then when they use more than > 50% of RAM kill them and immediately hibernate before the freed pages > are reused. If you can build your custom kernel, maybe it's worth > adding a sysfs entry to invoke try_to_free_pages(). You could also > change the hibernation code to do that, but having the user-level hook > may be more flexible. > > > On Wed, Feb 19, 2020 at 6:56 PM Chris Murphy <lists@colorremedies.com> wrote: > > > > Also, is this the correct list for hibernation/swap discussion? Or linux-pm@? > > > > Thanks, > > > > Chris Murphy ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: is hibernation usable? 2020-02-20 17:38 ` Luigi Semenzato @ 2020-02-21 8:49 ` Michal Hocko 2020-02-21 9:04 ` Rafael J. Wysocki 0 siblings, 1 reply; 13+ messages in thread From: Michal Hocko @ 2020-02-21 8:49 UTC (permalink / raw) To: Luigi Semenzato; +Cc: Chris Murphy, Linux Memory Management List, Linux PM On Thu 20-02-20 09:38:06, Luigi Semenzato wrote: > I was forgetting: forcing swap by eating up memory is dangerous > because it can lead to unexpected OOM kills Could you be more specific what you have in mind? swapoff causing the OOM killer? > , but you can mitigate that > by giving the memory-eaters a higher OOM kill score. Still, some way > of calling try_to_free_pages() directly from user-level would be > preferable. I wonder if such API has been discussed. No, there is no API to trigger the global memory reclaim. You could start the reclaim by increasing min_free_kbytes but I wouldn't really recommend that unless you know exactly what you are doing and also I fail to see the point. If s2disk fails due to insufficient swap space then how can a pro-active reclaim help in the first place? -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: is hibernation usable? 2020-02-21 8:49 ` Michal Hocko @ 2020-02-21 9:04 ` Rafael J. Wysocki 2020-02-21 9:36 ` Michal Hocko 2020-02-21 9:46 ` Chris Murphy 0 siblings, 2 replies; 13+ messages in thread From: Rafael J. Wysocki @ 2020-02-21 9:04 UTC (permalink / raw) To: Michal Hocko Cc: Luigi Semenzato, Chris Murphy, Linux Memory Management List, Linux PM On Fri, Feb 21, 2020 at 9:49 AM Michal Hocko <mhocko@kernel.org> wrote: > > On Thu 20-02-20 09:38:06, Luigi Semenzato wrote: > > I was forgetting: forcing swap by eating up memory is dangerous > > because it can lead to unexpected OOM kills > > Could you be more specific what you have in mind? swapoff causing the > OOM killer? > > > , but you can mitigate that > > by giving the memory-eaters a higher OOM kill score. Still, some way > > of calling try_to_free_pages() directly from user-level would be > > preferable. I wonder if such API has been discussed. > > No, there is no API to trigger the global memory reclaim. You could > start the reclaim by increasing min_free_kbytes but I wouldn't really > recommend that unless you know exactly what you are doing and also I > fail to see the point. If s2disk fails due to insufficient swap space > then how can a pro-active reclaim help in the first place? My understanding of the problem is that the size of swap is (theoretically) sufficient, but it is not used as expected during the preallocation of image memory. It was stated in one of the previous messages (not in this thread, cannot find it now) that swap (of the same size as RAM) was activated (swapon) right before hibernation, so theoretically that should be sufficient AFAICS. ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: is hibernation usable? 2020-02-21 9:04 ` Rafael J. Wysocki @ 2020-02-21 9:36 ` Michal Hocko 2020-02-21 17:13 ` Luigi Semenzato 2020-02-21 9:46 ` Chris Murphy 1 sibling, 1 reply; 13+ messages in thread From: Michal Hocko @ 2020-02-21 9:36 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Luigi Semenzato, Chris Murphy, Linux Memory Management List, Linux PM On Fri 21-02-20 10:04:18, Rafael J. Wysocki wrote: > On Fri, Feb 21, 2020 at 9:49 AM Michal Hocko <mhocko@kernel.org> wrote: > > > > On Thu 20-02-20 09:38:06, Luigi Semenzato wrote: > > > I was forgetting: forcing swap by eating up memory is dangerous > > > because it can lead to unexpected OOM kills > > > > Could you be more specific what you have in mind? swapoff causing the > > OOM killer? > > > > > , but you can mitigate that > > > by giving the memory-eaters a higher OOM kill score. Still, some way > > > of calling try_to_free_pages() directly from user-level would be > > > preferable. I wonder if such API has been discussed. > > > > No, there is no API to trigger the global memory reclaim. You could > > start the reclaim by increasing min_free_kbytes but I wouldn't really > > recommend that unless you know exactly what you are doing and also I > > fail to see the point. If s2disk fails due to insufficient swap space > > then how can a pro-active reclaim help in the first place? > > My understanding of the problem is that the size of swap is > (theoretically) sufficient, but it is not used as expected during the > preallocation of image memory. > > It was stated in one of the previous messages (not in this thread, > cannot find it now) that swap (of the same size as RAM) was activated > (swapon) right before hibernation, so theoretically that should be > sufficient AFAICS. Hmm, this is interesting. Let me have a closer look... pm_restrict_gfp_mask which would completely rule out any IO happens after hibernate_preallocate_memory is done and my limited understanding tells me that this is where all the reclaim happens (via shrink_all_memory). It is quite possible that the MM decides to not swap in that path - depending on the memory usage - and miss it's target. More details would be needed. E.g. vmscan tracepoints could tell us more. -- Michal Hocko SUSE Labs ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: is hibernation usable? 2020-02-21 9:36 ` Michal Hocko @ 2020-02-21 17:13 ` Luigi Semenzato 0 siblings, 0 replies; 13+ messages in thread From: Luigi Semenzato @ 2020-02-21 17:13 UTC (permalink / raw) To: Michal Hocko Cc: Rafael J. Wysocki, Chris Murphy, Linux Memory Management List, Linux PM On Fri, Feb 21, 2020 at 1:36 AM Michal Hocko <mhocko@kernel.org> wrote: > > On Fri 21-02-20 10:04:18, Rafael J. Wysocki wrote: > > On Fri, Feb 21, 2020 at 9:49 AM Michal Hocko <mhocko@kernel.org> wrote: > > > > > > On Thu 20-02-20 09:38:06, Luigi Semenzato wrote: > > > > I was forgetting: forcing swap by eating up memory is dangerous > > > > because it can lead to unexpected OOM kills > > > > > > Could you be more specific what you have in mind? swapoff causing the > > > OOM killer? No, not swapoff, just fast allocation. Also, in some earlier experiments I tried gradually increasing min_free_kbytes (precisely as suggested) and this would randomly trigger OOM kills when swap space was still available. > > > > , but you can mitigate that > > > > by giving the memory-eaters a higher OOM kill score. Still, some way > > > > of calling try_to_free_pages() directly from user-level would be > > > > preferable. I wonder if such API has been discussed. > > > > > > No, there is no API to trigger the global memory reclaim. You could > > > start the reclaim by increasing min_free_kbytes but I wouldn't really > > > recommend that unless you know exactly what you are doing and also I > > > fail to see the point. If s2disk fails due to insufficient swap space > > > then how can a pro-active reclaim help in the first place? > > > > My understanding of the problem is that the size of swap is > > (theoretically) sufficient, but it is not used as expected during the > > preallocation of image memory. > > > > It was stated in one of the previous messages (not in this thread, > > cannot find it now) that swap (of the same size as RAM) was activated > > (swapon) right before hibernation, so theoretically that should be > > sufficient AFAICS. Correct, those were my experiments. Search the archives for "semenzato", there are a couple of threads on the topic. But really, why not have a user-level interface for reclaim? I find it very difficult to understand the behavior of the reclaim code, and any attempt to reclaim from user level (memory-eating processes, raising min_free_kbytes) can end in the OOM-kill path. Using cgroups' memory.limit_in_bytes doesn't have this problem, precisely because it only calls try_to_free_pages(), which doesn't trigger OOM killing. If I could make that call from user level (without cgroups) it would greatly simplify my current workaround, and would be useful in other situations as well. Something like echo $page_count > /proc/sys/vm/try_to_free_pages cat /proc/sys/vm/pages_freed # the number of pages freed at the latest request > Hmm, this is interesting. Let me have a closer look... > > pm_restrict_gfp_mask which would completely rule out any IO > happens after hibernate_preallocate_memory is done and my limited > understanding tells me that this is where all the reclaim happens > (via shrink_all_memory). It is quite possible that the MM decides to > not swap in that path - depending on the memory usage - and miss it's > target. More details would be needed. E.g. vmscan tracepoints could tell > us more. > > -- > Michal Hocko > SUSE Labs ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: is hibernation usable? 2020-02-21 9:04 ` Rafael J. Wysocki 2020-02-21 9:36 ` Michal Hocko @ 2020-02-21 9:46 ` Chris Murphy 1 sibling, 0 replies; 13+ messages in thread From: Chris Murphy @ 2020-02-21 9:46 UTC (permalink / raw) To: Rafael J. Wysocki Cc: Michal Hocko, Luigi Semenzato, Chris Murphy, Linux Memory Management List, Linux PM On Fri, Feb 21, 2020 at 2:04 AM Rafael J. Wysocki <rafael@kernel.org> wrote: > > My understanding of the problem is that the size of swap is > (theoretically) sufficient, but it is not used as expected during the > preallocation of image memory. Right. I have no idea how locality of pages is determined in the swap device. But if it's sufficiently fragmented such that contiguous free space for a hibernation image is not sufficient, then hibernation could fail. > It was stated in one of the previous messages (not in this thread, > cannot find it now) that swap (of the same size as RAM) was activated > (swapon) right before hibernation, so theoretically that should be > sufficient AFAICS. I mentioned it as an idea floated by systemd developers. I'm not sure if it's mentioned elsewhere. Some folks wonder if such functionality could be prone to racing. https://lore.kernel.org/linux-mm/CAJCQCtSx0FOX7q0p=9XgDLJ6O0+hF_vc-wU4KL=c9xoSGGkstA@mail.gmail.com/T/#m4d47d127da493f998b232d42d81621335358aee1 Another idea that's been suggested for a while is formally separating hibernation and paging into separate files (or partitions). a. Guarantees hibernation image has the necessary contiguous free space. b. Might be easier to create (or even obviate) a sane interface for hibernation images in swapfiles; that is, if it were a dedicated hibernationfile rather than being inserted in a swapfile. Right now that interface doesn't exist, so e.g. on Btrfs while it can support swapfiles and hibernation images, the offset has to be figured out manually so resume can succeed. https://github.com/systemd/systemd/issues/11939#issuecomment-471684411 -- Chris Murphy ^ permalink raw reply [flat|nested] 13+ messages in thread
[parent not found: <CAJCQCtScZg1CP2WTDoOy4-urPbvP_5Hw0H-AKTwHugN9YhdxLg@mail.gmail.com>]
* Re: is hibernation usable? [not found] ` <CAJCQCtScZg1CP2WTDoOy4-urPbvP_5Hw0H-AKTwHugN9YhdxLg@mail.gmail.com> @ 2020-02-20 19:44 ` Luigi Semenzato 2020-02-20 21:48 ` Chris Murphy 2020-02-27 6:43 ` Chris Murphy 0 siblings, 2 replies; 13+ messages in thread From: Luigi Semenzato @ 2020-02-20 19:44 UTC (permalink / raw) To: Chris Murphy; +Cc: Linux Memory Management List, Linux PM On Thu, Feb 20, 2020 at 11:09 AM Chris Murphy <lists@colorremedies.com> wrote: > > On Thu, Feb 20, 2020 at 10:16 AM Luigi Semenzato <semenzato@google.com> wrote: > > > > I think this is the right group for the memory issues. > > > > I suspect that the problem with failed allocations (ENOMEM) boils down > > to the unreliability of the page allocator. In my experience, under > > pressure (i.e. pages must be swapped out to be reclaimed) allocations > > can fail even when in theory they should succeed. (I wish I were > > wrong and that someone would convincingly correct me.) > > What is vm.swappiness set to on your system? A fellow Fedora > contributor who has consistently reproduced what you describe, has > discovered he has vm.swappiness=0, and even if it's set to 1, the > problem no longer happens. And this is not a documented consequence of > using a value of 0. I am using the default value of 60. A zero value should cause all file pages to be discarded before any anonymous pages are swapped. I wonder if the fellow Fedora contributor's workload has a lot of file pages, so that discarding them is enough for the image allocator to succeed. In that case "sync; echo 1 > /proc/sys/vm/drop_caches" would be a better way of achieving the same result. (By the way, in my experiments I do that just before hibernating.) > > I have a workaround in which I use memcgroups to free pages before > > starting hibernation. The cgroup request "echo $limit > > > .../memory.limit_in_bytes" blocks until memory usage in the chosen > > cgroup is below $limit. However, I have seen this request fail even > > when there is extra available swap space. > > > > The callback for the operation is mem_cgroup_resize_limit() (BTW I am > > looking at kernel version 4.3.5) and that code has a loop where > > try_to_free_pages() is called up to retry_count, which is at least 5. > > Why 5? One suspects that the writer of that code must have also > > realized that the page freeing request is unreliable and it's worth > > trying multiple times. > > > > So you could try something similar. I don't know if there are > > interfaces to try_to_free_pages() other than those in cgroups. If > > not, and you aren't using cgroups, one way might be to start several > > memory-eating processes (such as "dd if=/dev/zero bs=1G count=1 | > > sleep infinity") and monitor allocation, then when they use more than > > 50% of RAM kill them and immediately hibernate before the freed pages > > are reused. If you can build your custom kernel, maybe it's worth > > adding a sysfs entry to invoke try_to_free_pages(). You could also > > change the hibernation code to do that, but having the user-level hook > > may be more flexible. > > Fedora 31+ now uses cgroupsv2. In any case, my use case is making sure > this works correctly, sanely, with mainline kernels because Fedora > doesn't do custom things with the kernel. > > > > -- > Chris Murphy ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: is hibernation usable? 2020-02-20 19:44 ` Luigi Semenzato @ 2020-02-20 21:48 ` Chris Murphy 2020-02-27 6:43 ` Chris Murphy 1 sibling, 0 replies; 13+ messages in thread From: Chris Murphy @ 2020-02-20 21:48 UTC (permalink / raw) To: Luigi Semenzato; +Cc: Chris Murphy, Linux Memory Management List, Linux PM On Thu, Feb 20, 2020 at 12:45 PM Luigi Semenzato <semenzato@google.com> wrote: > > On Thu, Feb 20, 2020 at 11:09 AM Chris Murphy <lists@colorremedies.com> wrote: > > > > On Thu, Feb 20, 2020 at 10:16 AM Luigi Semenzato <semenzato@google.com> wrote: > > > > > > I think this is the right group for the memory issues. > > > > > > I suspect that the problem with failed allocations (ENOMEM) boils down > > > to the unreliability of the page allocator. In my experience, under > > > pressure (i.e. pages must be swapped out to be reclaimed) allocations > > > can fail even when in theory they should succeed. (I wish I were > > > wrong and that someone would convincingly correct me.) > > > > What is vm.swappiness set to on your system? A fellow Fedora > > contributor who has consistently reproduced what you describe, has > > discovered he has vm.swappiness=0, and even if it's set to 1, the > > problem no longer happens. And this is not a documented consequence of > > using a value of 0. > > I am using the default value of 60. > > A zero value should cause all file pages to be discarded before any > anonymous pages are swapped. I wonder if the fellow Fedora > contributor's workload has a lot of file pages, so that discarding > them is enough for the image allocator to succeed. In that case "sync; > echo 1 > /proc/sys/vm/drop_caches" would be a better way of achieving > the same result. (By the way, in my experiments I do that just before > hibernating.) Unfortunately I can't reproduce graceful failure you describe, myself. I either get successful hibernation/resume or some kind of non-deterministic and fatal failure to enter hibernation - and any dmesg/journal that might contain evidence of the failure is lost. I've had better success with qemu-kvm testing, but even in that case I see about 1/4 of the time (with a ridiculously small sample size) failure to complete hibernation entry. I can't tell if the failure happens during page out, hibernation image creation, or hibernation image write out - but the result is a black screen (virt-manager console) and the VM never shutsdown or reboots, it just hangs and spins ~400% CPU (even though it's only assigned 3 CPUs). It's sufficiently unreliable that I can't really consider it supported or supportable. Microsoft and Apple have put more emphasis lately on S0 low power idle, faster booting, and application state saving. The behavior in Windows 10 with hiberfil.sys is a limited environment, essentially that of the login window (no user environment state is saved in it), and is used both for resuming from S4, as well as fast boot. A separate file pagefile.sys is used for paging, so there's never a conflict where a use case that depends on significant page out can prevent hibernation from succeeding. It's also Secure Boot compatible. Where on linux with x86_64 it isn't. Between kernel and ACPI and firmware bugs, it's going to take a lot more effort to make it reliable and trustworthy for the general case. Or it should just be abandoned, it seems to be mostly that way already. -- Chris Murphy ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: is hibernation usable? 2020-02-20 19:44 ` Luigi Semenzato 2020-02-20 21:48 ` Chris Murphy @ 2020-02-27 6:43 ` Chris Murphy 1 sibling, 0 replies; 13+ messages in thread From: Chris Murphy @ 2020-02-27 6:43 UTC (permalink / raw) To: Luigi Semenzato; +Cc: Chris Murphy, Linux Memory Management List, Linux PM On Thu, Feb 20, 2020 at 12:45 PM Luigi Semenzato <semenzato@google.com> wrote: > > On Thu, Feb 20, 2020 at 11:09 AM Chris Murphy <lists@colorremedies.com> wrote: > > > > On Thu, Feb 20, 2020 at 10:16 AM Luigi Semenzato <semenzato@google.com> wrote: > > > > > > I think this is the right group for the memory issues. > > > > > > I suspect that the problem with failed allocations (ENOMEM) boils down > > > to the unreliability of the page allocator. In my experience, under > > > pressure (i.e. pages must be swapped out to be reclaimed) allocations > > > can fail even when in theory they should succeed. (I wish I were > > > wrong and that someone would convincingly correct me.) > > > > What is vm.swappiness set to on your system? A fellow Fedora > > contributor who has consistently reproduced what you describe, has > > discovered he has vm.swappiness=0, and even if it's set to 1, the > > problem no longer happens. And this is not a documented consequence of > > using a value of 0. > > I am using the default value of 60. > > A zero value should cause all file pages to be discarded before any > anonymous pages are swapped. I wonder if the fellow Fedora > contributor's workload has a lot of file pages, so that discarding > them is enough for the image allocator to succeed. In that case "sync; > echo 1 > /proc/sys/vm/drop_caches" would be a better way of achieving > the same result. (By the way, in my experiments I do that just before > hibernating.) He reports hibernation failure even if he drops caches beforehand. https://lists.fedoraproject.org/archives/list/desktop@lists.fedoraproject.org/message/XYWYF33RFVISVZTPYSJRRXP7TFXPV4GD/ -- Chris Murphy ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2020-02-27 6:44 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-11 19:50 is hibernation usable? Chris Murphy
2020-02-11 22:23 ` Luigi Semenzato
2020-02-20 2:54 ` Chris Murphy
2020-02-20 2:56 ` Chris Murphy
[not found] ` <CAA25o9T2wwqoopoNRySdZoYkD+vtqRPsB1YPnag=TkOp5D9sYA@mail.gmail.com>
2020-02-20 17:38 ` Luigi Semenzato
2020-02-21 8:49 ` Michal Hocko
2020-02-21 9:04 ` Rafael J. Wysocki
2020-02-21 9:36 ` Michal Hocko
2020-02-21 17:13 ` Luigi Semenzato
2020-02-21 9:46 ` Chris Murphy
[not found] ` <CAJCQCtScZg1CP2WTDoOy4-urPbvP_5Hw0H-AKTwHugN9YhdxLg@mail.gmail.com>
2020-02-20 19:44 ` Luigi Semenzato
2020-02-20 21:48 ` Chris Murphy
2020-02-27 6:43 ` Chris Murphy
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox