0

I run a bath (snakemake -j 1) of memory-heavy operations in Python: subtracting two arrays up to 15 GB each, then calculating norms of the difference. Surprisingly my system started to misbehave:

  • Thunderbird crashes,
  • my graphic environment (XFCE with lightdm) crashes (effectively killing my screen sessions with the bath running),
  • after graphic environment respawned it swapped my monitors (pun intended) and did not allow me to re-swap them with Display settings - service lightdm restart was necessary,
  • my snakemake pipeline (bash + Python + numpy + pandas) tends to fail with segmentation faults when processing the biggest arrays,
  • yesterday I discovered I lost audio from Firefox,
  • recently after pipeline and graphic session crash one of bash processes went wild (100% CPU usage).

I have plenty (932 GB) of swap available, so it is not that my system suddenly ran out of memory. RAM chips also seems to work (17 passes of Memtest86+ revealed no error).

I ask about the reason behind crashes/misbehaviour of other programs (Thunderbird, screen session, graphic environment). Even if my programs were poorly written, I would expect their impact to be limited extensive swapping. A total XFCE session restart is something that definitively should not happen. And by restart I mean restart, not freezing or slowdown due to swapping.

abukaj
  • 485

2 Answers2

3

Active memory pages are not swap-able ... They have to be inactive i.e. not currently actively used/needed by any process in order to be candidates to be moved to swap.

Therefor, swap is not necessarily going to be used for all memory pages and the availability of free swap space doesn't mean that your system's physical memory is safe from getting full meanwhile.

"segmentation faults" are most likely due to applications not being able to allocate memory addresses due to insufficient free memory.

Bottom line, swap is only managed by the kernel and not by userspace applications and kernel only swaps inactive memory pages ... When one application actively uses a large amount of memory at once, kernel will not selectively swap any of that memory ... So, fixing your application's memory usage is the way to go around this issue.

Maybe it's time to stop thinking about fixing Ubuntu to work with your application and start thinking about fixing your own application's memory usage and a good deep look using a memory profiler is a good starting point ... See for example:

Check memory usage of process which exits immediately

Raffa
  • 34,963
1

It might when swap partition contains badblocks.

Memory load leads to heavy swapping, which drastically increases chance of hitting a badblock.

(Thanks to matigo, whose comment advising me to check RAM inspired me to check swap partition too).

abukaj
  • 485