1

We got a workstation pc with i9 13900k and ubuntu 22.10 is installed.

When all cores at their max, system becomes unresponsive. Rejects all ssh connections and keyboard and mouse does not work.

Tried to troubleshoot it but wasnt able to as we cannot get any information while system is at full load.

What would be the reasons? How we should approach the problem?

processes at hand: 31 pybullet simulations.

batu
  • 11

1 Answers1

2

I'd be inclined to monitor the system while ramping up the workload. So rather than hit 31 pybullet simulations, try building up to that. Before you start, I'd install dstat and leave it running in a terminal. Start the workload and watch the output of dstat.

sudo apt install dstat

Run in a terminal:

dstat --time --load -cdngy --top-cpu --top-mem --top-io

It'll look a bit like this:

----system---- ---load-avg--- --total-cpu-usage-- -dsk/total- -net/total- ---paging-- ---system-- -most-expensive- --most-expensive- ----most-expensive----
     time     | 1m   5m  15m |usr sys idl wai stl| read  writ| recv  send|  in   out | int   csw |  cpu process   |  memory process |     i/o process      
05-05 11:43:10|1.99 2.02 2.17| 10   2  87   1   0|2106k 2208k|   0     0 |7742B   24k|8606    13k|msedge       2.8|msedge       920M|msedge       11M  170k
05-05 11:43:11|1.99 2.02 2.17| 11   2  87   0   0|   0   896k|1174B 1117B|   0     0 |8305    10k|msedge       6.3|msedge       918M|msedge       29M   63B
05-05 11:43:12|1.83 1.99 2.16|  6   2  91   1   0|   0  1168k|2098B 2197B|   0     0 |6219  8362 |msedge       4.4|msedge       918M|msedge       20M   54B
05-05 11:43:13|1.83 1.99 2.16|  7   3  90   0   0|   0     0 |1524B 2136B|   0     0 |8403    12k|msedge       2.1|msedge       917M|msedge     3874k 5092B
05-05 11:43:14|1.83 1.99 2.16|  5   2  92   1   0|   0     0 |2200B  554B|   0     0 |6227  8621 |msedge       2.2|msedge       918M|btm         189k 6355B
05-05 11:43:15|1.83 1.99 2.16|  9   2  89   0   0|   0  8192B| 248k 5862B|   0     0 |7933    11k|msedge       4.8|msedge       919M|msedge       19M  260k
05-05 11:43:16|1.83 1.99 2.16|  9   2  88   1   0|   0  1488k|1194B  997B|   0     0 |6372  8858 |msedge       6.3|msedge       918M|msedge       28M  204B
05-05 11:43:17|2.00 2.02 2.17|  9   2  89   0   0|   0  3368k|2385B  447B|   0     0 |5557  7323 |msedge       6.2|msedge       918M|msedge       29M   49B
05-05 11:43:18|2.00 2.02 2.17|  9   2  88   1   0|   0    80k| 999B  351B|   0     0 |8337    12k|msedge       6.2|msedge       918M|msedge       27M 7521

You can optionally write it to a file in the background, in case the GUI locks up:

dstat --time --load -cdngy --top-cpu --top-mem --top-io --output report.csv 1 5 &

Start building up a workload, and note the time when you do it. You'll see the dstat output has timestamps so you can correlate what's happening. You'll likely see the cpu usage rise, and load average.

popey
  • 24,549