2

Ok, first, english is not my native language, so I apologize for any poor phrasing.

Second, I'm still learning about Linux, Ubuntu 18.04 to be more specific. This is the first time I use this OS and my knowledge about terminal, commands, packages is still very basic. And the things I CAN do I'm not sure I understand them entirely. Patience, please.

So, to the problem. I decided to try some gaming recently and to my surprise some games I could play before on W10 are unplayable duo to some performance issue. These games start at 35-45 fps and all of sudden drop to 1-15 fps (yes, ONE fps!).

At first I thought it could be something related to the gpu, drivers maybe, but no, I got the latest drivers from some ppa. Messed with the graphical quality of the games, but I started to notice that only the more cpu demanding were having this trouble.

Then I started searching ways to keep track of cpu usage. The system monitor wasn't enough, so I found some watch commands that track cpu speed and cpu temperature. I had to install lm-sensors for the temperature.

Finally, it appears the thinness of my laptop is making it overheat and then the cpu throttles, making these abismal drops in fps. I concluded this based on sudden drops in cpu speed as the temperatures were getting high. But I actually can't say if the temperatures I got were THAT high, the maximum I got was around 80°C on the CPU. And also, the fan appears to be working properly, it got around 5000 RPM.

To improve this I tried changing the govern parameter of cpufreq from powersave to performance. Although it didn't fix the performance drop, I noticed some improvement. The CPU speed dropped to 1600 MHz instead of 600 Mhz. Then it got me wondering if I should set a minimum CPU frequency or deactivate the scalling. But I fear it could led to an overheat and then melting it. I could also try some cooler support, but I'm not sure about the efficiency of those.

So, can anyone shed some light in this?

sudo lshw -c cpu

   descrição: CPU
   produto: Intel(R) Core(TM) i5-7200U CPU @ 2.50GHz
   fabricante: Intel Corp.
   ID físico: 36
   informações do barramento: cpu@0
   versão: Intel(R) Core(TM) i5-7200U CPU @ 2.50GHz
   serial: To Be Filled By O.E.M.
   slot: U3E1
   tamanho: 2228MHz
   capacidade: 3100MHz
   largura: 64 bits
   clock: 100MHz

lspci | grep -i VGA

00:02.0 VGA compatible controller: Intel Corporation HD Graphics 620 (rev 02)

lspci | grep -i 3D

01:00.0 3D controller: NVIDIA Corporation GM108M [GeForce 940MX] (rev a2)

xandr | grep connected

eDP-1-1 connected primary 1920x1080+0+0 (normal left inverted right x axis y axis) 309mm x 174mm

tomaz@tomaz-Inspiron-7460:~$ stress-ng -t 5m -v --tz -c 4

stress-ng: debug: [15960] 4 processors online, 4 processors configured
stress-ng: info:  [15960] dispatching hogs: 4 cpu
stress-ng: debug: [15960] cache allocate: default cache size: 3072K
stress-ng: debug: [15960] starting stressors
stress-ng: debug: [15961] stress-ng-cpu: started [15961] (instance 0)
stress-ng: debug: [15962] stress-ng-cpu: started [15962] (instance 1)
stress-ng: debug: [15960] 4 stressors spawned
stress-ng: debug: [15961] stress-ng-cpu using method 'all'
stress-ng: debug: [15964] stress-ng-cpu: started [15964] (instance 3)
stress-ng: debug: [15963] stress-ng-cpu: started [15963] (instance 2)
stress-ng: debug: [15964] stress-ng-cpu using method 'all'
stress-ng: debug: [15963] stress-ng-cpu using method 'all'
stress-ng: debug: [15962] stress-ng-cpu using method 'all'
stress-ng: debug: [15961] stress-ng-cpu: exited [15961] (instance 0)
stress-ng: debug: [15960] process [15961] terminated
stress-ng: debug: [15963] stress-ng-cpu: exited [15963] (instance 2)
stress-ng: debug: [15962] stress-ng-cpu: exited [15962] (instance 1)
stress-ng: debug: [15960] process [15962] terminated
stress-ng: debug: [15960] process [15963] terminated
stress-ng: debug: [15964] stress-ng-cpu: exited [15964] (instance 3)
stress-ng: debug: [15960] process [15964] terminated
stress-ng: info:  [15960] successful run completed in 300.05s (5 mins, 0.05 secs)
stress-ng: info:  [15960] cpu:
stress-ng: info:  [15960]          pch_skylake   59.25 °C
stress-ng: info:  [15960]                 B0D4   62.12 °C
stress-ng: info:  [15960]      INT3400 Thermal   48.08 °C
stress-ng: info:  [15960]                 SEN2   50.81 °C
stress-ng: info:  [15960]                 TMEM   50.15 °C
stress-ng: info:  [15960]         x86_pkg_temp   54.88 °C
stress-ng: info:  [15960]               acpitz   50.61 °C
stress-ng: info:  [15960]                 SEN1   50.78 °C

1 Answers1

0

You should be able to limit the upper CPU frequency, to for example 65%, using this command:

echo 65 | sudo tee /sys/devices/system/cpu/intel_pstate/max_perf_pct

The above assumes you are using the intel_pstate CPU frequency scaling driver, which you should be by default. To check:

doug@s15:~/temp$ cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_driver
intel_pstate
intel_pstate
intel_pstate
intel_pstate
intel_pstate
intel_pstate
intel_pstate
intel_pstate

A very good tool to use to monitor things is turbostat, which is included in the linux-tools-common package.

Below is an example where my computer is heavily loaded and the processor package temperature is has reached it highest point. In another terminal, I then limit the CPU frequency (shown further down) and you can observe the frequency, temperature and wattage drop:

doug@s15:~/temp$ sudo turbostat --quiet --Summary --show Busy%,Bzy_MHz,PkgTmp,PkgWatt --interval 15
Busy%   Bzy_MHz PkgTmp  PkgWatt
100.00  3500    79      63.91
100.00  3500    78      63.91
100.00  3500    78      63.91
100.00  3500    78      63.88
100.00  3500    79      63.89
100.00  3500    79      63.90
100.00  2755    70      45.86
100.00  2500    68      39.42
100.00  2500    67      39.26
100.00  2500    66      39.10
100.00  2500    65      39.07
100.00  2500    65      38.94
100.00  2500    64      38.93
100.00  2500    65      38.92

What was done in another terminal:

doug@s15:~$ cat /sys/devices/system/cpu/intel_pstate/max_perf_pct
100
doug@s15:~$ echo 65 | sudo tee /sys/devices/system/cpu/intel_pstate/max_perf_pct
65
doug@s15:~$ cat /sys/devices/system/cpu/intel_pstate/max_perf_pct
65

Note: the 2500 MHz CPU frequency is the nearest 100 MHz rounded value (nearest pstate of 25) to 3800 * 0.65. But hey, turbostat showed 3500 MHz before. Why? Because all cores were busy, and therefore the maximum CPU frequency was limited to 3500 MHZ internal to the processor itself. This information is also available via turbostat by not using the quiet directive. Example:

doug@s15:~/temp$ sudo turbostat --Summary --show Busy%,Bzy_MHz,PkgTmp,PkgWatt --interval 15
... [snip]...
cpu4: MSR_PLATFORM_INFO: 0x100070012200
16 * 100.0 = 1600.0 MHz max efficiency frequency
34 * 100.0 = 3400.0 MHz base frequency
cpu4: MSR_IA32_POWER_CTL: 0x0004005d (C1E auto-promotion: DISabled)
cpu4: MSR_TURBO_RATIO_LIMIT: 0x23242526
35 * 100.0 = 3500.0 MHz max turbo 4 active cores
36 * 100.0 = 3600.0 MHz max turbo 3 active cores
37 * 100.0 = 3700.0 MHz max turbo 2 active cores
38 * 100.0 = 3800.0 MHz max turbo 1 active cores
...[snip]...

There are many ways to limit the CPU temperatures to less than extreme values automatically. One is with thermald, and I have an example configuration script in another answer.

Doug Smythies
  • 16,146