6

When using ABAQUS 6.14 (but also ABAQUS 2018) on ubuntu 18.04 everything seems to work fine except the termination of the standard process (the process started when performing an implicit analysis -- if you are not familiar with this it doesn't matter).

The analysis indeed works as one can also see in a log file (the .sta file, for those who are familiar with abaqus) the message THE ANALYSIS HAS COMPLETED SUCCESSFULLY. The output database contains the analysis results. However, after the analysis has been completed, the process standard remains in a sleeping status using 0% CPU and keeping the same amount of RAM as when it was running.

From strace I get:

[pid 23191] close(8)                    = 0
[pid 23185] <... select resumed> )      = 0 (Timeout)
[pid 23185] select(0, NULL, NULL, NULL, {tv_sec=0, tv_usec=50000} <unfinished ...>
[pid 23193] <... select resumed> )      = 0 (Timeout)
[pid 23193] futex(0x7f3acd917db0, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
[pid 23191] futex(0x7f3acd917db0, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
[pid 23193] <... futex resumed> )       = 0
[pid 23191] <... futex resumed> )       = -1 EAGAIN (Resource temporarily unavailable)
[pid 23191] futex(0x7f3acd917db0, FUTEX_WAKE_PRIVATE, 1) = 0
[pid 23193] select(7, [4 5 6], NULL, NULL, {tv_sec=0, tv_usec=20000} <unfinished ...>
[pid 23191] munmap(0x7f3ab130b000, 327680) = 0
[pid 23191] munmap(0x7f3ab136b000, 1114112) = 0
[pid 23191] munmap(0x7f3ab16db000, 1114112) = 0
[pid 23191] munmap(0x7f3ab0fbb000, 1114112) = 0
[pid 23191] munmap(0x7f3ab0ddb000, 1114112) = 0
[pid 23191] munmap(0x7f3ab0a0b000, 1114112) = 0
[pid 23191] munmap(0x7f3ab03fb000, 1114112) = 0
[pid 23191] munmap(0x7f3ab050b000, 1114112) = 0
[pid 23191] munmap(0x7f3ab00cb000, 1114112) = 0
[pid 23191] munmap(0x7f3ab02eb000, 1114112) = 0
[pid 23191] munmap(0x7f3ab14eb000, 1114112) = 0
[pid 23191] futex(0x7f3ab8a5dd44, FUTEX_WAIT_PRIVATE, 8, NULL) = -1 EAGAIN (Resource temporarily unavailable)
[pid 23191] futex(0x7f3ab8a5dd44, FUTEX_WAIT_PRIVATE, 12, NULL <unfinished ...>
[pid 23193] <... select resumed> )      = 0 (Timeout)
[pid 23193] select(7, [4 5 6], NULL, NULL, {tv_sec=0, tv_usec=20000}) = 0 (Timeout)
[pid 23193] select(7, [4 5 6], NULL, NULL, {tv_sec=0, tv_usec=20000} <unfinished ...>
[pid 23185] <... select resumed> )      = 0 (Timeout)
[pid 23185] select(10, [5 6 8 9], NULL, NULL, {tv_sec=0, tv_usec=20000} <unfinished ...>
[pid 23193] <... select resumed> )      = 0 (Timeout)
[pid 23193] select(7, [4 5 6], NULL, NULL, {tv_sec=0, tv_usec=20000} <unfinished ...>
[pid 23185] <... select resumed> )      = 0 (Timeout)
[pid 23185] select(0, NULL, NULL, NULL, {tv_sec=0, tv_usec=50000} <unfinished ...>
[pid 23193] <... select resumed> )      = 0 (Timeout)
[pid 23193] select(7, [4 5 6], NULL, NULL, {tv_sec=0, tv_usec=20000}) = 0 (Timeout)
[pid 23193] select(7, [4 5 6], NULL, NULL, {tv_sec=0, tv_usec=20000} <unfinished ...>
[pid 23185] <... select resumed> )      = 0 (Timeout)
[pid 23185] select(10, [5 6 8 9], NULL, NULL, {tv_sec=0, tv_usec=20000} <unfinished ...>
[pid 23193] <... select resumed> )      = 0 (Timeout)
[pid 23193] select(7, [4 5 6], NULL, NULL, {tv_sec=0, tv_usec=20000} <unfinished ...>
[pid 23185] <... select resumed> )      = 0 (Timeout)
[pid 23185] select(0, NULL, NULL, NULL, {tv_sec=0, tv_usec=50000} <unfinished ...>
[pid 23193] <... select resumed> )      = 0 (Timeout)
[pid 23193] select(7, [4 5 6], NULL, NULL, {tv_sec=0, tv_usec=20000}) = 0 (Timeout)
[pid 23193] select(7, [4 5 6], NULL, NULL, {tv_sec=0, tv_usec=20000}) = 0 (Timeout)
[pid 23193] select(7, [4 5 6], NULL, NULL, {tv_sec=0, tv_usec=20000} <unfinished ...>

Like if the two processes were in a deadlock state. Moreover, the commands

pid -p 7002

and

pid -p 7010

do give an empty output. The dirs /proc/7002 and /proc/7010 do not exist.

The only abaqus-related processes executing are

david  6995  0.0  0.1 295428 51388 pts/0    S    17:00   0:00 /opt/abaqus/6.14-1/code/bin/python /opt/abaqus/6.14-1
david  6998  0.0  0.2 368744 97948 pts/0    S    17:00   0:00 /opt/abaqus/6.14-1/code/bin/python std_inst.com
david  7001  0.1  0.0 122076 20096 pts/0    Sl   17:00   0:03 /opt/abaqus/6.14-1/code/bin/eliT_DriverLM -job std_in
david  7008  0.4  0.5 735812 185364 pts/0   Sl   17:00   0:07 /opt/abaqus/6.14-1/code/bin/standard -standard -acade

On ubuntu 16.04 the exact same version works like a charm. Here the same strace on ubuntu 16.04 (with the same kernel version as my 18.04, i.e. 4.15.0-29):

3890  close(8)                          = 0
3892  <... select resumed> )            = 0 (Timeout)
3892  futex(0x7f29e43e1db0, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
3890  futex(0x7f29e43e1db0, FUTEX_WAKE_PRIVATE, 1) = 0
3892  <... futex resumed> )             = -1 EAGAIN (Resource temporarily unavailable)
3892  futex(0x7f29e43e1db0, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
3890  futex(0x7f29e43e1db0, FUTEX_WAKE_PRIVATE, 1) = 0
3892  <... futex resumed> )             = -1 EAGAIN (Resource temporarily unavailable)
3892  futex(0x7f29e43e1db0, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
3890  futex(0x7f29e43e1db0, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
3892  <... futex resumed> )             = 0
3890  <... futex resumed> )             = -1 EAGAIN (Resource temporarily unavailable)
3890  futex(0x7f29e43e1db0, FUTEX_WAKE_PRIVATE, 1) = 0
3892  select(7, [4 5 6], NULL, NULL, {0, 20000} <unfinished ...>
3890  munmap(0x7f29c7adb000, 327680)    = 0
3890  munmap(0x7f29c7b3b000, 1114112)   = 0
3890  munmap(0x7f29c7eab000, 1114112)   = 0
3890  munmap(0x7f29c778b000, 1114112)   = 0
3890  munmap(0x7f29c75ab000, 1114112)   = 0
3890  munmap(0x7f29c71db000, 1114112)   = 0
3890  munmap(0x7f29c6bcb000, 1114112)   = 0
3890  munmap(0x7f29c6cdb000, 1114112)   = 0
3890  munmap(0x7f29c689b000, 1114112)   = 0
3890  munmap(0x7f29c6abb000, 1114112)   = 0
3890  munmap(0x7f29c7cbb000, 1114112)   = 0
3890  exit_group(0)                     = ?
3891  +++ exited with 0 +++
3893  +++ exited with 0 +++
3892  +++ exited with 0 +++
3890  +++ exited with 0 +++
3880  <... wait4 resumed> [{WIFEXITED(s) && WEXITSTATUS(s) == 0}], 0, NULL) = 3890
3880  --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=3890, si_uid=1000, si_status=0, si_utime=107, si_stime=7} ---

Has someone a good idea how to solve this? Or in which direction should I investigate further.

David
  • 393
  • 1
  • 6
  • 25

4 Answers4

2

I found a solution that circumvents the deadlock by using a singularity container as proposed by Will Furnass here: http://learningpatterns.me/posts-output/2018-01-30-abaqus-singularity/

Although a bit complicated in the first place, it works like a charm when setup properly. I modified my aliases for abaqus on my host system (Manjaro/Arch linux) such that they point to the install in the singularity container and execute the command in the containers environment. However, since I need Intel Fortran Compiler, I generated a basic centos 7 container and modified it afterwards to install compilers and abaqus (v2019 in this case) rather than using the .def script as proposed by Will Furnass.

It takes some time to setup but now I have a container image I can work with on any system that runs singularity, which is quite nice :)

EDIT: I also tested copying a working install to a more recent linux system (and avoiding a fresh install of abaqus), I can confirm that this didn't work in my case (CentOS 7 install copied to Manajaro system).

1

I would like to present my work around for this issue. I've made a python wrapper for the abq2018 solver which checks the .sta file for completeness. Once the .sta file is complete, any process named standard will be killed. I've found that the solver exits gracefully when standard is killed and the analysis is complete.

This work around is not a perfect solution. Current issues with this work around:

  1. can't replace the abq2018 solver call directly
  2. will not work from GUI, must be run from the shell
  3. only parses job= argument
  4. you can only run one analysis at at time since all standard processes are killed
  5. abq will hang forever if .sta file is not created or modified

How to use this workaround:

  1. Create Python file named abq. Code for abq is detailed below. If you are using a solver other than abq2018, replace the line cmd = 'abq20xx.. with the solver that you are using.
  2. Make abq executable and available in your path. I placed abq in the Abaqus commands folder, then ran chmod +x abq
  3. Run an Abaqus standard job by executing abq job=Job-1. This will execute Job-1.inp, then this will kill the standard solver once Job-1.sta is completed.

Code for abq is below

#!/usr/bin/python
import subprocess
import sys
import time
arguments = sys.argv
jobname = arguments[1].split('job=')[-1]
cmd = 'abq2018 cpus=4 ask_delete=OFF background job=' + jobname
p = subprocess.call(cmd, shell=True)

complete = False
termination_criteria = [' THE ANALYSIS HAS COMPLETED SUCCESSFULLY\n',
                        ' THE ANALYSIS HAS NOT BEEN COMPLETED\n']

while complete is False:
    # wait every 5 seconds
    time.sleep(5)
    try:
        with open(jobname + '.sta', 'r') as f:
            last = f.readlines()[-1]
            if last in termination_criteria:
                # this will kill any process named standard
                subprocess.call('pgrep standard | xargs kill', shell=True)
                complete = True
    except IOError:
        # model.sta has been deleted or doesn't exist
        # try again in 5 seconds
        time.sleep(5)

1

Dassaults System published a bug-fix this month:

You need to update to Abaqus 2018 to Abaqus 2018-HF16 from https://software.3ds.com/ more details can be found at https://github.com/willfurnass/abaqus-2017-centos-7-singularity/issues/5#issue-713025844

I tried it with updating Abaqus 2020 to Abaqus 2020-HF5 and it worked for Ubuntu 20.04 as well as Fedora 32.

JoKalliauer
  • 1,674
  • 2
  • 20
  • 27
0

I met this problem with Linux mint 19 too. Abaqus 6.14-5 installed in Linux Mint 19. It cannot be terminated automatically but seen form .sta file, the analysis is completed. I think this problem is related to the kernel. By the way, do you find any solutions now?

leileilei
  • 181