For Qiskit, there are two tools you can check out.
The two qubit kak tool does exactly what you want for 2 qubits. If you give it a two qubit unitary, it gives you a list of gates to realize this, using the standard gate set used in Qiskit. This tool uses Vatan and Williams optimal two-qubit circuit. The decomposition algorithm used is explained in Drury and Love.
Here's an example in code
from qiskit.mapper import two_qubit_kak
import numpy as np
# matrix to decompose
perm = np.array([[0.,0.,0.,1.], [1.,0.,0.,0.], [0.,1.,0.,0.], [0.,0.,1.,0.] ])
#decomposition
permCircuit = two_qubit_kak(perm)
The corresponding result is
[{'name': 'u1', 'args': [0], 'params': (0.0, 0.0, 1.5707963267948968)}, {'name': 'u2', 'args': [1], 'params': (1.5707963267948966, 2.220446049250313e-16, -3.1415926535897922)}, {'name': 'cx', 'args': [1, 0], 'params': ()}, {'name': 'u1', 'args': [0], 'params': (0.0, 0.0, 1.5707963267948968)}, {'name': 'u2', 'args': [1], 'params': (1.5707963267948966, 3.141592653589793, -3.141592653589793)}, {'name': 'cx', 'args': [0, 1], 'params': ()}, {'name': 'u3', 'args': [1], 'params': (3.141592653589793, 0.0, 0.0)}, {'name': 'cx', 'args': [1, 0], 'params': ()}, {'name': 'u2', 'args': [0], 'params': (1.5707963267948966, 1.5707963267948968, -1.5707963267948957)}, {'name': 'u2', 'args': [1], 'params': (1.5707963267948966, -4.71238898038469, -1.5707963267948957)}]
The u1, etc here are single qubit rotations as defined by the OpenQASM standard.
Another tool is arbitary state vector initialization, which can be used for an arbitrary number of qubits. For this you can state what state vector you want, and you'll get a circuit that builds it. Obviously this is not what you asked for, but it may nevertheless be useful depending on your application.
Here's an example in code
qubit = QuantumRegister(2)
bit = ClassicalRegister(2)
circuit = QuantumCircuit(qubit,bit)
vector = [0, 1 / np.sqrt(3), 1 / np.sqrt(3), 1 / np.sqrt(3) ]
circuit.initialize( vector, qubit )
print(circuit)
Here the final print command yields
┌───┐┌────────────┐┌───┐┌────────────┐
q_0: |0>──────────────┤ X ├┤ Ry(0.7854) ├┤ X ├┤ Ry(2.3562) ├
┌────────────┐└─┬─┘└────────────┘└─┬─┘└────────────┘
q_1: |0>┤ Ry(1.9106) ├──■──────────────────■────────────────
└────────────┘
c_0: 0 ════════════════════════════════════════════════════
c_1: 0 ════════════════════════════════════════════════════
Other than these existing Qiskit tools, the only algorithm I know of that is planning an implementation is that of Iten, et al. I have heard that a Mathematica version will be produced soon.