NESSI  v1.1.2 The NonEquilibrium Systems SImulation Library
Example programs

# Installation

After downloading or cloning the repository nessi_demo, the example programs are compiled in a similar fashion os the libcntr library. We have prepare a CMake build environment; it is most convenient to write a configuration shell script as the following example:

CC=[C compiler] CXX=[C++ compiler]
cmake \
-DCMAKE_BUILD_TYPE=[Debug|Release] \
-Domp=[ON|OFF] \
-Dhdf5=[ON|OFF] \
-Dmpi=[ON|OFF] \
-DCMAKE_INCLUDE_PATH=[include directory] \
-DCMAKE_LIBRARY_PATH=[library directory] \
-DCMAKE_CXX_FLAGS="[compiling flags]" \
..

The dependicies for the example programs are the same as for libcntr. The eigen3 library and, optionally (turned on by hdf5=ON), the HDF5 library are required. Make sure the corresponding libraries can be found in the library path CMAKE_LIBRARY_PATH, while the corresponding headers should be placed in CMAKE_INCLUDE_PATH. Furthermore, the example depend on the libcntr library. Therefore, the library path for libcntr should be provided by CMAKE_LIBRARY_PATH, while the header cntr/cntr.hpp should be found in the include path. Create a build directory (for instance, cbuild/), navigate there and run the configure script:

sh ../configure.sh

After successful configuration, compile via

make
 Example: Output: a=1b=3x=a+bprint "results is", x result is:4

# Test of accuracy and scaling analysis

As a first application we test the accuracy and convergence of the methods described in Section $$\ref{sec:num_vie}$$. To this end, we consider a $$2\times 2$$ matrix-valued time-independent Hamiltonian \begin{align} \label{eq:ham2x2} \epsilon = \begin{pmatrix} \epsilon_1 & i \lambda \\ -i\lambda & \epsilon_2 \end{pmatrix} \ . \end{align} The corresponding numerically exact GF $$G(t,t^\prime)$$ (assuming fermions) is computed using the routine green_from_H. Equivalently, one can also solve the problem by downfolding. To this end, we solve \begin{align} \label{eq:downfold1} \left(i \partial_t - \epsilon_1 \right) g_1(t,t^\prime) = \delta_\mathcal{C}(t,t^\prime) + \int_{\mathcal{C}}d \bar{t}\, \Sigma(t,\bar t) g_1(\bar t,t^\prime) \end{align} with the embedding self-energy $$\Sigma(t,t^\prime) = |\lambda|^2 g_2(t,t^\prime)$$. Here, $$g_2(t,t^\prime)$$ is the free GF with respect to \begin{align} \left(i \partial_t - \epsilon_2 \right) g_2(t,t^\prime) = \delta_\mathcal{C}(t,t^\prime) \ . \end{align} The downfolding Dyson equation $$\eqref{eq:downfold1}$$ then must be identical to the $$(1,1)$$ matrix element of $$G$$: $$G_{1,1}(t,t^\prime)=g_1(t,t^\prime)$$. The test programs test_equilibrium.x and test_nonequilibrium.x solve either the equilibrium or nonequilibrium, respectively, and compare the error. In the equilibrium case, we define \begin{align} \label{eq:test_eq_err} \mathrm{err.} = \frac{1}{\beta} \int^\beta_0\! d\tau\, |G_{1,1}(\tau)-g_1(\tau)| \ , \end{align} whereas \begin{align} \label{eq:test_noneq_err} \mathrm{err.} &= \frac{1}{T^2} \int^T_0\! d t \!\int^t_0\! d t^\prime |G^<_{1,1}(t^\prime,t)-g^<_1(t^\prime,t)| \nonumber \\ &\quad + \frac{1}{T^2} \int^T_0\! d t \!\int^t_0\! d t^\prime |G^{\mathrm{R}}_{1,1}(t,t^\prime)-g^{\mathrm{R}}_1(t,t^\prime)| \\ &\quad \frac{1}{T \beta} \int^T_0\! d t \!\int^\beta_0\! d \tau |G^\rceil_{1,1}(t,\tau)-g^\rceil_1(t,\tau)| \end{align} for the nonequilibrium case.

## Equilibirum

For convenience, we define the types

#define GREEN cntr::herm_matrix<double>
#define CFUNC cntr::function<double>
#define CINTEG integration::I<double>

for double-precision objects. In the main part of the C++ program, the parameters of the Hamiltonian are defined as constants. In particular, we fix $$\epsilon_1=-1$$, $$\epsilon_2=1$$, $$\lambda=0.5$$. The chemical potential is set to $$\mu=0$$ and the inverse temperature fixed to $$\beta=20$$. The input variables read from file are Ntau ( $$N_\tau$$) and {SolverOrder} ( $$k=1,\dots,5$$). After reading these variables from file via

find_param(argv[1],"__Ntau=",Ntau);
find_param(argv[1],"__SolverOrder=",SolverOrder);

we can define all quantities. First we define the $$2\times 2$$ Hamiltonian Eq. \eqref{eq:ham2x2} as an eigen3 complex matrix:

cdmatrix eps_2x2(2,2);
eps_2x2(0,0) = eps1;
eps_2x2(1,1) = eps2;
eps_2x2(0,1) = I*lam;
eps_2x2(1,0) = -I*lam;

The $$1\times 1$$ Hamiltonian representing $$\epsilon_1$$ is constructed as

CFUNC eps_11_func(-1,1);
eps_11_func.set_constant(eps1*MatrixXcd::Identity(1,1));

Here, eps_11_func is a contour function entering the solvers below. Note the first argument in the constructor of CFUNC: the number of real-time points $$N_t$$ is set to -1. In this case, only the Matsubara part is addressed. Its value is fixed to the constant $$1\times 1$$ matrix by the last line. With the Hamiltonians defined, we can initialize and construct the free $$2\times 2$$ exact GF by

GREEN G2x2(-1,Ntau,2,FERMION);
cntr::green_from_H(G2x2,mu,eps_2x2,beta,h);

Including the libcntr header provides a number of constants for convenience; here, we have used FERMION=-1 (bosons would be described by BOSON=+1). The time step h is a dummy argument here, as the real-time components are not addressed. From the exact GF, we extract the submatrix $$G_{1,1}$$ by

GREEN G_exact(-1,Ntau,1,FERMION);
G_exact.set_matrixelement(-1,0,0,G2x2);

Finally, we define the embedding self-energy by

GREEN Sigma(-1,Ntau,1,FERMION);
cdmatrix eps_22=eps2*MatrixXcd::Identity(1,1);
cntr::green_from_H(Sigma, mu, eps_22, beta, h);
Sigma.smul(-1,lam*lam);

The last line performs the multiplication of $$\mathcal{T}[\Sigma]_{-1}$$ with the scalar $$\lambda^2$$. After initializing the approximate GF  G_approx, we can solve the Matsubara Dyson equation and compute the average error:

cntr::dyson_mat(G_approx, Sigma, mu, eps_11_func, CINTEG(SolverOrder),beta, CNTR_MAT_FOURIER);
err_fourier = cntr::distance_norm2(-1,G_exact,G_approx) / Ntau;
cntr::dyson_mat(G_approx, Sigma, mu, eps_11_func, CINTEG(SolverOrder),beta, CNTR_MAT_FIXPOINT);
err_fixpoint = cntr::distance_norm2(-1,G_exact,G_approx) / Ntau;

The error is then written to file.

For convenience, we provide a driver python3 script for creating the input file, running the program and plotting the results. For running the equilibrium test, go to directory where the examples programs have been installed and run

python3 utils/test_equilbrium.py k

where k=1,...,5 is the integration order. The test solves the Matsubara Dyson equation for $$N_\tau=10^x$$ for 20 values of $$x\in [1,3]$$. The results are plotted using  matplotlib. The figure below shows the corresponding plots for $$k=1$$ and $$k=5$$.

As the figure demonstrates, the Fourier method scales as $$\mathcal{O}(h^{2}_\tau)$$, while solving the Dyson equation in integral form results in approximately $$\mathcal{O}(h^{k+2}_\tau)$$ scaling of the average error for small enough $$h_\tau$$.

## Nonequilibirum

Testing the accuracy of the dyson and vie2 solvers can be done analogously. We adopt the same parameters as for the equilibrium case. To obtain the NEGFs , the Dyson equation $$\eqref{eq:downfold1}$$ is propagated in time. Equivalently, one can also solve the Dyson equation in integral form \begin{align} \label{eq:downfold1_vie} & g_1(t,t^\prime) + [F\ast g_1](t,t^\prime) = g^{(0)}_1(t,t^\prime) \ , \\ & g_1(t,t^\prime) + [g_1\ast F^\ddagger](t,t^\prime) = g^{(0)}_1(t,t^\prime) \ , \end{align} where $$F= -\Sigma\ast g_1^{(0)}$$ and $$F^\ddagger=-g_1^{(0)}\ast \Sigma$$, as explained in subsection [secygxeqw02]. The free GF $$g_1^{(0)}(t,t^\prime)$$ is known analytically and computed by calling the routine green_from_H.

The structure of the test program is analogous to the equilibrium case. First, the input variables $$N_t$$, $$N_\tau$$, $$T_\mathrm{max}$$ and $$k$$ are read from the input file:

find_param(flin,"__Nt=",Nt);
find_param(flin,"__Ntau=",Ntau);
find_param(flin,"__Tmax=",Tmax);
find_param(flin,"__SolverOrder=",SolverOrder);

The time step is fixed by $$h=T_\mathrm{max}/N_t$$. After initializing the Hamiltonian and the GFs, the embedding self-energy is construced via

cntr::green_from_H(Sigma, mu, eps_22, beta, h);
for(tstp=-1; tstp<=Nt; tstp++) {
Sigma.smul(tstp,lam*lam);
}

The generic procedure to solve a Dyson equation in the time domain in libcntr is

1. solve the equilibrium problem by solving the corresponding Matsubara Dyson equation,
2. compute the NEGFs for time steps $$n=0,\dots, k$$ by using the starting algorithm (bootstrapping), and
3. perform the time stepping for $$n=k+1,\dots,N_t$$.

For Eq. $$\eqref{eq:downfold1}$$, this task is accomplished by

GREEN G_approx(Nt, Ntau, 1, FERMION);
// equilibrium
cntr::dyson_mat(G_approx, Sigma, mu, eps_11_func, CINTEG(SolverOrder),beta);
// start
cntr::dyson_start(G_approx, mu, eps_11_func, Sigma, CINTEG(SolverOrder), beta, h);
// time stepping
for (tstp=SolverOrder+1; tstp<=Nt; tstp++) {
cntr::dyson_timestep(tstp, G_approx, mu, eps_11_func, Sigma, CINTEG(SolverOrder), beta, h);
}

The deviation of the nonequilbrium Keldysh components from the exact solution is then calculated by

err_dyson=0.0;
for(tstp=0; tstp<=Nt; tstp++){
err_dyson += cntr::distance_norm2_les(tstp, G_exact, G_approx) / (Nt*Nt);
err_dyson += cntr::distance_norm2_ret(tstp, G_exact, G_approx) / (Nt*Nt);
err_dyson += cntr::distance_norm2_tv(tstp, G_exact, G_approx) / (Nt*Ntau);
}

The solution of the corresponding integral formulation $$\eqref{eq:downfold1_vie}$$ is peformed by the following lines of source code:

// noninteracting 1x1 Greens function (Sigma=0)
GREEN G0(Nt,Ntau,1,FERMION);
cdmatrix eps_11=eps1*MatrixXcd::Identity(1,1);
cntr::green_from_H(G0, mu, eps_11, beta, h);
GREEN G_approx(Nt,Ntau,1,FERMION);
GREEN F(Nt,Ntau,1,FERMION);
GREEN Fcc(Nt,Ntau,1,FERMION);
// equilibrium
GenKernel(-1, G0, Sigma, F, Fcc, beta, h, SolverOrder);
cntr::vie2_mat(G_approx, F, Fcc, G0, beta, CINTEG(SolverOrder));
// start
for(tstp=0; tstp <= SolverOrder; tstp++){
GenKernel(tstp, G0, Sigma, F, Fcc, beta, h, SolverOrder);
}
cntr::vie2_start(G_approx, F, Fcc, G0, CINTEG(SolverOrder), beta, h);
// time stepping
for (tstp=SolverOrder+1; tstp<=Nt; tstp++) {
GenKernel(tstp, G0, Sigma, F, Fcc, beta, h, SolverOrder);
cntr::vie2_timestep(tstp, G_approx, F, Fcc, G0, CINTEG(SolverOrder), beta, h);
}

For convenience, we have defined the routine GenKernel which calculates the convolution kernels $$F$$ and $$F^\ddagger$$:

void GenKernel(int tstp, GREEN &G0, GREEN &Sigma, GREEN &F, GREEN &Fcc, const double beta, const double h, const int kt){
cntr::convolution_timestep(tstp, F, G0, Sigma, CINTEG(kt), beta, h);
cntr::convolution_timestep(tstp, Fcc, Sigma, G0, CINTEG(kt), beta, h);
F.smul(tstp,-1);
Fcc.smul(tstp,-1);
}

The python3 driver script test_nonequilibrium.py provides an easy-to-use interface for running the accuracy test. In the  nessi_demo/ directory, run

python3 utils/test_nonequilbrium.py k

where k is the solver order. The average error of the numerical solution of Eq. $$\eqref{eq:downfold1_vie}$$ is comput ed analogously to the Dyson equation in integro-differential form.

The average error of the numerical solution of Eq. $$\eqref{eq:downfold1_vie}$$ is computed analogously to the Dyson equation in integro-differential form. The output of test_nonequilibrium.py is shown in the figure below. As this figure confirms, the average error of solving the Dyson equation in the integro-differential form scales as $$\mathcal{O}(h^{k+1})$$, while the corresponding integral form yields a $$\mathcal{O}(h^{k+2})$$ scaling.

# Hubbard chain

The Hubbard model is one of the most basic models describing correlation effects. It allows to demonstrate the performance, strengths and also shortcomings of the NEGF treatment . Here, we consider a one-dimensional (1D) configuration with the Hamiltonian \begin{align} \label{eq:hubbmodel1} \hat{H}_0 = -J \sum_{\langle i,j \rangle, \sigma} \hat{c}^\dagger_{i\sigma} \hat{c}_{j \sigma} + \frac{U}{2}\sum_{i,\sigma} \hat{n}_{i\sigma} \hat{n}_{i\bar\sigma} \ , \end{align} where $$\langle i,j\rangle$$ constrains the lattice sites $$i,j$$ to the nearest neighbors, while $$\bar\sigma=\uparrow,\downarrow$$ for $$\sigma=\downarrow,\uparrow$$. We consider $$M$$ lattice sites with open boundary conditions. Furthermore, we restrict ourselves to the paramagnetic case with an equal number of spin-up ($$N_\uparrow$$) and and spin-down ($$N_\downarrow$$) particles. The number of particles is most conveniently characterized by the filling factor $$n=N_\uparrow/M$$. In analogy to Ref. , the system is excited with an instanteous quench of the on-site potential of the first lattice site to $$w_0$$: \begin{align} \label{eq:hubbchain_quench} \hat{H}(t) = \hat{H}_0 + \theta(t) w_0 \sum_\sigma \hat{c}^\dagger_{1\sigma} \hat{c}_{1\sigma} \ . \end{align} In this example, we treat the dynamics with respect to the Hamiltonian $$\eqref{eq:hubbchain_quench}$$ within the second-Born (2B), $$GW$$, and $$T$$-matrix (particle-particle ladder) approximation. A detailed description of these approximations can be found, for instance, in Ref. . The numerical representation of the respective self-energy expressions is implemented in the C++ module hubbard_chain_selfen_impl.cpp. Below we explain the key routines. One remark on naming conventions: here and in what follows, we denote the time step by $$\Delta t$$ (variable dt) to avoid confusion with the single-particle Hamiltonian of the noninteracting system ($$h^0$$, represented by the matrix h0).

10 url#1#1urlprefixhref#1#2#2 #1#1

M. P. von Friesen, C. Verdozzi, C.-O. Almbladh, Successes and Failures of Kadanoff-Baym Dynamics in Hubbard Nanoclusters, Phys. Rev. Lett. 103 (17).

## Self-energy approximation: second-Born

The 2B approximation corresponds to the second-order expansion in terms of the Hubbard repulsion $$U(t)$$, which we treat as time dependent here for generality. Defining the GF with respect to the lattice basis, $$G_{ij,\sigma}(t,t^\prime)=-i\langle T_\mathcal{C}\{\hat{c}_{i\sigma}(t) \hat{c}^\dagger_{j\sigma}(t)^\prime\}\rangle$$, the 2B is defined by \begin{align} \label{eq:sigma_hubb_2b} \Sigma_{ij,\sigma}(t,t^\prime) = U(t)U(t^\prime) G_{ij,\sigma}(t,t^\prime) G_{ij,\bar\sigma}(t,t^\prime)G_{ji,\bar\sigma}(t^\prime,t) \ . \end{align} The 2B self-energy $$\eqref{eq:sigma_hubb_2b}$$ is implemented in two steps. (i) The (per-spin) polarization $$P_{ij,\sigma}(t,t^\prime)= -i G_{ij,\sigma}(t,t^\prime)G_{ji,\sigma}(t^\prime,t)$$ is computed using the routine Bubble1 and subsequent multiplication by $$-1$$. (ii) The self-energy is then given by $$\Sigma_{ij,\sigma}(t,t^\prime) = i U(t) U(t^\prime) G_{ij,\sigma}(t,t^\prime) P_{ij,\bar\sigma}(t,t^\prime)$$, which corresponds to a bubble diagram computed by the routine Bubble2. Inspecting the Keldysh components of the GFs, one notices that the polarization $$P_{ij,\sigma}(t,t^\prime)$$ is needed on a time slice only. Dropping the spin dependence, the 2B self-energy is computed by the routine Sigma_2B as follows:

void Sigma_2B(int tstp, GREEN &G, CFUNC &U, GREEN &Sigma){
int nsites=G.size1();
int ntau=G.ntau();
GREEN_TSTP Pol(tstp,ntau,nsites,BOSON);
Polarization(tstp, G, Pol);
Pol.right_multiply(U);
Pol.left_multiply(U);
for(int i=0; i<nsites; i++){
for(int j=0; j<nsites; j++){
cntr::Bubble2(tstp,Sigma,i,j,G,i,j,Pol,i,j);
}
}
}

First, the polarization Pol, which represents $$P_{ij}(t,t^\prime)$$, is defined as a temporary time step. After computing $$P_{ij}(t,t^\prime)$$ by the function

void Polarization(int tstp, GREEN &G, GREEN_TSTP &Pol){
int nsites=G.size1();
for(int i=0; i<nsites; i++){
for(int j=0; j<nsites; j++){
cntr::Bubble1(tstp,Pol,i,j,G,i,j,G,i,j);
}
}
Pol.smul(-1.0);
}

the lines

Pol.right_multiply(U);
Pol.left_multiply(U);

perform the operation $$P_{ij}(t,t^\prime) \rightarrow P_{ij}(t,t^\prime) U(t^\prime)$$ and $$P_{ij}(t,t^\prime) \rightarrow U(t)P_{ij}(t,t^\prime)$$, respectively. Finally, cntr::Bubble2 computes $$\Sigma_{ij}(t,t^\prime)$$.

## Self-energy approximation: GW

As the next approximation to the self-energy, we consider the $$GW$$ approximation. We remark that we formally treat the Hubbard interaction as spin-independent (as in Ref. ), while the spin-summation in the polarization $$P$$ (which is forbidden by the Pauli principle) is excluded by the corresponding prefactor. The analogous approximation for the explictly spin-dependent interaction (spin-$$GW$$) is also discussed in ref .

Within the same setup as above, the $$GW$$ approximation is defined by \begin{align} \Sigma_{ij}(t,t^\prime) = i G_{ij}(t,t^\prime) \delta W_{ij}(t,t^\prime) \ , \end{align} where $$\delta W_{ij}(t,t^\prime)$$ denotes the dynamical part of the screened interaction $$W_{ij}(t,t^\prime) = U \delta_{ij}\delta_\mathcal{C}(t,t^\prime) + \delta W_{ij}(t,t^\prime)$$. We compute $$\delta W_{ij}(t,t^\prime)$$ from the charge susceptibility $$\chi_{ij}(t,t^\prime)$$ by $$\delta W_{ij}(t,t^\prime) = U(t) \chi_{ij}(t,t^\prime) U(t^\prime)$$. In turn, the susceptibility obeys the Dyson equation \begin{align} \label{eq:dyson_chi} \chi = P + P\ast U \ast \chi \ , \end{align} where $$P$$ stands for the irreducible polarization $$P_{ij}(t,t^\prime)=-i G_{ij}(t,t^\prime)G_{ji}(t^\prime,t)$$. The strategy to compute the $$GW$$ self-energy with libcntr thus consists of three steps:

1. Computing the polarization $$P_{ij}(t,t^\prime)$$ by Bubble1.

2. Solving the Dyson equation  as VIE. By defining the kernel $$K_{ij}=-P_{ij}(t,t^\prime)U(t^\prime)$$ and its hermitian conjugate, Eq.  amounts to $$[1+K]\ast \chi=P$$, which is solved using vie2.

3. Computing the self-energy by Bubble2.

The implementation of step 1 already shown above. For step 2, we distinguish between the equilibrium and time stepping on the one hand, and the starting phase on the other hand. For the former, we have defined the routine

void GenChi(int tstp, double dt, double beta, GREEN &Pol,
CFUNC &U, GREEN &PxU, GREEN &UxP, GREEN &Chi, int order){
PxU.set_timestep(tstp, Pol);
UxP.set_timestep(tstp, Pol);
PxU.right_multiply(tstp, U);
UxP.left_multiply(tstp, U);
PxU.smul(tstp,-1.0);
UxP.smul(tstp,-1.0);
if(tstp==-1){
cntr::vie2_mat(Chi,PxU,UxP,Pol,beta,CINTEG(order));
} else{
cntr::vie2_timestep(tstp,Chi,PxU,UxP,Pol,CINTEG(order),beta,dt);
}
}

Here, PxU and UxP correspond to the kernel $$K_{ij}$$ and its hermitian conjugate, respectively. Analogously, the starting routine is implemented as

void GenChi(double dt, double beta, GREEN &Pol, CFUNC &U,
GREEN &PxU, GREEN &UxP, GREEN &Chi, int order){
for(int n = 0; n <= order; n++){
PxU.set_timestep(n, Pol);
UxP.set_timestep(n, Pol);
PxU.right_multiply(n, U);
UxP.left_multiply(n, U);
PxU.smul(n,-1.0);
UxP.smul(n,-1.0);
}
cntr::vie2_start(Chi,PxU,UxP,Pol,CINTEG(order),beta,dt);
}

Finally, the self-energy is computed by

void Sigma_GW(int tstp, GREEN &G, CFUNC &U, GREEN &Chi, GREEN &Sigma){
int nsites=G.size1();
int ntau=G.ntau();
GREEN_TSTP deltaW(tstp,ntau,nsites,CNTR_BOSON);
Chi.get_timestep(tstp,deltaW);
deltaW.left_multiply(U);
deltaW.right_multiply(U);
for(int i=0; i<nsites; i++){
for(int j=0; j<nsites; j++){
cntr::Bubble2(tstp,Sigma,i,j,G,i,j,deltaW,i,j);
}
}
}

## Self-energy approximation: T-matrix

The particle-particle ladder $$T_{ij}(t,t^\prime)$$ represents an effective particle-particle interaction, which defines the corresponding self-energy by \begin{align} \Sigma_{ij}(t,t^\prime) = i U(t) T_{ij}(t,t^\prime) U(t^\prime) G_{ji}(t^\prime,t) \ . \end{align} The $$T$$-matrix, in turn, is obtained by solving the Dyson equation $$T=\Phi - \Phi \ast U \ast T$$, where $$\Phi$$ corresponds to the particle-particle bubble $$\Phi_{ij}(t,t^\prime) = -i G_{ij}(t,t^\prime) G_{ij}(t,t^\prime)$$. Hence, the procedure of numerically computing the $$\Sigma_{ij}(t,t^\prime)$$ is analogous to the $$GW$$ approximation:

1. Compute $$\Phi_{ij}(t,t^\prime)$$ by Bubble2 and multiply by $$-1$$.

2. Calculate the kernel $$K_{ij}(t,t^\prime) = \Phi_{ij}(t,t^\prime) U(t^\prime)$$ and its hermitian conjugate and solve the VIE $$[1+K]\ast T = \Phi$$ by vie2.

3. Perform the operation $$T_{ij}(t,t^\prime) \rightarrow U(t) T_{ij}(t,t^\prime) U(t^\prime)$$ and compute the self-energy by Bubble1.

## Mean-field Hamiltonian and onsite quench

So far, we have described how to compute the dynamical contribution to the self-energy. The total self-energy furthermore includes the Hartree-Fock (HF) contribution, which we incorporate into the mean-field Hamiltonian $$h^{\mathrm{MF}}_{ij}(t) = h^{0}_{ij}(t) + U n_i$$ with the occupation (per spin) $$n_i= \langle \hat{c}^\dagger_i \hat{c}_i \rangle$$. In the example program, the mean-field Hamiltonian is represented by the contour function hmf. Updating hmf is accomplished by computing the density matrix using the herm_matrix class routine density_matrix.

The general procedure to implement a quench of some parameter $$\lambda$$ at $$t=0$$ is to represent $$\lambda$$ by a contour function $$\lambda_n$$: $$\lambda_{-1}$$ corresponds to the pre-quench value which detemines the thermal equilibrium, while $$\lambda_n$$ with $$n\ge 0$$ governs the time evolution. In the example program, we simplify this procedure by redefining $$h^{0}_{ij} \rightarrow h^{0}_{ij} + w_0 \delta_{i,1}\delta_{j,1}$$ after the Matsubara Dyson equation has been solved.

## Generic structure of the example program

The programs are structured similarly as the previous examples. After reading variables from file and initializing the variables and classes, the Matsubara Dyson equation is solved in a self-consistent fashion. The example below illustrates this procedure for the 2B approximation.

tstp=-1;
gtemp = GREEN(SolverOrder,Ntau,Nsites,FermBos);
gtemp.set_timestep(tstp,G);
for(int iter=0;iter<=MatsMaxIter;iter++){
// update mean field
hubb::Ham_MF(tstp, G, Ut, h0, hmf);
// update self-energy
hubb::Sigma_2B(tstp, G, Ut, Sigma);
// solve Dyson equation
cntr::dyson_mat(G, Sigma, MuChem, hmf, CINTEG(SolverOrder), beta);
// self-consistency check
err = cntr::distance_norm2(tstp,G,gtemp);
if(err<MatsMaxErr){
break;
}
gtemp.set_timestep(tstp,G);
}

Updating the mean-field Hamiltonian (hubb::Ham_MF), the self-energy (hubb::Sigma_2B) and solving the corresponding Dyson equation (cntr::dyson_mat) is repeated until self-consistency has been reached, which in practice means that the deviation between the previous and updated GF is smaller than MatsMaxErr. For other self-energy approximations, the steps described above (updating auxiliary quantities) have to be performed before the self-energy can be updated.

Once the Matsubara Dyson equation has been solved up to the required convergence threshold, the starting algorithm for time steps $$n=0,\dots,k$$ can be applied. To reach self-consistency for the first few time steps, we employ the bootstrapping loop:

for (int iter = 0; iter <= BootstrapMaxIter; iter++) {
// update mean field
for(tstp=0; tstp<=SolverOrder; tstp++){
hubb::Ham_MF(tstp, G, Ut, h0, hmf);
}
// update self-energy
for(tstp=0; tstp<=SolverOrder; tstp++){
hubb::Sigma_2B(tstp, G, Ut, Sigma);
}
// solve Dyson equation
cntr::dyson_start(G, MuChem, hmf, Sigma, CINTEG(SolverOrder), beta, dt);
// self-consistency check
err=0.0;
for(tstp=0; tstp<=SolverOrder; tstp++) {
err += cntr::distance_norm2(tstp,G,gtemp);
}
if(err<BootstrapMaxErr && iter>2){
break;
}
for(tstp=0; tstp<=SolverOrder; tstp++) {
gtemp.set_timestep(tstp,G);
}
}

Finally, after the bootstrapping iteration has converged, the time propagation for time steps $$n>k$$ is launched. The self-consistency at each time step is accomplished by iterating the update of the mean-field Hamiltonian, GF and self-energy over a fixed number of  CorrectorSteps. As an initial guess, we employ a polynomial extrapolation of the GF from time step $n-1$ to $n$, as implemented in the routine cntr::extrapolate_timestep. Thus, the time propagation loop takes the form

for(tstp = SolverOrder+1; tstp <= Nt; tstp++){
// Predictor: extrapolation
cntr::extrapolate_timestep(tstp-1,G,CINTEG(SolverOrder));
// Corrector
for (int iter=0; iter < CorrectorSteps; iter++){
// update mean field
hubb::Ham_MF(tstp, G, Ut, h0, hmf);
// update self-energy
hubb::Sigma_2B(tstp, G, Ut, Sigma);
// solve Dyson equation
cntr::dyson_timestep(tstp,G,MuChem,hmf,Sigma,CINTEG(SolverOrder),beta,dt);
}
}

After the GF has been computed for all required time steps, we compute the observables. In particular, the conservation of the total energy provides a good criterion to assess the accuracy of the calculation. The total energy for the Hubbard model $$\eqref{eq:hubbmodel1}$$ is given in terms of the Galitskii-Migdal formula . \begin{align} E = \frac{1}{2}\mathrm{Tr}\left[\rho(t)\left(h^{(0)} +h^{\mathrm{MF}}\right)\right] - \frac{i}{2} \left[\Sigma\ast G\right]^<(t,t) \ .\label{Eq:Total_ener} \end{align} The last term, known as the correlation energy, is most conveniently computed by the routine:

Ecorr = cntr::correlation_energy(tstp, G, Sigma, I, beta, dt);

Here, I is an instance of the integrator class.

## Running the example programs

There are three programs for the 2B, $$GW$$ and $$T$$-matrix approximation, respectively: hubbard_chain_2b.x, hubbard_chain_gw.x, hubbard_chain_tpp.x. The driver script  demo_hubbard_chain.py located in the utils/ directory provides a simple interface to these programs. After defining the parameters and convergence parameters, the script creates the corresponding input file and launches all three programs in a loop. The occupation of the first lattice site $$n_1(t)$$ and the kinetic and total energy are then read from the output files and plotted. The script  demo_hubbard_chain.py also allows to pass reference data as an optional argument, which can be used to compare, for instance, to exact results.

## Discussion

Following Ref. , we have selected two prominent examples illustrating the shortcomings of weak-coupling diagrammatic treatments for finite systems and strong excitations. The regimes where the discussed approximations work well are systematically explored in, for instance, refs. .

For the Hubbard dimer ($$M=2$$) at half filling ($$n=1/2$$), a strong excitation (here $$w_0=5$$) leads to the phenomenon of artificial damping: although the density $$n_1(t)$$ exhibits an oscillatory behavior for all times in an exact treatment, the approximate NEGF treatment – with either self-energy approximation considered here – shows damping until an unphysical steady state is reached (see the figure below (a)–(b)). It is instructive to look at the total energy, shown as dashed lines in (b). The conservation of total energy is illustrated in (c)–(d). For the relatively large time step $$h=0.025$$, the energy is conserved up to $$4\times 10^{-5}$$ in the considered time interval, while using a half as small step $$h=0.0125$$ improves the accuracy of the energy conservation by two orders of magnitude.

In (e) in the figure below we show the corresponding dynamics of the occupation for $$M=4$$ and quarter filling. In the regime of small filling, the $$T$$-matrix approximation is known to provide a good description for any strength of the interaction. This is confirmed by (e), where the 2B and $$GW$$ approximation lead to artificial damping, while the $$n_1(t)$$ within the $$T$$-matrix agrees rather well with the exact result.

# Holstein impurity model coupled to a bath

In this example, we consider an electron-phonon (el-ph) coupled system, which consists of two electron sites and a phonon mode coupled to one of the electron sites; \begin{align} \hat{H}(t)&=-J(t)\sum_{\sigma}[{\hat{c}}_{0\sigma}^{\dagger}{\hat{c}}_{1\sigma}+{\hat{c}}_{1\sigma}^{\dagger}{\hat{c}}_{0\sigma}] + \sum_{i} \epsilon_i {\hat{n}}_{i} \nonumber\\ &+\omega_0\sum_i {\hat{a}}^{\dagger}_i {\hat{a}}_i+g(t)\sum_i ({\hat{a}}_i^{\dagger}+{\hat{a}}_i){\hat{n}}_0.\label{eq:Holstein_imp} \end{align} Here, $${\hat{c}}^\dagger$$ is the creation operator of an electron, $$i=0,1$$ is the site indices, $$\hat{n}_i = \sum_{\sigma}{\hat{c}}_{i\sigma}^{\dagger}{\hat{c}}_{i\sigma}$$ and $${\hat{a}}^\dagger$$ is that of an Einstein phonon coupled to the $$0$$th site. $$J(t)$$ is the hopping parameter of electrons, $$\epsilon_i$$ is the energy level of each site, $$\omega_0$$ is the phonon frequency and $$g(t)$$ is the el-ph coupling. If we regard the 0th site as an impurity, this model is nothing but a Holstein-type impurity model with a single bath site.

Now we introduce the GFs for electrons and phonons as \begin{align} \hat{G}_{\sigma}(t,t') &= -i \Bigl\langle T_{\mathcal{C}}{\hat{\Psi}}_\sigma(t) {\hat{\Psi}}^{\dagger}_\sigma (t')\Bigl\rangle, \\ D(t,t') &= -i\langle T_{\mathcal{C}} \Delta{\hat{X}}(t) \Delta{\hat{X}}(t') \rangle, \end{align} where $${\hat{\Psi}}_{\sigma}^{\dagger}=[{\hat{c}}^\dagger_{0\sigma}\;\; {\hat{c}}^\dagger_{1\sigma}]$$ $${\hat{X}}= {\hat{a}}^\dagger + {\hat{a}}$$ and $$\Delta{\hat{X}}(t) = {\hat{X}}(t) - \langle {\hat{X}}(t) \rangle$$. We note that the phonon propagator defined in this way consists only of connected diagrams when it is expressed with the Feynman diagrams.

In the present model, since the system is coupled to the phonon only at the site 0, one can trace out the contribution from the bath site (site 1) and focus on the GF at the site 0, $$G_{00}$$. Then the GFs are determined by evaluating the corresponding Dyson equations, \begin{align} \label{eq:Dyson_g} [i\partial_{t}-\epsilon_0-\Sigma^{\rm MF}(t)]G(t,t')-[(\Delta + \Sigma^{\rm corr})* G](t,t')=\delta_{\mathcal C}(t,t') \ , \end{align} \begin{align} \label{eq:D_dyson} D = D_{0}(t,t^\prime) + [D_{0}\ast \Pi \ast D](t,t^\prime) \ , \end{align} and the phonon displacement, $$X(t)=\langle {\hat{X}}(t)\rangle$$, which is described by \begin{align} \label{eq:X} X(t) = -\frac{2 g(0)}{\omega_0} n_0(0) + \int^t_0 d\bar{t} D_0^R(t,\bar{t})[g({\bar t})n_0(\bar{t})-g(0)n_0(0)]. \end{align} Here the mean-field contribution ($$\Sigma^{\rm MF}(t)$$) is described as \begin{align} \Sigma^{\rm MF}(t) &= g(t)X(t),\label{eq:H_mf_Holstein} \end{align} and $$\Delta(t,t')$$ is the hybridization function, which is expressed as \begin{align} \Delta(t,t') = J_0(t) G_{0,11}(t,t') J_0(t') \label{eq:Hyb_imp} \end{align} with $$[i\partial_t -\epsilon_j]G_{0,jj}(t,t')=\delta_\mathcal{C}(t,t')$$. $$\Sigma^{\rm corr}(t,\bar{t})$$ is the self-energy beyond the mean-field, $$D_{0}(t,t')\equiv-i\langle \Delta{\hat{X}}(t) \Delta{\hat{X}}(t') \rangle_0$$ is for the free phonon system, $$\Pi$$ is the phonon self-energy and $$n_0(t)= \langle {\hat{n}}_0(t) \rangle$$.

For the general impurity model, the hybridization function $$\Delta(t,t')$$ attains a more general form; but the rest (Eqs. $$\eqref{eq:Dyson_g}$$$$\eqref{eq:H_mf_Holstein}$$) keep the same form. In addition, the same type of impurity problem is solved in the nonequilibrium DMFT, where the hybridization function is self-consistently determined.

Once $$G_{00}$$ and $$\Sigma$$ are obtained, the rest part of the electron GFs and the energies can be calculated as follows. For the rest of GFs, \begin{align} \label{eq:G_rest_Hol_imp} &G_{10} = -G_{0,11} * J * G_{00}\\ &G_{01} = -G_{00} * J * G_{0,11}\\ & [i\partial_t -\epsilon_1]G_{11}(t,t^\prime) -[J \ast \widetilde{G}_{00} \ast J \ast G_{11}](t,t^\prime) =\delta_\mathcal{C}(t,t^\prime) \end{align} with \begin{align} [i\partial_t-\epsilon_0-\Sigma^{\mathrm{MF}}(t)]\widetilde{G}_{00}(t,t') -[\Sigma^{\rm corr} \ast \widetilde{G}_{00}](t,t^\prime) =\delta_\mathcal{C}(t,t'). \end{align}

The expression for energies are

\begin{align*} E_{\rm kin}(t)&=-J(t)\sum_{\sigma}[\langle{\hat{c}}_{0\sigma}^{\dagger}(t){\hat{c}}_{1\sigma}(t)\rangle+\langle{\hat{c}}_{1\sigma}^{\dagger}(t){\hat{c}}_{0\sigma}(t)\rangle] + \sum_{i} \epsilon_i \langle{\hat{n}}_{i}(t) \rangle \nonumber \\ &=-2i[\Delta \ast G_{00}+G_{00} \ast \Delta]^<(t,t) + \sum_{i} \epsilon_i \langle{\hat{n}}_{i}(t) \rangle . \end{align*}

\begin{align*} E_{\rm nX}(t)&=g(t)\langle X(t){\hat{n}}_0(t)\rangle=E_{\rm nX,MF}(t)+E_{\rm nX,corr}(t)\\ E^{\rm MF}_{\rm nX}(t)&=g(t)\langle X(t) \rangle \langle {\hat{n}}_0(t)\rangle\\ E^{\rm corr}_{\rm nX}(t) & = - 2i[\Sigma^{\rm corr} * G_{00}]^<(t,t). \end{align*}

\begin{align*} E_{\rm ph}(t)=\frac{\omega_{0}}{4} [iD^<(t,t) + \langle \hat{X}(t)\rangle^2]+\frac{\omega_{0}}{4} [iD_{\rm PP}^<(t,t) + \langle \hat{P}(t)\rangle^2]. \end{align*} Here $$D_{\rm PP}(t,t')=-i\langle T_\mathcal{C} \Delta\hat{P}(t) \Delta\hat{P}(t')\rangle$$ with $$\hat{P}=\frac{1}{i}({\hat{a}}-{\hat{a}}^\dagger)$$ and $$\Delta\hat{P}(t) \equiv \hat{P}(t) - \langle \hat{P}(t)\rangle$$.

Now the question is how to evaluate the self-energies. In this example, we consider two types of weak coupling expansions as impurity solvers, i.e. the self-consistent Migdal approximation (sMig) and unrenormalized Migdal approximation (uMig) . The numerical representation of respective self-energy expressions is implemented in the C++ module Holstein_impurity_impl.cpp.

## Self-consistent Migdal approximation as an impurity solver: sMig

The self-consistent Migdal approximation is the lowest order ($$\mathcal{O}(g^2)$$) self-consistent approximation. Within this approximation, one can treat the dynamics and renomalization of phonons through the electron-phonon coupling. Even though the phonon can absorb energies from electrons, the total energy is conserved in this approximation. The impurity self-energy for the electron and phonon are approximated as \begin{align} \hat{\Sigma}^{\rm sMig,corr}(t,t') & =ig(t)g(t') D(t,t')G_{00}(t,t'),\label{eq:sMig_el} \end{align} \begin{align} \Pi^{\rm sMig}(t,t')&=-2 i g(t)g(t') G_{00}(t,t') G_{00}(t',t).\label{eq:sMig_ph} \end{align}

In the sample program, the sMig self-energy is computed by the routine Sigma_Mig. We provide two interfaces for $$0\leq {\rm tstp}\leq {\rm SolverOrder}$$ (the Bootstrapping part) and $${\rm tstp}=-1, {\rm tstp}> {\rm SolverOrder}$$ (the Matsubara part and the time-step part), respectively. Here we show the latter as an example:

void Sigma_Mig(int tstp, GREEN &G, GREEN &Sigma, GREEN &D0, GREEN &D, GREEN &Pi, GREEN &D0_Pi, GREEN &Pi_D0, CFUNC &g_el_ph, double beta, double h, int SolverOrder, int MAT_METHOD){
int Norb=G.size1();
int Ntau=G.ntau();
//step1: phonon self-energy gGgG
GREEN_TSTP gGg(tstp,Ntau,Norb,FERMION);
G.get_timestep(tstp,gGg);
gGg.right_multiply(g_el_ph);
gGg.left_multiply(g_el_ph);
GREEN_TSTP Pol(tstp,Ntau,1,BOSON);
Pi.set_timestep_zero(tstp);
Bubble1(tstp,Pol,0,0,gGg,0,0,G,0,0);
Pi.incr_timestep(tstp,Pol,-2.0);
//step2: solve phonon Dyson to evaluate D^corr
step_D(tstp, D0, D, Pi, D0_Pi, Pi_D0, beta, h, SolverOrder, MAT_METHOD);
//step3: evaluate electron self-energy
Bubble2(tstp,Sigma,0,0,D,0,0,gGg,0,0);
}

This routine is separated into three steps:

1. Evaluating the phonon self-energy Eq. $$\eqref{eq:sMig_ph}$$ using Bubble1.

2. Solving the Dyson equation of the phonon Green’s function Eq. $$\eqref{eq:D_dyson}$$ using a routine step_D, see below. Here MAT_METHOD specifies which method is used to solve the Dyson equation of the Matsubara part.

3. Evaluating the electron self-energy Eq. $$\eqref{eq:sMig_el}$$ using Bubble2.

For the bootstrapping part, we do these steps for $${\rm tstp}\leq {\rm SolverOrder}$$ at once and start_D is used instead of step_D.

The routine for solving the Dyson equation of phonons is implemented using cntr routines vie2_*. Again for the Bootstrapping and the rest, we provides two routines, start_D and step_D, see below. The latter looks as

void step_D(int tstp,GREEN &D0, GREEN &D, GREEN &Pi, GREEN &D0_Pi, GREEN &Pi_D0,
double beta, double h, int SolverOrder, int MAT_METHOD){
//step1: set D0_Pi=-D0*Pi
cntr::convolution_timestep(tstp,D0_Pi,D0,Pi,CINTEG(SolverOrder),beta,h);
D0_Pi.smul(tstp,-1.0);
cntr::convolution_timestep(tstp,Pi_D0,Pi,D0,CINTEG(SolverOrder),beta,h);
Pi_D0.smul(tstp,-1.0);
//step2: solve [1-D0*Pi]*D=[1+D0_Pi]*D=D0
if(tstp==-1){
cntr::vie2_mat(D,D0_Pi,Pi_D0,D0,beta,CINTEG(SolverOrder),MAT_METHOD);
}else {
cntr::vie2_timestep(tstp,D,D0_Pi,Pi_D0,D0,CINTEG(SolverOrder),beta,h);
}
}

This routine is separated into two steps:

1. Preparing $$-D_0 \ast \Pi$$, which is the kernel of Eq. $$\eqref{eq:D_dyson}$$, and its conjugate for $$-\Pi \ast D_0$$ by cntr::convolution_timestep.
2. Solving Eq. $$\eqref{eq:D_dyson}$$ as the VIE with vie2_** for corresponding time step. For start_D, we operate these steps for $$0 \leq {\rm tstp} \leq {\rm SolverOrder}$$ at the same time, and use cntr::vie2_start.

Here we note that the free phonon GF, $$D_{0}(t,t')$$, can be prepared initially using a cntr routine as

cntr::green_single_pole_XX(D0,Phfreq_w0,beta,dt);

# DMFT for Holstein model

We now consider the lattice version of the previous problem. It is the so-called Holstein model, which is a fundamental model for el-ph coupled systems. The Hamiltonian of the single-band Holstein model is \begin{align} H(t)&=-J(t)\sum_{\langle i,j\rangle,\sigma}{\hat{c}}_{i,\sigma}^{\dagger}{\hat{c}}_{j,\sigma}-\mu\sum_i {\hat{n}}_i +\omega_0\sum_i {\hat{a}}^{\dagger}_i {\hat{a}}_i+g(t)\sum_i ({\hat{a}}_i^{\dagger}+{\hat{a}}_i){\hat{n}}_i.\label{eq:Holstein} \end{align} Here $$\mu$$ is the chemical potential and the rest is the same as the previous section. For simplicity, in the following we consider the Bethe lattice (with infinite number of coordinates), which has the semi-circular density of state for a free problem, $$\rho_0(\epsilon) = \frac{1}{2\pi J^{*2}} \sqrt{4J^{*2}-\epsilon^2}$$. Here we take $$J^*=1.0$$. The GFs are introduced as \begin{align} \hat{G}_{ij}(t,t') &= -i \langle T_{\mathcal{C}}{\hat{c}}_{i,\sigma}(t) {\hat{c}}_{j,\sigma}^{\dagger}(t') \rangle, \\ D_{ij}(t,t') &= -i\langle T_{\mathcal{C}} \Delta{\hat{X}}_i(t) \Delta{\hat{X}}_j(t') \rangle, \end{align} where we assumed the spin symmetry.

In this example, we treat the dynamics of the Holstein model using the dynamical mean-field theory (DMFT). In DMFT, the lattice model is mapped to the effective impurity model coupled to free electron baths. The effective impurity model for the Holstein model is the model considered in the previous section with the general hybridization $$\Delta(t,t')$$ (the number of bath site is more than one). The hybridization function is self-consistently determined so that the impurity Green’s function ($$G_{\rm imp}(t,t')$$) and the impurity self-energy ($$\Sigma_{\rm imp}$$) is identical to the local Green’s function ($$G_{\rm loc}=G_{ii}$$) and the local self-energy of the lattice problem. For the Bethe lattice, the DMFT self-consistency is simplified and the hybridization function is written as \begin{align} \Delta(t,t')=J^*(t)J^*(t') G_{\rm loc}(t,t').\label{eq:bethe_condition} \end{align}

When the impurity self-energy is given, the impurity GFs are determined by solving Eqs. $$\eqref{eq:Dyson_g}$$$$\eqref{eq:D_dyson}$$ and Eqs. $$\eqref{eq:X}$$ regarding $$G_{00}$$ as $$G_{\rm imp}$$ and $$D$$ as $$D_{\rm imp}$$. For the impurity self-energy, we can use the self-consistent Migdal approximation or the unrenormalized Migdal approximation again. Therefore, the main difference between the DMFT for the Holstein model on the Bethe lattice and the Holstein impurity problem in the previous section is whether the hybridization function is updated following Eq. $$\eqref{eq:bethe_condition}$$.

After the NEGFs are obtained, we can calculate some observables such as energies. The expression for energies (per site) is the same as the impurity except for the kinetic energy, which is now expressed as \begin{align} E_{\rm kin}(t)&=\frac{1}{N}\sum_{\langle i,j\rangle,\sigma}-J(t)\langle c_{i,\sigma}^{\dagger}(t)c_{j,\sigma}(t)\rangle =-2i[\Delta \ast G_{\rm loc}]^<(t,t). \end{align}

We also note that because of the attractive interaction mediated by phonons, this model shows the s-wave superconducting (SC) state at low enough temperatures. We can treat this situation as well using the Nambu formalism. Even though we skip the detail explanation of the formulation for the SC phase, we also provides sample programs for the these cases.

## Generic structure of the example program

The corresponding programs are implement in Holstein_bethe_Migdal.cpp and Holstein_bethe_uMig.cpp for normal states and Holstein_bethe_Nambu_Migdal.cpp and Holstein_bethe_Nambu_uMig.cpp for superconducting states. The structure of the code is almost the same as that for the impurity problem and the major difference is the update of the hybridization function. To clarify this, we show the part o the time propagation of Holstein_bethe_Migdal.cpp,

for(tstp = SolverOrder+1; tstp <= Nt; tstp++){
// Predictor: extrapolation
cntr::extrapolate_timestep(tstp-1,G,CINTEG(SolverOrder));
cntr::extrapolate_timestep(tstp-1,Hyb,CINTEG(SolverOrder));
// Corrector
for (int iter=0; iter < CorrectorSteps; iter++){
cdmatrix rho_M(1,1), Xph_tmp(1,1);
cdmatrix g_elph_tmp(1,1),h0_imp_MF_tmp(1,1);
//update self-energy
Hols::Sigma_Mig(tstp, G, Sigma, D0, D, Pi, D0_Pi, Pi_D0, g_elph_t, beta, dt, SolverOrder);
//update phonon displacement
G.density_matrix(tstp,rho_M);
rho_M *= 2.0;//spin number=2
n_tot_t.set_value(tstp,rho_M);
Hols::get_phonon_displace(tstp, Xph_t, n_tot_t, g_elph_t, D0, Phfreq_w0, SolverOrder,dt);
//solve Dyson for impurity
Xph_t.get_value(tstp,Xph_tmp);
g_elph_t.get_value(tstp,g_elph_tmp);
h0_imp_MF_tmp = h0_imp + Xph_tmp*g_elph_tmp;
h0_imp_MF_t.set_value(tstp,h0_imp_MF_tmp);
Hyb_Sig.set_timestep(tstp,Hyb);
Hyb_Sig.incr_timestep(tstp,Sigma,1.0);
cntr::dyson_timestep(tstp, G, 0.0, h0_imp_MF_t, Hyb_Sig, CINTEG(SolverOrder), beta,dt);
//Update hybridization
Hyb.set_timestep(tstp,G);
Hyb.right_multiply(tstp,J_hop_t);
Hyb.left_multiply(tstp,J_hop_t);
}
}

At the beginning of each time-step, we extrapolate the local GF and the hybridization, which serves as a predictor. Next, we iterate the DMFT self-consistency loop (corrector) until reaching convergence. In the DMFT self-consistency loop, we first evaluate the impurity self-energy using the Migdal approximation. Secondly, using the self-energy obtained, we solve the Dyson equation for the impurity GF. Then we update the hybridization function by Eq. $$\ref{eq:bethe_condition}$$.

## Running the example programs

For the normal states, there are two programs for the sMig and uMig, respectively: Holstein_bethe_Migdal.x and Holstein_bethe_uMig.x. For the superconducting states, we also provide Holstein_bethe_Nambu_Migdal.x and Holstein_bethe_Nambu_uMig.x. In these programs, we use $$\mu_{mf}\equiv \mu_{0} - gX(0)$$ as an input parameter instead of $$\mu$$. ( $$\mu$$ is determined as a post processing step.) For the normal phase, the excitation with modulating the hopping and the el-ph is implemented, while for the superconducting phase the excitation with pairing potential (the excitation term is $$F_{\rm sc}(t) \sum_i (\hat{c}^\dagger_{i,\uparrow} \hat{c}^\dagger_{i,\downarrow} + h.c. )$$) and modulating the hopping is implemented. The driver script demo_Holstein.py and demo_Holstein_sc.py located in the utils/ directory provide a simple interface to these programs. They have essentially the same structure as demo_Holstein_impurity.py in the previous section.

## Discussion

In the figure below, we show the spectral functions in the normal phase and the superconducting phase in equilibrium using DMFT + sMig.

One can observe opening of the SC gap below the transition temperature. Within DMFT + sMig, there occurs renormalization of the phonon spectrum, see (b). In the normal phase, the peak position of the phonon spectrum is shifted from the bare value and shows finite width around the peak. When the electron-phonon coupling is sufficiently strong, in the superconductor phase, a peculiar peak around the energy scale of the SC gap shows up. In (c)–(d), we show the time evolution of the kinetic energy after the excitation by modulating the hopping using DMFT + sMig (c) and DMFT + uMig (d). Within the DMFT + sMig the kinetic energy approaches a value above the equilibrium value, while within the DMFT + uMig the kinetic energy returns to the original value. In the former case, the total energy is conserved and after the excitation the system is expected to approach an equilibrium state with elevated temperature. On the other hand, in the latter case, phonons work as a heat bath and the injected energy by the excitation is absorbed, hence the system approaches an original equilibrium state.

# Hubbard model - translational invariant case

As in Section above, we will consider the Hubbard Hamiltonian. Here, we will assume a translationally invariant system with periodic boundary conditions. The translational invariance implies that all observables and propagators are diagonal in momentum space. Hence, all quantities can be labeled by the reciprocal lattice vector $${\mathbf{k}}$$ within the first Brillouin zone (BZ). This reduces the computational and storage complexity from $$\mathcal{O}(N_k^2)$$ for the real space formalism introduced in the example of the Hubbard chain to $$\mathcal{O}(N_k).$$ Moreover, the Dyson equation is diagonal in momentum space and can be efficiently parallelized using the distributed-memory parallelization with the Message Passing Interface (MPI).

We will consider a 1D chain described by the Hubbard model, see Eq. $$\eqref{eq:hubbmodel1}$$. The single particle part of the Hamiltonian can be diagonalized as \begin{align} \hat{H}_0 = \sum_{{\mathbf{k}}} \epsilon({\mathbf{k}}) \hat{c}_{{\mathbf{k}}}^{\dagger} \hat{c}_{{\mathbf{k}}}, \end{align} where we have introduced the free-electron dispersion $$\epsilon({\mathbf{k}})=-2 J \cos({\mathbf{k}}_x).$$ We will use a vector notation since the generalization to higher dimensional systems is straightforward. For the 1D chain used in the demonstration, the momentum has only one component $${\mathbf{k}}_x=k_x$$.

The system is excited via an electromagnetic excitation, which in translationally invariant systems is conveniently introduced using the Peierls substitution. The latter involves the vector potential $${\mathbf{A}}({\mathbf{r}},t)$$ as a phase factor in the hopping  \begin{align} J\rightarrow J \exp[-\frac{i q}{\hbar} \int_{{\mathbf{R}}_i}^{{\mathbf{R}}_j} d{\mathbf{r}} {\mathbf{A}}({\mathbf{r}},t)], \end{align} where $$q$$ is the charge of the electron. Using the Fourier transform and the fact that the vector potential varies only slowly on the length scale of the unit cell (dipolar approximation), we arrive at the time-dependent single-particle dispersion The vector potential is obtained from the electric field as $${\mathbf{A}}(t)=-\int_0^t {\mathbf{E}}(\bar t ) d\bar t.$$

In this example, we treat the dynamics within the $$GW$$ approximation. The numerical evaluation of the respective self-energy expressions is implemented in the C++ module gw_selfen_impl.cpp. Below we explain the key routines.

## Self-energy approximation: GW

In momentum space, the GW self-energy can be written as \begin{align} \label{Eq:Sigmak} \Sigma_{{\mathbf{k}}}(t,t^\prime)=\frac{i}{N_k} \sum_{{\mathbf{q}}} G_{{\mathbf{k}}-{\mathbf{q}}}(t,t^\prime) W_{{\mathbf{q}}}(t,t^\prime), \end{align} where we have introduced the Fourier transform of the propagator $$X_{{\mathbf{q}}}(t,t^\prime)=(1/N_k)\sum_{i} \exp(i ({\mathbf{r}}_i-{\mathbf{r}}_j) {\mathbf{q}}) X_{ij}(t,t^\prime),$$ see also Section [Sec:Chain] for the definition of propagators. Due to the translational invariance, propagators and corresponding Dyson equations are diagonal in the momentum space. This leads to the speed-up of calculations as the most complex operation, namely the solutions of the VIE, can be performed in parallel.

As each momentum point is independent, we have introduced a class kpoint in the C++ module gw_kpoints_impl.cpp. This class includes all propagators at the given momentum point $${\mathbf{k}}$$ and corresponding methods, like the solution of the Dyson equations for the single particle propagator $$G_{{\mathbf{k}}}(t,t^\prime)$$, see Eq. $$\eqref{eq:dyson_kspace}$$, and the retarded interaction $$W_{{\mathbf{k}}}(t,t^\prime)$$. The retarded interaction is obtained as a solution of the Dyson-like equation \begin{align} W_{{\mathbf{k}}}(t,t')=U\delta_\mathcal{C}(t,t^\prime)+ U [ \Pi_{{\mathbf{k}}} \ast W_{{\mathbf{k}}} ](t,t^\prime). \end{align} and the Fourier transform of the polarization is given by \begin{align} \Pi_{{\mathbf{k}}}(t,t')= \frac{-i}{N_k} \sum_{{\mathbf{q}}} G_{{\mathbf{k}}+{\mathbf{q}}}(t,t') G_{{\mathbf{k}}}(t',t). \label{Eq:Pol} \end{align} As momentum points are independent we can construct an arbitrary lattice as a collection of these points. This structure allows for an easy adaptation of the code to arbitrary lattice geometries. In particular, we provide an implementation of a 1D chain geometry in the class lattice_1d_1b within the C++ module gw_lattice_impl.cpp. The routine add_kpoints simply evaluates the sum or a difference of two vectors $${\mathbf{k}}\pm{\mathbf{q}},$$ where slight care has to be taken to map the vector back to the first BZ. For the modification to other lattices and interaction vertices, the user has to define the first BZ, the single particle dispersion $$\epsilon({\mathbf{k}})$$, the interaction vertex $$U$$ and how vectors sum up. Note that the generalization to the long range interaction \begin{align} \hat{H}_{\text{int}}=U \hat n_{i\uparrow} \hat n_{i\downarrow}+ \frac{1}{2}\sum_{i,j} V({\mathbf{r}}_i-{\mathbf{r}}_j) \hat n_i \hat n_j \end{align} is straighforward. For the purpose of demonstration, we have included the nearest-neighbor interaction $$V({\mathbf{r}}_i-{\mathbf{r}}_j)=\delta(|{\mathbf{r}}_i-{\mathbf{r}}_j|=1)V$$ into the example program (input parameter V).

While the Dyson equation is diagonal in momentum space, the evaluation of the self-energy diagrams mixes information between different momentum points, see Eq. $$\eqref{Eq:Sigmak}$$, and requires communication between different ranks. The communication between different ranks is performed using the distributed_timestep_array. The strategy to compute the $$GW$$ self-energy $$\mathcal{T}[\Sigma_{{\mathbf{k}}}]_n$$ at time step $$n$$ thus consist of two steps:

1. At time $$t_n$$, communicate the latest time-slice of the GF $$\mathcal{T}[G_{{\mathbf{k}}}]_n$$ and retarded interactions $$\mathcal{T}[W_{{\mathbf{k}}}]_n$$ for all momentum points between all MPI ranks

2. Evaluate the self-energy diagram $$\mathcal{T}[\Sigma_{{\mathbf{k}}_{\text{rank}}}]_n$$ in Eq. $$\eqref{Eq:Sigmak}$$ for a subset of momentum points $${\mathbf{k}}_{\text{rank}}$$ present on a given rank using the routine Bubble2

Step 1 is implemented as

void gather_gk_timestep(int tstp,int Nk_rank,DIST_TIMESTEP &gk_all_timesteps,std::vector<kpoint> &corrK_rank,std::vector<int> &kindex_rank){
gk_all_timesteps.reset_tstp(tstp);
for(int k=0;k<Nk_rank;k++){
gk_all_timesteps.G()[kindex_rank[k]].get_data(corrK_rank[k].G_);
}
gk_all_timesteps.mpi_bcast_all();
}

and an analogous routine is used for the bosonic counterpart. We gather the information from all ranks into an object of type distributed_timestep_array named gk_all_timesteps. We have introduced a map kindex_rank which maps an index of momenta from a subset on a given rank to the full BZ, like $$\mathbf{k}_{\text{rank}}\rightarrow \mathbf{k}$$. The variable corrK_rank is just a collection of kpoints on a given rank. mpi_bcast_all() is a wrapper around the MPI routine Allgather adjusted to the type distributed_timestep_array.

Step 2 is implemented as

void sigma_GW(int tstp,int kk,GREEN &S,DIST_TIMESTEP &gk_all_timesteps,DIST_TIMESTEP &wk_all_timesteps,lattice_1d_1b &lattice,int Ntau,int Norb){
assert(tstp==gk_all_timesteps.tstp());
assert(tstp==wk_all_timesteps.tstp());
GREEN_TSTP stmp(tstp,Ntau,Norb,-1),PHtmp(tstp,Ntau,Norb,-1);
S.set_timestep_zero(tstp);
for(int q=0;q<lattice.nk_;q++){
double wk=lattice.kweight_[q];
int i1,i2,ct1,ct2;
cdmatrix rtmp,vtmp;
stmp.clear();
for(i1=0;i1<Norb;i1++){
for(i2=0;i2<Norb;i2++){
cntr::Bubble2(tstp,stmp,i1,i2,gk_all_timesteps.G()[kq],gk_all_timesteps.G()[kq],i1,i2,wk_all_timesteps.G()[q],wk_all_timesteps.G()[q],i1,i2);
}
}
S.incr_timestep(tstp,stmp,wk);
}
}

As each rank includes only a subset of momentum points $$\mathbf{k}_{\text{rank}}$$ we only evaluate the self-energy diagrams $$\Sigma_{\mathbf{k}_{\text{rank}}}$$ for this subset of momentum points. All ranks carry information about the latest timestep for all momentum points and the internal sum over momentum $$\vec q$$ in Eq. $$\eqref{Eq:Sigmak}$$ is performed on each rank.

## Generic structure of the example program

As the generic structure is similar to the above examples, we will focus on the peculiarities connected to the usage of MPI. First, we need to initialize the MPI session

MPI::Init(argc,argv);
tid=MPI::COMM_WORLD.Get_rank();
tid_root=0;

and the cntr::distributed_timestep_array for the electronic and bosonic propagators

DIST_TIMESTEP gk_all_timesteps(Nk,Nt,Ntau,Norb,FERMION,true);
DIST_TIMESTEP wk_all_timesteps(Nk,Nt,Ntau,Norb,BOSON,true);

To simplify the mapping between a subset of point on a given rank and the full BZ we introduce a map kindex_rank which at position $$j=0,\ldots,N_{\text{rank}}-1$$ includes an index of the corresponding momenta in the full BZ.

The program consists of three main parts, namely solving the Matsubara Dyson equation, bootstrapping (tstp $$\leq$$ SolverOder) and time propagation for tstp $$>$$ SolverOder. The selfconsistency iterations include the communication of all fermionic and bosonic propagators between different ranks using the routine gather_gk_timestep and the determination of the local propagators.

for(int iter=0;iter<=MatsMaxIter;iter++){
// update propagators via MPI
diag::gather_gk_timestep(tstp,Nk_rank,gk_all_timesteps,corrK_rank,kindex_rank);
diag::gather_wk_timestep(tstp,Nk_rank,wk_all_timesteps,corrK_rank,kindex_rank);
diag::set_density_k(-1,Norb,Nk,gk_all_timesteps,lattice,density_k,kindex_rank,rho_loc);
diag::get_loc(-1,Ntau,Norb,Nk,lattice,Gloc,gk_all_timesteps);
diag::get_loc(-1,Ntau,Norb,Nk,lattice,Wloc,wk_all_timesteps);
}

As on each MPI rank, the momentum dependent single-particle density matrix $$\rho(\vec k)$$ is known for the whole BZ, the evaluation of the HF contribution is done as in Section~Sec:Chain}. The self-energies $$\Sigma_{k_{\text{rank}}}$$ for the momentum points $$k_{\text{rank}}=0,\ldots,N_{\text{rank}}-1$$ on each rank are obtained by the routine sigma_GW.

// update mean field and self-energy
for(int k=0;k<Nk_rank;k++){
diag::sigma_Hartree(-1,Norb,corrK_rank[k].SHartree_,lattice,density_k,vertex);
diag::sigma_Fock(-1,Norb,kindex_rank[k],corrK_rank[k].SFock_,lattice,density_k,vertex);
diag::sigma_GW(-1,kindex_rank[k],corrK_rank[k].Sigma_,gk_all_timesteps,wk_all_timesteps,lattice,Ntau,Norb);
}

Similarly, the solution of the Dyson equation for the fermionic (bosonic) propagators for each momentum point is obtained by step_dyson_with_error (step_W_with_error) which is just a wrapper around the Dyson solver and the returning error is the difference between the propagators at the previous and current iterations. The momentum-dependent error for the fermionic propagators is stored in convergence_error_ele and communicated among the ranks. %

Todo:
{This is just a reduction operation. Please, write a reduction operation to distributed timestep array and change this part.}
// solve Dyson equation
for(int k=0;k<Nk_rank;k++){
convergence_error_ele.block(kindex_rank[k])[0]=corrK_rank[k].step_dyson_with_error(tstp,iter,SolverOrder,lattice);
diag::get_Polarization_Bubble(tstp,Norb,Ntau,kindex_rank[k],corrK_rank[k].P_,gk_all_timesteps,lattice);
convergence_error_bos.block(kindex_rank[k])[0]=corrK_rank[k].step_W_with_error(tstp,SolverOrder,lattice);
}
convergence_error_ele.mpi_bcast_all();
convergence_error_bos.mpi_bcast_all();
double err_ele=0.0,err_bos=0.0;
for(int k=0;k<Nk;k++){
err_ele+=convergence_error_ele.block(k)[0];
err_bos+=convergence_error_bos.block(k)[0];
}
}

The bootstrapping and the real-time propagation carry an equivalent structure as the Matsubara solver. The main difference lies in the predictor-corrector scheme as explained in Section~Sec:Chain}. At the beginning of each time step, we extrapolate the momentum-dependent GF $$G_{k}$$ and the retarded interactions $$W_{k}$$, which works as a predictor

// Predictor: extrapolation
diag::extrapolate_timestep_G(tstp-1,Nk_rank,SolverOrder,Nt,corrK_rank);
diag::extrapolate_timestep_W(tstp-1,Nk_rank,SolverOrder,Nt,corrK_rank);

Then we make several iterations at a given time step until convergence, which acts as a corrector.

After the NEGFs are obtained, we evaluate the kinetic energy $$E_{\text{kin}}(t)=\frac{1}{2 N_k}\sum_{\vec k}\text{Tr}[ \rho_{\vec k}(t) \epsilon_{\vec k}(t)]$$ using the routine diag::KineticEnergy. The correlation energy is obtained from the Galitskii-Migdal formula

\begin{align*} E_{\text{corr}}(t)=\frac{1}{2 N_k} \sum_{\vec k} \left ( \mathrm{Tr}\left[\rho_{\vec k}(t) h^{\mathrm{MF}}_{\vec k})\right]- \frac{i}{2} \left[\Sigma_{\vec k}\ast G_{\vec k}\right]^<(t,t) \right), \end{align*}

using the routine diag::CorrelationEnergy. The two operations include the MPI reduction as the momentum sum is performed over the whole BZ.

## Running the example program

There is one program for the GW calculation, called gw.x. The driver script demo_gw.py located in the utils/ directory provides a simple interface to this program. Similar as in the previous cases, the script creates an input file and launches the program. The user can specify the shape of the electric pulse, but by default, we use a single-frequency pulse with a Gaussian envelope

\begin{align} E(t)=E_0 \exp(-4.6(t-t_0)^2/t_0^2) \sin(\omega (t-t_0)), \label{Eq:pulse} \end{align}

where $$t_0=2\pi/\omega N_{p}$$ is determined by the number of cycles $$N_p$$. After the simulation, the time evolution of the kinetic energy and potential energy are plotted. The output is determined by two optional parameters. First, if SaveGreen is true the local single particle ( $$G$$) and two particle ( $$W$$) propagators are stored to disk. Second, if SaveMomentum is true also the momentum-dependent propagators are stored to disk. As the full momentum and time-dependent propagators would require a lot of memory, we only save selected timeslices and their frequency is determined by the parameter output. For example, if output=100, every 100th timeslice will be stored to disk.

## Discussion

The equilibrium momentum-dependent spectral function $$A_k(\omega)=-\frac{1}{\pi}\text{Im}\left[ G_{{\mathbf{k}}}(\omega)\right]$$ and its local part $$A_{\text{loc}}(\omega)=\frac{1}{N_k}\sum_{{\mathbf{k}}}A_k(\omega)$$ are presented in the figure below.

The local spectral function $$A_{\text{loc}}(\omega)$$ shows the typical van Hove singularities present in 1D systems at $$\omega \approx \pm 2.$$ The comparison for two system sizes, namely $$N_k=128$$ and $$N_k=256$$, shows that the spectrum is converged. The momentum-dependent spectral function $$A_k(\omega)$$ follows the single-particle dispersion $$\epsilon_{{\mathbf{k}}}$$ closely. The broadening due to many-body effects is small close to the Fermi surface points ($$\pm \pi/2$$) due to the restricted scattering but it is increasing with increasing energies. Note that the GW approximation cannot capture peculiarities of the 1D system, like the absence of the Fermi surface as describe by the Tomanaga-Luttinger liquid . However, this is a peculiarity of low-dimensional systems and we consider this case mainly to avoid lengthy calculations in the example program. Another interesting observation is the presence of a shadow band, which is clearly visible for energies away from the chemical potential. The origin of this shadow band is the feedback of the two-particle excitations on the single particle spectrum.

The information about the two-particle excitation spectrum is contained in the bosonic correlator $$W$$. As the latter is antisymmetric in frequency $$\text{Im}[W(\omega)]=-\text{Im}[W(-\omega)]$$ we only present results for positive frequencies. The local two-particle correlator Im[$$W_{\text{loc}}(\omega)]$$ is presented in (c) in the figure above for two system sizes $$N_k=128$$ and $$N_k=256$$, respectively. The local component Im[$$W_{\text{loc}}(\omega)]$$ shows a strong peak around $$\omega\approx 4$$, which corresponds to particle-hole excitations between the two van-Hove singularities in the single-particle spectrum. As we have restricted the calculation to a short-range interaction we do not expect massive collective excitations (plasmons). The effective interaction is rather governed by the particle-hole continuum which for small momenta is linear with frequency (zero-sound mode). The latter is confirmed by the momentum-dependent bosonic correlator Im[$$W_{{\mathbf{k}}}(\omega)]$$, see (d). At larger momenta, the deviation from a linear dependence is evident and close to the edge of the BZ the intensity of the bosonic propagator is maximal as it corresponds to the transition between the two van-Hove singularities in the single particle spectrum.

Now, we turn to the dynamics after the photo-excitation. We excite the system with a short oscillating electric pulse, see Eq. $$\eqref{Eq:pulse}$$. From now on we will restrict the discussion to the single pulse case $$N_p=1.$$ The amplitude of the excitation $$E_0$$ determines the amount of the absorbed energy. In the figure below, we present the time evolution of the kinetic energy for the two excitation strengths $$E_0=3$$ and $$E_0=5$$.

As the kinetic energy in the system is increased the dynamics of a generic quantum many-body systems would be a relaxation of the kinetic energy. However, in the 1D case, the phase-space restriction due to the conservation of energy and momentum leads to a very inefficient relaxation. Note that this is only the case for electron-electron scattering, while for systems coupled with local bosonic excitations, like optical phonons, this is not anymore the case. For the strongest excitations, there is a clear relaxation dynamics to the final state (see inset) is accompanied with strongly damped oscillations.

In practice, the main bottleneck to reach longer propagation times is the memory restriction imposed by the hardware. The usage of the MPI parallelization scheme over momentum points reduces this issue due to the distributed memory among different nodes. This is beneficial as long as the number of momentum points is larger than the number of nodes. The usage of the cntr::distributedtimesteparray enables a minimal overlap of the stored information between different nodes, which in all practical cases leads to a linear reduction of the memory requirements per MPI rank.

Moreover, the MPI parallelization also speeds up the evaluation of the program. We have performed a scaling analysis for a system with fixed number of momentum points $$N_k=128,$$ and parallelized up to 128 processors, see below.

Moreover, for all tests we have fixed the number of tasks per node as in the real-world scenario we want to maximally distribute the memory usage. We can see that the scaling is almost perfect up to 128 processors, where a slight deviation from optimal scaling is observed. The main reason for this behavior is the communication overhead as a substantial amount of data, namely timesteps of propagators for all momentum points, has to be communicated among all nodes. We have tested different communication schemes and the current versions of the library includes the scheme with the best scaling. Of course, we cannot exclude that the scaling for a large number of processors can be improved and this will be an important task for a future update of the library. While the current version can already be directly applied to higher dimensional systems (2D, 3D), future applications to realistic modeling of periodic systems will rely on an efficient parallelization scheme.

previous page Physics background