Evaluation of three approaches for simulating 3-D time-domain electromagnetic data

We implemented and compared the implicit Euler time-stepping approach, the inverse Fourier transform-based approach and the Rational Arnoldi method for simulating 3-D transient electromagnetic data. We utilize the ﬁnite-element method with unstructured tetrahedral meshes for the spatial discretization supporting irregular survey geometries and anisotropic material parameters. Both, switch-on and switch-off current waveforms, can be used in combination with direct current solutions of Poisson problems as initial conditions. Moreover, we address important topics such as the incorporation of source currents and opportunities to simulate impulse as well as step response magnetic ﬁeld data with all approaches for supporting a great variety of applications. Three examples ranging from simple to complex real-world geometries and validations against external codes provide insight into the numerical accuracy, computational performance and unique characteristics of the three applied methods. We further present an application of logarithmic Fourier transforms to convert transient data into the frequency domain. We made all approaches available in the open-source Python toolbox custEM , which previously supported only frequency-domain electromagnetic data. The object-oriented software implementation is suited for further elaboration on distinct modelling topics and the presented examples can serve for benchmarking other codes.


I N T RO D U C T I O N
The transient electromagnetic (TEM) method is widely used for the exploration of marine hydrocarbon reservoirs (Constable 2010;Key 2012b), mineral deposits (Smith 2014;Guo et al. 2020) and groundwater (Siemon et al. 2009;Yogeshwar & Tezkan 2017). Nowadays, applications comprise different scales in airborne, land-based, marine and mixed setups. Fast and robust forward modelling techniques are not only required for the inversion of the recorded data for the increasingly complex, large-scale 3-D setups (Börner 2010), but also for investigating the effects of geological structures and optimizing survey designs.
Even though there are numerous publications in the field of geophysical 3-D EM modelling, there is still a great demand for opensource 3-D codes which support complex setups and serve to validate individual numerical implementations (Miensopust 2017). In this context, the term open-source is referred to as software tools which are independently accessible and usable by scientists without requesting authors' permission explicitly. The multigeophysics modelling and inversion tools provided by SimPEG (Cockett et al. 2015) and pyGIMLi (Rücker et al. 2017) were already made available to the community several years ago. Recently, Castillo-Reyes et al. (2018), Rochlitz et al. (2018) and Werthmüller et al. (2019b) presented the open-source codes PETGEM (FE), custEM (FE) and emg3-D (FD) as tools for simulations in the frequency domain. For 3-D time-domain EM modellings, there are tools in the P223F software suite (Raiche et al. 2007) or in SimPEG (Heagy et al. 2020) openly available. Most recently,  presents time-domain modelling techniques using emg3d, which were in simultaneous development to our work.
We present the first direct comparison of three established techniques for simulating TEM data. In particular, we considered the implicit Euler time-stepping method, an approach based on Fouriertransformed frequency-domain solutions, as well as a Rational Arnoldi method. We analyse the three time-domain approaches, focus on supporting complex 3-D geometries with unstructured tetrahedral meshes and compare results obtained with all three approaches in terms or accuracy and computational performance.
The following section introduces the underlying methodology of the three time-domain modelling approaches. Afterwards, we describe details about the implementation. The numerical examples provide insights about the accuracy and required computational resources of our implemented approaches. The first example deals with a land-based TEM setup and a loop transmitter (Tx) over a simple 1-D structure, enabling a comparison between our solutions and semi-analytic references obtained with empymod (Werthmüller 2017). The second example uses a long-offset TEM (LOTEM) setup on a larger scale with a grounded wire source over a blocky 3-D structure. Our results are compared with solutions of the SLDMEM software, which is based on the work by Druskin & Knizhnerman (1994). The third shows the capability of custEM to handle configurations which might be relevant to potential needs of the exploration industry. We computed EM responses of the Marlim R3D model (Correa & Menezes 2019) using the Rational Arnoldi method and compared the Fourier-transformed results to the reference solutions in the frequency domain. Before drawing our conclusions, we discuss the pros and cons of all approaches as well as potential optimization possibilities and general limitations.

M E T H O D O L O G Y
In this section, we provide the fundamentals of the three considered numerical approaches for simulating time-domain EM data. At first, we define the underlying partial differential equations for the given EM problem. We introduce a numerically feasible spatial discretization of the Maxwell equations with FE methods afterwards. For this purpose, we closely follow Ward & Hohmann (1988) and Jin (2015).
We consider a bounded domain ⊂ R 3 with the outer unit normal n and the spatially dependent, general electric conductivity tensor σ = σ i j (r), defined on . Neglecting displacement currents, the derived time-dependent partial differential equation for the electric field e = e(r, t) reads where μ = μ(r) denotes the magnetic permeability, and j t e the source current density function with q being the spatial pattern of the source current and s the temporal current waveform. The latter simplifies to a Heaviside step function H(t) or H( − t) for a transmitter switch-on or switchoff, respectively, at time t = 0 which is convenient for approximating the current waveforms of many TEM systems. Along the boundary ∂ of , we impose the inhomogeneous Dirichlet boundary condition where the boundary data g : ∂ → R 3 can be modelled as the solution of simplified models such as a layered half-space. With the implicit assumption that the boundaries are sufficiently far away from the source, eq. (3) reduces to the homogeneous Dirichlet condition In addition, Neumann boundary conditions can be defined as The combination of eqs (1) and (3) results in the initial-value problem n × e = n × g on ∂ × (0, ∞).
The same problem can be formulated in the frequency domain. We apply the Fourier transform, denoted by F, and introduce the electric field E in the frequency domain as a function of the angular frequency ω e(r, t) : We obtain the frequency-domain problem with the scaling and derivative laws of the Fourier transform with complex-valued electric fields E : → C 3 and boundary data G : ∂ → C 3 . The spatial approximation of both formulations is based on a FE discretization using first (p1) and second (p2) order Nédélecelements (Nédélec 1980) on unstructured tetrahedral meshes.
We use a standard variational formulation for Maxwell's equations and define the restricted Sobolev space H 0 (curl; ) = u ∈ L 2 ( ) : ∇ × u ∈ L 2 ( ), where L 2 ( ) denotes the function space of square-integrable vector fields on . The weak formulation in T D is obtained by multiplying eq. (1) by a stationary and smooth vector-valued test function φ ∈ H 0 (curl; ) and applying integration by parts. This yields the weak form of the variational problem in the time domain for all φ ∈ H 0 (curl; ), where ( ·, ·) denotes the inner product on L 2 ( ), that is with v as the complex conjugate of v. The weak form can be made numerically computable by further restricting the Sobolev space to a finite-dimensional subspace consisting of Nédélec finite elements on a tetrahedral mesh T h . On each tetrahedron K ∈ T h , the functions in the finite-dimensional subspace of H 0 (curl; ) consist of vector polynomials.
Using the Galerkin method and expanding the discrete solution of the above variational problem in a basis [φ 1 , . . . , φ N ], solving eq. (1) becomes the solution of the semi-discrete initial-value problem of the ordinary differential equation where the vector u(t) contains the N coefficients of the finite element approximation of e(t) inside the computational domain with respect to the Nédélec basis at time t > 0. The mass matrix M, the curl-curl matrix C, as well as the vector s are given in terms of the Nédélec basis by

Implicit Euler method
Following (Um et al. 2010) or Cai et al. (2017) with geophysical applications, we use an implicit Euler (IE) time-integration scheme to solve eq. (1) in time. The advantage over explicit Euler methods is that the solution is unconditionally stable (e.g. Jin 2015). With u(t) as a numerical approximation to e(t) and discrete time steps t, the first-order backward Euler scheme reads The solution u n+1 at time t n + 1 = t n + t depends only on the solution at time t n . Using the second-order accurate scheme, which is superior in case of a non-linear behaviour of j t e (Um et al. 2010), the solution u n+2 is based on the results u n+1 , u n of two previous time steps With b denoting the magnetic flux density and • the first time derivative, magnetic fields can be derived using the relation Most land-based transient CSEM methods require a time discretization of several decades between 1 μs and 1 s or even later times in marine setups. Related to the decay behaviour of electric fields, logarithmically increasing times steps appear to be adequate for covering these time ranges. However, Um et al. (2010) point out that piece-wise constant time steps can significantly increase the performance, because the system matrix factorization can be reused for all these steps. In contrast to different concepts of adaptive time stepping techniques (Um et al. 2010(Um et al. , 2012Cai et al. 2017), we considered a generally practical approach. We first divided the complete time range into n log logarithmic intervals and afterwards, each of them into n lin uniform linear steps. This resulted in only n log matrix factorization and n log · n lin solution phases with the direct solver MUMPS. Electric and magnetic fields were interpolated and exported at each logarithmic step. This is the default setting, but user-defined time stepping sequences and custom points in time for the stored fields are supported.

Fourier transform based methods
Fourier transform (FT) based methods convert frequency-domain data into the time domain. The FD solutions are, for instance, obtained based on eq. (8) (Rochlitz et al. 2019). The majority of publications dealing with T D CSEM forward simulations used such Fourier-based approaches based on sine/cosine, Laplace or Hankel transforms. The greatest challenge of these approaches is to minimize the amount of computationally expensive FD forward modelling to obtain a single T D solution. The transforms themselves are of negligible computational effort. Ghosh (1971) introduces the first application of the digital linear filter (DLF) method for Fast Hankel transforms (FHT) of Schlumberger and Wenner DC resistivity sounding data. Johansen & Sørensen (1979), Andersen (1989), Christensen (1990) and Mohsen & Hashish (1994)  We utilize the FHT for evaluating integrals of the form In general, these kinds of integrals are called Hankel Transforms, with a function f(k) to be transformed and J ν (kr) being the Bessel function of the first kind and order ν > −1. Because of the oscillatory behaviour of the Bessel function, the evaluation of such an integral would be numerically quite expensive for large arguments kr. These numerical costs can be significantly reduced by the FHT, which uses digital linear filter coefficients (Ĥ ) for a fast computation of the Hankel integral by casting it into a discrete sum Both, computational speed and numerical accuracy requirements, determine the choice of the digital filters when using the FHT. Typical filter sets contain a minimum of 50-100, on average 100-200 and in some cases up to several hundred coefficients (Key 2012a;Werthmüller et al. 2019a). More coefficients usually provide a higher accuracy, but it is also possible to achieve sufficiently accurate transform results with only a few tens of filter coefficients ) which significantly influences the computation times. In this work, we use a digital linear filter set for the Evaluation of TEM modelling approaches 1983 sine transform (ν = − 1 2 ) that consists of n = 80 coefficients (Appendix A). They were optimized based on the work by Christensen (1990). Using only 80 filter coefficients allows for a comparatively fast computation of the T D solution yet providing a sufficient accuracy (Seidel 2019). The coefficients were designed for 10 time channels per decade and are well suited for most land-based CSEM applications.

Rational Arnoldi method
After discretization in space, the explicit solution of the semidiscrete time-domain problem (14) is given in terms of the matrixexponential function (Börner et al. 2015) with the initial solution When the matrices C ∈ R N ×N and M ∈ R N ×N are large and sparse, Krylov subspace methods can be used to obtain an efficient approximation of the product of the matrix-exponential in eq. (23) with a vector.
We use a rational Krylov orthogonalization algorithm to construct a sequence of orthonormal vectors {v j } j≥1 . Starting this sequence with v 1 = u 0 / u 0 , we determine in each step a vector v j+1 which is orthogonal to the previously generated vectors v 1 , . . . , v j . This leads to the recursion where h i, j are the elements of an upper Hessenberg matrix computed by the Arnoldi algorithm, and the coefficients {ξ j } j = 1, . . . , m are known as poles.
The numerical effort of recursion in eq. (25) is dominated by the repeated solution of large linear systems of the form Once the set of basis vectors V m+1 = [v 1 , v 2 , . . . , v m+1 ] has been computed, we can define the rational Arnoldi approximation for f (A)u 0 of order m as where , or, more precisely, computing the expression exp(−tA m+1 ) becomes feasible, whereas the effort to compute exp(−tA) is prohibitive. It remains to determine the coefficients ξ j in eq. (25). Börner et al. (2015) have proposed a surrogate problem to find optimal values for the coefficients ξ j . Their suggested method reduced the problem of finding m parameters to that of finding merely parameters corresponding to only < m distinct shifts in the linear system solves in eq. (25). Reusing the shifts for the Arnold iterations makes the usage of direct solvers suitable.
In this work, we used a new set of = 2 cyclically repeated distinct poles ξ 1 = 323.1722, ξ 2 = 6.0128 × 10 5 and m = 48 rational Krylov subspace iterations to cover time intervals of four to five decades with one simulation. The logarithmic range [t 0 , t 1 ] (e.g. 10 −4 to 1 s) can be adjusted by multiplying the poles by 10 −6−log 10 (t 0 ) . Accordingly, the FE simulations require only three matrix factorizations, that is one for the computation of u 0 and two associated with the poles ξ 1 and ξ 2 .

Static fields
We consider switch-on as well as switch-off transients of the common 50 per cent duty-cycle square-wave current function. Switchon and switch-off transients can be converted into each other in case of knowing the static electric or magnetic field. If the switchoff response of b is of interest, the static magnetic field b static of grounded or ungrounded sources can be calculated analytically with the Biot-Savart law. Obtaining switch-off e-field transients can be achieved by calculating or approximating the direct current (DC) field e dc .
The DC field can be obtained by solving the Poisson problem with the FE method (e.g. Um et al. 2010) where is the electric potential. In addition to the commonly preferred secondary field formulation of this problem (Um et al. 2010), we considered a total-field formulation by incorporating the divergence of the source current directly on the two grounding nodes of the Tx. The total-field formulation supports arbitrary model geometries with topography. If e dc is directly projected onto the tetrahedral edges (in the Nédélec FE space) from evaluated at the nodes (in the Lagrange FE space), we found that both approaches lead to almost identical results. We recommend using second order (p2) basis functions for solving the Poisson-problem with a significantly higher accuracy compared to those obtained with p1 basis functions.
Alternatively, e dc and also b static might be approximated by a low-frequency solution (<0.001 Hz) of the quasi-static EM problem. The FT approach provides such a solution automatically, but we found that this approximation is often not accurate enough for reliably deriving the late-time response of switch-off transients. Instead, calculating static fields can be avoided for obtaining switchoff transients with the FT approach by using an additional cosine transform with other filter coefficients. We considered only the sine transform with explicitly calculated static fields in this work.
Our preferred way of simulating a switch-off process with the IE approach is to implement two subsequent source pulses with a switch-on phase in between (e.g. 10 s with 10 linear steps of 1 s for land-based TEM). This method necessitates only one additional matrix factorization and a few back-substitutions during the solution phase, which is computationally cheaper than solving the DC problem.

Magnetic field response
Even though e and • b are commonly the target quantities in TEM surveys, specific applications may require the simulation of the magnetic field. For instance, modern highly sensitive receivers based on superconducting quantum interference devices (SQUID), optically pumped magnetometers (OPM) or specific coil sensors record b and not the time rate of change of the magnetic field. Such instruments utilized for TEM recordings need appropriate modelling and inverse modelling tools (Rochlitz et al. 2018).
Using the presented IE approach, time-integrating eqs (18) and (20)  of eq. (18). The step response b is obtained using the step current function j t e on the right-hand side, whereas the impulse response • b and e can be calculated with an impulse-like current function, which itself is the time derivative of a perfect step excitation or Heaviside function.
Dividing the frequency-domain results by iω before carrying out the fast Hankel transformation leads to the step response b in the FT approach. The only way to obtain b with the RA approach is to integrate • b. This method requires a sufficient sampling of • b over time. We found that 80 frequencies per decade with equidistant spacing along a logarithmic frequency axis led to accurate integrals. The switch-on and switch-off step responses are obtained by integrating • b in normal or reverse order in time, respectively.

custEM
The custEM toolbox was initially designed for the robust and automated modelling of complex 3-D semi-airborne FD CSEM data in an object-oriented environment as described by Rochlitz et al. (2019) and Rochlitz (2020). We focus on the time-domain modelling specific content and refer to Appendix B for additional details about the software and its development over the past years. The following links forward to the documentation and the source code: (i) https://custem.readthedocs.io, (19.07.2021).

Source current incorporation
We adapted the procedure for incorporating time-independent source currents along edges associated with the transmitter wire, exploiting the characteristics of Nédélec basis functions (Rochlitz et al. 2019). The FT method can directly reuse this procedure, which was developed for simulations in FD. Our formulation of the RA method is based on perfect initial conditions without ramp effects. Therefore, the source current is treated as constant and can be incorporated in the same way as in FD.
In contrast, the IE method requires a modification to account for currents varying over time and also a suitable numerical approximation of the perfect switch-on or switch-off case. Either the source current function j t e or it's time derivative • j t e needs to be specified on discrete points in time during the on-time. Fig. 1 illustrates two typical source currents in terms of • j t e as well as j t e , a perfect switchoff current in (a) and a linear switch-off-ramp in (b). Related to the complexity of the time-dependent source functions, a few uniform time steps are appropriate for describing perfect rectangular functions or linear ramps, whereas more steps are required for accurately describing complex source functions. Uniform steps are preferred as they require only one matrix factorization and further inexpensive back-substitutions.
Following Cai et al. (2017) for the commonly considered perfect switch-off case, we consider a Gaussian function (Fig. 1a) to express the source function of a perfect switch-on or switch-off process at time t 0 in terms of We use the following representation of a time-dependent Dirac distribution whereĵ denotes the current strength of the related step-response signal. We found that suitable choices for α, controlling the stretching of the distribution, are one to three orders of magnitude lower than the earliest observation time (e.g. α = 10 ns for t 0 = 0 s and the first observation at 1 μs). The smaller α becomes, the better is the approximation of a perfect switch-off current often used for theoretical analysis. Using 100 uniform time steps for the discretization of the source pulse in the range [t 0 -5 · α, t 0 + 5 · α] showed to result in good accuracy and computational performance.

R E S U LT S
This section demonstrates the numerical accuracy and computational performance of the implemented algorithms by three examples. All computations were run on a Dell R PowerEdge R940 server with four Intel R Xeon R Gold 6154 processors and 48 LRDIMM 64 GB, DDR4-2666 Quad Ranks shared random access memory (RAM). Overall, 144 computing nodes and approximately 3 TB RAM were available. We realized parallelization utilizing both MPI (mpi4py) and OpenMP. MPI is the default parallel-communication library in FEniCSfor organizing the mesh distribution with PETSc and the parallelized internode communication with MUMPS. OpenMP is a shared-memory library used by MUMPS internally for solving the systems of equations, distributing the tasks during this phase over multiple threads. The consumed RAM (Table 1) depends only on the number of parallel MPI processes and is not increased by enabling additional OpenMP threads. A combined usage of up to tens of MPI processes with a few OpenMP threads can often reduce the computation times (CT) and memory requirements in FE problems with comparatively large system sizes, especially for p2 computations (Rochlitz 2020). We used a slightly increased number of MPI processes for the FT approach to account for the larger systems in the frequency-domain.

Example #1: Loop transmitter on top of a three-layered earth
First, we investigate the performance of the three implemented approaches using a simple land-based TEM setup with a loop transmitter as illustrated in Fig. 2. Layered-earth-like geometries usually require a tremendous amount of elements for a sufficiently large computational domain to avoid boundary effects. As a compromise, we used a bounded, 4 ×4 km 2 large fraction of the layered earth embedded within a 10 times larger homogeneous half-space, which led to approximately 17 000 nodes. We set the resistivity of the uppermost layer also for the surrounding boundary-half-space instead of an averaged value of the three layer resistivities, because a homogeneous lateral continuation of the uppermost layer with the strongest resistivity contrast to the air domain usually leads to the smallest artefacts (Rochlitz 2020). Table 1 lists the corresponding computational statistics for all considered approaches. We validated our results against the semianalytic solutions by empymod (Werthmüller 2017).  the standard formulation. An even stronger effect of the domain boundary was visible in the switch-off FT results of b. The increasing late-time errors of the switch-off response could be explained by the insufficient estimation of b static by the lowest-frequency B-field solution, which was influenced by boundary effects as well. If b is obtained via simple numerical integration as in the RA approach, no systematic late-time errors could be observed. Appendix C contains the corresponding switch-on responses and further explanations for interested readers. Aside from the late time, the FT and RA results were almost identical, whereas the IE errors slightly differed below the level of 1 per cent. This behaviour was attributed to effects of the timediscretization in addition to the influence of the spatial discretization, which dominated the error distribution of all three approaches. Fig. 4 shows the horizontal switch-off e y and ∂b y /∂t responses at the out-of-loop Rx positions. Analogous to the presented central loop results, the average error levels amounted to 10/1 per cent for p1/p2 and the time-discretization had an unique effect on the IE results. The sign reversal in the ∂b y /∂t result at 0.02 ms corresponded to the highest relative errors observed.

Example #2: 3-D LOTEM with conductors
After presenting a validation against analytic solutions, we considered a 3-D LOTEM configuration with a 1-km-long dipole source in a half-space model with three conductive bricks. Fig. 5 illustrates the geometry. Table 2 provides details about the incorporated anomalies. Solutions obtained with the established SLDMEM code, which is based on the work by Druskin & Knizhnerman (1994), served as a reference. To calculate the misfit between the two inexact numerical solutions, we utilized the so-called symmetric mean absolute percentage error (SMAPE) 200 · |a − b|/(|a| + |b|).
The parallel simulations were run on a single mesh, whereas the serial SLDMEM calculations used six optimized grids for the three receiver lines with respect to e or ||∂b/∂t||, respectively. The corresponding computational statistics are listed in Table 1. Approximately one hour was required for each of the six SLDMEM computations on a single Intel i7-8665U mobile CPU. Fig. 6a) shows the comparison of the e-field vector magnitude, denoted as ||e||, at the position x = 1000 m, y = -1000 m close to the anomalies. The related misfits between all custEM simulations and the SLDMEM solution at this Rx position were smaller than 3 per cent. Fig. 6(b) presents the corresponding ||∂b/∂t|| results with NRMS values mainly below 1 per cent. Tests revealed that the different overall misfit levels are mainly influenced by the underlying SLDMEM grid discretization. Viewing further Rx locations (plots are available in the data repository: https://gitlab.com/Rochlitz.R/custEM, 30.11.2020), we observed early-time misfits (<1 ms) of 1-10 per cent and a good agreement of less than 2 per cent, often better, at later times. Our three results differed at some locations only at late times (>0.1 s), which was mainly caused by effects of handling the DC level in the different approaches. Fig. 7 provides a comprehensive overview of all misfits at the 18 Rx locations. It shows the median (colour coded patches) of the 41 NRMS values of each transient response for ||e|| (top row) and ||∂b/∂t|| (bottom row). Bluish colours indicate stations with a very good fit, reddish indicate higher misfits of up to a few percents for most of the data points. The overall agreement of the ||∂b/∂t|| results with median NRMS values of about 1 per cent and smaller was very good. The misfit distribution of e varied stronger between the locations and exhibited median values up to 4 per cent, in particular, on the +1 km observation line. We strongly assume that further modifications on the SLDMEM grid for e on this Rx line would lead to a better match.

Example #3: Marlim R3D
After presenting all approached for two land-based setups, our final application is a real-world example about marine CSEM hydrocarbon exploration. Since the RA method has shown to be overall as accurate as the other approaches, but significantly faster, we present only results of this method for this example. Recently   Table 2), a 1 km grounded (x-directed) Tx at y = z = 0 km and three surface Rx lines at x = (-1, 0, 1) km with each 6 positions at y = (0.5, 1, 1.5, 2, 2.5, 3) km. the RA method against this recently published FD reference data set by transforming the transients into frequency domain. For this task, we applied the FFTlog algorithm (Talman 1978;Hamilton 2000) which was reimplemented as pyfftlog software in Python by Dieter Werthmüller (https://github.com/prisae/pyff tlog, 06.10.2020). The transformation parameters used in this work were added in Appendix D. The Marlim R3D model includes bathymetry, six subsurface layers with topography obtained from picking horizons in seismic data, and vertical transverse isotropic (VTI) conductivities. The particular challenge for reproducing these results with our code was the mapping of conductivities from a regular grid on a suitable tetrahedral mesh, taking the geometric constraints of the stratigraphic horizons into account. The latter covered only the central part of the area (≈11 km in x-and y-direction from the origin). Beyond this area, we appended an inner and outer boundary mesh for increasing the overall extent to avoid boundary artefacts. Within the inner boundary mesh, the tetrahedra volume was constrained for ensuring an accurate inter-and extrapolation of the supplied resistivity information. Fig. 8 illustrates a section through the complete domain, showing the layered-earth constraints as well as the horizontal resistivity distribution of our mesh. The vertical resistivities are twice as high aside from the air, water and deep salt (on top of basement) layers (Correa & Menezes 2019). Fig. 9 shows the electric fields on the broadside observation line and the corresponding NRMS values compared with the results by Correa & Menezes (2019), Fig. 4. The strongest E x component exhibited mainly misfits below 10 per cent at all frequencies. The misfits of the weaker E y and E z components were overall below 20 per cent. According to the nature of relative errors, the values were higher at positions of sign reversals or very low signal amplitudes.
In Appendix E, we added a cross-comparison with the custEM FD solution (Werthmüller et al. 2020) for the two frequencies 0.25 and 1 Hz. It revealed that the transformation process from T D to FD appeared to be an additional error source in the range of a few percent.
We used the same mesh for the T D and FD simulations because the mesh size is mainly controlled by approximating the subsurface geometry. The RA approach required ≈5 min computation time with 16 MPI processes and 4 OpenMP threads. Calculating the results directly in FD demanded ≈15 min with equal resources. As the transformation effort was negligible, we achieved a performance benefit of factor 3 for this particular example by simulating FD data in T D with the RA method. This ratio could change with other discretizations and solver types. It could also vary if multiple meshes are designed for specific ranges of times or frequencies in other modelling cases with a less tetrahedra-demanding geometry of the subsurface.

Approach characteristics
All three implemented time-domain EM modelling approaches support the simulation of complex geophysical models with good accuracy, which is mainly determined by the spatial discretization. The IE approach was computationally more efficient than the FT approach. The RA approach clearly outperformed the other two approaches regarding the computation times (Table 1). This behaviour is related to the computational complexity of each method. Following Mulder et al. (2008) and Börner et al. (2015), we can express the computational efforts as O (n dof * n timesteps ), O(2n dof * n f requencies ) (complex-valued systems) and O(n dof * n poles ) for the IE, FT and RA approaches, respectively. Taking into account the amount of expensive matrix factorizations using direct solvers, which are primarily controlling the computation times, the representation above is misleading. With our methodology, the computation times depend mainly on the number of logarithmic time steps +1, the number of frequencies, and the number of cyclically repeated poles +1. Table 3 provides an overview of these and further characteristics of each approach.
In this work, we only considered simple current wave-forms as ramp effects can often be neglected in case of fast-switching transmitters or irrelevant early times. If realistic ramps are of interest, the current function can be arbitrarily discretized in the IE approach and, for instance, adapted in a post-processing step for the FT and RA approaches Fitterman & Anderson (1987). It is not trivial to calculate sensitivities for inverse modellings with the given formulation of the RA approach. To the best of our knowledge, it has not yet been possible to develop an algorithm for calculating the derivatives of the Krylov basis vectors (25) with respect to the electrical conductivity.
Projecting the results at points in time is computationally inexpensive in the RA approach. Therefore, the numerical integration method for obtaining b from • b is efficient. In contrast, numerical integration is unsuited for the other two approaches. It requires a denser logarithmic spacing in time or significantly more frequency solutions to be calculated, which would make the computational performance even worse.
Based on the overall performance, it appears that the RA approach is the method of choice. In combination with the logarithmic median FT ( )= 2.13 % median FT ( )= 0.68 % (b) (a)    Fast Fourier transformation logfft, the RA approach can be seriously considered as an alternative to simulations in the frequency-domain. Even though not considered here, the poles ξ introduced in the RA method can be used to directly compute the frequency-domain response for a given spatial discretization.

Validation
To the best of our knowledge, there is still no report about proposals for a standard to evaluate the accuracy of 3-D geophysical EM simulations. With respect to recently published research, we consider misfits of less than 1 per cent between 3-D numerical and semi-analytic solutions as sufficiently accurate. Accordingly, we achieved good results with p2 basis functions in Example #1, whereas the p1 results were not accurate enough with errors up to 10 per cent. We decided for p1 and p2 computations on the same mesh in this examples for directly comparing accuracy and performance.
We observed increased errors of the p2 results towards the latest observation times in Example #1. The late-time errors were caused by the horizontal limitation of the layered-earth extent of the mesh as a compromise between accuracy and computational performance. Shifting the boundary-mesh to a greater distance would reduce the boundary effects for the cost of longer computation times and higher memory requirements. These boundary effects were not visible in the p1 results due to the greater overall error level caused by the comparatively coarse refinement.
Simulations with p1 can of course provide better results by using finer meshes with higher quality (Rochlitz 2020). However, this would lead to an inferior computational performance compared to the p2 computations (Grayver & Kolev 2015;Castillo-Reyes et al. 2019;Rochlitz et al. 2019). Nevertheless, we regard p1 simulations as a very practical tool for getting a first impression of the EM field behaviour of realistic models and for simulating complicated geometries which already have a large number of tetrahedra after the mesh generation, even without considering further Rx or Tx refinements.
We also expected misfits of around 1 per cent for accurate results of two different numerical codes, simulating exactly the same geometry in Example #2. Our comparison showed error levels of a few percent and less for the majority of the compared data points, but also larger misfits of up to more than 10 per cent at a few Rx locations. The median value was chosen instead of the mean as overall estimate of the fit between two different results to avoid the influence of single data points with very high relative errors due to very low amplitudes or sign reversals. The even better choice for an overall error estimate can be other percentiles, for example the 90th percentile instead of the median as 50th percentile (Rochlitz 2020). This would guaranty that almost all (e.g. 90 per cent) data points are of better quality than this measure. Moreover, the specific percentile can be chosen according to the number of investigated data points and expected outliers.
Aside from varying the spatial discretization and simulation parameters to optimize the fit between our two independent simulations, a validation of the 'true' result would be only possible through a cross-validation with further codes. In contrast to Example #1, a homogeneous subsurface model with multiple single anomalies of arbitrary shape is suited for an unstructured tetrahedral discretization. Table 1 shows comparable computational efforts for the simple 1-D TEM and the 3-D LOTEM model as a strong argument for this statement.
In Example #3, the published FD model results were independently reproduced based on the information by Correa & Menezes (2019). Actually, the unstructured tetrahedral discretization is generally suited to approximate realistic, irregular geometries such as in Marlim R3D with subsurface layer constraints. However, the provided resistivity model on a fine, regular grid with a limited extent required the interpolation of the resistivity distribution on our tetrahedral elements. We assume this procedure to cause the higher misfits up to 20 per cent and are aware of the related discrepancies. Though, this example demonstrates very well the unavoidable issue of a common and realistic model representation between different numerical methods in practical modelling applications.
Due to this problem and as further mesh variations did not yield better matching results, we considered the average error of 10 per cent as sufficient for the Marlim R3D model crosscomparison. This reveals that there is still a requirement of comparing more real-world models between state-of-the-art EM codes instead of validations to semi-analytic solutions for confirming the general accuracy. We think the most challenging part in complex geophysical EM modelling is to represent the true geometry of complicated conductivity structures accurately while keeping the required computational resources low.

Enhancements
The implementation of alternative iterative solvers with appropriate pre-conditioners would reduce the memory requirements significantly. Even though a broad range of pre-conditioners and iterative solution methods are provided by FEniCSand PETSc, this development step would require huge efforts for analysing the approach-specific, poorly conditioned system-matrices arising from the FE formulation to solve EM problems on unstructured tetrahedral meshes with tremendously varying cell sizes.
Adaptive time-stepping techniques as an alternative to our fixed logarithmic spacing with internal linear steps could provide a higher flexibility to enhance the computational performance of the IE approach. The performance of the FT approach can be considerably improved with a smaller number of problem-dependent FHT filter coefficients for specific setups . For instance, reducing the amount of required frequencies from 120 to 20-30 would speed up the FT approach significantly and make it maybe even faster than the IE method. Moreover, we assume that it would be worth investigating if the sequence of frequency-domain simulations in the FT method could be calculated faster with optimized iterative solution strategies for such problems.

C O N C L U S I O N S
We present a direct comparison of three of the most common approaches for modelling 3-D TEM data. Three examples ranging from simple to challenging EM modelling environments illustrate the numerical accuracy, computational performance and unique characteristics of all implemented approaches. The implicit Euler and inverse Fourier Transform based approaches are computationally not competitive to the Rational Arnoldi method in our work using unstructured tetrahedral meshes and the direct solver MUMPS. It is worth mentioning that simulations based on other numerical methods, discretizations and iterative solvers could lead to contrary observations.
We are confident that it would be possible to improve the computational performance of the IE and FT approaches using our discussed options and maybe others, whereas we do not see any straightforward and significant optimization potential for the RA method as used in our work. In particular, where is a great potential to reduce the number of required frequencies for the FT approach which would make it absolutely competitive to the IE approach in terms of computation times.
Nevertheless, the RA method, which requires only three expensive matrix factorizations with our implementation, would still clearly outperform the other approaches. We expect the parallelized version of the computationally most efficient Rational Arnoldi method to be of great use for the simulation of 3-D TEM signals.
Our final example shows that in some cases, calculating frequencydomain responses with the time-domain Rational Arnoldi method might be beneficial over computations in frequency domain. We are convinced that the topic of converting time-domain responses into frequency-domain is worth further investigation.
We provide an open-source, parallelized FE implementation for efficiently simulating realistic 3-D time-domain EM data. We encourage geophysical EM modelers to prefer benchmarking their codes with realistic 3-D models via cross-comparisons over validations to semi-analytic solutions since nowadays, the challenge in 3-D geophysical EM modelling appears to be not anymore the implementation of the numerical methods but the optimum discretization of realistic 3-D survey geometries. In this context, we are open for participating in further benchmark studies or for comparing the results of this work with others.

F U N D I N G
The development of custEM by RR as part of the DESMEX/DESMEX II projects was funded by the Federal Ministry of Education and Research, Germany (BMBF) in the framework of the research and development program Fona-r4 under grants 033R130D/033R130DN.

A C K N O W L E D G E M E N T S
We sincerely thank the developers of FEniCS, TetGen and pyGIMLi for their effort on developing software tools for the community over the years. The authors are especially grateful to Thomas Günther for all the fruitful discussions regarding the solution of methodological and implementation issues, Dieter Werthmüller for his great support on successfully applying the pyfttlog algorithm and Tilman Hanstein for providing the FHT filter coefficients. We highly appreciate the thorough comments from Jochen Kamm, Dieter Werthmüller and an anonymous reviewer that helped a lot to improve this manuscript.

DATA AVA I L A B I L I T Y
The open-source toolbox custEM and all presented results can be accessed on https://gitlab.com/Rochlitz.R/custEM, (23.07.2021). The documentation is available on https://custem.readthedocs.io, (23.07.2021). The software and corresponding data can be used for individual purposes by following the GNU General Public License, version 3.

S U P P O RT I N G I N F O R M AT I O N
Supplementary data are available at GJ I online.

links to custEM.txt
Please note: Oxford University Press is not responsible for the content or functionality of any supporting materials supplied by the authors. Any queries (other than missing material) should be directed to the corresponding author for the paper.

A P P E N D I X A : F H T F I LT E R C O E F F I C I E N T S
Additional to the set of filter weights, a digital filter consists of two more parameters: the shift a and the sampling interval or spacing s. The shift factor a determines the starting point for sampling the input function in the frequency domain and the unit-free spacing s defines the interval for the subsequent samples in the frequency domain down to the lowest frequency. Thus, the set of n sampling frequencies f n , ordered from high to low, is given by: with t 0 , the first time channel. We used the following FHT filter coefficients with the sampling interval s =10 0.1 and the shift factor a = 1.653801536E+03: Reusing established software, custEM relies on the open-source library FEniCS (Logg et al. 2012;Langtangen et al. 2016), providing highly developed libraries for the automated solution of partial differential equations. It further uses TetGen (Si 2015) and pyGIMLi (Rücker et al. 2017) for the mesh generation and COMET (Skibbe et al. 2020) for primary field calculations. The modularized organization allowed reusing existing submodules for the T D modelling approaches and also a total-field formulation for calculating 3-D natural-source (MT) EM data.
In addition to the major extension by adding the three timedomain modelling approaches as well as an MT modelling approach, we mention further important developments after the previously published version 0.93 (Rochlitz et al. 2019). The provided conda package simplifies the installation process for users and reduces incompatibility issues on different computer architectures. Newer conda packages of FEniCSsupport parallel I/O of model data and use recent versions of MUMPS. This leads to a significant computational performance improvement and the system matrix size is no longer limited. Incorporating multiple sources on multiple right-hand sides of the systems of equations was automated in all approaches, allowing to exploit reusing matrix factorization. In this context, we could remove redundant I/O overhead and add alternative parallelized interpolation methods.
The electric-field modelling approaches in FD support now simulations with relative magnetic permeabilities and induced polarization parameters. Common total and secondary field potential approaches for solving the DC problem were added. The mesh generation tools contain new features for automatically incorporating bathymetry information to model coastal or marine environments. Further meshing improvements are the support of connected embedded anomalies, an optimized Tx and Rx discretization, automated forwarding of Tx and Rx information for the FE simulation, and general updates to minimize the amount of required tetrahedra for most CSEM geometries.
The open-source Python toolbox custEM 1.0 has now become a tool for simulating any kind of CSEM, TEM and MT data, including marine, coastal, land-based, semi-airborne, airborne and borehole models with arbitrary geometry. A variety of supplied tutorials and examples can support interested users in simulating their own EM setups. In addition to forward modelling purposes, it is possible to build inverse-modelling routines upon our implemented algorithms. Fig. C1 illustrates the b z switch-on transients of Example #1 at the central loop positions. Aside from the early times, the relative error levels for p1 and p2 are lower than the ones of the impulse response and switch-off transients in Fig. 3. This observation can be attributed to the high amplitude level of the static field, which dominates the transients more and more towards late times. There are no boundary effects visible in comparison to the late-time switch-off transients of b, whose quality depends on an accurate estimation of b static .

A P P E N D I X D : T R A N S F O R M AT I O N O F T R A N S I E N T S
The results of testing different parameters showed that the most accurate transformation results with the fftl subroutine of the pyfftlog algorithm (https://github.com/prisae/pyff tlog, 06.10.2020) were achieved by using a long input time range from 1e-6 to 1e4 s with a fine sampling of 80 logarithmic steps per decade. Furthermore, we set the bias coefficient q = 1. We were required to modify the original RA solution with 80 samples per decade between 1e-2 to 1e2 s for this procedure. First, we determined the signal onset based on a constantly increasing signal and the amplitude level. Secondly, we set all values before the signal onset to the constant value of the onset amplitude divided by 1e6. Third, we used the UnivariateSpline algorithm of the scipy library to extent the original signal in both direction in time. Fig. D1 illustrates the modification result for one example transient.

A P P E N D I X E : M A R L I M R 3 D C RO S S -C O M PA R I S O N
Fig. E1 reveals in many areas higher misfits between both of our and the C&M solutions than between the transformed RA and FD solutions. This indicated a strong influence of the model discretization and resistivity interpolation procedure on the results. In most areas along the profile, the RA -C&M misfits were slightly higher than the FD-C&M misfits, which indicated an overall small error contribution by the transformation process. Between 395 000 and 400 000 m, the fit of the E x component was better between RA -C&M compared to FD-C&M, in particular at 0.25 Hz. This observation demonstrated that also the chosen numerical simulation approach can lead to higher misfits in some areas.