2. Background Theory¶

This section aims to provide the user with a basic review of the physics, discretization, and optimization techniques used to solve the frequency domain electromagnetics problem. It is assumed that the user has some background in these areas. For further reading see ([Nab91]).

Important

This code uses the following coordinate system and Fourier convention to solve Maxwell’s equations:

X = Easting, Y = Northing, Z +ve downward (left-handed)
An \(e^{-i \omega t}\) Fourier convention

2.1. Fundamental Physics¶

Maxwell’s equations provide the starting point from which an understanding of how electromagnetic fields can be used to uncover the substructure of the Earth. In the frequency domain Maxwell’s equations are:

(2.1)¶\[\begin{split}\begin{align} \nabla \times &\mathbf{E} - i\omega\mu \mathbf{H} = 0 \\ \nabla \times &\mathbf{H} - \sigma \mathbf{E} = \mathbf{s} \\ &\mathbf{\hat{n} \times H} = 0 \end{align}\end{split}\]

where \(\mathbf{E}\) and \(\mathbf{H}\) are the electric and magnetic fields, \(\mathbf{s}\) is some external source and \(e^{-i\omega t}\) is suppressed. Symbols \(\mu\), \(\sigma\) and \(\omega\) are the magnetic permeability, conductivity, and angular frequency, respectively. This formulation assumes a quasi-static mode so that the system can be viewed as a diffusion equation (Weaver, 1994; Ward and Hohmann, 1988 in [Nab91]). By doing so, some difficulties arise when solving the system;

the curl operator has a non-trivial null space making the resulting linear system highly ill-conditioned

the conductivity \(\sigma\) varies over several orders of magnitude

the fields can vary significantly near the sources, but smooth out at distance thus high resolution is required near sources

2.2. Octree Mesh¶

By using an Octree discretization of the earth domain, the areas near sources and likely model location can be give a higher resolution while cells grow large at distance. In this manner, the necessary refinement can be obtained without added computational expense. The figure below shows an example of an Octree mesh, with nine cells, eight of which are the base mesh minimum size.

When working with Octree meshes, the underlying mesh is defined as a regular 3D orthogonal grid where the number of cells in each dimension are \(2^{n_1} \times 2^{n_2} \times 2^{n_3}\). The cell widths for the underlying mesh are \(h_1, \; h_2, \; h_3\), respectively. This underlying mesh is the finest possible, so that larger cells have side lengths which increase by powers of 2. The idea is that if the recovered model properties change slowly over a certain volume, the cells bounded by this volume can be merged into one without losing the accuracy in modeling, and are only refined when the model begins to change rapidly.

2.3. Discretization of Operators¶

The operators div, grad, and curl are discretized using a finite volume formulation. Although div and grad do not appear in (2.1), they are required for the solution of the system. The divergence operator is discretized in the usual flux-balance approach, which by Gauss’ theorem considers the current flux through each face of a cell. The nodal gradient (operates on a function with values on the nodes) is obtained by differencing adjacent nodes and dividing by edge length. The discretization of the curl operator is computed similarly to the divergence operator by utilizing Stokes theorem by summing the magnetic field components around the edge of each face. Please see [HHG+12] for a detailed description of the discretization process.

2.4. Forward Problem¶

To solve the forward problem, we must first discretize and solve for the fields in Eq. (2.1), where \(e^{-i\omega t}\) is suppressed. Using finite volume discretization, the electric fields on cell edges (\(\mathbf{u_e}\)) are obtained by solving the following system at every frequency:

(2.2)¶\[\mathbf{A}(\boldsymbol{\sigma}) \, \mathbf{u_e} = - i \omega \mathbf{s_e}\]

where \(\mathbf{s_e}\) is a source term discretized to cell edges and matrix \(\mathbf{A}(\boldsymbol{\sigma})\) is given by:

(2.3)¶\[\mathbf{A}(\boldsymbol{\sigma}) = \mathbf{C^T \, M_\mu \, C} + i\omega \mathbf{M_\sigma}\]

\(\mathbf{C}\) is the curl operator and the mass matricies \(\mathbf{M_\sigma}\) and \(\mathbf{M_\mu}\) are given by:

\[\begin{split}\begin{align} \mathbf{M_\mu} &= diag \big ( \mathbf{A^T_{f2c} V} \, \boldsymbol{\mu^{-1}} \big ) \\ \mathbf{M_\sigma} &= diag \big ( \mathbf{A^T_{e2c} V} \, \boldsymbol{\sigma} \big ) \\ \end{align}\end{split}\]

where \(\mathbf{V}\) is a diagonal matrix containing all cell volumes, \(\mathbf{A_{f2c}}\) averages from faces to cell centres and \(\mathbf{A_{e2c}}\) averages from edges to cell centres. The magnetic permeabilities and conductivities for each cell are contained within vectors \(\boldsymbol{\mu}\) and \(\boldsymbol{\sigma}\), respectively.

2.4.1. Total vs Secondary Field¶

To compute the total field or the secondary field, we define a different right-hand-side for Eq. (2.2)

Total Field Computation:

For total field data, the analytic source current \(\mathbf{s}\) defined in Eq. (2.1) is interpolated to cell edges by a function \(f(\mathbf{s})\). It is then multiplyied by an inner-product matrix \(\mathbf{M_e}\) that lives on cell-edges. Thus for the right-hand-side in Eq. (2.2), the discrete source term \(\mathbf{s_e}\) is given by:

\[\mathbf{s_e} = \mathbf{M_e} \, f(\mathbf{s})\]

where

\[\mathbf{M_e} = diag \big ( \mathbf{A^T_{e2c} v} \big )\]

Secondary Field Computation:

For secondary field data, we compute the analytic electric field in a homogeneous full-space due to a current loop or wire. We do this for a background conductivity \(\sigma_0\) and permeability \(\mu_0\). The analytic solution for our source is computed by taking the analytic solution for an electric dipole and integrating over the path of the wire/loop, i.e.:

\[\mathbf{u_0}(\mathbf{r}) = \int_{s'} \mathbf{E_e} (\mathbf{r}, \mathbf{r'}) d\mathbf{s'}\]

For an electric dipole at the origin and oriented along the \(\hat{x}\) direction, the electric field in a homogeneous full-space is given by:

(2.4)¶\[\begin{split}\mathbf{E_e} = \frac{I ds}{4 \pi (\sigma - i \omega \varepsilon) r^3} e^{ikr} \Bigg [ \Bigg ( \frac{x^2}{r^2} \mathbf{\hat{x}} + & \frac{xy}{r^2} \mathbf{\hat{y}} + \frac{xz}{r^2} \mathbf{\hat{z}} \Bigg ) ... \\ &\big ( -k^2 r^2 - 3ikr +3 \big ) + \big ( k^2 r^2 + ikr -1 \big ) \mathbf{\hat{x}} \Bigg ] .\end{split}\]

where

\[k^2 = \omega^2 \mu \varepsilon + i\omega \mu \sigma\]

Once the analytic background field is computed on cell edges, we construct the linear operator \(\mathbf{A}(\mathbf{\sigma_0})\) from Eq. (2.3) using the background conductivity and permeability. Then we use \(\mathbf{A}(\mathbf{\sigma_0})\) and \(\mathbf{u_0}\) to compute the right-hand-side that is used to solve Eq. (2.2)

\[\mathbf{A}(\boldsymbol{\sigma_0}) \, \mathbf{u_0} = - i \omega \mathbf{s_e}\]

2.4.2. Computing Fields at Receivers¶

Once the electric field on cell edges has been computed, we must project to the receivers. For E3D version 2, closed wire loops are used to measure the average magnetic field perpendicular to the loop. Magnetic field measurements (\(H\)) are obtained by integrating the electric field (\(\mathbf{e}\)) over the path of close loop to compute the EMF. The EMF is then divided by \(i\omega \mu_0 A\), where \(A\) is the cross-sectional area, to represent the quantity in terms of the average magnetic field normal to the receiver. In practice, magnetic field measurements can be approximated accurately by applying a linear projection matrix (\(P\)) to the electric fields computed on cell edges:

\[H = \frac{1}{i\omega \mu_0 A} \int_C \mathbf{e} \cdot d\mathbf{l} \approx \frac{1}{i\omega} P \, \mathbf{u_e}\]

Where (\(\mathbf{P}\)) is the projection matrix that takes the electric fields on cell edges to all receivers, the vector containing all magnetic field measurements is given by:

(2.5)¶\[\mathbf{H} = \frac{1}{i\omega} \mathbf{P \, u_e} = - \mathbf{P \, A}(\sigma)^{-1} \mathbf{s}\]

2.5. Sensitivity¶

The total magnetic field data are split into their real and imaginary components. Thus the data at a particular frequency for a particular reading is organized in a vector of the form:

(2.6)¶\[\mathbf{d} = [\mathbf{H}^\prime, \mathbf{H}^{\prime \prime}]^T\]

where \(\prime\) denotes real components and \(\prime\prime\) denotes imaginary components. To determine the sensitivity of the data (i.e. (2.6)) with respect to the model (\(\boldsymbol{\sigma}\)), we must compute:

\[\frac{\partial \mathbf{d}}{\partial \boldsymbol{\sigma}} = \Bigg [ \dfrac{\partial \mathbf{H}^\prime}{\partial \boldsymbol{\sigma}} , \dfrac{\partial \mathbf{H}^{\prime\prime}}{\partial \boldsymbol{\sigma}} \Bigg ]^T\]

where the conductivity model \(\boldsymbol{\sigma}\) is real-valued. To differentiate the data with with respect to the model, we require the derivative of the electric fields on cell edges (\(\mathbf{u_e}\)) with respect to the model (Eq. (2.5)). This is given by:

(2.7)¶\[\frac{\partial \mathbf{u_e}}{\partial \boldsymbol{\sigma}} = - i\omega \mathbf{A}^{-1} diag(\mathbf{u_e}) \, \mathbf{A_{e2c}^T V }\]

2.6. Inverse Problem¶

We are interested in recovering the conductivity distribution for the Earth. However, the numerical stability of the inverse problem is made more challenging by the fact rock conductivities can span many orders of magnitude. To deal with this, we define the model as the log-conductivity for each cell, e.g.:

\[\mathbf{m} = log (\boldsymbol{\sigma})\]

The inverse problem is solved by minimizing the following global objective function with respect to the model:

(2.8)¶\[\phi (\mathbf{m}) = \phi_d (\mathbf{m}) + \beta \phi_m (\mathbf{m})\]

where \(\phi_d\) is the data misfit, \(\phi_m\) is the model objective function and \(\beta\) is the trade-off parameter. The data misfit ensures the recovered model adequately explains the set of field observations. The model objective function adds geological constraints to the recovered model. The trade-off parameter weights the relative emphasis between fitting the data and imposing geological structures.

2.6.1. Data Misfit¶

Here, the data misfit is represented as the L2-norm of a weighted residual between the observed data (\(d_{obs}\)) and the predicted data for a given conductivity model \(\boldsymbol{\sigma}\), i.e.:

(2.9)¶\[\phi_d = \frac{1}{2} \big \| \mathbf{W_d} \big ( \mathbf{d_{obs}} - \mathbb{F}[\boldsymbol{\sigma}] \big ) \big \|^2\]

where \(W_d\) is a diagonal matrix containing the reciprocals of the uncertainties \(\boldsymbol{\varepsilon}\) for each measured data point, i.e.:

\[\mathbf{W_d} = \textrm{diag} \big [ \boldsymbol{\varepsilon}^{-1} \big ]\]

Important

For a better understanding of the data misfit, see the GIFtools cookbook .

2.6.2. Model Objective Function¶

Due to the ill-posedness of the problem, there are no stable solutions obtained by freely minimizing the data misfit, and thus regularization is needed. The regularization uses penalties for both smoothness, and likeness to a reference model \(m_{ref}\) supplied by the user. The model objective function is given by:

(2.10)¶\[\begin{split}\begin{align} \phi_m = \frac{\alpha_s}{2} \!\int_\Omega w_s | m - & m_{ref} |^2 dV + \frac{\alpha_x}{2} \!\int_\Omega w_x \Bigg | \frac{\partial}{\partial x} \big (m - m_{ref} \big ) \Bigg |^2 dV \\ &+ \frac{\alpha_y}{2} \!\int_\Omega w_y \Bigg | \frac{\partial}{\partial y} \big (m - m_{ref} \big ) \Bigg |^2 dV + \frac{\alpha_z}{2} \!\int_\Omega w_z \Bigg | \frac{\partial}{\partial z} \big (m - m_{ref} \big ) \Bigg |^2 dV \end{align}\end{split}\]

where \(\alpha_s, \alpha_x, \alpha_y\) and \(\alpha_z\) weight the relative emphasis on minimizing differences from the reference model and the smoothness along each gradient direction. And \(w_s, w_x, w_y\) and \(w_z\) are additional user defined weighting functions.

An important consideration comes when discretizing the regularization onto the mesh. The gradient operates on cell centered variables in this instance. Applying a short distance approximation is second order accurate on a domain with uniform cells, but only \(\mathcal{O}(1)\) on areas where cells are non-uniform. To rectify this a higher order approximation is used ([HHG+12]). The second order approximation of the model objective function can be expressed as:

\[\phi_m (\mathbf{m}) = \mathbf{\big (m-m_{ref} \big )^T W^T W \big (m-m_{ref} \big )}\]

where the regularizer is given by:

(2.11)¶\[\begin{split}\begin{align} \mathbf{W^T W} =& \;\;\;\;\alpha_s \textrm{diag} (\mathbf{w_s \odot v}) \\ & + \alpha_x \mathbf{G_x^T} \textrm{diag} (\mathbf{w_x \odot v_x}) \mathbf{G_x} \\ & + \alpha_y \mathbf{G_y^T} \textrm{diag} (\mathbf{w_y \odot v_y}) \mathbf{G_y} \\ & + \alpha_z \mathbf{G_z^T} \textrm{diag} (\mathbf{w_z \odot v_z}) \mathbf{G_z} \end{align}\end{split}\]

The Hadamard product is given by \(\odot\), \(\mathbf{v_x}\) is the volume of each cell averaged to x-faces, \(\mathbf{w_x}\) is the weighting function \(w_x\) evaluated on x-faces and \(\mathbf{G_x}\) computes the x-component of the gradient from cell centers to cell faces. Similarly for y and z.

If we require that the recovered model values lie between \(\mathbf{m_L \preceq m \preceq m_H}\) , the resulting bounded optimization problem we must solve is:

(2.12)¶\[\begin{split}\begin{align} &\min_m \;\; \phi_d (\mathbf{m}) + \beta \phi_m(\mathbf{m}) \\ &\; \textrm{s.t.} \;\; \mathbf{m_L \preceq m \preceq m_H} \end{align}\end{split}\]

A simple Gauss-Newton optimization method is used where the system of equations is solved using ipcg (incomplete preconditioned conjugate gradients) to solve for each G-N step. For more information refer again to [HHG+12] and references therein.

2.6.3. Inversion Parameters and Tolerances¶

2.6.3.1. Cooling Schedule¶

Our goal is to solve Eq. (2.12), i.e.:

\[\begin{split}\begin{align} &\min_m \;\; \phi_d (\mathbf{m}) + \beta \phi_m(\mathbf{m - m_{ref}}) \\ &\; \textrm{s.t.} \;\; \mathbf{m_L \leq m \leq m_H} \end{align}\end{split}\]

but how do we choose an acceptable trade-off parameter \(\beta\)? For this, we use a cooling schedule. This is described in the GIFtools cookbook . The cooling schedule can be defined using the following parameters:

beta_max: The initial value for \(\beta\)

beta_factor: The factor at which \(\beta\) is decrease to a subsequent solution of Eq. (2.12)

nBetas: The number of times the inversion code will decrease \(\beta\) and solve Eq. (2.12) before it quits

Chi Factor: The inversion program stops when the data misfit \(\phi_d \leq N \times Chi \; Factor\), where \(N\) is the number of data observations

2.6.3.2. Gauss-Newton Update¶

For a given trade-off parameter (\(\beta\)), the model \(\mathbf{m}\) is updated using the Gauss-Newton approach. Because the problem is non-linear, several model updates may need to be completed for each \(\beta\). Where \(k\) denotes the Gauss-Newton iteration, we solve:

(2.13)¶\[\mathbf{H}_k \, \mathbf{\delta m}_k = - \nabla \phi_k\]

using the current model \(\mathbf{m}_k\) and update the model according to:

(2.14)¶\[\mathbf{m}_{k+1} = \mathbf{m}_{k} + \alpha \mathbf{\delta m}_k\]

where \(\mathbf{\delta m}_k\) is the step direction, \(\nabla \phi_k\) is the gradient of the global objective function, \(\mathbf{H}_k\) is an approximation of the Hessian and \(\alpha\) is a scaling constant. This process is repeated until any of the following occurs:

The gradient is sufficiently small, i.e.:

\[\| \nabla \phi_k \|^2 < tol \_ nl\]

The smallest component of the model perturbation its small in absolute value, i.e.:

\[\textrm{max} ( |\mathbf{\delta m}_k | ) < mindm\]

A max number of GN iterations have been performed, i.e.

\[k = iter \_ per \_ beta\]

2.6.3.3. Gauss-Newton Solve¶

Here we discuss the details of solving Eq. (2.13) for a particular Gauss-Newton iteration \(k\). Using the data misfit from Eq. (2.9) and the model objective function from Eq. (2.11), we must solve:

(2.15)¶\[\Big [ \mathbf{J^T W_d^T W_d J + \beta \mathbf{W^T W}} \Big ] \mathbf{\delta m}_k = - \Big [ \mathbf{J^T W_d^T W_d } \big ( \mathbf{d_{obs}} - \mathbb{F}[\mathbf{m}_k] \big ) + \beta \mathbf{W^T W m}_k \Big ]\]

where \(\mathbf{J}\) is the sensitivity of the data to the current model \(\mathbf{m}_k\). The system is solved for \(\mathbf{\delta m}_k\) using the incomplete-preconditioned-conjugate gradient (IPCG) method. This method is iterative and exits with an approximation for \(\mathbf{\delta m}_k\). Let \(i\) denote an IPCG iteration and let \(\mathbf{\delta m}_k^{(i)}\) be the solution to (2.15) at the \(i^{th}\) IPCG iteration, then the algorithm quits when:

the system is solved to within some tolerance and additional iterations do not result in significant increases in solution accuracy, i.e.:

\[\| \mathbf{\delta m}_k^{(i-1)} - \mathbf{\delta m}_k^{(i)} \|^2 / \| \mathbf{\delta m}_k^{(i-1)} \|^2 < tol \_ ipcg\]

a maximum allowable number of IPCG iterations has been completed, i.e.:

\[i = max \_ iter \_ ipcg\]