2. Background Theory

This section aims to provide the user with a basic review of the physics, discretization, and optimization techniques used to solve the frequency domain electromagnetics problem. It is assumed that the user has some background in these areas. For further reading see ( [Nab91] ).

Important

This code uses the following coordinate system and Fourier convention to solve Maxwell’s equations:
  • X = Easting, Y = Northing, Z +ve downward (left-handed)

  • An \(e^{-i \omega t}\) Fourier convention

2.1. Fundamental Physics

Maxwell’s equations provide the starting point from which an understanding of how electromagnetic fields can be used to uncover the substructure of the Earth. In the frequency domain Maxwell’s equations are:

(2.1)\[\begin{split}\begin{align} \nabla \times &\mathbf{E} - i\omega\mu \mathbf{H} = 0 \\ \nabla \times &\mathbf{H} - \sigma \mathbf{E} = \mathbf{s} \\ &\mathbf{\hat{n} \times H} = 0 \end{align}\end{split}\]

where \(\mathbf{E}\) and \(\mathbf{H}\) are the electric and magnetic fields, \(\mathbf{s}\) is some external source and \(e^{-i\omega t}\) is suppressed. Symbols \(\mu\), \(\sigma\) and \(\omega\) are the magnetic permeability, conductivity, and angular frequency, respectively. This formulation assumes a quasi-static mode so that the system can be viewed as a diffusion equation (Weaver, 1994; Ward and Hohmann, 1988 in [Nab91]). By doing so, some difficulties arise when solving the system;

  • the curl operator has a non-trivial null space making the resulting linear system highly ill-conditioned

  • the conductivity \(\sigma\) varies over several orders of magnitude

  • the fields can vary significantly near the sources, but smooth out at distance thus high resolution is required near sources

2.2. Octree Mesh

By using an Octree discretization of the earth domain, the areas near sources and likely model location can be give a higher resolution while cells grow large at distance. In this manner, the necessary refinement can be obtained without added computational expense. The figure below shows an example of an Octree mesh, with nine cells, eight of which are the base mesh minimum size.

../_images/OcTree.png

When working with Octree meshes, the underlying mesh is defined as a regular 3D orthogonal grid where the number of cells in each dimension are \(2^{n_1} \times 2^{n_2} \times 2^{n_3}\). The cell widths for the underlying mesh are \(h_1, \; h_2, \; h_3\), respectively. This underlying mesh is the finest possible, so that larger cells have side lengths which increase by powers of 2. The idea is that if the recovered model properties change slowly over a certain volume, the cells bounded by this volume can be merged into one without losing the accuracy in modeling, and are only refined when the model begins to change rapidly.

2.3. Discretization of Operators

The operators div, grad, and curl are discretized using a finite volume formulation. Although div and grad do not appear in (2.1), they are required for the solution of the system. The divergence operator is discretized in the usual flux-balance approach, which by Gauss’ theorem considers the current flux through each face of a cell. The nodal gradient (operates on a function with values on the nodes) is obtained by differencing adjacent nodes and dividing by edge length. The discretization of the curl operator is computed similarly to the divergence operator by utilizing Stokes theorem by summing the magnetic field components around the edge of each face. Please see [HHG+12] for a detailed description of the discretization process.

2.4. Forward Problem

2.4.1. Direct Solver Approach

To solve the forward problem, we must first discretize and solve for the fields in Eq. (2.1), where \(e^{-i\omega t}\) is suppressed. Using finite volume discretization, the electric fields on cell edges (\(\mathbf{u_e}\)) are obtained by solving the following system at every frequency:

(2.2)\[\big [ \mathbf{C^T \, M_\mu \, C} + i\omega \mathbf{M_\sigma} \big ] \, \mathbf{u_e} = - i \omega \mathbf{s_e}\]

where \(\mathbf{s_e}\) is a source term defined on cell edges, \(\mathbf{C}\) is the curl operator and:

\[\begin{split}\begin{align} \mathbf{M_\mu} &= diag \big ( \mathbf{A^T_{f2c} V} \, \boldsymbol{\mu^{-1}} \big ) \\ \mathbf{M_\sigma} &= diag \big ( \mathbf{A^T_{e2c} V} \, \boldsymbol{\sigma} \big ) \\ \end{align}\end{split}\]

where \(\mathbf{V}\) is a diagonal matrix containing all cell volumes, \(\mathbf{A_{f2c}}\) averages from faces to cell centres and \(\mathbf{A_{e2c}}\) averages from edges to cell centres. The magnetic permeabilities and conductivities for each cell are contained within vectors \(\boldsymbol{\mu}\) and \(\boldsymbol{\sigma}\), respectively.

Once the electric field on cell edges has been computed, the electric (\(\mathbf{E}\)) and magnetic (\(\mathbf{H}\)) fields at observation locations can be obtain via the following:

(2.3)\[\begin{split}\begin{align} \mathbf{E} &= \mathbf{Q_e \, u_e} = \mathbf{Q_c \, A_{e2c} \, u_e} \\ \mathbf{H} &= \mathbf{Q_h \, u_e} = \frac{1}{i \omega} \mathbf{Q_c} \, diag(\boldsymbol{\mu}^{-1}) \, \mathbf{A_{f2c} C \, u_e} \end{align}\end{split}\]

where \(\mathbf{Q_c}\) represents the appropriate projection matrix from cell centers to a particular receiver (Ex, Ey, Ez, Hx, Hy or Hz). If we let

(2.4)\[\mathbf{A}(\sigma) = \mathbf{C^T \, M_\mu \, C} + i\omega \mathbf{M_\sigma}\]

then (2.3) can be written as:

(2.5)\[\begin{split}\begin{align} \mathbf{E} &= -i\omega \mathbf{Q_e \, A}(\sigma)^{-1} \, \mathbf{s_e} \\ \mathbf{H} &= -i\omega \mathbf{Q_h \, A}(\sigma)^{-1} \, \mathbf{s_e} \end{align}\end{split}\]

2.4.2. Iterative Solver Approach

For this approach we decompose the electric field as follows:

(2.6)\[\mathbf{u_e} = \mathbf{a} + \mathbf{G} \phi\]

where \(\mathbf{u_e}\) is the fields on cell edges, \(\mathbf{a}\) is the vector potential, \(\phi\) is the scalar potential and \(\mathbf{G}\) is the discrete gradient operator. To compute the electric fields, the BiCGstab algorithm is used to solve the following system:

(2.7)\[\begin{split}\begin{bmatrix} \mathbf{A} (\sigma) + \mathbf{D} & -i\omega \mathbf{M_\sigma G} \\ -i\omega \mathbf{G^T M_\sigma} & -i\omega \mathbf{G^T M_\sigma G} \end{bmatrix} \begin{bmatrix} \mathbf{a} \\ \phi \end{bmatrix} = \begin{bmatrix} -i\omega\mathbf{s_e} \\ -i\omega \mathbf{G^T s} \end{bmatrix}\end{split}\]

where

\[\mathbf{D} = \mathbf{G} \, diag \big ( \mathbf{A^T_{n2c} V} \, \boldsymbol{\mu^{-1}} \big ) \mathbf{G^T}\]

is a matrix that is added to the (1,1) block of Eq. (2.7) to improve the stability of the system and \(\mathbf{A}\) is given by Eq. (2.4). Once Eq. (2.7) is solved, Eq. (2.6) is used to obtain the electric fields on cell edges and Eq. (2.3) computes the fields at the receivers.

Adjustable parameters for solving Eq. (2.7) iteratively using BiCGstab are defined as follows:

  • tol_bicg: relative tolerance (stopping criteria) when solver is used during forward modeling; i.e. solves Eq. (2.2). Ideally, this number is very small (~1e-10).

  • tol_ipcg_bicg: relative tolerance (stopping criteria) when solver needed in computation of \(\delta m\) during Gauss Newton iteration; i.e. must solve Eq. (2.10) to solve Eq. (2.16). This value does not need to be as large as the previous parameter (~1e-5).

  • max_it_bicg: maximum number of BICG iterations (~100)

2.4.3. Total vs Secondary Field

To compute the total field or the secondary field, we define a different right-hand-side for Eq. (2.2) (direct solver) or Eq. (2.7) (iterative solver).

Total Field Computation:

For total field data, the analytic source current \(\mathbf{s}\) defined in Eq. (2.1) is interpolated to cell edges by a function \(f(\mathbf{s})\). It is then multiplyied by an inner-product matrix \(\mathbf{M_e}\) that lives on cell-edges. Thus for the right-hand-side in Eq. (2.2) or Eq. (2.7), the discrete source term \(\mathbf{s_e}\) is given by:

\[\mathbf{s_e} = \mathbf{M_e} \, f(\mathbf{s})\]

where

\[\mathbf{M_e} = diag \big ( \mathbf{A^T_{e2c} v} \big )\]

Secondary Field Computation (Direct Solver):

For secondary field data, we compute the analytic electric field in a homogeneous full-space due to a current loop or wire. We do this for a background conductivity \(\sigma_0\) and permeability \(\mu_0\). The analytic solution for our source is computed by taking the analytic solution for an electric dipole and integrating over the path of the wire/loop, i.e.:

\[\mathbf{u_0}(\mathbf{r}) = \int_{s'} \mathbf{E_e} (\mathbf{r}, \mathbf{r'}) d\mathbf{s'}\]

For an electric dipole at the origin and oriented along the \(\hat{x}\) direction, the electric field in a homogeneous full-space is given by:

(2.8)\[\begin{split}\mathbf{E_e} = \frac{I ds}{4 \pi (\sigma - i \omega \varepsilon) r^3} e^{ikr} \Bigg [ \Bigg ( \frac{x^2}{r^2} \mathbf{\hat{x}} + & \frac{xy}{r^2} \mathbf{\hat{y}} + \frac{xz}{r^2} \mathbf{\hat{z}} \Bigg ) ... \\ &\big ( -k^2 r^2 - 3ikr +3 \big ) + \big ( k^2 r^2 + ikr -1 \big ) \mathbf{\hat{x}} \Bigg ] .\end{split}\]

where

\[k^2 = \omega^2 \mu \varepsilon + i\omega \mu \sigma\]

Once the analytic background field is computed on cell edges, we construct the linear operator \(\mathbf{A}(\mathbf{\sigma_0})\) from Eq. (2.4) using the background conductivity and permeability. Then we use \(\mathbf{A}(\mathbf{\sigma_0})\) and \(\mathbf{u_0}\) to compute the right-hand-side that is used to solve Eq. (2.2)

\[\mathbf{A}(\boldsymbol{\sigma_0}) \, \mathbf{u_0} = - i \omega \mathbf{s_e}\]

Secondary Field Computation (Iterative Solver):

A similar approach is taken in this case. Here, we must compute the analytic vector potential \(\mathbf{a}\) and scalar potential \(\phi\) for the background conductivity and permeabiltiy. The matrix in Eq. (2.7) is computed for the background conductivity and permeabiltiy. The linear operator and analytic potentials are then used to compute the right-hand side.

2.5. Sensitivity

Electric and magnetic field observations are split into their real and imaginary components. Thus the data at a particular frequency for a particular reading is organized in a vector of the form:

(2.9)\[\mathbf{d} = [E^\prime_{x}, E^{\prime \prime}_{x}, E^\prime_{y}, E^{\prime \prime}_{y}, E^\prime_{z}, E^{\prime \prime}_{z}, H^\prime_{x}, H^{\prime \prime}_{x}, H^\prime_{y}, H^{\prime \prime}_{y}, H^\prime_{z}, H^{\prime \prime}_{z}]^T\]

where \(\prime\) denotes real components and \(\prime\prime\) denotes imaginary components. To determine the sensitivity of the data (i.e. (2.9)) with respect to the model (\(\boldsymbol{\sigma}\)), we must compute:

\[\frac{\partial \mathbf{d}}{\partial \boldsymbol{\sigma}} = \Bigg [ \dfrac{\partial E_{x}^\prime}{\partial \boldsymbol{\sigma}} , \dfrac{\partial E_{x}^{\prime\prime}}{\partial \boldsymbol{\sigma}} , \dfrac{\partial E_{y}^\prime}{\partial \boldsymbol{\sigma}} , \dfrac{\partial E_{y}^{\prime\prime}}{\partial \boldsymbol{\sigma}} , \dfrac{\partial E_{z}^\prime}{\partial \boldsymbol{\sigma}} , \dfrac{\partial E_{z}^{\prime\prime}}{\partial \boldsymbol{\sigma}} , \dfrac{\partial H_{x}^\prime}{\partial \boldsymbol{\sigma}} , \dfrac{\partial H_{x}^{\prime\prime}}{\partial \boldsymbol{\sigma}} , \dfrac{\partial H_{y}^\prime}{\partial \boldsymbol{\sigma}} , \dfrac{\partial H_{y}^{\prime\prime}}{\partial \boldsymbol{\sigma}} , \dfrac{\partial H_{z}^\prime}{\partial \boldsymbol{\sigma}} , \dfrac{\partial H_{z}^{\prime\prime}}{\partial \boldsymbol{\sigma}} , \Bigg ]^T\]

where the conductivity model \(\boldsymbol{\sigma}\) is real-valued. To differentiate \(E^\prime_x\) (or any other field component) with respect to the model, we require the derivative of the electric fields on cell edges (\(\mathbf{u_e}\)) with respect to the model. This is given by:

(2.10)\[\frac{\partial \mathbf{u_e}}{\partial \boldsymbol{\sigma}} = - i\omega \mathbf{A}^{-1} diag(\mathbf{u_e}) \, \mathbf{A_{e2c}^T V }\]

Note

Eq. (2.10) defines the sensitivities when using the direct solver formulation. Computations involving the sensitivities will differ if the iterative solver approach is used.

2.6. Inverse Problem

We are interested in recovering the conductivity distribution for the Earth. However, the numerical stability of the inverse problem is made more challenging by the fact rock conductivities can span many orders of magnitude. To deal with this, we define the model as the log-conductivity for each cell, e.g.:

\[\mathbf{m} = log (\boldsymbol{\sigma})\]

The inverse problem is solved by minimizing the following global objective function with respect to the model:

(2.11)\[\phi (\mathbf{m}) = \phi_d (\mathbf{m}) + \beta \phi_m (\mathbf{m})\]

where \(\phi_d\) is the data misfit, \(\phi_m\) is the model objective function and \(\beta\) is the trade-off parameter. The data misfit ensures the recovered model adequately explains the set of field observations. The model objective function adds geological constraints to the recovered model. The trade-off parameter weights the relative emphasis between fitting the data and imposing geological structures.

2.6.1. Data Misfit

Here, the data misfit is represented as the L2-norm of a weighted residual between the observed data (\(d_{obs}\)) and the predicted data for a given conductivity model \(\boldsymbol{\sigma}\), i.e.:

(2.12)\[\phi_d = \frac{1}{2} \big \| \mathbf{W_d} \big ( \mathbf{d_{obs}} - \mathbb{F}[\boldsymbol{\sigma}] \big ) \big \|^2\]

where \(W_d\) is a diagonal matrix containing the reciprocals of the uncertainties \(\boldsymbol{\varepsilon}\) for each measured data point, i.e.:

\[\mathbf{W_d} = \textrm{diag} \big [ \boldsymbol{\varepsilon}^{-1} \big ]\]

Important

For a better understanding of the data misfit, see the GIFtools cookbook .

2.6.2. Model Objective Function

Due to the ill-posedness of the problem, there are no stable solutions obtained by freely minimizing the data misfit, and thus regularization is needed. The regularization uses penalties for both smoothness, and likeness to a reference model \(m_{ref}\) supplied by the user. The model objective function is given by:

(2.13)\[\begin{split}\begin{align} \phi_m = \frac{\alpha_s}{2} \!\int_\Omega w_s | m - & m_{ref} |^2 dV + \frac{\alpha_x}{2} \!\int_\Omega w_x \Bigg | \frac{\partial}{\partial x} \big (m - m_{ref} \big ) \Bigg |^2 dV \\ &+ \frac{\alpha_y}{2} \!\int_\Omega w_y \Bigg | \frac{\partial}{\partial y} \big (m - m_{ref} \big ) \Bigg |^2 dV + \frac{\alpha_z}{2} \!\int_\Omega w_z \Bigg | \frac{\partial}{\partial z} \big (m - m_{ref} \big ) \Bigg |^2 dV \end{align}\end{split}\]

where \(\alpha_s, \alpha_x, \alpha_y\) and \(\alpha_z\) weight the relative emphasis on minimizing differences from the reference model and the smoothness along each gradient direction. And \(w_s, w_x, w_y\) and \(w_z\) are additional user defined weighting functions.

An important consideration comes when discretizing the regularization onto the mesh. The gradient operates on cell centered variables in this instance. Applying a short distance approximation is second order accurate on a domain with uniform cells, but only \(\mathcal{O}(1)\) on areas where cells are non-uniform. To rectify this a higher order approximation is used ([HHG+12]). The second order approximation of the model objective function can be expressed as:

\[\phi_m (\mathbf{m}) = \mathbf{\big (m-m_{ref} \big )^T W^T W \big (m-m_{ref} \big )}\]

where the regularizer is given by:

(2.14)\[\begin{split}\begin{align} \mathbf{W^T W} =& \;\;\;\;\alpha_s \textrm{diag} (\mathbf{w_s \odot v}) \\ & + \alpha_x \mathbf{G_x^T} \textrm{diag} (\mathbf{w_x \odot v_x}) \mathbf{G_x} \\ & + \alpha_y \mathbf{G_y^T} \textrm{diag} (\mathbf{w_y \odot v_y}) \mathbf{G_y} \\ & + \alpha_z \mathbf{G_z^T} \textrm{diag} (\mathbf{w_z \odot v_z}) \mathbf{G_z} \end{align}\end{split}\]

The Hadamard product is given by \(\odot\), \(\mathbf{v_x}\) is the volume of each cell averaged to x-faces, \(\mathbf{w_x}\) is the weighting function \(w_x\) evaluated on x-faces and \(\mathbf{G_x}\) computes the x-component of the gradient from cell centers to cell faces. Similarly for y and z.

If we require that the recovered model values lie between \(\mathbf{m_L \preceq m \preceq m_H}\) , the resulting bounded optimization problem we must solve is:

(2.15)\[\begin{split}\begin{align} &\min_m \;\; \phi_d (\mathbf{m}) + \beta \phi_m(\mathbf{m}) \\ &\; \textrm{s.t.} \;\; \mathbf{m_L \preceq m \preceq m_H} \end{align}\end{split}\]

A simple Gauss-Newton optimization method is used where the system of equations is solved using ipcg (incomplete preconditioned conjugate gradients) to solve for each G-N step. For more information refer again to [HHG+12] and references therein.

2.6.3. Inversion Parameters and Tolerances

2.6.3.1. Cooling Schedule

Our goal is to solve Eq. (2.15), i.e.:

\[\begin{split}\begin{align} &\min_m \;\; \phi_d (\mathbf{m}) + \beta \phi_m(\mathbf{m - m_{ref}}) \\ &\; \textrm{s.t.} \;\; \mathbf{m_L \leq m \leq m_H} \end{align}\end{split}\]

but how do we choose an acceptable trade-off parameter \(\beta\)? For this, we use a cooling schedule. This is described in the GIFtools cookbook . The cooling schedule can be defined using the following parameters:

  • beta_max: The initial value for \(\beta\)

  • beta_factor: The factor at which \(\beta\) is decrease to a subsequent solution of Eq. (2.15)

  • beta_min: The minimum \(\beta\) for which Eq. (2.15) is solved before the inversion will quit

  • Chi Factor: The inversion program stops when the data misfit \(\phi_d \leq N \times Chi \; Factor\), where \(N\) is the number of data observations

2.6.3.2. Gauss-Newton Update

For a given trade-off parameter (\(\beta\)), the model \(\mathbf{m}\) is updated using the Gauss-Newton approach. Because the problem is non-linear, several model updates may need to be completed for each \(\beta\). Where \(k\) denotes the Gauss-Newton iteration, we solve:

(2.16)\[\mathbf{H}_k \, \mathbf{\delta m}_k = - \nabla \phi_k\]

using the current model \(\mathbf{m}_k\) and update the model according to:

(2.17)\[\mathbf{m}_{k+1} = \mathbf{m}_{k} + \alpha \mathbf{\delta m}_k\]

where \(\mathbf{\delta m}_k\) is the step direction, \(\nabla \phi_k\) is the gradient of the global objective function, \(\mathbf{H}_k\) is an approximation of the Hessian and \(\alpha\) is a scaling constant. This process is repeated until any of the following occurs:

  1. The gradient is sufficiently small, i.e.:

    \[\| \nabla \phi_k \|^2 < tol \_ nl\]
  2. The smallest component of the model perturbation its small in absolute value, i.e.:

    \[\textrm{max} ( |\mathbf{\delta m}_k | ) < mindm\]
  3. A max number of GN iterations have been performed, i.e.

    \[k = iter \_ per \_ beta\]

2.6.3.3. Gauss-Newton Solve

Here we discuss the details of solving Eq. (2.16) for a particular Gauss-Newton iteration \(k\). Using the data misfit from Eq. (2.12) and the model objective function from Eq. (2.14), we must solve:

(2.18)\[\Big [ \mathbf{J^T W_d^T W_d J + \beta \mathbf{W^T W}} \Big ] \mathbf{\delta m}_k = - \Big [ \mathbf{J^T W_d^T W_d } \big ( \mathbf{d_{obs}} - \mathbb{F}[\mathbf{m}_k] \big ) + \beta \mathbf{W^T W m}_k \Big ]\]

where \(\mathbf{J}\) is the sensitivity of the data to the current model \(\mathbf{m}_k\). The system is solved for \(\mathbf{\delta m}_k\) using the incomplete-preconditioned-conjugate gradient (IPCG) method. This method is iterative and exits with an approximation for \(\mathbf{\delta m}_k\). Let \(i\) denote an IPCG iteration and let \(\mathbf{\delta m}_k^{(i)}\) be the solution to (2.18) at the \(i^{th}\) IPCG iteration, then the algorithm quits when:

  1. the system is solved to within some tolerance and additional iterations do not result in significant increases in solution accuracy, i.e.:

    \[\| \mathbf{\delta m}_k^{(i-1)} - \mathbf{\delta m}_k^{(i)} \|^2 / \| \mathbf{\delta m}_k^{(i-1)} \|^2 < tol \_ ipcg\]
  2. a maximum allowable number of IPCG iterations has been completed, i.e.:

    \[i = max \_ iter \_ ipcg\]