next up previous contents
Next: BCUT (aka Burden Eigenvalues)[2,3,14] Up: Molecular Descriptors Previous: Contents   Contents

Subsections

Autocorrelation Descriptors[1,13,12,11]

These are based on an autocorrelation function, $AC_l$ defined as

\begin{displaymath}
AC_l = \int_{a}^{b} f(x) \cdot f(x+l) \cdot dx
\end{displaymath}

where $f(x)$ is a function and $l$ is the lag representing an interval of $x$. $a$ and $b$ define the total studied interval of the function. $f(x)$ is usually a time dependent function.

For an ordered sequence of $n$ values $f(x_i)$ we can calculate $AC_l$ by

\begin{displaymath}
AC_l = \sum_{i=1}^{n-l} f(x_i) \cdot f(x_{i+l})
\end{displaymath}

where $l$ is the lag and assumes the values from 1 to $L$ where $L_{max} = n-1$. $L$ is usually less than 8.

A property of the autocorrelation value is that it does not change when the origin of the $x$ variable is changed.

To get spatial autocorrelation descriptors $f(x_i)$ is a physico chemical property calculated for all atoms such as atomic mass, polarizability etc. Thus atoms represent the discrete points $x_i$ and the atomic properties for each atom represent the function value at that point. In this case the lag $l$ is defined as the topological distance $d$ (ie topological distance $d_{ij}$ between two graph vertices $v_i$ and $v_j$ is the number of edges in the shortest path between these two vertices).


Moreau Broto Autocorrelation

Also known as Autocorrelation of a Topological Structure (ATS). The ATS descriptor describes how a property is distributed along the topological structure. It is a spatial autocorrelation on a molecular graph defined as

\begin{displaymath}
ATS_d = \sum^{A}_{i=1} \sum^{A}_{j=1} \deltaup_{ij} \cdot (...
...=
\varmathbb{W}^T \cdot \varmathbb{B}^m \cdot \varmathbb{W}
\end{displaymath}

where $w$ is any atomic property, $A$ is the atom number (total number of atoms), $d$ is the considered topological distance, $\deltaup_{ij}$ is the Kronecker delta, $\varmathbb{B}^m$ is the $m$'th order binary sparse matrix (a matrix whose elements are equal to 1 only for vertices $v_i$ and $v_j$ at a distance $m$)and $\varmathbb{W}$ is the $A$ dimensional vector of atomic properties.

For each property $w$ the set of autocorrelation terms defined for all existing distances in the graph is the ATS descriptor defined as

\begin{displaymath}
\langle ATS_0, ATS_1, ATS_2, \cdots, ATS_D \rangle_{w}
\end{displaymath}

where $D$ is the topological diameter (maximum distance in the graph).

The average spatial autocorrelation descriptors exclude any dependence on the molecular size and are obtained by dividing each term by the corresponding number of contributions, ie,

\begin{displaymath}
\overline{ATS}_d = \frac{1}{\Delta} \cdot \sum^{A}_{i=1} \sum^{A}_{j=1} \deltaup_{ij} \cdot (w_i
\cdot w_j)_d
\end{displaymath}

where $\Delta$ is the sum of the Kronecker deltas, ie the number of the vertex pairs at a distance $d$.

ATS descriptors for 3D geometries are based on the geometry matrix (whose entries $G_{ij}$are the Euclidean distance between atoms $i$ and $j$)

Moran Coefficient

This is an index of spatial correlation defined by

\begin{displaymath}
I(d) = \frac{
\frac{1}{\Delta} \cdot \sum^{A}_{i=1} \sum^{...
...{
\frac{1}{A} \cdot \sum^{A}_{i=1} (w_i - \overline{w})^2
}
\end{displaymath}

where $w_i$ is an atomic property and $\overline{w}$ is the average value over the whole molecule, $A$ is the atom number, $d$ is the topological distance, $\Delta$ is the sum of the Kronecker deltas, ie the number of the vertex pairs at a distance $d$.

Geary Coefficient

Defined as

\begin{displaymath}
c(d) = \frac{
\frac{1}{2\Delta} \cdot \sum^{A}_{i=1} \sum^...
... \frac{1}{A-1} \cdot \sum^{A}_{i=1} (w_i - \overline{w})^2
}
\end{displaymath}

where the symbols are same as in the Moran coefficient. This is a distance type function and its range is $[0,\infty]$
next up previous contents
Next: BCUT (aka Burden Eigenvalues)[2,3,14] Up: Molecular Descriptors Previous: Contents   Contents
2003-06-16