CV

Miscellaneous Code

## Calculating Weighted Geometric Median of 3D Points

For a recent project I was asked to write some software to calculate weighted median centers for sets of related 3D points. There are several definitions that can be used to define exactly what median center means, and for this blog entry I am sharing a method known as the center of minimum aggregate travel.

From Wikipedia:

The geometric median is the point to which the population has the smallest possible sum of distances (or equivalently, the smallest average distance). Because of this property, it is also known as the point of minimum aggregate travel. Unfortunately, there is no direct closed-form expression for the geometric median; it is typically computed using iterative methods.

For sample data and a better explanation of the process, I searched the online textbook Geospatial Analysis Online to find a section covering centroids and centers.

## Sample data

Below is a table representing data borrowed from the Geospatial Analysis Online textbook that has been slightly modified for testing and verification purposes.

Name X Y Z Weight
A 3.0 0.0 0.0 1.0
B 14.0 3.0 0.0 1.0
C 10.0 4.0 0.0 1.0
D 13.0 11.0 0.0 1.0
E 4.0 13.0 0.0 1.0
F 0.0 8.0 0.0 1.0

The data is actually 2D and unweighted, but we can add a third dimension and set all weights to 1 to use the data for verification.

## Methodology

This method can be implemented in Python. See the full source on GitHub.

It consists of:
1. A weighted 3D point primitive type.
2. An aggregate type for calculating the weighted geometric mean center. This point serves as a good starting location for minimizing the sum of all distances.
3. An aggregate type for calculating the weighted geometric median (MAT) center.

Note:

These bits of code were originally developed to be used as sqlite3 aggreagate functions, which is why they employ step() and finalize() member methods.

### 1. WeightedPointZ primitive type

For this example I thought it would be best to create a simple primitive type to represent weighted 3D points. In practice this will probably not be very useful due to the fact that it's likely that one could be using more complete spatial types.

```class WeightedPointZ(object):
def __init__(self, x, y, z, weight=1.0):
"""
Point primitive.

:param x: X coordinate of point
:type x: float or integer
:param y: Y coordinate of point
:type y: float or integer
:param z: Z coordinate of point
:type z: float or integer
:param weight: Weight measurement of the point
:type weight: float or integer
"""
self._x = x
self._y = y
self._z = z
self._w = weight

def __eq__(self, other):
return self.__dict__ == other.__dict__

def distance(self, other):
"""
Calculates the distance between the instantiated point and one other.

:param other: Another point
:type other: WeightedPointZ
"""
return math.sqrt((other.x - self.x) ** 2 +
(other.y - self.y) ** 2 +
(other.z - self.z) ** 2)

@property
def x(self):
return self._x

@property
def y(self):
return self._y

@property
def z(self):
return self._z

@property
def weight(self):
return self._w
```

### 2. WeightedMeanCenterZ aggregate

The mean center is used to determine a decent starting point as the code iterates towards the point of minimum aggregate travel.

```class WeightedMeanCenterZ(object):
"""
Weighted mean center.
"""
def __init__(self):
self.x_wsum = 0.0
self.y_wsum = 0.0
self.z_wsum = 0.0
self.w_sum = 0.0

def step(self, point):
self.x_wsum += point.x * point.weight
self.y_wsum += point.y * point.weight
self.z_wsum += point.z * point.weight
self.w_sum += point.weight

def finalize(self):
x_mean = self.x_wsum / self.w_sum
y_mean = self.y_wsum / self.w_sum
z_mean = self.z_wsum / self.w_sum
return WeightedPointZ(x=x_mean, y=y_mean, z=z_mean)
```

### 3. WeightedMatCenterZ aggregate

Epsilon is an arbitrarily small value that can be adjusted as needed.

```class WeightedMatCenterZ(object):
"""
Weighted minimum aggregate travel center.
"""
def __init__(self):
self.epsilon = 0.001  # Arbitrarily small value that may be modified as needed
self.points = []

def step(self, point):
self.points.append(point)

def finalize(self):
if len(self.points) == 1:
# Do nothing, just return the point
return self.points

if len(self.points) == 2:
# Return the weighted mean center
return self._first_approximation()

if len(self.points) > 2:
# First calculate weighted mean center as our first approximate
# point, then iterate until we pass the epsilon condition
approximation = self._first_approximation()
while True:
median = self._next_approximation(approximation)
if median.distance(approximation) < self.epsilon:
return median
# median failed epsilon test and becomes our next approximation
approximation = median

def _first_approximation(self):
mean_center = WeightedMeanCenterZ()
for pt in self.points:
mean_center.step(pt)
return mean_center.finalize()

def _next_approximation(self, approximation):
dw_sum, x_dwsum, y_dwsum, z_dwsum = 0.0, 0.0, 0.0, 0.0
for pt in self.points:
distance = pt.distance(approximation)
if distance:  # This condition ensures that distance is not zero
dweight = pt.weight / distance
dw_sum += dweight
x_dwsum += pt.weight * pt.x / distance
y_dwsum += pt.weight * pt.y / distance
z_dwsum += pt.weight * pt.z / distance
x = x_dwsum / dw_sum
y = y_dwsum / dw_sum
z = z_dwsum / dw_sum
return WeightedPointZ(x=x, y=y, z=z)
```

### Interactive demo

From the text, we expect to get a result equal to (8.58, 5.61).

```>>> from wmatz import WeightedPointZ, WeightedMeanCenterZ, WeightedMatCenterZ
>>> points = {'A': WeightedPointZ(3.0, 0.0, 0.0, 1.0),
...           'B': WeightedPointZ(14.0, 3.0, 0.0, 1.0),
...           'C': WeightedPointZ(10.0, 4.0, 0.0, 1.0),
...           'D': WeightedPointZ(13.0, 11.0, 0.0, 1.0),
...           'E': WeightedPointZ(4.0, 13.0, 0.0, 1.0),
...           'F': WeightedPointZ(0.0, 8.0, 0.0, 1.0)}
>>> mat = WeightedMatCenterZ()
>>> for pt in points:
...     mat.step(points[pt])
...
>>> result = mat.finalize()
>>> result.x, result.y, result.z
(8.58241797604165, 5.607649669684189, 0.0)
```

Looks about right! See the full source on GitHub.