More on Modeling the Last Flight of MH370 with Monte Carlo - Part 1/3

Conor L. Myhrvold


Harvard University, SEAS, IACS , Computational Science & Engineering (CSE)

AM207 : Advanced Scientific Computing: Stochastic Optimization Methods. Monte Carlo Methods for Inference and Data Analysis

Final Project, Spring 2014. Current as of March 31.


Update Explanation: I'm updating from version one, incorporating new info released since I worked on the model. Since the goal is to make this version standalone as well, I've inserted the new material seamlessly without calling too much attention to what came first. I've also eliminated parts which are no longer relevant. For example, I took out plotting all of the runways in the region that a 777 could land on; while cool, the most likely scenario for MH370 now, according to my model and Inmarsat's analysis, is that it crashed and is at the bottom of the Indian Ocean. And, I fixed a few bugs.

Goal: The overall goal of this project is to use publically available information to pinpoint possible locations of flight MH370, a Malaysian Airlines Boeing 777-200ER which went missing in flight on March 8, 2014.

Let's get started. Here are the libraries we need. Note that after installing the Anaconda distribution of Python 2.7, which has IPython, you need to install seaborn and Basemap, which is part of matplotlib but is not included in the default installation. You need Basemap to make the maps -- there's really no other way around it.

In [1]:
#using Anaconda 64 bit distro of Python 2.7, Windows 7 environment (other specs will work though)
#import libraries

#numpy
import numpy as np

#scipy
import scipy as sp
from scipy import stats
from scipy.stats import norm
from scipy.stats import expon

#plotting
%matplotlib inline 
# ^^ make plots appear in IPython notebook instead of separate window
import matplotlib.pyplot as plt
import seaborn as sns
from IPython import display
from mpl_toolkits.basemap import Basemap # use for plotting lat lon & world maps

Overview - MH370

Malaysia Airlines 370 disappeared. A variety of information sources have been obtained, which can clue us into its possible locations. I have read a ton of articles and also reached out to some of the journalists (which I have a background in) and organizations involved, in an effort to get the best data possible to put into our Monte Carlo (MC) simulation. In particular, the New York Times Hong Kong Bureau has been of great assistance to me. Why Monte Carlo? MC is an ideal approach to model where the airplane could be since unlike other air disasters, we don't know a lot about what happened yet. Several weeks have passed and no definitive wreckage has been found, let alone the airplane crash site or how it got there. This is despite a determined search effort.

Getting the Data - Flight MH370 Known Flight Coordinates + Inmarsat Coords

Here are some useful constants we need for converting purposes:

In [2]:
eq_deg_km = 111.32 # number of km/degree at eq Source: http://en.wikipedia.org/wiki/Decimal_degrees
earth_radius = 6371 #in km, http://en.wikipedia.org/wiki/Earth

#Inmarsat satellite information
sat_height = 42170 #Inmarsat satellite height off ground, in meters
elevation_angle = np.radians(40) #elevation angle of satellite; convert degrees to radians. Source: NYT Hong Kong Bureau

Now we'll input the coordinates relevant to us:

In [3]:
# The Inmarsat satellite is at 0,64.5 -- it's geostationary.
inmarsat = [0, 64.5]

#Now we plot the plane's known positions

#Kuala Lumpur International Airport Coordinates: http://www.distancesfrom.com/my/Kuala-Lumpur-Airport-(KUL)-Malaysia-latitude-longitude-Kuala-Lumpur-Airport-(KUL)-Malaysia-latitude-/LatLongHistory/3308940.aspx
kualalumpur = [2.7544829, 101.7011363]

#Pulau Perak coordinates: http://tools.wmflabs.org/geohack/geohack.php?pagename=Pulau_Perak&params=5_40_50_N_98_56_27_E_type:isle_region:MY
# http://en.wikipedia.org/wiki/Perak_Island -> Indonesia military radar detected near island
pulauperak = [5.680556, 98.940833]

# Igari Waypoint. Source: # http://www.fallingrain.com/waypoint/SN/IGARI.html Given in hours,minutes,seconds.
igariwaypoint = [6. + 56./60. + 12./3600., 103. + 35./60. + 6./3600.] 

print "inmarsat lat/lon:", inmarsat[0], inmarsat[1]
print "kuala lumpur int'l airport coord lat/lon:", kualalumpur[0],kualalumpur[1]
print "pulau perak lat/lon:", pulauperak[0],pulauperak[1]
print "igari waypoint lat/lon:", igariwaypoint[0],igariwaypoint[1]
inmarsat lat/lon: 0 64.5
kuala lumpur int'l airport coord lat/lon: 2.7544829 101.7011363
pulau perak lat/lon: 5.680556 98.940833
igari waypoint lat/lon: 6.93666666667 103.585

Note that we don't use the Kuala Lumpur airport coordinates in our actual simulations. We do use the Igari Waypoint coordinate, however, in conjunction with the Pulau Perak coordinates to generate our initial heading. (Which was done in a Mathematica notebook, separate to this file, but could be easily replicable given that I provide the coordinates for both.) By changing the distribution see how sensitive this initial heading will be to our final results. However, I'll point out that irrespective of the sensitivity, this is our best guess of where the plane was last headed.

Plotting the Data

Before we plot the data, let's formally define our latitude and longitude bounds for our map, up front. That way, we can more readily change them across the document without manually editing one by one. These will double duty to make our probabilities as well:

In [4]:
#create lat/long grid of distance

lat_min = -50 #50 S
lat_max = 50  #50 N
lon_min = 50  #50 E
lon_max = 140 #130 E
lat_space = 5 #spacing for plotting latitudes and longitudes
lon_space = 5

Now we'll make a map with those points on it, using Basemap, an extension of Matplotlib:

In [5]:
#Set figure size
fig = plt.figure(figsize=[30,20])

#Setup Basemap
fig = Basemap(width=10000000,height=13000000,projection='lcc',resolution='c',lat_0=10,lon_0=90,suppress_ticks=True)

#Draw coasts
fig.drawcoastlines()

#Draw boundary
fig.drawmapboundary(fill_color='lightblue')

#Fill background
fig.fillcontinents(color='#FFD39B',lake_color='lightblue')

#Draw parallels
parallels = np.arange(lat_min,lat_max,lat_space)
fig.drawparallels(np.arange(lat_min,lat_max,lat_space),labels=[1,1,0,1], fontsize=15)

#Draw meridians
meridians = np.arange(lon_min,lon_max,lon_space)
fig.drawmeridians(np.arange(lon_min,lon_max,lon_space),labels=[1,1,0,1], fontsize=15)

#Translate coords into map coord system to plot

#Known 777 Locs
x,y = fig(kualalumpur[1],kualalumpur[0]) #plotted as lon,lat NOT lat,lon -- watch out!!
x2,y2 = fig(igariwaypoint[1],igariwaypoint[0])
x3,y3 = fig(pulauperak[1],pulauperak[0])

#Inmarsat Satellite Loc
x4,y4 = fig(inmarsat[1],inmarsat[0])

# plot coords w/ filled circles
fig.plot(x,y,'bo',markersize=10,label='MH370 Flight Path')
fig.plot(x2,y2,'bo',markersize=10)
fig.plot(x3,y3,'go',markersize=10,label='MH370 Last Known Coords')
fig.plot(x4,y4,'ro',markersize=10,label='Inmarsat 3-F1')

#Draw arrows showing flight path
arrow1 = plt.arrow(x,y,x2-x,y2-y,linewidth=3,color='blue',linestyle='dashed',label='flight path')
arrow2 = plt.arrow(x2,y2,x3-x2,y3-y2,linewidth=3,color='blue',linestyle='dashed',label='flight path')

#Make legend
legend = plt.legend(loc='lower left',fontsize=10,frameon=True,title='Legend',markerscale=1,prop={'size':15})
legend.get_title().set_fontsize('20')

#Add title
plt.title('Initial Map & Data', fontsize=30)

#Show below
plt.show()

This map is the basis for more complicated maps showing the plane's possible position at a given point in time, along with its flight path. The middle of the South Indian Ocean in this map is basically the middle of nowhere, but from our results this region of the South Indian Ocean, especially further down, is a very interesting area to search.

Satellite Ping Data -- the 5th Ping

Now we plot and incorporate the ping data from the satellite. Basically it's going to be within a big arc from the satellite. We actually don't know to the precision that Malayasia authorities claim to know. That's due to the uncertainty involved in inferring the geolocation from the engine ping. Really, it was never supposed to be used for location -- that data Malaysia Airlines did not subscribe for, so we are left with fairly substantial error in where this places the plane at the last ping mark. The other 4 pings would be critical, but those haven't been publically released, although we are now hearing that the ping time was not every hour but actually got longer over time, which indicates to some extent that MH370 moved away from the satellite (from a radial distance perspective) over time.

In [6]:
#To get the satellite calculate we have to solve several equations which were computed on Mathematica
"""
Computes the ping arc distance from the satellite to the plane.
Returns the angle in degrees.
"""
def satellite_calc(radius,orbit,angle):
    interim = (np.sqrt(-radius**2+orbit**2*(1./np.cos(angle)**2))-orbit*np.tan(angle))/np.float(orbit+radius)
    return np.degrees(2*np.arctan(interim))

ping_arc_dist = satellite_calc(earth_radius,sat_height,elevation_angle)
print "ping arc distance in degrees:", ping_arc_dist

dist_from_sat = earth_radius*np.radians(satellite_calc(earth_radius,sat_height,elevation_angle))
print "distance from satellite", dist_from_sat
ping arc distance in degrees: 43.3540831616
distance from satellite 4820.7540969

There are two types of approximations we can make, for a given radius of the geostationary satellite or the airplane. The first is a simple circle, which is pretty good. However, Earth is not a sphere, and so at large radii this approximation is not ideal. So I write a more complicated function to reflect the ellipse-like shape the arc will really be like. I don't include the spherical trigonometry / mathematical derivations (done in a Mathematica notebook), since they are not in the scope of finding MH370.

In [7]:
"""
write circle function. generates x&y pairs in a circle, 360 degrees
angle in degrees, number of points to put in circle, satellite location lat/lon
"""
def make_circle1(radius,pts,lon_loc,lat_loc): 
    coords_array = np.zeros((pts, 2))
    for i in xrange(pts):
        coords_array[i][0] =  radius*np.cos(np.radians(i)) + lat_loc 
        coords_array[i][1] =  radius*np.sin(np.radians(i)) + lon_loc 
    return coords_array

"""
write ellipse function
since across the earth it's not actually a circle
derived from spherical trigonometry
"""
def make_circle(radius,pts,lon_loc,lat_loc):
    coords_array = np.zeros((pts, 2))
    for i in xrange(pts):
        coords_array[i][0] =  radius*np.cos(np.radians(i)) + lat_loc 
        coords_array[i][1] =  radius*np.sin(np.radians(i))/(np.cos(np.cos(np.radians(radius))*(lat_loc+radius*np.cos(np.radians(i)))*(np.pi/180.))) + lon_loc 
    return coords_array

Now we test the two approximations to better understand how they compare to each other:

In [8]:
#test near the equator, -5 (5 S)

test = make_circle(8.12971613367,360,90,-5)

test_lat = []
test_lon = []

for i in xrange(len(test)):
    test_lat.append(test[i][0])
    test_lon.append(test[i][1])

test1 = make_circle1(8.12971613367,360,90,-5)

test_lat1 = []
test_lon1 = []

for i in xrange(len(test1)):
    test_lat1.append(test1[i][0])
    test_lon1.append(test1[i][1])

#create figure
plt.figure()
plt.plot(test_lon1,test_lat1,color='red',label='simple circle')
plt.plot(test_lon,test_lat,color='blue',label='ellipsoid')
legend = plt.legend(loc='lower left',fontsize=8,frameon=True,markerscale=1,prop={'size':5})
plt.title('comparing circle approximations to each other')
plt.legend()
plt.show()
In [9]:
#test at a low latitude, -40 (40 S)

test = make_circle(8.12971613367,360,90,-40)

test_lat = []
test_lon = []

for i in xrange(len(test)):
    test_lat.append(test[i][0])
    test_lon.append(test[i][1])

test1 = make_circle1(8.12971613367,360,90,-40)

test_lat1 = []
test_lon1 = []

for i in xrange(len(test1)):
    test_lat1.append(test1[i][0])
    test_lon1.append(test1[i][1])

#create figure
plt.figure()
plt.plot(test_lon1,test_lat1,color='red',label='simple circle')
plt.plot(test_lon,test_lat,color='blue',label='ellipsoid')
legend = plt.legend(loc='lower left',fontsize=8,frameon=True,markerscale=1,prop={'size':5})
plt.title('comparing circle approximations to each other')
plt.legend()
plt.show()

Above we see the difference between the two circles. Remember that at the equator, 1 degree of latitude and longitude stay the same, but at the poles, the degrees of latitude contract tremendously. The ellipse we have created is slightly better at the equator, but much better at lower latitudes. And that's where it matters most to find our plane, if our first version is any guide (either one -- even the one with bugs, which basically had the lower latitude plot, near the equator, which caused that lower latitude plot to be out of line for a small fraction of the make_vector dots.)

In [10]:
circle_pts = make_circle(ping_arc_dist,360,64.5,0)
#print circle_pts

circle_lat = []
for i in xrange(len(circle_pts)):
    circle_lat.append(circle_pts[i][0])

circle_lon = []
for i in xrange(len(circle_pts)):
    circle_lon.append(circle_pts[i][1])

print circle_lat[0:10]
print "-------------------"
print circle_lon[0:10]
[43.354083161614668, 43.347480124758938, 43.32767302554074, 43.294667897394362, 43.248474794001574, 43.189107786229151, 43.116584957842761, 43.030928399998473, 42.93216420451359, 42.820322455918792]
-------------------
[64.5, 65.387581023937116, 66.274618176340184, 67.16056873867629, 68.044892293124406, 68.927051859745077, 69.806515017937599, 70.682755007130297, 71.555251801804673, 72.423493156143252]

Incorporating New Info on Ping Distance Error

Right after my model was done, on March 24, excellent analysis from the TMF Associates article Understanding the "satellite ping" conclusion... was published, pointed out that the ping distance error would be 1 or 2 degrees. Essentially, the biggest uncertainty for the ping arc is the lag time, which is apparently on the order of 300 microseconds, which round trip is 600 microseconds according to the linked calculation (recall that the plane "saw" the satellite at a 40 degree angle, which on the graph is 600 microseconds.) It's 1 or 2 degrees because it hasn't been established whether the measurement by the satellite is done one-way or round trip.

In [11]:
print "Percent error 1 way is:", (1./40)*100
print "Percent error round trip is:", (2./40)*100
Percent error 1 way is: 2.5
Percent error round trip is: 5.0

Thus, I will use 2.5 and 5% error. (Fortuitously, I'd picked 5% before! So with the advent of this extra information and analysis, that number is spot on, for one ping. Which also means 10 and 20% were reasonable if you wanted to cover an area that included multiple pings, instead of just using the 5th ping.) Now what we'll do is vary the distance from the satellite (radius) by 5% and 2.5% to build in error in where the actually was (the geostationary satellite is assumed to essentially have no error in its position, relative to the error in knowing where the plane is).

Inmarsat 5% error

In [12]:
err1_5per = 0.95*ping_arc_dist
err2_5per = 1.05*ping_arc_dist

circle_pts_err1_5per = make_circle(err1_5per,360,64.5,0)
circle_pts_err2_5per = make_circle(err2_5per,360,64.5,0)

circle_lon_err1_5per = []
circle_lon_err2_5per = []
circle_lat_err1_5per = []
circle_lat_err2_5per = []

for i in xrange(len(circle_pts_err1_5per)):
    circle_lon_err1_5per.append(circle_pts_err1_5per[i][1])
    
for i in xrange(len(circle_pts_err2_5per)):
    circle_lon_err2_5per.append(circle_pts_err2_5per[i][1])
    
for i in xrange(len(circle_pts_err1_5per)):
    circle_lat_err1_5per.append(circle_pts_err1_5per[i][0])
    
for i in xrange(len(circle_pts_err2_5per)):
    circle_lat_err2_5per.append(circle_pts_err2_5per[i][0])

Inmarsat 2.5% error

In [13]:
err1_2per = 0.975*ping_arc_dist
err2_2per = 1.025*ping_arc_dist

circle_pts_err1_2per = make_circle(err1_2per,360,64.5,0)
circle_pts_err2_2per = make_circle(err2_2per,360,64.5,0)

circle_lon_err1_2per = []
circle_lon_err2_2per = []
circle_lat_err1_2per = []
circle_lat_err2_2per = []

for i in xrange(len(circle_pts_err1_2per)):
    circle_lon_err1_2per.append(circle_pts_err1_2per[i][1])
    
for i in xrange(len(circle_pts_err2_2per)):
    circle_lon_err2_2per.append(circle_pts_err2_2per[i][1])
    
for i in xrange(len(circle_pts_err1_2per)):
    circle_lat_err1_2per.append(circle_pts_err1_2per[i][0])
    
for i in xrange(len(circle_pts_err2_2per)):
    circle_lat_err2_2per.append(circle_pts_err2_2per[i][0])

Now we make a map plot showing these circles. The dashed lines away from the solid red line is the error, 2.5%.

In [14]:
#Set figure size
fig = plt.figure(figsize=[30,20])

#Setup Basemap
fig = Basemap(width=10000000,height=13000000,projection='lcc',resolution='c',lat_0=10,lon_0=90,suppress_ticks=True)

#Draw coasts
fig.drawcoastlines()

#Draw boundary
fig.drawmapboundary(fill_color='lightblue')

#Fill background
fig.fillcontinents(color='#FFD39B',lake_color='lightblue')

#Draw parallels
parallels = np.arange(lat_min,lat_max,lat_space)
fig.drawparallels(np.arange(lat_min,lat_max,lat_space),labels=[1,1,0,1], fontsize=15)

#Draw meridians
meridians = np.arange(lon_min,lon_max,lon_space)
fig.drawmeridians(np.arange(lon_min,lon_max,lon_space),labels=[1,1,0,1], fontsize=15)

#Translate coords into map coord system to plot

#Known 777 Locs
x,y = fig(kualalumpur[1],kualalumpur[0]) #plotted as lon,lat NOT lat,lon -- watch out!!
x2,y2 = fig(igariwaypoint[1],igariwaypoint[0])
x3,y3 = fig(pulauperak[1],pulauperak[0])

#Inmarsat Satellite Loc
x4,y4 = fig(inmarsat[1],inmarsat[0])

#Add circle coords -- 2.5% error
x5,y5 = fig(circle_lon,circle_lat)
x6,y6 = fig(circle_lon_err1_2per,circle_lat_err1_2per)
x7,y7 = fig(circle_lon_err2_2per,circle_lat_err2_2per)
x8,y8 = fig(circle_lon_err1_5per,circle_lat_err1_5per)
x9,y9 = fig(circle_lon_err2_5per,circle_lat_err2_5per)

#Draw circle showing extent of Inmarsat sat radar detection
fig.plot(x5,y5,'r-',markersize=5,label='Inferred MH370 Location from Satellite, 5th Ping')
fig.plot(x6,y6,'r--',markersize=5,label='with 2.5 & 5% error')
fig.plot(x7,y7,'r--',markersize=5)
fig.plot(x8,y8,'r--',markersize=5)
fig.plot(x9,y9,'r--',markersize=5)

# plot coords w/ filled circles
fig.plot(x,y,'bo',markersize=10,label='MH370 Flight Path')
fig.plot(x2,y2,'bo',markersize=10)
fig.plot(x3,y3,'go',markersize=10,label='MH370 Last Known Coords')
fig.plot(x4,y4,'ro',markersize=10,label='Inmarsat 3-F1')

#Draw arrows showing flight path
arrow1 = plt.arrow(x,y,x2-x,y2-y,linewidth=3,color='blue',linestyle='dashed',label='flight path')
arrow2 = plt.arrow(x2,y2,x3-x2,y3-y2,linewidth=3,color='blue',linestyle='dashed',label='flight path')

#Make legend
legend = plt.legend(loc='lower left',fontsize=10,frameon=True,title='Legend',markerscale=1,prop={'size':15})
legend.get_title().set_fontsize('20')

#Add title
plt.title('Inmarsat Ping Estimation', fontsize=30)

#Show below
plt.show()