mt-polygon-simplification/thesis/chapters/chapter05.tex
2019-07-14 20:37:26 +02:00

39 lines
3.8 KiB
TeX

% Performance benchmark
In this chapter i will explain the approach to improve the performance of a simplification algorithm in a web browser via WebAssembly. The go-to library for this kind of operation is simplifyJS. It is the javascript implementation of the Douglas-Peucker algorithm with optional radial distance preprocessing. The library will be rebuilt in the C programming language and compiled to Webassembly with emscripten. A webpage is built to produce benchmarking insights to compare the two approaches performance wise.
\subsection{State of the art: simplifyJS}
% simplifyJS + turf
Simplify.JS calls itself a "tiny high-performance JavaScript polyline simplification library. It was extracted from Leaflet, the "leading open-source JavaScript library for mobile-friendly interactive maps". Due to its usage in leaflet and Turf.js, a geospatial analysis library, it is the most common used library for polyline simplification.
The library itself has currently 20,066 weekly downloads while the Turf.js derivate @turf/simplify has 30,389.
The Douglas-Peucker algorithm is implemented with an optional radial distance preprocessing routine. This preprocessing trades performance for quality. Thus the mode for disabling this routine is called "highest quality".
Interestingly the library expects coordinates to be a list of object with x and y properties. \todo{reference object vs array form} GeoJSON and TopoJSON however store Polylines in nested array form. Luckily since the library is small and written in javascript any skilled webdeveloper can easily fork and modify the code for his own purpose. This is even pointed out in the source code. The fact that Turf.js, which can be seen as a convenience wrapper for processing GeoJSON data, decided to keep the library as is might indicate a performance benefit to this format. Listing \ref{lst:turf-transformation} shows how Turf.js calls Simplify.js. Instead of altering the source code the data is transformed back and forth between the formats on each call as it is seen in listing. It is questionable if this practice is advisable at all.
\lstinputlisting[
float, floatplacement=H,
language=javascript,
firstline=116, lastline=122,
caption=Turf.js usage of simplify.js,
label=lst:turf-transformation
]{../lib/turf-simplify/index.js}
Since it is not clear which case is faster, and given how simple the required changes are, two versions of Simplify.js will be tested: the original version, which expects the coordinates to be in array-of-objects form and the altered version, which operates on nested arrays. Listing \ref{lst:diff-simplify.js} shows an extract of the changes performed on the library. Instead of using properties, the coordinate values are accessed by index. Except for the removal of the lisencing header the alterations are restricted to these kind of changes. The full list of changes can be viewed in \texttt{lib/simplify-js-alternative/simplify.diff}.
\lstinputlisting[
float, floatplacement=H,
language=diff,
firstline=11, lastline=16,
caption=Snippet of the difference between the original Simplify.js and alternative,
label=lst:diff-simplify.js
]{../lib/simplify-js-alternative/simplify.diff}
\subsection{The webassembly solution}
Just like the simplify-js library the webassembly solution requires the data to be transformed for processing. Meant with that is the storing and loading of bytes into and from the module heap. This transformations however are intensive ones and not as easy to overcome. In a larger project the data may already be managed in a webassembly module. So the raw execution time might be relevant as well. To make assumptions about the real-world usage of WebAssembly in this case there will be seperate measurements for storing and loading of data and the execution.
\subsection{The implementation}