This commit is contained in:
Alfred Melch 2019-07-25 09:53:29 +02:00
parent 283f685dc6
commit de5d54e62c
11 changed files with 223 additions and 61 deletions

View File

@ -5,10 +5,32 @@
\subsubsection{Topological aspects}
\subsection{LineString simplification}
\subsubsection{Positional errors}
\subsubsection{Length errors}
\subsubsection{Area Errors}
\subsubsection{Area Errors}
\subsection{Runtimes on the Web}
\subsubsection{Webassembly}
\subsection{Coordinate representation}
\paragraph{in Javascript}
\paragraph{in C}
\paragraph{in C++}
\subsection{Data Formats}
\subsection{GeoJSON}
\subsection{TopoJSON}

View File

@ -2,16 +2,17 @@
% Performance benchmark
In this chapter i will explain the approach to improve the performance of a simplification algorithm in a web browser via WebAssembly. The go-to library for this kind of operation is Simplify.JS. It is the javascript implementation of the Douglas-Peucker algorithm with optional radial distance preprocessing. The library will be rebuilt in the C programming language and compiled to Webassembly with emscripten. A web page is built to produce benchmarking insights to compare the two approaches performance wise.
In this chapter I will explain the approach to improve the performance of a simplification algorithm in a web browser via WebAssembly. The go-to library for this kind of operation is Simplify.js. It is the JavaScript implementation of the Douglas-Peucker algorithm with optional radial distance preprocessing. The library will be rebuilt in the C programming language and compiled to WebAssembly with Emscripten. A web page is built to produce benchmarking insights to compare the two approaches performance wise.
\subsection{State of the art: Simplify.JS}
\subsection{State of the art: Simplify.js}
\label{sec:simplify.js}
% Simplify.JS + turf
Simplify.JS calls itself a "tiny high-performance JavaScript polyline simplification library. It was extracted from Leaflet, the "leading open-source JavaScript library for mobile-friendly interactive maps". Due to its usage in leaflet and Turf.js, a geospatial analysis library, it is the most common used library for polyline simplification. The library itself currently has 20,066 weekly downloads while the Turf.js derivate @turf/simplify has 30,389. Turf.js maintains an unmodified fork of the library in its own repository.
Simplify.js calls itself a "tiny high-performance JavaScript polyline simplification library"\footnote{\path{https://mourner.github.io/simplify-js/}}. It was extracted from Leaflet, the "leading open-source JavaScript library for mobile-friendly interactive maps"\footnote{\path{https://leafletjs.com/}}. Due to its usage in leaflet and Turf.js, a geospatial analysis library, it is the most common used library for polyline simplification. The library itself currently has 20,066 weekly downloads while the Turf.js derivate @turf/simplify has 30,389. Turf.js maintains an unmodified fork of the library in its own repository. \todo{So numbers can be added} \todo{leaflet downloads}
The Douglas-Peucker algorithm is implemented with an optional radial distance preprocessing routine. This preprocessing trades performance for quality. Thus the mode for disabling this routine is called "highest quality".
The Douglas-Peucker algorithm is implemented with an optional radial distance preprocessing routine. This preprocessing trades performance for quality. Thus the mode for disabling this routine is called highest quality.
Interestingly the library expects coordinates to be a list of object with x and y properties. \todo{reference object vs array form} GeoJSON and TopoJSON however store Polylines in nested array form. Luckily since the library is small and written in javascript any skilled webdeveloper can easily fork and modify the code for his own purpose. This is even pointed out in the source code. The fact that Turf.js, which can be seen as a convenience wrapper for processing GeoJSON data, decided to keep the library as is might indicate a performance benefit to this format. Listing \ref{lst:turf-transformation} shows how Turf.js calls Simplify.js. Instead of altering the source code the data is transformed back and forth between the formats on each call as it is seen in listing. It is questionable if this practice is advisable at all.
Interestingly the library expects coordinates to be a list of object with x and y properties. \todo{reference object vs array form} GeoJSON and TopoJSON however store coordinates in nested array form. Luckily since the library is small and written in JavaScript any skilled web developer can easily fork and modify the code for his own purpose. This is even pointed out in the source code. The fact that Turf.js, which can be seen as a convenience wrapper for processing GeoJSON data, decided to keep the library as is might indicate some benefit to this format. Listing \ref{lst:turf-transformation} shows how Turf.js calls Simplify.js. Instead of altering the source code the data is transformed back and forth between the formats on each call. It is questionable if this practice is advisable at all.
\lstinputlisting[
float=htbp,
@ -21,7 +22,7 @@ Interestingly the library expects coordinates to be a list of object with x and
label=lst:turf-transformation
]{../lib/turf-simplify/index.js}
Since it is not clear which case is faster, and given how simple the required changes are, two versions of Simplify.js will be tested: the original version, which expects the coordinates to be in array-of-objects form and the altered version, which operates on nested arrays. Listing \ref{lst:diff-simplify.js} shows an extract of the changes performed on the library. Instead of using properties, the coordinate values are accessed by index. Except for the removal of the licensing header the alterations are restricted to these kind of changes. The full list of changes can be viewed in \path{lib/simplify-js-alternative/simplify.diff}.
Since it is not clear which case is faster, and given how simple the required changes are, two versions of Simplify.js will be tested. The original version, which expects the coordinates to be in array-of-objects format and the altered version, which operates on nested arrays. Listing \ref{lst:diff-simplify.js} shows an extract of the changes performed on the library. Instead of using properties, the coordinate values are accessed by index. Except for the removal of the licensing header the alterations are restricted to these kind of changes. The full list of changes can be viewed in \path{lib/simplify-js-alternative/simplify.diff}.
\lstinputlisting[
@ -35,9 +36,23 @@ Since it is not clear which case is faster, and given how simple the required ch
\subsection{The webassembly solution}
\label{sec:benchmark-webassembly}
In scope of this thesis a library will be created that implements the same procedure as simplify.JS in C code. It will be made available on the web platform through WebAssembly. In the style of the model library it will be called simplify.WASM. The compiler to use will be emscripten as it is the standard for porting C code to wasm.
In scope of this thesis a library will be created that implements the same procedure as Simplify.JS in C code. It will be made available on the web platform through WebAssembly. In the style of the model library it will be called Simplify.wasm. The compiler to use will be Emscripten as it is the standard for porting C code to WebAssembly.
As mentioned the first step is to port simplify.JS to the C programming language. The file \path{lib/simplify-wasm/simplify.c} shows the attempt. It is kept as close to the Javascript library as possible. This may result in C-untypical coding style but prevents skewed results from unexpected optimizations to the procedure itself. The entrypoint is not the \texttt{main}-function but a function called simplify. This is specified to the compiler as can be seen in \path{lib/simplify-wasm/Makefile}. Furthermore the functions malloc and free from the standard library are made available for the host environment. Compling the code through emscripten produces a wasm file and the glue code in javascript format. These files are called simplify.wasm and simplify.js respectively. An example usage can be seen in \path{lib/simplify-wasm/example.html}. Even through the memory access is abstracted in this example the process is still unhandy and far from a drop-in replacement of simplify.JS. Thus in \path{lib/simplify-wasm/index.js} the a further abstraction to the emscripten emitted code was realised. The exported function \verb Simplify.wasm handles module instantiation, memory access and the correct call to the exported wasm code. Finding the correct path to the wasm binary is not always clear however when the code is imported from another location. The proposed solution is to leave the resolving of the code-path to an asset bundler that processes the file in a preprocessing step.
As mentioned the first step is to port simplify.JS to the C programming language. The file \path{lib/simplify-wasm/simplify.c} shows the attempt. It is kept as close to the JavaScript library as possible. This may result in C-untypical coding style but prevents skewed results from unexpected optimizations to the procedure itself. The entry point is not the \texttt{main}-function but a function called simplify. This is specified to the compiler as can be seen in listing \ref{lst:simplify-wasm-compiler-call}.
\lstinputlisting[
float=htpb,
language=bash,
% firstline=2, lastline=3,
label=lst:simplify-wasm-compiler-call,
caption={The compiler call}
]{../lib/simplify-wasm/Makefile}
\todo{More about the compiler call}
Furthermore the functions malloc and free from the standard library are made available for the host environment. Compiling the code through Emscripten produces a binary file in wasm format and the glue code as JavaScript. These files are called \texttt{simplify.wasm} and \texttt{simplify.js} respectively.
An example usage can be seen in \path{lib/simplify-wasm/example.html}. Even through the memory access is abstracted in this example the process is still unhandy and far from a drop-in replacement of Simplify.js. Thus in \path{lib/simplify-wasm/index.js} a further abstraction to the Emscripten emitted code was written. The exported function \texttt{simplifyWasm} handles module instantiation, memory access and the correct call to the exported wasm function. Finding the correct path to the wasm binary is not always clear however when the code is imported from another location. The proposed solution is to leave the resolving of the code-path to an asset bundler that processes the file in a preprocessing step.
\lstinputlisting[
float=htpb,
@ -46,10 +61,9 @@ firstline=22, lastline=33,
label=lst:simplify-wasm
]{../lib/simplify-wasm/index.js}
Listing \ref{lst:simplify-wasm} shows the function \texttt{Simplify.wasm}. Further explanaition will follow regarding the functions \texttt{getModule}, \texttt{storeCoords} and \texttt{loadResultAndFreeMemory}.
Listing \ref{lst:simplify-wasm} shows the function \texttt{simplifyWasm}. Further explanaition will follow regarding the abstractions \texttt{getModule}, \texttt{storeCoords} and \texttt{loadResultAndFreeMemory}.
Module instantiation will be done on the first call only but requires the function to be asynchronous. For a neater experience in handling emscripten modules a utility function named \texttt{initEmscripten}\footnote{/lib/wasm-util/initEmscripten.js} was written to turn the module factory into a Javascript Promise that resolves on finished compilation. The result from this promise can be cached in a global variable. The usage of this function can be seen in listing \ref{lst:simplify-wasm-emscripten-module}.
\paragraph {Module instantiation} will be done on the first call only but requires the function to be asynchronous. For a neater experience in handling Emscripten modules a utility function named \texttt{initEmscripten}\footnote{/lib/wasm-util/initEmscripten.js} was written to turn the module factory into a JavaScript Promise that resolves on finished compilation. The usage of this function can be seen in listing \ref{lst:simplify-wasm-emscripten-module}. The resulting WebAssembly module is cached in the variable \texttt{emscriptenModule}.
\lstinputlisting[
float=htbp,
@ -59,10 +73,7 @@ caption=My Caption,
label=lst:simplify-wasm-emscripten-module
]{../lib/simplify-wasm/index.js}
Next clarification is provided about how coordinates will be passed to this module and how the result is returned. Emscripten offers multiple views on the module memory. These correspond to the available WebAssembly datatypes (e.g. HEAP8, HEAPU8, HEAPF32, HEAPF64, ...)\footnote{\path{https://emscripten.org/docs/api_reference/preamble.js.html#type-accessors-for-the-memory-model}}. As Javascript numbers are always represented as a double-precision 64-bit binary\footnote{\path{https://www.ecma-international.org/ecma-262/6.0/#sec-4.3.20}} (IEEE 754-2008) the HEAP64-view is the way to go to not lose precision. Accordingly the datatype double is used in C to work with the data.
Listing \ref{lst:wasm-util-store-coords} shows the transfer of coordinates into the module memory. In line 3 the memory is allocated using the exported \texttt{malloc}-function. A Javascript TypedArray is used for accessing the buffer such that the loop for storing the values (lines 5 - 8) is trivial.
\paragraph {Storing coordinates} into the module memory is done in the function \texttt{storeCoords}. Emscripten offers multiple views on the module memory. These correspond to the available WebAssembly data types (e.g. HEAP8, HEAPU8, HEAPF32, HEAPF64, ...)\footnote{\path{https://emscripten.org/docs/api_reference/preamble.js.html#type-accessors-for-the-memory-model}}. As Javascript numbers are always represented as a double-precision 64-bit binary\footnote{\path{https://www.ecma-international.org/ecma-262/6.0/#sec-4.3.20}} (IEEE 754-2008) the HEAP64-view is the way to go to not lose precision. Accordingly the datatype double is used in C to work with the data. Listing \ref{lst:wasm-util-store-coords} shows the transfer of coordinates into the module memory. In line 3 the memory is allocated using the exported \texttt{malloc}-function. A JavaScript TypedArray is used for accessing the buffer such that the loop for storing the values (lines 5 - 8) is trivial.
\lstinputlisting[
float=tbph,
@ -74,7 +85,7 @@ label=lst:wasm-util-store-coords
\todo{Check for coords length < 2}
Now we dive int C-land. Listing \ref{lst:simplify-wasm-entrypoint} shows the entry point for the C code. This is the function that gets called from Javascript. As expected arrays are represented as pointers with corresponding length. The first block of code (line 2 - 6) is only meant for declaring needed variables. Lines 8 to 12 mark the radial distance preprocessing. The result of this simplification is stored in n auxiliary array named \texttt{resultRdDistance}. In this case points will have to point to the new array and the length is adjusted. Finally the Douglas-Peucker procedure is invoked after reserving enough memory. The auxiliary array can be freed afterwards. The problem now is to return the result pointer and the array length back to the calling code. \todo{Fact check. evtl unsigned}The fact that pointers in emscripten are represented by an integer will be exploited to return a fixed size array of two containing the values. A hacky solution but it works. We can now look back at how the javascript code reads the result.
\paragraph{To read the result} back from memory we have to look at how the simplification will be returned in the C code. Listing \ref{lst:simplify-wasm-entrypoint} shows the entry point for the C code. This is the function that gets called from JavaScript. As expected arrays are represented as pointers with corresponding length. The first block of code (line 2 - 6) is only meant for declaring needed variables. Lines 8 to 12 mark the radial distance preprocessing. The result of this simplification is stored in an auxiliary array named \texttt{resultRdDistance}. In this case \texttt{points} will have to point to the new array and the length is adjusted. Finally the Douglas-Peucker procedure is invoked after reserving enough memory. The auxiliary array can be freed afterwards. The problem now is to return the result pointer and the array length back to the calling code. \todo{Fact check. evtl unsigned}The fact that pointers in Emscripten are represented by an integer will be exploited to return a fixed size array of two containing the values. A hacky solution but it works. We can now look back at how the JavaScript code reads the result.
\lstinputlisting[
float=tbph,
@ -99,28 +110,40 @@ label=lst:wasm-util-load-result
\subsection{The implementation of a web framework}
The performance comparison of the two methods will be realized in a web page. It will be a built as a front-end web-application that allows user input to specify the input parameters of the benchmark. These parameters are: The polyline to simplify, a range of tolerances to use for simplification and if the so called high quality mode shall be used. By building a full application it will be possible to test a variety of use cases on multiple end-devices. Also the behavior of the algorithms can be researched under different preconditions. In the scope of this thesis a few cases will be investigated. The application structure will now be introduced.
The performance comparison of the two methods will be realized in a web page. It will be a built as a front-end web-application that allows the user to specify the input parameters of the benchmark. These parameters are: The polyline to simplify, a range of tolerances to use for simplification and if the so called high quality mode shall be used. By building this application it will be possible to test a variety of use cases on multiple devices. Also the behavior of the algorithms can be researched under different preconditions. In the scope of this thesis a few cases will be investigated. The application structure will now be introduced.
The dynamic aspects of the web page will be built in javascript to make it run in the browser. Webpack\footnote{https://webpack.js.org/} will be used to bundle the application code and use compilers like babel\footnote{https://babeljs.io/} on the source code. As mentioned in section \ref{sec:benchmark-webassembly} the bundler is also useful for handling references to the WebAssembly binary as it resolves the filename to the correct download path to use. There will be intentionally no transpiling of the Javascript code to older versions of the ECMA standard. This is often done to increase compatibility with older browsers, which not a requirement in this case. By refraining from this practice there will also be no unintentional impact on the application performance. Libraries in use are Benchmark.js\footnote{https://benchmarkjs.com/} for statistically significant benchmarking results, React\footnote{https://reactjs.org/} for the building the user interface and Chart.js\footnote{https://www.chartjs.org/} for drawing graphs.
\subsubsection{External libraries}
The web page consist of static and dynamic content. The static parts refer to the header and footer with explanation about the project. Those are written directly into the root HTML document. The dynamic parts are injected by Javascript. Those will be further discussed in this chapter as they are the main application logic.
The dynamic aspects of the web page will be built in JavaScript to make it run in the browser. Webpack\footnote{https://webpack.js.org/} will be used to bundle the application code and use compilers like babel\footnote{https://babeljs.io/} on the source code. As mentioned in section \ref{sec:benchmark-webassembly} the bundler is also useful for handling references to the WebAssembly binary as it resolves the filename to the correct download path to use. There will be intentionally no transpiling of the JavaScript code to older versions of the ECMA standard. This is often done to increase compatibility with older browsers. Luckily this is not a requirement in this case and by refraining from this practice there will also be no unintentional impact on the application performance. Libraries in use are Benchmark.js\footnote{https://benchmarkjs.com/} for statistically significant benchmarking results, React\footnote{https://reactjs.org/} for the building the user interface and Chart.js\footnote{https://www.chartjs.org/} for drawing graphs.
The web app is built to test a variety of cases with multiple datapoints. As mentioned Benchmark.js will be used for statistically significant results. It is however rather slow as it needs about 5 to 6 seconds per datapoint. This is why multiple types of benchmarking methods are implemented. Figure \ref{fig:benchmarking-uml} shows the corresponding UML diagram of the application. One can see the UI components in the top-left corner. The root component is \texttt{App}. It gathers all the internal state of its children and passes state down where it is needed.
\subsubsection{The framework}
\begin{figure}[htb]
The web page consist of static and dynamic content. The static parts refer to the header and footer with explanation about the project. Those are written directly into the root HTML document. The dynamic parts are injected by JavaScript. Those will be further discussed in this chapter as they are the main application logic.
\begin{figure}[htbp]
\centering
\label{fig:benchmarking-uml}
\fbox{\includegraphics[width=\linewidth]{images/benchmark-uml.jpg}}
\caption{UML diagram of the benchmarking application}
\end{figure}
In the upper right corner the different Use-Cases are listed. These cases implement a function \texttt{fn} to benchmark. Additional methods for setting up the function and clean up afterwards can be implemented as given by the parent class \texttt{BenchmarkCase}. Concrete cases can be created by instantiating one of the BenchmarkCases with a defined set of parameters. There are three charts that will be rendered using a subset of these cases. These are:
The web app is built to test a variety of cases with multiple datapoints. As mentioned Benchmark.js will be used for statistically significant results. It is however rather slow as it needs about 5 to 6 seconds per datapoint. This is why multiple types of benchmarking methods are implemented. Figure \ref{fig:benchmarking-uml} shows the corresponding UML diagram of the application. One can see the UI components in the top-left corner. The root component is \texttt{App}. It gathers all the internal state of its children and passes state down where it is needed.
In the upper right corner the different Use-Cases are listed. These cases implement a function \texttt{"fn"} to benchmark. Additional methods for setting up the function and clean up afterwards can be implemented as given by the parent class \texttt{BenchmarkCase}. Concrete cases can be created by instantiating one of the BenchmarkCases with a defined set of parameters. There are three charts that will be rendered using a subset of these cases. These are:
\begin{itemize}
\item \textbf{Simplify.js vs Simplify.wasm} - This Chart shows the performance of the simplification by Simplify.js, the altered version of Simplify.js and the newly developed Simplify.wasm.
\item \textbf{Simplify.js vs Simplify.wasm} - This Chart shows the performance of the simplification by Simplify.js, the altered version of Simplify.js and the newly developed Simplify.wasm. \todo{Cases}
\item \textbf{Simplify.wasm runtime analysis} - To further gain insights to WebAssembly performance this stacked barchart shows the runtime of a call to Simplify.wasm. It is partitioned into time spent for preparing data (\texttt{storeCords}), the algorithm itself and the time it took for the coordinates being restored from memory (\texttt{loadResult}).
\item \textbf{Turf.js method runtime analysis} - The last chart will use a similar structure. This time it analyses the performance impact of the back and forth transformation of data used in Truf.js.
\item \textbf{Turf.js method runtime analysis} - The last chart will use a similar structure. This time it analyses the performance impact of the back and forth transformation of data used in Truf.js. \todo{Cases}
\end{itemize}
On the bottom the different types of Benchmarks implemented can be seen. They all implement the abstract \texttt{measure} function to return the mean time to run a function specified in the given BenchmarkCase. The \texttt{IterationsBenchmark} runs the function a specified number of times, while the \texttt{OpsPerTimeBenchmark} always runs a certain amount of milliseconds to tun as much iterations as possible. Both methods got their benefits and drawbacks. Using the iterations approach one cannot determine the time the benchmark runs beforehand. With fast devices and a small number of iterations one can even fall in the trap of the duration falling under the accuracy of the timer used. Those results would be unusable of course. It is however a very fast way of determining the speed of a function. And it holds valuable for getting a first approximation of how the algorithms perform over the span of datapoints. The second type, the operations per time benchmark, seems to overcome this problem. It is however prune to garbage collection, engine optimizations and other background processes. \footnote{\path{https://calendar.perfplanet.com/2010/bulletproof-javascript-benchmarks/}}
Benchmark.js combines these approaches. In a first step it approximates the runtime in a few cycles. From this value it calculates the number of iterations to reach an uncertainty of at most 1\%. Then the samples are gathered. \todo{more}
\footnote{\path{http://monsur.hossa.in/2012/12/11/benchmarkjs.html}}
\todo[inline]{BenchmarkType}
\todo[inline]{BenchmarkSuite}
\subsubsection{The user interface}

View File

@ -1,15 +1,102 @@
\section[Algorithm comparison]{Compiling an existing C++ library for use on the web}
\section{Compiling an existing C++ library for use on the web}
In this chapter I will explain how an existing C++ library was utilized compare different simplification algorithms in a web browser. The library is named \textsl{psimpl} and was written in 2011 from Elmar de Koning. It implements various Algorithms used for polyline simplification. This library will be compiled to WebAssembly using the Emscripten compiler. Furthermore a Web-Application will be created for interactively exploring the Algorithms. The main case of application is simplifying polygons, but also polylines will be supported. The data format used to read in the data will be GeoJSON. To maintain topological correctness a intermediate conversion to TopoJSON will be applied if requested.
\subsection{State of the art: psimpl}
\subsection{Compiling to webassembly}
\textsl{psimpl} is a generic C++ library for various polyline simplification algorithms. It consists of a single header file \texttt{psimpl.h}. The algorithms implemented are \textsl{Nth point}, \textsl{distance between points}, \textsl{perpendicular distance}, \textsl{Reumann-Witkam}, \textsl{Opheim}, \textsl{Lang}, \textsl{Douglas-Peucker} and \textsl{Douglas-Peucker variation}. It has to be noted, that the \textsl{Douglas-Peucker} implementation uses the \textsl{distance between points} routine, also named the \textsl{radial distance} routine, as preprocessing step just like Simplify.js (Section \ref{sec:simplify.js}). All these algorithms have a similar templated interface. The goal now is to prepare the library for a compiler.
\subsubsection{Introduction to emscripten}
\todo[inline]{Describe the error statistics function of psimpl}
\subsection{Preserving topology GeoJSON vs TopoJSON}
\subsection{Compiling to WebAssembly}
\todo[inline]{object form vs array form}
As in the previous chapter the compiler created by the Emscripten project will be used. This time the code is not directly meant to be consumed by a web application. It is a generic library. There are no entry points defined that Emscripten can export in WebAssembly. So the entry points will be defined in a new package named psimpl-js. It will contain a C++ file that uses the library, the compiled code and the JavaScript files needed for consumption in a JavaScript project. \textsl{psimpl} makes heavy use of C++ template functions which cannot be handled by JavaScript. So there will be entry points written for each exported algorithm. These entry points are the point of intersection between JavaScript and the library. Listing \ref{lst:psimpl-js-entrypoint} shows one example. They all follow the same procedure. First the pointer given by JavaScript is interpreted as a double-pointer in line 2. This is the beginning of the coordinates array. \textsl{psimpl} expects the first and last point of an iterator so the pointer to the last point is calculated (line 3). The appropriate function template from psimpl is instantiated and called with the other given parameters (line 5). The result is stored in an intermediate vector.
\lstinputlisting[
float=htb,
language=c++,
firstline=56, lastline=62,
caption=One entrypoint to the C++ code,
label=lst:psimpl-js-entrypoint
]{../lib/psimpl-js/psimpl.cpp}
Since this is C++ the the capabilities of Emscripten's Embind can be utilized. Embind is realized in the libraries \texttt{bind.h}\footnote{\path{https://emscripten.org/docs/api_reference/bind.h.html#bind-h}} and \texttt{val.h}\footnote{\path{https://emscripten.org/docs/api_reference/val.h.html#val-h}}. \texttt{val.h} is used for transliterating JavaScript to C++. In this case it is used for the type conversion of C++ Vectors to JavaScript's Typed Arrays as seen at the end of listing \ref{lst:psimpl-js-entrypoint}. On the other hand \texttt{bind.h} is used for for binding C++ functions, classes, or enumerations to from JavaScript callable names. Aside from providing a better developer experience this also prevents name mangling in cases where functions are overloaded. Instead of listing the exported functions in the compiler command or annotating it with \texttt{EMSCRIPTEN\_KEEPALIVE} the developer gives a pointer to the object to bind. Listing \ref{lst:psimpl-js-bindings} shows each entry point bound to a readable name and at last the registered vector datatype. The parameter \texttt{my\_module} is merely for marking a group of related bindings to avoid name conflicts in bigger projects.
\lstinputlisting[
float=htb,
language=c++,
firstline=72, lastline=82,
caption=Emscripten bindings,
label=lst:psimpl-js-bindings
]{../lib/psimpl-js/psimpl.cpp}
\todo[inline]{Compiler call (--bind)}
The library code on JavaScript side is similar to the one in chapter \ref{sec:benchmark-webassembly}. This time a function is exported per routine.
\todo[inline]{More about javascript glue code with listing callSimplification.}
\subsection{The implementation}
The implementation is just as in the last chapter a web page and thus JavaScript is used for the interaction. The source code is bundled with Webpack. React is the UI Component library and babel is used to transform JSX to JavaScript. MobX\footnote{\path{https://mobx.js.org/}} is introduced as a state management library. It applies functional reactive programming by giving the utility to declare observable variables and triggering the update of derived state and other observers intelligently. To do that MobX observes the usage of observable variables so that only dependent observers react on updates. In contrast to other state libraries MobX does not require the state to be serializable. Many existing data structures can be observed like objects, arrays and class instances. It also does not constrain the state to a single centralized store like Redux\footnote{\path{https://redux.js.org/}} does. The final state diagram can be seen in listing \ref{fig:integration-state}. It represents the application state in an object model. Since this has drawbacks in showing the information flow the observable variables are marked in red, and computed ones in blue.
\begin{figure}[htb]
\centering
\fbox{\includegraphics[width=.8\linewidth]{images/integration-state.jpg}}
\caption{The state model of the application}
\label{fig:integration-state}
\end{figure}
On the bottom the three main state objects can be seen. They are implemented as singletons as they represent global application state. Each of them will now be explained.
\paragraph{MapState} holds state relevant for the map display. An array of TileLayers defines all possible background layers to choose from. The selected one is stored in \texttt{selectedTileLayerId}. The other two variables toggle the display of the vector layers to show.
\paragraph{AlgorithmState} stores all the information about the simplification algorithms to choose from. The class \texttt{Algorithm} acts as a generalization interface. Each algorithm defines which fields are used to interact with its parameters. These fields hold their current value, so the algorithm can compute its parameters array at any time. The fields also define additional restrictions in their \texttt{props} attribute like the number range from which to choose from. An integer field for example, like the n value in the \textsl{Nth point} algorithm, would instantiate a range field with a step value of one. The \texttt{ToleranceRange} however, which is modeled as its own subclass due to its frequent usage, allows for smaller steps to represent decimal numbers.
\paragraph{FeatureState} encapsulates the state of the vector features. Each layer is represented in text form and object format of the GeoJSON standard. The text form is needed as a serializable form for detecting whether the map display needs to update on an action. As the original features come from file or the server, the text representation is the source of truth and the object format derives from it. The simplified features are asynchronously calculated. This process is outsourced to a debounced reaction that updates the state upon finish.
\subsection{The user interface}
After explaining the state model the User Interface (UI) shall be explained. The interface is implemented in components which are modeled in a shallow hierarchy. They represent and update the application state. In listing \ref{fig:integration-ui} the resulting web page is shown. The labeled regions correspond to the components. Their behavior will be explained in the following.
\todo{Insert final picture.}
\todo{Red boxes around regions}
\todo{Make ui fit description}
\begin{figure}[htb]
\centering
\fbox{\includegraphics[width=\linewidth]{images/integration-ui.jpg}}
\caption{The user interface for the algorithm comparison.}
\label{fig:integration-ui}
\end{figure}
\paragraph{Leaflet Map}
The big region on the left marks the Leaflet map. Its main use is the visualization of Features. The layers to show are one background tile layer, the original and the simplified features. Original marks the user specified input features for simplification. These are marked in blue with a thin border. The simplified features are laid on top in a red styling. Aside from the default control for zooming on the top left the map contains a info widget showing the length of the currently specified tolerance on the top right.
\paragraph{Background Layers Control}
The first component in the Options panel is a simple radio button group for choosing the background layer of the map or none at all. They are provided by the OpenStreetMap (OSM) foundation\footnote{\path{https://wiki.osmfoundation.org/wiki/Main_Page}}. By experience the layer "OpenStreetMap DE" provides better loading times in Germany. "OpenStreetMap Mapnik" is considered the standard OSM tile layer\footnote{\path{https://wiki.openstreetmap.org/wiki/Featured_tile_layers}}.
\paragraph{Data Selection}
Here the input layer can be specified. Either by choosing one of the prepared data sets or by selecting a locally stored GeoJSON file. The prepared data will be loaded from the server upon selection by an Ajax call. Ajax stands for asynchronous JavaScript and XML and describes the method of dispatching an HTTP request from the web application without reloading the page. This way not all of the data has to be loaded on initial page load. On the other hand the user can select a file with an HTML input or via drag \& drop. For the latter the external package "file-drop-element" is used\footnote{\path{https://github.com/GoogleChromeLabs/file-drop#readme}}. It is a custom element based on the rather recent Custom Elements specification\footnote{\path{https://w3c.github.io/webcomponents/spec/custom/}}. It allows the creation of new HTML elements. In this case it is an element called "file-drop" that encapsulates the drag \& drop logic and provides a simple interface using attributes and events. Listing \ref{lst:compare-algorithms-file-drop} shows the use of the element. The mime type is restricted by the \texttt{accept} attribute to GeoJSON files.
\begin{lstlisting}[
language=html,
caption=The file-drop element in use,
label=lst:compare-algorithms-file-drop
]
<file-drop accept="application/geo+json">Drop area</file-drop>
\end{lstlisting}
\paragraph{Layer Control}
This element serves the purpose of toggling the display of the vector layers. The original and the simplified features can be independently displayed or be hidden. If features have been loaded, the filename will be shown here.
\paragraph{Simplification Control}
The last element in this section is the control for the simplification parameters. At first the user can choose if a conversion to TopoJSON should be performed before simplification. Then the algorithm itself can be selected. The parameters change to fit the requirements of the algorithm. The update of one of the parameters trigger live changes in the application state so the user can get direct feedback how the changes affect the geometries.
\subsection{The implementation}

View File

@ -1 +1,3 @@
\section{Conclusion}
Enhancement: Line Smoothing as preprocessing step

16
thesis/erklaerung.tex Normal file
View File

@ -0,0 +1,16 @@
\newpage
%{\huge Erkl\"arung}
\chapter*{Erkl\"arung}
Hiermit versichere ich, dass ich diese Masterarbeit selbst\"andig verfasst habe. Ich habe dazu keine anderen als die angegebenen Quellen und Hilfsmittel verwendet.
\newline
\newline
\newline
\newline
\newline
\newline
\newline
Augsburg, Datum \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ Name
\newpage

Binary file not shown.

After

Width:  |  Height:  |  Size: 101 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 334 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 46 KiB

Binary file not shown.

View File

@ -1,4 +1,4 @@
\documentclass[12pt]{article}
\documentclass[12pt,a4paper]{extarticle}
\usepackage[utf8]{inputenc}
\usepackage[onehalfspacing]{setspace}
@ -31,39 +31,13 @@
\setlength\parindent{0pt} % disable indentation for paragraphs
\title{Performance comparison of simplification algorithms for polygons in the context of web applications}
\author{Alfred Melch}
\begin{document}
\pagenumbering{gobble} % suppress page numbering
\begin{titlepage}
\begin{center}
\vspace*{1cm}
\textbf{Master Thesis}
\vspace{0.5cm}
Subtitle
\vspace{1.5cm}
\textbf{Alfred Melch}
\vfill
A thesis presented for the degree of\\
Master of Science
\vspace{0.8cm}
\includegraphics[width=.4\textwidth]{images/uni-augsburg.jpeg}
Department Name\\
University Name\\
Country\\
Date
\end{center}
\end{titlepage}
\input{titlepage.tex}
\section*{Abstract}
Abstract goes here

38
thesis/titlepage.tex Normal file
View File

@ -0,0 +1,38 @@
%%Titlepage
\begin{titlepage}%
\let\footnotesize\small
\let\footnoterule\relax
%\null\vfil
%\vskip 60\p@
\begin{center}%
{\huge Performance comparison of simplification algorithms for polygons in the context of web applications }\\[4mm]
\vspace*{2cm}
\includegraphics[height=2.5cm]{images/uniwappen_neu.jpg}%
\vspace*{2cm}
\begin{center}
\Large Masterarbeit\\
Institut f\"ur Informatik\\
Universit\"at Augsburg
\end{center}
\vfill%
\begin{center}
vorgelegt von\\[3mm]
{\large \bf Alfred Melch}\\
{Matrikelnummer xxx}
\end{center}
\vfill
\begin{center}
\small Augsburg, August 2019
\end{center}
\end{center}
\par
\vspace*{\fill}
\begin{tabular}{ll}
1. Gutachter: &Prof. Dr. Jörg Hähner\\
2. Gutachter: &Prof. Dr. Sabine Timpf\\
Betreuer: &Prof. Dr. Jörg Hähner\\
\end{tabular}
\end{titlepage}%