diff --git a/thesis/chapters/chapter05.tex b/thesis/chapters/chapter05.tex index 6bb5088..975d02f 100644 --- a/thesis/chapters/chapter05.tex +++ b/thesis/chapters/chapter05.tex @@ -1,9 +1,9 @@ % Performance benchmark -In this chapter i will explain the approach to improve the performance of a simplification algorithm in a web browser via WebAssembly. The go-to library for this kind of operation is simplifyJS. It is the javascript implementation of the Douglas-Peucker algorithm with optional radial distance preprocessing. The library will be rebuilt in the C programming language and compiled to Webassembly with emscripten. A web page is built to produce benchmarking insights to compare the two approaches performance wise. +In this chapter i will explain the approach to improve the performance of a simplification algorithm in a web browser via WebAssembly. The go-to library for this kind of operation is Simplify.JS. It is the javascript implementation of the Douglas-Peucker algorithm with optional radial distance preprocessing. The library will be rebuilt in the C programming language and compiled to Webassembly with emscripten. A web page is built to produce benchmarking insights to compare the two approaches performance wise. -\subsection{State of the art: simplifyJS} -% simplifyJS + turf +\subsection{State of the art: Simplify.JS} +% Simplify.JS + turf Simplify.JS calls itself a "tiny high-performance JavaScript polyline simplification library. It was extracted from Leaflet, the "leading open-source JavaScript library for mobile-friendly interactive maps". Due to its usage in leaflet and Turf.js, a geospatial analysis library, it is the most common used library for polyline simplification. The library itself currently has 20,066 weekly downloads while the Turf.js derivate @turf/simplify has 30,389. Turf.js maintains an unmodified fork of the library in its own repository. @@ -31,10 +31,11 @@ Since it is not clear which case is faster, and given how simple the required ch ]{../lib/simplify-js-alternative/simplify.diff} \subsection{The webassembly solution} +\label{sec:benchmark-webassembly} In scope of this thesis a library will be created that implements the same procedure as simplify.JS in C code. It will be made available on the web platform through WebAssembly. In the style of the model library it will be called simplify.WASM. The compiler to use will be emscripten as it is the standard for porting C code to wasm. -As mentioned the first step is to port simplify.JS to the C programming language. The file \path{lib/simplify-wasm/simplify.c} shows the attempt. It is kept as close to the Javascript library as possible. This may result in C-untypical coding style but prevents skewed results from unexpected optimizations to the procedure itself. The entrypoint is not the \texttt{main}-function but a function called simplify. This is specified to the compiler as can be seen in \path{lib/simplify-wasm/Makefile}. Furthermore the functions malloc and free from the standard library are made available for the host environment. Compling the code through emscripten produces a wasm file and the glue code in javascript format. These files are called simplify.wasm and simplify.js respectively. An example usage can be seen in \path{lib/simplify-wasm/example.html}. Even through the memory access is abstracted in this example the process is still unhandy and far from a drop-in replacement of simplify.JS. Thus in \path{lib/simplify-wasm/index.js} the a further abstraction to the emscripten emitted code was realised. The exported function \verb simplifyWasm handles module instantiation, memory access and the correct call to the exported wasm code. Finding the correct path to the wasm binary is not always clear however when the code is imported from another location. The proposed solution is to leave the resolving of the code-path to an asset bundler that processes the file in a preprocessing step. +As mentioned the first step is to port simplify.JS to the C programming language. The file \path{lib/simplify-wasm/simplify.c} shows the attempt. It is kept as close to the Javascript library as possible. This may result in C-untypical coding style but prevents skewed results from unexpected optimizations to the procedure itself. The entrypoint is not the \texttt{main}-function but a function called simplify. This is specified to the compiler as can be seen in \path{lib/simplify-wasm/Makefile}. Furthermore the functions malloc and free from the standard library are made available for the host environment. Compling the code through emscripten produces a wasm file and the glue code in javascript format. These files are called simplify.wasm and simplify.js respectively. An example usage can be seen in \path{lib/simplify-wasm/example.html}. Even through the memory access is abstracted in this example the process is still unhandy and far from a drop-in replacement of simplify.JS. Thus in \path{lib/simplify-wasm/index.js} the a further abstraction to the emscripten emitted code was realised. The exported function \verb Simplify.wasm handles module instantiation, memory access and the correct call to the exported wasm code. Finding the correct path to the wasm binary is not always clear however when the code is imported from another location. The proposed solution is to leave the resolving of the code-path to an asset bundler that processes the file in a preprocessing step. \lstinputlisting[ float=htpb, @@ -43,7 +44,7 @@ firstline=22, lastline=33, label=lst:simplify-wasm ]{../lib/simplify-wasm/index.js} -Listing \ref{lst:simplify-wasm} shows the function \texttt{simplifyWASM}. Further explanaition will follow regarding the functions \texttt{getModule}, \texttt{storeCoords} and \texttt{loadResultAndFreeMemory}. +Listing \ref{lst:simplify-wasm} shows the function \texttt{Simplify.wasm}. Further explanaition will follow regarding the functions \texttt{getModule}, \texttt{storeCoords} and \texttt{loadResultAndFreeMemory}. Module instantiation will be done on the first call only but requires the function to be asynchronous. For a neater experience in handling emscripten modules a utility function named \texttt{initEmscripten}\footnote{/lib/wasm-util/initEmscripten.js} was written to turn the module factory into a Javascript Promise that resolves on finished compilation. The result from this promise can be cached in a global variable. The usage of this function can be seen in listing \ref{lst:simplify-wasm-emscripten-module}. @@ -56,10 +57,8 @@ caption=My Caption, label=lst:simplify-wasm-emscripten-module ]{../lib/simplify-wasm/index.js} -Next clarification is provided about how coordinates will be passed to this module and how the result is returned. Emscripten offers multiple views on the module memory. These correspond to the available WebAssembly datatypes (e.g. HEAP8, HEAPU8, HEAPF32, HEAPF64, ...)\footnotemark. As Javascript numbers are always represented as a double-precision 64-bit binary\footnotemark (IEEE 754-2008) the HEAP64-view is the way to go to not lose precision. Accordingly the datatype double is used in C to work with the data. +Next clarification is provided about how coordinates will be passed to this module and how the result is returned. Emscripten offers multiple views on the module memory. These correspond to the available WebAssembly datatypes (e.g. HEAP8, HEAPU8, HEAPF32, HEAPF64, ...)\footnote{\path{https://emscripten.org/docs/api_reference/preamble.js.html#type-accessors-for-the-memory-model}}. As Javascript numbers are always represented as a double-precision 64-bit binary\footnote{\path{https://www.ecma-international.org/ecma-262/6.0/#sec-4.3.20}} (IEEE 754-2008) the HEAP64-view is the way to go to not lose precision. Accordingly the datatype double is used in C to work with the data. -\footnotetext{\path{https://emscripten.org/docs/api_reference/preamble.js.html#type-accessors-for-the-memory-model}} -\footnotetext{\path{https://www.ecma-international.org/ecma-262/6.0/#sec-4.3.20}} Listing \ref{lst:wasm-util-store-coords} shows the transfer of coordinates into the module memory. In line 3 the memory is allocated using the exported \texttt{malloc}-function. A Javascript TypedArray is used for accessing the buffer such that the loop for storing the values (lines 5 - 8) is trivial. @@ -87,7 +86,7 @@ label=lst:simplify-wasm-entrypoint Listing \ref{lst:wasm-util-load-result} shows the code to read the values back from module memory. The result pointer and its length are acquired by dereferencing the \texttt{resultInfo}-array. The buffer to use is the heap for unsigned 32-bit integers. This information can then be used to align the Float64Array-view on the 64-bit heap. Constructing the appropriate coordinate representation by reversing the flattening can be looked up in the same file. It is realised in the \texttt{unflattenCoords} function. At last it is important to actually free the memory reserved for both the result and the result-information. The exported method \texttt{free} is the way to go here. \lstinputlisting[ -float=tbph, +float=!tbph, language=javascript, firstline=29, lastline=43, caption=Loading coordinates back from module memory, @@ -96,4 +95,30 @@ label=lst:wasm-util-load-result -\subsection{The implementation} \ No newline at end of file +\subsection{The implementation of a web framework} + +The performance comparison of the two methods will be realized in a web page. It will be a built as a front-end web-application that allows user input to specify the input parameters of the benchmark. These parameters are: The polyline to simplify, a range of tolerances to use for simplification and if the so called high quality mode shall be used. By building a full application it will be possible to test a variety of use cases on multiple end-devices. Also the behavior of the algorithms can be researched under different preconditions. In the scope of this thesis a few cases will be investigated. The application structure will now be introduced. + +The dynamic aspects of the web page will be built in javascript to make it run in the browser. Webpack\footnote{https://webpack.js.org/} will be used to bundle the application code and use compilers like babel\footnote{https://babeljs.io/} on the source code. As mentioned in section \ref{sec:benchmark-webassembly} the bundler is also useful for handling references to the WebAssembly binary as it resolves the filename to the correct download path to use. There will be intentionally no transpiling of the Javascript code to older versions of the ECMA standard. This is often done to increase compatibility with older browsers, which not a requirement in this case. By refraining from this practice there will also be no unintentional impact on the application performance. Libraries in use are Benchmark.js\footnote{https://benchmarkjs.com/} for statistically significant benchmarking results, React\footnote{https://reactjs.org/} for the building the user interface and Chart.js\footnote{https://www.chartjs.org/} for drawing graphs. + +The web page consist of static and dynamic content. The static parts refer to the header and footer with explanation about the project. Those are written directly into the root HTML document. The dynamic parts are injected by Javascript. Those will be further discussed in this chapter as they are the main application logic. + +The web app is built to test a variety of cases with multiple datapoints. As mentioned Benchmark.js will be used for statistically significant results. It is however rather slow as it needs about 5 to 6 seconds per datapoint. This is why multiple types of benchmarking methods are implemented. Figure \ref{fig:benchmarking-uml} shows the corresponding UML diagram of the application. One can see the UI components in the top-left corner. The root component is \texttt{App}. It gathers all the internal state of its children and passes state down where it is needed. + +\begin{figure}[htb] + \centering + \label{fig:benchmarking-uml} + \fbox{\includegraphics[width=\linewidth]{images/benchmark-uml.jpg}} + \caption{UML diagram of the benchmarking application} +\end{figure} + +In the upper right corner the different Use-Cases are listed. These cases implement a function \texttt{fn} to benchmark. Additional methods for setting up the function and clean up afterwards can be implemented as given by the parent class \texttt{BenchmarkCase}. Concrete cases can be created by instantiating one of the BenchmarkCases with a defined set of parameters. There are three charts that will be rendered using a subset of these cases. These are: + +\begin{itemize} + \item \textbf{Simplify.js vs Simplify.wasm} - This Chart shows the performance of the simplification by Simplify.js, the altered version of Simplify.js and the newly developed Simplify.wasm. + \item \textbf{Simplify.wasm runtime analysis} - To further gain insights to WebAssembly performance this stacked barchart shows the runtime of a call to Simplify.wasm. It is partitioned into time spent for preparing data (\texttt{storeCords}), the algorithm itself and the time it took for the coordinates being restored from memory (\texttt{loadResult}). + \item \textbf{Turf.js method runtime analysis} - The last chart will use a similar structure. This time it analyses the performance impact of the back and forth transformation of data used in Truf.js. +\end{itemize} + +\todo[inline]{BenchmarkType} +\todo[inline]{BenchmarkSuite} diff --git a/thesis/images/benchmark-uml.jpg b/thesis/images/benchmark-uml.jpg new file mode 100644 index 0000000..2513a32 Binary files /dev/null and b/thesis/images/benchmark-uml.jpg differ diff --git a/thesis/main.pdf b/thesis/main.pdf index ae18962..b33e212 100644 Binary files a/thesis/main.pdf and b/thesis/main.pdf differ diff --git a/thesis/main.tex b/thesis/main.tex index effb97a..e4a1ad4 100644 --- a/thesis/main.tex +++ b/thesis/main.tex @@ -13,7 +13,8 @@ \usepackage{graphicx} % for figures \usepackage{todonotes} % for todo notes -\usepackage{url} +\usepackage{url} % for filepaths and urls +\usepackage{hyperref} % for hyperlinks % configure headers \usepackage{fancyhdr} % for headers