This commit is contained in:
Alfred Melch 2019-08-25 18:21:33 +02:00
parent 99fc4b146c
commit 9be4c1e214
11 changed files with 57 additions and 34 deletions

View File

@ -28,3 +28,10 @@
publisher={Online},
url={https://github.com/topojson/topojson-specification}
}
@article{wirfs2015ecmascript,
title={ECMAScript 2015 Language Specification},
author={Wirfs-Brock, Allen},
journal={Ecma International,},
year={2015}
}

View File

@ -65,4 +65,20 @@ title = {WebAssembly consensus and end of Browser Preview},
date = {2017-02-28},
url = {https://lists.w3.org/Archives/Public/public-webassembly/2017Feb/0002.html},
urldate= {2019-08-15}
}
@online{zakai2015webassembly,
author = {Zakai, Alon},
title = {WebAssembly},
date = {2015-06-17},
url = {https://groups.google.com/forum/#!topic/emscripten-discuss/k-egXO7AkJY/discussion},
urldate= {2019-08-15}
}
@online{zakai2018emit,
author = {Zakai, Alon},
title = {Emit WebAssembly by default},
date = {2018-05-10},
url = {https://github.com/emscripten-core/emscripten/pull/6419},
urldate= {2019-08-15}
}

View File

@ -12,15 +12,15 @@
Simplification of polygonal data structures is the task of reducing data points while preserving topological characteristics. The simplification often takes the form of removing points that make up the geometry. There are several solutions that tackle the problem in different ways. With the rising trend of moving desktop applications to the web platform geographic information systems have experienced the shift towards web browsers too \parencite{alesheikh2002web}. Performance is critical in these applications. Since simplification is an important factor to performance the solutions will be tested by constructing a web application using a technology called WebAssembly.
\subsection{Binary instruction sets on the web platform}
\subsection{Binary instruction set on the web platform}
The recent development of WebAssembly allows code written in various programming languages to be run natively in web browsers. So far JavaScript was the only native programming language on the web \parencite{reiser2017accelerate}. The goals of WebAssembly are to define a binary instruction format as a compilation target to execute code at native speed and taking advantage of common hardware capabilities \parencite{haas2017bringing}. The integration into the web platform brings portability to a wide range of platforms like mobile and internet of things. The usage of this technology promises performance gains that will be tested by this thesis. The results can give conclusions to whether WebAssembly is worth a consideration for web applications with geographic computational aspects. WebGIS is only one technology that would benefit greatly of such an advancement. Thus far WebAssembly has been shipped to the stable version of the four most used browser engines \parencite{wagner2017support}. The mainly targeted high-level languages for compilation are C and C++. Also a compiler for Rust and a TypeScript subset has been developed.
The recent development of WebAssembly allows code written in various programming languages to be run natively in web browsers. So far JavaScript was the only native programming language on the web \parencite{reiser2017accelerate}. The goals of WebAssembly are to define a binary instruction format as a compilation target to execute code at nearly native speed and taking advantage of common hardware capabilities \parencite{haas2017bringing}. The integration into the web platform brings portability to a wide range of devices like mobile and internet of things. The usage of this technology promises performance gains that will be tested by this thesis. The results can give conclusions to whether WebAssembly is worth a consideration for web applications with geographic computational aspects. WebGIS is only one technology that would benefit greatly of such an advancement. Thus far WebAssembly has been shipped to the stable version of the four most used browser engines \parencite{wagner2017support}. The mainly targeted high-level languages for compilation are C and C++. Also a compiler for Rust and a TypeScript subset has been developed.
\subsection{Performance as important factor for web applications}
There has been a rapid growth of complex applications running in web-browsers. These so called progressive web apps combine the fast reachability of web pages with the feature richness of locally installed applications. Even though these applications can grow quite complex, the requirement for fast page loads and instant user interaction still remains. One way to cope with this need is the use of compression algorithms to reduce the amount of data transmitted and processed. In a way simplification is a form of data compression. Web servers use lossless compression algorithms like gzip to deflate data before transmission. Browsers that implement these algorithms can then fully restore the requested ressources resulting in lower bandwidth usage. The algorithms presented here however remove information from the data in a way that cannot be restored. This is called lossy compression. The most common usage for this on the web is the compression of image data.
\subsection{Topology simplification for rendering performance}
%\subsection{Topology simplification for rendering performance}
While compression is often used to minimize bandwidth usage, the compression of geospatial data can particulary influence rendering performance. The bottleneck for rendering often is the transformation to scalable vector graphics used to display topology on the web. Implementing simplification algorithms for use on the web platform can lead to smoother user experience when working with large geodata sets.
@ -31,9 +31,9 @@ Shi and Cheung analyzed several different polyline simplification algorithms in
\subsection{Structure of this thesis}
This thesis is structured into a theoretical and a practical component. First the theoretical principles will be reviewed. A number of algorithms will be introduced in this section. Each algorithm will be dissected by complexity and characteristics. Topology of polygonal data will be explained as how to describe geodata on the web. An introduction to WebAssembly will follow.
This thesis is structured into a theoretical and a practical component. First the theoretical principles will be reviewed. A number of algorithms will be introduced in this section. Topology of polygonal data will be explained as how to describe geodata on the web. An introduction to WebAssembly will follow.
In chapter 3 the practical implementation will be presented. A web application will be developed to measure the performance of three related algorithms used for polyline simplification.
In chapter 3 the practical implementation will be presented. The developed web application will be described. It is used to measure the performance of three related algorithms used for polyline simplification.
The results of the above methods will be shown in chapter 4. After discussion of the results a conclusion will finish the thesis.

View File

@ -59,7 +59,7 @@ In this chapter several algorithms for polyline simplification will be explained
%\end{figure}
\paragraph{Lang simplification} Lang described this algorithm in 1969. The search area is defined by a specified number of points to look ahead of the key point. A line is constructed from the key point to the last point in the search area. If the perpendicular distance of all intermediate points to this line is below a tolerance limit, they will be removed and the last point is the new key. Otherwise the search area is shrunk by excluding this last point until the requirement is met or there are no more intermediate points. All the algorithms before operated on the line sequentially and have a linear time complexity. This one also operates sequentially, but one of the critics about the Lang algorithm is that it requires too much computer time \parencite{douglas1973algorithms}. The complexity of this algorithm is $\mathcal{O}(m^n)$ with \textsf{m} being the number of positions to look ahead. \parencite{lang1969rules}
\paragraph{Lang simplification} Lang described this algorithm in 1969. The search area is defined by a specified number of points to look ahead of the key point. A line is constructed from the key point to the last point in the search area. If the perpendicular distance of all intermediate points to this line is below a tolerance limit, they will be removed and the last point is the new key. Otherwise the search area is shrunk by excluding this last point until the requirement is met or there are no more intermediate points. All the algorithms before operated on the line sequentially and have a linear time complexity. This one also operates sequentially, but one of the critics about the Lang algorithm is that it requires too much computer time \parencite{douglas1973algorithms}. The worst case complexity of this algorithm is $\mathcal{O}(m^n)$ with \textsf{m} being the number of positions to look ahead. \parencite{lang1969rules}
%\begin{figure}
% \centering
@ -70,7 +70,7 @@ In this chapter several algorithms for polyline simplification will be explained
%\paragraph{Jenks simplification}
\paragraph{Douglas-Peucker simplification} David H. Douglas and Thomas K. Peucker developed this algorithm in 1973 as an improvement to the by then predominant Lang algorithm. It is the first global routine described here. A global routine considers the entire line for the simplification process and comes closest to imitating manual simplification techniques \parencite{clayton1985cartographic}. The algorithm starts with constructing a line between the first point (anchor) and last point (floating point) of the feature. The perpendicular distance of all points in between those two is calculated. The intermediate point furthest away from the line will become the new floating point on the condition that its perpendicular distance is greater than the specified tolerance. Otherwise the line segment is deemed suitable to represent the whole line. In this case the floating point is considered the new anchor and the last point will serve as floating point again (DP). The worst case complexity of this algorithm is $\mathcal{O}(nm)$ with $\mathcal{O}(n\log{}m)$ being the average complexity \parencite{koning2011polyline}. The m here is the number of points in the resulting line which is not known beforehand. \parencite{douglas1973algorithms}
\paragraph{Douglas-Peucker simplification} David H. Douglas and Thomas K. Peucker developed this algorithm in 1973 as an improvement to the by then predominant Lang algorithm. It is the first global routine described here. A global routine considers the entire line for the simplification process and comes closest to imitating manual simplification techniques \parencite{clayton1985cartographic}. The algorithm starts with constructing a line between the first point (anchor) and last point (floating point) of the feature. The perpendicular distance of all points in between those two is calculated. The intermediate point furthest away from the line will become the new floating point on the condition that its perpendicular distance is greater than the specified tolerance. Otherwise the line segment is deemed suitable to represent the whole line. In this case the floating point is considered the new anchor and the last point will serve as floating point again. The worst case complexity of this algorithm is $\mathcal{O}(nm)$ with $\mathcal{O}(n\log{}m)$ being the average complexity \parencite{koning2011polyline}. The m here is the number of points in the resulting line which is not known beforehand. \parencite{douglas1973algorithms}
%\begin{figure}
% \centering
@ -81,7 +81,7 @@ In this chapter several algorithms for polyline simplification will be explained
%\paragraph{with reduction parameter} \todo{O(n*m)}
\paragraph{Visvalingam-Whyatt simplification} This is another global point routine. It was developed in 1993. Visvalingam and Wyatt use a area-based method to rank the points by their significance. To do that the "effective area" of each point has to be calculated. This is the area the point spans up with its adjoining points \parencite{shi2006performance}. Then the points with the least effective area get iteratively eliminated, and its neighbors effective area recalculated, until there are only two points left. At each elimination the point gets stored in a list alongside with its associated area. This is the effective area of that point or the associated area of the previous point in case the latter one is higher. This way the algorithm can be used for scale dependent and scale-independent generalizations. \parencite{visvalingam1993line}
\paragraph{Visvalingam-Whyatt simplification} This is another global point routine. It was developed in 1993. Visvalingam and Wyatt use a area-based method to rank the points by their significance. To do that the "effective area" of each point has to be calculated. This is the area the point spans up with its adjoining points \parencite{shi2006performance}. Then, the points with the least effective area get iteratively eliminated, and its neighbors effective area recalculated, until there are only two points left. At each elimination the point gets stored in a list alongside with its associated area. This is the effective area of that point or the associated area of the previous point in case the latter one is higher. This way the algorithm can be used for scale dependent and scale-independent generalizations. \parencite{visvalingam1993line}
\subsubsection{Summary}

View File

@ -6,11 +6,11 @@ JavaScript was traditionally the only native programming language of web browser
\subsubsection{Introduction to WebAssembly}
WebAssembly\footnote{\url{https://webassembly.org/}} started in April 2015 with an W3C Community Group\footnote{\url{https://www.w3.org/community/webassembly/}} and is designed by engineers from the four major browser vendors (Mozilla, Google, Apple and Microsoft). It is a portable low-level bytecode designed as compilation target of high-level languages. By being an abstraction over modern hardware it is language-, hardware-, and platform-independent. It is intended to be run in a stack-based virtual machine. This way it is not restrained to the Web platform or a JavaScript environment. Some key concepts are the structuring into modules with exported and imported definitions and the linear memory model. Memory is represented as a large array of bytes that can be dynamically grown. Security is ensured by the linear memory being disjoint from code space, the execution stack and the engine's data structures. Another feature of WebAssembly is the possibility of streaming compilation and the parallelization of compilation processes. \parencite{haas2017bringing}
WebAssembly\footnote{\url{https://webassembly.org/}} started in April 2015 with an W3C Community Group\footnote{\url{https://www.w3.org/community/webassembly/}} and is designed by engineers from the four major browser vendors (Mozilla, Google, Apple and Microsoft). It is a portable low-level bytecode designed as compilation target of high-level languages. By being an abstraction over modern hardware it is language-, hardware-, and platform-independent. It is intended to be run in a stack-based virtual machine. This way it is not restrained to the web platform or a JavaScript environment. Some key concepts are the structuring into modules with exported and imported definitions and the linear memory model. Memory is represented as a large array of bytes that can be dynamically grown. Security is ensured by the linear memory being disjoint from code space, the execution stack and the engine's data structures. Another feature of WebAssembly is the possibility of streaming compilation and the parallelization of compilation processes. \parencite{haas2017bringing}
The goals of WebAssembly have been well defined. Its semantics are intended to be safe and fast to execute and to bring portability by language-, hardware- and platform-independence. Furthermore, it should be deterministic and have simple interoperability with the web platform. For its representation, the following goals are declared. It shall be compact and easy to decode, validate and compile. Parallelization and streamable compilation are also mentioned. \parencite{haas2017bringing}
These goals are not specific to WebAssembly. They can be seen as properties that a low-level compilation target for the web should have. In fact there have been previous attempts to run low-level code on the web. Examples are Microsoft's ActiveX, Native Client (NaCl) and Emscripten, each having issues complying with the goals stated. Java and Flash are examples for managed runtime plugins. Their usage is declining however not at least due to falling short on the goals mentioned above. \parencite{haas2017bringing}
These goals are not specific to WebAssembly. They can be seen as properties that a low-level compilation target for the web should have. In fact there have been previous attempts to run low-level code on the web. Examples are Microsoft's ActiveX, Native Client (NaCl) and Emscripten, each having issues complying with at least one of the goals stated. Java and Flash are examples for managed runtime plugins. Their usage is declining however not at least due to falling short on the goals mentioned above. \parencite{haas2017bringing}
It is often stated that WebAssembly can bring performance benefits. It makes sense that statically typed machine code beats scripting languages performance wise. It has to be observed however, if the overhead of switching contexts will neglect this performance gain. JavaScript has made a lot of performance improvements over the past years. Not at least Googles development on the V8 engine has brought JavaScript to an acceptable speed for extensive calculations. Modern engines observe the execution of running JavaScript code and will perform optimizations that can be compared to optimizations of compilers. \parencite{clark2017what}
@ -33,6 +33,6 @@ Emscripten\footnote{\url{https://webassembly.org/}} started with the goal to com
\label{fig:emscripten-chain}
\end{figure}
It is in fact this project that inspired the creation of WebAssembly. It was even called the "natural evolution of asm.js"\footnote{\url{https://groups.google.com/forum/\#!topic/emscripten-discuss/k-egXO7AkJY/discussion}}. As of May 2018 Emscripten changed its default output to WebAssembly\footnote{\url{https://github.com/emscripten-core/emscripten/pull/6419}} while still supporting asm.js. Currently the default backend named \texttt{fastcomp} generates the WebAssembly bytecode from asm.js. A new backend however is about to take its place that compiles directly from LLVM \parencite{zakai2019llvmbackend}.
It is in fact this project that inspired the creation of WebAssembly. It was even called the "natural evolution of asm.js" \parencite{zakai2015webassembly}. As of May 2018 Emscripten changed its default output to WebAssembly while still supporting asm.js \parencite{zakai2018emit}. Currently the default backend named \texttt{fastcomp} generates the WebAssembly bytecode from asm.js. A new backend however is about to take its place that compiles directly from LLVM \parencite{zakai2019llvmbackend}.
The compiler is only one part of the Emscripten toolchain. Part of it are various APIs, for example for file system emulation or network calls, and tools like the compiler mentioned.

View File

@ -18,7 +18,7 @@ Interestingly the library expects coordinates to be a list of objects with x and
float=htbp,
language=javascript,
firstline=116, lastline=122,
caption=Turf.js usage of simplify.js,
caption=Turf.js's usage of simplify.js,
label=lst:turf-transformation
]{../lib/turf-simplify/index.js}
@ -48,7 +48,7 @@ label=lst:simplify-wasm-compiler-call,
caption={The call to compile the C source code to WebAssembly in a Makefile}
]{../lib/simplify-wasm/Makefile}
Furthermore, the functions \texttt{malloc} and \texttt{free} from the standard library are made available for the host environment. Another option specifies the optimisation level. With \texttt{O3} the highest level is chosen. The closure compiler minifies the JavaScript glue code. Compiling the code through Emscripten produces a binary file in wasm format and the glue code as JavaScript. These files are called \texttt{simplify.wasm} and \texttt{simplify.js} respectively.
Furthermore, the functions \texttt{malloc} and \texttt{free} from the standard library are made available for the host environment. Another option specifies the optimisation level. With \texttt{O3} the highest level is chosen. The closure compiler minifies the JavaScript glue code. Compiling the code through Emscripten produces a binary file in WebAssembly format and the glue code as JavaScript. These files are called \texttt{simplify.wasm} and \texttt{simplify.js} respectively.
An example usage can be seen in \path{lib/simplify-wasm/example.html}. Even though the memory access is abstracted in this example the process is still unhandy and far from a drop-in replacement of Simplify.js. Thus in \path{lib/simplify-wasm/index.js} a further abstraction to the Emscripten emitted code was written. The exported function \texttt{simplifyWasm} handles module instantiation, memory access and the correct call to the exported wasm function. Finding the correct path to the wasm binary is not always clear when the code is imported from another location. The proposed solution is to leave the resolving of the code-path to an asset bundler that processes the file in a preprocessing step.
@ -72,7 +72,7 @@ caption=Caching the instantiated Emscripten module,
label=lst:simplify-wasm-emscripten-module
]{../lib/simplify-wasm/index.js}
\paragraph {Storing coordinates} into the module memory is done in the function \texttt{storeCoords}. Emscripten offers multiple views on the module memory. These correspond to the available WebAssembly data types (e.g. HEAP8, HEAPU8, HEAPF32, HEAPF64, ...)\footnote{\url{https://emscripten.org/docs/api_reference/preamble.js.html\#type-accessors-for-the-memory-model}}. As Javascript numbers are always represented as a double-precision 64-bit binary\footnote{\url{https://www.ecma-international.org/ecma-262/6.0/\#sec-4.3.20}} (IEEE 754-2008), the HEAPF64-view is the way to go to not lose precision. Accordingly the datatype double is used in C to work with the data. Listing \ref{lst:wasm-util-store-coords} shows the transfer of coordinates into the module memory. In line 3 the memory is allocated using the exported \texttt{malloc}-function. A JavaScript TypedArray is used for accessing the buffer such that the loop for storing the values (lines 5 - 8) is trivial.
\paragraph {Storing coordinates} into the module memory is done in the function \texttt{storeCoords}. Emscripten offers multiple views on the module memory. These correspond to the available WebAssembly data types (e.g. HEAP8, HEAPU8, HEAPF32, HEAPF64, ...). As JavaScript numbers are always represented as a double-precision 64-bit binary based on the IEEE 754-2008 specification, the HEAPF64-view is the way to go to not lose precision \parencite{wirfs2015ecmascript}. Accordingly the datatype double is used in C to work with the data. Listing \ref{lst:wasm-util-store-coords} shows the transfer of coordinates into the module memory. In line 3 the memory is allocated using the exported \texttt{malloc}-function. A JavaScript TypedArray is used for accessing the buffer such that the loop for storing the values (lines 5 - 8) is trivial.
\lstinputlisting[
float=tbph,
@ -106,9 +106,9 @@ label=lst:wasm-util-load-result
\subsection{File sizes}
\label{ch:file-sizes}
For web applications an important measure is the size of libraries. It defines the cost of including the functionality in terms of how much the application size will grow. When it gets too large, especially users with low bandwidth are discriminated as it might be impossible to load the app at all in a reasonable time. Even with fast internet, loading times are relevant as users expect a fast time to first interaction. Also users with limited data plans are glad when developers keep their bundle size to a minimum.
For web applications developers it is important to keep an eye at size of libraries. It defines the cost of including the functionality in terms of how much the application size will grow. When it gets too large, especially users with low bandwidth are discriminated as it might be impossible to load the app at all in a reasonable time. Even with fast internet, loading times are relevant as users expect a fast time to first interaction. Also users with limited data plans are glad when developers keep their bundle size to a minimum.
The file sizes in this chapter will be given as the gzipped size. gzip is a file format for compressed files based on the DEFLATE algorithm. It is natively supported by all browsers and the most common web server software. So this is the format that files will be transmitted in on production applications.
File sizes mentioned in this chapter represent the size after compressing the files with gzip. This format is used for compressed files based on the DEFLATE algorithm. It is natively supported by all browsers and the most common web server software. So this is the format that files will be transmitted in on production applications.
For JavaScript applications there is also the possibility of reducing filesize by code minification. This is the process of reformating the source code without changing the functionality. Optimization are brought for example by removing unnecessary parts like spaces and comments or reducing variable names to single letters. Minification is often done in asset bundlers that process the JavaScript source files and produce the bundled application code.
@ -146,12 +146,12 @@ In the upper right corner the different Use-Cases are listed. These cases implem
\begin{itemize}
\item \textbf{Simplify.js vs Simplify.wasm} - This Chart shows the performance of the simplification by Simplify.js, the altered version of Simplify.js and the newly developed Simplify.wasm.
\item \textbf{Simplify.wasm runtime analysis} - To further gain insights to WebAssembly performance this stacked barchart shows the runtime of a call to Simplify.wasm. It is partitioned into time spent for preparing data (\texttt{storeCords}), the algorithm itself and the time it took for the coordinates being restored from memory (\texttt{loadResult}).
\item \textbf{Simplify.wasm runtime analysis} - To further gain insights to WebAssembly performance this stacked barchart shows the run time of a call to Simplify.wasm. It is partitioned into time spent for preparing data (\texttt{storeCords}), the algorithm itself and the time it took for the coordinates being restored from memory (\texttt{loadResult}).
\item \textbf{Turf.js method runtime analysis} - The last chart will use a similar structure. This time it analyses the performance impact of the back and forth transformation of data used in Truf.js.
\end{itemize}
\subsubsection{The different benchmark types}
On the bottom the different types of Benchmarks implemented can be seen. They all implement the abstract \texttt{measure} function to return the mean time to run a function specified in the given \texttt{BenchmarkCase}. The \texttt{IterationsBenchmark} runs the function a specified number of times, while the \texttt{OpsPerTimeBenchmark} always runs a certain amount of milliseconds to run as much iterations as possible. Both methods got their benefits and drawbacks. Using the iterations approach one cannot determine the time the benchmark runs beforehand. With fast devices and a small number of iterations one can even fall in the trap of the duration falling under the accuracy of the timer used. Those results would be unusable of course. It is however a very fast way of determining the speed of a function. And it holds valuable for getting a first approximation of how the algorithms perform over the span of datapoints. The second type, the operations per time benchmark, seems to overcome this problem. It is however prune to garbage collection, engine optimizations and other background processes. \parencite{bynens2010bulletproof}
On the bottom the different types of benchmarks implemented can be seen. They all implement the abstract \texttt{measure} function to return the mean time to run a function specified in the given \texttt{BenchmarkCase}. The \texttt{IterationsBenchmark} runs the function a specified number of times, while the \texttt{OpsPerTimeBenchmark} always runs a certain amount of milliseconds to run as much iterations as possible. Both methods got their benefits and drawbacks. Using the iterations approach one cannot determine the time the benchmark runs beforehand. With fast devices and a small number of iterations one can even fall in the trap of the duration falling under the accuracy of the timer used. Those results would be unusable of course. It is however a very fast way of determining the speed of a function. And it holds valuable for getting a first approximation of how the algorithms perform. The second type, the operations per time benchmark, seems to overcome this problem. It is however prune to garbage collection, engine optimizations and other background processes. \parencite{bynens2010bulletproof}
Benchmark.js combines these approaches. In a first step it approximates the runtime in a few cycles. From this value it calculates the number of iterations to reach an uncertainty of at most 1\%. Then the samples are gathered. \parencite{hossain2012benchmark}
@ -160,9 +160,9 @@ For running multiple benchmarks the class \texttt{BenchmarkSuite} was created. I
\begin{figure}[htb]
\centering
\label{fig:benchmarking-statemachine}
\fbox{\includegraphics[width=.8\linewidth]{images/benchmark-statemachine.jpg}}
\caption{The state machine for the benchmark suite}
\label{fig:benchmarking-statemachine}
\end{figure}
Figure \ref{fig:benchmarking-statemachine} shows the state machine of the suite. Based on this diagram the user interface component shows action buttons so the user can interact with the state. While running, the suite checks if a state change was requested and acts accordingly by pausing the benchmarks or resetting all statistics gathered when stopping.
@ -182,7 +182,7 @@ The user interface has three sections. One for configuring input parameters. One
\paragraph{Run Benchmark} This is the control that displays the status of the benchmark suite. Here benchmarks can be started, stopped, paused and resumed. It also shows the progress of the benchmarks completed in percentage and absolute numbers.
\paragraph{Chart} The chart shows a live diagram of the results. The title represents the selected chart. The legend gives information on which benchmark cases will run. Also the algorithm parameters (dataset and high quality mode) and current platform description can be found here. The tolerance range maps over the x-Axis. On the y-Axis two scales can be seen. The left hand shows by which unit the performance is displayed. This scale corresponds to the colored lines. Every chart will show the number of positions in the result as a grey line. Its scale is displayed on the right. This information is important for selecting a proper tolerance range as it shows if a appropriate order of magnitude has been chosen. Below the chart additional control elements are placed to adjust the visualization. The first selection lets the user choose between a linear or logarithmic y-Axis. The second one changes the unit of measure for performance. The two options are the mean time in milliseconds per operation (ms) and the number of operations that can be run in one second (hz). These options are only available for the chart "Simplify.wasm vs. Simplify.js" as the other two charts are stacked bar charts where changing the default options won't make sense. Finally the result can be saved via a download button. A separate page can be fed with this file to display the diagram only.
\paragraph{Chart} The chart shows a live diagram of the results. The title represents the selected chart. The legend gives information on which benchmark cases will run. Also the algorithm parameters (dataset and high quality mode) and current platform description can be found here. The tolerance range maps over the x-Axis. On the y-Axis two scales can be seen. The left hand shows by which unit the performance is displayed. This scale corresponds to the colored lines. Every chart will show the number of positions in the result as a grey line. Its scale is displayed on the right. This information is important for selecting a proper tolerance range as it shows if a appropriate order of magnitude has been chosen. Below the chart additional control elements are placed to adjust the visualization. The first selection lets the user choose between a linear or logarithmic y-Axis. The second one changes the unit of measure for performance. The two options are the mean time in milliseconds per operation (ms) and the number of operations that can be run in one second (hz). These options are only available for the chart "Simplify.wasm vs. Simplify.js" as the other two charts are stacked bar charts where changing the default options won't make sense. Finally, the result can be saved via a download button. A separate page can be fed with this file to display the diagram only.
\subsection{The test data}
@ -192,7 +192,7 @@ Here the test data will be shown. There are two data sets chosen to operate on.
\paragraph{Simplify.js example}
This is the polyline used by Simplify.js to demonstrate its capabilities. Figure \ref{fig:dataset-simplify} shows the widget on its homepage. The user can modify the parameters with the interactive elements and view the live result. The data comes from a 10.700 mile car route from Lisboa, Portugal to Singapore and is based on OpenStreetMap data. The line is defined by 73,752 positions. Even with low tolerances this number reduces drastically. This example shows perfectly why it is important to generalize polylines before rendering them.
This is the polyline used by Simplify.js to demonstrate its capabilities. Figure \ref{fig:dataset-simplify} shows the widget on its homepage. The user can modify the parameters with the interactive elements and view the live result. The data comes from a 10,700 mile car route from Lisboa, Portugal to Singapore and is based on OpenStreetMap data. The line is defined by 73,752 positions. Even with low tolerances this number reduces drastically. This example shows perfectly why it is important to generalize polylines before rendering them.
\begin{figure}[htb]
\centering

View File

@ -2,7 +2,7 @@
In this chapter the results are presented. There were a multiple tests performed. Multiple devices were used to run several benchmarks on different browsers and under various parameters. To decide which benchmarks had to run, first all the problem dimensions were clarified. Devices will be categorized into desktop and mobile devices. The browsers to test will come from the four major browser vendors which were involved in WebAssembly development. These are Firefox from Mozilla, Chrome from Google, Edge from Microsoft and Safari from Apple. For either of the two data sets a fixed range of tolerances is set to maintain consistency across the diagrams. The other parameter "high quality" can be either switched on or off. The three chart types are explained in chapter \ref{ch:benchmark-cases}.
All benchmark results shown here can be interactively explored at the web page provided together with this thesis. The static files lie in the \path{build} folder. The results can be found when following the "show prepared results"-link on the home page.
All benchmark results shown here can be interactively explored at the web page provided together with this thesis. It is available online\footnote{\url{https://mt.melch.pro}} or in the form of static files. The static files lie in the \path{build} folder. The results can be found when following the "show prepared results"-link on the home page.
Each section in this chapter describes a set of benchmarks run on the same system. A table in the beginning will indicate the problem dimensions chosen to inspect. After a description of the system and a short summary of the case the results will be presented in the form of graphs. Those are the graphs produced from the application described in chapter \ref{ch:benchmark-app}. Here the results will only be briefly characterized. A further analysis will follow in the next chapter.
@ -37,7 +37,7 @@ In the case of the high quality mode enabled however the original and the WebAss
\input{./results-benchmark/win_chro_simplify_vs_false.tex}
\input{./results-benchmark/win_chro_simplify_vs_true.tex}
Figure \ref{fig:win_chro_simplify_vs_false} and \ref{fig:win_chro_simplify_vs_true} show the results under Chrome for the same setting. Here the performance seem to be switched around with the original being the slowest method in both cases. This version has however very inconsistent results. There is no clear curvature which indicates some outside influence to the results. Either there is a flaw in the implementation or a special case of engine optimization was hit.
Figure \ref{fig:win_chro_simplify_vs_false} and \ref{fig:win_chro_simplify_vs_true} show the results under Chrome for the same setting. Here the performance seem to be switched around with the original being the slowest method in both cases. This version has however very inconsistent results. There is no clear curvature which indicates some outside influence to the results. Either there is a flaw in the implementation or a special case of engine optimization or garbage collection was hit.
Without high quality mode the Simplify.wasm gets overtaken by the Simplify.js alternative at 0.4 tolerance. From there on the WebAssembly solution stagnates while the JavaScript one continues to get faster. With high quality enabled the performance gain of WebAssembly is more clear than in Firefox. Here the Simplify.js alternative is the second fastest followed by its original.
@ -63,7 +63,7 @@ For this case the same device as in the former case is used. To compare the resu
\input{./results-benchmark/win_edge_simplify_stack_false.tex}
\input{./results-benchmark/win_edge_simplify_stack_true.tex}
The bar charts visualize where the time is spent in the Simplify.wasm implementation. Each data point contains a stacked column to represent the proportion of time spent for each task. The blue section represents the time spent to initialize the memory, the red one the execution of the compiled WebAssembly code. At last the green part will show the time spent for getting the coordinates back in the right format.
The bar charts visualize where the time is spent in the Simplify.wasm implementation. Each data point contains a stacked column to represent the proportion of time taken for each task. The blue section represents the time spent to initialize the memory, the red one the execution of the compiled WebAssembly code. At last the green part will how long it took to get the coordinates back in the right format.
Inspecting figures \ref{fig:win_edge_simplify_stack_false} and \ref{fig:win_edge_simplify_stack_true} one immediately notices that the time spent for the memory preparation does not vary in either of the two cases. Also very little time is needed to load the result back from memory especially as the tolerance gets higher. Further analysis of that will follow in chapter \ref{ch:discussion} as mentioned.
@ -80,7 +80,7 @@ In the case of high quality disabled, the results show a very steep curve of the
\label{tbl:dimensions-3}
\end{table}
A 2018 MacBook Pro 15" will be used to test the safari browser. For comparison the benchmarks will also be held under Firefox on MacOS. This time the bavarian boundary will be simplified with both preprocessing enabled and disabled.
A 2018 MacBook Pro 15" will be used to test the safari browser. For comparison the benchmarks will also be held under Firefox on MacOS. This time the bavarian boundary will be simplified with both preprocessing enabled and disabled. Table \ref{tbl:dimensions-3} illustrates this.
\input{./results-benchmark/mac_ffox_bavaria_vs_false.tex}
\input{./results-benchmark/mac_ffox_bavaria_vs_true.tex}
@ -90,7 +90,7 @@ At first figure \ref{fig:mac_ffox_bavaria_vs_false} and \ref{fig:mac_ffox_bavari
\input{./results-benchmark/mac_safa_bavaria_vs_false.tex}
\input{./results-benchmark/mac_safa_bavaria_vs_true.tex}
The results of the Safari browser with high quality disabled (figure \ref{fig:mac_safa_bavaria_vs_false}) resembles the figure \ref{fig:win_edge_simplify_vs_false} where the Edge browser was tested. Both JavaScript versions with similar performance surpass the WebAssembly version at one point. Unlike the Edge results the original implementation is slightly ahead.
The results of the Safari browser with high quality disabled (figure \ref{fig:mac_safa_bavaria_vs_false}) resembles figure \ref{fig:win_edge_simplify_vs_false} where the Edge browser was tested. Both JavaScript versions with similar performance surpass the WebAssembly version at one point. Unlike the Edge results the original implementation is slightly ahead.
When turning on high quality mode the JavaScript implementations still perform alike. However, Simplify.wasm is clearly faster as seen in figure \ref{fig:mac_safa_bavaria_vs_true}. Simplify.wasm performs here about twice as fast as the algorithms implemented in JavaScript. Those however have a steeper decrease as the tolerance numbers go up.
@ -105,17 +105,17 @@ When turning on high quality mode the JavaScript implementations still perform a
\label{tbl:dimensions-4}
\end{table}
In this case the system is a Lenovo Miix 510 convertible with Ubuntu 19.04 as the operating system. Again the bavarian outline is used for simplification with both quality settings. It will be observed if the Turf.js implementation is reasonable. The third kind of chart is in use here, which is similar to the Simplify.wasm insights. There are also stacked bar charts used to visualize the time spans of subtasks. The results will be compared to the graphs of the Simplify.js vs. Simplify.wasm chart. As the Turf.js method only makes sense when the original version is faster than the alternative, the benchmarks are performed in the Firefox browser.
In this case the system is a Lenovo Miix 510 convertible with Ubuntu 19.04 as the operating system. As it can be seen in table \ref{tbl:dimensions-4} the bavarian outline is used again for simplification with both quality settings. It will be observed if the Turf.js implementation is reasonable. The third kind of chart is in use here, which is similar to the Simplify.wasm insights. There are also stacked bar charts used to visualize the time spans of subtasks. The results will be compared to the graphs of the Simplify.js vs. Simplify.wasm chart. As the Turf.js method only makes sense when the original version is faster than the alternative, the benchmarks are performed in the Firefox browser.
\input{./results-benchmark/ubu_ffox_bavaria_vs_true.tex}
\input{./results-benchmark/ubu_ffox_bavaria_jsstack_true.tex}
Figure \ref{fig:ubu_ffox_bavaria_vs_true} shows how the JavaScript versions perform with high quality enabled. Here it is clear that the original version is prefereable. In figure \ref{fig:ubu_ffox_bavaria_jsstack_true} one can see the runtime of the Turf.js method. The red bar here stands for the runtime of the Simplify.js function call. The blue and green bar is the time taken for the format transformations before and after the algorithm. Again the preparation of the original data takes significantly longer than the modification of the simplified line. When the alternative implementation is so much slower than the original it is actually more performant to transform the data format. More analysis as mentioned follows in the next chapter.
Figure \ref{fig:ubu_ffox_bavaria_vs_true} shows how the JavaScript versions perform with high quality enabled. Here it is clear that the original version is prefereable. In figure \ref{fig:ubu_ffox_bavaria_jsstack_true} one can see the runtime of the Turf.js method. The red bar here stands for the runtime of the Simplify.js function call. The blue and green bar is the time taken for the format transformations before and after the algorithm. Again the preparation of the original data takes significantly longer than the modification of the simplified line. When the alternative implementation is so much slower than the original it is actually more performant to transform the data format. More analysis follows, as mentioned, in the next chapter.
\input{./results-benchmark/ubu_ffox_bavaria_vs_false.tex}
\input{./results-benchmark/ubu_ffox_bavaria_jsstack_false.tex}
The next two figures show the case when high quality is disabled. In figure \ref{fig:ubu_ffox_bavaria_vs_false} two algorithms seem to converge. And when looking at figure \ref{fig:ubu_ffox_bavaria_jsstack_false} one can see that the data preparation gets more costly as the tolerance rises. From a tolerance of 0.0014 on the alternative Simplify.js implementation is faster than the Turf.js method.
The next two figures show the case when high quality is disabled. In figure \ref{fig:ubu_ffox_bavaria_vs_false} the two JavaScript algorithms seem to converge. And when looking at figure \ref{fig:ubu_ffox_bavaria_jsstack_false} one can see that the data preparation takes more proportion of the total time as the tolerance rises. From a tolerance of 0.0014 on the alternative Simplify.js implementation is faster than the Turf.js method.
\FloatBarrier
\subsection{Case 5 - Mobile benchmarking}
@ -128,7 +128,7 @@ The next two figures show the case when high quality is disabled. In figure \ref
\label{tbl:dimensions-5}
\end{table}
At last the results from a mobile device are shown. The device is an iPad Air with iOS version 12.4. The Simplify.js example is being generalized using Safari and the Firefox browser. Again both quality settings are used for the benchmarks.
At last the results from a mobile device are shown. The device is an iPad Air with iOS version 12.4. The Simplify.js example is being generalized using Safari and the Firefox browser. Again both quality settings are used for the benchmarks. See table \ref{tbl:dimensions-5}.
\input{./results-benchmark/ipad_safa_simplify_vs_false.tex}
\input{./results-benchmark/ipad_safa_simplify_vs_true.tex}

View File

@ -1,7 +1,7 @@
\section{Discussion}
\label{ch:discussion}
In this section the results are interpreted. This section is structured in different questions to answer. First it will be analyzed what the browser differences are. One section will deal with the performance of the pure JavaScript implementations while the next will inspect how Simplify.wasm performs. Then further insights to the performance of the WebAssembly implementation will be given. It will be investigated how long it takes to set up the WebAssembly call and how much time is spent to actually execute the simplification routines. Next the case of Turf.js will be addressed and if its format conversions are reasonable under specific circumstances. Finally, the performance of mobile devices will be evaluated.
In this section the results are interpreted. This section is structured in different questions to answer. First it will be analyzed what the browser differences are. One section will deal with the performance of the pure JavaScript implementations while the next will inspect how Simplify.wasm performs. Then further insights to the performance of the WebAssembly implementation will be given. It will be investigated how long it takes to set up the WebAssembly call and how much time is spent to actually execute the simplification routines. Next the case of Turf.js will be addressed and if its format conversions are reasonable under specific circumstances. Finally, the performance of the mobile device will be evaluated.
\subsection{Browser differences for the JavaScript implementations}
@ -24,7 +24,7 @@ The variance is very low when the preprocessing is turned off through the high q
\subsection{Insights into Simplify.wasm}
\label{ch:discussion-wasm-insights}
So far, when the performance of Simplify.wasm was addressed, it meant the time spent for the whole process of preparing memory to running the algorithm as WebAssembly bytecode to loading back the result to JavaScript. This makes sense when comparing it to the JavaScript library with the motive to replace it one by one. It does however not produce meaningful comparisons of WebAssembly performance in contrast to the native JavaScript runtime. Further insights to Simplify.wasm call will be provided here.
So far, when the performance of Simplify.wasm was addressed, it meant the time spent for the whole process of preparing memory to running the algorithm as WebAssembly bytecode to loading back the result to JavaScript. This makes sense when comparing it to the JavaScript library with the motive to replace it one by one. It does however not produce meaningful comparisons of WebAssembly performance in contrast to the native JavaScript run time. Further insights to the Simplify.wasm call will be provided here.
First the parts where JavaScript is run will be examined. Chapter \ref{ch:case2} shows that there is as good as no variance in the memory initialization. This is obvious due to the fact that this step is not dependent on any other parameter than the polyline length. Initial versions of the library produced in this thesis were not as efficient in flattening the coordinate array as the final version. By replacing the built-in \texttt{Array.prototype.flat}-method with a simple \texttt{for} loop, a good optimization was achieved on the JavaScript side of the Simplify.wasm process. The \texttt{flat} method is a rather new feature of ECMAScript and its performance might be enhanced in future browser versions. This example shows that when writing JavaScript code one can quickly deviate from the "fast path" even when dealing with simple problems.

View File

@ -2,7 +2,7 @@
%In this section a conclusion is drawn. First the results will be shortly summarized. The work done will be reflected and possible improvements are suggested. At last there will be an prospect about future work.
In this thesis, the performance of simplification algorithms in the context of web applications was analyzed. The dominant library for this task in the JavaScript ecosystem is Simplify.js. It implements the Douglas-Peucker algorithm with optional radial distance preprocessing. By using a technology called WebAssembly, this library was recreated with the goal to achieve a better performance. This recreation was called Simplify.wasm. Also a JavaScript alternative to Simplify.js was tested that operates on a different representation of polylines. To perform several benchmarks on different devices a website was built. The results were gathered by using the library Benchmark.js which produces statistically relevant benchmarks.
In this thesis, the performance of simplification algorithms in the context of web applications was analyzed. The dominant library for this task in the JavaScript ecosystem is Simplify.js. It implements the Douglas-Peucker algorithm with optional radial distance preprocessing. By using a technology called WebAssembly, this library was recreated with the goal to achieve a better performance. This recreation was called Simplify.wasm. Also a JavaScript alternative to Simplify.js was tested that operates on a different representation of polylines. To perform several benchmarks on different devices a web application was built. The results were gathered by using the library Benchmark.js which produces statistically relevant benchmarks.
It was shown that the WebAssembly based library showed more stable results across different web browsers. The performance of the JavaScript based ones varied greatly. Not only did the absolute run times vary. There were also differences in which JavaScript variant was the faster one. Generally it can be said that the complexity of the operation defines if Simplify.wasm is preferable to Simplify.js. This comes from the fact that there is an overhead of calling Simplify.wasm. To call the WebAssembly code the coordinates will first have to be stored in a linear memory object. With short run times this overhead can exceed the performance gain through WebAssembly. The pure algorithm run time was always shorter with WebAssembly.

Binary file not shown.

View File

@ -56,7 +56,7 @@
\input{titlepage.tex}
\section*{Abstract}
In this thesis the performance of polyline simplification in web browsers is evaluated. Based on the JavaScript library Simplify.js a WebAssembly solution is built to increase performance. The solutions implement the Douglas-Peucker polyline simplification algorithm with optional radial distance preprocessing. The format for polylines that Simplify.js expects differs from the representation used in major geodata formats. This discrepancy is obvious in another JavaScript library, Turf.js, where it is overcome by format transformations on each call. A slight variant of Simplify.js is proposed in this thesis that can operate directly on the format used in GeoJSON and TopoJSON. The three approaches, Simplify.js, Simplify.js variant and Simplify.wasm are compared across different browsers by creating a web page, that gathers various benchmarking metrics. It is concluded that WebAssembly performance alone supersedes JavaScript performance. A drop-in replacement that includes memory management however bears overhead that can outweigh the performance gain. To fully utilize WebAssembly performance more effort regarding memory management is brought to web development. It is shown that the method used by Turf.js is unfavorable in most cases. Merely one browser shows a performance gain under special circumstances. In the other browsers the use of the Simplify.js variant is preferable.
In this thesis the performance of polyline simplification in web browsers is evaluated. Based on the JavaScript library Simplify.js a WebAssembly solution is built to increase the performance. The solutions implement the Douglas-Peucker polyline simplification algorithm with optional radial distance preprocessing. The format for polylines that Simplify.js expects differs from the representation used in major geodata formats. This discrepancy is obvious in another JavaScript library, Turf.js, where it is overcome by format transformations on each call. A slight variant of Simplify.js is proposed in this thesis that can operate directly on the format used in GeoJSON and TopoJSON. The three approaches, Simplify.js, Simplify.js variant and Simplify.wasm are compared across different browsers by creating a web page, that gathers various benchmarking metrics. It is concluded that WebAssembly performance alone supersedes JavaScript performance. A drop-in replacement that includes memory management however bears overhead that can outweigh the performance gain. To fully utilize WebAssembly performance more effort regarding memory management is brought to web development. It is shown that the method used by Turf.js is unfavorable in most cases. Merely one browser shows a performance gain under special circumstances. In the other browsers the use of the Simplify.js variant is preferable.
\newpage