This commit is contained in:
Alfred Melch 2019-08-22 18:28:26 +02:00
parent 69708c2726
commit 5972aaf373
13 changed files with 38 additions and 34 deletions

File diff suppressed because one or more lines are too long

View File

@ -63,4 +63,12 @@
date = {2019-06-03},
url = {https://developer.apple.com/app-store/review/guidelines/},
urldate= {2019-08-15}
}
}
@inproceedings{alesheikh2002web,
title={Web GIS: technologies and its applications},
author={Alesheikh, Ali Asghar and Helali, Hussein and Behroz, HA},
booktitle={Symposium on geospatial theory, processing and applications},
volume={15},
year={2002}
}

View File

@ -9,20 +9,20 @@
% Why important, who participants, trends,
Simplification of polygonal data structures is the task of reducing data points while preserving topological characteristics. The simplification often takes the form of removing points that make up the geometry. There are several solutions that tackle the problem in different ways. With the rising trend of moving desktop applications to the web platform geographic information systems (GIS) have experienced the shift towards web browsers, too. \footnote{\url{https://www.esri.com/about/newsroom/arcnews/implementing-web-gis/}}\todo{find a better source}. Performance is critical in these applications. Since simplification is an important factor to performance the solutions will be tested by constructing a web application using a technology called WebAssembly. \todo{bessere formulierung}
Simplification of polygonal data structures is the task of reducing data points while preserving topological characteristics. The simplification often takes the form of removing points that make up the geometry. There are several solutions that tackle the problem in different ways. With the rising trend of moving desktop applications to the web platform geographic information systems have experienced the shift towards web browsers too \parencite{alesheikh2002web}. Performance is critical in these applications. Since simplification is an important factor to performance the solutions will be tested by constructing a web application using a technology called WebAssembly.
\subsection{Binary instruction sets on the web platform}
The recent development of WebAssembly allows code written in various programming languages to be run natively in web browsers. So far JavaScript was the only native programming language on the web\todo{quelle finden}. The goals of WebAssembly are to define a binary instruction format as a compilation target to execute code at native speed and taking advantage of common hardware capabilities \parencite{haas2017bringing}. The integration into the web platform brings portability to a wide range of platforms like mobile and internet of things (IoT). The usage of this technology promises performance gains that will be tested by this thesis. The results can give conclusions to whether WebAssembly is worth a consideration for web applications with geographic computational aspects. Web GIS is only one technology that would benefit greatly of such an advancement. Thus far WebAssembly has been shipped to the stable version of the four most used browser engines \parencite{wagner2017support}. The mainly targeted high-level languages for compilation are C and C++. Also a compiler for Rust and a TypeScript subset has been developed.
The recent development of WebAssembly allows code written in various programming languages to be run natively in web browsers. So far JavaScript was the only native programming language on the web \parencite{reiser2017accelerate}. The goals of WebAssembly are to define a binary instruction format as a compilation target to execute code at native speed and taking advantage of common hardware capabilities \parencite{haas2017bringing}. The integration into the web platform brings portability to a wide range of platforms like mobile and internet of things. The usage of this technology promises performance gains that will be tested by this thesis. The results can give conclusions to whether WebAssembly is worth a consideration for web applications with geographic computational aspects. WebGIS is only one technology that would benefit greatly of such an advancement. Thus far WebAssembly has been shipped to the stable version of the four most used browser engines \parencite{wagner2017support}. The mainly targeted high-level languages for compilation are C and C++. Also a compiler for Rust and a TypeScript subset has been developed.
\subsection{Performance as important factor for web applications}
There has been a rapid growth of complex applications running in web-browsers. These so called progressive web apps (PWA) combine the fast reachability of web pages with the feature richness of locally installed applications. Even though these applications can grow quite complex, the requirement for fast page loads and instant user interaction still remains. One way to cope with this need is the use of compression algorithms to reduce the amount of data transmitted and processed. In a way simplification is a form of data compression. Web servers use lossless compression algorithms like gzip to deflate data before transmission. Browsers that implement these algorithms can then fully restore the requested ressources resulting in lower bandwidth usage. The algorithms presented here however remove information from the data in a way that cannot be restored. This is called lossy compression. The most common usage for this on the web is the compression of image data.
There has been a rapid growth of complex applications running in web-browsers. These so called progressive web apps combine the fast reachability of web pages with the feature richness of locally installed applications. Even though these applications can grow quite complex, the requirement for fast page loads and instant user interaction still remains. One way to cope with this need is the use of compression algorithms to reduce the amount of data transmitted and processed. In a way simplification is a form of data compression. Web servers use lossless compression algorithms like gzip to deflate data before transmission. Browsers that implement these algorithms can then fully restore the requested ressources resulting in lower bandwidth usage. The algorithms presented here however remove information from the data in a way that cannot be restored. This is called lossy compression. The most common usage for this on the web is the compression of image data.
\subsection{Topology simplification for rendering performance}
While compression is often used to minimize bandwidth usage, the compression of geospatial data can particulary influence rendering performance. The bottleneck for rendering often is the SVG \todo{explain abbrevation} transformation used to display topology on the web. Implementing simplification algorithms for use on the web platform can lead to smoother user experience when working with large geodata sets.
While compression is often used to minimize bandwidth usage, the compression of geospatial data can particulary influence rendering performance. The bottleneck for rendering often is the transformation to scalable vector graphics used to display topology on the web. Implementing simplification algorithms for use on the web platform can lead to smoother user experience when working with large geodata sets.
\subsection{Related work}
There have been previous attempts to speed up applications with WebAssembly. They all have seen great performance benefits when using this technology. Results show that over several source languages the performance is predictably consistent across browsers \parencite{surma2019replacing}. Reiser and Bläser even propose to cross-compile JavaScript to WebAssembly. Through their developed library Speedy.js one can compile TypeScript, a JavaScript superset, to WebAssembly. The performance gains of critical functions reaches up to a factor of four \parencite{reiser2017accelerate}.

View File

@ -81,12 +81,12 @@ In this chapter several algorithms for polyline simplification will be explained
%\paragraph{with reduction parameter} \todo{O(n*m)}
\paragraph{Visvalingam-Whyatt simplification} This is another global point routine. It was developed in 1993. Visvalingam and Wyatt use a area-based method to rank the points by their significance. To do that the "effective area" of each point has to be calculated. This is the area the point spans up with its adjoining points \parencite{shi2006performance}. Then the points with the least effective area get iteratively eliminated, and its neighbors effective area recalculated, until there are only two points left. At each elimination the point gets stored in a list alongside with its associated area. This is the effective area of that point or the associated area of the previous point in case the latter one is higher. \todo{explain why the others not}This way the algorithm can be used for scale dependent and scale-independent generalizations. \parencite{visvalingam1993line}
\paragraph{Visvalingam-Whyatt simplification} This is another global point routine. It was developed in 1993. Visvalingam and Wyatt use a area-based method to rank the points by their significance. To do that the "effective area" of each point has to be calculated. This is the area the point spans up with its adjoining points \parencite{shi2006performance}. Then the points with the least effective area get iteratively eliminated, and its neighbors effective area recalculated, until there are only two points left. At each elimination the point gets stored in a list alongside with its associated area. This is the effective area of that point or the associated area of the previous point in case the latter one is higher. This way the algorithm can be used for scale dependent and scale-independent generalizations. \parencite{visvalingam1993line}
\subsubsection{Summary}
The algorithms shown here are the most common used simplification algorithms in cartography and GIS. The usage of one algorithm stands out however. It is the Douglas-Peucker algorithm. In \textsf{Performance Evaluation of Line Simplification Algorithms for Vector Generalization} Shi and Cheung conclude that "the Douglas-Peucker algorithm was the most effective to preserve the shape of the line and the most accurate in terms of position" \parencite{shi2006performance}. Its complexity however is not ideal for web-based applications. The solution is to preprocess the line with the linear-time radial distance algorithm to reduce point clusters. This solution will be further discussed in section \ref{ch:simplify.js}.
The algorithms shown here are the most common used simplification algorithms in cartography and geographic information systems. The usage of one algorithm stands out however. It is the Douglas-Peucker algorithm. In "Performance Evaluation of Line Simplification Algorithms for Vector Generalization" Shi and Cheung conclude that "the Douglas-Peucker algorithm was the most effective to preserve the shape of the line and the most accurate in terms of position" \parencite{shi2006performance}. Its complexity however is not ideal for web-based applications. The solution is to preprocess the line with the linear-time radial distance algorithm to reduce point clusters. This solution will be further discussed in section \ref{ch:simplify.js}.

View File

@ -17,11 +17,11 @@ label=lst:geojson-example
]{../data/example-simple.geojson}
The feature types differ in the format of their coordinates property. A position is an array of at least two elements representing longitude and latitude. An optional third element can be added to specify altitude. While the coordinates member of a \textsl{Point}-feature is simply a single position, a \textsl{LineString}-feature describes its geometry through an Array of at least two positions. More interesting is the specification for Polygons. It introduces the concept of the \textsl{linear ring} as a closed \textsl{LineString} with at least four positions where the first and last positions are equivalent. The \textsl{Polygon's} coordinates member is an array of linear rings with the first one representing the exterior ring and all others interior rings, also named surface and holes respectively. At last the coordinates member of \textsl{MultiLineStrings} and \textsl{MultiPolygons} is defined as a single array of its singular feature type.
The feature types differ in the format of their coordinates property. A position is an array of at least two elements representing longitude and latitude. An optional third element can be added to specify altitude. While the coordinates member of a Point-feature is simply a single position, a LineString-feature describes its geometry through an Array of at least two positions. More interesting is the specification for Polygons. It introduces the concept of the linear ring as a closed LineString with at least four positions where the first and last positions are equivalent. The Polygon's coordinates member is an array of linear rings with the first one representing the exterior ring and all others interior rings, also named surface and holes respectively. At last the coordinates member of MultiLineStrings and MultiPolygons is defined as a single array of its singular feature type.
GeoJSON is mainly used for web-based mapping. Since it is based on JSON it inherits its strengths. There is for one the enhanced readability through reduced markup overhead compared to XML-based data types like GML. Interoperability with web applications comes for free since the parsing of JSON-objects is integrated in JavaScript. Unlike the \todo{Introduce shapefiles}Esri Shapefile format a single file is sufficient to store and transmit all relevant data, including feature properties.
GeoJSON is mainly used for web-based mapping. Since it is based on JSON it inherits its strengths. There is for one the enhanced readability through reduced markup overhead compared to XML-based data types like GML. Interoperability with web applications comes for free since the parsing of JSON-objects is integrated in JavaScript. Unlike the Esri Shapefile\footnote{\url{https://doc.arcgis.com/en/arcgis-online/reference/shapefiles.htm}} format a single file is sufficient to store and transmit all relevant data, including feature properties.
To its downsides count that a text based format cannot store the geometries as efficiently as it would be possible with a binary format. Also only vector-based data types can be represented. Another disadvantage can be the strictly non-topologic approach. Every feature is completely described by one entry. However, when there are features that share common components, like boundaries in neighboring polygons, these data points will be encoded twice in the GeoJSON object. On the one hand this further poses concerns about data size. \todo{other formulation. Two negatives}On the other hand it is more difficult to execute topological analysis on the data set. Luckily there is a related data structure to tackle this problem.
To its downsides count that a text based format cannot store the geometries as efficiently as it would be possible with a binary format. Also only vector-based data types can be represented. Another disadvantage can be the strictly non-topologic approach. Every feature is completely described by one entry. However, when there are features that share common components, like boundaries in neighboring polygons, these data points will be encoded twice in the GeoJSON object. This further poses concerns about data size. Also it is more difficult to execute topological analysis on the data set. Luckily there is a related data structure to tackle this problem.
\paragraph{TopoJSON} is an extension of GeoJSON and aims to encode datastructures into a shared topology \parencite{bostock2017topojson}. It supports the same geometry types as GeoJSON. It differs in some additional properties to use and new object types like "Topology" and "GeometryCollection". Its main feature is that LineStrings, Polygons and their multiplicitary equivalents must define line segments in a common property called "arcs". The geometries themselves then reference the arcs from which they are made up. This reduces redundancy of data points. Another feature is the quantization of positions. To use it, one can define a "transform" object which specifies a scale and translate point to encode all coordinates. Together with delta-encoding of position arrays one obtains integer values better suited for efficient serialization and reduced file size.

View File

@ -1,7 +1,7 @@
\subsection[Web runtimes]{Running the algorithms on the web platform}
\todo{neuer satz plus quelle}JavaScript has been the only native programming language of web browsers for a long time. With the development of WebAssembly, there seems to be an alternative on its way. This technology, its benefits and drawbacks, will be explained in this chapter.
JavaScript was traditionally the only native programming language of web browsers \parencite{reiser2017accelerate}. With the development of WebAssembly, there seems to be an alternative on its way. This technology, its benefits and drawbacks, will be explained in this chapter.
\subsubsection{Introduction to WebAssembly}
@ -14,7 +14,7 @@ These goals are not specific to WebAssembly. They can be seen as properties that
It is often stated that WebAssembly can bring performance benefits. It makes sense that statically typed machine code beats scripting languages performance wise. It has to be observed however, if the overhead of switching contexts will neglect this performance gain. JavaScript has made a lot of performance improvements over the past years. Not at least Googles development on the V8 engine has brought JavaScript to an acceptable speed for extensive calculations. Modern engines observe the execution of running JavaScript code and will perform optimizations that can be compared to optimizations of compilers. \parencite{clark2017what}
The JavaScript ecosystem has rapidly evolved the past years. Thanks to package managers like \todo{format}bower, npm and yarn it is simple to pull code from external sources into ones codebase. Initially thought for server sided JavaScript execution the ecosystem has found its way into front-end development via module bundlers like browserify, webpack and rollup. In course of this growth many algorithms and implementations have been ported to JavaScript for use on the web. With WebAssembly this ecosystem can be broadened even further. By lifting the language barrier, existing work of many more programmers can be reused on the web. Whole libraries exclusive for native development could be imported by a few simple tweaks. Codecs not supported by browsers can be made available for use in any browser supporting WebAssembly. \parencite{surma2018emscripting}
The JavaScript ecosystem has rapidly evolved the past years. Thanks to package managers like Bower, npm and Yarn it is simple to pull code from external sources into ones codebase. Initially thought for server sided JavaScript execution the ecosystem has found its way into front-end development via module bundlers like browserify, webpack and rollup. In course of this growth many algorithms and implementations have been ported to JavaScript for use on the web. With WebAssembly this ecosystem can be broadened even further. By lifting the language barrier, existing work of many more programmers can be reused on the web. Whole libraries exclusive for native development could be imported by a few simple tweaks. Codecs not supported by browsers can be made available for use in any browser supporting WebAssembly. \parencite{surma2018emscripting}
% In this these the C++ library psimpl will be utilized to bring polyline simplification to the web. This library already implements various algorithms for this task. It will be further introduced in chapter \ref{ch:psimpl}.
@ -22,7 +22,7 @@ The JavaScript ecosystem has rapidly evolved the past years. Thanks to package m
There are various compilers with WebAssembly as compilation target. In this thesis the Emscripten toolchain is used. Other notable compilers are wasm-pack\footnote{\url{https://rustwasm.github.io/}} for Rust projects and AssemblyScript\footnote{\url{https://github.com/AssemblyScript/assemblyscript}} for a TypeScript subset. This latter compiler is particularly interesting as TypeScript, itself a superset of JavaScript, is a popular choice among web developers. This reduces the friction for WebAssembly integration as it is not necessary to learn a new language.
Emscripten\footnote{\url{https://webassembly.org/}} started with the goal to compile unmodified C and C++ applications to JavaScript. This is achieved by acting as a compiler backend to \todo{introduce LLVM}LLVM assembly. High level languages compile through a frontend into the LLVM intermediate representation. Well known frontends are Clang and LLVM-GCC. From there it gets passed through a backend to generate the architecture specific machine code. Emscripten hooks in here to generate asm.js, a performant JavaScript subset. In figure \ref{fig:emscripten-chain} one such example chain can be seen. On the left is the original C code which sums up numbers from 1 to 100. The resulting LLVM assembly can be seen in the middle. It is definitely more verbose, but \todo{explain easiness}easier to work on for the backend compiler. Notable are the allocation instructions, the labeled code blocks and code flow moves. The JavaScript representation on the right is the nearly one to one translation of the LLVM assembly. The branching is done via a switch-in-for loop, memory is implemented by a JavaScript array named HEAP and LLVM assembly functions calls become normal JavaScript function calls like \textsf{\_printf()}. Through optimizations the code becomes more compact and only then more performant. \parencite{zakai2011emscripten}
Emscripten\footnote{\url{https://webassembly.org/}} started with the goal to compile unmodified C and C++ applications to JavaScript. This is achieved by acting as a compiler backend to LLVM assembly. High level languages compile through a frontend into the LLVM intermediate representation. Well known frontends are Clang and LLVM-GCC. From there it gets passed through a backend to generate the architecture specific machine code. Emscripten hooks in here to generate asm.js, a performant JavaScript subset. In figure \ref{fig:emscripten-chain} one such example chain can be seen. On the left is the original C code which sums up numbers from 1 to 100. The resulting LLVM assembly can be seen in the middle. It is definitely more verbose, but easier to work on for the backend compiler. Notable are the allocation instructions, the labeled code blocks and code flow moves. The JavaScript representation on the right is the nearly one to one translation of the LLVM assembly. The branching is done via a switch-in-for loop, memory is implemented by a JavaScript array named HEAP and LLVM assembly functions calls become normal JavaScript function calls like \texttt{\_printf()}. Through optimizations the code becomes more compact and only then more performant. \parencite{zakai2011emscripten}
\begin{figure}
\centering
@ -33,8 +33,6 @@ Emscripten\footnote{\url{https://webassembly.org/}} started with the goal to com
\label{fig:emscripten-chain}
\end{figure}
\todo{bild rekonstruieren und "nach (quelle)"}
It is in fact this project that inspired the creation of WebAssembly. It was even called the "natural evolution of asm.js"\footnote{\url{https://groups.google.com/forum/\#!topic/emscripten-discuss/k-egXO7AkJY/discussion}}. As of May 2018 Emscripten changed its default output to WebAssembly\footnote{\url{https://github.com/emscripten-core/emscripten/pull/6419}} while still supporting asm.js. Currently the default backend named \texttt{fastcomp} generates the WebAssembly bytecode from asm.js. A new backend however is about to take its place that compiles directly from LLVM \parencite{zakai2019llvmbackend}.
It is in fact this project that inspired the creation of WebAssembly. It was even called the "natural evolution of asm.js"\footnote{\url{https://groups.google.com/forum/\#!topic/emscripten-discuss/k-egXO7AkJY/discussion}}. As of May 2018 Emscripten changed its default output to WebAssembly\footnote{\url{https://github.com/emscripten-core/emscripten/pull/6419}} while still supporting asm.js. Currently the default backend named \textsf{fastcomp} generates the WebAssembly bytecode from asm.js. A new backend however is about to take its place that compiles directly from LLVM \parencite{zakai2019llvmbackend}.
The compiler is only one part of the Emscripten toolchain. Part of it are various \todo{abbrevation tools latex}APIs, for example for file system emulation or network calls, and tools like the compiler mentioned.
The compiler is only one part of the Emscripten toolchain. Part of it are various APIs, for example for file system emulation or network calls, and tools like the compiler mentioned.

View File

@ -38,7 +38,7 @@ Since it is not clear which case is faster, and given how simple the required ch
In scope of this thesis a library will be created that implements the same procedure as Simplify.js in C code. It will be made available on the web platform through WebAssembly. In the style of the model library it will be called Simplify.wasm. The compiler to be used will be Emscripten as it is the standard for porting C code to WebAssembly.
As mentioned, the first step is to port Simplify.js to the C programming language. The file \path{lib/simplify-wasm/simplify.c} shows the attempt. It is kept as close to the JavaScript library as possible. This may result in C-untypical coding style but prevents skewed results from unexpected optimizations to the procedure itself. The entry point is not the \texttt{main}-function but a function called \todo{format}simplify. This is specified to the compiler as can be seen in listing \ref{lst:simplify-wasm-compiler-call}.
As mentioned, the first step is to port Simplify.js to the C programming language. The file \path{lib/simplify-wasm/simplify.c} shows the attempt. It is kept as close to the JavaScript library as possible. This may result in C-untypical coding style but prevents skewed results from unexpected optimizations to the procedure itself. The entry point is not the \texttt{main}-function but a function called \texttt{simplify}. This is specified to the compiler as can be seen in listing \ref{lst:simplify-wasm-compiler-call}.
\lstinputlisting[
float=htpb,
@ -47,8 +47,8 @@ language=bash,
label=lst:simplify-wasm-compiler-call,
caption={The call to compile the C source code to WebAssembly in a Makefile}
]{../lib/simplify-wasm/Makefile}
\todo{more about compiler call}
Furthermore, the functions \todo{format}malloc and free from the standard library are made available for the host environment. Compiling the code through Emscripten produces a binary file in wasm format and the glue code as JavaScript. These files are called \texttt{simplify.wasm} and \texttt{simplify.js} respectively.
Furthermore, the functions \texttt{malloc} and \texttt{free} from the standard library are made available for the host environment. Another option specifies the optimisation level. With \texttt{O3} the highest level is chosen. The closure compiler minifies the JavaScript glue code. Compiling the code through Emscripten produces a binary file in wasm format and the glue code as JavaScript. These files are called \texttt{simplify.wasm} and \texttt{simplify.js} respectively.
An example usage can be seen in \path{lib/simplify-wasm/example.html}. Even though the memory access is abstracted in this example the process is still unhandy and far from a drop-in replacement of Simplify.js. Thus in \path{lib/simplify-wasm/index.js} a further abstraction to the Emscripten emitted code was written. The exported function \texttt{simplifyWasm} handles module instantiation, memory access and the correct call to the exported wasm function. Finding the correct path to the wasm binary is not always clear when the code is imported from another location. The proposed solution is to leave the resolving of the code-path to an asset bundler that processes the file in a preprocessing step.
@ -62,16 +62,15 @@ caption={The top level function to invoke the WebAssembly simplification.}
Listing \ref{lst:simplify-wasm} shows the function \texttt{simplifyWasm}. Further explanation will follow regarding the abstractions \texttt{getModule}, \texttt{storeCoords} and \texttt{loadResultAndFreeMemory}.
\paragraph {Module instantiation} will be done on the first call only but requires the function to be asynchronous. For a neater experience in handling Emscripten modules, a utility function named \texttt{initEmscripten}\footnote{/lib/wasm-util/initEmscripten.js} was written to turn the module factory into a JavaScript \todo{format}Promise that resolves on finished compilation. The usage of this function can be seen in listing \ref{lst:simplify-wasm-emscripten-module}. The resulting WebAssembly module is cached in the variable \texttt{emscriptenModule}.
\paragraph {Module instantiation} will be done on the first call only but requires the function to be asynchronous. For a neater experience in handling Emscripten modules, a utility function named \texttt{initEmscripten}\footnote{/lib/wasm-util/initEmscripten.js} was written to turn the module factory into a JavaScript \texttt{Promise} that resolves on finished compilation. The usage of this function can be seen in listing \ref{lst:simplify-wasm-emscripten-module}. The resulting WebAssembly module is cached in the variable \texttt{emscriptenModule}.
\lstinputlisting[
float=htbp,
language=javascript,
firstline=35, lastline=40,
caption=My Caption,
caption=Caching the instantiated Emscripten module,
label=lst:simplify-wasm-emscripten-module
]{../lib/simplify-wasm/index.js}
\todo{find a caption for listing 8}
\paragraph {Storing coordinates} into the module memory is done in the function \texttt{storeCoords}. Emscripten offers multiple views on the module memory. These correspond to the available WebAssembly data types (e.g. HEAP8, HEAPU8, HEAPF32, HEAPF64, ...)\footnote{\url{https://emscripten.org/docs/api_reference/preamble.js.html\#type-accessors-for-the-memory-model}}. As Javascript numbers are always represented as a double-precision 64-bit binary\footnote{\url{https://www.ecma-international.org/ecma-262/6.0/\#sec-4.3.20}} (IEEE 754-2008), the HEAPF64-view is the way to go to not lose precision. Accordingly the datatype double is used in C to work with the data. Listing \ref{lst:wasm-util-store-coords} shows the transfer of coordinates into the module memory. In line 3 the memory is allocated using the exported \texttt{malloc}-function. A JavaScript TypedArray is used for accessing the buffer such that the loop for storing the values (lines 5 - 8) is trivial.
@ -110,11 +109,11 @@ For web applications an important measure is the size of libraries. It defines t
The file sizes in this chapter will be given as the gzipped size. gzip is a file format for compressed files based on the DEFLATE algorithm. It is natively supported by all browsers and the most common web server software. So this is the format that files will be transmitted in on production applications.
For JavaScript applications there is also the possibility of reducing filesize by code minification. This is the process of reformating the source code without changing the functionality. \todo{formulierung}Optimization are brought for example by removing unnecessary parts like spaces and comments or reducing variable names to single letters. Minification is often done in asset bundlers that process the JavaScript source files and produce the bundled application code.
For JavaScript applications there is also the possibility of reducing filesize by code minification. This is the process of reformating the source code without changing the functionality. Optimization are brought for example by removing unnecessary parts like spaces and comments or reducing variable names to single letters. Minification is often done in asset bundlers that process the JavaScript source files and produce the bundled application code.
For the WebAssembly solution there are two files required to work with it. The \textsf{.wasm} bytecode and JavaScript gluecode. The glue code is already minified by the Emscripten compiler. The binary has a size of 3.8KB while the JavaScript code has a total of 3.1KB. Simplify.js on the other hand will merely need a size of 1.1KB. With minification the size shrinks to 638 bytes.
For the WebAssembly solution there are two files required to work with it. The \texttt{.wasm} bytecode and JavaScript glue code. The glue code is already minified by the Emscripten compiler. The binary has a size of 3.8KB while the JavaScript code has a total of 3.1KB. Simplify.js on the other hand will merely need a size of 1.1KB. With minification the size shrinks to 638 bytes.
File size was not the main priority when producing the WebAssembly solution. There are ways to further shrink the size of the bytecode. As of now it contains the logic of the library but also necessary functionality from the C standard library. These were added by Emscripten automatically. The bloat comes from using the memory management functions \todo{format}malloc and free. If the goal was to reduce the file size, one would have to get along without memory management at all. This would even be possible in this case as the simplification process is a self-contained process and the module has no other usage. The input size is known beforehand so instead of creating reserved memory one could just append the result in memory at the location directly after the input feature. The function would merely need to return the result size. After the call is finished and the result is read by JavaScript the memory is not needed anymore. A test build was made which renounced from memory management. The size of the wasm bytecode shrunk to 507 byte and the glue code to 2.8KB. By using vanilla JavaScript API one could even ditch the glue code altogether \parencite{surma2019replacing}.
File size was not the main priority when producing the WebAssembly solution. There are ways to further shrink the size of the bytecode. As of now it contains the logic of the library but also necessary functionality from the C standard library. These were added by Emscripten automatically. The bloat comes from using the memory management functions \texttt{malloc} and \texttt{free}. If the goal was to reduce the file size, one would have to get along without memory management at all. This would even be possible in this case as the simplification process is a self-contained process and the module has no other usage. The input size is known beforehand so instead of creating reserved memory one could just append the result in memory at the location directly after the input feature. The function would merely need to return the result size. After the call is finished and the result is read by JavaScript the memory is not needed anymore. A test build was made which renounced from memory management. The size of the wasm bytecode shrunk to 507 byte and the glue code to 2.8KB. By using vanilla JavaScript API one could even ditch the glue code altogether \parencite{surma2019replacing}.
For simplicity the memory management was left in as the optimizations would require more careful engineering to ensure correct functionality. The example above shows however, that there is enormous potential to cut the size. Even file sizes below the JavaScript original are possible.
@ -156,7 +155,7 @@ On the bottom the different types of Benchmarks implemented can be seen. They al
Benchmark.js combines these approaches. In a first step it approximates the runtime in a few cycles. From this value it calculates the number of iterations to reach an uncertainty of at most 1\%. Then the samples are gathered. \parencite{hossain2012benchmark}
\subsubsection{The benchmark suite}
For running multiple benchmarks the class \texttt{BenchmarkSuite} was created. It takes a list of \texttt{BenchmarkCases} and runs them through a \texttt{BenchmarkType}. The suite manages starting, pausing and stopping of going through list. It updates the statistics gathered on each cycle. By injecting an \textsl{onCycle} method, the \texttt{Runner} component can give live feedback about the progress.
For running multiple benchmarks the class \texttt{BenchmarkSuite} was created. It takes a list of \texttt{BenchmarkCases} and runs them through a \texttt{BenchmarkType}. The suite manages starting, pausing and stopping of going through list. It updates the statistics gathered on each cycle. By injecting an \texttt{onCycle} method, the \texttt{Runner} component can give live feedback about the progress.
\begin{figure}[htb]
\centering

View File

@ -18,8 +18,9 @@ Each section in this chapter describes a set of benchmarks run on the same syste
\end{table}
At first it will be observed how the algorithms perform under different browsers. The chart to use for this is the "Simplify.js vs Simplify.wasm" chart. For that a Windows system was chosen as it allows to run benchmarks under three of the four browsers in question. The dataset is the Simplify.js example which will be simplified with and without the high quality mode.
\\ % to prevent footnote split
The device is a \textsf{HP Pavilion x360 - 14-ba101ng}\footnote{\url{https://support.hp.com/us-en/product/hp-pavilion-14-ba100-x360-convertible-pc/16851098/model/18280360/document/c05691748}} convertible. It contains an Intel® Core™ i5-8250U Processor with 4 cores and 6MB cache. The operating system is Windows 10 and the browsers are on their newest versions with Chrome 75, Firefox 68 and Edge 44.18362.1.0.
The device is a HP Pavilion x360 - 14-ba101ng\footnote{\url{https://support.hp.com/us-en/product/hp-pavilion-14-ba100-x360-convertible-pc/16851098/model/18280360/document/c05691748}} convertible. It contains an Intel® Core™ i5-8250U Processor with 4 cores and 6MB cache. The operating system is Windows 10 and the browsers are on their newest versions with Chrome 75, Firefox 68 and Edge 44.18362.1.0.
Table \ref{tbl:dimensions-1} summarizes the setting. For each problem dimension the chosen characteristics are highlighted in green color. The number of benchmark diagrams in a chapter is determined by the multitude of characteristics selected. In the case here there are three browsers tested each with two quality options resulting in six diagrams to be produced.
@ -104,9 +105,7 @@ When turning on high quality mode the JavaScript implementations still perform a
\caption{Problem dimensions of Case 4}
\end{table}
\todo{Take red line back in. Titel passt sonst nicht.}
In this case the system is a \textsf{Lenovo Miix 510} convertible with Ubuntu 19.04 as the operating system. Again the bavarian outline is used for simplification with both quality settings. It will be observed if the Turf.js implementation is reasonable. The third kind of chart is in use here, which is similar to the Simplify.wasm insights. There are also stacked bar charts used to visualize the time spans of subtasks. The results will be compared to the graphs of the Simplify.js vs. Simplify.wasm chart. As the Turf.js method only makes sense when the original version is faster than the alternative, the benchmarks are performed in the Firefox browser.
In this case the system is a Lenovo Miix 510 convertible with Ubuntu 19.04 as the operating system. Again the bavarian outline is used for simplification with both quality settings. It will be observed if the Turf.js implementation is reasonable. The third kind of chart is in use here, which is similar to the Simplify.wasm insights. There are also stacked bar charts used to visualize the time spans of subtasks. The results will be compared to the graphs of the Simplify.js vs. Simplify.wasm chart. As the Turf.js method only makes sense when the original version is faster than the alternative, the benchmarks are performed in the Firefox browser.
\input{./results-benchmark/ubu_ffox_bavaria_vs_true.tex}
\input{./results-benchmark/ubu_ffox_bavaria_jsstack_true.tex}
@ -134,7 +133,7 @@ At last the results from a mobile device are shown. The device is an iPad Air wi
\input{./results-benchmark/ipad_safa_simplify_vs_false.tex}
\input{./results-benchmark/ipad_safa_simplify_vs_true.tex}
When the high quality parameter is left in its default state the WebAssembly solution is fastest on low tolerance numbers (figure \ref{fig:ipad_safa_simplify_vs_false}). As seen before the JavaScript versions are getting faster when the tolerance increases. The original Simplify.js version surpasses the WebAssembly performance while the alternative tangents it. As it was the case on the desktop system the algorithms perform similarly when high quality is set to \textsf{true}. Figure \ref{fig:ipad_safa_simplify_vs_true} shows that Simplify.wasm is also here the faster method.
When the high quality parameter is left in its default state the WebAssembly solution is fastest on low tolerance numbers (figure \ref{fig:ipad_safa_simplify_vs_false}). As seen before the JavaScript versions are getting faster when the tolerance increases. The original Simplify.js version surpasses the WebAssembly performance while the alternative tangents it. As it was the case on the desktop system the algorithms perform similarly when high quality is set to \texttt{true}. Figure \ref{fig:ipad_safa_simplify_vs_true} shows that Simplify.wasm is also here the faster method.
\input{./results-benchmark/ipad_ffox_simplify_vs_false.tex}
\input{./results-benchmark/ipad_ffox_simplify_vs_true.tex}

View File

@ -39,7 +39,7 @@ In the results, Simplify.wasm is always faster when the high quality mode is ena
This shows that it is not always ideal to replace a library with a WebAssembly based approach. The cost of the overhead might exceed the performance gain when the execution time is low. In section \ref{ch:discussion-wasm-insights} it is pointed out, that the pure execution time of the simplification algorithm is fastest with WebAssembly. When preparing the geodata beforehand, for example by serializing it in a binary representation, one could immediately call the bytecode. This poses further effort regarding memory management to the web developer. One has to weigh up the complexity overhead to the performance benefit when considering such approaches.
\subsection{Analysis of Turf.js implementation}
%\subsection{Analysis of Turf.js implementation}
In this section the method used by Turf.js is evaluated. As seen when using the Chrome or Edge browser, the original library is the slower JavaScript method for simplification. There the data transformation is absolutely unfavorable. In Safari, where the JavaScript versions perform equally, the overhead will still lead to worse run times. Lastly the Firefox browser will be examined. The results from chapter \ref{ch:case4} show that there are indeed cases where the method prevails. These are the ones where the execution time is large enough to justify the overhead. Namely when high quality is enabled or low tolerance values when high quality is disabled.

Binary file not shown.

Binary file not shown.

Before

Width:  |  Height:  |  Size: 84 KiB

After

Width:  |  Height:  |  Size: 51 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 79 KiB

After

Width:  |  Height:  |  Size: 52 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 78 KiB

After

Width:  |  Height:  |  Size: 52 KiB