diff --git a/_posts/2019-12-01-rle-array.markdown b/_posts/2019-12-01-rle-array.markdown
index 8e73acb..f61be73 100644
--- a/_posts/2019-12-01-rle-array.markdown
+++ b/_posts/2019-12-01-rle-array.markdown
@@ -80,13 +80,13 @@ Run-length encoding is a simple yet powerful technique. Instead of storing array
so called "runs" --- consecutive elements of the array where the same value is stored. For each run, it then just keeps
its value and length:
-
+
Pandas requires us to be able to do quick [random access](https://en.wikipedia.org/wiki/Random_access), e.g. for
sorting and group-by operations. Instead of the actual run-lengths we store the end positions of each run (this is the
cumulative sum of the lengths):
-
+
This way, we can use [binary search](https://en.wikipedia.org/wiki/Binary_search_algorithm) to implement random access.
@@ -119,7 +119,7 @@ created as followed:
The whole setup can also be visualized:
-
+
You can generate the same data using
[`rle_array.testing.generate_test_dataframe`](https://jdasoftwaregroup.github.io/rle-array/_rst/rle_array.testing.html#rle_array.testing.generate_test_dataframe).
@@ -162,7 +162,7 @@ encouraged to try these and others.
Dictionary encoding replaces the actual payload data with a mapping. The trick is that mapped values can often be more
memory-efficient, especially when the original data is very long (e.g. for strings) and are repeated multiple times:
-
+
This is what [Pandas Categoricals](https://pandas.pydata.org/pandas-docs/stable/user_guide/categorical.html) implement.
For data-at-rest, this is implemented by
@@ -187,7 +187,7 @@ This distinction between semantics and data size is also made by the
Here is how this looks like in memory (for [big endian machines](https://en.wikipedia.org/wiki/Endianness)):
-
+
In this example, we can easily use 16 bits per element instead of 64, resulting in a 75% memory reduction.
@@ -198,7 +198,7 @@ noticeable exceptions due to the lacking hardware support on most CPUs), it also
### Bit-packing
Bit-packing is similar to [Data Types](#data-types), but allows to create types with non-standard width:
-
+
The advantage is that you can save even more memory, but it comes with heavy performance penalties, since
CPUs cannot read unaligned data that efficiently. In some cases however, it can be even faster due to the saved memory
@@ -212,7 +212,7 @@ Often we find columns in our DataFrames where information only occurs for a very
is often more efficient to explicitly store and look-up these few cases --- e.g. by using a
[HashTable](https://en.wikipedia.org/wiki/Hash_table) --- than using a simple array:
-
+
This is what [Pandas SparseArray](https://pandas.pydata.org/pandas-docs/stable/user_guide/sparse.html) implements. Note
that the default value does not need to be `0`, but can be an arbitrary element. One downside of sparse arrays is that
diff --git a/assets/css/main.scss b/assets/css/main.scss
index 433e2e1..acd2bbb 100644
--- a/assets/css/main.scss
+++ b/assets/css/main.scss
@@ -77,6 +77,10 @@ h6
padding-right: 0;
}
+.page__content img[src$=".svg"] {
+ width: 80%;
+}
+
.page__footer
{
background-color: $primary-color;
diff --git a/assets/images/2019-12-01-rle-array.graffle b/assets/images/2019-12-01-rle-array.graffle
index aaeb89b..d3036dd 100644
Binary files a/assets/images/2019-12-01-rle-array.graffle and b/assets/images/2019-12-01-rle-array.graffle differ
diff --git a/assets/images/2019-12-01-rle-array/bit_packing.png b/assets/images/2019-12-01-rle-array/bit_packing.png
deleted file mode 100644
index 6fb1a36..0000000
Binary files a/assets/images/2019-12-01-rle-array/bit_packing.png and /dev/null differ
diff --git a/assets/images/2019-12-01-rle-array/bit_packing.svg b/assets/images/2019-12-01-rle-array/bit_packing.svg
new file mode 100644
index 0000000..7a4f3c3
--- /dev/null
+++ b/assets/images/2019-12-01-rle-array/bit_packing.svg
@@ -0,0 +1,480 @@
+
+
+
diff --git a/assets/images/2019-12-01-rle-array/cube.png b/assets/images/2019-12-01-rle-array/cube.png
deleted file mode 100644
index 16e7c5b..0000000
Binary files a/assets/images/2019-12-01-rle-array/cube.png and /dev/null differ
diff --git a/assets/images/2019-12-01-rle-array/cube.svg b/assets/images/2019-12-01-rle-array/cube.svg
new file mode 100644
index 0000000..0e33828
--- /dev/null
+++ b/assets/images/2019-12-01-rle-array/cube.svg
@@ -0,0 +1,89 @@
+
+
+
diff --git a/assets/images/2019-12-01-rle-array/data_types.png b/assets/images/2019-12-01-rle-array/data_types.png
deleted file mode 100644
index d12833f..0000000
Binary files a/assets/images/2019-12-01-rle-array/data_types.png and /dev/null differ
diff --git a/assets/images/2019-12-01-rle-array/data_types.svg b/assets/images/2019-12-01-rle-array/data_types.svg
new file mode 100644
index 0000000..95f572b
--- /dev/null
+++ b/assets/images/2019-12-01-rle-array/data_types.svg
@@ -0,0 +1,444 @@
+
+
+
diff --git a/assets/images/2019-12-01-rle-array/dictionary_encoding.png b/assets/images/2019-12-01-rle-array/dictionary_encoding.png
deleted file mode 100644
index 131caab..0000000
Binary files a/assets/images/2019-12-01-rle-array/dictionary_encoding.png and /dev/null differ
diff --git a/assets/images/2019-12-01-rle-array/dictionary_encoding.svg b/assets/images/2019-12-01-rle-array/dictionary_encoding.svg
new file mode 100644
index 0000000..c9dcc81
--- /dev/null
+++ b/assets/images/2019-12-01-rle-array/dictionary_encoding.svg
@@ -0,0 +1,174 @@
+
+
+
diff --git a/assets/images/2019-12-01-rle-array/rle_array1.png b/assets/images/2019-12-01-rle-array/rle_array1.png
deleted file mode 100644
index c313ec7..0000000
Binary files a/assets/images/2019-12-01-rle-array/rle_array1.png and /dev/null differ
diff --git a/assets/images/2019-12-01-rle-array/rle_array1.svg b/assets/images/2019-12-01-rle-array/rle_array1.svg
new file mode 100644
index 0000000..28538f1
--- /dev/null
+++ b/assets/images/2019-12-01-rle-array/rle_array1.svg
@@ -0,0 +1,168 @@
+
+
+
diff --git a/assets/images/2019-12-01-rle-array/rle_array2.png b/assets/images/2019-12-01-rle-array/rle_array2.png
deleted file mode 100644
index 184ed2f..0000000
Binary files a/assets/images/2019-12-01-rle-array/rle_array2.png and /dev/null differ
diff --git a/assets/images/2019-12-01-rle-array/rle_array2.svg b/assets/images/2019-12-01-rle-array/rle_array2.svg
new file mode 100644
index 0000000..52f3183
--- /dev/null
+++ b/assets/images/2019-12-01-rle-array/rle_array2.svg
@@ -0,0 +1,192 @@
+
+
+
diff --git a/assets/images/2019-12-01-rle-array/sparse_data.png b/assets/images/2019-12-01-rle-array/sparse_data.png
deleted file mode 100644
index 917a3cb..0000000
Binary files a/assets/images/2019-12-01-rle-array/sparse_data.png and /dev/null differ
diff --git a/assets/images/2019-12-01-rle-array/sparse_data.svg b/assets/images/2019-12-01-rle-array/sparse_data.svg
new file mode 100644
index 0000000..03734ca
--- /dev/null
+++ b/assets/images/2019-12-01-rle-array/sparse_data.svg
@@ -0,0 +1,115 @@
+
+
+