1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
|
diff --git a/README.md b/README.md
index 1ee67fec8deb680da33cda840a1b5aa48601d775..f705a8e6c6be3a3e68c66932c88346b0f66b5bfe 100644
--- a/README.md
+++ b/README.md
@@ -5,10 +5,15 @@ [benchmarks](https://git.sr.ht/~poldi1405/go-yenc/tree/master/item/testdata/benchmarks/README.md))
## Objective
-The current objective is a single-threaded throughput of at least 10 MiB/s
-without causing a CPU-Meltdown or stealing too much RAM from Chrome.
+~~The current objective is a single-threaded throughput of at least 10 MiB/s
+without causing a CPU-Meltdown or stealing too much RAM from Chrome.~~
+
+I think we can safely say that we managed to hit this goal. Now it's time for
+the actual implementation. The details may change if a faster way occurs to me.
-
+
+
+
## License
diff --git a/testdata/benchmarks/README.md b/testdata/benchmarks/README.md
index fab0644d2fe30587100c01cda0a890d840bdff98..c18e1973b2f809af2c49ab01d111de5e58653a1b 100644
--- a/testdata/benchmarks/README.md
+++ b/testdata/benchmarks/README.md
@@ -28,12 +28,14 @@ Raw speed is calculated by running the benchmark 100 times and taking the
average. This is done to account for variations in CPU Usage as this test is
completed pretty quick.
-| Algorithm | ns/Op Escaped | ns/Op Unescaped | ns/Op (exp. avg.)¹ | *n*th fastest |
-|--------------|---------------|-----------------|--------------------|---------------|
-| naive | 2.40 | 2.39 | 2.39 | 1 |
-| lookup-table | 2.51 | 2.51 | 2.51 | 2 |
-| hashmap | 21.05 | 20.99 | 20.99 | 4
-| bootleg-simd | 13.95 | 8.48 | 8.57 | 3 |
+| Algorithm | ns/Op Escaped | ns/Op Unescaped | ns/Op (exp. avg.)¹ | *n*th fastest |
+|---------------|---------------|-----------------|--------------------|---------------|
+| naive | 2.42 | 2.28 | 2.28 | 2 |
+| naive-pointer | 22.72 | 22.70 | 22.72 | 6 |
+| lookup-table | 2.20 | 2.20 | 2.20 | 1 |
+| hashmap | 20.01 | 19.69 | 19.70 | 5 |
+| bootleg-simd | 16.49 | 10.52 | 10.62 | 3 |
+| simd | 15.42 | 11.83 | 11.77 | 4 |
¹) assuming random distribution of bytes and that 4/256 bytes have to be escaped.
@@ -42,18 +44,21 @@
`[data-throughput/benchmark.sh]`
Data Throughput is calculated by running the encoding function on a set of
-randomly generated data which is compiled into the program.
+randomly generated data which is written to a file. This operation is performed
+on a ramdisk to get raw numbers.
-| Algorithm | Duration | Byte | Throughput | *n*th fastest | Speed relative to naive |
-|--------------|----------|------------|---------------|---------------|-------------------------|
-| naive | 3.933 | 1073741824 | 260.36 MiB/s | 2 | 1.00 |
-| lookup-table | 3.300 | 1073741824 | 310.30 MiB/s | 1 | 1.19 |
-| hashmap | 35.236 | 1073741824 | 29.0612 MiB/s | 4 | 0.11 |
-| bootleg-simd | 19.144 | 1073741824 | 53.4893 MiB/s | 3 | 0.21 |
+| Algorithm | Duration | Byte | Throughput | *n*th fastest | Speed relative to naive |
+|---------------|----------|------------|---------------|---------------|-------------------------|
+| naive | 30.516 | 1073741824 | 33.5562 MiB/s | 3 | 1.00 |
+| naive-pointer | 30.752 | 1073741824 | 33.2986 MiB/s | 5 | 0.99 |
+| lookup-table | 30.569 | 1073741824 | 33.498 MiB/s | 4 | 1.00 |
+| hashmap | 63.524 | 1073741824 | 16.1199 MiB/s | 6 | 0.48 |
+| bootleg-simd | 4.310 | 1073741824 | 237.587 MiB/s | 1 | 7.08 |
+| simd | 4.314 | 1073741824 | 237.367 MiB/s | 2 | 7.07 |
<!--
-There was an extreme improvement by removing the fmt.Print() statements. This
-also lead to a new ranking and we have definitely met the 10 MiB/s
+No idea why SIMD changes from coming out almost last to placing first. I'm not
+complaining, but I am confused.
-->
Variations in speed may be due to changes in the input dataset and fluctuations
diff --git a/testdata/benchmarks/benchmark.sh b/testdata/benchmarks/benchmark.sh
index 19487ad4e9f81a0a7078cbf18690fc8749c1a111..0944917ef8819695b26d2558f553fd75de181934 100755
--- a/testdata/benchmarks/benchmark.sh
+++ b/testdata/benchmarks/benchmark.sh
@@ -2,6 +2,7 @@ #!/bin/env bash
runs=100
+rm -rf results.d
mkdir results.d
for i in $(seq $runs); do
@@ -12,5 +13,7 @@
echo " running benchmark: completed"
awk 'NR==FNR{a[$1]=$3+" ";next;} {a[$1]=($1 in a)?a[$1] $3 " ":$3 " "}END{for(x in a)print x, a[x]}' results.d/* | sed 's/$/ /' > results.d/joined
+
+vim results.d/joined
awk 'BEGIN{FS=" "}{ n=0; sum=0; for(i=1;i<NF;++i) { if( $i ) { ++n; sum += $i; } } print $1 ": " sum/n; }' results.d/joined | sort
|