compute the distances beforehand

flake-overlay-tweak
Brian Hicks 2021-08-24 17:08:09 -05:00
parent 3e4dc4167c
commit f0155566e1
2 changed files with 38 additions and 4 deletions

View File

@ -115,3 +115,36 @@ Benchmark #1: ./target/release/similar-sort benchmark < /usr/share/dict/words
And quite a speedup it is!
About 150ms over the previous improvement.
## Precalculate sizes
This is so much of a bigger result that I wonder if we're doing more work in parallel than we really need to?
What if we compute the distances in a `map` instead of doing it in the parallel code?
```
$ hyperfine './target/release/similar-sort benchmark < /usr/share/dict/words'
Benchmark #1: ./target/release/similar-sort benchmark < /usr/share/dict/words
Time (mean ± σ): 170.1 ms ± 4.5 ms [User: 158.5 ms, System: 12.6 ms]
Range (min … max): 163.5 ms … 181.6 ms 16 runs
```
Yay!
That finally gave us the result we wanted this whole time!
It's way faster!
A real comparison:
```
$ hyperfine './result/bin/similar-sort define < /usr/share/dict/words' './target/release/similar-sort benchmark < /usr/share/dict/words'
Benchmark #1: ./result/bin/similar-sort define < /usr/share/dict/words
Time (mean ± σ): 287.8 ms ± 5.1 ms [User: 254.8 ms, System: 75.3 ms]
Range (min … max): 282.3 ms … 296.2 ms 10 runs
Benchmark #2: ./target/release/similar-sort benchmark < /usr/share/dict/words
Time (mean ± σ): 165.8 ms ± 7.4 ms [User: 154.5 ms, System: 12.2 ms]
Range (min … max): 155.8 ms … 186.2 ms 15 runs
Summary
'./target/release/similar-sort benchmark < /usr/share/dict/words' ran
1.74 ± 0.08 times faster than './result/bin/similar-sort define < /usr/share/dict/words'
```

View File

@ -25,16 +25,17 @@ fn try_main() -> Result<()> {
let opts = Opts::from_args();
let mut lines: Vec<String> = stdin()
let mut lines: Vec<(usize, String)> = stdin()
.lock()
.lines()
.collect::<io::Result<Vec<String>>>()
.map(|line| line.map(|candidate| (levenshtein(&opts.target, &candidate), candidate)))
.collect::<io::Result<Vec<(usize, String)>>>()
.context("could not read lines from stdin")?;
lines.par_sort_unstable_by_key(|candidate| levenshtein(&opts.target, candidate));
lines.par_sort_unstable_by_key(|x| x.0);
let mut out = BufWriter::new(stdout());
for candidate in lines {
for (_, candidate) in lines {
writeln!(out, "{}", candidate).context("could not write to stdout")?;
}