From hours to 360ms: over-engineering a puzzle solution (blog.danielh.cc)
toast0 35 days ago [-]
I don't think it would make a large difference in runtime, but the bounds aren't quite right.

The largest possible value isn't 999,999,999, because that's not a legal value. The largest value is 987,654,321

Dividing that by 9, you get 109,739,369, but that's not a legal value either, you'd want to start with 109,738,365 for the gcd as it's a valid possible row. I don't know that all gcds would need to be a valid row, but any gcd larger than 98,765,432 would need to be used at multiples from 1-9; with that gcd or smaller, multiple 10 becomes viable and the gcd doesn't need to be valid with multiple of 1.

amyres 34 days ago [-]
One more nitpick on the upper bound of the GCD: since we know that 0 is going to be in every column, one of the row numbers must start with a 0 and will be an 8-digit number (0 in the first cell). The GCD can't be larger than the smallest row number.

The GCD search should start at 098,765,432 and then go down from there.

Someone 34 days ago [-]
> you'd want to start with 109,738,365 for the gcd as it's a valid possible row

Nitpick: that has two threes.

Also, what’s wrong with 109,738,654 from that argument’s view?

toast0 34 days ago [-]
Yes, I think you're right!
gus_massa 35 days ago [-]
If the floating point trick worth it? When I was young, floating point operations were very slow and I still try to avoid them like the plague.

I remembered there was a trick to avoid popcount, but I didn't remember it. So I found https://stackoverflow.com/questions/51387998/count-bits-1-on...

That version counts, the 1 bit's but here we already know the number is not 0 and we want to know that the inverted number it has exactly 1 bit set, so instead of

  popcnt(n) == 1
we can use

  n & (n-1) == 0

Moreover, n is the inverted number

  n = 0b1111111111 - s
so an alternative is to run the check in the original number, and I think that this does the trick:

  s | (s+1) == 0b1111111111
adgjlsfhk1 34 days ago [-]
floating point has the same bandwidth and ~3x the latency of integer+/-
35 days ago [-]
natessilva 35 days ago [-]
Should look into Knuth’s DLX. My JS implementation solved this case in 1.8ms
gus_massa 35 days ago [-]
If you can write a blog post about it (including source code), I guess it will be good to post here.
aroundtown 35 days ago [-]
For us slow kids in the back what does this mean, “[…] such that the nine 9-digit numbers formed by the rows of the grid has the highest-possible GCD over any such grid”?
genewitch 35 days ago [-]
All the digits in a row become a number. All nine such numbers have a greatest common denominator. The puzzle solution is the center row of the grid where the numbers have the greatest common denominator possible. I. E. None of them are prime.

It took 3 readings and a scan of some of the later words but I think this is the correct reading

So a naive program can just solve the puzzle repeatedly, differently, and ccmpute the GCD of the rows, and output the one with the highest result. There aren't infinitely many solutions to a sodoku style puzzle, so...?

35 days ago [-]
sizediterable 35 days ago [-]
I also wrote my solution in Rust. I was pleased that my approach gave me an opportunity to write a linked list and get the chance to apply some rarely used learning[0].

My solution doesn't use SIMD, but is actually takes about the same amount of time as the the solution in the article, given the same number of cores, though an glaring weakness of my approach is that it can only scale up to 7 cores as written.

Rough outline of how mine works:

- Set current best GCD to 1.

- Search through all valid board configurations.

- Bail out from the search early if the current partially generated board couldn't have a GCD greater than the current maximum found, e.g. if we've only generated two rows of the board, and the GCD of those two rows are already less than the max.

- Update the current best GCD as you find higher ones.

- Share the current best GCD value across multiple threads. That way the longer the program runs, the earlier and earlier the searches will start bailing out.

- Don't always read from the shared variable to avoid contention. Instead, each thread has its own copy of the last value it read which we compare with first before copying and caching the shared value.

- Another interesting property of this approach is that it can be used to validate the correct solution even faster than it takes to find it. Instead of initially setting the max GCD to 1, set it to `solution - 1`. That way branches in our search will bail even sooner from the beginning. This leads to the program running about 20% more quickly.

Source: https://gist.github.com/lazytype/35b45f3ea81b5c1c5555546fe6f...

[0] https://rust-unofficial.github.io/too-many-lists/

amyres 35 days ago [-]
Since we're over-engineering here, one more optimization is to skip all potential GCDs that are divisible by 2 or 5.

Suppose the GCD was divisible by 2, then all rows would be even. Since the last digit of an even integer is in {0,2,4,6,8} and we need 9 unique numbers in the final column, we know that 4 or 5 of the row numbers must be odd. So the GCD can't be even.

Similarly, the GCD can't be divisible by 5. If it were, all rows numbers would need a 0 or 5 in the final digit.

lern_too_spel 35 days ago [-]
We know that the GCD is divisible by 9 because the sum of the digits for every row is 45.
Jtsummers 35 days ago [-]
Not necessarily. The sum of the digits ranges from 36-45, there are 10 possible digits of which 9 will be used. If, say, 5 were the unused number then it would not be divisible by 9.
lern_too_spel 34 days ago [-]
You're right. I misread the question. It's 9 of the 10 digits 0-9 instead of all 9 digits 1-9 as in sudoku.
amyres 34 days ago [-]
Exactly. The actual solution to the puzzle in question has a GCD not divisible by 9.
junon 34 days ago [-]
> My first attempt when solving this puzzle was to encode the constraints as an SMT optimization problem, and using Z3 to find the solution.

This is why choosing the right tool for the job matters. Z3 is hilariously overkill for such a problem (not to undermine how interesting and useful Z3 is, and not to accuse the author of doing anything wrong).

Z3 is only one class of solver; a finite domain constraint solver is much more suited to these sorts of tasks.

https://gist.github.com/Qix-/3580f268e2725848971703e74f6b26b...

    $ time node sudoku.js
    
    1 2 7  6 8 5  4 3 9  
    5 9 4  1 3 7  8 2 6  
    8 3 6  4 2 9  5 7 1  
    
    3 6 1  8 5 4  2 9 7  
    7 4 5  2 9 1  3 6 8  
    2 8 9  3 7 6  1 5 4  
    
    6 5 3  9 1 8  7 4 2  
    4 1 2  7 6 3  9 8 5  
    9 7 8  5 4 2  6 1 3 
    
    0:00.08elapsed
gus_massa 33 days ago [-]
The article is not super clear, but the problem in the article has an additional requests, to maximize the GCD of the rows interpreted as "9" digits numbers, in your case:

GCD(127685439, 594137826, 836429571, ...) = 9

but the author finds one GCD with 8 digits.

Also, for this reason it's important in this version to keep the "0" as "0" instead of mapping it to "1".

foota 35 days ago [-]
Sigh. My friend did this problem on pen and paper and made me feel stupid. Their solution was so wildly clever! It relied on the observation that the sum of the rows of the sudoku board (given the digits in use) is a known fixed value, and went from there (I'll leave the rest as an exercise to the reader to avoid spoilers).
rahimnathwani 35 days ago [-]
The Jane Street page explains the (a?) number theory approach.
toast0 35 days ago [-]
This isn't exactly a sudoku board, because they allow for 0.
Jtsummers 35 days ago [-]
While it's not a true sudoku board, that invariant still holds. Whichever set of digits you end up using (it'll be the same 9 for rows, columns, and blocks) the sum will be same for all of them. It's also unique to the set of 9 numbers you end up using to solve the puzzle.

If you use 1-9, sum is 45. For anything else, it's 45 - (the unused number).

foota 35 days ago [-]
Yep, I'm aware, hence the given digits part. The whole board is still drawn from 9 digits, it's just that precisely which 9 digits is unknown.
cwmoore 32 days ago [-]
This looks like fun for hours and hours, special 100 digit numbers.

The last time I saw a puzzle like this it led me to https://www.kakurokokoro.com

35 days ago [-]
moralestapia 35 days ago [-]
If anything, what a testament of the massive failure Z3 is.
Jtsummers 35 days ago [-]
Unless I missed it, their Z3 solution wasn't even presented. So we can't comment on how good or bad Z3 is without seeing how good or bad their Z3 attempt was.
moralestapia 35 days ago [-]
That is true, indeed.

It could be that the author is just massively unskilled at writing Z3 code.

moralestapia 35 days ago [-]
Oh come on ...

Either one of the two statements is true, no other way around.

AlexeyBelov 34 days ago [-]
Why did you bring such negativity to this thread? "Either one of my two rude statements is true!"

Who asked?

moralestapia 33 days ago [-]
Who asked you, though?

It's a forum, people come here to discuss the submissions.

You should know this ...

fn-mote 35 days ago [-]
Can you say more about this?

On the face of it this comment seems ridiculous to me. Z3 is fabulously successful in other domains. Perhaps the problem fit is not there, or the problem encoding chosen was not appropriate.

moralestapia 35 days ago [-]
(Way) worse than exhaustive brute force search ... you don't really have to say much more.

I think it's the first time I see such thing in the wild.

gus_massa 35 days ago [-]
It's brute force after a complete rewrite of the problem.
moralestapia 33 days ago [-]
???

No, I mean even the most naive brutal force code comes back after ~10 minutes, whereas the Z3 didn't finish even after "several hours".

I am now quite curious about writing a Z3 implementation of this myself and try to figure out why is it taking that much time.

gus_massa 33 days ago [-]
[Sorry for my bad pseudo-python.] My guess is that the implementation in Z3 in he article was like

  max_gcd = 0
  for a1 in range(9):
    for b1 in range(9):
      ...
        *check_valid(a1, b1, ...)*
        n1 = to_number(a1, b1, ...)
        ...

        new_gcd = gcd(n1, n2, ...)
        max_gcd = max(max_gcd, new_gcd)
and the first brute force version was

  for gcd in range(111111111):
  for r1 in range(?):
    for r2 in range(?):
      ...
        a1, a2,... = split(r1*gcd)
        ...
        *check_valid(a1, b1, ...)*
        max_gcd = gcd
(I guess the author was checking earlier, but let's put only one check in may fake version.)

It's very difficult for a compiler to transform one into the other.