Assassin 66

mhparker · Post by **mhparker** » Mon Sep 10, 2007 3:13 pm

Hi again Ed,

Monday is obviously my posting day.

sudokuEd wrote:For example, am having a terrible time with A60 - one we generally felt was very hard and rated at 1.5 - but (current) SSscore is closer to 1.0. That is not a very good correlation. No way it was easier than A66. Was it Para

Don't know what Para's got to do with it (think I missed the point there), but I just took a look at the SudokuSolver (SS) log for A60:

SudokuSolver (version 1.4.1) wrote:50. 45 Rule on n14 - outies r1c456 r7c3 r3c4 total 21
50a. Cage 23(4) n12 restricts combinations with cells r1c4 r1c5 r1c6 containing {178} {269} {278} {289} {349} {359} {367} {368} {379} {457} {458} {459} {467} {469} {478} {567}
50b. Removed candidate 2 from r1c4
50c. Removed candidate 2 from r1c5
50d. Combinations {2579} {2678} no longer valid in cage 23(4) n12

Trouble is, I can't see for the life of me why 23(4) n12 should restrict the combinations I've marked in red above. What's wrong with [4]{289}, [7]{349} or [4]{379} for 23(4)? SS rarely (if ever) makes a mistake, so no doubt it's right and I'm wrong. The thing is, the conclusion allows SS to eliminate the 2 from the 23(4) cage, which appears to be important for the solution path. Bad news for the ratings when the automated solver doing the rating makes a critical move that a human solver can't even readily understand, let alone use in his/her own WT!!

Edit: Glyn has pointed out that the {2489} and {3479} combos for the 23(4) cage in r1 were eliminated in steps 27 and 48, respectively, due to conflict with r1 innies (h22(5)r1), which need one of {24} and one of {34}. Thanks Glyn!

Then there's the next move:

SudokuSolver wrote:51. 45 Rule on n2 - innies r1c456 r3c46 r2c6 total 32
51a. Cage 23(4) n12 restricts combinations with cells r1c4 r1c5 r1c6 containing [357] {157} {158} {159} {178} {179} {346} {347} {348} {349} {356} {359} {367} {368} {379} {389} {456} {457} {458} {459} {467} {469} {478} {479} {567} {578} {579} {678} {679}
51b. Cage 15(3) n23 restricts combinations with cells r2c6 r3c6 containing [69] {58} {49} {47}
51c. Removed candidates 57 from r1c4
51d. Removed candidates 57 from r1c5
51e. Removed candidates 57 from r1c6
51f. Removed candidates 69 from r2c6
51g. Removed candidate 1 from r3c4
51h. Removed candidate 9 from r3c6
51i. Found a hidden cage cage h32(6) at r1c456 r3c46 r2c6
51j. Cage sum in cage h13(2) at r15c6 - removed 68 from r5c6
51k. Combinations {159} {168} no longer valid in cage 15(3) n23
51l. Combinations {238} {247} {256} no longer valid in cage 13(3) n2
51m. Combination {259} no longer valid in cage 16(3) n36

The bit in red (step 51g) is interesting, because it leaves a naked single in r3c4 = 2!

I tried to follow manually what SS was doing here. Not only is it rather difficult (understatement of the year!) for a human to see that there are no permutations for the 32(5) n2 innies with a 1 in r3c4 that aren't blocked by cage 23(4) or cage 15(3) n23, but getting this result turns out (given a pencil and paper) to depend on {289} and {379} being blocked for r1c456 as mentioned above. Since I didn't understand that for step 50, I've thus got no chance of understanding how the 1 can be eliminated from r3c4 in step 51 either!

However, because these extremely difficult-to-follow steps result in a placement in the critical row 3 (remember those important n3 innies at r3c789?), the puzzle now becomes a whole lot easier than it would otherwise have been. Of course, it won't really help whacking up the scores for these types of moves. The point is that SS has now been able to uncover an easier path that a human solver cannot realistically be expected to find, resulting in a rating that's much lower than it should be.

IMO, the only thing that would help in such cases, would be for SS to ignore such moves, at least when in "rating mode".

BTW, if anyone can enlighten me as to the reasoning behind the logic for step 50a, please feel free to inform me!

CathyW · Post by **CathyW** » Mon Sep 10, 2007 3:36 pm

gary w wrote:I haven't been able to look at Para's wt because it's still in tiny text and I have to admit I don't know how to expand to normal font!

Thanks once again..I'm looking forward to the next assassin/variants now I've discovered the site - it's marvellous.

Hi Gary and welcome from me too!

To read the tiny text you can either click the quote button in order to read it (don't forget to click the back button afterwards unless you want to reply with the quote!) or select it, copy and paste into Word or Notepad and then change the font size.

We look forward to further input from you on future puzzles. You'll soon be "Hooked" and then an "Addict"!!

Para · Post by **Para** » Mon Sep 10, 2007 6:05 pm

gary w wrote:Thanks very much Mike for your detailed reply.I have to say that your logic is virtually IDENTICAL to that I employed!!

I haven't been able to look at Para's wt because it's still in tiny text and I have to admit I don't know how to expand to normal font!

Hi Gary

We always post our walk-throughs in tiny text, so there is no obvious spoiler to the puzzle. That way people who are still working on the puzzle, can safely read the forum without reading a way to solve the puzzle. These walk-throughs tend to be in tiny text, till a week after a puzzle is posted. That is why it was still in tiny text. But i have enlarged it now.

I think the point that Mike wanted to make, is that there is a clearer way to show your step. I would show your elimination through combination analysis of Innies R6 and R7 with the 11(2) cage in R67C5. It isn't the logic that is what is wrong. We just like to describe the steps in such a way that everyone can follow it when reading/following the walk-through.
And Mike innies R7 can't use a 7 in R7C1 at all, doesn't matter on the configuration of the rest. Think that is why Gary eliminated that option in his first post.

greetings

Para

ps. I think Ed asked me if i found Assassin 60 easier than Assassin 66. Definitely not!!!

pps. Mike, I agree with you on some of these 45-tests by Sudoku Solver(which Richard alo makes in the more difficult Killers) that are a bit over the top. But i guess my one 45-test in A60RP-lite works a bit like the Sudoku Solver, by setting a lot of restrictions and then testing the combinations.

mhparker · Post by **mhparker** » Mon Sep 10, 2007 7:36 pm

Hi Para,

Para wrote:I think the point that Mike wanted to make, is that there is a clearer way to show your step. I would show your elimination through combination analysis of Innies R6 and R7 with the 11(2) cage in R67C5. It isn't the logic that is what is wrong. We just like to describe the steps in such a way that everyone can follow it when reading/following the walk-through.

Yes, that's correct. There were some good ideas there, but no full WT or marks pic. Therefore, I just wanted to check with Gary as to whether that was really what he meant. Another reason for elaborating on the moves was to bring them to the attention of other readers who may otherwise have skated over Gary's post due to potentially not being able to understand his brief notes. I think Gary understands the spirit in which it was intended.

Para wrote:And Mike innies R7 can't use a 7 in R7C1 at all, doesn't matter on the configuration of the rest. Think that is why Gary eliminated that option in his first post.

The only option with a 7 in r7c1 is [782], which is blocked because it would force a 4 into both r7c4 (i/o diff. n7) and r7c6 (i/o diff. n9). Is that what you mean by "innies R7 can't use a 7 in R7C1 at all"? I just noticed that, even without considering another move, a 7 wouldn't work here.

Para wrote:ps. I think Ed asked me if i found Assassin 60 easier than Assassin 66.

Ah, the subtleties of the English language! If only Ed would have inserted a comma, as in "Was it, Para?", everything would have been clear. Ed should have known that!!

Para wrote:But i guess my one 45-test in A60RP-lite works a bit like the Sudoku Solver, by setting a lot of restrictions and then testing the combinations.

Don't remind me! Just checking through that one step of yours probably took me longer than it takes Cathy to do a Times Deadly!!

Andrew · Post by **Andrew** » Mon Sep 10, 2007 7:40 pm

mhparker wrote: isn't the rating of previous WTs only going to be accurate if they are optimized first?

sudokuEd wrote:Could be - we'll see I guess. Sudoku Solver works like this too - not optimized at all. Would be so easy if it just did the most productive thing for each step rather than strictly in the routine order - humans are so much better at learning where to look first/next. That's why I'm using an average of different settings to get the current set of (touch-wood emoticon here) accurate ratings.

Might have to average the human WT's too - Andrew & Cathy would average out to 1.16 for A66. Or, may just use the one puzzler (read Andrew) who is the most methodical amoung us to be the reference point. Or at least to get a feel if there really is a strong link (or should I say now, direct link ) between perceived difficulty and number/type/order of solve steps.

Mike has raised an interesting point with his comment. It clearly distinguishes between the difficulty of the actual puzzle and the difficulty found by the person who posted the WT, which is reflected by the WT. I suppose if nobody has posted the optimum WT then it could be argued that the "human" difficulty is that of the best posted WT.

Don't think I agree that the average of human WTs should be used. Surely the rating should be that of the lower rated of the human WTs?

I think I'll take the reference to me as a compliment. Please don't use my WTs as the standard for rating unless they happen to be the lowest rated ones. I'll agree that I'm the most methodical of the forum regulars; at the same time I admire the more inspirational solvers and wish I could spot some of their moves.

My early WTs used to be written as how I solved the puzzle. These days I've departed from that to the extent that when I spot something that I ought to have seen earlier I restart and put it in the correct place, then checking for any effect on the later steps. My WTs are therefore improved and, one could say, partially optimised although I never go to extent of trying to write an optimum WT. This can happen several times, as I've commented in the WT I posted last night for Vortex Lite, resulting in me taking much longer than if I just continued as I used to do for earlier (and easier) Assassins.

I'm sure people will notice that there are some steps in my WTs that are never used later, for example cage combinations and innies/outies where I've managed to eliminate some combinations/permutations. These are left in because, at the stage they were done, they might have proved useful later and they also show what I was looking at when I solved the puzzle. Some other WT posters also do this so it's clearly an acceptable part of WTs.

Post by **Ruud** » Mon Sep 10, 2007 8:55 pm

The rating of puzzles has always been an awkward subject of discussion, whether it's between programmers on the Sudoku programmers forum or on one of the player's forums.

Objectively, the rating should reflect the effort a player has to put into the puzzle to solve it. When player A with a particular skillset solves one puzzle in an hour, we could give it a rating 1.0. A puzzle that requires the same player 2 hours would then deserve the rating 2.0, and so forth.

No two players are alike, and player B may solve the first puzzle in 45 minutes and the second in an hour, simply because this player can find some shortcuts unknown to player A.

Objective measurement of solving time for a representative group of players is better, but it can only be used to rate the puzzle after it has been solved by most players. This data then needs to be matched against certain measurable characteristics of the puzzle. Statistics is not my area of expertise.

So we are left with an attempt to rate the puzzle based on characteristics that we associate with difficulty:

1. Hardest technique used. (what is hard?)
2. Length of solving path (optimized or not?)
3. Output from solving programs (how well do they perform?)

I tend to use all three, using 3 different solvers to assess the difficulty. A66 could be solved by SumoCue, but produced a long solving path resulting in a pretty high score. A67 could not be solved by SumoCue, so it required at least one step beyond its capabilities. JSudoku reported several conflicting combinations, so I assumed it to be more difficult than A66. However, two factors spoiled my assessment. The solving path of A66 was considerably longer than A67, with a relatively high percentage of tough steps. Since SumoCue stalled on A67, there were no notes to compare. The second factor was that A67, after I tested it manually, features a sneaky breakthrough in N2, which JSudoku did not catch.

As of now, I will refrain from commenting on the possible difficulty. My comments will be limited to the puzzle making process, the aestethics of the cage patterns and the announcement of extra versions.

Ruud

mhparker · Post by **mhparker** » Tue Sep 11, 2007 1:06 pm

Hi Ruud,

Ruud wrote:No two players are alike, and player B may solve the first puzzle in 45 minutes and the second in an hour, simply because this player can find some shortcuts unknown to player A.

That's the point I was trying to make above with the SudokuSolver example. Any averaging process is OK as long as all the obstacles are surmountable. Otherwise the player will be forced to take another route through the puzzle, thus invalidating some or all of the subsequent rating steps (depending on whether the intended and taken paths happen to merge again later).

Obviously we're looking for a rating system which comes up with some figure that can be associated with a puzzle (before we have all solved it please!!) and not a puzzle/player combination. However, I would agree with you in that, in order to rate puzzles realistically, at least the average skill set of the intended audience should be taken into account. Without wanting to give Richard any programming nightmares, this would first mean associating a skill level not only with each type of move, but also differentiating between simple and advanced usages of each move type, where necessary. For example, a conflicting combination move may be anything from straightforward to extremely difficult to see, depending on the cage types, number of constraints, whether it's a partial conflict, etc. Once the moves have been classified in this way, any moves that exceed the target difficulty level by more than a chosen threshold amount must then be ignored and circumvented in the rating process.

In practice, this means that the rating program (SS in our case) needs to be run with different parameters when rating a V1 (where any moves up to, say, a 1.75-rating level may be appropriate) than when it is rating a hard V2 (where any moves up to the 2.5-rating level may be considered acceptable).

For example, the moves I mentioned in my above post with the SudokuSolver example can be considered as the sort of moves that should be limited to 2.5-rating puzzles like the A48-Hevvie. They therefore have no place in the rating of a V1 Assassin, but are fine if it's a V2 that's being assessed.

Ruud wrote:So we are left with an attempt to rate the puzzle based on characteristics that we associate with difficulty:

1. Hardest technique used. (what is hard?)
2. Length of solving path (optimized or not?)
3. Output from solving programs (how well do they perform?)

I would add to this:

4. Narrowness of solving path (how many alternatives are there if the player misses an intended key move?)

The A34 (which is AFAIK widely considered to be one of the hardest-ever V1 Assassins) was a good example of this. The moves used were not really any more difficult than used in many subsequent Assassins that are considered to be easier, but it was hard to find any route at all through the puzzle.

Ruud wrote:The second factor was that A67, after I tested it manually, features a sneaky breakthrough in N2, which JSudoku did not catch.

I assume you mean Para's step 17. In which case, if it corresponds to what J-C told me in a PM I received from him some time ago, it's a feature, not a bug! Details follow in cloaked text:

Select text in box (e.g. via triple-click) to see what I wrote:When considering cages that partially overlap the unit (row, column or nonet) concerned for the purposes of identifying innie/outie difference groups, JSudoku uses the innies if the cage has more outlying cells than inlying ones, otherwise it uses the outies. Note that this means it will always use the outies if the numbers of inlying and outlying cells are identical. In the specific case of A67, this means that the I/O difference group for N2 will contain the innie of the 21(3) cage (R1C6), the outies of the 13(3) and 9(3) cages (R4C46) and - unfortunately in this case - the outie of the 17(2) cage (R4C5). Thus it misses Para's step 17, which requires taking the innie of the 17(2) cage instead.

Ruud wrote:As of now, I will refrain from commenting on the possible difficulty. My comments will be limited to the puzzle making process, the aestethics of the cage patterns and the announcement of extra versions.

I wouldn't take it so personally if I were you! I don't know how others see it, but it would be a shame if you weren't to refer to the possible difficulty any more. Even if your prediction proves to be wrong, there's no harm done by it, neither to you nor to anyone else, and even in this case it has the advantage of offering a small insight into what you intended. But, as far as the A67 is concerned, perhaps it was (in retrospect) a bit risky to use the words "will definitely" instead of (say) "should"!

I think I've said this before somewhere, but I'll say it again anyway. Although some Assassins may not quite fulfil their (high) expectations, there will be others that turn out to be even better than intended. Averaged over time, the Assassin series has been a huge success. Regardless of what rating system is used!

CathyW · Post by **CathyW** » Tue Sep 11, 2007 3:07 pm

I would certainly second your penultimate sentence Mike. Also proved by the fact that the Assassin thread is the most popular on the Sudocue forum judging by the number of posts! I have learnt a huge amount and have definitely improved my killer solving skills since I started doing the Assassins earlier this year.

Keep up the good work Ruud - it is highly appreciated.

Post by **Ruud** » Tue Sep 11, 2007 5:26 pm

Thanks for the kind responses.

I'm also pleased that Gary W wrote:both a66 and a67 took me about 1.5-2 hrs to do.A "deadly" killer in The Times normally takes me about 0.5 hrs so i'ld rate both of these as significantly harder

I've always considered the Times Deadly a perfect benchmark. The early Assassins were made to match at least this level. (Solving) times have changed a little since then...

Mike wrote:4. Narrowness of solving path (how many alternatives are there if the player misses an intended key move?)

I'm using it in SudoCue and for rating of the Samurai/Clueless puzzles, but it's easy to make a mistake here. For each alternative move, you should check whether it actually leads to the solution. Many moves, even in plain Sudoku, only produce non-essential eliminations. You could even argue that a puzzle becomes harder when it is full of red herrings, because they lead you away from the correct path.

Don't follow the lights, Frodo!

(PS) As an alternative to tiny text, there is a cloaking method that does not require the reader to copy the text. This may be a better method for short spoiling remarks, as it is easier to reveal. For complete WT's tiny text is better.

If you want to know how this works, triple-click this box to see what I wrote:[quote][color=white]Your solving tip goes here[/color][/quote]

Ruud

Para · Post by **Para** » Tue Sep 11, 2007 6:16 pm

Ruud wrote: (PS) As an alternative to tiny text, there is a cloaking method that does not require the reader to copy the text. This may be a better method for short spoiling remarks, as it is easier to reveal. For complete WT's tiny text is better.

If you want to know how this works, triple-click this box to see what I wrote:[quote][color=white]Your solving tip goes here[/color][/quote]

I think this would be really handy for tag-solutions for a V2 when a puzzle is still new. Because then we have fewer steps in a row.

Para

Andrew · Post by **Andrew** » Tue Sep 11, 2007 8:37 pm

Both TT and the cloaking method will work equally well for me when I'm involved in "tag" solutions. Whichever way people post their latest contributions, I'll still copy them to a Word file so that I've got a full record of what has happened so far together with any related comments.

Whichever method is used, message posters should remember to convert to normal text once there is no need for cloaking or TT. I'm normally fairly good at remembering to convert from TT but last week I was reminded that my Chevron Killer TT was still in TT.

It's in normal text now!

mhparker · Post by **mhparker** » Tue Sep 11, 2007 8:46 pm

Ruud wrote:As an alternative to tiny text, there is a cloaking method that does not require the reader to copy the text. This may be a better method for short spoiling remarks, as it is easier to reveal.

Thanks for the tip, Ruud! I've already edited my previous post on this thread to use cloaked text instead of TT.

I've already discovered another advantage with cloaking, namely that formatting (bold, italic, ...) is visible. With TT and copying to Notepad, Word or whatever, all formatting is lost.

Andrew · Post by **Andrew** » Wed Sep 12, 2007 4:30 am

Good point Mike!

Cloaking is clearly better than TT if you want to use bold, italic, ...

However it's not completely true that all formatting is lost if you copy TT to Notepad or Word provided that you go to "quote" mode before copying. Then the formatting commands are still there, one just needs to convert to actual bold or italic which is a bit boring but I've done it up to now.

Of course cloaking doesn't allow the use of different colours in messages so Ruud is quite right to say that TT is better for full walkthroughs.

mhparker · Post by **mhparker** » Wed Sep 12, 2007 12:40 pm

Andrew wrote:Of course cloaking doesn't allow the use of different colours in messages so Ruud is quite right to say that TT is better for full walkthroughs.

I'm also interested in discussing solving techniques for active Assassins without spoiling things for others. I will need at least candidate diagrams and maybe even occasional graphics.

Clearly, both cloaking and TT are unsuitable for this purpose.

One idea I had was to create a separate spoiler thread, and make the posts containing spoiler information on this thread (in normal text). On the relevant Assassin thread, I would just make a smaller post with the following info:

Subject being discussed
Spoiler warning
Link to spoiler thread post containing the actual information

People who have not yet done the puzzle would just not click the link!

When the Assassin is no longer current, I would then simply replace this smaller post with the content from the spoiler thread, deleting/clearing the spoiler thread post.

What do others think of this idea?

P.S. Of course, others could use the spoiler thread, too - it wouldn't necessarily have to be just for me!

Andrew · Post by **Andrew** » Wed Sep 12, 2007 9:38 pm

A separate Spoiler Thread seems a good idea Mike.

As you say, you would move the message over to the appropriate thread once that puzzle was no longer active. If other messages had been posted in the spoiler thread then, rather than saying "Message deleted" you would say "Message moved to ... thread."

If the thread only had the one message and that was moved to another thread, I'm guessing the Spoiler Thread would be automatically deleted. However that wouldn't be a problem; you (or anyone else who wants to post a spoiler) would just create it again next time.