Failure through Fixation

https://www.flickr.com/photos/britishlibrary/11033123455/

Einstellung, SICP EX 4.31 — an Experiential Report

Intro

In her book A Mind for Numbers Barbara Oakley mentions the Einstellung effect (and a great number of other things, too): “The Einstellung effect refers to getting stuck in solving a problem or understanding a concept as a result of becoming fixated on a flawed approach.” And, later on: “Chess players who experience Einstellung truly believe they are scanning the board for a different solution. But careful study of where their eyes are moving shows that they are keeping their focus on the original solution. Not only their eyes, but their mind itself can’t move away enough to see a new approach to the problem.”

I recently experienced Einstellung and, despite realizing what was happening, I had no good strategy prepared for it and ended up handling the event in a suboptimal manner. The rest of this report details the particular Einstellung experience, but also ties in a few other aspects of learning and problem-solving. Although the Einstellung happened during a programming exercise, the particulars of the exercise are mostly incidental: the experience is reported in detail in the hope that it will help you internalize what the Einstellung effect is and spawn ideas for handling it effectively if-when it arises.

The Exercise

At the time of writing I have spent 115 hours over the last six months studying SICP (also affectionately known as the wizard book). I love it: it starts out explaining different fundamentals in-depth, dips its toes into multiple programming paradigms, presents a curated sample of examples amidst the more theory-focused chapters, and it messes around with programming languages and simulators. I’m fairly convinced that when I finish the book, I’ll have learnt more from it than my first two years of university courses.

Getting back on track again after digressing to praise the wizard book: The Einstellung happened when I attempted to solve exercise 4.31. Chapter four of SICP is about metalinguistic abstraction, and develops four different interpreters; subchapter two develops an interpreter for a lazy scheme varian t. The initial approach taken in the text is to change the interpreter developed in the previous subchapter so that the interpreter always uses lazy evaluation. And then comes exercise 4.31:

Exercise 4.31: The approach taken in this section is somewhat unpleasant, because it makes an incompatible change to Scheme. It might be nicer to implement lazy evaluation as an upward-compatible extension , that is, so that ordinary Scheme programs will work as before. We can do this by extending the syntax of procedure declarations to let the user control whether or not arguments are to be delayed. While we’re at it, we may as well also give the user the choice between delaying with and without memoization.
For example, the definition

(define (f a (b lazy) c (d lazy-memo))
. . . )

would define f to be a procedure of four arguments, where the first and third arguments are evaluated when the procedure is called, the second argument is delayed, and the fourth argument is both delayed and memoized. Thus, or-dinary procedure definitions will produce the same behavior as ordinary Scheme, while adding the lazy-memo declaration to each parameter of every compound procedure will produce the behavior of the lazy evaluator defined in this section. Design and implement the changes required to produce such an extension to Scheme. You will have to implement new syntax procedures to handle the new syntax for define . You must also arrange for eval or apply to determine when arguments are to be delayed, and to force or delay arguments accordingly, and you must arrange for forcing to memoize or not, as appropriate.

The exercise raises a few interesting notions regarding compatibility, language standards and extensibility. The interpreter variant developed during exercise 4.31 should be able to run all standard scheme programs (using eager evaluation), but would have the additional capability of applying lazy evaluation to specified parametres. This means the hypothetical users of the interpreter from exercise 4.31 would not be taken by surprise when they tried standard scheme programs, in addition the upwards-compatibility would perhaps also make it easier to maintain a “real” metacircular lazy(-extended) interpreter, as the underlying changes to scheme would “ripple through” better due to the acceptance of standard eager programs. While memoization is an interesting topic, it’s not relevant here.

Description of the Solution Strategy

Having already developed a wholly lazy interpreter earlier in chapter 4.2, the exercise isn’t primarily about the implementation of laziness in general, but about changing the implementation so that the laziness becomes optional and the use of it specified through additional syntax. This also makes the new interpreter more versatile, since it allows for wholly lazy functions, wholly eager functions, and mixed-evaluation functions.

Crucially, the exercise states to “implement new syntax procedures to handle the new syntax for define”, “arrange for eval or apply to determine when arguments are to be delayed” and “force or delay arguments accordingly”. In order to get the laziness itself up and running before also adding in whether to memoize or not, I decided to start off by implementing an interpreter variant only allowing non-memoized laziness. But while the exercise text mentions handling the new syntax for define, is it really a new syntax for define?

After all, while scheme accepts definitions on the following format:
(define (f  . args) )
Those definitions are actually syntactic sugar for:
(define f (lambda args ))

A final note in this section: the mention of “arrang[ing] for eval or apply to determine when arguments are to be delayed” caused me to look through those two functions, and the main change to eval in the lazy interpreter was the dispatch to the actual-value function before handing applications over to the apply function. It didn’t seem like eval needed any further changes to accommodate the optionally-lazy approach. At the same time, it seemed self-evident that a new way of treating argument lists would be necessary, since those might be mixed. (And in the wholly lazy interpreter, the argument lists are handled in apply.)

The Einstellung Happens

I wanted the new implementation to allow the specification of lazy parametres in lambdas as well, since that’s what’s really going on. My worry was that if the laziness got implemented at the define-level, then it might not “carry down” so lambdas could also have lazy parametres. And, having realized that, I ended up pondering just where in the system the laziness would enter. After all, a define is just a declarative variable binding, and a lambda is a function definition but not the actual procedure that eventually gets evaluated — it’s the recipe, but not the meal itself, and since laziness is an evaluation mode, it seemed obvious that the appropriate level for the laziness was at the procedure level.

So I made some very odd changes (I have since overwritten and don’t have a copy of) to the make-procedure function along with a few changes to the way apply handled compound procedures, as well as supplying a set of basic syntax handlers for parametres. Then, thinking I had completed the exercise, I fired up the interpreter for a test run.

;;; L-Eval input: 
(define (sq (b lazy)) (* b b)) 

;;; L-Eval value: 
ok 

;;; L-Eval input: 
(sq (+ 2 2)) 
;Unbound variable b

Apparently not. But where was I wrong? I was obviously wrong about something, since the implementation didn’t work. I tried to backtrack: was there some other place in the system where the laziness could be more appropriately handled than in the procedures themselves? I looked through my code, and no, it made sense that the procedures would be the points of laziness. And my function for handling mixed argument lists looked okay as well. Since the procedure had access to its own parameter list, the information about lazy specification hadn’t just disappeared from the system through the void. So I looked through the list of sub-systems mentioned in the exercise text again, and those were the same ones as I had addressed. Where had I gone wrong?

I spent half an hour pondering the problem, but making no headway: I just walked around in circles, and the only way I managed to envision an alternative to the procedure-focused implementation seemed like they would result in being unable to use the lambda-style function declarations, which wasn’t desirable. I continued looking through the exercise text, and the changes I had implemented, and simply grew more and more confused.

Realizing that I was stuck and suffering from Einstellung, I decided to turn my attention to something else in an attempt to clear my mind and hopefully be able to attack the exercise from a different angle afterwards and fired up a video game. Then, after half an hour, I returned to the exercise and found that… the same obviously wrong solution was still the only one I managed to think of.

At that point I was so very confused and curious that I gave in and looked up a solution online. And to my surprise some of those solutions had made changes to parts of the interpreter not mentioned in the exercise text, e.g. to make-frame. And other solutions again mostly made changes to apply and the mixed argument list handling, meaning they didn’t do anything in particular with how functions, lambdas or procedures were handled apart from during the apply call. But none of those varied angles even occurred to me when I sat there struggling with my Einstellung: the one solution path I had found blocked out all the others. After all, I had a solution approach in mind already, and the steps leading towards it were still firing brightly in my mind.

Analysis

With some days’ distance to the Einstellung now, meaning it has diedin the meantime, and having had some time to ponder the various solutions I found online, I have a few interesting observations: For one, I was fairly close in my solution attempt. The appropriate level is indeed at the procedure level, and the mixed argument list was also a good find. The biggest error with my solution approach was that I forgot to clean up the formal parametres of the procedure so that the variables could be bound in its environment. E.g. “unbound variable b” probably also came with an implicit “but bound variable (b lazy)”. Or, at least, that was the big piece missing. The changes to make-procedure were either superfluous or they messed up things further.

In retrospect it’s obvious that my attempt to get unstuck from the Einstellung was hilariously ineffective. The video game did not require my full attention, so I continued consciously thinking about the programming exercise, which surely contributed to the continued stuckitude. Furthermore I stayed in the same room, which probably also contributed. After all, haven’t you too walked into a room and forgotten why you went there to begin with? Change in environment clears the mind.

However, I now know decidedly what Einstellung feels like, and can hopefully both recognize and handle it better when it happens again. Maybe a heuristics set to work through in an attempt to quickly disintegrate the Einstellung would be a worthwhile endeavour? It could certainly save time, particularly if it saved on context switches. Making effective use of time while letting something else lie vacant to let it tumble around in the subconscious requires focusing on something different, after all. (Perhaps, with the Einstellung making one solution track so pronounced, it would be useful to poke at the premises of the solution track, e.g.: “sure it’s obvious that the laziness has to enter the system at the procedure level, but how might an alternate stupid implementation look like where the procedures are left out of it?”)

I guess I’ll find out the next time I experience Einstellung (and realize it!). If the heuristics run-down fails to push me past it, though, then the best approach might be to sleep on it. In retrospect, I wish that’s what I had done rather than look up the solution so early — but the Einstellung frustrated me and also made me immensely curious as to the structure of the real solution(s), since I myself failed to see any other possible approaches while under its effects.