hash(-1) == hash(-2) in Compilers for Quantum
Maybe we should get the classical computers to make sense before the quantum computers…
This post is about how I (quite painfully) suffered the consequences of the CPython hash(-1) == hash(-2) behavior.
If you know about this behavior already, I sincerely hope you’ve never encountered this in the wild and that this serves as a fun read on a consequence of it.
If you do NOT know about this behavior then let this post be a warning (and still be a fun read)!
Context
Note
While I bring up a couple quantum computing tools in this post this is primarily geared towards the compiler engineering side of things. Therefore it's safe to think of things like "Stim" as just a tool with its own DSL that I try to correctly produce from the DSL that lives inside the SDK I work on.
If you are well-versed in the quantum computing AND compiler space (quite a niche, but a growing one at that!) then I hope you will find even more entertainment from this post (:
One of the things I’ve been working on at QuEra is transpilation from our SQUIN (Structural Quantum Information) eDSL (embedded Domain Specific Language) to our own Stim eDSL, all of which lives inside the Bloqade SDK1. From there we can easily generate the Stim text format that Stim itself can consume.
To be more precise, the workflow I target is the following:
- The user writes a program in
squin - They apply a compiler pass (
SquinToStimPass) to convert the program to thestimeDSL - They can now emit the text form of the
stimeDSL program
You can see a small example of this workflow below, starting with the squin program:
# allocate 10 qubits
=
# apply H gate on all qubits
# apply some 1 and 2Q gates
# measure everything out
The eDSLs I mentioned as well as the ability to spit out files of other forms are all built on top of Kirin, which handles lowering from Python to SSA IR form as well as common compiler optimization passes. We can see what the SSA form looks like immediately by just invoking the .print() method all kernels have:
func.func @example() -> !py.NoneType {
^0(%example_self):
│ %0 = py.constant.constant 10 : !py.int
│ %qs = func.invoke qalloc(%0) : !py.IList[!py.Qubit, !Any] maybe_pure=False
│ %1 = func.invoke h(%qs) : !py.NoneType maybe_pure=False
│ %2 = py.constant.constant 0 : !py.int
│ %3 = py.indexing.getitem(%qs, %2) : !py.Qubit
│ %4 = func.invoke z(%3) : !py.NoneType maybe_pure=False
│ %5 = py.constant.constant 5 : !py.int
│ %6 = py.indexing.getitem(%qs, %5) : !py.Qubit
│ %7 = func.invoke y(%6) : !py.NoneType maybe_pure=False
│ %8 = py.constant.constant 0 : !py.int
│ %9 = py.indexing.getitem(%qs, %8) : !py.Qubit
│ %10 = py.constant.constant 1 : !py.int
│ %11 = py.indexing.getitem(%qs, %10) : !py.Qubit
│ %12 = py.ilist.new(values=(%9, %11)){elem_type=!Any} : !py.IList[!py.Qubit, Literal(2,int)]
│ %13 = py.constant.constant 3 : !py.int
│ %14 = py.indexing.getitem(%qs, %13) : !py.Qubit
│ %15 = py.constant.constant 5 : !py.int
│ %16 = py.indexing.getitem(%qs, %15) : !py.Qubit
│ %17 = py.ilist.new(values=(%14, %16)){elem_type=!Any} : !py.IList[!py.Qubit, Literal(2,int)]
│ %18 = func.invoke cx(%12, %17) : !py.NoneType maybe_pure=False
│ %19 = func.invoke measure(%qs) : !py.IList[!py.MeasurementResult, ~N : !py.int] maybe_pure=False
│ %20 = func.const.none() : !py.NoneType
│ func.return %20
} // func.func example
If I now apply the SquinToStimPass we get the program in our stim eDSL form:
func.func @example() -> !py.NoneType {
^0(%example_self):
│ %0 = stim.aux.constant.int 0 : !py.int
│ %1 = stim.aux.constant.int 1 : !py.int
│ %2 = stim.aux.constant.int 2 : !py.int
│ %3 = stim.aux.constant.int 3 : !py.int
│ %4 = stim.aux.constant.int 4 : !py.int
│ %5 = stim.aux.constant.int 5 : !py.int
│ %6 = stim.aux.constant.int 6 : !py.int
│ %7 = stim.aux.constant.int 7 : !py.int
│ %8 = stim.aux.constant.int 8 : !py.int
│ %9 = stim.aux.constant.int 9 : !py.int
│ stim.gate.H(targets=(%0, %1, %2, %3, %4, %5, %6, %7, %8, %9)){dagger=False : !py.bool}
│ stim.gate.Z(targets=(%0)){dagger=False : !py.bool}
│ stim.gate.Y(targets=(%5)){dagger=False : !py.bool}
│ stim.gate.CX(targets=(%3, %5), controls=(%0, %1)){dagger=False : !py.bool}
│ %10 = stim.aux.constant.float 0.0 : !py.float
│ stim.collapse.MZ(p=%10, targets=(%0, %1, %2, %3, %4, %5, %6, %7, %8, %9))
│ %11 = func.const.none() : !py.NoneType
│ func.return %11
} // func.func example
Which you can then emit to text form:
=
=
return
H 0 1 2 3 4 5 6 7 8 9
Z 0
Y 5
CX 0 3 1 5
MZ(0.00000000) 0 1 2 3 4 5 6 7 8 9
The Problem
At some point as I was developing the SquinToStimPass I ran into a rather peculiar problem.
Given a program that had -1 and -2 as Stim eDSL constant statements, when I ran the CSE (Common Subexpression Elimination) rewrite rule, -1 would disappear only to be replaced by -2 !
I’d have something in the original SQUIN language2 like:
=
=
# get the last two measurements which, in time, are the
# most recent so per Stim convention should be accessible
# via the record index through -1 and -2
Which SHOULD have come out to the following text form from the Stim eDSL:
MZ(0.00000000) 0 1 2 3 4 5 6 7 8 9
DETECTOR(0, 0) rec[-1] rec[-2]
But instead I’d get
MZ(0.00000000) 0 1 2 3 4 5 6 7 8 9
DETECTOR(0, 0) rec[-2] rec[-2]
When I looked at the actual Stim eDSL output there was no -1 constant statement to be seen either.
I was bewildered! Never in my career had I ever seen something so odd. At first I was convinced some rewrite rule I had created as part of the pass was the culprit but it eventually dawned on me that something at the Kirin level must be the culprit.
The Bug
It wasn’t immediately apparent to me CSE was the problem but in retrospect, if I was even remotely aware Python (CPython to be exact) had this odd behavior3 I might have clocked it faster.
Acknowledgements
Before I go further I’d like to give kudos to my colleagues Kai-Hsin Wu and Xiu-Zhe (Roger) Luo who also implemented the additional logic in Kirin to avoid future incidences of this.
At a high level, the Kirin CSE rewrite rule at the time of the bug (Kirin v0.18.0) did the following:
- Generate an
intfrom Python’s built-inhashfunction applied on components of the statement including its type, arguments, attributes (which are compile-time values, like constants that are immediately available), etc. - Check if the
intis present in a “seen” dictionary that records all statements that have been encountered up to that point. - If the
intis present inseen, replace references to the current statement to the previous equivalent statement.
The actual code we'll be focusing on from steps 1 and 2 looks like this:
:
=
...
With the stim constant statements in the program I generated, the only part of those statements that was actually different between the two was the stmts.attributes which would contain the original -1 and -2 Python integers.
When we started digging deeper we ran into this peculiar behavior:
>>>
1
>>>
0
>>> # okay, nothing fishy with hashing positive integers...
>>>
-2
>>> # wait, what?
>>>
-2
>>> ==
True
>>> # alright I give up.
Thus when you invoke CSE on something like:
Simplified SSA Format Ahead!
The SSA form shown here is simplified from what you would actually see, there are some intermediate statements I've cut out to keep things concise here.
If you're really curious, there's a stim.aux.get_rec statement that accepts the negative integer first, and THEN feeds into a detector or observable statement.
...
%0 = stim.aux.constant.int -1
%1 = stim.aux.constant.int -2
%2 = stim.aux.detector(record_indices = (%0, %1))
...
You end up getting:
...
%0 = stim.aux.constant.int -1
%1 = stim.aux.constant.int -2
# notice reference to %0 dropped in favor of %1!
%2 = stim.aux.detector(record_indices = (%1, %1))
...
And when DCE (Dead Code Elimination) gets run, -1 just vanishes altogether:
# bye-bye -1 ):
%1 = stim.aux.constant.int -2
%2 = stim.aux.detector(record_indices = (%1, %1))
Origins
Some googling about this odd behavior led me to Omair Majid’s blog post which dives into the CPython source code to investigate. Omair points out that -1 is specifically avoided as a valid returnable hash value in the implementation because it’s used in C to indicate something went wrong.
If you’re like me and wondering that there has to be a better way, there's a cited Reddit answer that points out C does not have the ability to throw exceptions.
If you’re still like me and think there has to be some way around it, there is! But I don’t think this would be wise to have in the CPython codebase (at least, not in this context)…
The Fix
I had to do some digging on the Kirin GitHub PRs and Slack but the initial fix for this problem was made by Roger and it was just to add some additional logic to the statement attribute __hash__. The additional logic literally being:
# in src/kirin/ir/attrs/py.py
return
# default attribute hash behavior continued if not int
return +
This wasn’t the complete solution though and if we jump to Kirin v0.18.14 (the patch version immediately after the bug was found) the final fix was made a bit more robust (albeit with more overhead for me to wrap my head around).
First, the attribute __hash__ returns to this definition:
# in src/kirin/ir/attrs/py.py
return
BUT, on the CSE side:
- Instead of generating an
intas a key into theseendictionary there’s now anInfoclass that consumes pretty much what you saw before to instantiate it and is used as a key instead. Infohas its own__hash__definition that uses the Python build-inidon individual elements that would have just beenhash’ed before. This is a bump up from the old implementation wherehashwas just called directly on things belonging to the statement.
The logic looks something like this:
# in src/kirin/rewrite/cse.py, as part of the `Info` class
return
If you think carefully though this still runs afoul of the identical hash problem that caused this bug in the first place! Indeed, if I take the program from the Problem section but run it through CSE after rewriting it:
=
=
# SquinToStimPass already has CSE applied inside of it,
# but I'm doing it again outside for clarity.
(With some print statements inside CSE rewrite rule for debugging), the Info-generated hash ends up being exactly the same for the -1 and -2 constant statements.
How can it be the case that CSE behaves correctly this time though?
Info has one additional magic/”dunder” method that’s been explicitly implemented, which is __eq__. When two Python dictionary keys with identical hashes are encountered there’s a fallback to __eq__5. __eq__ in Info explicitly looks at the equality between the same elements that went into producing the custom hash.
Here it should be clear that False gets spit out between the -1 and -2 constant statements and CSE doesn’t touch anything.
The __eq__ implementation looks like this:
return False
return
Conclusion
If you’re as boggled as I am that hash(-1) == hash(-2) has just been a thing in Python, welcome to the club 😅
While the process of writing this post was a lot of fun6 I’m certainly glad this is no longer an issue. I do admit feeling a bit of nostalgia seeing how much Kirin has grown and once again, I’m grateful to work on a team where things like this can be quickly hammered out.
In the grander scheme of things this was also a humbling reminder to me that while current Quantum Computers are noisy and prone to error, even in the domain of classical computation there’s still plenty of problems to be found (:
The Bloqade SDK is really a namespace package. The eDSLs I mention here all belong in the bloqade-circuit package that gets pulled in.
The program I cooked up here should be compatible with the latest version of bloqade-circuit (v0.13.1). The syntax here is actually slightly different from what I remember working with because back then tuples with constants (ex: (1,5)) were the way to go for coordinates. Now lists are preferred, especially considering they offer more flexibility for construction/manipulation.
I actually wonder if other Python implementations have this behavior as well but I haven’t confirmed it.
The latest version of Kirin (v0.22.8 as of the time of writing this) has even more logic in CSE but the core remains pretty much the same.
Hynek Schlawack’s and Roy Williams’ posts on the topic were especially helpful in understanding this behavior!
Ha! You thought I was going to refute that in this footnote, but no (: It was however, also a decent amount of work. I wasn’t the most familiar with the __eq__ fallback behavior before writing this and caught onto it with the two blog posts above as well as help from an embedded print(f"eq invoked on {obj} against {self}") statement.