The most compelling reason to not do this is that (I claim) it’s not super obvio...

zwegner · on March 6, 2019

While I agree with you, I will note that even set union in Python is not commutative. a | b should equal b | a in the sense of __eq__, but the actual objects in the result set depend on the order of the arguments (and in the opposite way from dict + dict). This happens with objects that are distinct but compare/hash equally (x is not y and x == y). Whether that actually matters for any useful program is another story...

Dumb program to illustrate this point:

    class Dummy:
        def __init__(self, value):  self.value = value
        def __repr__(self):         return 'Dummy(%s)' % self.value
        def __hash__(self):         return 0
        def __eq__(self, other):    return True

    a = {Dummy(0)}
    b = {Dummy(1)}
    print(a | b)
    print(b | a)
    print(a | b == b | a)

endgame · on March 6, 2019

Unfortunately, even in Haskell Data.Map.Map's monoid instance is left-biased. There is the monoidal-containers package which newtype-wraps Data.Map.Map to have instance Monoid m => Monoid (MonoidalMap k m), which I think is much more sensible.

dan-robertson · on March 6, 2019

I think I wasn’t even sure that Haskell had a Monoid instance for Data.Map, I knew it wasn’t the interface which I would naturally expect though. I agree that the interface for MonoidalMap is more natural.

sametmax · on March 6, 2019

Besides, anytime somebody compare Python to Haskell, the battle is over. They have completly different use cases and philosophy. If you want something in Haskell, you probably want the opposite in Python.

mjburgess · on March 6, 2019

Its not clear what you're saying here.

The comparison was to say, "this decision is difficult everywhere" -- which lang seems beside the point.

zimablue · on March 6, 2019

Great post, sets have nice properties that dictionaries don't have. Making them act similarly seems like a trap

kbd · on March 6, 2019

> it’s not super obvious what to do when the keys are equal

    d1 | d2 | d3 | ...

is equivalent to:

    {**d1, **d2, **d3, ...}

dan-robertson · on March 6, 2019

Now read the above but instead of “it’s not super obvious what

  d1 | d2

should be because losing information/desirable properties/weird errors”, read “it’s not super obvious what

  {**d1, **d2}

should be because losing information/desirable properties/weird errors”.

Except I guess one could throw in something about TOOWTDI too.

kbd · on March 6, 2019

I actually think it is obvious what a dictionary merge should do (overwrite keys on the left with keys on the right), but this is besides the point because it's already been determined for

    {**d1, **d2}

In other words, there are no new semantics to discuss here. I'm just saying the two syntaxes should be equivalent.

bocklund · on March 6, 2019

> For the first two choices one loses commutativity which means that code then suddenly has to have previously cared about it (or it will do the wrong thing)

Since this is a new operator, that shouldn’t be an issue.

I think losing commutivity is okay. After all, d1.update(d2) != d2.update(d1) if keys conflict.

dan-robertson · on March 6, 2019

What you have written doesn’t look at all symmetrical but d1 | d2 looks very symmetrical. Operators being symmetrical around a vertical axis tends to imply being commutative (although there are many exceptions e.g. a divide symbol (but note fractions aren’t symmetrical) or a minus sign or using ^ for exponentiation (but superscripting is not symmetrical) or matrix multiplication (but maybe one could argue this is an abbreviation of function application))

Secondly I claim that the issue with using | is that it is not a new operator. It is a new, incompatible meaning for an old operator. Old code might not bother checking that its arg is a set because of it weren’t a set then | or in would fail. New programmers might see dicts as being basically sets and wrongly assume functions for sets would correctly work on dicts.

dstola · on March 6, 2019

In case the values match you could supply a collision callback to define what to do, eg to add the values,

  d1 = {'a': 1}
  d2 = {'a': 2}

  d3 = {**d1, **d2, add_func)

  def add_func(a, b):
      return a+b

Or something along those lines

rbanffy · on March 6, 2019

Why not raise a ValueError and let the programmer figure out what The Right Thing To Do is when you add two dicts that have the same key with a different value?

I assume the same key with the same value would be OK, but I'm not really sure it's a good idea for it to be OK.

zimablue · on March 6, 2019

You can't do value comparison without making dict item comparison a pissed in function or making dict values immutable. If you're doing something that really looks like a mathematical Union that will raise if there's any overlap then it's a really confusing abuse of notation. I don't think there's a way out.

dan-robertson · on March 6, 2019

That is one thing you could do to merge dicts. To expand on my last paragraph above, I think I would imagine the following operations (stupid syntax):

  a & b = { k: (a[k], b[k]) for k in a.keys() | b.keys() }
  a | b = { k: (a.get(k, None), b.get(k, None)) for k in a.keys() | b.keys() }
  a |& b = { k: (a.get(k,None), v) for k, v in b.items() }
  a &| b = { k: (v, b.get(k,None)) for k, v in a.items() }
  a |_| b = { k: only(a,b,k) for k in a.keys() | b.keys() }
  def only(a,b,k):
    if k in a && k in b:
      throw DuplicateKey(a,b,k)
    elseif k not in a && k not in b:
      assert(false)
    elseif k in a:
      return a[k]
    else:
      return b[k]

This doesn’t work well if values can be None so maybe instead of pairs there should be objects Left(x), Right(y), and Both(x,y)

mturmon · on March 6, 2019

That syntax doesn't make sense. The

  {**d1, **d2}

idiom is just a clever mashup of Python's dictionary construction literal {}, and * * unpacking. That's why it only works with string-valued keys (which is a major limitation).

Adding a third item to the dictionary literal would require special-casing the {} dictionary construction literal.

jessaustin · on March 6, 2019

  >>> { 'a' : 1 } | { 'a' : 2 }

ISTM the most logical result would be:

  { 'a' : { 1, 2 } }

...but I could certainly understand throwing an exception.

xiao_haozi · on March 6, 2019

While I see your point, I don't think this makes sense historically. Dictionaries never supported such behavior before so you'd be introducing a new core concept to a dictionary. But moreover, you'd be changing the type of the value only on duplicated keys, and what about if you were to add another value of 2 to a? Are you making this a set, and why? I think it would come with too many caveats and assumptions in the PEP.

I'm not saying you have a bad idea/logic here, just that I'm not sure it's the best thing for the dict.

dan-robertson · on March 6, 2019

Note that this forgets the order of the arguments, which may not be desirable

kqr · on March 6, 2019

If the property we want to achieve is "a | b == b | a" we necessarily have to forget the order of the arguments.